Got a quick question? Ask the CPA in your pocket

Compliance Software

Chat message between the user and Chat CPA, the user is asking Chat CPA how to minimize tax burden

Imagine if you had a chatbot on your phone that had an encyclopaedic knowledge of tax law. For at least a year a chatbot on WhatsApp has been answering accounting questions for business owners and accountants in private and public practice. is the personal project of Nick Miller, a CFO raised in Portland who now works in Denmark. He has trained and refined this chatbot on ChatGPT 3.5 for the past year and has more than 5,000 users paying by subscription. 

Miller is not a software engineer; emerged from a database Miller built to save time looking up US accounting standards such as GAAP and IFRC. Now, mid-sized accounting firms with 400 employees are buying the app for their compliance teams.

The interview has been edited for length and clarity.

DigitalFirst: I'm surprised that Wolters Kluwer or Thomson Reuters hasn't done something similar because it seems like a really smart thing to do. How did you get into doing this?

Miller: I was in audit for eight years, right out of college, where I studied accounting. It was for a small firm in Portland, Oregon. I was doing statutory audits of governments and nonprofits.

I came to the realisation that I needed a more sophisticated way to access and interpret all different kinds of accounting literature and guidance, whether it's tax laws from the IRC or IFRS versus US GAAP – as basic as that. 

I wanted something that could read it for me and give me insights in real time. Not final answers or anything, but just point me in the right direction, where I could then use my professional discretion and experience to fill in the gaps of what it couldn't do. 

DigitalFirst: So it was a lookup service, of sorts?

Miller: Yes. But it was nothing compared to what Open AI put on the market last December. It was out well before that. I was just using it for myself as a housing of data and inputs that I've crunched for years, and I was looking up and recalling things that I'd already taught it. I wasn't as ambitious as I could have been. It didn't matter to me to get it in the hands of other people. 

DigitalFirst: How did you go from being an auditor to “I'm building my own AI-driven lookup service”? 

Miller: That’s too nice of a way to put it. I didn't go from an auditor to some AI guru tech genius. I was just building this platform that had the means to look up, and not interpret, but recall what I've already taught it. 

DigitalFirst: Many accountants have the need for this but clearly not many have gone to the effort. Why you?

Miller: You're right. The Venn diagram of accountants and tech-inclined people, it's not exactly a perfect circle. In fact, those circles probably don't even touch. It's just a hobby of mine. I’m not really a programmer but I like building computers and pushing the limits of what I'm able to code. 

DigitalFirst: So you downloaded a whole stack of publicly available information into a database and were querying that database to help you just find things faster. 

Miller: Yes. In 2018 I was working at a large telecommunications company and doing some audits on the side. I still needed to comply with US GAAP and learn the reporting standards that I no longer had access to at my old audit firm. So I had this database that I could look up or query, and I was making it smarter. 

DigitalFirst: What was the feedback process?

Miller: It could generate a reply – not like a large language model, but it could anticipate what I was asking based on my historical communication. It could recall what I had last sent to it, and that's where it would start. 

DigitalFirst: For a non-programmer that sounds quite sophisticated. It definitely sounds more than just a stack of PDFs. 

Miller: It's not just a Control-F on a large PDF file. It was a little bit better than that. Then I moved to Denmark and started working at a fintech startup, where I have been for the last three years. And the project fell by the wayside. I wasn't really doing any audits. US GAAP was the foundation of the original model and it didn't matter (in Denmark) because it's not US GAAP or IFRS. 

Then last year, Chat GPT became more mainstream. And I thought that sounds quite familiar, let's see what it can do. And it was really impressive. It was a much more commercially viable setup than the rinky dink model I had on my old desktop. 

So I began fine tuning a GPT3 model which was a lot easier than what I had done before. I could upload the same databases that I was using in the past and teach the model how to interpret the data even better than GPT3 can. Running questions through it, testing it, seeing if the answer is right, where it needed help and I could tell it “No, this is the correct answer”. 

I just did that for months and months until I felt the model was ready. 

The one thing I think Chat GPT lacked was a user-friendly interface. I found a way to connect the fine-tuned model into WhatsApp so anyone could query it. 

DigitalFirst: Why did you choose a mobile app as an interface and not a desktop? It's something you're more likely to use at a desk than at a bus stop.

Miller: It's just that's what everyone uses. The mobile is the most important. If you looked at my user base, you would know over 90 percent of my active daily users are using mobile. You can pull up WhatsApp or Facebook Messenger on your desktop and it works just the same.

DigitalFirst: And when you talk about fine tuning, obviously there's an infinite number of questions you could ask and it’s impossible to correct everything. Plus you have to deal with potential hallucinations. How did you get around that?

Miller: There's literally an infinite number of things you could ask and it'll never be perfect. But I don't consider that too big a downside. If you're asking a CPA how to fill out a form and by what date, they are not perfect either. But the more you feed it, I find the better it gets. I'll test it on any random day and upload questions. I'll also have it make its own questions and then answer them, and (decide whether it got them right). 

It's still kind of baffling to me to even contemplate that concept, where it's generating its own test questions. 

I’m teaching it what I know and what I consider the correct way to interpret this IRC section or this US GAAP standard or the codification ASC 606. 

But that's not really enough for a user to pay a subscription every month. I considered the model ready for the marketplace when it could easily pass any CPA test exam that I gave it. I was using Wylie and Becker and any other free test software that I could use, and it was passing a lot better than I scored.

DigitalFirst: I guess that's the thing. It doesn't need to be perfect, it just needs to be better than the standard CPA.

Miller: It needs to be better than your average CPA. I can say pretty easily that it is. It's much better than me. It can produce good enough quality answers that I trust it to to guide the users with very little oversight. I hardly interfere or step in, in the middle of a conversation.

DigitalFirst: With ChatGPT it's not retaining the knowledge from the conversations you've had with it previously. But you're saying that the model you've set up is remembering the answers given to particular questions, and you are able to correct its future output?

Miller: So that's the difference between a fine tuned model and what the general population is using on Chat GPT. 

The limitation of GPT 4 is I think up to 36,000 tokens, which is essentially 20,000 words. A fine tuned model doesn't have that limitation. It recalls everything that I tell it.

Live users are limited by what GPT 4 can do, which is up to 36,000 tokens.

DigitalFirst: So if you compare that to the average CPA, they would remember at least some of the details of a client when they return year after year. 

Miller: They have their notepad and write down some personal facts about them. But that's about it. 

DigitalFirst: So ChatCPA would hit that limit of 36,000 tokens and then would essentially start afresh?

Miller: It wouldn't start afresh. I think it just removes the last several thousand tokens (to make room) for the new set.

DigitalFirst: So for users, it just has a short term memory not a long term memory. 

Miller: Exactly. My model only has a long term memory for the things that I have taught it.

DigitalFirst: The fact that it only has a short term memory from the user perspective hasn't been an impediment to people using it?

Miller: I don't think so. I haven't heard any complaints. 

DigitalFirst: What has the feedback been like?

Miller: I’m always collecting feedback. I just started out with the WhatsApp platform and it's grown to seven different platforms now. And I'm collecting good ideas that I couldn't have thought of, like screenshot reading. Someone asked if they could copy and paste their question from a text book rather than type it out. Originally, I couldn't do that, but I can do it now. 

DigitalFirst: And accounting firms are using it more than individual businesses? What's the split? Are they using it for the same thing or for different things?

Miller:  If you look at the number of end users, I would say at least 50 percent are CPAs at accounting firms. right? But if you look at just my clients, which are businesses that don’t want to hire a CPA but want some CPA answers, they'll use it just for themselves. So I would say the breakdown is that 25 percent are non-accountant individuals who want the help of a CPA, 25 percent are CPAs looking to help their career, and 50 percent are employees at an accounting firm.

DigitalFirst: What have been the biggest improvements you've made over the past six months?

Miller: Definitely the screenshot reader. Probably the SMS platform and iMessage as well. Here in the EU, WhatsApp is by far the most used platform for chatting so it wasn't really on my radar to be SMS capable. But back in the US that's the number one. I'd say the Slackbot is really useful for my firm clients and my enterprise clients. 

DigitalFirst: What are the plans in the next six to 12 months?

Miller: I'm on the waiting list for GPT 4. I think it's still in beta mode or you have to be accepted into it and I haven't yet been. 

DigitalFirst:  What would upgrading to GPT4 do for

Miller:  I don't know much about it, to be honest, but I'd say it’s at least 10 times more sophisticated as 3.5. It's just a much, much faster, smarter, baseline model. It might not make a difference because this is so specialised and a lot of it is based on what I've taught it. But I'm excited to see if there's a noticeable improvement. 

If it's capable of searching the internet in real time, or uploading a Google Doc or Excel file – let's say a set of financials – that would be by far the biggest paradigm shift. I’d be very excited. 

DigitalFirst:  What would be possible if you could upload a set of financial statements?

Miller:  I don't want to be hyperbolic, but it could reduce the number of hours needed to audit a complex set of financials. I mean, it's already really good at formulating unique Excel formulas. It could find your errors or find patterns or just insights or analysis, looking at one large Excel file at a time rather than one question at a time, meticulously typed in. 

It would change my value proposition when I talk to a CPA firm because it's no longer just a tool for your employees to quickly get some guidance on US GAAP or whatever. It would be a tool to – audit is too strong a word – but review a set of financials automatically and probably more accurately than the average set of eyes could. Yeah, I'm waiting for that day.

Image credit:

Subscribe to our newsletter

Subscribe to receive the latest stories and new guides to your inbox. No spam, we promise.

By subscribing you agree to our Privacy Policy.