“Go Clean Your Room” and Other Ways to Get a Straight Answer from AI
You’ve seen the headlines. Accounting jobs will be automated!
According to McKinsey, sixty percent of finance and accounting tasks could be automated with existing technology. So why doesn’t it feel that way?
Because “could be automated” and “can actually rely 100% on the output” are two very different things. And right now, there’s a big gap between them.
We’re stuck between two bad options. Full manual mode, which feels increasingly slow, or autonomous AI that just does stuff and won’t tell you why.
Neither one is actually usable in accounting, where we’re expected to work faster than ever, AND every number has to be defensible.
Sasha Orloff, co-founder and CEO of Puzzle, joined me on a recent livestream. He laid out a roadmap for approaching AI in accounting that works.
He calls the destination “governed automation.” But the analogy that got me was a parenting story.
“Go Clean Your Room”
Every parent has said it. And every parent knows the problem: you tell your child to go clean their room. Your child disappears for 20 minutes, and now they’re back, telling you it’s done.
But what’s the probability that it’s actually done?
It’s probabilistic. Maybe the floor is clear, but clothes are stuffed under the bed.
“Clean” can mean something entirely different to a child than it does to you. They answered the instruction. They just answered it according to their own interpretation.
That’s how most AI in accounting products works right now.
You give it a task, it does its best guess, and says it’s done. You have no idea what happened in between.
Governed Automation
The alternative version, the deterministic one, looks like this: before your child goes anywhere, they ask a few clarifying questions.
Do you mean just the floor? Or the whole room?
Do I have to fold the clothes or just put them away?
Once you agree on the definition of “clean,” they execute each step. Then they come back and show you: floor cleared, clothes folded, bed made. Done.
You can walk in and verify each condition. Yes or no, not “probably.”
That’s governed automation. And it’s the right design for accounting.
The Difference Between “Done” and “Actually Done”
There’s no “probably correct” in a general ledger.
Either the depreciation schedule is right, or it isn’t. Either the software categorized the transaction correctly, or it didn’t.
You can’t have a system running autonomous best-guess operations against books the client is going to stake business decisions on (and that you’re staking your reputation on).
As Sasha pointed out, you have to build a deterministic system on top of probabilistic technology. That’s the piece most accounting software companies miss.
Puzzle uses a large language model (LLM) to interpret what you want, then converts that intent into a series of concrete, auditable steps.
The AI drafts a plan, shows you the plan, asks if that’s what you meant, and waits for you to say yes before it touches anything.
Nothing happens without explicit permission.
You Still Have to Define “Clean”
Back to the room analogy for a second, because Sasha’s extension of it is great.
He pointed out that when the system doesn’t work, it usually fails for one of two reasons. Either:
The AI hasn’t been given the underlying functions to execute the steps (i.e., it doesn’t know what a “drawer” is), or
The user gave vague instructions.
“Go clean your room,” without specifying what clean means, is a recipe for disappointment. Same with “reconcile these accounts” without specifying your firm’s standards, your capitalization threshold, or your categorization rules.
The AI isn’t magic. It’s a very fast, very capable executor of instructions, but only instructions it understands.
By design, the accountant remains responsible for clearly defining those instructions.
You set the rules, review the output, and sign off.
The AI does the processing. Humans do the thinking.
You’re outsourcing labor, not judgment.
Trust Is Still the Product
The accounting profession has always been about trust.
Clients trust you with their numbers. AI makes that trust relationship more complicated, not less.
The answer isn’t to hand everything over to a machine. It’s to build systems that log every decision, trace every change, and ensure nothing happens without your explicit sign-off.
The best implementation of AI takes your prompt, asks questions to convert it into a deterministic checklist, and follows that checklist step by step with your approval at every stage. That’s how you get AI you can actually use as part of your close process. It’s a documented, auditable workflow you designed and signed off on.
That’s what I want to see in every AI tool accountants use. And it’s what buyers should demand when they evaluate new software.
“Go clean your room” is a terrible instruction.
“Clear the floor, fold the clothes, shirts in the shirt drawer, pants in the pant drawer, make the bed, and show me when its done” is a workflow.
Build the workflow.
If you want to see what this looks like in practice, watch the full livestream here.
And if you’re ready to try it yourself, Puzzle is opening the waitlist for AI Close now.