The AI Accounting Benchmark

How do AI models actually perform on real accounting work?

The AI Accounting Benchmark tests major AI models — Claude, GPT-4o, Gemini, Copilot, and others — against real-world accounting scenarios: tax code interpretation, journal entries, audit procedures, financial statement analysis, and more. Every quarter, I rerun the tests and publish updated results.

This is (to my knowledge) the first standardized benchmark specifically testing AI performance in professional accounting. The methodology and all results are published openly.

View the Latest Results →