AI and management consulting

Nov 5

Researchers at NVIDIA just released ProfBench, testing AI models on real professional tasks across Physics PhDs, Chemistry PhDs, Finance MBAs, and Consulting MBAs. These aren’t simple Q&A tests. They’re complex, multi-page reports requiring genuine domain expertise. Exactly the kind of work consultants do. The results? Even GPT-5 only achieves 65.9% performance. Finance shows the biggest gaps, with a 15% performance difference between top closed and open-source models.

Meanwhile.

A recent Reuters article describes consulting firms facing their “Kodak moment.” The industry is in crisis with shares down 30% while the S&P jumped 50%. One UK finance chief explained the economics: a project that costs a client $1M to do themselves, and $200K with Accenture, can now supposedly be done with AI for just $10K.

So we appear to have empirical evidence showing AI still struggles significantly with professional-grade work, yet the market is in some sense turning on the consulting industry, based on the expectation of AI capability. Perhaps worrying for the big consultancies, some clients appear to perceive AI as “good enough”. Even though the ProfBench scores show AI deficiencies, the market doesn’t care about 65.9% versus 90%. It cares about $10K versus $200K.

Interesting to see how the consulting world responds.

ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge
AI sets up Kodak moment for global consultants

Andrew Tope

AI and management consulting

References

The Luddites and AI…

Knowledge in an age of AI

Seen Things