4/10 Products & Tools 27 May 2026, 15:00 UTC

OpenAI, Thrive, and Crete build a self-improving tax agent using Codex to automate filings and improve accuracy.

Applying LLMs to deterministic domains like tax law usually fails due to hallucination risks, making a self-improving Codex agent highly notable. If the feedback loop successfully enforces strict regulatory compliance via code generation, this represents a major leap for agentic workflows in finance.

OpenAI, in collaboration with Thrive and Crete, has detailed the development of a self-improving tax agent powered by Codex. The system is designed to automate tax filings, iteratively improve its own accuracy, and significantly accelerate financial workflows.

Technical Analysis The most compelling architectural choice here is the use of Codex—a model optimized for code generation—rather than a standard conversational LLM. Tax law, while written in natural language, operates strictly as a set of deterministic rules, conditional logic, and mathematical operations. By leveraging Codex, the engineering team is likely translating tax regulations into executable code (e.g., Python scripts) to perform calculations. This "code-as-reasoning" approach drastically reduces the hallucination risks typically associated with generative AI in high-stakes financial environments.

The "self-improving" claim suggests the implementation of an iterative feedback loop, likely utilizing a REPL (Read-Eval-Print Loop) environment. The agent can generate tax logic, run it against test cases or historical filing data, evaluate the output for errors, and autonomously rewrite the code until it achieves a deterministic, correct result.

Why It Matters Applying autonomous agents to heavily regulated, zero-tolerance domains like tax preparation is notoriously difficult. If Thrive and Crete have successfully built a reliable, self-correcting pipeline, it validates a massive shift in how we handle compliance software. Instead of hardcoding thousands of ever-changing tax rules, engineers can build scaffolding that allows an agent to read the tax code, write the corresponding software logic, and test its own accuracy.

What to Watch Next Engineers should monitor how this system handles edge cases and sudden regulatory shifts. The real test of a self-improving agent is its performance decay over time as underlying state rules change. Furthermore, if this Codex-driven, logic-generation pattern proves reliable at scale, expect to see similar architectures rapidly deployed in other compliance-heavy sectors like healthcare billing, insurance underwriting, and legal discovery.

Sources

https://openai.com/index/building-self-improving-tax-agents-with-codex

openai codex ai-agents fintech automation