Which framework for AI first development in 2026

18 May 2026

I built the same thing nine times over in completely autonomous agent mode so that you can have one empirical data point on token efficiency, quality and effectiveness of MCPs and skills.

What I found was that:

Clojure and OCaml are 10x more token expensive than Next.js, Rails and Go
Framework-specific skills and MCPs are not ubiquitous, but they probably should be
Being popular in the training set means a much cheaper overall implementation
You can get the job done with ANY language, but your overall quality and cost may vary

There is no established best practice for AI assisted development. You must be experimenting constantly to understand how new tooling can be adopted. My small part to contribute here is to burn a billion tokens on building the same project over and over so that we can better understand what tools work best with AI, and what you should consider doing on your next greenfield project.

The hypothesis

Functional languages would use fewer tokens — a strong type system would also push towards leaner overall solutions.

I can confidently say this is not true. Functional languages appear to be the worst in token efficiency, although the overall solution is very line-of-code efficient. 2026 may not be the year for my beloved esoteric functional languages after all!

What was built

For this test project I built a feature-voting SaaS where product teams collect and prioritize user feedback through a public board.

The app covers the full product loop end-to-end:

Visitors can
- Submit requests
- Vote
- Get email notifications on status changes
Operators can
- Authenticate to an admin panel
- Moderate posts
- Bulk status update
- View/filter requests
- Publish a changelog
Vote integrity managed with rate limiting and spike detection
SSO via signed JWT or email auth flow
Public board with search, sort, status filtering and voting

How it was built

I started with a detailed PRD that broke the work into 26 user stories across 9 epics — phased into three releases. I used the beads issue tracker to manage the plan in Claude Code. The amount of nudging needed varied between 2 and 5 times to get the agent to keep working until the project was finished.

What was the result?

Total token burn

Stack	Total Tokens	Tests?
Go	~14M
Next.js	~26M
Rails	~36M
C	~52M	yes
Java Spring	~88M
Laravel	~91M	yes
OCaml	~122M	yes
Clojure v1	~133M	yes
Clojure v2	~181M	yes

Go, Next.js and Rails stand out as the most token efficient
Writing tests forces your agent to take more turns, which naturally means more tokens burned
Functional (and esoteric) languages really do burn a lot of tokens

Total LOC count

Stack	Primary Lang	Primary LOC	Total Code LOC
OCaml	OCaml	3,918 (8 files, 1,000 in test)	3,928
Rails	Ruby	1,732 (88 files)	4,950
C	C	2,315 (1 file)	5,150
Clojure v2 (no REPL)	Clojure	3,677 (29 files, 1,241 in test)	5,346
Clojure v1 (with REPL)	Clojure	2,847 (15 files, 599 in test)	5,799
Go	Go	3,095 (25 files)	7,013
Next.js	TypeScript	4,743 (72 files)	10,230
Laravel	PHP + TypeScript	6,011 PHP + 7,530 TS (2,923 in tests)	17,423

Clojure, Rails and OCaml are incredibly terse — especially when you consider the significant testing component of the OCaml/Clojure codebases.

How did framework skills impact the outcome?

Next.js and Laravel have the most coherent recommended practice for AI-based development. They ship skill files, MCPs and recommended claude.md structures. Within Next.js, there is an ORM-specific skill (Prisma) and a design system-specific skill (shadcn). Laravel included an MCP that can unify backend and frontend logs, and a significant suite of skill files including 'best practice' and 'documentation links'. Next.js was a surprisingly token-efficient implementation despite a high line count — though the shadcn components do account for a lot of those lines. The Laravel implementation wrote a full test suite and took many turns to iterate and validate through testing — as a result the token cost was higher, but this feels worthwhile in terms of the maintainability of the end solution.

There is surprisingly little out there in the way of official framework-specific skill files, MCPs or other AI harness tooling. There is real opportunity here for framework developers, as many people will be approaching new projects with an LLM-first approach.

The most interesting tooling story, though, was Clojure. My experience watching Claude edit Clojure was ... painful. The agent would get in such a twist balancing parentheses when editing code that it would eventually start writing Python scripts to do the editing for it.

After a small amount of research I came across clojure-mcp-light — a recommended MCP that encourages REPL-driven development and provides a small command-line utility that helps the agent balance parentheses during edits. What made it unique was a deterministic hook that runs after every edit.

The result surprised me: no marked token efficiency gain, but the agent took a fundamentally different approach to the whole project. It produced 2x the tests and fewer overall lines of code than the first attempt. The right tooling didn't just make Clojure cheaper — it changed how the agent thought about the problem.

So what framework should I use?

Although AI is capable of making a solution in any language, you should consider longer-term maintainability. That comes from simplicity in the overall solution, reduced churn, and consistent use of patterns — code that a human can actually follow. Right now the bottleneck is humans understanding the system, so you really want the agent writing in a structure optimised for human consumption.

It is hard to move away from mature frameworks when you want to keep a system going for 10 years and have multiple people maintaining and extending it. We should not just consider lines of code, just as we should not just consider token efficiency.

Right now the tooling is so new that there is no clear winner in out-of-the-box skills and MCPs. The addition of framework-specific AI tooling is a nice-to-have, but not yet a game changer.

I would encourage you to consider the following before committing to a framework in 2026:

What can I hire for reliably over the next 10 years?
What will I enjoy working on?
Choose your top 3 tools and vibe-code a prototype in each — see which solution fits the way your brain works.