Short answer first: for high-volume agent loops and coding on a budget, sakana fugu ai wins on price and gets you Fable-5-class output for about a quarter of the cost.
For the very hardest multi-step problems, Fugu Ultra edges ahead, and for one specific benchmark Fable 5 still rules.
Fusion is the closest cousin to Fugu in how it works, but it is pricier.
That is the whole fight in three sentences, and the rest of this article shows you exactly why I landed there after testing all three.
Here is the thing most people miss.
This is not a “which model is smartest” question.
It is a “which multi-agent setup gets me the best answer for the least money” question, and the answer changes depending on what you are building.
What Sakana Fugu actually is (and why it is not just another model)
Sakana is a Japanese AI lab, and Sakana Fugu is their full multi-agent orchestration system delivered through a single model API.
You do not sign up to ten different models.
You hit one API, and behind the scenes a panel of models — closed and open source — competes head-on on your prompt.
A judge then synthesises one answer from the panel.
If that sounds familiar, it should.
That is exactly how Fusion works.
The clever part is auto model selection and delegation: Fugu picks the right models and routes the work for you, so you get the wisdom of a crowd without the admin of managing a crowd.
There are two tiers, and the difference matters for this comparison:
Fugu — low latency, fast, built for coding (think Codex-style work) and customer-facing tasks where speed counts.
Fugu Ultra — the flagship, tuned for maximum answer quality on hard multi-step problems like AI research. It costs more, and it earns it on the toughest jobs.
One thing to set expectations on.
Both Fugu and Fusion are one-shot.
You send the prompt, the panel deliberates, and you wait for one answer.
It is not the back-and-forth chat loop you get from a Claude CLI session.
That is a feature for agent loops and a small adjustment if you are used to conversational coding.
Want the Agent OS and Sakana built in for you, ready to clone and run?
Get it inside the AI Profit Boardroom alongside 3,600+ members building agent systems every day. Join the AI Profit Boardroom here.
The 3-way comparison: Sakana Fugu vs Fusion vs Fable 5
Here is the side-by-side so you can see the shape of it before we get into the numbers.
Setup
Approach
Speed
Cost
Best use
Sakana Fugu
Multi-agent panel via one API, judge synthesises one answer; auto model selection
Fast (Fugu tier is low latency); one-shot
~25% of Fusion for the same prompts, plus a flat-rate subscription
High-volume agent loops, coding, customer-facing tasks on a budget
Fusion
Multi-agent panel, judge synthesises one answer (same idea as Fugu)
One-shot; depends on panel
Pay-per-usage on OpenRouter, pricier
Same approach as Fugu, available now, good for side-by-side testing
Fable 5
Single flagship model
Conversational, back-and-forth
Premium single-model pricing
Hard coding (SW Bench Pro), interactive sessions, top-tier reasoning
The headline pattern: Fugu and Fusion are the same species (multi-agent panels), and Fable 5 is the single-model champion they are trying to match.
Sakana claims Fugu matches Fable and Mythos, and the benchmarks mostly back that up.
Sakana fugu vs fusion: the cost gap is the whole story
Functionally, Fugu and Fusion do the same job.
Panel competes, judge decides, you get one answer.
So why would you pick one over the other?
Money and access.
Fusion runs on OpenRouter as pay-per-usage, and it is the pricier option.
Sakana Fugu comes in at roughly 25% of Fusion’s cost for the same prompts.
That is not a rounding error.
That is a 4x difference, and when you are running agent loops that fire thousands of times a day, a 4x cost gap decides whether the whole project is viable.
On top of that, Fugu offers a flat-rate subscription.
For high-volume agent work, flat-rate is the dream — you stop watching the meter and just let the agents run.
That alone makes Fugu the better default for anyone building always-on automation.
The catch on access: Sakana Fugu is not available in the EU or UK at launch because of GDPR.
That stings for me here in the UK and probably for you too.
So in practice, the “sakana fugu vs fusion” decision today is often made for you by geography — if you are in the UK, Fusion is what you can actually run, and you keep Fugu on the watchlist for when it lands.
The benchmarks: where Fugu Ultra wins and where Fable 5 fights back
Let me give you the real numbers, because this is where the comparison earns its keep.
No invented figures — just what was published.
Terminal Bench — Fable 5 scores 80.4, Fugu 80.2, Fugu Ultra 82.1. Basically a dead heat, with Fugu Ultra nosing in front.
SW Bench Pro — Fable 5 clearly beats both Fugu tiers. This is the one place the single-model champion pulls away.
Live Code Bench — Fugu Ultra lands around 93.2, which is genuinely strong.
Read it as a whole and the picture is clear.
Mostly, Fugu is even with or slightly beats Fable 5.
Fugu Ultra is the top-quality option for the hardest multi-step problems.
The lone exception is SW Bench Pro, where Fable 5 wins outright — so if your work looks like SW Bench Pro style tasks, Fable 5 keeps its crown.
This is why I keep saying it depends on the job.
There is no single winner across every benchmark.
There is a winner per use case, and that is far more useful to you than a leaderboard trophy.
An honesty beat on benchmarks (read this before you trust any number)
Quick warning, because the AI space is full of marketing dressed up as science.
Beware self-scored benchmarks.
The “Le Chaton Fat” hoax went viral precisely because a flashy self-reported score is easy to fake and easy to share.
So I do not take any lab’s word for it, including Sakana’s.
I test myself.
I run my own prompts through Goldie Bench so I can see how these setups behave on the work I actually do, not the work a marketing team picked to look good.
Want to test models the honest way without paying for every API?
Grab my free AI Money Lab — community, tools and the same testing approach I use. Join the free AI Money Lab here.
Hands-on: what I actually built with Fugu
Benchmarks are one thing.
Building is another.
So I put Fugu to work on the kind of stuff that exposes a model fast: a website, a maze game, a spiral galaxy simulation, and an orbit and solar-system simulation.
Then I ran the same prompts side-by-side against GLM 5.2, Opus 4.8 and Fusion.
The output across the board was strong.
What stood out to me was design taste — Fugu’s builds simply looked nicer and more interesting to me than the others.
The maze and the galaxy sim in particular had a polish that felt deliberate rather than generic.
That is a subjective call, and you should run your own builds before you take my word for it.
But it lines up with the benchmark story: this is a setup that competes at the top tier, not a budget knock-off that happens to be cheap.
Fugu Ultra vs standard Fugu: which tier do you pick?
Simple rule of thumb from all of this.
Pick standard Fugu when you want speed and you are doing coding or customer-facing work. Low latency, even-with-Fable-5 quality, lowest cost. This is your everyday workhorse.
Pick Fugu Ultra when the problem is genuinely hard and multi-step — deep research, gnarly reasoning chains, the stuff where a wrong answer costs you hours. It is pricier, but it is the top of the range and the benchmarks show it.
For most people running agent loops, standard Fugu is the right default, and you reach for Fugu Ultra on the hard 10% of tasks.
That mix gives you Fable-5-class results at a fraction of single-model pricing.
How I built Sakana into my Agent OS
Here is the part that makes all of this practical.
I built Sakana into my Agent OS in about an hour.
Fusion is already built in too, which is exactly why I could run the side-by-side tests above without any faff — same prompts, two panels, instant comparison.
That is the whole point of an Agent OS.
You wire in the model APIs once, and then you can swap, compare and route work without rebuilding anything.
When Fugu finally lands in the UK, it slots straight in next to Fusion and I flip a switch.
You access Sakana Fugu at sakana.ai, where you will find the two APIs and a technical report if you want to go deep on how the panel and judge work.
Want me to walk you through your AI setup personally?
Book a free AI SEO strategy session and we will map out where these agents fit your business. Grab your free strategy session here.
So which multi-agent setup wins?
Let me bring it home.
Building high-volume agent loops and want the best value? Sakana Fugu wins — Fable-5-class output at ~25% of Fusion’s cost, plus flat-rate. (UK and EU: hold tight until it lands.)
In the UK or EU today and want the same approach now? Fusion wins by default — same panel-and-judge design, available immediately.
Doing the hardest multi-step reasoning? Fugu Ultra wins on quality.
Working on SW Bench Pro style coding? Fable 5 still wins that specific fight.
No trophy for “best overall”, because that question is lazy.
The right move is to match the setup to the job and to test it yourself rather than trusting a self-scored chart.
Also on my other sites
I have covered Sakana Fugu from a few angles across my network — go deeper here:
Sakana Fugu AI is a full multi-agent orchestration system from the Japanese lab Sakana, delivered through a single model API.
A panel of models competes head-on and a judge synthesises one answer.
It comes in two tiers: Fugu for low-latency coding and customer-facing work, and Fugu Ultra for maximum quality on hard multi-step problems.
What is the difference between Sakana Fugu and Fusion?
Both run a multi-agent panel and return one synthesised answer in a single shot.
The big difference is cost and access.
Sakana Fugu runs at roughly 25% of Fusion’s cost for the same prompts and offers a flat-rate subscription, while Fusion on OpenRouter is pay-per-usage and pricier.
Fugu is not available in the EU or UK at launch.
Is Sakana Fugu better than Fable 5 for coding?
It is close.
On Terminal Bench, Fable 5 scores 80.4, Fugu 80.2 and Fugu Ultra 82.1.
On Live Code Bench, Fugu Ultra hits around 93.2.
The exception is SW Bench Pro, where Fable 5 clearly beats both Fugu tiers.
For most coding tasks Fugu is even with or slightly ahead of Fable 5.
How much does Sakana Fugu cost?
Sakana Fugu costs roughly 25% of what Fusion charges for the same prompts and also offers a flat-rate subscription, which is ideal for high-volume agent loops.
You access it through two APIs at sakana.ai.
Fugu Ultra is priced higher than standard Fugu.
Can I use Sakana Fugu in the UK?
Not at launch.
Sakana Fugu is not available in the EU or UK due to GDPR.
You can still test the approach today using Fusion, which behaves similarly and is built into my Agent OS for side-by-side comparison.
About Julian
I am Julian Goldie.
I run a 7-figure SEO and link building agency with a team of 70+, I have built a YouTube channel of 400K+ subscribers and 163K followers on X, and I am the author of Link Building Mastery.
I run the AI Profit Boardroom, a community of 3,600+ members building real AI agent systems, and I spend my days testing tools like sakana fugu ai so you do not have to take a marketing chart’s word for anything — you can match the right multi-agent setup to your actual work.