Side-by-side benchmark comparison across agentic, coding, multimodal, knowledge, reasoning, and math workflows.
Sibling matchup inside the Holo3 family.
Holo3-122B-A10B
~79
Winner · 1/8 categoriesHolo3-35B-A3B
~78
0/8 categoriesHolo3-122B-A10B· Holo3-35B-A3B
Holo3-122B-A10B makes more sense if agentic is the priority, while Holo3-35B-A3B is the cleaner fit if you want the cheaper token bill.
Holo3-122B-A10B and Holo3-35B-A3B sit in the same Holo3 family. This page is less about two unrelated model lineages and more about how the siblings trade off on benchmark shape, token costs, and practical limits like context window.
Holo3-122B-A10B finishes one point ahead overall, 79 to 78. That is enough to call, but not enough to treat as a blowout. This matchup comes down to a few meaningful edges rather than one model dominating the board.
Holo3-122B-A10B's sharpest advantage is in agentic, where it averages 78.9 against 77.8. The single biggest benchmark swing on the page is OSWorld-Verified, 78.8% to 77.8%.
Holo3-122B-A10B is also the more expensive model on tokens at $0.40 input / $3.00 output per 1M tokens, versus $0.25 input / $1.80 output per 1M tokens for Holo3-35B-A3B.
BenchLM keeps the benchmark table and the operator tradeoffs on the same page so a better score does not hide a materially slower, pricier, or smaller-context model.
Runtime metrics show N/A when BenchLM does not have a sourced snapshot for that exact model. The scoring rules and freshness policy are documented on the methodology page.
| Benchmark | Holo3-122B-A10B | Holo3-35B-A3B |
|---|---|---|
| AgenticHolo3-122B-A10B wins | ||
| OSWorld-Verified | 78.8% | 77.8% |
| Coding | ||
| Coming soon | ||
| Multimodal & Grounded | ||
| Coming soon | ||
| Reasoning | ||
| Coming soon | ||
| Knowledge | ||
| Coming soon | ||
| Instruction Following | ||
| Coming soon | ||
| Multilingual | ||
| Coming soon | ||
| Mathematics | ||
| Coming soon | ||
Holo3-122B-A10B and Holo3-35B-A3B are sibling variants in the Holo3 family, so the right pick depends on whether you value the better benchmark line, cheaper tokens, or the larger context window. Holo3-122B-A10B is ahead overall 79 to 78.
Holo3-122B-A10B has the edge for agentic tasks in this comparison, averaging 78.9 versus 77.8. Inside this category, OSWorld-Verified is the benchmark that creates the most daylight between them.
Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.
Free. No spam. Unsubscribe anytime. We only store derived location metadata for consent routing.