Structured Output Benchmark Value Accuracy (SOB Value Acc)

Name: Structured Output Benchmark Value Accuracy
Creator: BenchLM

A structured-output benchmark from Interfaze measuring whether extracted JSON leaf values exactly match verified ground truth.

Benchmark score on SOB Value Acc — June 18, 2026

BenchLM mirrors the published score view for SOB Value Acc. Interfaze Beta leads the public snapshot at 79.5%. BenchLM does not use these results to rank models overall.

1Closed

Interfaze Beta

Interfaze

79.5%

Overall —Context 1M

1 modelsInstruction FollowingCurrentDisplay onlyUpdated June 18, 2026

About SOB Value Acc

Year

2026

Tasks

Structured output extraction

Format

Value accuracy

Difficulty

Production structured-output reliability

SOB Value Accuracy goes beyond JSON parse success: it measures whether values in the structured response are correct and grounded in the source context across text, image, and audio-normalized inputs.

Structured Output Benchmark Leaderboard

BenchLM freshness & provenance

Version

SOB Value Acc 2026

Refresh cadence

Quarterly

Staleness state

Current

Question availability

Public benchmark set

CurrentDisplay only

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

Benchmark score table (1 models)

Interfaze Beta

InterfazeClosed

79.5%

FAQ

What does SOB Value Acc measure?

A structured-output benchmark from Interfaze measuring whether extracted JSON leaf values exactly match verified ground truth.

Which model scores highest on SOB Value Acc?

Interfaze Beta by Interfaze currently leads with a score of 79.5% on SOB Value Acc.

How many models are evaluated on SOB Value Acc?

1 AI models have been evaluated on SOB Value Acc on BenchLM.

Last updated: June 18, 2026 · BenchLM version SOB Value Acc 2026

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.