An open benchmark for AI coding agents on real-world React Native implementation tasks, emphasizing working app behavior, recommended architecture choices, and strict constraint adherence.
As of March 2026, Composer 2 leads the React Native Evals leaderboard with 97.2% , followed by Claude Opus 4.6 (84.7%) and GPT-5.4 (82.8%).
Composer 2
Cursor
Claude Opus 4.6
Anthropic
GPT-5.4
OpenAI
According to BenchLM.ai, Composer 2 leads the React Native Evals benchmark with a score of 97.2%, followed by Claude Opus 4.6 (84.7%) and GPT-5.4 (82.8%). There is significant spread across the leaderboard, making this benchmark effective at differentiating model capabilities.
13 models have been evaluated on React Native Evals. The benchmark falls in the Coding category. This category carries a 20% weight in BenchLM.ai's overall scoring system. React Native Evals is currently displayed for reference but excluded from the scoring formula, so it does not directly affect overall rankings.
Year
2026
Tasks
React Native app implementation tasks
Format
Framework-specific app development evaluation
Difficulty
Production mobile app engineering
React Native Evals focuses on framework-specific mobile work that generic coding benchmarks often miss. The public dashboard groups tasks into areas like navigation, animation, and async state, with repeated runs and cost tracking across models.
React Native EvalsAn open benchmark for AI coding agents on real-world React Native implementation tasks, emphasizing working app behavior, recommended architecture choices, and strict constraint adherence.
Composer 2 by Cursor currently leads with a score of 97.2% on React Native Evals.
13 AI models have been evaluated on React Native Evals on BenchLM.
Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.
Free. No spam. Unsubscribe anytime. We only store derived location metadata for consent routing.