Analysis of Reasoning Trajectories - Comparing Closed Weight Models vs Open Weight Models - Claude Sonnet 4 vs Kimi K2 Instruct
Abstract This study presents a comprehensive analysis of SWE-agent trajectories comparing Kimi K2 Instruct and Claude Sonnet 4 performance on software engineering tasks from the SWE-bench dataset. Through detailed examination of action category distr...
Aug 5, 20256 min read234
