According to OpenAI, o3 beats out the o1’s performance by nearly 23 percentage points on the SWE-Bench Verified coding test, more than 60 points higher on the Codeforce benchmark, and missed ...