SWE-bench Verified ' is a benchmark released by OpenAI in August 2024 and has been widely used as a representative indicator for ...