| Reference (Gold) |
10 |
100.00% |
21.00 |
NA |
Analysis |
Github |
OpenHands (subset of all) |
2 |
41.24% |
116.76 |
11/25/2024 |
Analysis |
Github |
| Claude Sonnet 3.5 - Fill-in + Unit Test Feedback |
0 |
30.59% |
552.79 |
09/25/2024 |
Analysis |
Github |
| Claude Sonnet 3.5 - Fill-in |
0 |
18.63% |
22.47 |
09/25/2024 |
Analysis |
Github |
| Claude Sonnet 3.5 - Base |
0 |
18.38% |
16.83 |
09/25/2024 |
Analysis |
Github |
Claude Sonnet 3.5 - Fill-in (subset of all) |
0 |
15.79% |
64.49 |
09/25/2024 |
Analysis |
Github |
SWE-Agent (subset of all) |
0 |
9.70% |
17.96 |
11/26/2024 |
Analysis |
Github |