Role overview
You'll annotate frontier-model trajectories on SWE-bench–style tasks derived from real open-source repositories. Currently, closed-source models do not expose their internal reasoning traces, making it difficult to understand how LLMs approach problem-solving.
To address this gap, you'll reconstruct and annotate the reasoning portions of model trajectories, using your own problem-solving process and the full task context to infer and infill the underlying thought process at each step.
What you'll work on
- Review model-generated code trajectories on realistic software engineering tasks
- Reconstruct chain-of-thought reasoning that explains each step of the solution process
- Annotate decision points, debugging logic, and problem-solving strategies
- Use full task context (codebase, issue descriptions, test cases) to infer plausible reasoning
- Ensure annotations reflect realistic developer thought processes and technical accuracy
What we're looking for
- 2+ years of experience in software engineering, with hands-on debugging and problem-solving in real codebases
- Degree in Software Engineering, Computer Science, or a related field (Bachelor's minimum; advanced degree preferred)
- Strong proficiency in Python, JavaScript, TypeScript, or other common languages found in open-source projects
- Familiarity with version control workflows (Git, PRs, issue tracking)
- Comfortable articulating technical reasoning in clear, structured writing