Role overview

We’re looking for an ML engineer to own large-scale training of our Lumen Enterprise models – our open‑source–based software engineering LLMs.

You’ll work on supervised fine-tuning (SFT), and reinforcement learning (RL) and continued pretraining on top of open-source base models to push state-of-the-art performance on real software engineering tasks: reading and modifying large codebases, using tools, and reasoning about complex systems.

If you enjoy working close to the metal with PyTorch and distributed training, and you like making big models actually work in practice, this role is for you.

What you'll work on

Take open-source base models (code + general LLMs) and turn them into high-performance Lumen Enterprise SWE agents via SFT and RL.
Design and run large-scale training experiments on multi-node GPU clusters, including long-context training and MoE-style architectures.
Build and iterate on large-scale RL loops where models write code, run tests or tools, and get rewarded (or penalized) accordingly.
Work hands-on across the stack: custom PyTorch dataloaders, distributed training primitives, RL objectives, and evaluation on real-world repos and tasks.

You’ll collaborate closely with infra, product, and research to decide what to train next, how to train it, and how to measure whether it’s actually better for engineers.

What we're looking for

You don’t need all of these, but the more you have, the more you’ll hit the ground running:
- Continued pretraining and long-context experience:
- Have run continued pretraining on domain-specific or long-context corpora.
- Familiarity with techniques like RoPE scaling, YaRN-style extrapolation, context parallelism, or similar.
- Code-focused RL and evaluation:
- Experience building RL loops where rewards come from code execution (tests, linters, static analysis, fuzzing, runtime traces).
- Familiarity with evaluation benchmarks for code models (e.g. HumanEval, MBPP, SWE-bench, or internal equivalents).
- Experience with modern LLM training stacks:
- Experience with large MoE models and expert/tensor parallelism is a plus.
- Serving and online training:
- Experience in tuning inference tasks for opensource frameworks, e.g. VLLM, SGLang, etc.
- Safety, robustness, and reward shaping:
- Experience with LLM-as-a-judge, reward hacking detection, or robustness evaluation.
- Open-source contributions or research:
- Contributions to open-source LLM tooling, RL libraries, or relevant research papers in LLM training / RLHF / code models.

Tags & focus areas

Used for matching and alerts on DevFound

Fulltime Ai Ai Engineer Machine Learning Robotics Generative Ai

Machine Learning Engineer – Lumen Enterprise Models SWEfocused LLMs

Role overview

What you'll work on

What we're looking for

Tags & focus areas

Ready to Join the Team?