Role overview
We are seeking a Research Engineer in Natural Language Processing (NLP) and Large Language Models (LLMs) to contribute to the design, training, and evaluation of next-generation foundation models. The role sits at the intersection of research and production-grade engineering, with a strong emphasis on post-training, multimodality, and advanced generative modeling techniques, including diffusion-based approaches.
You will work closely with researchers and applied scientists to translate novel ideas into scalable, reproducible systems, and to push the state of the art in open, responsible, and multilingual AI.
What you'll work on
- Design, implement, and maintain training and post-training pipelines for large language and multimodal models (e.g., instruction tuning, alignment, preference optimization)
- Conduct research and engineering on post-training methods
- Contribute to multimodal modeling, integrating text with modalities such as vision, speech, or audio
- Explore and apply diffusion-based models and hybrid generative approaches for language and multimodal representation learning
- Optimize large-scale training and inference
- Develop evaluation pipelines and benchmarks for language understanding, reasoning, alignment, and multimodal performance
- Collaborate with researchers to prototype new ideas, reproduce results from the literature, and contribute to publications or technical reports
- Ensure code quality, reproducibility, and documentation suitable for long-term research and open-source release
What we're looking for
- Experience with diffusion models (e.g., text diffusion, latent diffusion, or multimodal diffusion)
- Hands-on work on multimodal models (e.g., text-image, text-audio, speech-language systems)
- Exposure to LLM alignment, safety, or evaluation beyond standard language modeling metrics
- Experience with distributed training and large-scale model experimentation
- Familiarity with multilingual or low-resource language settings
- Contributions to open-source ML or published research in NLP, multimodality, or generative modeling