Responsibilities
- Research and develop agent frameworks that continuously learn and improve from execution traces, user feedback, and environmental signals.
- Build large-scale log analytics pipelines to extract quality signals, usage patterns, and actionable insights from model and agent invocation logs, driving data-informed system and model improvements.
- Explore and apply frontier techniques in LLM post-training, reasoning, and planning to enhance agent capabilities.
- Collaborate across algorithm research, platform engineering, and product teams to turn research ideas into production-grade systems at scale.
Basic qualifications
- Individuals who are completing or have recently completed a Ph.D. in Computer Science, Artificial Intelligence, Machine Learning, or a closely related discipline.
- Strong theoretical and practical foundation in machine learning, deep learning, reinforcement learning, or optimization.
- Research experience in at least one of the following areas: LLM-based agents, planning and reasoning, multi-agent systems, continual/lifelong learning, or LLM post-training (e.g., RLHF, DPO, GRPO, self-play).
- Strong programming skills in Python and proficiency with ML frameworks (e.g., PyTorch, TensorFlow, JAX).
- Publication record at top-tier venues (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL, AAAI, AAMAS, COLM).
- Strong problem-solving skills and ability to thrive in a fast-paced, collaborative environment.
Preferred qualifications
- Publications in areas directly related to agent learning and adaptation, such as tool use, self-improvement, skill discovery, trajectory optimization, reward modeling, or agent evaluation.
- Research experience in LLM reasoning and planning, including chain-of-thought, tree/graph search, Monte Carlo methods, or inference-time compute scaling.
- Experience training or fine-tuning large language models, including supervised fine-tuning, preference optimization, or curriculum learning.
- Hands-on experience building or evaluating LLM-based agent systems (e.g., ReAct, function calling, code generation agents, or multi-agent orchestration).
- Familiarity with meta-learning, few-shot generalization, or transfer learning in the context of LLM-based systems.
- Experience with feedback-driven optimization loops, such as online learning, bandit methods, or evolutionary strategies applied to agent improvement.
- Strong interest in bridging frontier AI research with production-grade engineering — turning papers into systems that work at scale.
- Internship experience at technology companies or research organizations.
- Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues;
- Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems; and
- Exercising sound judgment.
Tags & focus areas
Used for matching and alerts on DevFound Internship Machine Learning Generative Ai Ai