Responsibilities
- Design and implement scalable, reliable and high-performance machine learning infrastructure for foundation models across text, image, speech, and multi-modal domains
- Collaborate with other teams to productionize state-of-the-art AI algorithms
- Optimize models for performance, efficiency, and on-device intelligence
- Implement machine learning systems with stringent privacy and security requirements
- May also be required to manage a small team of engineers.
Basic qualifications
- MS or PhD in Computer Science, Machine Learning, or related technical field
- Expert-level programming skills in Python
- Proficiency in machine learning frameworks such as Jax, PyTorch, TensorFlow
- Strong background in: Distributed training, Model optimization, and Machine learning infrastructure
- Experience with large-scale model training and deployment
- Familiarity with: Kubernetes, Docker, Cloud platforms (AWS, GCP, Azure)
- Distributed computing frameworks
Preferred qualifications
- Experience with foundation models and large language models
- Background in multi-modal AI systems
- Demonstrated ability to transform research prototypes into production systems
- Published research or significant contributions to open-source ML projects
- Understanding of on-device machine learning techniques
Tags & focus areas
Used for matching and alerts on DevFound Fulltime Ai Machine Learning Deep Learning