Responsibilities

Develop and enhance ML compilers using PyTorch and C++ tailored for Qualcomm’s Neural Processing Unit (NPU) architecture.
Design and implement advanced quantization techniques to improve model efficiency and accuracy.
Optimize ML workloads for deployment across diverse devices, ensuring top-tier performance and energy efficiency.
Work directly with strategic customers to address specific challenges and deliver tailored ML solutions.
Collaborate with architecture and software engineering teams to integrate cutting-edge ML technologies into Qualcomm platforms.
Research, prototype, and implement innovative solutions in ML compiler design and system-level optimizations.
Evaluate and debug ML performance to identify and resolve bottlenecks in complex workflows.
Create thorough technical documentation to share knowledge and advancements with the team. Qualifications:
Qualifications:
Bachelor’s or advanced degree in Computer Science, Electrical Engineering, Machine Learning, or a related discipline.
Strong expertise in PyTorch and C++ programming.
Experience with ML workload analysis, compiler development, and quantization techniques.
Familiarity with deep learning frameworks such as TensorFlow or ONNX is a plus.
Proven track record of solving complex performance and efficiency challenges in hardware-aware ML solutions.
Ability to work collaboratively with strategic customers and deliver impactful results.
Excellent analytical skills and ability to thrive in a high-performance team environment.

Basic qualifications

Bachelor's degree in Science, Engineering, or related field and 2+ years of ASIC design, verification, validation, integration, or related work experience.
References to a particular number of years experience are for indicative purposes only. Applications from candidates with equivalent experience will be considered, provided that the candidate can demonstrate an ability to fulfill the principal duties of the role and possesses the required competencies.

Preferred qualifications

Experience with ML model deployment on hardware accelerators such as GPUs, TPUs, or NPUs.
Understanding of system-level architecture and low-level programming.
Contributions to ML research or publications in relevant fields are an advantage.

Tags & focus areas

Used for matching and alerts on DevFound

Machine Learning Pytorch Ai

Machine Learning Engineer - PyTorch/C++ Development (NPU Architecture Team)

Responsibilities

Basic qualifications

Preferred qualifications

Tags & focus areas