Responsibilities
- Collaborate with other engineers and scientists to develop scalable data pipelines for diverse AEC data sources used in production ML systems, including BIM, CAD, and infrastructure design data
- Work with large-scale, multi-modal datasets including text and geometric data, to design novel preprocessing, augmentation, analysis and content understanding
- Transform unstructured AEC and infrastructure data into representations suitable for machine learning
- Lead cross-functional collaboration with ML Research Scientists and Engineers to align data formats with downstream training and fine-tuning of LLMs
- Apply deduplication, normalization, and validation techniques to ensure high-quality data in production environments
- Architect and optimize pipelines for scalability, reproducibility, and cloud deployment
- Mentor junior engineers and provide technical guidance on complex data challenges
- Drive technical decision-making and influence best practices across the team.
- Perform requirements analysis with senior stakeholders, ensuring technical solutions meet both immediate project goals and long-term research objectives
- Communicate findings and technical insights through quantitative analysis, visualizations, and clear documentation
- Contribute to agile workflows, ensuring flexibility and responsiveness to evolving project needs
- Participate in technical planning and roadmap development
Basic qualifications
- MSc or PhD in Computer Science, Engineering, or a related field
- 5–8+ years of experience in Machine Learning, Engineering, or related fields
- Proven technical leadership, including leading complex projects and influencing technical direction in cross-functional teams
- Strong experience in geometric data modeling and processing, including complex 2D/3D representations, computational geometry, and data architectures
- Familiarity with machine learning concepts and frameworks and how data is represented for training
- Proficiency in Python and strong software deverlopment practices
- Ability to translate research ideas into production-grade systems
- Excellent communication skills with ability to influence and guide technical decisions
- Background in Architecture, Engineering, or Construction (AEC)
Preferred qualifications
- Experience with AEC data formats and workflows (e.g., BIM, IFC, CAD, or civil infrastructure design models)
- Exposure to AEC, infrastructure, or reality capture workflows and related platforms such as Autodesk Civil 3D, InfraWorks, ReCap, or similar systems is a plus
- Experience delivering production ML or data systems
- Strong foundations in core computer science (data structures, algorithms, systems, and scalability)
- Understanding of deep learning architectures (CNNs, Transformers) and familiarity with frameworks such as PyTorch
- Experience building scalable data or ML pipelines in cloud environments (e.g., AWS, SageMaker)
- Experience mentoring senior engineers or leading small technical teams
- Track record of driving technical innovation and engineering best practices
- You are passionate about solving problems for AEC (Architecture, Engineering, and Construction) and infrastructure customers by applying machine learning techniques
- You are comfortable working in newly forming ambiguous areas where learning and adaptability are key skills
- You easily collaborate with others and are comfortable with minimal direction
- You are constantly striving to learn new technologies and methodologies
- You seek new ways to solve hard problems
- You are unafraid to put your ideas out there and fail fast
Tags & focus areas
Used for matching and alerts on DevFound Fulltime Ai Machine Learning Generative Ai