Role overview
In a world of fast acceleration of Research & Innovation in the fields of low carbon processes and high sustainability solutions, IFPEN plays a major role as a committed player in the threefold ecological, energy and digital transition, as an institute open to society, and a trusted third party for public authorities.
In this context, the Physics and Analysis Direction aims to produce a large amount of analysis and characterizations in several fields, such as various types of spectroscopies or microscopies. These data, and the analysis of their contents by specialists, are stored in a wide variety of forms: databases, Microsoft Office files … Over the years, numerous experiments have generated a large volume of heterogeneous documents (reports, notes, meeting summaries) containing valuable information on protocols, parameters, and results. These archives remain difficult to exploit for understanding experimental dynamics and contextualizing new studies.
To address this challenging issue, the Digital and Science Technology team aims to develop a hybrid AI architecture combining language models and deep learning to transform the archives into a tool for analysis and trend identification.
What you'll work on
- Large Language Models (LLMs) for information extraction and structuring from heterogeneous documents
- Hybrid representations combining semantic embeddings and knowledge graphs
- Deep learning methods for similarity analysis (embedding-based similarity, metric learning), clustering, and trend analysis
- Graph-based learning approaches (e.g., Graph Neural Networks) to model relationships and experimental dynamics
What we're looking for
- Background in computer science, data science, artificial intelligence, or a related field
- Knowledge of machine learning and deep learning fundamentals
- Proficiency in Python programming and common ML/DL libraries (e.g., PyTorch, TensorFlow, scikit-learn)
- Familiarity with natural language processing and representation learning (embeddings)
- Ability to work autonomously and to collaborate within a multidisciplinary team
- Curiosity, autonomy and good interpersonal skills