We are seeking talented Data & AI Engineers to support the development of an enterprise-grade AI Assistant. This is a high-impact initiative at the intersection of modern data engineering and applied AI. The role focuses on building the data pipeline that transforms unstructured corporate documents into a searchable, intelligent knowledge base powering a Large Language Model (LLM) as well as automating actions such as ticket creation and workflow integrations.
Tasks
Data Engineering
- Build and maintain scalable data pipelines using Databricks and Azure Data Lake Storage Gen2 (ADLS Gen2), ensuring reliability and performance across all pipeline stages.
Data Transformation
- Implement the Medallion Architecture (Raw Silver Gold layers) to clean, normalise, and structure technical documentation for downstream AI consumption.
AI Orchestration
- Design and develop the Reasoning Layer using frameworks such as LangChain or LlamaIndex to manage LLM logic, prompt routing, and tool orchestration.
Search Optimisation
- Manage and tune Vector Databases (e.g. Pinecone, Weaviate, or Azure AI Search) to enable fast and accurate semantic retrieval across the corporate knowledge base.
Integration & Automation
- Develop Azure Apps and Azure Functions to connect the AI system to external enterprise platforms such as Salesforce and SharePoint, enabling end-to-end automation.
Requirements
- 3–4 years in Data Engineering or Software Development with a strong focus on cloud-based data processing.
- Strong proficiency in Databricks and Python; comfortable working across the full data engineering lifecycle.
- Hands-on experience with LLM orchestration frameworks: LangChain, LlamaIndex, or similar.
- Solid understanding of the Azure ecosystem — Storage (ADLS Gen2), Identity (AAD), and Serverless (Azure Functions).
- Familiarity with modern AI coding assistants (GitHub Copilot, Cursor, OpenAI Codex) to accelerate development velocity.