Responsibilities
- Design and implement sophisticated LLM prompts aligned to internal scoring rubrics and customer-specific needs.
- Conduct prompt testing and evaluation using labeled datasets to ensure performance accuracy.
- Collaborate with engineering to improve tooling and infrastructure for managing LLM scoring workflows.
- Maintain a centralized prompt library to ensure reuse and standardization across use cases.
- Document prompt strategies, testing approaches, and outcomes for internal learning.
- Identify and resolve inefficiencies in the prompt development and deployment process.
- Partner with product and data labeling teams to improve the quality and consistency of scoring outputs.
Basic qualifications
- Strong understanding of LLMs, NLP systems, and prompt engineering concepts.
- Ability to structure and document complex logic in a scalable and repeatable manner.
- Demonstrated ability to collaborate effectively with both technical and non-technical teams.
- Analytical mindset with an interest in experimentation, iteration, and data-driven decision-making.
Preferred qualifications
- Experience in structured data labeling or QA workflows.
- Familiarity with tools like LangChain, OpenAI/GPT APIs, or vector databases.
- Background in linguistics, cognitive science, or HCI.
- Proficiency in Python or scripting tools for evaluation or automation.
Tags & focus areas
Used for matching and alerts on DevFound Ai Nlp Generative Ai Fulltime