Responsibilities
- Design and build durable, idempotent ingestion pipelines for creative and performance data at scale (queues, retries, backpressure, dedup, schema evolution)
- Generate and manage embeddings for multi-modal creative assets; select and operate the right vector store for the workload
- Build and maintain retrieval pipelines that serve AI agent tools with accurate, low-latency responses
- Ship agent-style systems with tool use, state management, and multi-step reasoning workflows
- Develop and maintain the React frontend for the creative intelligence library and query interface
- Own the full lifecycle of your systems: design, build, deploy, monitor, and iterate
- Contribute to stack decisions with clear reasoning grounded in production experience
- Collaborate closely with product and enterprise partners to translate requirements into reliable, scalable systems
Basic qualifications
- Strong TypeScript — you use types as a design tool, not a formality
- Production experience with serverless or edge runtimes (Cloudflare Workers, Vercel, Lambda, Deno Deploy, or equivalent)
- Demonstrated experience building durable, idempotent ingestion pipelines with queuing, retry logic, backpressure handling, deduplication, and schema evolution
- Practical, production-level understanding of embeddings, chunking strategies, and retrieval quality tuning
- At least one agent-style system shipped to production: tool use, stateful multi-step workflows — framework matters less than the experience
- React fluency with modern patterns and component architecture
- Comfort operating across two cloud environments; able to reason clearly about when to use edge compute vs. managed data/AI services, and how to bridge them
- Must have prior remote work experience, be fluent with remote collaboration tools and platforms (such as Slack, Zoom, Google Workspace, Linear, or similar), and have ideally worked with US or UK-based companies. Applications without this experience will not be considered.
Preferred qualifications
- Experience building or operating RAG systems in production
- Familiarity with current embedding models and the tradeoffs across dimension, quality, and cost
- Background in ETL design, observability for data pipelines, or evaluation frameworks for retrieval quality
- Adtech, performance marketing, or marketing analytics background — understanding what channels, attribution, and creative testing look like in a live production context
- Opinions on vector databases (Cloudflare Vectorize, Vertex AI Vector Search, Turbopuffer, or similar) backed by hands-on experience
- TypeScript (primary language across the stack)
- Cloudflare Workers, Queues, and Agents SDK (or equivalent edge runtime)
- GCP — Vertex AI for embeddings and related data/AI services
- Vector database (to be selected: Cloudflare Vectorize, Vertex AI Vector Search, Turbopuffer, or similar)
- React with Remix or TanStack Start (TBD)
- Google Workspace, Slack, Zoom, and standard remote collaboration tooling
Tags & focus areas
Used for matching and alerts on DevFound Remote Ai Ai Engineer Data Engineer