Role overview

*KissMyButton · Greece (Remote/Hybrid)

About Us**

KissMyButton is a dedicated team of professional software developers. We are passionate about our work and aim to extend the potential of our clients through high-impact, technical excellence.

The Role

We are building a high-performance, secure, and eventually sovereign RAG engine. We are looking for a Senior Go Engineer who is a "security-first" architect. You will start by building the hardened back-end services for our AI Assistant and lead the transition to on-prem, high-throughput inference using cutting-edge GPU orchestration.

What you'll work on

Go Orchestration: Develop lightning-fast, type-safe APIs and middleware in Go (Golang) to handle streaming LLM data and complex RAG logic.
AI Security & Guardrails: Design and implement the "Shield"—protecting the system from prompt injection, data exfiltration, and ensuring strict PII/PHI filtering.
Inference Engineering: Deploy and tune vLLM and llm-d on Kubernetes to maximize GPU throughput and minimize "Time to First Token" (TTFT).
System Observability: Build the backend infrastructure for RAG evaluation (integration with tools like LangSmith or Arize) to track the "faithfulness" of our AI.
Performance: Optimize Go routines and memory management to support high-concurrency enterprise traffic.

What we're looking for

Go Mastery: 3+ years of professional experience building resilient distributed systems, durable execution & high-performance backends in Go.
Architecture: Solid understanding of system design, concurrency, and data consistency
Infrastructure DNA: Strong experience with Kubernetes (K8s) and Docker. You should be comfortable managing GPU-enabled nodes.
Security-First Mindset: Deep understanding of modern AuthN/AuthZ (OIDC, OAuth2) and API hardening.
AI Enthusiast: You are an active user of AI agentic tools (Claude, Cursor, Copilot) and have a strong curiosity about how LLMs work under the hood (Weights, Quantization, Context Windows).
*Strong communication and collaboration skills.

Experience with vLLM, llm-d, TGI, or NVIDIA Triton Inference Server.
Familiarity with Python (for data/AI scripting) and the Lang* ecosystem.
Knowledge of Vector Databases (Qdrant, Weaviate, or Milvus).
Experience with Open Policy Agent (OPA) for fine-grained access control.

Tags & focus areas

Used for matching and alerts on DevFound

Fulltime Remote Ai

Senior Go AI Infrastructure Engineer

Role overview

What you'll work on

What we're looking for

Tags & focus areas

Ready to Join the Team?