Indicium AI is trusted by the world's leading enterprises to deliver AI into production at scale. We are a global AI-native consultancy with proven experience across Financial Services, Energy & Utilities, Healthcare & Life Sciences, Retail & CPG, and Manufacturing. From strategy, to build, to business outcomes, we unlock value from AI with unmatched clarity, speed, and capability.
Powered by 600+ AI experts serving 50+ enterprise clients from 5 global locations, we work side-by-side with top partners - including Anthropic, Databricks, AWS, OpenAI, and Microsoft - to deliver modern AI with speed and measurable impact.
Overview
We're seeking an experienced AI Engineer to design, build, and deploy production-grade AI systems powered by large language models. This role sits at the intersection of software engineering and AI implementation, focusing on building reliable, scalable applications rather than model training or research.
You'll work with cutting-edge LLM technologies, building advanced AI systems that solve complex real-world problems through multi-agent orchestration, intelligent tool integration, and robust production workflows.
You'll be crafting the orchestration layer that makes these systems production-ready—handling failure modes, optimizing agent collaboration, and ensuring consistent, reliable outputs at scale.
You’ll combine strong software engineering fundamentals with deep practical knowledge of LLM capabilities, limitations, and best practices for building non-deterministic systems that users can trust.
Responsibilities
- Design and implement production AI systems integrating LLMs, RAG pipelines, vector databases, and agentic frameworks.
- Create evaluation frameworks to measure and monitor system performance, accuracy, and reliability
- Build and maintain production-grade AI applications with clean code, appropriate error handling, APIs, and data pipelines
- Experience implementing, maintaining and evaluating retrieval systems (vector/graph databases, ingestion pipelines, chunking strategies, retrieval techniques such as HyDE)
- Implement feedback loops and observability to continuously improve system performance
- Craft effective prompts and optimize for latency, cost, and quality across different model providers and configurations
- Hands-on experience building applications with LLM APIs and deep understanding of their capabilities, limitations, and failure modes
- Practical implementation of RAG architectures, vector databases, knowledge graphs and prompt engineering
- Experience building multi-step LLM workflows and agentic systems using frameworks (e.g. SDK, Strands, Claude Agents SDK, LangGraph, etc.) or custom implementations where needed
- Strong Python (or other modern programming language) proficiency with production API/service development experience and cloud platform knowledge (AWS, GCP, Azure)
- Understanding of distributed systems, CI/CD, testing frameworks, and deployment pipelines
- Solid foundations and understanding of production-grade, cloud-native platform and infrastructure requirements, design, and implementation.
- Strong data manipulation skills (pandas, SQL) and understanding of evaluation strategies for LLM-based systems
- Ability to work with ambiguity and optimise non-deterministic systems through a process of experimentation and evaluation while balancing latency/cost/quality tradeoffs
- Strong written and spoken English and Portuguese.