We are seeking an experienced AI Systems Architect to design, build, and scale high-performance distributed AI systems. The ideal candidate will have deep expertise in GenAI, LLMs, and cloud-native architectures, along with hands-on experience in building enterprise-scale AI/ML platforms and agent-based systems.

Must-Have Skills
Strong experience in designing and implementing high-performance, large-scale distributed systems
Proven experience in implementing and deploying AI/ML platforms at scale
Expertise in building agent-based architectures, evaluation frameworks, and prompt/context engineering
Knowledge of MCP (Model Context Protocol) servers
Hands-on experience in LLM inference optimization, including batching and caching strategies
Strong experience with Kubernetes and cloud infrastructure (AWS/Azure/GCP)
Proficiency in at least one programming language (Python, Java, Go, etc.)
Expertise in designing agent data stacks & retrieval systems, including:
Vector databases
Hybrid search
Data freshness strategies
Memory systems
Graph reasoning
BM25 and advanced retrieval techniques

Key Responsibilities
Architect and deliver scalable, high-performance distributed systems
Design and deploy AI/ML and GenAI platforms at enterprise scale
Build and manage agent-based architectures, including:
Prompt and context engineering
MCP servers
Evaluation frameworks
Optimize LLM inference pipelines for latency, throughput, and efficiency
Design and implement agent data & retrieval systems (vector DBs, hybrid search, memory, graph-based reasoning)
Lead Kubernetes-based, cloud-native deployments
Provide technical leadership, architecture governance, and hands-on mentoring to engineering teams

Nice to Have
Experience with RAG (Retrieval-Augmented Generation) frameworks
Familiarity with multi-agent systems and orchestration frameworks
Exposure to real-time data pipelines and streaming architectures

AI Systems Architect

Micasa Global