Senior Solutions Architect, AI and Kubernetes

Kubex about 18 hours ago

Remote

Senior Level

Full-Time

Top Benefits

Competitive Compensation

Equity

Benefits

About the role

About the Role Kubex is on a mission to enable autonomous, AI-driven management of cloud infrastructure resources across Kubernetes, IaaS, and GPU-backed environments. As AI inference workloads increasingly run at scale on Kubernetes, Kubex is expanding its capabilities in GPU sharing, vLLM optimization, MIG automation, and cross-provider GPU cost management.

We are looking for a Senior Solutions Architect with a strong technical evangelism streak to work in close partnership with our CTO in the field. This is a hands-on role with broad scope across existing customers, active sales engagements, and partner relationships. The right candidate combines deep Kubernetes expertise with real, production-level exposure to AI workloads and the ability to hold peer-level technical conversations with MLOps, AI Platform, and AI Infrastructure engineers who go deep on GPU infrastructure, inference serving, and the operational challenges of running AI workloads at scale.

This is not a role for someone who wants to stay at the surface. You will be the most senior technical voice in many of the engagements you lead, and the quality of your field work will directly shape how customers experience Kubex and how our product evolves. Travel is required.

Key Responsibilities Own and lead high-stakes customer engagements from initial architecture reviews and workshops through to production adoption, working closely alongside the CTO on the accounts and opportunities that matter most. Scope covers existing customers, active sales cycles, and partner relationships across the full range of Kubex capabilities. Go deep with MLOps, AI Platform, and AI Infrastructure engineering teams on the real challenges of running GPU-accelerated workloads in Kubernetes. That means credible, peer-level conversations on GPU sharing strategies, inference serving frameworks like vLLM, TRT-LLM, and NIM, inference optimization trade-offs, and how Kubex's agentic and human-in-the-loop automation fits into their stack. Work closely with Sales, Business Development, and Customer Success to shape and progress engagements, and bring structured field insight back to Product and Engineering to influence roadmap direction. Carry Kubex's technical voice into the communities that matter: KubeCon, Kubernetes User Groups, and MLOps and AI Platform forums. Contribute to technical content including blogs, solution briefs, reference architectures, webinars, and conference talks. Contribute to go-to-market strategy and product positioning, particularly as Kubex expands its AI and GPU optimization capabilities, and help translate complex technical capabilities into messaging that resonates with both technical buyers and business stakeholders.

Required Experience and Skills 7+ years in a highly technical, hands-on role, with significant time spent in senior customer-facing positions such as Solutions Architect, Senior Pre-Sales Engineer, or Senior SE. Deep, proven expertise with Kubernetes in production environments including architecture, resource management at scale (HPA, VPA, quotas, admission controllers, Karpenter), and deployment across major cloud platforms (AWS/EKS, Azure/AKS, GCP/GKE) and on-prem distributions including OpenShift and Rancher. Hands-on exposure to GPU-accelerated workloads in Kubernetes, particularly inference. Working knowledge of GPU sharing and partitioning approaches such as MIG, MPS, & time slicing, and how these are scheduled and managed in real clusters. Practical familiarity with LLM inference serving frameworks. vLLM experience is strongly preferred; exposure to TRT-LLM, NVIDIA NIM, or SGLang is a plus. Enough depth on inference optimization trade-offs, covering areas like KV cache, quantization, and parallelism, to engage credibly with customers who work in these details every day. Familiarity with GPU observability and performance monitoring for AI workloads in Kubernetes, through tools such as Prometheus, OpenTelemetry, or vendor-specific telemetry. Strong communication skills across audiences, with the ability to move comfortably between executive conversations and deep engineering sessions. Background in technical community engagement, content creation, or developer advocacy is important given the evangelism dimension of this role. Solid grounding in DevOps practices, CI/CD pipelines, FinOps, cloud-native infrastructure automation, and API integrations.

Preferred Qualifications Hands-on experience with vLLM or a comparable inference server in a production setting, including benchmarking and tuning for real workload requirements. Experience with KAI scheduler, NVIDIA run:ai, or similar GPU-aware scheduling solutions. Awareness of the broader accelerator landscape beyond NVIDIA, including AMD, TPUs, and custom silicon, and how workload requirements map to different chip types. Kubernetes certifications such as CKA, CKAD, or CKS. Prior experience in a startup or high-growth SaaS environment, particularly in infrastructure, DevOps, AI, or observability. Exposure to resource optimization problems: bin-packing, scheduling trade-offs, or infrastructure cost management at scale.

Why Join Kubex? GPU waste in production Kubernetes environments is one of the most expensive and least understood problems in enterprise AI infrastructure right now. Kubex is building the automation to fix it, and enterprise demand is accelerating faster than the market has solutions for it. You will work directly alongside the CTO on the engagements that matter most, with genuine input into product direction, roadmap, and how Kubex positions itself technically in the market. Kubex is expanding its AI optimization capabilities at the right moment: vLLM tuning, MIG automation, KAI scheduler integration, and cross-provider GPU cost management are all in active development, and this role puts you at the center of how those capabilities reach customers. The technology has real depth. Kubex's optimization systems are built on ML-driven analysis and a patent-pending GPU and XPU matching engine, not another layer of dashboards on top of existing tooling. Competitive compensation, equity, and benefits in a remote-first culture that measures impact over activity.

Kubex is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

About Kubex

Software Development

Website