AI Infrastructure Engineer – BlueCat Horizon Platform
Top Benefits
About the role
Have you heard of BlueCat? We’re one of those hidden gems that’s been disrupting the market as a key player in the rapidly growing space of Intelligent Network Operations. Organizations require a new model of network operations that links foundational core services with a deep, predictive understanding of network health and performance to improve change readiness. BlueCat’s Intelligent NetOps is a first-to-market combination of systems of understanding and change. BlueCat enables teams to enhance agility and mitigate risks from high rates of change with a unified management lifecycle, from provisioning to proactive troubleshooting and remediation.
At BlueCat, we take immense pride in our award-winning culture, an integral part of our identity. We are proud recipients of several prestigious accolades, including the "Great Place to Work" certification. By becoming a part of our team, you not only join a company at the forefront of technology but also become an integral member of Canada's top workplaces in various categories, including Technology, Today's Youth and Women, and Mental Health and Inclusion.
Job Description:
The BlueCat Horizon team is responsible for powering all BlueCat SaaS products. Our mission is to deliver BlueCat products on a reliable, fast, globally distributed, and cost-effective enterprise-grade cloud infrastructure. Central to this mission is our AI first strategy, as we fully embrace a product model where AI is integral to everything we create.
The AI Infrastructure Engineer role is a high-impact, implementation-focused position centered on building a production-grade agentic platform for
Horizon. You will be lead coder for our autonomous agent runtime, leveraging Amazon Bedrock AgentCore as the core framework and Kubernetes (EKS) as the orchestration engine.
Key Responsibilities:
You will bridge the gap between "experimental agents" and "production systems." Your mission is to build the secure, scalable, and stateful infrastructure that allows agents to reason, access enterprise tools, and persist memory. You will spend the majority of your time writing Go for systems-level Kubernetes extensions and Python for the agentic framework logic.
You will be working closely with the Architecture Team, driving architectural decisions to implementation and operation, interact with the product management team to understand the use-cases, requirements and develop and present technical solutions. Your work will directly impact the scalability, performance, and reliability of the BlueCat Horizon Platform, ensuring that it meets the demanding needs of the versatile AI Agentic Workloads.
Responsibilities/Duties:
-
Runtime Implementation: Deploy and optimize the AgentCore Runtime on Amazon EKS, ensuring agents have a secure, high-performance environment for long-running tasks.
-
Secure Gateway Logic: Build the AgentCore Gateway using Go to mediate between autonomous agents and internal microservices, enforcing zero-trust security.
-
State & Memory Management: Architect persistent state layers, ensuring agents maintain context across sessions or specialized vector stores.
-
Platform Integration: Engineer the "connective tissue" between AgentCore and the Horizon Kubernetes platform, ensuring agents have native access to cluster resources and internal services.
-
Standardized Tooling: Leverage the Model Context Protocol (MCP) to integrate diverse data sources and internal tools into the agent ecosystem.
-
Secure Gateways: Build the "connective tissue" in Go or Python that allows agents to securely interact with enterprise APIs via the AgentCore Gateway.
-
Evaluation: Design and implement automated evaluation frameworks to verify that Horizon agents are performing tasks accurately, safely, and within BlueCat's operational guardrails.
-
Observability: Build the infrastructure to capture agent execution traces and user feedback, feeding it back into the evaluation pipeline to continuously improve agent reliability.
-
Horizon-Specific Tooling: Build secure, high-performance interfaces in Go and Python that allow agents to interact with Horizon APIs, telemetry data, and configuration engines.
-
IaC Mastery: Lead the implementation of Terraform or AWS CDK modules to deploy the full AgentCore stack (Identity, Memory, and Gateway) in a repeatable, multi-account fashion.
-
Provide the infrastructure support for Retrieval-Augmented Generation (RAG) systems, ensuring low-latency access to vector databases.
-
Knowledge Graph Support: Support the integration of Knowledge Graphs into the agent reasoning loop to provide structured enterprise context.
Qualifications
-
Bachelor’s degree in computer science, Engineering, or a related field; Master’s degree preferred.
-
10+ years' experience in software engineering with around 5+ years commercial experience in cloud distributed systems and high scale designs with Golang and async Python,
-
Hands-on experience with AWS AgentCore ( or similar Agentic AI platforms), Agent SDK (Strand, OpenAI, LangChain), protocols (MCP, A2A),
-
Agentic Expertise: Proven ability to move agents beyond "chat" into autonomous "action" loops (MCP, A2A, RAG, KnowledgeBase)
-
Must have 2+ years hands-on proficiency in Kubernetes, kubernetes operators and containers
-
Experience with Helm charts, API gateways, ingress/egress gateways
-
You are passionate about building great REST APIs (and helping others do the same).
-
Passion for engineering rigor and operational excellence (design principles and patterns, unit testing, best practices for security and privacy, CI/CD etc).
-
Experience with CI/CD tools (GitLab) & automation
-
Strong experience with code tools like Terraform
-
Excellent written and verbal/presentation communication skills
-
Ability to work well with a distributed team
If you share our enthusiasm for the future of our company and are eager to contribute to our vibrant workplace, we look forward to receiving your application! Our comprehensive benefits encompass your health, financial well-being, and overall wellness, and we are committed to providing an exceptional work environment, enriching employee programs, and fostering a remarkable company culture. At our core, we champion values such as transparency, curiosity, respect, and above all, the pursuit of enjoyment.
In addition, we offer a range of appealing perks, including:
A Professional Development Budget
Dedicated Wellness Days and Wellness Week
A Lifestyle Spending Account
An Employee Recognition Program
Join us in shaping the future of our organization, where your talent and dedication can truly thrive. We invite you to apply and become a valuable member of our team!
BlueCat is an Equal Opportunity Employer that is committed to inclusion and diversity. We also take affirmative action to offer employment and advancement opportunities to all applicants, including minorities, women, protected veterans, and individuals with disabilities. BlueCat will not discriminate or retaliate against applicants who inquire about, disclose, or discuss their compensation or that of other applicants.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.