About the role
Job Description – Site Reliability Engineer (SRE) – GCP & Kubernetes Job Title: Site Reliability Engineer (SRE) Location: North York, ON (Hybrid) Experience: 4+ Years Job Summary We are looking for an experienced Site Reliability Engineer (SRE) with strong expertise in Google Cloud Platform (GCP), Kubernetes, and DevOps practices. The ideal candidate will be responsible for designing, deploying, automating, and supporting cloud infrastructure and applications while ensuring high availability, scalability, security, and operational excellence. Key Responsibilities Design, deploy, and support enterprise solutions on Google Cloud Platform (GCP). Build and manage cloud infrastructure, including VPC, IAM, Service Accounts, networking, and security. Develop and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Implement and support CI/CD pipelines using Jenkins, Git/Bitbucket, SonarQube, Nexus, Maven, Gradle, JIRA, and Rundeck. Deploy, manage, and troubleshoot Kubernetes clusters, including Google Kubernetes Engine (GKE). Administer Linux and Windows environments while automating operational tasks. Configure and maintain application servers such as Tomcat and NGINX. Develop automation scripts using Bash, Shell, or Python. Monitor system performance, troubleshoot production issues, and implement reliability improvements. Collaborate with architecture, development, and operations teams to deliver scalable and secure cloud solutions. Create and maintain technical documentation and operational runbooks. Provide technical guidance to onshore and offshore teams and support continuous improvement initiatives. Required Skills 4+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering. 2+ years of hands-on experience with Google Cloud Platform (GCP). Strong knowledge of GCP services, IAM, VPC, networking, and cloud security. Experience with Kubernetes and Google Kubernetes Engine (GKE). Hands-on experience with Terraform and Ansible. Experience with Docker and containerized applications. Strong understanding of CI/CD tools including Jenkins, Git/Bitbucket, Maven, Gradle, SonarQube, Nexus, JIRA, and Rundeck. Experience with Linux/Windows administration and automation tools such as Chef, Puppet, Ansible, or SaltStack. Proficiency in Bash, Shell scripting, or Python. Experience with Tomcat and NGINX administration. Strong analytical, troubleshooting, and communication skills. Preferred Qualifications Google Cloud certifications such as Associate Cloud Engineer, Professional Cloud DevOps Engineer, or Professional Cloud Architect. Experience with Azure cloud is a plus. Strong understanding of SRE principles, monitoring, incident management, and production support.
Similar Jobs
About the role
Job Description – Site Reliability Engineer (SRE) – GCP & Kubernetes Job Title: Site Reliability Engineer (SRE) Location: North York, ON (Hybrid) Experience: 4+ Years Job Summary We are looking for an experienced Site Reliability Engineer (SRE) with strong expertise in Google Cloud Platform (GCP), Kubernetes, and DevOps practices. The ideal candidate will be responsible for designing, deploying, automating, and supporting cloud infrastructure and applications while ensuring high availability, scalability, security, and operational excellence. Key Responsibilities Design, deploy, and support enterprise solutions on Google Cloud Platform (GCP). Build and manage cloud infrastructure, including VPC, IAM, Service Accounts, networking, and security. Develop and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Implement and support CI/CD pipelines using Jenkins, Git/Bitbucket, SonarQube, Nexus, Maven, Gradle, JIRA, and Rundeck. Deploy, manage, and troubleshoot Kubernetes clusters, including Google Kubernetes Engine (GKE). Administer Linux and Windows environments while automating operational tasks. Configure and maintain application servers such as Tomcat and NGINX. Develop automation scripts using Bash, Shell, or Python. Monitor system performance, troubleshoot production issues, and implement reliability improvements. Collaborate with architecture, development, and operations teams to deliver scalable and secure cloud solutions. Create and maintain technical documentation and operational runbooks. Provide technical guidance to onshore and offshore teams and support continuous improvement initiatives. Required Skills 4+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering. 2+ years of hands-on experience with Google Cloud Platform (GCP). Strong knowledge of GCP services, IAM, VPC, networking, and cloud security. Experience with Kubernetes and Google Kubernetes Engine (GKE). Hands-on experience with Terraform and Ansible. Experience with Docker and containerized applications. Strong understanding of CI/CD tools including Jenkins, Git/Bitbucket, Maven, Gradle, SonarQube, Nexus, JIRA, and Rundeck. Experience with Linux/Windows administration and automation tools such as Chef, Puppet, Ansible, or SaltStack. Proficiency in Bash, Shell scripting, or Python. Experience with Tomcat and NGINX administration. Strong analytical, troubleshooting, and communication skills. Preferred Qualifications Google Cloud certifications such as Associate Cloud Engineer, Professional Cloud DevOps Engineer, or Professional Cloud Architect. Experience with Azure cloud is a plus. Strong understanding of SRE principles, monitoring, incident management, and production support.