Google Cloud DevOps / Site Reliability Engineer (SRE) Job at Purple Drive, Alpharetta, GA

TEhUNWs1c3FtZmk1ZGNwUHdqWElVQ2lB
  • Purple Drive
  • Alpharetta, GA

Job Description

Role: Google Cloud DevOps / Site Reliability Engineer (SRE)

Location: Alpharetta, GA
Experience: 8-12 Years (Senior Level)

Job Summary

We are seeking an experienced Google Cloud DevOps / SRE Engineer to design, build, and operate highly reliable, scalable, and secure cloud infrastructure on Google Cloud Platform (GCP) . The ideal candidate will bring deep Linux expertise, strong cloud networking and security knowledge, and hands-on experience with automation, CI/CD, and Kubernetes-based deployments. This role plays a critical part in ensuring system reliability, performance, and operational excellence across large-scale distributed systems.

Key Responsibilities

Cloud Infrastructure & Platform Engineering

  • Design, deploy, and manage cloud infrastructure using Google Cloud Platform services including Compute Engine, GKE, VPC, IAM, Cloud Storage, and Cloud SQL.

  • Architect and support highly available, scalable, and fault-tolerant systems on GCP.

  • Implement and manage Shared VPCs, VPC peering, firewall rules, load balancers, DNS, and VPN tunnels .

DevOps & Automation

  • Build and maintain CI/CD pipelines using Jenkins (Declarative & Scripted) and GitHub Actions .

  • Automate infrastructure provisioning and configuration using Terraform , including module development, remote state management, dependency handling, and DRY principles.

  • Implement modern deployment strategies such as Canary releases and Blue/Green deployments .

  • Manage container artifacts using Docker and Helm .

Site Reliability & Operations

  • Ensure high availability, performance, and reliability of production systems.

  • Troubleshoot complex system issues including CPU, memory, disk I/O bottlenecks , kernel issues, and system boot failures.

  • Analyze logs and metrics to proactively identify and resolve performance and stability issues.

  • Support incident response, root cause analysis, and post-incident reviews.

Linux Systems Engineering (Must Have)

  • Demonstrate deep hands-on expertise with Linux systems (RHEL, Ubuntu, CentOS).

  • Perform kernel tuning, system optimization, storage management (LVM), and systemd administration.

  • Maintain OS-level security, patching, and performance best practices.

Security & Identity Management

  • Implement and troubleshoot Cloud IAM , service accounts, and Workload Identity Federation .

  • Enforce least privilege access and security best practices across environments.

  • Partner with security teams to maintain compliance and secure cloud operations.

Collaboration & Process

  • Work closely with application teams, architects, and security stakeholders.

  • Participate in on-call rotations and incident management processes.

  • Contribute to operational documentation, runbooks, and best practices.

Required Skills & Qualifications

Must-Have Skills

  • Strong hands-on experience with Google Cloud Platform (GCP) .

  • Deep expertise in Linux systems engineering (RHEL, Ubuntu, CentOS).

  • Proficiency in at least one programming language: Python, Go (Golang), or Java .

  • Strong troubleshooting and debugging skills across infrastructure and application layers.

  • Hands-on experience with Terraform for infrastructure as code.

  • Experience with CI/CD pipelines using Jenkins and/or GitHub Actions.

  • Kubernetes experience with GKE , Docker, and Helm.

Preferred Qualifications

  • GCP Certifications:

    • Google Professional Cloud DevOps Engineer

    • Google Professional Cloud Architect

  • CKA (Certified Kubernetes Administrator) .

  • Experience supporting large-scale distributed systems and microservices architectures .

  • Familiarity with ITIL processes , Change Advisory Board (CAB) workflows, and incident management .

Soft Skills

  • Strong analytical and problem-solving abilities.

  • Excellent communication skills with the ability to collaborate across teams.

  • Ownership mindset with a focus on reliability and continuous improvement.

  • Ability to work in fast-paced, production-critical environments.

Job Tags

Remote work,

Similar Jobs

LingaTech

Document Redesign Specialist Job at LingaTech

 ...Job Title: Document Redesign Specialist Location: Remote Duration: 10-12 weeks Overview: We are seeking a detail-oriented professional to support the Unemployment Compensation Tax System (UCTS) project by redesigning a large volume of unemployment compensation... 

Jimmy Britt Chrysler Jeep Dodge Ram

Automotive Service Writer Job at Jimmy Britt Chrysler Jeep Dodge Ram

 ...easy and stress-free. What youll do:Greet customers and write up service requests with a smileCommunicate clearly between...  ...required) Benefits: Above Average Industry Pay Medical, Dental, and VisionGreat work/life balance with a 5-day work... 

Pactiv Evergreen

Transportation Order Fulfillment Coordinator Job at Pactiv Evergreen

 ...helping us achieve our ambitious goals through our wide-ranging initiatives. Job Description: SUMMARY: The Transportation Order Fulfillment Coordinator is a key part of the Novolex Supply Chain organization. In this role you will manage the transportation execution... 

New Hampshire Department of Business & Economic Affairs

BUILDING CLEANING WORKER I Job at New Hampshire Department of Business & Economic Affairs

 ...this position is $14.40/hour - $18.46/hour If position is 2nd Shift, $1.20/hour shift differential applies *See total compensation...  ...Services, Division of Plant & Property has several part-time vacancies for Building Cleaning Worker I. Summary:... 

Highmark Health

Behavioral Health School Educator - Chill Mobile/Pittsburgh - Full time Job at Highmark Health

 ...Company Allegheny Health Network Job Description GENERAL OVERVIEW This job serves non-billable functions within the CHILL Workshop. These responsibilities include serving students, school district employees and student families. Responsibilities with students...