As a Site Reliability Engineer in the Digital Experience group, you will join our passionate colleagues in delivery, optimization, resilience, and availability of high-value and high-transaction rate services trusted by some of the largest brands in the world.
SREs have the competence and experience to provide direct technical contributions to major projects both in code, and in building and optimizing the production environment.
You align with your colleagues across engineering, deliver domain expertise for the infrastructure within your product area and draw on your strong communication skills to solidify partnership with teams across geographies.
Your points of view help develop and support successful delivery of reliability engineering, and you influence by way of metrics and data.
What you'll Do
Extend our product services and production environment using traditional software engineering best practices.
Contribute to the technical direction of our cloud enterprise solutions.
Collaborate with various internal teams to provide a high-quality customer experience.
Contribute service metrics and measurement.
Deliver automation to prevent problem recurrence and automate responses to all non-exceptional service conditions.
Establish credibility with the quality of your technical execution.
Participation in a cross-regional on-call rotation
Continually evaluate and adopt the latest industry technologies to optimize costs and increase efficiency.
Participate fully in a culture that supports innovation and creativity while delivering high output in a predictable and reliable way.
What you need to succeed
7 or more years of commercial software development, technical operations experience and experience managing large-scale cloud-based applications.
MUST possess a BS / MS in Computer Science or equivalent.
Able to design and deliver infrastructure solutions for scalability, reliability, high availability, performance, security, software maintainability, and operational excellence.
Experience with Linux-based open source software.
Experience with AWS technologies and Kubernetes, Terraform
Expertise with config management tools (Ansible / Salt-stack / Puppet),
NoSQL (Hadoop / Cassandra / MongoDB) and with monitoring and logging solutions (preferably Prometheus, Splunk, Grafana)
Expertise with at least one programming language (preferably Java or Python)
Excellent communication skills (verbal and written) are essential to the role.
Able to work optimally across multiple time zones to collaborate with Peers in other geographies.