Site Reliability Engineer III
Company Description
McDonald’s new growth strategy, Accelerating the Arches, encompasses all aspects of our business as the leading global omni-channel restaurant brand. As the consumer landscape shifts we are using our competitive advantages to further strengthen our brand. One of our core growth strategies is to Double Down on the 3Ds (Delivery, Digital and Drive Thru). McDonald’s will accelerate technology innovation so 65M+ customers a day will experience a fast, easy experience, whether at one of our 25,000 and growing Drive Thrus, through McDelivery, dine-in or takeaway.
Leading this tech revolution is McDonald’s Global Technology organization made up of intrapreneurs who get to build really cool tech with scary smart people using the latest innovations like AI, IOT, and edge computing. We do this working along diverse, global teams who are always hungry for a challenge. It’s bonus points when you get to see your family and friends use the tech you build at their favorite McD restaurant.
This role will collaborate closely with segment and market leads, project managers, and Global Technology Solutions teams to ensure the reliable and efficient operation of McDonald’s Edge program platform. In addition, this role will be responsible for managing the Edge technical platform and collaborating with the various capability teams from consumer, restaurant and company platform engineering teams. This person will work closely with others in Global Technology Risk Management and other areas of Global Technology to ensure that our services are meeting the needs of markets, application teams, and other stakeholders.
Check out the Global Technology Technical Blog to learn how technology is directly enabling the Accelerating the Arches strategy.
Job Description
This opportunity is part of the Global Technology Infrastructure & Operations team (GTIO), where our mission is to deliver modern and relevant technology that supports the way McDonald’s works. We provide outstanding foundational technology products and services including Global Networking, Cloud, End User Computing, and IT Service Management. It’s our goal to always provide an engaging, relevant, and simple experience for our customers.
The Site Reliability Engineer (SRE) – Edge Platform is a key member of the Edge Operations and SRE team within Global Technology Infrastructure & Operations. This role is responsible for ensuring the reliability, scalability, and operational excellence of the Edge computing platform that supports McDonald’s global restaurant technology ecosystem.
You will work closely with Architecture, Platform Engineering, Security teams to implement observability, automation, and incident response strategies that ensure the Edge platform is resilient and maintainable. This is a unique opportunity to influence the operational maturity of a global platform and drive continuous improvement across infrastructure and services.
Responsibilities & Accountabilities:
- Operate and maintain Edge platform infrastructure to ensure 24x7x365 availability, reliability, and performance.
- Design and implement observability frameworks using tools such as Prometheus, Grafana, Jaeger, and Datadog.
- Collaborate with Platform Engineering and Edge Solution Delivery teams to ensure platform features are operable, maintainable, and supportable in production environments.
- Develop and maintain runbooks, playbooks, and automation scripts to streamline operations and reduce manual effort.
- Develop and maintain runbooks, playbooks, and automation scripts to streamline operations and reduce manual toil.
- Lead incident response, root cause analysis, and post-incident reviews to drive continuous improvement.
- Participate in capacity planning, performance tuning, and disaster recovery exercises.
- Implement and manage CI/CD pipelines and Infrastructure-as-Code (IaC) for operational tooling and automation.
- Architect and maintain self-healing and auto-scaling capabilities across Edge clusters.
- Partner with security teams to ensure compliance with enterprise standards and implement secure operational practices.
- Contribute to platform architecture discussions with a focus on operational readiness and supportability.
- Stay current with industry trends in SRE, edge computing, and distributed systems.
Skills and experience required:
- Experience in Site Reliability Engineering, DevOps, or Platform Operations.
- Experience supporting Edge computing or hybrid cloud environments.
- Strong expertise in observability tools (Prometheus, Grafana, Jaeger, Datadog, ELK).
- Experience with container orchestration platforms (Kubernetes, GKE) and virtualization technologies.
- Proficiency in scripting and automation (Python, Bash, PowerShell).
- Hands-on experience with CI/CD tools (GitHub Actions, Jenkins, ArgoCD) and IaC (Terraform).
- Solid understanding of cloud platforms (GCP, AWS) and distributed systems.
- Strong problem-solving skills and ability to work in a fast-paced, collaborative environment.
- Excellent communication and documentation skills.
- GCP or AWS certification preferred.
- Experience with Agile methodologies is a plus.
Qualifications
- Bachelor’s degree in Computer Science, Engineering, or related field; or equivalent experience.
Additional Information
At McDonald’s we are People from all Walks of Life...
People are at the heart of everything we do, and they make the McDonald’s experience. We embrace diversity and are committed to creating an inclusive culture that means people can be their best authentic self in our restaurants and offices, which helps us to better serve our customers. We have a strong heritage of diversity and representation within our communities, which we are proud of. The diversity of our people, customers, Franchisees and suppliers gives us strength.
We do not tolerate inequality, injustice or discrimination of any kind. These are hugely important issues and a brand with our reach and relevance means we have a very meaningful role to play.
We also recognise our responsibility as a large employer to continue being active in our communities, helping to develop skills and drive aspirations that will help people to be more aware of the world of work and more successful within it, whether with McDonald’s or elsewhere.”