We are looking for an experienced Site Reliability Engineer to be part of our Infrastructure Support Team. Drive deep stability into new or established systems and provide support for day-to-day operational issues (including but not limited to operating, monitoring and maintaining high availability of software services.)
Manage Service Level Objectives and Indicators.
Risk acceptance and mitigation plans.
Proactive monitoring.
Release and Deployment.
Ability to gather and aggregate metrics and logs into structured applications.
Troubleshoot production issues and coordinate with the development team to streamlinecode deployment.
Provision, configure and maintain infrastructure.
Manage production failures, infrastructure issues (disk/memory), security, monitoring etc.
Automate systems tests for security, performance, and availability.
Perform infrastructure cost analysis and optimization.
1 to 3 years of experience in building and maintaining AWS infrastructure.
Flexibility, adaptability, and desire to learn new languages and technologies.
Experience in developing CI/CD workflows and tools.
Hands-on experience in deploying and managing infrastructure.
Experience with development platforms (AWS, Kubernetes, Terraform, Docker).
Experience in configuration management, test-driven development, and releasemanagement.
A solid foundation of networking and Linux administration.
Experience with Docker, Bitbucket, ELK, and deploying applications on AWS.
Previous experience in game testing
Good understanding of iOS and Android platforms
Hands-on programming experience
ISTQB foundation certification