Description
Kloia is a recognized AWS Partner with a deep focus on Application Modernization and Digital Transition projects.
Our teams are growing rapidly and we’re hiring a Site Reliability Engineer mainly for our managed services that we provide to our customers, but also for our internal projects to build a scalable and reliable platform of common services.
What does SRE do?
In Kloia, SRE Team focuses on eliminating toils on production workloads. Our main aim is to achieve 24x7 SLA with a support system and team that ‘Follow-the-Sun’.
Key parts of this role are to take part in the design and development process and help to make the right trade-offs between performance, cost, security and reliability, as well as to be a reliable escalation point supporting the system in production.
As SRE you will:
- Eliminating toils by automation, re-architecting, and refactoring.
- Approach the incidents with an “Automate Everything” mindset
- Pair with software engineers to troubleshoot incidents.
- Drive complex infrastructure changes with a fantastic level of transparency and communication, with zero downtime.
- Design and implement self-healing, reliable and scalable infrastructure in a cloud-native environment.
- Guide and unblock developers across multiple teams and get the right stuff done to push their product forward.
- Define SLOs and error quotas for services destined to run in production.
- Support and be a critical part of our dev-ops culture, including participation in our follow-the-sun on-call rota.
Position: SRE (Site Reliability Engineer)
Location: Remote - LATAM / APAC
Level: Junior/Medior