Engineering

Site Reliability Engineer - 3

What You will do
  • Be part of building a platform from scratch which can handle scale.

  • Dockerizing and orchestrating with K8S.

  • Work on Elasticsearch, MongoDB, Snowflake and Kafka cluster.

  • Implementing best practices, challenging status quo, tabulating industry and technical trends, changes and developments to ensure the team always strives for best in class work.

  • Manage capacity, build security into every layer and reduce cost.

  • Monitoring at scale with Prometheus and the like.

  • Maintain services once they are live by measuring and monitoring availability, latency and overall system reliability.

  • Implement secure networking, key management, user management, access management, and image management.


What you will need

  • Experience on AWS platform, K8s
  • Linux, Networking fundamentals.
  • Proficiency in python or shell scripting languages.
  • Having a mindset to Automate anything.
  • Awareness on Cloud Security concepts and Best Practices.
  • Experience with CI/CD practices, Deployment patterns and relevant toolsets.
  • Observability practices and toolchains (Monitoring, Metrics, Logging, Alerts & Tracing)
  • Infrastructure as code like Terraform, Ansible etc


Good to have:

  • AWS Certified Solutions Architect certification preferred
  • Certification in Kubernetes Administrator (CKA).
  • Certification in Kubernetes Application Developer (CKAD)Experience with configuration management tools & Strong code analysis skills in Python
See all jobs