Design and deliver reliable infrastructure, tools, and processes congruent with a cross-organization DevOps mindset and culture.
Script and code automation, infrastructure-as-code components, and platform services.
Participate in continually refining and prioritizing the work necessary to support DevOps tools and infrastructure.
Provide system administration and operations support for existing infrastructure and pipelines, while executing a plan to create tools and automation that enable more of this responsibility to be moved into the engineering teams.
Introduce operations that enable engineers to get real-time telemetry on application performance, exception handling, logging, and product usage information.
Promote the adoption and execution of industry best practices related to continuous integration and delivery, automated deployments, operations, infrastructure, support, and test automation.
Manage high-availability and growth of systems, service updates, maintenance, and validation.
Participate in 24x7 site reliability rotations, escalation workflows, and production incidents management.
Skills & Requirements:
Knowledge of one cloud platform (AWS, Azure, GCP)
Good hands-on knowledge of Configuration Management, Deployment and security tools like – Git, Bitbucket, Graylog & Elasticsearch, Zabbix, Site24x7, Prometheus, Grafana.
Proficient in scripting, and Git and Git workflows
Experience in developing Continuous Integration/ Continuous Delivery pipelines (CI/ CD)
Managing self-hosted Kubernetes clusters, and one of EKS or GKE
Working knowledge of networking technologies such as switching, routing, firewalls, and load balancing for high-performance highly-available web applications.