Coredge

Cloud OperationsNoidaFull Time

Position Overview:

We are seeking a highly skilled and motivated Senior DevOps Engineer to join our team. The ideal candidate will have 4-5 years of DevOps experience with a strong focus on on- premises or self-managed Kubernetes environments. This role involves deploying, operating, monitoring, and troubleshooting Kubernetes clusters and Linux infrastructure to ensure high availability, reliability, and performance of production systems.

Key Responsibilities:

Linux Administration:

Administer and support Linux-based systems in production environments.
Deploy, manage, and troubleshoot applications running on Linux.
Perform root cause analysis for OS-level issues to maintain high availability and performance.
Ensure system stability, security hardening, and performance tuning.

Kubernetes (On-Prem / Self-Managed) – Must Have:

Deploy, configure, and maintain on-premises or self-managed Kubernetes clusters (bare metal or VM-based).
Troubleshoot Kubernetes issues related to: Pod scheduling, networking, and storage, Cluster components and service failures, Application deployment and scaling
Debug containerized workloads and ensure reliable rollouts.
Manage Kubernetes resources such as Deployments, Services, ConfigMaps, Secrets, and CronJobs.
Work with ARGO CD / ARGO Workflows for Kubernetes-native application delivery and workflows (mandatory).

Monitoring & Observability – Must Have:

Implement and maintain Prometheus and Grafana for infrastructure and application monitoring.
Create and manage real-time Grafana dashboards for cluster health, application metrics, and alerts.
Analyze monitoring data to proactively identify and resolve performance and reliability issues.
Support incident response using observability tools.

Automation & CronJobs:

Configure and manage Kubernetes CronJobs and Linux-based scheduled tasks.
Troubleshoot failed or delayed automation jobs.
Improve operational efficiency through scripting and automation.
Develop automation using Shell, Python, or Ansible.

Platform / Portal Exposure (Good to Have):

Gain working knowledge of Horizon / platform portals used for infrastructure or operational visibility.
Monitor and track infrastructure health and incidents using internal portals.
Utilize portals for operational insights and incident management.

Cloud Awareness (Limited / Supporting Role):

Understand basic cloud computing concepts and architectures.
Provide support or troubleshooting for cloud-related dependencies when required.
Note: This role is primarily focused on on-prem Kubernetes, not public cloud operations

Key Requirements:

Bachelor’s degree in computer science, IT, or a related field (or equivalent hands-on experience).
4–5 years of experience in a DevOps / SRE / Production Support role.
Strong expertise in Linux system administration.
Hands-on experience with on-prem, self-managed, or unmanaged Kubernetes clusters.
Proven ability to deploy, debug, and troubleshoot Kubernetes environments.
Strong experience with Prometheus and Grafana.
Mandatory exposure to ARGO CD / ARGO Workflows.
Experience with automation and scripting (Shell, Python, Ansible).
Ability to handle production incidents independently.
Excellent troubleshooting, analytical, and communication skills.

Preferred Qualifications:

Kubernetes certifications such as CKA or CKAD.
Experience with CI/CD pipelines integrated with Kubernetes.
Exposure to container security, RBAC, and cluster hardening.
Experience supporting high-availability on-prem infrastructure

Back to All Positions

Apply for this Role

Fill in your details and we'll get back to you shortly.

Dflare AI (AI Cloud Platform)

CCS-Cirrus Cloud Suite

Cirrus Cloud Platform

Coredge Kubernetes Platform (CKP)

Cloud Orbiter

CoRobots

CVM (Coredge Virtualization Machine)

By Use Case

By Role

By Industry

Documentation Hub

Downloads

Learning

About Coredge

Customers and OEM Partners

Culture

Careers

Contact

Apply for this Role

Dflare AI (AI Cloud Platform)

CCS-Cirrus Cloud Suite

Cirrus Cloud Platform

Coredge Kubernetes Platform (CKP)

Cloud Orbiter

CoRobots

CVM (Coredge Virtualization Machine)

By Use Case

By Role

By Industry

Documentation Hub

Downloads

Learning

About Coredge

Customers and OEM Partners

Culture

Careers

Contact

Senior DevOps Engineer

Apply for this Role

Dflare AI (AI Cloud Platform)

 CCS-Cirrus Cloud Suite 

 Cirrus Cloud Platform 

Coredge Kubernetes Platform (CKP)

CVM (Coredge Virtualization Machine)