Kubernetes Governance Explained
July 05, 2022
According to IDC (IDC: Expect 175 zettabytes of data worldwide by 2025 ), it is projected that by 2025 our global data volume will reach 175 zetabytes. As the data increases, the need for data centers to comply with various standards like PCI DSS, CIS, NIST etc. would also increase. Complying with such standards in data centers brings out the need to define DevSecOps practices and automation to ensure that security requirements are fulfilled at various levels like Operating systems, IaaS, PaaS, application/workload etc.
Governance refers to a well-defined set of rules, processes, procedures that are aimed to assure accountability, transparency, and responsibility. When it comes to modern data centers, the same remains true. These DCs could be comprised of various cloud(s) or K8s clusters. DevSecOps is critical for enforcing and maintaining policies in IT infrastructure. When it comes to K8s clusters, these policies would outline DevSecOps policies and procedures to ensure that the workloads and applications running on the cluster are secure. In addition, due to business and compliance needs, some organizations need to comply with security standards like CIS, PCI DSS etc.
Let us look at governance at each of these levels:
- User Level:
User-level governance is a means of defining WHO can access WHAT. In terms of K8s, it means WHO (user/group/serviceaccount) can perform WHAT action(CREATE/DELETE/GET/LIST etc.) on WHICH K8s RESOURCE(pods/deployments/configmaps etc.)
User-level governance can be achieved by defining and enabling RBAC (Role-Based Access Control) and carefully defining Roles, RoleBindings, ClusterRoles, ClusterRoleBindings. RBAC can help you determine who has access to the Kubernetes cluster and to what extent.
Everything in Kubernetes is a resource: pods, nodes, services, service accounts, etc. Kubernetes allows us to define ownership over all these resources by creating Roles and ClusterRoles. These roles define the kind of action(verb) (CREATE/DELETE/GET/LIST etc.) that a user can perform on a particular resource. These roles can be cluster scoped or namespace scoped. As a standard practice, providing ClusterRole should be avoided as much as possible.
It is preferred to create distinct categories of roles based on cluster/organization needs and then define role bindings for different users. Some of the below best practices can be implemented while managing RBAC for users
- Do not provide ClusterRole freely.
- Make more use of Role/RoleBindings instead of ClusterRole/ClusterRoleBindings since the former are namespace-scoped and later are cluster-scoped.
- Provide ClusterRole/ClusterRoleBinding only when needed.
- Define explicit use of apiGroups, resources and verbs in the Role instead of using * as a wildcard.
One sample is given below for illustration:
In the above role definition, for core API group (represented by apiGroups: “”), all permissions are given for pods, podtemplates and replicationcontrollers, but only get/list/watch permissions are given for configmaps. The above role definition is extremely specific based on different resources in the same apiGroup.
2. Application level:
Application-level governance means that the application image is scanned regularly to ensure that there are no vulnerabilities.
Application-level governance can be achieved by using one of the CNCF (Cloud Native Computing Foundation) projects Open-Policy Agent (OPA), Gatekeeper, which provides a simple approach for defining and enforcing policies at scale. Gatekeeper is an admission controller that validates incoming requests to allow/dis-allow CREATE/DELETE/UPDATE of K8s resources using Open Policy Agent (OPA)/Gatekeeper. Gatekeeper lets users define policies as YAML files with embedded rego language for programmability. For example, a user may define a policy that any pod that gets created on a cluster needs to have certain labels like – gatekeeper-library/template.yaml at master · open-policy-agent/gatekeeper-library
OPA/GK is an admission controller webhook that gets the API request when any K8s resource is created (Consider POST call on any K8s resource – these could be custom resources as well). Once Gatekeeper receives a request, it executes the policies and accepts/denies the creation of resources depending on policy violation.
Another aspect of application-level governance can be covered by scanning and hardening application/container images. Various tools can perform static scanning and detect vulnerabilities. The hardening of container images can be done to harden the application in terms of exposure to other applications. This should be done periodically to ensure the container images are free from CVE issues.
3. Organization level:
Security standards like CIS/PCI are good to have based on the organization’s governance requirements, but at the same time, implementing them can be complex. Let us analyze how one such requirement can be implemented in a K8s cluster.
Let us consider PCI requirements 2.2.1. Where virtualization technologies are in use, implement only one primary function per virtual system component.” To implement this requirement on any workload/pod running on a K8s cluster, the extra ports on the container/pod need to be disabled. For example, in case of the workload/pod is serving on port Y, then the other open ports need to be disabled.
The above example illustrates how a top-level PCI standard translates deep down into workload/pod by using DevSecOps practices of defining policies and processes. However, there can be more complex requirements as compliance may be needed for more standards bringing up the need for the usage of more tools, policies and DevSecOps practices.
Governance and compliance play a significant role for data centers for business needs as well as for general security. To achieve governance, security needs to be tightened at the application, user, and organizational level by adopting sophisticated tools, DevSecOps practices, policies, and procedures.