Written by Hasham
Haider, Cloud Ops and Content Marketing, Replex
Kubernetes has seen rapid adoption
in the last couple of years firmly establishing itself as a leader in the
container orchestration space. Dan Kohn Executive Director of CNCF, predicts that
eventually much the world's legacy software, worth about $100 trillion in net
GDP, will be ported into Kubernetes, for better servicing.
One of the reasons behind this
accelerating adoption is the fact that Kubernetes is super-easy to get up and
running. Any developer can spin up a cluster with a couple of nodes running
containerized applications, in a matter of minutes.
Running mission-critical
applications in production, with the requisite Security, Governance,
Compliance, Operational and disaster recovery framework in place, is however, a
different ball-game altogether.
Enterprises typically have a robust
Governance, Compliance and Operational framework supporting applications,
infrastructure and technology. These frameworks evolve over time and
incorporate a lot of internal tribal knowledge, making them unique to each
enterprise.
For Kubernetes to see Enterprise
adoption at the level Dan Kohn envisages, Kubernetes needs an equally robust
set of tools that allows organizations to create a comprehensive Governance,
Compliance and Operational framework around it.
In this article, we will explore a
Governance and Compliance framework for Kubernetes. We will identify the
individual features of a such a framework as well as open-source and native
Kubernetes tooling that can support some of these features.
However, before we do that, let's
quickly review the concepts of Governance and Compliance and why the
introduction of the cloud and now Kubernetes necessitates this new framework.
At its most basic level, Governance
refers to a set of rules that allow enterprises to minimize risk, control
costs, and drive efficiency, transparency and accountability. Governance rules
are codified as policies that are then implemented enterprise-wide for a
consistent Governance framework.
Once Governance rules and policies
have been identified and codified, enterprises need to ensure they are
enforced. This process of monitoring and ensuring Governance policies are followed
is known as compliance.
Now that we have reviewed the
concepts of Governance and Compliance and identified the key drivers of a
Governance framework, let's now look at the individual elements of such a
framework through the lens of Kubernetes. We will also be reviewing both native
and open-source tools that allow us to manage these elements of the Governance
framework.
Authentication, Authorization &
Access Control
Authentication, authorization and
access control tooling together allow organizations to identify users,
implement a security paradigm and govern the use of resources.
Authentication
Authentication is the process of identifying users before giving them access to
resources. In Kubernetes, users can be authenticated either as user accounts or
service accounts. User accounts usually refer to accounts that are created and
managed by Kubernetes admins and allotted to team members. Service accounts are
created automatically for individual processes by the Kubernetes API and are
bound to specific namespaces. Service accounts can also be created manually by
Kubernetes admins via calls to the API.
Kubernetes supports a number of authentication strategies ranging from X509 client
certs and static token files to service account tokens and OpenID Connect
Tokens. It can also be integrated with other authentication protocols including
LDAP, SAML and kerberos.
Together these authentication
strategies provide a wide range of options for enterprises to implement a
secure authentication regime for their Kubernetes environments.
Authorization
Once users have been authenticated
they next need to be authorized. Authorization is the process of giving
subjects (groups, user accounts, service accounts) access to Kubernetes
resources.
There are a number of authorization
modules that are supported
by Kubernetes. These include Node, RBAC and
Webhook. Node authorization is specific to the Kubelet and authorizes any API
requests made by it.
Kubernetes RBAC allows the creation
of a set of rules (permissions) packaged as Roles. Roles can then be allotted
to users or services accounts using Role Bindings. With Kubernetes Roles,
cluster admins can control both the resources (pods, deployments etc.) that
users are allowed to access as well as the actions (verbs: get, list, update
etc.) that users are allowed to perform on those resources.
Roles are by default restricted to a
specific namespace and can be used to grant access to resources only within
that specific namespace. Roles can also be created for cluster-wide use using
Cluster Roles and allotted to users using Cluster Role Bindings.
Kubernetes RBAC gives cluster admins
fine-grained control over access and allows them to govern the use of
Kubernetes resources in line with the overall Governance framework.
In addition to authentication and
authorization, Kubernetes also provides an additional layer that API requests
can be filtered through. This set of filters are called Admission Controllers
and come into play once requests have been authenticated and authorized. We
will take a closer look at Admission Controllers in the Policy and Compliance
section.
Cost Management
Cost Management is another important
element of the enterprise Governance framework. Cost Management refers to the
continuing process of optimizing IT spend by putting in place policies to
control costs.
In this section we will look at the
discipline of Kubernetes Cost Management in the overall context of an
enterprise Governance framework. We will conside two aspects of Cost
Management: ensuring optimal utilization of resources and exercising control
over the provisioning and consumption of resources.
Kubernetes abstracts resources from
the underlying cloud or on-prem infrastructure e.g. Nodes and allows them to be
consumed by containers and pods. Tracking usage and utilization of Kubernetes
clusters will give us an idea of the nodes that see efficient utilization and
those that are underutilized. Right-sizing Kubernetes clusters based on these
metrics yield significant cost savings. Regular usage and utilization alerts
and notifications are an important piece of the puzzle, allowing teams to
respond to events quickly.
There are a number of open source
tools that allow organizations to track usage and utilization for Kubernetes
clusters. A monitoring pipeline incorporating both Prometheus and Grafana is a
good starting point. We have covered the deployment of just such a pipeline in
this blog
post. In the post, we first identify the
metrics that are important in the context of Kubernetes, set up the tools
required and then outline the expressions used to monitor those metrics in both
Prometheus and Grafana.
This open source monitoring pipeline
is however limited to tracking resource metrics for native kubernetes
abstractions like namespaces and Nodes. Building in a degree of automation
where clusters can be dynamically re-sized based on utilization metrics makes
the process even more productive.
The cloud and now Kubernetes has
made it very easy for developers or DevOps teams to spin up resources without
oversight from IT management. This can at times lead to resource provisioning
beyond immediate needs and in turn, increased costs. In this context control
over who can provision, read, update and delete resources and in what
quantities becomes very important. We have covered RBAC, which is one way to
govern the use of resources, in the previous section.
Kubernetes, however, does provide
additional knobs that allow IT Managers and Kubernetes administrators to
control resource provisioning. Following are some of the ways IT managers can
control resource provisioning and consumption for individual namespaces:
- Configure default resource requests
and limits: Ensure that each container created
inside that namespace gets allocated the default level of resource requests and
limits.
- Configure minimum and maximum
resource limits: Ensure that the resource requests and
limits value of each container created inside that namespace does not exceed or
go below the max and min values defined.
- Configure resource quotas for
Namespaces: Ensure that the total resource consumption of a namespace does not
exceed the value specified
- Configure quotas for other
Kubernetes objects: Set limits on the total number of Kubernetes objects (Pods,
Persistent Volume Claims, Services) that can run inside that namespace
Similar to the open source
monitoring pipeline, these controls are restricted to namespaces and cannot be
applied for custom organizational groupings like teams, clients or departments.
Policy and Compliance
Policies represent rules that govern
how management would like a system to behave. Every enterprise has a set of policies that reflect its
unique requirements around cost management, security, legislative landscape,
tribal knowledge and internal conventions. This is true in the
Kubernetes context too, where IT Managers and Kubernetes administrators require
more control over how Kubernetes is used and how it functions inside the
enterprise.
Once
Policies have been identified, they also need to be monitored and enforced as
part of internal compliance requirements.
Kubernetes
Admission Webhooks allow organizations to incorporate
custom governance and compliance policies into their Kubernetes environments.
Admission Webhooks are a type of Admission Controller, which serve as an
additional filter that requests for creating, updating or deleting Kubernetes
resources have to go through. Requests are only allowed after being checked
against the Admission Controllers that are currently running.
Admission Webhooks come in two
flavours: Mutating and Validating. Validating Admission Webhooks can only
accept or reject requests
based on whether they conform to custom policies whereas Mutating Admission
Webhooks can also modify requests and enforce default policies.
Kubernetes also provides a set of
standard hard-coded Admission
Controllers that reflect commonly enforced
policies.
The Open Policy Agent which is a part of the CNCF project, is a great tool that allows
organizations to easily create and enforce custom policies for their Kubernetes
environments.
Cost Allocation, Showback and
Chargeback
Budgeting and cost
allocation are important aspects of enterprise
IT environments. Cost allocation allows enterprises to initiate showback and
chargeback procedures as well as compare actual costs to budgeted amounts.
This, in turn, enables ROI analysis and ensures that IT spend drives business
value.
IT governance also revolves around
the concepts of accountability and transparency both of which are outcomes of
cost allocation. Additionally, Cost allocation is important in the context of
multi-tenant clusters where costs need to be allocated to multiple clients
sharing the Kubernetes cluster.
Kubernetes by introducing an
additional abstraction layer on top of the already existing cloud or
virtualization layer makes it hard to correlate the resource consumption by
individual Kubernetes objects to the costs of the underlying infrastructure.
The dashboard outlined here is a good start in visualizing Kubernetes costs. It does,
however, hard-code costs into the dashboard and is therefore not very suitable
for dynamic environments using many different instance and storage types. It
also does not provide any insight into Kubernetes costs for individual teams,
departments or clients.
Tagging is the key to Kubernetes
cost allocation and showback efforts. It is also essential to accountability
and resource governance efforts. Tagging allows resources to be discoverable
and traceable in the process reducing the probability of them falling below the
radar and adding unnecessary costs.
For tags to aid in cost allocation
efforts they need to propagate across both Kubernetes and cloud environments.
We explore the topic of cost
allocation for Kubernetes here where we use
Kops cloudlabels in combination with the AWS billing and cost explorer tools to
allocate costs.
However as is the case with the
dashboard mentioned above, this process is highly manual and does not lend
itself well to dynamic environments sharing multiple VMs, clusters or cloud
providers.
Conclusion
As Kubernetes
sees increased enterprise adoption, topics like Security, Governance,
Compliance, Operations and Cost Management take centre stage. A bare-boned
Kubernetes environment even though vastly feature-rich, does fall short when it
comes to these enterprise requirements.
Replex aims to
fill this gap by providing a comprehensive Kubernetes Governance and Cost
Management solution to the modern cloud-native enterprise.
To learn more about containerized infrastructure and cloud
native technologies, consider coming to KubeCon +
CloudNativeCon Barcelona, May 20-23 in Barcelona.