Limitations with current database operators in Kubernetes: Consistency (Part 2)

Tesera
5 min readJul 14, 2021

This is the second entry on the blog series about the current landscape of Kubernetes operators for databases.

As it was introduced in Part 1, the database ecosystem in Kubernetes is highly fragmented. The lack of unified frameworks and common standards leads developers to reimplement over and over the same abstractions using different patterns and strategies.

This is reflected on the experience side as well. With hosted solutions, developers use the same interface to deploy any database and do not require much operational knowledge. In contrast, Kubernetes still requires in-house expertise both for the operator and the database.

This consistency gap is the reason why organizations still prefer to use hosted solutions rather than native Kubernetes deployments for databases. Expanding on this topic, there are three nuances to consider.

Life cycle

The life cycle is the series of steps the database goes through from the start to end. Broadly speaking, we can define four stages: create, update, scale and delete. The main pain points are usually update and scale due to the lack of common standards, more than often, any of these actions requires complete rolling updates of the cluster which are prone to errors.

Ideally, an operator should be able to handle the complete life cycle of a database. In practice, we only find partial support. These are some of the most common unimplemented features:

  • Unable to resize data storage volumes.
  • Clusters cannot be scaled after the initial formation.
  • Not all the configuration parameters can be modified.
  • Scale down is not done in the correct order.
  • Delete orphan PVCs.

Some of these limitations are a side effect of using a StatefulSet as the underlying Kubernetes resource. It is hard to accommodate under the StatefulSet the different deployment strategies and nuances of each database, many of them not cloud native, despite the many automations and utilities provided.

For example, Zookeeper performs a rolling update in a specific order (i.e. leader the last) when a new node is added or Redis executes a manual join command to connect to a cluster before it is ready. This could be implemented with custom update strategies or PostHook operators in the StatefulSet. However, it feels more like a set of complex and brittle patches on top of a black box, rather than natively integrated logic. Since there are no clear strategies and guides to write these patches, it is up to the developer to come up with his own patterns.

Besides, as of the current 1.21 version of Kubernetes, the StatefulSet has some important limitations. First, PVC volumes defined on the StatefulSet cannot be resized (ISSUE-68737). Second, failed StatefulSet updates must be deleted manually (ISSUE-67250). Though there are patches for these problems, as of today, not many operators implement them.

Custom Resource Definition

The Custom Resource Definition or CRD is a direct reflection of the lifecycle and capabilities of the operator. It describes the YAML schema that developers can use to interact with it.

Again, there does not seem to exist a uniform schema to describe the databases. Though most of them use a single CRD for the cluster, they use different patterns to describe computational resources, configurations, topologies, internal resources, etc. Let’s check a couple of examples more in detail.

Configuration:

apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
name: custom-configuration
spec:
replicas: 1
rabbitmq:
additionalConfig: |
log.console.level = debug

In the first example, the developer inputs a raw configuration file. It is up to him to know which fields are available, the type, format and utility of each one.

apiVersion: zookeeper.pravega.io/v1beta1
kind: ZookeeperCluster
metadata:
name: zk-with-istio
spec:
replicas: 3
config:
initLimit: 10
tickTime: 2000
syncLimit: 5
quorumListenOnAllIPs: true

In the second case, the CRD offers an abstraction for that configuration file and the developer only has to input specific typed parameters.

Resources:

There is a similar duality when describing resources and Pod configurations

apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallationTemplate
metadata:
name: template-01
spec:
defaults:
templates:
dataVolumeClaimTemplate: data-volumeclaim-template
templates:
volumeClaimTemplates:
- name: data-volumeclaim-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
templating:
policy: Auto

On the one hand, some operators use the StatefulSet template format to define resources and storage. In the most extreme cases, it is also up to the developer to define the docker images and readiness probes which should be known facts for the operator already and not up to the user.

apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
name: production-ready
spec:
replicas: 3
resources:
requests:
cpu: 4
memory: 10Gi
persistence:
storageClassName: ssd
storage: 500Gi

On the other hand, there is an abstraction layer and the developer only needs to input specific fields for the deployment.

This is a common pattern that is repeated over and over in the ecosystem, operators that are just small logic layers on top of the StatefulSet with many arbitrary values and those that provide full abstractions and encode much more in-house knowledge and best practices about the database. The latter model provides an experience closer to a hosted solution.

Workflows

Besides the basic life cycle of the database, there exist other actions related to the business and operational side of an organization that need to be considered:

  • Deploy an Operator in multiple namespaces.
  • Handle any privilege field as a Kubernetes Secret.
  • Monitoring and insights from the cluster.
  • In-transit and at-rest encryption.
  • Autoscaling.
  • And more…

Though these actions do not interfere with the normal execution of the database, they define how the database coordinates and extends within other parts of the organization.

With the advent of Kubernetes and the Cloud Native foundation, some of these workflows are being standardized around common protocols. For example, Prometheus to handle monitoring or service mesh to handle in-transit encryption. However, it still requires implementation on the operator side to enable and configure them.

Conclusion

Developers still rely solely on StatefulSets to create database operators. Due to limitations in their design and the lack of common frameworks and best practices, each operator ends up with his own patterns, utilities and workflows.

Rather than offering a user experience akin to hosted solutions with full abstractions, many operators only offer small layers of helper methods that now require expertise both of the database and the operator itself.

In this fragmented and complex ecosystem, organizations need to invest in a knowledgeable DevOps team to understand each operator, its capabilities and how it fits in the current security and deployment workflows of the company. Once again, despite using Kubernetes, the process of database deployment and management gets centralized and siloed within the organization, unlike their stateless counterparts.

It is interesting that although Kubernetes standardized a consistent execution model for stateless containers, now it has also made possible a fragmented deployment of databases.

What’s next?

Part 3 : Limitations in StatefulSets.

This post was originally posted in Tesera

--

--

Tesera

Kubernetes made it easy to manage containers, now we make it simple to deploy databases | https://tesera.io