Ceph storage for Kubernetes

Tags: ceph , Storage

This article is more than 1 year old.


Opposites attract.  Stateful and stateless.

Photo by Alice Yamamura on Unsplash

Storage and container management systems are almost polar opposites of each other.  One deals with permanently storing, and protecting data for as long as it’s needed.  The other automatically manages highly dynamic workloads, scaling resources up and down as required.  

More organisations are taking a container-first approach to application deployment and management, but the underlying challenge of safely and securely storing data still remains the same.  Any storage system needs to be able to protect against hardware failure and maintain the availability of an organisation’s most important asset – its data.

Data growth happens alarmingly quickly, it is estimated that more than 2,500 PetaBytes (PB) of new data is created every day.  Thankfully, this is distributed across many organisations so no one has to deal with this kind of growth on their own, yet.  Scale-out storage systems like Ceph, are a great fit to deal with storage growth in any size of organisation, just by adding more nodes to a cluster you are able to add not just capacity, but also more compute power – so that performance and capacity scale hand in hand.

Creating data takes time and effort too, it’s not possible to just recreate some datasets if they happened to be lost – for example, photos or videos, medical records or financial transactions.  The significance of losses might be different across datasets – losing a photo is less critical than the loss of a credit to your bank account.  There are also regulatory requirements to consider as well, medical data needs to be permanently preserved for as long as 100 years. 

How do we bring the two together?

A container management system such as Kubernetes uses the Container Storage Interface (CSI) to integrate with external storage systems for their block and file based storage needs.

This provides two interfaces for interaction:

  • A control plane that is used for storage management  – specifically the creation, allocation and reclamation of volumes.
  • A data plane used for high speed parallel access to data stored in the storage system.

The CSI also implements storage classes so that underlying storage types can be mapped to classes based on the performance capabilities – for certain workloads it might be important that the data is stored on the fastest SSD based disks, but for others, like an archive, the data could be stored on less expensive NL-SAS or SATA capacity oriented disks.

Ceph: a secure, scaleable and trusted storage solution

From research organisations such as the Wellcome Sanger Institute and CERN to global telecom giants like Deutsche Telekom and BT, thousands of organisations the world over have turned to Ceph to provide reliable and scalable storage for their needs.

Alongside the scaling and resiliency features of Ceph another significant advantage is that it provides interfaces for multiple storage types within a single cluster, eliminating the need for multiple storage solutions or specialised hardware, thus reducing management overheads.  

Try out Ceph for Kubernetes

If you want to quickly test a robust production-ready storage solution that integrates well with Kubernetes, try MicroCeph. Microk8s is the easiest and fastest way to get Kubernetes up and running. You can experiment with the latest upstream features and toggle services on and off and seamlessly move your work from dev to production.

In the latest MicroK8s release (1.28), the rook-ceph addon was included to allow for easy integration with an external Ceph cluster.  In this guide we deploy a 3 node MicroCeph cluster, deploy MicroK8s, and then integrate the two to create a powerful compute and storage cluster.

Learn more

In this whitepaper, we discuss using Ceph in a containerised environment in much more detail.  You can learn about the different types of storage systems, the different access methods used to store and retrieve data, the way that data can be protected, more detail on how the two systems are interconnected, and a detailed introduction to Ceph.

Additional resources

ceph logo

What is Ceph?

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

It's an optimised and easy-to-integrate solution for companies adopting open source as the new norm for high-growth block storage, object stores and data lakes.

Learn more about Ceph ›

ceph logo

How to optimise your cloud storage costs

Cloud storage is amazing, it's on demand, click click ready to go, but is it the most cost effective approach for large, predictable data sets?

In our white paper learn how to understand the true costs of storing data in a public cloud, and how open source Ceph can provide a cost effective alternative!

Access the whitepaper ›


Interested in running Ubuntu in your organisation? Talk to us today

ceph logo

A guide to software-defined storage for enterprises

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

In our whitepaper explore how Ceph can replace proprietary storage systems in the enterprise.

Access the whitepaper ›


Interested in running Ubuntu in your organisation? Talk to us today

Newsletter signup

Get the latest Ubuntu news and updates in your inbox.

By submitting this form, I confirm that I have read and agree to Canonical's Privacy Policy.

Related posts

Meet the Canonical Ceph team at Cephalocon 2024

Date: December 4-5th, 2024 Location: Geneva, Switzerland In just a few weeks, Cephalocon will be held at CERN in Geneva. After last year’s successful...

Managed storage with Ceph

Treat your open source storage infrastructure as a service What if storage was like coffee: menu driven and truly service oriented? Everyone knows how quick...

How do you select the best enterprise data storage solution for your business?

The choices you make around IT infrastructure have great impact for both business cost and performance, across areas as diverse as operations, finance, data...