The Wellcome Sanger Institute: sharing genomic research worldwide securely with supported Ceph

This article was last updated 2 years ago.


A world-leading genomic research centre, the Wellcome Sanger Institute uses advanced DNA sequencing technology for large-scale studies that surpass the capabilities of many other organisations. Among other works, the Institute is currently heading the UK-wide Darwin Tree of Life Project to map the genetic code of 60,000 complex species. It is also working with expert groups across Britain to analyse the genetic code of COVID-19 samples, helping public health agencies to combat this now widespread virus.

For advanced research, genomic scientists need to use and access a vast amount of data. They then need to be able to share this data with other scientists worldwide in a secure and reliable manner. To meet this data storage and retrieval challenge, the Institute opted for Ceph on Ubuntu as an on-premise solution offering superior robustness and scalability. Authorised users internal and external to the Institute can store and retrieve any volume of data from any location via the S3 protocol. 

Dr Matthew Vernon, Principal Systems Administrator at the Institute says, “We have about 55 petabytes of data on the campus of which now about 20 petabytes or so are stored in our Ceph clusters. A lot of that is data that we want to share securely with collaborators via our S3 service. We were looking for support and stability for the infrastructure that our scientists could rely upon”. 

After evaluating a range of providers, the Wellcome Sanger Institute chose Canonical for Ceph support.

As Matthew explains, “We have quite a lot of Ceph expertise on site, but we needed someone with really in-depth knowledge of the system. We saw that Canonical could provide this expertise for us and we were already using Ubuntu for the operating system and the Ceph packages. So, that made it natural to look at Canonical as our support provider.”

With the IT infrastructure at the Wellcome Sanger Institute a key factor in pushing back the boundaries of science, Dr Peter Clapham, Informatics Support Group Team Leader says, “With Canonical, we have a platform in place for meeting leading edge requirements, ensuring resilience, and making sure that as it grows, the Institute has a provider that can grow with it and its support needs.” He adds, “We’ve engaged with Canonical for the confidence that we’re not just meeting challenges from today, but that we’re also looking to the future and the continuity of our technical solutions.”

To see more about how the Institute is meeting new pharmaceutical and technical goals, the full interview can be viewed below. Alternatively, learn more in the case study.

ceph logo

What is Ceph?

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

It's an optimised and easy-to-integrate solution for companies adopting open source as the new norm for high-growth block storage, object stores and data lakes.

Learn more about Ceph ›

ceph logo

How to optimise your cloud storage costs

Cloud storage is amazing, it's on demand, click click ready to go, but is it the most cost effective approach for large, predictable data sets?

In our white paper learn how to understand the true costs of storing data in a public cloud, and how open source Ceph can provide a cost effective alternative!

Access the whitepaper ›


Interested in running Ubuntu in your organisation? Talk to us today

ceph logo

A guide to software-defined storage for enterprises

Ceph is a software-defined storage (SDS) solution designed to address the object, block, and file storage needs of both small and large data centres.

In our whitepaper explore how Ceph can replace proprietary storage systems in the enterprise.

Access the whitepaper ›


Interested in running Ubuntu in your organisation? Talk to us today

Newsletter signup

Get the latest Ubuntu news and updates in your inbox.

By submitting this form, I confirm that I have read and agree to Canonical's Privacy Policy.

Related posts

What is object storage?

Object storage is a type of storage where data is manipulated as distinct units. It has accompanied the cloud computing revolution, with S3 (Simple Storage...

Meet the Canonical Ceph team at Cephalocon 2024

Date: December 4-5th, 2024 Location: Geneva, Switzerland In just a few weeks, Cephalocon will be held at CERN in Geneva. After last year’s successful...

Managed storage with Ceph

Treat your open source storage infrastructure as a service What if storage was like coffee: menu driven and truly service oriented? Everyone knows how quick...