![reddit raid monitor reddit raid monitor](https://cdn.eteknix.com/wp-content/uploads/2018/09/LaCie-Rugged-RAID-Pro-4TB-Photo-view-top-0.jpg)
There are no components in between, except for the network, which you will need to size accordingly 1.īecause there are no intermediate components or proxies that could potentially create a bottleneck, a Ceph cluster can really scale horizontally in both capacity and performance.Īnd while scaling storage and performance, data is protected by redundancy. Ceph ScalabilityĪ storage client will contact the appropriate storage node directly to store or retrieve data. Data always flows directly from the storage node towards the client and vice versa. It only keeps track of the CRUSH map for both clients and individual storage nodes. It's important to keep in mind that the Ceph monitor node does not store or process any metadata. It only keeps track of the state of the cluster, a task that is way easier to scale than running a 'registry' for data storage/retrieval itself.
![reddit raid monitor reddit raid monitor](https://i.redd.it/b4pgwno8pxa31.jpg)
So Ceph does have some kind of centralised 'registry' but it serves a totally different purpose. Those nodes are contacted by both the storage nodes and the storage clients. Regardless of the size of the Ceph storage cluster, you typically need just three (3) monitor nodes for the whole cluster. This CRUSH map is distributed across the cluster from a special server: the 'monitor' node. That map is the basis for the calculations the storage client need to perform in order to decide which storage node to contact. That map contains information about the storage nodes in the cluster. This is why Ceph can scale in capacity and performance while assuring availability.Īt the core of the CRUSH algoritm is the CRUSH map. a performance bottleneck, preventing further expansionĬeph does away with this concept of a centralised registry for data storage and retrieval.So to reiterate: given a particular state of the storage cluster, the client can calculate which storage node to contact for storage or retrieval of data.īecause there is no centralised 'registry' that keeps track of the location of data on the cluster (metadata). The storage client can - on it's own - determine what to do with data or where to get it. The CRUSH algoritm allows storage clients to calculate which storage node needs to be contacted for retrieving or storing data. What makes Ceph special?Īt the heart of the Ceph storage cluster is the CRUSH algoritm, developed by Sage Weil, the co-creator of Ceph. You are not tied to any particular proprietary hardware. This means that the risk of hardware vendor lock-in is quite mitigated. The hardware is simple and 'dumb', the intelligence resides all in software. A Ceph storage node at it's core is more like a JBOD. This is possible because Ceph manages redundancy in software.
![reddit raid monitor reddit raid monitor](https://i.imgur.com/8Fy2E9F.jpg)
With Ceph, you don't even need a RAID controller anymore, a 'dumb' HBA is sufficient. I want to touch upon a technical detail because it illustrates the mindset surrounding Ceph. You can actually start very small, with just a few storage nodes and expand as your needs increase. You don't need to start with petabytes of storage. And as you expand the cluster with extra storage nodes, capacity, performance and resiliency (if needed) will all increase at the same time. You will need multiple servers to satisfy your capacity, performance and resiliency requirements. And you scale by adding additional storage nodes. These storage nodes are just commodity ( COTS) servers containing a lot of hard drives and/or flash storage.Ĭeph is meant to scale. The basic building block of a Ceph storage cluster is the storage node.
![reddit raid monitor reddit raid monitor](https://www.pcbattlestations.com/wp-content/uploads/2019/01/twiist3d_14-compressor.jpg)
Ceph is used to build multi-petabyte storage clusters.įor example, Cern has build a 65 Petabyte Ceph storage cluster. What is Ceph?Ĭeph is a software-defined storage solution that can scale both in performance and capacity. I've written this blog post purely because I'm a storage enthusiast and I find Ceph interesting technology. After you finished reading this blog post you should have a good high-level overview of Ceph. In this blog post I will try to explain why I believe Ceph is such an interesting storage solution.