Ceph Storage Cluster on CentOS/RHEL 7.x

If you are familiar with Cloud Platforms you should know also about Ceph Object Storage and/or Ceph Block Device.
   Ceph Storage Cluster requires at least one Ceph Monitor and at least two Ceph OSD Daemons. Ceph stores a client’s data as objects within storage pools. Using the CRUSH algorithm, Ceph calculates which placement group should contain the object, and further calculates which Ceph OSD Daemon should store the placement group. The CRUSH algorithm enables the Ceph Storage Cluster to scale, rebalance, and recover dynamically.

Components:

  • Ceph OSDs: A Ceph OSD Daemon (Ceph OSD) stores data, handles data replication, recovery, backfilling, rebalancing, and provides some monitoring information to Ceph Monitors by checking other Ceph OSD Daemons for a heartbeat. A Ceph Storage Cluster requires at least two Ceph OSD Daemons to achieve an active + clean state when the cluster makes two copies of your data (Ceph makes 3 copies by default, but you can adjust it).
  • Monitors: A Ceph Monitor maintains maps of the cluster state, including the monitor map, the OSD map, the Placement Group (PG) map, and the CRUSH map. Ceph maintains a history (called an “epoch”) of each state change in the Ceph Monitors, Ceph OSD Daemons, and PGs.
  • MDSs: A Ceph Metadata Server (MDS) stores metadata on behalf of the Ceph Filesystem (i.e., Ceph Block Devices and Ceph Object Storage do not use MDS). Ceph Metadata Servers make it feasible for POSIX file system users to execute basic commands like ls, find, etc. without placing an enormous burden on the Ceph Storage Cluster.

 

Easy steps to setup your 1st Ceph Storage Cluster using LXC and bate-metal:

Install OS on 7 machines with the following purposes
1 LXC container for Ceph Admin
3 LXC containers for Ceph Monitors
3 Bare Metal machines for Ceph OSDs

 

Create DNS records for all involved containers and bare-metal
For this use your already installed DNS servers or /etc/hosts

 

Install python-pip
It is needed by Ceph on all nodes.

for i in {1..3}; do ssh root@cephmon0$i "yum -y install python-pip"; done && \
    for i in {1..3}; do ssh root@cephstor0$i "yum -y install python-pip"; done

 

Configure passwordles login on all nodes
ssh-keygen -t dsa && \
    for i in {1..3}; do ssh-copy-id root@cephmon0$i root@cephstor0$i; done

 

Preconfigure Ceph Admin node
mkdir /root/ceph && cd ceph && \
yum -y install python-pip && \
easy_install -U pip && \
pip install -U ceph-deploy

 

Deploy Ceph Monitors
for i in {1..3}; do ceph-deploy new cephmon0$i; done

 

Make sure you use only 2 replicas for a test cluster
tee ceph.conf << extra
osd pool default size = 2
public network = 192.168.7.0/24
extra

 

Install Ceph on all nodes
for i in {1..3}; do ceph-deploy install --release jewel cephmon0$i cephstor0$i; done

 

Initialize monitors
ceph-deploy mon create-initial


Zap, prepare and activate OSDs on Ceph Storage nodes
for i in {1..3}; do ceph-deploy disk zap cephstor0$i:vdb cephstor0$i:vdc cephstor0$i:vdd ; done

for i in {1..3}; do ceph-deploy osd prepare cephstor0$i:vdb cephstor0$i:vdc cephstor0$i:vdd ; done

for i in {1..3}; do ceph-deploy osd activate cephstor0$i:/dev/vdb1:/dev/vdb2 cephstor0$i:/dev/vdc1:/dev/vdc2 cephstor0$i:/dev/vdd1:/dev/vdd2 ; done

 

Add rados gateway on Ceph Monitors
for i in {1..3}; do ceph-deploy rgw create cephmon0$i ; done


Deploy ceph admin tools on all nodes

for i in {1..3}; do ceph-deploy admin cephadmin01 cephmon0$i cephstor0$i ; done

 

Check your newly created Ceph Storage Cluster
ceph status && ceph osd tree

You should see an output like this:

[root@cephmon01 ~]# ceph status
cluster a1e89c56-3b7d-4771-a555-b583b41461bd
  health HEALTH_OK
monmap e3: 3 mons at {cephmon01=192.168.7.101:6789/0,cephmon02=192.168.7.102:6789/0,cephmon03=192.168.7.103:6789/0}
               election epoch 10, quorum 0,1,2 cephmon01,cephmon02,cephmon03
osdmap e65: 9 osds: 9 up, 9 in
             flags sortbitwise
pgmap v293: 304 pgs, 7 pools, 848 bytes data, 170 objects
            321 MB used, 404 GB / 404 GB avail
            304 active+clean

[root@cephmon01 ~]# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 0.39510 root default 
-2 0.13170       host cephstor01 
0 0.04390               osd.0             up 1.00000 1.00000 
1 0.04390               osd.1             up 1.00000 1.00000 
2 0.04390               osd.2             up 1.00000 1.00000 
-3 0.13170       host cephstor02 
3 0.04390              osd.3              up 1.00000 1.00000 
4 0.04390              osd.4              up 1.00000 1.00000 
5 0.04390              osd.5              up 1.00000 1.00000 
-4 0.13170       host cephstor03 
6 0.04390              osd.6              up 1.00000 1.00000 
7 0.04390              osd.7              up 1.00000 1.00000 
8 0.04390              osd.8              up 1.00000 1.00000

 

How to create a rbd device

Creating a 5 GB rados device:

rbd create test_ceph -s 5120 --image-feature layering

rbd map test_ceph

mkfs.xfs /dev/rbd0

mkdir test_rados_block_device ; mount /dev/rbd0 test_rados_block_device

 

Here is how it should look like:

[root@node01 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/cl-root 8.0G 3.0G 5.1G 38% /
/dev/rbd0             5G   33M   5G   1% /root/test_rados_block_device

[root@knode01 ~]# rbd info test_ceph
rbd image ‘kube_ceph’:
size 5120 MB in 1280 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.233b7238e1f29
format: 2
features: layering
flags:

 

Now you can start using your new RBD drive.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s