:_mod-docs-content-type: PROCEDURE [id="migrating-ceph-mds_{context}"] = Migrating {Ceph} MDS to new nodes within the existing cluster You can migrate the MDS daemon when {rhos_component_storage_file_first_ref}, deployed with either a cephfs-native or ceph-nfs back end, is part of the overcloud deployment. The MDS migration is performed by `cephadm`, and you move the daemons placement from a hosts-based approach to a label-based approach. ifeval::["{build}" != "upstream"] This ensures that you can visualize the status of the cluster and where daemons are placed by using the `ceph orch host` command. You can also have a general view of how the daemons are co-located within a given host, as described in the Red Hat Knowledgebase article https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations]. endif::[] ifeval::["{build}" != "downstream"] This ensures that you can visualize the status of the cluster and where daemons are placed by using the `ceph orch host` command, and have a general view of how the daemons are co-located within a given host. endif::[] .Prerequisites * Complete the tasks in your {rhos_prev_long} {rhos_prev_ver} environment. For more information, see xref:red-hat-ceph-storage-prerequisites_configuring-network[{Ceph} prerequisites]. .Procedure . Verify that the {CephCluster} cluster is healthy and check the MDS status: + ---- $ sudo cephadm shell -- ceph fs ls name: cephfs, metadata pool: manila_metadata, data pools: [manila_data ] $ sudo cephadm shell -- ceph mds stat cephfs:1 {0=mds.controller-2.oebubl=up:active} 2 up:standby $ sudo cephadm shell -- ceph fs status cephfs cephfs - 0 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active mds.controller-2.oebubl Reqs: 0 /s 696 196 173 0 POOL TYPE USED AVAIL manila_metadata metadata 152M 141G manila_data data 3072M 141G STANDBY MDS mds.controller-0.anwiwd mds.controller-1.cwzhog ---- . Retrieve more detailed information on the Ceph File System (CephFS) MDS status: + ---- $ sudo cephadm shell -- ceph fs dump e8 enable_multiple, ever_enabled_multiple: 1,1 default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 1 Filesystem 'cephfs' (1) fs_name cephfs epoch 5 flags 12 joinable allow_snaps allow_multimds_snaps created 2024-01-18T19:04:01.633820+0000 modified 2024-01-18T19:04:05.393046+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 0 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0 up {0=24553} failed damaged stopped data_pools [7] metadata_pool 9 inline_data disabled balancer standby_count_wanted 1 [mds.mds.controller-2.oebubl{0:24553} state up:active seq 2 addr [v2:172.17.3.114:6800/680266012,v1:172.17.3.114:6801/680266012] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.mds.controller-0.anwiwd{-1:14715} state up:standby seq 1 addr [v2:172.17.3.20:6802/3969145800,v1:172.17.3.20:6803/3969145800] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.controller-1.cwzhog{-1:24566} state up:standby seq 1 addr [v2:172.17.3.43:6800/2227381308,v1:172.17.3.43:6801/2227381308] compat {c=[1],r=[1],i=[7ff]}] dumped fsmap epoch 8 ---- . Check the OSD blocklist and clean up the client list: + ---- $ sudo cephadm shell -- ceph osd blocklist ls .. .. for item in $(sudo cephadm shell -- ceph osd blocklist ls | awk '{print $1}'); do sudo cephadm shell -- ceph osd blocklist rm $item; done ---- + [NOTE] ==== When a file system client is unresponsive or misbehaving, the access to the file system might be forcibly terminated. This process is called eviction. Evicting a CephFS client prevents it from communicating further with MDS daemons and OSD daemons. Ordinarily, a blocklisted client cannot reconnect to the servers; you must unmount and then remount the client. However, permitting a client that was evicted to attempt to reconnect can be useful. Because CephFS uses the RADOS OSD blocklist to control client eviction, you can permit CephFS clients to reconnect by removing them from the blocklist. ==== . Get the hosts that are currently part of the {Ceph} cluster: + ---- [ceph: root@controller-0 /]# ceph orch host ls HOST ADDR LABELS STATUS cephstorage-0.redhat.local 192.168.24.25 osd cephstorage-1.redhat.local 192.168.24.50 osd cephstorage-2.redhat.local 192.168.24.47 osd controller-0.redhat.local 192.168.24.24 _admin mgr mon controller-1.redhat.local 192.168.24.42 mgr _admin mon controller-2.redhat.local 192.168.24.37 mgr _admin mon 6 hosts in cluster ---- . Apply the MDS labels to the target nodes: + ---- for item in $(sudo cephadm shell -- ceph orch host ls --format json | jq -r '.[].hostname'); do sudo cephadm shell -- ceph orch host label add $item mds; done ---- . Verify that all the hosts have the MDS label: + ---- $ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS cephstorage-0.redhat.local 192.168.24.11 osd mds cephstorage-1.redhat.local 192.168.24.12 osd mds cephstorage-2.redhat.local 192.168.24.47 osd mds controller-0.redhat.local 192.168.24.35 _admin mon mgr mds controller-1.redhat.local 192.168.24.53 mon _admin mgr mds controller-2.redhat.local 192.168.24.10 mon _admin mgr mds ---- . Dump the current MDS spec: + ---- $ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ mkdir -p ${SPEC_DIR} $ sudo cephadm shell -- ceph orch ls --export mds > ${SPEC_DIR}/mds ---- . Edit the retrieved spec and replace the `placement.hosts` section with `placement.label`: + ---- service_type: mds service_id: mds service_name: mds.mds placement: label: mds ---- . Use the `ceph orchestrator` to apply the new MDS spec: + ---- $ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mds -- ceph orch apply -i /mnt/mds Scheduling new mds deployment ... ---- + This results in an increased number of MDS daemons. . Check the new standby daemons that are temporarily added to the CephFS: + ---- $ sudo cephadm shell -- ceph fs dump Active standby_count_wanted 1 [mds.mds.controller-0.awzplm{0:463158} state up:active seq 307 join_fscid=1 addr [v2:172.17.3.20:6802/51565420,v1:172.17.3.20:6803/51565420] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.mds.cephstorage-1.jkvomp{-1:463800} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/2075903648,v1:172.17.3.135:6821/2075903648] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.controller-2.gfrqvc{-1:475945} state up:standby seq 1 addr [v2:172.17.3.114:6800/2452517189,v1:172.17.3.114:6801/2452517189] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.cephstorage-0.fqcshx{-1:476503} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.controller-1.tyiziq{-1:499136} state up:standby seq 1 addr [v2:172.17.3.43:6800/3615018301,v1:172.17.3.43:6801/3615018301] compat {c=[1],r=[1],i=[7ff]}] ---- . To migrate MDS to the target nodes, set the MDS affinity that manages the MDS failover: + [NOTE] It is possible to elect a dedicated MDS as "active" for a particular file system. To configure this preference, `CephFS` provides a configuration option for MDS called `mds_join_fs`, which enforces this affinity. When failing over MDS daemons, cluster monitors prefer standby daemons with `mds_join_fs` equal to the file system name with the failed rank. If no standby exists with `mds_join_fs` equal to the file system name, it chooses an unqualified standby as a replacement. + ---- $ sudo cephadm shell -- ceph config set mds.mds.cephstorage-0.fqcshx mds_join_fs cephfs ---- * Replace `mds.mds.cephstorage-0.fqcshx` with the daemon deployed on `cephstorage-0` that was retrieved from the previous step. . Remove the labels from the Controller nodes and force the MDS failover to the target node: + ---- $ for i in 0 1 2; do sudo cephadm shell -- ceph orch host label rm "controller-$i.redhat.local" mds; done Removed label mds from host controller-0.redhat.local Removed label mds from host controller-1.redhat.local Removed label mds from host controller-2.redhat.local ---- + The switch to the target node happens in the background. The new active MDS is the one that you set by using the `mds_join_fs` command. . Check the result of the failover and the new deployed daemons: + ---- $ sudo cephadm shell -- ceph fs dump … … standby_count_wanted 1 [mds.mds.cephstorage-0.fqcshx{0:476503} state up:active seq 168 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.cephstorage-1.jkvomp{-1:499760} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/452139733,v1:172.17.3.135:6821/452139733] compat {c=[1],r=[1],i=[7ff]}] $ sudo cephadm shell -- ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT crash 6/6 10m ago 10d * mds.mds 3/3 10m ago 32m label:mds $ sudo cephadm shell -- ceph orch ps | grep mds mds.mds.cephstorage-0.fqcshx cephstorage-0.redhat.local running (79m) 3m ago 79m 27.2M - 17.2.6-100.el9cp 1af7b794f353 2a2dc5ba6d57 mds.mds.cephstorage-1.jkvomp cephstorage-1.redhat.local running (79m) 3m ago 79m 21.5M - 17.2.6-100.el9cp 1af7b794f353 7198b87104c8 mds.mds.cephstorage-2.gnfhfe cephstorage-2.redhat.local running (79m) 3m ago 79m 24.2M - 17.2.6-100.el9cp 1af7b794f353 f3cb859e2a15 ---- ifeval::["{build}" != "downstream"] .Useful resources * https://docs.ceph.com/en/reef/cephfs/eviction[cephfs - eviction] * https://docs.ceph.com/en/reef/cephfs/standby/#configuring-mds-file-system-affinity[ceph mds - affinity] endif::[]