{"id":35048,"date":"2018-10-02T09:00:52","date_gmt":"2018-10-02T07:00:52","guid":{"rendered":"https:\/\/nolabnoparty.com\/?p=35048"},"modified":"2018-10-04T13:17:54","modified_gmt":"2018-10-04T11:17:54","slug":"replace-failed-host-vsan-cluster","status":"publish","type":"post","link":"https:\/\/nolabnoparty.com\/en\/replace-failed-host-vsan-cluster\/","title":{"rendered":"Replace a failed host in a vSAN cluster"},"content":{"rendered":"<p><img decoding=\"async\" class=\"aligncenter wp-image-35050 size-full\" title=\"replace-failed-host-vsan-cluster-01\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/09\/replace-failed-host-vsan-cluster-01.jpg\" alt=\"replace-failed-host-vsan-cluster-01\" width=\"602\" height=\"202\" \/><\/p>\n<p>If an ESXi host member of the <a href=\"https:\/\/nolabnoparty.com\/en\/virtual-san-2-node-cluster-installtion-robo-pt1\/\">vSAN cluster<\/a> fails for any reason,\u00a0you should replace the failed host as soon as possible to <strong>avoid data loss<\/strong>.<\/p>\n<p>It may happen a host fails or the ESXi installation gets corrupted with the result of <strong>disrupting provided services<\/strong>. If the vSAN cluster is well designed, data stored in the vSAN storage are still available even if a host fails, but <strong>data integrity and availability<\/strong> cannot be guaranteed if also a second host fails.<\/p>\n<p>Failed hardware must be replaced and the ESXi reinstalled as soon as possible leaving the disks used in the vSAN storage untouched (if the disk is not the failed component of course) to\u00a0<strong>preserve the logical structure<\/strong>.<!--more--><\/p>\n<p>&nbsp;<\/p>\n<h2>Replace the failed host<\/h2>\n<p>Once the failed hardware has been replaced\u00a0or the ESXi reinstalled, power on the host. Select the vSAN cluster and go to\u00a0<strong>Monitor &gt; vSAN<\/strong>\u00a0area to check the <strong>health status<\/strong> of the vSAN. The\u00a0<strong>State<\/strong>\u00a0of the replaced host should be reported as\u00a0<strong>Abnormal<\/strong>\u00a0if a fresh installation was required to restore the host functionality. Pretty clear that <strong>something wrong is occourring<\/strong> in the vSAN cluster.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-35076 size-large\" title=\"replace-failed-host-vsan-cluster-02\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/10\/replace-failed-host-vsan-cluster-02-600x428.jpg\" alt=\"replace-failed-host-vsan-cluster-02\" width=\"600\" height=\"428\" \/><\/p>\n<p>&nbsp;<\/p>\n<h4>Check vSAN cluster status<\/h4>\n<p>To figure out what's going on with the replaced host, you have to operate at the host level checking the cluster status using\u00a0<strong>esxcli<\/strong>\u00a0commands. Enable the <strong>SSH service<\/strong> in the replaced ESXi host and login with the\u00a0<em>root<\/em> credentials.<\/p>\n<p>From the console run the following command to <strong>get information of the vSAN cluster<\/strong>:<\/p>\n<p><span style=\"color: #0000ff;\"># esxcli vsan cluster get<\/span><\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-35106 size-large\" title=\"replace-failed-host-vsan-cluster-03\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/10\/replace-failed-host-vsan-cluster-03-600x64.jpg\" alt=\"replace-failed-host-vsan-cluster-03\" width=\"600\" height=\"64\" \/><\/p>\n<p>As expected, the new host is <strong>not member of any vSAN cluster<\/strong>\u00a0because the fresh installation procedure deleted all configuration settings. If an up-to-date\u00a0<strong>host's configuration backup<\/strong> is available, you can save lot of time in the restore configuration process.<\/p>\n<p>&nbsp;<\/p>\n<h4>Join the replaced host to vSAN cluster<\/h4>\n<p>To join the host to the vSAN cluster, you need to know the correct UUID in use. <strong>SSH a working ESXi<\/strong>\u00a0member of the vSAN cluster to <strong>retrieve the UUID<\/strong>\u00a0and run the following <em>esxcli<\/em> command:<\/p>\n<p><span style=\"color: #0000ff;\"># esxcli vsan cluster get<\/span><\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-35108 size-large\" title=\"replace-failed-host-vsan-cluster-04\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/10\/replace-failed-host-vsan-cluster-04-600x330.jpg\" alt=\"replace-failed-host-vsan-cluster-04\" width=\"600\" height=\"330\" \/><\/p>\n<p>Once the <strong>UUID number has been identified<\/strong>, write down the number (or simply copy it) and go back to the console of the replaced host. To<strong> join the new host<\/strong> to the vSAN cluster, run the command from the replaced host's console:<\/p>\n<p><span style=\"color: #0000ff;\"># esxcli vsan cluster join -u &lt;UUID&gt;<\/span><\/p>\n<p>where the <em>UUID<\/em> is the number you previously noted.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-35110 size-large\" title=\"replace-failed-host-vsan-cluster-05\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/10\/replace-failed-host-vsan-cluster-05-600x69.jpg\" alt=\"replace-failed-host-vsan-cluster-05\" width=\"600\" height=\"69\" \/><\/p>\n<p>When the process has completed, run the following command once again to get the <strong>vSAN cluster info<\/strong>:<\/p>\n<p><span style=\"color: #0000ff;\"># esxcli vsan cluster get<\/span><\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-35112 size-large\" title=\"replace-failed-host-vsan-cluster-06\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/10\/replace-failed-host-vsan-cluster-06-600x316.jpg\" alt=\"replace-failed-host-vsan-cluster-06\" width=\"600\" height=\"316\" \/><\/p>\n<p>The new ESXi host <strong>is now member<\/strong> of the vSAN cluster.<\/p>\n<p>From the vSphere Web Client, select the vSAN cluster and go to <strong>Monitor &gt; vSAN<\/strong> section. Click <strong>Retest<\/strong> button to check the health status.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-35114 size-large\" title=\"replace-failed-host-vsan-cluster-07\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/10\/replace-failed-host-vsan-cluster-07-600x315.jpg\" alt=\"replace-failed-host-vsan-cluster-07\" width=\"600\" height=\"315\" \/><\/p>\n<p>Some errors are still reported related to <strong>data availability<\/strong>. To fix the problem, click <strong>Repair Objects Immediately<\/strong> button.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-35116 size-large\" title=\"replace-failed-host-vsan-cluster-08\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/10\/replace-failed-host-vsan-cluster-08-600x446.jpg\" alt=\"replace-failed-host-vsan-cluster-08\" width=\"600\" height=\"446\" \/><\/p>\n<p>The <strong>Health<\/strong> status looks more much better now and the <a href=\"https:\/\/nolabnoparty.com\/en\/runecast-analyzer-1-7-vsan-support\/\">vSAN datastore<\/a> is operating properly.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-35118 size-large\" title=\"replace-failed-host-vsan-cluster-09\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/10\/replace-failed-host-vsan-cluster-09-600x287.jpg\" alt=\"replace-failed-host-vsan-cluster-09\" width=\"600\" height=\"287\" \/><\/p>\n<p>In the <strong>Datastores<\/strong> area, the vSAN <strong>Status<\/strong> is reported as <strong>Normal<\/strong>.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-35120 size-large\" title=\"replace-failed-host-vsan-cluster-10\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/2018\/10\/replace-failed-host-vsan-cluster-10-600x242.jpg\" alt=\"replace-failed-host-vsan-cluster-10\" width=\"600\" height=\"242\" \/><\/p>\n<p>Make sure to have a <a href=\"https:\/\/docs.vmware.com\/en\/VMware-vSphere\/6.5\/com.vmware.vsphere.virtualsan.doc\/GUID-57575456-0AD9-4655-9D6B-58509C1DF33C.html\" target=\"_blank\" rel=\"noopener\">robust design<\/a>\u00a0for your vSAN cluster and a good <strong>backup strategy<\/strong> in place to avoid data loss in the situation the vSAN cannot be restored.<\/p>\n<p><img decoding=\"async\" title=\"signature\" src=\"https:\/\/nolabnoparty.com\/wp-content\/uploads\/images\/firma.jpg\" alt=\"signature\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If an ESXi host member of the vSAN cluster fails for any reason,\u00a0you should replace the failed host as soon as possible to avoid data loss. It may happen a host fails or the ESXi installation gets corrupted with the result of disrupting provided services. If the vSAN cluster is well designed, data stored in the vSAN storage are still available even if a host fails, but data integrity and availability cannot be guaranteed if also a second host fails. Failed hardware must be replaced and the ESXi reinstalled as soon as possible leaving the disks used in the vSAN <\/p>\n","protected":false},"author":3,"featured_media":35050,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rop_custom_images_group":[],"rop_custom_messages_group":[],"rop_publish_now":"initial","rop_publish_now_accounts":{"linkedin_93tdZWzMZc_93tdZWzMZc":"","facebook_2879994398731222_17841400390232720":"","twitter_113568041_113568041":"","mastodon_115463926174894442_115463926174894442":""},"rop_publish_now_history":[],"rop_publish_now_status":"pending","footnotes":""},"categories":[903,1853],"tags":[28,1778,1678,1664],"class_list":["post-35048","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-vmware-en","category-vsan-en","tag-cluster","tag-host","tag-replace","tag-vsan","has_thumb"],"_links":{"self":[{"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/posts\/35048","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/comments?post=35048"}],"version-history":[{"count":0,"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/posts\/35048\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/media\/35050"}],"wp:attachment":[{"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/media?parent=35048"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/categories?post=35048"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nolabnoparty.com\/en\/wp-json\/wp\/v2\/tags?post=35048"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}