The other day I was upgrading my Isilon cluster to 7.0.1.8. Since I wanted to minimize customer impact I elected to do a rolling restart – something I’ve done several other times without problems. Isilon has a feature in SmartConnect Advanced which allows the cluster to dynamically move IP addresses between nodes of the cluster. What that means to my upgrade is when a node reboots, the IP address moves to a different node and my clients don’t notice the impact. The entire upgrade process usually takes about 10 minutes per node.
About three nodes into the rolling reboot, the node I was running the upgrade from, lost network connectivity in an unrelated problem. This stopped my upgrade process, leaving me running two different versions of OneFS. The fix was actually pretty simple, I had to restart my upgrade process. In order to do that I had to stop the upgrade service which was already running on the nodes.
[box]
Isilon-n1# isi services -a |grep upgrade
isi_upgrade_d Upgrade Daemon Enabled
Isilon-n1# isi services -a isi_upgrade_d disable
Isilon-n1# isi services -a |grep upgrade
isi_upgrade_d Upgrade Daemon Disabled
[/box]
Once this service was disabled I was able to re-run my upgrade process, and the nodes which already were at the new version of code were left unaffected. All in all, while my upgrade delayed some work around the house I needed to do, the problem was a small hiccup.
Comments are closed, but trackbacks and pingbacks are open.