dinsdag, juni 12, 2007

VMware: The dreadful sticky snapshot

When working with VMware ESX’s snapshot technology and/or VCB for a while, you will notice that sometimes a snapshot doesn’t get committed correctly. This occurs when for instance an ESX host crashes or hangs during a VCB snapshot.

The problem here is that a redo file (or delta, snapshot file) is created, but never removed. When looking at the snapshot manager, no snapshots appear to be available. Snapshots are devastating for your datastore performance, so we must remove them as quickly as possible! In the worst case, the redo file and the vmware-x.log files will eat up all the remaining disk space on the infected datastore and a message will be displayed (AAAaaaaaaaaaahh):

You will notice a number of <originaldiskname>-00000X.vmdk and <originaldiskname>-00000X-delta.vmdk files appear when browsing the datastore. When you look at your Virtual Machine disk properties (via Edit Settings) you will notice that your disk name will have changed from <originaldisk.vmdk> to <originaldisk>-00000X.vmdk, meaning a snapshot version is being used.

You can use 2 tricks to recover from this scenario:

  • Execute vmware-cmd /vmfs/volumes/<datastorename>/<vmname>/<vmname>.vmx removesnapshots on the Service Console. However, most of the time this will not work and you will have to revert to the method below.
  • Manually create a snapshot in the VI Client and remove it after it has been created. This will remove ALL existing snapshots and revert back to your <originaldisk>.vmdk. Great huh?

1 opmerking:

Unknown zei

Pretty good trick! I ran into a similar issue and used CONVERTER to get out of it! I wrote a quick note about it @ http://www.ipmer.com/2008/05/70gb-snapshot-yikes.html

Thanks for the post.
Carlo.