Monday, 19 December 2016

REPORTED ISSUES WHEN ESXi RUNNING OUT OF VMFS HEAP SIZE


VMFS heap 
is part of the physical memory of host, in use by the kernel which is reserved for file handling of VMFS volumes. The heap memory contains pointers to data blocks on VMDK files on VMFS volumes. Running out of VMFS3 Heap Space can occur when a large quantity of virtual disk space (.vmdk files) is active on a single ESX Host

The default VMFS3.MaxHeapSizeMB settings and maximum active VMDK files are:
  • The default heap size in ESXi/ESX 3.5/4.0 for VMFS-3 is set to 16 MB. This allows for a maximum of 4 TB of open virtual disk capacity
  • The default heap size has been increased in ESXi/ESX 4.1 and ESXi 5.x to 80 MB, which allows for 8 TB to 10 TB of open virtual disk capacity on a single ESXi/ESX host. .
  •  The default heap size has been further increased in ESXi 5.0 Patch, ESXi500-201303001 to 640MB, which should allow for 60TB of open virtual disk capacity on a single ESX/ESXi host.
  •  vSphere 5.5 supports a maximum heap size of 256 MB and enables hosts to access all address space of a 64 TB VMFS datastore.

Heap depletion can have serious side-effects in environment, various issues are reported due to this issue, it is also depends on the action performed. Most of the reported issues are mentioned below, but “Cannot allocate memory” seems to be the most common error when heap size is full.

It is known issue on ESXi 5.5 Update 3, ESXi 6.0 GA, ESXi 6.0u1, or ESXi 6.0u1a.

 Reported issues:
  • Multiple Virtual machines going down (shutdown) all of a sudden.
  • You see virtual machines run more slowly than usual or become unresponsive.
  • Unable to Power ON VMs
  • When you try to manually power on a migrated virtual machine, you may see the error: The VM failed to resume on the destination during early power on.
    Reason: 0 (Cannot allocate memory).
  • Adding a VMDK to a virtual machine running on an ESXi/ESX host where heap VMFS-3 is maxed out fails.
  • You try to migrate or Storage vMotion a virtual machine to a destination ESXi/ESX host on which heap VMFS-3 is maxed out.
  • Cloning a virtual machine using the vmkfstools -icommand fails and you see the error:Clone: 43% done. Failed to clone disk: Cannot allocate memory (786441)
  • Initialization of VMFS disk (in TB) failed
  • Write errors reported in Windows guests when copying large amounts of data from and to large VMDK files
  • Errors running Microsoft Exchange Server Jetstress
  • Windows reporting lack of storage capacity.
  • Issues reported in vSphere Client when performing storage migrations (storage vMotion).
    A storage vmotion of a large VMDK fails at 10%. A storage vMotion of a smaller VMDK completes with no issues.
    Relocate virtual machine <virtual machine name>. A general system error occurred: Storage VMotion failed to copy one or more of the VM ‘s disks. 
  • Please consult the VM’s log for more details, looking for lines starting with “SVMotion”.
    “Failed to create one or more destination disks. Canceling Storage vMotion.
    Storage vMotion failed to create the
    destination disk /vmfs/volumes/….
    (Cannot allocate memory). “
  • Issues while performing a backup. Backup-applications mount VMDK files. If during a backup job each time a large VMDK is added the heap size maximum could be reached.
  • In vSphere Task and Events you will see below event
  • In /var/log/vmfs/volumes/DatastoreName/VirtualMachineName/vmware.log you will see Cannot allocate memory errors listed(as mentioned below)
2016-12-16T17:14:58.522Z| vcpu-5| W110: A core file is available in "/vmfs/volumes/569ffae2-bb128d7a-f26b-00110a6aa6c8/VMNAME/vmx-zdump.001"
2016-12-16T17:14:58.522Z| vcpu-5| W110: Writing monitor corefile "/vmfs/volumes/569ffae2-bb128d7a-f26b-00110a6aa6c8/VMNAME/vmmcores.gz"
2016-12-16T17:14:58.572Z| vcpu-5| W110: CoreDump error line 2160, error Cannot allocate memory
2016-12-16T17:14:58.572Z| vcpu-5| I120: Backtrace:
2016-12-16T17:14:58.572Z| vcpu-5| I120: Backtrace[0] 000003ffe3e422a0 rip=00000000255d104e rbx=00000000255d0b30 rbp=000003ffe3e422c0 r12=0000000000000000 r13=00000000325f4620 r14=000003ffe3e429ac r15=000003ffe3e429a0
  • You will find below traces in Vmkernel.log:
2016-10-14T03:05:07.037Z cpu14:12491419)UserDump: 1907: Dumping cartel 12491419 (from world 12491419) to file /vmfs/volumes/5721ac4d-6e24ee9a-9627-00110a6633e8/VMNAME/vmx-zdump.000 ...
2016-10-14T03:05:07.187Z cpu13:49413)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x93 (0x439dc1202f80, 12491419) to dev "naa.600508b1001c7d402733d0abc9ef2e1a" on path "vmhba1:C0:T0:L2" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2016-10-14T03:05:13.119Z cpu0:12491419)UserDump: 2031: Userworld coredump complete.
  • An attempt to open or use a console session may show the following error: “Unable to connect to the MKS: Virtual Machine Config File Does Not Exist”
  
Solution:

This issue is resolved in:
ESXi 5.5 Update 3b
ESXi 6.0 Update 1b later

Second option, follow below steps to adjust the VMFS3.MaxHeapSizeMB to the maximum 256MB to resolve this issue.
1.     Log into vCenter Server or the ESXi/ESX host using the vSphere Client. If connecting to vCenter Server, select the ESXi/ESX host from the inventory.
2.     Under the Configuration tab, click Advanced Settings.
3.     Click VMFS3.
4.     Update the field in VMFS3.MaxHeapSizeMB.
5.     Reboot the ESXi/ESX host for the changes to take effect.

Note: - VMware has confirmed that heap size and max virtual disk capacity per host applies to VMFS only. The heap size constraint does not apply to RDMs nor does it apply to NFS datastores the default heap size for VMFS-5 datastores becomes 640MB which supports 60TB of open VMDK files per host