How Can We Help?
You are here:
Print

VMware Best Practice Guide

Introduction

This document illustrates recommended configurations for various aspects of a VMware environment in tandem with a StorTrends SAN. Examples are given within particular sections to help further illustrate how a certain feature or configuration benefits the user. The StorTrends family of SANs utilize a variety of product integrations and performance tuning mechanisms in order to enhance and simplify the overall performance and manageability of a virtualized environment.

Prerequisites

In order to have a proper understanding about the information found within this document, readers should have a good understanding of the following topics:

  • Configuration and operation of VMware vSphere
  • Configuration and operation of the StorTrends SAN
  • Operating systems such as Windows and various Linux distributions

Additional Resources

In addition to the recommendations found within this document, the following guides should provide recommendation for specific applications within the virtualized environment:

Intended Audience

This guide is intended for IT Managers, Solutions Architects and Server Administrators with the desire to implement a StorTrends SAN within their current, or a new, VMware environment.

NOTE: The information found in this guide is intended to give general recommendations that should apply to a majority of VMware environments. In some cases, these recommendations may not be applicable. That determination should be based on individual and business requirements.

Host Based Connectivity

The StorTrends family of SANs offer iSCSI connectivity, both 1 GbE and 10 GbE. The below recommendations will allow users to have optimized connections between their StorTrends SAN and their VMware environment.

Networking Recommendations

Within each VMware ESXi host, there should be separate port adapters for management and iSCSI traffic purposes. If there are multiple iSCSI ports within the VMware ESXi hosts that are under the same subnet, these ports should be added to the ‘Network Port Binding’ list for each host.

NOTE: For more information on port binding, follow the following link to a KB article that goes into further detail on the subject: VMware KB article 2038869

In order to add these ports to that list:

  • Go to the specific VMware ESXi host within your vCenter and go under the ‘Manage’ tab.
  • From this tab, go to the ‘Storage’ sub-tab and choose ‘Storage Adapters’.
  • Find the iSCSI Software Adapter (usually at the end of the list).
  • Under the ‘Adapter Details’ section, find the ‘Network Port Binding’ and click on the green ‘+’ to bring up a list of available VMkernel adapters, as shown below:


NOTE: Do NOT add management port adapters to this list.
NOTE: VMware vSphere versions 4.x, 5.0, 5.1, 5.5 and 6.0 do not support routing for any port being used for port binding. For this reason, be sure that the StorTrends SAN and all iSCSI VMkernel ports are on the same subnet and are able to communicate directly with each other.

 

Jumbo Frames

Jumbo frames, while not required, are recommended for environments that have high throughput requirements. This will require jumbo frames to be enabled from the StorTrends SAN, the switch and each individual VMware ESXi host.

In order to enable this on the StorTrends SAN:

  • Login to the ManageTrends UI and click on the ‘Control Panel’ for the Left Controller found on the left-hand navigation tree.
  • Under the ‘Network’ section, click on ‘TCP-IP’.
  • Find the interface that will have jumbo frames enabled from the dropdown and check the jumbo frame status to the right of the dropdown. If the status shows a gray ‘X’, click the ‘Enable’ button to enable jumbo frames for that interface.

In order to enable this on the VMware ESXi host:

  • Login to vCenter and navigate to the host in question. Click on the ‘Manage’ tab and then click on the ‘Networking’ sub-tab.
  • Under the ‘Virtual switches’ option, find the proper vSwitch from the list and click on the pencil icon to change its settings. Change the MTU value from 1500 to 9000.
  • Under the ‘VMkernel adapters’ option, find the VMkernel that correlates to the vSwitch that was modified above and click the pencil icon to change its settings. Go under ‘NIC settings’ and change the MTU to 9000.

In order to enable jumbo frames for your switch, consult the switch’s user manual.

Teaming Options and MCS/MPIO

There are a few options for increasing the available throughput of the environment as a whole; both on the StorTrends SAN side as well as the individual VMs (if the data drives are presented through the VM’s iSCSI initiator).

On the StorTrends SAN side, teaming the physical NICs of each controller can give a boost to throughput capabilities. There are a few options for teaming including Round Robin, Adaptive Load Balancing, and Link Aggregation Control Protocol (802.3ad). To gain more knowledge on these options, take a look at the StorTrends Alias Configuration Guide.

For individual VMs, Multiple Connections per Sessions (MCS) or Multipath I/O (MPIO) can be helpful protocols for increasing throughput of the VM. The main difference between the two protocols is that MCS creates multiple sessions on the same connection whereas MPIO creates multiple paths from physical connections. For more information on both protocols, and to see how to implement these protocols in Windows environments, take a look at the StorTrends MPIO/MCS Comparison Guide.

Round Robin Multipathing

Round robin multipathing is a VMware ESXi parameter that will create both redundancy and increased throughput for iSCSI connections. This setting needs to be applied on a per target basis. Below are steps to follow in order to do this manually:

  • Login to vCenter and navigate to the host in question. Click on the ‘Manage’ tab and then click on the ‘Storage’ sub-tab. Click on the ‘Storage Devices’ option and choose a device that is connected to the StorTrends SAN. Then, under the ‘Device Details’, click on ‘Edit Multipathing…’
  • In the dropdown under ‘Path selection policy:’, choose ‘Round Robin (VMware)’.
NOTE: If there are a multitude of targets to go through, StorTrends Support can offer a customized script that will make the changes for all targets that are connecting to a StorTrends LUN.
NOTE: It is recommended that round robin multipath be set for all iSCSI connections associated with the StorTrends SAN. This will allow for additional performance as seen in the performance section, below.

iSCSI Timeout Values

All StorTrends SANs are all fully redundant with dual-controllers. In the event of a failover, the user must ensure that proper timeout values are set throughout the environment in order to ensure no loss in connectivity. For VMware, change the following option in the iSCSI adapter.

  • Login to vCenter and navigate to the host in question. Click on the ‘Manage’ tab and then click on the ‘Storage’ sub-tab. Under the ‘Storage Adapters’ option, scroll down until the iSCSI Software Adapter is found and then click on the ‘Advanced Options’ tab in the ‘Adapter Details’ section. Click on the ‘Edit…’ button.
  • Scroll down until the entry for ‘RecoveryTimeout’ is found. Change the value to 120.

Other timeout recommendations may be found in the StorTrends Timeout Guide.

Data Drive Considerations

There are a few considerations to take when thinking about how to deploy datastores (both for OS data and for application data). Some of the considerations to take into account include how the datastores will be provisioned and how to connect the StorTrends volumes to the virtual machines (VMs) in question. The below sections will shed light on each of these considerations.

Provisioning Considerations

A general rule of thumb when figuring out what provisioning type to use for specific datastores is to use the opposite type then what was used on the StorTrends SAN.

NOTE: In the StorTrends 2610i, all volumes are created as thin provisioned volumes on the SAN.

Thin Provisioning

VMware vSphere thin provisioning allows for space overcommit at the datastore level. The capacity of the VMFS files found in each datastore provisioned as thin provisioning will only allocate the space as it is needed. If the virtual machines being deployed in the VMware environment have unpredictable space growth, thin provisioning is a viable option. With this type of provisioning, VMs running on these datastores will be susceptible to out-of-space conditions. For this reason, it is recommended that the ‘Datastore usage on disk’ alarm is enabled. To enable this alarm, log in to vCenter from the vSphere Web Client follow the steps below:

  • In the left-hand navigation tree, click on vCenter object and choose the ‘Manage’ tab. Choose the ‘Alarm Definitions’ sub-tab and type ‘usage’ in the search box to narrow down the alarms.
  • Click ‘Edit…’ to get to the settings for the alarm. Be sure that ‘Datastores’ are selected as the type being monitored and that the radio button for ‘specific conditions or state, for example CPU usage’ is chosen. Click ‘Next’.<>
  • The default values are 75% usage for a warning condition and 85% usage for a critical condition. These may be changed by clicking on each percentage and choosing a new value from the dropdown menu.
  • In the ‘Actions’ section, choose what actions to take for each condition. It is recommended to have email notifications sent out when these conditions are met to ensure that the correct people are notified.
NOTE: If the virtual disk supports clustering solutions such as Fault Tolerance, it should NOT be provisioned as a thin provisioned datastore.

Lazy Zeroed and Eager Zeroed Thick Provisioning

Lazy zeroed thick provisioning will create a virtual disk in a default thick format. With this format, the space required for the virtual disk is allocated at creation time. Data remaining on the physical device will be zeroed out on demand at a later time on first write from the VM.

Eager zeroed thick provisioning supports clustering features such as Fault Tolerance. With this format, the space required for the virtual disk is allocated at creation time. The data remaining on the physical disk is zeroed out when the virtual disk is created. This type of provisioning could take longer than other types of provisioning.

NOTE: Eager zeroed thick provisioning is the recommended option for provisioning virtual disk.

Methods for Connectivity

There are a few options to consider when thinking about how to connect a disk to a VM. There are two options that involve the VMware ESXi hosts – Raw device mapped (RDM) and creating a disk on a datastore and presenting that to a VM. The third option utilizes the VM’s iSCSI initiator.

VMDK on VMFS

One of the most common methods of connecting data drives to VMs is creating a VMDK disk on a VMFS volume. This method requires a user to carve out space on a datastore and assigning that new space to a VM for use.

Advantages

  • Ability to add storage to a VM from free space in the datastore that is hosting the VM or provision out a new datastore
  • Ability to view it in vCenter offers light overhead for administration
    • Allows the user to take advantage of vCenter tasks such as vMotion and cloning

Disadvantages

  • Unable to take application aware snapshots of data residing on VMDK

Raw Device Mapped (RDM) LUNs

Another method for connecting data drives to a VM within VMware is by utilizing RDM LUNs. This method bypasses the necessity to create a datastore and directly connects the RAW device to the VM.

Advantages

  • No real advantages over other methods

Disadvantages

  • Unable to take application aware snapshots of data
  • No performance gains as compared to other methods
  • There is a 256 target maximum per iSCSI initiator per VMware ESXi host and each RDM has to be a separate target

VM’s iSCSI Initiator

A third option for connecting data drives to a VM is to use the VM’s built-in iSCSI initiator. With this method, Windows and Linux VMs depend on their own iSCSI initiators and completely bypass the VMware layer.

Advantages

  • For Windows servers: ability to take application aware snapshots with the use of SnapTrends for VSS aware snapshots.
  • Ability to isolate data drives for disaster recovery (DR) purposes. This allows your data drive to be on a different replication schedule then the OS drive.
  • Can be easily mounted on either a virtual or physical server for quick recovery on a physical server in the case that the virtual environment is down.
  • When going from a physical server to a virtual one, there is no need to change any of the best practices already in place

Disadvantages

  • Not visible to vCenter, which can cause management overhead.

ATS Heartbeat

Starting with ESXi 5.5 Update 2, VMware changed its method for VMFS heartbeat updates from plain SCSI reads and writes with the VMWare ESXi kernel handling validation to offloading this procedure to the SAN utilizing the ATS VAAI primitive. In some cases, this can cause unwanted temporary lost access to datastores. If you see any of the following prints, you may be experiencing this phenomenon:

–          In the /var/run/log/vobd.log file and Virtual Center Events, you see the VOB message:
Lost access to volume <uuid><volume name> due to connectivity issues. Recovery attempt is in progress and the outcome will be reported shortly –          In the /var/run/log/vmkernel.log file, you see the message:
ATS Miscompare detected between test and set HB images at offset XXX on vol YYY–          In the /var/log/vmkernel.log file, you see similar error messages indicating an ATS miscompare:
2015-11-20T22:12:47.194Z cpu13:33467)ScsiDeviceIO: 2645: Cmd(0x439dd0d7c400) 0x89, CmdSN 0x2f3dd6 from world 3937473 to dev &#34;naa.50002ac0049412fa&#34; failed H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0.–          Hosts disconnecting from vSphere vCenter.-          Virtual machines hanging on I/O operations.

StorTrends recommends to disable ATS Heartbeat, whether you have experienced any of the points above or not, so as to ensure proper and stable connections to the StorTrends LUNs. VMware’s KB Article 2113956 has the proper steps to take depending on whether you utilize VMFS5 or VMFS3 datastores.

 

 

Performance Tuning Recommendations

The following section will walk through some changes that can be made to the environment in order to finely tune the performance of everything in order to get greater gains in overall performance.

Native Multipathing Plugin (NMP) Configuration

One of the aspects of VMware’s architecture that VMware improved upon starting in version 4.0 is their Native Multipathing Plugin (NMP) by introducing Storage Array Type Plugins (SATP) and Path Selection Plugins (PSP) as part of the VMware APIs for Pluggable Storage Architecture (PSA). With these plugins, storage systems are given the ability to aggregate I/Os across multiple connections as well as implement failover methods between those connections. VMware has three options for handling these and StorTrends recommends the “Round Robin” option, as seen in the ‘Round Robin Multipathing’ section earlier in this guide.

In this section, we explore additional settings that can be set when using Round Robin Multipathing to realize increased throughput. When Round Robin is configured, there are parameters that dictate how often each connection is utilized for servicing I/Os. By default, VMware switches paths every 1,000 I/Os. This can lead to uneven usage of the connections as well as unrealized throughput gains. Stortrends’ recommended revision to this setting is to set it to change paths for every 1 I/O.  By making this change, VMware will switch paths for every I/O, enabling a complete balance of utilization between connections as well as ensuring that each connection is fully utilized. As an added bonus, in the case of a failover, the time it takes to complete the failover from VMware’s perspective is reduces as the switching between connections will happen quicker. Making this change does incur a minute amount of additional cost to the CPUs of the ESXi hosts, but it is not enough to cause any undesirable side effects.

Depending on your ESXi version, making this change will vary slightly, but regardless, this change needs to be made from each ESXi host’s SSH shell. The scripts below will ensure that any new StorTrends LUNs that are presented to the ESXi host automatically take these settings when Round Robin is set as well as make the change to any current StorTrends LUNs.

NOTE: In order for the rules for new connections to take effect, a reboot of the ESXi host is required.
For ESXi versions 5.0 and newer:
#!/bin/sh# set rule for current connections
esxcli storage nmp satp set –default-psp=VMW_PSP_RR –satp=VMW_SATP_DEFAULT_AA; for i in `esxcli storage nmp device list | grep AMI|awk ‘{print $7}’|sed ‘s/(//g’|sed ‘s/)//g’` ; do esxcli storage nmp device set -d $i –psp=VMW_PSP_RR ; esxcli storage nmp psp roundrobin deviceconfig set -d $i -I 1 -t iops ; done
# set rule for new connections
esxcli storage nmp satp rule add -s “VMW_SATP_DEFAULT_AA” -V “AMI” -M “StorTrends” -P “VMW_PSP_RR” -O “iops=1”
For ESXi versions prior to 5.0:
#!/bin/sh# set rule for current connections
for i in `esxcli nmp device list | grep -i -B1 “ay Name: AMI” | grep -i “eui.” | grep -i -v “ay Name”` ; do esxcli nmp device setpolicy –device $i –psp VMW_PSP_RR; done
# this command needs to be re-run any time a new StorTrends LUN is added to ESXi host
for i in `esxcli nmp device list | grep -i -B1 “ay Name: AMI” | grep -i “eui.” | grep -i -v “ay Name”` ; do esxcli nmp roundrobin setconfig –device $i –iops 1 –type iops; done
# set rule for new connections
esxcli nmp satp setdefaultpsp –-satp VMW_SATP_DEFAULT_AA –-psp VMW_PSP_RR
esxcli corestorage claimrule load
esxcli corestorage claimrule run

Alignment Considerations

Misaligned partitions at any layer within the overall environment can cause a negative impact on performance on any VMs. The reason for this is that with a misaligned partition, every I/O operation from the VM may require multiple I/O operations at the VMware ESXi host and/or StorTrends SAN.

There are two levels of alignment that may need to be considered when implementing a solution with a StorTrends SAN:

  • Alignment of the datastore associated with the LUN hosted on the StorTrends SAN
  • Alignment of the VM running of that datastore
NOTE: Beginning with VMware ESXi 5.0, any datastore created through either the thick client or the web client is automatically aligned along the 1 MB boundary. This means that for any datastore created on a VMware ESXi 5.0 or later host, no extra steps need to be taken in order to ensure that datastores are properly aligned.

Detecting Misalignment at the Virtual Machine Level

Misaligned partitions can occur on VMs running on both Windows and Linux operating systems. The below sections will elaborate on how to check each OS for partition alignment issues as well as ways to correct these issues.

Detect Misalignment on Windows Virtual Machines

NOTE: These steps only need to be taken on VMs running Windows Server 2003 or earlier. Windows Server 2008 and later automatically aligns partitions correctly.

In order to check the alignment of the drives running on a Windows server, the ‘wmic’ command can be utilized. With this command, the starting offset can be found for all drives on the server, as shown below:

In order to check whether the partition is misaligned, the starting offset value needs to be divisible by 4096 to verify that the boundary is aligned to 4 KB. In the screenshot above, the starting offset for both partitions are not divisible by 4096, therefore these partitions are not properly aligned.

Detect Misalignment on Linux Virtual Machines

In order to find the information about a partition, including its starting offset, either parted or fdisk (or gdisk for GPT devices) can be utilized. The example below utilizes gdisk to get the starting sector of the device in question. For any value under the starting sector, multiply by 512 then divide by 4096 the check whether or not the partition is misaligned.

#gdisk -l /dev/sdb

GPT fdisk (gdisk) version 0.8.6

 

Partition table scan:

MBR: protective

BSD: not present

APM: not present

GPT: present

 

Found valid GPT with protective MBR; using GPT.

Disk /dev/sdb: 209715200 sectors, 100.0 GiB

Logical sector size: 512 bytes

Disk identifier (GUID): 66AFD151-6A3F-45DA-AF05-6A4D2903BF9E

Partition table holds up to 128 entries

First usable sector is 34, last usable sector is 209715166

Partitions will be aligned on 2048-sector boundaries

Total free space is 2014 sectors (1007.0 KiB)

 

Number  Start (sector)    End (sector)  Size       Code  Name

1            2049       209715166   100.0 GiB   8300  Linux filesystem

In the example above, the starting sector for this device is not divisible by 4096 after multiplying by 512, therefore this partition is misaligned.

Resolving Misalignment issues

If partitions are found to be misaligned, refer to the VMware technical paper Recommendations for Aligning VMFS Partitions. For Windows partitions in particular, Microsoft KB article 929491 provides information on proper partition alignment.

Queue Depth Considerations

Queue depth is a tricky concept to fully understand. Many considerations need to be made when thinking about making changes to queue depth settings. In its simplest form, queue depth is simply the amount of transactions allowed to be pending between an initiator (the VMware ESXi host) and a target (the StorTrends SAN).

In most cases, in particular clustered environments, there will be multiple initiators sending data to the StorTrends SAN. The queue depth comes into play in order to keep the SAN from becoming flooded with I/O requests. If the SAN’s queue depth is saturated, then transactions start to get piled up on the ESXi hosts and higher latencies and degraded performance are the consequences of this phenomenon. The difficult aspect to determine in this situation is what the queue depth should be set to on the individual ESXi hosts and/or the StorTrends SAN to gain optimal performance levels. If the queue depth is set too high, the StorTrends SAN may come under too much stress and I/O may timeout. If the queue depth is set too low, VMs on the ESXi host may shut down due to latency issues.

As a general recommendation, setting the queue depth of each Software iSCSI Initiator to 64 should suffice. StorTrends SANs have their queue depth set to 64, and matching the queue depths between initiator and target will ensure that the ESXi host does not report latencies not seen by the StorTrends SAN. In order to set this queue depth on each ESXi host, the following script may be run by SSH’ing into each host (please be sure to enter maintenance mode on the ESXi host before making these changes):

For ESXi versions 5.1 and prior:
#!/bin/sh vim-cmd hostsvc/advopt/update Disk.SchedNumReqOutstanding long 64 esxcfg-module iscsi_vmk -s “iscsivmk_HostQDepth=1024 iscsivmk_LunQDepth=64”For ESXi versions 5.5 and newer:
#!/bin/sh 

for i in `esxcli storage core device list | grep -B 1 ” Vendor: AMI” | grep Path | cut -d’/’ -f 5`; do esxcli storage core device set –device=$i -O 64; done;

 

esxcfg-module iscsi_vmk -s “iscsivmk_HostQDepth=1024 iscsivmk_LunQDepth=64”

After running the above script, a reboot of the individual ESXi hosts will need to take place. A StorTrends Support Engineer will be able to assist you in properly running the script if assistance is required.

Virtual Machine-Datastore Ratio

For the reasons illustrated in the Queue Depth Considerations section above, StorTrends recommends that a 1:1 VM-to-datastore ratio be maintained for VMs that have higher I/O demands. While this can cause a bigger management hurdle, each VM will be guaranteed a proper amount of queue depth for its I/O. For less demanding VMs, StorTrends recommends no more than 10 VMs be on a single datastore.

Block Size Considerations

The maximum block size supported by an individual ESXi host is configurable. By default, the maximum block size is set to 32 MB. The below formula shows you how block size correlates to IOPS back to the StorTrends SAN:

The recommended maximum block size for optimal performance to a StorTrends SAN is 128 KB. This allows VMware to send out packets quicker and with less queueing, resulting in decreased latencies.

In order to make this change, follow the steps below:

  • Login to vCenter and navigate to the host in question. Click on the ‘Manage’ tab and then click on the ‘Settings’ sub-tab. Click on ‘Advanced System Settings’ under ‘System’ to get a long list of settings for the host. Search for “Disk.DiskMax”
  • You will need to change the value of the setting that comes up. This change can be made by selecting the setting in the list and then clicking on the pencil icon to edit the value. No reboot is necessary for these changes to take effect.
    • DiskMaxIOSize = 128

As an example, let’s take a clustered environment with 5 hosts, 2 interfaces per server, and default queue depth and block size on each host. The block size for a StorTrends 3610i is 4k. If we plug in the numbers, you will see that the SAN would need to serve 20,971,520 IOPS in order to maintain a low latency.

Now, if we implement our recommendations for queue depth and block size, the SAN only needs to be able to serve 20,480 IOPS in order to maintain a low latency.

The above illustrates an example of how making just a couple of adjustments can vastly improve the performance and response time of the environment as a whole.

vStorage APIs for Array Integration (VAAI)

In the traditional virtualized environment, storage operations cause the virtualization host to endure huge amounts of overhead from available resources physically on the host. These functions would be more efficient if they were run on the storage device itself.

The StorTrends family of SANs all support the various vStorage APIs for Array Integration (VAAI). These primitives allow the VMware ESXi hosts to communicate directly with the StorTrends SAN. The communication enables the ESXi host to offload storage specific operations off to the SAN, in turn reducing the overhead seen by the ESXi hosts. This allows for significant improvements in performance of storage-centric operations such as cloning, zeroing, etc. The overall goal of these VAAI primitives is to create a direct line of constant communication between the StorTrends SAN and the VMware environment that will provide hardware assistance for I/O operations that the StorTrends SAN can complete more efficiently. The following table lists the supported VAAI primitives, as we well as a brief description of what they do and how they benefit the customer.

# Official Name
[alias name]
Function Benefits
1 Atomic Test and Set (ATS)
[Hardware Assisted Locking]
Enables granular locking of block storage devices, accelerating performance ATS allows for the locking of blocks instead of locking the entire LUN when performing specific I/O. This enables performance advantages by sharing LUNs within the StorTrends SAN.
2 Cloning Blocks
[Full Copy, Extended Copy]
Calls for the array to make a mirror of a LUN (clone, vMotion) With this primitive enabled, VMs can be cloned in less than 2 minutes as opposed to the average of 8 minutes for cloning VMs without the primitive. This is the signature VAAI command from VMware and allows for huge OPEX savings. Rather than having to read the data from the array and wait as it gets written back, the hypervisor can command the array to make a mirror of a range of data on its behalf. The biggest advantage here is that cloning and vMotioning are lightning quick.
3 Zeroing File Blocks
[Block Zeroing]
Communication mechanism for thin provisioning arrays Block zeroing is done at the metadata table on the StorTrends SAN. This helps speed up the creation of VM. This helps administrators cut down on the amount of time needed to deploy VMs by allowing them to create VMs by using Thick Eager Zeroing. This type of provisioning will instruct the StorTrends SAN to automatically write 0s to the metadata table. This in turn allows for VMs to be created in a fraction of the time of traditional methods of deploying VMs.
4 Out of Space Condition
[Thin Provisioning Stun]
“Pauses” a running VM when full capacity is reached This command enables the StorTrends Array to notify vCenter to suspend all VMs on a LUN that is reaching max capacity due to thin provisioning over-commit. This suspension will be lifted when space is freed up or added to either the datastore in question or the LUN hosted on the StorTrends SAN.
5 UNMAP
[Space Reclamation]
Allows thin arrays to clear unused VMFS space UNMAP is a SCSI command which was added to VAAI to allow thin-provisioned capable storage arrays to clear unused VMFS space. By using UNMAP, both vCenter and the StorTrends SAN have matching capacity usage. As an added benefit, deleted changes are not replicated in a DR situation.
6 Quota Exceeded Behavior
[TP Soft Threshold]
Allows vSphere to react before out-of-space condition This primitive allows StorTrends to alert VMware preemptively. A potential benefit of this would be triggering a Storage DRS rebalance. As an added benefit, VMware can query the StorTrends SAN for its threshold values using SCSI ‘mode sense’. This has the benefit of allowing vSphere to be proactive and react before an out-of-space condition occurs.
7 TP LUN Reporting
[Report Thin Capacity]
Enables vSphere to determine LUN TP status A simple, yet elegant command that allows vSphere to query the StorTrends SAN to determine if the LUN is thinly provisioned.

 

Atomic Test & Set (ATS) – Hardware Assisted Locking

The ATS primitive is one of the most important VAAI primitives available. ATS will allow multiple VMware ESXi hosts to access the same datastore simultaneously by allowing each host to only lock the blocks it is currently working on, rather than locking up the entire datastore.

Many operations on VMFS files greatly benefit from ATS, including:

  • Acquiring on-disk locks
  • Upgrading an optimistic lock to an exclusive/physical lock
  • Unlocking a read-only/multiwriter lock
  • Acquiring a heartbeat
  • Clearing a heartbeat
  • Replaying a heartbeat
  • Reclaiming a heartbeat
  • Acquiring on-disk lock with dead owner

In order for ATS to successfully handle multiple hosts requesting locks on the same datastore, the locking mechanism needs to move away from the traditional use of SCSI reservations, which locks an entire LUN. The benefits of this VAAI primitive include the eradication of contention issues with SCSI reservations as well as allowing a user to scale VMFS volumes to larger sizes.

NOTE: Although ATS drastically reduces SCSI reservation based latency and allows for an increase in the VM-to-datastore ratio, it should not be inferred that a large quantity of VMs can reside on a single datastore. This is because the primitive handles SCSI reservation based latency exclusively and does not take into consideration other performance inhibiting factors such as queue depth and network bandwidth requirements.

Cloning Blocks – Full Copy, Extended Copy

The extended copy VAAI primitive allows for operations such as cloning and Storage vMotion to finish in a much more efficient manner. This is the signature VAAI command from VMware. Without the use of this VAAI primitive, the VMware ESXi host utilizes the VMkernel software Data Mover driver. With this driver, the operation could take many minutes to hours because it will heavily consume CPU cycles, DMA buffers and SCSI commands in the HBA queue. With this VAAI primitive enabled, the full copy of the blocks happen on the StorTrends SAN, alleviating the extra strain on the host. The below table helps illustrate the advantages of this VAAI primitive on a StorTrends SAN.

Detail Time (min:sec)
VAAI Time taken to clone a 25 GB VM with Acceleration. 2:59
No VAAI Time without acceleration and there is a large increase in network utilization. 9:15
Advantage Time savings (not including benefits from additional network or disk utilization). 6:16

 

Zeroing File Blocks – Block Zeroing (Write Same)

The Write Same primitive allows for the zeroing of blocks to happen on the StorTrends SAN rather than on the VMware ESXi host itself. This is most helpful when using the Eager Zeroed Thick Provisioning method.  Without the use of this VAAI primitive, similar constraints to that of not using the extended copy primitive occur on the host. With this VAAI primitive enabled, these operations are offloaded to the StorTrends SAN, where the zeroing of blocks happens in a vastly more efficient manner.

One of the most common operations on virtual disks is initializing considerable amounts of a disk with zeroes in order to help isolate VMs and promote security. The use of the Write Same primitive offloads this task without transferring the data over the wire. This aids in improving the performance of the following operations:

  • Cloning operations
  • Allocating new file blocks for thin provisioned virtual disks
  • Initializing previous unwritten file blocks for virtual disks

The below table helps illustrate the advantages of this VAAI primitive on a StorTrends SAN.

Detail Time (min:sec)
VAAI Time taken to create a 70 GB VM with Write Same. 1:03
No VAAI Time without Write Same and there is a large increase in network utilization. 12:35
Advantage Time savings (not including benefits from additional network or disk utilization). 11:32

 

NOTE: Some storage arrays will write zeroes directly down to disk. Other arrays, including StorTrends, do not necessitate this operation and simply do a metadata update to write a page of all zeroes. With this advantage, the user can observe significant differences in performance using this primitive.

Out of Space Condition – Thin Provisioning Stun

The Out of Space Condition primitive enables vCenter to suspend VMs that have reached 100% utilization for capacity. Without this VAAI primitive enabled, if any VM on a particular datastore experiences this condition, all VMs within that datastore will be suspended. With this VAAI primitive enabled, only those VMs that reach 100% usage will be suspended, leaving other VMs within the same datastore free to continue running. In order to resume a suspended VM, either the volume on the StorTrends SAN can be expended or the VMDK for that particular VM can be expended from vCenter if space is available.

Quota Exceeded Behavior – Thin Provisioning (TP) Soft Threshold

The TP Soft Threshold primitive enables the StorTrends SAN to notify VMware preemptively of a pending out of space (OOS) condition. vSphere by default has no ability to recognize how much space a thin provisioned datastore is consuming on the StorTrends SAN. The only way vSphere will be made aware is when the VM(s) on that datastore stop running. This VAAI primitive provides a warning that pops up in vCenter notifying the user about a datastore nearing its capacity. This gives the user ample time to take necessary steps – such as adding more storage, extending the datastore, etc. – before an OOS condition occurs.

Thin Provisioning (TP) LUN Reporting – Report Thin Capacity

TP LUN reporting simply allows vCenter to query the StorTrends SAN to determine if a LUN is thin provisioned. To check this reporting, an SSH session into the particular VMware ESXi host will need to be opened in order to run the following command: esxcli storage core device list -d <device name>. Below is an example of the expected output:

# esxcli storage core device list -d eui.5b5fbb54c4d81900

eui.5b5fbb54c4d81900

   Display Name: AMI iSCSI Disk (eui.5b5fbb54c4d81900)

   Has Settable Display Name: true

   Size: 512000

   Device Type: Direct-Access

   Multipath Plugin: NMP

   Devfs Path: /vmfs/devices/disks/eui.5b5fbb54c4d81900

   Vendor: AMI

   Model: StorTrends iTX

   Revision: 2.8s

   SCSI Level: 6

   Is Pseudo: false

   Status: degraded

   Is RDM Capable: true

   Is Local: false

   Is Removable: false

   Is SSD: false

   Is Offline: false

   Is Perennially Reserved: false

   Queue Full Sample Size: 0

   Queue Full Threshold: 0

   Thin Provisioning Status: yes

   Attached Filters:

   VAAI Status: supported

   Other UIDs: vml.01000000003562356662623534633464383139303053746f725472

   Is Local SAS Device: false

   Is USB: false

   Is Boot USB Device: false

   No of outstanding IOs with competing worlds: 32

UNMAP – Space Reclamation

UNMAP is a formal SCSI command which was added into VMware’s VAAI primitives to allow thin provisioned storage arrays to clear unused VMFS space. This allows both vCenter and StorTrends to report matching used capacities. For users who have SAR replication in place for disaster recovery purposes, this will also help cut back on the amount of changes that get replicated.

At its inception, this VAAI primitive was set to be an automatic function. Concerns with performance and an array’s ability to reclaim the space within an optimal time frame caused this VAAI primitive to become the only manually triggered primitive. This manual trigger is initiated through the CLI for a particular VMware ESXi host. The commands differ slightly for versions of ESXi prior to 5.5 and those 5.5+. Below are examples of each with a reference to their corresponding KB articles for further details.

UNMAP in VMware ESXi 5.0 U1 or later and 5.1

The first step will be to change the directory to the root of the VMFS volume that has space available to reclaim. Once there, the command to initiate the UNMAP primitive is: vmkfstools -y <% of free space to unmap>.

#vmkfstools -y 60
Attempting to reclaim 60% of free capacity 48.8 GB (29.3 GB) on VMFS-5 file system ‘source-datastore’ with max file size 64 TB. Create file .vmfsBalloontsWt8w of size 29.3 GB to reclaim free blocks.
Done.
NOTE: If a percentage value in the high 90s or 100 is specified, the temporary balloon file that is created during the reclamation operation might fill up the VMFS volume. Any growth of current VMDK files due to running virtual machines writing to their disks or the creation of new files, such as snapshots, might fail due to unavailable space. Care should be taken when calculating the amount of free space to reclaim.

For more information on this, take a look at VMware KB article 2014849.

UNMAP in VMware ESXi 5.5 and later

The command for initiating the UNMAP primitive in VMware ESXi 5.5 and later is: esxcli storage vmfs unmap –volume-label=volume_label|–volume-uuid=volume_uuid –reclaim-unit=number.

For this command, either the –volume-label or the –volume-uuid flag (but not both) is required and –reclaim-unit is optional (and StorTrends recommends to simply use the default value, hence not requiring this flag).

#esxcli storage vmfs unmap –volume-label=TestDatastore

Or

#esxcli storage vmfs unmap –volume-uuid=51d7e2ca-744caa39-5a7d-00259091db57

For more information on this, take a look at VMware KB article 2057513.

Check Support and Enable/Disable VAAI Primitives

There are a few methods of checking whether or not VAAI primitives are supported or not for particular devices. In order to check whether specific VAAI primitives are supported, run the following command in an SSH shell: esxcli storage core device vaai status get –d <device name>

#esxcli storage core device vaai status get -d eui.5b5fbb54c4d81900
eui.5b5fbb54c4d81900VAAI Plugin Name:ATS Status: supportedClone Status: supportedZero Status: supportedDelete Status: supported

From the output above, it can be inferred that the ATS, Clone (Extended Copy), Zero (Write Same), and Delete (UNMAP) VAAI primitives are supported for this device.

From the VMware vSphere Web Client, the overall status of VAAI primitive support can be found by checking the ‘Hardware Acceleration’ status of a particular device. The possible statuses for this are Supported (shown if ATS is supported), Not Supported (shown if neither ATS, Extended Copy nor Write Same are supported), or Unknown (shown if other VAAI primitives are supported but haven’t been utilized). To check this, follow the steps below:

  • Login to vCenter and navigate to the host in question. Click on the ‘Manage’ tab and then click on the ‘Storage’ sub-tab. Click on the ‘Storage Devices’ option and choose a device that is connected to the StorTrends SAN. Then, under the ‘Device Details’ look for the ‘Hardware Acceleration’ field and confirm that is shows ‘Supported’ as its status. If the status shows ‘Unknown’, initiating a clone/vMotion or provisioning a virtual disk to eager zeroed thick should change this status to ‘Supported’.

VAAI primitives are enabled and disabled for the VMware ESXi host as a whole, not just for individual devices. The status of these primitives can be found directly from the VMware vSphere Web Client and may also be enabled/disabled as needed. In order to check the status or change the status of these primitives, follow the steps below:

NOTE: For all of the entries below, a status of ‘1’ signifies that the primitive is enabled whereas a status of ‘0’ signifies that the primitive is disabled.
  • Login to vCenter and navigate to the host in question. Click on the ‘Manage’ tab and then click on the ‘Settings’ sub-tab. Click on ‘Advanced System Settings’ under ‘System’ to get a long list of settings for the host. In the search bar, type ‘VMFS3’.Similarly, type ‘DataMover’ to bring up different list for other VAAI primitives.The following entries are of concern:
    • HarwareAcceleratedMove – Extended Copy
    • HardwareAcceleratedInit – Write Same
    • EnableBlockDelete – UNMAP
    • HardwareAcceleratedLocking – ATS VAAI primitive
  • To change the current status of any of the above primitives, select the particular primitive and click on the pencil icon at the top of the list of items. Simply input the value for the status required and click ‘OK’.

 

 

vSphere Plugin Features

The StorTrends vSphere Plugin offers insight into the StorTrends SAN from right within the VMware vSphere Web Client. The plugin in will also allow for the creation, cloning, expansion and deletion of virtual machines (VMs). The plugin is conveniently found at the Home tab of the vSphere Web Client, as shown below:

After clicking on the StorTrends icon, a login screen will come up. The StorTrends SAN’s IP and CLI username and password will be needed. Once in, there will be tabs for SAN Summary, Storage Statistics, Volume Statistics, and Management of VMs and datastores.

StorTrends SAN Summary

The ‘Summary’ tab will show you a summary of how the StorTrends SAN is being utilized within the VMware environment.


As shown in the figure above, a list of datastores and VMs running off of each datastore that are hosted on the StorTrends SAN are shown. The correlating StorTrends volume as well as used capacity, overall capacity, and correlating storage pool are shown as well.

Storage Statistics

The ‘Storage Statistics’ tab will elaborate on the overall historical Latency, IOPS, Throughput and Read and Write IO Distribution for the last day as seen from the StorTrends SAN. This view allows a user to see how much of a demand is being put on the StorTrends SAN coming from the VMware ESXi host(s) and can aid in determining if performance adjustments should be made.

NOTE: This tab will only be available for StorTrends 2610i SANs.

Volume Statistics

The ‘Volume Statistics’ tab will elaborate on the historical Latency, IOPS, Throughput and Read and Write IO Distribution for the last day as seen from the StorTrends SAN as seen from each individual volume that is connected to the VMware ESXi host(s) managed by the vCenter. The drop down next to ‘Volume:’ will allow a user to select different volumes hosted on the StorTrends SAN that are mounted in the VMware environment. There is also the ability to drill down even further by choosing individual VMs running on the associated datastore.

NOTE: This tab will only be available for StorTrends 2610i SANs.

Virtual Machine Management

The final tab within the StorTrends vSphere Plugin has options for VM management. Within this tab, a user will have the ability to create new VMs from scratch, clone VMs, expand datastores and delete datastores, as needed.

Virtual Machine Creation

The ‘Virtual Machine Creation’ option will allow a user to create virtual machines over a new datastore. In order to create this new VM, a wizard will walk the user through creating a volume on the StorTrends SAN, as seen below:


The next step within the wizard will be to choose a datacenter and a host within that datacenter that will contain the datastore for the VM:

The final step in the wizard will be to input the required specifications for the new VM. This will include a name for the VM, the OS type, CPU count, disk size, and memory size:

The user will see a summary of everything that was specified and, as long as all looks correct, the VM creation process will commence. Once the wizard finishes, the new VM will be available from the navigation tree.

Virtual Machine Cloning

The ‘Virtual Machine Cloning’ option will allow a user to create multiple clones of a virtual machine. There are two types of clones that StorTrends supports – Data Clones and Instant Clones.

Data clones will clone the volume on the StorTrends SAN and connect to the new volume from the VMware environment. The datastore will be re-signatured and the VM will be registered as a new VM.

Instant Clones will not clone the volume on the StorTrends SAN. Instead, a snapshot of the volume will be taken on the SAN and then a writable snapshot will be created and mounted from that. The writable snapshot will be used for cloning of the datastore and VM.

Datastore Expansion

The ‘Datastore Expansion’ option will allow a user to expand a datastore if space is running low. This wizard will automatically expand both the volume on the StorTrends SAN as well as the datastore itself. Within the wizard, the user simply chooses the datastore to expand, then enters the new desired capacity.

Datastore Deletion

The ‘Datastore Deletion’ option will allow a user a simple way of deleting a datastore. This will also delete the correlating volume from the StorTrends SAN.

Snapshot Management

The ‘Snapshot Management’ option will allow a user a view into the various snapshots on the various volumes housed on the StorTrends SAN. A user will also have a convenient button to quickly mount the snapshot on the ESXi host in order to recover data that may have been deleted, altered, or damaged in some way. If deemed necessary, the user can create a manual snapshot on the volume for various purposes.

Was this article helpful?
0 out Of 5 Stars
5 Stars 0%
4 Stars 0%
3 Stars 0%
2 Stars 0%
1 Stars 0%
5
How can we improve this article?
Please submit the reason for your vote so that we can improve the article.
Tags:
Table of Contents
Top