EMC FAST: Whether to File and/or Block Tier

// // Source: Just Be Better… — EMC FAST: Whether to File and/or Block Tier// <![CDATA[
var customColor = '#ff6600';
var vimeoColor = '#ff6600';
var disableQuoteTitleFonts;
var disableHeaderNavFonts;
var slimAudioPlayer;

document.write(' .player, #page .video .video_embed, .photoset_narrow { display: none; } ‘);

// ]]>

Storage performance needs in today’s data center can change on a moment’s notice. Data that once needed the backing of a freight train today may only need the performance of a vespa tomorrow. Having the ability to react to the ever changing needs of one’s data in an automated fashion allows efficiencies never before seen in EMC’s midrange product line. Generally as data ages its importance lessens both from a business and usage perspective. Utilizing FAST allows EMC customers to place data on the appropriate storage tier based on application requirements and service levels. Choosing between cost (SATA/NL-SAS) and performance (EFD’s/SAS) is a thing of the past. Below are the what, when and why of EMC’s FAST. The intent is to help one make an informed decision based on the needs of their organization.

Block Tiering (What, When and Why)

The What: FAST for VNX/Clariion is an array based feature that utilizes Analyzer to move block based data (slices of LUNs). By capturing performance characteristics, it can intelligently make predictions on where that data will be best utilized. Data is moved at the sub LUN layer in 1G slices, eliminating the need and overhead with moving the full LUN. This could mean that portions of a LUN could exist on multiple disk types (FC, SATA , EFD) Migrations are seamless to the host and occur bidirectionally based on performance needs, ie. FC to SATA, FC to EFD, SATA to FC, SAS to NL-SAS, etc. FAST is utilized at the Storage Pool layer and not available within Traditional RAID Groups. To utilize FAST v2 (which is sub LUN tiering ) you must be at FLARE 30 or above (4.30.000.5.xxx), and have both Analyzer and FAST enabler installed on the array. Existing LUNs/Data can migrate seamlessly and non-disruptively into storage pools using the VNX/Clariion LUN migration feature. Additionally FAST operates with other Array based features such as Snapview, MirrorView, SAN Copy, RecoverPoint, etc, without issue. All FAST operations and scheduling is configurable through Unisphere.

The When: Automated tiering is a scheduled batch event and does not happen dynamically.

The Why: To better align your application service levels with the best storage type. Ease of management, as a requirement for FAST are storage pools. Storage pools allow for concise management and eased growth opportunities from one location. Individual RG and Meta LUNs management is not needed to obtain high end services levels with the use of SP’s and FAST. The idea going forth is to minimize disk purchasing requirements by moving hot and cold data to and fro disk types that meet specific service levels for that data. If data is accessed frequently then it makes sense that it lives on either EFD (enterprise FLASH drives) or FC/SAS. If data is not accessed frequently then it ideally should live on SATA/NL-SAS. By utilizing FAST in your environment, you are utilizing your Array in the most efficient manner while minimizing cap-ex costs.

File Tiering (What, When and Why)

The What: FAST for VNX File/Celerra utilizes the Cloud Tiering Appliance (or what was FMA, previously known as Rainfinity). The CTA utilizes a policy engine that allows movement of infrequently used files across different storage tiers based on last access times, modify times, size, filename, etc. As data is moved, the user perception is that the files still exist on primary storage. File retrieval (or recall) is initiated simply by clicking on the file, the file is then copied back to its original location. The appliance itself is available as a virtual appliance that can be imported into your existing VMware infrastructure via vCenter, or as a physical appliance (HW plus the software). Unlike FAST for VNX/CLARiiON, FAST for file allows you to tier across arrays (Celerra <-> VNX, Isilon or third party arrays) or cloud service providers (Atmos namely, other SP’s coming). The introduction of CTA to your environment is non-disruptive. All operations for CTA are configurable through the CTA GUI. In summary, CTA can be used as a Tiering engine, an archiving engine or a migration engine based on the requirements of your business. From an archiving perspective, CTA can utilize both Data Domain and Centera targets for long term enforced file level retention. As a migration engine, CTA can be utilized for permanent file moves from one array to another during technology refreshes or platform conversions. Note: CTA has no knowledge of the storage type, it simply moves files from one tier to another based on pre- defined criteria.

The When: Automated tiering is designed to running at scheduled intervals (in batches) and does not happen dynamically or continually I should say.

The Why: Unstructured data, data that exists outside of pre-defined data model such as SQL, is growing at an alarming rate. Think about how many word docs, excel spreadsheets, pictures, text files exist in your current NAS or general file-served environments. Out of that number what percentage hasn’t been touched since its initial creation? In that context, a fair assessment would be 50% of that data. A more accurate assessment would probably be 80% of your data. Archiving and Tiering via CTA simple allows for more efficient use of your high end and low end storage. If 80% of your data is not accessed or accessed infrequently it has no business being on fast spinning disk (FC or SAS). Ultimately this allows you to curb your future spending on pricey high end disk and focus more purchasing capacity for where your data should sit, on low end storage.

***Update***

As brought to my attention on the twitters (Thanks->@PBradz and @veverything), there is of course another option. Historically, data LUNs as used by the data movers for file specific data (CIFS, NFS) has only been supported on traditional RAID Group LUNs. With the introduction of the VNX, support has been extended to pool LUNs. This implies that you can utilize FAST block tiering for the data that encompasses those LUNs. A couple of things when designing and utilizing in this manner (more info here)…

  • The entire pool should be used by file only
  • Thick LUNs only within the pool
  • Same tiering policy for each pool LUN
  • Utilize compression and dedupe on the file side. Stay clear of block thin provisioning and compression.

There are of course numerous other recommendations that should be noted if you decide to go this route. Personally, its taken me a while to warm up to storage pools. Like any new technology it needs to gain my trust before I go all in on recommending it. Inherent bugs and inefficiencies early on have caused me to be somewhat cautious. Assuming you walk the line on how your pools are configured, this is a very viable means to file tier (so to speak) with the purchase of FAST block only. That being said there is still benefit to using the CTA for long term archiving primarily off array, as currently FAST Block is array bound only. Define the requirements up front so you’re not surprised on the backend as to what the technology can and can not do. If the partner you’re working with is worth their salt you’ll know all applicable options prior to that PO being cut…

THE DISK IS OFFLINE BECAUSE OF POLICY SET BY AN ADMINISTRATOR

Note from Tanny:

This post did not work, but is worth sharing…. in my case it was a matter of just bringing the storage resource online in the cluster resource manager.

For a 2008R2 Clustered environment, take look at the cluster resource manager.
In my case it was a matter of bring the storage resource on line.  We swing a LUN from different servers for quick backups and restores.  The instructions did not work, but usually after presenting the LUN to the Cluster or any of the stand alone environments, a quick scan will bring the disk online and keep the previous drive letter.

Source: (Repost from the Happy SysAdm Blog)The disk is offline because of policy set by an administrator

You have just installed or cloned a VM with Windows 2008 Enterprise or Datacenter or you have upgraded the VM to Virtual Hardware 7 and under Disk Management you get an error message saying:
“the disk is offline because of policy set by an administrator”.
This is because, and this is by design, all virtual machine disk files (VMDK) are presented from Virtual hardware 7 (the one of ESX 3.5) to VMs as SAN disks.
At the same time, and this is by design too, Microsoft has changed how SAN disks are handled by its Windows 2008 Enterprise and Datacenter editions.
In fact, on Windows Server 2008 Enterprise and Windows Server 2008 Datacenter (and this is true for R2 too), the default SAN policy is now VDS_SP_OFFLINE_SHARED for all SAN disks except the boot disk.
Having the policy set to Offline Shared means that your SAN disks will be simply offline on startup of your server and if your paging file is on one of this secondary disks it will be unavailable.
Here’s the solution to this annoying problem.
What you have to do is first to query the current SAN policy from the command line with DISKPART and issue the following SAN commands:
= = = = = = = = = = = = = = = = = =
DISKPART.EXE
 
DISKPART> san
 
SAN Policy : Offline Shared
= = = = = = = = = = = = = = = = = =
Once you have verified that the applied policy is Offline Shared, you have two options to set the disk to Online.
The first one is to log in to your system as an Administrator, click Computer Management > Storage > Disk Management, right-click the disk and choose Online.
The second one is to make a SAN policy change, then select the offline disk, force a clear of its readonly flag and bring it online. Follow these steps:
= = = = = = = = = = = = = = = = = =
DISKPART> san policy=OnlineAll
 
DiskPart successfully changed the SAN policy for the current operating system.
DISKPART> LIST DISK
 
Disk ### Status Size Free Dyn Gpt
——– ————- ——- ——- — —
Disk 0 Online 40 GB 0 B
* Disk 1 Offline 10 GB 1024 KB
 
DISKPART>; select disk 1
 
Disk 1 is now the selected disk.
 
DISKPART> ATTRIBUTES DISK CLEAR READONLY
 
Disk attributes cleared successfully.
 
DISKPART> attributes disk
Current Read-only State : No
Read-only : No
Boot Disk : No
Pagefile Disk : No
Hibernation File Disk : No
Crashdump Disk : No
Clustered Disk : No
 
DISKPART> ONLINE DISK
 
DiskPart successfully onlined the selected disk.
= = = = = = = = = = = = = = = = = =
Once that is done, the drive mounts automagically.
  1. So, I’m trying all this but the return message I get in disk part is “DiskPart failed to clear disk attributes.”. Any further advice?

    DISKPART> san policy=OnlineAll

    DiskPart successfully changed the SAN policy for the current operating system.

    DISKPART> rescan

    Please wait while DiskPart scans your configuration…

    DiskPart has finished scanning your configuration.

    DISKPART> select disk 1

    Disk 1 is now the selected disk.

    DISKPART> attributes disk clear readonly

    DiskPart failed to clear disk attributes.

    DISKPART> attributes disk
    Current Read-only State : Yes
    Read-only : Yes
    Boot Disk : No
    Pagefile Disk : No
    Hibernation File Disk : No
    Crashdump Disk : No
    Clustered Disk : Yes

    DISKPART> san

    SAN Policy : Online All

    (Note from Tanny take a look at the Cluster resource manager and bring storage resource online)

  2. I see you problem. Have you checked that you have full access to the volume you want to change attributes for? Is it a cluster resource? I think so because your log says “clustered disk: yes”. In this case you should stop all nodes but one and then you will be allowed to use diskpart to reset the flags. The general idea is to grant the server you are connected to write access to the volume.
    Let me know if you need more help and if, so, please post more details about you configuration (servers and LUNs).
    Regards

    Reply

  3. I am having this same problem. It is in cluster and I have shut down the other node. I am still unable to change the read only flag.
    Please help?!

  4. Wacky problem – a SAN volume mounted to a 2008 (not R2) 32bit enterprise server had been working fine. After a reboot of the server, the disk was offline. Putting it back online, no problem, diskpart details for the volume show “Read Only: No”. Got support feom Dell and foud the the Volume was listed as Read Only. Simple fix, change the Volume to “Read Only: No” with Diskpart. 4 hours later, the Volume is marked as “read only” again.No chnages made by us, nothing in the Windows logs.
    The disk is an Dell/Emc SAN LUN, fiber connected, exclusive use to this machine. Have another LUN, almost the same size attached the same way to this machine, no problems with that. Appreciate any thoughts or places to look.

    Reply

  5. Ahhh, nice! A perfect tutorial! Thanks a lot!

    Reply

  6. Great article! I just spent 2 hours trying to figure out why my san disks weren’t showing and this was the fix.

    Thank you!

    Reply

  7. Thank you, thank you, thank you! This article helped me with an IBM DS3000 and an IBM System x3650M3 Windows Server 2008 R2. Thumbs up to you! I’d be still trying to figure why I couldn’t configure these drives!

    Reply

  8. These settings are good for window server 2008 R1 and R2. It breaks again with R2 SP1 ;-(. Is there any solution for R2 SP1?

    Reply

  9. Thanks. Very helpful.

    Reply

  10. Wonderful article..thanks a lot dude!!

    Reply

  11. This worked perfectly for me. I tried figuring it out on my own but just couldn’t get it to work within VMware Workstation.

    Reply

  12. let me know i how to remove is read only attribute and bring online. if i access san directly then it possible.
    i have two server in one server its show online but in second server its display reserved the disk offline message.

    I’m also trying all this but the return message I get same problem in disk part is “DiskPart failed to clear disk attributes.”. Any further advice?

    DISKPART> san policy=OnlineAll

    DiskPart successfully changed the SAN policy for the current operating system.

    DISKPART> rescan

    Please wait while DiskPart scans your configuration…

    DiskPart has finished scanning your configuration.

    DISKPART> select disk 1

    Disk 1 is now the selected disk.

    DISKPART> attributes disk clear readonly

    DiskPart failed to clear disk attributes.

    DISKPART> attributes disk
    Current Read-only State : Yes
    Read-only : Yes
    Boot Disk : No
    Pagefile Disk : No
    Hibernation File Disk : No
    Crashdump Disk : No
    Clustered Disk : Yes

    DISKPART> san

    SAN Policy : Online All

    Reply

  13. Exactly the answer I was looking for!

    Reply

  14. Well done – fixed me right up.

    Reply

  15. Perfect answer for a vexing problem. I had no clue where to look for

    Reply

  16. This is really helpful article ! Many Thanks.

    Reply

  17. Thanks for your reply!

  18. Thanks, this was very helpful for me.

    Reply

  19. Hi same problem here, the disk says its a clustered disk but i don’t have it in the Failover cluster manager. Its just a dedicated disk to one server from the san.. have cleared simultaneous connections and only one server is connected now but still won’t come online. Any help would be great.
    Thanks

    Reply

  20. Anonym

CLUSTERING SERVER 2012 R2 WITH ISCSI STORAGE

Wednesday, December 31, 2014   , , , , , , , , ,,   No comments

Source: Exit The Fast Lane

Yay, last post of 2014! Haven’t invested in the hyperconverged Software Defined Storage model yet? No problem, there’s still time. In the meanwhile, here is how to cluster Server 2012 R2 using tried and true EqualLogic iSCSI shared storage.

EQL Group Manager

First, prepare your storage array(s), by logging into EQL Group Manager. This post assumes that your basic array IP, access and security settings are in place.  Set up your local CHAP account to be used later. Your organization’s security access policies or requirements might dictate a different standard here.

SNAGHTML3b62e029

Create and assign an Access Policy to the VDS/VSS in Group Manager otherwise this volume will not be accessible. This will make subsequent steps easier when it’s time to configure ASM.image

Create some volumes in Group Manager now so you can connect your initiators easily in the next step. It’s a good idea to create your cluster quorum LUN now as well.

image

Host Network Configuration

First configure the interfaces you intend to use for iSCSI on your cluster nodes. Best practice says that you should limit your iSCSI traffic to a private Layer2 segment, not routed and only connecting to the devices that will participate in the fabric. This is no different from Fiber Channel in that regard, unless you are using a converged methodology and sharing your higher bandwidth NICs. If using Broadcom NICs you can choose Jumbo Frames or hardware offload, the larger frames will likely net a greater performance impact. Each host NIC used to access your storage targets should have a unique IP address able to access the network of those targets within the same private Layer2 segment. While these NICs can technically be teamed using the native Windows LBFO mechanism, best practice says that you shouldn’t, especially if you plan to use MPIO to load balance traffic. If your NICs will be shared (not dedicated to iSCSI alone) then LBFO teaming is supported in that configuration. To keep things clean and simple I’ll be using 4 NICs, 2 dedicated to LAN, 2 dedicated to iSCSI SAN. Both LAN and SAN connections are physically separated to their own switching fabrics as well, this is also a best practice.

image

MPIO – the manual method

First, start the MS iSCSI service, which you will be prompted to do, and check its status in PowerShell using get-service –name msiscsi.

image

Next, install MPIO using Install-WindowsFeature Multipath-IO

Once installed and your server has been rebooted, you can set additional options in PowerShell or via the MPIO dialog under  File and Storage Services—> Tools.

image

Open the MPIO settings and tick “add support for iSCSI devices” under Discover Multi-Paths. Reboot again. Any change you make here will ask you to reboot. Make all changes once so you only have to do this one time.

image

The easier way to do this from the onset is using the EqualLogic Host Integration Tools (HIT Kit) on your hosts. If you don’t want to use HIT for some reason, you can skip from here down to the “Connect to iSCSI Storage” section.

Install EQL HIT Kit (The Easier Method)

The EqualLogic HIT Kit will make it much easier to connect to your storage array as well as configure the MPIO DSM for the EQL arrays. Better integration, easier to optimize performance, better analytics. If there is a HIT Kit available for your chosen OS, you should absolutely install and use it. Fortunately there is indeed a HIT Kit available for Server 2012 R2.

image

Configure MPIO and PS group access via the links in the resulting dialog.

image

In ASM (launched via the “configure…” links above), add the PS group and configure its access. Connect to the VSS volume using the CHAP account and password specified previously. If the VDS/VSS volume is not accessible on your EQL array, this step will fail!

image

Connect to iSCSI targets

Once your server is back up from the last reboot, launch the iSCSI Initiator tool and you should see any discovered targets, assuming they are configured and online. If you used the HIT Kit you will already be connected to the VSS control volume and will see the Dell EQL MPIO tab.

image

Choose an inactive target in the discovered targets list and click connect, be sure to enable multi-path in the pop-up that follows, then click Advanced.

image

Enable CHAP log on, specify the user/pw set up previously:

image

If your configuration is good the status of your target will change to Connected immediately. Once your targets are connected, the raw disks will be visible in Disk Manager and can be brought online by Windows.

image

When you create new volumes on these disks, save yourself some pain down the road and give them the same label as what you assigned in Group Manager! The following information can be pulled out of the ASM tool for each volume:

image

Failover Clustering

With all the storage pre-requisites in place you can now build your cluster. Setting up a Failover Cluster has never been easier, assuming all your ducks are in a row. Create your new cluster using the Failover Cluster Manager tool and let it run all compatibility checks.

image

Make sure your patches and software levels are identical between cluster nodes or you’ll likely fail the clustering pre-check with differing DSM versions:

image

Once the cluster is built, you can manipulate your cluster disks and bring any online as required. Cluster disks will not be able to be brought online until all nodes in the cluster can access the disk.

image

Next add your cluster disks to Cluster Shared Volumes to enable multi-host read/write and HA.

image

The new status will be reflected once this change is made.

image

Configure your Quorum to use the disk witness volume you created earlier. This disk does not need to be a CSV.

image

Check your cluster networks and make sure that iSCSI is set to not allow cluster network communication. Make sure that your cluster network is setup to allow cluster network communication as well as allowing client connections. This can of course be further segregated if desired using additional NICs to separate cluster and client communication.

image

Now your cluster is complete and you can begin adding HA VMs, if using Hyper-V, SQL, File or other roles as required.

References:

http://blogs.technet.com/b/keithmayer/archive/2013/03/12/speaking-iscsi-with-windows-server-2012-and-hyper-v.aspx

http://blogs.technet.com/b/askpfeplat/archive/2013/03/18/is-nic-teaming-in-windows-server-2012-supported-for-iscsi-or-not-supported-for-iscsi-that-is-the-question.aspx

VNX 5300/VMware: Troubleshoot ESXi connectivity to SAN va iSCSI connection

Troubleshoot VMware ESXi/ESX to iSCSI array connectivity:

Note: This A rescan is required after every storage presentation change to the environment.

1.Log into the ESXi/ESX host and verify that the VMkernel interface (vmk) on the host can vmkping the iSCSI targets with this command:

# vmkping target_ip

If you are running an ESX host, also check that Service Console interface (vswif) on the host can ping the iSCSI target with:

# ping target_ip

Note: Pinging the storage array only applies when using the Software iSCSI initiator. In ESXi, ping and ping6 both run vmkping. For more information about vmkping, see Testing VMkernel network connectivity with the vmkping command (1003728).

2.Use netcat (nc) to verify that you can reach the iSCSI TCP port (default 3260) on the storage array from the host:

# nc -z target_ip 3260

Example output:

Connection to 10.1.10.100 3260 port [tcp/http] succeeded!

Note: The netcat command is available with ESX 4.x and ESXi 4.1 and later.

3.Verify that the host Hardware Bus Adapters (HBAs) are able to access the shared storage. For more information, see Obtaining LUN pathing information for ESX or ESXi hosts (1003973).

4.Confirm that no firewall is interfering with iSCSI traffic. For details on the ports and firewall requirements for iSCSI, see Port and firewall requirements for NFS and SW iSCSI traffic (1021626). For more information, see Troubleshooting network connection issues caused by firewall configuration (1007911).

Note: Check the SAN and switch configuration, especially if you are using jumbo frames (supported from ESX 4.x). To test the ping to a storage array with jumbo frames from an ESXi/ESX host, run this command:

# vmkping -s MTUSIZE IPADDRESS_OF_SAN -d

Where MTUSIZE is 9000 – (a header of) 216, which is 8784, and the -d option indicates “do not fragment”.

5.Ensure that the LUNs are presented to the ESXi/ESX hosts. On the array side, ensure that the LUN IQNs and access control list (ACL) allow the ESXi/ESX host HBAs to access the array targets. For more information, see Troubleshooting LUN connectivity issues on ESXi/ESX hosts (1003955).

Additionally ensure that the HOST ID on the array for the LUN (on ESX it shows up under LUN ID) is less than 255 for the LUN. The maximum LUN ID is 255. Any LUN that has a HOST ID greater than 255 may not show as available under Storage Adapters, though on the array they may reside in the same storage group as the other LUNS that have host IDs less than 255. This limitation exists in all versions of ESXi/ESX from ESX 2.x to ESXi 5.x. This information can be found in the maximums guide for the particular version of ESXi/ESX having the issue.

6.Verify that a rescan of the HBAs displays presented LUNs in the Storage Adapters view of an ESXi/ESX host. For more information, see Performing a rescan of the storage on an ESXi/ESX host (1003988).

7.Verify your CHAP authentication. If CHAP is configured on the array, ensure that the authentication settings for the ESXi/ESX hosts are the same as the settings on the array. For more information, see Checking CHAP authentication on the ESXi/ESX host (1004029).

8.Consider pinging any ESXi/ESX host iSCSI initiator (HBA) from the array’s targets. This is done from the iSCSI host.

9.Verify that the storage array is listed on the Storage/SAN Compatibility Guide. For more information, see Confirming ESXi/ESX host hardware (System, Storage, and I/O) compatibility (1003916).

Note: Some array vendors have a minimum-recommended microcode/firmware version to operate with VMware ESXi/ESX. This information can be obtained from the array vendor and the VMware Hardware Compatibility Guide.

10.Verify that the physical hardware is functioning correctly, including:

◦The Storage Processors (sometimes known as heads) on the array

◦The storage array itself

◦Check the SAN and switch configuration, especially if you are using jumbo frames (supported from ESX 4.x). To test the ping to a storage array with jumbo frames from ESXi/ESX, run this command:

# vmkping -s MTUSIZE STORAGE_ARRAY_IPADDRESS

Where MTUSIZE is 9000 – (a header of) 216, which is 8784.

Note: Consult your storage array vendor if you require assistance.

11.Perform some form of network packet tracing and analysis, if required. For more information, see:

◦Capturing virtual switch traffic with tcpdump and other utilities (1000880)

◦Troubleshooting network issues by capturing and sniffing network traffic via tcpdump (1004090)