Design & Build

Data Center Solution, Storage-Validation Test: Brocade VCS Fabric and Nimble CS220G-X2 Flash Storage Array

by ‎08-21-2014 07:40 AM - edited ‎09-02-2014 09:43 AM (4,907 Views)
 

 

Preface

 

Overview

The Solid State Ready (SSR) program is a comprehensive testing and configuration initiative to validate the interoperability of Fibre Channel and IP flash storage with a Brocade network infrastructure. This program provides testing of multiple fabrics, heterogeneous servers, NICs and HBAs in large port-count Brocade environments.

 

The SSR qualification program will help verify seamless interoperability and optimum performance of solid state storage in Brocade FC and Ethernet fabrics.

 

Purpose of This Document

The goal of this document is to demonstrate the compatibility of Nimble CS220G-X2 array in a Brocade Ethernet fabric. This document provides a test report on the SSR qualification test plan executed on the Nimble CS220G-X2 array.

 

Audience

The target audience for this document includes storage administrators, solution architects, system engineers, and technical development representatives.

 

Objective

  1. Test the Nimble CS220G-X2 array with the Brocade VCS Ethernet fabric, for different stress and error recovery scenarios, to validate the interoperability and integration of the Nimble array with Brocade VCS fabric.
  2. Validate the performance of the Brocade VCS fabric in a solid state storage environment for high throughput and low latency applications.

Test Conclusions

  1. Achieved 100% pass rate on all the test cases in the SSR qualification test plan. The network and the storage were able handle the various stress and error recovery scenarios without any issues.
  2. Different I/O workload scenarios were simulated using Medusa and VMware IOAnalyzer tools and sustained performance levels were achieved across all workload types. The Brocade VCS fabric handled both the low latency and high throughput I/O workloads with equal efficiency without any I/O errors or packet drops.
  3. The results confirm that the Nimble CS220G-X2 array interoperates seamlessly with Brocade VCS fabric, and demonstrated high availability and sustained performance.
  4. For optimal availability and performance, consideration should be given to using host link aggregation forming a vLAG with the Brocade VCS fabric.
  5. Host multipathing tools should be used when connecting to iSCSI target devices to discover and utilize multiple paths on the storage target.
  6. The switches in the VCS fabric should have sufficient number of ISL’s with multiple uplinks to provide sufficient bandwidth and redundancy.

Related Documents

References

 

Document History

 

Date                           Version                Description

2014-08-20                   1.0                       Initial Release

 

Key Contributors

The content in this guide was provided by the following key contributors.

  • Test Architect: Mike Astry, Patrick Stander
  • Test Engineer: Subhish Pillai

About Brocade

Brocade networking solutions help the world’s leading organizations transition smoothly to a world where applications and information reside anywhere. This vision is realized through the Brocade One™ strategy, which is designed to deliver key business benefits such as unmatched simplicity, non-stop networking, application optimization, and investment protection.

 

Innovative Ethernet and storage networking solutions for data center, campus, and service provider networks help reduce complexity and cost while enabling virtualization and cloud computing to increase business agility.

 

To help ensure a complete solution, Brocade partners with world-class IT companies and provides comprehensive education, support, and professional services offerings.

 

To learn more, visit (www.brocade.com)

 

About Nimble

Nimble Storage, Inc. provides flash-optimized hybrid storage platform. Nimble Storage CS-Series arrays are the building blocks of Adaptive Flash, a storage platform that dynamically and intelligently allocates storage resources to satisfy the changing needs of business-critical applications.

Our Adaptive Flash platform comprises two core innovations: CASL™, our flash-optimized file system, and InfoSight™, our cloud-based management software. CASL is industry leading in its ability to leverage flash and disk for the broadest range of workloads, delivering the performance of flash-only arrays while simultaneously scaling to petabyte scale deployments cost-effectively. In addition, CASL delivers best-in-class resilience and application-integrated data protection. InfoSight monitors our customers’ infrastructure on a near real-time basis and averts potential problems before they occur, radically simplifying our customers’ day-to-day operations and improving the health of their infrastructure.

 

Test Plan

 

Scope

Testing will be performed with a mix of GA and development versions of Brocade’s Network Operating System (NOS) running on Brocade VDX switches configured to form a Brocade VCS Ethernet fabric.

 

Testing will be at system level, including interoperability of Storage Devices with the Brocade VDX switches. Performance is observed within the context of best practice fabric configuration; however absolute maximum benchmark reporting of storage performance is beyond the scope of this test.

 

Details of the test steps are covered under “Test Case Descriptions” section. Standard test bed setup includes IBM/HP/Dell chassis server hosts with Brocade/QLogic/Emulex/Intel/Broadcom CNA’s and NIC’s with two uplinks from every host to the Brocade VCS fabric. The 10Gb Ethernet ports on the Nimble array “active” and “standby” controllers are spread across the VCS fabric.

 

Test Configuration

The following shows the devices under test (DUT) and the test equipment.

 

Test Configuration.jpg 

    Test Configuration

 

DUT Descriptions

 

Storage Array

DUT ID

Model

Vendor

Description

Notes

Nimble CS220 Array

CS220G-X2

Nimble

Nimble CS series iSCSI array with 640GB of SSD flash.

Each controller has 2x1GbE ports and 2x10GbE ports in an active-standby configuration.

 

Switch

DUT ID

Model

Vendor

Description

Notes

VDX 6740_1/2

VDX 6740

Brocade

48x10GbE and 4x40GbE QSFP+ ports

 Supports Auto-NAS

VDX 6730-32_1/2

VDX 6730-32

Brocade

24x10GbE and 8x8Gbps FC ports

 

VDX 6730-76_1/2

VDX 6730-76

Brocade

60x10GbE and 16x8Gbps FC ports

 

VDX 6720-24_1

VDX 6720-24

Brocade

24x10GbE ports

 

VDX 6720-60_1

VDX 6720-60

Brocade

60x10GbE ports

 

 

DUT Specifications

Device

Release

Configuration Options

Notes

Nimble CSS220G-X2 Array

2.1.4.0-100755-opt

Two Network topology used.

Data and Mgmt traffic configured to be on separate networks.

VDX 6740

NOS v4.1.2

VCS Fabric License

 

VDX 6730-32/76

NOS v4.1.2

VCS Fabric License

 

VDX 6720-24/60

NOS v4.1.2

VCS Fabric License

 

 

Test Equipment Specifications

Device Type

Model

Server (SRV1-8)

HP DL380p G8, HP DL360 G7, IBM x3630M4, IBM x3650M4, IBM x3550M3, Dell R710, Dell R810

CNA

QLogic (Brocade) 1860-CNA, Brocade 1020-CNA, Emulex OCe14102-UM, Intel X520-SR2, Broadcom NetXtreme II (BCM57711,BCM57810)

Analyzer/Jammer

JDSU Xgig

I/O Generator

Medusa v6.0, VMware IOAnalyzer

 

Configure Equipment

Some of the required and recommended configurations for the test bed systems are covered here.

 

Step 1. Brocade VCS Network Configuration

  1. The Brocade VDX switches are configured to form a Brocade VCS fabric in a Logical Chassis cluster mode. Refer to the Network OS Administrator’s guide (See Related Documents section above) on steps to configure the VCS fabric.

  2. Configure a VLAN on the VCS fabric to separate the Nimble iSCSI network and associate all the initiator and target switch ports to that VLAN.

Step 2. Nimble Array Configuration

The CS220G-X2 model array has an active-standby controller architecture.

 

  1. All disks on the array are members of the default disk pool and setup as per the standard best practices of Nimble.
  1. The volumes are configured with the “default” performance policy and with 100% Volume Reserve (Thick Provisioned) and 100% Volume Quota.
  1. No Snapshot or Replication policies are configured.
  1. For volumes attached to ESX hosts the “Allow multiple initiator access” box is checked during Volume creation.

Nimble Volumes Configuration.jpg 

   Nimble Volumes Configuration

 

  1. The network topology is configured as “Two Dedicated Networks” for the separation of Management and Data traffic. The 10GbE interfaces on the controllers are bound to the Data subnet and connected to the Brocade VCS fabric. The screenshots below show the network configuration used:

Nimble Network Configuration-Screen 1.jpg

   Nimble Network Configuration-Screen 1

 

Nimble Network Configuration-Screen 2.jpg 

   Nimble Network Configuration-Screen 2

 

Step 3. Host Configuration

 

For Windows Servers

  1. Enable Windows MPIO on the host.
  2. Install the “Nimble Connection Manager” utility for Windows systems from the Nimble Connect support portal. The utility installation will add an entry for the Nimble devices to the MPIO Devices list.

Windows Server MPIO Properties.jpg 

   Windows Server MPIO Properties

 

  1. Use the Nimble Connection Manager utility to discover and connect to the provisioned LUNs.

Nimble Connection Manager-Volumes Configuration-Screen 1.jpg    Nimble Connection Manager-Volumes Configuration-Screen 1

 

Nimble Connection Manager-Volumes Configuration-Screen 2.jpg    Nimble Connection Manager-Volumes Configuration-Screen 2

 

For Linux Servers

Install the required iSCSI Initiator and Multipath tools.

 

  1. Add the following to /etc/multipath.conf 

<==========>

      device {

          vendor "Nimble"

          product "Server"

          path_selector "round-robin 0"

          features "1 queue_if_no_path"

          no_path_retry 20

          path_grouping_policy group_by_serial

          path_checker tur

          rr_min_io 20

          failback immediate

          rr_weight priorities

        }

}

<==========>

 

  1. Discover the Nimble iSCSI target using “iscsiadm” utility.

<==========>

iscsiadm -m discovery -t st -p 192.168.7.214

iscsiadm -m node -L all

<==========>

 

  1. Set iSCSI Timeout Settings for Nimble Storage

 

<==========>

iSCSI Timeout parameters in the /etc/iscsi/iscsid.conf file should be set as follows:

                node.session.timeo.replacement_timeout = 120

node.conn[0].timeo.noop_out_interval = 5

node.conn[0].timeo.noop_out_timeout = 10

<==========>

 

Linux Server Configuration Example

<==========>

# iscsiadm -m node

192.168.7.218:3260,2460 iqn.2007-11.com.nimblestorage:hb067123-vol3-v7bae3bba8a170f6f.0000003b.f1b934d5

192.168.7.217:3260,2460 iqn.2007-11.com.nimblestorage:hb067123-vol3-v7bae3bba8a170f6f.0000003b.f1b934d5

 

# iscsiadm -m session

tcp: [31] 192.168.7.218:3260,2460 iqn.2007-11.com.nimblestorage:hb067123-vol3-v7bae3bba8a170f6f.0000003b.f1b934d5

tcp: [32] 192.168.7.217:3260,2460 iqn.2007-11.com.nimblestorage:hb067123-vol3-v7bae3bba8a170f6f.0000003b.f1b934d5

 

# multipath -ll

mpathd (2447f498620a8eb9a6c9ce900d534b9f1) dm-3 Nimble,Server

size=50G features='1 queue_if_no_path' hwhandler='0' wp=rw

`-+- policy='round-robin 0' prio=1 status=active

  |- 38:0:0:0 sde 8:64  active ready running

  `- 37:0:0:0 sdd 8:48  active ready running

<==========>

 

For VMware Servers

  1. Create a VMkernel port bound to a physical adapter and setup IP connectivity to the Nimble iSCSI target IP, and create binding between the VMkernel port and the “iSCSI Software Adapter”.

VMware iSCSI Adaptor Configuration.jpg 

   VMware iSCSI Adaptor Configuration

 

  1. Add the discovery IP addresses under the “Dynamic Discovery” tab and rescan the adapter to discover and list the iSCSI devices.

 

  1. Change the “Path Selection” policy on all the devices to “Round Robin”.

VMware iSCSI Adapter Scan.jpg 

   VMware iSCSI Adapter Scan

  

VMware iSCSI Multipath Configuration.jpg

   VMware iSCSI Multipath Configuration

 

Step 4. Other Recommendations

  1. Configure host LACP link aggregation forming a vLAG across multiple switches in the VCS fabric. This will provide fault tolerance at the host and allow better utilization of the available links.
  1. Enable Jumbo Frames (MTU=9000) on the host interfaces and the corresponding switch port in the VCS fabric.
  1.  For host adapters supporting Data Center Bridging protocol for iSCSI, DCB QoS can be configured on the VCS fabric. Here is an example CEE map with iSCSI CoS value “4” configured in addition to the Auto-NAS configuration.

 <==========>

# show running-config protocol lldp

protocol lldp

 advertise dcbx-iscsi-app-tlv

  

# show running-config cee-map

cee-map default

 precedence 1

 priority-group-table 1 weight 40 pfc on

 priority-group-table 15.0 pfc off

 priority-group-table 15.1 pfc off

 priority-group-table 15.2 pfc off

 priority-group-table 15.3 pfc off

 priority-group-table 15.4 pfc off

 priority-group-table 15.5 pfc off

 priority-group-table 15.6 pfc off

 priority-group-table 15.7 pfc off

 priority-group-table 2 weight 20 pfc off

 priority-group-table 3 weight 20 pfc off    à NAS

 priority-group-table 4 weight 20 pfc on     à iSCSI

 priority-table 2 2 3 1 4 2 2 15.0    

 remap fabric-priority priority 0

 remap lossless-priority priority 0

 

# sh run int te 75/0/18

interface TenGigabitEthernet 75/0/18

 cee default

 mtu 9216

 fabric isl enable

 fabric trunk enable

 switchport

 switchport mode access

 switchport access vlan 7

 spanning-tree shutdown

 no shutdown

<==========>

 

Test Cases

The following summarizes the test cases used in this test plan.

 

1.0

FABRIC INITIALIZATION – BASE FUNCTIONALITY

1.0.1

Storage Device – Physical and Logical Login with Speed Negotiation

1.0.2

iSCSI LUN Mapping

1.0.3

Storage Device Multipath Configuration – iSCSI Path integrity

1.1

ETHERNET STORAGE – ADVANCED FUNCTIONALTY

1.1.1

Storage Device – Jumbo Frame/MTU Size Validation

1.1.2

iSCSI Bandwidth Validation

1.1.3

Storage Device – w/Congested Fabric

1.1.4

Storage Device – iSCSI Protocol Jammer Test Suite

1.2

STRESS & ERROR RECOVERY

1.2.1

Storage Device Fabric IO integtiry – Congested Fabric

1.2.2

Storage Device Integrity – Device Recovery from Port Toggle – Manual Cable Pull

1.2.3

Storage Device Integrity – Device Recovery from Device Relocation

1.2.4

Storage Device Stress – Device Recovery from Device Port Toggle – Extended Run

1.2.5

Storage Device Recovery – ISL Port Toggle – Extended Run

1.2.6

Storage Device Recovery – ISL Port Toggle (entire switch)

1.2.7

Storage Device Recovery – Switch Offline

1.2.8

Storage Device Recovery – Switch Firmware Download HCL (where applicable)

1.2.9

Workload Simulation Test Suite

 

Test Case Descriptions

 

1.0 FABRIC INITIALIZATION – BASE FUNCTIONALITY

 

1.0.1 Storage Device – Physical and Logical Login with Speed Negotiation

 

Test Objective

  1. Verify device login to VDX switch with all supported speed settings.
  2. Configure VDX switch for AUTONAS
  3. Configure Storage Port for iSCSI connectivity. Validate Login & base connectivity.
  4. Configure Storage Port for NAS connectivity. Validate Login & base connectivity.

Procedure

  1. Change switch port speed to Auto and 10G. [Setting speed to 1G requires supported SFP.]
  2. Validate link states on the array and IP connectivity between the array and hosts.

Result

1. PASS. IP connectivity verified.

 

1.0.2 iSCSI LUN Mapping

 

Test Objective

  1. Verify host to LUN access with each mapped OS-type.

Procedure

  1. Establish IP connectivity between host and array.
  2. Create Host Groups and LUNs on the array with access to iSCSI initiator IQN.
  3. Verify host login to target and read/write access to LUNs

Result

1. PASS.  Able to perform read/write operations LUNs.

 

1.0.3 Storage Device Multipath Configuration – iSCSI Path Integrity

 

Test Objective

  1. Verify multi-path configures successfully.
  2. Each Adapter and Storage port to reside in different switches.
  3. For all device paths, consecutively isolate individual paths and validate IO integrity and path recovery.       

Procedure

  1. Setup host with at least 2 initiator ports. (Create a LAG between the hosts or assign IP addresses to each port in the same subnet to access the target.)
  2. Setup multipath on host.
  3. Establish iSCSI target sessions on both the target IP addresses.
  4. Start I/O
  5. Perform sequential port toggles across initiator and target switch ports to isolate paths.

Result

1. PASS.  I/O failed over to remaining available paths and recovered when disrupted path was restored.

 

1.1 ETHERNET STORAGE – ADVANCED FUNCTIONALTY

 

1.1.1Storage Device – Jumbo Frame/MTU Size Validation

 

Test Objective

  1. Perform IO validation testing while incrementing MTU Size from minimum to maximum with reasonable increments.
  2. Include Jumbo Frame size as well as maximum negotiated/supported between device and switch.

Procedure

  1. Set MTU on the storage interfaces to 1500, 3000, 6000 and 9000.
  2. Verify I/O operations complete at all the MTU sizes

Result

1. PASS. Verified I/O completed without issues. Verified I/O size adapts to the changed MTU value.

 

1.1.2 iSCSI Bandwidth Validation

 

Test Objective

  1. Validate maximum sustained bandwidth to storage port via iSCSI.
  2. After 15 minutes Verify IO completes error free.

Procedure

  1. Start iSCSI I/O to the storage array from multiple iSCSI initiators.
  2. Verify I/O runs without errors.

Result

1. PASS. All I/O operations completed without errors. 

 

1.1.3 Storage Device – w/Congested Fabric

 

Test Objective

  1. Create network bottleneck through a single Fabric ISL.
  2. Configure multiple ‘iSCSI/NAS to host’ data streams sufficient to saturate the ISL’s available bandwidth for 30 minutes. 
  3. Verify IO completes error free.

Procedure

  1. Start iSCSI I/O to the storage array from multiple hosts.
  2. Disable redundant ISL links in the VCS fabric to isolate a single ISL.
  3. Verify I/O runs without errors.

Result

1. PASS. I/O completed successfully on all hosts.

 

1.1.4 Storage Device – iSCSI Protocol Jammer Test Suite

 

Test Objective

  1. Perform Protocol Jammer Tests including areas such as:
    - CRC corruption,
    - packet corruption,
    - missing frame,
    - host error recovery,
    - target error recovery

Procedure

  1. Insert Jammer device in the I/O path on the storage end.
  2. Execute the following Jammer scenarios:
    - CRC corruption
    - Drop packets to and from the target.
    - Replace IDLE with Pause Frame
  3. Verify Jammer operations and recovery with Analyzer.

Result

1. PASS. I/O recovered in all instances after the jammer operations.

 

1.2 STRESS & ERROR RECOVERY

 

1.2.1 Storage Device Fabric IO integrity – Congested Fabric

 

Test Objective

  1. From all available initiators, start a mixture of READ/WRITE/VERIFY traffic with random data patterns continuously to all their targets overnight. 
  2. Verify no host application failover or unexpected change in I/O throughput occurs.
  3. Configure fabric & devices for maximum link & device saturation.
  4. Include both iSCSI & NAS/CIFS traffic. (if needed -- add L2 Ethernet traffic to fill available bandwidth)

Procedure

  1. Start iSCSI I/O to the storage array from multiple hosts.
  2. Setup a mix of READ/WRITE traffic.
  3. Verify all I/O complete without issues.

Result

1. PASS. All I/O completed without errors.

 

1.2.2 Storage Device Integrity – Device Recovery from Port Toggle – Manual Cable Pull

 

Test Objective

  1. With I/O running, perform a quick port toggle every Storage Device & Adapter port. 
  2. Verify host I/O will recover.
  3. Sequentially performed for each Storage Device & Adapter port.

Procedure

  1. Setup multipath on host and start I/O
  2. Perform multiple iterations of sequential port toggles across initiator and target switch ports.

Result

1. PASS. I/O failed over and recovered successfully.

 

1.2.3 Storage Device Integrity – Device Recovery from Device Relocation

 

Test Objective

  1. With I/O running, manually disconnect and reconnect port to different switch in same fabric.
  2. Verify host I/O will failover to alternate path and toggled path will recover.
  3. Sequentially performed for each Storage Device & Adapter port.
  4. Repeat test for all switch types.

Procedure

  1. Setup multipath on host and start I/O
  2. Move storage target ports to different switch ports in the fabric.

Result

1. PASS. I/O failed over and recovered successfully.

 

1.2.4 Storage Device Stress – Device Recovery from Device Port Toggle – Extended Run

 

Test Objective

  1. Sequentially toggle each Initiator and Target ports in fabric.  
  2. Verify host I/O will recover to alternate path and toggled path will recover.
  3. Run for 24 hours.

Procedure

  1. Setup multipath on host and start I/O
  1. Perform multiple iterations of sequential port toggles across initiator and target switch ports.

Result

1. PASS. I/O failed over and recovered successfully.

 

1.2.5 Storage Device Recovery – ISL Port Toggle – Extended Run

 

Test Objective

  1. Sequentially toggle each ISL path on all switches.  Host I/O may pause, but should recover.
  2. Verify fabric ISL path redundancy between hosts & storage devices.
  3. Verify host I/O throughout test.

Procedure

  1. Setup host multipath with links on different switches in the VCS fabric and start I/O.
  2. Perform multiple iterations of sequential ISL toggles across the fabric.

Result

1. PASS. I/O re-routes to the available paths in the VCS fabric and recovers when the link is restored.

 

1.2.6 Storage Device Recovery – ISL Port Toggle (Entire Switch)

 

Test Objective

  1. Sequentially, and for all switches, disable all ISLs on the switch under test.
  2. Verify fabric switch path redundancy between hosts & storage devices.
  3. Verify switch can merge back in to the fabric.
  4. Verify host I/O path throughout test.

Procedure

  1. Setup host multipath with links on different switches in the VCS fabric and start I/O.
  2. Perform multiple iterations of sequentially disabling all ISLs on a switch in the fabric

Result

1. PASS. I/O failed over to alternate path and recovered once the switch merged back in the fabric.

 

1.2.7 Storage Device Recovery – Switch Offline

 

Test Objective

  1. Toggle each switch in sequential order. 
  2. Include switch enable/disable, power on/off, and reboot testing.

Procedure

  1. Setup host multipath with links on different switches in the VCS fabric and start I/O.
  2. Perform multiple iterations of sequential disable/enable, power on/off and reboot of all the switches in the fabric.

Result

1. PASS. I/O failed over to alternate path and recovered once the switch merged back in the fabric.

 

1.2.8Storage Device Recovery – Switch Firmware Download HCL (Where Applicable)

 

Test Objective

  1. Sequentially perform firmware maintenance procedure on all device connected switches under test.
  2. Verify Host I/O will continue (with minimal disruption) through the “firmware download” and device pathing will remain consistent.

Procedure

  1. Setup host multipath with links on different switches in the VCS fabric and start I/O.
  2. Sequentially perform firmware upgrades on all switches in the fabric.

Result

1. PASS. I/O failed over doing the switch reloads. All switches need to be at the same code level for the switches to rejoin the fabric.

 

1.2.9 Workload Simulation Test Suite

 

Test Objective

  1. Validate Storage/Fabric behavior while running a workload simulation test suite.
  2. Areas of focus may include VM environments, de-duplication/compression data patterns, and database simulation.

Procedure

1. Setup six standalone hosts for driving iSCSI I/O. Use Medusa I/O tool for generating I/O and simulating workloads.

- Run random and sequential I/O in a loop at block transfer sizes of 512, 4k, 8k, 16k, 32k, 64k, 128k, 256k, 512k, and 1m. Include a nested loop of 100% read, 100% write, and 50% read/write.

- Run File Server simulation workload

- Run Microsoft Exchange Server simulation workload

 

2. Setup an ESX cluster of two hosts with foru worker VMs per host. Use VMware IOAnalyzer tool for generating I/O and simulating workloads.

- Run random and sequential IO at large and small block transfer sizes.

- Run SQL Server simulation workload

- Run OLTP simulation workload

- Run Web Server simulation workload

- Run Video on Demand simulation workload

- Run Workstation simulation workload

- Run Exchange server simulation workload

 

Result

1. PASS. All workload runs were monitored at the host, storage and fabric and verified they completed without any I/O errors or faults.