Friday, June 29, 2012

iSCSI

Here's a few concepts I've been studying.  A TPG (Target Portal Group) is basically a method of allowing the server to communicate to your storage system via iSCSI on multiple interfaces and multiple connections on a single session.  This means you can enable MPIO by having multiple TPGs.  Here's the dummy breakdown:
- Each interface (virtual or real) can only be part of one TPG
- Each TPG can have multiple interfaces
- Each session can have multiple connections
- Each connection can only be in one session.
- Each session can only communicate through one TPG

A great use of this is when a server has multiple virtual OS's and therefore needs multiple connections to the same storage system.  If you have multiple paths, you need ALUA.

When understanding ALUA (Asymmetric Logical Unit Access), it's helpful to know it's also called Target Port Group Support.  Basically it's a protocol for determining the best path from the server to the LUN (hence the LU in ALUA).  This protocol is standardized to work with any vendor's iSCSI hardware.

ISNS is basically DNS for iSCSI, but a little smarter in that it also understands TPG's and helps systems find each other that way.

Tuesday, June 26, 2012

NetApp SPC Results

NetApp is buzzing over the last SPC results.  The reason it's important is that we proved we're are able to do a lot with a little.  The test was a first in a few ways:
1.  It's our first performance test of the new cluster-mode software, which is going to be huge.
2.  The test was FC, where NetApp has historically been called the NAS experts by our competitors.  
3.  If you look at the hardware both in cost and amount, we did the same or better latency-wise up to 250,000 IOPS with a smaller number of controllers and disks.

I know list prices don't mean anything, but as a rough guesstimate, look at the 3PAR solution vs NetApp comparison in the recoverymonkey blog (below).  Here's the breakdown:

3PAR
List: $5,885,148
IOPS: 450,212.66
Latency: 13.67

NetApp
List: $1,672,602
IOPS: 250,039.97
Latency: 3.35ms


You could buy 2 of the NetApp solutions for half the price, 50,000 more IOPS, and 1/4 the latency of the 3PAR system.  Wow.


http://www.theregister.co.uk/2012/06/26/netapp_cluster_mode_spc_benchmark/
http://recoverymonkey.org/2012/06/20/netapp-posts-great-cluster-mode-spc-1-result


Disclaimer: I'm affiliated with NetApp and I'm appropriately biased, but not paid for any of the content on this site.

Tuesday, June 5, 2012

NetApp Experience: Shelf ADD => Disk Fail => Failover

During a shelf add last week, I experienced as big of a system outage as I've ever encountered on NetApp equipment.  We started seeing a few of these errors, which are normally spurious:


ses.exceptionShelfLog:info]: Retrieving Exception SES Shelf Log information on channel 0h ESH module A disk shelf ID 4.
ses.exceptionShelfLog:info]: Retrieving Exception SES Shelf Log information on channel 6b ESH module B disk shelf ID 5.


The first connection went smoothly, but when I unplugged the second connection from the existing loop, I started seeing some scary results.  Here's the order of important messages:

NOTE: Currently 14 disks are unowned. Use 'disk show -n' for additional information.
fci.link.break:error]: Link break detected on Fibre Channel adapter 0h.

disk.senseError:error]: Disk 7b.32: op 0x2a:1bc91268:0100 sector 0 SCSI:aborted command -  (b 47 1 4e)
raid.disk.maint.start:notice]: Disk /aggr3_thin/plex0/rg0/7b.32 Shelf 2 Bay 0  will be tested.
diskown.errorReadingOwnership:warning]: error 46 (disk condition triggered maintenance testing) while reading ownership on disk 7b.32
disk.failmsg:error]: Disk 7b.32 (JXWGA8UM): sense information: SCSI:aborted command(0x0b), ASC(0x47), ASCQ(0x01), FRU(0x00).
raid.rg.recons.missing:notice]: RAID group /aggr3_thin/plex0/rg0 is missing 1 disk(s).
Spare disk 0b.32 will be used to reconstruct one missing disk in RAID group /aggr3_thin/plex0/rg0.
raid.rg.recons.start:notice]: /aggr3_thin/plex0/rg0: starting reconstruction, using disk 0b.32

[disk.senseError:error]: Disk 7b.41: op 0x2a:190ca400:0100 sector 0 SCSI:aborted command - (b 47 1 4e)


diskown.errorReadingOwnership:warning]: error 46 (disk condition triggered maintenance testing) while reading ownership on disk 7b.41
[raid.disk.maint.start:notice]: Disk /aggr3_thin/plex0/rg1/7b.41 Shelf 2 Bay 9  will be tested
[disk.senseError:error]: Disk 7b.37: op 0x2a:190ca500:0100 sector 0 SCSI:aborted command - (b 47 1 4e)


raid.config.filesystem.disk.failed:error]: File system Disk /aggr3_thin/plex0/rg1/7b.37 Shelf 2 Bay 5  failed.
[disk.senseError:error]: Disk 7b.40: op 0x2a:190ca500:0100 sector 0 SCSI:aborted command - (b 47 1 4e)
raid.config.filesystem.disk.failed:error]: File system Disk /aggr3_thin/plex0/rg1/7b.40 Shelf 2 Bay 8  failed.
raid.vol.failed:CRITICAL]: Aggregate aggr3_thin: Failed due to multi-disk error



disk.failmsg:error]: Disk 7b.37 (JXWG6MLM): sense information: SCSI:aborted command(0x0b), ASC(0x47), ASCQ(0x01), FRU(0x00).

disk.failmsg:error]: Disk 7b.40 (JXWEEB3M): sense information: SCSI:aborted command(0x0b), ASC(0x47), ASCQ(0x01), FRU(0x00).
raid.disk.unload.done:info]: Unload of Disk 7b.37 Shelf 2 Bay 5 has completed successfully
raid.disk.unload.done:info]: Unload of Disk 7b.40 Shelf 2 Bay 8  has completed successfully

Waiting to be taken over.  REBOOT in 17 seconds.

cf.fsm.takeover.mdp:ALERT]: Cluster monitor: takeover attempted after multi-disk failure on partner


Long story short, this system had caused numerous issues in the past, and we replaced both a dead disk and an ESH module.  After that, the system stabilized: "Since the ESH module replacement there were no new loop or link breaks noticed in subsequent ASUPs."