IT engineering and a little bit of hacking: April 2011

Wednesday, April 27, 2011

"Directory cannot be written"

Needed to install a .jnlp file onto my fresh Windows 7 install and received "This directory cannot be written! Please choose another directory!" after I affirmed the directory install path:

C:\Program Files (x86)\ProgramName\

I'm sure there are several nifty ways to accomplish this, but my hands were tied by the Group Policies set by my company's IT staff (Note: if you have engineers, give them as much power as possible over their own tools). Since googling didn't turn up any quick solutions, here was my work around:

Start
Type "cmd"
Right click the result
Select "Run as administrator"
Type in the path to the .jnlp file
Press enter

This allowed the installation to complete, however the program wouldn't run after install. I uninstalled, re-downloaded the .jnlp using IE in Admin mode, and re-ran the install.

Booyah.

(Alternate solution: turn off UAC!)

Tuesday, April 26, 2011

NetApp Training Brain Dump: Troubleshooting

Beginner tips for troubleshooting with NetApp FAS Systems:

Special Boot Menu - Option "4a" (Pre-ONTAP 8.0 only)

Look here for a mapping of how to access this menu.
Essentially a factory reset.
Wipes out the data on all the disks and re-establishes RAID-DP.

Also wipes out the OS (obviously), which resides on the data drives.
Wipes out all customization and settings.

Re-installs a base level of DATA ONTAP OS from the Compact Flash.
Creates a 3-disk aggregate with a flexible volume.
Boots into setup.
Does not wipe out the RLM settings/IP address.

RLM (aka BMC and SP and RMC)

Each controller has its own RLM
Port is referred to as e0p and labeled with a wrench on the back of the CPU module.
System power {off | on | cycle | status}
Has its own IP and is a completely separate entity from the controllers. You can think of this as a direct KVM (keyboard, video, mouse) into the controller. Same basic functionality as an HP ILO/IBM RSA/IBM IMM.
system console drops you down to the controller console.
Control-D brings you back to the RLM console.

Aggr status -r is your friend. If you want a basic view of the health of the machine, this command will show you what data is healthy and presenting and if any of your aggregates have an issue.
disk show {-v | -a} is very useful for checking physical disk details, such as controller ownership.
LED's

LED's are not 100% reliable.
Solid green usually means the system identifies the disk, is able to use it, but possibly is not using it.
Blinking green means the disk is healthy and in use.
Obviously, amber is bad news.
You can manually turn on LED's one drive at a time, or a whole loop (update: be careful with whole loop LED's. Word is it can freak out your system and take over IO).
The NVRAM LED should be totally off when you pull out the CPU Module. If that light is still on, then data was not able to be flushed to the disk and is being kept alive by the battery. This is bad news bears. Exception: "if the controller was waiting for giveback, the flashing led can be ignored."
When you enter maintenance mode, all LED's will be on automatically.

Easy to confuse (Pre-ONTAP 8.0 only):

Look here for a mapping of how to access these menus.
Maintenance Mode:

Prompt: "*>"
Accessible from Special Boot Menu.
TBD (I'll update later)
"halt" to get out.

Diagnostic mode:

Prompt: "Enter Diag, Command, or Option:"
Accessible from CFE> shell.
Allows you to run hardware diagnostic tests, among other things.
"exit" to get out.

ONTAP versions are customized per FAS system. FAS2020 ONTAP 7.0 is not the same code as FAS3020 ONTAP 7.0.

Are commands universal though? I'll update later.

Replacing the PCI NVRAM card changes the system ID of the controller, as the ID is hard coded into the NVRAM logic. (what models is this true for? I'll update later)
Control-R: At the console, ONTAP kicks out messages regularly. These can interrupt you mid-command. Use Control-R to start a new line and (r)etrieve what you'd typed.
Control-C:

Use this to cancel any process you've started and get a new line.
Use this to quit whatever command you've been typing and get a new line.

Control-G: At the console, use this to access the RLM console.
Set PuTTY up to automatically save your logs.

NetApp Experience: 4a Issue

Putting this on the web because I couldn't find a single page on the internet with this error on it. This will be kind of a sparse post since it's not meant to be noob-friendly.

I was working on 4a-ing a controller on a FAS3050c with 1 shelf of 500GB SATA drives. Here's the series of events:

I had 4a'd the system and the console said it had completed zeroing disks.
RLM Console stopped responding, LCD read "Zeroing disks"
Plugged directly into console, still not responding
Console issued these errors:

Tue Apr 26 16:47:07 GMT [ems.engine.inputSuppress:info]: Event 'driver.com.overflow' suppressed 66 times since Tue Apr 26 16:42:07 GMT 2011.
Tue Apr 26 16:47:07 GMT [driver.com.overflow:info]: serial controller's input buffer is full.

I power cycled with RLM
FAS booted up, stated the controller had panicked, and did a core dump. Rebooted.
I 4a'd again.
Repeated "driver.com.overflow" errors.
Came back up just fine. I was able to netboot and get the system up and running.

Monday, April 25, 2011

Displaying Code in HTML

Found this out today - If you're going to include code in your blog post/web page, there's a few ways to do it.

1. Add a html object that creates a code-quoting area that isn't parsed by html
2. Use <pre><code>code here</code></pre>. I had mixed results.
3. Just be aware of HTML's special character codes. In my case, I'm not a super HTML wizard, I don't have a slick HTML customization GUI (google's blogspot isn't feature-rich), and I needed to color and customize the text so I couldn't just copy paste into the HTML code section.

Some Tips on option #3:
< > Use
& l t ; (no spaces) for <
and
& g t ; (no spaces) for >

to avoid this hassle when you're composing your blog.
Watch out for words like Object: formatted in the correct way, this can cause headaches
(<object> added a video in place of my text).

List of special characters:
http://htmlhelp.com/reference/html40/entities/special.html

Nifty link to add <> symbols to your words:
http://accessify.com/tools-and-wizards/developer-tools/quick-escape/default.php

NetApp Training Brain Dump: RAID Groups, Disks, and Shelves

A brain dump of fundamental NetApp concepts and terms with regards to RAID Groups (RG), Disks, and Shelves.

Note: You can use this calculator to play with the numbers within NetApp's best practices and find the optimal solution for your environment.

RAID Group Best Practices*1:

An aggregate is made up of RAID Groups. You cannot split a RAID Group between aggregates, but you can have multiple RAID Groups that make up a single aggregate.
Always use RAID-DP, which is an implementation of RAID-6 that uses striped data drives and 2 disks reserved solely for parity and diagonal parity. This allows you to lose two disks per RAID Group without losing data. As the ratio of data disks to parity disks goes up, your space efficiency goes up, but also the risk of losing 3 disks in a RG increases. There are also performance implications for a high data disk to parity disk ratio.
Same size/speed/type disks in a RG. No mix and match here. Any imbalance will result in over-utilization of some disks compared to others (resulting in increased likelihood of failure), as well as sub-par performance and uneven utilization of capacity.
Max RAID group size is 28 disks (for SAS/FC), best practice being 16 disks per RG, for a 14:2 data:parity ratio. This ratio is the balance between data protection and storage utilization: you can lose 2 disks out of 16 and still be up and running. As mentioned before, higher ratios mean less protection but more efficient space utilization.

Interesting note: HP's EVA line uses RAID 5 groups of 8 disks, with the capability of losing 1 disk per RAID group.
Minimum best practice RG is 7 disks.
SATA RG max size is 20 disks as of ONTAP 8.0.1, it was 16 disks before.
Lastly, the performance increase for adding more spindles drops dramatically to a flat line after 14 data drives, partially for the reasons above.

Make RAID groups all the same size, for the same reason that you use disks of the same speed and size: homogeneity creates the most efficient systems.
Remember to take into account the number of disks you have and their size when determining your RAID group layout. ONTAP 8.0 supports aggregates of up to 100TB (depending on the system), but any pre-ONTAP 8.0 systems support only 16TB aggregates. Spares and parity disks are not included in this limit.
Remember, shelves are not owned. Ownership by CPU Modules is disk-level.

Aggregate Limits by System*2

FAS6080 = 100TB
FAS6070 = 100TB
FAS6040 = 70TB
FAS6030 = 70TB
FAS3170 = 70TB
FAS3160 = 50TB
FAS3140 = 40TB
FAS3070 = 50TB (requires a PVR)
FAS3050 = 40TB (requires a PVR)
FAS3040 = 40TB (requires a PVR)

Disks:

When a disk fails, the system "bypasses" the disk and recreates the data on the fly from parity calculations to serve requests. It also pulls any unused disks available into the RAID Group and rebuilds to that disk using the parity calculations.
The term "spare" is nuanced, as only a disk assigned to a controller (but not part of a RG) is considered a spare in proper terminology. This is because only a disk assigned to a controller will be automatically pulled into a RG and built as a data disk. Assigned disks also have a consistent stream of traffic to them, checking it for validity.
In choosing the number of spares for a system, use this document: http://partners.netapp.com/go/techontap/matl/storage_resiliency.html
If 2 disks fail and there are no spares available, you have 24 hours to replace a failed disk before the system halts to protect the data (I would love to find the override for this).
When a "disk show" displays (xx) for a specific disk, that indicates the system can't read that disk at all. Usually this occurs when someone pulls a disk and replaces it before the system is able to recognize the old one was pulled. Typically, you want to give the system 60s after you pull the disk before you put in a new one.

Shelves:

Let shelves spin for at least 2 minutes before powering on the controllers per NetApp best practice.
SFP (Small Form Factor Pluggable): Standard size of a plug-in. Smaller than QSFP.
OSFP (Optical Small Form Factor Pluggable): Fibre connection for FCP, same size as regular SFP. One use for this is from the shelf IOM to the FAS system for SAS. AKA GBIC, Optical SFP Transciever
QSFP (Quad Small Form Factor Pluggable): Copper connection plug-in. Bigger plug-in than SFP.
DS14mX (Disk Shelf 14-Drive generation X). 14 drives, dual shelf modules.

Nomenclature: a group of DS14 shelves linked together is called a loop.
6 shelves/loop max.
DS14mX-FC is a Fibre Channel shelf supported by FAS systems.

ESH/ESH2/ESH4: Shelf modules with connections back to the FAS system or other shelves. These can use SFP for copper interconnect or OSFPs for FCP.

DS14mX-AT is a SATA shelf supported by FAS systems.

AT-FCX: Shelf module for a SATA shelf to communicate over FCP to other shelves/FAS system.

DSXXXX (DS4243, 2246, etc): SAS/SATA/SSD shelves supported by FAS systems. 24 drives, dual modules (called IOMs).

Naming convention: DS + #U + #drives + throughput per port in Gb. DS4243 is a 4U 24 disk 3Gb shelf.
Nomenclature: a group of linked SAS/SATA DS4243's is called a stack rather than loop.
IOM3/IOM6: Shelf modules with connections back to the FAS system or other shelves. These use QSFP copper from the IOM to the FAS system/other shelves.
SAS backplanes can take SATA drives, but not visa versa.
10 shelves/stack max.

Credit: Me!

Sources:
*1 http://media.netapp.com/documents/tr-3437.pdf
*2 http://lockienotlucky.wordpress.com/2011/01/28/notes-for-netapp-basics/

Great table for Aggregate/Disk size/RAID Group optimization.
http://now.netapp.com/NOW/knowledge/docs/ontap/rel73/html/ontap/rnote/rel_notes/reference/r_oc_rn_feat73_aggr-size-max-drives.html

NetApp Training Brain Dump: Terms and Acronyms

Here is a list of useful acronyms and definitions. The list is seriously incomplete, I'll keep it updated as I learn. These definitions are not meant to be exhaustive, but are meant to be concise and accurate to give you the general idea, in plain english, quickly.

Terms:

Anodefile: Haven't defined this yet.
The RLM/BMC/SP have essentially the same functionality. They are control modules for management of the device, giving you remote console access in case other connections go down. RLM is the oldest, SP the newest version of this module.
FlexClone: Copy of an existing volume. Looks like a volume, acts like a volume, takes up no space until you change something from the original. R/W. Basically a writable snapshot.
Snapshot: Point in time copy of an existing volume. Cannot be changed. Snapshots take up no space until data is changed on the original, because it's really just a bunch of pointers that are still pointing toward the original blocks.
Fingerprint database:
Deswizzling: background WAFL scanner establishing the relationship between the PVBNs and VVBNs. Only impacts destination, only impacts volumes.

Acronyms:

Product designations*2

Denotes single CPU Module (Controller):

FAS CI (Controller/IOXM): One controller, one IOXM, one chassis. IOXM adds PCI slots to support additional ports.
FAS CB (Controller/Blank): Self explanatory.
FAS E (Expansion): One controller, one IOXM, one chassis.

Denotes only HA (dual) CPU Modules:

FAS CC (Controller/Controller): Two controllers, one chassis
FAS c (Controller/Controller): Two controllers, one chassis
FAS AE (Active/Expansion): Two chassis set up for an HA pair. One controller, one IOXM per chassis.
FAS A (Active/Active): Indicates the same thing as CC.

VIF (Virtual Interface): virtual NIC, known as Trunked or Teamed NIC. They come in single mode (redundancy) or multi mode (load balanced). There are two types of Multi-Mode VIFs, Static Etherchannel and LACP.*1
Static Etherchannel: older protocol for combining NIC's into a single virtual NIC. Load balances just as well as LACP.
LACP (Link Aggregation Control Protocol): Newer protocol for combining NIC's into a single virtual NIC. This is an enhanced option over Static Etherchannel because of better error detection and handling. Pick LACP when possible.
ACP (Alternate Control Path): Backup path for the CPU modules to control the shelf modules.
Wiregauge: software that tests whether a FAS system is correctly wired for MPIO/HA.
NGS: NetApp Global Support.
NRD (non-return disks): Client has paid for the right to keep disks after they've failed. Never take these offsite.
IOXM (Input/Output Expansion Module): Module that goes in the place of a CPU module in a FAS system. Provides more PCI slots for network connectivity.
IOM3/IOM6 (Input/Output module): Redundant shelf module for DS42XX series.
FC-AL (Fibre Channel-Arbitrated Loop):
FC-VI (Fibre Channel-Virtual Interface):
TOI (Transfer of information).
IMAC (Install, Add, Move, Change).
NDMP: Network Data Management Protocol. This protocol sets up communication between the NAS device (e.g. Filer) and your backup device (e.g. tape library), bypassing the backup server. Typically the backup server is running enterprise backup software to facilitate the exchange, but doesn't want to be the middleman in the flow of data. You can think of the backup server as the witness in a duel, and the code of honor each dueler follows as NDMP. When the witness says go, take 10 steps, turn and shoot - very important, but the witness doesn't want to stand in between the two and pass on the bullets :-)
WWPN or WWN (World Wide Port Name): In a SAN, these unique names are used at a software level to route data to and from the correct ports. Each port has a unique name, ostensibly unique in the world
NPIV (N_Port ID Virtualization): How multiple

For more info on this, check out http://blog.scottlowe.org/2009/11/27/understanding-npiv-and-npv/

RDM (Raw Device Mapping): A term for presenting the LUN to the server via SCSI/FCoE/FCP.
VMFS (Virtual Machine File System): VMware's cluster file system.
NDU (Non-Disruptive Upgrades): refers to whether a firmware upgrade takes down the service.
PVBN (Physical Volume Block Numbers): how WAFL identifies the blocks of data. Essentially an address.
VVBN (Virtual Volume Block Numbers): how WAFL identifies the blocks to mirror changes. Each block has the same VVBN on the source and dest volume.
FUD (Fear, Uncertainty, Doubt): how small minded people justify avoiding work.
QSM (Qtree SnapMirror)
RTO (Recovery Time Objective): How long it takes to failover and be up from a disaster.
RPO (Recovery Point Objective): How much data the system will lose in the event of a disaster. Zero RPO means no data lost.
SDS (Storage Design Studio): Software within NetApp Dynamics that is used to design a SAN, including network, disk, and volume levels.
ALUA (Asymmetric Logical Unit Access): Since only one controller at a time owns and writes to each hard drive, ALUA software uses awareness of this to send traffic to the owning controller. This optimizes performance.
LREP (Logical Replication): Used for the first, full transfer of data for either replication of backup.
ASIS (Advanced Single Instance Storage)*3: NetApp's old name for data dedupe.
RBAC (Role Based Access Control)
RAS (Reliability, Availability, and Serviceability)

Positions (loose definitions):

- TPM (Third Party Maintainer): 3rd party NetApp contracted break/fix/install.

- ASE (Accredited Services Engineer): Somewhat analogous to TPM.

- FSE (Field Services Engineer): Third party firefighter. Expert level ASE.

- PSE (Professional Services Engineer): Consultant expert for implementations and base software like Operations Manager.
- TSE (Technical Support Engineer): The guys at NGS.

- PSC (Professional Services Consultant): Consultant guru, highly specialized expert. Architect.

Sources

*1. http://communities.netapp.com/message/8185?tstart=898
*2. http://www.netapp.com/us/communities/tech-ontap/tot-fas3200-boost-1101.html
*3. http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/250-netapp-asis.html

NetApp Training Brain Dump: Quick Hits

Here are some important facts to note that don't really belong in my other posts. I'll keep this updated over the next couple weeks.

Quick Hits:

You can suspend a specific protocol (e.g., turn off CIFS access).
500A098 is NetApp's base WWPN string.
~80% of NetApp's hardware support calls are the hot swappable units (Fan, PSU, Drive).
There are about 40,000 FAS systems in the US, 150,000 in the world (4/24/11).
Don't use HyperTerminal to access a FAS system console. PuTTY is your friend.
Give SAS/SATA/FC their own shelves and loops. Don't mix and match.
The FAS3140 doesn't have the 2 minute PSU countdown to a system halt, but best practice is still 2 minute replacement time.
FAS2000 series internal disks are unusual and specific.
FC disks are up to 750GB now (4/24/11)
ONTAP data blocks are minimum 4k.

NetApp Training Brain Dump: ONTAP Commands and System Manager Setup

ONTAP syntax explained:
Symbols:
| means "Or"
[] means "Command." These are optional.
{x | y} means pick one argument, x or y (required).
< > indicates input argument required here.

Simple Example:
ONTAP> command1 [ -a | command2 <address> | command3 {optionA | optionB}]

This would mean command1 is required, and from there you have 3 options: -a, command2, or command3.
If you go with -a, that's all you can type.
Example: ONTAP> command1 -a

If you go with command2, you need to input an address in the correct format or else the command fails.
Example: ONTAP> command1 command2 10.10.10.192

If you go with command3, you need to select from your preset options.
Example: ONTAP> command1 command3 optionB

Advanced Example:
(HINT: a new line does not necessarily indicate any syntax. Watch the brackets.)
ONTAP> command1 [-a | command2 [
   [command3 {optionA | optionB}] [command4 <input>]
   [command5 {optionC | optionD}]]
   [command6 {optionE | optionF}]]

In this case, note that command 5 is the last command you can add to command2.
Example of a valid command: ONTAP> command1 command2 thisobject command6 15

To use command6, you need to add it to command1. You can follow this based upon the red brackets []
TIP: When trying to zero out a setting, try using a space where the argument should be.

System Manager Setup:
Basic notes here:
1. Don't use e0p for this, as it is generally reserved for ACP. Pick an open data ethernet port.
2. Configure a NIC or VIF
3. Enable SNMP:
  a. options snmp.enable on
  b. Verify community
4. Enable SSL:
   a. options ssl.enable on
   b. secureadmin setup ssl
5. Install System Manager on machine.
6. Add storage system from System Manager.

If you receive "API invoke failed" messages, you likely did not enable SSL correctly.

Wednesday, April 13, 2011

Uninstall IE9

I highly recommend not moving to IE9 yet. MS needs to work out a ton of bugs, and the web community has to adjust their sites to the new browser. In the meanwhile, if you've already made the jump and regret it, I recommend this fantastic walk through on settling back into the tried and true IE8.

http://www.tech-recipes.com/rx/7905/ie9-how-to-uninstall-internet-explorer-9/

Monday, April 11, 2011

NetApp Training Brain Dump: Problem and Solution

In SAN management, you're always working to meet several requirements depending on the nature of the data. You may need to just add more space, in which case throwing more disks at the problem might be the best solution. Other issues can be more complex to address, such as increasing performance working under a strict budget. Below is a detailed progression of options as we search for the best solution.

Potential Options:
1) If you add more disks, the workload might be spread over more spindles and performance may increase.
2) What speed/size disk are you using? Do you have well tiered data management solutions?
3) It might be smart to upgrade the RAM/Cache/Controllers in your SAN storage unit.
4) Utilization management: what applications are driving this change in performance requirements? Can your SQL queries be more efficiently written? Or maybe backup processes are poorly designed?

5) If you have all your high-transaction data on the same disks, you may want to spread that more evenly with your low-IO data. This may require better software to be able to manage this solution (NetApp V-Series).

6) If a small amount of your data is responsibly for a disproportionate amount of writes, you may want to implement a Storage Acceleration solution in front of slower disks, giving you the performance you need at a lower cost.
7) If high read access is required, NetApp offers cache upgrades of high performance Flash memory to reduce the impact on your disks.

NetApp V-Series: Somewhat similar to HP SVSP, you place these devices between the clients of storage (servers, etc) and providers of storage (IBM XIV, etc). This allows you to utilize NetApp's awesome software w/ your existing SAN technology.

NetApp SA (Storage Acceleration): Again, place these between your clients and providers of storage. The device determines what data to hold in local, very fast memory, and what to relegate back to the actual SAN disk. Pretty much the same relationship as your laptop's memory vs hard drive. Increases performance.

Flash Cache: Implement this in your SAN storage unit for an inexpensive solution to drastically boost read operations! Marketing says the performance of this technology is "comparable to SSD."

Wednesday, April 6, 2011

NetApp Training Brain Dump: Bird's Eye View

Preparing for a deep dive into NetApp technology! In an intelligence report to King George in 1776, England's spies wrote about John Adam's strength being that he "sees large things largely." I try to take that approach of not getting caught in minutia when approaching a new technology, to better grasp the big picture. The next few posts will be my journey into that, and I'm sure that in trying to encapsulate complex ideas I will be slightly incorrect in some of these statements. Nuance comes with time! So here we go, basic terms, spelled out in English:

Product Definitions:

- FAS system (aka filer): NetApp's term for the custom machine that manages the storage. Roughly equivalent in purpose to HP EVA, IBM XIV, etc. Capable of serving storage over ethernet NAS (file based protocols like HTTP, FTP, CIFS, etc) or SAN block based protocols (FCoE, iSCSI, or FC). FAS (Fabric Attached Storage) designates that the filer is operating on FCoE, iSCSI, or FC rather than simply as a NAS device.

- SnapVault (OSSV): NetApp's backup solution. Allows full or incremental backups to be transfered from a server directly to a NetApp storage system.

- SnapMirror: Real time replication. Effectively creates software layer RAID 1 by creating exact clones of volumes or qtrees (can't mirror an aggregate from what I've read). This enables NetApp's Metrocluster.

- Metrocluster: their version of DR implementation. Two options: stretch (both controllers in one datacenter) or fabric attached (replication across an ISL (inter-site link) with one controller in each datacenter).

- SyncMirror:

- SnapDrive:

- FlexShare: Allows you to set processing priority for volumes within an aggregate.

- iGroup: Initiator group. All LUN's are mapped to an iGroup, which handle LUN masking based upon the client system. The iGroups basically contain the specifications for the OS-App combo etc to communicate to the LUN. Typically, each server (or cluster) should have its own iGroup based upon the OS, Application (SQL, VMware, etc), and SAN protocol.

Break it down: There are a few layers where the building blocks of storage are combined to form higher level concepts for easier management, each with NetApp-specific jargon. No worries, I'm here to translate and simplify:

- Layer 1: Disk drives. duh.

- Layer 2: RAID Group. This is a group of up to 28 disks operating as a pool of storage, 16 best practice. You want all the RG's in a specific aggregate to be the same size. Two parity disks per RG.

- Layer 2.5: Plex. A plex is a physical copy of the WAFL storage within the aggregate. A mirrored aggregate consists of two plexes; unmirrored aggregates contain a single plex. Take 11 players from the Chicago Bears and NE Patriots, and they're a football team. Move them around a bit, and you can put them in shotgun formation. You can say that they're a set of players (aggregate), and they're distinctly from the Bears and the Patriots (volumes in the aggregate), and that they're a formation (plex)...there are many ways to view the organization of data.

- Layer 3: Aggregate. This is a group of RAID Groups. A RAID group can not be assigned to more than one Aggregate.

- Layer 4: Volume. This is space carved out inside an aggregate. Typically this is space for 1 LUN + reserve space.
- Layer 5: LUN. This is space carved out inside a volume. There can be multiple LUNs per volume, but that can be inadvisable. The LUN is the actual virtual disk being presented to the server.

- Layer 6: QTree. Essentially, this is space carved out inside a LUN for a particular directory, sometimes with a hard limit.

I'll keep these definitions updated as I learn the nuances or need to make corrections.

Monday, April 4, 2011

SAN protocols for dummies

Just doing a brush up on the basic storage concepts. Noobs will appreciate the simplification in this post, experts will likely find it too oversimplified. More detail can be found in the links or more recent posts!

For reference, the OSI model:
7. Application
6. Presentation
5. Session
4. Transport
3. Network
2. Datalink
1. Physical

Quick hits*1:
SCSI, SATA, FC, and SAS are layer 1 and 2 protocols. iSCSI is a layer 5 protocol. Ethernet is a layer 2 protocol. FCP is a layer 1, 2, and above protocol. FCoE is a layer 1-6 protocol.

Photo Courtesy of FCoE.ru

Local protocols: these are how the CPU communicates to the hard drives. They are all layer 1 and 2 protocols specifying the hardware and electronic signals needed to send data between the drive and the CPU. This has trended toward serial protocols (away from parallel) for performance and cost reasons.

1. SCSI (Small Computer System Interface): a high performance parallel standard that specifies hardware level communication over a local BUS.

2. SATA (Serial Advanced Technology Attachment): Slow, inexpensive. Used mainly for unimportant, low change data.

3. Fibre Channel: Fast and expensive, this serial protocol is commonly used in enterprise SANs.

4. SAS (Serial attached SCSI): The SCSI protocol was modified to take advantage of cost and speed improvements in serial technology. This is also commonly found in enterprise SAN's.

5. FATA (Fibre Attached Technology Adapted): Slow, inexpensive. Really is SATA wrapped in a FC interface to gain from shelf technology. This is less used than the other types.

Network Protocols: This is how a server can communicate over your SAN to its storage, essentially virtualizes the relationship between a computer and its local drives, allowing your server to think a virtual drive in a datacenter somewhere is actually directly plugged into it.

1. iSCSI: Protocol simulates the SCSI protocol by wrapping it in ethernet-friendly packets.
Advantage: Works over existing ethernet networks (if given enough bandwidth).
Disadvantage: Some risks involved with having all your traffic on the same cables. Theoretically high overhead since it's higher up on the stack.

2. Fibre Channel Protocol: protocol that requires special fibre cabling and an entire alternate network to support communication.
Advantage: Can be fast, separates traffic and enhances stability.
Disadvantage: Expensive, requires special cabling.

3. Fibre Channel over Ethernet: FCoE. Protocol simulates the FC protocol by wrapping it in ethernet-friendly packets.
Advantage: Can be very fast for cheaper than FC.
Disadvantage: Theoretically more overhead. Doesn't separate traffic.

Sources:
You can find a great performance discussion here:
http://jmichelmetz.wordpress.com/2010/03/24/fcoe-vs-iscsi-the-cagefight-performance

*1: FCOE Discussion:
http://www.fcoe.ru/index.php?option=com_content&task=view&id=296&Itemid=65&lang=english

FC Discussion
http://bit.ly/dQkn8h