IT engineering and a little bit of hacking: August 2011

Wednesday, August 31, 2011

Volume Fractional Reserve vs Snap Reserve

(Note: this won't make sense to you unless you already understand snapshots. Read here for a common sense explanation of how WAFL handles snapshots).

Fractional Reserve: the amount of space that is reserved for your snapshots to grow. In NetApp's words,
"A volume option that enables you to determine how much space Data ONTAP reserves for Snapshot copy overwrites for LUNs, as well as for space-reserved files when all other space in the volume is used. Fractional reserve is generally used for volumes that hold LUNs with a small percentage of data overwrite."

Snapshot Reserve: In NetApp's words, "a set percentage of disk space for Snapshot copies."

The concept here is that a snapshot can become as large as the original dataset in the volume (100%), and space needs to be reserved for that. Remember that the space occupied by data in the volume is the sum of the existing LUNS/Qtrees and any snapshots that exist in that volume. Empty space in the volume is ignored by snapshots.

Consequently, the Fractional Reserve is between 0% and 100% the size of the original space taken up by data in the volume for each snapshot, plus any space reserved for thick-provisioned LUNs.

Say volume volx is 20GB and has a single thin provisioned LUN with 6GB of data. If there is one snapshot of volx, the volume fractional reserve need not be larger than 6GB. This is not to say that 6GB is required, it is just the ceiling.

The best explanation on fractional reserve I've seen yet is by Chris Kranz: http://communities.netapp.com/groups/chris-kranz-hardware-pro/blog/2009/03/05/fractional-reservation--lun-overwrite

Quick comparison between Fractional Reserve and Snap Reserve:
With fractional reserve, changes to existing data (aka data overwrite) will be written first to blank space, and then to the fractional reserve space. When a LUN is space-reserved (aka fully allocated or thick), fractional reserve is where there is space specifically reserved for that LUN.

If fractional reserve is set to 100%, the fractional reserve space will be the sum of:
1) The size of all space-reserved LUN's
2) The max size of all snapshots

With snap reserve, new data cannot be written to the space reserved. It is solely for changes to existing data. This means if all the open space is occupied, the LUN cannot grow even if the snap reserve is not full. At that point the data in the LUN can change, but the space taken up by the LUN cannot change.

If you are using Qtrees, it makes sense to use snap reserve. If you are using LUNs, go with fractional reserve.

NCDA Notes: Perf Tools

Quick description of tools used for performance and monitoring in conjunction with a FAS system..

netstat lists open connections and statistics on them
ifstat lists NIC's and statistics on them
pkkt gathers data on the a single interface and the network traffic on that interface for analysis
statit used to produce/analyze statistics on component performance
stats low level perf data - network
sysstat high level perf summary: cp's, cpu, protocols, etc.
perfmon downloadable all-purpose performance tool. big gun.
netmon essentially a lightweight version of wireshark.
ethereal network analysis tool. Basic idea is it tries to grab all available packets and figure out what's going on.

Tuesday, August 30, 2011

NetApp Experience: Joining the FAS System to Active Directory

While enabling CIFS, you are asked if you want to join a domain. This can get a bit complicated depending on whether you have privileges to add computers to the domain, and specifically which OU you have rights to add a computer to. Unfortunately, ONTAP lists OU's in a very not-useful way, so if you arent' able to get the OU=Contoso,OU=com,etc syntax right you'll find yourself frustrated. Luckily, there's a backdoor way to accomplish this.

"By creating a computer account in Active Directory before the computer joins the domain, you can determine exactly where in the directory the computer account will be placed."*1

So go ahead and create the account in ADUC, and then join the domain using the CIFS setup technique like you normally would. Problem solved!

*1 http://ptgmedia.pearsoncmg.com/images/9780789736178/samplechapter/0789736179_CH02.pdf

Saturday, August 27, 2011

NetApp Training Brain Dump: Cabling Standards

I encourage anyone who owns a FAS system to read over either of these two documents and make sure their system is correctly cabled. It could save you a big headache in the future!

DS4243 Installation and Service Guide

http://now.netapp.com/NOW/knowledge/docs/hardware/filer/210-04708_B0.pdf

SAS Cabling Guide

http://now.netapp.com/NOW/knowledge/docs/hardware/filer/215-05500_A0.pdf

Feel free to email me if you need a copy but don't have a NOW account. My email address can be found in my "about me."

Friday, August 26, 2011

NetApp!

I'm happy to announce that I've received an assignment to work for NetApp as a PSE for the next 6 months. I'm looking forward to learning a ton surrounded by such smart people!

Friday, August 12, 2011

NetApp Training Brain Dump: Volume Snapshots

Here's something I bet you've never thought of before: how can a volume snapshot, which is a picture of state of the volume, be stored in the same volume? It's kind of like trying to take a picture of your entire self, hand included, while holding the camera.

Think about it. A volume snapshot creates pointers to all the blocks in the volume, and retains them no matter what happens to the active data. If you write a couple new blocks to identify within the volume that there is a new snapshot, along with the timestamp, etc, that has changed the volume. That data must be written somewhere new since the old space is all protected. The entire volume's space is protected. Every new snapshot would make the volume read-only until the volume is expanded.

Even crazier: since snapshots reside inside the volume, this means when you take a second volume snapshot, you're taking a snapshot of a snapshot. Whoa.

Solution:
Volume snapshots don't take a snapshot of the whole volume: white space is not included. This means that volume snapshots only protect the blocks that are currently occupied with data: the rest of the space carved out for the volume is not part of the deal.

Let's say you have a 20GB volume with a 1GB LUN inside. Take a volume snapshot: the largest that snapshot can get is 1GB, and if your LUN changes but does not grow, only 2GB of your 20GB will be occupied be data.

Let's say you have a 20GB volume with 4 1GB LUNs inside. Take a volume snapshot: the largest that snapshot can get is 4GB, since that is the sum total of space occupied by data. Let's say that happens, and our original 4 LUNs are all 1GB but all their data has changed. The total space occupied by data is 8GB. If you take a snapshot of the volume now, it will save the state of both the LUNs and snapshots in the volume, since snapshots do indeed count as data. The largest your new snapshot can grow is 8GB.

A note of interest is that the snapshot info (name of snapshot, date taken, block pointers (aka inode tables), file allocation, and metadata like ACL's) all resides inside the volume.

Wednesday, August 10, 2011

NetApp Training Brain Dump: Snap Command

This post will be pretty straightforward: describing the snap command, which is used to manage snapshots in Data ONTAP. Click here if you need more background on snapshots. I'll skip the stuff that isn't very useful and put the basics on top: there are more options than I'll go through below but I've done you the favor of leaving out the less useful information.

Basics:
snap create <volume name> <snapshot name>
snap delete <volume name>. Add -a to delete all snapshots belonging to the volume.

Advanced: Snap Restore
Note: read this if you need help understanding NetApp's command syntax.
snap restore [ -f ] [ -t vol | file ] [ -s snapshot_name ] [ -r restore_as_path ] vol_name | restore_from_path

Informational:

snap list <volume name>. Add -A to list all snapshots, add -l to show date created and retention policy.

snap delta <volume 1 name> <volume 2 name> . compares two snapshots and tells you the rate of change between them. Gives some really cool tables. Add -A to compare all snapshots for this volume.
snap reclaimable <volume name> <snapshot 1 name> <snapshot 2 name> this command can take awhile. It calculates the amount of space you can get back by deleting the snapshots you list.
snap rename <old snapshot name> <new snapshot name>

Systematic:
snap sched <volume name> <#weekly><#daily> <#hourly> @list. For each #, replace with an integer. ONTAP will keep that many snapshots online for that time period. For the @list, use military time to designate when to take hourly snapshots. For example, a 2 in the #weekly spot would create two snapshots every Sunday at 24:00. Daily snapshots are taken at midnight.
snap autodelete. allows the volume to automatically delete snapshots. The volume will delete them based upon triggers you set. This gets complicated quickly, involving what kinds of locked snapshots you'll allow to be autodeleted and in what case.
snap reserve <volume name> <%> reserves space for snapshots in the given volume.

Monday, August 8, 2011

NetApp Training Brain Dump: snapmirror.conf

ONTAP saves the information for what snapmirror relationships have been set up in a file called /etc/snapmirror.conf. What /etc/rc/ is to ifconfig, /etc/snapmirror.conf is to snapmirror initiate. The basic idea is that even if you set up a snapmirror relationship, that data replication will cease when the system restarts...unless it is in snapmirror.conf.

There are several amazing walkthroughs for this file, so I won't be redundant. Please check out the links below from the very talented Chris Kanz!

http://www.wafl.co.uk/tag/snapmirrorconf/

http://www.wafl.co.uk/snapmirrorconf-2/

Saturday, August 6, 2011

NetApp Training Brain Dump: /etc/rc

Interesting history:

"The letters stand for Run-Com, the name of a command
file on early DEC operating systems. The Unix system's
original "rc" file was /etc/rc, which executes commands
when the system "boots" (or starts up). The name spread
to the C shell startup file .cshrc, the Mail startup
file .mailrc, the Berknet initialization file .netrc,
and the Netnews startup file .newsrc. Programmers could
have chosen a better suffix (such as init) but they
wanted to retain a realm of mystery in the system."

http://www.anvari.org/fortune/Miscellaneous_Collections/305494_what-does-rc-stand-for-and-why-are-there-so-many-rc-files.html

Friday, August 5, 2011

NetApp Training Brain Dump: Clusters

Doing my best to translate tech-speak into common sense, one day at a time.

In NetApp, a cluster is two controllers that are both capable of accessing any disk in the system. When data is sent to a particular controller to be written to disk, that data is sent to the local cache, and then mirrored to the partner controller's cache. The purpose of this is for failover: if one controller goes down, the other controller can 100% emulate the failed one and not miss a beat.

This is called a "takeover". If one partner "panics" (fails), the other controller will take over its disks, its IP addresses, and its traffic. Pretty cool. It knows what IP addresses to spoof because when you set it up, you put the partner addresses in the /etc/rc folder. You typically want no more than 50% utilization on either controller, so that in the case of a failover, the surviving controller can handle the total sum of traffic.

When you are confident the failed controller is back up and operational, you can initiate a "giveback," in which the controller coming back online will re-sync with its partner's cache, and then resume owning disks, handling traffic, and getting it's IP's back. Givebacks take 1-2 minutes or so, during which the taken-over system is unavailable, and there are complications for people accessing files via CIFS. The giveback command is issued from the partner that took over the down controller.

There are a number of options you can configure to handle this behavior. You can:

Alter how file sessions are terminated in CIFS before a giveback, including warning the user.
Delay/disable automatic givebacks.
Have ONTAP disable long-running operations before giveback.
Not allow the up controller to check the down controller before initiating giveback (bad idea).
Allow controllers to take each other over in case of hardware failure, and specify what IP/port to notify the partner on.
In a metrocluster, change FSID's on the partner's volumes and aggregates during a takeover.
Change how quickly an automatic takeover occurs if the partner is not responding.
Disable automatic takeovers.
Allow automatic takeovers when a discrepancy in disk shelf numbers is discovered.
Allow automatic takeovers when a NIC or all NIC's fail.
Allow automatic takeovers on panic.
Allow automatic takeovers on partner reboot.
Allow automatic takeovers if the partner fails within 60s of being online.

The command used to initiate and control takeover/giveback is cf. Here are your main options

cf disable: disables the clustering behavior.
cf enable: enable the clustering behavior.
cf takeover takes down the partner and spoofs it. cf giveback allows the down controller to take back its functionality. ONTAP won't allow these to be initiated if the system fails checks for whether the action would be disruptive or cause data loss.
cf giveback -f: bypasses ONTAP's first level of checks as long as data corruption and filer error aren't a possibility.
cf takeover -f: allows the takeover even if it will abort a coredump on the partners. Add a -n to the command: ignores whether the partner has a compatible version of ONTAP.
cf forcegiveback: ignores whether it is safe to do a giveback. Can result in data loss.
cf forcetakeover: ignores whether it is safe to do a takeover. Can result in data loss. -d bypasses all of ONTAP's checks and initiates the takeover regardless of data loss or disruption. -f also bypasses the prompt for confirmation.
cf status will inform you of the current state of the clustered relationship.

Thursday, August 4, 2011

NetApp Training Brain Dump: Data Replication

There are apparently different levels of synchronization, which we can observe from this sentence: "A (level-0 resync in progress) message indicates that a plex cannot be rejoined."*1

"If the mirrored aggregate was in an initial resynchronization state (level-0) before the forced takeover, you cannot rejoin the two aggregates. You must re-create the synchronous mirror to reestablish the MetroCluster configuration."

Level-0 means that there was never an actively sync dataset: although the relationship is set up, the initial full transfer never completed.

*1http://hd.kvsconsulting.us/netappdoc/733docs/html/v-series/metro/disaste3.htm

http://publib.boulder.ibm.com/infocenter/cicsts/v3r2/index.jsp?topic=%2Fcom.ibm.cics.ts.intercommunication.doc%2Ftopics%2Fdfht1c0082.html

Monday, August 1, 2011

DATA ONTAP 8.0 Simulator: SnapMirror

Instructions for setting up SnapMirror on two ONTAP 8.O Simulators.

Your sims probably have the same SYS_SERIAL_NUM. You'll need to change that on one of them. Boot to the SIMLOADER prompt (click here if you don't know how) and type the following:

SIMLOADER> SET SYS_SERIAL_NUM="123456-78-4"SIMLOADER> boot

After that step, the filer just kept rebooting, and after several minutes of trial and error I was able to see these messages:

So, I booted into maintenance mode (click here if you don't know how) :
*> disk remove_ownership all
*> disk assign all
*> halt

On reboot, we're in business! Next step: try to ping from one filer to the other using the filer names. Since I don't have DNS, I added each to their partner's hosts file.

Filer2> rdfile /etc/hosts
#Auto-generated by setup Mon Jul 25 16:11:01 GMT 2011
127.0.0.1 localhost localhost-stack
127.0.10.1 localhost-10 localhost-bsd
127.0.20.1 localhost-20 localhost-sk
# 0.0.0.0 filer2-e0a
# 0.0.0.0   filer2-e0b
192.168.3.3   filer2-simfasmsoe-e0c
# 0.0.0.0   filer2-e0d

Copy the text above and run
FILER2> wrfile /etc/hosts
<Right click here to past the text>
192.168.3.2 filer1

Make sure you pressed enter after adding filer1's line, and press control-c. Then run another rdfile /etc/hosts to verify. Perform this on each filer, and then test by pinging.

Next step is to create the aggr's and volumes you'll need (see the link at the end if you need help). After that, there's a few more setup items, the first being licensing (you can find license h ere. Please note that primary and secondary licenses are different).

Next steps:

filer1> options snapmirror.enable

snapmirror.enable on

filer2> options snapmirror.enable

snapmirror.enable on

Next steps:

filer1> wrfile etc/snapmirror.allow

filer2

control-c

filer2> wrfile etc/snapmirror.allow

filer1

control-c

Last, kick it off!

filer2> snapmirror initialize -S filer1:vol2 filer2:vol2

and...it failed. Error message:

Mon Aug 1 15:15:47 CDT [replication.dst.err:error]: SnapMirror: destination transfer from simfas:vol2 to vol2a : could not read from socket.

After turning off my firewalls, this thread tipped me off that it might be the hosts allow issue, so more steps!

filer1> options snapmirror.access hosts=filer2

snapmirror.access hosts=filer2

filer2> options snapmirror.access hosts=filer1

snapmirror.access hosts=filer1

Awesome.

Credit: The blog below helped guide me through it the first time, but seems to be for an older version so YMMV.
http://blog.szamosattila.hu/index.php/2011/03/netapp-snapmirror-lab-from-scratch