IT engineering and a little bit of hacking: June 2011

Tuesday, June 28, 2011

NetApp Training Brain Dump: ONTAP 8.0 Simulator Network

Problems explained simply.

In a previous post I covered installing the simulator and working through a couple bugs. Now for another challenge: network configuration. And you can't set a NIC to DHCP in ONTAP, which makes things a bit more difficult.

Here's a rundown on the NICs you have:

Bridged: you can consider your bridged card as being on the same network as your host card. So it should have an IP address of 192.168.2.x if your host OS is at 192.168.2.4.

NAT: a nat card is the equivalent of having multiple devices connected to one of those home router. You talk to your local network when doing local stuff. When accessing the internet, your router translates your internal IP address and uses the external IP (WAN) address of your router to send your request to the Internet. When an answer comes back, the reverse translation is done.

Host-only: They are normally used to connect to other VM's on the same network or the host.

Let's work with the bridged connection, which should be NIC3, referred to as e0c in ONTAP. E means ethernet, 0 mean's it's onboard, and c meaning the third port. Go ahead and right click each connection in the lower right hand corner of your VMware screen: they should look little monitors. Disconnect NIC1, NIC2, and NIC4.

Credit: Me!

Now set the IP: in the console of ONTAP type ifconfig e0c netmask
e.g., ifconfig e0c 192.168.2.4 netmask 255.255.255.0

Then type ifconfig e0c to see the results. Note that the NIC should show as 'up' - you see this right before the word "flowcontrol" at the bottom.

Credit: Me!

Great, now it's time to test! I recommend:

ping your gateway from ONTAP
ping your host system from ONTAP (this was spotty for me)
ping ONTAP from your host system
PuTTY to port 22 on ONTAP from your host system.
PuTTY to port 23 on ONTAP from your host system. For some reason, I couldn't do this. Haven't figured that out yet.

Finally, NetApp has awesomely provided trial licenses for your simulator, covering just about anything you'll want to check out. NetApp login required!

Continue here if you're still having trouble.

Monday, June 27, 2011

NetApp Insights: Pros and Cons (EMC vs NetApp)

Us engineers really only care about the way things are. We're not emotional about technology, and we don't get too attached to this tool or that tool without good reason. Unfortunately, sales guys mess everything up for everyone.

I'm talking of course about the amazing talent that sales and marketing have for muddying up the water and making it very difficult to compare two technologies.

Let's say you're an engineer, and you need some storage. Maybe for Vmware, maybe for a database, whatever. Storage costs crazy money, but you don't want to get in the middle of a Sharks vs Jets turf war. You just want to get the best technology and move on with life.

I ran across a guy doing just that recently, and he used some crowd sourcing to let them battle it out in a NetApp FAS3240 vs. EMC VNX5500 debate. Reading it definitely helped me understand the craziness a little better, so enjoy!
http://www.linkedin.com/groupItem?view=&srchtype=discussedNews&gid=45867&item=56257529&type=member&trk=eml-anet_dig-b_pd-ttl-cn

Tuesday, June 21, 2011

NetApp Insights: WINS, HTTPS, CIFS

Couple questions came up recently, one being "Why does my FilerView only work over HTTPS now?" I did some digging into it, and the reason for this is because HTTP SSL is enabled by default on ONTAP 8:

FASNAME> secureadmin status
ssh2 - active
ssh1 - inactive
ssl - active"

That option works in tandem with the httpd.admin.ssl.enable on configuration, which specifically enables HTTPS for filerview. If you are strongly against using https, you can turn this setting off, but if you don't have a specific need we'd obviously lean toward the more secure option :-)

"How do I configure WINS?" You have to run cifs setup. This will allow you to enter WINS servers, and you can run CIFS setup without having CIFS licensed. Here's a sample script of responses:

cifs terminate
cifs setup
This process will enable CIFS access to the filer from a Windows(R) system.

Do you want to make the system visible by WINS?: y
IPv4 address(es) of your WINS name server(s): 10.10.10.10
Would you like to specify additional WINS name servers? [n]: y
IPv4 address(es) of your WINS name server(s): 10.10.10.11
Would you like to specify additional WINS name servers? [n]: n
Would you like to reconfigure this filer as an NTFS only filer?: n
Would you like to change this name? n
Selection 1-4: (options are 1 enter a AD domain name, 2 Windows NT domain, 3 workgroup name, 4 NIS/LDAP)
Enter the name:

Done!

When you're done, run options cifs.wins_servers. That command should let you know what WINS servers you have entered.

Pro reading: http://www.redbooks.ibm.com/redpapers/pdfs/redp4074.pdf

Friday, June 17, 2011

NetApp Training Brain Dump: ONTAP 8.0 Boot Menu Option Flowchart

As promised, the 8.0 Boot Menu Options Diagram. Note it is a significantly changed landscape! You can find the 7.3 Boot Menu Options Diagram here.

Credit: Me!

Thursday, June 16, 2011

NetApp Training Brain Dump: DHCP and Soft SAN Zoning

Quick note I feel deserves to be called out: I believe there is no way to set an interface to DHCP. Wild!

Oh, and one more note: "soft" FC SAN zoning does not actually block traffic between ports/WWN's. It functions by creating silos in the fabric naming service, essentially refusing to do a lookup for any ports/WWN's not in the zone of the requester. "Hard" SAN zoning functions very similar to vlans in a router, blocking traffic destined for ports they're not allowed to reach.

Wednesday, June 15, 2011

NetApp Training Brain Dump: SnapVault

Data replication (DR) is a pretty easy concept: you copy stuff from this system to another. Ta-da!
Backups are a pretty easy concept: you copy stuff from this system to another spot (either on the same system or another). Ta-da!

Where you'll need to concentrate your brainpower is on the nuances of the implementation, both in terms of what a DR/Backup product does and how you need to use that to keep your data protected. Here's a quick run down of SnapVault, NetApp's flagship DR/Backup product*1:

SnapVault can copy volumes or Qtrees. You use this product to copy data from one FAS system (primary) to another (secondary, usually cheaper storage system) by setting up a schedule with retention periods. SV will delete ("roll off") old backups as they get too old based upon how you set it up ("only keep 3 latest copies...").

SV does a basic full copy at first, which obviously takes a bit. After that, it copies snapshots based upon your schedule, which essentially are incremental backups*2. This is the most space-efficient method of preserving data, since only the data that has changed is replicated after the full backup.

The built-in options for SV are hourly, daily, and weekly backups. Monthly and other options are available but require specialized scripts. SV obviously gives you flexibility to decide to keep a different number of snapshots on the primary than the secondary. All backups are read only, but LUNS/Qtrees can be presented normally and used as read only.

This setup is pretty darn space efficient, extremely fast, protects your data from a lot of dangers, gives you flexibility to set up backup architectures (e.g., present a read-only LUN from the backup to a backup server to send the data to tape), and saves you money by allowing the low-use backup data to be on cheaper storage.

Digging a little bit deeper, here's how SV works:

You set up a schedule for SV to take snapshots on your primary.
You set up a schedule for SV to replicate those snapshots at a certain rate to the secondary
SV does a full copy to the secondary
SV does a snapshot on the primary
From the secondary, SV pulls the primary's first snapshot.
SV continues doing snapshots on the primary
On the secondary, SV compares the current snapshot it has to the newest snapshot and replicates any changes
repeat steps 6 and 7 indefinitely

Remember, we're talking about snapshots vs regular incrementals: this means that the initial snapshot being replicated (step 5) will have no data at all, since none of the data has changed. But by step 7, the initial snapshot has grown significantly. It is not this, the first snapshot, that is replicated: it is the latest one. But the latest snapshot on the primary will itself have less data in it than has changed since the last time the secondary received a snapshot replication, so the secondary must find the changes since the last snapshot replication and copy those. Draw this out for yourself if you have trouble picturing it.

Credit: me!

*1: Don't confuse SnapVault with SnapMirror. SnapMirror does a one to one copy for failover, while SnapVault just copies data for retention.

*2: refresher: full backups are complete copies, incremental backups are "what's changed since the last full or incremental," and differential backups are "what's changed since the last full." Differential backups obviously become larger and waste more space as the time goes on since the latest full.

Great, honest convo back-and-forth about SV's strengths and weaknesses:
http://communities.netapp.com/message/26593#26593

Official NetApp SnapVault Doc: tr-3487.pdf

Monday, June 13, 2011

NetApp Training Brain Dump: Snapshots

The concept here is that a snapshot can become as large as the original dataset in the volume (100%). Remember that the space occupied by data in the volume is the sum of the existing LUNS/Qtrees and any snapshots that exist in that volume. Empty space in the volume is ignored by snapshots.

Here's the important background idea: WAFL does not update-in-place when existing data is changed. This means that for a normal LUN that has no snapshots, when data changes, it is written to a new location (total space occupied increases) and then the old data is deleted (total space occupied goes back to pre-change levels).

Illustration: If in a LUN with 6GB of data a 4KB block is changed, the sum total of space occupied by data rises to 6GB + 4KB, then back to 6GB as the out of date 4KB block is deleted and reclaimed. WAFL handles this so quickly that your LUN effectively does not increase in size. This is a great advantage for WAFL because update in place can cause data corruption.

This concept is essential to understanding how snapshots work in ONTAP. Let's go back to our 6GB LUN with a 4KB change: WAFL writes the new 4KB data to new, unoccupied space and the snapshot is left occupying the space that would be otherwise deleted. So as data changes, it is not actually the snapshot that is allocated more space, but its existence means that the space that could be reclaimed is now solely assigned to the snapshot. So any data that is only assigned to the snapshot is considered occupied by the snapshot. In this example, the snapshot would be considered to be 4KB in size.

The size taken up by the snapshot increases in concert with the changes to the original LUN: 500MB of changes to the original LUN means that the snapshot will grow from 0 to 500MB in size. For 20GB volume that has 6GB of data (including LUNs and other snapshots), the next snapshot can grow as large as 6GB, making the sum total of the original data and the new snapshot 12GB.

You can find commands to control snapshots here.

Tuesday, June 7, 2011

NetApp Training Brain Dump: Disk LED's and Loops

Wanted to pass on a nifty trick I learned recently: when dealing with multiple loops on a system, an easy way to figure out which shelves are physically on which loop is to turn on the LED of the corresponding disk on the first shelf. For example, let's say DS14's cabled MPHA with 4 loops of 1 shelf each. If one controller has a shelf on loop 1a-2a, turn on disk 17 (disk) LED and 18 (second disk) LED on that loop. If 4b-7c is another loop, maybe turn on disks 3 and 6. Obviously, how you choose to identify is up to your own pattern.

e.g.: led_on 1a.17

I also learned something interesting: according to this NetApp engineer, if you use the command that turns on all the LED's in a loop, it will take over all the IO on the loop and make the shelves completely inaccessible. Wild.

Pages

Tuesday, June 28, 2011

Monday, June 27, 2011

Tuesday, June 21, 2011

Friday, June 17, 2011

Thursday, June 16, 2011

Wednesday, June 15, 2011

Monday, June 13, 2011

Tuesday, June 7, 2011