IT engineering and a little bit of hacking: 2013

Friday, September 20, 2013

Snap Sched

Need to know your snapshot schedules? Remember, daily and weekly snapshots are always at midnight.

snap sched syntax:

<# of weeklies to be kept> <# of daily's to be kept> <# of hourlies to be kept>@

example 1:

Volume n1b_sys_vol: 0 2 6@8,12,16,20

takes no weekly snapshot, takes a daily snapshot and keeps up to 2 of them, and takes an hourly snapshot at 8am, noon, 4pm, and 8pm, keeping up to 6 of those hourly's.

example 2:

Volume n1b_sys_vol_snaps: 1 1 2@10,16

Takes one weekly (keeps for a week), one daily (keeps for a day), and 2 hourly at 10am and 4pm (keeps only two)

example three:

Volume n1a_corp_g1: 0 31 120

Takes none weekly, one per day (keeps for 31 days), takes one hourly (keeps for 120 hours).

https://library.netapp.com/ecmdocs/ECMP1196991/html/GUID-61658C03-3005-4F3F-81CF-05C24DC0A772.html

Tuesday, July 23, 2013

Clustered ONTAP Networking

Basic

ifgrps are made up of physical ports. ifgrps are to be named "a," e.g. a0a, a0b.

All members of a ifgrp must have the same role set (data, mgmt)
Cluster ports cannot be in ifgrps
A port with its own failover designation cannot be added to a ifgrp
All ports in an ifgrpmust be on the same physical node

A lif (logical interface group) is basically the IP addresses assigned to an ifgrp. That IP address entity is given other properties like "home port," VLAN, failover group, etc. Each IP needs to be able to have these properties independent of other IP's.
Spanning tree should not be enabled on the switch for any ports connecting to the controllers except the management ports.
It is a best practice to set flowcontrol to none for all ports except UTA ports.

UTA ports set to full.

You can of course add VLAN tagging and multiple IP's per LIF, named "-," e.g. a0b-441
Each LIF gets a routing group that you may or may not want to alter.

You'll need to create a default routing group (default per vserver) for LIF's to default to.

Once you have those concepts figured out, you can move on to the advanced.

For snapmirror/snapvault, each node needs its own "intercluster LIF" to identify which ports and IP to be used for replication.
You'll need inter-cluster routing groups on each node to set up replication.
You have to create a cluster peer to establish a relationship for replication.
Network Interface Failover Groups are simply the cluster-wide list of ifgrps whose LIFs can migrate over to each other. For example, you don't want your 10Gb NFS LIF failing over to another controller's 1Gb port ifgrp.

Your node management LIF can't fail over to other nodes (obviously, it's the portal to your physical node). So it should have its own failover group that contains only local ports.
You want to be cognizant of not including e0M on any failover groups with data ports.

Some examples of real life setups:

A vserver (virtual storage machine) that was strictly SAN would need only one IP address, assigned to a physical port for management.

Monday, March 11, 2013

Different Sized Disks in a RAID Group

We encountered an interesting situation up here recently, and it generated a lot of debate between the PSC’s. We finally locked down the answer, and I figured I would write it up and spread the knowledge!

The issue was that a RAID DP group on a system had 13 450GB data drives, 1 450GB Parity drive, 1 600GB data drive, and 1 600GB DP drive. The data drives were all right-sized to the same “effective used” size, but the 600GB DP drive was showing as not right-sized. This was in debate because it was commonly believed that all disks within a single RAID Group were required to be the same effective size.

In an amazing FAQ document (please find it attached), we found the following:

The document is written by Jay White, a Technical Marketing Engineer widely regarded as the authority on NetApp storage subsystems. It’s a fascinating series of scenario studies that comes highly recommended.

FAS2240 Cabling

Interesting: a single-chassis HA 2240’s installed without shelves still require SAS cabling. In order to be considered MPHA, cable 0a-0b and 0b-0a across the two controllers. A lot of NetApp people still don’t know this, so be sure your sales guy puts SAS cables on the SO.

Credit:NetApp

Sunday, March 10, 2013

Ha-config

Yesterday we booted up a new 7-mode cluster and saw a couple things:

The systems were rebooting rather than going to the setup prompt.
The systems’ NVRAM cards’ LED’s weren’t lit up correctly.
And an error message:

Cannot determine whether configuration should be stand-alone or HA. Chassis is in default configuration, controller is in default configuration, and the non-volatile memory is dirty. Boot into Maintenance mode and run the 'ha-config modify' command to set the controller and chassis configuration to stand-alone or HA, as appropriate. Setting the wrong configuration might lead to data loss. If you need assistance, contact support.

Mar 07 04:01:46 [localhost:haosc.config.unknown.nv:ALERT]: Cannot determine whether configuration should be stand-alone or HA. Chassis is in default configuration, controller is in default configuration, and the nonvolatile memory is dirty.

This is what we did to fix it.

Boot to special boot menu.
Choose 5 to get to maintenance mode.
Set these:

ha-config modify chassis ha
ha-config modify controller ha

Reboot to setup prompt
Configure initial setup
Cf enable
Verify via sysconfig that the systems knew their partner’s hostname and sysid now
Cf disable
Halt to loader prompt

Printenv
Verify psm-cc-config is true
Verify partner-sysid was properly set

Boot up and cf enable