Wednesday, July 27, 2011

NetApp Experience: Hardware

Got a few more hardware knowledge hits for you.  For one, there is a best practice around connecting shelves on a SAS PCI card, and it has to do with the internal architecture of the card.  The general idea is that there's two single points of failure inside each card called ASIC's.  Basically, A/B (or 1/2) are paired to one ASIC, and C/D (or 3/4) to the other ASIC.

For this reason, when you are connecting a stack to a single SAS PCI card (which you should try to avoid in the first place, but is occasionally unavoidable) you should use A/C as the start of the paths, and B/D should be the return paths.

Onboard ASIC's are paired between 0a-0b, 0c-0d, 0e-0f, etc.

Second, I ran into a customer that had a TON of problems with a system.  It showed up in all sorts of weird ways, leading the admins to update ONTAP and all the firmware.  They finally traced the issue to a single disk, which they replaced.  But the replacement disk failed, and so did the next replacement disk, which they pulled and left the problem slot empty.  Long story short, we swapped out the entire shelf chassis, pulling out disks, ESH modules, and power supplies and placing them in the new chassis.  We made the call to put an entirely new disk into the new chassis after all this.

Although this was a success in resolving the customer's issue, one interesting note was that the ESH modules did not retain the shelf ID.  It turns out that while the shelf ID is retained in the ESH module's volatile memory,  it is actually stored permanently in the internal circuitry of the shelf chassis, and read by the ESH module upon boot up.  Whoa!

No comments:

Post a Comment