IT engineering and a little bit of hacking: QOS

Showing posts with label QOS. Show all posts

Thursday, June 16, 2016

SolidFire Architecture #1

It's time I write a long-overdue overview of SolidFire: how it works, how it solves problems, and why service providers love it. So here is Part 1!

First, SolidFire is not the solution to everything. But it is the best in the world at what it does solve, which is why it won Gartner awards for the last two years. Since this is an engineering blog, let's talk about how it works.

SolidFire hardware is regular servers with SSD's and no RAID, so you get commodity hardware prices and a truly software-defined architecture. It protects data by writing it in two places using an algorithm we call Double Helix, and then earns space back with inline compression and global inline dedupe. The global inline dedupe allows for much greater dedupe ratios than anything else on the market, because every block of data written is unique. Other storage solutions have silos of dedupe, pools of blocks that are unique locally but duplicated many times throughout the environment.

The SolidFire robot

Today SF is iSCSI and FCP only. When you create a LUN, SF chooses where in the cluster to place the data, removing the enormous complexity of we call the "placement question." Let's spend some time on that: in most traditional storage environments, you have a couple of storage nodes that form capacity and performance silos. When you scale out to 20 or 1000 nodes, your provisioning encounters a complex question: where do I place this data? That spurs hours of performance and capacity analysis, trending and peaks vs average conversations. On SF, the cluster does it for you.

It also solves the performance question that multi-tenancy brings by allowing you to provision performance. Not just capacity, but performance! SF does this by allowing you to set a minimum, maximum, and burst for each volume, guaranteeing a service level.

I've only scratched the surface on this one: we'll save the scale cluster model and more for the next blog post.

Wednesday, March 16, 2016

SolidFire and ONTAP

I had a reseller ask last week "Now that NetApp bought SolidFire, are they going to kill all-flash FAS?" My answer: not on your life.

NetApp has sold tens of thousands of all-flash FAS (AFF) systems, which run CDOT, our flagship operating system. It's a great product that enormous enterprises (and governments) are spending a billion dollars a year on: there's no way we'd back down from continuing to invest in R&D there.

Besides that, SolidFire has a completely different architecture than AFF. One way to understand it is that AFF's architecture starts with smaller building blocks. Here's what I mean:

AFF dedupes each volume individually: SolidFire dedupes the entire cluster.
AFF protects each disk using RAID: SolidFire protects each node using two copies of everything.
AFF puts QOS on each volume: SolidFire shows you whether your QOS promises exceed the cluster's ability.
AFF deals with node failure by having a redundant partner take over: SolidFire deals with node failure by having ALL the other nodes pick up the slack.

These are different architectures, which solve different needs. I thought this was a great overview of SolidFire as well:
http://www.virtualtothecore.com/en/solidfire-a-quality-storage/

Friday, February 19, 2016

SolidFire

SolidFire is a "make private cloud easy" solution primarily designed for service providers. It's a "born in OpenStack" all-flash whitebox solution that aims to be stupid-easy to deploy and manage.

The goal for SolidFire is not to be the fastest, the most resilient, or the most features. It aims to answer one question, best in class: "How do I easily deploy Storage as a Service?" You can see this in their design choices:

Because this is a product service providers sell, they're flash only, have required QOS policies, and skip all the management tools, leaving that to OpenStack.
Because they use two copies of everything instead of RAID, they achieve node level resiliency and skip expensive hardware and software, using inline dedupe/compression to recover the space delta. This also spreads performance requirements across the entire cluster.
Because they expect you'll be deploying a single configuration thousands of times, they support only 1 protocol and have very limited configuration options.
Because this is for a cloud, not a single-purpose, the cluster (up to 100 nodes) auto-grows when you add a new node and recovers quickly when you lose one.

A few technical details:

Platform today is Dell servers. Now that Dell owns EMC, it'll probably convert to Cisco.

10 drives per node
SF2405: 5-10TB and 50k IOPS
SF4805: 10-20TB and 50k IOPS
SF9605: 20-40TB and 50k IOPS
SF9010: 20-40TB and 75k IOPS

Features:

Inline dedupe and compression
For QOS you can set min, max, and burst limits.
Mix any node platform
You can hot remove nodes
iSCSI, FCP (with a gateway device)
native snapshot capability and can backup to any Amazon Web Services S3 or OpenStack SWIFT-compatible API.

Under the hood:

Nodes are connected via 10GbE over your shared network. Not a private intracluster network.
“All connections for a particular LUN presented to storage go back to the primary node for that LUN. IE: multipath doesn't help you weather a failover. They're dependent on long iSCSI timeouts to give them time to fail a node and redirect traffic.”

Performance and QOS: http://www.solidfire.com/resources/provision-control-and-change-storage-performance-on-the-fly
Node Loss Demo: http://www.solidfire.com/resources/demonstration-of-solidfires-automated-self-healing-ha

SolidFire wins Gold in the Storage magazine/SearchStorage.com 2015 products of the Year Storage Systems: All-Flash Systems category. http://searchsolidstatestorage.techtarget.com/feature/SolidFire-SF9605

Wednesday, January 6, 2016

CDOT 8.3.2: What's New

The next release of our ONTAP operating system, CDOT 8.3.2, is expected to have several key features and is expected February-ish. 8.3.2RC2 is already out! Here’s what’s new:

1) Inline Dedupe: supported on All-flash and FlashPool enabled systems. This feature reduces the transactions to disk and reduces storage footprint by deduping in-memory.
a. Enabled on all-flash arrays by default
b. Uses 4k block size and stays deduped when replicated.
c. Syntax: volume efficiency modify -vserver NorthAmerica -volume /vol/ production-001 -inline-deduplication true

2) QOS (adjustable performance limit) is now supported on clusters up to 24 nodes (up from 8 nodes)!

3) Support for 3.84TB TLC SSDs. These huge flash drives have a lower cost per GB than smaller SSDs. They also require lower power consumption (85% less) and lower rack footprint (82% less RU) than a same-performance HDD array.

4) Static Adaptive Compression: in 8.3.1 we introduced a high-efficiency compression algorithm for inline compression. 8.3.2 gives you the ability to run this algorithm on an existing volume or dataset.
a. This is important for when you do a vol move from a system without inline compression enabled to one with it enabled, for example from a HDD FAS to an all-flash FAS.

5) Copy Free Transition: This feature allows you to shut down a 7-mode cluster and hook the disk shelves into a CDOT system. The data is converted to CDOT, vfilers are converted to vservers, and in a 2-8 hour window the entire 7 to C migration is complete.

6) Migrate a volume between vservers: the equivalent of vfiler move, this allows you to re-host a volume between vservers. This only works once, and only for volumes transitioned from 7-mode.

7) Inline Foreign LUN Import: Migrate FCP data by presenting the LUN at the NetApp controller, and NetApp then presents the LUN to the host. In the background, NetApp will copy the data over and then you can retire the old system and LUN. This is vendor-agnostic as of 8.3.1 and works for All-Flash FAS in 8.3.2.

8) SVM-DR MSID replication. SVM-DR allows you to replicate an entire vserver and all its configuration for a push-button disaster recover. MSID’s are the unique identifiers for volumes, and replicating them allows applications like VMware to accept the DR exports without re-mounting them, greatly shortening your RTO.

9) Audit Logging enhancements: records login failures and IP addresses of logins.

Read more at 8.3.2RC2 release notes: https://library.netapp.com/ecm/ecm_get_file/ECMLP2348067

Monday, June 1, 2015

CDOT Tip #4

Cool trick – You can watch traffic to an individual file in CDOT 8.3 with QOS policy groups:

cdot::qos policy-group> create -policy-group file_iops -vserver tdnas1

cdot::qos policy-group> show

Name Vserver Class Wklds Throughput

---------------- ----------- ------------ ----- ------------

file_iops tdnas1 user-defined 0 0-INF

Now you set this policy group per file:

cdot::> file modify -vserver tdnas1 -volume datastore_volume -file windows.vmdk -qos-policy-group file_iops

Then, you can do a perf analysis on the qos policy group

cdot::qos statistics performance> show -policy-group file_iops

Policy Group IOPS Throughput Latency

-------------------- -------- --------------- ----------

file_iops 1 0KB/s 0ms

file_iops 1 0.00KB/s 2.00ms

Note: you only see the file_iops policy-group when doing the statistics command while there is actual traffic on the file with that policy group. otherwise you’ll only see –total- which reflects the whole cluster.