« EMC Unified Storage Next Generation Simplicity | Main | Executing on fundamentals: Simplicity and Efficiency »

May 11, 2010


Feed You can follow this conversation by subscribing to the comment feed for this post.


Wow! A lot of information to consume. Thanks Chad - As always your post are amazingly detailed, and very clear.

Any technology that we can use that will make our storage and hereby our infrastructure perform better faster and for less $$ - is always welcome.

Keep them coming!!

Cristi Romano

Solved the December problem with VDI impact on storage. What's next? :-)
I would have an idea for "compression". Why you don't use DataDomain technology to deduplicate instead of "compress"? Of course you'd need to do it in a more random rw style then DataDomains sequential need. Also it should be enabled on LUN basis, knowing the CPU/delay/etc impact. You can also you FAST cache for the containers. Crazy idea, but... who knows.

All the best,

Itzik Reich

I can think of at least one more use Case: vCenter SRM Failover TEST.
if i run some Recovery Plans at the same time, the same boot storm that occur in VMware VIEW will happen here..


Dave B


Or MS patching with SCCM / WSUS. Patch windows are often limited, so you wind up boot storming those too :-)

(Unfortunatly, servers usually aren't linked cloned - they won't fit as well in the cache)


Thanks for the great post Chad, Interesting use of SSDs as cache.

Since I'm with NetApp, naturally I have some questions regarding the new caching scheme.

I keep reading in the various EMC SSD cache posts "we cache writes!"

Caching the writes is necessary with EMC's architecture, NetApp uses a different way of writing to the disk, but anyway, that's a different discussion.

My questions:

1. At what part of the write path is the SSD cache? More like a second level cache?

2. What's the page size? Same as sub-LUN FAST (768KB?) or something smaller?

3. Is it tunable by LUN or some other way?

4. What's the latency? NetApp developed the custom cache boards because they fit right in the PCI-E slots of the controllers, for maximum throughput and lowest latency.




When you get sub-lun FAST, doesn't that negate much of the need for FAST cache in most enviroments? The flash cache is still disks in your disk enclosures, not on internal expansion boards like NetApp. So the "cache" I/Os still flow through the same back-end data channels. This seems to me more of an 'interim' feature until sub-LUN optimization is possible.

So I think the characterization that you are reducing disk I/O is a bit misleading. You are reducing rotating disk I/Os, but you could just have a whole LUN on flash and reduce spinning disk I/Os to zero. Or wait for sub-LUN FAST and then frequently used blocks are on SSD.

Chad Sakac


Every array is different architecturally in many, many ways. [warning - I do not claim to be a NetApp expert]. NetApp's approach of NVRAM and journaling means that write caching isn't needed in the same way that EMC's are in a 1:1 analogy.

That doesn't mean it's not material, as large spike write workloads that could be buffered might not be - forcing NVRAM flushes to be faster than the timed flush (and the backend to work harder and less optimally). Hence the ongoing growth in NVRAM over the years.

But - it is incorrect for anyone to make a 1:1 correlation and jump to an erroneous conclusion.

Of course - NetApp is not the only EMC storage competitor :-) I try to avoid writing a post with any competitor in mind, so any comment I make isn't directed at anyone in particular.

Answering the questions:

1) the cache pages are small - the default is 8KB (can be customized). FYI as a correction (if we're going to compete, it might be on a correct basis) the sub-LUN FAST granularity on midrange is 1GB (as of the July release), and will be 768KB initially on the Symmetrix. Smaller sub-LUN granularity needs more "oomph" to pull off at scale. Expect, however, this granularity to continue to increase as hardware continues to accelerate. We have designed towards the massive multicore generation we are now in.

2) although to the system it looks like an extension of system cache, it is hierarchically below main system cache (metadata operations strive to leverage system read/write cache before hitting FAST Cache).

3) Yes, it's totally tunable on a LUN/Filesystem basis. Of course, on top of cache handling, we can apply QoS policy to the other shared system attributes.

4) Latency is fantastic - micro-seconds - thousands of times better than rotating magnetic media. We looked at length at the various ways to implement this. Latency was a few nanoseconds better when the flash was directly on the PCIe card itself. But, there were a few really big downsides to that approach:

a) We couldn't make the cache shared across controllers - so a customer would need to get twice as much, and ALUA behavior couldn't expect cached content when devices moved. Also, on SP failure/reboot - the cache needed to be rewarmed - not what people expect. Did NetApp crack this, or are the PAM II/Flash Cache cards needed on each storage processor and act independently? That seems to be a big downside if it were the case... I think I know the answer, but don't want to presume - since I'm not an expert on NetApp. Would love to hear it from someone who is....

b) it was harder to make it very simple, very easy to add more cache via the PCIe card way vs. the "just add SSD" way. This is material as the cost per unit of flash is changing VERY fast. With the EMC Unified FAST Cache design, it's a customer-installable option. They can start with as little as two 200GB SSDs for a few thousand dollars, and then add more in the future as their needs grow, at which point SSDs will likely be 2-10x cheaper. This design made it easy for all current generation EMC midrange customers to get started efficiently and easily.

Again, I'm not an expert on NetApp, so perhaps you can tell me - what's the minimum incremental cost of a small PAM II/Flash Cache module? This component - how quickly will NetApp be able to keep up with the plummeting commodity cost of SSD with their specific part - which will sell a fraction of the total SSD market?

And of course how easy is it to add PAM II/Flash Cache to an existing environment - how disruptive is that operation? Surely removing the entire array storage processor, disassembling it and adding a PCIe card - doesn't sound simple or easy to me. Or, is the proposal the cost-efficient and simple "just buy a new box, and migrate everything over"?

Thanks for the questions, and would love to hear your thoughts on my questions!

Chad Sakac


If you read the answer to Dikrek's question - you understand that sub-LUN FAST and FAST Cache are complementary capabilities.

the characterization of reducing disk IO (and improving latency) is NOT misleading (and borne out/supported by the oodles of data I put in the post).

Of course, you're right in the sense that if you just construct the whole config out of SSDs, FAST Cache isn't needed - but of course, this is economically impossible today with the current (current!!) $/GB of SSD.

No - until SSDs have roughly the same $GB as slow magnetic media (which WILL happen - but not for many years), we will have to have shared storage subsystems with both SSD and slow magnetic media.

Now, before going on - I want to restate what I mentioned above:

FAST Cache has latencies to disk for cached IO's that are measured in microseconds, the fact that they have loops and SFPs in the middle only add nanoseconds. With both NetApp's approach and the EMC one, the microsecond-class latencies come from the Flash controller and Flash itself - the only way to get it down even further into nanoseconds is to use SRAM (aka regular cache), though that drives cost up a lot.

It is notable that EMC does offer TB-class global shared SRAM caches on the enterprise arrays. In those cases, SRAM connected via SFP/backend loops would be a bad idea (since SRAM latency is nanosecond fast, so the incremental nanoseconds are material) - and of course, that's not how EMC does it for the global shared SRAM cache on VMAX (which use a very low-latency serial interface)

So, if the comparison latencies for a PCIe flash based cache and a PCIe-connected via SFPs/loops flash based cache is the same, let's talk about how FAST Cache and sub-LUN FAST are like... well.. an "AND not an OR" (sorry NetApp folks - I can't help it - that marketing slogan you're using makes me laugh :-)

sub-LUN FAST **doesn't** move a page from one tier to another immediately (that would be a lot of unnecessary and not "free" IOs). There's a set of metadata that tracks usage, and based on internal profiling, moves it over time. This means that when there is a momentary IO requirement, at any given moment, it is served by **where it is**. If it's on SSD, it will be served in microseconds (roughly same speed as if it was in FAST Cache), but if it's not, it will be served up in milliseconds (how many milliseconds will depend on 15K, 10K, 7.2K or 5.4K rotational speed more than anything else).

FAST Cache means that writes will immediately be completed (buffering up more IO, "deduplicating" reads, and enabling the array to spend less backend IOs by doing more coallescing and other things) and increase the likelyhood that a read will be served from cache (in microsesconds, thousands of times faster than the milliseconds of the fastest rotating magnetic media).

The main effect of sub-LUN FAST is: Enabling you to build a configuration with a given amount of IOps and a given amount of GBs for cheaper, by mixing drive types into a big pool that sorts itself out over time - ergo cost/GB efficiency. Just remember, an SSD is 170x cheaper than magnetic media RIGHT NOW when it comes to $/IO. Huge slow SATA is 20x cheaper than SSD when it comes to $/GB RIGHT NOW. Of course, workloads requirements change over their life, and maximizing the cost per IO and GB over that lifetime - that's what sub-LUN FAST makes more efficient (and since it's transparent - it makes it easier to configure a big pool).

The main effect of FAST Cache is: improving the response time and overall efficiency of the system for every IO, regardless of where it finds itself at any given moment - ergo IO latency/$ efficiency.

How they work together: heavy write and read IO workloads, if consistently heavy, will be on SSDs (sub-LUN FAST will move it). periodic bursts of read/write IO workloads will get SSD response from FAST Cache, which will drain to slower magnetic media (since sub-LUN FAST will have moved the not heavily used workloads to slow media).

The sub-LUN FAST and FAST Cache combination is the best of both. Both are in the same EMC Unified Storage software release (in Beta now, GA in July).

As I showed in the demos above - configuring these is so simple, so default that we expect it to become the default model (big pools that figure themselves out).

I suppose my question to NetApp folks out there (since these questions are both very NetApp centric) - do you feel **SO** confident in WAFL under all workload conditions (remembering of course that it was invented almost 20 years ago now) that you buy the "we don't need to auto-tier" argument?

I'd have to imagine that with the underlying NetApp plex/aggregate/flexvol structure, auto-tiering would be architecturally hard - perhaps that's the root of the - current - "we don't need it" worldview that seems to be going on there?

I think that the fact that many (Compellent of course, but 3PAR and others as well) are delivering sub-LUN automated tiering (to leverage the trend to SSD and large SATA economics), and some (we certainly are) are also doing large low cost cache (eg FAST Cache/PAM II) models would start to beg the question:

**what has NetApp indicated about auto-tiering plans, and what is the REAL plan - surely people aren't really telling Georgens that it's "just not needed"???**

Jonas Irwin

Chad - ntap is the new baby shampoo. NO TEARS!!! I'm waiting for them to put their money where their mouth is and only ship a single tier: SATA . Until they do that their customers still need to manually figure out what data goes on SATA and what data goes on FC.. This is complex and adding native SSD to their systems (if they ever do as Jay Kidd said they would) will make thing even more complex.

I agree with your comment about ntap not being EMC's only competitor. If I remember correctly, they are somewhere around #5 (or is it 6?) in the overall storage market. Maybe I don't get it but wouldn't it make more sense for them to try and knock off their nearest competitors first before throwing rocks and broken glass at EMC?

I'd like to point out a few things about ntap's architecture that may help you understand things a bit better. If someone from ntap or anywhere else can enlighten me with new data or features that I’m not privy to, I’d love to learn more.

WAFL does a decent job with purely small block random workloads only when they have boatloads of very contiguous free space. Take an empty ntap and an empty emc box. Carefully fill both systems with say 10-20% or so data.. Then run a heavy random load with lots of writes, but be very careful to avoid aging the system beyond even 12 or so hours. Surprise!, ntap will win the race every time! It's no accident that all the ntap funded "independent 3rd party tests" comparing emc and ntap arrays are intentionally setup this way. Take an average of all their "independent 3rd party” papers use of space, and you will find that they use no more than 10% usable to raw disk allocated to their test kits. Ntap’s biggest architectural challenge, that will likely keep them at #5 in the market, is that customers simply don't run their arrays like this in the data center.
Once you start filling a ntap system, taking and deleting snapshots and most importantly aging their file system, even over a moderate period of time (say 100 hours), the ntap performance profile changes very dramatically for the worse.. Why? Metadata becomes a challenge (not only in the form of space overhead) as there are many levels of indirection that a filer must query to find the real physical block that is being requested for a simple read for example. One simple read I/O can easily end up turning into five I/Os (or more) to traverse the metadata structures and locate the actual physical block.. This phenomenon can end up creating massive read latency many times over and the poor customer is left trying to figure out what happened to their initial zippy performance. This issue is also why ntap has almost no footprint in the data warehouse space...sequential reads almost always are turned into random reads. Their PAM cards can help mask this behavior (for random loads) by bringing their metadata read overhead onto SSD which can help, but those I/Os still need to be performed somewhere. PAM cards are a very expensive "customer paid fix" for ntap’s intrinsic scaling problem IMHO. I’m not sure, but in addition to needing PAM cards in each filer head (2X the cost to use in both sides of the cluster and needing to be rewarmed over many hours when a routine fail over occurs), and chewing up slots on the PCI bus, wouldn't that limit the overall scale or front end or back-end ports you could add to their filers?
On one hand ntap tells customers to build the "big magical pool of storage" to share all available I/Os to all apps, but on the other hand, they limit the size of the aggr to 16tb which seems really small with growing size of drives these days. Maybe when they were shipping 144gb drives (who makes these? where did that extra 2gb go?;-)) it might have worked okay.. I think they can do better with ontap 8 but I’ve never met anybody who actually runs Ontap 8 yet..
I only mentioned the impact on reads above but what happens to what they tout as their "advantage": Writes? Writes are also a huge challenge when the system is aged out and moderately full, because free space is now unfortunately trapped in smaller holes and new writes must seek many times over to find the open free space.. Again, this results in unexpected latency and unpredictable behavior. The Reallocate utility attempts to come to the rescue but it needs a ton of CPU resources and lots of time and will drag the system down.. I suppose this why many ntap arrays rarely get more than 1/3 full with the disks their spec sheets say they can scale to in the real world.. I can’t tell you how many shops end up with 30 or more of these things (only 1/4 to 1/3 full)..scattered all around their data centers.. It’s as if ntap has created a new industry storage category called "Networked DAS".


@ Jonas:

Throwing FUD is not conducive to respectful selling, those same points have been the mantra of anti-NetApp competitive sales for the last 10 years, but the real-life success stories, the company’s earnings and amazing growth tell the real story.

I have large customers with 10,000+ replicated snaps on their arrays, seem to be running just fine... (full, lots of I/O, data warehouses, complex DBs, tons of VMware etc – all without PAM). Funny that, the snapshot comment coming from EMC, a company that only allows 8 snaps per LUN (and with a well-publicized huge 50% performance hit…)

Indeed, even though you work for EMC, you will probably use our storage at least a few times today, since we provide the back-end disk for most of the online providers.

Maybe you need to read http://bit.ly/aNMwon and http://bit.ly/cnO2

Back to actually discussing technology.

This is turning into a post about NetApp instead of answering Chad’s legitimate questions. Let's put it this way:

NetApp provided thought leadership with shipping the PAM cache years before EMC even announced something similar (let's not forget FLARE 30 or sub-LUN FAST with the gigantic 1GB chunk are not even here yet and won't get initial wide adoption until matured). It's silly to think we're not working on new stuff for others to have to catch up on (again) :)

Regarding thought leadership in auto-tiering: Compellent was first with their auto-tiering and has a 512K minimum chunk. How do they do it?

Regarding thought leadership in (true) Unified Storage: NetApp, obviously. The (true) unified EMC system is coming what, (maybe) 2011? Almost 10 years later than NetApp?

Regarding thought leadership in true block-level deduplication of all primary storage protocols: NetApp again. Nobody else is there yet.

What about deduplication-aware cache? Which, in turn, deduplicates the cache itself. Since nobody else deduplicates all primary storage protocols at the block level, nobody else has this cache deduplication technology.

Enough with the trash talk. BTW, I like the V-Max. I hope Enginuity is getting the SSD cache.

Auto-tiering is a great concept but everyone doing it seems to suffer from potential performance issues due to the fact the data movement algorithm won't react fast enough to rapidly changing workloads. It can work well if the workload is predictable and stable over time – enabling you to just dump your data into an array and have it figure out (eventually) where the different hot/cold areas should reside.

The addition of huge chunks of cache goes a great way towards alleviating this, but it's only part of the answer. Otherwise, it's a solution waiting for a problem. Good for some workloads, but not all. Great to have if it gets out of the way when needed.

To answer Chad's question: Each cache card is separate and only seen by each controller - this is, fundamentally, an architectural difference, and it seems to work well in the real world. Upon controller failure the other cache card has to get warmed up with the workload from the failed controller. The cards are fast enough that this happens very rapidly (each board is much faster than several STEC SSDs, the benefits of a custom design - and no, the warm-up doesn’t take "many hours").

But, of course, I will not just go ahead and divulge the NetApp roadmap just because Chad is asking :) (just as Chad wouldn't divulge EMC's roadmap if I were asking, no matter how nicely).

I’ll give you my thoughts on the no-tiering message (may or may not agree with the NetApp CEO, it’s my own opinion):

In many situations, a decently designed box (NetApp with PAM, XIV, possibly CX with FLARE 30 and SSD cache) can get a lot of performance out of just SATA (NetApp has public SPC-1 and SPEC benchmarks for both OLTP and file workloads where PAM+SATA performed just as well as FC drives without PAM).

However, I don’t believe a single SATA tier covers all possible performance scenarios just yet (which is why I don’t agree with the SATA-only XIV approach – once the cache runs out, it has severe scaling problems and you can’t put any other kind of drive in it).

When I build systems, there are typically either 1 or 2 tiers + PAM. Never more than 2 tiers of disks, and very frequently, 1 tier (either all the largest 15K SAS drives, or all SATA if the sizing allows it). I see it this way:

It’s fairly easy to put data that should be on SATA there in the first place – most people know what that is. If you make a mistake, the large cache helps with that. It’s also fairly easy to put the rest of the data in a better-performing layer. Is it ideal? Not really. Should tiering be automated? Sure. But, until someone figures out how to do it without causing problems, the technology is not ready.

I will leave you with a final question: For everyone doing sub-LUN auto-tiering at the moment, how do you deal with LUNs that have hot spots that are spatially spread out on the LUN? (this is not an edge case). For instance, let’s take a 2TB LUN (say, for VMware). Imagine this LUN is like a sheet of finely squared paper. Now, imagine the hot spots are spread out in the little squares.

Depending on the size of your chunk, each “hot” chunk will encompass many of them surrounding little squares (pity I can’t attach an image to this reply), whether they’re “hot” or not.

With sub-lun auto-tiering, the larger the chunk, the more inefficient this becomes. Suddenly, due to the large chunk size, you may find half your LUN is now on SSD, where maybe only 1% of it needs to be there. Cache helps better in that case since it’s a small block size (4K on NetApp, 8K on EMC). It’s an efficiency thing.

It’s not that easy for a cool concept to become useful technology.


Jonas Irwin

@Dikrek -
It wasn't my intention to turn chad's post into a debate with ntap. I could easily respond with lots of questions for you as well as easily refute all the stuff (fud?) you said about emc but instead will save that for another time and place ;-). Perhaps over a beer or something.. To respond with a little honey instead of the predictable piss and vinegar - we agree that you guys were the first with PAM..nice work! If memory serves..we had EFD and that still has merit for truely cache hostile workloads.. that's probably why you guys still partner with texas mem with vseries..

I'll try to answer your statements and questions about 1GB slices being "huge". Really? 1GB is only .005% of a single 2TB SATA drive. Storage pools will easily have a hundred drives of mixed types but will very often predominately consist of 2tb drives. For the CX, I'd argue 1GB is relatively small. Is it perfect? Nope. 512 sounds great but the tradeoff is all the meta data baggage that needs to be stored to tracked about each slice. Each autotiering implementation has pros and cons I suppose. We've seen some phenomenal results with the 1GB slice but it's by no means an answer to all workloads and all types of data. You could probably play with IOMeter and create an artifical workload that makes it look really bad :-). For cold data, mega caches compliment auto-tiered data quite nicely with the added benefit of accelerating not only reads but writes as well. The long term benefit of FAST is that all the stale stuff, which is most of what sits on the array, ends up trickling down to low cost sata.. Like it or not, it makes for a great tco at every dimension.

To your question about a highly scattered random workload being bad for autotiering with a 1gb slice. We can probably agree that there's a temporal nature to most workloads..not all, but most - What begins to emerge for our customers is a sort of a datacenter wide storage "working set". We've seen data from literally thousands of vmware environments and have been able to generate thermographic charts that show very favorable patterns for autoteiring. Net/Net is the "pros" of mobility of a 1GB slice outweigh the cons for the vast majority of use cases. Ultimately, as long as customers use a variety of drive speeds and efd, providing the ability to throw all drive speeds into a few single pools and let the array figure it out is something most customers find very appealing :=)

Chad Sakac

@dikrek @jonas:

Dikrek - I try to stay away from claims of "FIRST!" and "THOUGHT LEADERSHIP!" - in the end, the first is transient, and the second is in the eye of the beholder.

Jonas is a good guy - I think you made him snap a bit. As the "800lb gorilla" (I don't mean that in the arrogant way, rather as a simple "we are the biggest") you can't imagine how often we hear all the statements of "EMC sucks at X, we rock" - and eventually it gets frustrating, and one wants to punch back.

While perhaps inevitable (everyone compares themselves against the largest player in every category), it gets hard after a while.

For example - the "well documented" snapshot thing you describe - that was the test NetApp commissioned and ran.

Also, EMC doesn't do 8 snapshots. That's a CLARiiON snapshot. We can do 1000 file-level snapshots, each writeable. We can do 96 filesystem snapshots, 16 writeable at a time. We can do continuous data protection (effectively "infinite snapshots") with Recoverpoint. We can do 128 writeable snapshots on Symmetrix. I know that might make us hard to follow, but it also means almost anytime anyone says something about us, they are wrong, which makes competing easier :-)

Likewise - our approach on Unified Storage has been focused on this:
- yes, customers want single storage solutions to support multiple protocols.
- the implementation of HOW that gets done is less important, what is important is that it's easy, and that they get the functionality they want/need.
- that they get it at the right price, via the channel they want, and with the support they need - in presales and in post-sales.

In our mind, we're able to do that well (and doing well in the market - which is of course the ultimate judge, as you point out).

Our approach (encapsulation of key functions) has enabled us to innovate down several axes at once, without getting into the effect of merging complex kernel codestreams that become co-dependent.

We have merged big chunks (underlying storage allocation logic, iSCSI target code and more) across kernels and will continue where it makes sense. It has enabled us to do things that are obscenely hard the other way.

The unification of our management models (unifying block, NAS, and also CDP) helps our customers. That was the only real beef people had (as opposed to FUD about underlying implementation). Check. Done. Don't start with "multiple ways of doing one thing" with me - I'll bring a world of examples where you need a ton of interfaces, kernels, management models to do a set of things :-)

That isn't to imply that the NetApp choices are by definition wrong, just DIFFERENT.

Want an example? Your question on autotiering and granularity is architecturally just totally different for each of us. As an example, if you have a VMware guest, and it has a guest swap - it's likely to be contiguous, or mostly contiguous with a virtual pool model that uses a traditional underlying block layout scheme. Likewise, same holds true for database structures.

This means that the data bears out that auto-tiering at 1GB level granularity has a huge beneficial effect. I certainly invite people to say "uh - you don't want it, you won't save anything". We can demonstrate materially that they will - at which point the person saying "they sux, we rox" will look flat out silly.

Conversely, if you use a "reallocate on every write" journaled model - if you didn't auto-tier at or close to the allocation size (in WAFLs case, 4K), the benefit would be much much lower. Your comment of "Imagine this LUN is like a sheet of finely squared paper. Now, imagine the hot spots are spread out in the little squares." is very WAFL-oriented. The "little squares" have more locality in alternate approaches. It's a superpower and a kryptonite at the same time - like ANY of the core design decisions we all have to make when designing a platform.

Again - not right/wrong - just different.

That difference meant that it was easier for NetApp than others to implement a sub-file level deduplication approach. All caches refer to a block referred to several times via higher level inode/pointer structures once. This means this has a second order benefit that fell out of that.

Again - not right/wrong - just different. In the same way that we are working at capacity efficiency from different angles starting what we can do to help customers most, and do the most quickly. Hence the focus on file level dedupe on primary storage coupled with compression now, and sub-file block-level dedupe on backup targets. I would argue (and do) we can compete with ANYONE when it comes to $/GB and $/IO in every dimension. It's not about any given feature, but rather the solution efficiency.

Of course are working on sub-file level dedupe, in the same way I'm SURE there's someone working on auto-tiering at NetApp.

Re "Thought Leadership", I don't know about how Compellent does their auto-tiering, but regardless, they were certainly the first player of any size to lead the way there. Each of us has innovated over, and over again - no one has an exclusive license to innovation. EMC has created new categories of storage - not once, but several times, as well as point innovation. Just look at your Bycast acquisition - in essence validating that customers have need for Atmos-like storage models.

Personally, I'm glad that my employer that we try to embrace innovation from different sources (R&D, M&A, and watching what competitors do).

We've intro'ed the dense storage configs (like Copan), the auto-tiering (like Compellent), unified NAS/SAN management (like NetApp) - all the while introducing new things (like primary storage compress, global cache coherency) the list goes on and on...


Chad et al,
when it comes to thought leadership on automatic sub lun tiering, I think we should all aknowledge that HP Labs can probably claim the moral high ground here thanks to the excellent work done by John Wilkes, Richard Golding, Carl Staelin, and
Tim Sullivan in creating a RAID Array with automatic tiering back in 1996 ! cf http://www.hpl.hp.com/research/ssp/papers/AutoRAID.TOCS.pdf

Unfortunately like much of the other work done in HP's storage labs, this never got the support it needed from the rest of HP to turn into a succesful product, and when they bought Compaq the somewhat misnamed "Enterprise Virtual Array" (which IMHO is neither Enterprise class, or particuarly "Virtual") killed off what could have been a much better product.

The good thing about EMC and NetApp (I work for NetApp btw, and have never worked for HP), is that our focus means that our good ideas become good products and we invest heavily in making sure our customers can take advantage of the technology we imagine and create. Ultimately it doesnt matter who thought of it first, what matters is who is able to solve problems the in the most efficient manner.

Vision and thought leadership is cool, but to paraphrase the CIO of Morgan Stanley , the true measure of differentiation is execution.

Personally, I'm more impressed by shipping products and happy customers than lab results, white papers and products that havent been released yet. Having said that the engineer in me is looking forward to seeing how well EMC will execute on their vision.

John Martin

Cristi Romano

Hi Chad,

I have a small tech question.
Can I enable Fast Cache on all of the five Vault Pack SSD drives?
Is it advisable?


Mike Riley


Sometimes ex-NetApp employees make for the most passionate evangelists for the competition. That's great passion! All the best to you in your career at EMC - just not too much when competing against us over here at NetApp :-) if you don't mind.

Unfortunately, your WAFL analysis is dated (ie. measured in years) in some areas and your proof points simply wrong. I would caution against using some of those "aged NetApp system" points in the field and your data warehouse example. Those are just softballs over the middle of the plate for most of the NetApp field nowadays. I'm not telling EMC how to train their sales folks - in fact, the NetApp in me says keep heading down this path - but we really don't need to get down into the weeds on how WAFL works. At the very least, competitors start with the name WAFL and let their imaginations run wild from there. In sales campaigns once a competitor pulls out the FUD paper, it's almost like witnessing a fender-bender. You know it's going to end badly for them but you just can't take your eyes off it. You can use it these rants if you want but I don't think it works out all that well for you. I do think to one of Chad's points, most customers don't care *how* the solution works. They want to know whether or not it solves their problem and *what* benefits they will see.

Much of what has been talked about here - unified storage, snapshots, primary storage dedupe, flash as cache - aren't important because NetApp pioneered in these areas. From a NetApp point of view, these were relatively easy to do because they were already part of the WAFL DNA. Whether by luck or design, WAFL lends itself very well to market shifts, particularly the shift towards efficiency and Cloud architectures. It's not about big beating small anymore. It's about fast beating slow. Nimble 800# gorilla is an oxymoron of sorts, isn't it? :-)

Anyway, Jonas is a good guy. I wish him success and I'm sure he will do right by his customers. Based on his post, though, I'm pretty sure WAFL isn't his strong suit but that's O.K. He works for EMC. Have him tell you why you should buy from EMC rather than why you shouldn't buy from NetApp.


O.K. - I had to chuckle a little at this statement: "I know that might make us hard to follow, but it also means almost anytime anyone says something about us, they are wrong, which makes competing easier :-)" I'm not sure the "Where's Waldo" strategy turns out all that well. I'm thinking having a bunch of incongruous approaches to answer the same basic problem wouldn't be a strength, at least not in a customer's eye. The implicit challenge is to find Waldo and performance is Waldo for EMC. It's a challenge to be dealt with. That's simply not a variable for NetApp nor does NetApp have to amortize development across a wide variety of platforms and features. It just means that comparatively NetApp can be more nimble with their solutions; adapt quickly to changes in customer demands. I'm not saying EMC can't - not wrong; just...different

Chad Sakac

@Mike - it's not a variety of approaches to solve one problem, it's a variety of approaches to solve different problems.

- Sometimes customers consolidate everything, including a broad set of open systems (not Windows/Linux) and mainframes.
- sometimes the midrange "lose significant percentage brains/ports/cache when you upgrade/have a storage brain fail" is a deal breaker for a customer.
- sometimes customers need heterogenous replication.
- sometimes customers requirements demand inline dedupe.
- sometimes customers need RPO of zero - not sometimes, but always.
- sometimes customer need to support tens of thousands of devices.

There is a long, long list.

Sure, in many cases simple, easy & efficient is good enough, and we're happy to fight that out with any respected competitor.

But the "one way all the time" thing - I think perhaps even you folks don't really buy that.

Does it cause any cognitive dissonance that:

1) ONTAP 8 has two distinctly different modes, with different featuresets and capabilities (yes, yes, to be merged at some future date).
2) If one way to backup was always the right way (snap and replicate - which we can do of course also) and inline dedupe is bad why the bid for Data Domain?
3) Bycast (good buy, IMO) - does that run ONTAP? Hmmm no. And I guess it DOES highlight that at very high internet-class scale, object models with rich metadata (ala EMC Atmos and Amazon S3) that have no intrinsic dependency on a given filesystem make sense after all, huh?
4) One way all the time, right? So why so many SnapManager products? Why not unify them (we have - one Replication Manager).

Look - this isn't to claim we're perfect, and you'll note that in my post I didn't make ANY comparison to anyone (NetApp included). EMC has lots to improve, and I wake up every day to try to make it a little bit better (after kissing my wife and kids - I've got my priorities straight :-) An example where we can improve is more contigousness, common look/feel/function across our capabilities, and if you look at my simplicity post from EMC world, you can see our massive progress in that area, something we're pretty excited about.

My comment about our different approaches to solutions was humor, nothing more, nothing less. Yes, we think sometimes different technology answers to different technology questions is the way to go. The question of whether we're right, and whether they are suffiently differentiated, and sufficiently integrated - well, that's up to the customer.

I will restate my comment though - I've found that what NetApp things they know about our products is usually at LEAST as wrong as what we think we know about theirs.

Mike Riley

Hi, Chad.

Since this seems to have taken a decided NetApp-centric turn, I posted my response on this NetApp blog:

When is it FUD? When is it Ignorance?

Have a great week.


Chad Sakac

@ Mike - sad how it turned into a NetApp/EMC bash fest. That happens far to often. Note that in the post, I referred to NetApp only once, and that was to acknowledge innovation around Flash used as an extension of system cache.

The record is clear (you can see it all above in the post and the comment thread), the pile-on started with a series of questions posed by pro-NetApp folks, and then it all went downhill from there.

We each see the world through our own eyes, and through the lens of our own experiences.
I always try to make the posts not refer to anyone else (except where the only right thing to do is to make an acknowledgement). I don't (haven't and will continue to try not to) filter comments - heck, they aren't even moderated. That means there's no stopping anyone from making any comment.

I'm going to try a new technique - and we'll see how it goes.

As soon as the dialog starts going sideways, rather than participating (fueling?) I'm simply going to say:


As in "Tomato"/"TomAHto"

We're both doing fine in the marketplace, both gaining share, both have many innovations, both have fans and detractors.

The more I get sucked into the back n forth, the more I think it just hurts us both when the blogosphere/twitpiss contests happen.

Don't get me wrong, I still vehemently disagree with you on many things, but that's OK.

Thanks for the comment.


Chad - surely you mean TomAYto ? heehee not even safe on that one!

In the end there are always two parts - the blog and your comments which are controllable - and the outside comments which can be wild and free.

Sadly Jonas Irwin fell into the trap of believing he is an authority on a company once he has left - never works, just begs to ask personal questions doesn't it? His job is "Director, Competitive Strategy, EMC" after all?


I have a Celerra NS-120 will these features be supported on it? How would I go about adding 1 Flash Drive, would this be a new shelf?

Chad Sakac

@PxPx - thank you for being an EMC customer!

Yup, those features all GA now, and you can get them as you upgrade to FLARE 30 and DART 6.0. You get a LOT with that (Unipshere, VAAI support, and a lot more), and it's free (though the FAST Suite is licensed), and non-disruptive. I would highly encourage it.

EMC's implementation of SSD support is very flexible. You can stick any SSD in any enclosure - basically anywhere you would put a disk.

When doing your update, please download and use the EMC Procedure Generator. It will generate a personalized process JUST FOR YOU.

Thanks again!


Do I need to download the Clariion Procedure generator, or the Celerra Procedure generator, or should I get both?

Don't I need to add 2 flash drives minimum in a raid-1 for fast cache?

Chad Sakac

@Shane - you need to have the CLARiiON and Celerra Procedure Generators (yes, they are actively being merged, analagously to how we are doing with the products). Note that as I'm posting this comment, there is a momentary pause on F30 upgrades, should be lifted shortly.

You can have an "unprotected" FAST Cache, but then it operates in Read-Only mode. If you want to use it in Read/Write (and you do), the minimum is a R1 config. We're seeing most deployments go out the door in parity configs, but R1 is fine.

Thanks for being a customer!


I have Unisphere 6.0.36-4 installed with Flare 30 on my Celerra NS120. However, my Unisphere looks nothing like the version you have in your video demonstration. How do I do the Auto-Tiering and Compression on the Celerra version of Unisphere those same options don't show up.


Here is what I got back from EMC support Supposedly these features dont work yet on Celerra LUNS.

Certain new CLARiiON FLARE Release 30 features are not supported by Celerra and the initial 6.0 NAS code release. CLARiiON LUNs containing the following features will not be diskmarked by the Celerra, resulting in diskmark failures similar to those described in the previous Symptom statements:
• CLARiiON FAST Auto-Tiering is not supported on LUNs used by Celerra.
• CLARiiON Fully Provisioned Pool-based LUNs (DLUs) are not supported on LUNs used by Celerra.
• CLARiiON LUN Compression is not supported on LUNs used by Celerra.

Chad Sakac


- FAST Cache - is fully supported on for NAS.
- FAST is supported for direct block storage provided by your Celerra today, and will be supported for NAS volumes VERY shortly. Today, fully automated storage tiering is done for NAS at the FILE level, not at the block level (but as insinuated, will be available shortly).
- for your NAS volumes, you're already up in that race :-) You get file-level dedupe and compression as a native part of what the Celerra offers.
- Thin provisioning is also provided for NAS.

@Derek, @Direk - this whole comment thread only makes my point now that NetApp has introduced native SSDs as a non-volatile tier in addition to their Flash Cache (but not yet automated tiering).

Moral of the story - each vendor tends to "pooh pooh" the other's approach, while furiously evaluating it if it's a good idea (EMC certainly does).

Hence my efforts to not say bad stuff about the other guy.

I think it might warrant a blog post :-)


i havea question about EMC Recoverpoint licensing. i have one RP-HW-1U-GN4B in main site and another one on remote site. i need 100TB CRR license for remote replication. please tell me which for product number and how many QTY i must choose for recoverpoint licensing

The comments to this entry are closed.

  • BlogWithIntegrity.com


  • The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by Dell Technologies and does not necessarily reflect the views and opinions of Dell Technologies or any part of Dell Technologies. This is my blog, it is not an Dell Technologies blog.