The Hot Aisle Logo
Fresh Thinking on IT Operations for 100,000 Industry Executives

 

Andy Monshaw

Andy Monshaw

 

 

I met up with Andy Monshaw, General Manager of IBM’s Storage Division last week for an exclusive Hot Aisle interview and found him in ebullient mood. Andy has been running Storage for IBM since January 2005 and is an industry veteran. Andy had prepared a huge list of new products and initiatives he wanted to share with me.

He started with a staggering set of industry statistics, the first being that newly created data is growing at an annual compound rate of 100%. So for every Terabyte of new data last year we are seeing two Terabytes this year. Gulp….

“Digital X-Ray, surveillance, high definition media, global Internet adoption and social networking are driving newly created storage demand through the roof.”

The second statistic that Andy shared, is that backup and archive data is growing at 300% year on year making backup a major challenge for vendors and customers alike. I remember years ago hearing that tape was dead and we would move to disk based backup solutions. Like lots of these predictions it isn’t happening. Andy told me that IBM had 8 Terabyte tape units in development and 100 Terabyte tape units in research, that investment in tape technology was accelerating and he saw no slow-up in tape sales anytime soon.

“Tape is not dead and in fact is the second greenest way to store data. The greenest way is either not to store it in the first place or to delete it completely when it’s value is gone.”

Expanding on the green point Andy explained the acquisition of Diligent the second of two Israeli companies that joined the IBM fold this year. IBM bought Diligent for it’s state of the art in-line data de-duplication software.

“Our Diligent acquisition enables IBM to deliver real time 900MB/s in line de-duplication that stops, at source, the need for our customers to create multiple unnecessary copies of the same data. Data de-duplication is an emerging technology that organizations are investing in today and Diligent’s innovative technology provides a single solution to support data protection, archive and data-retention applications – all while maintaining the integrity of the data.”

He told me that IBM now had over 5,000 SAN Volume Controller (SVC) customers with 13,000 instances in production and as a result SVC was still the clear market leader in Storage Virtualization. IBM have added a few additional capabilities to SVC (latest version 4.3) since I last looked including Thin Provisioning and capability to incorporate solid state disk. 

Although IBM all but invented the modern storage industry, (Computer Tape – 1950, Winchester Disk – 1954…) EMC have owned it commercially for well over a decade. Andy smiled broadly as he told me:

“EMC’s inVista has tanked with about fifty customers, exclusively EMC hardware at the back end and functionality limited to data mover between storage tiers. This in marked contrast to SVC which is vendor independent, rich in functionality and widely accepted as the market leader in storage virtualization.”

The first Israeli company IBM acquired was XIV. XIV is the Israeli storage company started by Moshe Yanai the ex-EMC Chief Engineer and inventor of the Symmetrix line of storage controllers. Yanai left EMC after being removed as Chief Engineer overseeing the companies hardware division and sidelined to become an adviser to EMC CEO Joseph Tucci. EMC had decided to change direction and focus on software (VMWare, Documentum etc..), Yanai wanted to continue with the hardware path that had delivered huge growth in shareholder value over EMC’s 20 year history.

Yanai apparently started XIV in retribution for the way he was treated at EMC developing a storage product that changes the model of the enterprise storage array away from centralized and tightly coupled architecture (shared cache etc..) to a loosely coupled cloud storage better suited to distributed systems.

Steve Duplessie takes up the story in his blog Steve’s IT Rants:

Andrew Monshaw of IBM and Moshe appear to share the same deep rooted (and possibly psychotic) hatred for the evil machine corp. (aka – EMC – The Hot Aisle)

The story is most likely far more interesting than the truth in this case. Anyway, post acquisition there wasn’t much news. That is, until the last week or so. There has been more XIV bashing in the blogosphere in the last few weeks than there should be in 8 years. I was blissfully trying to avoid the whole thing but there is something afoot, so I starting paying attention. Here’s the poop:

All of a sudden all the EMC bloggers starting tossing XIV grenades, starting with Barry Burke – aka the Storage Anarchist about how the XIV box sucks – (I specifically adore the reference to a lack of performance benchmark data from IBM seeing how EMC will literally shoot anyone who even suggests the thought in the center of their corporate quad at noon). He goes on to talk about how the XIV box sucks more power than a Sym or a Clariion (which is comical on several levels). There are LOTS of media blogs on the subject, basically questioning everything from is it worth the money to how IBM is going to die because they are hiring 1985 EMC sales people to sell the thing.

XIV is a very interesting investment for IBM. It is the antithesis of the Enterprise Storage Controller being constructed from lots and lots of Intel based computers, loosely coupled into a cluster with some very smart software. Here is what IBM say about the architecture:

The IBM XIV Storage System is built as a grid-based storage system of independent modules.

The modules are implemented using off-the-shelf, Intel-based servers within a customized, Linux-based architecture, and are interconnected over redundant Gigabit Ethernet switches. The modules act jointly as a large data grid devoid of a common backplane. The grid is managed by sophisticated distributed algorithms and delivers enterprise-class performance, reliability, and functionality.

As Rich Kucharski, IBM’s Managing Director of Solutions Architecture described, fault tolerance of the XIV is stunning:

“We ran a set of seriously destructive tests with a client which involved yanking out whole modules of the XIV whilst it was running live with hosts reading and writing data. The modular, loosely coupled architecture and smart recovery software made the exercise completely transparent to the host systems. We managed to take out over 50% of the modules before anything host affecting happened and when we put a module back in the XIV recovered, the host was able to restart writing. Needless to say the client was staggered.”

The XIV system is built modularly, from the following components:

  • Data modules. The data modules are Intel-based servers with a large number of disks. The modules store the data and perform all advanced storage functionality, such as redundancy, snapshots, and caching. Some of the data modules also contain interface connectivity; these are responsible for accepting host I/O commands (via FC or iSCSI) and forwarding them to the appropriate data module.
  • Gigabit Ethernet switches. The switches are the interconnect between all the data and interface modules.
  • UPS units. The UPS units ensure that the system has enough time to de-stage all cached data upon a power outage.

The XIV system provides a management function that handles all system-wide management functions: allocating new volumes, etc. In keeping with IBM XIV’s streamlined, standardized approach to architecture, the function does not have its own dedicated hardware. Instead, it runs on one of the standard modules and, in the event of failure of that module, automatically restarts on another module.

  • http://www.storagerap.com marc farley

    But not exactly very green.

  • http://www.thehotaisle.com thehotaisle

    Hi Marc,

    Thanks for the comment but…. I thought that the point of the XIV was it used single 1TB SATA disks rather than the equivalent three 16,000 RPM energy sucking 300 GB monsters in a DMX, Tagma or DS8000? SATA drives use a hell of a lot less power.

    XIV gets it's performance from not having a single backplane, spreading data in 1MB chunks across lots of spindles and using smart caching software in a distributed cache rather than the normal enterprise controller shared cache.

    I am keen to see the Watts per GB stored and Watts per GB transferred figures to do a real comparison though. Steve Duplessie (Steve's IT Rants) picks up that an awful lot of XIV bashing that seems to be coming from EMC connected figures. Maybe EMC are worried, after all DMX 4 is the last of the range? Where do EMC go after this?

    Steve

  • http://www.storagerap.com marc farley

    Yes, 1 TB SATA drives help, but you have to start with a large number of them (180, I believe) and you have to use mirroring, and that's where my “anti-green” comment came from.

    Wide striping is good and I suspect XIV will have decent streaming performance, but their distributed cache connectivity is 10GB Ethernet (I believe, again) which has great bandwidth characteristics, but relatively weak latency characteristics, which will likely impact transaction processing capabilities.

    So, I'm from 3PAR and of course any product that a customer could purchase instead of ours is a concern to me. The reaction from EMC probably comes from a couple places: their competitive culture and wanting to stomp on the perception that the company is still running on Moshe's fumes. BTW, I like your question Steve. Hulk and Maui appear to be a large distributed file system type of product with content management built in.

  • Richard

    With a 'single point of failure' data modules (single controller with many disks) Rich certainly needs to demonstrate a failure of a single data module … and a simultanious failure of the replacement module, during rebuild, at the same time. It would be interesting to know how long it takes to rebuild a fully loaded system in the above scenario and what the performance is while this is being done.

  • Rich Kucharski

    When you look at system performance and capacity, the XIV system is in a class by itself. Current generation platforms have a direct connection between performance and spindle speed which also impacts CapEx and Opex. With current generation storage arrays, the faster a drive is the smaller in size it is, ie 15k 146 gb. The XIV platform is not constrained by this limitation. As a result, we are able to leverage larger and “slower” drives to get similar type performance. This simple fact enables several things including, CONSISTENT performance, a more power efficient environment and the ability to eliminate storage tiers from the environment this enables customers to more easily manage their storage environment.

    The architecture employs a unique architecture that maximizes system performance through maximized spindle utilization and superior load distribution – the inherent technological limitations of SATA drives become less pronounced when factored into overall system performance and architecture. In various customer environments we prove that real-life applications exhibit better performance with XIV than DMX or other platforms, including OLTP applications and proprietary applications.

    This architecture also enables extremely fast data rebuilds, including drive failures or module failures. Since the architecture is based on data availability and data performance we can quickly and efficiently rebuild data when becomes suspect or there is a hardware failure, during this rebuild time system performance and availability is maintained. Our architecture is not a question or statement that “our drives are better than your drives”, it is a statement of “Our platform provides better availability / performance than others.” As mentioned above the system is based on a massively parallel architecture, this enables drive rebuilds to be measured on the order of minutes and entire module rebuilds are measured by an hour or two.

    We would invite you to take it for a POC an see for your self .

  • http://www.processmaster.com Alan Crean

    $300m for XIV is credible as you really can see it as a radical innovation, its the $200m for Diligent that seems a high price to pay as there are a plethora of deduplication vendors out there now – I am sure it all has something to do with the fact that the same guy started and owened both companies

  • http://www.thehotaisle.com thehotaisle

    Alan,

    You may well be right, Moshe started the company with Doron Kempel in EMC's old Israeli R&D labs. Interestingly EMC got a 24% stake for a notional $5M investment and Moshe put in $10M of his own money.

    According to Andy Monshaw, Diligent apparently have the fastest de-duplication technology running at 5 times the speed of the nearest competition.

    A smart move if it can deliver de-duplication at wire speeds.

    Steve

  • http://www.processmaster.com Alan Crean

    $300m for XIV is credible as you really can see it as a radical innovation, its the $200m for Diligent that seems a high price to pay as there are a plethora of deduplication vendors out there now – I am sure it all has something to do with the fact that the same guy started and owened both companies

  • http://www.thehotaisle.com thehotaisle

    Alan,

    You may well be right, Moshe started the company with Doron Kempel in EMC's old Israeli R&D labs. Interestingly EMC got a 24% stake for a notional $5M investment and Moshe put in $10M of his own money.

    According to Andy Monshaw, Diligent apparently have the fastest de-duplication technology running at 5 times the speed of the nearest competition.

    A smart move if it can deliver de-duplication at wire speeds.

    Steve