I met up with Andy Monshaw, General Manager of IBM’s Storage Division last week for an exclusive Hot Aisle interview and found him in ebullient mood. Andy has been running Storage for IBM since January 2005 and is an industry veteran. Andy had prepared a huge list of new products and initiatives he wanted to share with me.
He started with a staggering set of industry statistics, the first being that newly created data is growing at an annual compound rate of 100%. So for every Terabyte of new data last year we are seeing two Terabytes this year. Gulp….
“Digital X-Ray, surveillance, high definition media, global Internet adoption and social networking are driving newly created storage demand through the roof.”
The second statistic that Andy shared, is that backup and archive data is growing at 300% year on year making backup a major challenge for vendors and customers alike. I remember years ago hearing that tape was dead and we would move to disk based backup solutions. Like lots of these predictions it isn’t happening. Andy told me that IBM had 8 Terabyte tape units in development and 100 Terabyte tape units in research, that investment in tape technology was accelerating and he saw no slow-up in tape sales anytime soon.
“Tape is not dead and in fact is the second greenest way to store data. The greenest way is either not to store it in the first place or to delete it completely when it’s value is gone.”
Expanding on the green point Andy explained the acquisition of Diligent the second of two Israeli companies that joined the IBM fold this year. IBM bought Diligent for it’s state of the art in-line data de-duplication software.
“Our Diligent acquisition enables IBM to deliver real time 900MB/s in line de-duplication that stops, at source, the need for our customers to create multiple unnecessary copies of the same data. Data de-duplication is an emerging technology that organizations are investing in today and Diligent’s innovative technology provides a single solution to support data protection, archive and data-retention applications – all while maintaining the integrity of the data.”
He told me that IBM now had over 5,000 SAN Volume Controller (SVC) customers with 13,000 instances in production and as a result SVC was still the clear market leader in Storage Virtualization. IBM have added a few additional capabilities to SVC (latest version 4.3) since I last looked including Thin Provisioning and capability to incorporate solid state disk.
Although IBM all but invented the modern storage industry, (Computer Tape – 1950, Winchester Disk – 1954…) EMC have owned it commercially for well over a decade. Andy smiled broadly as he told me:
“EMC’s inVista has tanked with about fifty customers, exclusively EMC hardware at the back end and functionality limited to data mover between storage tiers. This in marked contrast to SVC which is vendor independent, rich in functionality and widely accepted as the market leader in storage virtualization.”
The first Israeli company IBM acquired was XIV. XIV is the Israeli storage company started by Moshe Yanai the ex-EMC Chief Engineer and inventor of the Symmetrix line of storage controllers. Yanai left EMC after being removed as Chief Engineer overseeing the companies hardware division and sidelined to become an adviser to EMC CEO Joseph Tucci. EMC had decided to change direction and focus on software (VMWare, Documentum etc..), Yanai wanted to continue with the hardware path that had delivered huge growth in shareholder value over EMC’s 20 year history.
Yanai apparently started XIV in retribution for the way he was treated at EMC developing a storage product that changes the model of the enterprise storage array away from centralized and tightly coupled architecture (shared cache etc..) to a loosely coupled cloud storage better suited to distributed systems.
Steve Duplessie takes up the story in his blog Steve’s IT Rants:
Andrew Monshaw of IBM and Moshe appear to share the same deep rooted (and possibly psychotic) hatred for the evil machine corp. (aka – EMC – The Hot Aisle)
The story is most likely far more interesting than the truth in this case. Anyway, post acquisition there wasn’t much news. That is, until the last week or so. There has been more XIV bashing in the blogosphere in the last few weeks than there should be in 8 years. I was blissfully trying to avoid the whole thing but there is something afoot, so I starting paying attention. Here’s the poop:
All of a sudden all the EMC bloggers starting tossing XIV grenades, starting with Barry Burke – aka the Storage Anarchist about how the XIV box sucks – (I specifically adore the reference to a lack of performance benchmark data from IBM seeing how EMC will literally shoot anyone who even suggests the thought in the center of their corporate quad at noon). He goes on to talk about how the XIV box sucks more power than a Sym or a Clariion (which is comical on several levels). There are LOTS of media blogs on the subject, basically questioning everything from is it worth the money to how IBM is going to die because they are hiring 1985 EMC sales people to sell the thing.
XIV is a very interesting investment for IBM. It is the antithesis of the Enterprise Storage Controller being constructed from lots and lots of Intel based computers, loosely coupled into a cluster with some very smart software. Here is what IBM say about the architecture:
The IBM XIV Storage System is built as a grid-based storage system of independent modules.
The modules are implemented using off-the-shelf, Intel-based servers within a customized, Linux-based architecture, and are interconnected over redundant Gigabit Ethernet switches. The modules act jointly as a large data grid devoid of a common backplane. The grid is managed by sophisticated distributed algorithms and delivers enterprise-class performance, reliability, and functionality.
As Rich Kucharski, IBM’s Managing Director of Solutions Architecture described, fault tolerance of the XIV is stunning:
“We ran a set of seriously destructive tests with a client which involved yanking out whole modules of the XIV whilst it was running live with hosts reading and writing data. The modular, loosely coupled architecture and smart recovery software made the exercise completely transparent to the host systems. We managed to take out over 50% of the modules before anything host affecting happened and when we put a module back in the XIV recovered, the host was able to restart writing. Needless to say the client was staggered.”
The XIV system is built modularly, from the following components:
- Data modules. The data modules are Intel-based servers with a large number of disks. The modules store the data and perform all advanced storage functionality, such as redundancy, snapshots, and caching. Some of the data modules also contain interface connectivity; these are responsible for accepting host I/O commands (via FC or iSCSI) and forwarding them to the appropriate data module.
- Gigabit Ethernet switches. The switches are the interconnect between all the data and interface modules.
- UPS units. The UPS units ensure that the system has enough time to de-stage all cached data upon a power outage.
The XIV system provides a management function that handles all system-wide management functions: allocating new volumes, etc. In keeping with IBM XIV’s streamlined, standardized approach to architecture, the function does not have its own dedicated hardware. Instead, it runs on one of the standard modules and, in the event of failure of that module, automatically restarts on another module.