The Hot Aisle Logo
Fresh Thinking on IT Operations for 100,000 Industry Executives

I published a brief article last month about a set of test we were doing on a production data warehouse that was performing very poorly. Running the database on a Dell Server with 4 x 73GB SAS drives configured in RAID0 we got a respectable 575 IOs per second (IOPS) on a random write test. Running the same tests and same database with a Fusion-io 160GB ioDrive we got a staggering 22,327 IOPS. This is 38.8 times faster and significantly better than I expected.

Random reads were 21 times faster, sequential writes 25.3 times faster and sequential reads 2.4 times faster.

The full test results are published on my Mac iDrive here. The file is called Test Results 15 Feb 09.
There are a couple of other interesting documents that are worth a read including the McKinsey Data Center strategy document.

There Are 4 Responses So Far. »

  1. I think that the area of i/o performance is becoming a fascinating one with several competing approaches and technological rates of change opening new possibilities and creating new bottlenecks, reducing some costs and increasing others. The key, and it creates a major issue in data centre, is to understand the current and projected business needs so that the performance/cost/flexibility requirements can be tracked from the lob application portfolio down to how this is supported by the assets and operational costs.

    In principle, the best bang for buck for I/O performance comes from reducing the amount of I/O: change the application level software, change the dbms products, change configurations and/or throw in-server RAM at the DBMS cache. Then you can throw SSD at the problem. The counter problems then come from re-use of storage assets and network contention (either in the IP network or the storage network), and the corresponding increases in operational processes/technical skills required.

    At the simplistic economic model of price /GB of various types of storage, SSD's dropping faster than server RAM, and the technology gives you a much higher total capacity, which is crucial if you go beyond the RAM limits of your servers. I didn't get a view of the size of the DW data set, but for standard rdbms, you can get very large storage needs (see tpc.org's tpc H results, where 70-128x query set to total storage ratios are typical in the 3TB dataset datawarehouse range.) That pumps up the bill if you have to make it all SSD. (say, 20usd per GB gives a storage bill of 70x60k=4M2 usd for the storage – comparable to the 5M4 usd 5yr cost of ownership of an IBM solution at this scale.) Using a specialised DW dbms could drop this by an order of magnitude, but that still looks like it would need more work to get the solution into the business case.

    Getting anything like optimal takes a lot of effort to get the business to agree what parameters it's happy to sign up to for all business applications and then to engage the whole IT/IS development and operations groups aligned around these parameters.

  2. Steve
    I like the “Green IT through Discipline” panel set. Have you tried to push the efficiency incentives back up the IT supply chain into application development? As things stand, new software will soak up any capacity that's available, and there's a tendency at each stage of an application's life to over-estimate and provision, with no closed loop process that I've seen to measure actual resource use against business benefit over the lifecycle.
    Tim

  3. I think that the area of i/o performance is becoming a fascinating one with several competing approaches and technological rates of change opening new possibilities and creating new bottlenecks, reducing some costs and increasing others. The key, and it creates a major issue in data centre, is to understand the current and projected business needs so that the performance/cost/flexibility requirements can be tracked from the lob application portfolio down to how this is supported by the assets and operational costs.<br><br>In principle, the best bang for buck for I/O performance comes from reducing the amount of I/O: change the application level software, change the dbms products, change configurations and/or throw in-server RAM at the DBMS cache. Then you can throw SSD at the problem. The counter problems then come from re-use of storage assets and network contention (either in the IP network or the storage network), and the corresponding increases in operational processes/technical skills required.<br><br>At the simplistic economic model of price /GB of various types of storage, SSD’s dropping faster than server RAM, and the technology gives you a much higher total capacity, which is crucial if you go beyond the RAM limits of your servers. I didn’t get a view of the size of the DW data set, but for standard rdbms, you can get very large storage needs (see tpc.org’s tpc H results, where 70-128x query set to total storage ratios are typical in the 3TB dataset datawarehouse range.) That pumps up the bill if you have to make it all SSD. (say, 20usd per GB gives a storage bill of 70x60k=4M2 usd for the storage – comparable to the 5M4 usd 5yr cost of ownership of an IBM solution at this scale.) Using a specialised DW dbms could drop this by an order of magnitude, but that still looks like it would need more work to get the solution into the business case. <br><br>Getting anything like optimal takes a lot of effort to get the business to agree what parameters it’s happy to sign up to for all business applications and then to engage the whole IT/IS development and operations groups aligned around these parameters.

  4. [...] Read the entire blog entry here >> All views and opinions expressed in ESG blog posts are intended to be those of the post's author and do not necessarily reflect the views of Enterprise Strategy Group, Inc., or its clients. ESG bloggers do not and will not engage in any form of paid-for blogging. Click to see our complete Disclosure Policy. For important information about using this content, please review our Terms & Conditions Tags: data warehouse, Fusion-io, iops, it leadership [...]

Post a Response