The Hot Aisle Logo
Fresh Thinking on IT Operations for 100,000 Industry Executives

Last night a few friends (IT Operations Geeks like me) and I had a great “Brains Trust” event at the OXO Tower in London. The topic was a continuation of the last Hot Aisle blog entry Why do we try to solve backup when restore is the problem?  Apart from the food, view and wine being quite exceptional we had a great conversation.

The first key insight is that everyone without exception thought that date protection was way too complex, too risky and restores were always touch and go (heart in mouth). Everyone hated what they currently did.  Everyone was clear that the tools and processes used to mange data protection and copy data were flawed at best and not fit for purpose in the main.  This was a big insight for most of the brains trust as they all operated data protection because they had always done it that way and were just too busy to think about how dumb it was.

We discussed Big Data as this was universally seen as a huge exacerbation of the data protection issue and a few of felt that Big Data was going to bust data protection, making it unsustainable and useless. We all saw the importance of Big Data being driven by an increased business demand to measure business performance and customer behaviour.  Everyone felt Big Data was inevitable in their organisation.

The topic got really interesting quite quickly when it became evident that Restore wasn’t the only problem that needed to be addressed.  At IT Operations guys we create Copy Data of Production Data for a number of key reasons:

  1. Data Protection (software error, user mistake, sabotage, hardware failure, loss of data centre)
  2. Development Support (Snapshots to help in development, UAT and non functional tests)
  3. Regulatory Compliance (Maintaining data to abide by national and international rules and laws, contractual commitments)
  4. Performance Enhancement (Creating point in time copies to run Business Intelligence reporting against because the production system can’t manage the extra IO load)

We then started considering how ofter data got deleted as a matter of operational best practice and all but one of us agreed that the pain and risk of deleting data was so great that we didn’t do it any more.  The single exception was our operations guy from a very large global legal firm who said his firm took proactive data deletion very seriously and controlling data retention and immutability was a critical issue for him.

Everyone was emphatic that production data and copy data MUST be kept separate but few could put hand on heart and claim 100% compliance. Everyone found managing copy data complex and inefficient – when can we delete copy data? Huge fear of messing up and causing a problem so we keep everything.

We then started thinking about the relative ratio of production data to copy data (for all of the reasons above) and came to the conclusion that in real environments we could easily have somewhere between five and fifteen copies of each piece of production data, depending mainly on how smart we were about copying the copy data (e.g. backing up a development snapshot or the BI data).  The unanimous conclusion was that production data volumes are growing exponentially (maybe 40% CAGR) and copy data exacerbates the issue because we keep each of the four types of copy data in different silos.

We then started thinking about why we did things this way, whey we keep these four different siloed approaches going,  the unanimous conclusion was that we did it this way because we always did it this way and no one has stopped for long enough to think about a better approach.