I recently chaired a very interesting panel debate and discussion Hosted By BLADE Network Technologies in London.
The panel was made up of:
- Harkeeret (Harqs) Singh, Global Head Data Centre Energy Optimisation, Thomson Reuters
- Finlay MacLeod, VP IT Infrastructure, EMEA, First Data Corp
- Charles Ferland, VP of EMEA, BLADE Network Technologies
- Brian Peterson, VP International Operations, Emulex
- Tikiri Wanduragala, EMEA xSeries Senior Consultant, IBM
- Steve O’Donnell, Managing Director, Enterprise Strategy Group (ESG)
Steve O’Donnell: As a result of mankind’s insatiable appetite for ever cheaper and cleaner 24/7computing power the next decade will place extraordinary demands on Britain’s data centres which will dwarf those of the past few years. How ready are today’s data centre users, network and solution providers to meet the challenge?
Harkeeret Singh: The role of Head of Data Centre Optimisation is a new role that came about recently because energy is the biggest cost component of the data centre. With IT estates continually growing, energy prices on the increase and various initiatives such as the green grid and new European legislation and the US carbon tax in the offing energy consumption has become an important consideration for companies. A recent study predicted that data centre emissions are growing faster than those in the aviation industry and we have to do something about it.
In most organisations there are silos of IT guys, nobody is really joining it all up and making the whole process efficient. In many organisations the objectives within IT aren’t aligned with the business. It’s important that we get energy joined up – all the way from the guys who pay the bills, through the facilities management who provide the premises and infrastructure to the virtualised environment chaps who are doing amazing things with the technology. It’s only when you can look at it holistically that you can get the real picture of what is happening in an organisation.
Tikiri Wanduragala: IBM’s message is also about joining things up – we’re looking for the joined up planet. Within our resource constrained world, we see a lot of IT and instrument capability going out – we carry it around in our briefcases, not to mention all of the stuff crammed back in the data centre. We need to start making better decisions on how we use resources, we have to start using less and start switching our devices off. We have to start managing our energy in a controlled fashion, perhaps even matching the workflow of a company to the amount of energy available.
Steve O’Donnell: There are vast amounts of complexity across the enterprise and certainly it’s obvious that parts of the organisation are not bringing IT together. What is happening in the network space, Charles?
Charles Ferland: The trend has to be to look at both the SAN and IP networks and migrate them into one platform. If you have two switching fabrics and you can migrate them into one, then you not only save half of your infrastructure capital costs but also reduce the total power consumption and cooling requirements. That’s the scenario we are looking at with Converged Enhanced Ethernet (CEE).
Brian Peterson: Convergence has been on the table for some time – it’s not about how but when! We have been working on bringing down the power consumption of our products and minimising the amount of hardware needed in order to bring down cost and energy usage. Virtualisation of the environment through software solutions that create a virtual space has been a key element of this strategy.
Finlay MacLeod: What we are finding is that new equipment from vendors is still power hungry and this leads to increased cooling requirements. In a data centre we are constantly replacing the cabling infrastructure, upgrading the cabling technology to faster copper and fibre. However, some organisations don’t take the cables out when they have finished with them, leading to even more reduced airflow and increasing the need for cooling, which accounts for a large percentage of the data centre’s energy usage.
Change has to be managed diligently, when we upgrade the infrastructure the old cabling should usually be removed. Of course there are always exceptions and some of the processes will depend on where it is in the data centre and may be better left in situ. However, the more cables you put in the cable tray reduces the amount of air that can flow through.
Harkeeret Singh: Thomson Reuters has embarked on a holistic improvement programme for energy. And we’ve started with some very simple, but effective things such as looking at the actual energy bills. In the past people used to guess at what the energy bills would be. Now we gather the energy bills and look at them. It’s amazing how varied these bills are from across the world. Just by validating the energy bills there is an opportunity to make savings. In just 12 weeks we have shown savings of US $1M.
At the moment the focus is on the operational side. At our worst performing sites we are setting a 10% energy saving challenge, at the better sites its only a few percent. Although we aren’t setting enormous targets, just a few percent will make a lot of difference to our overall consumption and bills. It’s a joint objective shared by all the operational teams.
We have more than 20 data centres that are a significant size and for them consolidation and virtualisation will be the focus. Virtualisation isn’t appropriate for all areas of the business. Some of our data feeds provide time critical market data and these may not be suitable for virtualisation, but we will put power management on all the boxes so that we can start to understand the data and its usage. Only by monitoring and measuring can you start to understand the issues and work out how to gain improvements.
Tikiri Wanduragala: Power consumption is unique to each machine and the environment it is running in, so it’s important to understand what your servers are really doing. Once you have this real time data, then you can start to make more accurate projections and the organisation can take the business decision to either optimise the environment or perhaps to run that application somewhere else. If you can’t measure it then you can’t manage it.
Harkeeret Singh: It’s important to have a holistic measuring strategy and you need a number of metrics across all of the silos of IT. If you just increase server utilisation then there can be a challenge around the lifecycle of things. Servers tend to last about three years with some being moved on after only a year, and yet we are trying to build data centres that last 15 years or more with technology that may not see the year out. In order to build the perfect data centre we need to understand exactly where the technology is going. For example, if we are going to have liquid cooled servers in the future, should we start putting our cabling in overhead so that we can put liquid cooling in underneath?
Steve O’Donnell: With M&E (mechanical and electrical), cabling, plant, racks and storage all with different life cycles, how do you keep it all together and what does network convergence really mean for the data centre?
Charles Ferland: 10Gb Ethernet has been around for some time and is becoming de facto in the data centre. The price point is interesting because at less than $500 per port we now have a technology that can carry both data and storage in an affordable way.
Harkeeret Singh: At Thomson Reuters we are replacing servers to gain more performance, however the TCO equation for the server is made up from the electricity it consumes rather than the cost of the hardware. In energy terms it takes us 18 months to recoup the capital cost of the server. We are pushing vendors to provide us with clean data on CPU power utilisation. Providing a power specification is not yet mandatory and vendors tend to only give us the best figure they can get from a fully optimised server – and that doesn’t equate to real life.
Brian Peterson: It’s true to say that we have the computing power; it’s there and can be gobbled up by these virtualised machines, however they also need to be energy efficient and that is more complex. Another issue is that the finance guys are still writing down this equipment over 4 years, but we may be changing them every year. And then there are the disposal requirements. Servers without hazardous substances are much easier to get rid off.
Tikiri Wanduragala: At IBM we changed our manufacturing process some 18 months ago, refreshing the whole line to meet the RoHS (Restriction of the Use of Certain Hazardous Substances in Electrical and Electronic Equipment Regulations 2008) regulations. For us it was an absolute requirement. Fortunately the technology to do this already exists but not in the commodity end of the market. We already manufacture highly efficient power supplies, and water cooling already exists so what we have done is taken ideas from our mid range systems and are applying them to the manufacture of our distributed systems.
Steve O’Donnell: 98% of organisations have already adopted some form of virtualisation. Recognising that having loads of servers running at 5% utilisation is a waste, virtualisation has become a key driver for energy efficiency.
Charles Ferland: The next step is to virtualise storage and then the network. Having multiple cables and switching devices running at low utilisation is going to use more energy than a consolidated environment. Networking is one area that needs to be virtualised. If you are going to be running thousands of virtual machines that can move around the enterprise then you need to instil the same level of agility into the network. If the servers are going to be running at 90% capacity then the infrastructure needs to scale to give the same level of bandwidth and performance. Aggregations of single Gigabit Ethernet cables are much more difficult to manage than a single pair of 10Gb network ports. Admittedly network virtualisation is a lot more complicated than server virtualisation, but if 98% of enterprises are claiming to have virtualised servers, it’s only a question of time before virtualisation in all its forms becomes commoditised.
Steve O’Donnell: With more enterprises using Hyper-V, Citrix and Xen Server, what does that mean for the network?
Brian Peterson: Certainly VMWare was a good lead ahead of everyone – now Xen, Citrix and Microsoft have solutions that endorse virtualisation and provide end users with choice and flexibility and a competitive marketplace. Emulex provides the i/o and partner with all of the providers. BLADE Network Technologies works with them all too. It’s important for customers to have that in mind because today I’m using this product but tomorrow I may want to use something else so it’s important to look at a network vendor who can be agnostic and uses open standards to provide the data centre customer with future proof connectivity.
Tikiri Wanduragala: If you look at virtualisation in the mid range – it was owned by the manufacturers. The revolution we have today in this age of virtualisation is being able to play with all players. As a manufacturer we have to support the whole range. For servers at the moment the leader is VMWare, but Citrix and Oracle are pushing Xen. The world is going in two directions with players like Oracle and Cisco pushing the all pre-built and it works route, but will that format really serve everybody’s needs?
Finlay MacLeod: We still have a lot of complexity in our existing software and moving from the old legacy stuff to this new virtualised world is difficult. Virtualisation is around but is it really in the production environment? How much mission critical stuff is virtualised? We are virtualising databases, email, test and development and low i/o stuff. There are still a lot of issues surrounding it.
In the old days moving a server physically had a process associated with it – you called the IT guys, involved the facility management services, physically moved the server and made sure that all the change management tasks were completed before it came back on line. Moving a server virtually doesn’t really have the same set of processes associated with it, how can you be sure that it is taking the same set of network characteristics with it when it moves? The machine it’s moving on to may not have the same security parameters. This is a complex business. All of a sudden you have drag and drop virtual machine and it is a scary concept to IT managers.
Steve O’Donnell: We have been talking about Ethernet being the main transport and how do we handle the storage traffic and bring them together. Is this happening as the 10Gb pipe becomes available and affordable?
Brian Peterson: The performance and latency we can see on Ethernet and the re-usability of the equipment is important. A 10Gb Ethernet switching fabric is easier to use.
Steve O’Donnell: How long is the network card refresh cycle in the data centre? Are customers really reusing cards?
Charles Ferland: Refresh on the storage side is longer than on the server side. It’s also been a long cycle from 1Gb to 10Gb, however because of virtualisation we are seeing a legitimate requirement for more bandwidth. For the server running tens of applications sometimes latency is the important thing rather than the bandwidth.
Finlay MacLeod: Data centres are full of equipment and applications and the reality is that they take a long time to change. At First Data Corp, the adoption of 10Gb Ethernet is incremental, starting with the high performance, mission critical i/o stuff.
At Thomson Reuters we are looking at it, but not implementing it.
Charles Ferland: At BLADE we have worked to reduce the price point so that customers can afford to adopt it. There is a huge adoption in the edge, with the fastest growth connecting servers to servers and servers to storage because that is where most of the traffic is. The edge is where it is, and it may in time migrate to the core. Core switches are a $3m investment because you need two of them. Replacing the servers at the edge with a rack of blades is a much cheaper option. So at the moment what we are finding is that there is a load of traffic going round the edge of the network rather than passing through the core.
If you are using the interconnect between the servers as your backbone then it makes sense to put the interconnect on the top of the rack. The technology is going to move away from having loads of cables going back to a patch panel and onto the top of the rack, so expect to see data centre scaled rack units appearing in the future. We’re progressing from server to blade to rack level computing.
Steve O’Donnell: And that brings us to the thorny issue of cooling. In the high performance market we are already seeing liquid cooled, coldplate technology, is that coming to the standard data centre?
Tikiri Wanduragala: IBM is certainly getting hot on liquid cooling. Everybody went down the air cooled route because people are essentially frightened of the idea of liquid and circuits. There is also the issue of the retrofit. As chip densities have increased we have introduced a water cooled back door that fits on an enterprise rack, so that even the older equipment can benefit. Given that water can move heat 4,000 times more quickly than air it makes sense to bring the cool liquid closer to the hot components.
The next developments will be heat sinks with water channels and in ten years time we will be putting the liquid channels on the chips themselves and perhaps even fully immersing the circuit boards in liquids to remove the heat that way. It’s not new technology – the early Cray machines were immersed in oil, that’s how they got their nickname ‘bubbles’.
Steve O’Donnell: Can you summarise the key drivers to energy efficiency in the data centre?
Harkeert Singh: I believe that they fall into six categories:
1. Revenue protection – greener products are easier to sell
2. Brand improvement – moving up the league table
3. Cost reduction
4. Mitigation against future power costs which are predicted to double in the next 10 years
5. Legislation in both Europe and the US – there will be targets and a carbon tax and someone is going to have to pay
6. Providing a greener planet for our kids
Finlay MacLeod: There is another one – London is running out of power capacity. When your data centre runs out of power you have to choose equipment that uses less energy. Every watt you take out also adds more savings to the cooling bill.