I met with David Wright from Verari yesterday and we were talking about high density computing and power particularly about Verari’s data center in a container product. That got us to talking about power distribution problems and the fact that some of the big firms have been buying up sites in old US steel towns where power is plentiful and there are suitable large scale sites for the huge sheds that are needed to house these Web 2.0 data centers.
I thought that it might be useful to look at data center locations and work out what the options are for optimal siting.
Ideally a data center ought to be located in a cold place with plenty of electrical power and close to a consumer (market garden, manufacturing process, swimming pool complex) that can use the warm water or air that is a byproduct of operations. The site should be fairly level, neither subject to ground heave or slippage nor flooding. It should not be close to incineration plants or other industrial processes that expel foul contamination or dust into the air. Trees that give off sap are also best kept a reasonable distance away.
Even though avoiding flooding is an important issue, being close to a source of cold water like a large lake, or a fast flowing river can make cooling much less costly, if their is no industrial process that can use our waste heat.
Being close to multiple sources of high capacity network connections is also pretty essential.
So much for the site physical characteristics now we need to look at the purpose of the data center because that also imposes a number of constraints. What is the data center to be used for? There are a few basic types:
- Data Storage bunker
- Streaming Video source
- Application Hosting
- Web 2.0 services
The main characteristics we need to consider are bandwidth demands and latency. For data storage latency should not be an issue but bandwidth could be important. This is why recently the Icelandic government was promoting Iceland as a great place to store data online. Often data storage bunkers need to be provisioned in pairs and the distance between needs to be tuned between being far enough away to ensure that a single incident could not cause a simultaneous outage and close enough to support synchronous file replication.
Streaming video tends to need high bandwidth and reasonably low latency so data centers need to be located close to the consumer. Buffered video is less bandwidth and latency sensitive (uTube) so it is less critical that the cacheing data centers are close to the consumer.
Application hosting is a mixed bag, some applications are not latency or bandwidth constrained (gmail, hotmail) as they use relatively efficient transmission protocols, others are severely constrained by latency such as financial services market data feeds, where a few milliseconds advantage can lead to a trader dominating the market.
Low latency means that these applications need to be close to the consumer and explains why wholesale bankers must have sites close to Wall Street and London City.
Web 2.0 services are interesting as they encompass so many types, web storage, web applications, web services (authentication, security etc..) but largely need to be relatively well distributed with very high capacity network links between sites. Nevertheless scalability demands that these sites are enormous with extreme high performance connectivity inside the data center. Cisco’s Unified Computing System recent product launch shows that vendors are now beginning to understand the importance of simplifying the networking platform and improving connectivity and throughput.