Most people who read this won’t have a clue what a Hollerith punched card is. I only just caught the end of the era at University where I learned to program in FORTRAN coding one punched card at a time. Once the stack of cards was complete, I delivered it to the computer operator for scheduling and execution.
Jobs were scheduled one at a time because that is how the primitive Burroughs scheduler and operating system was designed. Running more than one program at a time was still a pipe dream in those days so hardware engineers focused on making programs run faster by scaling up the hardware. Faster Processor, faster IO, more main memory, and more powerful instruction sets that did more in fewer clock cycles.
This propensity to scale up, make computers more and more powerful and IO faster and faster has been at the center of the whole industry for decades, an arms race for more clock cycles. In fact Gordon Moore a founder at Intel coined the phrase Moore’s Law to describe the rapid and continuous performance improvements in processor performance we have seen over the last 40 years.
The same is true for networking. Token Ring networks ran at 4MB/s, Ethernet at 10MB/s in the early days of the LAN, now 10GB/s is the norm for new installations, a three orders of magnitude improvement in 20 years.
Storage systems also have shown massive performance improvements with systems like Oracle’s ExaData offering 1M IOPS performance levels. Database technology has also seen massive performance improvements driven in part by smart data design and and great database technology. Performance levels we see today in these scale up systems are unimaginable only a few decades ago.
I remember in 1975 Donald Michie Professor of the Machine Intelligence and Perception unit at Edinburgh University proving mathematically that we would never see a computer beat a grand master at chess within out lifetimes. The problem was too big to solve with current technology and the rate of growth of performance required to beat a grand master, he told us was just unbelievable.
The fact that the unbelievable levels of performance we see today are still not enough for the largest Internet scale tasks such as hosting Twitter, Facebook or LinkedIn or managing the search indexes at Yahoo or Google. Scale up just doesn’t scale up enough. None of these Internet scale enterprises use scale up technology any more. They scale out at every level. Scale out compute, storage, network, application architecture and even at the database level.
Scale out applications are becoming more common with developers adopting a MapReduce style approach to coding, where a master process splits the problem into a number of smaller parts and then farms them out to a large number of processes that derive the answer. The master process then combines the answers to deliver a single consolidated output. For the largest scale computational problems this is often the only way to get to the answer in a meaningful timescale.
Scale out compute is now commonplace, with any number of hypervisor technologies (VMware, Xen, KVM, Hyper-V) supported by a cloud operating system to handle virtualisation and load balancing.
Scale out storage is also a growth industry with products like HP’s X9000 (IBRIX) and IBM’s XIV gaining traction in the market. Object storage is also gaining popularity with URI or HTTP protocols becoming commonplace on any number of offerings such as Amazon’s S3. Open source file systems such as Apache Hadoop add an additional feature of understanding the location of the data so that compute and storage elements can be closely co-located to reduce network latency and end to end bandwidth demands.
Scale out networking follows the logic that most network traffic in a scale out world is edge to edge so why bother with a core network? Converge on 10G lossless Ethernet using top of rack switches supporting iSCSI, NAS and HTTP protocols to converge the SAN and LAN into a common routable IP system.
Scale out databases are now commonly referred to as NOSQL databases that go back in time to pre-relational designs that do not provide ACID consistency guarantees (atomicity, consistency, isolation, durability) but allow sharding to split the data sets over multiple systems to improve the parallelism of the overall system.
The legacy of the punched card is still with us because Information Technology is an evolutionary process. Scale up approaches continue to support the evolution, but one day the dinosaurs will die out.








Pingback: Jenny Ambrozek