Computer cluster
|
PurdueLinuxComputerCluster.jpg
A computer cluster is a group of loosely coupled computers that work together closely so that in many respects it can be viewed as though it were a single computer. Clusters are commonly (but not always) connected through fast local area networks. Clusters are usually deployed to improve speed and/or reliability over that provided by a single computer, while typically being much more cost-effective than single computers of comparable speed or reliability.
Contents |
Cluster types
There are many ways to categorize clusters, but the following is a common categorization:
- High-availability clusters (originally called Two-node clusters by this page)
- Load balancing clusters
- High-performance Clusters
High-availability (HA) Clusters are implemented primarily for the purpose of improving the availability of services which the cluster provides. They operate by having redundant nodes which are then used to provide service when system components fail. The most common size for an HA cluster is two nodes, since that's the minimum required to provide redundancy. HA cluster implementations attempt to manage the redundancy inherent in a cluster to eliminate single points of failure. There are many commercial implementations of High-Availability clusters for many operating systems. The Linux-HA project is one commonly used free software HA package for the Linux OS.
Load balancing clusters operate by having all workload come through one or more load-balancing front ends, which then distribute it to a collection of back end servers. Although they are implemented primarily for improved performance, they commonly include high-availability features as well. Such a cluster of computers is sometimes referred to as a server farm. There are many commercial load balancers available. The Linux Virtual Server project provides one commonly used free software package for the Linux OS.
High-performance (HPC) clusters are implemented primarily to provide increased performance by splitting a computational task across many different nodes in the cluster, and are most commonly used in scientific computing. One of the more popular HPC implementations is a cluster with nodes running Linux as the OS and free software to implement the parallelism. This configuration is often referred to as a Beowulf cluster. Such clusters commonly run custom programs which have been designed to exploit the parallelism available on HPC clusters. Many such programs use libraries such as MPI which are specially designed for writing scientific applications for HPC computers.
High-Performance cluster implementations
An organization publishes the 500 fastest clusters twice a year. TOP500 [1] (http://top500.org) is a collaboration between the University of Mannheim, the University of Tennessee, and the National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory. The current top supercomputer is the Department of Energy's BlueGene/L system with performance of 136.8 TFlops. The second place is owned by another BlueGene/L system with performance of 91.29 TFlops.
Clustering can provide significant performance benefits versus price. The System X supercomputer at Virginia Tech, the seventh most powerful supercomputer on Earth as of November 2004, is a 12.25 TFlops computer cluster of 1100 Apple XServe G5 2.3 GHz dual processor machines (4 GB RAM, 80 GB S-ATA HD) running Mac OS X. The cluster initially consisted of Power Mac G5s; the XServe's are smaller, reducing the size of the cluster. The total cost of the previous Power Mac system was $5.2 million, a tenth of the cost of slower mainframe supercomputers. The Power Mac G5s were sold off.
The central concept of a Beowulf cluster is using COTS computers to produce a cost-effective alternative to a traditional supercomputer. One project that took this to an extreme was the Stone Soupercomputer.
John Koza has the largest computer cluster owned by an individual.
Cluster history
The first commodity clustering product was ARCnet, developed by Datapoint in 1977. ARCnet wasn't a commercial success and clustering didn't really take off until DEC released their VAXcluster product in the 1980s for the VAX/VMS operating system. The ARCnet and VAXcluster products not only supported parallel computing, but also shared file systems and peripheral devices. They were supposed to give you the advantage of parallel processing while maintaining data reliability and uniqueness.
The history of cluster computing is intimately tied up with the evolution of networking technology. As networking technology has become cheaper and faster, cluster computers have become significantly more attractive.
Cluster technologies
MPI is a widely-available communications library that enables parallel programs to be written in C and Fortran, for example, in the climate modeling program MM5.
The GNU/Linux world sports various cluster software, such as:
- Beowulf, distcc, MPICH and other - mostly specialized application clustering. distcc provides parallel compilation when using GCC.
- Linux Virtual Server, Linux-HA - director-based clusters that allow incoming requests for services to be distributed across multiple cluster nodes.
- Mosix, openMosix, Kerrighed, OpenSSI - full-blown clusters integrated into the kernel that provide for automatic process migration among homogenous nodes. OpenSSI and Kerrighed are single-system image implementations.
DragonFly BSD, a recent fork of FreeBSD 4.8 is being redesigned at its core to enable native clustering capabilities. It also aims to achieve single-system image capabilities.
MSCS is Microsoft's high-availability cluster service for Windows. Based on technology developed by Digital Equipment Corporation. The current version supports up to eight nodes in a single cluster, typically connected to a SAN. A set of APIs support cluster-aware applications, generic templates provide support for non-cluster aware applications.
Grid computing is a technology closely related to cluster computing. The key differences between grids and traditional clusters are that grids connect collections of computers which are do not fully trust each other, and hence operate more like a computing utility than like a single computer. In addition, grids typically support more heterogeneous collections than are commonly supported in clusters.
Free Software used
This lists Free Software used in order to make a computer cluster.
- OSCAR (http://oscar.openclustergroup.org/)
See also
References
- Greg Pfister: In Search of Clusters, Prentice Hall, ISBN 0138997098
- Evan Marcus, Hal Stern: Blueprints for High Availability: Designing Resilient Distributed Systems, John Wiley & Sons, ISBN 0471356018
- Karl Kopper: The Linux Enterprise Cluster: Build a Highly Available Cluster with Commodity Hardware and Free Software, No Starch Press, ISBN 1593270364
External links
- Linux clustering information center (http://lcic.org/)
- LinuxHPC (http://www.linuxhpc.org/)
- Beowulf (http://www.beowulf.org/)
- Top 500 Supercomputer List (http://www.top500.org/)
- Cplant, a non-Beowulf Linux cluster (http://www.cs.sandia.gov/cplant)
- IEEE task force on cluster computing, the leading academic community on cluster computing (http://www.ieeetfcc.org/)
- The cajo project (https://cajo.dev.java.net) Free clustered computing using Java. (LGPL)
- Understanding How Cluster Quorums Work (http://www.windowsnetworking.com/articles_tutorials/Cluster-Quorums.html)da:Klyngecomputer
de:Computercluster es:Cluster de computadores ko:컴퓨터 클러스터 nl:Computercluster ja:コンピュータ・クラスター pl:Klaster komputerowy fi:Klusteri zh:计算机集群