Problem solve Get help with specific problems with your technologies, process and projects.

Guide to server architectures

The following tip is excerpted from Chapter 1, Choosing your server, from of our expert e-book, "Windows servers and storage." This excerpt breaks down sysmmetric multi-processing systems (SMP), with analysis on clusters and grid computing.

The following tip is excerpted from Chapter 1, Choosing your server, from of our expert e-book, "Windows servers and storage." This chapter touches on aspects of server hardware, beginning with architectures--systems components and interconnects -- and ways to build servers from these components.

SMP (Sysmmetric Multi-Processing)

In an SMP system there are multiple processors, each with the same access to system resources (the 'Systems Interconnect' diagram shows an SMP). Since modern systems require caches to deliver useful performance, this means that an SMP must be cache-coherent.

A cache-coherent system ensures that whenever a processor accesses data, it sees the same value as any other processor would, regardless of whether the data is cached in one processor or maintained in memory. The mechanisms used to ensure this uniformity rely on a notification system that informs "interested caches" when a cached data value in several caches has been changed. Traditionally, SMPs use a shared bus, making it simple to maintain cache coherence since every processor sees every bus transaction.

However, a shared bus does not scale well. Large-scale systems must use some other form of interconnect, and generally look like a collection of independent SMP systems (each with its own processors, cache and portion of main memory) connected through a cache-coherent interconnect. In such a system, the time to access data increases when a processor in one SMP system needs to access data in the memory connected directly to another SMP system rather than in local memory. Such a system is referred to as a cache-coherent, non-uniform memory access (CCNUMA) system and the ratio between the access time to local memory and the access time to remote memory is called the NUMA factor. If software is not aware of the NUMA nature of the system, and the factor goes above four, bad things can happen to performance. CCNUMA systems require software variants, particularly in the memory management system. That software determines where to allocate memory requests by scheduling processes in a way to minimize overhead.

To illustrate, when a process running on processor A is suspended and resumes on a distant processor, the cost of moving data across the system and accessing data from a worse place can easily negate the performance gains from having more processors.

The key advantage of using SMP is that it reflects a view of how computers work, promulgated by most mainstream operating systems and assumed by mainstream sequential programming languages. To oversimplify, it promotes a programming model in which cooperating processes share memory. This allows low-cost interprocess communication; large chunks of data can be moved by passing pointers rather than copying the data itself.

But SMP's strength is also its weakness. The systems are fragile. An error in the cache of one processor, if not detected and fixed immediately, can pollute data for the whole system, causing an irrecoverable error.

A small SMP offers a simple means of getting greater server performance than a uniprocessor, and 2 to 4 processor systems are remarkably cost effective. Furthermore, generic mainstream software written with some awareness of multithreading, commonly seen these days, should run faster on a small SMP.

Very large SMP systems comprised of 32 to 128 processors are probably best suited to scientific computing in which very large data structures are manipulated by parallelizable code. Commercial applications -- including transaction processing, Web serving, Google databases and decision support with few updates -- have workloads that emphasize multiple independent transactions or interactions and have less need for efficient cache coherence. Transaction processing in particular is also not well served by a fragile system.

All server vendors offer SMPs: Dell, IBM, Sun, Hewlett-Packard and, at the lower end, some of the PC players as well as Apple.


A cluster is a collection of computer systems interconnected in a way that a program running on one machine can access the resources of another machine, but only indirectly. For instance, a processor is unable to perform a load or store to remote memory. Instead, it must ask software running on a remote processor to perform the access and forward the data.

This sounds remarkably inefficient, but such an isolationist approach provides a major benefit: robustness. In a cluster, if a memory system or cache failure occurs on one node, it is quite difficult for the polluted data to instantly diffuse through the rest of the system.

The apparent inefficiencies matter more in some applications than others. Even for scientific computing, when large shared data areas are manipulated, it is possible for appropriate workloads to allocate the resources in such a way that the interprocessor communications costs are a small part of the overall computational burden.

However, a small cluster rarely makes sense, simply because the programming model of choice (for most software) is one that best fits an SMP view of the world. Writing software that works on a cluster requires not writing it as you would for an SMP. Shared data is basically not available -- explicit message-passing is required. This means that there's not a great deal of mainstream software for cluster architectures. There are just those applications that scale well on clusters, including databases and transaction-processing systems.

In the real world, there is no battle between SMP and cluster. Effective clusters are constructed by clustering small-to-medium scale SMPs. Choose a cluster solution when your application requires robust scaling of data-processing capabilities, memory size or data storage. However, a cluster can only be deployed when the application is already available for cluster architecture.

As with SMPs, all the server vendors offer clusters: IBM, Sun, Hewlett-Packard, Dell, the other PC players and Apple.

Grid computing

Grid computing is a fashionable term referring to the ability, through software, to leverage a large number of independent and possibly heterogeneous computer systems, which happen to be connected through a network. Its name resonates with the concept of the electricity grid, a vanilla means of delivering enough electricity anywhere.

Grid computing is simply the result of being able to distribute software tasks to a large population of network-connected processors. Real-world grid computing leverages software that virtualizes available resources, with the goal of presenting a view of having a large enough, rich enough computing resource available.

To the extent that virtualization works for tasks to be executed on a grid, the grid offers some advantages: A cluster has similar properties (and its nodes may be virtualized in the same manner), but the nodes of a cluster generally must be close together. While it's certainly possible to have I/O connections across a WAN-scale distance (perhaps using iSCSI), there are no such scalable, low-latency, WAN-friendly processor interconnects. A grid has no such restriction. Distributed computing systems relax the need for physical proximity that historically require a strong homogeneity; so grid computing allows organizations to deploy heterogeneous distributed systems.

Grid computing can be an excellent choice for workloads that naturally break into convenient parcels of computation on well-defined, relatively small amounts of data. In such cases, data and programs can be distributed to a remote computer, the task run and the results collected. With proper program sizing, execution time, data sizing and bandwidth, you can obtain fairly high efficiencies.

Good grid computing candidates include anonymous computation processes, such as the search for extra-terrestrial intelligence through digital signal processing analysis of radio signals, as well as simulation and what-if analyses for financial and aerospace domains. Virtualization also allows the potentially vast amounts of storage in a grid system to be geographically distributed and replicated transparently for increased robustness.

Grid computing is based on specialized software rather than hardware systems designed specifically for grid computing. Any machine may be used. Grid computing leverages software and the major server vendors all offer grid support.

About the authors:
René J Chevance is an independent consultant. He formerly worked as chief Scientist of Bull, a European-based global IT supplier.

Pete Wilson is Chief Scientist of Kiva Design, a small consultancy and research company specializing in issues surrounding the move to multi-core computing platforms, with special emphasis on the embedded space. Prior to that, he spent seven years at Motorola/Freescale.

Dig Deeper on Windows Server management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.