For purposes of understanding the factors that contribute to Web performance, and ultimately to capacity planning, understanding the basic building blocks and activities that together make up a Web transaction is key. These building blocks not only illuminate the many potential culprits for Web-server performance problems, but they also explain why assigning responsibility and tackling optimization can't always solve what ultimately...
turns out to be "somebody else's problem." As I've said in the first 2 parts of this 3-part tip, this might be summarized as "if it ain't yours, you can't fix it."
To recap again, all Web transactions consist of 3 separate, but interlocking sets of activities: these involve (1) a client request that initiates activity, and eventually has to handle some kind of response, (2)a set of network connections to ferry requests from clients to servers and responses from servers to clients, and (3) a Web server that handles incoming requests, and either provide the information requested, or some kind of alternative response (such as a redirect, an error message, and so forth). In today's tip, we examine the third an final part of the puzzle, as we explore Web transactions from the server's point of view.
From the server perspective, there is a lot of activity involved in handling incoming requests for various services, because so many requests can arrive (or be processed in parallel) but also because the server is a natural "gathering place" for activities that center on the content and services it offers. Then, too, it's possible for a single "logical server" to consist of a physical cluster of servers, or even--when appropriate IP handling and load balancing capabilities are available--for that single logical presence to be instantiated in the form of multiple, mirrored physical presences at various locations around the globe.
To begin, let's explore the anatomy of a request for Web access from a client:
- A client makes some request, which arrives at the server via a network connection of some kind.
- The server parses the request, and determines what type of HTTP operation is requested.
- Based on the HTTP request, the server executes some HTTP method (which may be a GET, HEAD, or other method to match or reply to the incoming request).
- For a GET request (by far the most common type that Web servers handle), the server locates a file somewhere in its file system, either in cache or on a disk or network mount point or share of some kind.
- One located, the file is accessed and its contents copied to the network port that matches the incoming HTTP request.
- When the file contents are completely delivered, the server closes the HTTP connection when non-persistent HTTP is used, or maintains an open connection where persistent HTTP is used.
It's also important to recognize that numerous such requests will be processed in parallel. Generally, more traffic means more request processing is occurring on any given server.
The importance of cache and memory for server performance should be obvious in revisiting the preceding sequence with a critical eye. That's because a server's ability to handle multiple simultaneous transactions--given especially that so many such transactions involve file system access--will depend on how much memory is available not just to process such transactions, but also on how much memory is available to cache recently-accessed files, objects, and resources. Because memory access is at least 3 orders of magnitude faster than file system access, this explains why more memory so often translates into improved Web server performance. Likewise, because file system access from time to time will be unavoidable, this also explains why Web servers typically incorporate the fastest forms of storage that are affordable, be they RAID arrays or even high-speed, high-dollar network-attached storage devices of some kind.
Likewise, because access requests naturally aggregate at Web servers, they benefit disproportionately from high-bandwidth Internet connections as well. When the most typical bottlenecks for local server performance--file system access and transaction processing--can be satisfactorily addressed, network bandwidth appears as the next bottleneck in the processing environment. This explains why co-location for small-scale servers is so important: it gets such installations closer to the backbone than would otherwise be. This also explains why data centers, clustered servers, and load-balancing across multiple sites become increasingly important as Web servers increase in scale and capacity: all of these techniques represent effective (albeit increasingly expensive) ways to increase the aggregate bandwidth that servers can manage.
When you consider the whole of this puzzle--client side, network, and server side--it's interesting that speed (or at least consumable bandwidth) plays an important role overall. But at both client and server ends, anything that helps to facilitate re-use of data and to lower overall traffic or file-system access (especially through caching) also plays an important role. With these key building blocks in mind, you should be able to address the areas of performance and capacity that you control, and to make sure the right kinds of resources and facilities are available to mitigate the impact of resources outside your scope and control as well.
Ed Tittel presides over LANWrights, Inc., a company that specializes in Web markup languages and tools, with a decidedly Microsoft flavor. He is also a co-author of more than 100 books on a variety of topics, and a regular contributor to numerous TechTarget web sites. Contact Ed via e-mail at email@example.com.