Hubbry Logo
Web serverWeb serverMain
Open search
Web server
Community hub
Web server
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Web server
Web server
from Wikipedia

PC clients communicating via the network with a web server serving static content only
The inside and front of a Dell PowerEdge server, a computer designed to be mounted in a rack mount environment. Servers similar to this one are often used as web servers.
Multiple web servers may be used for a high-traffic website.
Server farm with thousands of web servers used for super-high traffic websites
ADSL modem running an embedded web server serving dynamic web pages used for modem configuration

A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message. A web server can also accept and store resources sent from the user agent if configured to do so.[1][2][3][4][5]

The hardware used to run a web server can vary according to the volume of requests that it needs to handle. At the low end of the range are embedded systems, such as a router that runs a small web server as its configuration interface. A high-traffic Internet website might handle requests with hundreds of servers that run on racks of high-speed computers.[6]

A resource sent from a web server can be a pre-existing file (static content) available to the web server, or it can be generated at the time of the request (dynamic content) by another program that communicates with the server software. The former usually can be served faster and can be more easily cached for repeated requests, while the latter supports a broader range of applications.

Technologies such as REST and SOAP, which use HTTP as a basis for general computer-to-computer communication, as well as support for WebDAV extensions, have extended the application of web servers well beyond their original purpose of serving human-readable pages.

History

[edit]
First web proposal (1989) evaluated as "vague but exciting..."
The world's first web server, a NeXT Computer workstation with Ethernet, 1990. The case label reads: "This machine is a server. DO NOT POWER IT DOWN!!"

This is a very brief history of web server programs, so some information necessarily overlaps with the histories of the web browsers, the World Wide Web and the Internet; therefore, for the sake of clarity and understandability, some key historical information below reported may be similar to that found also in one or more of the above-mentioned history articles.[7]

Initial WWW project (1989–1991)

[edit]

In March 1989, Sir Tim Berners-Lee proposed a new project to his employer CERN, with the goal of easing the exchange of information between scientists by using a hypertext system. The proposal titled "HyperText and CERN", asked for comments and it was read by several people. In October 1990 the proposal was reformulated and enriched (having as co-author Robert Cailliau), and finally, it was approved.[8][9][10]

Between late 1990 and early 1991 the project resulted in Berners-Lee and his developers writing and testing several software libraries along with three programs, which initially ran on NeXTSTEP OS installed on NeXT workstations:[11][12][10]

Those early browsers retrieved web pages written in a simple early form of HTML from web servers using a new basic communication protocol that was named HTTP 0.9.

In August 1991 Tim Berners-Lee announced the birth of WWW technology and encouraged scientists to adopt and develop it.[13] Soon after, those programs, along with their source code, were made available to people interested in their usage.[11] Although the source code was not formally licensed or placed in the public domain, CERN informally allowed users and developers to experiment and further develop on top of them. Berners-Lee started promoting the adoption and the usage of those programs along with their porting to other operating systems.[10]

Fast and wild development (1991–1995)

[edit]

In December 1991, the first web server outside Europe was installed at SLAC (U.S.A.).[12] This was a very important event because it started trans-continental web communications between web browsers and web servers.

In 1991–1993, CERN web server program continued to be actively developed by the www group, meanwhile, thanks to the availability of its source code and the public specifications of the HTTP protocol, many other implementations of web servers started to be developed.

In April 1993, CERN issued a public official statement stating that the three components of Web software (the basic line-mode client, the web server and the library of common code), along with their source code, were put in the public domain.[14] This statement freed web server developers from any possible legal issue about the development of derivative work based on that source code (a threat that in practice never existed).

At the beginning of 1994, the most notable among new web servers was NCSA httpd which ran on a variety of Unix-based OSs and could serve dynamically generated content by implementing the POST HTTP method and the CGI to communicate with external programs. These capabilities, along with the multimedia features of NCSA's Mosaic browser (also able to manage HTML FORMs in order to send data to a web server) highlighted the potential of web technology for publishing and distributed computing applications.

Number of active web sitesYear020,00040,00060,00080,000100,000199119931995Number of active web sitesNumber of active web sites (1991–1996)
Number of active web sites (1991–1996)[15][16]

In the second half of 1994, the development of NCSA httpd stalled to the point that a group of external software developers, webmasters and other professional figures interested in that server, started to write and collect patches thanks to the NCSA httpd source code being available to the public domain. At the beginning of 1995 those patches were all applied to the last release of NCSA source code and, after several tests, the Apache HTTP server project was started.[17][18]

At the end of 1994, a new commercial web server, named Netsite, was released with specific features. It was the first one of many other similar products that were developed first by Netscape, then also by Sun Microsystems, and finally by Oracle Corporation.

In mid-1995, the first version of IIS was released, for Windows NT OS, by Microsoft. This marked the entry, in the field of World Wide Web technologies, of a very important commercial developer and vendor that has played and still is playing a key role on both sides (client and server) of the web.

In the second half of 1995, CERN and NCSA web servers started to decline (in global percentage usage) because of the widespread adoption of new web servers which had a much faster development cycle along with more features, more fixes applied, and more performances than the previous ones.

Explosive growth and competition (1996–2014)

[edit]
Number of active web sitesYear03,000,0006,000,0009,000,00012,000,00015,000,0001996199820002002Number of active web sitesNumber of active web sites (1996-2002)
Number of active web sites (1996-2002)[16][19]
Sun's Cobalt Qube 3 – a computer server appliance (2002, discontinued)

At the end of 1996, there were already over fifty known, different web-server-software programs that were available to everybody who wanted to own an Internet domain name or to host websites.[20] Many of them lived only shortly and were replaced by other web servers.

The publication of RFCs about protocol versions HTTP/1.0 (1996) and HTTP/1.1 (1997, 1999), forced most web servers to comply (not always completely) with those standards. The use of TCP/IP persistent connections (HTTP/1.1) required web servers both to increase the maximum number of concurrent connections allowed and to improve their level of scalability.

Between 1996 and 1999, Netscape Enterprise Server and Microsoft's IIS emerged among the leading commercial options whereas among the freely available and open-source programs Apache HTTP Server held the lead as the preferred server (because of its reliability and its many features).

In those years there was also another commercial, highly innovative and thus notable web server called Zeus (now discontinued) that was known as one of the fastest and most scalable web servers available on market, at least till the first decade of 2000s, despite its low percentage of usage.

Apache resulted in the most used web server from mid-1996 to the end of 2015 when, after a few years of decline, it was surpassed initially by IIS and then by Nginx. Afterward IIS dropped to much lower percentages of usage than Apache (see also market share).

From 2005–2006, Apache started to improve its speed and its scalability level by introducing new performance features (e.g., event MPM and new content cache).[21][22] As those new performance improvements initially were marked as experimental, they were not enabled by its users for a long time and so Apache suffered, even more, the competition of commercial servers and, above all, of other open-source servers which meanwhile had already achieved far superior performances (mostly when serving static content) since the beginning of their development and at the time of the Apache decline were able to offer also a long enough list of well tested advanced features.

A few years after 2000 started, not only other commercial and highly competitive web servers (e.g., LiteSpeed) but also many other open-source programs such as Hiawatha, Cherokee HTTP server, Lighttpd, Nginx and other derived and related products also available with commercial support emerged.

Around 2007–2008, most popular web browsers increased their previous default limit of 2 persistent connections per host-domain (a limit recommended by RFC-2616)[23] to 4, 6 or 8 persistent connections per host-domain, in order to speed up the retrieval of heavy web pages with lots of images, and to mitigate the problem of the shortage of persistent connections dedicated to dynamic objects used for bi-directional notifications of events in web pages.[24] Within a year, these changes, on average, nearly tripled the maximum number of persistent connections that web servers had to manage. This trend (of increasing the number of persistent connections) definitely gave a strong impetus to the adoption of reverse proxies in front of slower web servers and it gave also one more chance to the emerging new web servers that could show all their speed and their capability to handle very high numbers of concurrent connections without requiring too many hardware resources (expensive computers with lots of CPUs, RAM and fast disks).[25]

New challenges (2015 and later years)

[edit]

In 2015, RFCs published new protocol version [HTTP/2], and as the implementation of new specifications was not trivial at all, a dilemma arose among developers of less popular web servers (e.g., with a percentage of usage lower than 1% .. 2%), about adding or not adding support for that new protocol version.[26][27]

In fact supporting HTTP/2 often required radical changes to their internal implementation due to many factors (practically always required encrypted connections, capability to distinguish between HTTP/1.x and HTTP/2 connections on the same TCP port, binary representation of HTTP messages, message priority, compression of HTTP headers, use of streams also known as TCP/IP sub-connections and related flow-control, etc.) and so a few developers of those web servers opted for not supporting new HTTP/2 version (at least in the near future) also because of these main reasons:[26][27]

  • protocols HTTP/1.x would have been supported anyway by browsers for a very long time (maybe forever) so that there would be no incompatibility between clients and servers in next future;
  • implementing HTTP/2 was considered a task of overwhelming complexity that could open the door to a whole new class of bugs that till 2015 did not exist and so it would have required notable investments in developing and testing the implementation of the new protocol;
  • adding HTTP/2 support could always be done in future in case the efforts would be justified.

Instead, developers of most popular web servers, rushed to offer the availability of new protocol, not only because they had the work force and the time to do so, but also because usually their previous implementation of SPDY protocol could be reused as a starting point and because most used web browsers implemented it very quickly for the same reason. Another reason that prompted those developers to act quickly was that webmasters felt the pressure of the ever increasing web traffic and they really wanted to install and to try – as soon as possible – something that could drastically lower the number of TCP/IP connections and speedup accesses to hosted websites.[28]

In 2020–2021 the HTTP/2 dynamics about its implementation (by top web servers and popular web browsers) were partly replicated after the publication of advanced drafts of future RFC about HTTP/3 protocol.

Technical overview

[edit]
PC clients connected to a web server via Internet

The following technical overview should be considered only as an attempt to give a few very limited examples about some features that may be implemented in a web server and some of the tasks that it may perform in order to have a sufficiently wide scenario about the topic.

A web server program plays the role of a server in a client–server model by implementing one or more versions of HTTP protocol, often including the HTTPS secure variant and other features and extensions that are considered useful for its planned usage.

The complexity and the efficiency of a web server program may vary a lot depending on:[1]

  • common features implemented;
  • common tasks performed;
  • performances and scalability level aimed as a goal;
  • software model and techniques adopted to achieve wished performance and scalability level;
  • target hardware and category of usage (e.g., embedded system, low-medium traffic web server, high traffic Internet web server).

Common features

[edit]

Although web server programs differ in how they are implemented, most of them offer the following common features.

These are basic features that most web servers usually have.

  • Static content serving: to be able to serve static content (web files) to clients via HTTP protocol.
  • HTTP: support for one or more versions of HTTP protocol in order to send versions of HTTP responses compatible with versions of client HTTP requests, (e.g., HTTP/1.0, HTTP/1.1, HTTP/2, HTTP/3).
  • Logging: usually web servers have also the capability of logging some information, about client requests and server responses, to log files for security and statistical purposes.

A few other more advanced and popular features (only a very short selection) are the following ones.

Common tasks

[edit]

A web server program, when it is running, usually performs several general tasks:[1]

  • starts, optionally reads and applies settings found in its configuration files or elsewhere, optionally opens log file, starts listening to client connections and requests;
  • optionally tries to adapt its general behavior according to its settings and its current operating conditions;
  • manages client connections (accepting new ones or closing the existing ones as required);
  • receives client requests (by reading HTTP messages):
  • executes or refuses requested HTTP method:
  • replies to client requests sending proper HTTP responses (e.g., requested resources or error messages) eventually verifying or adding HTTP headers to those sent by dynamic programs and modules;
  • optionally logs (partially or totally) client requests or its responses to an external user log file or to a system log file by syslog, usually using common log format;
  • optionally logs process messages about detected anomalies or other notable events (e.g., in client requests or in its internal functioning) using syslog or some other system facilities; these log messages usually have a debug, warning, error, alert level which can be filtered (not logged) depending on some settings, see also severity level;
  • optionally generates statistics about web traffic managed or its performances;
  • other custom tasks.

Read request message

[edit]

Web server programs are able:[29] [30] [31]

  • to read an HTTP request message;
  • to interpret it;
  • to verify its syntax;
  • to identify known HTTP headers and to extract their values from them.

Once an HTTP request message has been decoded and verified, its values can be used to determine whether that request can be satisfied or not. This requires many other steps, including security checks.

URL normalization

[edit]

Web server programs usually perform some type of URL normalization (URL found in most HTTP request messages) in order to:

  • make resource path always a clean uniform path from root directory of website;
  • lower security risks (e.g., by intercepting more easily attempts to access static resources outside the root directory of the website or to access to portions of path below website root directory that are forbidden or which require authorization);
  • make path of web resources more recognizable by human beings and web log analysis programs (also known as log analyzers or statistical applications).

The term URL normalization refers to the process of modifying and standardizing a URL in a consistent manner. There are several types of normalization that may be performed, including the conversion of the scheme and host to lowercase. Among the most important normalizations are the removal of "." and ".." path segments and adding trailing slashes to a non-empty path component.

URL mapping

[edit]

"URL mapping is the process by which a URL is analyzed to figure out what resource it is referring to, so that that resource can be returned to the requesting client. This process is performed with every request that is made to a web server, with some of the requests being served with a file, such as an HTML document, or a gif image, others with the results of running a CGI program, and others by some other process, such as a built-in module handler, a PHP document, or a Java servlet."[32][needs update]

In practice, web server programs that implement advanced features, beyond the simple static content serving (e.g., URL rewrite engine, dynamic content serving), usually have to figure out how that URL has to be handled as a:

  • URL redirection, a redirection to another URL;
  • static request of file content;
  • dynamic request of:
    • directory listing of files or other sub-directories contained in that directory;
    • other types of dynamic request in order to identify the program or module processor able to handle that kind of URL path and to pass to it other URL parts, (i.e., usually path-info and query string variables).

One or more configuration files of web server may specify the mapping of parts of URL path (e.g., initial parts of file path, filename extension and other path components) to a specific URL handler (file, directory, external program or internal module).[33]

When a web server implements one or more of the above-mentioned advanced features then the path part of a valid URL may not always match an existing file system path under website directory tree (a file or a directory in file system) because it can refer to a virtual name of an internal or external module processor for dynamic requests.

URL path translation to file system

[edit]

Web server programs are able to translate an URL path (all or part of it), that refers to a physical file system path, to an absolute path under the target website's root directory.[33]

Website's root directory may be specified by a configuration file or by some internal rule of the web server by using the name of the website which is the host part of the URL found in HTTP client request.[33]

Path translation to file system is done for the following types of web resources:

  • a local, usually non-executable, file (static request for file content);
  • a local directory (dynamic request: directory listing generated on the fly);
  • a program name (dynamic requests that is executed using CGI or SCGI interface and whose output is read by web server and resent to client who made the HTTP request).

The web server appends the path found in requested URL (HTTP request message) and appends it to the path of the (Host) website root directory. On an Apache server, this is commonly /home/www/website (on Unix machines, usually it is: /var/www/website). See the following examples of how it may result.

URL path translation for a static file request

Example of a static request of an existing file specified by the following URL:

http://www.example.com/path/file.html

The client's user agent connects to www.example.com and then sends the following HTTP/1.1 request:

GET /path/file.html HTTP/1.1
Host: www.example.com
Connection: keep-alive

The result is the local file system resource:

/home/www/www.example.com/path/file.html

The web server then reads the file, if it exists, and sends a response to the client's web browser. The response will describe the content of the file and contain the file itself or an error message will return saying that the file does not exist or its access is forbidden.

URL path translation for a directory request (without a static index file)

Example of an implicit dynamic request of an existing directory specified by the following URL:

http://www.example.com/directory1/directory2/

The client's user agent connects to www.example.com and then sends the following HTTP/1.1 request:

GET /directory1/directory2 HTTP/1.1
Host: www.example.com
Connection: keep-alive

The result is the local directory path:

/home/www/www.example.com/directory1/directory2/

The web server then verifies the existence of the directory and if it exists and it can be accessed then tries to find out an index file (which in this case does not exist) and so it passes the request to an internal module or a program dedicated to directory listings and finally reads data output and sends a response to the client's web browser. The response will describe the content of the directory (list of contained subdirectories and files) or an error message will return saying that the directory does not exist or its access is forbidden.

URL path translation for a dynamic program request

For a dynamic request the URL path specified by the client should refer to an existing external program (usually an executable file with a CGI) used by the web server to generate dynamic content.[34]

Example of a dynamic request using a program file to generate output:

http://www.example.com/cgi-bin/forum.php?action=view&orderby=thread&date=2021-10-15

The client's user agent connects to www.example.com and then sends the following HTTP/1.1 request:

GET /cgi-bin/forum.php?action=view&ordeby=thread&date=2021-10-15 HTTP/1.1
Host: www.example.com
Connection: keep-alive

The result is the local file path of the program (in this example, a PHP program):

/home/www/www.example.com/cgi-bin/forum.php

The web server executes that program, passing in the path-info and the query string action=view&orderby=thread&date=2021-10-15 so that the program has the info it needs to run. (In this case, it will return an HTML document containing a view of forum entries ordered by thread from October 15, 2021). In addition to this, the web server reads data sent from the external program and resends that data to the client that made the request.

Manage request message

[edit]

Once a request has been read, interpreted, and verified, it has to be managed depending on its method, its URL, and its parameters, which may include values of HTTP headers.

In practice, the web server has to handle the request by using one of these response paths:[33]

  • if something in request was not acceptable (in status line or message headers), web server already sent an error response;
  • if request has a method (e.g., OPTIONS) that can be satisfied by general code of web server then a successful response is sent;
  • if URL requires authorization then an authorization error message is sent;
  • if URL maps to a redirection then a redirect message is sent;
  • if URL maps to a dynamic resource (a virtual path or a directory listing) then its handler (an internal module or an external program) is called and request parameters (query string and path info) are passed to it in order to allow it to reply to that request;
  • if URL maps to a static resource (usually a file on file system) then the internal static handler is called to send that file;
  • if request method is not known or if there is some other unacceptable condition (e.g., resource not found, internal server error, etc.) then an error response is sent.

Serve static content

[edit]
PC clients communicating via network with a web server serving static content only

If a web server program is capable of serving static content and it has been configured to do so, then it is able to send file content whenever a request message has a valid URL path matching (after URL mapping, URL translation and URL redirection) that of an existing file under the root directory of a website and file has attributes which match those required by internal rules of web server program.[33]

That kind of content is called static because usually it is not changed by the web server when it is sent to clients and because it remains the same until it is modified (file modification) by some program.

NOTE: when serving static content only, a web server program usually does not change file contents of served websites (as they are only read and never written) and so it suffices to support only these HTTP methods:

  • OPTIONS
  • HEAD
  • GET

Response of static file content can be sped up by a file cache.

Directory index files
[edit]

If a web server program receives a client request message with an URL whose path matches one of an existing directory and that directory is accessible and serving directory index files is enabled then a web server program may try to serve the first of known (or configured) static index file names (a regular file) found in that directory; if no index file is found or other conditions are not met then an error message is returned.

Most used names for static index files are: index.html, index.htm and Default.htm.

Regular files
[edit]

If a web server program receives a client request message with an URL whose path matches the file name of an existing file and that file is accessible by web server program and its attributes match internal rules of web server program, then web server program can send that file to client.

Usually, for security reasons, most web server programs are pre-configured to serve only regular files or to avoid to use special file types like device files, along with symbolic links or hard links to them. The aim is to avoid undesirable side effects when serving static web resources.[35]

Serve dynamic content

[edit]
PC clients communicating via network with a web server serving static and dynamic content

If a web server program is capable of serving dynamic content and it has been configured to do so, then it is able to communicate with the proper internal module or external program (associated with the requested URL path) in order to pass to it the parameters of the client request. After that, the web server program reads from it its data response (that it has generated, often on the fly) and then it resends it to the client program who made the request.[citation needed]

NOTE: when serving static and dynamic content, a web server program usually has to support also the following HTTP method in order to be able to safely receive data from clients and so to be able to host also websites with interactive forms that may send large data sets (e.g., lots of data entry or file uploads) to web server, external programs or modules:

  • POST

In order to be able to communicate with its internal modules or external programs, a web server program must have implemented one or more of the many available gateway interfaces (see also Web Server Gateway Interfaces used for dynamic content).

The three standard and historical gateway interfaces are the following ones.

CGI
An external CGI program is run by web server program for each dynamic request, then web server program reads from it the generated data response and then resends it to client.
SCGI
An external SCGI program (it usually is a process) is started once by web server program or by some other program or process and then it waits for network connections; every time there is a new request for it, web server program makes a new network connection to it in order to send request parameters and to read its data response, then network connection is closed.
FastCGI
An external FastCGI program (it usually is a process) is started once by web server program or by some other program or process and then it waits for a network connection which is established permanently by web server; through that connection are sent the request parameters and read data responses.
Directory listings
[edit]
Directory listing dynamically generated by a web server

A web server program may be capable to manage the dynamic generation (on the fly) of a directory index list of files and sub-directories.[36]

If a web server program is configured to do so and a requested URL path matches an existing directory and its access is allowed and no static index file is found under that directory then a web page (usually in HTML format), containing the list of files or subdirectories of above mentioned directory, is dynamically generated (on the fly). If it cannot be generated an error is returned.

Some web server programs allow the customization of directory listings by allowing the usage of a web page template—an HTML document containing placeholders, (e.g., $(FILE_NAME), $(FILE_SIZE), etc.) that are replaced with the field values of each file entry found in directory by web server (e.g., index.tpl) or the usage of HTML and embedded source code that is interpreted and executed (e.g.,, index.asp) or by supporting the usage of dynamic index programs such as CGIs, SCGIs, FCGIs (e.g., index.cgi, index.php, index.fcgi).

Usage of dynamically generated directory listings is usually avoided or limited to a few selected directories of a website because that generation takes much more OS resources than sending a static index page.

The main usage of directory listings is to allow the download of files (usually when their names, sizes, modification date-times or file attributes may change randomly and frequently) as they are, without requiring to provide further information to requesting user.[37]

Program or module processing
[edit]

An external program or an internal module (processing unit) can execute some sort of application function that may be used to get data from or to store data to one or more data repositories:[citation needed]

  • files (file system);
  • databases (DBs);
  • other sources located in local computer or in other computers.

A processing unit can return any kind of web content, also by using data retrieved from a data repository:[citation needed]

In practice whenever there is content that may vary, depending on one or more parameters contained in client request or in configuration settings, then, usually, it is generated dynamically.

Send response message

[edit]

Web server programs are able to send response messages as replies to client request messages.[29]

An error response message may be sent because a request message could not be successfully read or decoded or analyzed or executed.[30]

NOTE: the following sections are reported only as examples to help to understand what a web server, more or less, does; these sections are by any means neither exhaustive nor complete.

Error message

[edit]

A web server program may reply to a client request message with many kinds of error messages, anyway these errors are divided mainly in two categories:

When an error response or message is received by a client browser, then if it is related to the main user request (e.g., an URL of a web resource such as a web page) then usually that error message is shown in some browser window or message.

URL authorization

[edit]

A web server program may be able to verify whether the requested URL path:[40]

  • can be freely accessed by everybody;
  • requires a user authentication (request of user credentials such as user name and password);
  • access is forbidden to some or all kind of users.

If the authorization or access-rights feature has been implemented and enabled and access to web resource is not granted, then, depending on the required access rights, a web server program:

  • can deny access by sending a specific error message (e.g., access forbidden);
  • may deny access by sending a specific error message (e.g., access unauthorized) that usually forces the client browser to ask human user to provide required user credentials; if authentication credentials are provided then web server program verifies and accepts or rejects them.

URL redirection

[edit]

A web server program may have the capability of doing URL redirections to new URLs (new locations) which consists in replying to a client request message with a response message containing a new URL suited to access a valid or an existing web resource (client should redo the request with the new URL).[41]

URL redirection of location is used:[41]

  • to fix a directory name by adding a final slash '/';[36]
  • to give a new URL for a no more existing URL path to a new path where that kind of web resource can be found.
  • to give a new URL to another domain when current domain has too much load.

Example 1: a URL path points to a directory name but it does not have a final slash '/' so web server sends a redirect to client in order to instruct it to redo the request with the fixed path name.[36]

From:
  /directory1/directory2
To:
  /directory1/directory2/

Example 2: a whole set of documents has been moved inside website in order to reorganize their file system paths.

From:
  /directory1/directory2/2021-10-08/
To:
  /directory1/directory2/2021/10/08/

Example 3: a whole set of documents has been moved to a new website and now it is mandatory to use secure HTTPS connections to access them.

From:
  http://www.example.com/directory1/directory2/2021-10-08/
To:
  https://docs.example.com/directory1/2021-10-08/

Above examples are only a few of the possible kind of redirections.

Successful message

[edit]

A web server program is able to reply to a valid client request message with a successful message, optionally containing requested web resource data.[42]

If web resource data is sent back to client, then it can be static content or dynamic content depending on how it has been retrieved (from a file or from the output of some program or module).

Content cache

[edit]

In order to speed up web server responses by lowering average HTTP response times and hardware resources used, many popular web servers implement one or more content caches, each one specialized in a content category.[43] [44]

Content is usually cached by its origin:

File cache

[edit]

Historically, static contents found in files which had to be accessed frequently, randomly and quickly, have been stored mostly on electro-mechanical disks since mid-late 1960s and 1970s; regrettably reads from and writes to those kind of devices have always been considered very slow operations when compared to RAM speed and so, since early OSs, first disk caches and then also OS file cache sub-systems were developed to speed up I/O operations of frequently accessed data.

Even with the aid of an OS file cache, the relative or occasional slowness of I/O operations involving directories and files stored on disks became soon a bottleneck in the increase of performances expected from top level web servers, specially since mid-late 1990s, when web Internet traffic started to grow exponentially along with the constant increase of speed of Internet or network lines.

The problem about how to further efficiently speed-up the serving of static files, thus increasing the maximum number of requests or responses per second (RPS), started to be studied and researched since mid 1990s, with the aim to propose useful cache models that could be implemented in web server programs.[45]

In practice, nowadays, many web server programs include their own userland file cache, tailored for a web server usage and using their specific implementation and parameters.[46] [47] [48]

The wide spread adoption of RAID and fast solid-state drives (storage hardware with very high I/O speed) has slightly reduced but of course not eliminated the advantage of having a file cache incorporated in a web server.

Dynamic cache

[edit]

Dynamic content, output by an internal module or an external program, may not always change very frequently (given a unique URL with keys or parameters) and so, maybe for a while (e.g., from one second to several hours or more), the resulting output can be cached in RAM or even on a fast disk.[49]

The typical usage of a dynamic cache is when a website has dynamic web pages about news, weather, images, maps, etc. that do not change frequently (e.g., every n minutes) and that are accessed by a huge number of clients per minute per hour; in those cases it is useful to return cached content too (without calling the internal module or the external program) because clients often do not have an updated copy of the requested content in their browser caches.[50]

Anyway, in most cases those kind of caches are implemented by external servers (e.g., reverse proxy) or by storing dynamic data output in separate computers, managed by specific applications (e.g., memcached), in order to not compete for hardware resources (CPU, RAM, disks) with web servers.[51] [52]

Kernel-mode and user-mode web servers

[edit]

A web server software can be either incorporated into the OS and executed in kernel space, or it can be executed in user space (like other regular applications).

Web servers that run in kernel mode (usually called kernel space web servers) can have direct access to kernel resources and so they can be, in theory, faster than those running in user mode, but there are disadvantages in running a web server in kernel mode (e.g., difficulties in developing and debugging software) whereas run-time critical errors may lead to serious problems in OS kernel.

Web servers that run in user-mode have to ask the system for permission to use more memory or more CPU resources. Not only do these requests to the kernel take time, but they might not always be satisfied because the system reserves resources for its own usage and has the responsibility to share hardware resources with all the other running applications. Executing in user mode can also mean using more buffer or data copies (between user-space and kernel-space) which can lead to a decrease in the performance of a user-mode web server.

Nowadays almost all web server software is executed in user mode (because many of the aforementioned small disadvantages have been overcome by faster hardware, new OS versions, much faster OS system calls and new optimized web server software). See also comparison of web server software to discover which of them run in kernel mode or in user mode (also referred as kernel space or user space).

Performances

[edit]

To improve the user experience (on client or browser side), a web server should reply quickly (as soon as possible) to client requests; unless content response is throttled (by configuration) for some type of files (e.g., big or huge files), also returned data content should be sent as fast as possible (high transfer speed).

In other words, a web server should always be very responsive, even under high load of web traffic, in order to keep total user's wait (sum of browser time + network time + web server response time) for a response as low as possible.

Performance metrics

[edit]

For web server software, main key performance metrics (measured under vary operating conditions) usually are at least the following ones:[53]

  • number of requests per second (RPS, similar to QPS, depending on HTTP version and configuration, type of HTTP requests and other operating conditions);
  • number of connections per second (CPS), is the number of connections per second accepted by web server (useful when using HTTP/1.0 or HTTP/1.1 with a very low limit of requests or responses per connection, i.e., 1 .. 20);
  • network latency + response time for each new client request; usually benchmark tool shows how many requests have been satisfied within a scale of time laps (e.g., within 1ms, 3ms, 5ms, 10ms, 20ms, 30ms, 40ms) or the shortest, the average and the longest response time;
  • throughput of responses, in bytes per second.

Among the operating conditions, the number (1 .. n) of concurrent client connections used during a test is an important parameter because it allows to correlate the concurrency level supported by web server with results of the tested performance metrics.

Software efficiency

[edit]

The specific web server software design and model adopted:

  • single process or multi-process;
  • single thread (no thread) or multi-thread for each process;
  • usage of coroutines or not;

... and other programming techniques, such as:

... used to implement a web server program, can bias a lot the performances and in particular the scalability level that can be achieved under heavy load or when using high end hardware (many CPUs, disks and lots of RAM).

In practice some web server software models may require more OS resources (specially more CPUs and more RAM) than others to be able to work well and so to achieve target performances.

Operating conditions

[edit]

There are many operating conditions that can affect the performances of a web server; performance values may vary depending on:

  • the settings of web server (including the fact that log file is or is not enabled, etc.);
  • the HTTP version used by client requests;
  • the average HTTP request type (method, length of HTTP headers and optional body);
  • whether the requested content is static or dynamic;
  • whether the content is cached or not cached (by server or client);
  • whether the content is compressed on the fly (when transferred), pre-compressed (i.e., when a file resource is stored on disk already compressed so that web server can send that file directly to the network with the only indication that its content is compressed) or not compressed at all;
  • whether the connections are or are not encrypted;
  • the average network speed between web server and its clients;
  • the number of active TCP connections;
  • the number of active processes managed by web server (including external CGI, SCGI, FCGI programs);
  • the hardware and software limitations or settings of the OS of the computers on which the web server runs;
  • other minor conditions.

Benchmarking

[edit]

Performances of a web server are typically benchmarked by using one or more of the available automated load testing tools.

Load limits

[edit]

A web server (program installation) usually has pre-defined load limits for each combination of operating conditions, also because it is limited by OS resources and because it can handle only a limited number of concurrent client connections (usually between 2 and several tens of thousands for each active web server process, see also the C10k problem and the C10M problem).

When a web server is near to or over its load limits, it gets overloaded and so it may become unresponsive.

Causes of overload

[edit]

At any time web servers can be overloaded due to one or more of the following causes:

  • Excess legitimate web traffic. Thousands or even millions of clients connecting to the website in a short amount of time (e.g., the Slashdot effect).
  • Distributed Denial of Service attacks. A denial-of-service attack (DoS attack) or distributed denial-of-service attack (DDoS attack) is an attempt to make a computer or network resource unavailable to its intended users.
  • Computer worms that sometimes cause abnormal traffic because of millions of infected computers (not coordinated among them).
  • XSS worms can cause high traffic because of millions of infected browsers or web servers.
  • Internet bots Traffic not filtered or limited on large websites with very few network resources (e.g., bandwidth) or hardware resources (CPUs, RAM, disks).
  • Internet (network) slowdowns (e.g., due to packet losses) so that client requests are served more slowly and the number of connections increases so much that server limits are reached.
  • Web servers, serving dynamic content, waiting for slow responses coming from back-end computers (e.g., databases), maybe because of too many queries mixed with too many inserts or updates of DB data; in these cases web servers have to wait for back-end data responses before replying to HTTP clients but during these waits too many new client connections or requests arrive and so they become overloaded.
  • Web servers (computers) partial unavailability. This can happen because of required or urgent maintenance or upgrade, hardware or software failures such as back-end (e.g., database) failures; in these cases the remaining web servers may get too much traffic and become overloaded.

Symptoms of overload

[edit]

The symptoms of an overloaded web server are usually the following ones:

  • Requests are served with (possibly long) delays (from one second to a few hundred seconds).
  • The web server returns an HTTP error code, such as 500, 502,[54][55] 503,[56] 504,[57] 408, or even an intermittent 404.
  • The web server refuses or resets (interrupts) TCP connections before it returns any content.
  • In very rare cases, the web server returns only a part of the requested content. This behavior can be considered a bug, even if it usually arises as a symptom of overload.

Anti-overload techniques

[edit]

To partially overcome above average load limits and to prevent overload, most popular websites use common techniques like the following ones:

  • Tuning OS parameters for hardware capabilities and usage.
  • Tuning web servers parameters to improve their security and performances.
  • Deploying web cache techniques (not only for static contents but, whenever possible, for dynamic contents too).
  • Managing network traffic, by using:
    • Firewalls to block unwanted traffic coming from bad IP sources or having bad patterns;
    • HTTP traffic managers to drop, redirect or rewrite requests having bad HTTP patterns;
    • Bandwidth management and traffic shaping, in order to smooth down peaks in network usage.
  • Using different domain names, IP addresses and computers to serve different kinds (static and dynamic) of content; the aim is to separate big or huge files (download.*) (that domain might be replaced also by a CDN) from small and medium-sized files (static.*) and from main dynamic site (maybe where some contents are stored in a backend database) (www.*); the idea is to be able to efficiently serve big or huge (over 10 – 1000 MB) files (maybe throttling downloads) and to fully cache small and medium-sized files, without affecting performances of dynamic site under heavy load, by using different settings for each (group) of web server computers:
    • https://download.example.com
    • https://static.example.com
    • https://www.example.com
  • Using many web servers (computers) that are grouped together behind a load balancer so that they act or are seen as one big web server.
  • Adding more hardware resources (i.e., RAM, fast disks) to each computer.
  • Using more efficient computer programs for web servers (see also: software efficiency).
  • Using the most efficient Web Server Gateway Interface to process dynamic requests (spawning one or more external programs every time a dynamic page is retrieved, kills performances).
  • Using other programming techniques and workarounds, especially if dynamic content is involved, to speed up the HTTP responses (i.e., by avoiding dynamic calls to retrieve objects, such as style sheets, images and scripts), that never change or change very rarely, by copying that content to static files once and then keeping them synchronized with dynamic content.
  • Using latest efficient versions of HTTP (e.g., beyond using common HTTP/1.1 also by enabling HTTP/2 and maybe HTTP/3 too, whenever available web server software has reliable support for the latter two protocols) in order to reduce a lot the number of TCP/IP connections started by each client and the size of data exchanged (because of more compact HTTP headers representation and maybe data compression). This may not prevent overloads of RAM and CPU caused by the need for encryption. It may also not address overloads caused by excessively large files uploaded at high speed, because they are optimized for concurrency.[58][59]

Market share

[edit]
Chart:
Market share of all sites for most popular web servers 2005–2021
Chart:
Market share of all sites for most popular web servers 1995–2005

Below are the latest statistics of the market share of all sites of the top web servers on the Internet by Netcraft.

Web server: Market share of all sites
Date nginx (Nginx, Inc.) Apache (ASF) OpenResty (OpenResty Software Foundation) Cloudflare Server (Cloudflare, Inc.) IIS (Microsoft) GWS (Google) Others
October 2021[60] 34.95% 24.63% 6.45% 4.87% 4.00% (*) 4.00% (*) Less than 22%
February 2021[61] 34.54% 26.32% 6.36% 5.0% 6.5% 3.90% Less than 18%
February 2020[62] 36.48% 24.5% 4.00% 3.0% 14.21% 3.18% Less than 15%
February 2019[63] 25.34% 26.16% N/A N/A 28.42% 1.66% Less than 19%
February 2018[64] 24.32% 27.45% N/A N/A 34.50% 1.20% Less than 13%
February 2017[65] 19.42% 20.89% N/A N/A 43.16% 1.03% Less than 15%
February 2016[66] 16.61% 32.80% N/A N/A 29.83% 2.21% Less than 19%

NOTE: (*) percentage rounded to integer number, because its decimal values are not publicly reported by source page (only its rounded value is reported in graph).

See also

[edit]

Standard Web Server Gateway Interfaces used for dynamic contents:

  • CGI Common Gateway Interface
  • SCGI Simple Common Gateway Interface
  • FastCGI Fast Common Gateway Interface

A few other Web Server Interfaces (server or programming language specific) used for dynamic contents:

  • SSI Server Side Includes, rarely used, static HTML documents containing SSI directives are interpreted by server software to include small dynamic data on the fly when pages are served (e.g., date and time, other static file contents, etc.).
  • SAPI Server Application Programming Interface:
    • ISAPI Internet Server Application Programming Interface
    • NSAPI Netscape Server Application Programming Interface
  • PSGI Perl Web Server Gateway Interface
  • WSGI Python Web Server Gateway Interface
  • Rack Rack Web Server Gateway Interface
  • JSGI JavaScript Web Server Gateway Interface
  • Java Servlet, JavaServer Pages
  • Active Server Pages, ASP.NET

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A web server is a computer system that provides (WWW) services on the , consisting of hardware, an operating system, web server software (such as or Microsoft's ), and website content including web pages. In the context of the Hypertext Transfer Protocol (HTTP), the web server acts as the origin server, listening for incoming connections from clients like web browsers, interpreting their requests, and returning appropriate responses, typically containing hypertext documents and associated resources such as images, stylesheets, and scripts. The concept of the web server originated with the invention of the World Wide Web by at in 1989, where he proposed a system for sharing hypertext documents among researchers. By the end of 1990, Berners-Lee had implemented the first web server, known as "," running on a NeXT computer, which served the inaugural webpage describing the project itself. This early server laid the foundation for HTTP, a stateless application-level protocol designed for distributed, collaborative hypermedia information systems, as formalized in subsequent IETF specifications starting with RFC 1945 in 1996. Web servers function by maintaining a connection with clients over TCP/IP, processing HTTP requests (such as GET or POST methods), and delivering responses with status codes (e.g., 200 OK for success or 404 Not Found for missing resources). They can be categorized as static servers, which deliver pre-existing files without modification, or dynamic servers, which generate content in real-time by integrating with application servers, databases, or scripting languages like or Python to handle user-specific data. Common architectures include process-based models, where each request spawns a new process; thread-based models for concurrent handling within a single process; and event-driven models for high , as seen in modern asynchronous servers. Among the most widely used web server software as of November 2025, holds the largest market share at 33.2%, valued for its efficiency in managing numerous simultaneous connections, followed closely by Server at 25.1% and at 25.0%, the latter known for its modular extensibility and long-standing dominance since its release in 1995. holds 14.9%, while Microsoft's IIS commands 3.6% of the market, primarily in enterprise Windows environments. These servers are essential for hosting websites, web applications, and APIs, supporting the global exchange of over 1.35 billion websites and enabling functionalities from simple static sites to complex platforms.

Overview

Definition and Role

A web server is either software or a combination of software and hardware designed to accept requests from clients, such as web browsers, via the Hypertext Transfer Protocol (HTTP) and deliver corresponding web pages or resources, typically transmitted over the Transmission Control Protocol/Internet Protocol (TCP/IP). In the client-server model of the , the web server fulfills the server-side role by processing incoming requests and returning responses, which may include static files like documents, CSS stylesheets, and images, or dynamically generated content produced by interfacing with backend systems such as scripts or databases. This setup enables the distribution of hypermedia information across networks, supporting collaborative and interactive web experiences. The concept of the web server originated as a key component in Tim Berners-Lee's vision for the , proposed in 1989 at to facilitate global information sharing among researchers. At its core, HTTP serves as the foundational protocol, defined as a stateless application-level protocol that treats each request-response exchange independently, without retaining session information between interactions unless explicitly managed by additional mechanisms. For secure communications, extends HTTP by layering it over (TLS), encrypting data in transit to protect against eavesdropping and tampering. Web servers employ standard HTTP methods to handle specific actions, such as GET for retrieving a without altering server state and POST for submitting to be processed, often triggering updates or creation on the server. To ensure clients interpret responses correctly, web servers specify types, which identify the media format of content (e.g., text/ for HTML files or image/ for images), distinguishing web-specific delivery from other server types. Unlike file servers, which provide generic network file access via protocols like SMB without content-type negotiation, or database servers, which manage structured retrieval and storage through query languages like SQL, web servers are optimized for HTTP-based web content dissemination and formatting.

Types and Classifications

Web servers can be classified based on their content handling capabilities, distinguishing between traditional static-only servers and modern dynamic-capable ones. Static web servers primarily deliver pre-built files such as , CSS, and images without processing, making them suitable for simple, unchanging websites with low computational demands. In contrast, dynamic web servers integrate additional software modules to generate content on-the-fly, often using languages like or CGI to interact with databases and produce personalized responses based on user input or session data. This evolution allows dynamic servers to support interactive applications, though they require more resources for execution. Architectural designs for web servers vary to optimize under different loads, including process-based forking, multi-threaded, event-driven, and hybrid models. Forking architectures, such as the pre-fork model, create a pool of child processes in advance to handle incoming connections, ensuring isolation but consuming higher memory per request. Threaded models employ multiple threads within a single process to manage concurrent requests, offering better resource sharing than forking while reducing overhead, as seen in Apache's worker MPM. Event-driven architectures use non-blocking I/O to process multiple requests asynchronously with minimal threads, excelling in high-concurrency scenarios like those addressed by the , exemplified by servers like . Hybrid approaches combine elements, such as event-driven handling for static content with threaded processing for dynamic tasks, to balance efficiency and scalability. Deployment models classify web servers by their physical and operational environments, encompassing software-based, hardware appliances, cloud services, and embedded systems. Software servers, installed on general-purpose hardware, provide flexible configuration for custom needs, with examples including for versatile hosting. Hardware appliances integrate web serving with dedicated processors and optimized firmware for reliability in enterprise settings, such as F5 BIG-IP devices that combine load balancing and HTTP handling. Cloud-based deployments leverage virtualized infrastructure for elastic scaling, like AWS Elastic Load Balancing (ELB), which distributes traffic across targets without managing underlying servers. Embedded web servers run on resource-constrained devices for local management, common in IoT applications such as smart thermostats using lightweight frameworks like HOKA to expose configuration interfaces via HTTP. Web servers also differ in licensing models, with open-source options promoting community-driven development and ones emphasizing vendor support. Open-source servers like offer free access to , enabling customization and rapid bug fixes through global contributions, though they may require expertise for secure implementation. examples include HTTP Server, which extends with integrated for enterprise security and performance tuning, and IIS, tightly coupled with Windows for seamless integration. Open-source models reduce licensing costs and foster innovation but can expose vulnerabilities if patches are delayed, while servers provide dedicated support and compliance certifications at the expense of higher fees and limited modifications. Emerging classifications reflect shifts toward distributed and efficient paradigms, including serverless architectures and edge servers for CDN integration. Serverless models abstract server management entirely, allowing functions like to handle web requests on demand, scaling automatically for event-driven workloads without provisioning infrastructure. Edge servers, positioned near users in CDN networks, cache and serve content to minimize latency, as in Cloudflare's edge infrastructure that processes HTTP requests closer to the end-user than central data centers. These types address modern demands for low-latency, cost-effective delivery in global applications.
Licensing ModelExamplesProsCons
Open-Source, Cost-free, highly customizable, strong community supportPotential security gaps without vigilant maintenance, steeper learning curve for advanced setups
ProprietaryOracle HTTP Server, IISVendor-backed support, integrated security features, easier enterprise complianceLicensing expenses, restricted code access limiting flexibility

History

Origins in the WWW Project (1989–1993)

In March 1989, , a researcher at , submitted a memorandum proposing a hypertext-based information management system to facilitate sharing scientific data among physicists worldwide. This proposal outlined a distributed network of hypertext documents linked via a simple protocol, laying the groundwork for what would become the (WWW) and its foundational Hypertext Transfer Protocol (HTTP). By 1991, Berners-Lee had implemented the first version of HTTP, known as HTTP 0.9, as part of this initiative to enable seamless document retrieval over the . The inaugural web server, , emerged from this project in late 1990, developed by Berners-Lee on computer running the operating system. This server was designed to host and deliver static documents, with the first —dedicated to explaining the WWW project itself—going live at http://info.cern.ch on December 20, 1990. Initially confined to 's internal network, the server operated as a basic file-serving daemon, responding to HTTP requests by transmitting raw content without advanced processing capabilities. Key advancements followed in 1991, including the release of the libwww library by Berners-Lee, a public-domain toolkit that provided developers with core functions for handling HTTP communications and parsing hypertext. This library facilitated the creation of compatible clients and servers, promoting interoperability in the nascent ecosystem. Later that year, on December 17, 1991, Berners-Lee delivered the first public demonstration of the WWW at the Hypertext '91 conference in San Antonio, Texas, showcasing the integrated browser, server, and hypertext navigation to an audience of researchers. By 1993, the project extended beyond CERN with the development of the NCSA HTTPd prototype by Rob McCool at the National Center for Supercomputing Applications (NCSA), which began in early 1993 and was first publicly released on April 22 as version 0.3, introducing enhancements like configurable access controls while remaining rooted in HTTP 0.9 compatibility. Early web servers during this period faced significant constraints, primarily limited to academic and research environments due to their experimental nature and reliance on existing infrastructure, which by late 1993 supported only about 500 known servers and accounted for roughly 1% of total . Lacking built-in features, such as or , they were vulnerable to unrestricted access and unsuitable for sensitive transmission. Functionality was restricted to serving static content, with no support for dynamic generation or user interactions beyond basic retrieval. The protocol foundations established in HTTP 0.9 emphasized simplicity to accelerate adoption: requests consisted of a single line in the format "GET /path", without headers, version indicators, or methods beyond retrieval, while responses delivered unadorned documents directly over TCP connections. This minimalist design avoided complexity but highlighted limitations, such as the inability to specify content types or handle errors, prompting early discussions in the —particularly through NCSA's extensions—on incorporating status codes, headers, and multiple methods to evolve toward HTTP 1.0 concepts.

Expansion and Early Servers (1994–2000)

The release of the browser in 1993 catalyzed widespread adoption of the , dramatically increasing demand for web server software and propelling the NCSA HTTPd server—developed at the —as the first widely used implementation from 1993 to 1994. By supporting inline images and a user-friendly graphical interface, Mosaic transformed the web from an academic tool into a accessible platform, leading to exponential growth in web usage and server deployments. This surge prompted the NCSA HTTPd to handle a growing number of sites, with web pages indexed by early search tools reaching around 110,000 by late 1994. In response to the stalling development of NCSA HTTPd, a group of web administrators formed the Apache Group in February 1995 to coordinate enhancements through email-shared patches, resulting in the initial public release of the (version 0.6.2) in April 1995. The project culminated in Apache 1.0 on December 1, 1995, which quickly surpassed NCSA HTTPd to become the dominant web server by April 1996, owing to its innovative modular architecture that enabled developers to add or extend features via loadable modules without altering the core code. This design fostered rapid community contributions and adaptability to diverse hosting needs. Concurrent with Apache's rise, commercial alternatives emerged to meet enterprise demands. Microsoft released (IIS) 1.0 in May 1995 as a free add-on for , integrating web serving with its ecosystem for easier deployment on Windows platforms. Netscape Communications launched the Enterprise Server 2.0 in March 1996, building on its earlier Netsite software to offer robust features like load balancing and security for business applications. These servers competed in a burgeoning market, where the introduction of the (CGI) specification in 1993 standardized dynamic content generation by allowing web servers to execute external scripts in response to user requests. Standardization efforts further supported this expansion. The HTTP/1.1 protocol, formalized in RFC 2068 in January 1997, introduced persistent connections to reuse TCP sockets across multiple requests, reducing latency, and added support for to serve multiple domains from a single . Simultaneously, pioneered Secure Sockets Layer (SSL) integration in 1994 with version 1.0 (though not publicly released due to flaws), laying the groundwork for encrypted web communications in subsequent versions like SSL 2.0 in 1995. The period marked explosive growth, with the number of websites expanding from approximately 2,700 in to over 17 million by mid-2000, while web server installations surged into the millions as measured by early surveys. This proliferation reflected the web's transition to a commercial medium, driven by easier dynamic content and secure features that enabled and broader accessibility.

Maturation and Modern Era (2001–Present)

Following the dot-com bust, web server technology shifted toward efficiency and scalability to handle surging internet traffic. In 2004, was released by as an open-source server emphasizing an asynchronous, that excelled in managing high concurrency without the threading overhead of traditional servers like . This innovation addressed limitations in handling thousands of simultaneous connections, becoming a staple for high-traffic sites. Similarly, lighttpd, released in 2003 by Jan Kneschke, emerged as a lightweight alternative optimized for resource-constrained environments such as embedded systems and low-power devices, featuring a single-process model with support for dynamic content. In 2019, F5 Networks acquired Nginx Inc., enhancing its enterprise features and support for modern deployments. Protocol advancements further matured web servers by improving speed and reliability. HTTP/2, standardized by the IETF in May 2015 via RFC 7540, introduced multiplexing to allow multiple requests over a single TCP connection, along with header compression using HPACK to reduce overhead and enable server push for proactive resource delivery. Building on this, was published in June 2022 as RFC 9114, leveraging —a UDP-based protocol developed by —to provide built-in encryption, lower latency through 0-RTT handshakes, and resilience to packet loss, significantly enhancing performance for mobile and variable networks. Web servers like and quickly adopted these protocols, with widespread implementation by the early 2020s to support modern web applications. The rise of and transformed web server deployment. Docker, released in 2013 by Solomon Hykes and team at dotCloud, popularized , enabling lightweight, portable web server instances that could be scaled rapidly across distributed environments without OS-level overhead. Complementing this, serverless architectures gained traction with AWS Lambda's launch in November 2014, allowing developers to run web server code in response to events without managing underlying infrastructure, thus abstracting away traditional server provisioning. expanded via content delivery networks (CDNs), with —founded in 2009—growing prominently in the to cache and serve content closer to users, reducing latency for global web server loads. Security enhancements became integral as threats evolved. Web Application Firewalls (WAFs) integrated into servers during the 2000s, exemplified by ModSecurity's Apache module released in 2002, which provided rule-based protection against common attacks like and XSS. The 2014 Heartbleed vulnerability in exposed flaws in TLS implementations, prompting urgent patches across servers like and and accelerating adoption of secure defaults. , launched by the in April 2016 (following a December 2015 beta), democratized TLS certificates with free, automated issuance, leading to over 80% of web servers using by 2020. By the 2020s, web servers incorporated emerging trends for performance and sustainability. Support for (Wasm) on the server side advanced with runtimes like Wasmtime (2019) and frameworks like Fastly's Compute@Edge (2020), enabling secure, high-performance execution of non-JavaScript code for web backends up to 2025. Sustainability efforts post-2020 focused on energy-efficient designs in web infrastructure, including idle-time power reduction in hardware like those from the and adoption of carbon-aware computing practices in data centers to minimize environmental impact during peak loads.

Technical Fundamentals

Core Architecture

The core architecture of a web server encompasses the fundamental structural elements that enable it to accept, process, and respond to HTTP requests efficiently. At its foundation, web servers employ a listener mechanism to monitor incoming network connections on designated ports, typically port 80 for HTTP and 443 for HTTPS. In the Apache HTTP Server, this is managed through Multi-Processing Modules (MPMs), where a parent process launches child processes or threads dedicated to listening and handling connections; for instance, the worker MPM creates a fixed number of server threads per child process to manage concurrency, while the event MPM uses separate listener threads to accept connections and assign them to idle worker threads for processing. Similarly, NGINX utilizes a master process that binds to listen sockets and spawns multiple worker processes, each employing an event-driven, non-blocking I/O model via kernel interfaces like epoll to efficiently multiplex thousands of connections without dedicated listener threads per worker. Configuration files form a critical part of this architecture, defining server behavior, loaded modules, and resource limits. Apache's primary , httpd.conf, centralizes directives for global settings such as server root, listening ports, and module inclusions, often supplemented by additional files like apache2.conf on some distributions for modular organization. These files use a declarative syntax to scope directives to specific contexts, ensuring flexible yet controlled server operation. A hallmark of modern web server design is modularity, allowing the core to be extended without altering the base code. Apache exemplifies this through its loadable modules system, where extensions are dynamically or statically linked at runtime; for example, mod_rewrite provides a rule-based using PCRE regular expressions to manipulate requested URLs , enabling features like and redirects. Likewise, mod_ssl integrates SSL/TLS encryption by leveraging for secure connections, handling certificate management and protocol negotiation within the server's request pipeline. This plug-in architecture promotes extensibility, with over 50 core modules available for functions ranging from to content compression. Memory and resource management in web servers balances efficiency and scalability, distinguishing between stack allocation for transient, fixed-size data like local variables and function call frames, and heap allocation for dynamic structures such as request buffers, response objects, and connection states that persist across operations. To mitigate overhead from frequent allocations, servers implement pooling strategies; Apache's threaded MPMs maintain thread pools to reuse resources for handling multiple requests, reducing creation costs, while NGINX's minimizes by avoiding per-connection threads and instead pooling upstream connections to backend services. Web servers predominantly operate in user space for security and portability, executing application logic outside the kernel to limit privileges and prevent crashes from affecting the OS core. Traditional implementations like run entirely in user mode, relying on kernel system calls (e.g., accept() and read()) for I/O operations, which introduce context switches but ensure isolation. follows a similar user-space model but supports kernel-assisted optimizations; for high-performance scenarios, variants or deployments of (such as ports using frameworks like f-stack or custom kernels like Junction) can integrate kernel bypass techniques to route packets directly in user space, eliminating kernel network stack involvement and reducing latency for latency-sensitive applications. Integration layers facilitate communication with external components, enhancing the server's role in dynamic environments. Protocols like serve as a binary interface between the web server and backend applications, enabling persistent processes (e.g., for or Python scripts) to handle multiple requests over TCP or Unix sockets, thus avoiding the per-request overhead of traditional CGI. Logging mechanisms provide essential observability, with access logs recording details of every request (e.g., IP, , status code) in a common format like the Combined Log Format, and error logs capturing diagnostics such as syntax errors or resource failures for troubleshooting. In , these are configured via directives in httpd.conf, directing output to files like access_log and error_log, often rotated for manageability.

Request-Response Mechanism

The request-response mechanism forms the core of how web servers interact with clients over the Hypertext Transfer Protocol (HTTP), a stateless application-level protocol designed for distributed hypertext systems. In this cycle, a client establishes a TCP connection to the server—typically on port 80 for HTTP or 443 for HTTPS—sends an HTTP request message specifying the desired resource and parameters, and the server processes the request independently of prior interactions before generating and transmitting an HTTP response message. This stateless nature means each request contains all necessary information for the server to fulfill it, without retaining session state across requests unless explicitly managed through mechanisms like cookies or tokens; this design enhances scalability by allowing servers to handle requests from any client without context dependency. HTTP request messages follow a structured text-based format consisting of a start line (request line), zero or more header fields separated by colons, a blank line to delimit the headers, and an optional message body for methods like POST that include payload data. The request line specifies the HTTP method (such as GET for retrieval or POST for submission), the request-target (usually a URI path like /index.), and the protocol version (e.g., HTTP/1.1), enabling the server to identify the action and . Header fields provide additional metadata, including the Host field for domain identification, Content-Type for body media type, and Accept headers for client preferences; for instance, Accept: text/ indicates a preference for content. Responses mirror this structure but begin with a status line containing the HTTP version, a three-digit status code (e.g., 200 for success), and a reason phrase (e.g., ), followed by headers like Content-Length or Server, and an optional body carrying the representation such as or . Upon receiving a request over the connection, the server performs initial parsing to validate the message syntax, extract the method, URI, and headers, and ensure compliance with the protocol version; invalid requests may trigger early termination. Routing then occurs based on the Host header, allowing a single server to manage multiple virtual hosts by directing requests to appropriate configurations or backends for different domains sharing the same . follows, where the server evaluates client-provided Accept, Accept-Language, and Accept-Encoding headers to select the most suitable response variant, such as delivering compressed content if the client supports encoding, prioritizing quality factors (q-values) from 0 to 1 to resolve preferences. Basic error handling integrates throughout: client errors like malformed syntax result in 4xx status codes (e.g., 400 Bad Request), while server-internal issues yield 5xx codes (e.g., 500 Internal Server Error), both included in the response status line without delving into specifics. To manage concurrency efficiently, modern web servers employ asynchronous handling via event loops, which monitor multiple connections non-blockingly and dispatch events like incoming data or timers without dedicating threads per request, thus avoiding blocking on I/O operations such as socket reads. This event-driven approach, as implemented in servers like , enables a single process to interleave handling of thousands of simultaneous requests by queuing and processing events in a loop, significantly improving throughput under load compared to traditional thread-per-connection models.

Operations and Features

Processing Incoming Requests

Web servers initiate the processing of incoming requests by establishing connections over TCP (for HTTP/1.x and ) or /UDP (for ) on designated ports, such as port 80 for unencrypted HTTP and port 443 for . In TCP-based implementations, a listener socket is created to monitor for incoming connections, and upon detection, the server invokes the operating system's accept() to create a new socket dedicated to the client connection. For , handles connection setup via datagrams without traditional sockets. This allows the server to receive the raw HTTP message over the transport stream, which includes the request line (comprising the HTTP method, request URI, and protocol version) followed by headers and optionally a body. Parsing occurs line-by-line, identifying key headers like Host (mandatory in HTTP/1.1 to support ) and User-Agent (indicating the client's software). Following reception, the server handles the request URI through normalization to ensure consistent interpretation. This involves decoding percent-encoded characters (e.g., converting %20 to a space) while preserving reserved characters like / and ?, resolving relative paths by merging with the base URI if needed, and removing redundant segments such as multiple slashes or dot segments (. and ..). The normalized URI is then mapped to internal server paths, often via configuration rules that translate it to filesystem locations or application handlers, enabling dynamic routing without exposing the underlying structure. Validation steps follow to safeguard against malformed or abusive requests. The server verifies support for the HTTP method (e.g., GET, POST, PUT, DELETE) against configured allowances, rejecting unsupported ones with a 501 Not Implemented status. Headers undergo sanitization to strip or escape potentially harmful content, such as invalid characters in fields like Content-Type, and the overall request size—including headers and body—is checked against limits (typically 8KB for headers and configurable maxima like 1MB for bodies in many implementations) to mitigate denial-of-service attacks from oversized payloads. For environments hosting multiple domains on a single , virtual hosting routes requests appropriately. In HTTP, the Host header determines the target site, while for , (SNI) extends the TLS handshake by including the requested hostname in the ClientHello message, allowing the server to select the correct certificate and configuration without requiring separate IP addresses or ports. supports similar name indication within its /TLS integration. Throughout processing, servers log request metadata to access logs for auditing and analysis. Common entries include the client's , , request method, normalized URI, protocol version, and , formatted in standards like the or extended variants for richer details such as response time and bytes sent (recorded post-processing). This occurs early in the phase to capture inbound details accurately.

Generating and Sending Responses

Once the web server has processed an incoming request, it generates a response by assembling the appropriate content and headers according to the HTTP protocol specifications. For static content, the server retrieves the requested file directly from , determines its based on the file extension using predefined mappings, and sets the corresponding Content-Type header, such as text/ for .html files or image/ for .jpg files. This type assignment ensures the client interprets the content correctly, as standardized in the media types registry. To optimize transmission, servers often apply compression algorithms like to the response body if the client supports it, indicated by the Accept-Encoding header, reducing bandwidth usage by encoding the content and adding a Content-Encoding: header. For dynamic content, the server invokes backend scripts or applications to generate the response on-the-fly, such as executing code via the mod_php module in , which embeds the PHP interpreter to scripts and produce or other output. This invocation typically occurs after mapping the request URI to a script file, with the server passing environment variables and input to the backend for . Response buffering is employed during this to collect the generated output in before transmission, allowing for efficient handling of variable-length content and preventing partial sends that could lead to incomplete responses. Before sending the response, the server performs modifications based on security and routing needs, including authorization checks like HTTP Basic Authentication, where the server verifies credentials sent in the header against stored user data. If authorization fails, a 401 Unauthorized status is returned, prompting the client for credentials. For redirections, the server issues a 301 Moved Permanently or 302 Found status code along with a header specifying the new URI, instructing the client to refetch the resource elsewhere. Error handling involves customizing pages for status codes like 404 Not Found, when the requested resource is absent, or 500 , for server-side failures, often using server-specific templates to provide user-friendly messages instead of default protocol errors. The assembled response is then transmitted to the client over the established connection, utilizing techniques like for streaming dynamic or large content in HTTP/1.1, where the body is sent in sequential chunks each preceded by its size in , allowing indefinite-length responses without a prior Content-Length header. and use framed streams for similar purposes. Persistent connections, enabled by the Connection: keep-alive header in HTTP/1.1, permit reusing the same TCP connection for multiple requests and responses, reducing overhead from repeated s. in provides built-in multiplexing without . For secure transmission, involves a TLS prior to content exchange, where the server authenticates itself with a certificate, negotiates keys, and establishes a to protect the response data in transit; in , TLS is integrated into . Finally, the server finalizes the response by appending informational headers, such as the Server header identifying the software (e.g., Server: /2.4.58), which aids in but can be customized or omitted for . The connection is then either closed explicitly with a Connection: close header or reused if keep-alive is supported and no errors occurred, ensuring efficient resource management.

Caching and Optimization Techniques

Web servers employ caching mechanisms to store frequently accessed resources, reducing the need to regenerate or retrieve content from origin sources, thereby improving response times and reducing server load. These techniques are essential for handling high-traffic scenarios, as they minimize redundant computations and network transfers. Optimization methods further enhance efficiency by compressing data, streamlining delivery, and leveraging protocol advancements. Caching in web servers is categorized into static and dynamic types. Static file caching involves storing unchanging assets, such as images, CSS, and files, either in memory for rapid access or on disk for persistence, allowing servers like or to serve them directly without processing. Dynamic caching, on the other hand, handles variable content by using conditional validation mechanisms; for instance, the header provides a for versions, while the Last-Modified header indicates the last update timestamp, enabling clients to request updates only if changes have occurred, often resulting in a 304 Not Modified response. These mechanisms are supported across HTTP versions, including HTTP/3. Cache control is managed through HTTP headers that dictate storage and freshness rules. The Cache-Control header specifies directives like max-age, which sets the maximum time in seconds a resource can be considered fresh before revalidation, and no-cache, which requires validation even if stored. The Vary header informs caches about request headers (e.g., Accept-Language) that influence response variations, ensuring correct for different clients. Proxy caching, including reverse proxies such as or , extends this by storing responses at intermediate layers to offload origin servers, with directives like public allowing shared caches or private restricting to client-side storage. Optimization techniques focus on reducing payload size and transmission overhead. Content minification removes unnecessary characters from files like , CSS, and without altering functionality, shrinking transfer sizes by up to 20-30% in typical cases. Preloading, via the Link header with rel="preload", hints browsers to fetch critical resources early. Compression algorithms, such as introduced in 2016, achieve better ratios than for text-based assets, reducing bandwidth needs by 20-26% on average when enabled server-side. Integration with load balancers allows distributing cached content across nodes, while and HTTP/3's connection multiplexing enables multiple requests over a single connection, eliminating and improving throughput for concurrent assets. Cache invalidation ensures stale data is removed to maintain accuracy. Time-based expiration relies on TTL values set via Cache-Control's max-age or Expires header, automatically discarding entries after a defined period to balance freshness and . Purge strategies actively invalidate specific entries upon content updates, often triggered by application logic or webhooks in content management systems. Content Delivery Networks (CDNs) facilitate global offloading by caching at edge locations, with invalidation propagated via purge APIs to synchronize changes across distributed nodes, reducing latency for international users. Advanced optimizations target language-specific bottlenecks. Opcode caching, such as PHP's OPcache, precompiles scripts into bytecode stored in shared memory, bypassing parsing on subsequent requests and yielding up to 3x performance gains for dynamic PHP applications. This is particularly effective for web servers running PHP, where repeated compilation otherwise consumes significant CPU cycles.

Performance Considerations

Key Metrics and Evaluation

Web server performance is assessed through several primary metrics that quantify its ability to handle traffic efficiently. Throughput measures the number of requests processed per second (RPS), indicating the server's capacity to deliver content under load. Latency, often expressed as (TTFB), captures the duration from request initiation to the receipt of the initial response byte, directly impacting . Concurrency evaluates the maximum number of simultaneous connections the server can maintain without degradation, a critical factor for high-traffic scenarios. Resource utilization tracks consumption of CPU, RAM, and bandwidth, revealing bottlenecks in hardware efficiency during operation. Beyond these, efficiency factors provide deeper insights into operational overhead. CPU cycles per request quantify the computational cost of handling individual requests, with lower values signaling optimized processing. Memory footprint assesses the RAM allocated per connection or process, essential for scaling on resource-constrained systems. I/O wait times measure delays due to disk or network operations, which can accumulate under sustained loads. For software like NGINX, events per second—handled via its event-driven model—highlight efficiency in managing asynchronous I/O without blocking threads. Benchmarking tools standardize these evaluations. (ab), bundled with the , simulates concurrent requests to compute RPS and latency on targeted endpoints. wrk, a multithreaded HTTP benchmarking tool, excels in generating high loads on multi-core systems, reporting detailed latency distributions and throughput. performs by emulating user behaviors across multiple URLs, yielding metrics on transaction rates and response times. Standardized suites like the TechEmpower Framework Benchmarks provide industry comparisons for web server throughput under realistic workloads, including dynamic content generation, with ongoing rounds such as Round 23 as of 2025. Evaluations occur in varied contexts to reflect real-world demands. Steady-state testing applies consistent loads to gauge sustained performance, while burst loads simulate sudden spikes to assess recovery and peak handling. Real-world scenarios incorporate actual traffic patterns, contrasting with synthetic benchmarks that use controlled, repeatable inputs for isolation. OS tuning, such as enabling on for efficient event notification, significantly influences outcomes by reducing polling overhead in high-concurrency environments. Historical benchmarks illustrate evolving capabilities, particularly in high-concurrency settings. Systematic reviews from the 2010s show outperforming in RPS under concurrent loads, achieving up to 2-3 times higher throughput for static content due to its non-blocking , while excels in modular dynamic processing but incurs higher resource costs at scale. These comparisons, often using tools like , underscore 's edge in scenarios exceeding 10,000 simultaneous connections since the mid-2010s.

Load Handling and Scalability

Web servers encounter overload primarily through resource exhaustion, where excessive concurrent threads or connections deplete CPU, , or network capacity, leading to system instability. Distributed denial-of-service (DDoS) attacks exacerbate this by flooding servers with illegitimate traffic, consuming bandwidth and computational resources to block legitimate users. Slow backend services, such as database queries or external APIs, can create cascading bottlenecks, while high disk I/O demands from or file operations further amplify delays under peak loads. Overload manifests in symptoms like elevated response latency, as processing queues lengthen and resources become contested; dropped connections occur when the server rejects new incoming requests to preserve stability; and elevated error rates, notably HTTP 503 Service Unavailable responses, signal temporary incapacity to handle demand. These indicators can be systematically monitored using open-source tools like , which collects time-series data on metrics such as request duration, connection counts, and error ratios to enable early detection and alerting. To counteract overload, web servers employ anti-overload techniques including , implemented via modules like , which caps requests per client IP or to prevent abuse and preserve resources. Queuing systems, such as NGINX's upstream module, buffer excess requests in a wait queue rather than rejecting them outright, allowing controlled processing during surges. In cloud infrastructures, auto-scaling dynamically provisions additional instances based on predefined thresholds, while graceful degradation prioritizes core functionality—such as serving static content over dynamic pages—to maintain partial availability amid stress. Scalability strategies extend beyond immediate mitigation to long-term , with horizontal scaling distributing load across multiple servers via load balancers like , which routes traffic using algorithms such as least connections or round-robin for even utilization. Vertical scaling upgrades individual server hardware, enhancing CPU cores, RAM, or storage to accommodate higher loads without architectural changes, though it faces physical limits on single-machine . For stateful applications maintaining session data, sharding partitions workloads and data across nodes, ensuring balanced distribution while preserving consistency through coordination tools. Contemporary advancements in load handling incorporate integration, where web servers interface with decomposed, independently scalable services to isolate failures and optimize resource use in distributed environments. AI-driven solutions, leveraging models for traffic prediction, enable proactive scaling by forecasting demand patterns and preemptively adjusting server pools, reducing latency spikes in dynamic setups as demonstrated in service function chain optimizations.

Deployment and Ecosystem

The is a modular, open-source web server software that has been a foundational implementation since its launch in , emphasizing extensibility through loadable modules for handling diverse functionalities such as proxying and authentication. Key modules like mod_proxy enable capabilities, allowing it to forward requests to backend servers while supporting dynamic content via integrations with scripting languages. Historically significant as one of the earliest widely adopted open-source servers, it powers a broad range of use cases from small websites to large-scale enterprise deployments due to its robust configuration options and active community-driven development. NGINX operates on an , making it particularly efficient for serving static content and managing high-concurrency scenarios without blocking processes on individual requests. It excels as a , load balancer, and HTTP cache, with unit-based configuration files that simplify setup for caching dynamic content or accelerating media delivery. Originating in as a solution to address the of handling 10,000 concurrent connections, NGINX has evolved into a versatile platform for modern web applications, including API gateways and Kubernetes ingress controllers. Microsoft Internet Information Services (IIS) is a proprietary web server tightly integrated with the , providing seamless support for applications and native management through the IIS Manager graphical interface. It offers features like built-in compression, rewriting, and role-based security, making it ideal for enterprise environments hosting .NET-based web apps or services. Evolving alongside versions since IIS 1.0 in 1995, it emphasizes administrative ease and integration with ecosystem tools for authentication and diagnostics. Among other notable software implementations, serves as a commercial for , offering enhanced performance through event-driven processing and built-in support while maintaining compatibility with Apache configurations. It is commonly used in shared hosting environments for its resource efficiency and anti-DDoS protections. provides automatic certificate management via integration, simplifying secure deployments with a concise configuration syntax written in Go, suitable for personal projects or . For embedded or resource-constrained systems, lighttpd offers a lightweight, single-threaded design optimized for fast static file serving and support, often deployed in IoT devices or small-scale applications. On the hardware and appliance side, F5 BIG-IP functions as an application delivery controller that combines web serving with advanced load balancing, SSL offloading, and security features like web application firewalls, targeting enterprise data centers for high-availability traffic management. Cisco ACE, a legacy module for Catalyst switches, provided intelligent load balancing for protocols including HTTP and SIP, with session persistence and health monitoring, historically used in service provider networks before its end-of-sale. Cloud-native options like Google Cloud Load Balancing deliver global anycast IP distribution for HTTP(S), TCP, and UDP traffic, with autoscaling and integration to Google Kubernetes Engine, ideal for distributed applications requiring low-latency edge delivery. Selection of a web server implementation depends on specific requirements such as ease of configuration, community support for troubleshooting, and alignment with existing infrastructure— for instance, open-source options like or suit diverse ecosystems, while integrated solutions like IIS fit Windows-centric setups. As of November 2025, holds the largest market share among web servers, powering 33.2% of all websites with known server software, followed closely by Server at 25.1% and at 25.0%. accounts for 14.9%, for 5.3%, while Microsoft IIS has declined to just 3.6%, reflecting a broader shift away from proprietary solutions toward open-source alternatives. These figures, derived from surveys of millions of websites, underscore 's dominance in high-traffic environments and the rising role of cloud-based proxies like in handling global traffic. Several factors are driving these shifts in market share. Cloud migration has significantly boosted adoption of Nginx and modern proxies like Envoy, which excel in containerized and microservices architectures common in cloud-native deployments. For instance, a 2025 survey indicated that 65% of new application deployments favor Nginx due to its efficiency in scalable cloud setups. Security concerns have also propelled servers with built-in automation, such as Caddy, which automatically provisions and renews TLS certificates, reducing misconfiguration risks in an era of rising cyber threats. Emerging trends point to further evolution beyond traditional web servers. There is a notable shift toward API gateways like Kong, which integrate , , and for ecosystems, with the API management market projected to grow at a 24% CAGR through 2030. In the Web3 space, decentralized servers using IPFS gateways are gaining traction for content distribution without central points of failure, enabling resilient applications in blockchain-based ecosystems. Additionally, metrics are influencing choices, with green hosting providers emphasizing and carbon offsetting to meet regulatory and consumer demands for eco-friendly infrastructure. Looking ahead, serverless architectures are forecasted to expand significantly, with the market expected to grow from $26.5 billion in 2025 to $76.9 billion by 2030 at a 23.7% CAGR, while traditional on-premises deployments continue to decline amid widespread adoption. is also expected to grow substantially for low-latency needs in IoT and applications.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.