Cross-site tracking cookies have a bleak future but can still cause privacy woes to unwary users
For many years, privacy advocates have been sounding the alarm on the use of cookies to track, profile, and serve personalized ads to web users. The discussion has been especially acute over cookies used for cross-site tracking, in which a website leaks or offers visitor data to third-party services included in the site.
In response, some of the major web browser vendors stepped up their efforts in the past two years to offer improved or new options to block third-party cookies. In 2020, Apple updated Intelligent Tracking Prevention in Safari and, in 2021, Mozilla rolled out Total Cookie Protection in Firefox to clamp down on tracking via third-party cookies.
Google has gone as far as promising to disable third-party cookies in Chrome, but not until a privacy-preserving alternative – currently being explored under the Privacy Sandbox initiative – is developed for businesses in need of advertising and analytics services.
However, all of this effort put into blocking third-party cookies may be for naught if the user fails to audit the settings for their browser of choice. A freshly installed web browser may not be blocking third-party cookies by default. A notable exception is Firefox for desktop, which has Total Cookie Protection turned on by default as of June 2022.
In order to better understand the concerns around cookies, we will take a brief look at Hypertext Transfer Protocol (HTTP) header fields, and then deep dive into what cookies look like, how they are handled by web browsers, and some of the security and privacy implications of their use.
Why do websites use cookies anyway?
Websites use HTTP to serve up web pages requested by visitors. Using this protocol, a client – for example, a web browser like Google Chrome – sends an HTTP request to a server and the server returns an HTTP response. Note that in this article we use “HTTP” to mean HTTP or HTTPS.
HTTP is a stateless protocol, meaning a server can process the request without depending on other requests. However, by using cookies, servers can maintain state – they can identify multiple requests as coming from the same source across page reloads, navigations, browser restarts, and even third-party sites. This was the rationale behind the introduction of cookies.
What are HTTP header fields?
Without getting too bogged down in the details of HTTP, what is most relevant to understand here is that an HTTP request contains header fields that modify or convey information about the request. Let’s consider the following client request:
This request has two header fields: User-Agent and Host.
The User-Agent header field indicates that the client making the request is Chrome version 103 running on a 64-bit Windows 10 machine. Note that the User-Agent can be spoofed. The Host header field indicates the domain, and optionally the connection port, that the request is made to, and is required in all HTTP v1.1 requests; in this case the domain is example.com.
The example.com server might send a response to the above request that looks like this:
An HTTP response also contains header fields that modify the response and this particular response even contains message content. Again, the main idea here is that HTTP requests and responses use header fields that affect their processing via the information they deliver, which may include cookies.
What are cookies?
A cookie is a piece of data delivered by a server to a client typically via the Set-Cookie header field in the form of a name=value pair. Let’s redo the HTTP response above, but this time the server will attempt to set a cookie.
In this example cookie, SessionID is the cookie name and 31d4d96e407aad42 is the cookie value.
Where are cookies stored?
When a Google Chrome browser running on Windows receives an HTTP response with cookies, it saves the cookies on disk in an SQLite version 3 database called Cookies:
This database contains a table called cookies where the cookie value is encrypted and stored in a column called encrypted_value, along with associated metadata, as can be seen from the other columns in the table:
A partial row from the cookies table might look like this:
Tools that attempt to access the Cookies database and decrypt cookie values can be detected by ESET products’ Real-time file system protection. For example, this Python script available in GitHub is detected as Python/PSW.Stealer.AD:
However, the Chrome browser allows you to view the decrypted cookie value in Chrome DevTools:
Even though it is possible to view the decrypted cookie value in Chrome DevTools, the value will likely make little sense because it may either be a unique, random value (for example, a session identifier) or contain data that has been further encrypted and signed by the issuing server, and often encoded in some “text-safe” way such as base64.
Whatever the data stored in the cookie, the length of the name=value cookie pair cannot exceed four kilobytes. This is probably where the popular description of cookies storing “small” bits or pieces of data originates.
Returning cookies to the server
Once a cookie is set, future client requests to the server that set the cookie may include the cookie in a Cookie header field. Let’s redo the HTTP request above, but this time include the previously set cookie:
One of the critical points impacting privacy and security on the web is the client’s decision logic about whether to include cookies in an HTTP request to the originating server. This largely boils down to whether the request is being initiated in a first-party context on the site that set the cookie or in a third-party context on a different site that includes resources from the site that set the cookie.
Next, let’s take a look at how cookie security and privacy features affect the client’s decision to return cookies.
Cookie security
Let’s say I log into my account on a website. I expect the server to remember that I am logged in. So the server sends a cookie after I authenticate. As long as the client returns that cookie to the server in subsequent requests, the server knows I am logged in and there is no need to reauthenticate with every request.
Now, imagine that an attacker somehow steals that cookie, perhaps via malware delivered by email. Possessing that stolen cookie is nearly as good as having my authentication credentials because the server associates the use of that cookie with my authenticated self.
To mitigate the dangers from such cookie theft, the server can implement a few measures.
First, this particular cookie can be set to expire after a short period of inactivity. After its expiry, a stolen cookie becomes useless to a thief because the account is effectively logged out.
Second, the server can require any critical actions, such as resetting the account password or, say, transferring more than a nominal amount in a banking application, to be confirmed with the current password or some other mechanism like a verification code. A cookie thief should not be able to reset my password or empty my bank account by having the cookie alone.
Finally, the server can set this cookie with as many attributes for more stringent security as appropriate for the cookie’s purpose. This means using the following attributes:
- Secure, which instructs clients to not include the cookie in unencrypted HTTP requests [this is a mitigation against adversary-in-the-middle (AitM) attacks];
- HttpOnly, which instructs clients to prevent non-HTTP APIs like JavaScript from accessing the cookie [this is a mitigation against cross-site scripting (XSS) attacks];
- SameSite=Strict, which instructs clients to include the cookie only in requests to domains that match the current site displayed in the browser’s address bar [this is a mitigation against cross-site request forgery (CSRF) attacks]; and
- Path=/, which instructs clients to include the cookie in requests to any path of the domain. In combination with the next point in this list, the cookie can be considered as “locked” to the domain;
- but not Domain in order to prevent the cookie from being included in requests to subdomains of the host that set the cookie. For example, a cookie set by com should not be sent to accounts.google.com.
Attempting to set such a fortified cookie would look like this:
Set-Cookie: SessionID=31d4d96e407aad42; Secure; HttpOnly; SameSite=Strict; Path=/
Here, the attributes that follow the first name=value pair are also part of the cookie.
Taking further measures to protect a site against AitM, XSS, and CSRF attacks also contributes to the security of cookies and the services they help provide.
Of course, cookies have more uses than handling logged-in users. They can also be used to keep items in a shopping cart, remember user preferences, and track user behavior.
First-party cookies vs. third-party cookies
Tracking via cookies can happen in both first-party and third-party contexts. Nowadays, tracking via first-party cookies is par for the course, if disclosed as required by privacy laws, and little can be done against it except perhaps the potentially website-breaking option of blocking all cookies or limiting it by browsing in private or incognito mode so that you appear as a new visitor each time you visit the site after opening a new window or tab and starting a new browser session.
But what exactly is a first-party cookie? Let’s use Google as an example. If you open https://google.com in your web browser, then all the cookies set by the google.com server and included in your client (browser) requests to google.com are considered first-party cookies. An easy way to check this is to look for cookies with a domain attribute value of google.com as these are a match for the domain displayed in the browser’s address bar.
Chrome DevTools has a Filters toolbar to expedite finding requests by their domain property and a Cookies tab to view the cookies sent with each request:
And what is a third-party cookie? If you visit a non-Google site like welivesecurity.com that triggers requests to google.com – perhaps the web page has an embedded YouTube video that loads a script hosted on google.com – the cookies included in these requests are considered third-party. Again, an easy way to check this is to look for cookies with a domain attribute value of google.com, as these are not a match for the domain displayed in the browser’s address bar:
Notice how few cookies are returned to google.com when visiting this WeLiveSecurity article compared to the horde of cookies that are returned when directly on google.com. This is due to the cookie’s SameSite attribute. In a third-party context, only cookies that are set with both the SameSite=None and Secure attributes may be returned.
This is why companies in the business of analytics, advertising, and personalization are strongly interested in SameSite=None; Secure cookies. Google’s NID cookie, for example, is a super tracker that helps:
- remember preferences, such as preferred language, the number of results to show on a search results page, and whether Google’s SafeSearch filter is turned on
- collect analytics on Google Search
- show targeted Google ads in Google services to users that are not signed in
- enable personalized autocomplete as users type search terms in Google Search
The NID cookie could last indefinitely – a scary proposition – unless you manually delete it, as it is reset to expire six months after your last use of a Google service, for example, each time you log in or out of your account.
Login fingerprinting
To get a stronger idea of the tracking capability of third-party cookies, consider visiting a site that uses a piece of HTML and JavaScript code (hat tip to Robin Linus) to make a specially crafted request to the Google login service after the page has loaded.
Clicking on the Run button below will result in one of two actions. If third-party cookies are enabled in this browser session, the code will display the Google favicon below the Run button and open an alert dialog that says, “You are logged into Google in this browser”. But if third-party cookies are blocked in this browser, the code will not display the favicon below the Run button and will open an alert dialog that says “I don’t know if you are logged into Google”. You can test both actions by refreshing this page between runs.
<img onload=“alert(‘You are logged into Google in this browser’)” onerror=“alert(‘I don’t know if you are logged into Google’)” src=“https://accounts.google.com/ServiceLogin?continue=https%3A%2F%2Fwww.google.com%2Ffavicon.ico”> |
Figure 12. Clicking the Run button checks whether you are logged into Google in this browser session
Google uses a cookie called __Host-3PLSID that can be included in requests from a third-party context. If you are logged in, this cookie will be included in the request, making the request successful and thereby leaking your login status to the third-party site.
The same issue applies to PayPal, although multiple runs may lead to PayPal requiring a CAPTCHA to be solved that then prevents login fingerprinting:
<img onload=“alert(‘You are logged into PayPal in this browser’)” onerror=“alert(‘I don’t know if you are logged into PayPal’)” src=“https://www.paypal.com/signin?returnUri=https%3A%2F%2Ft.paypal.com%2Fts?v=1.6.8”> |
Figure 13. Clicking the Run button checks whether you are logged into PayPal in this browser session
Nearly all the cookies that paypal.com sets are eligible to be returned in a third-party context. PayPal seems to use at least two cookies called id_token and HaC80bwXscjqZ7KM6VOxULOB534 to identify logged-in users.
Blocking third-party cookies
Login fingerprinting will not work on all sites because it exploits a weakness (although not every service provider seems to be concerned about this) in how the server has implemented its login mechanism and its handling of redirects. To prevent tracking you across websites and possible leaks of your login status, make sure to turn on any settings your browser has for blocking third-party cookies.
The following list describes where to find the third-party cookie settings in a smattering of the most popular web browsers.
Firefox
As we said at the outset, Firefox for desktop has had Total Cookie Protection turned on by default since June 2022. Aside from the blogpost we just linked to, this support article provides more in-depth technical discussion of this feature, including how to troubleshoot sites that might not work properly with the feature enabled. More adventurous users might wish to fine-tune the default settings, found here:
Chrome
The Chrome browser provides the settings for cookies under “Privacy and security”:
Once you have checked the “Block third-party cookies” option, all third-party cookies are blocked – they will not be returned to the server, nor can they be set on the client:
Edge
For the Microsoft Edge browser, follow the numbers in the image below to block third-party cookies:
In the settings for Safari on iOS, turn on “Prevent Cross-Site Tracking”:
Safari on iOS
Third-party browsers on iOS
iPhones have an “Allow Cross-Website Tracking” setting that is available for each third-party browser via the Settings app. Thus, in addition to checking the third-party cookie settings offered by each browser app, make sure this setting is not selected:
Conclusion: Predicting the death of third-party tracking cookies
The noose around third-party cookies for tracking is tightening from at least three points. First, from users who are turning on cookie-blocking technology on their devices and apps. Second, from web browser vendors who are strengthening their default browser settings to limit tracking. Third, from web developers who are using alternative storage mechanisms to handle cross-site resources.
With these growing efforts to undercut online tracking, cross-site tracking cookies sit on a precarious footing for their long-term survival, and we can predict their demise in a not too distant future.