Cache-Control Recommendations

Cache-Control is one of the most frequently misunderstood HTTP headers, due to its overlapping and perplexingly-named directives. Confusion around it has led to numerous security incidents, and many configurations across the web contain unsafe or impossible combinations of directives. Furthermore, the interactions between various directives can have surprisingly different behavior depending on your browser.

The objective of this document is to provide a small set of recommendations for developers and system administrators that serve documents over HTTP to follow. Although these recommendations are not necessarily optimal in all cases, they are designed to minimize the risk of setting invalid or dangerous Cache-Control directives.

Recommendations

Recommendation Safe for PII Use Cases Header Value
Don't cache (default) Yes API calls, direct messages, pages with personal data, anything you're unsure about max-age=0, must-revalidate, no-cache, no-store, private
Static, versioned resources No Versioned files (such as JavaScript bundles, CSS bundles, and images), commonly with names such as loader.0a168275.js max-age=n, immutable
Infrequently changing public resources, or low-risk authenticated resources No Images, avatars, background images, and fonts max-age=n

Don't cache (default): max-age=0, must-revalidate, no-cache, no-store, private

When you're unsure, the above is the safest possible directive for Cache-Control. It instructs browsers, proxies, and other caching systems to not cache the contents of the request. Although it can have significant performance impacts if used on frequently-accessed public resources, it is a safe state that prevents the caching of any information.

It may seem that using no-store alone should stop all caching, but it only prevents the caching of data to permanent storage. Many browsers will still allow the caching of these resources to memory, even if it doesn't write them to disk. This can cause issues where shared systems may contain sensitive information, such as browsers maintaining cached documents for logged out users.

Although no-store may seem sufficient to instruct content delivery networks (CDNs) to not cache private data, many CDNs ignore these directives to varying degrees. Adding private in combination with the above directives is sufficient to disable caching both for CDNs and other middleboxes.

Static, versioned resources: max-age=n, immutable

If you have versioned resources such as JavaScript and CSS bundles, this instructs browsers (and CDNs) to cache the resources for n seconds, while not purging their caches even when intentionally refreshing. This maximizes performance, while minimizing the amount of complexity that needs to get pushed further downstream (e.g. service workers). Care should be taken such that this combination of directives isn't used on private or mutable resources, as the only way to "bust" the cache is to use an updated source document that refers to new URLs.

The value to use for n depends upon the application, and is ideally set to a bit longer than the expected document lifetime. One year (31536000) is a reasonable value if you're unsure, but you might want to use as low as a week (604800) for resources that you want the browser to purge faster.

Infrequently changing public resources or low-risk authenticated resources: max-age=n

If you have public resources that are likely to change, simply set a max-age equal to a number (n) seconds that makes sense for your application. Simply using max-age will allow user agents to still use stale resources in some circumstances, such as when there is poor connectivity.

There is no need to add must-revalidate outside of the unlikely circumstance where the resource contains information that must be reloaded if the resource is stale.

Directives

For brevity, this only covers the most common directives used inside Cache-Control. If you are looking for additional information, the MDN article on Cache-Control is pretty exhaustive. Note that its recommendations differ from the recommendations in this document.

max-age=n (and s-maxage=n)

  • instructs the user agent to cache a resource for n seconds, after which time it is considered "stale"
  • s-maxage works the same as max-age, but only applies to intermediary systems such as CDNs

no-store

  • tells user agents and intermediates not to cache anything at all in permanent storage, but note that some browsers will continue to cache in memory

no-cache

  • contrary to everything you would think, does not tell browsers not to cache, but instead forces them to check to see if the resource has been updated via ETag or Last-Modified
  • essentially the same as max-age=0, must-revalidate

must-revalidate

  • forces a validation when cache is stale – this can mean that browsers will fail to use a cached resource if it is stale but the site is down
  • generally only useful for things like HTML with time-specific or transactional data inside
  • if max-age is set, must-revalidate doesn't do anything until it expires

immutable

  • indicates that the body response will never change
  • when combined with a large max-age, instructs the browser to not check to see if it's still valid, even when user purposefully chooses to refresh their browser

public

  • indicates that even normally non-cacheable responses (typically those requiring Authorization) can be cached on public systems, such as CDNs and proxies
  • recommended to not use unless you're certain, as it's probably better to waste bytes than to make the mistake of having a private document get cached on a CDN

private

  • indicates that caching can happen only in private browser (or client) caches, not on CDNs
  • note that this wording can be deceiving, as “private” documents are frequently cached on CDNs, with high-entropy URLs
  • documents behind authentication are an example of a good target for the private directive

stale-while-revalidate=n

  • instructs browsers to use cached resources which have been stale for less than n seconds, while also firing off an asynchronous request to refresh the cache so that the resource is fresh on next use
  • great for services where some amount of staleness is acceptable (e.g. weather forecasts, profile images, etc.)
  • can provide a decent performance boost, as long as you're careful to avoid any issues where you require multiple resources to be fresh in a synchronized manner
  • browser support is still limited, so if you decrease max-age to compensate, note that it will affect browsers that don't yet support stale-while-revalidate

Common anti-patterns and pitfalls

Surveys of Cache-Control across the internet have identified numerous anti-patterns in broad usage. This list is not meant to be extensive, but simply to demonstrate how complex and sometimes misleading that the Cache-Control directive can be.

  • max-age=604800, must-revalidate

While there are times that max-age and must-revalidate are useful in combination, for the most part this is saying that you can cache a file but then must immediately distrust it afterwards even if the hosting server is down. Instead use max-age=604800, which says to cache it for a week while still allowing the use of a stale version if the resource is unavailable.

  • max-age=604800, must-revalidate, no-cache

no-cache tells user agents that they must check to see if a resource is unmodified via ETag and/or Last-Modified with each request, and so neither max-age=604800 nor must-revalidate do anything.

  • pre-check=0, post-check=0

You still see these directives appearing in Cache-Control responses, as part of some long-treasured lore for controlling how Internet Explorer caches. But these directives have never worked, and you're wasting precious bytes1 by continuing to send them.

  • Expires: Fri, 09 April 2021 12:00:00 GMT

While the HTTP Expires header works the same way as max-age in theory, the complexity of its date format means that it is extremely easy to make a minor error that looks valid, but where browsers treat it as max-age=0. As a result, it should be avoided in preference of the far more simple max-age directive.

  • Pragma: no-cache

Not only is the behavior of Pragma: no-cache largely undefined, but HTTP/1.0 client compatibility hasn't been necessary for about 20 years.

Glossary

  • fresh — a resource that was last validated less than max-age seconds ago
  • immutable — a resource that never changes, as opposed to mutable
  • stale — the opposite of fresh, a resource that was last validated more than max-age seconds ago
  • user agent — a user's browser, mobile client, etc.
  • validated — the user agent requested a resource from a server, and the server either provided an up-to-date resource or indicated that it hasn't changed from the last request

Learn More

Footnotes

  1. Technically not true thanks to HTTP/2 header compression, but don't send them regardless.

[Category: Security] [Permalink]


Analysis of the Alexa Top 1M sites (April 2019)

Prior to the release of the Mozilla Observatory in June of 2016, I ran a scan of the Alexa Top 1M websites. Despite being available for years, the usage rates of modern defensive security technologies was frustratingly low. A lack of tooling combined with poor and scattered documentation had led to minimal awareness around countermeasures such as Content Security Policy (CSP), HTTP Strict Transport Security (HSTS), and Subresource Integrity (SRI).

Since then, a number of additional assessments have done, including in October 2016, June 2017, and February 2018. All three surveys demonstrated clear and continual improvement in the state of web security. As a year has gone by since the last survey, it seemed like the perfect time to give the world wide web another assessment.

April 2019 Scan

Technology February 2018 April 2019 % Change
(Feb. 2018)
% Change
(All‑Time1)
Content Security Policy (CSP) .022%2
.112%3
.026%2
.142%3
+18%
+27%
+420%
+1083%
Cookies (Secure/HttpOnly)4 8.97% 10.79% +20% +474%
  — Cookies (SameSite)4 .514%
Cross-origin Resource Sharing (CORS)5 96.89% 97.57% +.70% +4.0%
HTTPS 54.31% 71.67% +32% +142%
HTTP → HTTPS Redirection 21.46%6
32.82%7
35.92%6
52.15%7
+67%
+59%
+610%
+485%
Public Key Pinning (HPKP) 1.07% 1.73% +62% +302%
  — HPKP Preloaded8 0.70% 1.73% +141% +308%
Strict Transport Security (HSTS)9 6.03% 8.68% +44% +396%
  — HSTS Preloaded8 .631% .570% -10% +261%
Subresource Integrity (SRI) 0.182%11 0.770%11 +323% +5033%
X-Content-Type-Options (XCTO) 11.72% 16.27% +38% +163%
X-Frame-Options (XFO)12 12.55% 16.42% +31% +140%
X-XSS-Protection (XXSSP)13 10.36% 11.74% +13% +133%
 
Number of sites successfully scanned: 976,431

The overall growth in adoption continues to be encouraging, particularly the rise in the HTTPS and redirections to HTTPS. Overall, an additional 170,000 sites on the Alexa Top 1M now support HTTPS and about 190,000 of the top million websites have decided to do so automatically by redirecting to their HTTPS counterpart.

Subresource Integrity has also seen a sharp increase in uptake, as more and more libraries and content delivery networks work to make its usage a simple copy-and-paste operation. We've also see X-Content-Type-Options gain signicantly increased usage, particularly given that its usage enables cross-origin read blocking and helps protect against side-channel attacks like Meltdown and Spectre.

While the usage of Content Security Policy has continued to grow, it seems to be slowing down a bit. Tools like the Mozilla Laboratory make policy generation a lot easier, but it still remains extremely difficult to retrofit CSP to old and sprawling websites like so many of the top million.

Lastly, whether a result of policy changes in how the HTTP Strict Transport Security preload list is administered or some weird bug in my code, the percentage of the Alexa Top 1M contained in the preload list fell slightly. Oddly enough, of the 20,105 sites that set preload, only 5,540 of them are actually preloaded.

Mozilla Observatory Grading

Progress continues to be made amongst the Alexa Top 1M websites, but the vast majority still do not use Content Security Policy, Strict Transport Security, or Subresource Integrity. When properly used, these technologies can nearly eliminate huge classes of attacks against sites and their users, and so they are given significant weight in Observatory grading.

Here are the overall grades changes over the last year. Please keep in mind that what is being tested now isn't the same as what was being tested three years ago. An A+ in April 2016 was considerably easier to acquire than an A+ is now.

Grade April 2016 October 2016 June 2017 February 2018 April 2019 % Change
  A+ .003% .008% .013% .018% .028% +58%
A .006% .012% .029% .011% .014% +26%
B .202% .347% .622% 1.08% 1.48% +37%
C .321% .727% 1.38% 2.04% 1.82% -11%
D 1.87% 2.82% 4.51% 6.12% 4.62% -24%
F 97.60% 96.09% 93.45% 90.73% 92.03% +1.43%

It's interesting to notice growth at both the top and the bottom. Over the last year, Observatory tests have gotten more difficult, particularly with regards to loading JavaScript over protocol-independent URLs such as this:

<script src="//example.com/script.js">

As a result, the bifurcation in scores likely indicates that more sites have decided to take web security seriously while others at the tail have fallen further into failure.

The Mozilla Observatory recently passed an important milestone of 10 million scans and has now helped over 175,000 websites improve their web security.

That's a big number, but I would love to see it continue to grow. So please share the Observatory so that the web can keep getting safer. Thanks so much!



Footnotes:

  1. Since April 2016
  2. Allows 'unsafe-inline' in neither script-src nor style-src
  3. Allows 'unsafe-inline' in style-src only
  4. Amongst sites that set cookies
  5. Disallows foreign origins from reading the domain's contents within user's context
  6. Redirects from HTTP to HTTPS on the same domain, which allows HSTS to be set
  7. Redirects from HTTP to HTTPS, regardless of the final domain
  8. As listed in the Chromium preload list
  9. max-age set to at least six months
  10. Percentage is of sites that load scripts from a foreign origin
  11. Percentage is of sites that load scripts
  12. CSP frame-ancestors directive is allowed in lieu of an XFO header
  13. Strong CSP policy forbidding 'unsafe-inline' is allowed in lieu of an XXSSP header

[Category: Security] [Permalink]


Lore of MTG - Battlemage

Magic: The Gathering: BattleMage

Released in 1997, Magic: The Gathering - BattleMage was a real-time strategy game by Acclaim Entertainment. Published for the PlayStation and PC, its gameplay was bears little resemblence to the Magic we know today. Nevertheless, it is filled with an incredible amount of lore from early Magic history.

Due to its age and rarity – as well as the storyline's many branching paths – this lore was long-since considered lost to the Vorthos community.

Given Magic’s return to Dominaria, and BattleMage's significance in cards such as Time of Ice, I thought it best to crawl through BattleMage's code to extract the lore contained within.

I do hope you enjoy these texts, which are ordered as they appear in the game. Please contact me if you notice any mistakes in the editing. Thanks!

Stories

Geography

[Category: Magic] [Permalink]


Analysis of the Alexa Top 1M sites (Feb 2018)

Prior to the release of the Mozilla Observatory in June of 2016, I ran a scan of the Alexa Top 1M websites. Despite being available for years, the usage rates of modern defensive security technologies was frustratingly low. A lack of tooling combined with poor and scattered documentation had led to minimal awareness around countermeasures such as Content Security Policy (CSP), HTTP Strict Transport Security (HSTS), and Subresource Integrity (SRI).

Since then, a number of additional assessments have done, including in October 2016 and June 2017. Both of those surveys demonstrated clear and continual improvement in the state of internet security. But now that tools like the Mozilla Observatory, securityheaders.io and Hardenize have become more commonplace, has the excitement for improvement been tempered?

February 2018 Scan

Technology June 2017 February 2018 % Change
(June 2017)
% Change
(All‑Time1)
Content Security Policy (CSP) .018%2
.043%3
.022%2
.112%3
+22%
+161%
+340%
+833%
Cookies (Secure/HttpOnly)4 6.50% 8.97% +38% +139%
Cross-origin Resource Sharing (CORS)5 96.55% 96.89% +.35% +3.3%
HTTPS 45.80% 54.31% +19% +83%
HTTP → HTTPS Redirection 14.38%6
22.88%7
21.46%6
32.82%7
+49%
+43%
+324%
+268%
Public Key Pinning (HPKP) 0.71% 1.07% +51% +148%
  — HPKP Preloaded8 0.43% 0.70% +63% +71%
Strict Transport Security (HSTS)9 4.37% 6.03% +38% +245%
  — HSTS Preloaded8 .337% .631% +87% +299%
Subresource Integrity (SRI) 0.113%10 0.182%11 +61% +1113%
X-Content-Type-Options (XCTO) 9.41% 11.72% +21% +89%
X-Frame-Options (XFO)12 10.98% 12.55% +14% +84%
X-XSS-Protection (XXSSP)13 8.12% 10.36% +28% +106%

Improvement across the web appears to be continuing at a steady rate. Although a 19% increase in the number of sites that support HTTPS might seem small, the absolute numbers are quite large — it represents over 83,000 websites, a slight slowdown from the previous survey's 119,000 jump, but still a great sign of progress in encrypting the web's long tail.

Not only that, but an additional 97,000 of the top websites have chosen to be HTTPS by default, with another 16,000 of them forbidding any HTTP access at all through the use of HTTP Strict Transport Security (HSTS). Also notable is the jump in websites that have chosen to opt into being preloaded in major web browsers, via a process known as HSTS preloading. Until browsers switch to HTTPS by default, HSTS preloading is the best method for solving the trust-on-first-use problem in HSTS.

Content Security Policy (CSP) — one of the most important recent advances due to its ability to prevent cross-site scripting (XSS) attacks — continues to see strong growth. Growth is faster in policies that ignore inline stylesheets (CSS), perhaps reflecting the difficulties that many sites have with separating their presentation from their content. Nevertheless, improvements brought about by specification additions such as 'strict-dynamic' and policy generators such as the Mozilla Laboratory continue to push forward CSP adoption.

Mozilla Observatory Grading

Despite this progress, the vast majority of top websites around the web continue not to use Content Security Policy, Strict Transport Security, or Subresource Integrity. As these technologies — when properly used — can nearly eliminate huge classes of attacks against sites and their users, they are given a significant amount of weight in Observatory scans.

As a result of their low usage rates amongst top websites, they typically receive failing grades from the Observatory. But despite new tests and harsher grading, I continue to see improvements across the board:

Grade April 2016 October 2016 June 2017 February 2018 % Change
  A+ .003% .008% .013% .018% +38%
A .006% .012% .029% .011% -62%
B .202% .347% .622% 1.08% +74%
C .321% .727% 1.38% 2.04% +48%
D 1.87% 2.82% 4.51% 6.12% +36%
F 97.60% 96.09% 93.45% 90.73% -2.9%

As 976,930 scans were successfully completed in the last survey, a decrease in failing grades by 2.9% implies that over 27,000 of the top sites in the world have improved from a failing grade in the last eight months alone. Note that the drop in A grades is due to a recent change where extra credit points can no longer be used to move up to an A grade.

Thus far, over 140,000 websites around the web have directly used the Mozilla Observatory to improve their grades, indicated by making an improvement to their website after an initial scan. Of these 140,000 websites, over 2,800 have improved all the way from a failing grade to an A or A+ grade.

When I first built the Observatory at Mozilla, I had never imagined that it would see such widespread use. 6.6M scans across 2.3M unique domains later, it seems to have made a significant difference across the internet. I couldn't have done it without the support of Mozilla and the security researchers who have helped to improve it.

Please share the Mozilla Observatory so that the web can continue to see improvements over the years to come!



Footnotes:

  1. Since April 2016
  2. Allows 'unsafe-inline' in neither script-src nor style-src
  3. Allows 'unsafe-inline' in style-src only
  4. Amongst sites that set cookies
  5. Disallows foreign origins from reading the domain's contents within user's context
  6. Redirects from HTTP to HTTPS on the same domain, which allows HSTS to be set
  7. Redirects from HTTP to HTTPS, regardless of the final domain
  8. As listed in the Chromium preload list
  9. max-age set to at least six months
  10. Percentage is of sites that load scripts from a foreign origin
  11. Percentage is of sites that load scripts
  12. CSP frame-ancestors directive is allowed in lieu of an XFO header
  13. Strong CSP policy forbidding 'unsafe-inline' is allowed in lieu of an XXSSP header

[Category: Security] [Permalink]


HTTP Status Code Handling

I was recently writing some code for the Mozilla Observatory to store and interact with the HTTP status codes. As part of my code, I wanted to ensure that I would only store these status codes if they were an integer as per the HTTP/1.1 specification:

   The status-code element is a three-digit integer code giving the
   result of the attempt to understand and satisfy the request.

While it is easy to create test cases for conditions that don't satisfy this requirement, it is somewhat more difficult to determine how third-party libraries will handle HTTP requests that fall outside this constraint. I looked around the internet for websites to help me test weird status codes, but most of them only let me test with the known status codes. As such, I decided to add arbitrary HTTP status codes to my naughty httpbin fork, called misbehaving.site.

What I discovered is that the various browser manufacturers have wildly different behavior with how they handle unknown HTTP status codes. Here is what the HTTP specification says that browsers should do:

   HTTP status codes are extensible.  HTTP clients are not required to
   understand the meaning of all registered status codes, though such
   understanding is obviously desirable.  However, a client MUST
   understand the class of any status code, as indicated by the first
   digit, and treat an unrecognized status code as being equivalent to
   the x00 status code of that class, with the exception that a
   recipient MUST NOT cache a response with an unrecognized status code.

…so what happens in reality?

Chrome

Chrome's behavior is strange, but surprisingly not the strangest of the major browsers:

Chrome's HTTP status code behavior

For negative status codes, Chrome always displays HTTP status code 200. For 0, it simply displays Finished instead of the actual status code. It otherwise simply reflects the status code, unless it exceeds 2147483647 (231-1), in which case it displays 2147483647.

Note that when exceeding 2147483647, it displays this error in the console, despite the page otherwise loading normally:

Chrome's HTTP status code behavior

Firefox

It actually took me quite a while to figure out Firefox's behavior. Let's take a look:

Firefox's HTTP status code behavior

Status codes in Firefox are modulo 65536 (216), unless it works out to 0, in which it displays status code 200.

This works up to a certain point, when it starts to display different behavior:

Firefox's HTTP status code behavior

Note how the status icon (blue dot, yellow triangle, etc.) is dependent on the first digit of the status code, once Firefox has finished interpreting it.

Safari

Safari only accepts status codes between 1 and 999. Should the status code fall outside that range, it reflects the entire HTTP request as plaintext, headers and all:

Safari's HTTP status code behavior

It also displays this error in the browser console. I'm not sure why, as the output is just JSON and there isn't any script on the page:

Safari's HTTP status code behavior

Note if you serve from localhost instead of a remote server, it displays a different error:

Safari's HTTP status code behavior

Edge

Not to be left behind, Edge also has some unusual HTTP status code handling:

Edge's HTTP status code behavior

For status code 0, it displays (Pending), although the page otherwise loads normally. For negative status codes, it displays them as the status code modulo 4294967296 (232). This is unless the status code is less than -4294967295, in which case it displays 1.

For positive status codes, it simply reflects them. This is until the status code reaches 4294967296 or higher, in which case it shows (Pending) and the browser displays this error:

Edge's HTTP status code behavior

Final words

Those who have been around in computing for a long time are likely familiar with Postel's Law:

Be liberal in what you accept, and conservative in what you send.

While it seems like the neighborly thing to do, it is the bane of those of us who enjoy consistent software behavior. If the specification had simply stated that status codes falling outside 100-599 should be treated as an unrecoverable error, then we wouldn't see the unusual behavior that we see today.

Luckily, while all of the browsers have their own idiosyracies, none of them are actually harmful in this case.

If you enjoyed this post and would like to test how browsers handle other quirky HTTP responses, please consider opening an issue or sending a pull request to the misbehaving.site github repository.

[Category: Security] [Permalink]