Caching

Performance

Caching is the temporary storage of web resources such as images, CSS, JavaScript, and HTML at or near the client to avoid repeated work by the origin server. In the web performance context, HTTP caching uses freshness lifetimes and validators so responses can be reused when still fresh or revalidated quickly when stale. Effective caching reduces latency, bandwidth, and server load, improving user experience and stabilising Core Web Vitals. For images, layered caching across browsers, CDNs, and proxies is central to fast delivery and consistent rendering across devices.

Overview

Caching reuses prior responses instead of refetching or recomputing them. A cache stores a representation identified by its cache key—typically the URL plus variant-driving headers such as Accept or Vary. When a subsequent request arrives, the cache serves the stored response if it is still fresh (within its max-age) or can be revalidated cheaply (using ETag or Last-Modified) with a conditional request that yields a 304 Not Modified. This avoids full payload transfers and reduces round-trips where a cache hit can be satisfied locally or at the network edge.

Freshness and validation are controlled by HTTP headers. Cache-Control governs lifetimes (max-age, s-maxage) and behaviours (public, private, no-cache, no-store, immutable) for browsers and shared caches. Validators (ETag, Last-Modified) enable conditional GETs, while directives such as stale-while-revalidate and stale-if-error provide resilience by serving a slightly stale object during background refresh or when the origin is unavailable. Vary defines which request headers contribute to variants, ensuring the right response is served for content negotiation such as image formats and device-specific versions.

In practice, caching is layered. Browsers maintain memory and disk caches subject to eviction policies and size limits. Service workers can implement fine-grained strategies in the client. Upstream, CDNs and reverse proxies cache shared content close to users, offloading origin traffic and smoothing traffic peaks. For image optimisation, this layered approach allows device-appropriate formats (AVIF, WebP), responsive sizes, and DPR-aware variants to be served quickly without recomputation for every request.

Caching layers for images

Browser and service worker

Browsers cache images aggressively when responses are marked public with a positive max-age. Disk caches typically retain large objects longer; memory caches prefer smaller, recently used items. The browser’s cache key includes the URL and variant-relevant headers, so query strings and Vary can fragment the cache. A service worker can pre-cache hero images, implement stale-while-revalidate logic, or serve offline fallbacks via the Cache Storage API, providing deterministic behaviour that complements default browser caching policies.

CDN and reverse proxy

CDNs cache images at the edge using Cache-Control and s-maxage, often with support for surrogate directives and keys. They can respect Vary: Accept to cache AVIF/WebP/JPEG variants per capability, and Vary on device hints (e.g., DPR, Width) for responsive imaging. Features like stale-while-revalidate, stale-if-error, and soft purge improve resilience and control. Reverse proxies such as Varnish or NGINX serve as a shield cache in front of the origin, reducing origin I/O and CPU when resizing or format-converting images dynamically.

Origin and application layer

Applications may cache image metadata, transformation results, or database lookups in memory stores (e.g., Redis) to avoid reprocessing. When images are generated on-demand, emitting strong validators (content hash ETags) and long-lived shared TTLs allows outer layers to absorb traffic. Pairing immutable, versioned asset URLs with far-future caching and explicit purging or key-based invalidation ensures reproducibility and minimises cache fragmentation across layers.

Caching risks and cache invalidation

Misconfiguration is the main risk. Overly permissive policies can expose private images via shared caches, while overly restrictive policies (no-store) eliminate performance benefits. Using Vary too broadly can explode the variant space and collapse cache hit ratios. Validators that differ per origin node (e.g., weak or server-specific ETags) force revalidation misses. Heuristic caching of HTML can serve outdated metadata. Each of these affects freshness guarantees and can lead to stale, inconsistent, or uncacheable experiences.

Invalidation is notoriously hard because cached copies exist in multiple layers and geographies. Strategies that minimise the need for purges are most reliable: immutable, content-hashed filenames for static images; long-lived TTLs in shared caches; and short, revalidated lifetimes for HTML. When purging is required, prefer targeted methods such as surrogate keys or tag-based invalidation rather than wildcard URL patterns, which can be slow and disruptive at scale. Plan for propagation delays across edge networks when timing a release.

For SEO, caching must not impede timely updates of critical HTML and directives. Over-caching robots.txt, sitemaps, or canonical tags can delay crawlers from seeing changes. For images, the risk is lower, but ensure redirects and content negotiation are consistent so that bots and users receive the same formats or correct fallbacks. Signed URLs and authenticated images should be marked private or uncacheable by shared layers to avoid leakage while still allowing browser reuse when appropriate.

Implementation notes

Choose cache keys and lifetimes deliberately. For static, versioned images, emit Cache-Control: public, max-age=31536000, immutable to enable long-lived browser and CDN caching. For dynamically generated or personalised responses, separate them from static assets and mark HTML as short-lived with validators (e.g., Cache-Control: private, no-cache plus strong ETag). Use s-maxage to give shared caches longer lifetimes than browsers when appropriate, and consider stale-while-revalidate/stale-if-error to smooth origin load and tolerate brief outages.

Emit strong validators so revalidation is cheap and accurate. Prefer content hash ETags that are invariant across origin nodes, or a stable Last-Modified timestamp based on the actual asset. Set Vary narrowly to the headers that truly affect the image variant—commonly Accept (for format negotiation) and device hints (e.g., DPR, Width) when serving responsive variants. Ensure Accept-Encoding is part of the cache key by default to avoid gzip/br compression confusion across clients and proxies.

Monitor effectiveness with hit ratio, byte hit ratio, TTFB distribution, and origin offload. Static assets with hashed URLs often achieve 90%+ hit ratios in shared caches. Track error budgets for purges and use staged rollouts to minimise cache churn. For image pipelines, normalise transformation parameters to reduce unique variants, and prefer deterministic URLs for the same output so caches can consolidate demand across users and sessions.

Comparisons

Caching vs compression

Compression reduces the size of data transferred; caching reduces how often data is transferred. They are complementary: a cached object may never traverse the network again during its lifetime, while a compressed but uncached object still incurs latency and bandwidth on every request. Image optimisation typically uses both—modern formats (AVIF/WebP) for size reduction and layered caching for delivery speed and origin offload.

Caching vs preload/prefetch

Preload and prefetch influence when the browser fetches a resource; caching influences whether a fetch is necessary and how it is served. Preloading a resource that is not cacheable wastes bandwidth, while prefetching cacheable resources can warm the cache ahead of use. Effective strategies combine both: preload critical hero images that are cacheable for immediate display, and rely on caching for subsequent navigations and repeat visits.

Caching vs lazy loading

Lazy loading defers when an image is fetched based on viewport visibility; caching determines how quickly the image can be delivered once requested. Lazy loading reduces initial page weight and network contention, while caching accelerates delivery for images that are eventually needed. Using both avoids over-fetching and still ensures that deferred images render fast when scrolled into view.

FAQs

Which headers control browser and CDN caching?

Cache-Control defines lifetimes and scope (public/private), with s-maxage targeting shared caches such as CDNs. ETag and Last-Modified enable conditional requests. Vary specifies which request headers affect variants. For evergreen static assets, combine Cache-Control: public, max-age=31536000, immutable with content-hashed URLs. Many CDNs also support surrogate directives (e.g., Surrogate-Control or surrogate keys) for more granular invalidation beyond standard HTTP controls.

How long should images be cached?

Static images with versioned filenames can be cached for a year or more in both browsers and shared caches. Dynamic or frequently changing images should use shorter lifetimes with strong validators to allow quick revalidation. For responsive or format-negotiated variants, keep consistent URLs for identical outputs so that long TTLs remain safe and hit ratios stay high across devices and geographies.

What is the difference between no-cache and no-store?

no-cache allows a cache to store the response but requires revalidation before reuse. no-store forbids storing the response at all, even in memory. Use no-store for highly sensitive or personalised content that must never be retained, and no-cache when freshness must be confirmed but storage is acceptable to reduce transfer if the content is unchanged (via 304 responses).

Are query-string cache busters still recommended for images?

Appending ?v=123 can work but is less robust than content-hashed filenames. Some intermediaries treat query strings conservatively, reducing cacheability; and query changes may fragment caches unnecessarily. Hash-based filenames coupled with long TTLs and immutable are more predictable, and allow precise purging when needed without affecting unrelated assets or variants.

Does caching affect SEO or Core Web Vitals?

Yes. Faster repeat views and edge-served hits reduce TTFB and speed up rendering, which can improve LCP and overall perceived performance. HTML should remain fresh so crawlers see updates promptly; static assets can be cached long. Search engines handle standard HTTP caching correctly, and serving optimised image formats via content negotiation is acceptable when fallbacks are consistent and accessible.

Synonyms

web cachingHTTP cachingbrowser cachingproxy cachingedge caching