<link rel="stylesheet" href="main.css?hash=sha384-5rcfZgbOPW7..." integrity="sha384-5rcfZgbOPW7..."/>
Etag: "sha384-5rcfZgbOPW7..."
Cache-Control: max-age=31536000, immutableHow do you do version updates? Add content hash to all files except for root index.html.
Cache everything forever, except for index.html
To deploy new version upload all files, making sure index.html is last.
Since all files are unique, old version continues to be served.
No cache invalidating required since all files have unique paths, expect index.html which was never cached.
You have to ensure you absolutely have properly content hashes for everything. Images, css, js. Everything
“Incremental Static Regeneration” is also one of the funniest things to come out of this tech cycle.
The build logic to decide which things to rebuild of course is probably the interesting bits but we dont need all these services... </grey-beard-rant>
[1] https://openbenchmarking.org/test/pts/nginx&eval=c18b8feaeca...
edit: to be less ranty they are more or less building static sites out of their Next.js codebase but on-demand updated etc which is indeed interesting but none of this needs cloudflare/hyerscaler tech
Not sure how many customers/sites they have. Perhaps they don't want to spend CPU regenerating all sites on every deployment? They do describe a content-driven pre-warmer but I'm still unclear why this couldn't be a content-driven static site generator running on some build machine
I have come to conclude it is that way because they focus on optimizing for a demo case that presents well to non-technical stakeholders. Doing one particular thing that looks good at a glance gets the buy-in, and then those who bought in never have to deal with the consequences of the decision once it is time to build something other than the demo.
For example cloudfront with s3, you use If-None-Match when uploading to ensure deploy fails on conflict
However, it's probably more inexperience than anything. Nobody senior was around to tell our founders that they should go for a SSG architecture when they started /shrug. It's mostly worked out anyways though haha.
I already had HAProxy setup. So I have added stale while revalidate compatible header from HAProxy. Cloudflare handle the rest.
Edit: I am not using vercel. Self hosted using docker on EC2.
I obviously can be done but clearly is not the intended solution which really bothers me
Now that I have proper header added by HAProxy, Cloudflare cache rules for stale-while-revalidate works.
If anyone can reach Cloudflare. Please let us forcefully use stale-while-revalidate even when upstream server tells otherwise.
Mintlify powers documentation for tens of thousands of developer sites, serving 72 million monthly page views. Every pageload matters when millions of developers and AI agents depend on your platform for technical information.
We had a problem. Nearly one in four visitors experienced slow cold starts when accessing documentation pages. Our existing Next.js ISR caching solution could not keep up with deployment velocity that kept climbing as our engineering team grew.
We ship code updates multiple times per day and each deployment invalidated the entire cache across all customer sites. This post walks through how we architected a custom edge caching layer to decouple deployments from cache invalidation, bringing our cache hit rate from 76% to effectively 100%.
We achieved our goal of fully eliminating cold starts and used a veritable smorgasbord of Cloudflare products to get there.

| Component | Purpose |
|---|---|
| Workers | docs-proxy handles requests; revalidation-worker consumes the queue |
| KV | Store deployment configs, version IDs, connected domains |
| Durable Objects | Global singleton coordination for revalidation locks |
| Queues | Async message processing for cache warming |
| CDN Cache | Edge caching with custom cache keys via fetch with cf options |
| Zones/DNS | Route traffic to workers |
We could have built a similar system on any hyperscaler, but leaning on Cloudflare's CDN expertise, especially for configuring tiered cache, was a huge help.
It is important that you understand the difference between two key terms which I use throughout the following solution explanation.
Both ultimately warm the cache by fetching pages, but they differ in when and why they're triggered. More on this in sections 2 through 4 below.
We placed a Cloudflare Worker in front of all traffic to Mintlify hosted sites. It proxies every request and contains business logic for both updating and using the associated cache. When a request comes in, the worker proceeds through the following steps.
cache key based on the path, deployment ID, and request typeOur cache key structure shown below. The cachePrefix roughly maps to the name of a particular customer, deploymentId identifies which Vercel deployment to proxy to, path is needed to know the correct page to fetch and then contentType functions such that we can store both html and rsc variants for every page.
`${cachePrefix}/${deploymentId}/${path}#${kind}:${contentType}`;
For example: acme/dpl_abc123/getting-started:html and acme/dpl_abc123/getting-started:rsc.
The most innovative aspect of our solution is automatic version mismatch detection.
When we deploy a new version of our Next.js client to production, Vercel sends a deployment.succeeded webhook. Our backend receives this and writes the new deployment ID to Cloudflare's KV.
KV.put('DEPLOY:{projectId}:id', deploymentId);
Then, when user requests come through the docs-proxy worker, it extracts version information from the origin response headers and compares it against the expected version in KV.
gotVersion = originResponse.headers['x-version'];
projectId = originResponse.headers['x-vercel-project-id'];
wantVersion = KV.get('DEPLOY:{projectId}:id');
shouldRevalidate = wantVersion != gotVersion;
When a version mismatch is detected, the worker automatically triggers revalidation in the background using ctx.waitUntil(). The user gets the previously cached stale version immediately. Meanwhile, cache warming of the new version happens asynchronously in the background.
We do not start serving the new version of pages until we have warmed all paths in the sitemap. Since, when you load a new version of any given page after an update, you have to make sure that all subsequent navigations also fetch that same version. If you were on v2 and then randomly saw v1 designs when navigating to a new page it would be jarring and worse than them loading slowly.
Our first concern when triggering revalidations for sites was that we were going to create a race condition where we had multiple updates in parallel for a given customer and start serving traffic for both new and old versions at the same time.
We decided to use Cloudflare's Durable Objects (DO) as a lock around the update process to prevent this. We execute the following steps during every attempted revalidation trigger.
DO storage for any inflight updates, ignore the trigger if there is oneDO storage to track that we are starting an update and "lock"cachePrefix, deploymentId, and host info for the revalidation worker to processDO stateWe also added a failsafe where we automatically delete the DO's data and unlock in step 1 if it has been held for 30 minutes. We know from our analytics that no update should take that long and it is a safe timeout.
Cloudflare Queues make it easy to attach a worker that can consume and process messages, so we have a dedicated revalidation worker that handles both prewarming (proactive) and version revalidation (reactive). Using a queue to control the rate of cache warming requests was mission critical since without it, we'd cause a thundering herd that takes down our own databases.
Each queue message contains the full context for a deployment: cachePrefix, deploymentId, and either a list of paths or enough info to fetch them from our sitemap API. The worker then warms all pages for that deployment before reporting completion.
// Get paths from message or fetch from sitemap API
paths = message.paths ?? fetchSitemap(cachePrefix)
// Process in batches of 6 (Cloudflare's concurrent connection limit)
for batch in chunks(paths, 6):
awaitAll(
batch.map(path =>
// Warm both HTML and RSC variants
for variant in ["html", "rsc"]:
cacheKey = "{cachePrefix}/{deploymentId}/{path}#{variant}"
headers = { "X-Cache-Key": cacheKey }
if variant == "rsc":
headers["RSC"] = "1"
fetchWithRetry(originUrl, headers)
)
)
Once all paths are warmed, the worker reads the current doc version from the coordinator's DO storage to ensure we're not overwriting a newer version with an older one. If the version is still valid, it updates the DEPLOYMENT:{domain} key in KV for all connected domains and notifies the coordinator that cache warming is complete. The coordinator only unlocks after receiving this completion signal.
Beyond reactive revalidation, we also proactively prewarm caches when customers update their documentation. After processing a docs update, our backend calls the Cloudflare Worker's admin API to trigger prewarming:
POST /admin/prewarm HTTP/1.1
Host: workerUrl
Content-Type: application/json
{
"paths": ["/docs/intro", "/docs/quickstart", "..."],
"cachePrefix": "acme/42",
"deploymentId": "dpl_abc123",
"isPrewarm": true
}
The admin endpoint accepts batch prewarm requests and queues them for processing. It also updates the doc version in the coordinator's DO to prevent older versions from overwriting newer cached content.
This two-pronged approach ensures caches stay warm through both:
We have successfully moved our cache hit rate to effectively 100% based on monitoring logs from the Cloudflare proxy worker over the past 2 weeks. Our system solves for revalidations due to both documentation content updates and new codebase deployments in the following ways.
Our system is also self-healing. If a revalidation fails, the next request will trigger it again. If a lock gets stuck, alarms clean it up automatically after 30 minutes. And because we cache at the edge with a 15-day TTL, even if the origin goes down, users still get fast responses from the cache. Improving reliability as well as speed!
If you're running a dynamic site and chasing P99 latency at the origin, consider whether that's actually the right battle. We spent weeks trying to optimize ours (RSCs, multiple databases, signed S3 URLs) and the system was too complicated to debug meaningfully.
The breakthrough came when we stopped trying to make dynamic requests faster and instead made them not happen at all. Push your dynamic site towards being static wherever possible. Cache aggressively, prewarm proactively, and let the edge do what it's good at.