Describe the bug
Several archived pages fail to load dynamic content (news listings, publications, events, datasets) because PyWB cannot consistently replay POST requests to a WordPress REST API endpoint. The pages appear to load, but the content blocks populated by the API remain empty.
This appears to be a variant of the known POST request replay limitation documented in Issue #768 and the POST Request Replay wiki page.
Steps to reproduce the bug
Visit the below example pages in The National Archives webarchive we’ve been working on:
https://webarchive.nationalarchives.gov.uk/ukgwa/20260206124441/https://www.nceo.ac.uk/news/
https://webarchive.nationalarchives.gov.uk/ukgwa/20260206124441/https://www.nceo.ac.uk/news/latest-events/
https://webarchive.nationalarchives.gov.uk/ukgwa/20260106092954/https://www.nceo.ac.uk/data-facilities/datasets-tools/
https://webarchive.nationalarchives.gov.uk/ukgwa/20260106092954/https://www.nceo.ac.uk/our-research/publications/
The dynamic content blocks (news, publications, events) under the HTML content-area div do not populate - see screenshots for comparisons. In replay, the POST request to https://www.nceo.ac.uk/wp-json/v2/archive-block/results is not consistently resolved to the captured response, so the frontend receives no usable data for those blocks.
Expected behavior
Dynamic content should render as captured, with POST API responses replayed correctly from the WARC.
Screenshots
See PyWB replay of News page below:
And then the live replay of this page below:
Environment
- OS: MacOS Tahoe 26.3
- Browser: Chrome
- Version: Version 145.0.7632.159
Additional context
This appears to be a replay-time POST matching issue rather than a capture gap. The WARC contains successful 200 responses for the endpoint, but during replay PyWB does not always match/reconstruct the original POST request context (method + body canonicalization/index key), which can lead to a 404 or missing response delivery for the same URL. The issue is most consistent with known PyWB limitations around POST replay/index compatibility. The POST body is standard application/x-www-form-urlencoded and relatively small.
There is an additional complicating factor: the endpoint https://www.nceo.ac.uk/wp-json/v2/archive-block/results is POST-only. A GET request to the same URL returns a 404 rest_no_route error, which makes it difficult to inspect or test the response directly in a live environment and rules out standard fuzzy matching workarounds.
This issue is similar to an already raised issue - [#768], however it differs in that it is not specific to the OutbackCDX backend, and the POST-only nature of the endpoint (GET → 404 rest_no_route) eliminates fallback matching strategies that might otherwise apply.
Describe the bug
Several archived pages fail to load dynamic content (news listings, publications, events, datasets) because PyWB cannot consistently replay POST requests to a WordPress REST API endpoint. The pages appear to load, but the content blocks populated by the API remain empty.
This appears to be a variant of the known POST request replay limitation documented in Issue #768 and the POST Request Replay wiki page.
Steps to reproduce the bug
Visit the below example pages in The National Archives webarchive we’ve been working on:
The dynamic content blocks (news, publications, events) under the HTML content-area div do not populate - see screenshots for comparisons. In replay, the POST request to https://www.nceo.ac.uk/wp-json/v2/archive-block/results is not consistently resolved to the captured response, so the frontend receives no usable data for those blocks.
Expected behavior
Dynamic content should render as captured, with POST API responses replayed correctly from the WARC.
Screenshots
See PyWB replay of News page below:
And then the live replay of this page below:
Environment
Additional context
This appears to be a replay-time POST matching issue rather than a capture gap. The WARC contains successful 200 responses for the endpoint, but during replay PyWB does not always match/reconstruct the original POST request context (method + body canonicalization/index key), which can lead to a 404 or missing response delivery for the same URL. The issue is most consistent with known PyWB limitations around POST replay/index compatibility. The POST body is standard
application/x-www-form-urlencodedand relatively small.There is an additional complicating factor: the endpoint
https://www.nceo.ac.uk/wp-json/v2/archive-block/resultsis POST-only. A GET request to the same URL returns a404 rest_no_routeerror, which makes it difficult to inspect or test the response directly in a live environment and rules out standard fuzzy matching workarounds.This issue is similar to an already raised issue - [#768], however it differs in that it is not specific to the OutbackCDX backend, and the POST-only nature of the endpoint (GET →
404 rest_no_route) eliminates fallback matching strategies that might otherwise apply.