Skip to content

Example record in CDXJ spec contains extra digest field #163

@extua

Description

@extua

The example record in section 8 of the CDXJ spec contains two digest fields: digest and recordDigest.
Only digest is mentioned in the spec.

org,example)/index.html 20220106150849300 {"url":"https://example.org/index.html","digest":"sha-256:a8c5ac6f47aa34c5c5183daedc6ebbc7ca1e53fd2ec7db5e98d71bffb163b2ce","mime":"image/png","offset":283,"length":2269,"recordDigest":"sha256:e520b333999144ff38f593f6d76f5333d24895701953b2ea0507ed041d20ca2c","status":200,"filename":"data.warc.gz"}

On my understanding the digest value can be copied from the WARC-Payload-Digest field in the WARC header, but reading back the WARC spec it's not entirely clear.

What did the extra recordDigest field refer to?
I notice they're different values so they refer to different things.

Simply removing recordDigest from the example in the spec would clear up some confusion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions