Skip to content

Mark CDX entries as actual page or a subasset #31

@vcavallo

Description

@vcavallo
  • Add this to the WARC headers
  • include the y column when building CDX indices
  • decide how to merge CDX records that differ on this point (probably just be maximally inclusive – if someone thinks it’s a page then it is a page for everyone)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions