Skip to main content
  • Log in
EIDF Catalogue

EIDF Catalogue

  • DATASETS
  • ORGANISATIONS
  • GROUPS
  1. Home
  2. Datasets

Organisations

  • Data-hosting for... - 1 x

Formats

  • collection - 1 x
  • TSV - 1

Licenses

There are no Licenses that match this search

close
1 dataset found

Organisations: Data-hosting for Common Crawl augmentations Formats: collection

Filter Results
  • Common Crawl URL index for August 2019 with Last-Modified timestamps

    Data-hosting for Common Crawl augmentations
    This dataset consists of a complete set of augmented index files for CC-MAIN-2019-35 [1]. This version of the index contains one additional field, lastmod, in about 18% of the entries, giving the value of the Last-Modified header from the HTTP response as a POSIX-format timestamp, enabling much...
    TSV collection
You can also access this registry using the API (see API Docs).
EPCC
DDI
  • About
  • Edinburgh International Data Facility
  • Website accessibility
  • CKAN API
  • Open Data