Skip to main content
  • Log in
EIDF Catalogue

EIDF Catalogue

  • DATASETS
  • ORGANISATIONS
  • GROUPS
  1. Home
  2. Organisations
  3. Data-hosting for Common Crawl...

Data-hosting for Common Crawl augmentations

Data-hosting for Common Crawl augmentations

read more

Followers
0
Datasets
1

Organizations

  • Data-hosting for... - 1

Groups

There are no Groups that match this search

Tags

There are no Tags that match this search

Formats

  • collection - 1
  • TSV - 1

Licenses

There are no Licenses that match this search

close
  • Datasets
  • About
  • Activity Stream
1 dataset found

Filter Results
  • Common Crawl URL index for August 2019 with Last-Modified timestamps

    Data-hosting for Common Crawl augmentations
    This dataset consists of a complete set of augmented index files for CC-MAIN-2019-35 [1]. This version of the index contains one additional field, lastmod, in about 18% of the entries, giving the value of the Last-Modified header from the HTTP response as a POSIX-format timestamp, enabling much...
    TSV collection
EPCC
DDI
  • About
  • Edinburgh International Data Facility
  • Website accessibility
  • CKAN API
  • Open Data