Common Crawl URL index for August 2019 with Last-Modified timestamps
Data-hosting for Common Crawl augmentations
This dataset consists of a complete set of augmented index files for CC-MAIN-2019-35 [1]. This version of the index contains one additional field, lastmod, in about 18% of the entries, giving the value of the Last-Modified header from the HTTP response as a POSIX-format timestamp, enabling much...