• heartlessevil@lemmy.one
      link
      fedilink
      arrow-up
      21
      ·
      1 year ago

      For context, based on historical pushshift data:

      • 80gb zipped decompresses to ~1100GB of text data
      • 80gb zipped would only be the most recent ~4 months of comments

      They do indicate that the data they have is more valuable though, particularly pointing out how users are being tracked (GDPR alarm bells ringing) or censored.