First this was created as a comment, but than I decided to make this post so people will find it easier =)
Thanks to @InternetPirate@lemmy.fmhy.ml
for finding the link at https://the-eye.eu/redarcs
comment on https://lemmy.dbzer0.com/comment/129402
There are 19,980 sub-reddit's archived at the-eye. To download/install/view on linux do this;
Download archives:
wget https://the-eye.eu/redarcs/files/Piracy_submissions.zst
(size: 42 MB)
wget https://the-eye.eu/redarcs/files/Piracy_comments.zst
(size: 208 MB)
I also recommend downloading index page for faster offline viewing.
curl -A Firefox https://the-eye.eu/redarcs/ -o redarcs.html
And to extract all the links for index page do;
cat redarcs.html | grep -oE "href='(.*?)'" | cut -d\' -f2
Than just drag-and-drop redarcs.html into firefox or chrome to view it.
You can also save them in text file:
cat redarcs.html | grep -oE "href='(.*?)'" | cut -d\' -f2 >> links.txt
Install zstd package:
- Arch:
pacman -S zstd
- Ubuntu:
apt install zstd
Extract files:
zstd -d Piracy_submissions.zst
(extracted size: 593 MB)
zstd -d Piracy_comments.zst
(extracted size: 2.4 GB)
View files with head/tail/grep:
cat Piracy_submissions | head -10
(example)
cat Piracy_submissions | tail -10
(example)
cat Piracy_submissions | grep "word"
(example)
Note format seems to be json files. You can use jq
tool for this.
cat Piracy_submissions | jq -r "."
or cat Piracy_submissions | jq -r ".title"
Take a look at Pine64 Quartz64 boards as a decent alternative