In reply to https://journal.robbi.my/posts/221218071647/
Robbie, that's a very helpful post. That is so wonderfully systematic of you. By the way, the link to archivenow
just loops back to your page. But you gave me enough clues to find it on Github. Thanks.
What I really like is your use of the webhooks for webmentions. I need to figure out how to do that.
My 'build' process is rather manual and crude. Basically, once I make a post (or similar), then I
- run a script that does a
git commit
to Github - wait for Netlify (or its app) to notice the changes and build my site
- job more or less done.
To process webmentions, I run my getMentions.sh
script (see here for details), and then run my other script to build the site. I need to do some thinking as to where in all of that I do the automatic thing with the webmention webhooks. That's a project for another day.
Meanwhile, I've taken your “submit10url2backway”
scripted and adapted it for my site.
#!/usr/bin/env bash
# Copyright Robbi Nespu <contact@robbi.my> , 2022
# License: MIT
# This script is used to archive my website to the Internet Archive.
# Modified for my site - 2022-12-19 by Peter Smith
# Fetch RSS and read the first 10 URLs send send it to the Wayback Machine
urls1=$(curl -s https://petersmith.org/index.xml | grep "<link>" | head -n 10 | awk -F"<guid>" '{print $2} ' | awk -F"</guid>" '{print $1}')
for i in $urls1
do archivenow --ia "$i"
done
# Same like previous but for the IndieWeb RSS
urls2=$(curl -s https://robbinespu.gitlab.io/indieweb/index.xml | grep "<link>" | head -n 10 | awk -F"<guid>" '{print $2} ' | awk -F"</guid>" '{print $1}')
for i in $urls2
do archivenow --ia "$i"
done
# Archive the main website
archivenow --ia "https://petersmith.org/"
echo "Done!"