Observation on archival sites: Archive.Today vs. Internet Archive

Some of my followers may have noted I've been archiving a number of older posts from my previous account of late....

In doing this, I've noticed a few things about Archive.Today (a/k/a Archive.Is) and the Internet Archive's Wayback Machine.

It turns out that Archive.Today is really convenient to invoke with DDG set as my default search engine as I simply highlight the navigation bar for a page, prepend the "!ais" bang search to the head of the URL (followed by a space) and hit return.

Archive.Today helpfully offers links for other potential archive sites, including the Internet Archive, so I don't have to independently call up that URL.

Archive.Today responds very quickly. There's a practically instant response that the page is or is not archived, and if not, the "save" form also pops up nearly instantly.

By contrast, the Internet Archive takes a few seconds to respond whether or not the page is archived, and a few further seconds when requesting a page be saved.

(Both sites have a two-stage submission. The Internet Archive does have a submission URL which should work in one fell swoop, though it occasionally breaks and error-detection is ... difficult.)

Archive.Today's processing queue ranges from 0 to 10k or so slots.

The Internet Archive is currently reporting ~10 hours to process archival requests.

AT does include comments on Diaspora* posts. IA does not.

My manual workflow has evolved to:

Pull up page, reload in Diaspora* (otherwise cookies may not be current, forcing a log-out / log-in cycle, also annoying).
Mark the post "tagged" to indicate it's been archived. I typically also "like" it to set a sharper visual indicator.
Prepend '!ais ' to the navigation bar and hit <enter>.
Open "Search in Internet Archive" in a new tab, then select that tab to get IA working on finding the post.
Switch back to the Archive.Today tab and select save, then confirm. At that point the request is processing.
Switch back to the Internet Archive tab, wait for the page to fully load, request archive, wait for that page to load, confirm, and wait for the request to return.
Even after this stage, the IA request may still fail. Detecting this is ... difficult.

I may also save content from the original (JoindiasporaCom) address, though mostly I'm working through Glasswings. I have run an automated submission of all my posts from the take-out JSON archive, and will run that another time or so before final shutdown. That will at least preserve post content online, but not the comments threads :-(

Hopefully this information may be useful to others.

#Archival #WebArchival #ArchiveIs #ArchiveToday #InternetArchive #WaybackMachine

0 Persons are tagged with #archiveis

#archiveis

Observation on archival sites: Archive.Today vs. Internet Archive