Sunday, May 01, 2022

Using Internet Archive to Find Broken Links

One of the most frustrating elements of the Internet is that links to sites I pointed to in past years (from 2005 on) are now broken. Thus, if you look at articles I posted in the 2000s on this blog, you will be frustrated to find not-working links in many places.

Here's what I do to overcome that problem.  

First, when I find a broken link, I look at the root of the links. That's the first part, which I've underlined in the example below.  

www.tutormentorexchange.net is the root.  I remove the rest of the link and see if the original root site is still working.


If the site still works, and has a search field, I enter the name of the website or article and see if it's on the site, but in another place.  If it is I update my blog article or web library with the new link, so it is there for the next person who looks at the article.

If I can't find the article on the original website, I put the article name in a Google search, and see if it is available in another place.  If I can find it then I replace the old link with a new one.

However, I'm often not able to find the article.  That's when I visit the Internet Archive at https://archive.org/web/ 


At the top of the home page you'll see the image below:

Enter the ULR of the broken link then click on the "browse history" button.  Below you can see the result of my search for https://tutormentorexchange.net/mapping-the-programs

Your search will reveal any history available in the Internet archive, or, tell you that no history exists. For the web address I input, you can see along the middle bar that there are many results, dating back to 2009.  In the calendar below the middle timeline, archive dates are shown.

If you put your mouse over a highlighted date on the calendar, then click the link provided, the archived page will open.  You can see my archived page for March 23, 2022.


In the circled area at the top of the page, in your browser address line, you'll find the location of this page in the web archive.  For this page it is: https://web.archive.org/web/20220323074928/https://tutormentorexchange.net/mapping-the-programs

That is the address you put in your blog article and/or library as the archived location of the page with the broken link.

What's interesting is that you can look at this page at different points in time.  This page has been saved 79 times since 2009.  Click on any of the black bars and the page saved for that date will appear.  


I clicked on a 2009 page and was able to look at the version of the site from that date.


Let's look at another website.  I used http://www.tutormentorconnection.org as my primary website address since the late 1990s when my first site was launched.  That site was taken off line in January 2022, so is now only available in the Internet Archive.  Below you can see the result when I entered that URL in the search bar.


The site has been saved 450 times since 2000.  Thus, if I open November 4, 2021 you'll see the site as it was just before shutting down. 


However, if you open the site from October 2007, this is what you'll find.

Below is the home page from August 2004, the year before the tech team at IUPUI rebuilt the website. 

I'm only showing the home pages. For most of these sites you can open interior pages and see most of what was on the site at that time.

Here's one more valuable feature of the Internet Archive.  You can add new links to it.  Here's an example.  

At https://www.vialogues.com/vialogues/play/34104 you can find one of several videos that I put on the Vialogues site in past years.  Earlier this year I learned that the site will stop working in the near future.


To preserve these video conversations I went back to the Wayback Machine.  In the image below you can see a "save page now" box.  I entered the Vialogues address for the video above.


Below is the search result.  I've circled the date of the archive, which was March 18, 2022, and the new URL for finding the page. 


This is the saved page.  The new URL is in the browser line at the top of the page. 


I put five of my videos in the Internet Archive so now they will remain available as long at the Archive is available.  This blog and other websites that hold information that I've collected over the past 25 years are all available on the Internet Archive.  That means future social engineers, historians and community builders will be able to find this information well beyond my lifetime and the active life of the sites I still maintain.  I hope they use it to build Tutor/Mentor Connection type strategies in every major city in the world that support a wide range of Birth-to-Work volunteer-based youth learning programs. 

The Internet Archive is a non-profit and needs donations to remain available. It's one of the sites I make a small annual contribution to. 

It's pretty interesting, and a valuable resource for finding and replacing broken links on a website or blog.  I hope you find this introduction useful.


1 comment:

Dogtrax said...

This kind of tutorial is very helpful, Daniel. Thanks.
Kevin