Difference between revisions of "Help:WebArchive Notes"

(Created page with "=Web Archive= The Internet Archive (also known as the WayBack Machine) [https://archive.org/web/] is an archive of various websites that started in 1996. Information availabl...")
 
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
=Web Archive=
 
=Web Archive=
The Internet Archive (also known as the WayBack Machine) [https://archive.org/web/] is an archive of various websites that started in 1996.  Information available through the web is not static.  It is not uncommon for websites run by businesses to be reconfigured periodically and old information removed to improve its utility for its current customers.  Also companies themselves close or reorganized and their websites can fully disappear.  Privately run websites that serve as long standing information caches can also experience change or disappearance.  But the use of internet archive can help in retrieving some information that was historically available on the web.
+
The Internet Archive (also known as the Wayback Machine) [https://archive.org/web/ https://archive.org/web/] is an archive of various websites that started in 1996.  As many know and have experienced, information available through the web is not static.  It is not uncommon for business websites to be reconfigured periodically and old information removed to improve its utility for its current customers.  Also as companies themselves may close or reorganize their websites can change or disappear.  Individually run websites that serve as long standing information caches can also experience change or disappearance.  However, the internet archive can help retrieve information that was available on a certain website at an earlier point in time.
  
This is a useful tool, but some cautions are worth noting.  The internet archive was (and continues to be) built by taking periodic yet selective snapshots of websites via bots that crawled the web.  This means its coverage can be limited by the following issues.
+
This is a useful tool for research, but some cautions are worth noting.  The Internet Archive was (and continues to be) built by taking periodic yet selective snapshots of websites via bots that crawled the web. (Archival practices are influenced by the archive's available storage and bandwidth but also by the technical aspects of the capture process, which itself has developed over the years.) This means that its coverage can be affected by the following issues.
1) Its snapshots are discrete, not continuous.  Thus information that changed rapidly (and so changed between snapshots) would not be captured.
+
# Its snapshots are discrete, not continuous.  Thus information that changed rapidly (and so changed between snapshots) would not be captured.
2) Its snapshots were selective.  The web crawlers might not navigate through every subpage and link on each pass meaning that some subpages might not be archived or archived less frequency.
+
# Its snapshots are selective.  The web crawlers might not navigate through every subpage and link on each pass meaning that some subpages might not be archived or archived with less frequency.
3) Some webpages had set ups that obstructing their being archived, whether intentionally or unintentionally.
+
# Some webpages had set-ups that obstructed their being archived, whether intentionally or unintentionally.
4) Information behind an authorization wall (i.e., requiring log in) could generally not be accessed and so would not be archivable
+
# Information behind an authorization wall (i.e., requiring log in) could generally not be accessed by web crawlers and so would not be archivable.
5) Search features by and large do not operate within archived pages.  [The Internet Archive captures the front-end, i.e. various pages of the website, but not any back-end servers or databases which would be needed for certain operation like search.]
+
# Search features by and large do not operate within archived pages.  [The Internet Archive captures the front-end, i.e. various pages of the website, but not any back-end servers or databases which would be needed for certain operation like search.]
6) Some websites (such as web stores) formed certain pages by a type of query or search from their database.  This would yield several pages of results that could be paged through by a user in normal usage.  The Internet Archive may capture the first page for inquiries coming from a built in link, but capturing the second or additional pages, while possible, tends to be rare.
+
# Some websites (such as web stores) populate certain pages by a type of query or search from their database.  This would yield several pages of results that could be paged through by a user in normal usage.  The Internet Archive may capture the first page for inquiries coming from a built in link, but capturing the second or additional pages, while possible, tends to be less frequent.
7) Images were captured at times, but not always.  Also downloadable files likewise were sometimes captured, but not always.
+
# Images were captured at times, but not always.  Also downloadable files likewise were sometimes captured, but not always.
  
An interesting consequence of points 6 and 7 is that information presented as plain text is more likely to archived than information bundled into a pdf download or arranged using a more sophisticated or interactive user interface.
+
An interesting consequence of points 6 and 7 is that information presented as plain text is more likely to have been archived than information that was bundled into a pdf download or arranged using a more sophisticated or interactive user interface.
  
Following links through old pages generally leads to archived versions of the linked pages.  However the capture date for the linked page may differ slightly or even greatly from the capture date of the starting page.  Also some links will redirect to the page according to a redirect that was captured.  But some links will yield a miss non archived page message.
+
Following links in old pages shown in the Wayback Machine generally leads to archived versions of the linked pages.  However the capture date for the linked page may differ slightly or even greatly from the capture date of the starting page.  Also some links will redirect to an archived according to the redirect that was captured.  But some links will yield a 'non-archived URL' error page.
 
 
If dating the page is important, the Way Back Machine does show a task bar indicating the day the page was captured.  However, this can interact in unusual ways with webpages that use frames.  Each frame could come from captures from different times that the main overall page was captured.  However, if you view the frame source, you will find at the end remarks indicating both day that frame was captured.
 
  
 +
The Wayback machine also has a toolbar that shows the capture date of the currently viewed page and allows one to navigate to earlier or later versions of the webpage.  However, this toolbar can interact in unusual ways with webpages that use frames.  Each frame (as well as other page components) may have different capture times than the time that the main overall page may appeared to have been captured.  However, if you view the frame source, you will find embedded comments indicating the day/time that the selected frame was captured.
  
 
=Old BattleTech pages=
 
=Old BattleTech pages=
To use the Wayback Machine you need to know the URL for the old page you are trying to access.  Because of modern day redirects, it can take some sleuthing to determine where to start.  Below is a list of links to old webpages available through the WayBack Machines that are significant to BattleTech.
+
To use the Wayback Machine you need to know the URL for the old page you are trying to access.  Because of modern day redirects, it can take some sleuthing to determine where to start.  Below is a list of links to old webpages available through the Wayback Machine that are significant to BattleTech.
  
 
==FASA==
 
==FASA==
FASA Main Page 1996-2001  
+
FASA Main Page (1996-2001)
 
[https://web.archive.org/web/19961104125009/http://www.fasa.com/ Web Archive (www.fasa.com)]
 
[https://web.archive.org/web/19961104125009/http://www.fasa.com/ Web Archive (www.fasa.com)]
  
FASA BattleTech Page 1998-2001  
+
FASA BattleTech Page (1998-2001)
 
[https://web.archive.org/web/19981206041133/http://www.fasa.com/BattleTech/index.html  Web Archive (www.fasa.com/battletech/index.html)]
 
[https://web.archive.org/web/19981206041133/http://www.fasa.com/BattleTech/index.html  Web Archive (www.fasa.com/battletech/index.html)]
  
FASA Web Store/Catalog 2000-2001 [https://web.archive.org/web/20000903061928/http://store.fasa.com/ Web Archive (store.fasa.com)]
+
FASA Web Store/Catalog (2000-2001) [https://web.archive.org/web/20000903061928/http://store.fasa.com/ Web Archive (store.fasa.com)]
  
FASA Web Store/Catalog BattleTech 2000-2001  
+
FASA Web Store/Catalog BattleTech (2000-2001)
 
[https://web.archive.org/web/20000821001024/http://store.fasa.com/battletech/index.asp Web Archive (store.fasa.com/BattleTech/index.asp)]
 
[https://web.archive.org/web/20000821001024/http://store.fasa.com/battletech/index.asp Web Archive (store.fasa.com/BattleTech/index.asp)]
  
 
==WizKids==
 
==WizKids==
WizKids Main Page 2001-2008 [https://web.archive.org/web/20020213212713/http://www.wizkidsgames.com/wk_home.asp Web Archive (www.wizkidsgames.com/wk_home.asp)]
+
WizKids Main Page (2001-2008) [https://web.archive.org/web/20020213212713/http://www.wizkidsgames.com/wk_home.asp Web Archive (www.wizkidsgames.com/wk_home.asp)]
  
 
==Catalyst Game Labs==
 
==Catalyst Game Labs==
Catalyst Game Labs Main Page 2007-present [https://web.archive.org/web/20070925031026/http://www.catalystgamelabs.com/ Web Archive(catalystgamelabs.com)]
+
Catalyst Game Labs Main Page (2007-present) [https://web.archive.org/web/20070925031026/http://www.catalystgamelabs.com/ Web Archive(catalystgamelabs.com)]
  
Catalyst Game Labs Classic BattleTech Page 2007-2009 [https://web.archive.org/web/20071012090038/catalystgamelabs.com/classicbattletech/ Web Archive(catalystgamelabs.com/classicbattletech/)]
+
Catalyst Game Labs Classic BattleTech Page (2007-2009) [https://web.archive.org/web/20071012090038/catalystgamelabs.com/classicbattletech/ Web Archive(catalystgamelabs.com/classicbattletech/)]
  
Catalyst Game Labs BattleTech Page 2009-present [https://web.archive.org/web/20090628084600/catalystgamelabs.com/battletech/ Web Archive(catalystgamelabs.com/battletech/)]
+
Catalyst Game Labs BattleTech Page (2009-present) [https://web.archive.org/web/20090628084600/catalystgamelabs.com/battletech/ Web Archive(catalystgamelabs.com/battletech/)]
  
 
==Related Sites==
 
==Related Sites==
Classic BattleTech Website (FanPro 2001-2007, Catalyst Game Labs 2007-2011) www.classicbattletech.com/ [would be replaced by bg.battletech.com]
+
Classic BattleTech Website (FanPro 2001-2007, Catalyst Game Labs 2007-2011) [https://web.archive.org/web/20011026060840/http://www.classicbattletech.com Web Archive(www.classicbattletech.com/)] [would be replaced by bg.battletech.com]
  
BattleTech Board Game Website 2011-present bg.battletech.com
+
BattleTech Board Game Website (2011-present) [https://web.archive.org/web/20111029035901/http://bg.battletech.com/ Web Archive(bg.battletech.com)]
  
BattleCorps Website 2004-2017 http://www.battlecorps.com/BC2/index.html
+
BattleCorps Website (2003-2004) [https://web.archive.org/web/20031020184810/http://www.battlecorps.com/  Web Archive(www.battlecorps.com)] [Redirects to www.battlecorps.com/BC2 for 2004-2017]
 +
BattleCorps Website (2004-2017) [https://web.archive.org/web/20040806025039/http://www.battlecorps.com/BC2/ Web Archive(www.battlecorps.com/BC2)]
  
 
==Miniatures==
 
==Miniatures==
  
===Ral Partha Miniatures===
+
===Ral Partha===
Ral Partha Main Page 1998-1999 http://ralpartha.com/ralpartha/ral.html
+
Ral Partha Main Page (1998-1999) [https://web.archive.org/web/19980109024235/http://ralpartha.com/ralpartha/ral.html Web Archive(http://ralpartha.com/ralpartha/ral.html)]
  
Ral Partha BattleTech Page 1998-1999 http://ralpartha.com/cgi-bin/RalPartha/battletech.html?7
+
Ral Partha BattleTech Page (1998-1999) [https://web.archive.org/web/19980109024534/http://ralpartha.com/cgi-bin/RalPartha/battletech.html?7 Web Archive(http://ralpartha.com/cgi-bin/RalPartha/battletech.html?7)]
  
Ral Partha Main Page 1999-2001 (moved to FASA site) http://www.fasa.com/ralpartha/index.html
+
Ral Partha Main Page (1999-2001) (moved to FASA site due to acquisition by FASA) [https://web.archive.org/web/20000304200326/http://www.fasa.com/ralpartha/index.html Web Archive(http://www.fasa.com/ralpartha/index.html)]
  
 
===Iron Wind Metals===
 
===Iron Wind Metals===
Iron Wind Metals Main Page 2002-2008, 2015-present http://www.ironwindmetals.com [Note: Issues with Java made some of the early versions of the page inaccessible to Web Archive.] [Redirects active for 2008-2015]
+
Iron Wind Metals Main Page (2002-2008, 2015-present) [https://web.archive.org/web/20020105034248/http://www.ironwindmetals.com/ Web Archive(www.ironwindmetals.com)] [Redirects to www.ironwindmetals.com/d for 2008-2015]
 +
 
 +
Iron Wind Metals Main Page (2008-2015) [https://web.archive.org/web/20080907233224/http://www.ironwindmetals.com/d/ Web Archive(http://www.ironwindmetals.com/d)]
  
Iron Wind Metals Main Page 2008-2015 http://ironwindmetals.com/d/
+
Note: Issues with Java resulted in some of the early versions of Iron Winds not being archived.  Also Iron Wind Metals' web store and catalog were archived with highly varying degrees of coverage and success over the years.  
  
 
===Ral Partha Europe===
 
===Ral Partha Europe===
Ral Partha Europe Main Page 2001-present www.ralparthaeurope.co.uk
+
Ral Partha Europe Main Page (2001-present) [https://web.archive.org/web/20010810215452/http://www.ralparthaeurope.co.uk/ Web Archive(www.ralparthaeurope.co.uk)]
  
 
===Military Simulations (Australia)===
 
===Military Simulations (Australia)===
Military Simulations Main Page 2000-present www.milsims.com.au [Note: Website uses frames and frames may come from different times of capture than displayed in WebArchive toolbar.]
+
Military Simulations Main Page (2000-present) [https://web.archive.org/web/20000301030323/http://www.milsims.com.au/ Web Archive(www.milsims.com.au)]
 +
 
 +
Note: Military Simulations' website uses frames.  Each frame may used a capture from times that differ from each other and from the time displayed in WebArchive toolbar. If dating is important, then view Frame Source to identify time of capture.]
 +
 
 +
===Armorcast===
 +
Armorcast Main Page (1996-present) [https://web.archive.org/web/19961227012146/http://armorcast.com/ Web Archive (armorcast.com)] (Armorcast was acquired in 2007, BattleTech license ended in 2007)
 +
 
 +
Armorcast BattleTech Page (2007) [https://web.archive.org/web/20071024150425/http://www.timdp.members.sonic.net/battletech/ Web Archive (www.timdp.members.sonic.net/battletech/)] (page would continue to be posted for many years after)
 +
 
 +
 
 +
[[Category: Help|WebArchive Notes]]

Latest revision as of 15:47, 23 June 2021

Web Archive[edit]

The Internet Archive (also known as the Wayback Machine) https://archive.org/web/ is an archive of various websites that started in 1996. As many know and have experienced, information available through the web is not static. It is not uncommon for business websites to be reconfigured periodically and old information removed to improve its utility for its current customers. Also as companies themselves may close or reorganize their websites can change or disappear. Individually run websites that serve as long standing information caches can also experience change or disappearance. However, the internet archive can help retrieve information that was available on a certain website at an earlier point in time.

This is a useful tool for research, but some cautions are worth noting. The Internet Archive was (and continues to be) built by taking periodic yet selective snapshots of websites via bots that crawled the web. (Archival practices are influenced by the archive's available storage and bandwidth but also by the technical aspects of the capture process, which itself has developed over the years.) This means that its coverage can be affected by the following issues.

  1. Its snapshots are discrete, not continuous. Thus information that changed rapidly (and so changed between snapshots) would not be captured.
  2. Its snapshots are selective. The web crawlers might not navigate through every subpage and link on each pass meaning that some subpages might not be archived or archived with less frequency.
  3. Some webpages had set-ups that obstructed their being archived, whether intentionally or unintentionally.
  4. Information behind an authorization wall (i.e., requiring log in) could generally not be accessed by web crawlers and so would not be archivable.
  5. Search features by and large do not operate within archived pages. [The Internet Archive captures the front-end, i.e. various pages of the website, but not any back-end servers or databases which would be needed for certain operation like search.]
  6. Some websites (such as web stores) populate certain pages by a type of query or search from their database. This would yield several pages of results that could be paged through by a user in normal usage. The Internet Archive may capture the first page for inquiries coming from a built in link, but capturing the second or additional pages, while possible, tends to be less frequent.
  7. Images were captured at times, but not always. Also downloadable files likewise were sometimes captured, but not always.

An interesting consequence of points 6 and 7 is that information presented as plain text is more likely to have been archived than information that was bundled into a pdf download or arranged using a more sophisticated or interactive user interface.

Following links in old pages shown in the Wayback Machine generally leads to archived versions of the linked pages. However the capture date for the linked page may differ slightly or even greatly from the capture date of the starting page. Also some links will redirect to an archived according to the redirect that was captured. But some links will yield a 'non-archived URL' error page.

The Wayback machine also has a toolbar that shows the capture date of the currently viewed page and allows one to navigate to earlier or later versions of the webpage. However, this toolbar can interact in unusual ways with webpages that use frames. Each frame (as well as other page components) may have different capture times than the time that the main overall page may appeared to have been captured. However, if you view the frame source, you will find embedded comments indicating the day/time that the selected frame was captured.

Old BattleTech pages[edit]

To use the Wayback Machine you need to know the URL for the old page you are trying to access. Because of modern day redirects, it can take some sleuthing to determine where to start. Below is a list of links to old webpages available through the Wayback Machine that are significant to BattleTech.

FASA[edit]

FASA Main Page (1996-2001) Web Archive (www.fasa.com)

FASA BattleTech Page (1998-2001) Web Archive (www.fasa.com/battletech/index.html)

FASA Web Store/Catalog (2000-2001) Web Archive (store.fasa.com)

FASA Web Store/Catalog BattleTech (2000-2001) Web Archive (store.fasa.com/BattleTech/index.asp)

WizKids[edit]

WizKids Main Page (2001-2008) Web Archive (www.wizkidsgames.com/wk_home.asp)

Catalyst Game Labs[edit]

Catalyst Game Labs Main Page (2007-present) Web Archive(catalystgamelabs.com)

Catalyst Game Labs Classic BattleTech Page (2007-2009) Web Archive(catalystgamelabs.com/classicbattletech/)

Catalyst Game Labs BattleTech Page (2009-present) Web Archive(catalystgamelabs.com/battletech/)

Related Sites[edit]

Classic BattleTech Website (FanPro 2001-2007, Catalyst Game Labs 2007-2011) Web Archive(www.classicbattletech.com/) [would be replaced by bg.battletech.com]

BattleTech Board Game Website (2011-present) Web Archive(bg.battletech.com)

BattleCorps Website (2003-2004) Web Archive(www.battlecorps.com) [Redirects to www.battlecorps.com/BC2 for 2004-2017] BattleCorps Website (2004-2017) Web Archive(www.battlecorps.com/BC2)

Miniatures[edit]

Ral Partha[edit]

Ral Partha Main Page (1998-1999) Web Archive(http://ralpartha.com/ralpartha/ral.html)

Ral Partha BattleTech Page (1998-1999) Web Archive(http://ralpartha.com/cgi-bin/RalPartha/battletech.html?7)

Ral Partha Main Page (1999-2001) (moved to FASA site due to acquisition by FASA) Web Archive(http://www.fasa.com/ralpartha/index.html)

Iron Wind Metals[edit]

Iron Wind Metals Main Page (2002-2008, 2015-present) Web Archive(www.ironwindmetals.com) [Redirects to www.ironwindmetals.com/d for 2008-2015]

Iron Wind Metals Main Page (2008-2015) Web Archive(http://www.ironwindmetals.com/d)

Note: Issues with Java resulted in some of the early versions of Iron Winds not being archived. Also Iron Wind Metals' web store and catalog were archived with highly varying degrees of coverage and success over the years.

Ral Partha Europe[edit]

Ral Partha Europe Main Page (2001-present) Web Archive(www.ralparthaeurope.co.uk)

Military Simulations (Australia)[edit]

Military Simulations Main Page (2000-present) Web Archive(www.milsims.com.au)

Note: Military Simulations' website uses frames. Each frame may used a capture from times that differ from each other and from the time displayed in WebArchive toolbar. If dating is important, then view Frame Source to identify time of capture.]

Armorcast[edit]

Armorcast Main Page (1996-present) Web Archive (armorcast.com) (Armorcast was acquired in 2007, BattleTech license ended in 2007)

Armorcast BattleTech Page (2007) Web Archive (www.timdp.members.sonic.net/battletech/) (page would continue to be posted for many years after)