Monday, March 12, 2012

Memento


Time travel for the web...

Its not possible to say whether we could, in reality, be able to time travel but its possible now in case of web, of course virtually. To no surprise, this feature of time travelling on web was possible even before Memento, thanks to the Internet Archive's Wayback Machine which stores old copies for pages they archive.

With Time Travel of Web, it means Old versions of websites will be as easy to find as current ones with this new time-travelling technology, Memento. There are various other archive sites which strive to archive the web. Internet Archive strives to archive the whole web All these archive sites employ web crawlers, which timely in some orderly fashion crawl the whole web.

Why Memento?

One would think what is the need of Memento when already archive sites archive the web. This could be explained with an example. If you were to access cnn.com, you would be presented with today's version of the page. But what if you wanted to see how it looked one year ago? You would need to visit the Internet Archive's Wayback Machine to find a list of old copies of the page they had archived, and you would need to click on one of the links. And if IA didn't have the page archived, you would have to search other web archives for the archived version. This is potentially a lot of work.

Thus, Memento...

This effort is made transparent to user by Memento. It makes it simpler to find old versions of web pages. All the user have to do is to use a Memento enabled browser to open the web page and visit the Url and supply a desired date and thats it. Memento would fetch the archived page for that date and display. The user would not need to manually go and fetch for the page across many archive sites.

Challenges in building Memento...

The archival copies have URIs which are protocol-wise disconnected from the URI of the resource of which they represent the prior state. This is because of the lack of the temporal or time dimension based capabilities in the most common web-based protocol, HTTP. This prevents getting an archived page through URI of its original. This challenge includes following multitude links from the original to the archived resource, or searching web archives.


Memento is a protocol based solution to address this problem such that archived resources can be seamlessly accessed through the URI of its original with an added time dimension while Content Negotiation.

Experience it...

You can witness this in action by using Memento add-on for Mozilla Firefox Browser. Even after you provide a date, it is possible that the page of that date is not rendered. This is not Memento's fault. It's a limitation of web archiving in general. Its because, Memento could only find that page archived closer to the date desired by you. Ofcourse, this would work only if previous versions are availabel and are on servers that supports Memento Framework.

How does Memento works?

Memento uses the standard function of HTTP i.e. Content Negotiation. Content Negotiation is done by your browser for every page you request, however, you may not be aware of this happening. This negotiation allows Url to send multiple types of data, depending upon the browser. Usually the language constraint is negotiated for every page request. But HTTP Content Negotiation is not limited to arbitrating between media formats an dlanguages. So, another dimension of date and time is added to this negotiation on each page request.

Memento is comprised of both server and browser software. At Server end running Apache web system, just few lines of code are needed to build in date and time negotiation capability within the server. On the browser end, a provosion for the user to provide the desired date and time. Of course, this requires the website owners to store many more time stamped versions of the pages. However, web pages need no extra capability. Web servers need to be able to intercept the date-time requests made by user.

Who drives Memento?

Memento is a collaboration between:
  Memento is funded by the Library of Congress.

More Details...

For more technical details, visit Memento Guide Introduction.