Archive.is

[Other]

Archive.is is a time capsule for web pages.


Mirrors


PGP Keys


It takes a ‘snapshot’ of a webpage that will always be online even if the original page disappears. It saves a text and a graphical copy of the page for better accuracy and provides a short and reliable link to an unalterable record of any web page including those from Web 2.0 sites: https://archive.ph/2020.04.21/rt.live/ https://archive.ph/2014.06.26/google.com/maps/… This can be useful if you want to take a ‘snapshot’ a page which could change soon: price list, job offer, real estate listing, drunk blog post, … Saved pages will have no active elements and no scripts, so they keep you safe as they cannot have any popups or malware! FAQ ===Which parts of web page are saved?Textual content of the web page.Images.Content of the frames.Content and images loaded or generated by Javascript on Web 2.0 sitesScreenshot of 1024x768 pixels.Which parts of web page are not saved?Flash and content loaded by flash.Video and sounds. It has no sense to archive youtube.com unless you want to archive the title of the video and comments. The video itself will not be saved.PDFRSS and other XML-pages saved not reliable. Most of them are not saved or saved as blank page.How long does it take to make a snapshot ?The same time as to load a page into your browser. Although, saving the pages with heavy scripts or the pages full of Ads may take up to few minutes. There is 5 minutes timeout, if page is not fully loaded in 5 minutes, the saving considered failed. It is not often, but it happens.It there limit on the page size ?The stored page with all images must be smaller than 50MbWhat software do you run and how data is stored ?The archive runs Apache Hadoop and Apache Accumulo. All data is stored on HDFS, textual content is duplicated 3 times among servers in different datacenters and images are duplicated 2 times. All datacenters are in Europe.How long the page will be stored ?Virtually forever. We have a lot of free space and although the archive grows with time, the storage and bandwidth get cheaper.Do you delete my stored page(s) ?Pages which violate our hoster’s rules (cracks, porn, etc) may be deleted. Also, completely empty pages (or pages which have nothing but text like “502 Server Timeout”) may be deleted.How is the archive funded?It is privately funded; there are no complex finances behind it. It may look more or less reliable compared to startup-style funding or a university project, depending on which risks are taken into account.Will advertising appear on the archive one day ?I cannot make a promise that it will not. With the current growth rate I am able to keep the archive free of ads. Well, I can promise it will have no ads at least till the end of 2014.How to refer to the saved page ?Each page has short url http://archive.is/XXXXX, where XXXXX is the unique indentfier of a page. Also, the page can be refered with urls likehttp://archive.is/2013/http://www.google.de/ - the newest snapshot in year 2013.http://archive.is/201301/http://www.google.de/ - the newest snapshot in January 2013.http://archive.is/20130101/http://www.google.de/ - the newest snapshot within the day of 1st January 2013.The date can be extended further with hours, minutes and seconds:http://archive.is/2013010103/http://www.google.de/http://archive.is/201301010313/http://www.google.de/http://archive.is/20130101031355/http://www.google.de/Year, month, day, hours, minutes and seconds can be separated with dots, dash or colons to increase readability:http://archive.is/2013-04-17/http://blog.bo.lt/http://archive.is/2013.04.17-12:08:20/http://blog.bo.lt/It is also possible to refer all snapshots of the given urlhttp://archive.is/http://www.google.de/All saved pages from the domainhttp://archive.is/www.google.deAll saved pages from all the subdomainshttp://archive.is/*.google.deIs there a way to link to the most recent archive of an article by including the URL in an archive. is link?Yes.http://archive.is/newest/http://reddit.com/ There is also http://archive.is/oldest/http://reddit.com/ How to refer to exact part of a long page ? There are two options:add hashtag with the scroll position as a number between 0 (top of the page) and 100 (bottom). For example http://archive.is/FWVL#40%select some text on the page and get URL with hashtag referring to the selection. For example http://archive.is/FWVL#selection-1493.0-1493.53Does it support any API ?archive.is supports MementoWeb API. More info can be found hereCan I have an account to manage my bookmarks ?No. But you can keep bookmarks to archived pages in one of the existing bookmark managers, like Delicious, Google Bookmarks, …Why does archive.is not obey robots.txt?Because it is not a free-walking crawler, it saves only one page acting as a direct agent of the human user. Such services don’t obey robots.txt (e.g. Google Feedfetcher, screenshot- or pdf-making services, isup.me, …)Is IPv6 supported ?Yes.http://archive.is/[2A00:1450:400C:C00::69]http://archive.is/ipv6.google.comAre domains with national characters supported ?Yes.http://archive.is/www.maroñas.com.uyhttp://archive.is/*.测试Do you preserve archivers’ privacy? E.g. not disclose the source IP address?Yes.But take in mind that when you archive a page, your IP is being sent to the the website you archive as though you are using a proxy (in X-Forwarded-For header). This feature allows websites (e.g shops or the sites with weather forecast) target your region, not mine.I found incorrect/inaccurate/obsolete informartion. Can I request it to be altered or deleted?The archive is not a news agency nor an authoritative source of reference information. It merely certifies that at the given point of time there was a page on the web. The page might well contain a fairy tale and despite “One day Little Red Riding Hood goes to visit her granny” being a false statement it is not the reason to burn the books. Note that weather forecasts on the archived pages are outdated as well.My question is not here!More questions and answers: http://blog.archive.is/archive

Reviews (1)(Average Rating 5.0 / 5.0)


Roosevelt D. Frankli2023-09-19

Cool story bro