Jump to content

Archive Team: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Tech234a (talk | contribs)
Undid revision 1190890993 by 46.1.106.37 (talk)
Updated current projects
Tags: Visual edit Mobile edit Mobile web edit Advanced mobile edit
Line 14: Line 14:


* [[Imgur]]: The image host Imgur updated their terms of service on April 19, 2023. This update focused on removing old, unused, and inactive content that is not tied to a user account, along with NSFW content.<ref>{{Cite web |title=Imgur Terms of Service Update |url=https://help.imgur.com/hc/en-us/articles/14415587638029-Imgur-Terms-of-Service-Update |url-status=live |archive-url=https://web.archive.org/web/20230531003216/https://help.imgur.com/hc/en-us/articles/14415587638029-Imgur-Terms-of-Service-Update-April-19-2023- |archive-date=31 May 2023 |access-date=9 June 2023 |website=Imgur Help}}</ref>
* [[Imgur]]: The image host Imgur updated their terms of service on April 19, 2023. This update focused on removing old, unused, and inactive content that is not tied to a user account, along with NSFW content.<ref>{{Cite web |title=Imgur Terms of Service Update |url=https://help.imgur.com/hc/en-us/articles/14415587638029-Imgur-Terms-of-Service-Update |url-status=live |archive-url=https://web.archive.org/web/20230531003216/https://help.imgur.com/hc/en-us/articles/14415587638029-Imgur-Terms-of-Service-Update-April-19-2023- |archive-date=31 May 2023 |access-date=9 June 2023 |website=Imgur Help}}</ref>
* [[Issuu]]: On December 20, 2022, Issuu changed their pricing page to reduce the limits of free accounts from 500 pages and 100 MB per upload to 50 pages and 50 MB.<ref>{{Cite web |title=Issuu - Archiveteam |url=https://wiki.archiveteam.org/index.php/Issuu |access-date=2023-06-09 |website=wiki.archiveteam.org |archive-date=2023-05-17 |archive-url=https://web.archive.org/web/20230517174953/https://wiki.archiveteam.org/index.php/Issuu |url-status=live }}</ref>
* [[Blogger (service)|Blogger]]: In May 2023, Google announced that inactive accounts would be deleted starting on 2023-12-01 across their platform, including Blogger blogs.<ref>{{Cite web |title=Blogger - Archiveteam |url=https://wiki.archiveteam.org/index.php/Blogger |access-date=2024-01-02 |website=wiki.archiveteam.org}}</ref>
* [[Reddit]]: Banning communities that generate bad PR for Reddit Inc. Restricting access to APIs and data on June 19, 2023.<ref>{{Cite web |last=KeyserSosa |date=2023-04-18 |title=An Update Regarding Reddit's API |url=http://www.reddit.com/r/reddit/comments/12qwagm/an_update_regarding_reddits_api/ |access-date=2023-06-09 |website=r/reddit |archive-date=2023-06-08 |archive-url=https://web.archive.org/web/20230608234424/https://www.reddit.com/r/reddit/comments/12qwagm/an_update_regarding_reddits_api/ |url-status=live }}</ref>
* [[Reddit]]: Banning communities that generate bad PR for Reddit Inc. Restricting access to APIs and data on June 19, 2023.<ref>{{Cite web |last=KeyserSosa |date=2023-04-18 |title=An Update Regarding Reddit's API |url=http://www.reddit.com/r/reddit/comments/12qwagm/an_update_regarding_reddits_api/ |access-date=2023-06-09 |website=r/reddit |archive-date=2023-06-08 |archive-url=https://web.archive.org/web/20230608234424/https://www.reddit.com/r/reddit/comments/12qwagm/an_update_regarding_reddits_api/ |url-status=live }}</ref>
* [[Russian invasion of Ukraine]]: Archiving various [[.ua]] sites in the wake of the Russian government's invasion.<ref>{{Cite web |title=.ua - Archiveteam |url=https://wiki.archiveteam.org/index.php/.ua |access-date=2023-06-09 |website=wiki.archiveteam.org |archive-date=2023-03-23 |archive-url=https://web.archive.org/web/20230323013739/https://wiki.archiveteam.org/index.php/.ua |url-status=live }}</ref>
* [[Russian invasion of Ukraine]]: Archiving various [[.ua]] sites in the wake of the Russian government's invasion.<ref>{{Cite web |title=.ua - Archiveteam |url=https://wiki.archiveteam.org/index.php/.ua |access-date=2023-06-09 |website=wiki.archiveteam.org |archive-date=2023-03-23 |archive-url=https://web.archive.org/web/20230323013739/https://wiki.archiveteam.org/index.php/.ua |url-status=live }}</ref>
* [[Telegram (software)|Telegram]]: Archiving public messages in various newsworthy and/or otherwise notable Telegram channels.<ref>{{Cite web |title=Telegram - Archiveteam |url=https://wiki.archiveteam.org/index.php/Telegram |access-date=2023-06-09 |website=wiki.archiveteam.org |archive-date=2023-05-29 |archive-url=https://web.archive.org/web/20230529001705/https://wiki.archiveteam.org/index.php/Telegram |url-status=live }}</ref>
* [[Telegram (software)|Telegram]]: Archiving public messages in various newsworthy and/or otherwise notable Telegram channels.<ref>{{Cite web |title=Telegram - Archiveteam |url=https://wiki.archiveteam.org/index.php/Telegram |access-date=2023-06-09 |website=wiki.archiveteam.org |archive-date=2023-05-29 |archive-url=https://web.archive.org/web/20230529001705/https://wiki.archiveteam.org/index.php/Telegram |url-status=live }}</ref>
* [[GitHub]]: When it was bought by Microsoft in 2018, many archivists and users were worried the site would become more restrictive. This project archives the UI parts of GitHub and the code of each repository.<ref>{{Cite web |title=GitHub - Archiveteam |url=https://wiki.archiveteam.org/index.php/GitHub |access-date=2023-06-09 |website=wiki.archiveteam.org |archive-date=2023-05-27 |archive-url=https://web.archive.org/web/20230527000727/https://wiki.archiveteam.org/index.php/GitHub |url-status=live }}</ref>
* [[GitHub]]: When it was bought by Microsoft in 2018, many archivists and users were worried the site would become more restrictive. This project archives the UI parts of GitHub and the code of each repository.<ref>{{Cite web |title=GitHub - Archiveteam |url=https://wiki.archiveteam.org/index.php/GitHub |access-date=2023-06-09 |website=wiki.archiveteam.org |archive-date=2023-05-27 |archive-url=https://web.archive.org/web/20230527000727/https://wiki.archiveteam.org/index.php/GitHub |url-status=live }}</ref>
* [[MediaFire|Mediafire]]: On 2020-12-18, users reported that they began receiving emails from MediaFire how they plan to classify accounts as abandoned if they fail to meet certain criteria, starting in January.<ref>{{Cite web |title=MediaFire - Archiveteam |url=https://wiki.archiveteam.org/index.php/MediaFire |access-date=2024-01-02 |website=wiki.archiveteam.org}}</ref>
* [[Google Drive]]: As of September 13, 2021, Google started requiring that, in order to access files and folders, users either have permissions tied to their signed-in Google Accounts or access the item through a URL with a random per-item parameter called resourceKey.<ref>{{Cite web |title=Security update for Google Drive - Google Drive Help |url=https://support.google.com/drive/answer/10729743 |access-date=2023-06-09 |website=support.google.com |archive-date=2023-03-31 |archive-url=https://web.archive.org/web/20230331101442/https://support.google.com/drive/answer/10729743 |url-status=live }}</ref> The result of this will be that at least millions of links across the Web will effectively break.
* Coronavirus Outbreak: Documenting and preserving data, events, and impacts of [[COVID-19]] on society.<ref>{{Cite web |title=Coronavirus - Archiveteam |url=https://wiki.archiveteam.org/index.php/Coronavirus |access-date=2023-06-09 |website=wiki.archiveteam.org |archive-date=2023-06-09 |archive-url=https://web.archive.org/web/20230609170016/https://wiki.archiveteam.org/index.php/Coronavirus |url-status=live }}</ref>
* Coronavirus Outbreak: Documenting and preserving data, events, and impacts of [[COVID-19]] on society.<ref>{{Cite web |title=Coronavirus - Archiveteam |url=https://wiki.archiveteam.org/index.php/Coronavirus |access-date=2023-06-09 |website=wiki.archiveteam.org |archive-date=2023-06-09 |archive-url=https://web.archive.org/web/20230609170016/https://wiki.archiveteam.org/index.php/Coronavirus |url-status=live }}</ref>
* [[YouTube|Youtube]]: Saving metadata, thumbnails, comments and selected videos. Videos and channels are to be limited to: Channels that may be deleted becuase company went bankrupt, channel owner died, YouTube banning certain content, and channels related to world events and politics.<ref>{{Cite web |title=YouTube - Archiveteam |url=https://wiki.archiveteam.org/index.php/YouTube |access-date=2024-01-02 |website=wiki.archiveteam.org |language=en}}</ref>
* Wikiteam: Saving wiki [[XML|xml]] dumps.<ref>{{Cite web |title=WikiTeam - Archiveteam |url=https://wiki.archiveteam.org/index.php/WikiTeam |access-date=2024-01-02 |website=wiki.archiveteam.org}}</ref>
* Urlteam: Saving [[URL shortening|URL shorteners]].<ref>{{Cite web |title=URLTeam - Archiveteam |url=https://wiki.archiveteam.org/index.php/URLTeam |access-date=2024-01-02 |website=wiki.archiveteam.org |language=en}}</ref>
* URLs: Archiving [[URL|URLs]] from various sources.<ref>{{Cite web |title=URLs - Archiveteam |url=https://wiki.archiveteam.org/index.php/URLs |access-date=2024-01-02 |website=wiki.archiveteam.org}}</ref>


{{As of|2023|June|9}}, the largest project on ArchiveTeam is [[Reddit]], with over 2.82 petabytes archived.<ref>{{Cite web |title=Reddit tracker Dashboard |url=https://tracker.archiveteam.org/reddit/ |access-date=2023-06-09 |website=tracker.archiveteam.org |archive-date=2023-05-23 |archive-url=https://web.archive.org/web/20230523211101/https://tracker.archiveteam.org/reddit/ |url-status=live }}</ref>
{{As of|2023|June|9}}, the largest project on ArchiveTeam is [[Reddit]], with over 2.82 petabytes archived.<ref>{{Cite web |title=Reddit tracker Dashboard |url=https://tracker.archiveteam.org/reddit/ |access-date=2023-06-09 |website=tracker.archiveteam.org |archive-date=2023-05-23 |archive-url=https://web.archive.org/web/20230523211101/https://tracker.archiveteam.org/reddit/ |url-status=live }}</ref>

Revision as of 19:01, 2 January 2024

Archive Team logo

Archive Team is a group dedicated to digital preservation and web archiving that was co-founded by Jason Scott in 2009.[1][2]

Its primary focus is the copying and preservation of content housed by at-risk online services. Some of its projects include the partial preservation of GeoCities,[3][4] Yahoo! Video, Google Video, Splinder, Friendster, FortuneCity,[5][6][7][8][9][10][11][12][excessive citations] TwitPic,[13] SoundCloud,[14] and the "Aaron Swartz Memorial JSTOR Liberator".[15] Archive Team also archives URL shortener services[16] and wikis[17] on a regular basis.

According to Jason Scott, "Archive Team was started out of anger and a feeling of powerlessness, this feeling that we were letting companies decide for us what was going to survive and what was going to die."[18] Scott continues, "it's not our job to figure out what's valuable, to figure out what's meaningful. We work by three virtues: rage, paranoia and kleptomania."[19]

Warrior/Tracker system

Archive Team is composed of a loose community of independent contributors/users.[citation needed] Their archival process makes use of a "Warrior", a virtual machine environment. Individuals use the Warrior in their desktop environments use to download content without requiring technical expertise. Tasks are allocated by a centrally-managed Tracker that networks with and allocates items to Warriors. The tracker also monitors user upload activity and displays a leader board.[20]

Projects

There are several projects currently running:

  • Imgur: The image host Imgur updated their terms of service on April 19, 2023. This update focused on removing old, unused, and inactive content that is not tied to a user account, along with NSFW content.[21]
  • Blogger: In May 2023, Google announced that inactive accounts would be deleted starting on 2023-12-01 across their platform, including Blogger blogs.[22]
  • Reddit: Banning communities that generate bad PR for Reddit Inc. Restricting access to APIs and data on June 19, 2023.[23]
  • Russian invasion of Ukraine: Archiving various .ua sites in the wake of the Russian government's invasion.[24]
  • Telegram: Archiving public messages in various newsworthy and/or otherwise notable Telegram channels.[25]
  • GitHub: When it was bought by Microsoft in 2018, many archivists and users were worried the site would become more restrictive. This project archives the UI parts of GitHub and the code of each repository.[26]
  • Mediafire: On 2020-12-18, users reported that they began receiving emails from MediaFire how they plan to classify accounts as abandoned if they fail to meet certain criteria, starting in January.[27]
  • Coronavirus Outbreak: Documenting and preserving data, events, and impacts of COVID-19 on society.[28]
  • Youtube: Saving metadata, thumbnails, comments and selected videos. Videos and channels are to be limited to: Channels that may be deleted becuase company went bankrupt, channel owner died, YouTube banning certain content, and channels related to world events and politics.[29]
  • Wikiteam: Saving wiki xml dumps.[30]
  • Urlteam: Saving URL shorteners.[31]
  • URLs: Archiving URLs from various sources.[32]

As of 9 June 2023, the largest project on ArchiveTeam is Reddit, with over 2.82 petabytes archived.[33]

See also

References

  1. ^ Scott, Jason (January 6, 2009). "Team Archive is GO". ASCII by Jason Scott. Archived from the original on 2016-11-02. Retrieved December 30, 2016.
  2. ^ "Revision history of "Main Page"". Archive Team. Archived from the original on 2016-12-31. Retrieved December 30, 2016.
  3. ^ Gilbertson, Scott (2010-11-01). "Geocities Lives On as Massive Torrent Download". Wired. Archived from the original on 2012-04-25.
  4. ^ Modine, Austin (2009-04-28). "Web 0.2 archivists save Geocities from deletion". The Register. Archived from the original on 2012-05-03.
  5. ^ Sullivan, Mark (2012-04-13). "The 'Archive Team' Rescues User Content From Doomed Sites". PC World. Archived from the original on 2012-04-20.
  6. ^ Schwartz, Matt (January 2012). "Fire in the Library". Technology Review. Archived from the original on 2012-01-24.
  7. ^ Garfield, Bob; Scott, Jason (2012-03-23). "The Archive Team". OnTheMedia. Archived from the original on 2012-04-27. Retrieved 2012-04-19.
  8. ^ Masnick, Mike (2012-04-12). "Historic Archive Of Websites From The January 18th SOPA Blackout". Techdirt. Archived from the original on 2012-04-15.
  9. ^ Scott, Jason (2012-03-06). "Click: The Archive Team - Jason Scott talks about his mission to salvage our digital heritage". BBC. Archived from the original on 2015-04-03.
  10. ^ Morton, Simon; Scott, Jason (2012-03-03). "The Archive Team". RadioNZ. Archived from the original on 2012-04-21.
  11. ^ Misener, Dan (2011-04-29). "Full Interview: Jason Scott on online video and digital heritage". CBC. Archived from the original on 2012-10-26.
  12. ^ Paul-Choudhury, Sumit (May 6, 2011). "Amateur heroes of online heritage". New Scientist. Archived from the original on April 2, 2015. Retrieved March 9, 2015.
  13. ^ "TwitPic - Archiveteam". Archived from the original on 2014-09-09. Retrieved 2014-09-17.
  14. ^ "Archive Team promises to back up SoundCloud amid worries of a shutdown". 2017-07-18. Archived from the original on 2018-10-21. Retrieved 2018-11-28.
  15. ^ "Aaron Swartz Memorial JSTOR Liberator sets public domain academic articles free". 2013-01-15. Archived from the original on 2018-03-23. Retrieved 2018-11-28.
  16. ^ "url shortening was a fucking awful idea". URLTE.AM. Archived from the original on 2011-06-11.
  17. ^ WikiTeam Archived 2016-02-10 at the Wayback Machine - We archive wikis, from Wikipedia to tiniest wikis
  18. ^ "Open Source Bridge 2012 Keynote - Jason Scott". YouTube. Archived from the original on 2017-09-14. Retrieved 2018-11-28.
  19. ^ "Open Source Bridge 2012 Keynote - Jason Scott". YouTube. Archived from the original on 2017-09-14. Retrieved 2018-11-28.
  20. ^ Ogden, Jessica (October 21, 2021). ""Everything on the internet can be saved": Archive Team, Tumblr and the cultural significance of web archiving". Internet Histories. 6 (1–2): 113–132. doi:10.1080/24701475.2021.1985835. S2CID 239510759.
  21. ^ "Imgur Terms of Service Update". Imgur Help. Archived from the original on 31 May 2023. Retrieved 9 June 2023.
  22. ^ "Blogger - Archiveteam". wiki.archiveteam.org. Retrieved 2024-01-02.
  23. ^ KeyserSosa (2023-04-18). "An Update Regarding Reddit's API". r/reddit. Archived from the original on 2023-06-08. Retrieved 2023-06-09.
  24. ^ ".ua - Archiveteam". wiki.archiveteam.org. Archived from the original on 2023-03-23. Retrieved 2023-06-09.
  25. ^ "Telegram - Archiveteam". wiki.archiveteam.org. Archived from the original on 2023-05-29. Retrieved 2023-06-09.
  26. ^ "GitHub - Archiveteam". wiki.archiveteam.org. Archived from the original on 2023-05-27. Retrieved 2023-06-09.
  27. ^ "MediaFire - Archiveteam". wiki.archiveteam.org. Retrieved 2024-01-02.
  28. ^ "Coronavirus - Archiveteam". wiki.archiveteam.org. Archived from the original on 2023-06-09. Retrieved 2023-06-09.
  29. ^ "YouTube - Archiveteam". wiki.archiveteam.org. Retrieved 2024-01-02.
  30. ^ "WikiTeam - Archiveteam". wiki.archiveteam.org. Retrieved 2024-01-02.
  31. ^ "URLTeam - Archiveteam". wiki.archiveteam.org. Retrieved 2024-01-02.
  32. ^ "URLs - Archiveteam". wiki.archiveteam.org. Retrieved 2024-01-02.
  33. ^ "Reddit tracker Dashboard". tracker.archiveteam.org. Archived from the original on 2023-05-23. Retrieved 2023-06-09.