General 16 min read

Save the Page, Not the Link

MMNMNOTE
link-rotdigital-decaynote-takingweb-archivingresearchownership
Updated June 8, 2026

A quarter of all webpages that existed between 2013 and 2023 are already gone. Pew Research Center measured it: "A quarter of all webpages that existed at one point between 2013 and 2023 are no longer accessible, as of October 2023" 1. The bookmark you saved last year points at a coin flip. So save the substance, not the link.

This is the part people miss. A bookmark is a promise made by someone else's server — that the page will still be there, unchanged, when you return. The promise breaks quietly. Pages move, sites shut down, articles get rewritten, and the link survives long after the thing it pointed to is gone. The fix is not a better bookmark manager. It is a different habit: when something matters, copy the words into a note you control.

How much of the web actually disappears?

A lot, and faster the older a page gets. Pew found that "38% of webpages that existed in 2013 are not available today, compared with 8% of pages that existed in 2023" 2. The decade-old page is roughly five times likelier to be dead than the year-old one. Pew calls the phenomenon "digital decay."

The decay is not confined to abandoned blogs. Pew found that "23% of news webpages contain at least one broken link, as do 21% of webpages from government sites" 3. Even the reference layer crumbles: "54% of Wikipedia pages contain at least one link in their 'References' section that points to a page that no longer exists" 4. Social posts vanish fastest — "Nearly one-in-five tweets are no longer publicly visible on the site just months after being posted" 5.

Doesn't this only happen to small sites?

No. It happens to the paper of record. Researchers at Harvard Law School studied every outbound link in The New York Times from 1996 to 2019, and the Berkman Klein Center summarized the result plainly: a quarter of the paper's deep links are now rotten, leading to completely inaccessible pages 6. Even careful, well-funded institutions lose their citations.

The full finding is worth quoting: "A quarter of the deep links in The New York Times' articles are now rotten, leading to completely inaccessible pages, according to a team of researchers from Harvard Law School, who worked with the Times' digital team" 6. Written up by its authors in the Columbia Journalism Review, the same study showed how rot compounds with age: "6 percent of links from 2018 had rotted, as compared to 43 percent of links from 2008 and 72 percent of links from 1998" 7. More than half the articles were touched — "Fifty-three percent of all articles that contained deep links had at least one rotted link" 8.

Scholarship fares no better. A Harvard study of legal citations "found more than 70 percent of all URLs no longer produced the information originally cited," and across all U.S. Supreme Court opinions, "50 percent of referenced URLs likewise suffered from link or reference rot" 9. The footnotes of the law itself are decaying.

Can't I just trust the Wayback Machine?

You can, and you should — but archiving the page is the archivist's answer, not yours. Web archives genuinely work. The Internet Archive describes the goal precisely: saved pages "will continue to exist even after the original page changes or is removed from the web" 10. That is real preservation, and it deserves your support.

It also assumes the archive caught the page, kept it, and stays online itself. Clare Stanton, who runs the Perma.cc preservation project at Harvard's Library Innovation Lab, frames why a frozen copy matters for anyone who cites: "If you are writing a paper for a journal, or if you're writing a brief for court, you want to make sure that the exact thing you were referencing from the internet can be seen in that exact same iteration when someone's reading your citation" 11. That is the institutional answer — back up the web, for everyone. The personal answer is smaller and entirely in your hands: keep your own copy of the part you needed.

Why copy the substance instead of the page?

Because you rarely needed the whole page. You needed one figure, one sentence, one paragraph — and that is what a note holds best. A saved page is a brittle snapshot of someone else's layout. A note is the distilled claim, in your own words, searchable and yours. The link decays; the meaning you extracted does not have to.

This is the behavioral shift the research argues for. The studies tell you that linkrot is not a rare accident but the web's default state — the Columbia Journalism Review authors put it bluntly: "Linkrot is already blighting that record—and it's not going away on its own" 7. If decay is the baseline, then leaning on a remote URL to still resolve in five years is the risky move. Copying the substance into a durable, plain-text note you own is the conservative one.

Capture the substance the moment it matters, in a format that will outlive the page. The move is manual and takes seconds: select the part you actually need, paste it into a note, and record where it came from. Here is the durable version of that habit.

  1. Copy the quote, figure, or paragraph — not the URL alone. Paste the exact text you would want to cite later, while you can still see it.
  2. Record the source inline. Add the title, author, publication, the URL, and the date you accessed it. The URL may rot; the citation still tells a reader what you saw.
  3. Add one line in your own words. Why did this matter to you? That sentence is the part no archive can reconstruct, and the part you will actually search for.
  4. Keep it in plain text you own. A Markdown note stored on your own device survives the source site, the bookmarking app, and the company behind either. For the longer case on why plain text endures, see Plain Text Is a Love Letter.
  5. Submit the page to an archive too, if it is load-bearing. For a citation you must defend, save it to the Wayback Machine or Perma.cc as well. Belt and suspenders: the public archive for the record, your note for yourself.

The caveats, honestly

These numbers come from samples, not a full census of the web. Pew "sampled pages collected by Common Crawl each year from 2013 through 2023 (approximately 90,000 pages per year)" — a careful method, but an estimate, not a head count. "No longer accessible" also includes pages that were moved or deliberately removed, not only sites that died.

The figures also use different yardsticks, and they should not be blended into one number. Pew sampled the open web at random; the Harvard team studied one publisher's outbound links; a separate study by Ahrefs, an SEO firm, found that "Since January 2013, 66.5% of the links pointing to the 2,062,173 websites we sampled have rotted" — but that measured backlink profiles of large domains, a different population entirely 12. Treat the Ahrefs figure as corroboration, not as "two-thirds of the web is dead." The honest claim is narrower and still alarming: across several rigorous methods, a large and growing share of what we link to is already gone.

Frequently Asked Questions

The short version: a quarter of the decade's webpages are already inaccessible, the rate climbs the older a page is, and even The New York Times, government sites, and Wikipedia's references rot. Archiving tools help, but the durable personal move is to copy the substance you needed into a plain note you own. The questions below answer the specifics people search for most.

How much of the internet disappears every year?

There is no clean annual rate, but the cumulative loss is large. Pew found that "a quarter of all webpages that existed at one point between 2013 and 2023 are no longer accessible, as of October 2023" 1, and that older pages fare worse: 38% of 2013 pages were gone versus 8% of 2023 pages 2. Decay compounds with age rather than arriving on a fixed schedule.

It depends on what you measure. Of The New York Times' outbound "deep links," a Harvard study found "25 percent of all links were completely inaccessible" 7. A separate Ahrefs study of large-domain backlinks found "66.5% of the links pointing to the 2,062,173 websites we sampled have rotted" since 2013 12. The methods differ, so the numbers differ — do not merge them into one figure.

Because a link is a pointer to someone else's server, and that target changes constantly. Pages get moved, renamed, or deleted; sites shut down; content is rewritten; companies fold. Pew calls the result "digital decay" 4. The link itself can survive long after the thing it pointed to is gone, which is why a working-looking bookmark is no guarantee the content is still there.

Not as long as you think, and survival drops sharply with age. The Harvard New York Times study found "6 percent of links from 2018 had rotted, as compared to 43 percent of links from 2008 and 72 percent of links from 1998" 7. A link a few years old is fairly likely to work; a link from two decades ago is more likely dead than alive.

How do I save a webpage permanently before it disappears?

Two layers. Copy the exact quote, figure, or paragraph you need into a plain-text note you own, with the source and access date recorded inline — that survives the original site and any app. For anything you must cite or defend, also submit the page to a web archive like the Wayback Machine or Perma.cc, whose saved pages "will continue to exist even after the original page changes or is removed from the web" 10.

Link rot is the tendency of hyperlinks to stop pointing to their intended content over time, because the target page has moved, changed, or vanished. It is distinct from "content drift," where the link still resolves but the page now says something different. Both forms are pervasive: even 54% of Wikipedia pages carry at least one reference link to a page that no longer exists 4.


The web is not a library you can return to. It is a tide, and a quarter of one decade's pages have already gone out with it. Keep what mattered above the waterline — in a plain note, on your own device, in words you wrote down while you could still see the source.


In MNMNOTE, notes live locally on your own device in plain Markdown, so the paragraph you saved today stays readable long after the page you found it on is gone.


mnmnote.com


References


Footnotes

  1. Chapekis, A., Bestvater, S., Remy, E., & Rivero, G. "When Online Content Disappears." Pew Research Center, 2024-05-17. https://www.pewresearch.org/data-labs/2024/05/17/when-online-content-disappears/ — "A quarter of all webpages that existed at one point between 2013 and 2023 are no longer accessible, as of October 2023." Accessed 2026-06-06. 2

  2. Pew Research Center, "When Online Content Disappears," 2024-05-17. https://www.pewresearch.org/data-labs/2024/05/17/when-online-content-disappears/ — "38% of webpages that existed in 2013 are not available today, compared with 8% of pages that existed in 2023." Accessed 2026-06-06. 2

  3. Pew Research Center, "When Online Content Disappears," 2024-05-17. https://www.pewresearch.org/data-labs/2024/05/17/when-online-content-disappears/ — "23% of news webpages contain at least one broken link, as do 21% of webpages from government sites." Accessed 2026-06-06.

  4. Pew Research Center, "When Online Content Disappears," 2024-05-17. https://www.pewresearch.org/data-labs/2024/05/17/when-online-content-disappears/ — "54% of Wikipedia pages contain at least one link in their 'References' section that points to a page that no longer exists"; methodology: "sampled pages collected by Common Crawl each year from 2013 through 2023 (approximately 90,000 pages per year)"; framing term "digital decay." Accessed 2026-06-06. 2 3

  5. Pew Research Center, "When Online Content Disappears," 2024-05-17. https://www.pewresearch.org/data-labs/2024/05/17/when-online-content-disappears/ — "Nearly one-in-five tweets are no longer publicly visible on the site just months after being posted." Accessed 2026-06-06.

  6. Berkman Klein Center for Internet & Society. "New research shows how many important links on the web get lost to time." cyber.harvard.edu, last updated 2021-05-24. https://cyber.harvard.edu/story/2021-05/new-research-shows-how-many-important-links-web-get-lost-time — "A quarter of the deep links in The New York Times' articles are now rotten, leading to completely inaccessible pages, according to a team of researchers from Harvard Law School, who worked with the Times' digital team." Accessed 2026-06-06. 2

  7. Bowers, J., Stanton, C., & Zittrain, J. "What the ephemerality of the Web means for your hyperlinks." Columbia Journalism Review, 2021-05-21. https://www.cjr.org/analysis/linkrot-content-drift-new-york-times.php — "Of these deep links, 25 percent of all links were completely inaccessible"; "6 percent of links from 2018 had rotted, as compared to 43 percent of links from 2008 and 72 percent of links from 1998"; "Linkrot is already blighting that record—and it's not going away on its own." Accessed 2026-06-06. 2 3 4

  8. Bowers, J., Stanton, C., & Zittrain, J. "What the ephemerality of the Web means for your hyperlinks." Columbia Journalism Review, 2021-05-21. https://www.cjr.org/analysis/linkrot-content-drift-new-york-times.php — "Fifty-three percent of all articles that contained deep links had at least one rotted link." Accessed 2026-06-06.

  9. Harvard Center for the Legal Profession, "Pausing the Internet" (reporting Zittrain, Albert & Lessig, "Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations," Harvard Law Review Forum vol. 127, 2014). https://clp.law.harvard.edu/knowledge-hub/magazine/issues/the-evolution-of-law-libraries/pausing-the-internet/ — "Within their sample of academic journals, the authors found more than 70 percent of all URLs no longer produced the information originally cited. In surveying all published Supreme Court opinions, they found that 50 percent of referenced URLs likewise suffered from link or reference rot." Accessed 2026-06-06.

  10. Internet Archive Help Center. "Save Pages in the Wayback Machine." help.archive.org. https://help.archive.org/help/save-pages-in-the-wayback-machine/ — "These saved pages can be cited, shared, linked to – and they will continue to exist even after the original page changes or is removed from the web." Accessed 2026-06-06. 2

  11. Harvard Center for the Legal Profession, "Pausing the Internet." https://clp.law.harvard.edu/knowledge-hub/magazine/issues/the-evolution-of-law-libraries/pausing-the-internet/ — Clare Stanton (Perma.cc / Harvard Library Innovation Lab): "If you are writing a paper for a journal, or if you're writing a brief for court, you want to make sure that the exact thing you were referencing from the internet can be seen in that exact same iteration when someone's reading your citation." Accessed 2026-06-06.

  12. Stox, P. "At Least 66.5% of Links to Sites in the Last 9 Years Are Dead (Ahrefs Study on Link Rot)." Ahrefs, 2022-04-29 (modified 2024-02-02). https://ahrefs.com/blog/link-rot-study/ — "Since January 2013, 66.5% of the links pointing to the 2,062,173 websites we sampled have rotted." Accessed 2026-06-06. 2