Adactio: Tags—access

Denial

The Wikimedia Foundation, stewards of the finest projects on the web, have written about the hammering their servers are taking from the scraping bots that feed large language models.

Our infrastructure is built to sustain sudden traffic spikes from humans during high-interest events, but the amount of traffic generated by scraper bots is unprecedented and presents growing risks and costs.

Drew DeVault puts it more bluntly, saying Please stop externalizing your costs directly into my face:

Over the past few months, instead of working on our priorities at SourceHut, I have spent anywhere from 20-100% of my time in any given week mitigating hyper-aggressive LLM crawlers at scale.

And no, a robots.txt file doesn’t help.

If you think these crawlers respect robots.txt then you are several assumptions of good faith removed from reality. These bots crawl everything they can find, robots.txt be damned.

Free and open source projects are particularly vulnerable. FOSS infrastructure is under attack by AI companies:

LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.

You try to do the right thing by making knowledge and tools freely available. This is how you get repaid. AI bots are destroying Open Access:

There’s a war going on on the Internet. AI companies with billions to burn are hard at work destroying the websites of libraries, archives, non-profit organizations, and scholarly publishers, anyone who is working to make quality information universally available on the internet.

My own experience with The Session bears this out.

Ars Technica has a piece on this: Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries .

So does MIT Technology Review: AI crawler wars threaten to make the web more closed for everyone.

When we talk about the unfair practices and harm done by training large language models, we usually talk about it in the past tense: how they were trained on other people’s creative work without permission. But this is an ongoing problem that’s just getting worse.

The worst of the internet is continuously attacking the best of the internet. This is a distributed denial of service attack on the good parts of the World Wide Web.

If you’re using the products powered by these attacks, you’re part of the problem. Don’t pretend it’s cute to ask ChatGPT for something. Don’t pretend it’s somehow being technologically open-minded to continuously search for nails to hit with the latest “AI” hammers.

If you’re going to use generative tools powered by large language models, don’t pretend you don’t know how your sausage is made.

Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries - Ars Technica

As it currently stands, both the rapid growth of AI-generated content overwhelming online spaces and aggressive web-crawling practices by AI firms threaten the sustainability of essential online resources. The current approach taken by some large AI companies—extracting vast amounts of data from open-source projects without clear consent or compensation—risks severely damaging the very digital ecosystem on which these AI models depend.

Go To Hellman: AI bots are destroying Open Access

AI companies with billions to burn are hard at work destroying the websites of libraries, archives, non-profit organizations, and scholarly publishers, anyone who is working to make quality information universally available on the internet.

“Wait, not like that”: Free and open access in the age of generative AI

Anyone at an AI company who stops to think for half a second should be able to recognize they have a vampiric relationship with the commons. While they rely on these repositories for their sustenance, their adversarial and disrespectful relationships with creators reduce the incentives for anyone to make their work publicly available going forward (freely licensed or otherwise). They drain resources from maintainers of those common repositories often without any compensation.

Even if AI companies don’t care about the benefit to the common good, it shouldn’t be hard for them to understand that by bleeding these projects dry, they are destroying their own food supply.

And yet many AI companies seem to give very little thought to this, seemingly looking only at the months in front of them rather than operating on years-long timescales. (Though perhaps anyone who has observed AI companies’ activities more generally will be unsurprised to see that they do not act as though they believe their businesses will be sustainable on the order of years.)

It would be very wise for these companies to immediately begin prioritizing the ongoing health of the commons, so that they do not wind up strangling their golden goose. It would also be very wise for the rest of us to not rely on AI companies to suddenly, miraculously come to their senses or develop a conscience en masse.

Instead, we must ensure that mechanisms are in place to force AI companies to engage with these repositories on their creators’ terms.

The web was always about redistribution of power. Let’s bring that back.

Many of us got excited about technology because of the web, and are discovering, latterly, that it was always the web itself — rather than technology as a whole — that we were excited about. The web is a movement: more than a set of protocols, languages, and software, it was always about bringing about a social and cultural shift that removed traditional gatekeepers to publishing and being heard.

Century-Scale Storage

This magnificent piece by Maxwell Neely-Cohen—with some tasteful art-direction—is right up my alley!

This piece looks at a single question. If you, right now, had the goal of digitally storing something for 100 years, how should you even begin to think about making that happen? How should the bits in your stewardship be stored with such a target in mind? How do our methods and platforms look when considered under the harsh unknowns of a century? There are plenty of worthy related subjects and discourses that this piece does not touch at all. This is not a piece about the sheer volume of data we are creating each day, and how we might store all of it. Nor is it a piece about the extremely tough curatorial process of deciding what is and isn’t worth preserving and storing. It is about longevity, about the potential methods of preserving what we make for future generations, about how we make bits endure. If you had to store something for 100 years, how would you do it? That’s it.

W3C@30: W3C and me - YouTube

This is a lovely, lovely talk from Léonie!

Hire HTML and CSS people

Every problem at every company I’ve ever worked at eventually boils down to “please dear god can we just hire people who know how to write HTML and CSS.”

80 / 20 accessibility · marcus.io

So my observation is that 80% of the subject of accessibility consists of fairly simple basics that can probably be learnt in 20% of the time available. The remaining 20% are the difficult situations, edge cases, assistive technology support gaps and corners of specialised knowledge, but these are extrapolated to 100% of the subject, giving it a bad, anxiety-inducing and difficult reputation overall.

How do we build the future with AI? – Chelsea Troy

This is the transcript of a fantastic talk called “The Tools We Still Need to Build with AI.”

Absorb every word!

Applying the four principles of accessibility

Web Content Accessibility Guidelines—or WCAG—looks very daunting. It’s a lot to take in. It’s kind of overwhelming. It’s hard to know where to start.

I recommend taking a deep breath and focusing on the four principles of accessibility. Together they spell out the cutesy acronym POUR:

Perceivable
Operable
Understandable
Robust

A lot of work has gone into distilling WCAG down to these four guidelines. Here’s how I apply them in my work…

Perceivable

I interpret this as:

Content will be legible, regardless of how it is accessed.

For example:

The contrast between background and foreground colours will meet the ratios defined in WCAG 2.
Content will be grouped into semantically-sensible HTML regions such as navigation, main, footer, etc.

Operable

I interpret this as:

Core functionality will be available, regardless of how it is accessed.

For example:

I will ensure that interactive controls such as links and form inputs will be navigable with a keyboard.
Every form control will be labelled, ideally with a visible label.

Understandable

I interpret this as:

Content will make sense, regardless of how it is accessed.

For example:

Images will have meaningful alternative text.
I will make sensible use of heading levels.

This is where it starts to get quite collaboritive. Working at an agency, there will some parts of website creation and maintenance that will require ongoing accessibility knowledge even when our work is finished.

For example:

Images uploaded through a content management system will need sensible alternative text.
Articles uploaded through a content management system will need sensible heading levels.

Robust

I interpret this as:

Content and core functionality will still work, regardless of how it is accessed.

For example:

Drop-down controls will use the HTML select element rather than a more fragile imitation.
I will only use JavaScript to provide functionality that isn’t possible with HTML and CSS alone.

If you’re applying a mindset of progressive enhancement, this part comes for you. If you take a different approach, you’re going to have a bad time.

Taken together, these four guidelines will get you very far without having to dive too deeply into the rest of WCAG.

The Web Accessibility Cookbook

Manu’s book is available to pre-order now. I’ve had a sneak peek and I highly recommend it!

You’ll learn how to build common patterns written accessibly in HTML, CSS, and JavaScript. You’ll also start to understand how good and bad practices affect people, especially those with disabilities.

Home - Sa11y

Another handy accessibility testing tool that can be used as a bookmarklet.

Manifesto for a Humane Web

I endorse this message.

This manifesto is intended as a personal response to the current state of the web. It is a statement of intent and a call to arms, inviting you, the reader, to go forth and build humane websites, and to resist the erosion of the web we know and love.

Write Alt Text Like You’re Talking To A Friend – Cloud Four

This is good advice:

Write alternative text as if you’re describing the image to a friend.

Faster Connectivity !== Faster Websites - Jim Nielsen’s Blog

The bar to overriding browser defaults should be way higher than it is.

Amen!

Bookmarklets for testing your website

I’m at day two of Indie Web Camp Brighton.

Day one was excellent. It was really hard to choose which sessions to go to because they all sounded interesting. That’s a good problem to have.

I ended up participating in:

a session on POSSE,
a session on NFC tags,
a session on writing, and
a session on testing your website that was hosted by Ros

In that testing session I shared some of the bookmarklets I use regularly.

Bookmarklets? They’re bookmarks that sit in the toolbar of your desktop browser. Just like any other bookmark, they’re links. The difference is that these links begin with javascript: rather than http. That means you can put programmatic instructions inside the link. Click the bookmark and the JavaScript gets executed.

In my mind, there are two different approaches to making a bookmarklet. One kind of bookmarklet contains lots of clever JavaScript—that’s where the smart stuff happens. The other kind of bookmarklet is deliberately dumb. All they do is take the URL of the current page and pass it to another service—that’s where the smart stuff happens.

I like that second kind of bookmarklet.

Here are some bookmarklets I’ve made. You can drag any of them up to the toolbar of your browser. Or you could create a folder called, say, “bookmarklets”, and drag these links up there.

Validation: This bookmarklet will validate the HTML of whatever page you’re on.

Validate HTML

Carbon: This bookmarklet will run the domain through the website carbon calculator.

Calculate carbon

Accessibility: This bookmarklet will run the current page through the Website Accessibility Evaluation Tools.

WAVE

Performance: This bookmarklet will take the current page and it run it through PageSpeed Insights, which includes a Lighthouse test.

PageSpeed

HTTPS: This bookmarklet will run your site through the SSL checker from SSL Labs.

SSL Report

Headers: This bookmarklet will test the security headers on your website.

Security Headers

Drag any of those links to your browser’s toolbar to “install” them. If you don’t like one, you can delete it the same way you can delete any other bookmark.

On Nielsen’s ideas about generative UI for resolving accessibility

Per Axbom quite rightly tears Jakob Nielsen a new one.

I particularly like his suggestion that you re-read Nielsen’s argument but replace the word “accessibility” with “usability”:

Assessed this way, the ~~accessibility~~usabiity movement has been a miserable failure.

~~Accessibility~~Usability is too expensive for most companies to be able to afford everything that’s needed with the current, clumsy implementation.

jgarber623/aria-collapsible: A dependency-free Web Component that generates progressively-enhanced collapsible regions using ARIA States and Properties.

This is a really lovely little HTML web component from Jason. It does just one thing—wires up a trigger button to toggle-able content, taking care of all the ARIA for you behind the scenes.

The Folly of Chasing Demographics - YouTube

I just attended this talk from Heydon at axe-con and it was great! Of course it was highly amusing, but he also makes a profound and fundamental point about how we should be going about working on the web.

Monday, April 7th, 2025

Friday, March 28th, 2025

Wednesday, March 26th, 2025

Sunday, March 16th, 2025

Sunday, March 2nd, 2025

Sunday, December 15th, 2024

Wednesday, October 30th, 2024

Friday, September 27th, 2024

Thursday, August 29th, 2024

Thursday, June 27th, 2024

Thursday, May 30th, 2024

Perceivable

Operable

Understandable

Robust

Tuesday, May 28th, 2024

Monday, May 20th, 2024

Monday, May 13th, 2024

Sunday, April 28th, 2024

Wednesday, April 17th, 2024

Sunday, March 10th, 2024

Sunday, March 3rd, 2024

Tuesday, February 27th, 2024

Wednesday, February 21st, 2024