I Don’t Need No Stinking API: Web Scraping For Fun and Profit | Hartley Brody

A handy step-by-step guide to scraping HTML to get data out. Useful for services (—cough—Twitter—cough—) that keep changing the rules of their API use.

Tagged with

Related links

kimono : Turn websites into structured APIs from your browser in seconds

This tool for building ScrAPIs is an interesting development—the current trend for not providing a simple API (or even a simple RSS feed) is being interpreted as damage and routed around.

Tagged with

Official Google Webmaster Central Blog [EN]: More options to help websites preview their content on Google Search

Google’s pissing over HTML again, but for once, it’s not by making up rel values:

A new way to help limit which part of a page is eligible to be shown as a snippet is the “data-nosnippet” HTML attribute on span, div, and section elements.

This is a direct contradiction of how data-* attributes are intended to be used:

…these attributes are intended for use by the site’s own scripts, and are not a generic extension mechanism for publicly-usable metadata.

Tagged with