Skip to content

Commit 6a418f9

Browse files
authored
Stop fetching spec info from Specref (#1668)
This completely drops the logic that used to fetch spec info from Specref. Motivation for the change is to avoid having a circular dependency between browser-specs and Specref that makes it cumbersome to make some updates. In practice, there remained only 6 specs for which browser-specs was leveraging Specref: 3 from ISO, 2 from TC39 and the Web Bluetooth API. One consequence is that the title of the TC39 specs will change because the actual specs include a registered mark and the year, which are now captured, leading to "ECMAScript® 2025 Language Specification". Maybe not ideal, but then that's per spec... README and comments updated in the code to remove mentions of Specref.
1 parent 31ce416 commit 6a418f9

File tree

4 files changed

+41
-165
lines changed

4 files changed

+41
-165
lines changed

README.md

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,11 @@ The list is used in a variety of ways, which include:
88
by tools such as [ReSpec](https://respec.org/docs/) and
99
[Bikeshed](https://speced.github.io/bikeshed/) to create terminology
1010
and reference links between Web specifications.
11+
* [BCD](https://github.com/mdn/browser-compat-data) and
12+
[web-features](https://github.com/web-platform-dx/web-features) to validate
13+
specification URLs
14+
* [Specref](https://www.specref.org/) to complete the list of specifications
15+
that can be referenced.
1116
* Analyzers of browser technologies to create reports on test coverage,
1217
WebIDL, and specification quality.
1318

@@ -186,10 +191,11 @@ The `shortname` property is always set.
186191

187192
### `title`
188193

189-
The title of the spec. The title is either retrieved from the
190-
[W3C API](https://w3c.github.io/w3c-api/) for W3C specs,
191-
[Specref](https://www.specref.org/) or from the spec itself. The
192-
[`source`](#source) property details the actual provenance.
194+
The title of the spec. The title is either retrieved from an official source
195+
(the [W3C API](https://w3c.github.io/w3c-api/) for W3C specs, the
196+
[workstreams database](https://github.com/whatwg/sg/blob/main/db.json) for
197+
WHATWG specs, etc.), or from the spec itself. The [`source`](#source) property
198+
details the actual provenance.
193199

194200
The `title` property is always set.
195201

@@ -485,11 +491,12 @@ available.
485491

486492
The URL of the latest Editor's Draft or of the living standard.
487493

488-
The URL is either retrieved from the [W3C API](https://w3c.github.io/w3c-api/)
489-
for W3C specs, or [Specref](https://www.specref.org/). The document at the
490-
versioned URL is considered to be the latest Editor's Draft if the spec does
491-
neither exist in the W3C API nor in Specref. The [`source`](#source) property
492-
details the actual provenance.
494+
The URL is either retrieved from an official source (the
495+
[W3C API](https://w3c.github.io/w3c-api/) for W3C specs, the
496+
[workstreams database](https://github.com/whatwg/sg/blob/main/db.json) for
497+
WHATWG specs, etc.) when possible. The document at the versioned URL is
498+
considered to be the latest Editor's Draft otherwise. The [`source`](#source)
499+
property details the actual provenance.
493500

494501
The URL should be relatively stable but may still change over time. See
495502
[Spec identifiers](#spec-identifiers) for details.
@@ -552,8 +559,7 @@ The `pages` property is only set for specs identified as multipage specs.
552559
The URL of the repository that contains the source of the Editor's Draft or of
553560
the living standard.
554561

555-
The URL is either retrieved from the [Specref](https://www.specref.org/) or
556-
computed from `nightly.url`.
562+
The URL is computed from `nightly.url`.
557563

558564
The `repository` property is always set except for IETF specs where such a repo does not always exist.
559565

@@ -621,7 +627,7 @@ The `excludePaths` property is seldom set.
621627

622628
The provenance for the `title` and `nightly` property values. Can be one of:
623629
- `w3c`: information retrieved from the [W3C API](https://w3c.github.io/w3c-api/)
624-
- `specref`: information retrieved from [Specref](https://www.specref.org/)
630+
- `whatwg`: information retrieved from [WHATWG](https://spec.whatwg.org/)
625631
- `ietf`: information retrieved from the [IETF datatracker](https://datatracker.ietf.org)
626632
- `spec`: information retrieved from the spec itself
627633

schema/definitions.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@
6060

6161
"source": {
6262
"type": "string",
63-
"enum": ["w3c", "specref", "spec", "ietf", "whatwg"]
63+
"enum": ["w3c", "spec", "ietf", "whatwg"]
6464
},
6565

6666
"nightly": {

src/fetch-info.js

Lines changed: 7 additions & 113 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
* Module that exports a function that takes an array of specifications objects
33
* that each have at least a "url" and a "short" property. The function returns
44
* an object indexed by specification "shortname" with additional information
5-
* about the specification fetched from the W3C API, Specref, or from the spec
6-
* itself. Object returned for each specification contains the following
5+
* about the specification fetched from the W3C API, WHATWG, IETF or from the
6+
* spec itself. Object returned for each specification contains the following
77
* properties:
88
*
99
* - "nightly": an object that describes the nightly version. The object will
@@ -15,8 +15,8 @@
1515
* feature the URL of the TR document for W3C specs when it exists, and is not
1616
* present for specs that don't have release versions (WHATWG specs, CG drafts).
1717
* - "title": the title of the specification. Always set.
18-
* - "source": one of "w3c", "specref", "spec", depending on how the information
19-
* was determined.
18+
* - "source": one of "w3c", "ietf", "whatwg", "spec", depending on how the
19+
* information was determined.
2020
*
2121
* The function throws when something goes wrong, e.g. if the given spec object
2222
* describes a /TR/ specification but the specification has actually not been
@@ -25,17 +25,14 @@
2525
*
2626
* The function will start by querying the W3C API, using the given "shortname"
2727
* properties. For specifications where this fails, the function will query
28-
* SpecRef, using the given "shortname" as well. If that too fails, the function
29-
* assumes that the given "url" is the URL of the Editor's Draft, and will fetch
30-
* it to determine the title.
28+
* IETF, then WHATWG, using the given "shortname" as well. If that too fails,
29+
* the function assumes that the given "url" is the URL of the Editor's Draft,
30+
* and will fetch it to determine the title.
3131
*
3232
* If the function needs to retrieve the spec itself, note that it will parse
3333
* the HTTP response body as a string, applying regular expressions to extract
3434
* the title. It will not parse it as HTML in particular. This means that the
3535
* function will fail if the title cannot easily be extracted for some reason.
36-
*
37-
* Note: the function operates on a list of specs and not only on one spec to
38-
* bundle requests to Specref.
3936
*/
4037

4138
import puppeteer from "puppeteer";
@@ -45,17 +42,6 @@ import Octokit from "./octokit.js";
4542
import ThrottledQueue from "./throttled-queue.js";
4643
import fetchJSON from "./fetch-json.js";
4744

48-
// Map spec statuses returned by Specref to those used in specs
49-
// Note we typically won't get /TR statuses from Specref, since all /TR URLs
50-
// are handled through the W3C API. Also, "Proposal for a CSS module" entries
51-
// were probably manually hardcoded in Specref, they are really just Editor's
52-
// Drafts in practice.
53-
const specrefStatusMapping = {
54-
"ED": "Editor's Draft",
55-
"Proposal for a CSS module": "Editor's Draft",
56-
"cg-draft": "Draft Community Group Report"
57-
};
58-
5945
async function useLastInfoForDiscontinuedSpecs(specs) {
6046
const results = {};
6147
for (const spec of specs) {
@@ -215,97 +201,6 @@ async function fetchInfoFromWHATWG(specs, options) {
215201
return specInfo;
216202
}
217203

218-
async function fetchInfoFromSpecref(specs, options) {
219-
function chunkArray(arr, len) {
220-
let chunks = [];
221-
let i = 0;
222-
let n = arr.length;
223-
while (i < n) {
224-
chunks.push(arr.slice(i, i += len));
225-
}
226-
return chunks;
227-
}
228-
229-
// Browser-specs contributes specs to Specref. By definition, we cannot rely
230-
// on information from Specref about these specs. Unfortunately, the Specref
231-
// API does not return the "source" field, so we need to retrieve the list
232-
// ourselves from Specref's GitHub repository.
233-
const specrefBrowserspecsUrl = "https://raw.githubusercontent.com/tobie/specref/main/refs/browser-specs.json";
234-
const browserSpecs = await fetchJSON(specrefBrowserspecsUrl, options);
235-
specs = specs.filter(spec => !browserSpecs[spec.shortname.toUpperCase()]);
236-
237-
// Browser-specs now acts as source for Specref for the WICG specs and W3C
238-
// Editor's Drafts that have not yet been published to /TR. Let's filter out
239-
// these specs to avoid a catch-22 where the info in browser-specs gets stuck
240-
// to the that in Specref.
241-
const filteredSpecs = specs.filter(spec =>
242-
!spec.url.match(/\/\/(wicg|w3c)\.github\.io\//) &&
243-
!spec.url.match(/\/\/www\.w3\.org\//) &&
244-
!spec.url.match(/\/\/drafts\.csswg\.org\//));
245-
246-
const chunks = chunkArray(filteredSpecs, 50);
247-
const chunksRes = await Promise.all(chunks.map(async chunk => {
248-
let specrefUrl = "https://api.specref.org/bibrefs?refs=" +
249-
chunk.map(spec => spec.shortname).join(',');
250-
return fetchJSON(specrefUrl, options);
251-
}));
252-
253-
const results = {};
254-
chunksRes.forEach(chunkRes => {
255-
256-
// Specref manages aliases, let's follow the chain to the final spec
257-
function resolveAlias(name, counter) {
258-
counter = counter || 0;
259-
if (counter > 100) {
260-
throw "Too many aliases returned by Respec";
261-
}
262-
if (chunkRes[name].aliasOf) {
263-
return resolveAlias(chunkRes[name].aliasOf, counter + 1);
264-
}
265-
else {
266-
return name;
267-
}
268-
}
269-
Object.keys(chunkRes).forEach(name => {
270-
if (specs.find(spec => spec.shortname === name)) {
271-
const info = chunkRes[resolveAlias(name)];
272-
if (info.edDraft?.startsWith('http:')) {
273-
console.warn(`[warning] force HTTPS for nightly of ` +
274-
`"${spec.shortname}", Specref returned "${info.edDraft}"`);
275-
}
276-
if (info.href?.startsWith('http:')) {
277-
console.warn(`[warning] force HTTPS for nightly of ` +
278-
`"${spec.shortname}", Specref returned "${info.href}"`);
279-
}
280-
const nightly =
281-
info.edDraft?.replace(/^http:/, 'https:') ??
282-
info.href?.replace(/^http:/, 'https:') ??
283-
null;
284-
const status =
285-
specrefStatusMapping[info.status] ??
286-
info.status ??
287-
"Editor's Draft";
288-
if (nightly?.startsWith("https://www.iso.org/")) {
289-
// The URL is to a page that describes the spec, not to the spec
290-
// itself (ISO specs are not public).
291-
results[name] = {
292-
title: info.title
293-
}
294-
}
295-
else {
296-
results[name] = {
297-
nightly: { url: nightly, status },
298-
title: info.title
299-
};
300-
}
301-
}
302-
});
303-
});
304-
305-
return results;
306-
}
307-
308-
309204
async function fetchInfoFromIETF(specs, options) {
310205
async function fetchRFCName(docUrl) {
311206
const body = await fetchJSON(docUrl, options);
@@ -611,7 +506,6 @@ async function fetchInfo(specs, options) {
611506
{ name: 'w3c', fn: fetchInfoFromW3CApi },
612507
{ name: 'ietf', fn: fetchInfoFromIETF },
613508
{ name: 'whatwg', fn: fetchInfoFromWHATWG },
614-
{ name: 'specref', fn: fetchInfoFromSpecref },
615509
{ name: 'spec', fn: fetchInfoFromSpecs }
616510
];
617511
let remainingSpecs = specs;

test/fetch-info.js

Lines changed: 15 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -35,45 +35,6 @@ describe("fetch-info module", function () {
3535
});
3636
});
3737

38-
describe("fetch from Specref", () => {
39-
it("works on an ISO spec", async () => {
40-
const spec = {
41-
url: "https://www.iso.org/standard/85253.html",
42-
shortname: "iso18181-2"
43-
};
44-
const info = await fetchInfo([spec]);
45-
assert.ok(info[spec.shortname]);
46-
assert.equal(info[spec.shortname].source, "specref");
47-
assert.equal(info[spec.shortname].title, "Information technology — JPEG XL image coding system — Part 2: File format");
48-
assert.equal(info[spec.shortname].nightly, undefined);
49-
});
50-
51-
it("can operate on multiple specs at once", async () => {
52-
const spec = getW3CSpec("presentation-api");
53-
const other = getW3CSpec("hr-time-2");
54-
const info = await fetchInfo([spec, other]);
55-
assert.ok(info[spec.shortname]);
56-
assert.equal(info[spec.shortname].source, "w3c");
57-
assert.equal(info[spec.shortname].nightly.url, "https://w3c.github.io/presentation-api/");
58-
assert.equal(info[spec.shortname].title, "Presentation API");
59-
60-
assert.ok(info[other.shortname]);
61-
assert.equal(info[other.shortname].source, "w3c");
62-
assert.equal(info[other.shortname].nightly.url, "https://w3c.github.io/hr-time/");
63-
assert.equal(info[other.shortname].title, "High Resolution Time Level 2");
64-
});
65-
66-
it("does not retrieve info from a spec that got contributed to Specref", async () => {
67-
const spec = {
68-
url: "https://registry.khronos.org/webgl/extensions/ANGLE_instanced_arrays/",
69-
shortname: "ANGLE_instanced_arrays"
70-
};
71-
const info = await fetchInfo([spec]);
72-
assert.ok(info[spec.shortname]);
73-
assert.equal(info[spec.shortname].source, "spec");
74-
});
75-
});
76-
7738
describe("fetch from IETF datatracker", () => {
7839
it("fetches info about RFCs from datatracker", async () => {
7940
const spec = {
@@ -337,6 +298,21 @@ describe("fetch-info module", function () {
337298
fetchInfo([spec]),
338299
/^Error: W3C API redirects "webaudio" to "webaudio-.*"/);
339300
});
301+
302+
it("can operate on multiple specs at once", async () => {
303+
const spec = getW3CSpec("presentation-api");
304+
const other = getW3CSpec("hr-time-2");
305+
const info = await fetchInfo([spec, other]);
306+
assert.ok(info[spec.shortname]);
307+
assert.equal(info[spec.shortname].source, "w3c");
308+
assert.equal(info[spec.shortname].nightly.url, "https://w3c.github.io/presentation-api/");
309+
assert.equal(info[spec.shortname].title, "Presentation API");
310+
311+
assert.ok(info[other.shortname]);
312+
assert.equal(info[other.shortname].source, "w3c");
313+
assert.equal(info[other.shortname].nightly.url, "https://w3c.github.io/hr-time/");
314+
assert.equal(info[other.shortname].title, "High Resolution Time Level 2");
315+
});
340316
});
341317

342318
describe("fetch from all sources", () => {

0 commit comments

Comments
 (0)