Wikidata:Property proposal/attested as
attested as
[edit]Originally proposed at Wikidata:Property proposal/Lexemes
Description | how the word is attested in a text, if different for the current way to write it. Should be used with attested in (P5323) |
---|---|
Data type | String |
Domain | lexeme |
Example 1 | persuade (L11073) → "perswade", attested in (P5323) The Merry Wives of Windsor (Q844836) |
Example 2 | begi (L49195) → "begui" ... |
Example 3 | hacer (L39182) → "fazer" ... |
Expected completeness | always incomplete (Q21873886) |
Motivation
[edit]Many texts are from pre-standarization sources, and have lexemes that are not well written, but they are not from an ancient language itself. They are only written variants of the same words that are no longer in use. As we upload more and more lexemes and try to make links with Wikisource, this will happen even more. We should have attested as, attested in and first attestment properties as a triad. Theklan (talk) 12:52, 23 December 2019 (UTC)
Discussion
[edit]Support -Theklan (talk) 12:52, 23 December 2019 (UTC)
- Support. Nomen ad hoc (talk) 16:24, 23 December 2019 (UTC).
- Support ArthurPSmith (talk) 17:52, 24 December 2019 (UTC)
Comment How would one represent that a ‘word’ (lexeme?) was attested in a certain sense?―BlaueBlüte (talk) 07:46, 26 December 2019 (UTC)
- @BlaueBlüte: I assume the statement would be on the sense, rather than on the main lexeme? ArthurPSmith (talk) 21:37, 30 December 2019 (UTC)
- @Theklan: Do you agree that the domain for this property should be Sense, or “Sense or Lexeme”, rather than Lexeme? If so, can you update the proposal?―BlaueBlüte (talk) 21:46, 30 December 2019 (UTC)
- @BlaueBlüte: Yes, it may be Sense or Lexeme. -Theklan (talk) 12:50, 20 January 2020 (UTC)
- @Theklan: Do you agree that the domain for this property should be Sense, or “Sense or Lexeme”, rather than Lexeme? If so, can you update the proposal?―BlaueBlüte (talk) 21:46, 30 December 2019 (UTC)
- @BlaueBlüte: I assume the statement would be on the sense, rather than on the main lexeme? ArthurPSmith (talk) 21:37, 30 December 2019 (UTC)
- Support, but… I think the datatype should be another lexeme. --Tinker Bell ★ ♥ 20:50, 26 December 2019 (UTC)
- @Tinker Bell: why? Perhaps it should be added as another form of the given lexeme; the meaning expected here wouldn't make sense at all though if it was a completely different lexeme?? ArthurPSmith (talk) 21:37, 30 December 2019 (UTC)
- @Tinker Bell: I could see the datatype being a Form (of the same Lexeme); and if the Form in question is not in current use, it would somehow be marked as obsolete. That way one could also make statements about the attested form.―BlaueBlüte (talk) 21:46, 30 December 2019 (UTC)
Comment How about a more general model designed after the example of
- usage example (P5831) with qualifiers
and in which more general model the main property, say “attestation” or “quotation”, would indeed be of datatype (monolinugal) string and capture the attestation context, with the rest (which (potentially obsolete) Form, which Sense, reference, date) being handled by qualifiers (maybe even co-opting subject form (P5830) and subject sense (P6072), perhaps upon expanding their scope and adjusting their labels)? This would neatly take care of “attested as, attested in and first attestment properties as a triad” [as per Theklan] even though a query might be necessary to get the first attestation. Furthermore, using qualifiers, it would be much more expandable, for example to model ‘how obscure’ or ‘how idiomatic’ or ‘how representative’ an attestation is.―BlaueBlüte (talk) 22:07, 30 December 2019 (UTC)
- @BlaueBlüte, ArthurPSmith: I think we could do persuade (L11073)<pre-standarized/archaic form>L<perswade>
attested in (P5323)The Merry Wives of Windsor (Q844836), although we should make a difference between this property and derived from lexeme (P5191). The domain can be the whole Lexeme, or a particular sense. --Tinker Bell ★ ♥ 22:57, 30 December 2019 (UTC) - @Tinker Bell: So all archaic forms should be their own lexemes? Then, when using <pre-standarized/archaic form> on a lexeme, how exactly would (or should) that be different from derived from lexeme (P5191)? And when using <pre-standarized/archaic form> on a sense, how would that be different from a translation to an earlier development stage of the (otherwise same) language?
As for the use of the qualifier attested in (P5323) you’re proposing as part of <pre-standarized/archaic form> statements: that doesn’t strike me as semantically appropriate. For the object of the qualifier (e.g., The Merry Wives of Windsor (Q844836)) cannot attest the meaning of the statement, namely that the object of the statement is a ‘translation’ or ‘pre-standardized/archaic form’ of a modern lexeme/sense, because the modern lexeme/sense had not yet evolved at the time of the ‘attestation’.―BlaueBlüte (talk) 01:49, 31 December 2019 (UTC)- @BlaueBlüte: an archaic form isn't necessarily an etymon: persuade (L11073) derives from Latin persuadeo. From persuadeo derived persuade (L11073) and perswade, but the first one was standarized. So, both lexemes are related. Another example could be the Spanish word for king: rei and rey, being used today the last form. I think, in this case, attested in (P5323) shouldn't be used as a qualifier, but in the archaic Lexeme. --Tinker Bell ★ ♥ 01:30, 1 January 2020 (UTC)
- @Tinker Bell: So all archaic forms should be their own lexemes? Then, when using <pre-standarized/archaic form> on a lexeme, how exactly would (or should) that be different from derived from lexeme (P5191)? And when using <pre-standarized/archaic form> on a sense, how would that be different from a translation to an earlier development stage of the (otherwise same) language?
- Support this would be very good, at least for English, whose spelling was somehow even worse back a few hundred years! DemonDays64 | Talk to me 01:56, 14 January 2020 (UTC)
- Support this would also be very good for minorized languages which have been mostly orals and the spelling had been normalized very late --Aitalvivem (talk) 16:19, 16 January 2020 (UTC)
- Done @Theklan, BlaueBlüte, Nomen ad hoc, ArthurPSmith, Aitalvivem: @DemonDays64: please make good use of it. --- Jura 11:16, 20 January 2020 (UTC)
Comment Lexeme and Form should both be listed as allowed entity type constraint. Most attested forms, and even (historical) dictionary headwords are not always the form that is represented today as canonical lexeme form, but another form of the lexeme. In those cases the attested form should be linked to the lexeme form, not the lexeme. An example: abade (L48666): F2 "abadea" is what is attested as canonical form in a 1745 dictionary, not "abade". DL2204 (talk) 15:01, 26 October 2020 (UTC)
- @Jura1: Support Can you take a look into this issue, please? Theklan (talk) 16:03, 26 October 2020 (UTC)
- The property was already created months ago. Further discussion should take place elsewhere, e.g. on Property talk:P7855 or Wikidata_talk:Lexicographical_data. --- Jura 16:07, 26 October 2020 (UTC)
- @Jura1: Support Can you take a look into this issue, please? Theklan (talk) 16:03, 26 October 2020 (UTC)