Wikidata talk:Notability/sandbox

From Wikidata
Jump to navigation Jump to search

My thoughts[edit]

  • I don't think category items should be listed in the structural needs section. Structual needs should focus on Wikidata. The notability for items for categories should come from sitelinks. ChristianKl12:26, 6 September 2020 (UTC)[reply]
    • This is mostly for Structured Data on Wikimedia Commons (SDC) which has a strong structural need for such a change in the Wikidata notability guidelines. Wikidata items don't just exist for Wikidata itself but to help other Wikimedia websites, if Wikimedia Commons can benefit better from having items for ever Commonswiki category then Wikidata should be able to fulfill this. -- Donald Trung/徵國單  (討論 🀄) (方孔錢 💴) 20:29, 6 September 2020 (UTC)[reply]
      • Wikimedia Commons categories have no problem having sitelinks if it's desireable to have a Wikidata item for every one of those categories. There's no need to handled that via the structural needs provision. ChristianKl15:07, 9 September 2020 (UTC)[reply]
  • I don't think the hierachy you proposed is helpful. For scientific facts it's often good to have citations to the study that established the facts and such citations are much superior to a New York Times article on the same topic.
I much rather have a link to the document by the prosecutor that lays out what a person is charged with then a link to a newspaper article about the charge. ChristianKl12:26, 6 September 2020 (UTC)[reply]
  • A verifiability policy should lay out what it means with terms like reliable and self-published if it creates such a hierarchy. ChristianKl12:26, 6 September 2020 (UTC)[reply]
  • The speedy deletion policy seems to me like too bureaucratic and unnecessarily driven more discussion to https://www.wikidata.org/wiki/Wikidata:Requests_for_deletions that's already overrun. A solution to the issue might be to have a way that allows admins (and maybe other people with a tag-deletion-right) to tag items to be deleted. Those could then be automatically deleted by a bot after 3 days. ChristianKl12:26, 6 September 2020 (UTC)[reply]
    • Unfortunately bots can't actually read what they are deleting, more eyes on an item can help preserve it as people who would otherwise have never found the item can add more data to better establish its notability on Wikidata. Less speedy deletions can also have more advantages, I don't think that there's much benefit in rushing the deletion of items that are otherwise not harmful. -- Donald Trung/徵國單  (討論 🀄) (方孔錢 💴) 20:31, 6 September 2020 (UTC)[reply]
      • While bots can't read what they are deleting, tagging an item with a property would put the deletion of the item on watchlists and thus allow people who edited the item before to object. That would mean that they do get more eyeballs on them as currently happens. At the same time there would no bureaucracy that reduces the effectiveness of undesireable items getting removed.
      • It would also allow the setting up of listeria lists of items in specific domains that are nominated for speed deletion (and people can watch those). ChristianKl15:05, 9 September 2020 (UTC)[reply]

Works cited in Wikipedia[edit]

I wonder where in this criteria would fit a notability claim for a 'work' cited in Wikipedia?
The "structural need" criteria is focused on internal-use within Wikidata (i.e. 'this item is needed in order to make a proper statement in another item'), but it doesn't cover the "structural needs" of other sister projects.
The "one valid sitelink" rule is focused on sister projects, but only for whole mainspace pages (e.g. 'a WP article about this subject').

It is my assumption that if a work is cited in a Wikipedia article (e.g. a book) then that work would be considered inherently notable for WD. I assume this to be true as a matter of general principle, but also becuase it would serve the 'structural need' of supporting things like the 'cite q' template (or equivalent) on Wikipedia - which makes footnotes which subscribe to WD items.

Going further: What granularity of the definition of 'work' is appropriate. I think we can all agree that the full work (e.g. Nineteen Eighty-Four (Q208460)) would be notable, but Wikisource refers to individual specific editions of the work (e.g. today's featured work on Wikisource is A Simplified Grammar of the Swedish Language (Q19025640), which is the 1902 edition of A Simplified Grammar of the Swedish Language (Q55238808)). This specificity is accounted for in the aforementioned "one valid sitelink" critiera - but could and should it also be considered valid according to the 'structural need' defition I describe in the previous paragraph?

Going further: and its is where I really want to get to is to discuss whether A specific URL (e.g. a BBC News article) is a valid 'work' for the purposes of what I'm describing. We DO have an item for BBC News Online (Q4704926), but if a Wikipedia article [and, indeed a Wikidata statement] is referencing a specific news article - could/should that news article be considered WD notable in its own right for the same reason that a book is considered notable? i.e. that is the individual and specific 'work' to which a Wikipedian/Wikidatan is trying to make a reference.

I don't expect you to be able to answer this last question within this draft document - it would need a new RfC probably - but I think the earlier paragraphs could probably be addressed within this document.

[I'm writing this comment in my volunteer user account because it is of personal interest, but this is also highly relevant to my WMF work as WikiCite project coordinator.] Sincerely, Wittylama (talk) 12:22, 9 September 2020 (UTC)[reply]

The "structural need" criterion is indeed not well-defined, but use of Wikidata content/items in other Wikimedia projects is usually considered to be "structural need". In particular, if you use a Wikidata item for a citation as you outlined above, it will be covered by the "structural need" use case. Re. works and editions see Wikidata:WikiProject Books (we use the FRBR model, thus there should be separate items for the work and for each edition). An online article can also be equipped with a separate item, although I would be careful with this approach and only do it on an individual bases when I had a specific use case in mind. ---MisterSynergy (talk) 12:53, 9 September 2020 (UTC)[reply]
The current policy proposal considers having an external identifier to be sufficient for notability under (1). Books that have ISBN's or OCLC work ID's would fall under that definition. When it does come to individual citations on Wikipedia I think it could be worthwhile to state taht it creates notability. ChristianKl14:21, 9 September 2020 (UTC)[reply]
1) So ChristianKl/MisterSynergy, if I understand correctly, you're saying that any book - at any level of FRBR granularity - is worthy of a item by virtue of having an external identifier (eg ISBN) irrespective of whether it is used in any other way on any sister project? I thought that WikiData accepted the more granular 'work' items (e.g "the 1902 edition of a book") only if they were being used in a sister project, like Wikisource. I think it would be worthwhile to clarify this point because it seems to be easy (for me at least) to misunderstand the scope of the current rules (let alone any potential changes).
2) With regards to URLs - I want to give a specific example as a test case: Footnote 4 in the English Wikipedia article Emu War - https://en.wikipedia.org/wiki/Emu_War#cite_note-defended-4
This is notable because I used to work for the National Library of Australia (which hosts the digitised newspaper collection this footnote cites (Trove) and this footnote was one of the biggest single sources of inbound 'deep links' to Trove. To here: https://trove.nla.gov.au/newspaper/article/4509731
That Wikipedia footnote links to a specific and stable URL, which is ALSO a specific historic newspaper article. It's a rare case of a clean/neat overlap of citing a website and citing a newspaper. This is a valid and important footnote, with very stable and structured metadata [The Argus (Melbourne, Vic. : 1848 - 1957) / Sat 19 Nov 1932 / Page 22 / "EMU WAR" DEFENDED].
Should that specific URL/newspaper article deserve a Wikidata item purely on the notability criteria of being used in Wikipedia? Wittylama (talk) 14:39, 9 September 2020 (UTC)[reply]
I'm speaking about the current proposal in the sandbox not the approved notability policy that's in place for Wikidata. The proposal in the sandbox does specify that an external identifier is sufficient. As far as I'm concerned individual books (editions) with OCLC/ISBNs that you can reference are also okay with our approved notability policy (but I do consider the intent of our bot-policy to be a limiting factor for larger creation of items). ChristianKl15:11, 9 September 2020 (UTC)[reply]
Okay, some more general remarks: the notability policy governs which content can be covered by an item here with a pretty generic set of rules. The purpose is basically to ensure that all content here is properly identified (i.e. linked) against a resource that provides information about the entity; this also helps for basic verification of data, although individual references for claims would are more desirable of course. This identification can be a Wikimedia sitelink (which usually contains external references), or an external database/identifier, or in many cases "structural need" (backlinks, use in Wikimedia projects in a certain context, etc.). We do not want to have items here which require expert knowledge or excessive research to be understood by random users/editors.
That said, an important finding of plenty of discussions within the recent months is that Wikidata will certainly not be able to host items about everything which is "notable" in this sense. The technical infrastructure has several difficult scaling problems which render infinite growth impossible (the technical limits are quite close in fact), and socially we are also not able to maintain an arbitrary amount of data in good shape. This problem becomes more and more evident, but the solution to it is not very straightforward to elaborate. It is clear, however, that any bulk import (e.g. "any book - at any level of FRBR granularity") is problematic if it only consumes resources (computational, community efforts, ...), but does not serve a purpose. If you do have a specific purpose in mind such as use in a Wikimedia project, items are safe here. If you plan to import a larger amount of such items, maybe because some Wikipedia language edition wants to move to Wikidata-based references completely, there would be a discussion necessary where the amount of items required for that undertaking should be estimated. —MisterSynergy (talk) 21:44, 9 September 2020 (UTC)[reply]
Thanks for the reply MisterSynergy. To dispell any concern in this regard - I'm not intending/requesting to mass import granular levels of types of book. I raised that as an example to request clarification from earlier commenters.
But I COULD see a future where individual URLs which are used as References in Wikipedia are mass imported as standalone QIDs. I am asking colleagues at the InternetArchive if they can get an estimate of actually how many unique items that might be (the ones that have been 'archived' at least) - to inform the debate. As a practical example, I return to the question of the 'emu war' URL.... Do the people commenting here (ping also @ChristianKl, Mike Peel:) think that THAT newspaper article - with its stable URL to match - is worthy of being its own Wikidata item based solely on the 'notability claim' that it is used as a Wikipedia reference? Or equally what about the URL which is Reference number 1 in the English Wikipedia article Impact of the COVID-19 pandemic on the arts and cultural heritage, which links to the IFLA covid response page (InternetArchive version here. This is a specific document that is used 8 times within that WP article (and potentially many other times in other articles in other languages). Should " https://www.ifla.org/covid-19-and-libraries " be worthy of its own Wikidata item based on the notability criteria of being cited by a Wikipedia article? -- Wittylama (talk) 14:12, 10 September 2020 (UTC)[reply]
I think I can provide some estimations about URLs as well: enwiki has ~102M different URLs, in use ~145M times; dewiki has ~22M different URLs, in use ~26M times; Wikidata currently has ~51M URLs, in use ~56M times (all data from the externallinks tables). One can definitely argue that some of them are not relevant here, but as ballpark figures you can use these numbers I guess.
Interestingly, when I first heard of Wikicite years ago, before anything was done in that project I initially thought it would be just that: all Wikipedia references being put to Wikidata into individual items, including offline and online resources, so that they can be used in Wikipedia with templates. Wikicite turned out to be something different, but the idea is not forgotten. With the techincal limitations we experience meanwhile, I think Wikidata would probably not be the place to host such Wikipedia citation data in case there would be a bulk import from one of the large Wikipedias. Wikibase has meanwhile been developed much further and the SD at Commons shows us how we could potentially build another knowledge base for some specific purpose such as "all Wikipedia citations" that makes use of Wikidata content as well. For the time being, individual citations need to be put to Wikidata. —MisterSynergy (talk) 20:21, 10 September 2020 (UTC)[reply]

Speedy deletion[edit]

@GZWDer: I've removed "If the item meets Wikidata:Speedy deletion, it can be deleted immediately without discussion. Otherwise, you may nominate the item for deletion if you can not find sources that would make the item notable." - while I agree with it, it seems to be a more complicated issue per Wikidata_talk:Requests_for_deletions#Splitting_this_process_into_speedy_deletions_and_deletion_discussions?. I think we should focus on notability first, and then move on to the other pages. Similarly with this bit about verifiability, it seems sensible but I think it's out of scope for this change. Thanks. Mike Peel (talk) 19:02, 12 September 2020 (UTC)[reply]