Wikidata talk:Requests for comment/Music-related items
@Moebeus: I've made a first stab. It's not really clear what this is about yet. It probably needs a better structure and it might need to frame the problems better: structural issues, property issues, lack of imports... It might need more questions, or fewer questions. The italics indicate things that I'm not sure about. Jc86035 (talk) 13:46, 10 December 2018 (UTC)
- @Jc86035: This is brilliant stuff, a great start👍 I'm going to need some time to digest this and formulate some thoughts on my own - this is so important if WD is ever to become a serious, trusted source for music information. Moebeus (talk) 14:14, 10 December 2018 (UTC)
- @Moebeus: Thanks. I've notified the project list and no one has responded (although page views show that a lot of people probably clicked through), so I think the main issues remain the same. I haven't edited it in a few days. Jc86035 (talk) 11:56, 15 December 2018 (UTC)
@Moebeus: Some thoughts about this:
- Now that the deletion discussion for B-side (DEPRECATED) (P1432) has concluded, I think it would be appropriate to treat it as consensus for separating singles from songs. Assuming that there is consensus for doing so, this leaves no real consensus for whether to separate tracks from compositions, and a de facto convention of lumping them all together anyway. Asking the question again probably wouldn't achieve much.
- This means that the more important question for the RfC would probably be "how do we clean up these 87,000 items?". (Also, I'm a bit slow at cleaning items up; I tried migrating a P1432 use and it took me at least half an hour to fix up Burn the Witch / Spectre (Q60197400) and all of the associated items. Unless you're much, much faster than me I think time would be much better spent organizing a large-scale import – one group of items fixed is really a drop in the bucket.) On the other hand, a lot of large-scale automated imports have taken place without much community discussion, so assuming that MusicBrainz data is usually accurate, we might not really need an RfC for creating more than 250,000 items based on MusicBrainz works, releases and recordings (leaving the original item as the single, excluding singles where more than one track has an article; and moving the sitelinks to the new item, excluding the aforementioned singles as well as singles where the article is about more than one song).
- @Jc86035: Importing MB Artist, MB Release Group, MB Release and MB Work is great. MB Recording to a much lesser extent, as I'm sure you've noticed they have a MASSIVE amount of duplicates. In that case I would simply import only recordings that have an ISRC-code attached, or some other criteria that might narrow it down a bit, otherwise we have a new, much larger clean-up job on our hands (though simpler, as a clean-up would entail not much more than merging).
- Fallout would have to be anticipated on wikis that use Wikidata infoboxes if no-one fixes them in time (I'm still not entirely sure how it would work, although someone more experienced than me with WD infoboxes would probably be able to hammer something out in about twenty minutes), but I don't know if it's possible to get usage statistics for all wikis. frwiki reports just 28 uses of performer (P175), so I'm not expecting a massive amount of usage.
- Adding too many seemingly random questions to the RfC (e.g. which entities should some obscure property/qualifier/type of statement be used on?) is probably not going to be helpful, although in this particular context I think it would be useful to figure out exactly what properties should be used on what items.
- The RfC could be seen as a way to get around previous consensus (e.g. the property proposal where Pigsonthewing (probably accidentally) derailed the entire discussion by not reading usernames correctly) depending on how it's written. [Perhaps it would be appropriate to ask for closure of that property proposal at WD:AN, since it's likely not going to go anywhere just for being too long for users frequenting WD:PP to make up their minds about it in five to ten seconds.]
- I'm not familiar with WD:AN - Is that a kind of "higher authority" with the power to drive stuff through?
- RfCs seem to take a very long time, especially if there's no consensus – at least compared to the English Wikipedia. Yes/no questions would probably be useful.
- It might or might not be useful to solicit input from Wikipedia contributors. I think creating an infobox demo would be a prerequisite (depending on what the RfC is actually about).
- I'll probably be taking a wikibreak in the near future, so I might not really work on this. (I have no objection to you or someone else cleaning up and posting the RfC.)
- 2. That would probably be an issue, although it might be resolvable by (automatically?) going through releases to be imported based on date, format and number of tracks, and checking for tracks with similar length and title. When I was writing the comment I forgot to consider importing albums, although this wouldn't really be necessary or useful unless all the existing song/single items are cleaned up first.
- 6. Not really; the noticeboard is just a good place to ask uninvolved admins or other users to do something (e.g. semi-protect an item, close a discussion). I'm surprised you're not familiar with it; it's right at the top of Wikidata:Project chat (although perhaps the link isn't very visible). Jc86035 (talk) 09:46, 3 January 2019 (UTC)
Comment on 10 January 2019
- Comment based on this example: All existing MusicBrainz IDs SHOULD be checked manually, until it is clear how they managed to create two IDs for one artist (an 11 years old kid) out of thin air (a lousy enwiki BLP), one with records released six years before the kid was born. Summary of issues: Alleged (v=Sn7wh19THyQ) "Child exploitation", circular references (featuring alternative facts: enwiki to MusicBrainz to wikidata), a bot in need of an emergency STOP (presumably it cannot "see" the released 2001 vs. born 2007 bug, but "there is already another ID" should require human intervention. If MusicBrainz is another Archive.is kill it. –220.127.116.11 07:34, 10 January 2019 (UTC)
- I've moved this here because the RfC is still a draft and might not actually be opened. Jc86035 (talk) 11:13, 12 January 2019 (UTC)
- @Moebeus: After having made a few hundred edits to MusicBrainz, I think automatically importing data for more than one artist at a time (even only for official releases containing one or more first releases of tracks) would probably introduce too many errors to be acceptable. Tracks not having associated works would also be an issue. Jc86035 (talk) 11:26, 12 January 2019 (UTC)
- @Jc86035 : I agree with that assessment 1000%. It's unfortunate (for us) but MusicBrainz is heavily influenced by the users of the Picard music tagger, at least historically. Because of that there is an unnatural amount of compilations and pirated releases with massive amounts of tracks (I'm guessing torrent sources) and artificially few singles, and little work has been done to consolidate recordings and establish works/compositions while cover art has high importance. I still think MB is a great source, but manuel checks are absolutely necessary. Moebeus (talk) 11:46, 12 January 2019 (UTC)