Shortcut: WD:DEV

Wikidata:Contact the development team

From Wikidata
Jump to: navigation, search

Project
chat

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Requests
for permissions

Interwiki
conflicts

Requests
for deletions

Property
proposal

Properties
for deletion

Requests
for comment

Partnerships
and imports

Bot
requests

Development plan

Status updates

Development input

Contact the development team

Contact the development team

Wikidata development is ongoing. You can leave notes for the development team here, on #wikidata connect and on the mailing list or report bugs on Phabricator. (See the list of open bugs on Phabricator.)

Regarding the accounts of the Wikidata development team, we have decided on the following rules:

  • Wikidata developers can have clearly marked staff accounts (in the form "Fullname (WMDE)"), and these can receive admin and bureaucrat rights.
  • These staff accounts should be used only for development, testing, spam-fighting, and emergencies.
  • The private accounts of staff members do not get admin and bureaucrat rights by default. If staff members desire admin and bureaucrat rights for their private accounts, those should be gained going through the processes developed by the community.
  • Every staff member is free to use their private account just as everyone else, obviously. Especially if they want to work on content in Wikidata, this is the account they should be using, not their staff account.
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2016/07.


One site link per item limitation[edit]

I would like to bring Wikidata:Property proposal/same as (permanently duplicated item) to your attention. Besides the messed up situation with Commons linking (see this RFC, where it was basically decided that having multiple links would be the right solution) this is another example where it turns out that the one site link per item limitation has not worked out. At this point, the Wikidata community is building various workarounds and "hacks" to make it work, but always at the expense of our data model and with considerable effort, both in discussion and implementation. Another area where this shows are our various items for Wikipedia categories, which move Wikidata further away from an "ontology of the world" and more towards an "ontology of the Wikipedia projects". Therefore I would really wish that you would seriously consider changing this to a model that is more suited to the needs of the community. --Srittau (talk) 06:56, 20 June 2016 (UTC)

@Srittau: If I agree with you about the trend of modifying Wikidata in order to adapt it to the various policies of the wikipedias I think you are wrong when starting the discussion here: development team is not responsible of the wikidata use by the community. Better start the discussion in the project chat or better find a group of persons having the same remarks and organize with them a RfC.
Wikidata lacks some common principles which has to be respected by everyone in order to develop common tools for queries or automatic infoboxes building. Your example about Commons clearly shows the current trend: people are looking for particular solutions and don't have a reference so they create domain rules which are contradictory with other rules applied in other fields.
Again the only solution is to provide a list of general principles and to adopt them in a RfC. Snipre (talk) 08:49, 20 June 2016 (UTC)
There have been multiple discussions and multiple RFCs. --Srittau (talk) 11:43, 20 June 2016 (UTC)
None of which have really taken into account the effects of such a change. Have you reviewed what Danny said at the last RFC? Were you okay with what he said? --Izno (talk) 12:05, 20 June 2016 (UTC)
How will adopting a list of principles help? Adopting "there should be one item per concept" as a principle will not change anything because Wikidata does not allow us to do that. Adopting "there can be multiple items for a single concept" as a principle would be silly, why would we want duplicates? - Nikki (talk) 12:42, 20 June 2016 (UTC)
If the development team really doesn't want to allow multiple sitelinks in general, perhaps a compromise would be to somehow extend how sitelinks work so that there can be multiple subtypes which act like separate sites. For example, Commons has institution and gallery pages, we could have two subtypes for them and allow a "Commons (institution)" sitelink and a "Commons (gallery)" sitelink on the same item. I think that could solve the problem for the majority of cases where a project deliberately has multiple pages for the same concept. It would work for script variants ("Serbo-Croatian (Latin)", "Serbo-Croatian (Cyrillic)", "Ladino (Latin)", "Ladino (Hebrew)", "Hakka (Latin)", "Hakka (Han)", etc). It would work for regional variants ("Armenian (Western)" and "Armenian (Eastern)", "Northern Frisian (Sölring)", "Northern Frisian (Fering)", "Northern Frisian (Öömrang)", "Romansh (Survilvan)", "Romansh (Grischun)", etc). It would maybe even work for multilingual sites (e.g. Wikidata's project chat pages, help pages in different languages on Meta). The interwiki links could also display which subtype it is, so both pages could be linked from the sidebar of articles and still be distinguishable. - Nikki (talk) 12:42, 20 June 2016 (UTC)
Agree with Nikki's proposal. -- Jura 12:47, 20 June 2016 (UTC)
I'd like to see that, too. StevenJ81 (talk) 15:23, 20 June 2016 (UTC)
Would you mean something that "one sitelink per namespace on a wiki" ? This would solve some of the problems, as namespaces totally can (and do) handle some kind of basic content classification. author  TomT0m / talk page 17:49, 20 June 2016 (UTC)
I suppose it could work that way. But on Ladino Wikipedia, we don't put the lad-hebr pages into a separate namespace; they are part of main namespace. In a sense, what this would do at face value is give certain specific Wikipedias the right to have two-to-one links. There are a few ways to accomplish it. Your way is one. Another way is to use page content language as a deciding factor—for example, if the page content language is lad-hebr I can put it there, if lad-latn I cannot. But I suppose the easiest way to do it is simply to make it an honor system. Those few wikis, for specific reasons, are allowed two entries (e.g., a lad-latn and a lad-hebr); it is up to them to make sure they use them properly, and not to use one for Bonnie Parker and one for Clyde Barrow. Given the specific circumstances, I don't see why that wouldn't work. StevenJ81 (talk) 19:06, 20 June 2016 (UTC)
Is there (yet) any model for how you should solve it for multi-language-wikis like betawikiversity, incubator and oldwikisource? Maybe those wikis can use the same solution as Ladino/Frisian/Armenian/Nynorsk etc? If it is two, three or fifty languages within a multi-language/script-project shouldn't matter. -- Innocent bystander (talk) 19:26, 20 June 2016 (UTC)
Good question. What I suggested would probably work for all three, although for Incubator it would probably be better if it understood prefixes (i.e. the sitelink entry for a certain project in the Incubator should ideally only allow pages with that project's prefix). Actually, the same sort of thing would apply to Commons, a "Commons (institution)" sitelink should ideally only allow institution pages (but an imperfect solution which gets implemented will solve more of our problems than a perfect solution which never gets implemented). - Nikki (talk) 22:49, 20 June 2016 (UTC)
Nynorsk uses prefixes for their Högnorsk-articles. -- Innocent bystander (talk) 09:44, 21 June 2016 (UTC)
This probably really should be an RFC to capture all aspects of the problem - for example what about the redirect problem - and note all the times roughly the same issue has come up in project chat. That said, it would be good to hear first from the developers exactly what would be technically feasible as a solution here... ArthurPSmith (talk) 20:19, 20 June 2016 (UTC)
I'm not sure discussing both simultaneously would be a good idea. Whether to allow redirects as sitelinks and whether to allow multiple sitelinks on an item seem to be pretty much unrelated problems. They're also both contentious issues. Any broad discussion covering both is probably going to get too complex to draw any useful conclusions from. I do agree that we need more input from the developers though. We've already discussed both issues so many times... - Nikki (talk) 22:49, 20 June 2016 (UTC)
RFCs and detailed discussions would be required if we would get the ability to have multiple site links. But as long as we don't have the technical foundations, or at least the buy-in of the development team, RFCs would just be another discussion without merit. But I think discussions in the past have clearly shown that multiple sitelinks - possibly requiring different "qualifiers" for namespace or language - would have merit and are very likely to be adopted in several areas. --Srittau (talk) 09:10, 21 June 2016 (UTC)
Yes, and it is basically the same issue as far as the technical/UI element goes on the client wiki pages: if wikidata is the hub for inter-language links between the wikipedias, than how do we handle the cases that (whether due to redirects or to the other factors mentioned above) the one-to-one relationship between a wikidata item and an article in a given language wikipedia is broken? That needs some technical clarification or at least some sort of specific proposal for how it ought to be handled. I don't remember seeing anything on that yet. ArthurPSmith (talk) 14:28, 21 June 2016 (UTC)
Namespaces or prefixes won't solve the "Bonnie and Clyde problem" where there are three concepts Bonnie Parker (Q2319886), Clyde Barrow (Q3320282) and Bonnie and Clyde (Q219937) and Wikipedias may have 1 (e.g. ca), 2 (e.g. pt) or 3 (e.g. en) articles that only properly interwiki with other Wikipedias with the same structure. There is no current way to link the ca and pt articles for example. Thryduulf (talk) 19:40, 21 June 2016 (UTC)
@Thryduulf: There is potential solutions to this problem and technical issues to implement them here : WD:XLINK. It needs manpower however. author  TomT0m / talk page 20:48, 21 June 2016 (UTC)

We have in some projects pages like Wikidata:Glossary/ca. It has been discussed somewhere how they could be treated. My proposal is: Each page in a Multi-language/-script wiki should have a language set to it in the "page-settings". Every such project should have a "default" language. If the default is "English", "West Armenian", "Multilang", "Unspecified" or "Ladino (latin)" depends on the nature of the project. Pages based on the Translation extension like the Catalan Glossary here at Wikidata should have "Catalan" as language and it cannot be changed manually. It is controlled by the Translation-extension. Wikidata:Diskutejo on the other hand is not based on the Translation extension and the language of that page could be modified to Esperanto in a special-page especially devoted to that page. Not all namespaces should have an option to be changed manually, only a limited set of namespaces should have that option. (Depending on the nature of the project.) The set of languages should be limited. In projects like Wikidata/Meta/Commons the number of languages is large but still limited. The set of languages in oldwikisource could look very strange if somebody creates a page in Urdu with Mongolian script. But I do not think that will become a problem. Wikisource as a project is very inclusive to strange languages and scripts and I think we can find a pragmatic solutions when such problems arrive. The number of languages in the Armenian wiki should only be two, East and West. The right to change languages of a page, should be limited to a user group. But if that user group is Autoconfirmed, Sysops or a Special-designed user group is up to the local community to decide. When somebody creates a page in Northern Armenian, it has to be a community-talk about including this new language into the Armenian wiki or not. Thereafter the community go to Phabricator to add this new language to the project. This has to be done every time also in the Incubator-wikis, but the procedure should probably be more simple.

Here at Wikidata, each multi-language/-script-wiki can have more than one sitelink as long as they uses different lang-code in the page-settings. Each time the language-settings in the client is changed, the related sitelink is removed here at Wikidata to prevent us from having two West-Armenian pages in the same item. The visible interwiki at the Esperanto Wikipedia Project Chat should be linked to the Esperanto Wikidata Project Chat (if there is one). Otherwise it should link to the "default" Wikidata Project Chat (if there is one). My opinion is that this should not be used to solve the problems with namespaces in Commons. -- Innocent bystander (talk) 06:28, 22 June 2016 (UTC)

I really like the idea of being able to set the language/variant as part of the page's metadata for multilingual wikis. Combining this idea with what I suggested above where Wikidata would allow one sitelink per language/variant sounds like a really nice solution. It seems like that would just leave Commons and Incubator. - Nikki (talk) 22:25, 1 July 2016 (UTC)

Adding a dummy comment here to prevent archiving. I still need to look into this and reply but I have not had time yet. Sorry :( --Lydia Pintscher (WMDE) (talk) 12:51, 11 July 2016 (UTC)

Maybe we can just modify the timestamp, such as "--Liuxinyu970226 (talk) 23:59, 31 August 2016 (UTC)"? --Liuxinyu970226 (talk) 12:47, 21 July 2016 (UTC)
  • Essentially this concerns Commons and one specific language. Splitting that language into two Wikipedias would solve most of the problem. Most other languages only have few duplicates.
    --- Jura 13:10, 21 July 2016 (UTC)

Datatype of IUCN ID[edit]

IUCN-ID (P627) should have the datatype "external-id". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:56, 10 July 2016 (UTC)

Please comment at Wikidata:Identifier migration/0#Not going to convert if you think this should be converted, instead of pestering people at this page about your own arbitrary opinions. There are other people working on the identifiers 'project' who have different opinions to your own. I.e. knock it off. --Izno (talk) 23:22, 10 July 2016 (UTC)
"pestering people" Well, there's an interesting approach to assuming good faith. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:32, 11 July 2016 (UTC)

Does anyone wish to attempt to provide any evidence that IUCN-ID (P627) is either i) not external to Wikidata or ii) not an identifier? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:03, 16 July 2016 (UTC)

Until recently IUCN-ID (P627) was intended to be used as the source of IUCN conservation status (P141) (see property proposal). Without any further discussion you promoted the property to one pointing to the taxon concept supported by the IUCN. So why are you not using the regular way as suggested above, Mr. Mabbett?! --Succu (talk) 20:48, 16 July 2016 (UTC)
I find your assertion of the intended use to be an unsupported opinion. I also note that you offer none of the evidence I invited. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:41, 17 July 2016 (UTC)

Wikidata coping with Wikipedia growth[edit]

@Lydia Pintscher (WMDE): there seems to be an issue when creating items for new articles and pages at Wikipedia/Wikimedia sites.

Currently a WMF employee seeks to convince GZWDer who creates most of these items to limit the creation. Given that it shouldn't matter if one user creates all items or if 1000 users each create 1 item, it seems that we hit some limit to Wikidata's growth.

Would you attempt to assess the size of the problem? Maybe new items could be created directly by MediaWiki (at least for some pages).
--- Jura 10:57, 11 July 2016 (UTC)

My understanding of the issue is that it is the rate the items are being created causing server issues, not who is creating them. Thryduulf (talk) 11:03, 11 July 2016 (UTC)
The problem is that someone needs to create all those items (continuously).
I wonder how many new items have to be created every day merely to keep up with Wikipedia and how much below (or above) the currently possible rate that is.
--- Jura 11:13, 11 July 2016 (UTC)
I think we have to discuss again about the rule one WP article = one WD item. The structure used in some WP leads to an increase of items which doesn't match any classification scheme. Example from the project chat: in en:WP, they have an article about one person Trayvon Martin (Q13864562), one about the death of this person shooting of Trayvon Martin (Q913747), one about the timeline of the death of this person timeline of the shooting of Trayvon Martin (Q7806459) and even the speech of the president about the death of this person Trayvon Martin could have been me 35 years ago (Q16957966). So if we follow that example, i.e. for each person item we have an additional item for its death and an item for the timeline of the death, this increases in a significant way the number of items. But if this logic is applied until the end then we should do the same for all life events for all persons (birth, marriage, divorce,...) and then we are out in term of items number. Snipre (talk) 12:12, 11 July 2016 (UTC)
But somebody then have to do that job, and it looks like a hypothetical situation. I am more worried about the discussions we have had about elections. The proposals have said that each candidate in each election should have an item about them, telling how many votes they got. That leads to thousands of persons in every municipality, every X years for every municipality in all democratic nations. One out a of a thousand of such persons are notable enough for WP. All we know about them is their first and last names and sometimes which political party they represented. They can even be fictional, since being a live person is not required to get votes. -- Innocent bystander (talk) 12:36, 11 July 2016 (UTC)
Innocent bystander This is a similar problem which can be defined as a granularity problem: until which level of details do we want to go ? And is it worth to have a small number of items with a high degree of details and the rest with too few statements to offer any comparison level ? Snipre (talk) 13:07, 11 July 2016 (UTC)
For Wikipedia it is of no interest to know all these thousands of names. On svwiki and regarding Swedish municipalities we are not even interested in the names of the elected candidates. Only the number of seats each political party gets is of interest. -- Innocent bystander (talk) 13:39, 11 July 2016 (UTC)
The problem isn't the total amount of edits but the high frequency of the edits or page creations. I just checked and there are about 10000 new Wikipedia articles per day according to the graph at https://reportcard.wmflabs.org/#secondary-graphs-tab. --Lydia Pintscher (WMDE) (talk) 12:50, 11 July 2016 (UTC)
One large article-creator is Lsjbot. He is paused now for July because of holidays, but there is a probably a large lag in the creation of items between sv/cebwiki and Wikidata. Please keep me (or Lsj) informed how this holiday affects this issue! If necessary, we can probably slow down Lsjbot. Lsj himself will not be happy about it, but I think he will understand the situation, since read only-problems with the servers is a constant problem also for him. -- Innocent bystander (talk) 13:28, 11 July 2016 (UTC)
  • It seems that the number of required new items hasn't grown that much, but we might not have sufficient capacity to catch up with missing items. I think WMDE should look into this and try to come up with a plan.
    --- Jura 05:44, 13 July 2016 (UTC)
  • To me this issue seems like it should be technical in nature. Is there hope that it will be solved at a technical level sooner or later? I myself would hope that Wikidata has billions of items in a few years. If ProveIt creates Wikidata items, the speed of item creation might double or grow even stronger by the end of this year. ChristianKl (talk) 18:41, 14 July 2016 (UTC)
  • For me a matter of „ruthlessness“ not unknown to GZWDer. --Succu (talk) 21:12, 16 July 2016 (UTC)

Deletion on WikiSpecies not on Wikidata?[edit]

The page at https://species.wikimedia.org/wiki/T.C._Narendran was deleted, but sitelink at Q7672638 wasn't removed.
--- Jura 05:45, 13 July 2016 (UTC)

The administrator that deleted the page on Wikispecies, doesn't exist on Wikidata. Mbch331 (talk) 16:14, 13 July 2016 (UTC)
Seems odd that this has an impact.
--- Jura 11:08, 18 July 2016 (UTC)

Arbitrary access on Commons[edit]

After arbitrary access was enabled on Commons, we started with experiments on how to best use it. Three possible paths emerged

  1. Current approach is not to use sitelinks at all, but for every page that want to access Wikidata have a hardwired q-code. Example: pick any Commons Creator template: most of them have a q-code. It is simple reliable, but it is prone to problems when q codes are deleted for some reason. For example, in the past Commons pages were linking to Q1066592 or Q18516705. Also with Wikidata pages linking to Commons (through properties like: Commons category (P373), Commons Creator page (P1472), Commons Institution page (P1612) and Commons gallery (P935)) and the commons pages linking back to wikidata, one need to keep those links in synch and if there is any issue with wrong pages being linked than changes have to be made on two projects.
  2. Approach number 2 favored by Wikidata community is to have an item for every Commons page: items for article pages, category pages, creator pages, institution pages, etc. Most of those items would not have properties other than a property with a link to article item, where all the properties would be kept. Each item would have a sitelink to Commons page and that page could access properties using LUA code. There are two issues with this approach: first is that it produces a lot of extra items that do not do anything other than serve as redirects to items where the properties are kept. Second issue is that it does not work for Commons creator and institution templates, which as templates are transcluded on other pages, where sitelinks will not work. However this could work for Commons categories.
  3. Third approach would require change in the wikibase software and before I go any farther with lobbying for it, I would like to have feedback from development team on is it even possible to implement. The idea is to allow lua to perform queries like, return article entity that had value X in property Y, for properties that suppose to have unique values among items. That way I can check which item points to current category page on commons through P373, or which item points to creator page through P1472 (page name is stored in creator's linkback variable and does not change when transcluded).

Some of this was already discussed in here and here. Is that third option possible technically? --Jarekt (talk) 13:14, 15 July 2016 (UTC)

@Jarekt: Assuming option number 3 would be limited to identifier type statements, I think that's feasible yes. Quite some ground work is required for that, so I don't think it can be done fast, see T99899 for that. Cheers, Hoo man (talk) 00:08, 24 July 2016 (UTC)
Hoo man, I was mostly imagine this for any property that have uniqueness constrain, and in my example I was hoping to use it for detecting links from wikidata. so detection of pages that have a specific value in Commons category (P373), Commons Creator page (P1472), Commons Institution page (P1612), image (P18) and Commons gallery (P935)) properties. --Jarekt (talk) 16:24, 24 July 2016 (UTC)

Varying caps in English labels for language codes code mul,zzx,ukn[edit]

  • "mul" is currently "Multiple Languages". This should probably be "multiple languages". Sample: Q5030507#P1476
  • "zzx" is currently "No linguistic content". Please change to "no linguistic content". Sample: Q18510551#P1684
  • "ukn" is currently "Unknown Language". Please change it to "unknown language" Sample: Q3596394#P1476

If this is something I can change myself, please advise me where. Thanks.
--- Jura 11:08, 18 July 2016 (UTC)

I was also looking for this, this looks rather ugly. Sjoerd de Bruin (talk) 11:10, 18 July 2016 (UTC)
Exactly what is the intension with the code "ukn"? The Swedish label ("obestämt språk") is hard to interpret without knowing the circumstances. -- Innocent bystander (talk) 06:51, 22 July 2016 (UTC)
I prefer "undetermined". Wikidata:Database reports/monolingual text/undetermined language seem to use it correctly (unless one knows).
--- Jura 06:59, 22 July 2016 (UTC)
The inscription of U 932 looks like Old Norse language (Q35505) to me, and that is what the claim actually says. -- Innocent bystander (talk) 07:08, 22 July 2016 (UTC)
In that case it should be "mis" not "ukn", at least until https://phabricator.wikimedia.org/T137115 actually works.
--- Jura 07:14, 22 July 2016 (UTC)

Links in redirect-related edit comments[edit]

It would have been nice with a link in edit comments like this and this. It would make it easier to verify a merge and would simplify the cases when they need to be reverted. -- Innocent bystander (talk) 16:30, 20 July 2016 (UTC)

Created phab:T141188.--GZWDer (talk) 18:14, 23 July 2016 (UTC)

Playing around[edit]

Do we have a Game to merge items? Isn't it a little to easy to do bad merges already without turning it into pure entertainment? -- Innocent bystander (talk) 06:44, 22 July 2016 (UTC)

We have had this game since 2014 and there have always been some bad merges which happen anyway. Matěj Suchánek (talk) 07:20, 22 July 2016 (UTC)
I don't think tools should allow to merge disambiguation items into anything else then disambiguation items.
--- Jura 07:27, 22 July 2016 (UTC)
Agree. The Game helped with lot of mergers in past, but it needs more control mechanisms. Workflow on WD changed since 2014, there were lot of easy and clear mergers in past, but there are lot of border situations now. Another disadvantage of the game is that it is link-based, not item-based, so mergers based on "correct" links (but "incorrect" labels and properties) are happening. Undo of old merges is sometimes very hard. --Jklamo (talk) 08:44, 22 July 2016 (UTC)
And we maybe need Constraints-reports also for such things as P31:Wikimedia disambiguation page (Q4167410), not only based on the Properties. -- Innocent bystander (talk) 08:54, 22 July 2016 (UTC)

Bot login staff[edit]

Hello! Now my User:SKbot as reply to action=login got following:

Main-account login via action=login is deprecated and may stop working without warning. To continue login with action=login, see Special:BotPasswords. To safely continue using main-account login, see action=clientlogin.

Why? What is a Special:BotPasswords? -- Sergey kudryavtsev (talk) 16:35, 23 July 2016 (UTC)

PS: I'm try again, now action=login works again... What's stuff?! -- Sergey kudryavtsev (talk) 16:42, 23 July 2016 (UTC)

@Sergey kudryavtsev: This is a general MediaWiki thing, so you probably want to have a thorough look at mw:API:Login for this. Cheers, Hoo man (talk) 00:04, 24 July 2016 (UTC)