Wikidata:Contact the development team

From Wikidata
Jump to: navigation, search
Shortcut: WD:DEV
Wikidata development is ongoing. You can leave notes for the development team here, on #wikidata connect and on the mailing list or report bugs on Bugzilla.
See the list of bugs on Bugzilla.

Regarding the accounts of the Wikidata development team, we have decided on the following rules:

  • Wikidata developers can have clearly marked staff accounts (in the form "Fullname (WMDE)"), and these can receive admin and bureaucrat rights.
  • These staff accounts should be used only for development, testing, spam-fighting, and emergencies.
  • The private accounts of staff members do not get admin and bureaucrat rights by default. If staff members desire admin and bureaucrat rights for their private accounts, those should be gained going through the processes developed by the community.
  • Every staff member is free to use their private account just as everyone else, obviously. Especially if they want to work on content in Wikidata, this is the account they should be using, not their staff account.

On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at April.


After a long discussion in the PC, I am just wondering if someone from the development team dealing with rights and licenses can give a feedback about the problem of incorporation of some large parts of other databases in WD after the independent imports of hundreds of contributors. Isolated imports are covered by the short citation rule and by giving the source you can import what to want. But when this is done thousands times and the same sources are used, important parts of some documents/databases will be integrated in WD and then this will be a potential problem according to the law (both EU an US). Shouldn't we modifying our license to reduce the problem by applying the CC-BY which is often one of the main common rights required by other databases or public documents ? And don't look at the WMF because I already asked that question and they just said to look with WD about that. Snipre (talk) 19:38, 9 April 2014 (UTC)

I am not a lawyer and therefore will refrain from advice on the legal parts of this. I am however absolutely sure that CC-0 is the only right license for Wikidata and that we should not change it. There are several reasons for this. Here are a few:
  • No matter what license we switch to it will not solve the import problem. There are simply too many licenses out there to make this a deciding factor.
  • Licensability of data is disputed anyway. We're just making sure everyone knows our stance on this is "do whatever you want" which is basically what people can do most of the time anyway.
  • Licensing is either a pain on the producer or re-user. I am absolutely convinced that we have to be the ones bearing the pain if we want to be successful.
  • We do not need to import other large databases. We're not about having all the data out there. Let's grow slowly and build a useful knowledgebase - not a data graveyard.
I am absolutely convinced that a license change is one of the very few things that can kill Wikidata. Let's not do that.

--Lydia Pintscher (WMDE) (talk) 11:42, 10 April 2014 (UTC)

@Lydia Pintscher (WMDE): Perhaps my comment was not clear enough and I should focus on the real problem of the data import instead of trying to propose a solution. I just take one example of the french or the swiss communes: data about these administrative entities is provided only by the statistical services of both countries. So even if you use another document as source, the primary source is and stays the statistical services. Both services have some copyright on the data: the french ask only for the credit here and the swiss one asked for the credit and for no commercial use (here). So for a strict application of the CC0, we won't be able to use any data directly or indirectly from this two databases because we don't ensure the credit outside of WD with a CC0 lisence. That's the first thing.
The second thing is about the data import. Right now without any massive data import we can import individually data from these databases according to the short citation rule, but as the contributors will work the totality of the databases will be integrated in WD. So there we will have a problem according to the EU law because of the presence of those databses in WD without respecting their lisences. And as these databases are the only ones providing data about the communes it won't be difficult to find the source.
So the question is this one: WD is under the CC0 lisence, according to EU law about databases and without any massive import from other databases having at least a CC-BY lisence (credit), what will happen if by a the result of independent efforts of several contributors major party or the totality of a database under a CC-BY lisence will be present in WD ?
Can we argue good faith in case of demand of deletion or demand to modify our lisence to keep these data and that is ?
Can we expect that public databases won't say anything in that situation ?
As most of public databases in Europe deal with a similar lisence to CC-BY I think we have to clarify this question before letting person doing what they want because even if individually they respect the system, at the end WD won't respect the EU law.
And I already try to ask that question to the WMF and the replied that a WD problem so for me we can continue our job without thinking about future but I think we will have bad surprises once WD will reach a certain size and will be use by a lot of persons.
You can have a CC0 lisence but in that case, in my opinion, we have to be sure that your sources are quite diverse to avoid any risk to have a large amount of an unique database. And for some data this is not possible because there is only one source.
I know that you are not a lawyer but as everyone should know the law there is perhaps a good idea to find someone who can give as least an appreciation of the risk (high risk vs. small risk) to work in our situation. Snipre (talk) 14:38, 10 April 2014 (UTC)
Knowing too much can be equally bad for us because anyone who sues can then argue that we knowingly infringed. We should definitely avoid that. We should act in a best-effort manner. Changing our license will not change anything there. And this is all I can say on-record. Sorry :/ --Lydia Pintscher (WMDE) (talk) 14:55, 10 April 2014 (UTC)
I am no lawyer either but for what I could gather the general principle of copyright license is pretty clear, at least for the EU and the US: databases can be protected in the EU, they can't in the US. As a US-hosted database, it seems unlikely that Wikidata can get into serious troubles for copyright reasons, but we may be required to remove some content to comply with European law. My hunch is that if we have to remove content that clients have grown used to rely on, it could be much more harmful to the project than adopting a more restrictive license right now. That said, I guess we could also bluntly decide that we do not care about European copyright policies, and we could probably get away with that. It seems that by itself, CC0 is incompatible with at least French and Swedish law anyway. --Zolo (talk) 20:25, 10 April 2014 (UTC)
@Zolo. Don't say that US don't have a copyright for databse because database can have a copyright if there is a creative action like the evalutation/selection of data. Snipre (talk) 09:11, 11 April 2014 (UTC)
@Lydia. It's up to you but I think this should be mentioned in some way to the community in order to be sure that everyone who contributes knows that his work is not protected against a possible deletion even if the contributor is doing the job properly. In my case I would say that without a clear position I will avoid to import more data in WD. Snipre (talk) 09:11, 11 April 2014 (UTC)
@Lydia Pintscher (WMDE): Hi, Lydia, I understand your intentions, and it is laudable. I hope that all government offices and scientific institutions will give in the future their data in public domain, but now it is not so! Maybe Wikidata could give its contribution with its example... --Paperoastro (talk) 21:20, 14 April 2014 (UTC)
@Snipre, Lydia Pintscher (WMDE): We have now as possible values "custom value", "unknown value", and "no value". Would it be a reasonable solution to have "external value" and point in the source where is the value, which license it uses, and how to extract it?--Micru (talk) 21:48, 14 April 2014 (UTC)
That's technically infeasible for the foreseeable future. It's also counter to what we want to do here. We want to give everyone easy access to a lot of free and open data. We don't want them to go through the hassle of having to get their data from a lot of different sources and attribute each of them. In addition individual data points are not an issue at all. --Lydia Pintscher (WMDE) (talk) 08:56, 15 April 2014 (UTC)
If what we want is a CC0 database and some data is not CC0, then we can either ignore that data or point towards it. All the data that we have and that we generate still will be CC0, free, open and hassle-free. If in addition to that someone wants to get more data right now they already have pointers to some external sources, but that is a poor solution which doesn't convey license information of each site or the extraction methods... In a way we wouldn't be adding more hassle than the one already exists, just simplifying it and automating it. --Micru (talk) 12:33, 15 April 2014 (UTC)
Interesting idea. Would need quite a bit more elaboration, but interesting idea. --Denny (talk) 16:06, 17 April 2014 (UTC)
@Denny: Another idea is to use a property to refer to non-free data as suggested here: Wikidata:Property_proposal/Generic#non-free_data_available_at --Micru (talk) 20:28, 18 April 2014 (UTC)

Different items with same sitelink[edit]

I found pairs of items which have the same sitelink:

Category:Edmonton Drillers (2007) players (Q13260681) <-> Category:Edmonton Drillers (2007) players (Q15254147) and (no label) (Q9740449) <-> (no label) (Q16320263)

Is it a bug? --Pasleim (talk) 18:58, 12 April 2014 (UTC)

See WD:True duplicates. Matěj Suchánek (talk) 19:29, 12 April 2014 (UTC)
Thanks for this link. I reported there a new list with duplicates. --Pasleim (talk) 21:02, 13 April 2014 (UTC)

Errors in summaries[edit]

See summaries of recent edits of Matěj Suchánek (talk) 21:32, 12 April 2014 (UTC)

Please be more specific ;-) They look fine to me here. --Lydia Pintscher (WMDE) (talk) 21:34, 12 April 2014 (UTC)
+1; I see no problem. --Succu (talk) 21:36, 12 April 2014 (UTC)
I think he means the fact it is saying 'Added link to [shwiki]: shwiki'. Since the latter 'shwiki' should really be the page added not the wiki name which was repeated less than 10 pixels away. John F. Lewis (talk) 21:40, 12 April 2014 (UTC)
So whats (e.g.) wrong in the history of (E)-4-Hydroxy-3-methyl-but-2-enyl pyrophosphate (Q2708007)? --Succu (talk) 22:04, 12 April 2014 (UTC)
"Added link to [shwiki]: shwiki" should be "Added link to [shwiki]: (E)-4-Hidroksi-3-metil-but-2-enil pirofosfat" like the previous edit "Removed link to [shwiki]: (E)-4-Hidroksi-3-metil-but-2-enil pirofosfat". John F. Lewis (talk) 22:08, 12 April 2014 (UTC)
I also saw "Added link to [shwiki]: shwiki"--GZWDer (talk) 12:14, 14 April 2014 (UTC)
Aha! Ok now I see it. I have been looking at a few other edits and they seem fine. Does anyone else see a pattern that'd help track this down? --Lydia Pintscher (WMDE) (talk) 12:16, 14 April 2014 (UTC)
Ohhhh. Looking at the bot's contributions they are all like that. The bot is setting the edit summary there, right? Then it is an issue with the bot it seems. --Lydia Pintscher (WMDE) (talk) 12:18, 14 April 2014 (UTC)
@Dcirovic: Please use meaningful edit summary.--GZWDer (talk) 12:20, 14 April 2014 (UTC)
At the beginning of my recent set of bot additions of interwiki links, I was having difficulties with logging into wikidata as a bot, even though I was logged in as such elsewhere. For that reason, few edits were unintentionally made with my IP address. The "(E)-4-Hidroksi-3-metil-but-2-enil pirofosfat" entry, and couple of others, were used for debugging. I would remove my IP change, modify bot configuration, and test it. The sequence was repeated few times, until the cause of the problem was identified. No harm was done, and only a small number of such changes were made.
I think that the change comments such as: "Added link to [shwiki]" are quite sufficient. I assume that the cause of confusion, are the deletions of my IP edits. If this community requires more elaborate comments in such cases, I will be glad to accommodate that. --Dcirovic (talk) 17:32, 14 April 2014 (UTC)
@Dcirovic: For your tests you should use --Succu (talk) 17:50, 14 April 2014 (UTC)

Searching categories[edit]

See Wikidata:Project_chat/Archive/2014/04#Searching_categories. Q9107354 is not in the result of [1].--GZWDer (talk) 12:44, 14 April 2014 (UTC)

The issue is that Category:Unionoida is treated as one term and not two. The same will happen with project pages for example. Can you please file a bug for that? We'll have to see what we can do about it. Adding an alias Unionoida makes it work. --Lydia Pintscher (WMDE) (talk) 13:00, 14 April 2014 (UTC)

Defective rendering of large numbers in French[edit]

It was reported that field number of deaths (P1120) of World War I (Q361), rendered as "16,563,868" in English, appears as "16&nbsp;563&nbsp;868" in French (should be "16 563 868"). Is there a bug tracking this issue ? LaddΩ chat ;) 16:01, 14 April 2014 (UTC)

Same in Hungarian for example for population (P1082). --JulesWinnfield-hu (talk) 16:22, 14 April 2014 (UTC)

That should be bugzilla:61911. Fix is doen and waiting for the next deployment. --Lydia Pintscher (WMDE) (talk) 09:19, 15 April 2014 (UTC)

Wrong rendering of time values[edit]

See the source of instance of (P31) in Aristobulus I (Q335124). I've added it via wbeditentity. --Ricordisamoa 05:53, 15 April 2014 (UTC)

<value time="+00000002014-04-15T05:25:44Z" timezone="0" before="0" after="0" precision="14" calendarmodel="" />
You have added a precision > 11 and written hour, minute and second.
That looks like a known bug to me. -- Lavallen (talk) 07:17, 15 April 2014 (UTC)
Yes that would be precision of seconds which are not supported at the moment - only days or larger. --Lydia Pintscher (WMDE) (talk) 08:59, 15 April 2014 (UTC)
This edit should no longer be possible with the next deployment. --Lydia Pintscher (WMDE) (talk) 09:02, 15 April 2014 (UTC)
Have you considered that it could be more useful to support hour, minutes and seconds correctly? Otherwise, calling the datatype "time" is a bit confusing. --Ricordisamoa 19:45, 15 April 2014 (UTC)
Of course ;-) But that is quite a bit more complicated (conversions between timezones anyone?) So for now this is the solution. --Lydia Pintscher (WMDE) (talk) 20:41, 15 April 2014 (UTC)

Empty diff[edit]

? --Ricordisamoa 09:48, 16 April 2014 (UTC)

There is no difference. The IP changed P3x back to the original values. --Succu (talk) 10:14, 16 April 2014 (UTC)
So why did the rollback succeed? --Ricordisamoa 12:57, 16 April 2014 (UTC)
This is interesting because when you move a statement upwards or downwards, there is no visible change in the diff. (Actually, I didn't check whether the IP really moved a statement. The page is too big...) Matěj Suchánek (talk) 14:01, 16 April 2014 (UTC)
I think your revert should have been simply ignored, because there was nothing to revert. May be this is a bug?
Moving statements around should cause an edit summery. I think this is a bug. --Succu (talk) 20:46, 16 April 2014 (UTC)

Problems with AbuseFilter[edit]

Although a filter is set to warn, users are not warned. (Happens since April 9.) Matěj Suchánek (talk) 16:33, 18 April 2014 (UTC)

file usage commons[edit]

If an item uses a picture (f. e. [2] in wolf (Q18498)), it should be listed in the used files on commons (down at the linked Filepage). Greetings, Conny (talk) 06:57, 19 April 2014 (UTC).