Wikidata talk:WikidataCon 2017

Jump to: navigation, search

About this board

Previous discussion was archived at Wikidata talk:WikidataCon 2017/Archive 1 on 2017-05-10.

Sebastian Wallroth (talkcontribs)

Hi WikidataCon,

I'd like to make some recordings for the podcasts meta:WikiJabber and de:Wikipedia:WikiStammtisch. What I need is a place to sit down relaxed and to talk for about an hour per person. And a power plug. And access to the event. Would that be possible?

Lea Lacroix (WMDE) (talkcontribs)

Hello Sebastian,

Thanks for your interest in the WikidataCon. Unfortunately, we don't have any seats left for more attendees, and since the venue will be quite crowded, I can't promise a quiet place to do interviews. In these conditions, I think it's better not to do it.

Some other Wikidata-related events will happen regularly in Berlin or other places, all listed on Wikidata:Events: if you want to meet Wikidata editors :)

Reply to "Podcasting"

Looking for a keynote speaker: ontology specialist?

Lea Lacroix (WMDE) (talkcontribs)

Hello all,

We plan to have two keynote sessions during the WikidataCon, and we're currently looking for more ideas.

A lot of people mentioned that they would like to have someone who's specialist in ontologies, that could give us an overview on how to model high-level ontologies and and an external point of view on this topic.

Do you have any name coming to your mind? Feel free to share names (explaining a bit why this person would be relevant) and we will contact them quickly.


Multichill (talkcontribs)

I mentioned Frank van Harmelen (Q5490605) before, but might be a bit short notice now.....

Ls1g (talkcontribs)

Giancarlo Guizzardi could potentially be a match ( - the website is outdated, he is in Italy now).

(Conflict of interest warning: he is a colleague of mine)

Gamaliel (talkcontribs)

Tpt (talkcontribs)

You could maybe ask Jamie Taylor who is giving a keynote at ISWC in Vienna a few days before. He is the head of the Knowledge Graph Schema Team at Google and was the schema person of Freebase.

An other possible person is Fabian Suchaneck who does a lot of research around schemas.

GerardM (talkcontribs)

Consider Barend Mons. He is the guy behind Wikiproteins, his paper describes the development and notions of a Wiki for biomedical purposes and it is what made OmegaWiki possible. He does have an article and an item.

Reply to "Looking for a keynote speaker: ontology specialist?"
GoranSM (talkcontribs)

Hi there,

in 2017, we have started the development of a system that provides statistical analyses and insights from Wikidata usage across the sister projects: the Wikidata Concepts Monitor (WDCM). We have a working prototype that relies on 20 large projects and are currently putting efforts to scale it over the whole range of projects. Methods of distributional semantics (topic models, for example) are utilized to discover important structural information from the statistical patterns of Wikidata usage; there's a front-end under development where visualizations and indicators will be provided. Everything is being developed in R.

If anyone is interested in this topic - not only the WDCM system in itself, but the empirical study of Wikidata usage and drawing important conclusions for the community from it - let me know and we could propose an event together. Anything like a round table or a discussion sounds like an interesting format. I guess people interested in semantics, statistics, content insights, data science, text-mining, sociology of knowledge, cognitive science and similar could profit directly from getting in touch with this project. However, anyone from the community could find this interesting, I guess. Not to mention the value of your inputs for the people who are involved in planning and development.



GerardM (talkcontribs)

will you include the language content is requested in?

GoranSM (talkcontribs)

@GerardMEnglish, I guess.

GerardM (talkcontribs)

There are some 280+ languages Wikipedia is available in. Wikidata serves all of them. It is really relevant to understand how important Wikidata content is for each Wikipedia ie each language. We also provide service to Wikisource, Wikinews .. Only English does not cut it.

GoranSM (talkcontribs)

@GerardM Sorry, I think I've misunderstood your question. Here we go:

(1) The WDCM system is currently being scaled to track and analyze the Wikidata usage across all sister projects, not just Wikipedia, but also Wikisource, Wikinews, etc. Until the end of the year, and very probably in the early autumn already, the scaled system will be available and ready to use.

(2) As of the language content tracking: the current design of the database schema that is crucial for this project does not allow us to track the language specific Wikidata usage precisely (not on the item level of analyses; as already explained, we have the project level data, e.g. we know what's happening on enwiki, dewiki, frwiki, etc). In the future, if these important tables get re-designed or enhanced, we will be able to track language specific usage on the level of individual Wikidata entities.

Hope this sounds better?

GerardM (talkcontribs)

Hoi, yes that is good news. You indicate that you will know the use from a Wiki and thereby deduce the percentage that is in a language. You also indicate that the tables do not know about the language used. Given the increasing number of projects from outside the WMF this is is a miss.

How significant this and if it grows relative to WMF traffic is when you do measure traffic not identified as coming from a Wiki project. I sincerely hope that the tables allow for this.

GoranSM (talkcontribs)

@GerardM Maybe a prototype explains what the WDCM is:

Reply to "Topics to discuss: Wikidata usage"
Lea Lacroix (WMDE) (talkcontribs)

Hello all,

I'm glad to announce that the registrations for the WikidataCon are now open! You can access to all informations and fill the form.

Important: due the necessary time for people to get visas (about 3 months), we changed the deadline for the scholarship applications. You can apply for a scholarship before July 16th. We will then make sure that the applicants receive a response on July 25th.

Reminder: you can still propose a project for the program until July 31st!

If you have any question, problem, or see an information missing, feel free to reach me.

Multichill (talkcontribs)

What kind of crazy system is this? Lessons learned from dozens of previous organized events: Never ever make tickets free. Random people will sign up, won't show up and your actual audience will get annoyed.

I consider myself an active member of the community. I thought this would be a nice event to attend and pay for myself. Forcing me to sign up to some stupid waiting list is not nice at all. I doubt if I will be there.

Reply to "Registration open!"
Lea Lacroix (WMDE) (talkcontribs)

Hello all,

Thanks again for all the feedback and ideas. The call for projects for the WikidataCon is now open. We will accept a great diversity of formats and topics. So, if you want to submit any project, alone or with others, technical or not, it's the perfect time!

You can read details here, the call for submissions is running until July 31st. If you have any question, feel free to reach the program committee.

(just mentioning people that were involved in the previous discussions: @Snipre @Magnus Manske @Husky @ArthurPSmith @Multichill @Ainali @MichaelSchoenitzer @Amit_gkp @Alangi derick @Sjoerddebruin)


Reply to "Call for projects!"
Lea Lacroix (WMDE) (talkcontribs)

Hello all,

If you plan to apply for a scholarship for WikidataCon, please read the informations and apply here :)

Application process will close on July 31th.

Reply to "Scholarships process is now open"
Snipre (talkcontribs)

We already metioned that in the past but I think we never had the opportunity to put some effort in some tutorial about ontology for WD. The best sould be to invite a specialist of ontology and to propose him to provide 1) some basics about ontology building (main question to answer before starting, the main characteristics and the possible models (with pros and cons)), 2) some technical rules we should define and 3) perhaps the vision of an expert about what can be the ontology model for WD. For a name I can propose Barry Smith because I read some of his works but I thing it should be possible to find a German professor working in that field. Snipre (talk) 11:33, 22 May 2017 (UTC)

Micru (talkcontribs)

I agree with @Snipre that it would be good to invite someone from an external group, it could be Barry Smith or it could be anyone else from the BFO group or another one. There is a lot to learn from external expertise and a keynote on basic principles would be most useful.

Snipre (talkcontribs)

Ping Léa @Lea Lacroix (WMDE).

TomT0m (talkcontribs)

After reading I think this indeed could be a major source of knowledge for us (and hopefully consensus).

Micru (talkcontribs)

@TomT0m If it is not translated into layman terms and has some practical applications, for us is pretty much useless in its current form.

TomT0m (talkcontribs)

The point was not to discuss this article but that (one of) the author of it is relevant as a speaker. After playing around a lot on Wikidata I can assure yoi that many items can be linked together just with instance of, part of and probably the properties discussed in this article. This is enough to model a huge number of stuff in Wikidata without having to pass through the property creation process, which is a pain. Temporal relationships are also a part where we don’t have much done and are relevant to model processes. And temporal relationship are very useful for modelling for example processes. I’d welcome Barry to present us some useful and generic properties that can be applied all around Wikidata.

There is at least one application : items filled with statements made of those properties but have no label and/or no description nor article in someone language should be enough for this someone to have a sufficient definition if he understand the properties. If we can, with the help of barry, create and define such a minimal set of properties that can be used on almost every item, I think we will have the basis of some sort of minimal but expressive common language to define items and create a description in any language. We just have to agree and translate the initial definition, which is a lot easier than to describe a lot of specific properties for many fields. This is the whole point of having an ontology.

Snipre (talkcontribs)

That's typically a subject which should be presented in that presentation: what are the pros/cons of having a classification which is properties oriented vs. a classification relying mainly on instance/subclass structure.e

TomT0m (talkcontribs)

As far as I know, there is no such thing as a «property oriented» classification. Classification is the process of making classes :)

Snipre (talkcontribs)

But you can classify using a lot lot of classes with few properties or using a small number of classes and using properties to add more data to an item. Is the comment more clear now ?

TomT0m (talkcontribs)

I think I get the idea but I don’t really think this is a well posed question. First, having a lot of properties is not orthogonal to have a lot of classes.

I think what you want to know is that if this is always relevant to have explicit class membership with an « instance of » statement, and if it is relevant to have items for specific classes (how deep?).

Indeed, the concept of a « main sequence » star (main sequence (Q3450)  ). Conceptually, this maps to a class of star. To retrieve all the stars of this class, we could surely, assuming we have stars with enough informations about their magnitude and color in it, create a SPARQL query that finds all its instances. Note that this query must combine the values of magnitude and on color to compute the set of stars that match.

I’ll explore two solutions : 1 : we assert the « instance of » relationship on the item 2 : we don’t, and we assume the query is enough, and just mark the stars as « instance of : star»

There is probably a third one which could be to create a property « senquence of the star » but imho there is so little pros we should not even think about it. The only thing I could find is that it’s widely used in infoboxes to have such a field.

Pros of 2 : we don’t have to put an explicit «instance of» statement. Cons of 2 : The class membership won’t be shown on Wikidata page about the star item atm. A star in which we only know that it in main sequence but don’t know the magnitude won’t show up in the query result. Queries have no items atm, so the query does not even exists as far as Wikidata data model knows.

Pros of 1 : We can add a star we could not show up in the query because we miss the magnitude or the color as instance of «main sequence star». We can always put the «instance of » to the most precise class we know and don’t have to stop at an arbitrary level.

In both cases we keep the item main sequence (Q3450)  . We probably in both cases have to assert the subclass relationship between the two classes. We keep the properties «magnitude» and «color», so this has no real incidence on the number of properties in the database.

As a conclusion, I think this definitely is a false dilemma. There «is» a strong relationship between properties, queries and classes. It does not make sense to oppose one to another of this three. We just don’t have the technical ways yet to reflect them (and eventually use them) in Wikidata. The right question is for the dev team : how to implement and exploit this relationship in Wikibase. The qeustion is more a WikiProject Reasoning question than an ontological one.

LauraHale (talkcontribs)

How things are categorized using properties is useful. At the present, the structure for this is such that you can end up with instances where you know a specific type of query you want to run but it may be impossible without super advanced queries because of sequencing. This makes the usefulness of queries problematic at times as in some areas, it is impossible to do without running these queries. The application becomes less useful for the query engine unless it can handle increased load of these complex queries. (It often can't, and you can do serious server drain by doing certain types of accidental query attempts of this nature.) The other issue is there needs to be a lot more work on the development side to explain how to run queries to lay people.

If you want a good example of these issues, try to create a list of people who have competed at the Olympic Games or a list of sportspeople by disability sport classification.

TomT0m (talkcontribs)

@LauraHale: Did not understand a word of what you mean, sorry.

GerardM (talkcontribs)

When a speaker is not able to explain in terms understood by us all, he/she is a waste of time because he will not be convincing. As it is there is a huge resistance to the overly detailed sub-classes. Some see it as a two camp situation but the problem is that there is no feedback from the splitters why they do this and how they mitigate the negative effects.

If we are to agree, we have to listen to the arguments by others. When they cannot go beyond the "it is so because it is" we will get a situation similar to the insistence at the beginning of Wikidata to standardise on what the GND does. We are still suffering the consequences; all the issues have not been resolved. This will happen because the mission of Wikidata is huge and much prior art and knowledge is a child of its time and will not fit properly in what we are aiming for. Multi linguality is something that is key to what Wikidata needs to be. We do suffer the consequences of research that is dominantly about the English Wikipedia and the existing data on resources like Wikidata do not consider 280+ languages.

Micru (talkcontribs)

@GerardM It is hard to appraise if the speaker will be understood or not, because we haven't heard him yet ;) That a person publishes a difficult paper, doesn't mean that they cannot adapt the message to the audience.

Wikidata has evolved on its own, but I also agree with the others that we can get insights from researchers in the field. We are not in isolation, we are part of a community, and we can hear what they suggest to. Later on we can see if we like it or not, but let´s not close our ears even before of hearing what they have to say!

I hope we can agree that inviting one or several ontology experts and hearing their advice/criticism could be a good thing for us. As the opposite seems to go more in the direction of becoming an echo chamber, and I hope you prefer to avoid this effect too.

GerardM (talkcontribs)

Yes, we could invite ontology experts to present their opinion on how Wikidata could grow in their opinion given what Wikidata is. When this is only as a speaker on whatever event, it will not make much of an impact because not everyone will be there, have the opportunity to converse about it.

Inviting expert to express an opinion is a good idea, we should. However, when the basics of Wikidata are not properly addressed in their opinion it will not resonate. Multi linguality is something we have a big problem with, we are coping badly imho <grin> and if you ask I will explain again </grin> but when I do it is seriously meant for you to rebut arguments! Thanks, ~~~~

Lea Lacroix (WMDE) (talkcontribs)

Thank you for your ideas! We plan a "keynote" format for the WikidataCon: a person outside of the Wikidata world coming to give an enlightening presentation or another point of vue related to our project. I think that what you suggest would fit perfectly.

Feel free to give me other names if you find some relevant people! We will contact them soon.

TomT0m (talkcontribs)

I think the best stuff to do would not to invite to make an abstract talk about formal ontology. Obviously we don’t have the basis to build a strong consistent reasoning system with academical properties in Wikidata - Wikidata is not about truth. I think it should be on top ontologies either as it’s a little too abstract for the common Wikipedian. I think it should be about the presentation of ontology properties and how to use them to build useful models.

GerardM (talkcontribs)

Hoi, nothing wrong with a great presentation that presents its point well. We do not need an abstract talk but we need arguments well presented, proper arguments that explain. Arguments that we can ponder, arguments that we can see in the light of what we aim to achieve. They may even help us to (re-)consider what it is we may achieve. So let us consider who we are and what we achieve and bring out the best in all of us.

When ontology properties help us build models, use those properties in the light of our multi lingual project a project where the granularity of the least granular languages should be the maximum granularity for our project. <grin> do you see this point </grin> Thanks

TomT0m (talkcontribs)

I don’t know what the granularity of a language is. I don’t really thinks it matters anyway since what is important is that the properties can be explained in any of our langauges, which should not be a problem. say we write an english text that explains how to use a property in english. That text will surely be translatable and adaptable in any human language. The worst that can happen is that it will be longer in some language than in others.

GerardM (talkcontribs)

Hoi, you can not explain in a language that has no concept for what you take for granted. No the notion that everything is translatable is a fallacy. We should not make the expressiveness of English rule our classes and subclasses.

TomT0m (talkcontribs)

Replace «translate in the … language» by «can be explained in the … language» and see if you think the claim is still valid. If there is no word, you can translate the definition. If you can’t explain, then it’s of no use for Wikidata and it’s pointless to even try. We should stop right now :p

A concept is no word. When you learn a concept at school the teacher has to explain it, the bare string carries no meaning. He may first explain the concept before giving it a name … concepts can be invented and later named (I©’d say this happens all the time).

GerardM (talkcontribs)

<grin> in one of the Inuit languages they are said to have more than 20 words for ice and snow. They cannot explain it to me because I do not share the same references with them. But you overreact. We should not go overboard with subclasses. At that I am allergic to the infinite splitting that is going on. It may work for you but for me we have an audience, an audience that are not served by such folly.

TomT0m (talkcontribs)

I don’t understand how you want to solve the inuit snow problem. One way or another you would have to crate items to describe them. And this wether or not you decide they are snow subclasses. if you. are referring to ruler subclassing, the item cotains the information on which kingdom is used. This should be exploitable with automatic labéels or description and avoid using qualifiers all the time and reduce redundancy that way.

GerardM (talkcontribs)

Hoi, <grin> maybe some problems do not need to be solved. They are all frozen water as far as I am concerned </grin>. Your problem is that you want to solve it, I do not. I want us to share first and foremost. When solution are not experienced as solutions they are not a solution.

TomT0m (talkcontribs)

So your solution to sharing would be … not sharing ? this is nonsense.

GerardM (talkcontribs)

no the solution would be not to go to the same depth of sub subs.

Reply to "Topic to discuss: ontology and WD"
Micru (talkcontribs)

As many of you know there was a proposed Wikidata Community User Group back in 2015. It was not pushed further because it was not seen as necessary as most of the potential activities are managed by WMDE. Maybe it could be a topic of discussion, specially if more organizations want to get involved into wikidata.

Sjoerddebruin (talkcontribs)

Yeah, it would be great to have some discussion about this.

Lea Lacroix (WMDE) (talkcontribs)

Good idea, since this topic has also been discussed during the meetup in Berlin. We will have a "discussion" format in the program :)

Reply to "Wikidata Community User Group"
Lea Lacroix (WMDE) (talkcontribs)

Hello all,

The WikidataCon will be built by and for the Wikidata community. In that regard, we are actively looking for volunteers to take part to the most important projects of the organization.

  • Program committee: will coordinate and select the content of the event. → Learn more and apply before May 17th
  • Scholarships committee: will define criteria and select people that will receive a scholarship. → Learn more and apply before May 17th

But also open groups on various topics:

  • Accessibility: can be involved on any topic regarding the accessibility (on a large scale) of the event, the respect of the diversity, the inclusiveness.
  • Social events: will help people connecting with each others during the conference.
  • Communication: logos, pictures, posters, signs, goodies, and all the things.
  • Logistics: about every practical questions.
  • Catering: because food is an important topic.
  • Documentation: keep track of everything happening.

If you don't find answers to your questions on the sub-pages of Wikidata:WikidataCon_2017, feel fee to ask on this talk page! At anytime you can reach me to talk about the event.

Reply to "Join a volunteer team :)"
There are no older topics