User:ArthurPSmith/5th Birthday

From Wikidata
Jump to navigation Jump to search

Happy Birthday Wikidata!!![edit]

I first ran into Wikidata while I was trying to reconcile the French (frwiki) and English (enwiki) wikipedia lists and descriptions of the universities in France. Many French universities went through a major reorganization starting in July 2013 thanks to an initiative of French president François Hollande. The changes, which included mergers, splitting, and renaming of many major institutions, were evident quickly on frwiki, but even two years later on enwiki there was almost no sign of the new situation. Where I work we were at the time relying on a source of organization information that pulled almost all its data from enwiki, and so we had many people coming to us claiming to be from French universities that our source didn't know about. I spent some time working on reading the frwiki pages and some of their references (and university websites) to try to sort things out, and then started working on updating (or creating new) enwiki pages to reflect the current situation. That led to dealing with the interwiki links, and discovering this amazing wiki-based dataset behind it all. I felt suddenly that here was the solution to a lot of problems I'd been dealing with over the years!

Beyond looking into how to best represent universities and other organizations, my background in physics made me interested in what Wikidata was doing with regard to physical properties, particles and concepts. Physicists haven't had much luck with "theories of everything" (our theories of "almost everything" are pretty good), but it looked to me like Wikidata came very close to providing a "data model for everything"! Leaving the class hierarchy to properties rather than something intrinsic to the data model means that a Wikidata item can represent anything at just about any conceptual level. And the property system is very rich. I was particularly excited when the Quantity datatype *with units* became available a little after I started using Wikidata. I learned how to use QuickStatements to do a lot of edits quickly (fixing up relationships between universities for example), but I was also soon running a "bot" to add half-life (and other) data pulled from the US National Nuclear Database to the various items for radionuclides in Wikidata. Check out the Wikidata Chart of the Nuclides - https://tools.wmflabs.org/ptable/nuclides - a web application I wrote with Ricordisamoa, which pulls its information completely via Wikidata properties, and links to every one of the over 6000 nuclides listed in WIkidata.

Since properties seemed so central to everything that made Wikidata so great I quickly got involved in the property proposal discussion process, and it wasn't long before I was given the opportunity to be a property creator. Since then I've added over 240 properties to Wikidata as well as participating in around a thousand discussions. The vast majority of these (as with the overall collection of properties) are external ID types - the migration of old string ID's to external ID's was another big effort I got drawn into, including providing another app to support some of the ID's like IMDB that have complex formatter URL logic.

While Wikidata can do wonderful things now, and there's more great things in the development pipeline, it seems to me we have barely scratched the surface of what this amazing wiki-powered database can do. I am most excited about the applications that can be built on top of it. The "chart of the nuclides" app took me only a few hours of work, and I was pretty new to coding in python. I've been really really impressed with Scholia but I think even that is only just barely a taste of what Wikidata will empower in coming years. There are many businesses and analysts whose income is based on their ability to take data that other people don't know how to get or understand, and putting it together into pretty charts and summaries. On the one hand, maybe all those folks will be out of jobs when the same can be done with a few clicks in the wikidata query interface, or through some free apps built on top of it. On the other hand, maybe the ease of access to powerful datasets will greatly expand the power of what these businesses and analysts do, enabling far better decision-making in a wide variety of human endeavors all over the world. Either way the world is a better place thanks to Wikidata!

Ok, yes, Wikidata has been a big time sink for me. But I've loved it, I feel I've gotten to know a lot of Wikidatans who are doing their part to make good quality free data available to the world. A few of them I've now met personally thanks to events like Wikicite, and I've had a wonderful time meeting lots more at Wikidatacon in Berlin this weekend. Yes, we don't always agree on everything - but if this stuff were easy somebody else would have done it already. I think Wikidata is one of the greatest (if so far little known) projects of the 21st century. Happy 5th Birthday to Wikidata!