Wikidata talk:WikiProject Taxonomy

From Wikidata
Jump to: navigation, search
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at July.

Count of taxon name (P225) and parent taxon (P171)[edit]

Count of P225 and P171

We have now more than 500,000 × parent taxon (P171) but still a lot of work to do. --Succu (talk) 16:00, 20 March 2014 (UTC)

Yes, this is less than a third? - Brya (talk) 19:22, 20 March 2014 (UTC)

It is noteworthy that the curve of P225 appears to be flattening, stabilizing a little below two million. This would be within the expected range for an endpoint. - Brya (talk) 05:51, 21 March 2014 (UTC)

250,000 more items are now connected via parent taxon (P171). --Succu (talk) 14:03, 18 May 2014 (UTC)

A 50% increase over the last report, and approaching the halfway mark! P225 appears indeed to have stabilized. - Brya (talk) 05:47, 20 May 2014 (UTC)

I updated the diagram. --Succu (talk) 08:27, 29 June 2014 (UTC)

Interesting, P225 is rising again, while the climb of P171 is flattening. Still not reached the halfway point for P171. - Brya (talk) 10:14, 29 June 2014 (UTC)

Archive index is broken on this page[edit]

Archive index of this page doesn't show link to 2014 archives (despite archives are exists). Please fix it.

AlgaeBase property proposal[edit]

I've added a proposal for AlgaeBase property. Please comment. -- Alexander Vasenin (talk) 21:28, 10 June 2014 (UTC)

also known as[edit]

I see that the designers of the Wikdata game have chosen to treat an "also known as" as if it were an "is the same as". Usually this will be a wrong assumption. I guess this means that we have to eliminate just about all the "also known as"-statements so as to prevent stupid merges. And all the information will have to go to the Talk page. - Brya (talk) 17:13, 11 June 2014 (UTC)

@Brya: why don't you contact Magnus Manske and explain him the problem ? this seems more efficient if a solution at the root of the problem is found. TomT0m (talk) 17:19, 11 June 2014 (UTC)
I am not treating any of these properties in any way. However, if one of the items links to the other (irrespective of the property), they should not be shown in the game. In any case, bad merges are the fault of the user ordering the merge, not the game. --Magnus Manske (talk) 10:39, 12 June 2014 (UTC)
Well, this page says "Some topics have duplicate items on Wikidata. Two items with the same title or alias will be suggested to you." Usually, there won't be a link to the other item, as Wikidata is quite short on fundamental properties, and "also known as" is the only field available.
        But, yes, I agree that bad merges are the fault of the user ordering the merge, not the game, and given how popular the game is, the number of bad merges is not all that high. Still, if that number get higher it will become necessary to eliminate "also known as". - Brya (talk) 10:54, 12 June 2014 (UTC)
Hi Magnus Manske, I'm not here to distribute points to say whose fault is what :) But if there is a solution to solve this problem and reduce the set of candidates to merge or warn that in taxonomy items sometimes things are a bit subtle, maybe it's a good idea to do so. "The game" is adressed to anyone, and a lot of its users are not aware of taxonomy. There are subtilities in that field, such as the doubling of the species and other kinds of living organisms with almost the same name that are not really easy to catch with the naked eye. TomT0m (talk) 10:59, 12 June 2014 (UTC)
And I'm saying that it's a game because merge decisions usually need a human to decide; otherwise, it could be a bot :-) That aside, I believe I now understand what you mean; you are calling aliases "also known as", right? Well, at no point I suggest that the two items the game shows are the same (again, if I'd know that, I could just make a bot instead). As quoted above, they are items "with the same title or alias", nothing more. And yes, many of these pairs are the same topic (the number of "same" and "different" decisions is about 50:50), which is why they should be merged. --Magnus Manske (talk) 11:33, 12 June 2014 (UTC)

Side note : @Magnus Manske, when you got an afterthought you might have made a mistake in "The game", there is not a lot of ways to review your your work, just to see your edit history on Wikidata. Any idea on ho to improve this ? TomT0m (talk) 11:19, 12 June 2014 (UTC)

If you click on Settings (or your user name in the title), you get a list of your last actions in the game. More views are certainly possible, but not a priority for me right now. Feel free to submit code. --Magnus Manske (talk) 11:33, 12 June 2014 (UTC)
Yes, but many are not the same (see here for some of the known exceptions) and should not be merged. This is all the more so in the case of the "also known as". - Brya (talk) 18:45, 12 June 2014 (UTC)

User script for taxonomy statements sorting and highlighting[edit]

I've wrote a small script which can be useful for our project. It reorders statements so instance of (P31) is always on top of the list, than comes taxon name (P225), taxon rank (P105), parent taxon (P171), and than other properties. It also highlights statements with colors (by type) for easier editing. To use it add the the following line: importScript('User:Alex.vasenin/taxohelper.js'); to your common.js file. Hope you find it useful ;-) -- Alexander Vasenin (talk) 20:40, 12 June 2014 (UTC)

Why do you think instance of (P31) should be the topmost statement? --Succu (talk) 20:55, 12 June 2014 (UTC)
Because it's fundamental property of any entity. It answers the question - what is it? -- Alexander Vasenin (talk) 21:03, 12 June 2014 (UTC)
P31=taxon is the most superfluous of statements, and is often not added. Somebody should still present a case of why it is there at all. - Brya (talk) 05:15, 13 June 2014 (UTC)
The purpose of the script is to present statements in the most digestible and user-friendly way. If you don't like instance of (P31) - feel free to copy the script to your userspace and remove first elements from both JavaScript arrays -- Alexander Vasenin (talk) 07:58, 13 June 2014 (UTC)
Somebody should still present a case of why it is there at all We did multiple time, you just do not want to hear. It's not useful to you because you have your own classification system here. It's totally redundant with the one uses in the rest of the project, so actually, YOU have to make a case on why you don't use it. Here Taxonomy is synonym of biological taxonomy, this is ignoring the meaning of this term is actually broader. See taxonomy (Q7211) (View with Reasonator). TomT0m (talk) 10:29, 13 June 2014 (UTC)

Separate project for biology?[edit]

Is general biology in the scope of this project? I think we should have a quick vote if we should include properties like ploidy (P1349) and spore print color (P787). My only concern is that there will be many more such properties in the future and that this project will be too overloaded for new contributors. It might be better to focus on identifiers and taxonomy here and outsource the rest of biology. Please also think about biometrics in your decision. I am sure that we will have a barrage of properties like "average heart beat", "number of teeth", "liver volume" and "average hibernation duration" as soon as those data-types become available. -Tobias1984 (talk) 11:09, 13 June 2014 (UTC)

Tobias, yes, I think a separate project for biology would make sense. Properties like ploidy (P1349) are in scope for the Wikiproject Molecular biology, but spore print color (P787), "average heart beat", "number of teeth", "liver volume", "average hibernation duration" and other more macroscopic biological properties are not. They are also outside the scope of taxonomy, although they might be of interest to taxonomy (as they are to molecular biology). Emw (talk) 11:52, 13 June 2014 (UTC)
Pictogram voting comment.svg Comment This in not a real opposition to make a separate project, just pulling a string I opened in the previous thread. Before genetic taxonomy and phylogenetic modern tools, taxonomy was closely related to the study of common characteristic of the organisms, I think I understand. Vertebrae for example was the class of all animals with this characteristic. This makes a link beetween OWL class expressions and biological taxonomy on Wikidata : these toWikiProjects (biology and taxonomy) are (will be) in fact closely related as we can define classes of organisms as a function of the properties (as in wikidata properties) in Wikidata :) Metaclasses like Clade are actually modern developments of taxonomy. Modern languages like OWL2 are perfectly fine with classing the classes and reason about taxonomy itself. Actually Metaclasses are interesting in history of science studying (this one is for Emw :) ). TomT0m (talk) 15:27, 13 June 2014 (UTC)
To answer your initial question Tobias, it's clearly beyond the scope of the project. --Succu (talk) 21:37, 13 June 2014 (UTC)
I agree with Succu, although I would not know what "general biology" is. - Brya (talk) 05:07, 14 June 2014 (UTC)
@Brya: Now that I read the sentence again, it does sound like I am referring to a introductory class at an USA university. What I meant is "biology with the exception of taxonomy", and it would only be an Wikidata-organizational thing.
Another thought: It is important that the WikiProjects have a healthy size and a healthy volume of activity. People only watch pages where they get a reasonable amount of relevant information in digestible portions. -Tobias1984 (talk) 21:18, 14 June 2014 (UTC)
Tobias1984: Dog breeds task force and Cat breeds task force are „dead”. What are we doing with these? --Succu (talk) 21:31, 14 June 2014 (UTC)
@Tobias1984: I'm new here, but I have some experience in project planning. The best way to refine project scope is to get back to the project goal. The goal of phase 2 is to deliver infoboxes to wikipedias. There are quite a lot of taxoboxes in wikipedias - hence our project. General biology infoboxes are different (despite taxonomy still a part of biology). -- Alexander Vasenin (talk) 21:45, 14 June 2014 (UTC)
✓ Done Wikidata:WikiProject Biology -Tobias1984 (talk) 11:07, 19 June 2014 (UTC)


There are some infoboxes describing fossil taxons which sometimes marked with † (dagger). Is there an established practice of attributing taxon as extinct in Wikidata? -- Alexander Vasenin (talk) 23:07, 14 June 2014 (UTC)

Not that I know of, but I do agree there should be a way to include this information. BTW: to me "fossil" and "extinct" are different things. - Brya (talk) 05:22, 15 June 2014 (UTC)
@Alex.vasenin: For taxa that went extinct you can set temporal range end (P524) (together with temporal range start (P523)). Helpful link to all time slices: User:Tobias1984/Geologic Time Scale. For species that went extinct recently you can set temporal range end (P524) = holocene. But we need an additional property for "calendaric date of extinction" or "last sighting" for the year the last specimen was sighted. -Tobias1984 (talk) 10:48, 18 June 2014 (UTC)


Paul Silva (1922-2014). - Brya (talk) 05:27, 15 June 2014 (UTC)

Paul Claude Silva (Q10346275) for reference. -Tobias1984 (talk) 10:43, 18 June 2014 (UTC)
For those who needed that reference, Paul Silva was the founder (and builder) of the AlgaeBase which just got its own property and may be regarded as the "Brummitt for Algae". - Brya (talk) 16:46, 18 June 2014 (UTC)

Ptereleotridae vs Ptereleotrinae[edit]

Could anyone have a suggestions how to fix this mess Ptereleotridae (Q1423033). Family Ptereleotridae and subfamily Ptereleotrinae are not the same thing, but looks like wikipedias treat them as synonyms (for example, enwp redirects both terms to dartfish). -- Alexander Vasenin (talk) 16:51, 15 June 2014 (UTC)

It is best to just have one rank per item / each rank its own item. In this case this is obligatory as the French Wikipedia has a page on each. - Brya (talk) 17:41, 15 June 2014 (UTC)
I prefer clarity too, but that way we lose much of interwiki links. Well, at least someone might notice they aren't synonms. Thanks. -- Alexander Vasenin (talk) 18:37, 15 June 2014 (UTC)
Interwiki links are important, and it is always nice to have as many as possible linking together. In this case the interwiki links must be divided over two items, so there is no choice. It is good anyway to have a separate item for each rank and for each name (with exceptions for homotypic names and for names used at several ranks), so as to to organize the data. As Wikipedia's grow the number of interwiki links will grow as well, so it is best to plan for it and have sufficient items here, that will offer a place for them. - Brya (talk) 05:15, 16 June 2014 (UTC)

Lsjbot now starting with plant species[edit]

For yor information: Lsjbot now starting with plant species. Lsjbot started generating his bot articles allready on warwiki. --Succu (talk) 18:58, 16 June 2014 (UTC)

Just great. Is there a possibility that somebody could explain to Lsjbot about ranks: these entries look like gibberish. - Brya (talk) 05:50, 17 June 2014 (UTC)
Also on ceb-wiki. The recent edits on ceb-wiki do take ranks into account, although the earlier ones didn't. The edits on war-wiki still don't. Mysteries. - Brya (talk) 18:27, 19 June 2014 (UTC)

Since end of may Cheers!-bot is creating bot articles for viwiki (vi:Thể loại:Bài do bot tạo). --Succu (talk) 18:16, 20 June 2014 (UTC)

Yes, but these are based on The Plant List, and the level of error is a lot lower than in CoL, which Lsjbot uses. - Brya (talk) 18:47, 20 June 2014 (UTC)

Sit down and read: For This Author, 10,000 Wikipedia Articles Is a Good Day's Work: In Sweden, Sverker Johansson and His 'Bot' Have Created 2.7 Million Articles; Some Purists Complain. --Succu (talk) 19:46, 21 July 2014 (UTC)

Yes, I had seen it: "Mr. Johansson has to find a reliable database, [...]" No mention of the fact that he is not literate enough in his chosen subject to manage a straightforward copy-and-paste, let alone recognize a reliable database. - Brya (talk) 05:37, 22 July 2014 (UTC)

Wikidata:Requests for permissions/Bot/Structor[edit]

I'm planning (and already doing) to make some structural edits to species items. Specifically I add russian/french/german standard labels and descriptions, instance of (P31) taxon (Q16521) and parent taxon (P171). Example of an edit. Algorithm is as the following. At first I make search for genus latin name and discover if there are homonymous taxa. If there are not, I'll get search results as a draft list and then make some manual filtration. Then my program perform automatic edits through wbeditidentity of API. Does anyone mind? --Infovarius (talk) 11:29, 1 July 2014 (UTC)

Some remarks:
  1. For french labels see Add Label = taxon name (P225) above.
  2. German labels and descriptions are set by my bot. If you want to do this too, please check the german label of the genus and use the form Art der Gattung trivialname (scientific name) if appropriate.
  3. Checking the uniqueness of a genus name within wikidata is is not enough. You have to ensure that you do not mix an animal species with an plant genus. How will you do this?
--Succu (talk) 17:02, 2 July 2014 (UTC)
1. According to Wikidata:Requests for comment/Automatic labelling I can add taxon name (P225) as a label for de/fr/es/it/ru. 2. I am already using such form. 3. Species with genus? How is possible? May be, you mean animal species with plant species. In that case I assume that we have already items for both genera and so I can isolate them. Infovarius (talk) 17:55, 2 July 2014 (UTC)
  1. German has no votes and the voter for french refused to do this with his bot (see above). But I don't care.
  2. Fine, but I haven't seen an test edit.
  3. I mean adding a genus name as parent taxon (P171) belonging to the animal kingdom to a plant species with the same generic name. You assume a lot. But on what basis? A bot shouldn't. Have a look into the history of "Unique value" violations of taxon name (P225). Nearly every day you'll find new generic names belonging to diffenent kingdoms or later homonyms within the same kingdom. So „guessing“ is not an option.
--Succu (talk) 18:29, 2 July 2014 (UTC)
@Infovarius: What exactly we gain from this subpage? --Succu (talk) 20:25, 14 July 2014 (UTC)
You can propose a next genus to work with. Also you can control that I am not doing wrong (I plan to avoid ambiguous taxa at first). --Infovarius (talk) 20:29, 14 July 2014 (UTC)
@Infovarius: You will doing this (next genus to work with) on which groundings? --Succu (talk) 20:36, 14 July 2014 (UTC)
At present arbitrarily. But I can consider some systematic approach if you wish. Infovarius (talk) 20:39, 14 July 2014 (UTC)
„arbitrarily“ fits well to a lot of your contributions. A „systematic approach“ at yours is more than overdue. --Succu (talk) 20:45, 14 July 2014 (UTC)
I like you too, but please avoid personal attacks. Thread is frozen. --Infovarius (talk) 22:01, 14 July 2014 (UTC)


Hardly a major botanist, but a productive one: Lyn Craven 1945-2014. - Brya (talk) 19:21, 11 July 2014 (UTC)

Mind to update Lyndley Craven (Q6708677) accordingly? --Succu (talk) 20:03, 11 July 2014 (UTC)
I would rather not. I don't see at all why somebody who went through life as Lyn Craven should suddenly be transformed to Wikimedia's Lindley Alan Craven (not even registering his actual name). It does not make sense to me (close to Original Research) and I would rather have nothing to do with it. - Brya (talk) 04:56, 12 July 2014 (UTC)
@Brya: - His nickname is even mentioned in this biography: So there is no original research involved. I updated the item and the Wikis. -Tobias1984 (talk) 09:38, 22 July 2014 (UTC)
Well, birth names are all too often mentioned in biographies (unless the person took great care never to let it leak), but for most people it is never used in real life. I subscribe to the school of thought that people should be referred to by the name they used all their life, and not by the name they were registered under, at birth. There are plenty of people who indeed are careful to keep their birth name concealed, and their birth names are not their fault, in the first place. - Brya (talk) 16:40, 22 July 2014 (UTC)

Autolist link[edit]

Go to Autolist2 and put claim[225] AND claim [31] AND noclaim[31:16521,31:310890,31:502895,31:713623] in the WDQ section. This lists all items that have a taxon name and an instance of, but that instance of is not taxon, monotypic taxon, common name or clade. There are 333 of them. Some have "no value" for taxon name so are not mistakes, but there are some mistakes on there mixing disambiguation pages with taxons. 10:16, 18 July 2014 (UTC)