Wikidata:Database reports
From Wikidata
A collection of database reports relevant to the upkeep of Wikidata. Inspired by en:WP:DBR. Requests for new reports can be made on the talk page. The source code is posted on Github. Patches and contributions are all welcome.
Contents |
Statistics [edit]
- Bots by edit count (last updated at 19:03, 2013-05-22)
- No. of labels, descriptions, aliases and links for items per language
- Some statistic about Label, Description and Link
- Statitics for statements, claims and sources
Interwiki [edit]
- Items with links to userpages (last updated at 00:30, 25 May 2013 (UTC))
- Items missing a specific language's link (last updated at 03:24, 10 April 2013 (UTC))
- Pages with local langlinks
- Remaining Interwikis - lists of articles that still have interwiki links (Updated constantly although with millions of articles some are likely to be wrong)
- progress of sitelink removal
Properties [edit]
- Most used properties (last updated at 00:53, 25 May 2013 (UTC))
- Most linked to items (last updated at 11:03, 14 May 2013 (UTC))
- Items that are missing a property that is logically inherited (last updated at 01:50, 25 May 2013 (UTC))
- Non-unique values, invalid value format and etc: Wikidata:Database reports/Constraint violations
- List of all properties with language and usage statistics (from database dumps - may be outdated)
- Items using themselves as values for claims, qualifiers or sources (from database dumps - may be outdated)
- Conflicting or missing sex of items used as value for a sex specific property (from database dumps - may be outdated)
- Items which are a subclass, but not of same type as the superclass (from database dumps - may be outdated)
- Disambiguation items used in claims, qualifiers or sources (from database dumps - may be outdated)
Labels And Descriptions [edit]
- Properties without label
- persons without en label
- Terminator:Top 1000 labels without descriptions (English, German, French, Spanish, Italian)
Possible Conflicts [edit]
Taxa [edit]
- Taxon check – Different items with same taxon names (may be duplicate items or wrong taxon names)
- Used taxons – List of all items used as taxon ranks (some may not be taxa)
- Conflicting taxa – Conflicts between use of taxon specific properties and taxon rank property
Sources [edit]
Reports can be based on several sources:
- Toolserver queries
- age depends on the replication lag of the Toolserver. This can be limited to minutes, but sometimes grows to days or weeks.
For information about replag, see https://wiki.toolserver.org/view/Replag . If Wikidata is on server 5, for current lag see http://toolserver.org/~bryan/stats/replag/?cluster=s5 - Full database dumps
- age depends on the availability of a new full dump. This can be less than a week ago, but sometimes grows to weeks or months.
For the last available ones, see http://dumps.wikimedia.org/wikidatawiki/ . - Incremental database dumps
- these are made available daily and include most information since the last full dump.
For the last available ones, see http://dumps.wikimedia.org/other/incr/wikidatawiki/ . See also: status of today's incremental dump - Wikidata API
- This provides live data. See http://www.wikidata.org/w/api.php
Depending on the source used to generate the reports, these can be more or less up to date.
The planned query feature will allow to query and display reports on Wikidata directly.