A collection of database reports relevant to the upkeep of Wikidata. Inspired by en:WP:DBR. Requests for new reports can be made on the talk page. The source code is posted on Github. Patches and contributions are all welcome.
- Bots by edit count (last updated at 15:32, 2014-03-09)
- Some statistic about Label, Description and Link
- No. of labels, descriptions and aliases for items per language
- Some statistics about sitelinks
- links per language (near real time)
- progress based on the two weekly dumps
- Items with links to userpages (last updated at 00:30, 15 March 2014 (UTC))
- Items missing a specific language's link
- Pages with local langlinks
- Remaining Interwikis - lists of articles that still have interwiki links (Updated constantly although with millions of articles some are likely to be wrong)
- progress of sitelink removal
- No-interwiki - articles with no interwikis in a given Wikipedia
- Lonely Interwiki links - Items which only have one Interwiki link, developed by User:Sanyi4.
- Most used properties (last updated at 00:55, 15 March 2014 (UTC))
- Most linked to items (last updated at 01:30, 15 March 2014 (UTC))
- Items that are missing a property that is logically inherited (last updated at 01:50, 15 March 2014 (UTC))
- Wikidata:Database reports/Constraint violations & Constraint violation reports Non-unique values, invalid value format and etc:
- List of all properties with language and usage statistics (from database dumps - may be outdated)
- Items using themselves as values for claims, qualifiers or sources (from database dumps - may be outdated)
- Conflicting or missing sex of items used as value for a sex specific property (from database dumps - may be outdated)
- Items which are a subclass, but not of same type as the superclass (from database dumps - may be outdated)
- Disambiguation items used in claims, qualifiers or sources (from database dumps - may be outdated)
- Burial locations (from database dumps - outdated)
Labels And Descriptions
- Properties without label
- persons without en label
- Terminator:Top 1000 labels without descriptions (English, German, French, Spanish, Italian)
- Language as term
- Different items with same commonscat link
- Different items with same number and pattern
- Different items with same name and same supercategories
- disambiguation page conflict
- User:Akkakk/issues (link to wiki, but no label; brackets in label; description present, but no label; links to userpages; items using themselves as value; disambiguations/categories/templates with wrong/without description; disambiguations/persons without en-label; string-values with wrong format; multiple values on single value properties; etc.)
- Empty item: Short pages. Based on length of the pages.
- Taxon check – Different items with same taxon names (may be duplicate items or wrong taxon names)
- Used taxons – List of all items used as taxon ranks (some may not be taxa)
- Conflicting taxa – Conflicts between use of taxon specific properties and taxon rank property
Reports can be based on several sources:
- Toolserver queries
- age depends on the replication lag of the Toolserver. This can be limited to minutes, but sometimes grows to days or weeks.
For information about replag, see https://wiki.toolserver.org/view/Replag . If Wikidata is on server 5, for current lag see http://toolserver.org/~bryan/stats/replag/?cluster=s5
- Full database dumps
- age depends on the availability of a new full dump. This can be less than a week ago, but sometimes grows to weeks or months.
For the last available ones, see http://dumps.wikimedia.org/wikidatawiki/ . See also: http://dumps.wikimedia.org/wikidatawiki/latest/
- Incremental database dumps
- these are made available daily and include most information since the last full dump.
For the last available ones, see http://dumps.wikimedia.org/other/incr/wikidatawiki/ . See also: status of today's incremental dump
- Wikidata API
- This provides live data. See http://www.wikidata.org/w/api.php
Depending on the source used to generate the reports, these can be more or less up to date.
The planned query feature will allow to query and display reports on Wikidata directly.