Property talk:P10632

From Wikidata
Jump to navigation Jump to search

Documentation

OpenSanctions ID
identifier of persons, companies, luxury vessels of political, criminal, or economic interest at opensanctions.org
Applicable "stated in" valueOpenSanctions (Q110087116)
Data typeExternal identifier
Domainhuman (Q5), organization (Q43229), ship (Q11446) or position (Q4164871)
Allowed values[a-z]+(-[0-9a-zA-Z]+)+|Q[1-9]\d*|NK-[A-Za-z0-9]{22}
ExampleBirgit Honé (Q20606538)eu-cor-2030915
André Viola (Q2848794)eu-cor-2032180
Semion Mogilevich (Q471862)Q471862
Irbis Air Company (Q3396960)ch-seco-4058
Alexander Nikolaevich Bushuyev (Q112194249)ua-nabc-person-14653-bushuyev-alexander-nikolaevich
ua-nsdc-person-176-2018-213
NK-33dvfMLd6hCdByWC8Gjntp
Q112194249
ru-inn-645205630830
Formatter URLhttps://www.opensanctions.org/entities/$1/
Lists
Proposal discussionProposal discussion
Current uses
Total240,037
Main statement205,122 out of 204,469 (100% complete)85.5% of uses
Qualifier9<0.1% of uses
Reference34,90614.5% of uses
Search for values
[create Create a translatable help page (preferably in English) for this property to be included here]
Single value: this property generally contains a single value. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P10632#Single value, SPARQL
Distinct values: this property likely contains a value that is different from all other items. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P10632#Unique value, SPARQL (every item), SPARQL (by value)
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P10632#Entity types
Scope is as main value (Q54828448), as reference (Q54828450): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P10632#Scope, SPARQL
Type “human (Q5), organization (Q43229), ship (Q11446), position (Q4164871): item must contain property “instance of (P31)” with classes “human (Q5), organization (Q43229), ship (Q11446), position (Q4164871)” or their subclasses (defined using subclass of (P279)). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P10632#Type Q5, Q43229, Q11446, Q4164871, SPARQL
Format “[a-z]+(-[0-9a-zA-Z]+)+|Q[1-9]\d*|NK-[A-Za-z0-9]{22}: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P10632#Format, SPARQL

formatter URL[edit]

@RShigapov: Why did you remove the trailing slash from the "formatter URL"? It redirects to one with trailing slash, that's why I put it in the formatter: to avoid an extra redirect. Cc @Dhx1: --Vladimir Alexiev (talk) 12:53, 26 April 2022 (UTC)[reply]

Only due to aesthetic reasons. I am not against keeping the trailing slash. RShigapov (talk) 11:50, 2 May 2022 (UTC)[reply]

format regex[edit]

@pamputt: I changed format regex to

[a-z-]+-[\d-]+|Q[1-9]\d*|NK-\w+

to fit more cases as per https://www.wikidata.org/wiki/Wikidata:Database_reports/Constraint_violations/P10632#%22Format%22_violations, eg

Jaroslav Doubrava (Q12023774): ua-nsdc-person-176-2018-1763
Jaroslav Holík (Q15249906): ua-nsdc-person-133-2017-45
Ladislav Zemánek (Q19602329): ua-nsdc-person-176-2018-1192
Škoda JS (Q46966976): NK-YHaUtCn6nApumiJPhQNu32
SUEX OTC (Q111605855): NK-RDMqFGqcEPf2DpSoZjisjj
Guga Arm (Q111605895): NK-WsBynC7ryHSLnT59gpyZZy
František Hudec (Q111605917): interpol-red-1994-45042
František Procházka (Q111606020): interpol-red-2007-54803
Lacno (Q111606916): NK-BAVEkjjnFuZCorkKf79bkE

Could you change "URL match pattern" accordingly?

--Vladimir Alexiev (talk) 12:53, 26 April 2022 (UTC)[reply]

Hi Vladimir Alexiev I have slightly modified your regex to be a bit more restrictive. Pamputt (talk) 13:56, 26 April 2022 (UTC)[reply]

Importing the rest of OpenSanctions[edit]

MD Imtiaz Ahammad Kopiersperre Jklamo ArthurPSmith S.K. Givegivetake fnielsen rjlabs ChristianKl Vladimir Alexiev Parikan User:Cardinha00 MB-one User:Simonmarch User:Jneubert Mathieudu68 User:Kippelboy User:Datawiki30 User:PKM User:RollTide882071 Andber08 Sidpark SilentSpike Susanna Ånäs (Susannaanas) User:Johanricher User:Celead User:Finnusertop cdo256 Mathieu Kappler RShigapov User:So9q User:1-Byte pmt Rtnf econterms Dollarsign8 User:Izolight maiki c960657 User:Automotom applsdev Bubalina Fordaemdur DaxServer

Notified participants of WikiProject Companies

Mcnabber091 (talk) 00:29, 18 June 2014 (UTC) Tobias1984 (talk) 10:23, 8 November 2015 (UTC) Note 1 PAC2 (talk) 09:29, 26 September 2016 (UTC) Rjlabs (talk) 20:30, 14 March 2017 (UTC) Datawiki30 (talk) 11:55, 2 September 2018 (UTC) Sidpark (talk) 09:31, 2 December 2018 (UTC) Mathieu Kappler (talk) 11:44, 6 September 2021 (UTC)[reply]

Notified participants of WikiProject Economics

WikiProject every politician has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

@RShigapov, So9q, OpenSanctions, Nikola Tulechki, Borko1990:

Currently we have 112,361 ids same as WD id and 11 different from WD:

NK-YHaUtCn6nApumiJPhQNu32
NK-RDMqFGqcEPf2DpSoZjisjj
NK-WsBynC7ryHSLnT59gpyZZy
interpol-red-1994-45042
interpol-red-2007-54803
ua-nsdc-person-176-2018-1763
ua-nsdc-person-176-2018-1192
ch-seco-4058
ua-nsdc-person-133-2017-45
NK-BAVEkjjnFuZCorkKf79bkE
NK-CSTHWr7TNQckHEURKfFGMH
rupep-person-14804
rupep-person-16821

In the source, there are:

  • 112,504 ids same as WD id, so 143 were lost due to QS errors. IMHO that's too few to care
  • 92,097 ids different from WD.

Are there volunteers to help importing/matching these entities to WD? I'm not sure MnM is the best way since we need richer data.

(From https://github.com/opensanctions/opensanctions/issues/198)

Breakdown per type:

csvtk freq -f schema openSanctions-no-WD.csv
Airplane,269
Vessel,415
Company,2455
CryptoWallet,7457
Organization,4335
LegalEntity,4581
Person,57754

Russian entities (total and per type):

csvtk grep -f countries -rp '.*ru.*' openSanctions-no-WD.csv|csvtk nrow
9448

csvtk grep -f countries -rp '.*ru.*' openSanctions-no-WD.csv|csvtk freq -f schema
Airplane,1
Vessel,40
Company,466
LegalEntity,438
Organization,580
Person,7923

Eg if we look for Airplanes:

csvtk grep -f schema -p Airplane openSanctions-no-WD.csv|csvtk cut -f id,name,aliases,identifiers |head -20
id,name,aliases,identifiers
NK-27foGhde2c676yHk4Szsir,EX-301,,524;EX-301
NK-2Dm77ySwRh5BkuNFcKWd47,EP-MOF,,149;3149;EP-MOF
NK-2eLCszh7MxME37JiDXyhEu,MSN 550,,2-WGLP;550;MSN 550
NK-2oY2W562WbtAajN2bZsGrB,YV2964,,19000646;YV2964
NK-2q4YRztADsVsmYPEWLg4KU,EP-IBK,,671;EP-IBK
NK-2RcsM4umoACDaCAdAnWsg8,YV3033,,208B5140;YV3033
NK-2SkGGxw32P4qw6MnyXveKd,EP-ICE,,139;EP-ICE
NK-2XP4SpC6mZrEeRvNhaefCw,EP-CFM,,11394;EP-CFM
NK-3RboG5aQQdFse7aGYR6yvQ,YV3034,,208B5142;YV3034
NK-3s9cpVTo8iWdZ5v5nJEZZd,EP-ITE,,1424;EP-ITE
NK-3sqVBEnk3WWSZykfuJkhod,N488RC,,228;N488RC
NK-3sYzggNQbXwW7neddLEhaX,EP-MNV,,567;EP-MNV
NK-3TXRoCSy64gFjoXA9ztzgX,EP-MHF,,55;EP-MHF
NK-4ChbzQJpWyoyuSe7qkSmSk,YV2726,,136;YV2726
NK-4E2sKLvJYCFiDndeux4X3p,EP-MHA,,160;EP-MHA
NK-4eH4pto6F4XbDdVaUv9GtH,EP-IED,,345;EP-IED
NK-4hdbWLAqSnPHDWYzVvbFww,EP-ICF,,173;EP-ICF
NK-4jxr45VrPqoqRzyZbBStJt,UR-CKX,,131;3131;UR-CKX
NK-4mThxv7oHM2bJopxXhEWpb,YK-AGD,,1670;22360;YK-AGD

I checked the first 10 or so and they are not in WD, so IMHO are worth creating. However, I should work not with CSV but one of the 2 JSONs. Eg this plane https://www.opensanctions.org/entities/NK-27foGhde2c676yHk4Szsir/ is owned by MAHAN AIR, which is WD https://www.wikidata.org/wiki/Q1149762 (OS does not yet know that WD id, but MnM can find that coreference easily). We want the link airline-aircraft in WD. Which means we should import the remaining entities in tiers: first persons and orgs, then assets.

Eg Russian marine vessels (tankers, yachts etc):

csvtk grep -f countries -rp '.*ru.*' openSanctions-no-WD.csv|csvtk grep -f schema -p Vessel

Eg

ofac-36401,Vessel,Lady Sevda,LADY SEVDA,,ru,,273342180;IMO 9683738;UBWL7,Program - SDN List - Block - Executive Order 14024;RUSSIA-EO14024,,,US OFAC Specially Designated Nationals (SDN) List;US Trade Consolidated Screening List (CSL),2022-04-06 18:24:16,2022-04-14 06:17:01

That tanker is not on WD but is found by AIS: currently crossing from the Black Sea into Azov Sea: https://www.vesselfinder.com/?imo=9683738, https://www.marinetraffic.com/en/ais/details/ships/shipid:1130342/mmsi:273342180/imo:9683738/vessel:LADY_SEVDA

--Vladimir Alexiev (talk) 13:19, 26 April 2022 (UTC)[reply]

IMHO MnM is a good idea. If the catalogue is created (or scraped) properly, it can be easily updated, you can download easily unmatched entries only (to create new items) etc.--Jklamo (talk) 11:26, 29 April 2022 (UTC)[reply]
@Jklamo: MnM is good for the following kinds (though it cannot create relations between them):
Company,2455
Organization,4335
LegalEntity,4581
Person,57754

But for the following kinds, I think we should create them in a bulk, in a second step, together with their relations (i.e. using a bot not MnM). Else we'll get items like "EX-301, isa Airplane; OpenSanctions id" that are rather poor and won't be appreciated by the community

Airplane,269
Vessel,415

Cheers --Vladimir Alexiev (talk) 14:53, 1 May 2022 (UTC)[reply]

The lists are constantly updated, we need a more permanent solution than just one-time import/linking. MnM is convenient for monitoring interconnections. To create new items it can be used as source for unmatched items and populate them using openrefine and opensanctions datasets.--Jklamo (talk) 11:52, 8 May 2022 (UTC)[reply]