Wikidata talk:WikiProject Companies/Archive 2
This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion. |
Small businesses and notability
The Wikidata:Notability guidelines are significantly looser than those of the English Wikipedia. The requirement is that an entry refers to an instance of a clearly identifiable conceptual or material entity. The entity must be notable, in the sense that it can be described using serious and publicly available references. For "small businesses", would an entry in a chamber of commerce directory (for example [1]) which includes a business website and hours operation be sufficient to meet this threshold?
Slightly differently, is there a rule regarding individual Walmart/McDonalds franchise locations? Right now these appear to be excluded, is this by policy or simply because they haven't been added yet?
I posed this at Wikidata:Project chat, but this may be a better place for the discussion. Power~enwiki (talk) 20:45, 28 April 2018 (UTC)
- Thanks for posting here. We have discussed notability of small businesses and franchises here and here on this page; the conclusion generally was that Wikidata just doesn't have the human capacity to deal with all such entities, and so "small businesses" or facilities of larger companies (with less than 50 employees) would normally be excluded unless they met other notability criteria (in particular having a wikipedia page of their own). However, if you think the human capacity issue is maybe not such a problem, we can certainly discuss this further! ArthurPSmith (talk) 23:08, 28 April 2018 (UTC)
- If I have any ideas as to how to do mass creations without a ton of labor, I'll come back here. Government records of business listings are too indiscriminate, and Chamber of Commerce / YellowPages listings aren't licensed correctly and also aren't formatted for automatic processing. Power~enwiki (talk) 21:01, 9 May 2018 (UTC)
- It's worth noting that if you want to add many records for businesses automatically, that should go through a bot request, where the discussion would be had whether or not the data should be added. ChristianKl ❪✉❫ 19:08, 22 July 2018 (UTC)
- If I have any ideas as to how to do mass creations without a ton of labor, I'll come back here. Government records of business listings are too indiscriminate, and Chamber of Commerce / YellowPages listings aren't licensed correctly and also aren't formatted for automatic processing. Power~enwiki (talk) 21:01, 9 May 2018 (UTC)
Business enterprise as a collection of companies?
I'm not sure exactly how a company like Marmalade Insurance (Q8058284) should be represented in Wikidata. It's a marginally notable enterprise that has a enwiki article at Young Marmalade. Like many enterprises, it has multiple companies that change over time. I've identified four (with British Companies House IDs):
- 04627884 Young Marmalade (2003)
- 06779950 Provisional Marmalade Limited (2008)
- 07639886 Intelligent Marmalade Ltd (2011)
- 08676228 Marmalade Ltd (2013)
I'd be reluctant to create items for all of these in Wikidata, since even if they are all considered notable, and it would be feasible in this case, it doesn't seem very useful. It's easier to just treat them as a single business enterprise. E.g., it would be unclear which of the companies should be linked with the enwiki article: not 08676228, which seems to now be the parent company, because the inception date is wrong, and not 04627884, which was apparently the original company but doesn't have the main insurance entity.
Larger enterprises may have dozens or hundreds of subsidiary companies, constantly changing, and keeping track of them could be a full-time job for somebody.
Would it be feasible to make a way of listing these subsidiary companies on a single Wikidata item for the enterprise? Ghouston (talk) 01:40, 15 June 2018 (UTC)
- Currently, business (Q4830453) is a subclass of company (Q783794), which seems like strange logic. Businesses can be structured in various ways, including various types of company depending on jurisdiction. There are also partnerships and sole traders, which are businesses but not companies. Ghouston (talk) 07:43, 15 June 2018 (UTC)
- So we have sole proprietorship (Q2912172) as a subclass of business (Q4830453), which is a good reason not to make business (Q4830453) a subclass of company (Q783794). Ghouston (talk) 07:45, 15 June 2018 (UTC)
- corporate group (Q197952) is what I'm looking for, as an item for a group of related companies. Ghouston (talk) 07:47, 15 June 2018 (UTC)
- So we have sole proprietorship (Q2912172) as a subclass of business (Q4830453), which is a good reason not to make business (Q4830453) a subclass of company (Q783794). Ghouston (talk) 07:45, 15 June 2018 (UTC)
- @Ghouston: I guess the question for separate items vs one item is mainly, can statements on one item represent all of them adequately (as a group in some way) - for example, are they in the same country, have the same headquarters location, produce the same products, etc. If it's essentially one company that's just changed its official name a few times, I would definitely say one item with the official name (and any associated ID's) qualified by start and end dates etc. If it's subdivided into separate business units that do different things, I would say separate items, but maybe not all of the sub-units need an item of their own if they are small or otherwise not really notable. ArthurPSmith (talk) 14:44, 15 June 2018 (UTC)
- Yes, it seems to me that one item would be sufficient in this particular example, since the enterprise only seems to be known for one thing in one place. The individual companies are a not very interesting detail, but it would still be nice to name them with their Companies House ids, now that I've discovered them. Ghouston (talk) 03:28, 16 June 2018 (UTC)
- Perhaps creating items for the companies is the only way of retaining all the information, including full names, name changes, inception dates, and Companies House ids. The group page would also be needed for linking with enwiki, presumably the companies would be part of (P361) the group. Ghouston (talk) 05:39, 17 June 2018 (UTC)
- Marmalade Insurance (Q8058284) is now set up as a corporate group (vaguely defined) with 4 companies as parts. Ghouston (talk) 09:24, 21 June 2018 (UTC)
Adding historical information of companies
I wonder if I can add historical information about a company simply because it is currently possible in Wikidata. Out of historical perspectiv it is really interesting to collect information about companies in context of time. Events like changes of the official name or the move of the headquaters location. I added an example for this Level-5 (Q674686) with two statements about headquarters location (P159). It works fine for me but I am not sure if that's in the sense of Wikidata. Have you any thoughts about it? Diggr (talk) 14:26, 10 July 2018 (UTC)
- It is totally OK to store old hq and names of company at Wikidata. Just do not forget to mark current data with proper rank.--Jklamo (talk) 17:19, 10 July 2018 (UTC)
- Okay, I will do that. Thank you! Diggr (talk) 14:57, 11 July 2018 (UTC)
Bot for Legal Entity Identifier (LEI)
The GLEIF – Global Legal Entity Identifier Foundation - publishes every month a CSV file with BIC-LEI mappings. [2]
How do you find the idea, to use a bot to read the csv and write the LEI to all available banks with BIC-Identifier? Datawiki30 (talk) 09:49, 21 July 2018 (UTC)
- @Datawiki30: that's a brilliant idea! I have recently been working with that dataset, adding LEI ids and french company numbers for a few french financial institutions. https://tools.wmflabs.org/editgroups/b/OR/11cc9a3/ . I used OpenRefine for that. − Pintoch (talk) 17:14, 21 July 2018 (UTC)
Automating tier 1 capital ratio extraction from published pdf
Notified participants of WikiProject Companies
This capital ratio (P2663) is one very important regulatory indicator fr banks. Usually every EU-bank publishes this ratio on the own homepage. The problem ist that the data is in pdf among with other data. I woul like to automatically download and read the data to import to wikidata. This caital ratio is very important regulatory indicator. Usually every EU-bank publishes this ratio on the own homepage. The problem ist that the data is in pdf among with other data. I woul like to try to automatically download and read the data to import to wikidata.
Once I downloaded the pdf files,I would have onlyn for Germany more than 1500 files. Does anyone have experience for searching and extracting information from multiple pdf-files? I found two interesting things which could help:
- pdfgrep [3] - this command utility can read multiple files and write the results in a text file. This is fine, but after that I would need a script to structure/cut only the data I need.
- Tabula [4] - this should also work very well, but there should be the same problem like pdf grep.
You can see one example file here [5]. The advantage is, that these pdf are highly standardised. Datawiki30 (talk) 21:27, 30 July 2018 (UTC)
- Notified participants of WikiProject Companies
- I have written some python script to help find and extract the data. The code has the following main tasks:
- 1. Search for selected company and the Document with the data using the Qwant.com api
- 2. Get the document
- 3. Search the Dokument for keywords using pdfgrep and regular expressions and extract the value
- 4. Write the values in QuickStatements-Format
- 5. Optionally save the page to Web-Archive
- Are you interested for the code? Maybe we can use this for other purposes... --Datawiki30 (talk) 20:03, 10 September 2018 (UTC)
- Have you tested this yet? Usually for bulk imports of this sort you should use a bot account for which there's an approval process - at the least it would be good to have somebody review what you've done! ArthurPSmith (talk) 20:06, 10 September 2018 (UTC)
- Yes. For the first time I ran this for a 10 banks and checked the values. After that I've done that for about 200 banks. Of course I've cross checked the values before importing them to WD. But indeed there was no second person the check again the values.
- I would like to use the same method to update the values for another 600-700 german cooperative banks. I've tested this for about 50 banks - see here: WD-Query
- What should I do next? Should I use a separate bot account? --Datawiki30 (talk) 20:30, 10 September 2018 (UTC)
- Thanks for the query example - however, I think it would be better if you could list some specific changes that followed your current process. When I look at for example Volksbank Eisenberg (Q2531693) I see some changes that I wonder where they came from - for example modifying the point in time value after the fact. Also I looked at the reference and I don't see where the value "20.6" came from - are you doing some computation beyond just extracting the values from the PDF? ArthurPSmith (talk) 19:20, 11 September 2018 (UTC)
- Thank you for the review! I remember that there were about 20 Edits, where I had to correct the point in time property after the batch. You have to search for "Harte Kernkapitalquote". To find the value you have to search for "20,6". (Sometimes there are some scanned PDF or PDF with some format, that are not searchable.) Here you can find another QuickStatements-batch for german saving banks: https://tools.wmflabs.org/quickstatements/#/batch/3777. --Datawiki30 (talk) 21:06, 11 September 2018 (UTC)
- Ok - sorry to be slow getting back to you - I've taken a look and you seem to be approaching this in a reasonable manner, the added data looks good. As far as I'm concerned you can go ahead! ArthurPSmith (talk) 20:03, 13 September 2018 (UTC)
- Have you tested this yet? Usually for bulk imports of this sort you should use a bot account for which there's an approval process - at the least it would be good to have somebody review what you've done! ArthurPSmith (talk) 20:06, 10 September 2018 (UTC)
type of business entity (Q1269299) and list of business entities (Q53400657) HELP!
If you ever hated the term legal person welcome to the mosh pit!
see this for prior discussion, moving here because it rapidly rolled off "Project Chat" https://www.wikidata.org/w/index.php?diff=680757693&oldid=680755302&title=Wikidata:Project_chat#type_of_business_entity_(Q1269299)_and_list_of_business_entities_(Q53400657)
type of business entity https://www.wikidata.org/wiki/Q1269299
list of business entities (Q53400657) (has now been updated to "list of legal entity types by country" https://www.wikidata.org/wiki/Q53400657
... there a real difference between "type of business entity" and "business entity". Q1269299 use the first as label and the second as alias, maybe we can switch them to make it clearer (or even change for "legal form" which is an other alias but the label for the corresponding property legal form (P1454)). I'm not a specialist of this subject either, can someone else pitch in
- What's been published in the English Wikipedia does not fit well into the Wikidata structure. I have moved list of business entities (Q53400657)' in Wikipedia to list of legal entity types by country. That article might be broken down to separate articles - "list of legal entity types in the United States, Japan, U.K. ... and that might then play better in Wikidata. The wikipedia article Legal person roughly translates to Legal entity and roughly translates to legal entity type (either list or by geolocation). (note that the article Legal person has been moved to Legal entity and been reverted (see talk page). Is your head spinning yet? Help stop the spin. Thanks! Rjlabs (talk) 19:34, 6 August 2018 (UTC)
- Indeed, there seems to be another structure in the English Wikipedia to describe the legal entity form. The Global LEI Foundation publishes also the entity legal form (see this - GLEIF Legal Entity Form Code ). They also published their list of legal forms - derived from the ISO 20275 (see PDF-legal-Form ). When I scroll to USA I found, that there are different legal forms for different states... I dont think, that we can solve this situation. Maybe the type of business entity could be the parent entity of all the other legal entity forms according to the GLEIF-List. --Datawiki30 (talk) 19:57, 10 August 2018 (UTC)
- It's normal that each US state has a separate ELF sublist, because each is a separate jurisdiction. I had high expectations about the ELF list, but there are misspellings and some unexpected values (at least comparing to the BG Trade Register), and commonly-established abbreviations are not used in the ELF --Vladimir Alexiev (talk) 09:05, 23 August 2018 (UTC)
- @ Vladimir: Are you talking about "Командиртно дружество с акции" :-)? Why dont you write them an e-mail? They can correct the misspellings you found. You can also propose them the abbreviations you have in mind. --Datawiki30 (talk) 21:11, 10 September 2018 (UTC) PS: Other things should be not so easy to correct. I found some not complete legal adresses in GLEIF for companies with two or more legal adresses. I've challenged the LEI and the issuer pointed at the german register. When I've checked the register I found that the address there is also nicht complete - there were two towns but only one ZIP-Code. So I've called the local curt to ask them about that, but they said, the company is responsible for the address-data. So I called the company - they say that they have never had problems with their partners about the address... :-/ All at all this was a very interesting experience ;-)
Bot for nominal GDP-values - Data from the World Bank
Notified participants of WikiProject Companies
I tried a discussion about this on the Economy WD-Project, but there is no activity. I would like to request a bot for reading GDP-values from the WorldBank DB and write back to Wikidata. The prototype is already done and it worked well on test.wikidata.org. The Scripts can be easily adopted to write other statistics like inflation, GDP per capita etc. How do you find the idea? --Datawiki30 (talk) 20:10, 10 September 2018 (UTC)
- Good idea.--Jklamo (talk) 13:59, 25 September 2018 (UTC)
- Thank you Jklamo. I would appreciate your support for the request for the bot-permission here: Wikidata:Requests_for_permissions/Bot/WDBot. Other members of the Companies-Project are also welcome to support the bot request :-). Cheers! --WDBot (talk) 09:26, 30 September 2018 (UTC)
- The bot has been approved and is running. Cheers! Datawiki30 (talk) 20:53, 20 October 2018 (UTC)
Company, business, etc.
There is a mess with the concepts about companies. I'm not talking about different type of companies, but about the basic definitions, on which we now have the following 5 items:
- business activity (Q19862406)
- business (Q4830453)
- company (Q52834234)
- enterprise (Q6881511)
- company (Q783794)
I think we should have instead only 3 distinct items that refer to 3 distinct concepts:
- a legal entity that carries on a business activity (in English "company", in Italian "società", in French "compagnie/societé");
- the business activity carried out by the entrepreneur (in English "business", in Italian "impresa", in French "entreprise");
- the complex of assets (goods and human capital) organized by the entrepreneur to carry on the business activity (in Italian "azienda").
How can we match this concepts with the items above? I've opened an Interwiki conflict some months ago, but I didn't received any answer. I ping @Alan ffm: because I've seen that he made some edits on this theme. --BohemianRhapsody (talk) 22:16, 26 September 2018 (UTC)
- Note business (Q4830453) used to be labeled "business enterprise" in English, and was sort of the catch-all for the organization/corporation etc. - there are thousands (hundreds of thousands?) of items that use this item as their value for instance of (P31). I guess this corresponds to your first concept? business activity (Q19862406) I believe is your second concept - an activity rather than an organization. ArthurPSmith (talk) 23:27, 26 September 2018 (UTC)
"Replaced by" (P1366) or "dissolved, abolished or demolished" (P576) for company mergers
I have asked myself which property is the best when two (or more) companies merger. I would suggest the following:
a) The old companies and the new merged company are available on Wikidata
- "Replaced by" (P1366) for the old companies pointing at the new company
- "replaces" (P1365) for the new company pointing at the old companies
- in this case the property "dissolved, abolished or demolished" (P576) should not be used (redundant)
b) The new company is not available on Wikidata -> "dissolved, abolished or demolished" (P576) for the old companies
What do you thing about this? --Datawiki30 (talk) 18:10, 29 September 2018 (UTC)
- Depends on the type of merger, as something referred as merger can actually mean different things, like:
- creation of a new entity into which are old entities merged
- creation of a new entity, old entities become a subsidiary of this entity
- renaming one entity, second entity merged into the renamed entity
- renaming one entity, second entity become subsidiary of the renamed entity.
- --Jklamo (talk) 10:28, 30 September 2018 (UTC)
- Thank you for the 4 examples. I would suggest:
- creation of a new entity into which are old entities merged -> option a) above
- creation of a new entity, old entities become a subsidiary of this entity -> parent organization (P749) for the new entity and subsidiary (P355) for the old entities.
- renaming one entity, second entity merged into the renamed entity -> option a) above
- renaming one entity, second entity become subsidiary of the renamed entity -> parent organization (P749) for the renamed entity and subsidiary (P355) for the other entity.
- --Datawiki30 (talk) 13:51, 30 September 2018 (UTC)
- Thank you for the 4 examples. I would suggest:
- Companies can be difficult. Recently Fairfax Media (Q1393218) (an Australian company owning numerous newspapers, websites etc.) was taken over by Nine Entertainment Co. (Q16999054), which is apparently extinguishing the Fairfax brand. Newspaper articles say that Fairfax has ceased to exist, and editors at en:Fairfax Media have followed suit. However, the company didn't go away on the day of the takeover, but continues to exist as a subsidiary of Nine (see government database entries at [6] and [7]). There's unlikely to be enough public information available in future to work out its on-going status. On Wikidata, it still owns numerous newspapers (this may be technically correct at present, but may change over time). I've put a dissolved, abolished or demolished date (P576) on the Wikidata item, but I'm not really sure how it should be handled. Ghouston (talk) 03:54, 12 December 2018 (UTC)
- Now the English Wikipedia has changed its mind and reinstated Fairfax Media as en:Nine Publishing. The other sitelinks on the item still call it Fairfax Media. Yet the Fairfax Media company still exists, according to the Australian business register [8]. Ghouston (talk) 01:19, 5 March 2019 (UTC)
Where would website or privacy policy go?
I am working on a project aiming to catalog data tied to how companies handle data protection concerns. For instance I would like to have a framework to add website URLs, as well as privacy policies URL and content. That's very much just a start. What would be the best way to do this? I would appreciate any help as I am new to Wikidata. Thank you. Pdehaye (talk) 13:13, 19 November 2018 (UTC)
- Website URL's can be added using the property official website (P856). I don't think we currently have a property to specifically link to privacy policies - you could do this now using URL (P2699) with a qualifier applies to part, aspect, or form (P518) privacy policy (Q1999831); that seems the right way to me at least but somebody else may suggest a better way to model this. ArthurPSmith (talk) 15:26, 19 November 2018 (UTC)
- What if I want to create a new item for a specific privacy policy?
Mix n Match with SIRENE, French register for French companies
I've launched a mix n match to link wikidata to the SIRENE database, ie French register for companies. Only big companies with more than 1,000 employees have bee3 selected --PAC2 (talk) 07:43, 30 January 2019 (UTC)
CX and similar
At Q55841490, I tried to add some more data on this one. Obviously active ones could be more interesting, but Q55841490 might be a good benchmark for the type of data that should be available. --- Jura 18:06, 11 February 2019 (UTC)
Gender pay gap data
Disclosure requirements in the UK for the gender pay gap of most companies have produced a treasure trove of data: https://gender-pay-gap.service.gov.uk/ Countless secondary sources have published articles about this: see an example of usage inline in English Wikipedia articles.
The CSV dump contains CompanyNumber and/or SicCodes for each line which should help match Wikidata items. The Open Government License was previously discussed at Wikidata:Project_chat/Archive/2017/10#OGL licence for data. Nemo 09:41, 12 August 2019 (UTC)
Difference between net worth (P2218) and total equity (P2137)
net worth (P2218) is described as applying to Persons only. However, net worth (Q1933764) points to w:Net_worth, which is said to apply to companies, individuals, governments, economic sectors or entire countries.
So I'd suggest to merge the two props and their aliases? Barring that, can someone explain what is the difference? I'm trying to map those props to a Company Graph model.
Cheers! --Vladimir Alexiev (talk) 04:53, 25 September 2019 (UTC)
- Check this link, for example. In my humble opinion, merging is not a good idea, keeping both "generic" net worth (P2218) applied to persons (and more specific total equity (P2137) for companies makes sense, for example for constraint violations setting. Same for users, company infoboxes are using total equity (P2137), while person infoboxes are using net worth (P2218).--Jklamo (talk) 07:51, 25 September 2019 (UTC)
- The enwiki article doesn't define the property. Property proposal definitions and discussion on the property page define it. It seems the difference is that they have different constraints with one being for people and the other for organizations. I feel weakly in favor of merging but merging means deleting one property and thus is a discussion that would happen through Property deletion requests. ChristianKl ❪✉❫ 08:27, 25 September 2019 (UTC)
- Thanks @Jklamo: although the explanation at that link is a bit circuitous, I'm satisfied there is enough of a historical difference to keep both. --Vladimir Alexiev (talk) 14:47, 21 October 2019 (UTC)
Company name and business changes over time
Should a company (listed, public company in this example) be coded in a single Wikidata item with time-qualified name, ticker code etc, or as multiple items linked by replaces (P1365)/replaced by (P1366)? My question is triggered by an EnWiki AFD discussion about a stub article about a company. The company had already changed its name, and in trying to clean up the article, I discovered there was also a Wikipedia article about the company under an earlier name. For now, I have created a third Wikidata entity and linked them with follows/followed by, but I wonder if the three should be merged and have time qualifiers on the properties that have changed. Miller's Retail (Q6859005) already existed, Specialty Fashion Group (Q28183744) was the article brought to AFD, I created City Chic Collective (Q83484143) for its current name.
To add to the complexity, the company ran a number of retail chains of shops, and has recently sold off all but one of those (the one that matches its current name - City Chic). The chains it sold include one with the original company name (Miller's/Millers). Should each chain/brand also have its own Wikidata item so that ownership changes cna be tracked over time (or is that a question for another wikiproject?)? --ScottDavis (talk) 23:59, 23 January 2020 (UTC)
- @ScottDavis: If it's clearly the same entity with a different name, then a single item should suffice. If there was some structural change along with the name change (for example a relocation, merger, change of business activity, etc.) then maybe two items would be better. ArthurPSmith (talk) 15:42, 24 January 2020 (UTC)
Adding the Forbes 2000 rank to a company
I would like to add the Forbes 2000 rank to companies (for many years). This can be done by the statement "part_of" Forbes_Global_2000, with the qualifier ranking and point_in_time (see https://www.wikidata.org/wiki/Q26463 for an example). The full Forbes 2000 list can be found on Kaggle for the year 2017. To do so I would need to use a bot.
My question: if I read the term of use of Forbes (https://www.forbes.com/terms-and-conditions), it is clearly protected by a copyright. On Wikipedia, we can already find the top 20 rank. How to proceed ? Do I need to ask Forbes if they agree to have their list in wikidata?
- I'm unsure about the copyright status, but don't think part of (P361) is the right property for indicating something is present in a list. --SilentSpike (talk) 10:14, 3 March 2020 (UTC)
Model items
I think it would be a good idea to establish some model item (P5869) items for business (Q4830453) and company (Q783794) (and any other items that should be maintained by this WikiProject - currently just those two). McDonald’s (Q38076) seems like a decent starting point as potentially the most globally well known business which could be easily fleshed out with sourced information. --SilentSpike (talk) 23:12, 14 April 2020 (UTC)
- Seems to be in a good shape, but some statements are not referenced. Apple (Q312) is even in better.--Jklamo (talk) 07:40, 15 April 2020 (UTC)
- @Jklamo: Good pick, looks to be very fleshed out. I think we could add both as model items (as they are) and the data can be cleaned up where needed (things I notice missing most are start/end times and non-wikipedia import sources) only serving to make them even more model. Have gone ahead and done so now. --SilentSpike (talk) 11:11, 15 April 2020 (UTC)
Navboxes
Currently some properties related to companies are on Template:Organisation_properties. Should all properties for companies go there or should we make a seperate navbox for companies?
Notified participants of WikiProject Companies
Iwan.Aucamp (talk) 14:19, 19 May 2020 (UTC)
- Thanks for link, I was not aware of that template. We do not have similar template here and properties seems to be similar. I think it will be a good idea to add company properties there.--Jklamo (talk) 16:43, 19 May 2020 (UTC)
I merged the item from Finnish Wikipedia into this item. The Finnish article is about a Finnish and an international position, specifically as a historical feature. After merging I noticed that all the articles linked to this item represent a French position, while the Wikidata item is country-independent. How would you prefer to model this? I would prefer to have a more abstract parent item and country-specific subitems. I hope you could help in remodeling and pointing out the problems, for example infoboxes that use this value. – Susanna Ånäs (Susannaanas) (talk) 14:16, 14 May 2020 (UTC)
- We have the general items chief executive officer (Q484876) and president (Q1255921) that seem to apply in most countries; this is a combination of the two? ArthurPSmith (talk) 17:48, 15 May 2020 (UTC)
- @Susannaanas, ArthurPSmith: My Finnish isn't particularly strong, but I get the impression from the article that Pääjohtaja might perhaps be closer to director general (Q1501800). To confound the issue, I also fear that Q428322 may be misleadingly labelled, and should perhaps stick with something closer to its linked enwiki article (PDG). --Oravrattas (talk) 20:50, 23 June 2020 (UTC)
Importing data from OpenCorporates
Notified participants of WikiProject Companies
I have recently had a look at ways to improve our connections with OpenCorporates (Q7095760). I have discovered that the OpenCorporates ID (P1320) consists in two predictable parts:
- a code for the jurisdiction of the entity;
- the company number for this legal entity in that jurisdiction - this is the original id from the national company register, not a made-up one from OpenCorporates.
Therefore it is possible to deduce a lot of OpenCorporates ID (P1320) from properties for national registers (and conversely). For instance OpenCorporates ID (P1320)fr/527678262 corresponds to SIREN number (P1616) "527678262" , and OpenCorporates ID (P1320)gb/02906991 corresponds to Companies House company ID (P2622)02906991.
I have done a few of these derivations for various jurisdictions and created a table summarizing the correspondence between national prefixes and Wikidata properties.
I think this makes OpenCorporates ID (P1320) really interesting for Wikidata: it connects in a completely transparent way with national registers that we already link to. Moreover, they have an OpenRefine reconciliation interface that makes it easy to look for matches and upload the ids to Wikidata. (I am thinking about writing a tutorial about this workflow.)
For now, the limiting factor is the license: we cannot pull more than the ids because their API is under CC-BY-SA. I will have a call on Tuesday with them to see if they could still allow some data import in Wikidata. What sort of data would you be most interested in? − Pintoch (talk) 17:41, 21 July 2018 (UTC)
- That is interesting. I think corporate relationships would be very useful - this entity is a subsidiary of that, etc. Official website if there is one. More detailed headquarters location info if applicable. Inception & dissolution/merger dates, replaces/replaced etc. if possible. If a more detailed corporate type is available and can be matched to a wikidata ID for P31 that could be useful. Stock index/ticker symbol I guess. ArthurPSmith (talk) 18:38, 21 July 2018 (UTC)
- Also financial data (total revenue (P2139), total assets (P2403), e. g.), information of leadership (director / manager (P1037)), seat (headquarters location (P159)) and legal status (legal form (P1454)) will be of interest. --MB-one (talk) 01:24, 22 July 2018 (UTC)
- OpenCorporates suggest that we narrow down to the top three most important data fields that we would like to import, to see if they could make a special agreement to import these in Wikidata. − Pintoch (talk) 15:11, 25 July 2018 (UTC)
- Well that's promising. I assume that's 3 fields beyond the ID? Do you have a list of their fields somewhere, and how often they are populated? Looking at a few examples, I think I'd say "Incorporation Date" and "Jurisdiction" may be the most important. ArthurPSmith (talk) 16:59, 26 July 2018 (UTC)
- OpenCorporates suggest that we narrow down to the top three most important data fields that we would like to import, to see if they could make a special agreement to import these in Wikidata. − Pintoch (talk) 15:11, 25 July 2018 (UTC)
- Also financial data (total revenue (P2139), total assets (P2403), e. g.), information of leadership (director / manager (P1037)), seat (headquarters location (P159)) and legal status (legal form (P1454)) will be of interest. --MB-one (talk) 01:24, 22 July 2018 (UTC)
What is the current status (31.03.2021) on importing data from OpenCorporates? Does anyone work on it? What is the number of companies (out of 190+ millions) planned to import? RShigapov
- @RShigapov: Good question! @Pintoch: were you still in contact with them or was this handed off to somebody else? ArthurPSmith (talk) 17:24, 31 March 2021 (UTC)
- @ArthurPSmith: hello all, good to hear from you! I haven't continued the conversation beyond what is reported here, sorry… I am sure it is not too hard to pick it back up from where it was though. If someone wants to ask them again I can forward them the conversation I had (although the people behind the service might have changed in the meantime). − Pintoch (talk) 18:30, 31 March 2021 (UTC)
- @RShigapov: are you interested in taking this further? If not I think it's a useful project so I could look into it. ArthurPSmith (talk) 19:38, 31 March 2021 (UTC)
- @ArthurPSmith: As you know I am not yet experienced here and not yet aware about how everything works, so at this point I would prefer to support you than taking an action myself. @Pintoch: could you send the conversation to both of us though? Meanwhile I am also thinking about your discussion above on OpenCorporates IDs. OpenCorporates had even a policy paper on their IDs How OpenCorporates should handle company number problems. So OpenCorporates tries to keep their IDs as close as possible to the national registers. In case of Germany (I work on creating a Wikibase instance for German company data in www.berd-bw.de) OpenCorporates ID is formed from the code of a local court (Amtsgericht), the code corresponding to a legal form of a legal entity and the company number. Would it make sense to create a new property in Wikidata 'native company number' for the company numbers? Then, a statement with the property 'native company number' could have a qualifier "issued by" (P2378) for a registration authority. Wikidata already has the local courts of Germany as items. For Germany there should be one more qualifier with the code for a legal form, but for some other countries the qualifier 'issued by' would be sufficient. If we want to make it applicable to all countries, then the datatype of 'native company number' has to be string, I guess, not the external-id, because formatterURLs would be country-specific. Does it make sense? RShigapov P.S. How do you save the dates here? 07.05, 01 April 2021 (UTC)
- See Wikidata_talk:WikiProject_Companies#Missing_identifier_properties below. A property for OpenCorporates directory of identifiers could be interesting. --- Jura 08:38, 1 April 2021 (UTC)
- @Jura1:, interesting! So you use catalog code (P528) with qualifier catalog (P972) for what I have thought as 'native company number' with qualifier issued by (P2378). If we apply your scheme to the German case, how would we model the type of register (HRA, HRB, GnR, PR, VR) which correspond to the groups of legal forms? Could we use catalog code (P528) for company number with a qualifier catalog code (P528) for the type of register and another qualifier catalog (P972) for the registration authority? RShigapov 10:05, 1 April 2021 (UTC)
- I used that as no specific property was available for Red Cross organizations of some countries. The applicable national/local identifiers can very a lot and for most we don't have properties yet. These may be the same identifiers as for companies, but the relevant ones can be others. In general, I would create specific properties though, especially when one has more than one value to add. --- Jura 12:44, 1 April 2021 (UTC)
- @Jura1:, interesting! So you use catalog code (P528) with qualifier catalog (P972) for what I have thought as 'native company number' with qualifier issued by (P2378). If we apply your scheme to the German case, how would we model the type of register (HRA, HRB, GnR, PR, VR) which correspond to the groups of legal forms? Could we use catalog code (P528) for company number with a qualifier catalog code (P528) for the type of register and another qualifier catalog (P972) for the registration authority? RShigapov 10:05, 1 April 2021 (UTC)
- See Wikidata_talk:WikiProject_Companies#Missing_identifier_properties below. A property for OpenCorporates directory of identifiers could be interesting. --- Jura 08:38, 1 April 2021 (UTC)
- @ArthurPSmith: As you know I am not yet experienced here and not yet aware about how everything works, so at this point I would prefer to support you than taking an action myself. @Pintoch: could you send the conversation to both of us though? Meanwhile I am also thinking about your discussion above on OpenCorporates IDs. OpenCorporates had even a policy paper on their IDs How OpenCorporates should handle company number problems. So OpenCorporates tries to keep their IDs as close as possible to the national registers. In case of Germany (I work on creating a Wikibase instance for German company data in www.berd-bw.de) OpenCorporates ID is formed from the code of a local court (Amtsgericht), the code corresponding to a legal form of a legal entity and the company number. Would it make sense to create a new property in Wikidata 'native company number' for the company numbers? Then, a statement with the property 'native company number' could have a qualifier "issued by" (P2378) for a registration authority. Wikidata already has the local courts of Germany as items. For Germany there should be one more qualifier with the code for a legal form, but for some other countries the qualifier 'issued by' would be sufficient. If we want to make it applicable to all countries, then the datatype of 'native company number' has to be string, I guess, not the external-id, because formatterURLs would be country-specific. Does it make sense? RShigapov P.S. How do you save the dates here? 07.05, 01 April 2021 (UTC)
- @RShigapov: are you interested in taking this further? If not I think it's a useful project so I could look into it. ArthurPSmith (talk) 19:38, 31 March 2021 (UTC)
- @ArthurPSmith: hello all, good to hear from you! I haven't continued the conversation beyond what is reported here, sorry… I am sure it is not too hard to pick it back up from where it was though. If someone wants to ask them again I can forward them the conversation I had (although the people behind the service might have changed in the meantime). − Pintoch (talk) 18:30, 31 March 2021 (UTC)
Location types
As part of the EurHisFirm research project, work is in progress on a Wikibase instance in order to publish and enrich historical data related to European companies since the 19th century. The properties and, more broadly, the data model will be kept as close as possible to what already exists in Wikidata, and I would like to thank all the participants of this Wikiproject for their inspiring work until now. The instance is not yet live but, when it is, you will be notified and we hope that you will find this new source of data useful!
At the moment, we're focusing on the location types of the companies. EurHisFirm source databases contain for each company a list of "locations" such as headquarters, commercial headquarters, operational headquarters, branch office, point of sale, factory, place of liquidation, etc. The idea is to use a similar model as Wikidata's headquarters location (P159) (example here), but using instead a generic property such as "company location" with the location type (e.g. "headquarters") as value and the location infos (e.g "city = Paris", "address = 14 rue roquépine", etc.) as qualifiers.
The company location property doesn't exist in Wikidata as far as I know, probably because this type of granular data is hard to source for most companies and not notable or even useful.
I'm wondering if this is the best way to do it. What do you think? – The preceding unsigned comment was added by Johanricher (talk • contribs) at 07:05, June 23, 2020 (UTC).
- @Johanricher: There is also a new proposal for a corporate domicile property, where that may differ from the location of the headquarters. We do of course have a generic location (P276) that could be used here with qualifiers to indicate other location types. ArthurPSmith (talk) 19:37, 23 June 2020 (UTC)
Automobile manufacturers
Hi, I recently joined Wikidata because I'm trying to build a car database from Wikidata data. So, in order to find all the automobile manufacturers, I used to run a query on instance of (P31) automobile manufacturer (Q786820). However, turns out that a lot of manufacturers are excluded from the result. In many cases that's because some are just instance of (P31) business (Q4830453) (and sometimes enterprise (Q6881511)), without further classification. I realize that there is industry (P452) automotive industry (Q190117), but this includes other companies in the industry (e.g., automotive suppliers) that I am not interested in. User:Jklamo suggested to me that I should use industry (P452) and product, material, or service produced or provided (P1056) rather than instance of (P31). So shall I use product, material, or service produced or provided (P1056) motor car (Q1420) for automobile manufacturers? Thanks! --Ccdb.me (talk) 18:49, 6 December 2020 (UTC)
- @Ccdb.me: Well, that will only work if those statements have been actually already entered for these manufacturers. Wikidata is far from complete, so you may have to add some of this information yourself as you find out more about these things. But as a general rule, yes it's preferable to use more specific properties rather than refinements of instance of (P31). ArthurPSmith (talk) 19:16, 7 December 2020 (UTC)
- An alternative could be to query values of "manufacturer" or "brand" on car models. --- Jura 20:12, 7 December 2020 (UTC)
Corporate name changes
I have been looking into publishers. The names of Publishers have changed a few times between now and when their now in the Public Domain works were published. It was a leap of faith for me to decide that "The Macmillian Company of Canada" is now "Macmillian Publishers", for instance.
What about separate items for each name change, complete with inception and demise and their monarchs and public faces, locations and subsidiaries or what have you? -- all gathering at the current name item. I think it would be a good way to capture the history of the corp and for sure easier to manage when dealing with versions of the product. At least for books this would be great and I suspect for many other products also.
Thank you for your consideration of this.
Notified participants of WikiProject Companies --RaboKarbakian (talk) 04:02, 23 December 2020 (UTC)
- Difficult question indeed. Briefly, we stick to the corporate identity, thus in the case of mere renaming of the same legal entity, we are using official name (P1448) (and start time (P580)/end time (P582)), but in fact that is sporadic case (and note that different entities from different countries may have the same name or even different entities may have the same name in one country over time). The company and its subsidiary is not same entity, even wiki pages tend to conflate multiple concepts into one article. We are more granular here at Wikidata.
- Broadly, it is often convenient to create a separate item for imprint and legal entity/publishing company, as single publishing company may use multiple imprints, and same imprint may be used by different publishing companies over time. You can connect imprint and publishing house easily using item operated (P121) and owned by (P127) (and start time (P580)/end time (P582)). --Jklamo (talk) 01:45, 31 December 2020 (UTC)
Missing identifier properties
What to do when we lack a dedicated property for a company's (or similar organization's) identifier?
- I went for using catalog code (P528) with qualifier catalog (P972). Sample at Q11282482#P528, registry there is Register of Charities (Q105651056).
BTW, I noticed we have a mostly unused OpenCorporates corporate grouping (P5256), but lack a property for its directory of registries. Maybe we could repurpose P5256 to that. --- Jura 09:27, 24 February 2021 (UTC)
- The best approach is to propose a property for the identifier. We already have 30+ properties for company identifiers and I am not aware of any rejected property proposal of this kind.
- I do not think that repurposing OpenCorporates corporate grouping (P5256) is a good idea. --Jklamo (talk) 13:10, 1 April 2021 (UTC)
- It's just not efficient for users to go through the property proposal process for one or two values. Especially when every organization uses a different one. BTW, the one above may not even be used by OC. --- Jura 13:21, 1 April 2021 (UTC)
- I will propose props for both https://opencorporates.com/registers and https://www.gleif.org/en/about-lei/code-lists/gleif-registration-authorities-list --Vladimir Alexiev (talk) 07:05, 10 April 2021 (UTC)
- It's just not efficient for users to go through the property proposal process for one or two values. Especially when every organization uses a different one. BTW, the one above may not even be used by OC. --- Jura 13:21, 1 April 2021 (UTC)
- I do not think that repurposing OpenCorporates corporate grouping (P5256) is a good idea. --Jklamo (talk) 13:10, 1 April 2021 (UTC)
In a case of OpenCorporates coverage you can use OpenCorporates ID (P1320), as a national identifier is often easily derivable from their ID (thus easy to populate, if a property is created in the future). Using catalog code (P528) is suboptimal, you can use described at URL (P973) with a link to national register as well.
- Indeed, see https://www.wikidata.org/wiki/Wikidata:Property_proposal/EIK: there's a query at the bottom that uses OpenCorporates ID (P1320) and EU VAT number (P3608) to populate BG EIK (P8894) --Vladimir Alexiev (talk) 07:05, 10 April 2021 (UTC)