User:Jean-Frédéric/Videogames data model
The data-model we follow when it comes to video game (Q7889) is inconsistent and in many cases not granular enough. This is often inherited from the way things are modelled on Wikipedia.
Typically, handheld tie-ins to a big release have very little to do with one another. Yet, we may typically have only one item:
- FIFA: Road to World Cup 98 (Q913223), for a home-console version and GameBoy version.
- Tomb Raider: Legend (Q665785), for the multi-platform main title, a Nintendo DS and a Game Boy Advance versions
That being said, would it really make sense to have separate items for the PC, Xbox and PS versions?
Language versions and updates
- One item, Street Fighter Alpha 2 (Q2422795), for Street Fighter Alpha 2, Street Fighter Zero 2 and Street Fighter Zero 2 Alpha
Remakes, ports, etc.
- Plenty of remakes/remaster are separate. Examples:
- Final Fantasy Tactics (Q591378) and Final Fantasy Tactics: The War of the Lions (Q1158777)
- The Legend of Zelda: A Link to the Past and Four Swords (Q1358804) and The Legend of Zelda: A Link to the Past (Q370055)
- The Legend of Zelda: Twilight Princess HD (Q23008476) and The Legend of Zelda: Twilight Princess (Q735613)
- Metal Gear Solid 3: Subsistence (Q12636722) and Metal Gear Solid 3: Snake Eater (Q247935), by virtue of some Wikipedias
- But not necessarily:
- The 16-bit Final Fantasy re-released for PS/GBA.
- The Ico & Shadow of the Colossus Collection (Q7741308)
- Virtually every game re-released on Wii Virtual Console, Android etc.
- This makes things like using publication date (P577) very painful − what is the meaning of having publication date = 2010 for a 1995 game, because it was re-released on iOS?
(On that topic: remakes are not often linked to the original work using based on (P144))
Many external databases are often more granular than Wikidata − cf all the MobyGames examples above.
- Hall of Light ID (P4671) makes the distinction between Amiga version and CD32 version (while MobyGames does not)
- Metacritic ID (P1712) makes the distinction between each platform (see eg Q200864#P1712)
- Guardiana ID (P4710) makes the distinction between each platform − eg the Master System, MegaDrive and GameGear versions of Streets of Rage 2 (Q2569030) (neither Wikipedia nor MobyGames do) ; but not per region.
- Atarimania identifier (P4859) makes the distinction per platform (eg Atari 400 vs Atari ST), per region and reedition (see eg 6 entries for Monkey Island)
I feel like we should be able to answer basic questions from the data:
- Which was the original version of The NewZealand Story (Q2121419) or Night Gunner (Q16666551) and which were the ports?
- How does the reviews of Final Fantasy VII (Q214232) on PS, PC and iOS differ?
- How many copies of Street Fighter II: The World Warrior (Q1133204) were sold for SNES?
Game / Editions
With the current data model of one item to rule them all, the best workaround is often to slap a platform (P400) qualifier on virtually every statement − which is even more convoluted for properties like distribution format (P437) (where the distribution often [not always] is a complete function of the platform).
See User:Diggr/Data Models of Video Games for a deeper analysis of the current modeling options.
Our vocabulary to model relationships between games is very limited. Sequels are appropriately done with follows (P155)/followed by (P156). But while Relationships among video games: Existing standards and new definitions (Q50180192) outlines 10 other relationships types (isPortOf, isRemakeOf, isRebootOf, isPrequelOf, isExpansionOf, isSidestoryOf, isSpinoffOf, isCrossoverOf, isSpiritualSuccessorOf, isInspiredBy), Wikidata more or less has based on (P144).
Our vocabulary to describe game features is very limited, boiling down to game mode (P404), genre (P136) and some others inherited from different media (narrative location (P840), takes place in fictional universe (P1434), set in period (P2408), P (P)). The, the Video Game Metadata Schema also describes games in terms of mechanics, mood, narrative genre, setting, theme, trope and visual style ; IGDB has theme and player perspective ; MobyGames has visual and perspective…
Wikipedias do not really distinguish much between Platform versions Although we do have eg Family Computer (Q491640) vs. Nintendo Entertainment System (Q172742) vs Nintendo Entertainment System (Q34468618), we have only one item for SNES / Super Famicom. While this in itself a convenient abstraction, it is not really accurate to say that a Japan-only game was published on the NES.
Another inconsistency comes to mandatory accessories: there is widespread use of Super CD-ROM2 (Q13574779) for example).and (but not with
The case of arcade games is also special: current practice seems to be; but we do have some use with arcade systems directly (eg or ).
Another example: Amiga databases will make separate entries for chipset-versions (Original Chip Set (Q1969923), Amiga Enhanced Chip Set (Q1343048), Advanced Graphics Architecture (Q379575)) − eg for Pinball Fantasies (Q2095462): hol:1055 vs. hol:1056. Is that a level of detail we want to reach? This is also applicable, to a certain degree, to OS-compatibility: there have been quite a few Microsoft Windows (Q1406) operating systems.
As it turns out, there is scientific literature on the topic :) Whatever model we settle on should be informed by academic research.
Some relevant articles:
- Workflows zur datenbasierten Videospielforschung - Am Beispiel der populären Videospielserie Metal Gear Solid (Q50179733)
- Relationships among video games: Existing standards and new definitions (Q50180192)
- A conceptual model for video games and interactive media (Q50180436)
In the same way that books settled on using Functional Requirements for Bibliographic Records (Q16388), we should settle on using a more sophisticated data model. The current state of the research seems to be: we need to differentiate game, edition and local release.
Ontologies & controlled vocabularies
- GAMECIP − The Game Metadata and Citation Project