Wikidata:Property proposal/position in sequence

From Wikidata
Jump to navigation Jump to search

position in biological sequence[edit]

Originally proposed at Wikidata:Property proposal/Natural science

Descriptionindex or position of a nucleotide in a genomic sequence, or of an amino acid in a protein's amino acid sequence; used as qualifier
Representsnucleotide (Q28745), amino acid position (Q66424100)
Data typeQuantity
Domainproperty
Allowed valuesinteger > 0
Example 1phenylalanine hydroxylase (Q420604):
has part(s) (P527)O-phosphorylated residue (Q66735569)
→ "position in biological sequence" → 16
of (P642)protein (Q8054)
Example 2phenylalanine hydroxylase (Q420604):
gene substitution association with (P1916)phenylketonuria (Q194041)
→ "position in biological sequence" → 39
of (P642)protein (Q8054)
Example 3PAH (Q14851781):
gene substitution association with (P1916)phenylketonuria (Q194041)
→ "position in biological sequence" → 102912840
of (P642)human chromosome 12 (Q847102)
Planned useexactly specifying polymorphisms (mutations), hereditary diseases, PTMs
See alsogenomic start (P644), genomic end (P645) (these should be renamed/redefined to include amino acid/proteins); note also series ordinal (P1545) which is abstractly similar but associated with series not fixed sequences

Motivation[edit]

See Wikidata:Property_proposal/amino_acid_(start,_end)_position. In particular, commenters wished for unification of nuc/aa sequences---therefore redefinition of genomic start (P644), genomic end (P645) should happen simultaneously. SCIdude (talk) 08:27, 24 August 2019 (UTC)[reply]

Discussion[edit]

WikiProject Molecular biology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

@ChristianKl series ordinal (P1545) has other problems, see their talk page, it is not identical to sequence index since it allows arbitrary ordinals like 2,4 or 15X. Semantically a series is not a sequence, and AI applications will have problems mapping series ordinal (P1545) to a sequence index. I would agree to use an abstract "index/position in sequence" instead of this proposal, however. --SCIdude (talk) 07:01, 29 August 2019 (UTC)[reply]
@ChristianKl As said I'm in favor of a generic sequence index property. Do you think such a proposal would pass quickly? Then it would make this one obsolete. --SCIdude (talk) 14:06, 19 September 2019 (UTC)[reply]
@SCIdude: When it comes to passing a proposal quickly, it's about making clear why one choice of modeling the domain is better then other choices of modelling the domain. As long as it's not clear which choice is best, the proposal should stay open. ChristianKl14:33, 19 September 2019 (UTC)[reply]
@SCIdude, ديفيد عادل وهبة خليل 2, ArthurPSmith, TiagoLubiana, Yair rand, ChristianKl: @YULdigitalpreservation, Gtsulab: position in biological sequence (P8275) has been created. Pamputt (talk) 15:52, 2 June 2020 (UTC)[reply]