ID K7F9F2_PELSI Unreviewed; 788 AA.
AC K7F9F2;
DT 09-JAN-2013, integrated into UniProtKB/TrEMBL.
DT 09-JAN-2013, sequence version 1.
DT 27-MAR-2024, entry version 50.
DE SubName: Full=Collagen type XXVIII alpha 1 chain {ECO:0000313|Ensembl:ENSPSIP00000004662.1};
OS Pelodiscus sinensis (Chinese softshell turtle) (Trionyx sinensis).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Archelosauria; Testudinata; Testudines; Cryptodira; Trionychia;
OC Trionychidae; Pelodiscus.
OX NCBI_TaxID=13735 {ECO:0000313|Ensembl:ENSPSIP00000004662.1, ECO:0000313|Proteomes:UP000007267};
RN [1] {ECO:0000313|Proteomes:UP000007267}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Daiwa-1 {ECO:0000313|Proteomes:UP000007267};
RG Soft-shell Turtle Genome Consortium;
RL Submitted (OCT-2011) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Proteomes:UP000007267}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Daiwa-1 {ECO:0000313|Proteomes:UP000007267};
RX PubMed=23624526; DOI=10.1038/ng.2615;
RA Wang Z., Pascual-Anaya J., Zadissa A., Li W., Niimura Y., Huang Z., Li C.,
RA White S., Xiong Z., Fang D., Wang B., Ming Y., Chen Y., Zheng Y.,
RA Kuraku S., Pignatelli M., Herrero J., Beal K., Nozawa M., Li Q., Wang J.,
RA Zhang H., Yu L., Shigenobu S., Wang J., Liu J., Flicek P., Searle S.,
RA Wang J., Kuratani S., Yin Y., Aken B., Zhang G., Irie N.;
RT "The draft genomes of soft-shell turtle and green sea turtle yield insights
RT into the development and evolution of the turtle-specific body plan.";
RL Nat. Genet. 45:701-706(2013).
RN [3] {ECO:0000313|Ensembl:ENSPSIP00000004662.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AGCU01029134; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01029135; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01029136; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR AlphaFoldDB; K7F9F2; -.
DR STRING; 13735.ENSPSIP00000004662; -.
DR Ensembl; ENSPSIT00000004688.1; ENSPSIP00000004662.1; ENSPSIG00000004348.1.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000163195; -.
DR HOGENOM; CLU_356246_0_0_1; -.
DR OMA; ILFEEKC; -.
DR TreeFam; TF331207; -.
DR Proteomes; UP000007267; Unassembled WGS sequence.
DR CDD; cd01450; vWFA_subfamily_ECM; 1.
DR Gene3D; 3.40.50.410; von Willebrand factor, type A domain; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF878; COLLAGEN ALPHA-1(XXVIII) CHAIN; 1.
DR Pfam; PF01391; Collagen; 2.
DR Pfam; PF00092; VWA; 1.
DR PRINTS; PR00453; VWFADOMAIN.
DR SMART; SM00327; VWA; 1.
DR SUPFAM; SSF53300; vWA-like; 1.
DR PROSITE; PS50234; VWFA; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000007267};
KW Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..23
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 24..788
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5003901464"
FT DOMAIN 49..229
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT REGION 243..770
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 262..282
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 627..655
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 741..755
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 788 AA; 79140 MW; AF6B14590A1D6151 CRC64;
MWKKHFAFCI LLLAVATDHV VYGQNRKKGQ KPILLAGKDD FQDPVCFIDI VFILDSSESA
KHVLFDKQKD FVENLSDKVF VMKPAKFRKY DVKLAVMQFS STVKIDHPFV AWKDLNNFKQ
EVKNMNYIGH GTYSYYAISN ATHLFKTEGR KGSVKVALLM TDGIDHPKSP DVQGISTTAR
DFGISFITIG LSSKKASKSN LHLLSGNPPN EPVLILNDPN LSEKIKEQLV TLFNKRCEQK
SCNCEKGDPG PPGPPGGHGG RGDKGDQGRK GDKGDSEKGE SGEKGQVGAP GNKGEKGGRG
ECGTPGLKGD RGLEGPLGLK GPRGPQGISG PPGMPGPNGI QGNKGEPGPS GPYGPTGPPG
VGLPGPKGDR GQEGRIGPPG PIGIGEPGLT GPRGPEGVQG ERGPPGEGFP GQKGEKGSEG
PAGPTGIAGE SIKGDKGERG PEGPQGPIGP PGIGSQGSQG IQGPKGIPGQ KGSQGIGVQG
PKGKEGESGH KGDPGPPGIG VPGPKGDPGT SGQPGQPGMK GQTGNNGKKG EKGAPGLRGL
EGPAGKGDVG QKGDQGEKGS PGVDGALGSQ GPPGPKGEPG KKGSDGSPGP SIRGLPGPKG
EPGETGLKGS DGNPGDSIMG PMGNPGPTGL PGLPGPKGDG YPGAPGPPGL PGSPGPRGPK
GFGDPGQKGE LGARGLPGPS GPRGIGILGP KGAVGQKGLP GPPGPQGDSI QGSKGERGVP
GLPGPRGSEG YGLPGPKGDY GEKGDPGKKG DKGDIGEPGP AGPKGLMGRK GELGLSKEEI
IRLIMEIC
//