ID G5C6R8_HETGA Unreviewed; 1679 AA.
AC G5C6R8;
DT 14-DEC-2011, integrated into UniProtKB/TrEMBL.
DT 14-DEC-2011, sequence version 1.
DT 24-JAN-2024, entry version 39.
DE SubName: Full=Collagen alpha-1(XXII) chain {ECO:0000313|EMBL:EHB17229.1};
GN ORFNames=GW7_08328 {ECO:0000313|EMBL:EHB17229.1};
OS Heterocephalus glaber (Naked mole rat).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Glires; Rodentia; Hystricomorpha; Bathyergidae;
OC Heterocephalus.
OX NCBI_TaxID=10181 {ECO:0000313|EMBL:EHB17229.1, ECO:0000313|Proteomes:UP000006813};
RN [1] {ECO:0000313|EMBL:EHB17229.1, ECO:0000313|Proteomes:UP000006813}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=21993625; DOI=10.1038/nature10533;
RA Kim E.B., Fang X., Fushan A.A., Huang Z., Lobanov A.V., Han L.,
RA Marino S.M., Sun X., Turanov A.A., Yang P., Yim S.H., Zhao X.,
RA Kasaikina M.V., Stoletzki N., Peng C., Polak P., Xiong Z., Kiezun A.,
RA Zhu Y., Chen Y., Kryukov G.V., Zhang Q., Peshkin L., Yang L., Bronson R.T.,
RA Buffenstein R., Wang B., Han C., Li Q., Chen L., Zhao W., Sunyaev S.R.,
RA Park T.J., Zhang G., Wang J., Gladyshev V.N.;
RT "Genome sequencing reveals insights into physiology and longevity of the
RT naked mole rat.";
RL Nature 479:223-227(2011).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; JH173571; EHB17229.1; -; Genomic_DNA.
DR STRING; 10181.G5C6R8; -.
DR eggNOG; KOG1217; Eukaryota.
DR eggNOG; KOG3544; Eukaryota.
DR InParanoid; G5C6R8; -.
DR Proteomes; UP000006813; Unassembled WGS sequence.
DR GO; GO:0110165; C:cellular anatomical entity; IEA:UniProt.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR Gene3D; 2.60.120.200; -; 1.
DR Gene3D; 3.40.50.410; von Willebrand factor, type A domain; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR048287; TSPN-like_N.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1074; LAM_G_DOMAIN DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01391; Collagen; 10.
DR Pfam; PF00092; VWA; 1.
DR PRINTS; PR00453; VWFADOMAIN.
DR SMART; SM00210; TSPN; 1.
DR SMART; SM00327; VWA; 1.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
DR SUPFAM; SSF53300; vWA-like; 1.
DR PROSITE; PS50234; VWFA; 1.
PE 4: Predicted;
KW Collagen {ECO:0000256|ARBA:ARBA00023119, ECO:0000313|EMBL:EHB17229.1};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000006813};
KW Secreted {ECO:0000256|ARBA:ARBA00022530}; Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..27
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 28..1679
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5003475183"
FT DOMAIN 38..213
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT REGION 479..675
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 814..954
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1267..1512
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1544..1661
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 591..606
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 830..844
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 857..871
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1488..1505
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1641..1658
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1679 AA; 173972 MW; 1CBCA1D848119159 CRC64;
MVGPRWSTAA CLLWILLLWS QDGSCQAQRA GCKSVHYDLL FLLDTSSSVG KGDFEKVRQW
VANLVDTFEV GPDHTRVGVV RYSDAPTTAF ELGRFSSRQE VKAAAGRIPY HGGNTNTGDA
LRYITGRSFS PQAGGRPGNR AFKQVAILLT DGRSQDLVLD AAVAAHGAGI RVFAVGVGAA
LKEELEEIAS EPKSAHIFHV SDFNAIDKIR GKLRRRLCEN VLCPSVRVEG DRFKHTNGGT
KEITGFDLMD LFSVKEILGK RENGAQSSYV RMGSFPVVQR TEDVFPQGLP DEYAFVTTFR
FRKTSRKEDW YIWQVIDQYG IPQVSIRLDG ENKAVEYNAV GAMKDAVRVV FRGPRVNVLF
DRDWHKMALS IQAQNVSLYV DCALVQMLPI EERENIDIQG KTVIGKRLYD SVPIDFDLQR
IVIYCDSRHA ELETCCDIAS GPCQVTVVTE PPPPPLQPPT PGSEQIGFLK TINCSCPPGE
KGERGFDGPV GLPGPKGDMG ITGPMGSPGP KGEKGDTGRG SFVQGKKAEK ESLGLPGPPG
RDGSKGMRGE PGEPGEPGLP GEVGMRGPQG LPGLPGPPGH VGAPGLQGER GERGIRGEKG
ERGLDGFPGK PGETGEQGGP GPPGVAGPQG EKVQREGLKG EQGAPGPRGH QGPPGPPGAP
GLMGPEGKEG PPGLQGLRVD EAVVQVAFSP AEEYELKLRS EELLNIRHCI HSGTVTTTTT
VTTINSIITT ATTITMTTIT MTTITNTITT TISSTTTTII TRVLLPRDSP LIGEFEPQRW
RFPPRCIIKL TLGNACLPVD FGFFSCPQGK KGDMGAPGLP GSSGLQGPSG PPGIPGPPGP
GGPSGLPGEI GFPGKPGAPG NPGPPGRDGL RGPPGLPGSK GEPGERGEGG LAGKPGPRGE
TGLPGAPGLP GVRGEKGDQG EKGGLGLPGL KGDRGEKGPP GELGPRGPTG PPDRALALWG
LSTVRRGLWE GLVCLESQSP LAMPVVPMQH IRAVGQGIVV SCPFQAFSSF FLWVPKDKMV
PMGLLEWQET PALLDLQGSL VPVASREAWA HLASEAPLGK MGSAVRREQQ VKKGTKGQLA
PGVILVLLGS PGHLEKGMMV NGGTKEHLGS LALLAAAVTQ ASGLLAPLDL EDCLDSPGPR
GQLARMVHQE VQEKGGPLGN RTYAVSALLV HQASLVYQGL KETKASQESQ AEKAWKGKRE
TQESGVQMVR LDRKVIRVVL EFQVSWGPQG SLGHQEQMES QELLDHQEFK GLRTPSQGFH
GVFMSRIFQG KEGPPGPQGP SGIPGIPGEE GKQGRDGKPG PPGEPGKVGD PGLPGPEGPR
GSPGFKGHTG DPGAPGPRGE PGATGPPGQE GTPGKEGDTG PAGPQGPRGP RGPLGNNGSP
GSPGDPGHPG TPGQKGSKGE NGNPGLPGFL GPRGPPDRPA RWPWCSVALV SLQGTPGKPG
EPGARGERGD PGIKGDKGLP GGKGQPGDPG IPGHKGHTGL MGPQGPPGES GPAGPPGPPG
QPGFPGLRGE SPSVDTLRRL IQEELGKQLE SRLAYLLAQM QPAHMKVSQG RPGPPGPPGK
DGVPGRAGLM GEPGRPGQGG LEGPSGPVGP KGERGAKGDP GVPGVGLRGE TGSPGIPGQP
GEPGYAKDGV PGSPGPQGET GPAGHPGPPG PPGPPGLCDP SQCAYFASLA ARPGNVKGP
//