ID G5ATA4_HETGA Unreviewed; 1448 AA.
AC G5ATA4;
DT 14-DEC-2011, integrated into UniProtKB/TrEMBL.
DT 14-DEC-2011, sequence version 1.
DT 27-MAR-2024, entry version 50.
DE SubName: Full=Collagen alpha-1(XVIII) chain {ECO:0000313|EMBL:EHB00265.1};
GN ORFNames=GW7_13728 {ECO:0000313|EMBL:EHB00265.1};
OS Heterocephalus glaber (Naked mole rat).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Glires; Rodentia; Hystricomorpha; Bathyergidae;
OC Heterocephalus.
OX NCBI_TaxID=10181 {ECO:0000313|EMBL:EHB00265.1, ECO:0000313|Proteomes:UP000006813};
RN [1] {ECO:0000313|EMBL:EHB00265.1, ECO:0000313|Proteomes:UP000006813}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=21993625; DOI=10.1038/nature10533;
RA Kim E.B., Fang X., Fushan A.A., Huang Z., Lobanov A.V., Han L.,
RA Marino S.M., Sun X., Turanov A.A., Yang P., Yim S.H., Zhao X.,
RA Kasaikina M.V., Stoletzki N., Peng C., Polak P., Xiong Z., Kiezun A.,
RA Zhu Y., Chen Y., Kryukov G.V., Zhang Q., Peshkin L., Yang L., Bronson R.T.,
RA Buffenstein R., Wang B., Han C., Li Q., Chen L., Zhao W., Sunyaev S.R.,
RA Park T.J., Zhang G., Wang J., Gladyshev V.N.;
RT "Genome sequencing reveals insights into physiology and longevity of the
RT naked mole rat.";
RL Nature 479:223-227(2011).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; JH166860; EHB00265.1; -; Genomic_DNA.
DR STRING; 10181.G5ATA4; -.
DR eggNOG; KOG3546; Eukaryota.
DR InParanoid; G5ATA4; -.
DR Proteomes; UP000006813; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR Gene3D; 2.60.120.200; -; 1.
DR Gene3D; 3.40.1620.70; -; 1.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 1.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR010515; Collagenase_NC10/endostatin.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR048287; TSPN-like_N.
DR InterPro; IPR045463; XV/XVIII_trimerization_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1034; COLLAGEN ALPHA-1(XVIII) CHAIN; 1.
DR Pfam; PF01391; Collagen; 4.
DR Pfam; PF20010; Collagen_trimer; 1.
DR Pfam; PF06482; Endostatin; 1.
DR SMART; SM00210; TSPN; 1.
DR SUPFAM; SSF56436; C-type lectin-like; 1.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
PE 4: Predicted;
KW Collagen {ECO:0000256|ARBA:ARBA00023119, ECO:0000313|EMBL:EHB00265.1};
KW Reference proteome {ECO:0000313|Proteomes:UP000006813};
KW Signal {ECO:0000256|ARBA:ARBA00022729}.
FT DOMAIN 170..358
FT /note="Thrombospondin-like N-terminal"
FT /evidence="ECO:0000259|SMART:SM00210"
FT REGION 399..1155
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 420..434
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 573..587
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 619..638
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 668..683
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 725..742
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 964..980
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1066..1149
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1448 AA; 146319 MW; 17B572279FE196FC CRC64;
MLLRQCVRTG ASADAGSKLR PVLRLPEEEQ GGVCHAPRCV AGGGSTQSPV SSSLGLPCLW
GAKSPGSFTE APETLVHPAW LSCSLGGGCG VPGPVGHLVC VFARQLCELG PGPGCHTADV
LVAPLSGPHP DPLRGLRGYP LFPPGPHVEG AWLLSKAAGP VLVGESAHQE VGLLQLLGEP
LPQQVRAIDD PEVGPAYVFG LDANSGQVAQ YHLPSPFFPD FSLLFHVRPD TEDAGMLFAI
TDAAQAAVVL GVKLSGVQDQ HQNISLLYSE PGTSQTHTAA SFRLPAFVGQ WTRFALSVDA
GSVALYVDCE EFQRLPFAQS LGRLELEPGA GLFVGQAGGA DPDKFQGAIA ELWVRGDPRV
SPVHCLDEED EDKEGASGDF GSGLEETGEL HRETGAMTQT LGLPLPPPVT PPPLARGGSV
EDARMEEMEE PSTAASPRHE AVPAKTLPGP GSNDAWDESA WSPGSSQGKG GLKGQKGEPG
VQGPPGPVGP QGPGGSMMHG PGAQGPPGPP GNDGAPGRDG EPGDPGEDGR PVSLPFFLPL
DGGGPLGPQG FPGPPGDVGP KGEKGDLGVG PRGPPGPQGP PGPPGPSFRH DKLVSLPGEP
GRFGLSPRAA GPHPPCVSVQ GPRGFPGPPG PPGVPGLPGE PGRFGVNSSH VLGPPGLPGV
PGRDGSPGFP GSPGPPGPPG KEGPPGRVGQ KGSPGDVGAP GPKGSQGDPG PEGSPGETGF
AGAPGPAGPP GPPGPPGPPG PRLGAGFDDM EGSGAPFWST AGGSAGLQGD PGLPGDKGEV
GADGARGFPG LPGSEGPTGP MGPKGEKGSR GERGDPGKDG VGQPGLPGPP GPPGAVVYVS
EPGVRTHMGP EGAAVLQPGF AGFPGPAGPK GDEGSKGAPG SPGPKGERGE PGAIFSPDGR
ALGPAHKGAK GEPGFRGPPG PYGRPGHKGE IGFPGRPGRP GMNGLKGEKG EPGDASLGSG
VRGMPGPPGP PGPPGPPGTP VYDSNAFVES GRAGPPGPRG LPGPSGPKGD KGEVGPPGPP
GQFPLDLLRL GAEMKGEKGD RGDAGQKGQM GTPGASGGIF GSSVPGPPGY PGPPGIPGPK
GDSIQGPPGP PGPQGPPGIG YEGRQGPPGP PGPEGRQGPP GPPSFPGPHR QTVSVPGPPG
PPGPPGPPGT MGTSSGVRAW ASYEAMLEKV HEVPEGWLLF VAEDEALYVR VRNGVRKVLL
EARMPLPSAT DNEVAALQPP LVQLHGGGPY PRREHSYSTA RPWRADDILA SPPRLPAPQP
YPGAPHRHGA YMHLRPASPT SLPTHTHHDF QPVLHLVALN SPLPGDMHGI RGADFQCFQQ
ARAVGLRGTF RAFLSSRLQD LYSIVRRADR GAVPIVNLKD EVLSPSWEAL FSGSRGQLKP
GARIFSFDSR DVLRHPAWPQ KSVWHGSDPS GRRLTESYCE TWRTEAAVAM GWPAGCWSRA
PRVAATPT
//