GenomeNet

Database: UniProt
Entry: W6US03_ECHGR
LinkDB: W6US03_ECHGR
Original site: W6US03_ECHGR 
ID   W6US03_ECHGR            Unreviewed;      1561 AA.
AC   W6US03;
DT   16-APR-2014, integrated into UniProtKB/TrEMBL.
DT   16-APR-2014, sequence version 1.
DT   27-MAR-2024, entry version 38.
DE   SubName: Full=Collagen alpha-2(I) chain {ECO:0000313|EMBL:EUB61137.1};
GN   ORFNames=EGR_03985 {ECO:0000313|EMBL:EUB61137.1};
OS   Echinococcus granulosus (Hydatid tapeworm).
OC   Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes; Cestoda;
OC   Eucestoda; Cyclophyllidea; Taeniidae; Echinococcus;
OC   Echinococcus granulosus group.
OX   NCBI_TaxID=6210 {ECO:0000313|EMBL:EUB61137.1, ECO:0000313|Proteomes:UP000019149};
RN   [1] {ECO:0000313|EMBL:EUB61137.1, ECO:0000313|Proteomes:UP000019149}
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX   PubMed=24013640; DOI=10.1038/ng.2757;
RA   Zheng H., Zhang W., Zhang L., Zhang Z., Li J., Lu G., Zhu Y., Wang Y.,
RA   Huang Y., Liu J., Kang H., Chen J., Wang L., Chen A., Yu S., Gao Z.,
RA   Jin L., Gu W., Wang Z., Zhao L., Shi B., Wen H., Lin R., Jones M.K.,
RA   Brejova B., Vinar T., Zhao G., McManus D.P., Chen Z., Zhou Y., Wang S.;
RT   "The genome of the hydatid tapeworm Echinococcus granulosus.";
RL   Nat. Genet. 45:1168-1175(2013).
CC   -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC       whole genome shotgun (WGS) entry which is preliminary data.
CC       {ECO:0000313|EMBL:EUB61137.1}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; APAU02000023; EUB61137.1; -; Genomic_DNA.
DR   STRING; 6210.W6US03; -.
DR   EnsemblMetazoa; XM_024493234.1; XP_024352333.1; GeneID_36339700.
DR   OMA; HCREIME; -.
DR   OrthoDB; 2970887at2759; -.
DR   Proteomes; UP000019149; Unassembled WGS sequence.
DR   GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR   GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR   Gene3D; 2.60.120.1000; -; 1.
DR   Gene3D; 2.60.120.200; -; 1.
DR   InterPro; IPR008160; Collagen.
DR   InterPro; IPR013320; ConA-like_dom_sf.
DR   InterPro; IPR000885; Fib_collagen_C.
DR   PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR   PANTHER; PTHR24023:SF1082; COLLAGEN ALPHA-1(X) CHAIN; 1.
DR   Pfam; PF01410; COLFI; 1.
DR   Pfam; PF01391; Collagen; 10.
DR   SMART; SM00038; COLFI; 1.
DR   SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
DR   PROSITE; PS51461; NC1_FIB; 1.
PE   4: Predicted;
KW   Collagen {ECO:0000313|EMBL:EUB61137.1};
KW   Reference proteome {ECO:0000313|Proteomes:UP000019149}.
FT   DOMAIN          1318..1561
FT                   /note="Fibrillar collagen NC1"
FT                   /evidence="ECO:0000259|PROSITE:PS51461"
FT   REGION          143..191
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          223..754
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          802..1271
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        144..158
FT                   /note="Acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        159..173
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        382..417
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        647..661
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        686..702
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        1072..1087
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        1129..1144
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        1217..1240
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        1250..1269
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ   SEQUENCE   1561 AA;  155054 MW;  4CFC782A935FD184 CRC64;
     MKPGYKGELF IIYSSQGRVL LSVGVDKQVV VAYQEAAAGV RAARISTSGT AIHSEKIGPA
     LDDNEWHRVG LNFKDGRVRL AVDCKAVEES TIVLPVKFID KRNTLSLLPH GFNGVLQDLM
     LVLGDRSLEQ QCLIYTPDCP TDVDDAMMND EGEPGEQGDP GEPGEKGEPG KPGPRGPDGE
     QGPPGPTGSL FVVPLNLGIG GGANARAAHF RQLLQQHLES LRGVHGPRGL TGPTGPDGPM
     GPPGLKGETG PQGDTGLQGR RGPMGPVGFP GPRGKTGRDG DRGDPGPAGP KGPDGLPGDS
     GLMGPKGLRG PRGPPGPVGE QGSPGEEGSR GPTGDAGDQG DMGSYGPRGP PGLQGPLGPR
     GKTGSRGPVG LQGEVGEPGP TGMPGPMGPP GPAGPEGPAG PRGVVGPPGP PGKQGPPGAT
     GPEGRPGYPG SKGEMGLKGD TGPEGQKGEP GLPGPRGVKG ARGIVGISGP KGDTGKPGAV
     GVPGEAGPKG LKGSRGSPGV MGLMGPQGEK GEPGKSGPFG PRGDRGLQGP PGLMGPAGPQ
     GLLGDPGPTG PAGPQGREGD KGETGPVGPP GADGEQGQHG APGPRGIQGK VGPAGEKGDK
     GATGPPGPPG PTGELGFQGH PGPSGLPGPA GPMGRQGPQG FPGPRGEPGE IGPPGPRGPI
     GMTGPVGPAG EILNFFQQGP TGDPGKEGKD GPKGDKGPKG ERGNRGPPGQ RGPRGYLGLD
     GMPGPAGPEG FKGETGPDGE VGPTGPKGYR GDAGILGQVG PPVIQSISIW IGSLLPPEAT
     SKDKGNGTIF SISSEFLRLL TGPLGPRGAQ GIQGEKGPKG PDGAMGTVGP PGSRGSRGRS
     GPTGPIGAPG FDGEKGERGP PGPQGKKGPP GPQGPRGIVG PIGSAGMPGI QGPPGAPGQP
     GEPGPMGLIG PDGLTGPSGL PGPMGMEGAP GPIGKRGPKG DRGREGPQGS PGPMGPMGSM
     GLTGPAGNPG PTGPAGAVGE PGDKGQQGPR GSPGYEGSPG NQGPQGRKGP DGPAGQVGPP
     GDPGMEGPIG PQGPTGPLGP PGSQGARGIP GGRGSPGPVG EDGAPGNTGP QGKPGPRGPP
     GPQGPQGPAG QAGPQGLPGP RGSKGPKGEP GPQGPPGETG NPGPRGLSGP PGLRGPPGPA
     GLQGPPGVKG PEGEVGPEGI PGATGADGGK GETGPKGPKG AYGPVGPMGP PGPRGERGPT
     GVEGERGRIG PKGAQGPPGP TGDPGPPGDS GPPGPEGPMG VRGPVGPGGP RGPKGKPGPQ
     GPPGPPGPVK ILDLTAGYFK FEPGRTKRSI NEDELYDMKD PDASLFPNNV PAIGAVLRRL
     YARIESLESA VRYYRRPIGT RAYPARHCRE IMEATDSPHG PVSGEYWIDP NLGSSRDAIK
     VECKFSGSVA KTCVHATPES KALRLVNLRK SGGEGSWWFS KLLEGNTNGT QRLYYAPRNQ
     FNFLQLLHHR AEQSITALCR GSVVYYDSRN KNYDLAANLL LFNGKVINTH LDRRVRGEGG
     FVQLEVNIKD DCMDRSPTGS TANFDLVANH PELLPIIDMK MFDFGEDNQQ LGYYVNEVCF
     S
//
DBGET integrated database retrieval system