ID G7Y9R8_CLOSI Unreviewed; 1673 AA.
AC G7Y9R8;
DT 25-JAN-2012, integrated into UniProtKB/TrEMBL.
DT 25-JAN-2012, sequence version 1.
DT 27-MAR-2024, entry version 47.
DE SubName: Full=Collagen alpha-2(V) chain {ECO:0000313|EMBL:GAA49702.1};
GN ORFNames=CLF_103446 {ECO:0000313|EMBL:GAA49702.1};
OS Clonorchis sinensis (Chinese liver fluke).
OC Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes; Trematoda;
OC Digenea; Opisthorchiida; Opisthorchiata; Opisthorchiidae; Clonorchis.
OX NCBI_TaxID=79923 {ECO:0000313|EMBL:GAA49702.1, ECO:0000313|Proteomes:UP000008909};
RN [1] {ECO:0000313|EMBL:GAA49702.1}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Henan {ECO:0000313|EMBL:GAA49702.1};
RX PubMed=22023798; DOI=10.1186/gb-2011-12-10-r107;
RA Wang X., Chen W., Huang Y., Sun J., Men J., Liu H., Luo F., Guo L., Lv X.,
RA Deng C., Zhou C., Fan Y., Li X., Huang L., Hu Y., Liang C., Hu X., Xu J.,
RA Yu X.;
RT "The draft genome of the carcinogenic human liver fluke Clonorchis
RT sinensis.";
RL Genome Biol. 12:R107-R107(2011).
RN [2]
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=Henan;
RA Wang X., Huang Y., Chen W., Liu H., Guo L., Chen Y., Luo F., Zhou W.,
RA Sun J., Mao Q., Liang P., Zhou C., Tian Y., Men J., Lv X., Huang L.,
RA Zhou J., Hu Y., Li R., Zhang F., Lei H., Li X., Hu X., Liang C., Xu J.,
RA Wu Z., Yu X.;
RT "The genome and transcriptome sequence of Clonorchis sinensis provide
RT insights into the carcinogenic liver fluke.";
RL Submitted (OCT-2011) to the EMBL/GenBank/DDBJ databases.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; DF142979; GAA49702.1; -; Genomic_DNA.
DR Proteomes; UP000008909; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0016020; C:membrane; IEA:UniProtKB-KW.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR CDD; cd00110; LamG; 1.
DR Gene3D; 2.60.120.1000; -; 1.
DR Gene3D; 2.60.120.200; -; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR000885; Fib_collagen_C.
DR InterPro; IPR001791; Laminin_G.
DR InterPro; IPR048287; TSPN-like_N.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1082; COLLAGEN ALPHA-1(X) CHAIN; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 3.
DR SMART; SM00038; COLFI; 1.
DR SMART; SM00210; TSPN; 1.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
PE 4: Predicted;
KW Collagen {ECO:0000313|EMBL:GAA49702.1}; Membrane {ECO:0000256|SAM:Phobius};
KW Reference proteome {ECO:0000313|Proteomes:UP000008909};
KW Transmembrane {ECO:0000256|SAM:Phobius};
KW Transmembrane helix {ECO:0000256|SAM:Phobius}.
FT TRANSMEM 31..50
FT /note="Helical"
FT /evidence="ECO:0000256|SAM:Phobius"
FT DOMAIN 1426..1673
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 295..325
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 367..622
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 643..1377
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 528..553
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 783..802
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 990..1009
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1182..1199
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1329..1360
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1673 AA; 168605 MW; B3D4CCFD55700B0C CRC64;
MPVTFLSTET MQLYRWGTRI QRRSNAISSQ LAMMFILFLS SIILSASGLL SGRPESPQGA
PELLRAIHLA EQAYVEAFPG HCRTRLDEFG QPRPLNEPTN AYSIARGISL YIEGSRLIPN
LPLSELSVLF TARVDPGFAG ELFTLYDRHG HIQLSVTFGR KLSIRYLVRP TALPHPRISQ
VDRNQQTETV GFQTRLDDGL WHRVAISMKN AEVRLFVDCR QRESAAYNRT TITVDSRGRM
RILKDGIQGA IQDLFVAPTA DLANKQCDVY TTDCLEDQMM GDEAASQIEE ALRGPMGATG
AMGPPGDKGE MGHNGSTGLP GEPGQPGPAG SLFVIPLNLG VGDANYARAA VFRDLLQKHL
LSLKGVRGAR GMTGNPGPDG MQGERGIKGE PGAPGETGFQ GPRGPLGMPG RPGPAGRPGL
DGSRGDPGAN GAPGEDGRPG DSGYMGPKGL KGEVGPRGPK GDVGLPGGEG PRGPVGEPGE
QGDVGLPGLQ GPVGIPGPQG LRGKPGGPGP HGRKGETGEQ GLPGLAGPPG PVGAPGPEGI
AGPRGPPGPP GIRGRPGAVG VPGSDGPPGF PGPKGDIGPK GESGDPGQKG EAGVQGPRGY
KGERGRMGAV GIKGDTGKPG PIGVAGEIGP KGLKGDIGLI GPVGRPGPMG EKGNPGPPGP
IGARGDPGGK GFAGAVGPPG PPGPDGERGP IGPTGPRGRD GDKGEVGGQG PPGDSGEQGQ
RGLPGPRGPP GRPGPIGIKG DAGAVGPVGP AGPIGEPGVQ GPMGPVGPPG QQGAVGRPGP
AGFPGPRGEP GERGPPGQPG PLGATGPTGA TGDPGPDGER GPVGPRGPTG DTGKDGLPGR
DGSPGQKGVR GPQGLPGGRG PRGFSGFDGL PGRSGPPGAK GETGPDGEVG PIGSKGYQGD
QGPMGPIGSP GSMGPRGPPG PAGEKGTTGL DGPYGLVGSP GPRGPQGRPG PVGQIGPQGF
EGEKGSKGAS GSPGVKGDKG ANGIRGPPGP VGSPGLPGIQ GPPGIPGQPG EEGPQGLSGP
EGTRGIAGSP GQVGMRGAPG ISGPPGEKGD AGTPGPRGRT GSPGLPGPPG QAGVPGDIGS
LGPPGMAGEK GDAGQQGPPG SPGFEGISGL QGPVGSPGEE GPPGLEGPLG DMGEQGIQGP
KGETGPPGLP GPAGVKGPQG AKGAPGGVGE DGSLGPRGKP GEPGSRGPPG PPGPEGSPGM
TGPQGPDGPR GQKGVKGETG EVGPPGEPGR RGPIGASGPR GPPGAMGQRG QPGSIGLPGS
QGDRGPDGVT GPHGDPGEAG KPGPKGIKGA DGPIGPMGPP GPKGEKGIPG LSGERGPPGP
KGAMGPQGPE GRPGPPGPPG TEGPDGSPGY PGPIGPPGPQ GAKGAPGAMG PPGPPGAIKI
LDVSEGYYRF EDMPRRRRNV DYSLTKQDPD EAITPSNVPS IGAVLRRVYE RIENLENSMN
MFDRPAGTRQ HPARHCRDIL RSSEAPEALK FGQYWIDPNL GSKVDAFLAE CRFRDGKIAQ
TCIQPVPESQ TLPLTQFKKT DREKEWWFSR LKTQIPNQNF TRTQLYYAPH NQIRYLQMLH
QTVEQSVTFL CRNTAVYYDT KQRNHKSAVR ARLFNEVEIN TYEDRRVRAA SGTMLLEIKV
RDECMERASH RSTSVFEFIA RETQMLPILD FQLIEFGDTD QEIGYYVDSV CFS
//