ID G7YSI1_CLOSI Unreviewed; 5356 AA.
AC G7YSI1;
DT 25-JAN-2012, integrated into UniProtKB/TrEMBL.
DT 25-JAN-2012, sequence version 1.
DT 27-MAR-2024, entry version 33.
DE SubName: Full=Collagen alpha-1(VII) chain {ECO:0000313|EMBL:GAA55911.1};
DE Flags: Fragment;
GN ORFNames=CLF_109355 {ECO:0000313|EMBL:GAA55911.1};
OS Clonorchis sinensis (Chinese liver fluke).
OC Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes; Trematoda;
OC Digenea; Opisthorchiida; Opisthorchiata; Opisthorchiidae; Clonorchis.
OX NCBI_TaxID=79923 {ECO:0000313|EMBL:GAA55911.1, ECO:0000313|Proteomes:UP000008909};
RN [1] {ECO:0000313|EMBL:GAA55911.1}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Henan {ECO:0000313|EMBL:GAA55911.1};
RX PubMed=22023798; DOI=10.1186/gb-2011-12-10-r107;
RA Wang X., Chen W., Huang Y., Sun J., Men J., Liu H., Luo F., Guo L., Lv X.,
RA Deng C., Zhou C., Fan Y., Li X., Huang L., Hu Y., Liang C., Hu X., Xu J.,
RA Yu X.;
RT "The draft genome of the carcinogenic human liver fluke Clonorchis
RT sinensis.";
RL Genome Biol. 12:R107-R107(2011).
RN [2]
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=Henan;
RA Wang X., Huang Y., Chen W., Liu H., Guo L., Chen Y., Luo F., Zhou W.,
RA Sun J., Mao Q., Liang P., Zhou C., Tian Y., Men J., Lv X., Huang L.,
RA Zhou J., Hu Y., Li R., Zhang F., Lei H., Li X., Hu X., Liang C., Xu J.,
RA Wu Z., Yu X.;
RT "The genome and transcriptome sequence of Clonorchis sinensis provide
RT insights into the carcinogenic liver fluke.";
RL Submitted (OCT-2011) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: Lacks conserved residue(s) required for the propagation of
CC feature annotation. {ECO:0000256|PROSITE-ProRule:PRU01005}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; DF144112; GAA55911.1; -; Genomic_DNA.
DR Proteomes; UP000008909; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR CDD; cd19941; TIL; 1.
DR CDD; cd01450; vWFA_subfamily_ECM; 1.
DR Gene3D; 2.10.25.10; Laminin; 1.
DR Gene3D; 3.40.50.410; von Willebrand factor, type A domain; 1.
DR InterPro; IPR036084; Ser_inhib-like_sf.
DR InterPro; IPR003582; ShKT_dom.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR PANTHER; PTHR24020; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24020:SF70; PH DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF00092; VWA; 1.
DR SMART; SM00254; ShKT; 3.
DR SMART; SM00327; VWA; 1.
DR SUPFAM; SSF57567; Serine protease inhibitors; 1.
DR SUPFAM; SSF53300; vWA-like; 1.
DR PROSITE; PS51670; SHKT; 2.
DR PROSITE; PS50234; VWFA; 1.
PE 4: Predicted;
KW Collagen {ECO:0000313|EMBL:GAA55911.1};
KW Disulfide bond {ECO:0000256|PROSITE-ProRule:PRU01005};
KW Reference proteome {ECO:0000313|Proteomes:UP000008909}.
FT DOMAIN 267..303
FT /note="ShKT"
FT /evidence="ECO:0000259|PROSITE:PS51670"
FT DOMAIN 307..345
FT /note="ShKT"
FT /evidence="ECO:0000259|PROSITE:PS51670"
FT DOMAIN 4307..4486
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DISULFID 316..334
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU01005"
FT DISULFID 325..338
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU01005"
FT NON_TER 5356
FT /evidence="ECO:0000313|EMBL:GAA55911.1"
SQ SEQUENCE 5356 AA; 615605 MW; 2B8E8F5A91BE35AF CRC64;
MVQHRYILWC KTVPVKDKLT AFVLCLLVAG ISLASDNGLC KPEKSEVLCS ASGDRWIKRR
EQFIFINGTG CKVRQTISEE IIVCPRALAS GSNEPCRRGR QLVEMIRFVP KKCVCVPVKY
SQEISCPCRS SRLSDEDCRS NNGVRVITYE RRAGDECERR HVRVNNETCT QGHLRFAIEE
HHNDSISGHL MRSRRSAGSD IEACGKNQIF VVNRNPCPDT CERVLFGKPN VCHRAQAGPG
CECRQGYVLD GILCVPPSQC GCLNAICVDR APAEKCHQWR AEGRCEKDQV AMSKVCRATC
HNCQQPCEDM ISTVQCEQLK QEGKCQDDFY RTVCRLTCSP QECRCPPCRI DKGECNPETR
NVEIRQTCFR LQRDGQCKPL ITKQFKPCGE CPMGLVREPG PCNYCQRQRQ VVLRTFFRED
GGLFRCTVVK HVKVEPCECQ PVNLIKSTCV KDKLLVRKLL TRKQTSCSQC SLRKIQLPNR
LIVCKKPELR SGNCYRKGPN QPLIRKIVAI HEVARQCKCQ PMVQVREEIC NCPPNAREGP
TCFAHRNQLA FKTTAYVVQG KQCVPRIGHE WIPRPACPPS DKADEHHGKF ECNPETCERT
WHKTTWHWEE CKCKPITLTE PSGKCCCPKP RRVLSPCQKG MQKEQLITYE LVEGQCIRHV
QNRLIPCACP PPQHSYKCRT DTNERMIKTV LYKRLPDKNC GRIESVFVLP TTCKEEGIRR
ITECVRKPGE SRPTRNLLIT TPKVDNCVCK EPEAKVVDEA CGCPEAQEPP VKLIHKCPET
CHQQHGDHCG TECNEVFIWQ RFVYDKDSDG NAPRCKTEVI RKIVKPCCCP RNRRITRSCD
PTGQYEMFKV TENRLRHDKC VPILRVIRRP VHCTSGFQQR VWGPVEADGT RQMKLLFNHV
KRCHCATQAV QRACSQQCPK PIQTTTCDPG SRELVHTVEN FIPIGCRCDK RVFKRRTSVL
CPEHSELIST HCSPETNIET SKYRRHFQHN CECREEEITK HRPCGCSEPM LSEPVCDAKN
NQIHRIQTVF DVVENKCIPR KLAVHKPVDC RPFELEEAKR NHGAPRVELH CDAKSGRGRL
VAEVWQPVNC RCVKRPQVVR EGVCQCPPPQ KVMHCDGTRN LWIRKTVTFH LDKKTLKCRR
HIREEAQPTY CQSARIRWGP CNPRTGRRKL HLIYHRRDHC HCIREMRVME KPCVCESKQV
KKQRQCDSLR GILHTVFLTQ HWSDKLHQCV PARIIRSTTI KCLPLARLFR APCSQGKMLE
KIVESYRDVN ECQCRKRIRT EVRDCRCPTK MLIDGKCDGH QTHWDELMVE QQWSEEKRAC
VVERIKRARH HCRCDEPKVF KECAKGVMHE TKIISSLNVR TGSCDRSVLQ RQYKPACESL
MGMESQAQHD QLFVRNRQTQ CDPNTCTRQL ETYTRRYDPG RCSCDWSLVK TRRCTCCGCP
KPKVQASCQN NRELVGQVIY YTTHQQRCGH QCIEKTHAIH SEVDCHEYPP PPNGHWNDCD
RKTCVQHFIA YTYEVQMCWC ALTKKVLATR ACCCPPNVRK EQKCTHGLPE LVTYHTVLRD
GRCVEVAKPE PLPLVCPDKK EIRHGECRPD SCRQPIFELG WRVDPETCKC QTYQVLQGER
ECCCLEKSNV RAVCVGSCSV VVQEIFKYDQ QNERCIKENH IQRVCPECPK PHVIREECDR
ENTCLQTIKH VDFVLEDCHC KRLERAEKVS CCCPKSTVLG TRCLEENGQV ETKTLFYELV
NGQCARRVKL DRMPVLCPPP GAAKLTPSCD PQTCLEPVLE QRWLRVGCRC VEQKPQQYRT
CCCSNRISRT KRICNPDGSS LLMTQSWRLH NGHCVPQVIT KRLDVPVCHP QRVTALGTCD
TVTRRQVALV ERFAVKQCRC QAIFREKIER ICACPAPVEH AEPCQHATCL QRIVRIPWAL
DEQKGQCVRL PLQVQMRPCC CLREKEPPVE GTRCNPLTGE VELTRRSFVF NLVKRVCEVQ
EDKRYNKLDC PSHGRILRGQ CNPLNGIAID QVEMWEARLS ECQCRRLRKT IKRICSCSHL
DHVLPPMCLA DEGVLIKKHV VHHLRQNGCQ PEEKILKETI VCAEGEHPQF QCNPTTCDSV
RTVSWVERVG CKCVPRVREE RGKCCCPEPT EQQECTNGGR LLVLHKLSFI LDKTRQTCVP
QKDKVQKEIE CHETGPHVLK EYCDKETCHP VALLRQVVLR NCQCGNLVLR KRNKNTLCCC
PPPRFKTNCY PQYGVISRVM YRYELFKGHC ISRKFVDQDK IVCPKERINR GVCNRATGMR
PLLRRYYELV GCKCHLRIRK DTEPCSCPLP ETVKLPCDPA KPIRKVVRVS FDLEEDTINR
KMRCAKRIQH IRNEPCGCES TQIEKHCGRG ELVVIRKEHQ LTVKGDEPTC VQRTYVKRVP
VICAGELTKI YRSQCEGFRR KVIHVREFVD ANTCECRNQA RVTFEACNCH LKNRVHQDCK
NGVLLVSKEL YTSKPFVDHC IITTTRGVHP VVCKGRHEIV QTSGCTIQKD NGLFHSEEVR
WQQVENCQCI TRRKQTLRLC GCPEPQLTKR CLDTVSMAVY KTTFIRIADE CVPSQEVLTS
EVRCTDPAKI VERSTCETAG VDPSTNSLPG CFETFQVAVR QVEDCKCVEK LLSVRRRCCT
PEPVVTRQCD MTRGAWVTTL KKFTLVPGKN LFELGSVVIR DHVLNVVREQ QDQRVVCPEP
KVHENCDAQT GLWTRTVTRF EMANCHCIPM RQVERGRCQC PPPRITETEC KNNFRLRISD
SYELINGKCV QRHDSKRIRC GCPKPVRRVY CDGEGQWVKC YTQFLLNPAA STCRLVKHCV
RWYQDCPKDT TRVSVACSAQ TEFKHTLQRV RFVNNPKTCH CDPQILDQWT EMCQCDHLKR
KWLRCKLGLI EIKQLTHHLE NGDCVPKMVK HVKLPVCPKP HVHIRACDRN PNSPMKGWIL
KAVDTYHVRN CQCVRKRRLI KKPCDCNLIH PPRHARRCEE PAILHIKAIF WTQTEEKCIP
GFAIYTKHIK CPTERKLRLS PCVEGMDGIG RRLVEQIEQH VEHCECVWKV VVTQHRICKC
PPPKHSKECV DGGKAILRTL IAYTLKDDMC APIQRQIKEN PCERQELPPV PPDQTGRCNP
LTCSTERIHY QRVSVNCNCI TQRVVIKEAC CCPRPSKPVL RCHTARNVML HQQAFYQFKP
AMGEQRAHCL PVMHVKSIAI NCGPKLQRIL VKKCDGEFHR VMIIRTVVEN CLCKQRPTKE
LRIRCGCPRV IRRLPGPCIN QWANDRWIGF QAVVVKRSNL LKVNRVKCKP VIFEQKQRRC
ACPEPTEAVE CEGGKLRVKY RTIYKLNEEL NNCEKHVFRK EDNPVCPPTK VVQSQCGPES
NYEQTEITRA YAVENCECTP RLTKVKWICD CSARFPVQRL VHCLPLGTQR LVEIRRWIQT
GRECKPLVEK RTEPVECPKQ LHVIKGECDR EIPNRRKLIW LQQRPVNCRC VWHRLIHPNP
SVHRIKSSVA CRCRPKITVR QCRPPVYGQL AQMRMVTVNF QLRDGECRPE SRVDLHPVDC
KNGVQLMNGE CDPMTNTRIV KRIISHLEMN ECRCVTREFE RRCHCSCPPT KLYLHCQRQQ
GLVRHIKIAH FLAEDKCLCK AKRAIRTSKV VCPIKKRLIQ RGPCQHIPGK SMDSSDKYRQ
VVWEVSHRDG CQCVTKRVVN SEPCFCTPQP IVEKRCLDDQ LLETTHSERR LSEVDHKCIR
VVVNKIAKPV DLGKPEVIVR CNPQTGIEEV TELSPYAVNC KRNVRVKKSH RRCKCKVQPQ
LIHRDQCDST CQQRLVWRRE VQGEDGSCRL QHRLERRACC CPQTQQLEPR CNAAKRLLEV
GFRTFTLHAG VCVPQDQFMN KAIVCEPNEQ VTQHPQPNGF VRVERHFNEL DGCNCVRKKE
VRLNKWNCPE PITKQRCTNP EPGLYTIETV STRWHVPKES PVCSRLDTVV DSNPVDCSAK
QLVRADACQL EPQRHATVRV DHLFTSVNDG CRCVRNPPTL KVHVCQCMKP EDHVMCDRER
GLLEHVHIKY ELSGDGTRCI PHKTKRIWRP SCSPAAPRHV RTTPCDPSTG LMYQIFEEVG
QKDCRCVNLE KRVPIRCHCA KPIQESRCLS PQIREIRTTT FHLTPEGTCG KTVLTHKEEV
RCQLLPQLLQ SSPTYRVQRS QDKPNLIQHV HACHSAGRCL YKVREFLVAP DQSCACRWKF
RELVRACCCP TDPQPQPLSS RCDASKGLIV FRKVDWNKHN GVCRPNMHER VKQIVCEHEE
TMKPLGPCQD GRQKKLVKRW IRVGCKCVLH QQVVVKPCVC HTVSTAEVVF MIDGAAAGRQ
TQYQERVQQL LFKTIEQFLL SHETKSQSNS FRFAVMTYAD RPEIVFNLQN YEDPNRILEQ
VQELTLRGER ANLALALHTM LREVLPLKRP DVPVLLYIVT DGRDQDEDTV RLIGQVHRAQ
IQVTVAAMGA EPYGMNYLGA LVTPPKAVHL LRISNTDNVN QYLTRISETL CSRACPANYV
KSSDCSRETG CIGRSYIHTY QFDGHKGQCV GATRVQTRRC CCIQQAPPPV RQCEQNRLVQ
IISNWQLTKD GTCSKFVIKR DATPLLIQQC SPPVDVRVGQ CNAEGYAVEL TVRRFLDNCE
CKESKSRRIV RCRCSPVHET KRCLTDEIQI REITSQELIN GECHSRKLVK KTKLKCPPPL
VYRSDCDSMT CQRRVTIVEH VPEQCICKRF ARVTHETCCC KGQKAIRYEG CRYNALKTFV
EESIEPKLTD GACVKRTRHR FEPVACPRNP TTVRHQCRHV STMESGLRQK IAADPALIFR
QVEQFWWQMD SCECRQMRRA HFEACGCDQS AMPAESQFVH RCNQQNGVVI TYKQTLKLEV
DGTRQSSDQT LMIPDLSHAK CRPEHTIVSA RKIVCPEPKV FHTACETAAD GREYRAVQVH
RWNRQGCACH NLPPELVERQ LCSCRPEHVS KQCVTSRPGG PENRMIVKHI REEIVVTQDP
SGKRVRLCKL APIEEKVHLI SCPRSQIKYS GCVNGLVHIT LRLNLVTECQ CHMRIRRLVT
KCSSLPGAVN PAMLHGPVSS NNNVAVKSQA TQSPIKGHAQ LVLPVYQIAD CIDLLPVKQC
STLEHPPHQV CEKPGQIRDL LCRRTCKQCT TCPLDQTKFS VVLGRLKDAC IVSDKRYERM
FFIRALVKRG DLAACKLACA KTPECSSIDY YVPRNDEADQ PKCILNRVDP PTLMKRLDPG
TVRIGSTEQD LERWNARKCI LFRKTCAAEC PKPQTIELST CECHQFRPNY QVDPTEDQLK
PVTHCAKQVR VRYFTQTSYG RCVAKEWIGM TPCVVGTTTV LDGPQVNCLD TKPPLWCQSV
ITPNPTVCRD ISMKKV
//