ID A0A1B0FIW1_GLOMM Unreviewed; 2687 AA.
AC A0A1B0FIW1;
DT 05-OCT-2016, integrated into UniProtKB/TrEMBL.
DT 05-OCT-2016, sequence version 1.
DT 27-MAR-2024, entry version 43.
DE RecName: Full=Hemocytin {ECO:0008006|Google:ProtNLM};
OS Glossina morsitans morsitans (Savannah tsetse fly).
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; Hippoboscoidea;
OC Glossinidae; Glossina.
OX NCBI_TaxID=37546 {ECO:0000313|EnsemblMetazoa:GMOY003789-PA, ECO:0000313|Proteomes:UP000092444};
RN [1] {ECO:0000313|Proteomes:UP000092444}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Yale {ECO:0000313|Proteomes:UP000092444};
RA Lawson D.;
RT "Genome Sequence of the Tsetse Fly (Glossina morsitans): Vector of African
RT Trypanosomiasis.";
RL Submitted (MAR-2014) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Proteomes:UP000092444}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Yale {ECO:0000313|Proteomes:UP000092444};
RG International Glossina Genome Initiative W.H.O.;
RA Lawson D.;
RT "Genome Sequence of the Tsetse Fly (Glossina morsitans): Vector of African
RT Trypanosomiasis.";
RL Submitted (MAR-2014) to the EMBL/GenBank/DDBJ databases.
RN [3] {ECO:0000313|EnsemblMetazoa:GMOY003789-PA}
RP IDENTIFICATION.
RC STRAIN=Yale {ECO:0000313|EnsemblMetazoa:GMOY003789-PA};
RG EnsemblMetazoa;
RL Submitted (MAY-2020) to UniProtKB.
CC -!- SIMILARITY: Belongs to the thrombospondin family.
CC {ECO:0000256|ARBA:ARBA00009456}.
CC -!- CAUTION: Lacks conserved residue(s) required for the propagation of
CC feature annotation. {ECO:0000256|PROSITE-ProRule:PRU00039}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CCAG010011041; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR STRING; 37546.A0A1B0FIW1; -.
DR EnsemblMetazoa; GMOY003789-RA; GMOY003789-PA; GMOY003789.
DR VEuPathDB; VectorBase:GMOY003789; -.
DR PhylomeDB; A0A1B0FIW1; -.
DR Proteomes; UP000092444; Unassembled WGS sequence.
DR GO; GO:0005576; C:extracellular region; IEA:InterPro.
DR GO; GO:0008061; F:chitin binding; IEA:InterPro.
DR CDD; cd00057; FA58C; 2.
DR CDD; cd19941; TIL; 4.
DR Gene3D; 2.60.120.260; Galactose-binding domain-like; 2.
DR Gene3D; 2.10.25.10; Laminin; 3.
DR InterPro; IPR002557; Chitin-bd_dom.
DR InterPro; IPR036508; Chitin-bd_dom_sf.
DR InterPro; IPR006207; Cys_knot_C.
DR InterPro; IPR000421; FA58C.
DR InterPro; IPR008979; Galactose-bd-like_sf.
DR InterPro; IPR002172; LDrepeatLR_classA_rpt.
DR InterPro; IPR036084; Ser_inhib-like_sf.
DR InterPro; IPR002919; TIL_dom.
DR InterPro; IPR014853; VWF/SSPO/ZAN-like_Cys-rich_dom.
DR InterPro; IPR001007; VWF_dom.
DR InterPro; IPR001846; VWF_type-D.
DR PANTHER; PTHR11339; EXTRACELLULAR MATRIX GLYCOPROTEIN RELATED; 1.
DR PANTHER; PTHR11339:SF373; HEMOLECTIN, ISOFORM A; 1.
DR Pfam; PF08742; C8; 4.
DR Pfam; PF00754; F5_F8_type_C; 2.
DR Pfam; PF01826; TIL; 2.
DR Pfam; PF00094; VWD; 3.
DR SMART; SM00832; C8; 3.
DR SMART; SM00494; ChtBD2; 1.
DR SMART; SM00041; CT; 1.
DR SMART; SM00231; FA58C; 2.
DR SMART; SM00192; LDLa; 1.
DR SMART; SM00214; VWC; 2.
DR SMART; SM00216; VWD; 3.
DR SUPFAM; SSF57603; FnI-like domain; 1.
DR SUPFAM; SSF49785; Galactose-binding domain-like; 2.
DR SUPFAM; SSF57625; Invertebrate chitin-binding proteins; 1.
DR SUPFAM; SSF57567; Serine protease inhibitors; 2.
DR PROSITE; PS50940; CHIT_BIND_II; 1.
DR PROSITE; PS01185; CTCK_1; 1.
DR PROSITE; PS01225; CTCK_2; 1.
DR PROSITE; PS01286; FA58C_2; 1.
DR PROSITE; PS50022; FA58C_3; 2.
DR PROSITE; PS50068; LDLRA_2; 1.
DR PROSITE; PS01208; VWFC_1; 1.
DR PROSITE; PS50184; VWFC_2; 1.
DR PROSITE; PS51233; VWFD; 3.
PE 3: Inferred from homology;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157, ECO:0000256|PROSITE-
KW ProRule:PRU00039}; Repeat {ECO:0000256|ARBA:ARBA00022737}.
FT DOMAIN 188..371
FT /note="VWFD"
FT /evidence="ECO:0000259|PROSITE:PS51233"
FT DOMAIN 707..773
FT /note="Chitin-binding type-2"
FT /evidence="ECO:0000259|PROSITE:PS50940"
FT DOMAIN 919..1073
FT /note="F5/8 type C"
FT /evidence="ECO:0000259|PROSITE:PS50022"
FT DOMAIN 1108..1257
FT /note="F5/8 type C"
FT /evidence="ECO:0000259|PROSITE:PS50022"
FT DOMAIN 1556..1736
FT /note="VWFD"
FT /evidence="ECO:0000259|PROSITE:PS51233"
FT DOMAIN 1870..2073
FT /note="VWFD"
FT /evidence="ECO:0000259|PROSITE:PS51233"
FT DOMAIN 2231..2299
FT /note="VWFC"
FT /evidence="ECO:0000259|PROSITE:PS50184"
FT DOMAIN 2552..2660
FT /note="CTCK"
FT /evidence="ECO:0000259|PROSITE:PS01225"
FT DISULFID 2600..2654
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00039"
FT DISULFID 2604..2656
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00039"
SQ SEQUENCE 2687 AA; 301952 MW; D5981FBC7CE3F1FB CRC64;
KYCAWVTEDI FQDCHWTVEP EQYYEDCLYD VCACKEEPEQ CYCPILSAYG AECMRQGIKT
GWRLAVKECA IKCPNGQVYD ECGDSCSHTC EDLATKNTCQ RDCVEGCRCP HGEYLNENNE
CVPQSKCHCS FDGMTFKAGY KEVRPGRNFL DLCTCRNGLW ECEDAEPGDD KNECQAGYWQ
CSDNGCESTC SVWGDSHFTT FDNHEFDFQG ACDYVLSKGV DPNGDGFTII IQNVLCGTMG
VTCSKSVEIS LTGNIHDTLT LTSDSSYLAD PSKSLMKKLR DTVNAKAHGA FHIYKAGVFV
VIEVVPLHLQ VKWDEGTRVY VRLGNEWKNK VNGLCGNYND NAMDDMQTPS QSPETSPLIF
GHSWKVQKYC AMPTQPIDAC KEHPERETWA QLKCGILKSP MFKDCHAEVP FERYLKRCIF
DTCACDQGGD CECLCTAIAA YAHACSQKGI NIRWRKPHFC PMQCDPHCSE YKACTPACAV
ETCDNFLDQS AAEHMCKNEN CVEGCLIKPC EEGLIYSNDT YRHCVPKSEC KPVCMVKDEK
TYYEGDILYQ DACSTCRCSK RKEVCSGVRC TEDIITDKPS KIIEGTTLKP ISDEMQSKCI
KGWTRWFDND NDSSGKLVRL NDEEKLPRYN RGESIYGTCQ TKYMKEIECR VVNSHEPSDF
MDENVDCNLQ NGLTCVGQCH DYEIRVFCQC QEKEISVFPH TEKPEIGQKC DSLMAEYKEH
PEDCHRFLHC TPKSTGEWTY VEKTCGDNMM FNPVMNTCDH IRVVQELKPM CNINDTDVDI
CPEGQIMSDC ANQCEHTCHF YGMILKKRGL CKEGEHCRPG CVPKERADCH DGGKFWRDEN
TCVEADECPC MDQSEKYVQP HMPFVGELEV CQCIDNAYTC VPNKIVVATL APVTNATPIT
DTTQFEEVFP TTIATLKHCD PELLTSVIQG AEPLPDSVFS ASSNLGPKYV PQNGRLLPSS
KKQTDAWAPM INDQMQYLQV TLPEREPLFG IAMAGHPDFD NYVTLFKILY SQDGVAYHYL
VDDTEMPQLF NGPLDSRIPV KALFKIPIEA KSVRIYALKW HGSIAMRVEL LGCGTTDSMT
TTEFVTEPSV RMSTPEEFLK EWEEEEKCID EMGVDNGKMG PNQIKVSSIW RISKPAQKPR
LIDSLKLSSN DGWKPMINTP NEYIEFDFLE PRNISGFITK GGPEGWVTGF KVLFSKNKLI
WNTALTPEGQ PKIFRANQDS DTEQISKFKT PILTQYIKIV PAKWENNINM RVEPLGCFIE
YPITRDSLVE LEQPKHPKCL NCEGVVDPNS KEGDCKCQDG LYWDGSSCVQ SNLCPCVENY
ITYPIGSKYE NRDCEECVCV LGGHSNCKPK KCPPCTDNLR PVVGSGCYCK CEPCPKEQKL
CPSSGDCIPE VLWCNGVQDC ADDEDATCRD KYDVIPEKLV QNDTEIITCP VPECPPQMKV
KLTEKKLRKM SQMFSSTFTE KYTVTREGNK VTKTKIITSS EEILPSTKQE IDFNREQACD
EFICVPIPVK IIKRNETTSC PDPKCPSNYD VEVDRSTSKP GDCPRYSCIL KPTKDDVCEI
SGKTFTTFDG IEFKYDTCSH ILARDLINSS WIITVHLQCT DESRKICRKM ITIKDMEKQS
ILTIMPNMRV NFNGFEYTVQ QLINSPICKA SFVVSQPGNT VLVVSPKRGF WVLYDDIGYI
KIGISSKYIK TVDGLCGFYN GVASDDKRTP NGTIIANTVK FGDSWFDKRI PKEECYPQTH
PTFLKCGKSV NYKQFVSKCM ETTCECLKAS NGDAKSCKCN LLQDFVKKCL TVNPNIQMDT
WRAVHMCEIT CPAPLVHSDC YKRRCELSCD TLNSDDCPVI SDACFSGCYC PEGTVRKGET
CVALADCKDC VCNTIGSSKY FTYDRNSFTF NGNCTYLLSR DIVLPGVHTF QVYVTMDDCH
KLGRTTAPEF SSCAKSLHIL NGDHVIHIQR SSDNPKALKV FVDGFEVKKL PYKDAWINLR
EVVGKELILT LHESHVELKA AFDDLIFSIG VSSVKYGSKM EGLCGDCDGN PNNDFQENPA
KKKKRKPSQD FVDIMNSWRA DEPKLGLDAS ECLSEEVVKE DCLPLPPEKD PCLWFFEETI
FGKCNMIVDP VVYVSACQQD ICKVGNDQKG ACESLSAYAN ECAKHGICLN WRRADLCPYD
CPLDMTYDAC GCSKTCETMK LLTEFQAVNM KTSAVVNTVS TDDICPIEER FEGCFCPPGK
VMENGKCIRE EMCVKCEDSD HILGDRWQKD KCTECLCDKN GKTQCVEHKC SVEENICAEG
YKPQKIVNED QCCPRYACVP EPKLPPPDVC LEPIMPVCGP GQFKKQKTGP DGCPQYICEC
KPKEECDPLE LVGYLKPGQT VVKEEIGCCP TQKVVCDKNI CPPKPQQCGM EFYEVYTKQE
PDDCCPSYEC GPPKDLCIVD YGPGKKFTKK IAEKWMDPRD PCKHEMCAYG PNDSTQVITT
TDVCEKKCLK GFKYVNADPT KCCGECIQTA CLHHDVLYEP EDTWKSTDNC TTYKCIKVGT
SLIVNSAQET CPDVSQCFGE LLEDEGGCCK ICKETPKSEY LKNCLPLSLA DTETVNLIKV
FKAGHGHCKN NQPILGFTEC MGTCNSGSKY NALTFAHDKV CHCCNIKSYK KITIPLTCDD
GVQVLKELDI PAACGCQPCS DSIEYHTDSF LDVRLQPLPL AQLIGGH
//