ID E2B4Q6_HARSA Unreviewed; 2196 AA.
AC E2B4Q6;
DT 30-NOV-2010, integrated into UniProtKB/TrEMBL.
DT 30-NOV-2010, sequence version 1.
DT 27-MAR-2024, entry version 72.
DE SubName: Full=Neurotrypsin {ECO:0000313|EMBL:EFN89337.1};
GN ORFNames=EAI_07659 {ECO:0000313|EMBL:EFN89337.1};
OS Harpegnathos saltator (Jerdon's jumping ant).
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Hymenoptera; Apocrita; Aculeata; Formicoidea;
OC Formicidae; Ponerinae; Ponerini; Harpegnathos.
OX NCBI_TaxID=610380 {ECO:0000313|Proteomes:UP000008237};
RN [1] {ECO:0000313|EMBL:EFN89337.1, ECO:0000313|Proteomes:UP000008237}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=R22 G/1 {ECO:0000313|EMBL:EFN89337.1,
RC ECO:0000313|Proteomes:UP000008237};
RX PubMed=20798317; DOI=10.1126/science.1192428;
RA Bonasio R., Zhang G., Ye C., Mutti N.S., Fang X., Qin N., Donahue G.,
RA Yang P., Li Q., Li C., Zhang P., Huang Z., Berger S.L., Reinberg D.,
RA Wang J., Liebig J.;
RT "Genomic comparison of the ants Camponotus floridanus and Harpegnathos
RT saltator.";
RL Science 329:1068-1071(2010).
CC -!- SUBCELLULAR LOCATION: Secreted {ECO:0000256|ARBA:ARBA00004613}.
CC -!- SIMILARITY: Belongs to the peptidase S1 family. CLIP subfamily.
CC {ECO:0000256|ARBA:ARBA00024195}.
CC -!- CAUTION: Lacks conserved residue(s) required for the propagation of
CC feature annotation. {ECO:0000256|PROSITE-ProRule:PRU00196}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; GL445584; EFN89337.1; -; Genomic_DNA.
DR MEROPS; S01.461; -.
DR InParanoid; E2B4Q6; -.
DR OMA; IMDCGPG; -.
DR Proteomes; UP000008237; Unassembled WGS sequence.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-SubCell.
DR GO; GO:0016020; C:membrane; IEA:InterPro.
DR GO; GO:0008061; F:chitin binding; IEA:InterPro.
DR GO; GO:0004252; F:serine-type endopeptidase activity; IEA:InterPro.
DR GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR CDD; cd00037; CLECT; 1.
DR CDD; cd00108; KR; 1.
DR CDD; cd00112; LDLa; 3.
DR CDD; cd01099; PAN_AP_HGF; 1.
DR CDD; cd00190; Tryp_SPc; 1.
DR Gene3D; 2.170.140.10; Chitin binding domain; 3.
DR Gene3D; 3.50.4.10; Hepatocyte Growth Factor; 1.
DR Gene3D; 4.10.400.10; Low-density Lipoprotein Receptor; 2.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 1.
DR Gene3D; 2.40.20.10; Plasminogen Kringle 4; 1.
DR Gene3D; 3.10.250.10; SRCR-like domain; 3.
DR Gene3D; 2.40.10.10; Trypsin-like serine proteases; 1.
DR InterPro; IPR001304; C-type_lectin-like.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR002557; Chitin-bd_dom.
DR InterPro; IPR036508; Chitin-bd_dom_sf.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR000001; Kringle.
DR InterPro; IPR013806; Kringle-like.
DR InterPro; IPR038178; Kringle_sf.
DR InterPro; IPR036055; LDL_receptor-like_sf.
DR InterPro; IPR023415; LDLR_class-A_CS.
DR InterPro; IPR002172; LDrepeatLR_classA_rpt.
DR InterPro; IPR003609; Pan_app.
DR InterPro; IPR009003; Peptidase_S1_PA.
DR InterPro; IPR043504; Peptidase_S1_PA_chymotrypsin.
DR InterPro; IPR001190; SRCR.
DR InterPro; IPR017448; SRCR-like_dom.
DR InterPro; IPR036772; SRCR-like_dom_sf.
DR InterPro; IPR001254; Trypsin_dom.
DR InterPro; IPR018114; TRYPSIN_HIS.
DR InterPro; IPR033116; TRYPSIN_SER.
DR PANTHER; PTHR24258; SERINE PROTEASE-RELATED; 1.
DR PANTHER; PTHR24258:SF128; TEQUILA, ISOFORM G; 1.
DR Pfam; PF01607; CBM_14; 3.
DR Pfam; PF00051; Kringle; 1.
DR Pfam; PF00057; Ldl_recept_a; 2.
DR Pfam; PF00024; PAN_1; 1.
DR Pfam; PF00530; SRCR; 3.
DR Pfam; PF00089; Trypsin; 1.
DR PRINTS; PR00018; KRINGLE.
DR PRINTS; PR00261; LDLRECEPTOR.
DR PRINTS; PR00258; SPERACTRCPTR.
DR SMART; SM00494; ChtBD2; 3.
DR SMART; SM00034; CLECT; 1.
DR SMART; SM00130; KR; 1.
DR SMART; SM00192; LDLa; 3.
DR SMART; SM00473; PAN_AP; 1.
DR SMART; SM00202; SR; 3.
DR SMART; SM00020; Tryp_SPc; 1.
DR SUPFAM; SSF56436; C-type lectin-like; 1.
DR SUPFAM; SSF57414; Hairpin loop containing domain-like; 1.
DR SUPFAM; SSF57625; Invertebrate chitin-binding proteins; 3.
DR SUPFAM; SSF57440; Kringle-like; 1.
DR SUPFAM; SSF57424; LDL receptor-like module; 2.
DR SUPFAM; SSF56487; SRCR-like; 3.
DR SUPFAM; SSF50494; Trypsin-like serine proteases; 1.
DR PROSITE; PS50041; C_TYPE_LECTIN_2; 1.
DR PROSITE; PS50940; CHIT_BIND_II; 3.
DR PROSITE; PS50070; KRINGLE_2; 1.
DR PROSITE; PS01209; LDLRA_1; 2.
DR PROSITE; PS50068; LDLRA_2; 2.
DR PROSITE; PS50948; PAN; 1.
DR PROSITE; PS00420; SRCR_1; 1.
DR PROSITE; PS50287; SRCR_2; 3.
DR PROSITE; PS50240; TRYPSIN_DOM; 1.
DR PROSITE; PS00134; TRYPSIN_HIS; 1.
DR PROSITE; PS00135; TRYPSIN_SER; 1.
PE 3: Inferred from homology;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157, ECO:0000256|PROSITE-
KW ProRule:PRU00196}; Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Hydrolase {ECO:0000256|ARBA:ARBA00022801, ECO:0000256|RuleBase:RU363034};
KW Kringle {ECO:0000256|ARBA:ARBA00022572, ECO:0000256|PROSITE-
KW ProRule:PRU00121};
KW Protease {ECO:0000256|ARBA:ARBA00022670, ECO:0000256|RuleBase:RU363034};
KW Reference proteome {ECO:0000313|Proteomes:UP000008237};
KW Serine protease {ECO:0000256|ARBA:ARBA00022825,
KW ECO:0000256|RuleBase:RU363034}; Signal {ECO:0000256|ARBA:ARBA00022729}.
FT DOMAIN 215..272
FT /note="Chitin-binding type-2"
FT /evidence="ECO:0000259|PROSITE:PS50940"
FT DOMAIN 297..354
FT /note="Chitin-binding type-2"
FT /evidence="ECO:0000259|PROSITE:PS50940"
FT DOMAIN 393..450
FT /note="Chitin-binding type-2"
FT /evidence="ECO:0000259|PROSITE:PS50940"
FT DOMAIN 1126..1230
FT /note="SRCR"
FT /evidence="ECO:0000259|PROSITE:PS50287"
FT DOMAIN 1244..1380
FT /note="C-type lectin"
FT /evidence="ECO:0000259|PROSITE:PS50041"
FT DOMAIN 1398..1481
FT /note="Kringle"
FT /evidence="ECO:0000259|PROSITE:PS50070"
FT DOMAIN 1523..1603
FT /note="Apple"
FT /evidence="ECO:0000259|PROSITE:PS50948"
FT DOMAIN 1650..1753
FT /note="SRCR"
FT /evidence="ECO:0000259|PROSITE:PS50287"
FT DOMAIN 1802..1902
FT /note="SRCR"
FT /evidence="ECO:0000259|PROSITE:PS50287"
FT DOMAIN 1949..2190
FT /note="Peptidase S1"
FT /evidence="ECO:0000259|PROSITE:PS50240"
FT REGION 28..95
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 135..191
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 450..517
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 550..867
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 919..1038
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 158..172
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 464..517
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 573..587
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 683..706
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 744..758
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 794..809
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 820..834
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 847..867
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 919..939
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 940..975
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 982..1032
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT DISULFID 1200..1210
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00196"
FT DISULFID 1614..1632
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00124"
FT DISULFID 1626..1641
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00124"
FT DISULFID 1722..1732
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00196"
FT DISULFID 1827..1891
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00196"
FT DISULFID 1840..1901
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00196"
FT DISULFID 1871..1881
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00196"
SQ SEQUENCE 2196 AA; 244502 MW; DAE035268454D780 CRC64;
MTASLDRFLV ALCAVHIIVF VGAIYDPRAT TTKPPERRES VLPWHSAAKG NESGIPAEGS
RAAKWQTYDD DDDDDVPSEP AWGPWRKKPG SEIEQRKLVD IQDQPAEQAP ERQHRPGAGE
WQQQENLEQP LFRREQPDKN HGGLADAPKY GSRTPPRNTI DFEHKLPPRY DDSEETEDED
LYEGVPLDST KPRKESILLS RKPQHTIARY DPRSGVQCPD RNSTGQFVYP PDCKFFVNCW
KGRAFVQPCA PGTHFNPDTL ECDFPHKVKC YEGESAGYTQ PIHPESQVVR NPHKLREPKC
PPYLTGLLPH QGDCSKFLQC ANGATYVMDC GPGTVFNPAV GVCDWPRNVE GCEAGERQNG
TFKAEEDVKA PLTPPSPQTH PYEHKSEYTE VKRIACPADF TGLLPHPETC KKFLQCANGA
TFVMDCGPGT AFNPLTTVCD WPHKVPSCKT DKPADGAHRT TNVFRPPSAP SGASGTVSWS
SSGQYNRTSG PRGHGSWPWT TTTTTTTTPK PAWRPVTTSP SRWAPMWTTA RPPYDHPGHS
ANHHYGRVYE DSQHRDYEPS RPEWQPNGGH FGRPGDYDHH AYDDRYDASR WPGSPVADRG
QQGTRHFGQY PGQSWPAEHR PPPDNRYGQD YGPPGRHYQH RYHGYGPPPP ASPSWGSGSA
ADGSYGRDYH GEYSPHPGGG HWPPDGDYHR RPPYRYDYHY FEDPAAPGEP DQRQFPAYSP
PGDGTKFGAD GNSQASRPAG YDFSSGGRED HGRTSPGFYR PDFDRTSSPA GNVFLNRGDP
LAGRWNQTSH GPRWGRPPAP PPGPGSTPWS RPDGPGFQPS AWHGQQSPEQ APARPWDQDT
QEDKFHQWQS GSDIYQQSTV DRENDAKMAQ WASRTNIFLS PKGQVASDSK NETQRPRTNV
YPAGIYVDAN GVKGHFITKE IEKNRTHPEI QHSYDTRRPL GRNDTLNTPA STDAANDYAP
STVTTTRTET VRLPSQAVPS RRGKVYVNTT TRGTGPRGSR NESRSNLPSV SIEEDDSSGA
TDTKQSGEFP DAVPQQETDD YVDVLDEKNE WKPKLVFENR SETTSTAPSV IMRITPKKTD
VELFNIEAAP FKEEEPPFPV YYVPPVRPLT HSRKTALPTP LSGQMIRLRG GSGPGDGYVE
VQGALPGWGI VCDSRNGWTL KKAHIVCKQL GYTRGAEMAW QGRNNRNGVP TWIAANTVTC
LGNETRFQSC KFTHNRECRV DRDAIGVQCV SNRIAHCRKD EIPHDGQCYH LAEPVGGSNH
DEVSDYCRRR NARLIDIISQ AENNFVSEWL SQSYPEVNSI MTSGLGFITM NLPLWLWGDS
SHAKFKFTKW WPGWMNDTKQ PPAVGSHPLC IVMKRKFPCH GLAESTCVAD YFFWDTEDCE
ASSRGHSFVC ERPYDDIGCV YGKGNQYTGK ANVTLSGNEC LPWAEQRIAH QLRVNVVSEE
VRKKLRTHNF CRNPNPAKES RPWCFVGPGT GKREYCDIPA CGNIDPKRST LTGQCKPRHF
ECMPGECIPS PWDCTNGADE RKCAVDLSLF EKSARHKLEG YDVEKWLNTP LKTCALRCKE
ADFTCRSFNH KAEGNICLLS DSNIGLTGAL KPDRQFDYYE RTERSVNCDG MYTCNNRKCI
NQTQVCNGKN DCNDRSDESM CTAENLDYDI RLSGTNNSHE GRIEVKVLGH WGQVCDDGFG
MINADVICRE LGFALGALEV RPGGFYGNLE PPTRFMVDQL KCRGNETTLR DCDFNGWGVH
NCQPEEAVGV VCKTAVNTCQ QDQWKCDSVP SCIPASFICD EVVDCPDSSD ESPQHCDAPF
ELRLVDGDSP LQGRVEVRHH GVWGTVCDDD FTNTAAAVIC RSLGYGGKAI AKKNGFFGPG
DGLIWLDEVF CHGNETQLYR CDHSHWGQHN CNHNEDAGVI CSPGNVNDTE RWAMVPELPE
RSIDEILPTN CGQRAKDFND DEDLIFAKVV HGSVAPKGTY PWQASIRVRG HSRSSHWCGA
VIVSPLHVLT AAHCLEGYNK GTYFVRAGDY NTDIEEGTEA EANIEDYYVH EEFRLGGHRM
NNDIALVLLK GPGIPLGKDI MPICLPHENT EYPAGLNCTI SGFGSIETGK TTQSKDLRYG
WVPLLDQSIC RAGYVYGEGA ISDGMMCAGY LDEGIDTCDG DSGGPLACHH NGAFTLYGIT
SWGQHCGKAN KPGVYVRVAY YRRWIDRKIK ESLAGR
//