ID R1D873_EMIHU Unreviewed; 1553 AA.
AC R1D873;
DT 26-JUN-2013, integrated into UniProtKB/TrEMBL.
DT 26-JUN-2013, sequence version 1.
DT 27-MAR-2024, entry version 57.
DE RecName: Full=Peptidase S1 domain-containing protein {ECO:0008006|Google:ProtNLM};
GN ORFNames=EMIHUDRAFT_468143 {ECO:0000313|EMBL:EOD31643.1};
OS Emiliania huxleyi (Coccolithophore) (Pontosphaera huxleyi).
OC Eukaryota; Haptista; Haptophyta; Prymnesiophyceae; Isochrysidales;
OC Noelaerhabdaceae; Emiliania.
OX NCBI_TaxID=2903 {ECO:0000313|EMBL:EOD31643.1};
RN [1] {ECO:0000313|EMBL:EOD31643.1}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=CCMP1516 {ECO:0000313|EMBL:EOD31643.1};
RG DOE Joint Genome Institute;
RA Read B., Kegel J., Klute M., Kuo A., Lefebvre S.C., Maumus F., Mayer C.,
RA Miller J., Allen A., Bidle K., Borodovsky M., Bowler C., Brownlee C.,
RA Claverie J.-M., Cock M., De Vargas C., Elias M., Frickenhaus S.,
RA Gladyshev V.N., Gonzalez K., Guda C., Hadaegh A., Herman E.,
RA Iglesias-Rodriguez D., Jones B., Lawson T., Leese F., Lin Y.-C.,
RA Lindquist E., Lobanov A., Lucas S., Malik S.-H.B., Marsh M.E., Mock T.,
RA Monier A., Moreau H., Mueller-Roeber B., Napier J., Ogata H., Parker M.,
RA Probert I., Quesneville H., Raines C., Rensing S., Riano-Pachon D.M.,
RA Richier S., Rokitta S., Salamov A., Sarno A.F., Schmutz J., Schroeder D.,
RA Shiraiwa Y., Soanes D.M., Valentin K., Van Der Giezen M., Van Der Peer Y.,
RA Vardi A., Verret F., Von Dassow P., Wheeler G., Williams B., Wilson W.,
RA Wolfe G., Wurch L.L., Young J., Dacks J.B., Delwiche C.F., Dyhrman S.,
RA Glockner G., John U., Richards T., Worden A.Z., Zhang X., Grigoriev I.V.;
RT "Genome variability drives Emilianias global distribution.";
RL Submitted (JUL-2012) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Proteomes:UP000013827}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=CCMP1516 {ECO:0000313|Proteomes:UP000013827};
RX PubMed=23760476; DOI=10.1038/nature12221;
RA Read B.A., Kegel J., Klute M.J., Kuo A., Lefebvre S.C., Maumus F.,
RA Mayer C., Miller J., Monier A., Salamov A., Young J., Aguilar M.,
RA Claverie J.M., Frickenhaus S., Gonzalez K., Herman E.K., Lin Y.C.,
RA Napier J., Ogata H., Sarno A.F., Shmutz J., Schroeder D., de Vargas C.,
RA Verret F., von Dassow P., Valentin K., Van de Peer Y., Wheeler G.,
RA Dacks J.B., Delwiche C.F., Dyhrman S.T., Glockner G., John U., Richards T.,
RA Worden A.Z., Zhang X., Grigoriev I.V., Allen A.E., Bidle K., Borodovsky M.,
RA Bowler C., Brownlee C., Cock J.M., Elias M., Gladyshev V.N., Groth M.,
RA Guda C., Hadaegh A., Iglesias-Rodriguez M.D., Jenkins J., Jones B.M.,
RA Lawson T., Leese F., Lindquist E., Lobanov A., Lomsadze A., Malik S.B.,
RA Marsh M.E., Mackinder L., Mock T., Mueller-Roeber B., Pagarete A.,
RA Parker M., Probert I., Quesneville H., Raines C., Rensing S.A.,
RA Riano-Pachon D.M., Richier S., Rokitta S., Shiraiwa Y., Soanes D.M.,
RA van der Giezen M., Wahlund T.M., Williams B., Wilson W., Wolfe G.,
RA Wurch L.L.;
RT "Pan genome of the phytoplankton Emiliania underpins its global
RT distribution.";
RL Nature 499:209-213(2013).
RN [3] {ECO:0000313|EnsemblProtists:EOD31643}
RP IDENTIFICATION.
RG EnsemblProtists;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- SIMILARITY: Belongs to the peptidase S1 family.
CC {ECO:0000256|ARBA:ARBA00007664}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; KB864594; EOD31643.1; -; Genomic_DNA.
DR RefSeq; XP_005784072.1; XM_005784015.1.
DR STRING; 2903.R1D873; -.
DR PaxDb; 2903-EOD31643; -.
DR EnsemblProtists; EOD31643; EOD31643; EMIHUDRAFT_468143.
DR GeneID; 17276917; -.
DR KEGG; ehx:EMIHUDRAFT_468143; -.
DR eggNOG; KOG3627; Eukaryota.
DR HOGENOM; CLU_246290_0_0_1; -.
DR OMA; NSWILIQ; -.
DR Proteomes; UP000013827; Unassembled WGS sequence.
DR GO; GO:0016020; C:membrane; IEA:UniProtKB-KW.
DR GO; GO:0004252; F:serine-type endopeptidase activity; IEA:InterPro.
DR GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR CDD; cd00041; CUB; 2.
DR CDD; cd00190; Tryp_SPc; 1.
DR Gene3D; 2.60.120.290; Spermadhesin, CUB domain; 2.
DR Gene3D; 2.40.10.10; Trypsin-like serine proteases; 3.
DR InterPro; IPR000859; CUB_dom.
DR InterPro; IPR009003; Peptidase_S1_PA.
DR InterPro; IPR043504; Peptidase_S1_PA_chymotrypsin.
DR InterPro; IPR001314; Peptidase_S1A.
DR InterPro; IPR035914; Sperma_CUB_dom_sf.
DR InterPro; IPR001254; Trypsin_dom.
DR InterPro; IPR018114; TRYPSIN_HIS.
DR InterPro; IPR033116; TRYPSIN_SER.
DR PANTHER; PTHR24276:SF91; AT26814P-RELATED; 1.
DR PANTHER; PTHR24276; POLYSERASE-RELATED; 1.
DR Pfam; PF00431; CUB; 1.
DR Pfam; PF00089; Trypsin; 1.
DR Pfam; PF13365; Trypsin_2; 1.
DR PRINTS; PR00722; CHYMOTRYPSIN.
DR SMART; SM00042; CUB; 1.
DR SMART; SM00020; Tryp_SPc; 1.
DR SUPFAM; SSF49854; Spermadhesin, CUB domain; 2.
DR SUPFAM; SSF50494; Trypsin-like serine proteases; 2.
DR PROSITE; PS01180; CUB; 2.
DR PROSITE; PS50240; TRYPSIN_DOM; 1.
DR PROSITE; PS00134; TRYPSIN_HIS; 2.
DR PROSITE; PS00135; TRYPSIN_SER; 1.
PE 3: Inferred from homology;
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Hydrolase {ECO:0000256|RuleBase:RU363034};
KW Membrane {ECO:0000256|SAM:Phobius};
KW Protease {ECO:0000256|RuleBase:RU363034};
KW Reference proteome {ECO:0000313|Proteomes:UP000013827};
KW Serine protease {ECO:0000256|RuleBase:RU363034};
KW Transmembrane {ECO:0000256|SAM:Phobius};
KW Transmembrane helix {ECO:0000256|SAM:Phobius}.
FT TRANSMEM 1506..1526
FT /note="Helical"
FT /evidence="ECO:0000256|SAM:Phobius"
FT DOMAIN 336..431
FT /note="CUB"
FT /evidence="ECO:0000259|PROSITE:PS01180"
FT DOMAIN 630..685
FT /note="CUB"
FT /evidence="ECO:0000259|PROSITE:PS01180"
FT DOMAIN 1182..1420
FT /note="Peptidase S1"
FT /evidence="ECO:0000259|PROSITE:PS50240"
FT REGION 304..326
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 433..518
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 719..796
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 864..898
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1008..1055
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1133..1160
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1435..1504
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1533..1553
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 307..326
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 439..518
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 726..796
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 864..882
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 883..898
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1011..1055
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1145..1160
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1446..1492
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1535..1553
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1553 AA; 158433 MW; 6FEA2CF4DCEAC3EF CRC64;
MLALVVLGAA APGRGPVAGP TPGPRVVYGS DDRRDHYALN DNDAADVMAK SLLENAIMAH
VIRSDLGSPD SDGVYRVLDD YPDFGRLGPA RGMCDDQRFL RDPRLSYCSA TWVGEDRVIT
AGHCVEDQFD CDNAAYVFKY YVTGYDKNGD PTFPEITTDD VYLCSSVTTD FADGGTDIAL
IKLDRAVVGG RTPPIYQKSQ VPTTFGQKLL MIGFPDGLPA KVEDGGAVTD TGSSNGYEFF
TASTDAFGGN SGSGVFDERG TMIGVLVRGV TDYVSDDERG CTVVNELPNS AGGEGITYVD
RITQLQPSQP SPPPPPPSPP LPPLAPCSVS VDSGPCSLTK NGTHSCVTSP GFPNDYPNDK
GCTITCLPAA NTVEAFDVEE ETSCQYDYLE VNGVKYCGTS GPPNGEPAGA VQWVSDFSVT
AKGWKICWEL ENGSSPSAPP SSMPSPPPSA SPMLPPPSAS PAPPPPSASP SPPPPSASPP
PPSASPSPPP SASPPPPSAS PAPPPSSASP SPPPPSASPA ALSIEFQVTV AGGSWQYEVS
WSLTCSGALI GSGGAPYSGT LSAPPGECTL DMNDSYGDGW NGDQWEGAGY TFTLDSGSTG
SATFTLAPSA SPPPPPPLAP CSVSVDSGPC SLTKNGTHSC VTSPGFPGDY PNDKGCTITC
LPPANTVEAF EVEDETSCQY DYLEVNGVKY CGTSGPPNGE PAGAVQWVSD FSICWELENG
SSPSAPPSSM PSPPPSASPA PPPSASPSPP PPSAPPPPPS ASPSPPPSAS PPPPSASPAP
PPPSASPSPP PPSASPAALS IEFQVTVAGG SYPSEVSWTL MCSGALIGSG GAPYSGTLSA
PPGECTLDMN DSYGDGWNGD QWEGAGNTFT LDSGSSGSAT FTLAPLPPPP PLPPPAPPAG
SPRYVACGRT GRCDESDRWA APSEKHEVRC CSDSRIDGWK KKAGCSVWSE SDDDEMGGKC
HHNKNFAQAS TICEDAGARL CTRAELQGNC ARGSGCGHDR DLIWSGTESE AAAPPPPPSA
SPSPPPSASL PPPSASPAPP PSSASPSPPP PSASPAALSI EFQVTVAGGS YPSEVSWTLM
CSGALIGSGG APYSGTLSAP PGECTLDMND SYGDGWNGDQ WEGAGNTFTL DSGSSGSATF
TLAPLPPSSP QPPSQPPQPP LPGYCFATSA VASKAVVDVP KIVGGAPVNP ARKYTWLVSL
DGPGSFIGKH YCGGTLIAPD WVLTAAHCTQ GTVGRVAVGM HVVADVNTDP CVSLHEVAEI
FNHEDYGSPN SESNDVSLLK LATPVTGYAP IDLLDGPGMD TPFQEHLAML TVAGWGTMSS
GGVSSPEAME VDVPVVQNED CNVDYSGQID DTMICAGDTV NGGEDACQGD SGGPLFGIDA
DGNYVLTGIV SWGIGCADAD FPGVYARVSH ASEWICARTD GVVCGEPDSL KSRNVKKAAK
KTASPLVPPL VPPVSPPPAP HVSPSPPPPP PPPSPPPPLP PPSPPPSPPL TVLGPGDDDD
DDKSTVLIAA TAVGGLFGVA LIAAVATRYC MTKKAASAEE PKKDTLPVDK SVA
//