ID A0A167P2N2_CALVF Unreviewed; 801 AA.
AC A0A167P2N2;
DT 06-JUL-2016, integrated into UniProtKB/TrEMBL.
DT 06-JUL-2016, sequence version 1.
DT 22-FEB-2023, entry version 22.
DE RecName: Full=MSP domain-containing protein {ECO:0000259|PROSITE:PS50202};
GN ORFNames=CALVIDRAFT_535431 {ECO:0000313|EMBL:KZO98355.1};
OS Calocera viscosa (strain TUFC12733).
OC Eukaryota; Fungi; Dikarya; Basidiomycota; Agaricomycotina; Dacrymycetes;
OC Dacrymycetales; Dacrymycetaceae; Calocera.
OX NCBI_TaxID=1330018 {ECO:0000313|EMBL:KZO98355.1, ECO:0000313|Proteomes:UP000076738};
RN [1] {ECO:0000313|EMBL:KZO98355.1, ECO:0000313|Proteomes:UP000076738}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=TUFC12733 {ECO:0000313|EMBL:KZO98355.1,
RC ECO:0000313|Proteomes:UP000076738};
RX PubMed=26659563; DOI=10.1093/molbev/msv337;
RA Nagy L.G., Riley R., Tritt A., Adam C., Daum C., Floudas D., Sun H.,
RA Yadav J.S., Pangilinan J., Larsson K.H., Matsuura K., Barry K., Labutti K.,
RA Kuo R., Ohm R.A., Bhattacharya S.S., Shirouzu T., Yoshinaga Y.,
RA Martin F.M., Grigoriev I.V., Hibbett D.S.;
RT "Comparative Genomics of Early-Diverging Mushroom-Forming Fungi Provides
RT Insights into the Origins of Lignocellulose Decay Capabilities.";
RL Mol. Biol. Evol. 33:959-970(2016).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; KV417276; KZO98355.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A167P2N2; -.
DR STRING; 1330018.A0A167P2N2; -.
DR OrthoDB; 1343558at2759; -.
DR Proteomes; UP000076738; Unassembled WGS sequence.
DR Gene3D; 2.60.40.10; Immunoglobulins; 1.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR000535; MSP_dom.
DR InterPro; IPR008962; PapD-like_sf.
DR Pfam; PF00635; Motile_Sperm; 1.
DR SUPFAM; SSF49354; PapD-like; 1.
DR PROSITE; PS50202; MSP; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000076738}.
FT DOMAIN 467..594
FT /note="MSP"
FT /evidence="ECO:0000259|PROSITE:PS50202"
FT REGION 1..479
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 598..681
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 61..87
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 94..127
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 168..198
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 199..226
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 234..263
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 264..286
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 331..360
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 391..410
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 429..443
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 598..654
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 801 AA; 87003 MW; 6BB826DD4BB9E071 CRC64;
MLGSQQGQQR PPDQPGASAA AGAAAGGYPY SGYSQKLAKE QAYAQLHPGN VEEHHPPPSY
HMATSPSTSP LSAAAQPQQQ QQQQKRTQQL RHKGRDRQVR DDTSSDQESF WEHAEREALR
LEHGTPKPDS DSQYYASPGP LAPNRASYQD PQAYGPYATQ SPPPHDAHYG SPSAPPPPSA
YRAPPPGSRE PQRPLPSPPQ SATSHGSHGS YPPSPSPQSQ GRPMSAPARS PPSRSPEQAR
PPPQAPSPTR VQQPHSPAPA RPSHPPHLEH HSHEHEHEHE HPAAPTGPRH LMHAVPSAPH
LTPEVLERDL PSLPHDSPSQ FSGSLPLHGE EPYPRGEERY EPLQDEMRLT RGDHGRESSG
ETVKPGPGPG YDGEAYPSEL TPLLPGGLRM LRAERDAREE RPRAEFGRSE EEEEVAEEEG
RAGRAQEEEQ DGDDDDDRDE EDEPGMHDLP GERRPLQPPP RIGTPEPLTI SPDTPDAPIL
LPNGNTKLLF RVVNPNRAPV AFKVMTTQPR VYSVRPNFGL IPPSGTLDLE VHAPADVASR
RRPAPGRGDR FAVLSRFLAQ EEEEELEEHE DFAALFPSKT TSAEGMRSQR FRIRPLMVPQ
NQNQGQGQNQ GGQGMGSGMS ATTAGSLMSA FSPSQRPRAL SSSGSTTALS PTYTAAPLPS
ILGPATTTPA APPGRPKSPS AIRFAPAVGA MASLAGVAGQ FAEAREAERR DEVLRESIGA
VREGVGELKD RLDEVIRRMQ LEEEAKERAL APDQMPAEME YSYPEYHTPG KKRAGLHPGR
VLGLALAVFV ATYYGRVFHV I
//