ID A0A1R3HC41_COCAP Unreviewed; 540 AA.
AC A0A1R3HC41;
DT 12-APR-2017, integrated into UniProtKB/TrEMBL.
DT 12-APR-2017, sequence version 1.
DT 27-MAR-2024, entry version 26.
DE SubName: Full=Major sperm protein {ECO:0000313|EMBL:OMO67897.1};
GN ORFNames=CCACVL1_20227 {ECO:0000313|EMBL:OMO67897.1};
OS Corchorus capsularis (Jute).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC rosids; malvids; Malvales; Malvaceae; Grewioideae; Apeibeae; Corchorus.
OX NCBI_TaxID=210143 {ECO:0000313|EMBL:OMO67897.1, ECO:0000313|Proteomes:UP000188268};
RN [1] {ECO:0000313|EMBL:OMO67897.1, ECO:0000313|Proteomes:UP000188268}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. CVL-1 {ECO:0000313|Proteomes:UP000188268};
RC TISSUE=Whole seedling {ECO:0000313|EMBL:OMO67897.1};
RA Alam M., Haque M.S., Islam M.S., Emdad E.M., Islam M.M., Ahmed B.,
RA Halim A., Hossen Q.M.M., Hossain M.Z., Ahmed R., Khan M.M., Islam R.,
RA Rashid M.M., Khan S.A., Rahman M.S., Alam M.;
RT "Corchorus capsularis genome sequencing.";
RL Submitted (SEP-2013) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:OMO67897.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AWWV01012333; OMO67897.1; -; Genomic_DNA.
DR AlphaFoldDB; A0A1R3HC41; -.
DR STRING; 210143.A0A1R3HC41; -.
DR EnsemblPlants; OMO67897; OMO67897; CCACVL1_20227.
DR Gramene; OMO67897; OMO67897; CCACVL1_20227.
DR OMA; HYAGISR; -.
DR OrthoDB; 2385921at2759; -.
DR Proteomes; UP000188268; Unassembled WGS sequence.
DR Gene3D; 1.25.40.20; Ankyrin repeat-containing domain; 2.
DR Gene3D; 2.60.40.10; Immunoglobulins; 1.
DR InterPro; IPR002110; Ankyrin_rpt.
DR InterPro; IPR036770; Ankyrin_rpt-contain_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR000535; MSP_dom.
DR InterPro; IPR008962; PapD-like_sf.
DR PANTHER; PTHR24201; ANK_REP_REGION DOMAIN-CONTAINING PROTEIN; 1.
DR PANTHER; PTHR24201:SF14; ANK_REP_REGION DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF00023; Ank; 1.
DR Pfam; PF12796; Ank_2; 2.
DR Pfam; PF13637; Ank_4; 1.
DR Pfam; PF00635; Motile_Sperm; 1.
DR PRINTS; PR01415; ANKYRIN.
DR SMART; SM00248; ANK; 9.
DR SUPFAM; SSF48403; Ankyrin repeat; 1.
DR SUPFAM; SSF49354; PapD-like; 1.
DR PROSITE; PS50297; ANK_REP_REGION; 6.
DR PROSITE; PS50088; ANK_REPEAT; 7.
DR PROSITE; PS50202; MSP; 1.
PE 4: Predicted;
KW ANK repeat {ECO:0000256|ARBA:ARBA00023043, ECO:0000256|PROSITE-
KW ProRule:PRU00023}; Reference proteome {ECO:0000313|Proteomes:UP000188268};
KW Repeat {ECO:0000256|ARBA:ARBA00022737}.
FT DOMAIN 4..134
FT /note="MSP"
FT /evidence="ECO:0000259|PROSITE:PS50202"
FT REPEAT 171..203
FT /note="ANK"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00023"
FT REPEAT 204..236
FT /note="ANK"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00023"
FT REPEAT 237..269
FT /note="ANK"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00023"
FT REPEAT 270..302
FT /note="ANK"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00023"
FT REPEAT 304..336
FT /note="ANK"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00023"
FT REPEAT 391..423
FT /note="ANK"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00023"
FT REPEAT 424..456
FT /note="ANK"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00023"
SQ SEQUENCE 540 AA; 58732 MW; DD2464B7037A07E4 CRC64;
MDRLISLEPS NLVAVRIEPG QKCYGELTLR NVMYTMPVAF RLQPVNKSRY SVKPQSGIIA
PLGTLTVEIV YHLPPGLLLP DSFPHSDDSF LLHSVVVPGA AIKETSNFDT VPNDWFTTKR
KQVFVDSGIK IMFVGSPILV QLVMDGSMDE IRDVLERSDP SWNPADSLDS QGQTLLHIAI
AQSRPDIVQL LLEFEPDVEF QSRSGSTPLE AAAGCGEELI VELLLAHKAS AERSESSSWG
PIHLAAVGGH LEVLRLLLLK GANVNSLTKD GNTALHLAVE ERRRDCARLL LANGAKADIR
NVRDGDTPLH IAAGLGDEQM VKLLLQKGAN KDIRNKTGKT AYDVAAEFGH MRLFDALKLG
DSLCLAARKG ELRTIQRLIE NGAVINGKDQ HGWTALHRAS FKGKIDAAKM LIDKGIDIDS
KDEDGYTALH CAVESGHSEV VEFLVKKGAD VEARTNKGVT PLQIAESLHY AGISRILIHG
GATKDGMPQQ ISAMPVSSPF GNGKMGKEIE TNKAPMMKRR PSRARALRGS FDRALPLAVV
//