ID R1CJI8_EMIHU Unreviewed; 3933 AA.
AC R1CJI8;
DT 26-JUN-2013, integrated into UniProtKB/TrEMBL.
DT 26-JUN-2013, sequence version 1.
DT 27-MAR-2024, entry version 48.
DE RecName: Full=VWFA domain-containing protein {ECO:0000259|PROSITE:PS50234};
GN ORFNames=EMIHUDRAFT_95626 {ECO:0000313|EMBL:EOD23033.1};
OS Emiliania huxleyi (Coccolithophore) (Pontosphaera huxleyi).
OC Eukaryota; Haptista; Haptophyta; Prymnesiophyceae; Isochrysidales;
OC Noelaerhabdaceae; Emiliania.
OX NCBI_TaxID=2903 {ECO:0000313|EMBL:EOD23033.1};
RN [1] {ECO:0000313|EMBL:EOD23033.1}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=CCMP1516 {ECO:0000313|EMBL:EOD23033.1};
RG DOE Joint Genome Institute;
RA Read B., Kegel J., Klute M., Kuo A., Lefebvre S.C., Maumus F., Mayer C.,
RA Miller J., Allen A., Bidle K., Borodovsky M., Bowler C., Brownlee C.,
RA Claverie J.-M., Cock M., De Vargas C., Elias M., Frickenhaus S.,
RA Gladyshev V.N., Gonzalez K., Guda C., Hadaegh A., Herman E.,
RA Iglesias-Rodriguez D., Jones B., Lawson T., Leese F., Lin Y.-C.,
RA Lindquist E., Lobanov A., Lucas S., Malik S.-H.B., Marsh M.E., Mock T.,
RA Monier A., Moreau H., Mueller-Roeber B., Napier J., Ogata H., Parker M.,
RA Probert I., Quesneville H., Raines C., Rensing S., Riano-Pachon D.M.,
RA Richier S., Rokitta S., Salamov A., Sarno A.F., Schmutz J., Schroeder D.,
RA Shiraiwa Y., Soanes D.M., Valentin K., Van Der Giezen M., Van Der Peer Y.,
RA Vardi A., Verret F., Von Dassow P., Wheeler G., Williams B., Wilson W.,
RA Wolfe G., Wurch L.L., Young J., Dacks J.B., Delwiche C.F., Dyhrman S.,
RA Glockner G., John U., Richards T., Worden A.Z., Zhang X., Grigoriev I.V.;
RT "Genome variability drives Emilianias global distribution.";
RL Submitted (JUL-2012) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Proteomes:UP000013827}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=CCMP1516 {ECO:0000313|Proteomes:UP000013827};
RX PubMed=23760476; DOI=10.1038/nature12221;
RA Read B.A., Kegel J., Klute M.J., Kuo A., Lefebvre S.C., Maumus F.,
RA Mayer C., Miller J., Monier A., Salamov A., Young J., Aguilar M.,
RA Claverie J.M., Frickenhaus S., Gonzalez K., Herman E.K., Lin Y.C.,
RA Napier J., Ogata H., Sarno A.F., Shmutz J., Schroeder D., de Vargas C.,
RA Verret F., von Dassow P., Valentin K., Van de Peer Y., Wheeler G.,
RA Dacks J.B., Delwiche C.F., Dyhrman S.T., Glockner G., John U., Richards T.,
RA Worden A.Z., Zhang X., Grigoriev I.V., Allen A.E., Bidle K., Borodovsky M.,
RA Bowler C., Brownlee C., Cock J.M., Elias M., Gladyshev V.N., Groth M.,
RA Guda C., Hadaegh A., Iglesias-Rodriguez M.D., Jenkins J., Jones B.M.,
RA Lawson T., Leese F., Lindquist E., Lobanov A., Lomsadze A., Malik S.B.,
RA Marsh M.E., Mackinder L., Mock T., Mueller-Roeber B., Pagarete A.,
RA Parker M., Probert I., Quesneville H., Raines C., Rensing S.A.,
RA Riano-Pachon D.M., Richier S., Rokitta S., Shiraiwa Y., Soanes D.M.,
RA van der Giezen M., Wahlund T.M., Williams B., Wilson W., Wolfe G.,
RA Wurch L.L.;
RT "Pan genome of the phytoplankton Emiliania underpins its global
RT distribution.";
RL Nature 499:209-213(2013).
RN [3] {ECO:0000313|EnsemblProtists:EOD23033}
RP IDENTIFICATION.
RG EnsemblProtists;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; KB865668; EOD23033.1; -; Genomic_DNA.
DR RefSeq; XP_005775462.1; XM_005775405.1.
DR STRING; 2903.R1CJI8; -.
DR PaxDb; 2903-EOD23033; -.
DR EnsemblProtists; EOD23033; EOD23033; EMIHUDRAFT_95626.
DR GeneID; 17268580; -.
DR KEGG; ehx:EMIHUDRAFT_95626; -.
DR eggNOG; KOG3544; Eukaryota.
DR HOGENOM; CLU_224153_0_0_1; -.
DR Proteomes; UP000013827; Unassembled WGS sequence.
DR CDD; cd00198; vWFA; 13.
DR Gene3D; 3.40.50.410; von Willebrand factor, type A domain; 14.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR PANTHER; PTHR24020; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24020:SF70; PH DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF00092; VWA; 14.
DR SMART; SM00327; VWA; 14.
DR SUPFAM; SSF53300; vWA-like; 14.
DR PROSITE; PS50234; VWFA; 14.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000013827};
KW Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..42
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 43..3933
FT /note="VWFA domain-containing protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5014590375"
FT DOMAIN 198..329
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 373..547
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 591..765
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 809..983
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 1027..1201
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 1245..1419
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 1463..1637
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 1681..1855
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 1899..2073
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 2117..2252
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 2304..2478
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 2522..2696
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 2740..2914
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 2958..3140
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT REGION 333..364
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 551..582
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 769..800
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 987..1018
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1205..1236
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1423..1454
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1641..1672
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1859..1890
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2077..2108
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2482..2513
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2700..2731
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2918..2949
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 335..364
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 553..582
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 771..800
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 989..1018
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1207..1236
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1425..1454
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1643..1672
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1861..1890
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2079..2108
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2484..2513
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2702..2731
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2920..2949
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 3933 AA; 402506 MW; 85564803AF90C9A6 CRC64;
MPPARPGAVR ASDRLGRAQP LPQVGLQALV LALAALARPA AAVQLPANAV NVSSGETCPE
GTTQPTVDWC QWYAGELGVY YLDKSGFYPE EGALVACVHR LVNNYSDPDS LYEAVPMPEM
TVSYQFADGR CPDYKYCETC LPRVLQCYCI SDDSSPPPPL PPLSPGETLS PPMQIPNLRL
YVGQCIPHDG WAGVQNCQFS SDAATLTPLT GSLSEVLTAI DGASAAGGGT SVSDGLELGR
VEVNQGARAD VPRTILLLTD GVQTVDGNDD TAIAKAAVVK QDGISIVAVG FGGANEQTMR
AIASAPSSDF AFFGASMDEV RQHFASDQLC ELAASPKAPP PPPSTPAPPG LPPSPPSPSP
APPSPPACVV KIELVLVLDK SGSVQAEQSS LLAFAREMVS QFSLDATEGA RVGIVEFSSD
AATLTPLTGS LSEVLTAIDG ASAAGGGTSV SDGLELGRVE VNQGARADVP RTILLLTDGV
QTVDGNDDTA IAKAAVVKQD GISIVAVGFG GANEQTMRAI ASAPSSDFAF FGASMDEVRQ
HFASDQLCEL AASPKAPPPP PSTPAPPGLP PSPPSPSPAP PSPPACVVKI ELVLVLDKSG
SVQAEQSSLL AFAREMVSQF SLDATEGARV GIVEFSSDAA TLTPLTGSLS EVLTAIDGAS
AAGGGTSVSD GLELGRVEVN QGARADVPRT ILLLTDGVQT VDGNDDTAIA KAAVVKQDGI
SIVAVGFGGA NEQTMRAIAS APSSDFAFFG ASMDEVRQHF ASDQLCELAA SPKAPPPPPS
TPAPPGLPPS PPSPSPAPPS PPACVVKIEL VLVLDKSGSV QAEQSSLLAF AREMVSQFSL
DATEGARVGI VEFSSDAATL TPLTGSLSEV LTAIDGASAA GGGTSVSDGL ELGRVEVNQG
ARADVPRTIL LLTDGVQTVD GNDDTAIAKA AVVKQDGISI VAVGFGGANE QTMRAIASAP
SSDFAFFGAS MDEVRQHFAS DQLCELAASP KAPPPPPSTP APPGLPPSPP SPSPAPPSPP
ACVVKIELVL VLDKSGSVQA EQSSLLAFAR EMVSQFSLDA TEGARVGIVE FSSDAATLTP
LTGSLSEVLT AIDGASAAGG GTSVSDGLEL GRVEVNQGAR ADVPRTILLL TDGVQTVDGN
DDTAIAKAAV VKQDGISIVA VGFGGANEQT MRAIASAPSS DFAFFGASMD EVRQHFASDQ
LCELAASPKA PPPPPSTPAP PGLPPSPPSP SPAPPSPPAC VVKIELVLVL DKSGSVQAEQ
SSLLAFAREM VSQFSLDATE GARVGIVEFS SDAATLTPLT GSLSEVLTAI DGASAAGGGT
SVSDGLELGR VEVNQGARAD VPRTILLLTD GVQTVDGNDD TAIAKAAVVK QDGISIVAVG
FGGANEQTMR AIASAPSSDF AFFGASMDEV RQHFASDQLC ELAASPKAPP PPPSLPAPPG
LPPSPPSPSP APPSPPACVV KIELVLVLDK SGSVQAEQSS LLAFAREMVS QFSLDATEGA
RVGIVEFSSD AATLTPLTGS LSEVLTAIDG ASAAGGGTSV SDGLELGRVE VNQGARADVP
RTILLLTDGV QTVDGNDDTA IAKAAVVKQD GISIVAVGFG GANEQTMRAI ASAPSSDFAF
FGASMDEVRQ HFASDQLCEL AASPKAPPPP PSTPAPPGLP PSPPSPSPAP PSPPACVVKI
ELVLVLDKSG SVQAEQSSLL AFAREMVSQF SLDATEGARV GIVEFSSDAA TLTPLTGSLS
EVLTAIDGAS AAGGGTSVSD GLELGRVEVN QGARADVPRT ILLLTDGVQT VDGNDDTAIA
KAAVVKQDGI SIVAVGFGGA NEQTMRAIAS APSSDFAFFG ASMDEVRQHF ASDQLCELAA
SPKAPPPPPS TPAPPGLPPS PPSPSPAPPS PPACVVKIEL VLVLDKSGSV QAEQSSLLAF
AREMVSQFSL DATEGARVGI VEFSSDAATL TPLTGSLSEV LTAIDGASAA GGGTSVSDGL
ELGRVEVNQG ARADVPRTIL LLTDGVQTVD GNDDTAIAKA AVVKQDGISI VAVGFGGANE
QTMRAIASAP SSDFAFFGAS MDEVRQHFAS DQLCELAASP KAPPPPPSTP APPGLPPSPP
SPSPAPPSPP ACVVKIELVL VLDKSGSVQA EQSSLLAFAR EMVSQFSLDA TEGARVGIVE
FSSDAATLTP LTGSLSEVLT AIDGASAAGG GTSVSDGLEL GRVEVNQGAR ADVPRTILLL
TDGVQTVDGN DDTAIAKAAV VKQDGISIVA VGPPPSATSA SPTPPPPSPP PVIIIVGGGG
GGGAPSPPPS QQPFPPPACV VKIELVLVLD KSGSVQAEQS SLLAFAREMV SQFSLDATEG
ARVGIVEFSS DAATLTPLTG SLSEVLTAID GASAAGGGTS VSDGLELGRV EVNQGARADV
PRTILLLTDG VQTVDGNDDT AIAKAAVVKQ DGISIVAVGF GGANEQTMRA IASAPSSDFA
FFGASMDEVR QHFASDQLCE LAASPKAPPP PPSTPAPPGL PPSPPSPSPA PPSPPACVVK
IELVLVLDKS GSVQAEQSSL LAFAREMVSQ FSLDATEGAR VGIVEFSSDA ATLTPLTGSL
SEVLTAIDGA SAAGGGTSVS DGLELGRVEV NQGARADVPR TILLLTDGVQ TVDGNDDTAI
AKAAVVKQDG ISIVAVGFGG ANEQTMRAIA SAPSSDFAFF GASMDEVRQH FASDQLCELA
ASPKAPPPPP SLPAPPGLPP SPPSPSPAPP SPPACVVKIE LVLVLDKSGS VQAEQSSLLA
FAREMVSQFS LDATEGARVG IVEFSSDAAT LTPLTGSLSE VLTAIDGASA AGGGTSVSDG
LELGRVEVNQ GARADVPRTI LLLTDGVQTV DGNDDTAIAK AAVVKQDGIS IVAVGFGGAN
EQTMRAIASA PSSDFAFFGA SMDEVRQHFA SDQLCELAAS PKAPPPPPST PAPPGLPPSP
PSPSPAPPSP PACVVKIELV LVLDKSGSVQ AEQSSLLAFA REMVSQFSLD ATEGARVGIV
EFSSDAATLT PLTGSLSEVL TAIDGASAAG GGTSVSDGLE LGRVEVNQGA RADVPRTILL
LTDGVQTVDG NDDTAIAKAA VVKQDGISIV AVGFGGANEQ TMRAIASAPS SDFAFFGASM
DEAEPWYNCL DITSFMNNAI NISDDGMDIT HGASGQIFGQ TSLCDGKKVS DRIIDGFDLT
ALLYYMTGSF ADLADMNRDP SAVVTGLQGF TDVHLRCGTN ESYYGYYTAY ANNACTNGPF
IDGPNNADLN IDWDLSGRGT VTGRRQLAEE QAQEEGAPLK LWRSTDVFQQ RQQPEFDPPS
GVNISELVIR TRLWVEAEGG SWYQIAIGGE HVSANFQVQG VPWPQSFPRE ARLSNHERFP
RDGKAPTHDA HRVQLRFARH CEYRDECRSD CAIITSQLAN DVMLSEETLI YRQERTAGRQ
RICGIDLFMW VPAAPDSGSR RLQDGEQSKP CVKFGVAVPT AKCTPETCEC RDDDVEPLAP
PPYLPPALPP PPLLPSPLLP SPLILSQPSP PSPPSFPFID LVDDEGEALF TVIHDKAFTL
DVNGTEFARY LPRRRLDALS PVREVRRLAG AAPVEIGDVL IFKPWHEYPD RQARSCDGAA
DLSSNFGGVV DQNLEVSLRL VIPTYYYHGC FARGPVSAGG DLTELTFAVA AITAVVVHLP
PSMPPPARPP LPSSPPSPPL LSPAPPCGCS PYYDLERGLA SSPTFLNSTN ALFYTLETGS
VWPLSLPVYD PAWGCACSSS VCIASPAVNL NTASAGAEAP AAVIRCMTMR SRYAASISPC
SASSFASQSE ATVVATVPWG TVAYTLLGPR RYPSSLSGGS EYSTPILDPA WLRACAAADR
SPAITYWMFS RSCARGASAR SASLAARSPA SVARRAFCRS FLALRAAARA ASCSLVGSFF
FGAVRSMYER SAACTDSRSP AAFSCFRSTS MVA
//