ID A9UW31_MONBE Unreviewed; 890 AA.
AC A9UW31;
DT 05-FEB-2008, integrated into UniProtKB/TrEMBL.
DT 05-FEB-2008, sequence version 1.
DT 24-JAN-2024, entry version 58.
DE RecName: Full=Fibrillar collagen NC1 domain-containing protein {ECO:0000259|PROSITE:PS51461};
GN ORFNames=MONBRDRAFT_31892 {ECO:0000313|EMBL:EDQ90696.1};
OS Monosiga brevicollis (Choanoflagellate).
OC Eukaryota; Choanoflagellata; Craspedida; Salpingoecidae; Monosiga.
OX NCBI_TaxID=81824 {ECO:0000313|EMBL:EDQ90696.1, ECO:0000313|Proteomes:UP000001357};
RN [1] {ECO:0000313|EMBL:EDQ90696.1, ECO:0000313|Proteomes:UP000001357}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=MX1 / ATCC 50154 {ECO:0000313|Proteomes:UP000001357};
RX PubMed=18273011; DOI=10.1038/nature06617;
RG JGI Sequencing;
RA King N., Westbrook M.J., Young S.L., Kuo A., Abedin M., Chapman J.,
RA Fairclough S., Hellsten U., Isogai Y., Letunic I., Marr M., Pincus D.,
RA Putnam N., Rokas A., Wright K.J., Zuzow R., Dirks W., Good M.,
RA Goodstein D., Lemons D., Li W., Lyons J.B., Morris A., Nichols S.,
RA Richter D.J., Salamov A., Bork P., Lim W.A., Manning G., Miller W.T.,
RA McGinnis W., Shapiro H., Tjian R., Grigoriev I.V., Rokhsar D.;
RT "The genome of the choanoflagellate Monosiga brevicollis and the origin of
RT metazoans.";
RL Nature 451:783-788(2008).
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CH991547; EDQ90696.1; -; Genomic_DNA.
DR RefSeq; XP_001744747.1; XM_001744695.1.
DR AlphaFoldDB; A9UW31; -.
DR STRING; 81824.A9UW31; -.
DR EnsemblProtists; EDQ90696; EDQ90696; MONBRDRAFT_31892.
DR GeneID; 5889859; -.
DR KEGG; mbr:MONBRDRAFT_31892; -.
DR eggNOG; KOG3544; Eukaryota.
DR eggNOG; KOG4509; Eukaryota.
DR InParanoid; A9UW31; -.
DR OMA; TEMHARN; -.
DR Proteomes; UP000001357; Unassembled WGS sequence.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR Gene3D; 2.60.120.1000; -; 2.
DR Gene3D; 3.30.870.30; MITD, C-terminal phospholipase D-like domain; 1.
DR Gene3D; 1.20.58.80; Phosphotransferase system, lactose/cellobiose-type IIA subunit; 1.
DR InterPro; IPR000885; Fib_collagen_C.
DR InterPro; IPR036056; Fibrinogen-like_C.
DR InterPro; IPR007330; MIT_dom.
DR InterPro; IPR036181; MIT_dom_sf.
DR InterPro; IPR032341; MITD1_C.
DR InterPro; IPR038113; MITD1_C_sf.
DR PANTHER; PTHR21222:SF1; MIT DOMAIN-CONTAINING PROTEIN 1; 1.
DR PANTHER; PTHR21222; UNCHARACTERIZED; 1.
DR Pfam; PF01410; COLFI; 2.
DR Pfam; PF04212; MIT; 1.
DR Pfam; PF16565; MIT_C; 1.
DR SMART; SM00038; COLFI; 1.
DR SUPFAM; SSF56496; Fibrinogen C-terminal domain-like; 1.
DR SUPFAM; SSF116846; MIT domain; 1.
DR PROSITE; PS51461; NC1_FIB; 2.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000001357};
KW Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..24
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 25..890
FT /note="Fibrillar collagen NC1 domain-containing protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5002742420"
FT DOMAIN 211..432
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT DOMAIN 434..657
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
SQ SEQUENCE 890 AA; 98984 MW; 47FF40D1D885625B CRC64;
MALPWLRRGA VLLLLLTTVW SGLGAEPSME VTEGTIVMEA TDVEFVVLDG NYRTSVVQMR
NALQANISSL AGTIADTNQQ LNISLQATIR ELDELFTNSL NVTADTISTN LLQEIDEAIT
HSQLVLSQLN ELRAQTQSNI SALAEAAATA RSVLRTETQA NFSALAETEM HARNQLREEA
HANLTALQSS NQEVTDQLEA DLTVVTETTL PALSARVDEV AHNLSTILVR DGSSRGAAAA
SCQAIFNVNL GLTNGTYWLD PNGQSTHDAV QLKCERVDNT VYTVLPAPPA LPMGGHFSAD
ASADRWESVR SINDTVFQYG DIPDDQLRAL LARSSSGRQS IQLDCRGVIS SLWDDVDWYY
EYPVVLVGLS GEVWTFPTAD DLGFQGRVFR PNIVFDNCSQ NHVDELGLSI YDMEGPATAL
PIVDMWLSDI GNANESFGFS LSEVRLSEAF QGPVVAYGTR MRPGKSCLDI FMHGYGSNDT
YYYIDPNGGL RNDSVRVFCN MTGGGWTGIA PLEQVPFKAW NPSGTDGYRF FSLHTGGYEI
TYDMADHQLD LLLASSSEAF QVLTVACQDA LVYYYDSSNL YSYAFRYKGY NDVLWTYESD
GLNVPVDNCK ANDATTRSTV VEFTGTPGDL PIRDFAPRDM GGATEFIGVH FSFVWAVKFD
KEARYDLALN NYRVALELLV PCIAPQSTLP VDMRGNLQQR IRDYMSRAEI VKEAARKERL
CADQAAKAHS AEVLHIKQND VGYDYQTLFG KYFTGAQVVT VEDPYLYRPH QIINLVHFFE
AALAGIGRDN FKMAVVRTKH QQDGVNQAEV FAGLESNLQE YGVRLVVEYE DFHDRCVRFD
NGYIIGLGAG LDIYLRPERQ FQLGMHDYRL RKCRETRIVA SYNSREDRRT
//