ID E5C7G5_9BACE Unreviewed; 958 AA.
AC E5C7G5;
DT 08-FEB-2011, integrated into UniProtKB/TrEMBL.
DT 08-FEB-2011, sequence version 1.
DT 27-MAR-2024, entry version 42.
DE RecName: Full=Collagen-like protein {ECO:0008006|Google:ProtNLM};
GN ORFNames=BSGG_0441 {ECO:0000313|EMBL:EFS29741.1};
OS Bacteroides sp. D2.
OC Bacteria; Bacteroidota; Bacteroidia; Bacteroidales; Bacteroidaceae;
OC Bacteroides.
OX NCBI_TaxID=556259 {ECO:0000313|EMBL:EFS29741.1, ECO:0000313|Proteomes:UP000003135};
RN [1] {ECO:0000313|EMBL:EFS29741.1, ECO:0000313|Proteomes:UP000003135}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=D2 {ECO:0000313|EMBL:EFS29741.1,
RC ECO:0000313|Proteomes:UP000003135};
RG The Broad Institute Genome Sequencing Platform;
RA Earl A., Ward D., Feldgarden M., Gevers D., Allen-Vercoe E., Strauss J.,
RA Sibley C., White A., Young S.K., Zeng Q., Gargeya S., Fitzgerald M.,
RA Haas B., Abouelleil A., Alvarado L., Arachchi H.M., Berlin A., Brown A.,
RA Chapman S.B., Chen Z., Dunbar C., Freedman E., Gearin G., Goldberg J.,
RA Griggs A., Gujja S., Heiman D., Howarth C., Larson L., Lui A.,
RA MacDonald P.J.P., Montmayeur A., Murphy C., Neiman D., Pearson M.,
RA Priest M., Roberts A., Saif S., Shea T., Shenoy N., Sisk P., Stolte C.,
RA Sykes S., Wortman J., Nusbaum C., Birren B.;
RT "The Genome Sequence of Bacteroides sp. D2.";
RL Submitted (OCT-2011) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:EFS29741.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; ACGA02000035; EFS29741.1; -; Genomic_DNA.
DR RefSeq; WP_008999309.1; NZ_JH636022.1.
DR AlphaFoldDB; E5C7G5; -.
DR HOGENOM; CLU_281548_0_0_10; -.
DR OrthoDB; 1031347at2; -.
DR Proteomes; UP000003135; Unassembled WGS sequence.
DR Gene3D; 2.10.10.20; Carbohydrate-binding module superfamily 5/12; 1.
DR InterPro; IPR003006; Ig/MHC_CS.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF914; OTOLIN-1; 1.
DR PROSITE; PS00290; IG_MHC; 1.
PE 4: Predicted;
FT REGION 432..471
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 594..624
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 432..454
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 594..610
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 958 AA; 104358 MW; 6537363F0F3B80DD CRC64;
MIEVENKKVP HSFRNKYLRN SGSVSISTTT PTPINGGGAN LDVLKIDDGR TVSDENVFSS
LRSLFEIKSR IIALTDNNTA PTDDNTFSSL RIRQELYAAI DALKDSYLSK TAPDETQFLI
KLLGGLIVDN GLDVTKGIST DTLTATTVTT QILNVLDKLI AKSATFSGDI SSNDYAEGLI
GWLIGKDGHI DAKSLRLRDF LEVPELRYNR VSIVSGEEWN APGGGIIENI DESNRIIYLK
LEPGEIAEIE VDDICKGIFN DSTGFQTAYF RITEKIGDST FKYALRSGTT AHPCKAMHFV
SYGNFTSKDR QRSSYSTQSY VRYLTGVNSW EITKEMIAMQ LGDLSNLKLF GIEMTGHSAY
LRNVYMTGTI KQLSNDGITE VPVPAFKGVW TPGTYWYYDE VVCNGSTWIC IADKTIQEPT
DNSTDWLKYV SKGETGDKGD KGDKGDKGDT GATGAKGDKG DTGPTGSQGI PGTSQYFHVK
YSANANGNPM SDTPNTYIGT AVTTSATAPT GYASYKWVQL KGSQGPKGEQ GIAGPTGANG
QTSYLHIKYS DNGTSFTANN GETPGAWIGQ YVDFTAADST TFSKYIWTKV KGDTGDKGDK
GDKGDKGDQG GKGDTGATGL PGALIRPRGE WKANTNYVNN TQYRDTIIYN GNTYSCRVDH
NSGSSFDVTK WTLFNDFVNV ATQLLVAQNA TIDILGTSGL FIGNQAKTQG WLMTGGSIKH
NVTGLELTAD GKLSLPATGA ILVGNKTFIT NGKIVTDFID VKTLEVEKLN GATGTFKSLQ
GTKIVDNKEV VMCEIGFSTS EGKMYFEGDM QHQGTFKEPN GTNRSYRFLT ADLWCRGQFG
HQQMTSLSFN SASTSDFFAH IYNYGTDTTY HKYAQSGQPI DCIFLEGSGN YVIYICNSPR
RKMITIVNAS GYPKRVLTTW QSGGTYTLEP YRFAIFVTAE TYASVNNTSS TVNLHVMQ
//