ID R5HST4_9FIRM Unreviewed; 643 AA.
AC R5HST4;
DT 24-JUL-2013, integrated into UniProtKB/TrEMBL.
DT 24-JUL-2013, sequence version 1.
DT 27-MAR-2024, entry version 25.
DE RecName: Full=Collagen triple helix repeat protein {ECO:0008006|Google:ProtNLM};
GN ORFNames=BN469_02131 {ECO:0000313|EMBL:CCY27531.1};
OS Firmicutes bacterium CAG:114.
OC Bacteria; Bacillota.
OX NCBI_TaxID=1263001 {ECO:0000313|EMBL:CCY27531.1, ECO:0000313|Proteomes:UP000018090};
RN [1] {ECO:0000313|EMBL:CCY27531.1, ECO:0000313|Proteomes:UP000018090}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=MGS:114 {ECO:0000313|Proteomes:UP000018090};
RA Nielsen H.B., Almeida M., Juncker A.S., Rasmussen S., Li J., Sunagawa S.,
RA Plichta D., Gautier L., Le Chatelier E., Peletier E., Bonde I., Nielsen T.,
RA Manichanh C., Arumugam M., Batto J., Santos M.B.Q.D., Blom N., Borruel N.,
RA Burgdorf K.S., Boumezbeur F., Casellas F., Dore J., Guarner F., Hansen T.,
RA Hildebrand F., Kaas R.S., Kennedy S., Kristiansen K., Kultima J.R.,
RA Leonard P., Levenez F., Lund O., Moumen B., Le Paslier D., Pons N.,
RA Pedersen O., Prifti E., Qin J., Raes J., Tap J., Tims S., Ussery D.W.,
RA Yamada T., MetaHit consortium, Renault P., Sicheritz-Ponten T., Bork P.,
RA Wang J., Brunak S., Ehrlich S.D.;
RT "Dependencies among metagenomic species, viruses, plasmids and units of
RT genetic variation.";
RL Submitted (NOV-2012) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:CCY27531.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CAXW010000209; CCY27531.1; -; Genomic_DNA.
DR AlphaFoldDB; R5HST4; -.
DR STRING; 1263001.BN469_02131; -.
DR Proteomes; UP000018090; Unassembled WGS sequence.
DR Gene3D; 1.20.5.320; 6-Phosphogluconate Dehydrogenase, domain 3; 1.
DR InterPro; IPR008160; Collagen.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1082; COLLAGEN ALPHA-1(X) CHAIN; 1.
DR Pfam; PF01391; Collagen; 3.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000018090}.
FT REGION 17..189
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 221..263
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 285..331
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 353..396
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 423..485
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 239..263
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 643 AA; 64509 MW; C65AC8FAC61ADDB7 CRC64;
MNCYGYNYYK NCYPNPLPEE QVVSGLQGPQ GPRGEQGPQG EQGIKGDTGC PGPIGPRGMA
GPQGPRGPQG VRGDMGPKGD PGAVGPQGPR GDPGPMGPQG DRGPVGPKGD AGPMGPTGPR
GERGEQGERG PAGEQGPQGE QGYAGVRGPQ GPQGEMGCPG PQGEQGPQGE RGIQGERGET
GPQGERGETP TVAVGTVQLG DLPQVIANPT ETGISLDFVV PLGPTGPQGE VGPQGIQGPQ
GAQGETGPQG PTGASPTVSV GTVTAGEDPQ ITAVPTETGV SLSFVVPIGP TGPQGETGPQ
GETGARGPQG EPGATGPQGP AGGTPTVAVG SVTAGADPQV TAIPTETGIS LDFVVPVGPT
GPQGEAGPQG EQGEPGPQGA VGATGPQGLT GATPTVSVGS VTAGDIPQVT ATPTETGISL
AFVVPVGPTG PQGETGPQGE KGEAGPQGET GGIGPQGEKG DVGPQGESGP KGDPGTSPKI
TVEEDTPTTY KVKFTDDTQE IVSPNLRSNL KVYNKNLSVA GSSLEVPLES LILTAEYSGV
GTIRLSLRPK DTAAPVLADV RRTSIYGGLG AVEVQTLDNT KISTRTVIDD IVYDQSEEMH
WIRLRQQDPS TSLWSMCEVR TFSSKLGART SICVDWLYTG VTF
//