ID A0A151MF52_ALLMI Unreviewed; 1858 AA.
AC A0A151MF52;
DT 08-JUN-2016, integrated into UniProtKB/TrEMBL.
DT 08-JUN-2016, sequence version 1.
DT 27-MAR-2024, entry version 26.
DE SubName: Full=Collagen alpha-1(I) chain-like {ECO:0000313|EMBL:KYO23158.1};
GN ORFNames=Y1Q_0005597 {ECO:0000313|EMBL:KYO23158.1};
OS Alligator mississippiensis (American alligator).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Archelosauria; Archosauria; Crocodylia; Alligatoridae; Alligatorinae;
OC Alligator.
OX NCBI_TaxID=8496 {ECO:0000313|EMBL:KYO23158.1};
RN [1] {ECO:0000313|EMBL:KYO23158.1, ECO:0000313|Proteomes:UP000050525}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=KSC_2009_1 {ECO:0000313|EMBL:KYO23158.1};
RX PubMed=22293439; DOI=10.1186/gb-2012-13-1-415;
RA St John J.A., Braun E.L., Isberg S.R., Miles L.G., Chong A.Y., Gongora J.,
RA Dalzell P., Moran C., Bed'hom B., Abzhanov A., Burgess S.C., Cooksey A.M.,
RA Castoe T.A., Crawford N.G., Densmore L.D., Drew J.C., Edwards S.V.,
RA Faircloth B.C., Fujita M.K., Greenwold M.J., Hoffmann F.G., Howard J.M.,
RA Iguchi T., Janes D.E., Khan S.Y., Kohno S., de Koning A.J., Lance S.L.,
RA McCarthy F.M., McCormack J.E., Merchant M.E., Peterson D.G., Pollock D.D.,
RA Pourmand N., Raney B.J., Roessler K.A., Sanford J.R., Sawyer R.H.,
RA Schmidt C.J., Triplett E.W., Tuberville T.D., Venegas-Anaya M.,
RA Howard J.T., Jarvis E.D., Guillette L.J.Jr., Glenn T.C., Green R.E.,
RA Ray D.A.;
RT "Sequencing three crocodilian genomes to illuminate the evolution of
RT archosaurs and amniotes.";
RL Genome Biol. 13:415-415(2012).
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:KYO23158.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AKHW03006215; KYO23158.1; -; Genomic_DNA.
DR STRING; 8496.A0A151MF52; -.
DR Proteomes; UP000050525; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR Gene3D; 2.60.120.1000; -; 1.
DR Gene3D; 2.60.120.200; -; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR000885; Fib_collagen_C.
DR InterPro; IPR001791; Laminin_G.
DR InterPro; IPR048287; TSPN-like_N.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1095; COLLAGEN ALPHA-1(I) CHAIN-LIKE ISOFORM X1; 1.
DR Pfam; PF01410; COLFI; 1.
DR Pfam; PF01391; Collagen; 8.
DR Pfam; PF02210; Laminin_G_2; 1.
DR SMART; SM00038; COLFI; 1.
DR SMART; SM00210; TSPN; 1.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
DR PROSITE; PS51461; NC1_FIB; 1.
PE 4: Predicted;
KW Collagen {ECO:0000313|EMBL:KYO23158.1};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000050525};
KW Secreted {ECO:0000256|ARBA:ARBA00022530};
KW Signal {ECO:0000256|ARBA:ARBA00022729}.
FT DOMAIN 1640..1858
FT /note="Fibrillar collagen NC1"
FT /evidence="ECO:0000259|PROSITE:PS51461"
FT REGION 248..298
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 457..568
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 604..1293
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1317..1593
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 248..282
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 492..516
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 523..538
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 604..624
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1081..1095
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1227..1259
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1858 AA; 188776 MW; B394BF462AA4CEE3 CRC64;
MHSDWETTSL RPSLEGEIPG NDFWLPQDEV NLLDKLTSYA QDLSNISLSY DETNCSILEI
GQYSTLSVPT REVFGDRFAD ELSVLLKLRY SLKEETSLLT ILSHHSHMLF QIRINPYALV
FVTTRRRHYE FPITTLGDGD WHRVALSITL ERLELRVDCQ LVESVSWSNY FGMGVTTEGL
VIIGGLIEPF EIPFEGALQE LTFVMGDPGA AREHCHSYNA SCTSAFGMAA FHNWTRFLWE
LSTEGSNFTN PDLTSRPTQV WEDPQKSQSP ESSGSALTSA TATKRPFVEE EEDEGNLSIE
EEGFELLNYT YNKITGPGRP GVGRSNLKEA RNTSESLTSV TSVSPTAPED AQVANEDLNL
PAASIKVNRL ASKKQHILKT TDENITTEKL KGDMGIWTSF SPSKPVDTIV DLDRSSAFSK
VSIEESETAA SSGNMPRLLS SISHGMVEVP VGRGMLENHS KASGSRKDAL NPPQTPLDAR
LVHTRYRKRI VHQRTTGTDT KDGTSVDRDQ RTGDRQLIRV KPGPPGPRGP PGLPGCPGRR
GPMGPKGDKG HPGAMGRTGV PGDPGPAGSP GVPTIVLWRN SEEDRLAFMQ SSFYQLLYAG
WPRQPGPPGP PGHPGKPGLT GPPGYPGEPG EKGQLGYVGE PGLQGFPGRA GYPGSDGLSG
MDGKPGPYGL PGEQGPQGFK GDQGPAGEKG EEGFVGDPGP PGEKGEKGAK GVKGENGLLG
PAGPQGMMGV KGAPGFQGPP GPQGGTGSTG SSGPAGPVGE PGQPGPVGLQ GVNGSRGEMG
PSGLPGPQGP KGPQGLQGRR GPPGPRGSQG PVGIEGLSGP KGDPGEAGPP GLRGEQGQEG
PMGLIGMPGP KGLSGEQGSG GPDGAKGDIG AKGNKGAQGL LGTKGPPGIR GQIGLPGFPG
PRGLAGLQGP DGEEGQIGLP GQNSSEGTKG SRGPDGPKGI AGPRGHRGRM GQRGLVGVPG
PPGLRGAPGI EGPEGKPGTQ GGSGVVGSPG RKGPAGFLGL QGDRGHNGPP GLLGPSGKPG
PDGDPGSPGL RGLLGKPGLE GPQGPVGMYG YPGPVGDRGK PGPMGEKGEP GVQGPSGFPG
KAGPKAPPGP PGSRGPPGLQ GRLGDPGPRG LPGLPGLPGS QGTDGIPGKP GVVGEMGLPG
PSGAAGPAGY MGREGPAGPE GPLGEEGEKG EAGDPGGAGI GGLDGEQGLP GLPGVEGQPG
VKGEQGEVGL LGDPGVKGSS GEPGDEGPKG HPGRPGELGK EGDVGDPGDP GEPGLKGEKG
DAGSRGPPGV PGSGASTGNI GEKGPKGDKG QEGLLGQVGV IGLAGAIGAR GRTGEAGLQG
LPGLDGPEGE KGDQGPPGVI GPSGVAGLEG APGEDGIKGD QGKPGPSGAP GSSGEKGIQG
FKGLFGPEGP KGDPGKKGET GQQGLHGSPG ETGAPGPLGS EGLAGKKGDQ GEPGEPGQTG
PQGQSGPEGC KGQKGEKGDI GRDGDTGFPG NPGRRGKRGR AGHRPPRGPS GPKGSQGEQG
PEGPKGPQGT HGIPGLKGVK GDRGPKGLRG EKGSVGFLGP QGDDGFKGKP GPSGPPGRSG
PKGDQGDIGL SGPQGFPGLP GTPGFFGQKG LKGLQGLPGC KGKPGLLGSP GLPGPRGPAL
NLSMEELKHL IYSSNKLNYD TVWALMGNLS HELKLLVDHP NGTKDNPATT CKELLLAQPH
LPDGYYYIDP NQGSPQDSLL AFCNFTAGGE TCISPVHNQV PIKAWLKAYA SEETFEWFSS
LPGGFLLEYQ GAGTVQLRFL KLHSSIATQK VSYSCRPEGD KDEPQPEKEI KFLADSRDQS
YLATLQGCVL DNESSITDTI FQFSTEELDL LPLRDLAVFH NGDASHQFGF TVGPVCFA
//