ID A0A1S3H1A5_LINUN Unreviewed; 1324 AA.
AC A0A1S3H1A5;
DT 12-APR-2017, integrated into UniProtKB/TrEMBL.
DT 12-APR-2017, sequence version 1.
DT 27-MAR-2024, entry version 27.
DE SubName: Full=Collagen alpha-1(I) chain {ECO:0000313|RefSeq:XP_013378924.1};
GN Name=LOC106150587 {ECO:0000313|RefSeq:XP_013378924.1};
OS Lingula unguis.
OC Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Brachiopoda; Linguliformea;
OC Lingulata; Lingulida; Linguloidea; Lingulidae; Lingula.
OX NCBI_TaxID=7574 {ECO:0000313|Proteomes:UP000085678, ECO:0000313|RefSeq:XP_013378924.1};
RN [1] {ECO:0000313|RefSeq:XP_013378924.1}
RP NUCLEOTIDE SEQUENCE.
RX PubMed=26383154;
RA Luo Y.J., Takeuchi T., Koyanagi R., Yamada L., Kanda M., Khalturina M.,
RA Fujie M., Yamasaki S.I., Endo K., Satoh N.;
RT "The Lingula genome provides insights into brachiopod evolution and the
RT origin of phosphate biomineralization.";
RL Nat. Commun. 6:8301-8301(2015).
RN [2] {ECO:0000313|RefSeq:XP_013378924.1}
RP IDENTIFICATION.
RG RefSeq;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- CAUTION: Lacks conserved residue(s) required for the propagation of
CC feature annotation. {ECO:0000256|PROSITE-ProRule:PRU00076}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR RefSeq; XP_013378924.1; XM_013523470.1.
DR STRING; 7574.A0A1S3H1A5; -.
DR EnsemblMetazoa; XM_013523470.1; XP_013378924.1; LOC106150587.
DR GeneID; 106150587; -.
DR KEGG; lak:106150587; -.
DR InParanoid; A0A1S3H1A5; -.
DR OrthoDB; 5475408at2759; -.
DR Proteomes; UP000085678; Unplaced.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005509; F:calcium ion binding; IEA:InterPro.
DR CDD; cd00054; EGF_CA; 6.
DR Gene3D; 2.10.25.10; Laminin; 6.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001881; EGF-like_Ca-bd_dom.
DR InterPro; IPR000742; EGF-like_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1082; COLLAGEN ALPHA-1(X) CHAIN; 1.
DR Pfam; PF01391; Collagen; 4.
DR Pfam; PF00008; EGF; 6.
DR SMART; SM00181; EGF; 7.
DR SMART; SM00179; EGF_CA; 6.
DR SUPFAM; SSF57196; EGF/Laminin; 6.
DR PROSITE; PS00022; EGF_1; 5.
DR PROSITE; PS01186; EGF_2; 3.
DR PROSITE; PS50026; EGF_3; 7.
PE 4: Predicted;
KW Collagen {ECO:0000313|RefSeq:XP_013378924.1};
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157, ECO:0000256|PROSITE-
KW ProRule:PRU00076}; EGF-like domain {ECO:0000256|PROSITE-ProRule:PRU00076};
KW Reference proteome {ECO:0000313|Proteomes:UP000085678};
KW Signal {ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..21
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 22..1324
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5010211758"
FT DOMAIN 74..117
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 216..252
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 253..290
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 296..334
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 335..373
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 381..420
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 421..459
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT REGION 469..1234
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT DISULFID 107..116
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 242..251
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 280..289
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 363..372
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 449..458
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
SQ SEQUENCE 1324 AA; 130356 MW; A0BD44AD2C42728C CRC64;
MSRLTLVAVV SVLSGLVLCQ AATVTPPAAN QTLDGADEYP MCSTCGSGIS PFPGAPVGMV
QSVLMYRTLL IMYAHIRCST FNPCMNNAEC YRRCDSEGYV RSYRCMCPCG HYGLYCEKGA
LECSIKEQIG EFDCNDRFFN VTCGLQNIKI ARVCEYSEND PYCSGPDTFF FKYKWRLGNR
CDGLQSCTAV MDEMCAEDSN PRLRFLYQCC PSEESPDVPC RHDPCQNAGT CVPEADSFKC
ICPPGFTGRL CDIPTACSPN PCLNDGQCVP MGFTAFRCVC PAGFVGHLCE NQFTIMISAC
ITDPCLNNGR CIPNGDSYTC DCSEATGFTG ARCETNYICD LVNPCQNGGT CNPFPSGGYW
CQCPPGWSGF QCDREEKSIT EVSPCDSDPC LNNGRCIPVG ANDYRCDCSR AVGFTGANCE
TNNICDILKP CLNDGVCRPL PNGGYWCDCA ERWVGFNCDV DMTIQIERFG VPGPQGVPGP
PGPQGDTGPS GEIGTTGPVG PTGSEGAGLV GPTGPRGDTG DTGPTGPAGP EGEPGPAGPR
GPKGDSGAAG SPGEPGRPGS QGPTGPTGPR GRDGSDGEPG RPGSTGASGS TGRQGPRGNP
GPAGSTGARG PQGPTGQRGS AGSPGEPGSA GRPGSTGPQG PTGPRGPSGA SVEGEPGRPG
APGRTGPQGP TGPAGPRGRD GFDGEDGRDG ARGPQGPRGP TGAQGAQGRP GSQGPPGPAG
RNGQTGPQGR PGAPGNDGED GPRGPTGPQG PSGRIGATGP RGFAGANGRP GEPGRAGPQG
PRGPTGAGVT GARGPAGPRG PTGPPGSDGE DGPAGPRGPS GPQGAPGEPG RPGANGRPGR
DGSPGRNGAP GATGPRGVGA RGPTGPSGNP GARGAQGERG PTGPQGPRGS NGSPGEPGRP
GSQGPQGRAG DRGPTGPAGP RGRDGSPGEP GRPGARGATG PGGSQGARGN PGPAGSPGAR
GPQGPTGPRG NPGNDGGIGA AGRPGVAGPQ GPTGPTGPRG QSVQGEPGRP GAPGSTGPQG
PRGVAGSPGR DGSDGEDGPQ GSTGSRGPQG PQGPAGAPGR PGDRGPEGSP GRDGSTGPQG
PRGTPGTNGE PGQMGPAGPT GPRGASGSPG QSGSPGSDGR PGEQGPQGPA GPQGPTGVGM
PGPQGPTGPQ GPQGPAGFPG EDGSAGPQGP TGPQGPAGEN GSPGEPGAPG VGSPGPQGPT
GPRGYNGEDG APGEPGPAGP QGPNGATGPT GPAGGSSSCR GGSSCWCGCN CQCGIHMCCP
RALYYSHDVI PDSTGTGCWC DGMNKGPCLY QGKCCPPYIY YTQGPDNNPL KPHIHKADGT
PVWQ
//