ID A0A4D9E0W0_9SAUR Unreviewed; 1395 AA.
AC A0A4D9E0W0;
DT 03-JUL-2019, integrated into UniProtKB/TrEMBL.
DT 03-JUL-2019, sequence version 1.
DT 28-JAN-2026, entry version 21.
DE SubName: Full=Collagen alpha-1(XV) chain {ECO:0000313|EMBL:TFK01882.1};
GN ORFNames=DR999_PMT15837 {ECO:0000313|EMBL:TFK01882.1};
OS Platysternon megacephalum (big-headed turtle).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Archelosauria; Testudinata; Testudines; Cryptodira; Durocryptodira;
OC Testudinoidea; Platysternidae; Platysternon.
OX NCBI_TaxID=55544 {ECO:0000313|EMBL:TFK01882.1, ECO:0000313|Proteomes:UP000297703};
RN [1] {ECO:0000313|EMBL:TFK01882.1, ECO:0000313|Proteomes:UP000297703}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=DO16091913 {ECO:0000313|EMBL:TFK01882.1};
RC TISSUE=Muscle {ECO:0000313|EMBL:TFK01882.1};
RA Gong S.;
RT "Draft genome of the big-headed turtle Platysternon megacephalum.";
RL Submitted (APR-2019) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|EMBL:TFK01882.1, ECO:0000313|Proteomes:UP000297703}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=DO16091913 {ECO:0000313|EMBL:TFK01882.1};
RC TISSUE=Muscle {ECO:0000313|EMBL:TFK01882.1};
RA Gong S.;
RT "The genome sequence of big-headed turtle.";
RL Submitted (APR-2019) to the EMBL/GenBank/DDBJ databases.
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix {ECO:0000256|ARBA:ARBA00004498}.
CC -!- SIMILARITY: Belongs to the multiplexin collagen family.
CC {ECO:0000256|ARBA:ARBA00061275}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:TFK01882.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; QXTE01000206; TFK01882.1; -; Genomic_DNA.
DR STRING; 55544.A0A4D9E0W0; -.
DR OrthoDB; 10060752at2759; -.
DR Proteomes; UP000297703; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0031012; C:extracellular matrix; IEA:TreeGrafter.
DR GO; GO:0005615; C:extracellular space; IEA:TreeGrafter.
DR GO; GO:0007155; P:cell adhesion; IEA:UniProtKB-KW.
DR CDD; cd00247; Endostatin-like; 1.
DR FunFam; 3.10.100.10:FF:000008; collagen alpha-1(XVIII) chain isoform X1; 1.
DR FunFam; 2.60.120.200:FF:000039; Collagen XV alpha 1 chain; 1.
DR Gene3D; 2.60.120.200; -; 1.
DR Gene3D; 3.40.1620.70; -; 1.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 1.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR050149; Collagen_superfamily.
DR InterPro; IPR010515; Collagenase_NC10/endostatin.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR001791; Laminin_G.
DR InterPro; IPR048287; TSPN-like_N.
DR InterPro; IPR045463; XV/XVIII_trimerization_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1082; COLLAGEN TRIPLE HELIX REPEAT; 1.
DR Pfam; PF01391; Collagen; 5.
DR Pfam; PF20010; Collagen_trimer; 1.
DR Pfam; PF06482; Endostatin; 1.
DR SMART; SM00282; LamG; 1.
DR SMART; SM00210; TSPN; 1.
DR SUPFAM; SSF56436; C-type lectin-like; 2.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
PE 3: Inferred from homology;
KW Cell adhesion {ECO:0000256|ARBA:ARBA00022889};
KW Collagen {ECO:0000256|ARBA:ARBA00023119, ECO:0000313|EMBL:TFK01882.1};
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW Proteoglycan {ECO:0000256|ARBA:ARBA00022974};
KW Reference proteome {ECO:0000313|Proteomes:UP000297703};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW Secreted {ECO:0000256|ARBA:ARBA00022525};
KW Signal {ECO:0000256|ARBA:ARBA00022729, ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..19
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 20..1395
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5020020753"
FT DOMAIN 133..321
FT /note="Thrombospondin-like N-terminal"
FT /evidence="ECO:0000259|SMART:SM00210"
FT DOMAIN 182..320
FT /note="Laminin G"
FT /evidence="ECO:0000259|SMART:SM00282"
FT REGION 321..798
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 877..916
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 994..1136
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 443..458
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 513..524
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 533..542
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 572..581
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 583..600
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 665..680
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 763..772
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 786..795
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 887..916
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1030..1047
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1057..1070
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1100..1113
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1395 AA; 143237 MW; B91335342DFC166C CRC64;
MFSKLAWIIF LMQSGFIEGQ WWWQRLWGTT QTTTPPVATA VNTITLNVSV PKGYVTNGVS
LSWTTVSARA PAGSITSTVS SGATGVPYES TLPGAEAVTN FSREKALSEL ELKTGKKISQ
GKQKKTERGS KGHLDLTELI GVPLPLSVSF ITGYGGFPAY SFGPDANIGR LTSTLIPQTF
YSDFAIVVTV KPNSDDGGVL FAITDAFQKT IYLGMRLSPV DDGTQRIIMY YTEPGSYISR
EAASFKVPVM TNKWNRFTVT VQGNYTVLFM DCEEYNRVQF QRSSQALQFE SNSGIFVGNA
GATGLEKFTG SIQQLMIKPD PRATEDQCED DDPYASGDSS GTGGIQEQEG LPETEEVIAS
SQPLPEETTA GPVGAPPTVS AQSEEMDFSG HHILDETPEP PTIKEQGSTS AGNDQHESDA
TTVAQEILKP EAGSGAIVLQ GVSREKGQKG ERGEKGEQGP RGPPGKSELE EMQTGIQGPP
GPLGKPGRDS EPGIPGKDGL PGERGLQGLP GLKGEDGLKG EKGEPGVGLP GPRGLPGPPG
PSAPFRGLSR LEPEGSGSGD LDRDNEVLRG LPGPPGPPGL PGAPGKPDSN SGPPGSPGKD
GPSGEPGPPG PQGHPGLDGM VGPPGKKGEK GDQGLPGAVG PKGDAGDIGS PGPEGQAGAD
GQPGKPGPQG PPGPPGPPGP GYGFGFEDME GSGSISLLSE PRIPGSRGPN GPVGETGQRG
PMGPKGEKGD TGPPGTAGLK GEQGADGKPG FPGVAGRPGD AGPKGDKGDT GSKGEPGQDG
ASIVGPPGPP GPPGPIIAVP QLLLNDTDGI SNLTGIKGLL GPPGPDGRPG LPGFPGPRGP
KGVIGLTGLQ GPKGQRGEKG EPGFIISADG SLRELTGRQG QKGERGAMGP PGRMGPVGPS
GPKGELGIPG RPGRPGLNGL KGVKGERGVT LYGPPGLPGR PGPPGPPGAV IHIKGTVFPI
SPRPHCKMPV GTAHPGNQEA NIYGMKGVPG SWGLHGPPGL KGEKGERGSP GLPGPPLPSA
YFSHFVNSMK GEKGDNGETG FKGEKGEPSG GLFMSGPPGP HGPPGRPGPV GPKGDSVVGP
NGPPGLPGLP GSPGYGIVGP PGPPGPPGPP GPPAIYGSAA AVPGPPGPPG EPGLSGTRNL
VTTFRNIDGM LQKVHLVAEG TLTYLSESSE VFIRVRGGWR RLQLGELIPI PADSPPPPAI
SGYGFQSLPA LRPVSTINHG KPTLHLVALN LPLSGAMRAD YQCFQQARAA GLMSTYRAFL
SSHLQDLSTV VRKSERYNLP IVNLKGEILF NNWESVFTGS GGQFNIQIPI YSFDGRNVMT
DPSWPHKIIW HGSTANGIRL VSNYCEAWRT ADMAVMGQAS PLTTGKLLDQ KPYSCSNKFI
VLCIENSFVS DIRRK
//