ID A0AA47NVV4_MERPO Unreviewed; 1436 AA.
AC A0AA47NVV4;
DT 24-JAN-2024, integrated into UniProtKB/TrEMBL.
DT 24-JAN-2024, sequence version 1.
DT 08-OCT-2025, entry version 9.
DE SubName: Full=Collagen alpha-1(XV) chain {ECO:0000313|EMBL:KAK0140581.1};
GN Name=Col15a1 {ECO:0000313|EMBL:KAK0140581.1};
GN ORFNames=N1851_022436 {ECO:0000313|EMBL:KAK0140581.1};
OS Merluccius polli (Benguela hake) (Merluccius cadenati).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Actinopterygii; Neopterygii; Teleostei; Neoteleostei; Acanthomorphata;
OC Zeiogadaria; Gadariae; Gadiformes; Gadoidei; Merlucciidae; Merluccius.
OX NCBI_TaxID=89951 {ECO:0000313|EMBL:KAK0140581.1, ECO:0000313|Proteomes:UP001174136};
RN [1] {ECO:0000313|EMBL:KAK0140581.1}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=C29 {ECO:0000313|EMBL:KAK0140581.1};
RC TISSUE=Fin {ECO:0000313|EMBL:KAK0140581.1};
RA Mateo J.L., Blanco-Fernandez C., Garcia-Vazquez E., Machado-Schiaffino G.;
RT "A new Merluccius polli reference genome to investigate the effects of
RT global change in West African waters.";
RL Front. Mar. Sci. 0:0-0(2023).
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix {ECO:0000256|ARBA:ARBA00004498}.
CC -!- SIMILARITY: Belongs to the multiplexin collagen family.
CC {ECO:0000256|ARBA:ARBA00061275}.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:KAK0140581.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; JAOPHQ010004053; KAK0140581.1; -; Genomic_DNA.
DR Proteomes; UP001174136; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0031012; C:extracellular matrix; IEA:TreeGrafter.
DR GO; GO:0005615; C:extracellular space; IEA:TreeGrafter.
DR GO; GO:0030020; F:extracellular matrix structural constituent conferring tensile strength; IEA:TreeGrafter.
DR GO; GO:0007155; P:cell adhesion; IEA:UniProtKB-KW.
DR GO; GO:0030198; P:extracellular matrix organization; IEA:TreeGrafter.
DR CDD; cd00247; Endostatin-like; 1.
DR FunFam; 3.10.100.10:FF:000008; collagen alpha-1(XVIII) chain isoform X1; 1.
DR FunFam; 2.60.120.200:FF:000039; Collagen XV alpha 1 chain; 1.
DR Gene3D; 2.60.120.200; -; 1.
DR Gene3D; 3.40.1620.70; -; 1.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 1.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR050149; Collagen_superfamily.
DR InterPro; IPR010515; Collagenase_NC10/endostatin.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR048287; TSPN-like_N.
DR InterPro; IPR045463; XV/XVIII_trimerization_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1118; LAMININ G DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01391; Collagen; 2.
DR Pfam; PF20010; Collagen_trimer; 1.
DR Pfam; PF06482; Endostatin; 1.
DR SMART; SM00210; TSPN; 1.
DR SUPFAM; SSF56436; C-type lectin-like; 1.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
PE 3: Inferred from homology;
KW Cell adhesion {ECO:0000256|ARBA:ARBA00022889};
KW Collagen {ECO:0000256|ARBA:ARBA00023119, ECO:0000313|EMBL:KAK0140581.1};
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW Proteoglycan {ECO:0000256|ARBA:ARBA00022974};
KW Reference proteome {ECO:0000313|Proteomes:UP001174136};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW Secreted {ECO:0000256|ARBA:ARBA00022525};
KW Signal {ECO:0000256|ARBA:ARBA00022729}.
FT DOMAIN 206..395
FT /note="Thrombospondin-like N-terminal"
FT /evidence="ECO:0000259|SMART:SM00210"
FT REGION 42..201
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 397..495
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 725..757
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 773..856
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 918..1155
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1210..1240
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 48..106
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 111..122
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 151..168
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 423..435
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 467..495
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 725..740
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 825..841
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 959..972
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 987..999
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1044..1064
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1104..1115
FT /note="Gly residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1116..1131
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1139..1148
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1220..1231
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1436 AA; 150032 MW; 839EAD570F09F7B0 CRC64;
MGCRVWEQSD DGWFFQQGSQ VVWVHVSAAL ELAVGRCCPG YTTNLDDTTP STTPSTPTSS
YPSTSYPTTS YPPTSTSTST STTSTSSSTS SSSSSSSSSS SSSSSSIIPP AVFPPSTSPD
PTPRGGTSED FTSGGTNGDL IAGPGFVGSN PSQGPAAAPT PAQLTAEPRS PSPGETLMPA
RSNGSRSKPP HKPLKLWKSE PGSGGHLDLT ELIGVPLPPG VSFARGFDSL PAFSFSPDAN
IGRLTRTFLP GPFYPDFTII ATVRPSSSRG GVLFAITDAH QKVVELGLVL TPAQKGLQSI
LLYYTDRQRY THSHKAAAFT VPEMTDQWTR FTVAVEGDEV RLYMDCGEAE LTVFQRGGAR
LSFSHDSGVF VGNAGGTRLH KFVGSIQQLV IRDDPRAAEE QCEEDDPYAS GDSSGDDGER
EEEVMKNKEE KKHNTYQDSV PVQAPPTEAP EADPDEYSGQ LTSTEANEVH TRPRSQERGV
RVQRWSSEKA LKEQGETKDL QAPLVYPAPQ APLLLLVKEC LAQPPLSLAP RGQQAPQDYQ
DPVDVRGELD RREARETMET LARKELRASQ VWLEKLEPKE RRETAVWGYQ GNQASQALLD
RPGHEVSLME WTLWALVLKT WRVKTQSWSG APPVPLVPPD PLDPGDMNLR TQQMAQGRLG
PQGPLGATAR MDSQELQASQ VEMGLQVFVE MLERRVIKDP QGHQEQSQDV PPCVLQGDCG
ALGVAGPPGV AGPPGVSGPQ GRPGPPGPAG GKYFMEDLEG SGKTDMLIGA GVRGPQGAPG
LPGPPGIKGR DGVPGTPGLS VKGESGDPGP DGQPGLAGLP GVRGAKGEKG SIGLKGDHGE
DGLSIPGSPG PPGPPGPIIN LQELLLNDTY GLFNFTEIRG PPGPAGPRGP KGDVGHLGIQ
GPAGLKGEKG EPGVTIAADG SLVSGAKGPQ GPRGLKGDRG VPGPAGFIGL IGPTGPKGEY
GFPGRPGRPG ASGRKGDRGD SSGTPGRPGP PGPPGPPGLP GRVVGLSGQT GVRVTGTKGE
KGEEGPPGES VTPVGSVETG TSDIRQEVAE EELGFRAEKG EKGRQGSPGR PGSPGRPGPP
GRSGLVGPKG EGTVGPPGPA GEPGQPGGPG FGRPGPRGPP GPIGPPGPPP MYGAAVNVPG
PPGPPGPPGA QGLFNPVRSF KTLQALARAS EGASEEVSEG SLAFVSERGE LYLRTHSGWR
PLQLGELLQM PSDSSPLSRA AERRARTHSQ ELQDSSRGYQ PSYNLLPQTA STVPGLHLVA
LNSPLRGDMR GIRGADYQCY QQARGVGLTA TYRAFLSSHL QDLATIVRRV DRHHMPVVNL
QGEVLFSSWS SIFSGNGGVF NPATPLYSFD GRDVMTDSAW PEKQVWHGSS AVGVRATTSY
CEAWRAGDRA VTGHASLLQT GQLIGQHSRS CSAPLVVLCI ENTYVERGGA TVEGRG
//