ID A0A3P9BST2_9CICH Unreviewed; 2517 AA.
AC A0A3P9BST2;
DT 13-FEB-2019, integrated into UniProtKB/TrEMBL.
DT 13-FEB-2019, sequence version 1.
DT 27-MAR-2024, entry version 24.
DE SubName: Full=Collagen type VII alpha 1 chain {ECO:0000313|Ensembl:ENSMZEP00005013023.1};
OS Maylandia zebra (zebra mbuna).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Actinopterygii; Neopterygii; Teleostei; Neoteleostei; Acanthomorphata;
OC Ovalentaria; Cichlomorphae; Cichliformes; Cichlidae; African cichlids;
OC Pseudocrenilabrinae; Haplochromini; Maylandia; Maylandia zebra complex.
OX NCBI_TaxID=106582 {ECO:0000313|Ensembl:ENSMZEP00005013023.1, ECO:0000313|Proteomes:UP000265160};
RN [1] {ECO:0000313|Ensembl:ENSMZEP00005013023.1, ECO:0000313|Proteomes:UP000265160}
RP NUCLEOTIDE SEQUENCE.
RX PubMed=25186727; DOI=10.1038/nature13726;
RA Brawand D., Wagner C.E., Li Y.I., Malinsky M., Keller I., Fan S.,
RA Simakov O., Ng A.Y., Lim Z.W., Bezault E., Turner-Maier J., Johnson J.,
RA Alcazar R., Noh H.J., Russell P., Aken B., Alfoldi J., Amemiya C.,
RA Azzouzi N., Baroiller J.F., Barloy-Hubler F., Berlin A., Bloomquist R.,
RA Carleton K.L., Conte M.A., D'Cotta H., Eshel O., Gaffney L., Galibert F.,
RA Gante H.F., Gnerre S., Greuter L., Guyon R., Haddad N.S., Haerty W.,
RA Harris R.M., Hofmann H.A., Hourlier T., Hulata G., Jaffe D.B., Lara M.,
RA Lee A.P., MacCallum I., Mwaiko S., Nikaido M., Nishihara H.,
RA Ozouf-Costaz C., Penman D.J., Przybylski D., Rakotomanga M., Renn S.C.P.,
RA Ribeiro F.J., Ron M., Salzburger W., Sanchez-Pulido L., Santos M.E.,
RA Searle S., Sharpe T., Swofford R., Tan F.J., Williams L., Young S., Yin S.,
RA Okada N., Kocher T.D., Miska E.A., Lander E.S., Venkatesh B., Fernald R.D.,
RA Meyer A., Ponting C.P., Streelman J.T., Lindblad-Toh K., Seehausen O.,
RA Di Palma F.;
RT "The genomic substrate for adaptive radiation in African cichlid fish.";
RL Nature 513:375-381(2014).
RN [2] {ECO:0000313|Ensembl:ENSMZEP00005013023.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (SEP-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR Ensembl; ENSMZET00005013472.1; ENSMZEP00005013023.1; ENSMZEG00005009670.1.
DR GeneTree; ENSGT00940000154368; -.
DR Proteomes; UP000265160; LG5.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0004867; F:serine-type endopeptidase inhibitor activity; IEA:InterPro.
DR GO; GO:0007155; P:cell adhesion; IEA:UniProtKB-KW.
DR CDD; cd00063; FN3; 8.
DR CDD; cd22627; Kunitz_collagen_alpha1_VII; 1.
DR CDD; cd01450; vWFA_subfamily_ECM; 1.
DR Gene3D; 1.20.5.320; 6-Phosphogluconate Dehydrogenase, domain 3; 1.
DR Gene3D; 2.60.40.10; Immunoglobulins; 8.
DR Gene3D; 4.10.410.10; Pancreatic trypsin inhibitor Kunitz domain; 1.
DR Gene3D; 3.40.50.410; von Willebrand factor, type A domain; 2.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR003961; FN3_dom.
DR InterPro; IPR036116; FN3_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR002223; Kunitz_BPTI.
DR InterPro; IPR036880; Kunitz_BPTI_sf.
DR InterPro; IPR020901; Prtase_inh_Kunz-CS.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1108; ENDOSTATIN DOMAIN-CONTAINING PROTEIN; 1.
DR Pfam; PF01391; Collagen; 14.
DR Pfam; PF00041; fn3; 6.
DR Pfam; PF00014; Kunitz_BPTI; 1.
DR Pfam; PF00092; VWA; 2.
DR PRINTS; PR00759; BASICPTASE.
DR PRINTS; PR00453; VWFADOMAIN.
DR SMART; SM00060; FN3; 9.
DR SMART; SM00327; VWA; 1.
DR SUPFAM; SSF57362; BPTI-like; 1.
DR SUPFAM; SSF49265; Fibronectin type III; 5.
DR SUPFAM; SSF53300; vWA-like; 2.
DR PROSITE; PS00280; BPTI_KUNITZ_1; 1.
DR PROSITE; PS50279; BPTI_KUNITZ_2; 1.
DR PROSITE; PS50853; FN3; 6.
DR PROSITE; PS50234; VWFA; 2.
PE 4: Predicted;
KW Cell adhesion {ECO:0000256|ARBA:ARBA00022889};
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000265160};
KW Secreted {ECO:0000256|ARBA:ARBA00022525}.
FT DOMAIN 9..179
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 198..293
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 294..387
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 463..548
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 631..717
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 719..807
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 808..894
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 928..1093
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 2463..2513
FT /note="BPTI/Kunitz inhibitor"
FT /evidence="ECO:0000259|PROSITE:PS50279"
FT REGION 1099..2414
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1099..1127
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1569..1596
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1603..1617
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1779..1799
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1839..1856
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2128..2143
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2399..2414
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 2517 AA; 255263 MW; D42586932B60AD6F CRC64;
RCDNVQAADI VFLVDGSSSI GRPNFLQVKG FMAGIVKPFA SSVGESGIRF GVAQYSDTSV
EFTLTAYLNG TELVNAVQNI NYKGGNTRTG DGLKFASDNF FNPASVRDVP KIMILITDGK
SQDNVNEPAQ KLRSQGVHVF AVIKSADRNE LAQISSQPSS DFTFFVGDFK LLNTLLPLVN
AQVCTKAGGD YASDAFVGPS NLQFIGQTSD SLRFRWSPAG GPVSAYVIQH VPLTGLGQPV
IAEMRQDSVT ANQRTYTARG LRSGTDYLVT VIAQYPNSVG DSVSAKQRTS LPGVSSLRLV
QAGFFSLTLG WNQPSTPVQG YRITYGPQQP AAQLLERTLS AGSTSVTLES LQPDTEYVIS
LYPLFPRNSA SPSVLNARTV QQLSVETESE GSVRVRWRGV SGARAYRLVW GPFTRDVETV
EVGGDINFYT LSGVQPDTEY IVTIIPLYEG NTEASPTTAR FKIQQVLRAA ITSPSSIRLT
WRLIQSSQGY RLEWREGEGP VRSLSFPRST TNSVLTGLKP NTKYIFTLYT LFEGREEATP
VSTTFRQPVG RVSNLRVVEY LGSTVRLGWT GVAGATQYRV IILNTDEVRL IPGNQTTLDL
RDLIVGVSYG VSVTALVGEN EGDPVTVNIK PVTNLRVINA NSRRIRITWT GVTGATGYRV
TWRQGNAEQS RTLGAELTAF TLEGLQPDEA LIIGVAAVAD QRVGEVATLA TQTNPQSGLL
SGLRVLDITP QRIRITWTLS SRATGYKITW RRDDVETSRQ VDASVSTYTI DGLQPDSAYT
VQVSTLTGSR EGTPAVLDVK TQSTVGTVTS LQVQEGRGEV VRVTWVGMQG ATSYRVSWRR
TDGEERSQLV AGDVTAVDLD QLDPGVQYEV QVMALVQNRE GAPVSVSVTT SLPTTTLRVV
EVTQDSVRLG WSPLQGATGY ILRWREESVV FLVPASTDRI NLARPLREFL TNTAESLITT
GALNTQIGVV VYGSRPKIWF LLNRHARSDT LLQEIQSIPF DESPGSNIEA VTFTRQYVLT
PSAGRRLRVP GVVVIIADKR SNDDLSVKVL AVGVDQADTV ELYQAVSDGN NNLLYTSYPA
QLNTLQSNLA DLLCGIGERG EKGERGRDGA DGRKGEPGRD GLPGRDGPRG PEGRPGTPGQ
SVPIDPSLVV KGEKGERGFP GIDGNPGLPG RPGAPGGSQG LPGVRGNPGE PGTPGPLGPK
GDKGERGEPG SVTSGGGGLP GRKGEPGIPG TDGIPGRLGR DGTKGEPGVP GTRGQDGRPG
IPGTPGLSGD SAEGQPGPPG KNGEPGDRGP RGPPGEIGSK GDRGQPGEPG SQGDRGERGP
AGETGGRGDS GKPGPPGPAG VRGLPGPSGP PGEKGNDGAR GEPGRTGERG VPGPESSKGE
KGDPGQRGEK GSPGSGGSGV AGPKGEPGER GESGLPGKPG ERGLRTGQPG PPGEKGDIGD
PGESGRNGDA GDPGEHGGKG TKGEAGTPGA PGLRGPEGQR GPPGTRQGER GASGLDGRPG
LDGKPGAPGP PGQRGDPGKQ GDPGRDGLPG LVGAQGPPGP VGPAGNPGIP GKAGENGKAG
PPGKTGEDGV PGEDGRKGDK GEAGAAGRDG RDGETGDRGP SGPLGSPGPP GAPGLPGSIG
PPGQVVYVKG ADATPIPGPQ GPPGTPGVPG IPGAAGTRGE RGLPGLKGET GDPGEDGAPG
KPGTSVDVQK ALAGFGIQGE KGDKGEPGQR GPSGADGARG FSGERGGKGD AGDRGPVGPA
GPPGRAIVER GSEGPAGPAG EPGKPGIPGL PGRAGELGEA GRPGDKGERG EKGDRGDSGE
SADSVLVGSP GARGPPGVSG LKGEPGATGP SGPKGDRGFV GPRGDKGERG EPGEKGRDGS
QGTPGETGKP GQDGKPGSTG PAGPRGQPGN PGEPGIMGPT GAMGPAGLPG PPGVKGNQGE
AGVGVQGPPG AQGSTGLPGP AGPPGALQGP QGPPGLPGQV QGEAGKPGVP GRDGVPGKEG
IHGLPGKQGI AGPPGQAGLK GEQGDSGPPG KAVAGPPGPK GERGPPGLTL PGTAGERGPI
GEKGNKGDKG VAGIKGERGQ AGESGEPGED GDPGMPGPAG PPGPAGKTGD SGPPGVRGEN
GQPGPPGPPG ERQGETGVGV PGARGERGDP GPRGEEGRAG LDGERGSSGD TVLVGGPPGE
KGNKGETGDR GPKGIQGEKG VKGQEGPPGE VGLRGEPGER GSTGFPGARG PGGQKGEAGQ
PGVPGESGLL GKDGLPGRKG EQGETGNMGI RGVKGDRGPK GICGGDGPKG EKGNSGANGR
SGLPGRKGEQ GDIGPSGAPG IPGKEGLVGP KGDRGFDGIA GPKGAQGEKG ERGLPGVPGP
PGPRGADGGP GLTGPQGPAG AKGPEGLQGQ KGERGPIGPA AVGPRGIPGI PGERGEAGDM
GPDGAKGDRG EAGMTEEEIR EYVRSEMSQH CGRELHMVVN TNDPDYEHVY SVESYDDPLE
EPCLLPMDEG SCGKYTMRWY FNRQAQACRP FIYSGCEGND NRFLHLEECE ETCLGEA
//