ID M4AHL7_XIPMA Unreviewed; 940 AA.
AC M4AHL7;
DT 01-MAY-2013, integrated into UniProtKB/TrEMBL.
DT 05-DEC-2018, sequence version 2.
DT 27-MAR-2024, entry version 53.
DE SubName: Full=Collagen alpha-1(XXI) chain-like {ECO:0000313|Ensembl:ENSXMAP00000013961.2};
OS Xiphophorus maculatus (Southern platyfish) (Platypoecilus maculatus).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Actinopterygii; Neopterygii; Teleostei; Neoteleostei; Acanthomorphata;
OC Ovalentaria; Atherinomorphae; Cyprinodontiformes; Poeciliidae; Poeciliinae;
OC Xiphophorus.
OX NCBI_TaxID=8083 {ECO:0000313|Ensembl:ENSXMAP00000013961.2, ECO:0000313|Proteomes:UP000002852};
RN [1] {ECO:0000313|Proteomes:UP000002852}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=JP 163 A {ECO:0000313|Proteomes:UP000002852};
RA Walter R., Schartl M., Warren W.;
RL Submitted (JAN-2012) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Proteomes:UP000002852}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=JP 163 A {ECO:0000313|Proteomes:UP000002852};
RX PubMed=23542700; DOI=10.1038/ng.2604;
RA Schartl M., Walter R.B., Shen Y., Garcia T., Catchen J., Amores A.,
RA Braasch I., Chalopin D., Volff J.N., Lesch K.P., Bisazza A., Minx P.,
RA Hillier L., Wilson R.K., Fuerstenberg S., Boore J., Searle S.,
RA Postlethwait J.H., Warren W.C.;
RT "The genome of the platyfish, Xiphophorus maculatus, provides insights into
RT evolutionary adaptation and several complex traits.";
RL Nat. Genet. 45:567-572(2013).
RN [3] {ECO:0000313|Ensembl:ENSXMAP00000013961.2}
RP IDENTIFICATION.
RC STRAIN=JP 163 A {ECO:0000313|Ensembl:ENSXMAP00000013961.2};
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR AlphaFoldDB; M4AHL7; -.
DR STRING; 8083.ENSXMAP00000013961; -.
DR Ensembl; ENSXMAT00000013980.2; ENSXMAP00000013961.2; ENSXMAG00000013932.2.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000164076; -.
DR HOGENOM; CLU_639275_0_0_1; -.
DR InParanoid; M4AHL7; -.
DR OMA; RTYNHRQ; -.
DR OrthoDB; 2883115at2759; -.
DR Proteomes; UP000002852; Unassembled WGS sequence.
DR Gene3D; 2.60.120.200; -; 1.
DR Gene3D; 1.20.5.320; 6-Phosphogluconate Dehydrogenase, domain 3; 1.
DR Gene3D; 3.40.50.410; von Willebrand factor, type A domain; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR048287; TSPN-like_N.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR PANTHER; PTHR24020; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24020:SF84; COLLAGEN ALPHA-1(XXI) CHAIN-LIKE ISOFORM X1; 1.
DR Pfam; PF01391; Collagen; 6.
DR Pfam; PF00092; VWA; 1.
DR PRINTS; PR00453; VWFADOMAIN.
DR SMART; SM00210; TSPN; 1.
DR SMART; SM00327; VWA; 1.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
DR SUPFAM; SSF53300; vWA-like; 1.
DR PROSITE; PS50234; VWFA; 1.
PE 4: Predicted;
KW Reference proteome {ECO:0000313|Proteomes:UP000002852};
KW Signal {ECO:0000256|ARBA:ARBA00022729, ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..18
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 19..940
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5017366395"
FT DOMAIN 34..207
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT REGION 445..776
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 812..911
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 678..698
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 717..737
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 940 AA; 97435 MW; 30E2FE942CB72D58 CRC64;
MLLWSVWSVM LLLAAAEAVE NEDVRAGCST AVNDLVYIVD GSWSVGPDDF ETAKLWLINI
SSQFDISLHY SQVAVVQYSD TPRLEIPLGT HHSGVQLIQA IHNINYLGGN TQTGRAIKFA
VDHVFSASQR VSQVKNRIAV VVTDGKSQDD VVDASTEARA QGITVFAVGV GSEITTSELI
SIANVPSSTY VLYAQDYTNI DRIRDSMEQK LCEESVCPTR IPVASRDEKG FELMLGMNIQ
KKAKKISGSL ASEAAYALTS STDVTENTRE IFPEGLPPSY VFVATIRLKG NSSQLIFDLW
RVLSKSKEIQ AAVTLNGKDQ SVIFTTTSVT EAEQKAIFKK GFKTLYDGKW HQLKILVSPQ
HAVSFLDDKL IQEIALKPVE PIYINGKTQI AKKRGSQVTV PVEIQKLRLY CDPHQSERET
ACEIYSVDDE RCPLNRTATV EQEEEDCNCA LGPPGQRGPP GRMGFRGEKG REGPPGPDGK
PGKQGIPGYA GPPGFPGIKG EEGPQGLRGE AGVKGNKGDQ GKPGLTGEPG PTGPQGLPGE
RGEPGPVGQT GLKGETGPPG MDGAAGLTGQ KGEMGDVGPA GDVGPKGAPG PIGPPGETGP
QGSPGEQGLA GLQGPPGPKG NLGVHGPQGP PGEIGRAGPK GSCGEPGVLG PPGLPGLKGL
VGEPGFPGQP GVLGRPGLKG HKGDRGESGS KGNRGQRGED GEPGTPGIPG DVGSKGWKGE
RGLKGESGPR GPEGKKGDTG HLGIVGPRGF PGQDGLPGQP GQPGYPGKPG KSPSDEHLVK
MCADVLRNQL PAFLQSLALQ ATCESCNTVK GPPGEPGAPG PKGSMGAAGY PGRSGAPGYP
GPPGMQGPAG IKGDIGERGP KGSKGEGYKG PPGPPGQPGM QGPRGFDGIG YPGSQGIQGK
PGLQGDPGKQ GIPGLPGVCD VSMCYRTYNH RQHYNKGPDV
//