ID W5KPZ0_ASTMX Unreviewed; 3031 AA.
AC W5KPZ0;
DT 16-APR-2014, integrated into UniProtKB/TrEMBL.
DT 05-DEC-2018, sequence version 2.
DT 27-MAR-2024, entry version 69.
DE SubName: Full=Cadherin EGF LAG seven-pass G-type receptor 1 {ECO:0000313|Ensembl:ENSAMXP00000009652.2};
OS Astyanax mexicanus (Blind cave fish) (Astyanax fasciatus mexicanus).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Actinopterygii; Neopterygii; Teleostei; Ostariophysi; Characiformes;
OC Characoidei; Characidae; Astyanax.
OX NCBI_TaxID=7994 {ECO:0000313|Ensembl:ENSAMXP00000009652.2, ECO:0000313|Proteomes:UP000018467};
RN [1] {ECO:0000313|Proteomes:UP000018467}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=female {ECO:0000313|Proteomes:UP000018467};
RA Jeffery W., Warren W., Wilson R.K.;
RL Submitted (MAR-2013) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Proteomes:UP000018467}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=female {ECO:0000313|Proteomes:UP000018467};
RX PubMed=25329095; DOI=10.1038/ncomms6307;
RA McGaugh S.E., Gross J.B., Aken B., Blin M., Borowsky R., Chalopin D.,
RA Hinaux H., Jeffery W.R., Keene A., Ma L., Minx P., Murphy D., O'Quin K.E.,
RA Retaux S., Rohner N., Searle S.M., Stahl B.A., Tabin C., Volff J.N.,
RA Yoshizawa M., Warren W.C.;
RT "The cavefish genome reveals candidate genes for eye loss.";
RL Nat. Commun. 5:5307-5307(2014).
RN [3] {ECO:0000313|Ensembl:ENSAMXP00000009652.2}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC -!- FUNCTION: Receptor that may have an important role in cell/cell
CC signaling during nervous system formation.
CC {ECO:0000256|ARBA:ARBA00002066}.
CC -!- SUBCELLULAR LOCATION: Cell membrane {ECO:0000256|ARBA:ARBA00004651};
CC Multi-pass membrane protein {ECO:0000256|ARBA:ARBA00004651}. Membrane
CC {ECO:0000256|ARBA:ARBA00004141}; Multi-pass membrane protein
CC {ECO:0000256|ARBA:ARBA00004141}.
CC -!- SIMILARITY: Belongs to the G-protein coupled receptor 2 family. LN-TM7
CC subfamily. {ECO:0000256|ARBA:ARBA00010933}.
CC -!- CAUTION: Lacks conserved residue(s) required for the propagation of
CC feature annotation. {ECO:0000256|PROSITE-ProRule:PRU00076}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR STRING; 7994.ENSAMXP00000009652; -.
DR Ensembl; ENSAMXT00000009652.2; ENSAMXP00000009652.2; ENSAMXG00000009342.2.
DR eggNOG; KOG4289; Eukaryota.
DR GeneTree; ENSGT00940000159839; -.
DR HOGENOM; CLU_000158_1_0_1; -.
DR InParanoid; W5KPZ0; -.
DR OrthoDB; 4006628at2759; -.
DR Proteomes; UP000018467; Unassembled WGS sequence.
DR Bgee; ENSAMXG00000009342; Expressed in embryo and 6 other cell types or tissues.
DR GO; GO:0005886; C:plasma membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005509; F:calcium ion binding; IEA:UniProtKB-UniRule.
DR GO; GO:0004930; F:G protein-coupled receptor activity; IEA:UniProtKB-KW.
DR GO; GO:0042074; P:cell migration involved in gastrulation; IEA:Ensembl.
DR GO; GO:0007166; P:cell surface receptor signaling pathway; IEA:InterPro.
DR GO; GO:0007507; P:heart development; IEA:Ensembl.
DR GO; GO:0007156; P:homophilic cell adhesion via plasma membrane adhesion molecules; IEA:InterPro.
DR GO; GO:0021915; P:neural tube development; IEA:Ensembl.
DR CDD; cd15991; 7tmB2_CELSR1; 1.
DR CDD; cd11304; Cadherin_repeat; 8.
DR CDD; cd00054; EGF_CA; 4.
DR CDD; cd00055; EGF_Lam; 1.
DR CDD; cd00110; LamG; 2.
DR Gene3D; 2.60.120.200; -; 2.
DR Gene3D; 2.60.220.50; -; 1.
DR Gene3D; 2.60.40.60; Cadherins; 9.
DR Gene3D; 4.10.1240.10; GPCR, family 2, extracellular hormone receptor domain; 1.
DR Gene3D; 2.10.25.10; Laminin; 5.
DR Gene3D; 1.20.1070.10; Rhodopsin 7-helix transmembrane proteins; 1.
DR Gene3D; 2.170.300.10; Tie2 ligand-binding domain superfamily; 1.
DR InterPro; IPR002126; Cadherin-like_dom.
DR InterPro; IPR015919; Cadherin-like_sf.
DR InterPro; IPR020894; Cadherin_CS.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR001881; EGF-like_Ca-bd_dom.
DR InterPro; IPR000742; EGF-like_dom.
DR InterPro; IPR032471; GAIN_dom_N.
DR InterPro; IPR046338; GAIN_dom_sf.
DR InterPro; IPR017981; GPCR_2-like_7TM.
DR InterPro; IPR036445; GPCR_2_extracell_dom_sf.
DR InterPro; IPR001879; GPCR_2_extracellular_dom.
DR InterPro; IPR000832; GPCR_2_secretin-like.
DR InterPro; IPR000203; GPS.
DR InterPro; IPR001791; Laminin_G.
DR InterPro; IPR002049; LE_dom.
DR PANTHER; PTHR24026:SF126; CADHERIN-89D; 1.
DR PANTHER; PTHR24026; FAT ATYPICAL CADHERIN-RELATED; 1.
DR Pfam; PF00002; 7tm_2; 1.
DR Pfam; PF00028; Cadherin; 8.
DR Pfam; PF00008; EGF; 3.
DR Pfam; PF16489; GAIN; 1.
DR Pfam; PF01825; GPS; 1.
DR Pfam; PF00053; Laminin_EGF; 2.
DR Pfam; PF02210; Laminin_G_2; 2.
DR PRINTS; PR00205; CADHERIN.
DR PRINTS; PR00249; GPCRSECRETIN.
DR SMART; SM00112; CA; 9.
DR SMART; SM00181; EGF; 6.
DR SMART; SM00179; EGF_CA; 4.
DR SMART; SM00180; EGF_Lam; 1.
DR SMART; SM00303; GPS; 1.
DR SMART; SM00008; HormR; 1.
DR SMART; SM00282; LamG; 2.
DR SUPFAM; SSF49313; Cadherin-like; 9.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 2.
DR SUPFAM; SSF57196; EGF/Laminin; 2.
DR SUPFAM; SSF81321; Family A G protein-coupled receptor-like; 1.
DR PROSITE; PS00232; CADHERIN_1; 6.
DR PROSITE; PS50268; CADHERIN_2; 9.
DR PROSITE; PS00022; EGF_1; 5.
DR PROSITE; PS01186; EGF_2; 2.
DR PROSITE; PS50026; EGF_3; 6.
DR PROSITE; PS01248; EGF_LAM_1; 1.
DR PROSITE; PS50027; EGF_LAM_2; 1.
DR PROSITE; PS50227; G_PROTEIN_RECEP_F2_3; 1.
DR PROSITE; PS50261; G_PROTEIN_RECEP_F2_4; 1.
DR PROSITE; PS50221; GPS; 1.
DR PROSITE; PS50025; LAM_G_DOMAIN; 2.
PE 3: Inferred from homology;
KW Calcium {ECO:0000256|ARBA:ARBA00022837, ECO:0000256|PROSITE-
KW ProRule:PRU00043}; Cell membrane {ECO:0000256|ARBA:ARBA00022475};
KW Developmental protein {ECO:0000256|ARBA:ARBA00022473};
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157, ECO:0000256|PROSITE-
KW ProRule:PRU00076};
KW EGF-like domain {ECO:0000256|ARBA:ARBA00022536, ECO:0000256|PROSITE-
KW ProRule:PRU00076};
KW G-protein coupled receptor {ECO:0000256|ARBA:ARBA00023040};
KW Hydroxylation {ECO:0000256|ARBA:ARBA00023278};
KW Laminin EGF-like domain {ECO:0000256|ARBA:ARBA00023292,
KW ECO:0000256|PROSITE-ProRule:PRU00460};
KW Membrane {ECO:0000256|ARBA:ARBA00023136, ECO:0000256|SAM:Phobius};
KW Receptor {ECO:0000256|ARBA:ARBA00023170};
KW Reference proteome {ECO:0000313|Proteomes:UP000018467};
KW Repeat {ECO:0000256|ARBA:ARBA00022737}; Signal {ECO:0000256|SAM:SignalP};
KW Transducer {ECO:0000256|ARBA:ARBA00023224};
KW Transmembrane {ECO:0000256|ARBA:ARBA00022692, ECO:0000256|SAM:Phobius};
KW Transmembrane helix {ECO:0000256|ARBA:ARBA00022989,
KW ECO:0000256|SAM:Phobius}.
FT SIGNAL 1..23
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 24..3031
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5017417403"
FT TRANSMEM 2470..2493
FT /note="Helical"
FT /evidence="ECO:0000256|SAM:Phobius"
FT TRANSMEM 2505..2523
FT /note="Helical"
FT /evidence="ECO:0000256|SAM:Phobius"
FT TRANSMEM 2529..2551
FT /note="Helical"
FT /evidence="ECO:0000256|SAM:Phobius"
FT TRANSMEM 2572..2591
FT /note="Helical"
FT /evidence="ECO:0000256|SAM:Phobius"
FT TRANSMEM 2611..2633
FT /note="Helical"
FT /evidence="ECO:0000256|SAM:Phobius"
FT TRANSMEM 2654..2675
FT /note="Helical"
FT /evidence="ECO:0000256|SAM:Phobius"
FT TRANSMEM 2681..2704
FT /note="Helical"
FT /evidence="ECO:0000256|SAM:Phobius"
FT DOMAIN 255..362
FT /note="Cadherin"
FT /evidence="ECO:0000259|PROSITE:PS50268"
FT DOMAIN 363..470
FT /note="Cadherin"
FT /evidence="ECO:0000259|PROSITE:PS50268"
FT DOMAIN 471..576
FT /note="Cadherin"
FT /evidence="ECO:0000259|PROSITE:PS50268"
FT DOMAIN 577..681
FT /note="Cadherin"
FT /evidence="ECO:0000259|PROSITE:PS50268"
FT DOMAIN 682..783
FT /note="Cadherin"
FT /evidence="ECO:0000259|PROSITE:PS50268"
FT DOMAIN 784..886
FT /note="Cadherin"
FT /evidence="ECO:0000259|PROSITE:PS50268"
FT DOMAIN 887..992
FT /note="Cadherin"
FT /evidence="ECO:0000259|PROSITE:PS50268"
FT DOMAIN 993..1094
FT /note="Cadherin"
FT /evidence="ECO:0000259|PROSITE:PS50268"
FT DOMAIN 1117..1217
FT /note="Cadherin"
FT /evidence="ECO:0000259|PROSITE:PS50268"
FT DOMAIN 1296..1354
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 1356..1392
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 1396..1434
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 1435..1639
FT /note="Laminin G"
FT /evidence="ECO:0000259|PROSITE:PS50025"
FT DOMAIN 1642..1678
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 1682..1863
FT /note="Laminin G"
FT /evidence="ECO:0000259|PROSITE:PS50025"
FT DOMAIN 1865..1901
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 1902..1939
FT /note="EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50026"
FT DOMAIN 1996..2043
FT /note="Laminin EGF-like"
FT /evidence="ECO:0000259|PROSITE:PS50027"
FT DOMAIN 2028..2101
FT /note="G-protein coupled receptors family 2 profile 1"
FT /evidence="ECO:0000259|PROSITE:PS50227"
FT DOMAIN 2468..2705
FT /note="G-protein coupled receptors family 2 profile 2"
FT /evidence="ECO:0000259|PROSITE:PS50261"
FT REGION 2301..2326
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2776..2949
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2957..2976
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 3012..3031
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2307..2321
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2800..2823
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2836..2850
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2851..2866
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2888..2902
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2903..2932
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2962..2976
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT DISULFID 1344..1353
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 1382..1391
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 1668..1677
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 1891..1900
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 1929..1938
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00076"
FT DISULFID 1996..2008
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00460"
FT DISULFID 1998..2015
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00460"
FT DISULFID 2017..2026
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU00460"
SQ SEQUENCE 3031 AA; 335079 MW; 3226CD6E135476C8 CRC64;
MDLPMKWIWF GVTLLVHLPL YGCFELHFPE SLQPGAALLN ASLGPGWIYS IDWSLTASSI
GRCVRIGSTD GVVTLARKVH CSRFTRLPAP LHFRLTSLLS GNAVLIHFNT FIHGQNCFSK
HKRKPPKREL DASIHLRSGY GTCFSSSMLP FSIYEHLPAF TWQAQILQAT CKEKQERHEK
FLSKASAGSQ QCLPQHATFL CTVPNATSLD RVLFLQVQLH FSPNHGHFHN QDFGLSSEQW
VKRLKRNANS APQFQLPNYQ VSVPENEPSG TRVITLKAFD SDNGDAGVLE YDIEALFDSR
SNDYFQINPD TGGITTLQPL DREMKDTHVF KVTATDNGTP RRSATAYLTI TVSDTNDHKP
VFEQTEYRVS IRENVEVGFE VMTIRATDGD APSNANMIYK IVNEEDVKSV FEIDPRNGLV
RIKVRPDREV KSEYQLRVEA NDQGKDPGPH SATATVHITI EDENDNYPQF SQKRYVVQVL
ENVAVNSEVA QVKATDKDNG VNAKVHYSII SGNVKGQFYI HSPTGVIDVI NPLDYEMIRE
YNLRIKAQDG GRPPLINGTG MVVVQVVDVN DNAPMFVSTP FQASVLENVP VGHSVIHIQA
IDADSGDNAL LEYRLTDTSP GFPFVINNST GWVTVCLELD REITEFYTFS VEARDHGLPG
MSSSASVSVT VLDVNDNVPT FTQKIYNLKI NEDAAVGASV LTVTAVDRDV NSVVTYQISS
GNTRNRFAIT SQSGGGLITL ALPLDYKQER QYVLTVTASD GTRFDNAQIF INVTDTNTHR
PVFQSANYQV LVSEDKPVGS TVVVISATDE DTGENARITY VMEDNVPQFK IHPDTGAITT
QIEIDYEDQA SYTLAIIARD NGIPQKSDTT YVEIIVLDAN DNAPQFLRDI YQGTVFEDAP
VYTSVLQVSA SDRDSGSNGR VSYTFQGGDD GEGDFFIEPY SGIIRTARKL DRENVALYNL
KAFAVDKGVP PLKATVDVQV TVLDINDNAP VFEKDELYIY VEENSAVGST LARISATDPD
EGTNAQILYQ IVEGNVPEVF QLDIFSGDLI ALSDLDYEAK MEYLIVVQAT SAPLVSRAIV
HVRLVDVNDN YPVLQDFEII FNNYVTNKSK SFPADVIGKV PAHDPDVSDK LLYMFVEGNE
LSLLILNQNT GELKLSKDLD NNRPLEATMR VTVSDGLHQV SALCTLRVTI ITDDMLTNSI
TVRLENMSQE HFLSPLLGLF AEGVAAVLST SREGVFVFNI QNDTDVGAPI LNVTFSAQQP
GGTLGRFFPT EELQEQMYLN RTLLRLISTQ HVLPFDDNIC LREPCENYMK CVSVLKFDSS
APFVASSTLL FRPIHPVNGL RCRCPDGFTG DYCETEIDLC YSSPCKNNGL CRSREGGYTC
ECLEDFTGEH CEVNSRSGRC VSGVCKNGGS CVDLLVGGFM CQCPEGEFEK PYCQMTTRSF
PGHAFVTFRG LRQRFHFTLS FMFATRERNA LLLYNGRFNE KHDFIAVEIV DEQIQLTFSA
GESKTTVSPF VAAGVSDGQW HTVWLHYYNK PNIGRLGLPH GPSEEKVAVV AVDDCDVALA
VRFGGQIGNY TCAARGTQTG KKKSLDLTGP LLLGGVPDLP EDFPIRNRDF VGCMRNLIID
SKPLDMATFI ANNGTAAGCP AKSDFCSRSV CHNGGVCVNK WNTHACSCPL GYGGKNCEHA
MTSPLHFDGN GMVSWSDPDI TIAIPWYIGL MFRTRKNAGT LLQASAGELS RFNLLISNRH
LRFQVFLGTR RMALLEFPQV RVNDGAWHHV LVELKSGKDG KDIKYMALVS LDYGMFQKTV
EIGNELPGLK VRGLWMGGLM KKDGSVLNGF NGCMQGVRMG ETSTNTVNIN VRQAQRVHVR
DGCDMANACV SNMCPVHSRC TDNWASHSCI CEPGYFGRDC VDACLLNPCE HISSCKRKPA
SKHGYTCECG RNYYGQYCEH RGDQPCARGW WGYPNCGPCN CDVNKGFKRD CNKTTGVCSC
KDNFYRPPGS DTCYPCECFH LGSQNRTCDL LTGQCPCKTG VVGRQCNRCD NPFAEVTASG
CVVVYDGCPK AFDVGIWWPK TMFGGPAAMN CPKGSSGTAV RHCSDEHGWL PPELFNCTSH
SFAKLRKEVE ELNGNSSRLD GERSKSLAAT LQSATKHTQH LYGSDVRTAY QLMSSILQRE
SLQQGFNLSA THDSSFNMNI VKASSSILDP ENKGHWEQIQ QSDGGVGFLL RQFEEYGTTL
VQNMRKTYLK PFTIVTDNMI VSVDFLDSST PDQSELPRFK DIYEVYSKEL ESSVHFPDMF
IKPPEYTDVA PTEPPVLPVS DPPSSFVGPE GTHSHNLTSP APSAKKRRHV ELPDLPSVGM
VIIYRSLGQL LPENYDPDRR SLRLPTRPVI NSPVVSVVAY KEGDSISAPL QRPITLTFRL
LETQERTKPV CVYWNHSILA SGSGAWSSKG CELIFRNSTH ISCQCNHMTS FAVLMDISKR
EHGDVLPLKV VTYTTVSASL VALLITFLLL AILRKLRSNL HSIHKNLVAA IFLSELIFLA
GINQTDNPFV CTVVAILLHY SYMCTFAWMF VEGLHIYRML TEMRNINQGH MRFYYAIGWG
IPAIITGLAV GLDPQGYGNP DFCWLSVYDT IIWSITGPII FVVLINITIF VLAAKASCGR
RQRTFEKSGA ISALRVAFLL LLLISATWLL GLMAVNSDVL TFHYLFAIIS CIQGICIFFF
HCILNKDVRK NLKSVFTGKK GAAEEPSTTR ATLLTRSLNG NTNIEDGCLY RTPIGESTVS
LESSVRSGKS HSSNYLAYKL RGEQKRSVPS SGRKSRSNEG DPSLYYRKSK RRADSDSDSE
LSVDEHSSSY ASSHSSDSEA EKRSSKPKWN NERSPVHSTP KVDSVSNHTK PYWPVEPPTA
SESEGTGGPE KLKVETKANV ELHETSKLSP NGELSQSEGP PSQPSSNQLP RRGILKNKII
YPPPLTDKNM KNRLREKLSD YNPPTPTIQC KSPSEGSNEG LCAVTENNGI VIKPPALTLT
TTLNGVTMGL GTGPALATEE TDSDGSNETS I
//