ID A0A267FGS0_9PLAT Unreviewed; 1772 AA.
AC A0A267FGS0;
DT 22-NOV-2017, integrated into UniProtKB/TrEMBL.
DT 22-NOV-2017, sequence version 1.
DT 27-MAR-2024, entry version 18.
DE RecName: Full=FRAS1-related extracellular matrix protein N-terminal domain-containing protein {ECO:0000259|Pfam:PF19309};
GN ORFNames=BOX15_Mlig031119g1 {ECO:0000313|EMBL:PAA72956.1};
OS Macrostomum lignano.
OC Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Platyhelminthes;
OC Rhabditophora; Macrostomorpha; Macrostomida; Macrostomidae; Macrostomum.
OX NCBI_TaxID=282301 {ECO:0000313|EMBL:PAA72956.1, ECO:0000313|Proteomes:UP000215902};
RN [1] {ECO:0000313|EMBL:PAA72956.1, ECO:0000313|Proteomes:UP000215902}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=DV1 {ECO:0000313|EMBL:PAA72956.1};
RC TISSUE=Whole organism {ECO:0000313|EMBL:PAA72956.1};
RA Berezikov E.;
RT "A platform for efficient transgenesis in Macrostomum lignano, a flatworm
RT model organism for stem cell research.";
RL Submitted (JUN-2017) to the EMBL/GenBank/DDBJ databases.
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:PAA72956.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; NIVC01001050; PAA72956.1; -; Genomic_DNA.
DR STRING; 282301.A0A267FGS0; -.
DR Proteomes; UP000215902; Unassembled WGS sequence.
DR InterPro; IPR039005; CSPG_rpt.
DR InterPro; IPR045658; FRAS1-rel_N.
DR PANTHER; PTHR45739:SF12; CHONDROITIN SULFATE PROTEOGLYCAN 4-LIKE ISOFORM X2; 1.
DR PANTHER; PTHR45739; MATRIX PROTEIN, PUTATIVE-RELATED; 1.
DR Pfam; PF16184; Cadherin_3; 6.
DR Pfam; PF19309; Frem_N; 2.
DR PROSITE; PS51854; CSPG; 4.
PE 4: Predicted;
KW Glycoprotein {ECO:0000256|ARBA:ARBA00023180};
KW Reference proteome {ECO:0000313|Proteomes:UP000215902};
KW Repeat {ECO:0000256|ARBA:ARBA00022737};
KW Signal {ECO:0000256|ARBA:ARBA00022729, ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..28
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 29..1772
FT /note="FRAS1-related extracellular matrix protein N-
FT terminal domain-containing protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5012492686"
FT DOMAIN 39..141
FT /note="FRAS1-related extracellular matrix protein N-
FT terminal"
FT /evidence="ECO:0000259|Pfam:PF19309"
FT DOMAIN 168..265
FT /note="FRAS1-related extracellular matrix protein N-
FT terminal"
FT /evidence="ECO:0000259|Pfam:PF19309"
FT REPEAT 846..937
FT /note="CSPG"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU01201"
FT REPEAT 1078..1172
FT /note="CSPG"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU01201"
FT REPEAT 1194..1294
FT /note="CSPG"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU01201"
FT REPEAT 1574..1667
FT /note="CSPG"
FT /evidence="ECO:0000256|PROSITE-ProRule:PRU01201"
FT REGION 336..358
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1772 AA; 197683 MW; E37835F4765838C3 CRC64;
MAATSMQHLL PVLLWLAVLI IDVTSCSSAT VVVPNASRPP IRVQQGKSVA LTAAHLLIRR
QPSGSGSLCR VEVDSSDPLT QTVGQMKPKV FDCYFTEGSV FYEHEGSPLR SHDSVRLLVY
SFQPGNRSEI APIRLAVEVQ LRSRRGSGQG AVGERSSSTA TYIVMPGPRP LTVPRFRGTS
APIDSNALNF VYDPLTTRCT VVYSDSVQAS SSLAFRDWPM VGRLVDDQST RVREFKHDCR
QFLLAGYRYE HQRPPNPDMD YIPLSVYLLR LDFNGSVDAT SDAIVTEKYY LPVHIDKAQK
SAPPQLQLRQ RSVSLAHTGG LPVVLPADLL LLPSASASST TSDSAPSTET DIRRSGDSSG
GITSGLVVMV TQIEPVLPFN INASFVDTRE PTAGQISGFQ PRDLREGRVA FQFATPTDTD
RVELRIRLRP IDAFFQDGPD VLLNLASTPN RQLQLFPRPL LAYTQATVRF NSRNLLCSGS
DCNLVSYRIR REPYSGRILM NNAPIEFFGP RTLNNPEEFY ITYTHNGNIP SEDSIELRTQ
VSKSSGTPIV FKFPIHVTRL WDSVSTLKSS NVNYRLGPGG QVCFGVEIID ESVLRTLKHD
CIDMRYEVVK RPEYGNLIRA LNPLSDWARS RSTSTFLTFA VSDLLQGKTI CYLQQRNLEP
GQTNVWDSMT LKQAGRPDFT PLLIRLRLHD KPRAPLINRS ARYLSAPTML ETDNNFVITR
DFLRYETEGL SPNQIVYEVL LHPYYAGTNI SIDAGRLLDK RKLERIFTTG DNVKAKVRIT
SFKEVGCLSQ FTQEDVDNGN VLYVPPLNDV GISNRTVNVY LQPYDTTWNK ASPHKLTFIV
LPVDNQEPKL EPQKLSVERG KRLTIKLNHL RPRDIDTKVD QLTLRLLKEP KHGKLYDGDV
SIGVGQWFKV GRVRSEKISY QHDGTTSSAD YFVLQVTDNN QNSPDYNIYL NITERQAWDY
GTKFYANNTI FVLEGGQTVL TTEMFPTLKS ETIDPRQISY LQGERPVKGE FELLGSSSRS
ISQFNHEDLR AGRIVYRHTT GEIGPRAIVD ICPMFVYSGS DWQTHRLTFS IQPVDNQPPS
ITDGAQVLVK EGDSTVLSGN VIKVRDADTE ADKLLLMVQD APKHGEIRLI NGSGATVNQF
SQQQLLAGEV AFLQNKNFGT EFTRDNFTVI ASDGERRSSP VTVQVYIYPV TDELPSIVGL
ADFAISKGES RVLNATYFSV TDADVPEEEV IVRIRRLPSV GTLFQDWFTA RISKKVTETS
NRQFTKRNLN YMRLVYHQDL DVTSLNDSFV VEASDPVYRI SKTCRIAIAT RNEQPPVFSS
PRSQLQLYYG EKVQLTENHL LVRDPDTPKE AISISVVSLP ANCELLRLEP SRSSVILSGR
SRDRLVALRL GESFSQRDVQ QSRMFVRASR LSSSSAASPA TVELRLMAGD GRFSTDAANL
MLRLVPKPAE VLELFLIRTS RLKAVTQRVT KLTQEEISVY YRDRVQNSAS FQVIPQQVSD
SQPCYSFELE GDPVVTDNET VFLMSDVLDG RVLFNATNCS KSKEDILFEV RSNWIGKSAS
GMLPVDLIRL DVAPPTLDHV GTLSLRAHEA VAIDQTVLRT SDADTPPQEI LYRIKNAPKQ
GLLVIGNSQL GQEDSFTQAD VNNGTVLWYR SLIEPPTGLD KLKFTVEDMN KTRGFCYKSA
IPRSSPVTLS IRVAPSTRNP IELVRVKSPS ELMSFGNGGR FGFRIKNDSL LAKPAMVLFE
LASQPGLGIL KLGESEIPVG GVSLSVTSIP DE
//