ID G1SM53_RABIT Unreviewed; 1333 AA.
AC G1SM53;
DT 19-OCT-2011, integrated into UniProtKB/TrEMBL.
DT 11-DEC-2019, sequence version 3.
DT 27-MAR-2024, entry version 70.
DE RecName: Full=Thrombospondin-like N-terminal domain-containing protein {ECO:0000259|SMART:SM00210};
OS Oryctolagus cuniculus (Rabbit).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Glires; Lagomorpha; Leporidae; Oryctolagus.
OX NCBI_TaxID=9986 {ECO:0000313|Ensembl:ENSOCUP00000003979.4, ECO:0000313|Proteomes:UP000001811};
RN [1] {ECO:0000313|Ensembl:ENSOCUP00000003979.4, ECO:0000313|Proteomes:UP000001811}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Thorbecke {ECO:0000313|Proteomes:UP000001811};
RX PubMed=21993624; DOI=10.1038/nature10530;
RA Lindblad-Toh K., Garber M., Zuk O., Lin M.F., Parker B.J., Washietl S.,
RA Kheradpour P., Ernst J., Jordan G., Mauceli E., Ward L.D., Lowe C.B.,
RA Holloway A.K., Clamp M., Gnerre S., Alfoldi J., Beal K., Chang J.,
RA Clawson H., Cuff J., Di Palma F., Fitzgerald S., Flicek P., Guttman M.,
RA Hubisz M.J., Jaffe D.B., Jungreis I., Kent W.J., Kostka D., Lara M.,
RA Martins A.L., Massingham T., Moltke I., Raney B.J., Rasmussen M.D.,
RA Robinson J., Stark A., Vilella A.J., Wen J., Xie X., Zody M.C., Baldwin J.,
RA Bloom T., Chin C.W., Heiman D., Nicol R., Nusbaum C., Young S.,
RA Wilkinson J., Worley K.C., Kovar C.L., Muzny D.M., Gibbs R.A., Cree A.,
RA Dihn H.H., Fowler G., Jhangiani S., Joshi V., Lee S., Lewis L.R.,
RA Nazareth L.V., Okwuonu G., Santibanez J., Warren W.C., Mardis E.R.,
RA Weinstock G.M., Wilson R.K., Delehaunty K., Dooling D., Fronik C.,
RA Fulton L., Fulton B., Graves T., Minx P., Sodergren E., Birney E.,
RA Margulies E.H., Herrero J., Green E.D., Haussler D., Siepel A., Goldman N.,
RA Pollard K.S., Pedersen J.S., Lander E.S., Kellis M.;
RT "A high-resolution map of human evolutionary constraint using 29 mammals.";
RL Nature 478:476-482(2011).
RN [2] {ECO:0000313|Ensembl:ENSOCUP00000003979.4}
RP IDENTIFICATION.
RC STRAIN=Thorbecke {ECO:0000313|Ensembl:ENSOCUP00000003979.4};
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR SMR; G1SM53; -.
DR STRING; 9986.ENSOCUP00000003979; -.
DR PaxDb; 9986-ENSOCUP00000003979; -.
DR Ensembl; ENSOCUT00000004612.4; ENSOCUP00000003979.4; ENSOCUG00000004590.4.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000167622; -.
DR HOGENOM; CLU_001074_19_1_1; -.
DR InParanoid; G1SM53; -.
DR TreeFam; TF344135; -.
DR Proteomes; UP000001811; Unplaced.
DR Bgee; ENSOCUG00000004590; Expressed in uterus and 14 other cell types or tissues.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR Gene3D; 2.60.120.200; -; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR048287; TSPN-like_N.
DR PANTHER; PTHR24023:SF1019; COLLAGEN; 1.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR Pfam; PF01391; Collagen; 3.
DR PRINTS; PR01217; PRICHEXTENSN.
DR SMART; SM00210; TSPN; 1.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
DR PROSITE; PS51257; PROKAR_LIPOPROTEIN; 1.
PE 4: Predicted;
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Reference proteome {ECO:0000313|Proteomes:UP000001811};
KW Signal {ECO:0000256|ARBA:ARBA00022729, ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..37
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 38..1333
FT /note="Thrombospondin-like N-terminal domain-containing
FT protein"
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5023916089"
FT DOMAIN 43..223
FT /note="Thrombospondin-like N-terminal"
FT /evidence="ECO:0000259|SMART:SM00210"
FT REGION 274..598
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 629..763
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 854..1309
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 363..386
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 387..403
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 404..419
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 434..454
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 520..541
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 629..672
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1045..1062
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1121..1144
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1211..1230
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1333 AA; 134023 MW; E22BFDD4EA9CA288 CRC64;
MPLGARGRTD RPASCLGTQG FLVAWILVSF ACHLASAQGA PEDVDVLQRL GLSWTKVGAG
RSPAPSGVIP FQSGFIFTQQ ARLQAPTAAV IPAALGTELA LVLSLCSHRV NHAFLFAVRS
RKHKLQLGLQ FLPGKTVVHL GPRRSVAFDL DVHDGRWHHL ALELRGRTVT LVTACGQRRL
PVPLPFRRDP ALDPGGAFLL GKMSPRSVQF EGALCQFSIH PVSQVAHNYC SHLRKQCVQA
DVYRPQLGPL FLRDSARTFA FHTDLALPGM ENLTTATPAL GPRPAGRGPK VTVAPATPPK
PLRTSSIDLG KHVTAGGPAW TTPPPAKQWA SRALPSAPPA PRASVAGSTR ASRPATAQPL
PHITAPRTPP SLSSKPSPPS SVPPTKSPRS APKTALPSPM QSTPPAQKLA PPTSHPLPAR
VPRPSEKSMQ RAPVTPRPPT PSTRPLPPTA GSSTKPSPPG AQTEPRMPSR ASKPAPALTS
THKPPQPTAS PPSPSSGSGY TRVPGPPATA VPPTSGIRIP RTTLPPTVTP SPASAGSKKP
TGSEASKKAG PKSSSPEPDP FRPGKAARDA PLNDPSTRPG PRQPQPSRQT TPALAMTPAL
AVAPARLLSS PSRDYPFFHL AGPTPFPLLM GPPGPKGDCG FPGPPGLPGL PGPPGARGPR
GPPGPYGNPG LPGPPGAKGQ KGDPGLSPGK AHDGAKGDVG LPGLVGNPGP LGRKGYKGYP
GPAGHPGEQG QPGPEGSPGA KGYPGRQGLP GPVGDPGPKG SRGYIGLPGL FGLPGSDGER
GLPGVPGKRG KMGRPGFPGD FGERGPPGLD GNPGELGLPG PPGVPGLIGD MGALGPIGYP
GPKGVKGLMG SVGEPGLKGD KGEQGLPGVS GDPGFQGDKG SQGLPGFPGA RGKPGPLGRV
GDKGSLGFPG PPGPEGFPGD IGPPGDNGPE GMKGKPGARG LPGPPGQLGP EGDEGPMGPP
GVPGLEGQPG RKGFPGRPGP DGLKGEPGNP GRPGPVGEQG LMGFIGLVGE PGIVGEKGDR
GVMGPPGAPG PKGAMGHPGT PGGVGTPGEP GPPGPPGSRG PPGPRGTKGR RGPRGPDGPT
GEQGSRGLKG PPGPQGRPGQ PGQQGAAGER GYSGARGFPG IPGPSGPPGT KGLPGEPGPQ
GPQGPVGPLG EMGPKGPPGA VGEPGLPGEA GMKGDLGPLG TPGEQGLVGQ RGEPGLEGDS
GPMGPDGLKG DRGDPGPDGE RGEKGQEGLK GEEGPPGPPG ITGVRGPEGK AGKQGEKGRD
RGQGCQGLPR TAGRDGCPWR PWSPWHARPQ RLPGQPGTNG CSRTDGGPRR ARTAWLRWTQ
RHCGTPRTSW TQR
//