ID G3X1N9_SARHA Unreviewed; 1785 AA.
AC G3X1N9;
DT 16-NOV-2011, integrated into UniProtKB/TrEMBL.
DT 07-APR-2021, sequence version 2.
DT 27-MAR-2024, entry version 68.
DE SubName: Full=Collagen type XIV alpha 1 chain {ECO:0000313|Ensembl:ENSSHAP00000021594.2};
GN Name=COL14A1 {ECO:0000313|Ensembl:ENSSHAP00000021594.2};
OS Sarcophilus harrisii (Tasmanian devil) (Sarcophilus laniarius).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Metatheria; Dasyuromorphia; Dasyuridae; Sarcophilus.
OX NCBI_TaxID=9305 {ECO:0000313|Ensembl:ENSSHAP00000021594.2, ECO:0000313|Proteomes:UP000007648};
RN [1] {ECO:0000313|Ensembl:ENSSHAP00000021594.2, ECO:0000313|Proteomes:UP000007648}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=21709235; DOI=10.1073/pnas.1102838108;
RA Miller W., Hayes V.M., Ratan A., Petersen D.C., Wittekindt N.E., Miller J.,
RA Walenz B., Knight J., Qi J., Zhao F., Wang Q., Bedoya-Reina O.C.,
RA Katiyar N., Tomsho L.P., Kasson L.M., Hardie R.A., Woodbridge P.,
RA Tindall E.A., Bertelsen M.F., Dixon D., Pyecroft S., Helgen K.M.,
RA Lesk A.M., Pringle T.H., Patterson N., Zhang Y., Kreiss A., Woods G.M.,
RA Jones M.E., Schuster S.C.;
RT "Genetic diversity and population structure of the endangered marsupial
RT Sarcophilus harrisii (Tasmanian devil).";
RL Proc. Natl. Acad. Sci. U.S.A. 108:12348-12353(2011).
RN [2] {ECO:0000313|Ensembl:ENSSHAP00000021594.2}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR Ensembl; ENSSHAT00000021768.2; ENSSHAP00000021594.2; ENSSHAG00000018289.2.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000153769; -.
DR HOGENOM; CLU_002527_2_0_1; -.
DR InParanoid; G3X1N9; -.
DR TreeFam; TF329914; -.
DR Proteomes; UP000007648; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0005614; C:interstitial matrix; IEA:Ensembl.
DR GO; GO:0030199; P:collagen fibril organization; IEA:Ensembl.
DR GO; GO:0048873; P:homeostasis of number of cells within a tissue; IEA:Ensembl.
DR GO; GO:0061050; P:regulation of cell growth involved in cardiac muscle cell development; IEA:Ensembl.
DR GO; GO:0003229; P:ventricular cardiac muscle tissue development; IEA:Ensembl.
DR CDD; cd00063; FN3; 7.
DR CDD; cd01482; vWA_collagen_alphaI-XII-like; 2.
DR Gene3D; 2.60.120.200; -; 1.
DR Gene3D; 2.60.40.10; Immunoglobulins; 8.
DR Gene3D; 3.40.50.410; von Willebrand factor, type A domain; 2.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR003961; FN3_dom.
DR InterPro; IPR036116; FN3_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR048287; TSPN-like_N.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR PANTHER; PTHR24020; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24020:SF15; COLLAGEN ALPHA-1(XIV) CHAIN; 1.
DR Pfam; PF01391; Collagen; 4.
DR Pfam; PF00041; fn3; 7.
DR Pfam; PF00092; VWA; 2.
DR PRINTS; PR00453; VWFADOMAIN.
DR SMART; SM00060; FN3; 8.
DR SMART; SM00210; TSPN; 1.
DR SMART; SM00327; VWA; 2.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
DR SUPFAM; SSF49265; Fibronectin type III; 7.
DR SUPFAM; SSF53300; vWA-like; 2.
DR PROSITE; PS50853; FN3; 7.
DR PROSITE; PS50234; VWFA; 2.
PE 4: Predicted;
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Reference proteome {ECO:0000313|Proteomes:UP000007648};
KW Secreted {ECO:0000256|ARBA:ARBA00022525};
KW Signal {ECO:0000256|ARBA:ARBA00022729, ECO:0000256|SAM:SignalP}.
FT SIGNAL 1..28
FT /evidence="ECO:0000256|SAM:SignalP"
FT CHAIN 29..1785
FT /evidence="ECO:0000256|SAM:SignalP"
FT /id="PRO_5029639644"
FT DOMAIN 32..122
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 159..331
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 356..445
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 446..537
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 538..628
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 629..716
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 739..831
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 833..923
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 1037..1210
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT REGION 126..148
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1461..1614
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1647..1785
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1565..1580
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1663..1689
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1738..1752
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1785 AA; 191726 MW; 8E0D282A128F4C97 CRC64;
MKTFQCKIWF CLLPALGVLV DFSSYVEGQV APPTRLRYSI LSHDSIQISW KAPKGKFSGY
KLLVTPNSGE KTNQLTLQNS ANKAIIQGLL PDEEYIVQLI AFDKDKESKP AQGQFRIKDL
EKRKEPKPKV KVVDKGNGSK PSPPAEENKF ICKTPAIADI VILVDGSWSI GRFNFRLVRL
FLENLVTAFD VGENKTRIGV AQYSGDPRIE WHLNTFSTKD AVIDAVRNLP YKGGNTLTGL
ALNYIFENSF KPEAGARIGV SKIGILITDG KSQDDIIPPS RTLRDSGVEL FAIGVKNADV
AELQEIASEP DSTHVYNVAE FDLMHTVVEG LTKTVCARVE EQEKEIKGSL VTDIGAPTDL
ITSEVTARSF RVTWTPAPGK VEKYRVVYYP TRGGKPEEVV VDGTESTAVL KGLMSLTEYQ
IAVFAIYSTT ASEGLRGTEM TLALPMVSDL ELYDVTENSL RARWNAVPGA SGYLILYAPL
TEGLAGDEKE MKIGETLTDI QLKGLLPNTE YTVTVYAMFG EEASDPVTGQ EMTLFLSPPK
NLRISNVGAN TARITWDPAS KNVNSYLISY AKTSGTETNE VEVDRVSTYS LKGLTALTDY
TVAISSIYEE GQSEPLTGSF TTKQIPAQQY LEVDEESQDS FRVSWKPLSA DVARQKLMWI
PVYGGKVEEV VLNEDEDTYI IEGLQPGTEY EVSLLAILSD ESETEVLTAV GTTLDAPSTV
STVTTVSPTG PVTSVFRTGI RNLVVDDEST SSLRVKWDIS DNRVEHFRVT YLTAKGDHAE
EVIGTVMVPG KQNQLLLKPL LSDTEYKVTV TPIYSDGEGV SVSAPGKTLP LSGPQNLRVS
DEWYNRLRIT WDPPSSPPKG YRIVYKPISV PGQALETFVG NDINTILIRN LASGTEYNVK
VFASYPSGFS DALTSEVKTL FLGVTDLEAE RIQTNSLCLK WKLHNHATAY RVAIENLKDG
KKQETTMGGG TSSHCFYGLM PDSEHKISVY TQLQEITGPS VSIMEKTRPI PTPPPATTTP
LPTIPPAKEV CKAAKADLVF MVDGSWSIGD DNFNKIIKFL YNTVGALDKI GADGTQVSIV
QFTDDPRTEF KLNTYKTKDT LLDGIKNLSY KGGNTKTGKA LKHVRDALFT AEGGTRRGIP
KVIVVITDGR SQDDVNKISK ELQLEGISIF AIGVADADYG ELLSIGSQPS ARHVFFVDDF
DAFKKIEDEL ITFVCETASA TCPLLNKDGN NLAGFKMMEL FGLVEKDFSA VDGVSMGPGT
FNVYPCYQLH KDALVSQPTK YLHPEGLPSD YTISFLFRIL PDTPQEPFAL WEILNKDSDP
LVGVILDNGG KTITYFNYDY QGDFQTVTFE GPEIKKIFYG SFHKLHIVIS KTLAKVFIDC
KQVGEKAVNA SGNITANGME VLGRMVRSRG PKENSAPFQL QMFDIVCSTS WANRDKCCEL
PGLRDEENCP ALPHSCSCSE ISKGASGPAG PPGGPGLRGP KGQRGDQGPK GPDGPRGETG
SPGPQGPPGP QGPSGLSIQG MPGAPGEKGE KGDLGFPGPQ GIPGNTGSPG RDGSQGHRGL
PGKDGPTGPP GPPGPIGIPG APGVPGVTGS TGPQGDLGPP GLPGAKGERG ERGDLQSHAM
VRSVARQVCE QLIQSHMARY TSILNQIPSH SSSIRTIPGP PGEPGRQGSP GTPGEPGPPG
RPGFPGNPGQ PGTPGERGLS GVKGEKGNPG TGTQGPRGPP GPAGPVGEGR TGSPGPQGSP
GPRGPPGHLG VPGPQGPSGQ PGYCDPSSCS SYGVGDVIPY NDYQH
//