ID K7G400_PELSI Unreviewed; 1173 AA.
AC K7G400;
DT 09-JAN-2013, integrated into UniProtKB/TrEMBL.
DT 09-JAN-2013, sequence version 1.
DT 27-MAR-2024, entry version 64.
DE SubName: Full=Collagen type VII alpha 1 chain {ECO:0000313|Ensembl:ENSPSIP00000015011.1};
GN Name=COL7A1 {ECO:0000313|Ensembl:ENSPSIP00000015011.1};
OS Pelodiscus sinensis (Chinese softshell turtle) (Trionyx sinensis).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Archelosauria; Testudinata; Testudines; Cryptodira; Trionychia;
OC Trionychidae; Pelodiscus.
OX NCBI_TaxID=13735 {ECO:0000313|Ensembl:ENSPSIP00000015011.1, ECO:0000313|Proteomes:UP000007267};
RN [1] {ECO:0000313|Proteomes:UP000007267}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Daiwa-1 {ECO:0000313|Proteomes:UP000007267};
RG Soft-shell Turtle Genome Consortium;
RL Submitted (OCT-2011) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Proteomes:UP000007267}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Daiwa-1 {ECO:0000313|Proteomes:UP000007267};
RX PubMed=23624526; DOI=10.1038/ng.2615;
RA Wang Z., Pascual-Anaya J., Zadissa A., Li W., Niimura Y., Huang Z., Li C.,
RA White S., Xiong Z., Fang D., Wang B., Ming Y., Chen Y., Zheng Y.,
RA Kuraku S., Pignatelli M., Herrero J., Beal K., Nozawa M., Li Q., Wang J.,
RA Zhang H., Yu L., Shigenobu S., Wang J., Liu J., Flicek P., Searle S.,
RA Wang J., Kuratani S., Yin Y., Aken B., Zhang G., Irie N.;
RT "The draft genomes of soft-shell turtle and green sea turtle yield insights
RT into the development and evolution of the turtle-specific body plan.";
RL Nat. Genet. 45:701-706(2013).
RN [3] {ECO:0000313|Ensembl:ENSPSIP00000015011.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AGCU01138579; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01138580; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01138581; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01138582; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01138583; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01138584; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AGCU01138585; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR AlphaFoldDB; K7G400; -.
DR Ensembl; ENSPSIT00000015082.1; ENSPSIP00000015011.1; ENSPSIG00000013422.1.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000154865; -.
DR HOGENOM; CLU_001074_2_3_1; -.
DR OMA; TGFKGQM; -.
DR Proteomes; UP000007267; Unassembled WGS sequence.
DR GO; GO:0004867; F:serine-type endopeptidase inhibitor activity; IEA:InterPro.
DR GO; GO:0007155; P:cell adhesion; IEA:UniProtKB-KW.
DR GO; GO:0035987; P:endodermal cell differentiation; IEA:Ensembl.
DR CDD; cd22627; Kunitz_collagen_alpha1_VII; 1.
DR Gene3D; 4.10.410.10; Pancreatic trypsin inhibitor Kunitz domain; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR002223; Kunitz_BPTI.
DR InterPro; IPR036880; Kunitz_BPTI_sf.
DR InterPro; IPR020901; Prtase_inh_Kunz-CS.
DR PANTHER; PTHR37456:SF6; COLLAGEN ALPHA-1(XXIII) CHAIN ISOFORM X1-RELATED; 1.
DR PANTHER; PTHR37456; SI:CH211-266K2.1; 1.
DR Pfam; PF01391; Collagen; 13.
DR Pfam; PF00014; Kunitz_BPTI; 1.
DR PRINTS; PR00759; BASICPTASE.
DR SMART; SM00131; KU; 1.
DR SUPFAM; SSF57362; BPTI-like; 1.
DR PROSITE; PS00280; BPTI_KUNITZ_1; 1.
DR PROSITE; PS50279; BPTI_KUNITZ_2; 1.
PE 4: Predicted;
KW Cell adhesion {ECO:0000256|ARBA:ARBA00022889};
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Reference proteome {ECO:0000313|Proteomes:UP000007267}.
FT DOMAIN 1100..1150
FT /note="BPTI/Kunitz inhibitor"
FT /evidence="ECO:0000259|PROSITE:PS50279"
FT REGION 12..149
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 165..998
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 54..84
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 265..285
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 325..354
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 434..448
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 541..560
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 688..702
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 750..781
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 845..859
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1173 AA; 114993 MW; 202424B63E53FF02 CRC64;
MCLSPDLLLL QGLPGLRGEQ GPPGAVGPPG PPGLTGKPGD DGKPGPNGKN GEDGIPGEDG
RKGDKGEAGA PGRDGQEGPK GERGDRGAPG PLGPPGVPGQ VGIPGQGAPG LTGAAGQKGD
RGEPGSKGDQ GRPGDPGLRG EPGTVSNVER ALESYGIKIS SLREITGAFD GSTDPFLPYP
DRWRGPKGDS GDPGERGSPG KEGAMGFPGE KGPKGDKGDQ GPAGPQGPTG RAIGERGPEG
PPGQAGEPGK PGIPGVPGRA GELGEAGRPG EKGDRGEKGD RGEPGRDGVQ GPPGPPGPKA
DVVEGSLSGF PGERGPPGSK GAKGESGVDG ERGPKGDKGE AGQKGERGEA GEKGRDGSPG
LPGERGLAGP EGKPGMPGFP GVLGRPGNQG DPGPPGPAGL AGPPGAQGPS GIKGDPGEPG
SSIRGLPGPQ GNVGLPGPSG PPGLVGPQGA PGLPGQVGET GKPGVPGRDG VPGTQGEPGL
PGKIGVPGPV GPAGSKGEPG DAGLRGQAIV GPPGMKGEKG APGDIDGNLL GEPGAKGERG
LPGPRGEKGE SGRQGEPGDP GEDGAKGAPG VKGEKGSVGI GLQGPPGQDG PQGLKGDTGL
PGPPGSPGLP GTAGTPGQPG LRAENGQPGP PGPPGERGLI GFPGRDGTSG SPGPLGPPGP
AGAPGTPGLK GDKGDVGVGQ PGPRGERGDP GQRGEDGRPG LEGDRGPAGP PGNRGERGDK
GDVGALGLKG DKGDTVIVEG PAGVRGSKGE PGDRGLKGTE GEKGDKGDPG FPGEKGGKGE
QGEKGSAGFP GARGPGGQKG EVGESGDSGA PGLPGKDGIP GLRGEKGDMG PLGVRGPKGE
RGTKGACGQD GDKGEKGEPG IPGRMGLPGR KGELGELGMP GTPGIPGKEG LMGPKGDRGF
NGQQGAKGDQ GEKGDRGAPG VIGSPGARGN DGAPGPPGPP GSVGPKGPEG IQGQKGERGP
SGESTVGARG VPGMPGERGD QGSPGLEGPR GEKGEPSLTE QEIRSFVRQE MSQHCACGGQ
FSSEPRLLPN YPSTQPLPSA NAQLVPVLKL SHAEEEDGRE VHVVHTNDPE YEHVYAMEDY
EESLEADGTE STMLRDADPC SSPLDEGDCN RFTLRWYYNQ KVAECRPFIY SGCRGNLNRF
YTKEDCDLQC RHQADADAQI AADNTSAEKR LNA
//