ID H3B9J1_LATCH Unreviewed; 2976 AA.
AC H3B9J1;
DT 18-APR-2012, integrated into UniProtKB/TrEMBL.
DT 18-APR-2012, sequence version 1.
DT 27-MAR-2024, entry version 70.
DE RecName: Full=Collagen type VII alpha 1 chain {ECO:0008006|Google:ProtNLM};
GN Name=COL7A1 {ECO:0000313|Ensembl:ENSLACP00000018562.1};
OS Latimeria chalumnae (Coelacanth).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Coelacanthiformes; Coelacanthidae; Latimeria.
OX NCBI_TaxID=7897 {ECO:0000313|Ensembl:ENSLACP00000018562.1, ECO:0000313|Proteomes:UP000008672};
RN [1] {ECO:0000313|Proteomes:UP000008672}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Wild caught {ECO:0000313|Proteomes:UP000008672};
RA Di Palma F., Alfoldi J., Johnson J., Berlin A., Gnerre S., Jaffe D.,
RA MacCallum I., Young S., Walker B.J., Lander E., Lindblad-Toh K.;
RT "The draft genome of Latimeria chalumnae.";
RL Submitted (AUG-2011) to the EMBL/GenBank/DDBJ databases.
RN [2] {ECO:0000313|Ensembl:ENSLACP00000018562.1}
RP IDENTIFICATION.
RG Ensembl;
RL Submitted (NOV-2023) to UniProtKB.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AFYH01040458; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01040459; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01040460; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01040461; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01040462; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01040463; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01040464; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01040465; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01040466; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AFYH01040467; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR STRING; 7897.ENSLACP00000018562; -.
DR Ensembl; ENSLACT00000018695.1; ENSLACP00000018562.1; ENSLACG00000016343.1.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000163668; -.
DR HOGENOM; CLU_000510_0_0_1; -.
DR InParanoid; H3B9J1; -.
DR OMA; QTFFAVD; -.
DR TreeFam; TF351645; -.
DR Proteomes; UP000008672; Unassembled WGS sequence.
DR Bgee; ENSLACG00000016343; Expressed in mesonephros and 1 other cell type or tissue.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0004867; F:serine-type endopeptidase inhibitor activity; IEA:InterPro.
DR GO; GO:0007155; P:cell adhesion; IEA:UniProtKB-KW.
DR CDD; cd00063; FN3; 9.
DR CDD; cd22627; Kunitz_collagen_alpha1_VII; 1.
DR CDD; cd01450; vWFA_subfamily_ECM; 1.
DR Gene3D; 2.60.40.10; Immunoglobulins; 9.
DR Gene3D; 4.10.410.10; Pancreatic trypsin inhibitor Kunitz domain; 1.
DR Gene3D; 3.40.50.410; von Willebrand factor, type A domain; 2.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR003961; FN3_dom.
DR InterPro; IPR036116; FN3_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR002223; Kunitz_BPTI.
DR InterPro; IPR036880; Kunitz_BPTI_sf.
DR InterPro; IPR020901; Prtase_inh_Kunz-CS.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1071; COLLAGEN ALPHA-1(VII) CHAIN; 1.
DR Pfam; PF01391; Collagen; 16.
DR Pfam; PF00041; fn3; 8.
DR Pfam; PF00014; Kunitz_BPTI; 1.
DR Pfam; PF00092; VWA; 2.
DR PRINTS; PR00759; BASICPTASE.
DR PRINTS; PR00453; VWFADOMAIN.
DR SMART; SM00060; FN3; 9.
DR SMART; SM00327; VWA; 2.
DR SUPFAM; SSF57362; BPTI-like; 1.
DR SUPFAM; SSF49265; Fibronectin type III; 5.
DR SUPFAM; SSF53300; vWA-like; 2.
DR PROSITE; PS00280; BPTI_KUNITZ_1; 1.
DR PROSITE; PS50279; BPTI_KUNITZ_2; 1.
DR PROSITE; PS50853; FN3; 9.
DR PROSITE; PS50234; VWFA; 2.
PE 4: Predicted;
KW Cell adhesion {ECO:0000256|ARBA:ARBA00022889};
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Disulfide bond {ECO:0000256|ARBA:ARBA00023157};
KW Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW Reference proteome {ECO:0000313|Proteomes:UP000008672};
KW Secreted {ECO:0000256|ARBA:ARBA00022525}.
FT DOMAIN 8..181
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 204..299
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 300..387
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 388..479
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 482..570
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 579..665
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 669..757
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 761..849
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 854..942
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 943..1032
FT /note="Fibronectin type-III"
FT /evidence="ECO:0000259|PROSITE:PS50853"
FT DOMAIN 1036..1213
FT /note="VWFA"
FT /evidence="ECO:0000259|PROSITE:PS50234"
FT DOMAIN 2911..2961
FT /note="BPTI/Kunitz inhibitor"
FT /evidence="ECO:0000259|PROSITE:PS50279"
FT REGION 1239..1774
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1803..1952
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1984..2858
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1249..1263
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1460..1477
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1695..1709
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1858..1885
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1895..1914
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1992..2022
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2075..2095
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2208..2222
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2249..2263
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2361..2378
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2565..2579
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2639..2653
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2660..2675
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2823..2858
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 2976 AA; 303493 MW; 0A020CDDFE1A276B CRC64;
CENVVAGDIV FLVDGSSSIG RANFQQIKTF MEGVVAPFVS AISETGVRFG TVQYSDDSRV
EFTFQDYTNG TELMSAVRNL RYKGGNTRTG IGLKYVADNF FGPTIIRRDV PKITILITDG
KSQDNVDDPS TKLKSQDIKV FSVGIKNADK TELANVASEP SEDYSFYVND FKLLRTLLPL
VSRNVCSGIG GVLAADPASE AYTGPSGLVF TEESYNSLRI SWNAAGGPVT GYRVQYVPLG
GLDRPITAEL REISVRPADT VASLTGLKSA TDYQVTVIAQ HANSVGESVS GKGRTMRLEG
VQRLLGREVT SQSAQLTWSS VQGATGYRLA WGPIAGRGIR KVDVEGNKDS YLLQNLRPET
EYIVTLTALY GTSEGPAATA RFKTEAEESQ VLQTFSISPN SIQVSWKVIR EARGYRLEWR
RAAEGAPLQT ISLPTSTNSY DITGLQPGTE YFITLFTLYE DREIATPAST SQTGKVEPAA
GTVSNLRVID TRGNRVQLGW IGVPGATEYK ITIRNSEDGS EVTRRIPGSQ NRFELEELKE
RVTYIIRVAA LIGSQEGAAA SLSVQKLQNR PQQEISVSGV TDLRVIGTET NRIRIAWTGV
AGATGYKVTW RNTVGREISR VVAGDVTSYN IDRLRSNSPY TIGVTTLIGS REASPVIITA
RTEDASVGRV RDLQVTDAGS TRLRIAWTRV ARATGYRISW RRSDGAEVSR VISGDVTFME
LEGLQPDTAY AIRVSALVGS REGSATVVST RTGSGQVEVG SVRELQLLES RSNIARVTWV
GVPGATAYRI VWSRVDGGPE NSRVVPGNTS SIDIGDLMGG ASYLVKVIAL VGKQEGNPVS
IRVTTPPEVP VLNRVGNLRV LESSERRVRI AWTGVAGNTG YRIYWRTEDG GPETSRLIGG
NVNSFVIDNL EPGRHYIIRV AAISENREAD PVVLRANTAS LRPVSGFRVM EVNRNDVVVS
WAPVTGATSY VLRWRLTAER DDQQRIPLPG TASSYRVTGL RLGEQYTFTL QPVYGSEAGP
ESSVSERTVC RGVRVDIVFV VHGTQDNAYN ADRVRQFLFN VASAIPEMGP AETQVGVVVY
SYRGRTWTLL DRNRDLNAIL QQLRTIPFDE PSGTAIGGAI KFARDYLFDP SAGRRQGIPG
VMVVVADGPS SDDVIAPARE AKAAGIHILA VGMEEADQNQ LLRMVTDENP NNVFFSRDVN
ELYRLEGNVA EALCGIASGS GVGPDRCVVQ CPKGEKGEAG LMGRDGIPGP VGPPGPPGAP
GLPGRGELPG IPVKGEKGER GFSGADGVPG VPGRPGNTGS SGRPGNPGLP GVRGDPGEPG
GVGPKGDQGV PGEPGTVVSG GGIPGRKGEP GAQGYPGSAG PVGPRGPAGS LGTPGPPGLP
GLQGSPGVSI KGEKGEAGER GLPGFASGVL QKGQQGEPGR PGEAGSPGPR GPQGPPGQQG
EKGDVGEGVL GLPGRTGDPG ERGPRGHRGE QGVKGDRGEA GETGAQGSRG ERGPSGPTGQ
KGEPGQSGPV GPPGRTGVPG PVGPRGEKGD QGSPGEPGKS VPGADGKKGD KGNQGVLGPA
GPKGGKGDPA EKGEKGSVGY GVPGQSGPKG ELGERGNVGL TGKPGPKGSQ GEPGEKGEQG
KPGPPGQIGL RGKEGEKGEK GEDGTPGESG LPGKTGERGL RGLQGVSGRP GEKGDLGDPG
ENGRDGKPGP SGPRGDRGPQ GPPGPAGPPG TAVSVDQGVT IKGDKGDPGD PGESGFKGQR
GEQGSPGVPG ERGLEGQRGQ PGLRGDTGDR GSTGEKVRYA SIGFHGNAVI PVFEVVHCPV
SLEGDQGKPG DPGRDGLPGL RGVQGLTGPI GPPGPQGLPG KPGDDGKPGL SGKNGEDGTP
GEDGRKGEKG EVGVPGREGR NGVEGDRGEP GQTGPVGPPG LPGVPGPSGP PGQGIPGVPG
AAGIRGERGQ TGSQGEKGDP GDSGRPGLPG TAVNMEDALA TYGIKMTVLQ DAVQKYNRVL
EGTSVMEPGA RGERGAKGEK GDPGERGPSG REGERGFPGE RGLKGNKGES GQPGPVGPPG
RAIGERGPEG PPGQSGEPGK PGIPGVPGRV GEIGEAGRPG DKGDKGDKGD RGEAGEGVDL
SLPGLPGERG LPGFKGVKVS LRLRRGEVSP PGESGVKGEA GSKGDKCISG GNAGPKGDRG
ERGEAGEKGR DGAPGVAGTP GLQGQDGKPG QQGARGYPGS PGLPGFPGIL GRPGPPGPQG
PQGSPGVKGN QGEPGFGIVG SPGPQGNIGL PGLPGPPGPV GPQGSSGLPG QSGESGKPGV
PGRDGLPGKD GLNGLSGKMG LQGPLGPAGL KGEQGDAGPP GKMVVGPPGA KGEKGAPGEL
MAGGTGEQGL PGPQGEKGER GPVGPRAEKG EPGEPGDPGE DGVKGAPGLK GNKGEAGVGL
RGPPGQPGPQ GLKGDQGFPG PPGPPGPQGV AGTPGLPGLR GENGLPGAPG PAGERGFVGL
PGRDGSPGAQ GPPGPPGVTG EPTGEASSGE KGDPGVGQPG LRGERGDPGP RGEDGRPGLP
GERGPSGAAG SRGERGDKGD VGSIGLKGDK GDTVTVTGET GMKGSRGEPG ERGLKGTQGE
KGGKGDQGPV GERGLKGEQG DKGSIGFPGA RGPDGQKGEV GQSGTPGEPG LPGKEGFPGM
RGEKGERGIP GFRGPKGDRG FKGACGKDGM KGEKGEPGIP GRSGLPGRKG ETGEAGAPGN
AGTPGREGII GPKGDRGFDG MQGSKGEQGE KGDRGAPGAI GPPGPRGADG ARGQSGPPGP
VGPKGPEGLQ GQKGERGPPG PSVMGPRGIP GIPGERGDQG NGGPEGSKGD KGDPGMTEDQ
VRSFVRQEMS THCASQGQFF PPGSGPSNTQ SAQTDRSVTS RLVPVLKFTH REEDDVVNTN
DPDYEHFYTV ESYEDTLAPW FLCGDEAPNP CVLPLDEGNC HHFTMRWYYN RNAGECRPFV
YSGCGGNANR FSLKEDCELR CRSKRKGKET LCGGQS
//