GenomeNet

Database: UniProt
Entry: H2YGA3_CIOSA
LinkDB: H2YGA3_CIOSA
Original site: H2YGA3_CIOSA 
ID   H2YGA3_CIOSA            Unreviewed;      1333 AA.
AC   H2YGA3;
DT   18-APR-2012, integrated into UniProtKB/TrEMBL.
DT   18-APR-2012, sequence version 1.
DT   27-MAR-2024, entry version 53.
DE   RecName: Full=Fibrillar collagen NC1 domain-containing protein {ECO:0000259|PROSITE:PS51461};
OS   Ciona savignyi (Pacific transparent sea squirt).
OC   Eukaryota; Metazoa; Chordata; Tunicata; Ascidiacea; Phlebobranchia;
OC   Cionidae; Ciona.
OX   NCBI_TaxID=51511 {ECO:0000313|Ensembl:ENSCSAVP00000004352.1, ECO:0000313|Proteomes:UP000007875};
RN   [1] {ECO:0000313|Proteomes:UP000007875}
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RA   Birren B., Nusbaum C., Abebe A., Abouelleil A., Adekoya E., Ait-zahra M.,
RA   Allen N., Allen T., An P., Anderson M., Anderson S., Arachchi H.,
RA   Armbruster J., Bachantsang P., Baldwin J., Barry A., Bayul T.,
RA   Blitshsteyn B., Bloom T., Blye J., Boguslavskiy L., Borowsky M.,
RA   Boukhgalter B., Brunache A., Butler J., Calixte N., Calvo S., Camarata J.,
RA   Campo K., Chang J., Cheshatsang Y., Citroen M., Collymore A., Considine T.,
RA   Cook A., Cooke P., Corum B., Cuomo C., David R., Dawoe T., Degray S.,
RA   Dodge S., Dooley K., Dorje P., Dorjee K., Dorris L., Duffey N., Dupes A.,
RA   Elkins T., Engels R., Erickson J., Farina A., Faro S., Ferreira P.,
RA   Fischer H., Fitzgerald M., Foley K., Gage D., Galagan J., Gearin G.,
RA   Gnerre S., Gnirke A., Goyette A., Graham J., Grandbois E., Gyaltsen K.,
RA   Hafez N., Hagopian D., Hagos B., Hall J., Hatcher B., Heller A.,
RA   Higgins H., Honan T., Horn A., Houde N., Hughes L., Hulme W., Husby E.,
RA   Iliev I., Jaffe D., Jones C., Kamal M., Kamat A., Kamvysselis M.,
RA   Karlsson E., Kells C., Kieu A., Kisner P., Kodira C., Kulbokas E.,
RA   Labutti K., Lama D., Landers T., Leger J., Levine S., Lewis D., Lewis T.,
RA   Lindblad-toh K., Liu X., Lokyitsang T., Lokyitsang Y., Lucien O., Lui A.,
RA   Ma L.J., Mabbitt R., Macdonald J., Maclean C., Major J., Manning J.,
RA   Marabella R., Maru K., Matthews C., Mauceli E., Mccarthy M., Mcdonough S.,
RA   Mcghee T., Meldrim J., Meneus L., Mesirov J., Mihalev A., Mihova T.,
RA   Mikkelsen T., Mlenga V., Moru K., Mozes J., Mulrain L., Munson G.,
RA   Naylor J., Newes C., Nguyen C., Nguyen N., Nguyen T., Nicol R., Nielsen C.,
RA   Nizzari M., Norbu C., Norbu N., O'donnell P., Okoawo O., O'leary S.,
RA   Omotosho B., O'neill K., Osman S., Parker S., Perrin D., Phunkhang P.,
RA   Piqani B., Purcell S., Rachupka T., Ramasamy U., Rameau R., Ray V.,
RA   Raymond C., Retta R., Richardson S., Rise C., Rodriguez J., Rogers J.,
RA   Rogov P., Rutman M., Schupbach R., Seaman C., Settipalli S., Sharpe T.,
RA   Sheridan J., Sherpa N., Shi J., Smirnov S., Smith C., Sougnez C.,
RA   Spencer B., Stalker J., Stange-thomann N., Stavropoulos S., Stetson K.,
RA   Stone C., Stone S., Stubbs M., Talamas J., Tchuinga P., Tenzing P.,
RA   Tesfaye S., Theodore J., Thoulutsang Y., Topham K., Towey S., Tsamla T.,
RA   Tsomo N., Vallee D., Vassiliev H., Venkataraman V., Vinson J., Vo A.,
RA   Wade C., Wang S., Wangchuk T., Wangdi T., Whittaker C., Wilkinson J.,
RA   Wu Y., Wyman D., Yadav S., Yang S., Yang X., Yeager S., Yee E., Young G.,
RA   Zainoun J., Zembeck L., Zimmer A., Zody M., Lander E.;
RL   Submitted (AUG-2003) to the EMBL/GenBank/DDBJ databases.
RN   [2] {ECO:0000313|Ensembl:ENSCSAVP00000004352.1}
RP   IDENTIFICATION.
RG   Ensembl;
RL   Submitted (NOV-2023) to UniProtKB.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   Ensembl; ENSCSAVT00000004416.1; ENSCSAVP00000004352.1; ENSCSAVG00000002577.1.
DR   GeneTree; ENSGT00940000167228; -.
DR   Proteomes; UP000007875; Unassembled WGS sequence.
DR   GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR   Gene3D; 2.60.120.1000; -; 1.
DR   InterPro; IPR008160; Collagen.
DR   InterPro; IPR000885; Fib_collagen_C.
DR   PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR   PANTHER; PTHR24023:SF1100; FIBRILLAR COLLAGEN NC1 DOMAIN-CONTAINING PROTEIN; 1.
DR   Pfam; PF01410; COLFI; 2.
DR   Pfam; PF01391; Collagen; 8.
DR   SMART; SM00038; COLFI; 1.
DR   PROSITE; PS51461; NC1_FIB; 1.
PE   4: Predicted;
KW   Extracellular matrix {ECO:0000256|ARBA:ARBA00022530};
KW   Reference proteome {ECO:0000313|Proteomes:UP000007875};
KW   Secreted {ECO:0000256|ARBA:ARBA00022530}.
FT   DOMAIN          1125..1332
FT                   /note="Fibrillar collagen NC1"
FT                   /evidence="ECO:0000259|PROSITE:PS51461"
FT   REGION          1..70
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          127..1116
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        18..41
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        304..318
FT                   /note="Polar residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        433..450
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        784..799
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        1005..1025
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        1094..1116
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ   SEQUENCE   1333 AA;  129917 MW;  DCDA05FB6A4506CD CRC64;
     QGQKGAKGEQ AVVEPGIFIP GPPGPPGDPG LDGPPGFQGP PGGPGEYGDR GPPGRDGLPG
     EKGLPGPPGR HIMIPFRIAQ PNGEKGPGSN APAEMQAQLA LTQARLAMQG PPGPQGLAGM
     PGTAGDMGPA GIKGESGEPG PMGQRGPRGS MGPDGMPGKP GRVGEDGLRG EPGAMGSKGD
     RGYDGLPGLP GPKGHRVSHV GPIGPPGGDG EQGEQGEAGP RGLPGEAGPR GEMGHKGPPG
     VSGPPGPQGE PGPPGQQGTS GPQGIPGPQG MMGPPGTKGP HGKAGLPGIP GADGAPGHPG
     NEGPTGSKGS QGQPGPQGPQ GYPGTRGVKG ENGVRGIKGN KGEKGLDGLP GFKGDMGPKG
     DTGIAGPTGS RGEDGPEGPK GREGARGELG PVGLTGEKGK IGVPGLPGYP GRSGIKGSLG
     KPGKPGQMGL KGDRGLEGKR GQEGQRGPKG PRGKRGPRGS TGKAGPKGDR GQDGPQGPIG
     ERGPPGPRGP SGYVGTKGPP GPPGKDGLPG HPGSRGETGF QGKTGPPGPA GVVGPQGPTG
     ENGPSGNRGH PGPPGPAGEP GLSGSAGKEG SKGDRGPRGP IGKMGATGSQ GFPGSRGPQG
     PVGAPGLKGS EGPPGPPGPL GAVGQRGPQG PAGQIGPAGS ANGPPGPQGE NGSPGGPGGP
     GPAGRDGLQG PVGLPGAPGS IGPRGEDGDK GEAGPPGATG LKGGKGEHGP PGPPGVQGNS
     GDPGPPGNDG EPGQRGQQGL YGEKGDEGPR GFPGPPGPRG LQGIPGPSGS KGDTGDAGPL
     GPPGPAGQRG PPGGPEGPPG TYGQPGNVGD KGEPGGPGAP GISGEPGPMG PKGENGEKGE
     AGLTGPQGEA GPRGPRGDDG PKGNPGPVGF PGDPGVSGIP GTPGDEGVPG DVGDTGAPGE
     PGPPGPSGEV GPSGGPGRRG ESGGIGPVGE PGPHGLQGKT GKRGTTGLQG LPGPAGAPGL
     PGSSGADGPV GPMGPSGLKG IKGEMGVSGE KGHPGLIGLV GPPGEEGEKG ERGPQGRDGP
     HGAKGDDGRP GPSGPVGPMG APGLPGSLGS KGNKGSLGPT GPKGDEGIQG PPGPPGPPGQ
     VYNASPLTAN SMKARRRRST EEEGLHREKR QAQDESFIEY PEGLEEIYAA METLKQELEM
     MKEPMGRTQD NPGRSCKDIW LCHPDFPSGN YWIDPNGGCS ADAIEVFCDF EAEGDTCISP
     VERTASVSWL TCLSDPPLVC IPQFNFLRLL SSQAKQRFTY KCVNSIGWEN QQTGSFDQAI
     HLLAANDEVL TYGSEHLTVI EDNCKTGHGN GQVVLELRTR EVDLLPLFDY KAFDFGTRSQ
     RHGYQLDRVC FSG
//
DBGET integrated database retrieval system