GenomeNet

Database: UniProt
Entry: A0A016U1S5_9BILA
LinkDB: A0A016U1S5_9BILA
Original site: A0A016U1S5_9BILA 
ID   A0A016U1S5_9BILA        Unreviewed;       668 AA.
AC   A0A016U1S5;
DT   11-JUN-2014, integrated into UniProtKB/TrEMBL.
DT   11-JUN-2014, sequence version 1.
DT   27-MAR-2024, entry version 29.
DE   RecName: Full=Collagen triple helix repeat protein {ECO:0008006|Google:ProtNLM};
GN   Name=Acey_s0061.g3248 {ECO:0000313|EMBL:EYC09269.1};
GN   Synonyms=Acey-col-99 {ECO:0000313|EMBL:EYC09269.1};
GN   ORFNames=Y032_0061g3248 {ECO:0000313|EMBL:EYC09269.1};
OS   Ancylostoma ceylanicum.
OC   Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida;
OC   Rhabditina; Rhabditomorpha; Strongyloidea; Ancylostomatidae;
OC   Ancylostomatinae; Ancylostoma.
OX   NCBI_TaxID=53326 {ECO:0000313|EMBL:EYC09269.1, ECO:0000313|Proteomes:UP000024635};
RN   [1] {ECO:0000313|Proteomes:UP000024635}
RP   NUCLEOTIDE SEQUENCE.
RC   STRAIN=HY135 {ECO:0000313|Proteomes:UP000024635};
RX   PubMed=25730766; DOI=10.1038/ng.3237;
RA   Schwarz E.M., Hu Y., Antoshechkin I., Miller M.M., Sternberg P.W.,
RA   Aroian R.V.;
RT   "The genome and transcriptome of the zoonotic hookworm Ancylostoma
RT   ceylanicum identify infection-specific gene families.";
RL   Nat. Genet. 47:416-422(2015).
CC   -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC       whole genome shotgun (WGS) entry which is preliminary data.
CC       {ECO:0000313|EMBL:EYC09269.1}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; JARK01001397; EYC09269.1; -; Genomic_DNA.
DR   AlphaFoldDB; A0A016U1S5; -.
DR   Proteomes; UP000024635; Unassembled WGS sequence.
DR   GO; GO:0048856; P:anatomical structure development; IEA:UniProt.
DR   InterPro; IPR008160; Collagen.
DR   PANTHER; PTHR37456:SF5; -; 1.
DR   PANTHER; PTHR37456; SI:CH211-266K2.1; 1.
DR   Pfam; PF01391; Collagen; 7.
PE   4: Predicted;
KW   Reference proteome {ECO:0000313|Proteomes:UP000024635};
KW   Repeat {ECO:0000256|ARBA:ARBA00022737}.
FT   REGION          32..71
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          118..158
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          193..428
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          453..619
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        195..217
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        293..325
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        458..473
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        522..538
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        548..562
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ   SEQUENCE   668 AA;  66898 MW;  7E568EB1F2FFFB0F CRC64;
     MRGMESRLED SDKIPVEMRN ATLARSRTIR NTRDCVCPPG PRGDRGPPGP PGLSAPYYRR
     RTTPPPMASL DPRLHRKLRA LGMLYSPDGQ SIQLRGLPGP PGPPGTKGAR GYPGFPGPIG
     LDGPRGLPGT PGAKGEKGDR GPVGPPGYPG QKGEPGMVRH GQIQADAPDG WRALNQSRSH
     LYNEYRVGAP IPNAPLMIQG PPGPPGPPGP PGMPGHEGKP GPKGERGLPG FDGESKVGPK
     GDHGEPGRDG IPGLRGPPGE RGEKGEPGAP AAYGRHIAHP HTHAMPGPQG PQGPQGAQGP
     PGVPGPPGQP APPAFCPPGP PGKPGTDGRT GPRGPQGEKG DRGEPGIAGP RGHPGAPGGP
     TVSGAKVVVG PPGPPGRDGL NGEKGEKGET GAPGPIGPPG PEGIAGKRGR RGRNGEPGVC
     HQNCTSGGSQ DLERIAHELM PVLRKELKEI RIKHGYPGPP GVPGPPGAPG PVGPAGARGP
     QGLPGHSGEK GDRGEIGPPG LPGQPGQTYE IPSQTGAAAP GPRGPPGLPG PPGEKGEPGG
     PGLPGQPGSL GLPGPPGPMG LRGPPGTEGR AGKQGPVGPK GDMGPMGPAG TPGHPGSPGE
     RGADGTPGQR GEKGDQGIPG LDAPCPTGPD GLPLPYCSWK PMDNLEEAIR KHVAMLHEVI
     IAISSCVM
//
DBGET integrated database retrieval system