ID A0AAD9IZQ6_9ANNE Unreviewed; 1280 AA.
AC A0AAD9IZQ6;
DT 29-MAY-2024, integrated into UniProtKB/TrEMBL.
DT 29-MAY-2024, sequence version 1.
DT 28-JAN-2026, entry version 8.
DE RecName: Full=Integrase catalytic domain-containing protein {ECO:0000259|PROSITE:PS50994};
GN ORFNames=LSH36_803g00021 {ECO:0000313|EMBL:KAK2143892.1};
OS Paralvinella palmiformis.
OC Eukaryota; Metazoa; Spiralia; Lophotrochozoa; Annelida; Polychaeta;
OC Sedentaria; Canalipalpata; Terebellida; Terebelliformia; Alvinellidae;
OC Paralvinella.
OX NCBI_TaxID=53620 {ECO:0000313|EMBL:KAK2143892.1, ECO:0000313|Proteomes:UP001208570};
RN [1] {ECO:0000313|EMBL:KAK2143892.1}
RP NUCLEOTIDE SEQUENCE.
RC STRAIN=P08H-3 {ECO:0000313|EMBL:KAK2143892.1};
RX PubMed=37494294; DOI=10.1093/molbev/msad172;
RA Perez M., Aroh O., Sun Y., Lan Y., Juniper S.K., Young C.R., Angers B.,
RA Qian P.Y.;
RT "Third-Generation Sequencing Reveals the Adaptive Role of the Epigenome in
RT Three Deep-Sea Polychaetes.";
RL Mol. Biol. Evol. 40:msad172-msad172(2023).
CC -!- CAUTION: The sequence shown here is derived from an EMBL/GenBank/DDBJ
CC whole genome shotgun (WGS) entry which is preliminary data.
CC {ECO:0000313|EMBL:KAK2143892.1}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; JAODUP010000803; KAK2143892.1; -; Genomic_DNA.
DR AlphaFoldDB; A0AAD9IZQ6; -.
DR Proteomes; UP001208570; Unassembled WGS sequence.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0031012; C:extracellular matrix; IEA:TreeGrafter.
DR GO; GO:0005615; C:extracellular space; IEA:TreeGrafter.
DR GO; GO:0030020; F:extracellular matrix structural constituent conferring tensile strength; IEA:TreeGrafter.
DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR GO; GO:0015074; P:DNA integration; IEA:InterPro.
DR GO; GO:0030198; P:extracellular matrix organization; IEA:TreeGrafter.
DR Gene3D; 2.60.120.200; -; 1.
DR Gene3D; 3.40.1620.70; -; 1.
DR Gene3D; 3.10.100.10; Mannose-Binding Protein A, subunit A; 1.
DR Gene3D; 3.30.420.10; Ribonuclease H-like superfamily/Ribonuclease H; 1.
DR InterPro; IPR016186; C-type_lectin-like/link_sf.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR050149; Collagen_superfamily.
DR InterPro; IPR010515; Collagenase_NC10/endostatin.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR016187; CTDL_fold.
DR InterPro; IPR001584; Integrase_cat-core.
DR InterPro; IPR012337; RNaseH-like_sf.
DR InterPro; IPR036397; RNaseH_sf.
DR InterPro; IPR048287; TSPN-like_N.
DR InterPro; IPR045463; XV/XVIII_trimerization_dom.
DR PANTHER; PTHR24023; COLLAGEN ALPHA; 1.
DR PANTHER; PTHR24023:SF1113; COLLAGEN ALPHA-2(IX) CHAIN-LIKE ISOFORM X1; 1.
DR Pfam; PF01391; Collagen; 2.
DR Pfam; PF20010; Collagen_trimer; 1.
DR Pfam; PF06482; Endostatin; 1.
DR SMART; SM00210; TSPN; 1.
DR SUPFAM; SSF56436; C-type lectin-like; 1.
DR SUPFAM; SSF49899; Concanavalin A-like lectins/glucanases; 1.
DR SUPFAM; SSF53098; Ribonuclease H-like; 1.
DR PROSITE; PS50994; INTEGRASE; 1.
PE 4: Predicted;
KW Collagen {ECO:0000256|ARBA:ARBA00023119};
KW Reference proteome {ECO:0000313|Proteomes:UP001208570};
KW Repeat {ECO:0000256|ARBA:ARBA00022737}.
FT DOMAIN 1016..1108
FT /note="Integrase catalytic"
FT /evidence="ECO:0000259|PROSITE:PS50994"
FT REGION 240..390
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 490..823
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 240..256
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 264..274
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 340..349
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 351..361
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 362..372
FT /note="Gly residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 490..503
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 509..526
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 528..542
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 544..557
FT /note="Low complexity"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 677..686
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 740..785
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 792..807
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1280 AA; 135006 MW; 864C8DDAFDE483F8 CRC64;
MDLMQAIDPP RGSPLGQITD ITNISGNDSI SYLKGIDGYP AFLIKRNANI QKPAAAFLTT
LPDDFAILIT VIPSDDLGGF LFSVVNPAST IIQLGVSLSP SSSEGMMDIK LYYTDYRKSE
TSEVLVRFSV PSFVHQWTSI GLKIKNDVVT LYMDCMEYSR SETTGRKRGL RFDSGSSLFI
GRGPIQNEKS FAGGLQQLTI VEDADKAESQ CDDIDPYQEF SGSPDTSIIP RPGSRYMIKG
EKGDRGAPGL KGDRGPPGET VFITPPPITP PPPDFLGTKG EKGERGLQGP KGDRGTTGVK
GDKGDTGIPG IPGTGEQGLP GPPGLPGLPG KIGERGLPGP AIPGPPGPPG RDGIPGPRGL
PGTPGGNLGS GEDGTNLPGI SGLPGKPGVA GPRGEKSCTY LRAVHISELY LSQSCTYLRA
ELISELYLSQ SCTYLRAVLI SELYLTQSCS YLRAVLISEL YLSQSCTYLR AVLNSELYLS
QSCTYLRAGD RGDQGLRGHD GRDGMPGTPG LPGPPGPPGP PGPPSYIPEE EIFGGSGDED
ELLPSYPSGG SSGSVYPTWK GDRGFPGTPG LPGPRGPPGR DGTPGLSGTP GDSGQPGPRG
QKGDMGNPGI GTPGLPGSPG LQGPPGPPGR VVTAKGVDIT GPPGYPGQKG NKGEPGEGSS
RSCNDICYGV KGEIGPQGPP GSPGPRGPAG IMGEIGLPGF PGKPGDRGLP GLAVKGSKGD
RGSPGPPGTV VGDSGGIYIP APPGPPGPPG PPGHCPPGPP GPPGVGVPGP PGPPGNVPRP
EPPGHPADSP VQCPPGPPGP PGPPGPPGTG NGNTIIPSFG GDEQFTPGAG AVIFRNKDDL
FMVAHNIPFG TIIFLLDTEV LYVRVSEGFR EILTKVEYFP ANHFVQPTTP EPEIIPPPQD
TSEITGPMIQ FALESRLKKC KQANMPRAAA EDGCPWTPNI SVYAHKPLSK TPKAITKSDT
TSDEILNRVQ TKKKHSDRRY PVSTSTVQEM HVHNVTLHPM RDEDKQQQES KVITAVPPRP
WQKVASDLFA WGGNDYSICT DYYSSFFEID VLGETTSGEV IAKLKNSCAR YGIPETRVSD
NGSQYSSTSF SAFLKNGIFI NGSSTPLHLI AANQPYKGDI NGISGADYIC FKEAKAIGLQ
GTYRAFLSSR VQDLISIVHR KYDRKFPVVN LHNQKLFNSW NDIFNGAGAY FNDRVPIYSF
DGKDVLYDPK WPQKIIWHGA TSTGQRNEKS YCNAWSSESS TETGVASSLI KSMILDNEQF
ACNNGFILLC AENSYRTYRK
//