Purine synthesis pathway and identification of novel genes

dc.contributorInstituto de Física de São Carlos – IFSC/USPpt_BR
dc.contributor.authorThiemann, Otavio Henrique
dc.date.accessioned2017-05-26T19:50:53Z
dc.date.available2017-05-26T19:50:53Z
dc.date.issued2017-05-26
dc.description.abstractWe are currently involved as a sequencing laboratory in the SUCEST project. Our Data Mining purpose is to search the EST database generated from the SUCEST effort in two themes: the first is the identification of genes involved in the purine synthesis and recycling pathways. The second theme of our data mining effort is an exploratory analysis of those EST sequences whose identity is unknown from the first database searches, the so called “no hit sequence". The purine nucleotide pathways are of central importance to all living organisms and have been investigated as a possible target for chemotherapy due to differences between the disease agents and their hosts. In human cells the purine nucleotides are synthesized from non-nucleotide precursors such as amino acids, ammonia and carbon dioxide. Purines can also be recycled through the salvage pathway. Another important enzyme involved in the salvage as well as de novo pathway is the enzyme responsible for the synthesis of the PRPP substrate, PRPP synthetase (PRS), utilized in all PRTases reactions. The knowledge of sugarcane purine synthesis enzymes will open the possibility of using such enzymes as a target for drugs to combat phytopathogen agents, as is being done with several parasitic targets. With our participation in the project as a sequencing laboratory, we have initiated a preliminary Data Mining effort using the following strategy. Representative enzyme sequences for each member of the purine de novo synthesis and recycling pathways have been chosen from the NCBI database. Those peptide sequences are being used to search the entire translated SUCEST database using the BLAST facility available. Retrieved EST clones are further tested for the statistical significance of the alignment by a Monte Carlo shuffling algorithm. To calibrate the Monte Carlo analysis, known protein sequences of divergent rate along the phylogenetic three have been used. Those sequences are compared to each other and to the EST clones. The resulting table of p-values indicates the degree of divergence of each enzyme along different rate and with the Sugarcane EST clones. Preliminary results employing this strategy allowed us to identify at least one potential case of polymorphism in Sugarcane, of the protein PRPP synthetase, a key enzyme of the purine synthesis pathway. Interestingly, two important genes, glutamine-PRPP-amidotransferase and GAR transformylase, from the de novo synthesis pathway have not been found in the SUCEST database so far. Those missing genes pose interesting questions that may be further investigated, such as if those genes are of such low abundance as to be undetected in the current libraries. Alternatively glutamine-PRPP-amidotransferase and GAR transformylase y be so divergent as to avoid detection in our search strategy. The possibility that Sugarcane would employ an alternative purine metabolism is unlikely since all the other enzymes involved have been identified with high degree of similarity to the known sequences. In every genome effort undertaken to date a variable number of unidentified sequences are encountered. Those genes are of great interest since they may be responsible for important and yet unknown pathways of the organism studied. The EST genome sequencing effort of the sugarcane plant isn't an exception. Several EST sequences are being accumulated as “no hit sequences" that aren't initially identified by the standard method of search employed. Our purpose is to analyze as many as possible of those sequences in an attempt to identify sequences with marginal similarity scores. Although such undertaking is laborious and will not permit the detailed examination of all the unknown sequences, we may be able to identify if potentially valuable information is being lost as "no Hit sequences". If such is the case, it would justify further efforts to develop automatic search methods to explore those sequences. The strategy employed will be of collecting individual sequences and perform database searches, against the public databases, using the translated peptide sequence as query. Such approach is known to increase the sensitivity of the search and return more reliable results. Marginal identities will be further analyzed by statistical methods, such as the Monte Carlo algorithm, and phylogenetic reconstruction if the sequence length permits. The identified sequences will be made available for further study by the data mining laboratory working on the pathway specific to it or will be catalogued for further analysis.pt_BR
dc.description.notesOutorgado: Prof. Dr. Otavio Henrique Thiemann; Universidade de São Paulo, USP, Instituto de Física de São Carlos, IFSC, Departamento de Física e Ciência Interdisciplinar, FCI, Grupo de Cristalografia, São Carlos, SP, Brasil.pt_BR
dc.description.sponsorshipFAPESP(00/07439-8)pt_BR
dc.format2 p.pt_BR
dc.format.mediumDigitalpt_BR
dc.identifier.urihttp://repositorio.ifsc.usp.br/handle/RIIFSC/8913
dc.language.isoengpt_BR
dc.rightsAcesso abertopt_BR
dc.subjectSequenciamento genéticopt_BR
dc.subjectGenomaspt_BR
dc.subjectMineração de dadospt_BR
dc.subjectCana-de-açúcarpt_BR
dc.subjectGenoma Cana­de­Açúcar ­ SucESTpt_BR
dc.subject.classificationIFSC - FCIpt_BR
dc.titlePurine synthesis pathway and identification of novel genespt_BR
dc.type.categoryPesquisapt_BR
usp.date.end2002-07-31
usp.date.initial2000-08-01
usp.date.ratification2000
usp.description.localSão Carlos, SP, Brasilpt_BR
usp.isreferencedbyhttp://www.bv.fapesp.br/pt/auxilios/29062/purine-synthesis-pathway-and-identification-of-novel-genes/pt_BR

Arquivos

Pacote original

Agora exibindo 1 - 1 de 1
Imagem de Miniatura
Nome:
Purine synthesis pathway and identification of novel genes - 00_07439-8.pdf
Tamanho:
152.04 KB
Formato:
Adobe Portable Document Format

Licença do pacote

Agora exibindo 1 - 1 de 1
Nenhuma Miniatura Disponível
Nome:
license.txt
Tamanho:
849 B
Formato:
Item-specific license agreed upon to submission
Descrição:

Coleções