GENE ONTOLOGY DATA MINING AND SYSTEMS BIOLOGY OF CANCER

Sharma, Ajay Shiv

Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/14460

Title:	GENE ONTOLOGY DATA MINING AND SYSTEMS BIOLOGY OF CANCER
Authors:	Sharma, Ajay Shiv
Keywords:	Bioinformatics and Computational;Contemporary Bioinformatics;Cancer and Alzheimer’s;Physiological Models
Issue Date:	Jul-2014
Publisher:	Dept. of Electrical Engineering iit Roorkee
Abstract:	Bioinformatics and computational systems biology fuses several branches of applied sciences and applied engineering and interplay that exploits basic sciences such as mathematics, physics, chemistry, computer science that present a partial picture and biological sciences such as molecular biology, structural biology, and systems biology that present a whole picture. To unravel complicated biological issues, this research work deals with interdisciplinary aspects of contemporary bioinformatics and computational systems biology as defined by the NIHs working definition of bioinformatics and computational biology (http://www.bisti.nih.gov/docs/compubiodef.pdf) in which many science fields come under the umbrella of bioinformatics with numerous applications in the field of life sciences. Contemporary bioinformatics deals with computational tools providing a userfriendly environment for the dissemination of life sciences knowledge through existing biological databases in a particular domain. On similar lines, computational systems biology, which has roots in life sciences, primarily deals with analytical data oriented techniques and mathematical modelling to study and analyze complex biological systems. For this, four different research concerns for progressive biological discoveries are handled in this research work using bioinformatics and systems biology approaches. At present, all research indications speak in favour of the key challenge for integrative biology: providing physiological models that could facilitate development of novel drugs against diseases such as cancer and Alzheimer’s disease against which effective therapeutics currently do not exist. Even though such full physiological models are not always attainable due to inadequate biological data and/or their appropriate integration, functional genomics can be currently considered as a reliable functional basis upon which such models are expected to rely. The research work provides novel insights into how a biological data base, which are essentially descriptive physiological models, can be functionally improved in terms of contemporary bioinformatics depending on the accessibility and integration of data. Most researchers agree that the challenge is data management, data analysis, data interpretation, data modelling and understand all the biological data that are being produced. However, a major issue prevails: all the abovementioned issues are handled differently at different laboratories throughout the world, producing plethora of biological data. To fill this research gap, an omics or integrative genomics revolution is need that uses the power of gene ontology (GO). The first concern of this work is to provide theoretical models to achieve this herculean task of integrating biological data by moving from knowledge gained from functional genomics to physiological models. Since a better understanding of many pathological conditions is the ultimate goal of full physiological models, physiology can be understood as the science of the functioning of living systems. To approach a full physiological model, a tremendous amount of biological knowledge contained in various databases needs to be sorted out by discriminating different types of data subjected to double integration: i) vertically - from molecular level, over cell and organ levels, all the way to the level of a whole organism and (ii) horizontally - comprising gene, anatomy and phenotype data. As such, a hypothetical full physiological model is supposed to have its full biological process (BP), full cellular component (CC), full molecular function (MF) and with its specific full ontologies respectively. Connecting individual ontologies from various data resources is a key step leading to a universal full physiological model. As such, the proposed model is supposed to have its full BPCCMF with its specific full ontologies. After understanding the concept of the full physiology, the illustration using a plants physiological model is implemented in this research work and the same can be extended for other organisms, pathological conditions, etc. The second primary concern in this work is the development of a gene ontology data mining tool using contemporary bioinformatics focusing on the design of a plants physiology database that represents all biological knowledge in a computationally tractable way unambiguously. The idea to serve the plant scientific community by using power of contemporary bioinformatics came from the fact that plants have been the most studied since the advent of classic genetics. Recent studies show that plants are biologically more complex and there are enormous applications to be gained from researching plant genes to progress the reception of nutrients from the earth to enhance plant yields and plant ailments that directly effects the health of humans. This research work focuses on providing a centralized plants physiology database as a new searching and investigating tool after mining plants gene ontological data from GO database. The applications of contemporary database management led to the development of Plants Physiology Database (PPDB), a searching and browsing tool based on the mining of large amounts of gene ontology data currently available. The PPDB is publicly available and freely accessible on-line (http://www.iitr.ernet.in/ajayshiv/) through a user-friendly environment generated by Drupal-6.24. Another focus of this work is the systems biology of cancer. Last decade has witnessed the emergence of new field of research called systems biology to capture the biological phenomenon with data analysing, modelling, and computational tools. Generations of scientists and physicians have dedicated their life to improving patient care and fighting against cancer. Systems biology offers promising insights to defeat cancer. Cancer is a major health issue responsible for 8.2 million deaths in 2012 and 14.1 million new cancer cases were reported in 2013 worldwide (http://globocan.iarc.fr). It is anticipated that the global yearly number of deaths should reach 17 million in 2030. As such, research progress in cancer treatment is real but insufficient. Cancer is a genetic disease that causes a deregulation of gene networks that control cell growth and dissemination. As a result, methods for modelling gene networks are central to any modern approach of the molecular biology of cancer. Moreover, the sequencing of the human genome and subsequent genomic revolution has impacted cancer research at the molecular level due to high throughput technologies like microarray database (MDB). As such, this research work focuses on both aspects of systems biology of cancer separating it into different computational approaches dealing with data driven systems biology and model driven systems biology. Data driven models are based on computational statistical tools that can handle high throughput MDB and termed as topdown models. They deal with two types of statistical analysis known as a low level analysis dealing with background correction, normalization using a model based expression index (MBEI) method along with high level analysis dealing with filtering of genes to find interesting genes, hierarchical clustering of filtered genes, genetic association study and gene ontology data mining/enrichment analysis. The central dogma of microarray data analyses is the third research concern in this work. The invaluable information produced after analyses can pave the way for innovative opportunities for early diagnosis of malignancies. This research work can enhance further research in diagnostics, prognostics, disease markers, target validation and targeted therapies using contemporary bioinformatics at a later stage. The list of significant genes or differentially expressed genes helps to find the functional relationships between genes in MDB warehouses by linking it to annotations of GO. For instance, a precautionary double mastectomy on finding the BRCA1 gene with only 87% probable chance of acquiring the disease shows the promising nature of this field. On the other hand, another approach on how dynamical mathematical models can provide novel insight that cannot be done by doing experiments. Model driven dynamical models or bottom-up models approach is the opposite of a top down model. With the bottom up model, it begins with a hypothesis of a biological mechanism. After having this hypothesis, equations are written down to describe how the components in the biological system interact with one another. Then simulations are run to generate predictions for what would happen under different conditions. Some of the keywords associated with bottom up models are ordinary differential equations, computational tools of dynamical systems to interpret the output and methods for parameter estimation, partial differential equations and stochastic models. The focus of the final research concern deals with developing models consisting of systems of differential equations and using computational tools of dynamical systems in order to interpret the results of these simulations. Therefore, a multiscale computational approach of tumour growth model is presented. A mathematical model is developed for tumour growth and angiogenesis to simulate the solid tumour growth/progression with chemotherapy drug and anti-angiogenesis drug estimation using partial differential equation (PDE) modelling. The PDE compartmental model incorporated spatiotemporal processes including cellular and tissue-mediated diffusion, cellular transport and migration, cell proliferation, angiogenesis, apoptosis, vessel maturation and formation to model tumour progression and transition from avascular to vascular growth. The angiogenesis process coupled with the solid tumour growth model on a reaction–diffusion kinetics framework portrayed the spatiotemporal development of the generalised functions of a tumour’s micro-environment viz., nutrients and growth factors that regulate the tumour’s growth during angiogenesis. Most cancers involve an endothelial growth factor receptor/extracellular signal-regulated kinases (EGFR/ERK) signalling pathway that are related to the cell-division cycle promoting tumour cells. Treatment is studied from tyrosine kinase inhibitors (TKI) in EGFR signalling, which are distributed through the blood vessels of a tumour’s microvasculature. This showed a huge potential for in-vitro experiments due to the availability of clinical and expression data information, which helps in learning about the responses to treatment. Using ordinary differential equations to model the systems pathway of downstream pathway of EGFR signalling (SOS RAS RAF MEK ERK PI3K AKT), we performed computational simulations to determine the facilitation of glucose, oxygen, tumour angiogenesis factor (TAF), drug (TKI), tumour growth factor alpha (analogue of EGFR) and angiogenesis inhibitor. The simulation results showed signalling pathways of TKIEGFR and IGF1R regulation of various active cells, migrating cells, proliferative cells, apoptotic and quiescent cells could be a united behaviour for the entire profile of tumour growth. The results established the dual role behaviour played by angiogenesis as TKIEGFR and VEGF inhibitors are furnished to diminish tumour incursion. In addition, the neovasculature can transport nutrients to neoplasm cells to continue cell metabolism, thus enhancing the rate of cell endurance. Hence, simulation results suggest that the coexpression of EGFR and IGF1R activates a higher number of ERK receptors compared to down and over-expressions. There is a good agreement between the simulations, an experimental wild type mouse model, and clinical data. It can be concluded that this work may not be able to solve the numerous convoluted issues in the field of biotechnology, but it can address issues in gene ontology data mining using contemporary bioinformatics taking the example of a plants physiology database and state of the art work related to cancer systems biology.
URI:	http://hdl.handle.net/123456789/14460
Research Supervisor/ Guide:	Prasad, Rajendra Gupta, Hari om
metadata.dc.type:	Thesis
Appears in Collections:	DOCTORAL THESES (Electrical Engg)

Files in This Item:

File	Description	Size	Format
G24319-ajay_T.pdf		7.08 MB	Adobe PDF	View/Open

Show full item record