dc.description.abstract |
The preliminary step in rational-based drug discovery is identifying the society's unmet medical
needs that are not properly addressed with the available treatments. After prioritizing the unmet
medical needs, drug target identification is the first and key phase in the pipeline. In the previous
target-based drug discovery the failure rate is very high. Many of these failures are attributed to
improper target identification. Our inadequate knowledge about the disease and molecular
mechanisms played a significant role. The wealth of data and information in this 'omics' era
present immense of new opportunities to enhance our understanding about the disease dynamics
at cellular and molecular level. With these advancements, the task of successful identification of
therapeutic targets becomes more promising. However, there is no single sufficient-enough
method due to the complexity of human diseases, heterogeneity of biological data and inherent
limitations of each approach. Therefore, systematically integrated computational methods can be
used to identify potential drug targets for high burden drug-resistance diseases like tuberculosis
(TB).
TB is an infectious disease caused by the infamous etiological agent Mycobacterium tuberculosis
(Mtb). It is the cause of morbidity and mortality to millions every year. Mycobacterium
tuberculosis H37Rv is the most studied strain of TB. The emergences and rise of drug-resistance
is the main bottleneck for the management, control and eradication programs of TB. Various
strategies have been implemented to counter the problem of resistance. However, available
statistics indicates that resistance forms are still on the rise. Drugs used in the current treatment
of drug-resistance TB are expensive, toxic, with adverse side effects and ineffective to act on the
latent forms of bacillus. The stated shortcomings highlight the requirement of new therapeutic
targets.
In this thesis, comprehensive protein-protein interactome network analyses have been carried out
to identify potential drug targets and co-targets of Mycobacterium tuberculosis H37Rv. Proteins
involved in the same cellular processes often interact with each other. Protein—protein interaction
network analysis is fundamental to understand the complexity of biological systems by revealing
hidden relationships between drugs, genes, proteins and diseases. There is enormous amount of
protein-protein interaction data in various repositories due to the advancement of techniques such
as two-hybrid systems, mass spectrometry, and protein microarrays. These analyses have been
carried out by aiming to obtain important system-level insights about TB and counter the
challenges at the target identification phase of drug discovery process. In silco molecular
modelling and structure analysis has also been carried out for protein translocase subunit SecY
(Rv0732)
The list of potential primary drug targets has been identified through analysis of comparative
genome and network centrality measures on the protein-protein interaction network of the
pathogen. The interaction dataset was retrieved from STRING. It is one of the main sources of
protein-protein interaction data of TB. It acts as a meta-database by integrating interactions from
numerous sources such as experimental repositories, computational prediction methods and
public text collections. The protein-protein interaction dataset of Mycobacterium tuberculosis
H371?v in STRING has been shown that it is of low quality by containing false positives and false
negatives. This can affect the results of any analysis which is based on this dataset. To minimize
the impact, the portion of the dataset which is more reliable has been considered. The four
centrality measures degree, closeness, betweenness and eigenvector have been used to identify
the most central proteins in the interactome network. Only proteins that found at the centre of
gravity of the interactome network were considered. BLAST search of protein coding genes has
been carried out against DEG to filter out genes which are essential for the survival and growth
of the pathogen. The corresponding protein sequences obtained after DEG search were subjected
to BLASTp search against the non-redundant database with an e-value threshold cut off set to
0.005 and restricted to Homo sapiens to avoid the possible host toxicity at the sequence level. A
list consisting of 137 proteins have been proposed as potential primary drug targets of
Mycobacterium tuberculosis H37Rv. These proteins are believed to be reliable targets since they
are reported as essential proteins for the growth and survival of the pathogen, have no detectable
homology with human so as to prevent host toxicity and prioritized based on their network
centrality measure values where all of them were found within the close neighbourhood of the
centre of gravity of protein-protein interaction network. Many of the proteins in the list have been
reported as drug targets by other methods.
The potential primary drug targets have been further prioritized based on their influence to
resistance genes using maximum flow approach on weighted proteome interaction network of
-
the pathogen. The weighted protein-protein interaction network of the pathogen has been
constructed using a dataset retrieved from STRING. The combined score values of the pair of
interacting proteins has been assigned as weight of interactions. The potential drug targets and -
resistance genes have been taken as inputs. Then, the potential drug targets have been prioritized
based on their maximum flow value to resistance genes. This approach does not suffer from
biasness towards shortest paths since it is based on flow. More importantly, the inhibition of a
protein which has more influence on the resistance genes of the existing drugs is expected to
disrupt the communication to these genes. Hence, it can be considered as an additional
druggablity assessment criteria for drug resistance diseases like TB.
Our limited system-level knowledge about the possible routes of resistance is one of the causes
of failure to strategies against drug-resistance TB. Detailed analysis has been carried out to
explore these routes through which information required for triggering drug-resistance may be
passed on in the cell. Proteins involved in the emergence of resistance by mediating information
among drug target proteins of eight clinically used drugs in the current treatment regime of TB
and resistance genes have been identified. These lists of proteins have been proposed as potential
co-targets of each drug. The analysis has been carried out on weighted drug-specific proteinprotein
interaction networks of the pathogen. The validated drug targets and resistance genes
have been taken as inputs. The maximum flow values of proteins in the flow from validated drug
targets to resistance genes have been computed. Proteins have been prioritized based on their
maximum flow value. Subsequent filters such as non-homologous assessment to avoid host
- toxicity, identification of proteins that interact with the host and essentiality analysis have been
- carried out. The final refined lists of proteins have strong involvement in the emergence of drug
- resistance and targeting them with systematic combination of existing drugs is believed to be
effective to prevent the emergence of drug resistance.
In silco structural analysis of protein translocase subunit SecY (Rv0732) has been carried out to
get descriptive three-dimensional structure. Rv0732 has been selected because it is highly ranked
potential drug target without solved three-dimensional structure. The active site has been
identified for protein-Ligand or protein-inhibitor binding. |
en_US |