Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/18709
Title: NOVEL MACHINE LEARNING MODEL TO CAPTURE IN VIVO PROTEIN- DNA BINDING
Authors: Singla, Bhaviktisha
Issue Date: Apr-2024
Publisher: IIT, Roorkee
Abstract: Transcription factors (TFs) play a vital role in gene regulation, selectively binding to DNA at distinct sequence motifs that confer sequence specificity for controlling gene regulatory processes. Understanding TF binding within the cellular context is vital for comprehending growth, development, differentiation, evolution, and disease. Current computational methods for modeling and predicting TF DNA binding mainly focus on sequence-specific binding data, often neglecting dependencies and lacking precision in prediction results. Various models have been developed, and sequences have been scored using conventional methods such as PWM matrix through MEME-ChIP. The MinSeq-ChIP algorithm generates MinSeqs from our sequences for scoring purposes. Our findings indicate a substantial performance improvement in the base model created using the MinSeq ChIP algorithm. However, the MinSeq algorithm has limitations, such as assuming independence between all neighboring MinSeqs during scoring and, notably, employing Threshold-based Filtering, considering only MinSeqs with more than 50 counts. To enhance algorithm results, the integration of Machine Learning algorithms has performed in this study. Given that the base model of the MinSeq-ChIP algorithm already outperforms existing models, incorporating this algorithm into machine learning models yields remarkable performance in accurately and precisely identifying the binding motifs of Transcription factors.
URI: http://localhost:8081/jspui/handle/123456789/18709
Research Supervisor/ Guide: Bhimsaria, Devesh
metadata.dc.type: Dissertations
Appears in Collections:MASTERS' THESES (Bio.)

Files in This Item:
File Description SizeFormat 
22559002_BHAVIKTISHA SINGLA.pdf5.78 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.