AUTOMATIC HYPERPARAMETER OPTIMIZATION OF NEURAL ARCHITECTURES

G. M., Biju

Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/19270

Title:	AUTOMATIC HYPERPARAMETER OPTIMIZATION OF NEURAL ARCHITECTURES
Authors:	G. M., Biju
Issue Date:	Feb-2024
Publisher:	IIT Roorkee
Abstract:	The success of deep learning in time series analysis and computer vision heavily relies on the development of high-performing neural architectures. This research offers a framework focusing on hyperparameter optimization and Neural Architecture Search (NAS) to optimize the architectural topology of deep learning models applied to diverse electrical system tasks and image classification tasks. The primary objective is to maximize classification or regression accuracy while enhancing auxiliary objectives such as model interpretability and optimizing search computation. Five key studies contribute to this unified approach: In the first study, a fault classification model based on Long Short-Term Memory (LSTM) was crafted for the Power System Machine Learning benchmark dataset, with a primary focus on bolstering reliability through enhanced interpretability. The study introduces novel metrics tailored to gauge model interpretability derived from the disentanglement of fault classification factors. Subsequently, hyperparameter optimization, executed through multiobjective Bayesian Optimization, was employed to ascertain the optimal model architecture. The optimization objective sought to concurrently maximize interpretability and classification accuracy. The resultant Pareto-optimal solution showcases diverse model architectures, offering varying trade-offs between accuracy and interpretability. Additionally, the study explores the manifestation of interpretability through subsequences, employing Shapley Additive Explanations. An analysis of the impact of class representation and architectural parameters on interpretability is conducted. Notably, the most accurate model within the Pareto front demonstrates highly competitive accuracy when benchmarked against the dataset. The second study focuses on architecture optimization for electrical load forecasting, incorporating a temporal attention mechanism into an LSTM-attention hybrid model. A novel metric assesses interpretability, and Pareto optimization seeks an optimal architecture balancing interpretability and prediction accuracy. The investigation explores the intricate relationship between model architecture and interpretability. The third study utilizes Differentiable Architecture Search (DARTS) for electric load demand forecasting. By generating a customized RNN cell using NAS, the study demonstrates the advantage of tailoring the internal RNN structure. Models constructed from these cells outperform general RNN variants, showcasing the efficacy of customized architectures. The fourth study delves into optimized transfer learning for electrical time series classification, leveraging advancements in vision datasets. The proposed algorithm optimizes pre-trained models for target tasks, achieving state-of-the-art performance. This algorithm identifies and retains features that contribute to the target tasks performance while pruning features that hinder accuracy gain. Once optimized, the pre-trained model extracts features from the spectrogram of input electric time series, encompassing voltage, current, power, etc. These features are then utilized by the downstream classifier to generate class scores for various categories, such as grid fault types and power quality disturbance types. To enhance the models confidence, the class scores undergo calibration. Experiments on 12 electric time series datasets demonstrate the method’s effectiveness, with ablation studies emphasizing the significance of optimizing pre-trained models and calibrating class scores. The fifth study introduces Sequential Neural Architecture Search (SQNAS), a novel algorithm that independently searches nodes in cell-type NAS in a sequential manner. SQNAS significantly reduces search costs, achieving competitive performance in image classification tasks with reduced search costs. The efficiency of SQNAS is demonstrated across various datasets, highlighting its effectiveness in faster NAS. Collectively, these studies contribute to a unified understanding of optimizing deep learning architectures for enhanced model performance. The proposed methodologies, centered around hyperparameter optimization and NAS, demonstrate the significance of tailored architectures in achieving state-of-the-art results across diverse tasks.
URI:	http://localhost:8081/jspui/handle/123456789/19270
Research Supervisor/ Guide:	Pillai P.K. Gopinatha and Seshadrinath, Jeevanand
metadata.dc.type:	Thesis
Appears in Collections:	DOCTORAL THESES (Electrical Engg)

Files in This Item:

File	Description	Size	Format
17914017_BIJU G. M..pdf		10.35 MB	Adobe PDF	View/Open

Show full item record