Please use this identifier to cite or link to this item: http://localhost:8081/xmlui/handle/123456789/2381
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKumar, Raj-
dc.date.accessioned2014-09-27T07:18:49Z-
dc.date.available2014-09-27T07:18:49Z-
dc.date.issued2012-
dc.identifierM.Techen_US
dc.identifier.urihttp://hdl.handle.net/123456789/2381-
dc.guideNigam, M. J.-
dc.description.abstractReinforcement Learning is an improvised unsupervised learning where the learner has the task to learn by interaction with the environment, the only performance measure signal at hand is the feedback signal from the environment. In Reinforcement learning the learner performs some action on the environment. After evaluating the result, the learner has to differentiate between the actions with high and low quality. So, clearly the learner has to explore many of the possible actions before concluding about the status of the different actions, this makes the learning rate very slow. Also the differentiating among the actions is difficult in the long term sense. In this dissertation work, first the working of Q-learning for a large state space of inverted pendulum in discrete state space is examined. The control algorithm used is the off policy temporal difference which approximate the action value function. Hence, this policy directly tries to simplify the analysis of the algorithm and convergence time is reduced. The results obtained for controlling the system in discrete state space shows that the time taken to control the system is reduced. The policy used, determines which state-action pairs are visited and hence updated. The controller designed, has the learning time fairly reasonable. Then the work is extended to the continuous time state space with the help of universal function approximation capability of fuzzy logic. Hence, this work presents a self tuning method of fuzzy logic controllers. The consequence part of the fuzzy logic controller is self tuned through the Q-learning algorithm of reinforcement learning. The off policy temporal difference algorithm is used for tuning which directly approximate the action value function which gives the maximum reward. In this way, the Q-learning algorithm is used for the continuous time environment. The approach considered is having the advantage of fuzzy logic controller in a way that it is robust under the environmental uncertainties and no expert knowledge is required to design the rule base of the fuzzy logic controller.en_US
dc.language.isoenen_US
dc.subjectFUZZY LOGICen_US
dc.subjectCANTROLLERen_US
dc.subjectREINFORCEMENTen_US
dc.subjectELECTRONICS AND COMPUTER ENGINEERINGen_US
dc.titleTUNING OF FUZZY LOGIC CONTROLLER THROUGH REINFORCEMENT LEARNINGen_US
dc.typeM.Tech Dessertationen_US
dc.accession.numberG21990en_US
Appears in Collections:MASTERS' THESES (E & C)

Files in This Item:
File Description SizeFormat 
ECDGF21990.pdf2.98 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.