Practical Reinforcement Learning Develop self-evolving, intelligent agents with OpenAI Gym, Python, and Java Dr Engr S.M Farrukh Akhtar BIRMINGHAM - MUMBAI Practical Reinforcement Learning Copyright © 2017 Packt Publishing All rights reserved No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews Every effort has been made in the preparation of this book to ensure the accuracy of the information presented However, the information contained in this book is sold without warranty, either express or implied Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals However, Packt Publishing cannot guarantee the accuracy of this information First published: October 2017 Production reference: 1131017 Published by Packt Publishing Ltd Livery Place 35 Livery Street Birmingham B3 2PB, UK ISBN 978-1-78712-872-9 www.packtpub.com Credits Author Copy Editors Dr Engr S.M Farrukh Akhtar Vikrant Phadkay Alpha Singh Reviewers Project Coordinator Ruben Oliva Ramos Nidhi Joshi Juan Tomás Oliva Ramos Vijayakumar Ramdoss Commissioning Editor Proofreader Wilson D'souza Safis Editing Acquisition Editor Indexer Tushar Gupta Tejal Daruwale Soni Content Development Editor Graphics Mayur Pawanikar Tania Dutta Technical Editor Production Coordinator Suwarna Patil Aparna Bhagat About the Author Dr Engr S.M Farrukh Akhtar is an active researcher and speaker with more than 13 years of industrial experience analyzing, designing, developing, integrating, and managing large applications in different countries and diverse industries He has worked in Dubai, Pakistan, Germany, Singapore, and Malaysia He is currently working in Hewlett Packard as an enterprise solution architect He received a PhD in artificial intelligence from European Global School, France He also received two master's degrees: a master's of intelligent systems from the University Technology Malaysia, and MBA in business strategy from the International University of Georgia Farrukh completed his BSc in computer engineering from Sir Syed University of Engineering and Technology, Pakistan He is also an active contributor and member of the machine learning for data science research group in the University Technology Malaysia His research and focus areas are mainly big data, deep learning, and reinforcement learning He has cross-platform expertise and has achieved recognition for his expertise from IBM, Sun Microsystems, Oracle, and Microsoft Farrukh received the following accolades: Sun Certified Java Programmer in 2001 Microsoft Certified Professional and Sun Certified Web Component Developer in 2002 Microsoft Certified Application Developer in 2003 Microsoft Certified Solution Developer in 2004 Oracle Certified Professional in 2005 IBM Certified Solution Developer - XML in 2006 IBM Certified Big Data Architect and Scrum Master Certified - For Agile Software Practitioners in 2017 He also contributes his experience and services as a member of the board of directors in K.K Abdal Institute of Engineering and Management Sciences, Pakistan, and is a board member of Alam Educational Society Skype id: farrukh.akhtar .. .Practical Reinforcement Learning Develop self- evolving, intelligent agents with OpenAI Gym, Python, and Java Dr Engr S.M Farrukh Akhtar BIRMINGHAM - MUMBAI Practical Reinforcement Learning. .. Supervised learning Unsupervised learning Reinforcement learning Introduction to reinforcement learning Positive reinforcement learning Negative reinforcement learning Applications of reinforcement learning. .. Carlo and policy gradient with practical examples The third part applies reinforcement learning with the most recent and widely used algorithms with practical applications We end with practical