www.it-ebooks.info Python for Finance Yves Hilpisch Beijing • Cambridge • Farnham • Köln • Sebastopol • Tokyo www.it-ebooks.info Preface Not too long ago, Python as a programming language and platform technology was considered exotic — if not completely irrelevant — in the financial industry By contrast, in 2014 there are many examples of large financial institutions — like Bank of America Merrill Lynch with its Quartz project, or JP Morgan Chase with the Athena project — that strategically use Python alongside other established technologies to build, enhance, and maintain some of their core IT systems There is also a multitude of larger and smaller hedge funds that make heavy use of Python’s capabilities when it comes to efficient financial application development and productive financial analytics efforts Similarly, many of today’s Master of Financial Engineering programs (or programs awarding similar degrees) use Python as one of the core languages for teaching the translation of quantitative finance theory into executable computer code Educational programs and trainings targeted to finance professionals are also increasingly incorporating Python into their curricula Some now teach it as the main implementation language There are many reasons why Python has had such recent success and why it seems it will continue to do so in the future Among these reasons are its syntax, the ecosystem of scientific and data analytics libraries available to developers using Python, its ease of integration with almost any other technology, and its status as open source (See Chapter 1 for a few more insights in this regard.) For that reason, there is an abundance of good books available that teach Python from different angles and with different focuses This book is one of the first to introduce and teach Python for finance — in particular, for quantitative finance and for financial analytics The approach is a practical one, in that implementation and illustration come before theoretical details, and the big picture is generally more focused on than the most arcane parameterization options of a certain class or function Most of this book has been written in the powerful, interactive, browser-based IPython Notebook environment (explained in more detail in Chapter 2) This makes it possible to provide the reader with executable, interactive versions of almost all examples used in this book Those who want to immediately get started with a full-fledged, interactive financial analytics environment for Python (and, for instance, R and Julia) should go to http://oreilly.quant-platform.com and try out the Python Quant Platform (in combination with the IPython Notebook files and code that come with this book) You should also have a look at DX analytics, a Python-based financial analytics library My other book, Derivatives Analytics with Python (Wiley Finance), presents more details on the theory and numerical methods for advanced derivatives analytics It also provides a wealth of readily usable Python code Further material, and, in particular, slide decks and videos of talks about Python for Quant Finance can be found on my private website If you want to get involved in Python for Quant Finance community events, there are opportunities in the financial centers of the world For example, I myself (co)organize meetup groups with this focus in London (cf http://www.meetup.com/Python-for-Quantwww.it-ebooks.info Finance-London/) and New York City (cf http://www.meetup.com/Python-for-QuantFinance-NYC/) There are also For Python Quants conferences and workshops several times a year (cf http://forpythonquants.com and http://pythonquants.com) I am really excited that Python has established itself as an important technology in the financial industry I am also sure that it will play an even more important role there in the future, in fields like derivatives and risk analytics or high performance computing My hope is that this book will help professionals, researchers, and students alike make the most of Python when facing the challenges of this fascinating field www.it-ebooks.info Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, and email addresses Constant width Used for program listings, as well as within paragraphs to refer to software packages, programming languages, file extensions, filenames, program elements such as variable or function names, databases, data types, environment variables, statements, and keywords Constant width italic Shows text that should be replaced with user-supplied values or by values determined by context TIP This element signifies a tip or suggestion WARNING This element indicates a warning or caution www.it-ebooks.info Using Code Examples Supplemental material (in particular, IPython Notebooks and Python scripts/modules) is available for download at http://oreilly.quant-platform.com This book is here to help you get your job done In general, if example code is offered with this book, you may use it in your programs and documentation You do not need to contact us for permission unless you’re reproducing a significant portion of the code For example, writing a program that uses several chunks of code from this book does not require permission Selling or distributing a CD-ROM of examples from O’Reilly books does require permission Answering a question by citing this book and quoting example code does not require permission Incorporating a significant amount of example code from this book into your product’s documentation does require permission We appreciate, but do not require, attribution An attribution usually includes the title, author, publisher, and ISBN For example: “Python for Finance by Yves Hilpisch (O’Reilly) Copyright 2015 Yves Hilpisch, 978-1-491-94528-5.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com www.it-ebooks.info Safari® Books Online NOTE Safari Books Online is an on-demand digital library that delivers expert content in both book and video form from the world’s leading authors in technology and business Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training Safari Books Online offers a range of plans and pricing for enterprise, government, education, and individuals Members have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and hundreds more For more information about Safari Books Online, please visit us online www.it-ebooks.info How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata, examples, and any additional information You can access this page at http://bit.ly/python-finance To comment or ask technical questions about this book, send email to bookquestions@oreilly.com For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com Find us on Facebook: http://facebook.com/oreilly Follow us on Twitter: http://twitter.com/oreillymedia Watch us on YouTube: http://www.youtube.com/oreillymedia www.it-ebooks.info Acknowledgments I want to thank all those who helped to make this book a reality, in particular those who have provided honest feedback or even completely worked out examples, like Ben Lerner, James Powell, Michael Schwed, Thomas Wiecki or Felix Zumstein Similarly, I would like to thank reviewers Hugh Brown, Jennifer Pierce, Kevin Sheppard, and Galen Wilkerson The book benefited from their valuable feedback and the many suggestions The book has also benefited significantly as a result of feedback I received from the participants of the many conferences and workshops I was able to present at in 2013 and 2014: PyData, For Python Quants, Big Data in Quant Finance, EuroPython, EuroScipy, PyCon DE, PyCon Ireland, Parallel Data Analysis, Budapest BI Forum and CodeJam I also got valuable feedback during my many presentations at Python meetups in Berlin, London, and New York City Last but not least, I want to thank my family, which fully accepts that I do what I love doing most and this, in general, rather intensively Writing and finishing a book of this length over the course of a year requires a large time commitment — on top of my usually heavy workload and packed travel schedule — and makes it necessary to sit sometimes more hours in solitude in front the computer than expected Therefore, thank you Sandra, Lilli, and Henry for your understanding and support I dedicate this book to my lovely wife Sandra, who is the heart of our family Yves Saarland, November 2014 www.it-ebooks.info Part I Python and Finance This part introduces Python for finance It consists of three chapters: Chapter 1 briefly discusses Python in general and argues why Python is indeed well suited to address the technological challenges in the finance industry and in financial (data) analytics Chapter 2, on Python infrastructure and tools, is meant to provide a concise overview of the most important things you have to know to get started with interactive analytics and application development in Python; the related Appendix A surveys some selected best practices for Python development Chapter 3 immediately dives into three specific financial examples; it illustrates how to calculate implied volatilities of options with Python, how to simulate a financial model with Python and the array library NumPy, and how to implement a backtesting for a trend-based investment strategy This chapter should give the reader a feeling for what it means to use Python for financial analytics — details are not that important at this stage; they are all explained in Part II www.it-ebooks.info tables compressed, Working with Compressed Tables working with, Working with Tables tail risk, Value-at-Risk technical analysis backtesting example, Technical Analysis definition of, Technical Analysis retrieving time series data, Technical Analysis testing investment strategy, Technical Analysis trading signal rules, Technical Analysis trend strategy, Technical Analysis technology, role in finance, Technology in Finance–The Rise of Real-Time Analytics templating, Templating testing, unit testing, Unit Testing text reading/writing text files, Reading and Writing Text Files representation with strings, Strings three-dimensional plotting, 3D Plotting tools, Tools–Spyder IPython, IPython–System shell commands Python interpreter, Python Spyder, Spyder–Spyder (see also mathematical tools) traders’ chat room application basic idea of, Traders’ Chat Room commenting functionality, Core functionality connection/log in, Core functionality data modeling, Data Modeling database infrastructure, Imports and database preliminaries importing libraries, The Python Code security issues, Core functionality styling, Styling templating, Templating www.it-ebooks.info traits library, Graphical User Interfaces traitsui.api library, Updating of Values trapezoidal rule, Numerical Integration tuples, Tuples two-dimensional plotting importing libraries, Two-Dimensional Plotting one-dimensional data set, One-Dimensional Data Set–One-Dimensional Data Set other plot styles, Other Plot Styles–Other Plot Styles two-dimensional data set, Two-Dimensional Data Set–Two-Dimensional Data Set U unit testing best practices, Unit Testing universal functions, Basic Vectorization, Basic Analytics unsorted data, Unsorted data updating of beliefs, Statistics urllib library, urllib URLs (uniform resource locators), urllib user-defined functions (UDF), User-defined functions V valuation framework Fundamental Theorem of Asset Pricing, Fundamental Theorem of Asset Pricing overview of, Valuation Framework risk-neutral discounting, Risk-Neutral Discounting valuation of contingent claims, Valuation–American Options American options, American Options European options, European Options valuation theory, Least-Squares Monte Carlo value-at-risk (VaR), Value-at-Risk values, updating in GUI, Updating of Values variance of returns, Normality Tests variance reduction, Variance Reduction vectorization basic, Basic Vectorization www.it-ebooks.info full with log Euler scheme, Full Vectorization with Log Euler Scheme fundamental idea of, Vectorization of Code memory layout, Memory Layout speed increase achieved by, Vectorization with NumPy with DataFrames, Financial Data with NumPy, Vectorization with NumPy Vega definition of, Generic Valuation Class of a European option in BSM model, Implied Volatilities visualization (see data visualization) VIX volatility index, Volatility Options volatility clustering, Financial Data volatility index, The Financial Model volatility options American on the VSTOXX, American Options on the VSTOXX–The Options Portfolio main index, Volatility Options model calibration, Model Calibration–Calibration Procedure tasks undertaken, Volatility Options VSTOXX data, The VSTOXX Data–VSTOXX Options Data volatility smile, Implied Volatilities volatility, stochastic model, Stochastic volatility VSTOXX data futures data, VSTOXX Futures Data index data, VSTOXX Index Data libraries required, The VSTOXX Data options data, VSTOXX Options Data W web browser deployment, Python Quant Platform web technologies communication protocols, Web Basics–urllib rapid web applications, Rapid Web Applications–Styling role in finance, Web Integration www.it-ebooks.info web plotting, Web Plotting–Real-time stock price quotes web services, Web Services–The Implementation Werkzeug library, Rapid Web Applications workbooks generating xls workbooks, Generating Workbooks (.xls) generating xlsx workbooks, Generating Workbooks (.xslx) OpenPyxl library for, Using OpenPyxl pandas generated, Using pandas for Reading and Writing reading from, Reading from Workbooks X xlrd library, Basic Spreadsheet Interaction xlsxwriter library, Basic Spreadsheet Interaction xlwings library, xlwings xlwt library, Basic Spreadsheet Interaction Y Yahoo! Finance, Financial Plots, Financial Data Z Zen of Python, What Is Python? zero-based numbering schemes, Tuples www.it-ebooks.info About the Author Yves Hilpisch is the founder and managing partner of The Python Quants, an analytics software provider and financial engineering group The Python Quants offer, among others, the Python Quant Platform (http://quant-platform.com) and DX Analytics (http://dx-analytics.com) Yves also lectures on mathematical finance and organizes meetups and conferences about Python for Quantitative Finance in New York and London www.it-ebooks.info Colophon The animal on the cover of Python for Finance is a Hispaniolan solenodon The Hispaniolan solenodon (Solenodon paradoxus) is an endangered mammal that lives on the Caribbean island of Hispaniola, which comprises Haiti and the Dominican Republic It’s particularly rare in Haiti and a bit more common in the Dominican Republic Solenodons are known to eat arthropods, worms, snails, and reptiles They also consume roots, fruit, and leaves on occasion A solenodon weighs a pound or two and has a footlong head and body plus a ten-inch tail, give or take This ancient mammal looks somewhat like a big shrew It’s quite furry, with reddish-brown coloring on top and lighter fur on its undersides, while its tail, legs, and prominent snout lack hair It has a rather sedentary lifestyle and often stays out of sight When it does come out, its movements tend to be awkward, and it sometimes trips when running However, being a night creature, it has developed an acute sense of hearing, smell, and touch Its own distinctive scent is said to be “goatlike.” It gets toxic saliva from a groove in the second lower incisor and uses it to paralyze and attack its invertebrate prey As such, it is one of few venomous mammals Sometimes the venom is released when fighting among each other, and can be fatal to the solenodon itself Often, after initial conflict, they establish a dominance relationship and get along in the same living quarters Families tend to live together for a long time Apparently, it only drinks while bathing Many of the animals on O’Reilly covers are endangered; all of them are important to the world To learn more about how you can help, go to animals.oreilly.com The cover image is from Wood’s Illustrated Natural History The cover fonts are URW Typewriter and Guardian Sans The text font is Adobe Minion Pro; the heading font is Adobe Myriad Condensed; and the code font is Dalton Maag’s Ubuntu Mono www.it-ebooks.info Python for Finance Yves Hilpisch Editor Brian MacDonald Editor Meghan Blanchette Revision History 2014-12-09 First release Copyright © 2014 Yves Hilpisch O’Reilly books may be purchased for educational, business, or sales promotional use Online editions are also available for most titles (http://safaribooksonline.com) For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com The O’Reilly logo is a registered trademark of O’Reilly Media, Inc Python for Finance, the cover image of a Hispaniolan solenodon, and related trade dress are trademarks of O’Reilly Media, Inc Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks Where those designations appear in this book, and O’Reilly Media, Inc was aware of a trademark claim, the designations have been printed in caps or initial caps While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work Use of the information and instructions contained in this work is at your own risk If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights This book is not intended as financial advice Please consult a qualified professional if you require financial advice O’Reilly Media 1005 Gravenstein Highway North Sebastopol, CA 95472 2014-12-10T07:08:11-08:00 www.it-ebooks.info Python for Finance Table of Contents Preface Conventions Used in This Book Using Code Examples Safari® Books Online How to Contact Us Acknowledgments I Python and Finance Why Python for Finance? What Is Python? Brief History of Python The Python Ecosystem Python User Spectrum The Scientific Stack Technology in Finance Technology Spending Technology as Enabler Technology and Talent as Barriers to Entry Ever-Increasing Speeds, Frequencies, Data Volumes The Rise of Real-Time Analytics Python for Finance Finance and Python Syntax Efficiency and Productivity Through Python Shorter time-to-results Ensuring high performance From Prototyping to Production Conclusions Further Reading Infrastructure and Tools Python Deployment Anaconda Python Quant Platform Tools Python IPython From shell to browser Basic usage Markdown and LaTeX Magic commands www.it-ebooks.info System shell commands Spyder Conclusions Further Reading Introductory Examples Implied Volatilities Monte Carlo Simulation Pure Python Vectorization with NumPy Full Vectorization with Log Euler Scheme Graphical Analysis Technical Analysis Conclusions Further Reading II Financial Analytics and Development Data Types and Structures Basic Data Types Integers Floats Strings Basic Data Structures Tuples Lists Excursion: Control Structures Excursion: Functional Programming Dicts Sets NumPy Data Structures Arrays with Python Lists Regular NumPy Arrays Structured Arrays Vectorization of Code Basic Vectorization Memory Layout Conclusions Further Reading Data Visualization Two-Dimensional Plotting One-Dimensional Data Set Two-Dimensional Data Set Other Plot Styles www.it-ebooks.info Financial Plots 3D Plotting Conclusions Further Reading Financial Time Series pandas Basics First Steps with DataFrame Class Second Steps with DataFrame Class Basic Analytics Series Class GroupBy Operations Financial Data Regression Analysis High-Frequency Data Conclusions Further Reading Input/Output Operations Basic I/O with Python Writing Objects to Disk Reading and Writing Text Files SQL Databases Writing and Reading NumPy Arrays I/O with pandas SQL Database From SQL to pandas Data as CSV File Data as Excel File Fast I/O with PyTables Working with Tables Working with Compressed Tables Working with Arrays Out-of-Memory Computations Conclusions Further Reading Performance Python Python Paradigms and Performance Memory Layout and Performance Parallel Computing The Monte Carlo Algorithm The Sequential Calculation The Parallel Calculation www.it-ebooks.info Performance Comparison multiprocessing Dynamic Compiling Introductory Example Binomial Option Pricing Static Compiling with Cython Generation of Random Numbers on GPUs Conclusions Further Reading Mathematical Tools Approximation Regression Monomials as basis functions Individual basis functions Noisy data Unsorted data Multiple dimensions Interpolation Convex Optimization Global Optimization Local Optimization Constrained Optimization Integration Numerical Integration Integration by Simulation Symbolic Computation Basics Equations Integration Differentiation Conclusions Further Reading 10 Stochastics Random Numbers Simulation Random Variables Stochastic Processes Geometric Brownian motion Square-root diffusion Stochastic volatility Jump diffusion www.it-ebooks.info Variance Reduction Valuation European Options American Options Risk Measures Value-at-Risk Credit Value Adjustments Conclusions Further Reading 11 Statistics Normality Tests Benchmark Case Real-World Data Portfolio Optimization The Data The Basic Theory Portfolio Optimizations Efficient Frontier Capital Market Line Principal Component Analysis The DAX Index and Its 30 Stocks Applying PCA Constructing a PCA Index Bayesian Regression Bayes’s Formula PyMC3 Introductory Example Real Data Conclusions Further Reading 12 Excel Integration Basic Spreadsheet Interaction Generating Workbooks (.xls) Generating Workbooks (.xslx) Reading from Workbooks Using OpenPyxl Using pandas for Reading and Writing Scripting Excel with Python Installing DataNitro Working with DataNitro Scripting with DataNitro www.it-ebooks.info Plotting with DataNitro User-defined functions xlwings Conclusions Further Reading 13 Object Orientation and Graphical User Interfaces Object Orientation Basics of Python Classes Simple Short Rate Class Cash Flow Series Class Graphical User Interfaces Short Rate Class with GUI Updating of Values Cash Flow Series Class with GUI Conclusions Further Reading 14 Web Integration Web Basics ftplib httplib urllib Web Plotting Static Plots Interactive Plots Real-Time Plots Real-time FX data Real-time stock price quotes Rapid Web Applications Traders’ Chat Room Data Modeling The Python Code Imports and database preliminaries Core functionality Templating Styling Web Services The Financial Model The Implementation Conclusions Further Reading III Derivatives Analytics Library www.it-ebooks.info 15 Valuation Framework Fundamental Theorem of Asset Pricing A Simple Example The General Results Risk-Neutral Discounting Modeling and Handling Dates Constant Short Rate Market Environments Conclusions Further Reading 16 Simulation of Financial Models Random Number Generation Generic Simulation Class Geometric Brownian Motion The Simulation Class A Use Case Jump Diffusion The Simulation Class A Use Case Square-Root Diffusion The Simulation Class A Use Case Conclusions Further Reading 17 Derivatives Valuation Generic Valuation Class European Exercise The Valuation Class A Use Case American Exercise Least-Squares Monte Carlo The Valuation Class A Use Case Conclusions Further Reading 18 Portfolio Valuation Derivatives Positions The Class A Use Case Derivatives Portfolios The Class www.it-ebooks.info A Use Case Conclusions Further Reading 19 Volatility Options The VSTOXX Data VSTOXX Index Data VSTOXX Futures Data VSTOXX Options Data Model Calibration Relevant Market Data Option Modeling Calibration Procedure American Options on the VSTOXX Modeling Option Positions The Options Portfolio Conclusions Further Reading A Selected Best Practices Python Syntax Documentation Unit Testing B Call Option Class C Dates and Times Python NumPy pandas Index About the Author Colophon Copyright www.it-ebooks.info ... For that reason, there is an abundance of good books available that teach Python from different angles and with different focuses This book is one of the first to introduce and teach Python for finance — in particular, for quantitative finance and for financial... analytics environment for Python (and, for instance, R and Julia) should go to http://oreilly.quant-platform.com and try out the Python Quant Platform (in combination with the IPython Notebook files and code that come with this book)... http://www.meetup.com /Python- for- Quantwww.it-ebooks.info Finance- London/) and New York City (cf http://www.meetup.com /Python- for- QuantFinance-NYC/) There are also For Python Quants conferences and workshops several