Erlang and OTP in Action potx

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	397
Dung lượng	6,16 MB

Nội dung

www.it-ebooks.info 2 ©Manning Publications Co. Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 MEAP Edition Manning Early Access Program Copyright 2009 Manning Publications For more information on this and other Manning titles go to www.manning.com Licensed to Wow! eBook <www.wowebook.com> www.it-ebooks.info 3 ©Manning Publications Co. Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 Table of Contents Part One: Getting Past Pure Erlang; The OTP Basics Chapter One: The Foundations of Erlang/OTP Chapter Two: Erlang Essentials Chapter Three: Writing a TCP based RPC Service Chapter Four: OTP Packaging and Organization Chapter Five: Processes, Linking and the Platform Part Two: Building A Production System Chapter Six: Implementing a Caching System Chapter Seven: Logging and Eventing the Erlang/OTP way Chapter Eight: Introducing Distributed Erlang/OTP way Chapter Nine: Converting the Cache into a Distributed Application Chapter Ten: Packaging, Services and Deployment Part Three: Working in a Modern Environment Chapter Eleven: Non-native Erlang Distribution with TCP and REST Chapter Twelve: Drivers and Multi-Language Interfaces Chapter Thirteen: Communication between Erlang and Java via JInterface Chapter Fourteen: Optimization and Performance Chapter Fifteen: Make it Faster Appendix A – Installing Erlang Appendix B – Lists and Referential Transparency Licensed to Wow! eBook <www.wowebook.com> www.it-ebooks.info 4 ©Manning Publications Co. Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 1 The foundations of Erlang/OTP Welcome to our book about Erlang and OTP in action! You probably know already that Erlang is a programming language—and as such it is pretty interesting in itself—but our focus here will be on the practical and the “in action”, and for that we also need the OTP framework. This is always included in any Erlang distribution, and is actually such an integral part of Erlang these days that it is hard to say where the line is drawn between OTP and the plain standard libraries; hence, one often writes “Erlang/OTP” to refer to either or both. But why should we learn to use the OTP framework, when we could just hack away, rolling our own solutions as we go? Well, these are some of the main points of OTP: Productivity Using OTP makes it possible to produce production-quality systems in very short time. Stability Code written on top of OTP can focus on the logic, and avoid error prone re-implementations of the typical things that every real-world system will need: process management, servers, state machines, etc. Supervision The application structure provided by the framework makes it simple to supervise and control the running systems, both automatically and through graphical user interfaces. Upgradability The framework provides patterns for handling code upgrades in a systematic way. Reliable code base The code for the OTP framework itself is rock-solid and has been thoroughly battle tested. Licensed to Wow! eBook <www.wowebook.com> www.it-ebooks.info 5 ©Manning Publications Co. Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 Despite these advantages, it is probably true to say that to most Erlang programmers, OTP is still something of a secret art, learned partly by osmosis and partly by poring over the more impenetrable sections of the documentation. We would like to change this. This is to our knowledge the first book focused on learning to use OTP, and we want to show that it can be a much easier experience than you might think. We are sure you won’t regret it. In this first chapter, we will present the core features on which Erlang/OTP is built, and that are provided by the Erlang programming language and run-time system:  Processes and concurrency  Fault tolerance  Distributed programming  Erlang's core functional language The point here is to get you acquainted with the thinking behind all the concrete stuff we’ll be diving into from chapter 2 onwards, rather than starting off by handing you a whole bunch of facts up front. Erlang is different, and many of the things you will see in this book will take some time to get accustomed to. With this chapter, we hope to give you some idea of why things work the way they do, before we get into technical details. 1.1 – Understanding processes and concurrency Erlang was designed for concurrency—having multiple tasks running simultaneously—from the ground up; it was a central concern when the language was designed. Its built-in support for concurrency, which uses the process concept to get a clean separation between tasks, allows us to create fault tolerant architectures and fully utilize the multi-core hardware that is available to us today. Before we go any further, we should explain exactly what we mean by the words “process” and “concurrency”. 1.1.1 – Processes Processes are at the heart of concurrency. A process is the embodiment of an ongoing activity: an agent that is running a piece of program code, concurrent to other processes running their own code, at their own pace. Licensed to Wow! eBook <www.wowebook.com> www.it-ebooks.info 6 ©Manning Publications Co. Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 They are a bit like people: individuals, who don’t share things. That’s not to say that people are not generous, but if you eat food, I don’t get full, and furthermore, if you eat bad food, I don’t get sick from it. You have your own brain and internals that keep you thinking and living independently of what I do. This is how processes behave; they are separate from one another and are guaranteed not to disturb one another through their own internal state changes. Figure illustrating processes running their own code (some running the same code, at different points) A process has its own working memory and its own mailbox for incoming messages. Whereas threads in many other programming languages and operating systems are concurrent activities that share the same memory space (and have countless opportunities to step on each other’s toes), Erlang’s processes can safely work under the assumption that nobody else will be poking around and changing their data from one microsecond to the next. We say that processes encapsulate state. P ROCESSES: AN EXAMPLE Consider a web server: it receives requests for web pages, and for each request it needs to do some work that involves finding the data for the page and either transmitting it back to the place the request came from (sometimes split into many chunks, sent one at a time), or replying with an error message in case of failure. Clearly, each request has very little to do with any other, but if the server accepted only one at a time and did not start handling the next request until the previous was finished, there would quickly be thousands of requests on queue if the web site was a popular one. If the server instead could start handling requests as soon as they arrived, each in a separate process, there would be no queue and most requests would take about the same time from start to finish. The state encapsulated by each process would then be: the specific URL for the request, who to reply to, and how far it has come in the handling as yet. When the request is finished, the process disappears, cleanly forgetting all about the request and recycling the memory. If a bug should cause one request to crash, only that process will die, while all the others keep working happily. Figure illustrating the web server processes example? When Erlang was invented, its focus was on handling phone calls; these days, it’s mostly Internet traffic, but the principles are the same. Licensed to Wow! eBook <www.wowebook.com> www.it-ebooks.info 7 ©Manning Publications Co. Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 THE ADVANTAGES OF ERLANG-STYLE PROCESSES Because processes cannot directly change each other's internal state, it is possible to make significant advances in error handling. No matter how bad code a process is running, it cannot corrupt the internal state of your other processes. Even at a fine-grained level within your program, you can have the same isolation that you see between the web browser and the word processor on your computer desktop. This turns out to be very, very powerful, as we will see later on when we talk about process supervision. Since processes can share no internal data, they must communicate by copying. If one process wants to exchange information with another, it sends a message; that message is a read-only copy of the data that the sender has. These fundamental semantics of message passing make distribution a natural part of Erlang. In real life, you can’t share data over the wire—you can only copy it. Erlang process communication always works as if the receiver gets a personal copy of the message, even if the sender happens to be on the same computer—this means that network programming is no different from coding on a single machine! This transparent distribution allows Erlang programmers to look at the network as simply a collection of resources—we don’t much care about whether process X is running on a different machine than process Y, because they are going to communicate in the exact same way no matter where they are located. 1.1.2 – Concurrency explained So what do we really mean by “concurrent”? Is it just another word for “in parallel”? Well, almost but not exactly, at least when we’re talking about computers and programming. One popular semi-formal definition reads something like “those things that don’t have anything that forces them to happen in a specific order are said to be concurrent”. For example, given the task to sort two packs of cards, you could sort one first, and then the other, or if you had some extra arms and eyes you could sort both in parallel. There is nothing that requires you to do them in a certain order; hence, they are concurrent tasks, they can be done in either order, or you can jump back and forth between the tasks until they’re both done, or, if you have the extra appendages (or perhaps someone to help you), you can perform them simultaneously in true parallel fashion. Figure showing concurrent vs. order-constrained tasks Licensed to Wow! eBook <www.wowebook.com> www.it-ebooks.info 8 ©Manning Publications Co. Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 This may sound strange: shouldn’t we say that tasks are concurrent only if they are actually happening at the same time? Well, the point with that definition is that they could happen at the same time, and we are free to schedule them at our convenience. Tasks that need to be done simultaneously together are not really separate tasks at all. Some tasks, though, are separate but non-concurrent and must be done in order, such as breaking the egg before making the omelet. One of the really nice things that Erlang does for you is that it helps you with the physical execution: if there are extra CPUs (or cores or hyperthreads) available, it will use them to run more of your concurrent processes in parallel—if not, it will use what CPU power there is to do them all a bit at a time. You will not need to think about such details, and your Erlang programs automatically adapt to different hardware—they just run more efficiently if there are more CPUs, as long as you have things lined up that can be done concurrently. Figure showing Erlang processes running on a single core and on a multicore machine But what if your tasks are not concurrent? If your program must first do X, then Y, and finally Z? Well, that is where you need to start thinking about the real dependencies in the problem you are out to solve. Perhaps X and Y can be done in any order as long as it is before Z. Or perhaps you can start working on a part of Z as soon as parts of X and Y are done. There is no simple recipe, but surprisingly often a little bit of thinking can get you a long way, and it gets easier with experience. Rethinking the problem in order to eliminate unnecessary dependencies can make the code run more efficiently on modern hardware. However, that should usually be your second concern. The most important effect of separating parts of the program that don’t really need to be together will be that it makes your code less confused, more readable, and allows you to focus on the real problems rather than on the mess that follows from trying to do several things at once. This means higher productivity and fewer bugs. 1.1.3 – Programming with processes in Erlang When you build an Erlang program you say to yourself, “what activities here are concurrent; can happen independently of one another?” Once you sketch out an answer that question, you can start building a system where every single instance of those activities you identified becomes a separate process. In contrast to most other languages, concurrency in Erlang is very cheap. Spawning a process is about as much work as allocating an object in your average object-oriented Licensed to Wow! eBook <www.wowebook.com> www.it-ebooks.info 9 ©Manning Publications Co. Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 language. This can take some getting used to in the beginning, because it is such a foreign concept! Once you do get used to it however, magic starts to happen. Picture a complex operation that has six concurrent parts, all modeled as separate processes. The operation starts, processes are spawned, data is manipulated, a result is produced, and at that very moment the processes involved simply disappear magically into oblivion, taking with them their internal state, database handles, sockets, and any other stuff that needs to be cleaned up that you don’t want to have to do manually. Figure of processes being set up, running, and disappearing In the rest of this section we are going to take a brief look at the characteristics of processes. We will show how quick and easy it is to start them, how lightweight they are, and how simple it is to communicate between them. This will enable us to talk in more detail about what you can really do with them and how they are the basis of the fault tolerance and scalability that OTP provides. 1.1.4 – Creating a process: “spawning” Erlang processes are not operating system “threads”. They are much more lightweight, implemented by the Erlang run-time system, and Erlang is easily capable of spawning hundreds of thousands of processes on a single system running on commodity hardware. Each of these processes is separate from all the other processes in the run-time system; it shares no memory with the others, and in no way can it be corrupted by another process dying or going berserk. A typical thread in a modern operating system reserves some megabytes of address space for its stack (which means that a 32-bit machine can never have more than about a thousand simultaneous threads), and it still crashes if it happens to use more stack space than expected. Erlang processes, on the other hand, start out with only a couple of hundred bytes of stack space each, and they grow or shrink automatically as required. Figure illustrating lots of Erlang processes? The syntax for starting processes is quite straightforward, as illustrated by the following example. We are going to spawn a process whose job is to execute the function call io:format("erlang!") and then finish, and we do it like this: spawn(io, format, ["erlang!"]) Licensed to Wow! eBook <www.wowebook.com> www.it-ebooks.info 10 ©Manning Publications Co. Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 That’s all. (Although the spawn function has some other variants, this is the simplest.) This will start a separate process which will print the text “erlang!” on the console, and then quit. In chapter 2 we will get into details about the Erlang language and its syntax, but for the time being we hope you will simply be able to get the gist of our examples without further explanation. One of the strengths of Erlang is that it is generally pretty easy to understand the code even if you’ve never seen the language before. Let’s see if you agree. 1.1.5 – How processes talk Processes need to do more than spawn and run however—they need to communicate. Erlang makes this communication quite simple. The basic operator for sending a message is !, pronounced “bang”, and it is used on the form “Destination ! Message”. This is message passing at its most primitive, like mailing a postcard. OTP takes process communication to another level, and we will be diving into all that is good with OTP and messaging later on, but for now, let’s marvel at the simplicity of communicating between two independent and concurrent processes illustrated in the following snippet run() -> Pid = spawn(fun ping/0), Pid ! self(), receive pong -> ok end. ping() -> receive From -> From ! pong end. Take a minute and look at the code above. You can probably understand it without any previous knowledge of Erlang. Points worth noting are: another variant of the spawn function, that here gets just a single reference to “the function named ping that takes zero arguments” (fun ping/0); and also the function self() that produces the identifier of the current process, which is then sent on to the new process so that it knows where to reply. That’s it in a nutshell: process communication. Every call to spawn yields a fresh process identifier that uniquely identifies the new child process. This process identifier can then be used to send messages to the child. Each process has a “process mailbox” where incoming messages are stored as they arrive, regardless of what the process is currently busy doing, and are kept there until it decides to look for messages. The process may then search and Licensed to Wow! eBook <www.wowebook.com> www.it-ebooks.info [...]... print memory usage information Please try out a few of these right now, for example listing or changing the current directory, printing the history, and printing the system and memory information Look briefly at the output from running i() and note that much like in an operating system, there is a whole bunch of things going on in the background apart from the shell prompt that you see 2.1.4 – Escaping... If your operating system is Windows, just open a web browser and go to www .erlang. org/download.html, then download and run the latest version from the top of the “Windows binary” column For other operating systems, and further details on installing Erlang, see Appendix A 2.1 – The Erlang shell An Erlang system is a more interactive environment than you may be used to With most programming languages,... programming effort, and will most likely introduce subtle bugs that may take years to weed out Erlang programs, on the other hand, are not much affected by this kind of problem As we explained in Section 1.1.1, the way Erlang avoids sharing of data and communicates by copying makes the code immediately suitable for splitting over several machines The kind of intricate data-sharing dependencies between... proper binaries, whose length in bits is divisible by eight Erlang has an advanced and somewhat intricate syntax for constructing ©Manning Publications Co Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 Licensed to Wow! eBook www.it-ebooks.info 25 new binaries or bitstrings as well as for matching and extracting data... about integers, a few pages back) This correspondence is reflected in the names of some of the standard library functions in Erlang, such as atom_to_list(A), which returns the list of characters of any atom A STRINGS AND THE SHELL The Erlang shell tries to maintain the illusion that strings are different from plain lists, by checking if they contain only printable characters If they do, it prints them... 2.1.1 – Starting the shell We assume that you have downloaded and installed Erlang/ OTP as we said above If you are using Linux, Mac OS X, or any other UNIX-based system, just open a console window and run the erl command If you are using Windows, you should click on the Erlang icon that the installer created for you; this runs the program called werl, which opens a special console for Erlang that avoids... ENTERING QUOTED STRINGS When you enter double- or single-quoted strings (without going into detail about what this means, for now), a particular gotcha worth bringing up right away is that if you forget a closing quote character and press return, the shell will expect more characters and will print the same prompt again, much like above when we forgot the period If this happens, type a single closing... propagating further, it insulates the processes it is linked to from each other, and can also be entrusted with reporting failures and even restarting the failed subsystems We call such processes supervisors Figure illustrating supervisor, workers, and signals The point of letting an entire subsystem terminate completely and be restarted is that it brings us back to a state known to function properly Think... ways of leaving the shell (and stopping the entire Erlang system) You should be familiar with all of them, because they all have their uses in managing and debugging a system ©Manning Publications Co Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 Licensed to Wow! eBook www.it-ebooks.info 20 CALLING Q() OR INIT:STOP()... designed for running continuously, and for interactive development, debugging, and upgrading Optimally, the only reason for restarting Erlang is because of a hardware failure, ©Manning Publications Co Please post comments or corrections to the Author Online forum: http://www.manning-sandbox.com/forum.jspa?forumID=454 Licensed to Wow! eBook www.it-ebooks.info 17 operating system upgrade, . Caching System Chapter Seven: Logging and Eventing the Erlang/ OTP way Chapter Eight: Introducing Distributed Erlang/ OTP way Chapter Nine: Converting. foundations of Erlang/ OTP Welcome to our book about Erlang and OTP in action! You probably know already that Erlang is a programming language and as such

Ngày đăng: 08/03/2014, 18:20

Xem thêm