Human vs. Agent Experimental Economics

In 1993, after three decades of human-only experimental economics, a landmark paper involving a mix of traditional human experimental economics and software- agent market studies was published in theJournal of Political Economyby Gode and Sunder (G&S) [28]. G&S were interested in understanding how much of the efficiency of the CDA is due to the intelligence of traders, and how much is due to the organisation of the market. To test this, G&S introduced a very simpleZero Intelligence Constrained(ZIC) trading agent that generate random bid or ask prices drawn from a uniform distribution, subject to the constraint that prices generated cannot be loss-making—i.e., sell prices are equal or above limit price, buy prices are equal or below limit price. G&S performed a series of ZIC-human experiments, with results demonstrating that the simple ZIC agents produced convergence towards the theoretical equilibrium and had human-like scores for allocative efficiency (Eq. (2));

suggesting that market convergence towards theoretical equilibrium is an emergent property of the CDA market mechanism and not the intelligence of the traders.

Indeed, G&S found that the only way to differentiate the performance of humans and ZIC traders was by using their profit dispersion statistics (Eq. (3)). These results were striking and attracted considerable attention.

In 1997, Dave Cliff [13] presented the first detailed mathematical analysis and replication of G&S’s results. Results demonstrated that the ability of ZIC traders to converge on equilibrium was dependent on the shape of the market’s demand and supply curves. In particular, ZIC traders were unable to equilibrate when acting in markets with demand and supply curves very different to those used by G&S. To address this issue, Cliff developed theZero Intelligence Plus (ZIP) trading algorithm. Rather than issuing randomly generated bid and ask prices in the manner of ZIC, Cliff’s ZIP agents contain an internal profit margin from which bid and ask prices are calculated. When a buyer (seller) sees transactions happen at a price below (above) the trader’s current bid (ask) price, profit margin is raised,

46 J. Cartlidge and D. Cliff thus resulting in a lower (higher) bid (ask) price. Conversely, a buyer’s (seller’s) profit margin is lowered when order and transaction prices indicate that the buyer (seller) will need to raise (lower) bid (ask) price in order to transact [13, p.43].

The size of ZIP’s profit margin update is determined using a well-established machine learning mechanism (derived from the Widrow–Hoff Delta rule [56]).

Cliff’s autonomous and adaptive ZIP agents were shown to display human-like efficiency and equilibration behaviours in all markets, irrespective of the shape of demand and supply.

Around the same time that ZIP was introduced, economists Steve Gjerstad and his former PhD supervisor John Dickhaut independently developed a trading algorithm that was later named GD after the inventors [27]. Using observed market activity—frequencies of bids, asks, accepted bids, and accepted asks—resulting in the most recentL transactions (where L = 5 in the original study), GD traders calculate a private, subjective “belief” of the probability that a counterparty will accept each quote price. The belief function is extended over all prices by applying cubic-spline interpolation between observed prices (although it has previously been suggested that using any smooth interpolation method is likely to suffice [19, p.17]). To trade, GD quotes a price to buy or sell that maximises expected surplus, calculated as price multiplied by the belief function’s probability of a quote being accepted at that price. Simulated markets containing GD agents were shown to converge to the competitive equilibrium price and allocation in a fashion that closely resembled human equilibration in symmetric markets, but with greater efficiency than human traders achieved [27]. A modified GD (MGD) algorithm, where the belief function of bid (ask) prices below (above) the previous lowest (highest) transaction price was set to probability zero, was later introduced to counter unwanted price volatility.

In 2001, a series of experiments were performed to compare ZIP and MGD in real-time heterogeneous markets [52]. MGD was shown to outperform ZIP. Also in 2001, the first ever human–agent experiments—with MGD and ZIP competing in the same market as human traders—were performed by Das et al., a team from IBM [15]. Results had two major conclusions: (a) firstly, mixed human–agent markets were off-equilibrium—somehow the mixture of humans and agents in the market reduces the ability of the CDA to equilibrate; (b) secondly, in all experiments reported, the efficiency scores of humans were lower than the efficiency scores of agents (both MGD and ZIP). In Das et al.’s own words, “. . . the successful demonstration of machine superiority in the CDA and other common auctions could have a much more direct and powerful impact—one that might be measured in billions of dollars annually” [15]. This result, demonstrating for the first time in human-algorithmic markets that agents can outperform humans, implied a future financial-market system where ATS replace humans at the point of execution.

Despite the growing industry in ATS in real financial markets, in academia there was a surprising lack of further human–agent market experiments over the following decade. In 2003 and 2006, Grossklags & Schmidt [30, 31] performed human–

agent market experiments to study the effect that human behaviours are altered by their knowledge of whether or not agent traders are present in the market. In

Modelling Financial Markets Using Human–Agent Experiments 47 2011, De Luca & Cliff successfully replicated Das et al.’s results, demonstrating that GDX (an extension of MGD, see [51]) outperforms ZIP in agent–agent and agent–human markets [17]. They further showed that Adaptive Aggressive (AA) agents—a trading agent developed by Vytelingum in 2006 that is loosely based on ZIP, with significant novel extensions including short-term and long-term adaptive components [54,55]—dominate GDX and ZIP, outperforming both in agent–agent and agent–human markets [18]. This work confirmed AA as the dominant trading- agent algorithm. (For a detailed review of how ZIP and AA have been modified over time, see [49, 50].) More recent human–agent experiments have focused on emotional arousal level of humans, monitoring heart rate over time [57] and monitoring human emotions via EEG brain data [8].

Complementary research comparing markets containing only humans against markets containing only agents—i.e., human-only or agent-only markets rather than markets in which agents and humans interact—can also shed light on market dynamics. For instance, Huber, Shubik, and Sunder (2010) compare dynamics of three market mechanisms (sell-all, buy-all, and double auction) in markets containing all humans against markets containing all agents. “The results suggest that abstracting away from all institutional details does not help understand dynamic aspects of market behaviour and that inclusion of mechanism differences into theory may enhance our understanding of important aspects of markets and money, and help link conventional analysis with dynamics” [33]. This research stream reinforces the necessity of including market design in our understanding of market dynamics.

However, it does not offer the rich interactions between humans and ATS that we observe in real markets, and that only human–agent interaction studies can offer.

4 Methodology

In this section, the experimental methodology and experimental trading platform (OpEx) are presented. Open Exchange (OpEx) is a real-time financial-market simulator specifically designed to enable economic trading experiments between humans and automated trading algorithms (robots). OpEx was designed and developed by Marco De Luca between 2009 and 2010 while he was a PhD student at the University of Bristol, and since Feb. 2012 is freely available for open-source download from SourceForge, under the terms of the Creative Commons Public License.5Figure2shows theLab-in-a-boxhardware arranged ready for a human–

agent trading experiment. For a detailed technical description of the OpEx platform, refer to [19, pp. 26–33].

At the start of each experiment, 6 human participants were seated at a terminal around a rectangular table—with three buyers on one side and three sellers opposite—and given a brief introduction and tutorial to the system (explaining

5OpEx download available at:www.sourceforge.net/projects/open-exchange.

48 J. Cartlidge and D. Cliff

Fig. 2 TheLab-in-a-boxhardware ready to run an Open Exchange (OpEx) human versus agent trading experiment. Six small netbook computers run human trader Sales GUIs, with three buyers (near-side) sitting opposite three sellers (far-side). Netbook clients are networked via Ethernet cable to a network switch for buyers and a network switch for sellers, which in turn are connected to a router. The central exchange and robots servers run on the dedicated hardware server (standing vertically, top-left), which is also networked to the router. Finally, anAdministratorlaptop (top table, centre) is used to configure and run experiments. Photograph: © J. Cartlidge, 2012

the human trading GUI illustrated in Fig.3), during which time they were able to make test trades among themselves while no robots were present in the market.

Participants were told that their aim during the experiment was to maximise profit by trading client orders (assignments or alternatively namedpermitsto distinguish that traders will simultaneously have multiple client orders to work, whereas in the traditional literature, a new assignment would only be received once the previous assignment had been completed) that arrive over time. For further details on the experimental method, refer to [9, pp. 9–11].

Trading Agents (Robots)

Agent-robots are independent software processes running on the multi-core hardware server that also hosts the central exchange server. Since agents can act at any time—there is no central controller coordinating when, or in which order, an agent can act—and since the trading logic of agents does not explicitly include temporal information, in order to stop agents from issuing a rapid stream of quotes, a sleep timer is introduced into the agent architecture. After each action, or decision to not act, an agent willsleepfor ts milliseconds beforewaking and deciding upon the next action. We name this thesleep-wakecycle of agents. For instance, ifts =100, the sleep-wake cycle is 0.1 s. To ensure agents do not miss important events during sleep, agents are also set to wake (i.e., sleep is interrupted) when a new assignment permit is received and/or when an agent is notified about a new trade execution. The parameterts is used to configure the “speed” of agents for each experiment.

Modelling Financial Markets Using Human–Agent Experiments 49

Fig. 3 Trading GUI for a human buyer. New order assignments (orpermits) arrive over time in theClient Orderspanel (top-left) and listed in descending order by potential profit. Assignments are selected by double-clicking. This opens aNew Orderdialogue pop-up (top-centre) where bid price and quantity are set before entering the new bid into the market by pressing button BUY.

The marketOrder Bookis displayed top-right, with all bids and asks displayed. Bid orders that the trader currently has live in the market are listed in theOrderspanel (middle), and can be amended from here by double-clicking. When an order executes it is removed from the orders panel and listed in theTradeshistory panel (bottom). For further GUI screen shots, refer to [9, Appendix C]

Trading agents are configured to use the Adaptive Aggressive (AA) strategy logic [54,55], previously shown to be the dominant trading agent in the literature (see Sect.3.3). AA agents have short-term and long-term adaptive components.

In the short term, agents use learning parameters β1 and λ to adapt their order aggressiveness. Over a longer time frame, agents use the moving average of the previous N market transactions and a learning parameter β2 to estimate the market equilibrium price,pˆ0. Theaggressivenessof AA represents the tendency to accept lower profit for a greater chance of transacting. To achieve this, an agent with high (low) aggression will submit orders better (worse) than the estimated equilibrium price pˆ0. For example, a buyer (seller) with high aggression and estimated equilibrium valuepˆ0 =100 will submit bids (asks) with pricep >100 (price p < 100). Aggressiveness of buyers (sellers) increases when transaction prices are higher (lower) thanpˆ0, and decreases when transaction prices are lower (higher) than pˆ0. The Widrow–Hoff mechanism [56] is used by AA to update aggressiveness in a similar way that it is used by ZIP to update profit margin (see Sect.3.3). For all experiments reported here, we set parameter values β1 = 0.5, λ=0.05,N =30, andβ2=0.5. The convergence rate of bids/asks to transaction price is set toη=3.0.

50 J. Cartlidge and D. Cliff

Table 1 Permit schedule for market efficiency experiments.

1 2a 3 4 5 6

Buyer 1 350 (0) 250 (4) 220 (7) 190 (09) 150 (14) 140 (16) Buyer 2 340 (1) 270 (3) 210 (8) 180 (10) 170 (12) 130 (17) Buyer 3 330 (2) 260 (4) 230 (6) 170 (11) 160 (13) 150 (15) Seller 1 50 (0) 150 (4) 180 (7) 210 (09) 250 (14) 260 (16) Seller 2 60 (1) 130 (3) 190 (8) 220 (10) 230 (12) 270 (17) Seller 3 70 (2) 140 (4) 170 (6) 230 (11) 240 (13) 250 (15) Six permit types are issued to each market participant, depending on their role. For each role (e.g., Buyer 1), there are two traders: one human (Human Buyer 1) and one robot (Robot Buyer 1). Thus, there are 12 traders in the market. Permit values showlimit price—the maximum value at which to buy, or minimum value at which to sell—and the time-step they are issued (in parentheses).

The length of each time-step is 10 s, making one full permit cycle duration 170 s. During a 20-min experiment there are seven full cycles

aType 2 permits were accidentally issued to Buyer1/Seller1 at time-step 4 rather than time-step 5

Exploring the Effects of Agent Speed on Market Efficiency: April–June 2011 All experiments were run at the University of Bristol between April and July 2011 using postgraduate students in non-financial but analytical subjects (i.e., students with skills suitable for a professional career in finance, but with no specific trading knowledge or experience). Participants were paid £20 for participating and a further

£40 bonus for making the most profit, and £20 bonus for making the second highest profit. Moving away from the artificial constraint of regular simultaneous replenishments of currency and stock historically used, assignment permits were issued at regular intervals. AA agents had varying sleep-wake cycle: ts = 100, andts = 10,000. We respectively label these agents AA-0.1 to signify a sleep- wake cycle of 0.1 s, and AA-10 to signify a sleep-wake cycle of 10 s. A total of 7 experiments were performed, using the assignment permit schedules presented in Table1. The supply and demand curves generated by these permits are shown in Fig.4. We can see that for all experiments, P0 = 200 andQ0 =126. Since each human only participates in one experiment, and since trading agents are reset at the beginning of each run, traders have no opportunity to learn the fixed value ofP0

over repeated runs. For further details of experimental procedure, see [11].

Exploring the Robot Phase Transition (RPT): March 2012

Twenty-four experiments were run on 21st March 2012, at Park House Business Centre, Park Street, Bristol, UK. Participants were selected on a first-come basis from the group of students that responded to adverts broadcast to two groups: (1) students enrolled in final year undergraduate and postgraduate module in computer science that includes coverage of the design of automated trading agents; (2) members of the Bristol Investment Society, a body of students interested in pursuing a career in finance. We assume that these students have the knowledge and skills to embark on a career as a trader in a financial institution. Volunteers were paid £25 for participating, and the two participants making the greatest profit received an iPad valued at £400. To reduce the total number of participants required, each group were

Modelling Financial Markets Using Human–Agent Experiments 51

0 50 100 150 200 250 300 350 400

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Price

Quantity Q0 P0

Supply

Demand

Fig. 4 Stepped supply and demand curves for permit schedule defined in Table1. Curves show the aggregate quantity that participants are prepared to buy (demand) and sell (supply) at every price point. The point at which the two curves intersect is the theoretical equilibrium point for the market:P0 =200 is the equilibrium price, andQ0is the equilibrium quantity. As there are two traders in each role—one human and one robot—each permit cycleQ0=2×9=18, and over the seven permit cycles of one full experiment,Q0=18×7=126. The market is symmetric about P0

used in a session of six separate experiments. Therefore, 24 experiments were run using the 24 participants. Between experiments, human participants rotated seats, so each played every role exactly once during the session of 6 experiments. Human roles were purposely mixed between experiment rounds to reduce the opportunity for collusion and counteract any bias in market role. Once again, agents used the AA algorithm with varying sleep-wake cycle, and assignment orders were released into the market at regular intervals.

Table2presents the assignment permit schedules used for each experiment, and the full supply and demand curves generated by these permits are plotted in Fig.5.

At each price point—i.e., at each step in the permit schedule—two assignment permits are sent simultaneously to a human trader and to a robot trader, once every replenishment cycle. For all experiments, permits are allocated in pairs symmetric aboutP0such that the equilibrium is not altered, and the inter-arrival time of permits is 4 s. Cycles last 72 s and are repeated eight times during a 10 min experiment.

Therefore, over a full experiment there are 2×8 = 16 permits issued at each price point. The expected equilibrium number of trades for the market, Q0, is 144 intra-marginal units. Each experiment, P0 is varied in the range 209–272 to stop humans from learning the equilibration properties of the market between experiments. Agents are reset each time and have no access to data from previous experiments. In cyclical markets, permits are allocated in strict sequence that is unaltered between cycles. Inrandommarkets, the permit sequence across the entire run is randomised. For further details on experimental procedure, see [9,10].

52 J. Cartlidge and D. Cliff

Table 2 Permit schedule for RPT experiments

1 2 3 4 5 6

Buyer 1 77 (1) 27 (4) 12 (7) −9 (10) −14 (13) −29 (16)

Buyer 2 73 (2) 35 (5) 8 (8) −5 (11) −22 (14) −25 (17)

Buyer 3 69 (3) 31 (6) 16 (9) −1 (12) −18 (15) −33 (18) Seller 1 −77 (1) −27 (4) −12 (7) 9 (10) 14 (13) 29 (16) Seller 2 −73 (2) −35 (5) −8 (8) 5 (11) 22 (14) 25 (17) Seller 3 −69 (3) −31 (6) −16 (9) 1 (12) 18 (15) 33 (18) Six permit types are issued to each market participant, depending on their role. For each role, there is one human and one robot participant. Permit values showlimitprice−P0. Thus, e.g., if P0=100, a permit of type 4 to Buyer1 would have a limit price of 91. For buyers, limit prices are the maximum value to bid; and for sellers, limit prices are the minimum value to ask. Numbers in brackets show the time-step sequence in which permits are allocated. Thus, after 11 time-steps, Buyer2 and Seller2 each receive a permit of type 4. For all experiments, the inter-arrival time- step between permits is 4 s. Permits are always allocated in pairs, symmetric aboutP0. In cyclical markets, the sequence is repeated eight times: the last permits are issued to Buyer3 and Seller3 at time 576 s, and the experiment ends 24 s later. In non-cyclical or “random” markets, the time-step of permits is randomised across the run. Participants receive the same set of permits in both cyclical and random markets, but in a different order

Fig. 5 Stepped supply and demand curves for an entire run of the RPT experiments, defined by the permit schedules shown in Table2. Curves show the aggregate quantity that participants are prepared to buy (demand) and sell (supply) at every price point. The point at which the two curves intersect is the theoretical equilibrium point for the market:Q0=144 is the equilibrium quantity, andP0is the equilibrium price. Each experiment the value ofP0is varied in the range 209–272 to avoid humans learning a fixed value ofP0over repeated trials. The market is symmetric aboutP0.

Human vs. Agent Experimental Economics

Order Urgency Updated After Intraday Transaction

Broken Markets: Flash Crashes and Subsecond Fractures