This is a post mortem by the SEC of the system errors at Knight Capital that ultimately caused a $440m loss. The actual report is in PDF format but I have reproduced the technical sections below.
August 1, 2012 and Related Events
Preparation for NYSE Retail Liquidity Program
To enable its customers’ participation in the Retail Liquidity Program (“RLP”) at the New York Stock Exchange,5 which was scheduled to commence on August 1, 2012, Knight made a number of changes to its systems and software code related to its order handling processes. These changes included developing and deploying new software code in SMARS. SMARS is an automated, high speed, algorithmic router that sends orders into the market for execution. A core function of SMARS is to receive orders passed from other components of Knight’s trading platform (“parent” orders) and then, as needed based on the available liquidity, send one or more representative (or “child”) orders to external venues for execution.
Upon deployment, the new RLP code in SMARS was intended to replace unused code in the relevant portion of the order router. This unused code previously had been used for functionality called “Power Peg,” which Knight had discontinued using many years earlier. Despite the lack of use, the Power Peg functionality remained present and callable at the time of the RLP deployment. The new RLP code also repurposed a flag that was formerly used to activate the Power Peg code. Knight intended to delete the Power Peg code so that when this flag was set to “yes,” the new RLP functionality—rather than Power Peg—would be engaged.
When Knight used the Power Peg code previously, as child orders were executed, a cumulative quantity function counted the number of shares of the parent order that had been executed. This feature instructed the code to stop routing child orders after the parent order had been filled completely. In 2003, Knight ceased using the Power Peg functionality. In 2005, Knight moved the tracking of cumulative shares function in the Power Peg code to an earlier point in the SMARS code sequence. Knight did not retest the Power Peg code after moving the cumulative quantity function to determine whether Power Peg would still function correctly if called.
Beginning on July 27, 2012, Knight deployed the new RLP code in SMARS in stages by placing it on a limited number of servers in SMARS on successive days. During the deployment of the new code, however, one of Knight’s technicians did not copy the new code to one of the eight SMARS computer servers. Knight did not have a second technician review this deployment and no one at Knight realized that the Power Peg code had not been removed from the eighth server, nor the new RLP code added. Knight had no written procedures that required such a review.
Events of August 1, 2012
On August 1, Knight received orders from broker-dealers whose customers were eligible to participate in the RLP. The seven servers that received the new code processed these orders correctly. However, orders sent with the repurposed flag to the eighth server triggered the defective Power Peg code still present on that server. As a result, this server began sending child orders to certain trading centers for execution. Because the cumulative quantity function had been moved, this server continuously sent child orders, in rapid sequence, for each incoming parent order without regard to the number of share executions Knight had already received from trading centers. Although one part of Knight’s order handling system recognized that the parent orders had been filled, this information was not communicated to SMARS.
The consequences of the failures were substantial. For the 212 incoming parent orders that were processed by the defective Power Peg code, SMARS sent millions of child orders, resulting in 4 million executions in 154 stocks for more than 397 million shares in approximately 45 minutes. Knight inadvertently assumed an approximately $3.5 billion net long position in 80 stocks and an approximately $3.15 billion net short position in 74 stocks. Ultimately, Knight realized a $460 million loss on these positions.
The millions of erroneous executions influenced share prices during the 45 minute period. For example, for 75 of the stocks, Knight’s executions comprised more than 20 percent of the trading volume and contributed to price moves of greater than five percent. As to 37 of those stocks, the price moved by greater than ten percent, and Knight’s executions constituted more than 50 percent of the trading volume. These share price movements affected other market participants, with some participants receiving less favorable prices than they would have in the absence of these executions and others receiving more favorable prices.
BNET Reject E-mail Messages
On August 1, Knight also received orders eligible for the RLP but that were designated for pre-market trading.6 SMARS processed these orders and, beginning at approximately 8:01 a.m. ET, an internal system at Knight generated automated e-mail messages (called “BNET rejects”) that referenced SMARS and identified an error described as “Power Peg disabled.” Knight’s system sent 97 of these e-mail messages to a group of Knight personnel before the 9:30 a.m. market open. Knight did not design these types of messages to be system alerts, and Knight personnel generally did not review them when they were received. However, these messages were sent in real time, were caused by the code deployment failure, and provided Knight with a potential opportunity to identify and fix the coding issue prior to the market open. These notifications were not acted upon before the market opened and were not used to diagnose the problem after the open.
Controls and Supervisory Procedures
Knight had a number of controls in place prior to the point that orders reached SMARS. In particular, Knight’s customer interface, internal order management system, and system for internally executing customer orders all contained controls concerning the prevention of the entry of erroneous orders.
However, Knight did not have adequate controls in SMARS to prevent the entry of erroneous orders. For example, Knight did not have sufficient controls to monitor the output from SMARS, such as a control to compare orders leaving SMARS with those that entered it. Knight also did not have procedures in place to halt SMARS’s operations in response to its own aberrant activity. Knight had a control that capped the limit price on a parent order, and therefore related child orders, at 9.5 percent below the National Best Bid (for sell orders) or above the National Best Offer (for buy orders) for the stock at the time that SMARS had received the parent order. However, this control would not prevent the entry of erroneous orders in circumstances in which the National Best Bid or Offer moved by less than 9.5 percent. Further, it did not apply to orders—such as the 212 orders described above—that Knight received before the market open and intended to send to participate in the opening auction at the primary listing exchange for the stock.
Although Knight had position limits for some of its trading groups, these limits did not account for the firm’s exposure from outstanding orders. Knight also did not have pre-set capital thresholds in the aggregate for the firm that were linked to automated controls that would prevent the entry of orders if the thresholds were exceeded.
For example, Knight had an account—designated the 33 Account—that temporarily held multiple types of positions, including positions resulting from executions that Knight received back from the markets that its systems could not match to the unfilled quantity of a parent order. Knight assigned a $2 million gross position limit to the 33 Account, but it did not link this account to any automated controls concerning Knight’s overall financial exposure.
On the morning of August 1, the 33 Account began accumulating an unusually large position resulting from the millions of executions of the child orders that SMARS was sending to the market. Because Knight did not link the 33 Account to pre-set, firm-wide capital thresholds that would prevent the entry of orders, on an automated basis, that exceeded those thresholds, SMARS continued to send millions of child orders to the market despite the fact that the parent orders already had been completely filled. Moreover, because the 33 Account held positions from multiple sources, Knight personnel could not quickly determine the nature or source of the positions accumulating in the 33 Account on the morning of August 1.
Knight’s primary risk monitoring tool, known as “PMON,” is a post-execution position monitoring system. At the opening of the market, senior Knight personnel observed a large volume of positions accruing in the 33 Account. However, Knight did not link this tool to its entry of orders so that the entry of orders in the market would automatically stop when Knight exceeded pre-set capital thresholds or its gross position limits. PMON relied entirely on human monitoring and did not generate automated alerts regarding the firm’s financial exposure. PMON also did not display the limits for the accounts or trading groups; the person viewing PMON had to know the applicable limits to recognize that a limit had been exceeded. PMON experienced delays during high volume events, such as the one experienced on August 1, resulting in reports that were inaccurate.
Code Development and Deployment
Knight did not have written code development and deployment procedures for SMARS (although other groups at Knight had written procedures), and Knight did not require a second technician to review code deployment in SMARS. Knight also did not have a written protocol concerning the accessing of unused code on its production servers, such as a protocol requiring the testing of any such code after it had been accessed to ensure that the code still functioned properly.
On August 1, Knight did not have supervisory procedures concerning incident response. More specifically, Knight did not have supervisory procedures to guide its relevant personnel when significant issues developed. On August 1, Knight relied primarily on its technology team to attempt to identify and address the SMARS problem in a live trading environment. Knight’s system continued to send millions of child orders while its personnel attempted to identify the source of the problem. In one of its attempts to address the problem, Knight uninstalled the new RLP code from the seven servers where it had been deployed correctly. This action worsened the problem, causing additional incoming parent orders to activate the Power Peg code that was present on those servers, similar to what had already occurred on the eighth server.