FCA Final Notice 2014: Royal Bank of Scotland Plc, National Westminster Bank Plc and Ulster Bank Ltd–RBS

The full report is contained in a pdf on the linked page, but I have reproduced the IT-relevant sections below.


THE Root Cause of the it incident

The batch scheduler failure.

Banks generally update that day’s transactions in the evening. They use a software tool known as a batch scheduler to process those updates. A batch scheduler coordinates the order in which data underlying the updates is processed. The data includes information about customer withdrawals and deposits, interbank clearing, money market transactions, payroll processing, and requests to change standing orders and addresses. The processes underlying the updates are called “jobs”. Batch schedulers place the jobs into queues and ensure that each job is processed in the correct sequence. That day’s batch processing is complete when all balances are final.

On Sunday 17 June 2012 a team from Technology Services upgraded the batch scheduler software that processed updates to customers’ accounts at NatWest and Ulster Bank because this software could no longer be sufficiently supported. They upgraded the batch scheduler software from Version 1 to Version 2A. Version 2A contained a modification known as a “patch”. (A separate batch scheduler processed updates to RBS’s accounts.)

On the evening of Monday 18 June 2012, the Technology Services team executed the first full batch run (set of updates) for the NatWest and Ulster Bank batch scheduler since the software update. During the evening the team noticed a number of anomalies. The mainframe computer was using a higher than normal percentage of its total processing capacity. This, in turn, caused the system to slow down and to experience several batch terminal failures. This meant that it did not properly update customers’ accounts. The team raised the failures with internal IT experts who were able to re-run the failed batches by entering commands into the system manually. This allowed the complete batch to run 8 that night. The RBS batch scheduler was also affected because of interdependencies with the NatWest and Ulster Bank batch scheduler.

On Tuesday 19 June 2012, Technology Services backed out the software upgrade. The Technology Services’ team was not aware, however, that Version 2A, the upgraded version of the software, was not compatible with Version 1, the version that had been in place prior to the upgrade. The reason it was not compatible was because Version 2A, the upgraded version of the software, contained the “A” patch modification. Technology Services had only tested the consequences of backing out Version 2 to Version 1. They had not tested the consequences of backing out Version 2A to Version 1. This was the underlying cause of the IT Incident.

As a result of backing out the software upgrade, a significant number of jobs failed to appear in the batch queues and the unprocessed batch jobs began to multiply. The lack of compatibility between Version 2A and Version 1 of the software, which was unknown to Technology Services, and the subsequent release of incomplete batches in Ulster Bank’s and NatWest’s systems, was the actual cause of the IT Incident.

To resolve the problem, technical support staff focused on manually re-loading jobs into the batch queues. This process of manual intervention is implemented when a number of batch jobs fail to run. By the morning of 20 June 2012, the NatWest batch for 19 June was largely completed. However, the team had not completed processing Ulster Bank batches by then and that caused a significant backlog at the start of the following working day.

By 21 June 2012, batch processing for Ulster Bank was more than one day behind. This meant that the next day’s batch processing started before the current day’s batch processing was complete. The simultaneous processing of Ulster Bank’s batches interfered with each other because there were multiple days’ files in the processing system and multiple days’ jobs in the queues. This caused additional recovery problems and further backlogs.

The effects of the batch scheduler failure

The IT Incident affected all of the Banks. The effects of the IT Incident on RBS were not as severe as the effects on NatWest and Ulster Bank because a separate batch scheduler controlled the updates to RBS’s customers’ accounts. However, RBS was affected because some of the information it required to update its accounts was dependent upon receiving accurate and timely information from NatWest and Ulster Bank. That information included management information, finance and risk information as well as payments that customers from those banks were making to each other and to RBS customers as well.

By the beginning of 25 June 2012, Technology Services had managed to stabilise RBS’s and NatWest’s batch processes, although both banks’ records required some manual updating throughout the week. From 25 June 2012, the focus of effort was on the recovery of Ulster Bank’s batches. The Ulster Bank batch scheduler did not return to full functionality until 10 July 2012. 4.14. The IT Incident potentially affected 635 systems at the RBS Group, of those systems 75 were payment related systems, which included the following functions (some systems had more than one function):

  1. The administration or updating of customer accounts (17 systems).
  2. The processing and execution of payments (68 systems).
  3. The application of interest and charges (11 systems).
  4. The reconciliation of accounting entries across the Banks (3 systems).

The effects of the IT Incident were wide-ranging and affected a number of the Banks’ systems and customers. The following are some examples of the ways the IT Incident affected the Banks’ systems and customers.

  1. ATMs were generally available, but they presented out of date balances because of missing or duplicate transactions. This meant that some customers were unable to withdraw cash. In addition, some customers ran the risk of overdrawing their accounts when they withdrew cash, particularly if their accounts were close to their limits and credits had not been applied. The IT Incident affected ATMs until:
    1. RBS: 27 June 2012 (system was not fully functional for 8 days);
    2. NatWest: 28 June 2012 (system was not fully functional for 9 days);
    3. Ulster Bank NI: 8 July 2012 (system was not fully functional for 19 days).
  2. Digital Banking is an internet based online banking service for personal and small business customers of RBS, including RBS International customers. The system remained technically available, but there were intermittent periods of outage for logins. Customers were affected if they were unable to login and make online banking transactions, make payments and view correct balances and transaction histories. The IT Incident affected Digital Banking until:
    1. RBS: 25 June 2012 (system was not fully functional for 6 days);
    2. NatWest: 1 July 2012 (system was not fully functional for 12 days);
    3. Ulster Bank NI: 9 July 2012 (system was not fully functional for 20 days).
  3. Direct Banking/Telephony is the Banks’ telephone banking service. The system remained available but with intermittent periods of outage. Customers who tried to log-in during the periods of outage were unable to make online banking transactions, make payments and view correct balances and transaction histories. The IT Incident affected Direct Banking/Telephony until:
    1. RBS: 25 June 2012 (system was not fully functional for 6 days);
    2. NatWest: 1 July 2012 (system was not fully functional for 12 days);
    3. Ulster Bank NI: 8 July 2012 (system was not fully functional for 19 days).
  4. Teller service was available at branches, however, the IT Incident meant that transactions from those branches were not updated in the central computer system and that caused the Banks’ overnight ledger balance to 10 be inaccurate for affected customers. Those customers were unable to make or receive payments and could not be provided with their correct balances or transaction histories. The IT Incident affected teller services until:
    1. RBS: Unaffected;
    2. NatWest: 28 June 2012 (system was not fully functional for 9 days);
    3. Ulster Bank NI: 9 July 2012 (system was not fully functional for 20 days).
  5. Bankline Direct is a payments channel which provides customers with a method of making payments. Customers were not able to see up to date account information (balance and transactions). Customers could make payments, although this would be dependent on the account being up to date in some circumstances. The IT Incident affected Bankline Direct until:
    1. RBS: 27 June 2012 (system was not fully functional for 8 days);
    2. NatWest: 28 June 2012 (system was not fully functional for 9 days);
    3. Ulster Bank NI: Unaffected.
  6. Point of Sale is the system which provides a gateway between Visa and its users. It authorises debit card transactions, both domestic and international. The system remained available, however, authorisations were checked against incorrect balances. Customers may have lost the ability to pay for transactions, especially if credit was not applied to accounts leading to lack of available funds. The Banks partially mitigated the problem by arranging a £200 “stand-in” limit for debit cards which gave customers the ability to buy goods up to that limit. The IT Incident affected the Point of Sale systems until:
    1. RBS: 27 June 2012 (system was not fully functional for 8 days);
    2. NatWest: 28 June 2012 (system was not fully functional for 9 days);
    3. Ulster Bank NI: 8 July 2012 (system was not fully functional for 19 days).
  7. The Relationship Management Platform is an IT system RBS International used. The IT Incident affected corporate customers’ transactions and account records which in turn affected corporate customers’ ability to make payments to corporate accounts, draw invoices and make salary runs. The IT Incident affected the Relationship Management Platform until:
    1. RBS: 6 July 2012 (system was not fully functional for 17 days);
    2. NatWest: 6 July 2012 (system was not fully functional for 17 days); 11
    3. Ulster Bank NI: 6 July 2012 (system was not fully functional for 17 days).
  8. BankTrade GTS is a system the RBS Group used to process the bank’s international trades and UK bonds. Although the system was processing trades, the backlog delayed the processing of the current day’s trades. Commercial customers’ international transactions were potentially delayed exposing them to risk of non or late payment. The IT Incident affected the BankTrade GTS system until:
    1. RBS: 18 July 2012 (system was not fully functional for 29 days);
    2. NatWest: 18 July 2012 (system was not fully functional for 29 days);
    3. Ulster Bank NI: 18 July 2012 (system was not fully functional for 29 days).

Following the IT Incident, the Authority required the Banks to appoint a Skilled Person to independently assess the immediate causes, consequences and management of the IT Incident.

Link to Original Report