The Science Behind False Positive Tuning

Introduction

How do you know if the thresholds are set correctly in your OFAC (Office of Foreign Assets Control) Sanctions Filtering or BSA (Bank Secrecy Act) Transaction Monitoring system?  That is certainly an important question, and a mystery in the world of anti-money laundering (AML). Commonly in the financial industry, and for the purposes of this article, the act of tuning with the goal of false positive reduction will focus on OFAC Sanctions Filtering and/or BSA Transaction Monitoring. Tuning is often driven by the need to improve quality of alerts voluntarily by the organization or it can be mandated by the regulators. Regardless the reason, it is very important that tuning be done periodically and correctly. The benefits of tuning can reduce workload, allowing more time to be spent on alerts that are more meaningful, thereby improving quality. However, if the tuning process is not conducted correctly, a greater risk can be created due to missing alerts. The goal in this article is to explain the science behind the process of what is called false positive tuning, the terminology used, the iterations, and how this affects you.

The Definitions

Much of the industry is familiar with the term False Positives, correct?  Of course, but how familiar are you with the other terms? First, let us explain some of the terminology you should know such as Positives, Negatives, False Positives (Type I Errors) and False Negatives (Type II Errors) (Wikipedia, 2015).

  • Positives are suspicious activity that generate an alert, requiring that a SAR is completed and submitted to the government.
  • False Positives( Type I Errors) are non-suspicious activity that generate an alert.
  • False Negatives (Type II Errors) are suspicious activity that does not generate an alert, but should have had a SAR completed and submitted to the government.
  • Negatives are non-suspicious activity that does not generate an alert.

The Risks

Next, let’s talk about what this means and how it applies to you. Positives and Negatives present a perfect process, which is rare or never can realistically exist without False Positives and False Negatives. False Positives create some risks because of the additional work that takes away focus from the investigator on Positive alerts. However, False Negatives are by far the greatest risk for any AML department and the financial institution as a whole since these should have generated an alert for their suspicious activity. Even an effective compliance operation has a level of False Negatives. The goal of tuning is to find the right balance between False Positives and False Negatives. The AML tools (software) allow for various levels of tuning, and expert consulting companies have various levels of expertise.

The Methodologies

One of the methodologies that we use is to conduct a statistical review. Our process is to review Positives, Negatives, False Positives (Type I Errors) and False Negatives (Type II Errors), and develop a threshold that can be evaluated and justified. From there we use a standard deviation formula that identifies a number of statistical clues including the more obvious factors such as the outliers and the percentile formula to determine the 85th percentile (Bland & Altman, 1996). This methodology should be conducted on each rule/scenario to isolate parameters and thresholds relative to the data or population that is being analyzed.  The aggregate of the alerts that are generated can also have an impact on a determination of a false positive and a false negatives. However, you must also account for the AML risk profile/assessment of the organization to determine if it’s right for your institution based on the statistical analysis.  In the end, it is critical that the statistics, AML risk profile, technical experience and compliance experience all weigh in on determining what the new proposed thresholds should be for iteration testing.

Another methodology is to conduct iteration testing, or trial runs of the system. Typically, this first involves determining how many iterations you expect to run. To determine this, you need to know how many variables are involved in the process, such as how many threshold deviations, how many rules sets, how much data and so on. As stated above, we use the statistical data results to determine the thresholds and parameters of each iteration test. Generally, we see three to five iteration tests per rule set. This is best done in a clean test environment with all data files and a reset or rollback process in place. In addition, the process should be planned, executed and documented for each step, including justification of each test case as to what steps and tests have been conducted.

These are just two methodologies used for false positive tuning. We often combine both methodologies to create a more thorough approach and completeness of the exercise.

Risk vs. Quality

The management process of an alert, specifically risk and quality, is the key to a successful AML Governance Program, since most AML Governance Programs in the end are really managed by your AML software. We distinguish quality and risk in two ways, even though their relationship is intertwined. The quality of an alert process is improved by reducing false positives. The risk of an alert process is decreased by reducing false negatives. However, note that as we decrease false positives, the quality of an alert improves but that does not affect the risk of an alert. To affect the risk of an alert, we have to decrease the number of false negatives. There are tools that allow you to have more control over your false positives and false negatives, without adjusting your 85% percentile. The most common process in the industry in a tuning exercise is to conduct iteration testing; but we combine methodologies such as a statistical analysis (standard deviations, percentile, and distributions) and iteration testing for a more thorough review. We believe that one of the most important lessons to be learned is to understand the quality of alerts and the risks associated.

Conclusion

We hope this brief article has level-set some of the terminology, summarized our methodology, and discussed the difference between quality and risk as it relates to alerts. While there are varying methods to conduct the tuning process, it is most important that you understand your risk profile/assessment and the different types of positives and negatives. Further, a better understanding of the triggers generating the alerts will help improve quality while lowering risk.

References

Bland, J. M. & Altman, D. (June 29, 1996). Measurement Error. BMJ. Volume 312. Retrieved 2015, February 25. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2351401/pdf/bmj00548-0038.pdf

Wikipedia.  Type I and type II errors (2015). Retrieved 2015, February 25. http://en.wikipedia.org/wiki/Type_I_and_type_II_errors

Facebooktwitterlinkedin