HMDA Data Integrity: Methods for Quantifying the Risk

Industry Updates  »  HMDA Data Integrity: Methods for Quantifying the Risk

The emphasis of managing all aspects of regulatory compliance in the current environment center on understanding, measuring, and mitigating risk. For many facets, however, this remains esoteric due to subjectivity and the span of unknowns that surround most issues. Therefore, it is often very difficult for an institution to accomplish this with any degree of certainty.

Identifying The Risk

Reducing risk is essentially reducing uncertainty. To reduce uncertainty requires having measureable expectations regarding where an institution stands with respect to compliance as well as future outcomes. 

In a recent post, we noted the interagency HMDA transaction testing guidelines that were released this month. These guidelines provide error tolerance thresholds based on an institution’s HMDA volume along with the anticipated sample sizes examiners will use to evaluate HMDA data integrity. 

This is important because the data integrity review is almost always the first step in a compliance and/or fair lending review, but also because resubmission may be required if the errors exceed these thresholds. Civil money penalties can also be assessed depending on the extent of the errors.

Most HMDA reporting institutions have a system in place for accurately collecting and reporting HMDA data, but errors still occur. As compliance is an expense, the question becomes how resources should be allocated.

For example, if an institution discovers HMDA reporting errors, at what level does it require a full (and costly) scrub be conducted or if the level of errors fall within regulatory tolerances? The new guidelines provide a way to quantify this and conduct testing to assist in making these determinations. 

The Process

The first step is to conduct a review of the institution’s HMDA reportable transactions for accuracy. This should be done by first selecting a random sample of HMDA reportable applications and then comparing the items reported on the LAR to the true values as reflected in the file. (It is critical that this be a truly random sample and not haphazardly selected.)

Once this is completed, an error rate can be determined. For example, if 100 applications were reviewed and 5 errors were noted, this is an error rate of 5%.  

The error rate based on the sample is not the “true” error rate but an estimate based on the sample. Once the estimated error rate is determined from the sample, a range can be constructed to determine the likely range for the true error rate. This range is referred to as the “confidence interval” and by convention is typically constructed at 95%.

This simply means that, in repeated sampling, 95% of the time the true error rate will lie within this range. This then provides an idea of how prevalent errors may or may not be in the institution’s HMDA data. 

Taking A Closer Look

For example, in the left column of Table I below shown in yellow, the “true” rate for a measured error rate of 5% based on a sample of 100 applications is likely to fall between 1.6% and 11.3%. This is a wide range, but the confidence interval is a function of the variation in the data and the sample size. Therefore, more precise measurement (i.e. smaller confidence interval) can be obtained with a larger sample.

Note in the far-right column shown in green, a measured rate of 5% with a sample of 600 would mean the true rate would lie between 3.4% and 7.1%.  Likewise, a smaller error rate also produces a smaller range. For example, an error rate of 1% in a sample of 100 (far-left corner of Table I in blue) is between 0% and 5.4%. 

A-70 I-1.png

The error rate as determined by the confidence interval can then be compared to the expected tolerances based on the institution’s LAR count. If the tolerance is less than the upper end of the confidence interval, then the review suggests the institution’s error rate may exceed the tolerances. The tolerances can be found on page 5 of the guidelines which can be accessed here:

Once the range of the true error rate is determined, the institution may also want to consider what a review during an examination would determine based on the interagency guidance. Again, we can apply the same principles and base our assessment on an anticipated sample during an examination.  

To illustrate, let’s examine Table II below. We will assume the institution has a LAR count of 1,000 and has conducted a review which resulted in an estimated error rate of 2.5%. Assuming this is the true error rate, and examiners used a sample of 79 as per the guidelines, the range of errors detected would likely fall between 0.3% up to 8.8% (red shaded portion of Table II). This means that although the measured error rate of 2.5% is < than the tolerance of 5.1%, an examination may detect an error rate above the tolerance.

Note here that, (again looking at Table I above shaded in gray), if the institution had measured the error rate based on a sample of 300, the estimated true error rate would have been roughly 1% to 5.6%, nearly within the tolerance. However, a smaller sample conducted during an examination may determine the institution’s error rate is outside of the tolerances. This is why it is prudent to consider both measures when assessing the risk. 

A-70 I-2.png

The above example is a general guide to quantifying risk based on overall error rates. The guidelines, however, are a bit more complex and resubmission may be required even if overall error rates are within these tolerances. Institutions should study the guidance thoroughly and take into account all factors when quantifying risk. 

How to cite this blog post (APA Style):
Premier Insights. (2017, September 28). HMDA Data Integrity: Methods for Quantifying the Risk [Blog post]. Retrieved from

Leave a Reply

Your email address will not be published. Required fields are marked *