In studying your bank’s loan data, how can you determine the relationships among various factors in your lending policies, customer base, pricing, and more? Through the use of regression modeling, an important tool in statistical analysis.
A key application of statistical analysis is measuring relationships between variables. There are two possible outcomes with regard to measuring these relationships:
- The variables are correlated (the variables are related)
- The variables are independent (the variables are unrelated)
The multivariate regression model allows these relationships to be measured and tested independently and simultaneously. For our purposes, we would most often utilize fair lending regression tables to find relationships in our data.
In evaluating statistical relationships from regression analysis, there are (3) things we are concerned with. We can call these “the three S’s.” They are size, sign, and significance.
Size is simply a measure of the relationship of one variable with another. In a typical regression table, this is shown by either the coefficient or the odds ratio. Whether expressed as a coefficient or odds ratio, these reflect the size or magnitude of the relationship.
The next thing we are concerned with is the direction of the relationship, that is, whether the correlation is positive or negative. This can be determined by either the sign of the coefficient, t, or z statistics or by the odds ratio.
If these have a negative sign, the relationship is negative and vice versa. If the odds ratio is less than 1.0, then the observed relationship is negative. Otherwise, the variables are positively correlated.
Finally, and probably most importantly, we are interested in significance. The regression model measures relationships between variables, but the measured relationships can be statistically insignificant. Simply, this means that the observed correlation is insufficient to rule out the possibility that the association is just random variation within the sample.
Statistical significance is determined by the reported p value for the variable of interest. The p value is the probability that the correlation observed between the variables is the result of random variation. In general, it is widely accepted that p values should be less than .05 to conclude that measured correlation is meaningful.
Bringing it All Together
All (3) of the attributes of regression output shown above are important to understanding regression results. Two examples are presented below which show how results from a typical regression table may be presented and how to interpret them.
Conclusion: The relationship between LTV and note rate in this dataset is positive and statistically significant. LTV’s of 100% are associated with a .435 percentage point higher expected note rate relative to LTV’s less than 100%.
Conclusion: The relationship between credit score and denials is negative and statistically significant. Each increase of one in credit score on average is associated with a 3% lower probability of denial.