Hlongwane R, Ramaboa KKKM, Mongwe W. Enhancing credit scoring accuracy with a comprehensive evaluation of alternative data.
PLoS One 2024;
19:e0303566. [PMID:
38771812 PMCID:
PMC11108212 DOI:
10.1371/journal.pone.0303566]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 04/27/2024] [Indexed: 05/23/2024] Open
Abstract
This study explores the potential of utilizing alternative data sources to enhance the accuracy of credit scoring models, compared to relying solely on traditional data sources, such as credit bureau data. A comprehensive dataset from the Home Credit Group's home loan portfolio is analysed. The research examines the impact of incorporating alternative predictors that are typically overlooked, such as an applicant's social network default status, regional economic ratings, and local population characteristics. The modelling approach applies the model-X knockoffs framework for systematic variable selection. By including these alternative data sources, the credit scoring models demonstrate improved predictive performance, achieving an area under the curve metric of 0.79360 on the Kaggle Home Credit default risk competition dataset, outperforming models that relied solely on traditional data sources, such as credit bureau data. The findings highlight the significance of leveraging diverse, non-traditional data sources to augment credit risk assessment capabilities and overall model accuracy.
Collapse