Big data analytics in risk management: credit risk assessment and evaluation using machine learning algorithms.
Abstract
The scope of this Master Thesis is to introduce and analyze the concept of credit scoring and credit risk evaluation through an extensive description and implementation of an end to end classification modelling process in a both analytical and business manner. An open source large dataset which includes almost 1 million loan applications is used to construct and implement 3 different classification models in order to give answers to the following questions; What is the process of constructing a classification model for credit risk scoring? What incremental value does a classification model for predicting the probability of default adds to a financial institution? And finally, which classification algorithm suits better for the purpose of credit scoring calculation and risk assessment? Extensive literature is introduced in order to construct a proper theoretical and modelling framework for the final implementation of the 3 different models and their results as well as a detailed comparison in terms of accuracy and sensitivity are presented.