使用手动模型和自动化机器学习的堕落天使债券投资和破产预测

论文标题

使用手动模型和自动化机器学习的堕落天使债券投资和破产预测

Fallen Angel Bonds Investment and Bankruptcy Predictions Using Manual Models and Automated Machine Learning

论文作者

Mateika, Harrison, Jia, Juannan, Lillard, Linda, Cronbaugh, Noah, Shin, Will

论文摘要

这项研究的主要目的是找到一个最能预测哪种堕落的天使债券的模型可能会恢复到投资级债券，哪些债券将陷入破产。为了实施解决方案，我们认为理想的方法是创建一个可以预测破产的最佳机器学习模型。在那里的许多机器学习模型中，我们决定选择四种分类方法：逻辑回归，KNN，SVM和NN。我们还利用了Google Cloud的机器学习的自动化方法。我们的模型比较结果表明，除了Google Cloud的机器学习具有很高的精度得分外，模型在原始数据集上没有很好地预测破产。但是，我们过度采样和特征选择数据集的性能非常好。这可能是由于该模型过于拟合以匹配过度采样数据的叙述（例如，它不能很好地预测该数据集之外的数据）。因此，我们无法创建一个我们有信心预测破产的模型。但是，我们能够以两种关键方式从该项目中找到价值。首先是在每个度量标准和每个数据集中的Google Cloud的机器学习模型都优于其他模型或同等模型。第二个是我们发现，利用特征选择并没有大大降低预测能力。这意味着我们可以减少有关预测破产的未来实验的数据量。

The primary aim of this research was to find a model that best predicts which fallen angel bonds would either potentially rise up back to investment grade bonds and which ones would fall into bankruptcy. To implement the solution, we thought that the ideal method would be to create an optimal machine learning model that could predict bankruptcies. Among the many machine learning models out there we decided to pick four classification methods: logistic regression, KNN, SVM, and NN. We also utilized an automated methods of Google Cloud's machine learning. The results of our model comparisons showed that the models did not predict bankruptcies very well on the original data set with the exception of Google Cloud's machine learning having a high precision score. However, our over-sampled and feature selection data set did perform very well. This could likely be due to the model being over-fitted to match the narrative of the over-sampled data (as in, it does not accurately predict data outside of this data set quite well). Therefore, we were not able to create a model that we are confident that would predict bankruptcies. However, we were able to find value out of this project in two key ways. The first is that Google Cloud's machine learning model in every metric and in every data set either outperformed or performed on par with the other models. The second is that we found that utilizing feature selection did not reduce predictive power that much. This means that we can reduce the amount of data to collect for future experimentation regarding predicting bankruptcies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题