Footnote:
Nach Informationen von SSRN wurde die ursprüngliche Fassung des Dokuments June 2023 erstellt
Description:
We fine-tune a large language model to classify accounting topics within financial disclosures. This allows for the efficient and accurate classification of accounting topics in large volumes of out-of-sample unlabeled text. Specifically, our model leverages innovations in supervised machine learning and large language models to overcome the challenges of manually labeling data for this task and outperforms the most prevalent topic classification method in accounting and finance research (LDA). We demonstrate the importance of these innovations with several examples of unlabeled disclosures – custom notes to the financial statements, the MD&A section, and the risk factor section – that can be classified into topics by our model. We find that these disclosures contain meaningful topic-specific information, which was previously difficult to uncover and is predictive of specific accounting outcomes. Researchers and practitioners interested in identifying relevant and consistent information on accounting topics from large volumes of textual data can use our model