Linking Industry Sectors and Financial Statements: A Hybrid Approach for Company Classification

Authors

  • Guy Stephane Waffo Dzuyo Forvis MazarsLORIA, CNRS, Université de Lorraine
  • Gaël Guibon LORIA, CNRS, Université de Lorraine Université Sorbonne Paris Nord, CNRS, Laboratoire d’Informatique de Paris Nord, LIPN, F-93430 Villetaneuse, France
  • Christophe Cerisara LORIA, CNRS, Université de Lorraine
  • Luis Belmar-Letelier Forvis Mazars

DOI:

https://doi.org/10.1609/aaai.v39i16.33806

Abstract

The identification of the financial characteristics of industry sectors has a large importance in accounting audit, allowing auditors to prioritize the most important area during audit. Existing company classification standards such as the Standard Industry Classification (SIC) code allow to map a company to a category based on its activity and products. In this paper, we explore the potential of machine learning algorithms and language models to analyze the relationship between those categories and companies' financial statements. We propose a supervised company classification methodology and analyze several types of representations for financial statements. Existing works address this task using solely numerical information in financial records. Our findings show that beyond numbers, textual information occurring in financial records can be leveraged by language models to match the performance of dedicated decision tree-based classifiers, while providing better explainability and more generic accounting representations. We think this work can serve as a preliminary work towards semi-automatic auditing.

Published

2025-04-11

How to Cite

Waffo Dzuyo, G. S., Guibon, G., Cerisara, C., & Belmar-Letelier, L. (2025). Linking Industry Sectors and Financial Statements: A Hybrid Approach for Company Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 39(16), 16444–16452. https://doi.org/10.1609/aaai.v39i16.33806

Issue

Section

AAAI Technical Track on Machine Learning II