Investigating the Role of Ensemble Learning in High-Value Wine Identification
Keywords:ensemble learning, fraud detection, wine classification
We tackle the problem of authenticating high value Italian wines through machine learning classification. The problem is a seriuos one, since protection of high quality wines from forgeries is worth several million of Euros each year. In a previous work we have identified some base models (in particular classifiers based on Bayesian network (BNC), multi-layer perceptron (MLP) and sequential minimal optimization (SMO)) that well behave using unexpensive chemical analyses of the interested wines. In the present paper, we investigate the role of esemble learning in the construction of more robust classifiers; results suggest that, while bagging and boosting may significantly improve both BNC and MLP, the SMO model is already very robust and efficient as a base learner. We report on results concerning both cross validation on two different datasets, as well as experiments with models trained with the above datasets and tested with a dataset of potentially fake wines; this has been synthesized from a generative probabilistic model learned from real samples and expert knowledge.