Representation Space Augmentation for Effective Self-Supervised Learning on Tabular Data

Moonjung Eo; Kyungeun Lee; Hye-Seung Cho; Dongmin Kim; Ye Seul Sim; Woohyung Lim

doi:10.1609/aaai.v39i11.33265

Representation Space Augmentation for Effective Self-Supervised Learning on Tabular Data

Authors

Moonjung Eo LG AI Research
Kyungeun Lee LG AI Research
Hye-Seung Cho LG AI Research
Dongmin Kim LG AI Research
Ye Seul Sim LG AI Research
Woohyung Lim LG AI Research

DOI:

https://doi.org/10.1609/aaai.v39i11.33265

Abstract

Tabular data, widely used across industries, remains underexplored in deep learning. Self-supervised learning (SSL) shows promise for pre-training deep neural networks (DNNs) on tabular data, but its potential is hindered by challenges in designing suitable augmentations. Unlike image and text data, where SSL leverages inherent spatial or semantic structures, tabular data lacks such explicit structure. This makes traditional input-level augmentations, like modifying or removing features, less effective due to difficulties in balancing critical information preservation with variability. To address these challenges, we propose RaTab, a novel method that shifts augmentation from input-level to representation-level using matrix factorization, specifically truncated SVD. This approach preserves essential data structures while generating diverse representations by applying dropout at various stages of the representation, thereby significantly enhancing SSL performance for tabular data.

AAAI-25 / IAAI-25 / EAAI-25 Proceedings Cover

Downloads

Published

2025-04-11

How to Cite

Eo, M., Lee, K., Cho, H.-S., Kim, D., Sim, Y. S., & Lim, W. (2025). Representation Space Augmentation for Effective Self-Supervised Learning on Tabular Data. Proceedings of the AAAI Conference on Artificial Intelligence, 39(11), 11625–11633. https://doi.org/10.1609/aaai.v39i11.33265

Download Citation

Issue

Vol. 39 No. 11: AAAI-25 Technical Tracks 11

Section

AAAI Technical Track on Data Mining & Knowledge Management I

Representation Space Augmentation for Effective Self-Supervised Learning on Tabular Data

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information