Relational In-Context Learning on Structured Data via Neighborhood Aggregation and Structural Information (Extended Abstract)
DOI:
https://doi.org/10.1609/aaaiss.v9i1.42930Abstract
Relational databases are one of the most common data storage mechanisms across many business domains including healthcare, banking, e-commerce, logistics, and human resources. Traditionally, data science teams must join complex relational database structures into a single table, perform manual feature engineering, and train a single machine learning model, per task. This prevents the exploitation of relational signal and increases the required resources needed to scale across business use cases. There have been significant advancements in foundation model research across computer vision, natural language processing, and tabular deep learning. However, there has been limited work, such as Relational Transformer, exploring models that can directly predict on relational databases. Previous work has applied In-Context Learning (ICL) on tasks that have simpler homogeneous data. We leverage advancements in tabular foundation models (TFM), such as TabPFN and ConTextTab, to directly perform ICL on multi-modal relational data. Specifically, we construct a heterogeneous graph via primary and foreign keys in a relational database. We then apply heterogeneous GraphSAGE model as a fixed random feature map to aggregate neighborhood information across the subgraphs. Additionally, we augment entity node representations by combining structural information with the entity embedding. Finally, we supply a TFM with the contextualized entity node representation to perform ICL on arbitrary tasks - without any training. Our method is tested on the public relational deep learning benchmark Relbench, which contains many diverse, real-world predictive regression and classification tasks across healthcare, e-commerce, marketing, and enterprise sales. In a fully ICL regime, our model is shown to be competitive with several fully-trained benchmarks on classification tasks. On average, the method achieves 114% of the fully-trained LightGBM, 95% of the fully-trained relational deep learning model reported in Relbench, and 98% of the zero-shot performance of pretrained RT. This signifies an advancement in unlocking accurate predictions, without training, on the dominant business data structure across diverse domains.Downloads
Published
2026-06-23
How to Cite
Meyer, J., Palczewski, T., Shaikh, A., Mohammadi, M., Katupputhur Ramprasath, D., Paresh, K., … Li, M. (2026). Relational In-Context Learning on Structured Data via Neighborhood Aggregation and Structural Information (Extended Abstract). Proceedings of the AAAI Symposium Series, 9(1), 219–220. https://doi.org/10.1609/aaaiss.v9i1.42930
Issue
Section
AI in Business: Intelligent Transformation and Management (Extended Abstracts)