WhichStreams: A Dynamic Approach for Focused Data Capture from Large Social Media

Authors

  • Thibault Gisselbrecht Université Pierre et Marie Curie
  • Ludovic Denoyer Université Pierre et Marie Curie
  • Patrick Gallinari Université Pierre et Marie Curie
  • Sylvain Lamprier Université Pierre et Marie Curie

DOI:

https://doi.org/10.1609/icwsm.v9i1.14587

Keywords:

Social Media, Data Capture, Machine Learning

Abstract

Due to the huge amount of data produced on large social media, capturing useful content usually implies to focus on subsets of data that fit with a pre-specified need. Considering the usual API restrictions of these media, we formulate this task of focused capture as a dynamic data sources selection problem. We then propose a machine learning methodology, named WhichStreams, which is based on an extension of a recently proposed combinatorial bandit algorithm. The evaluation of our approach on various Twitter datasets, with both offline and online settings, demonstrates the relevance of the proposal for leveraging the real-time data streaming APIs offered by most of the main social media.

Downloads

Published

2021-08-03

How to Cite

Gisselbrecht, T., Denoyer, L., Gallinari, P., & Lamprier, S. (2021). WhichStreams: A Dynamic Approach for Focused Data Capture from Large Social Media. Proceedings of the International AAAI Conference on Web and Social Media, 9(1), 130-139. https://doi.org/10.1609/icwsm.v9i1.14587