Towards Cognitive Automation of Data Science


  • Alain Biem IBM Research
  • Maria Butrico IBM Research
  • Mark Feblowitz IBM Research
  • Tim Klinger IBM Research
  • Yuri Malitsky IBM Research
  • Kenney Ng IBM Research
  • Adam Perer IBM Research
  • Chandra Reddy IBM Research
  • Anton Riabov IBM Research
  • Horst Samulowitz IBM Research
  • Daby Sow IBM Research
  • Gerald Tesauro IBM Research
  • Deepak Turaga IBM Research



Data Science, Automation, Reasoning Under Uncertainty, Visualization, NLP, Text Analytics


A Data Scientist typically performs a number of tedious and time-consuming steps to derive insight from a raw data set. The process usually starts with data ingestion, cleaning, and transformation (e.g. outlier removal, missing value imputation), then proceeds to model building, and finally a presentation of predictions that align with the end-users objectives and preferences. It is a long, complex, and sometimes artful process requiring substantial time and effort, especially because of the combinatorial explosion in choices of algorithms (and platforms), their parameters, and their compositions. Tools that can help automate steps in this process have the potential to accelerate the time-to-delivery of useful results, expand the reach of data science to non-experts, and offer a more systematic exploration of the available options. This work presents a step towards this goal.




How to Cite

Biem, A., Butrico, M., Feblowitz, M., Klinger, T., Malitsky, Y., Ng, K., Perer, A., Reddy, C., Riabov, A., Samulowitz, H., Sow, D., Tesauro, G., & Turaga, D. (2015). Towards Cognitive Automation of Data Science. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1).