Challenges in Materials Discovery – Synthetic Generator and Real Datasets


  • Ronan Le Bras Cornell University
  • Richard Bernstein Cornell University
  • John Gregoire California Institute of Technology
  • Santosh Suram California Institute of Technology
  • Carla Gomes Cornell University
  • Bart Selman Cornell University
  • R. Bruce van Dover Cornell University



Materials Discovery, Computational Sustainability, Dataset


Newly-discovered materials have been central to recent technological advances. They have contributed significantly to breakthroughs in electronics, renewable energy and green buildings, and overall, have promoted the advancement of global human welfare. Yet, only a fraction of all possible materials have been explored. Accelerating the pace of discovery of materials would foster technological innovations, and would potentially address pressing issues in sustainability, such as energy production or consumption. The bottleneck of this discovery cycle lies, however, in the analysis of the materials data. As materials scientists have recently devised techniques to efficiently create thousands of materials and experimentalists have developed new methods and tools to characterize these materials, the limiting factor has become the data analysis itself. Hence, the goal of this paper is to stimulate the development of new computational techniques for the analysis of materials data, by bringing together the complimentary expertise of materials scientists and computer scientists. In collaboration with two major research laboratories in materials science, we provide the first publicly available dataset for the phase map identification problem. In addition, we provide a parameterized synthetic data generator to assess the quality of proposed approaches, as well as tools for data visualization and solution evaluation.




How to Cite

Le Bras, R., Bernstein, R., Gregoire, J., Suram, S., Gomes, C., Selman, B., & van Dover, R. B. (2014). Challenges in Materials Discovery – Synthetic Generator and Real Datasets. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1).



Computational Sustainability and Artificial Intelligence