Physics-constrained Automatic Feature Engineering for Predictive Modeling in Materials Science


  • Ziyu Xiang Texas A&M University
  • Mingzhou Fan Texas A&M University
  • Guillermo Vázquez Tovar Texas A&M University
  • William Trehern Texas A&M University
  • Byung-Jun Yoon Texas A&M University Brookhaven National Laboratory
  • Xiaofeng Qian Texas A&M University
  • Raymundo Arroyave Texas A&M University
  • Xiaoning Qian Texas A&M University



Feature Construction/Reformulation, Reinforcement Learning, Scalability of ML Systems, Natural Sciences


Automatic Feature Engineering (AFE) aims to extract useful knowledge for interpretable predictions given data for the machine learning tasks. Here, we develop AFE to extract dependency relationships that can be interpreted with functional formulas to discover physics meaning or new hypotheses for the problems of interest. We focus on materials science applications, where interpretable predictive modeling may provide principled understanding of materials systems and guide new materials discovery. It is often computationally prohibitive to exhaust all the potential relationships to construct and search the whole feature space to identify interpretable and predictive features. We develop and evaluate new AFE strategies by exploring a feature generation tree (FGT) with deep Q-network (DQN) for scalable and efficient exploration policies. The developed DQN-based AFE strategies are benchmarked with the existing AFE methods on several materials science datasets.




How to Cite

Xiang, Z., Fan, M., Vázquez Tovar, G., Trehern, W., Yoon, B.-J., Qian, X., Arroyave, R., & Qian, X. (2021). Physics-constrained Automatic Feature Engineering for Predictive Modeling in Materials Science. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 10414-10421.



AAAI Technical Track on Machine Learning V