A Nonparametric Online Model for Air Quality Prediction
Keywords:Gaussian Processes, Machine Learning, Spatio-Temporal, Online learning, RT-AQF, environment monitoring
We introduce a novel method for the continuous online prediction of particulate matter in the air (more specifically, PM10 and PM2.5) given sparse sensor information. A nonparametric model is developed using Gaussian Processes, which eschews the need for an explicit formulation of internal -- and usually very complex -- dependencies between meteorological variables. Instead, it uses historical data to extrapolate pollutant values both spatially (in areas with no sensor information) and temporally (the near future). Each prediction also contains a respective variance, indicating its uncertainty level and thus allowing a probabilistic treatment of results. A novel training methodology (Structural Cross-Validation) is presented, which preserves the spatio-temporal structure of available data during the hyperparameter optimization process. Tests were conducted using a real-time feed from a sensor network in an area of roughly 50x80 km, alongside comparisons with other techniques for air pollution prediction. The promising results motivated the development of a smartphone applicative and a website, currently in use to increase the efficiency of air quality monitoring and control in the area.