Publications
2022
Huang, Jiajing; Wen, Jin; Yoon, Hyunsoo; Pradhan, Ojas; Wu, Teresa; O’Neill, Zheng; Candan, Kasim Selcuk
In: Energy and Buildings, vol. 259, pp. 111872, 2022, ISSN: 0378-7788.
Abstract | Links | BibTeX | Tags: Building AFDD, Machine learning, Real, Similarity, Simulated
@article{HUANG2022111872,
title = {Real vs. simulated: Questions on the capability of simulated datasets on building fault detection for energy efficiency from a data-driven perspective},
author = {Jiajing Huang and Jin Wen and Hyunsoo Yoon and Ojas Pradhan and Teresa Wu and Zheng O'Neill and Kasim Selcuk Candan},
url = {https://www.sciencedirect.com/science/article/pii/S0378778822000433},
doi = {https://doi.org/10.1016/j.enbuild.2022.111872},
issn = {0378-7788},
year = {2022},
date = {2022-01-01},
journal = {Energy and Buildings},
volume = {259},
pages = {111872},
abstract = {Literature on building Automatic Fault Detection and Diagnosis (AFDD) mainly focuses on simulated system data due to high expenses and difficulties of obtaining and analyzing real building data. There is a lack of validation on performances and scalabilities of data-driven AFDD approaches using simulated data and how it compares to that from real building data. In this study, we conduct two sets of experiments to seek answers to this question. We first evaluate data-driven fault detection strategies on real and simulated building data separately. We observe that the fault detection performances are not affected by fault detection strategies, sizes of training data, and the number of cross-validation folds when training and blind test data come from the same data source, namely, simulated or real building data. Next, we conduct a cross-dataset study, that is, develop the model using simulated data and tested on real building data. The results indicate the model trained on simulated data is not generalized to be applied for real building data for fault detection. Kolmogorov-Smirnov Test is conducted to confirm that there exist statistical differences between the simulated and real building data and identify a subset of features with similarities between the two datasets. Using the subset of the feature, cross-dataset experiments show fault detection improvements on most fault cases. We conclude that even if the system produces simulated data with the same fault symptoms from physical analysis perspectives, not all features from simulated datasets may not be beneficial for AFDD but only a subset of features contains valuable information from a machine learning perspective.},
keywords = {Building AFDD, Machine learning, Real, Similarity, Simulated},
pubstate = {published},
tppubtype = {article}
}
Literature on building Automatic Fault Detection and Diagnosis (AFDD) mainly focuses on simulated system data due to high expenses and difficulties of obtaining and analyzing real building data. There is a lack of validation on performances and scalabilities of data-driven AFDD approaches using simulated data and how it compares to that from real building data. In this study, we conduct two sets of experiments to seek answers to this question. We first evaluate data-driven fault detection strategies on real and simulated building data separately. We observe that the fault detection performances are not affected by fault detection strategies, sizes of training data, and the number of cross-validation folds when training and blind test data come from the same data source, namely, simulated or real building data. Next, we conduct a cross-dataset study, that is, develop the model using simulated data and tested on real building data. The results indicate the model trained on simulated data is not generalized to be applied for real building data for fault detection. Kolmogorov-Smirnov Test is conducted to confirm that there exist statistical differences between the simulated and real building data and identify a subset of features with similarities between the two datasets. Using the subset of the feature, cross-dataset experiments show fault detection improvements on most fault cases. We conclude that even if the system produces simulated data with the same fault symptoms from physical analysis perspectives, not all features from simulated datasets may not be beneficial for AFDD but only a subset of features contains valuable information from a machine learning perspective.
2021
Zhang, Liang; Wen, Jin; Li, Yanfei; Chen, Jianli; Ye, Yunyang; Fu, Yangyang; Livingood, William
A review of machine learning in building load prediction Journal Article
In: Applied Energy, vol. 285, pp. 116452, 2021, ISSN: 0306-2619.
Abstract | Links | BibTeX | Tags: Building energy forecasting, Building energy system, Building load prediction, Data engineering, Feature engineering, Machine learning
@article{ZHANG2021116452,
title = {A review of machine learning in building load prediction},
author = {Liang Zhang and Jin Wen and Yanfei Li and Jianli Chen and Yunyang Ye and Yangyang Fu and William Livingood},
url = {https://www.sciencedirect.com/science/article/pii/S0306261921000209},
doi = {https://doi.org/10.1016/j.apenergy.2021.116452},
issn = {0306-2619},
year = {2021},
date = {2021-01-01},
journal = {Applied Energy},
volume = {285},
pages = {116452},
abstract = {The surge of machine learning and increasing data accessibility in buildings provide great opportunities for applying machine learning to building energy system modeling and analysis. Building load prediction is one of the most critical components for many building control and analytics activities, as well as grid-interactive and energy efficiency building operation. While a large number of research papers exist on the topic of machine-learning-based building load prediction, a comprehensive review from the perspective of machine learning is missing. In this paper, we review the application of machine learning techniques in building load prediction under the organization and logic of the machine learning, which is to perform tasks T using Performance measure P and based on learning from Experience E. Firstly, we review the applications of building load prediction model (task T). Then, we review the modeling algorithms that improve machine learning performance and accuracy (performance P). Throughout the papers, we also review the literature from the data perspective for modeling (experience E), including data engineering from the sensor level to data level, pre-processing, feature extraction and selection. Finally, we conclude with a discussion of well-studied and relatively unexplored fields for future research reference. We also identify the gaps in current machine learning application and predict for future trends and development.},
keywords = {Building energy forecasting, Building energy system, Building load prediction, Data engineering, Feature engineering, Machine learning},
pubstate = {published},
tppubtype = {article}
}
The surge of machine learning and increasing data accessibility in buildings provide great opportunities for applying machine learning to building energy system modeling and analysis. Building load prediction is one of the most critical components for many building control and analytics activities, as well as grid-interactive and energy efficiency building operation. While a large number of research papers exist on the topic of machine-learning-based building load prediction, a comprehensive review from the perspective of machine learning is missing. In this paper, we review the application of machine learning techniques in building load prediction under the organization and logic of the machine learning, which is to perform tasks T using Performance measure P and based on learning from Experience E. Firstly, we review the applications of building load prediction model (task T). Then, we review the modeling algorithms that improve machine learning performance and accuracy (performance P). Throughout the papers, we also review the literature from the data perspective for modeling (experience E), including data engineering from the sensor level to data level, pre-processing, feature extraction and selection. Finally, we conclude with a discussion of well-studied and relatively unexplored fields for future research reference. We also identify the gaps in current machine learning application and predict for future trends and development.