House price prediction: A data-centric aspect approach on performance of combined principal component analysis with deep neural network model

Mostofi F., Toğan V., Başağa H. B.

Journal of Construction Engineering, Management & Innovation, vol.4, no.2, pp.106-116, 2021 (Peer-Reviewed Journal)


High dimensionality and skewness are two intrinsic characteristics of real estate dataset that affects the price prediction accuracy of deep neural network (DNN). The objective of this study is to investigate the effect of skewness in prediction accuracy of combined principal component analysis (PCA) with DNN (PCA-DNN) model. This research follows a threefold approach over a high dimensional and positively skewed real estate price dataset. Firstly, data distribution is to conform with normality using three conventional skewness reduction techniques, namely as square root transformation (SRT), cube root transformation (CRT), and logarithmic transformation (LT) methods. Secondly, the high dimensionality of original, SRT, CRT and LT skewed datasets are to be reduced using PCA. Thirdly, price prediction accuracy of PCA-DNN model over datasets with different skewness levels are to be compared by observing their error values. The results suggest that CRT method can considerably improve both prediction accuracy and computational time of PCA-DNN model, while displaying a good generalization ability. Despite CRT method, SRT and LT methods resulted in high error values and overfitting issues, respectively.