Machine learning-based regression analysis of the winter temperature anomalies in China and associated influencing factors
-
-
Abstract
In this study, the mean winter temperature collected at 160 stations in China from 1951 to 2021 and a number of atmospheric circulation and sea temperature indices are used to investigate the relationship between the distribution of winter temperature anomalies and the atmospheric circulation and external forcing factors. A model of fitting is also established by using machine learning methods. In this way, we can understand to what extent the screened combination of influencing factors can explain the distribution of winter temperature anomalies in China. The Least Absolute Shrinkage and Selection Operator (LASSO) algorithm is used to extract the influencing factors related to winter temperature anomalies. In addition, to reflect the nonlinear relationship between these factors, the original features are augmented to polynomial features using Taylor's formula. To further study the nonlinear relationship between the selected factors, the Least Squares Gradient Boosting Decision Tree (LS-GBDT) algorithm is used to estimate and fit winter temperature. Experiments are conducted on the training samples and test samples respectively and have achieved good results. The result verifies that machine learning can be used to screen and analyze the importance of factors affecting winter temperature anomalies more reasonably, and the estimation model established can to a certain extent reflects the nonlinear relationship between the factors influencing the climate system and winter temperature. This work provides a new way to simulate and estimate distribution of the winter tmperature anomalies in China.
-
-