Abstract:In order to compare the characteristics and applicability of the mathematical statistical model and the machine learning model in medium and long-term runoff forecasting, stepwise regression and random forest are selected to build medium and long-term forecasting model. Based on the physical mechanism of meteorological factors, single correlation coefficient and random forest importance analysis are combined to select key meteorological factors as input to the model. The runoff flows from the two reservoirs of Wudongde and Pubugou in the upper Yangtze River from 1959 to 1998 were simulated, and the runoff flows of the two reservoirs from 1999 to 2014 were predicted. The results show that the overall simulation effect of the two models is good and the stability is strong. The accuracy of the prediction results of the random forest is higher than that of the stepwise regression, but the difference in accuracy is small. Random forest can reduce the fitting error caused by the abnormal change of the predictor value, but the overfitting problem is more obvious. The research method has important application value for the formulation of the upper Yangtze River Basin operation scheme, and the research results have reference value for understanding the characteristics of mathematical statistics and machine learning models in the application of mid-long term forecasting.