Exploration of improving time series forecasting system
The transformation from traditional model to LSTM and its improvement
During the development of our forecasting system, we initially tried to use traditional regression methods and simple machine learning algorithms (such as linear regression and random forests) to predict abortion rates. These models have limited performance in handling nonlinear relationships and complex time dependencies in data.
In particular, for inputs with time series characteristics such as climate data and Google Trends data, traditional models have difficulty capturing long-term trends and short-term fluctuations that change over time. In addition, since real data often have noise, these simple models are highly sensitive to noise and the prediction results are not stable enough.
Therefore, we decided to use the long short-term memory network (LSTM) model instead. The LSTM model is good at processing sequence data and can effectively capture the time series dependencies and long-term dependencies in time series data, and is suitable for complex time series prediction tasks
Compared with the models we tried before, LSTM can not only identify short-term patterns in a small time window, but also combine data with longer time spans to learn the impact of historical trends, thereby better predicting future abortion rates. In addition, LSTM is more robust when dealing with noisy data and can maintain good prediction accuracy even when the data quality is not high.
Finally, by applying the LSTM model to the preprocessed training data (January 2004 to December 2021) and using a sliding window approach to create a 24-month time series, we were able to significantly improve the model's forecasting performance. In the forecast set from January 2022 to December 2023, the LSTM model demonstrated good accuracy and successfully captured the potential impact of climate change and trend data on abortion rates.