Machine learning algorithms for forecasting and backcasting blood demand data with missing values and outliers: A study of Tema General Hospital of Ghana
比较六种机器学习模型在加纳医院血液需求数据上的预测和回测表现,发现KNN预测误差12.55%,ELM回测误差19.36%,为处理含缺失值和异常值的时间序列数据提供参考。
The major challenge in managing blood products lies in the uncertainty of blood demand and supply, with a trade-off between shortage and wastage, especially in most developing countries. Thus, reliable demand predictions can be imperative in planning voluntary blood donation campaigns and improving blood availability within Ghana hospitals. However, most historical datasets on blood demand in Ghana are predominantly contaminated with missing values and outliers due to improper database management systems. Consequently, time-series prediction can be challenging since data cleaning can affect models’ predictive power. Also, machine learning (ML) models’ predictive power for backcasting past years’ lost data is understudied compared to their forecasting abilities. This study thus aims to compare K-Nearest Neighbour regression (KNN), Generalised Regression Neural Network (GRNN), Neural Network Auto-regressive (NNAR), Multi-Layer Perceptron (MLP), Extreme Learning Machine (ELM) and Long Short-Term Memory (LSTM) models via a rolling-origin strategy, for forecasting and backcasting a blood demand data with missing values and outliers from a government hospital in Ghana. KNN performed well in forecasting blood demand (12.55% error); whereas, ELM achieved the highest backcasting power (19.36% error). Future studies can also employ ML algorithms as a good alternative for backcasting past values of time-series data that are time-reversible.