Dangut, Maren DavidSkaf, ZakwanJennions, Ian K.2022-05-232022-05-232022-05-14Dangut MD, Skaf Z, Jennions IK. (2022) Handling imbalanced data for aircraft predictive maintenance using the BACHE algorithm, Applied Soft Computing, Volume 123, July 2022, Article number 1089241568-4946https://doi.org/10.1016/j.asoc.2022.108924https://dspace.lib.cranfield.ac.uk/handle/1826/17946Developing a prognostic model to predict an asset’s health condition is a maintenance strategy that increases asset availability and reliability through better maintenance scheduling. Therefore, developing reliable vehicle health predictive models is vital in the aerospace industry, especially considering a safety–critical system such as aircraft. However, one of the significant challenges faced in building reliable data-driven prognostic models is the imbalance dataset. Training machine-learning models using an imbalanced dataset causes classifiers to be biased towards the class with majority samples, resulting in poor predictive accuracy in data-driven models. This problem can become more challenging if the imbalance ratio is extreme and classes overlap. In this paper, a novel approach called Balanced Calibrated Hybrid Ensemble Technique (BACHE) is developed to tackle the severe imbalanced classification problem. The proposed method involves the combination of hybrid data sampling and ensemble-based learning. It uses a cascading balanced approach to transfer a class imbalance problem into a sub-problem by decomposing the original problem into a set of subproblems, each characterized by a reduced imbalance ratio. Then uses a calibrated boosting with a cost-sensitive decision tree to enhance recognition of hard-to-learn patterns, which improves the prediction of the extreme minority class. BACHE is evaluated using a real-world aircraft dataset with rare component replacement instances. Also, a comparative experiment of the proposed approach with other similar existing methods is conducted. The performance metrics used are precision, recall, G-mean, and an area under the curve. The final results show that the proposed model outperforms other similar methods. Also, it can attain an excellent performance on large, extremely imbalanced datasets.enAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/PrognosticImbalanced learningEnsemble learningPredictive maintenanceAerospaceHandling imbalanced data for aircraft predictive maintenance using the BACHE algorithmArticle