Abstract
Anomaly detection is a longstanding and active research area that has many applications in domains such as finance, security and manufacturing. However, the efficiency and performance of anomaly detection algorithms are challenged by the large-scale, high-dimensional and heterogeneous data that are prevalent in the era of big data. Isolation-based unsupervised anomaly detection is a novel and effective approach for identifying anomalies in data. It relies on the idea that anomalies are few and different from normal instances, and thus can be easily isolated by random partitioning. Isolation-based methods have several advantages over existing methods, such as low computational complexity, low memory usage, high scalability, robustness to noise and irrelevant features, and no need for prior knowledge or heavy parameter tuning. In this survey, we review the state-of-the-art isolation-based anomaly detection methods, including their data partitioning strategies, anomaly score functions, and algorithmic details. We also discuss some extensions and applications of isolation-based methods in different scenarios, such as detecting anomalies in streaming data, time series, trajectory and image datasets. Finally, we identify some open challenges and future directions for isolation-based anomaly detection research.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
V. Chandola, A. Banerjee, V. Kumar. Anomaly detection: A survey. ACM Computing Surveys, vol. 41, no. 3, Article number 15, 2009. DOI: https://doi.org/10.1145/1541880.1541882.
G. Pang, C. Shen, L. Cao, A. Van Den Hengel. Deep learning for anomaly detection: A review. ACM Computing Surveys, vol. 54, no. 2, Article number 38, 2022. DOI: https://doi.org/10.1145/3439950.
F. Y. Edgeworth. XLI. On discordant observations. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 23, no. 143, pp. 364–375, 1887. DOI: https://doi.org/10.1080/14786448708628471.
T. Fernando, H. Gammulle, S. Denman, S. Sridharan, C. Fookes. Deep learning for medical anomaly detection–A survey. ACM Computing Surveys, vol. 54, no. 7, Article number 141, 2022. DOI: https://doi.org/10.1145/3464423.
B. Venkataramanaiah, J. Kamala. RETRACTED ARTICLE: ECG signal processing and KNN classifier-based abnormality detection by VH-doctor for remote cardiac healthcare monitoring. Soft Computing, vol. 24, no. 22, pp. 17457–17466, 2020. DOI: https://doi.org/10.1007/s00500-020-051911.
X. Huang, Y. Yang, Y. Wang, C. Wang, Z. Zhang, J. Xu, L. Chen, M. Vazirgiannis. DGraph: A large-scale financial dataset for graph anomaly detection. In Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, USA, Article number 1654, 2022.
S. Kumar, L. Akoglu, N. Chawla, J. A. Rodriguez-Serrano, T. Faruquie, S. Nagrecha. Machine learning in finance. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, pp. 4139–4140, 2021. DOI: https://doi.org/10.1145/3447548.3469456.
L. Cui, Y. Qu, G. Xie, D. Zeng, R. Li, S. Shen, S. Yu. Security and privacy-enhanced federated learning for anomaly detection in IoT infrastructures. IEEE Transactions on Industrial Informatics, vol. 18, no. 5, pp. 3492–3500, 2022. DOI: https://doi.org/10.1109/TII.2021.3107783.
M. Hosseinzadeh, A. M. Rahmani, B. Vo, M. Bidaki, M. Masdari, M. Zangakani. Improving security using SVM-based anomaly detection: Issues and challenges. Soft Computing, vol. 25, no. 4, pp. 3195–3223, 2021. DOI: https://doi.org/10.1007/s00500-020-05373-x.
L. Erhan, M. Ndubuaku, M. Di Mauro, W. Song, M. Chen, G. Fortino, O. Bagdasar, A. Liotta. Smart anomaly detection in sensor systems: A multi-perspective review. Information Fusion, vol. 67, pp. 64–79, 2021. DOI: https://doi.org/10.1016/j.inffus.2020.10.001.
T. Finke, M. Krämer, A. Morandini, A. Mück, I. Oleksiyuk. Autoencoders for unsupervised anomaly detection in high energy physics. Journal of High Energy Physics, vol. 2021, no. 6, Article number 161, 2021. DOI: https://doi.org/10.1007/JHEP06(2021)161.
A. L. Alfeo, M. G. C. A. Cimino, G. Manco, E. Ritacco, G. Vaglini. Using an autoencoder in the design of an anomaly detector for smart manufacturing. Pattern Recognition Letters, vol. 136, pp. 272–278, 2020. DOI: https://doi.org/10.1016/j.patrec.2020.06.008.
K. Pooja, S. Rekha. Anomaly detection for predictive maintenance in industry 4.0–A survey. E3S Web of Conferences, vol. 170, Article number 02007, 2020. DOI: https://doi.org/10.1051/e3sconf/202017002007.
F. Pittino, M. Puggl, T. Moldaschl, C. Hirschl. Automatic anomaly detection on in-production manufacturing machines using statistical learning methods. Sensors, vol. 20, no. 8, Article number 2344, 2020. DOI: https://doi.org/10.3390/s20082344.
L. Ruff, J. R. Kauffmann, R. A. Vandermeulen, G. Montavon, W. Samek, M. Kloft, T. G. Dietterich, K. R. Müller. A unifying review of deep and shallow anomaly detection. Proceedings of the IEEE, vol. 109, no. 5, pp. 756–795, 2021. DOI: https://doi.org/10.1109/JPROC.2021.3052449.
T. Chen, S. Kornblith, M. Norouzi, G. Hinton. A simple framework for contrastive learning of visual representations. In Proceedings of the 37th International Conference on Machine Learning, pp. 1597–1607, 2020.
Y. Liu, M. Jin, S. Pan, C. Zhou, Y. Zheng, F. Xia, P. S. Yu. Graph self-supervised learning: A survey. IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 6, pp. 5879–5900, 2023. DOI: https://doi.org/10.1109/TKDE.2022.3172903.
H. Xiang, X. Zhang, X. Xu, A. Beheshti, L. Qi, Y. Hong, W. Dou. Federated learning-based anomaly detection with isolation forest in the IoT-edge continuum. ACM Transactions on Multimedia Computing, Communications and Applications, to be published. DOI: https://doi.org/10.1145/3702995.
F. T. Liu, K. M. Ting, Z. H. Zhou. Isolation forest. In Proceedings of the 8th IEEE International Conference on Data Mining, Pisa, Italy, pp. 413–422, 2008. DOI: https://doi.org/10.1109/ICDM.2008.17.
S. Hariri, M. C. Kind, R. J. Brunner. Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 4, pp. 1479–1489, 2021. DOI: https://doi.org/10.1109/TKDE.2019.2947676.
T. R. Bandaragoda, K. M. Ting, D. Albrecht, F. T. Liu, Y. Zhu, J. R. Wells. Isolation-based anomaly detection using nearest-neighbor ensembles. Computational Intelligence, vol. 34, no. 4, pp. 968–998, 2018. DOI: https://doi.org/10.1111/coin.12156.
H. Xiang, X. Zhang, H. Hu, L. Qi, W. Dou, M. Dras, A. Beheshti, X. Xu. OptiForest: Optimal isolation forest for anomaly detection. In Proceedings of the 32nd International Joint Conference on Artificial Intelligence, Macao, China, pp. 2379–2387, 2023.
K. M. Ting, B. C. Xu, T. Washio, Z. H. Zhou. Isolation distributional kernel: A new tool for point and group anomaly detections. IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 3, pp. 2697–2710, 2023. DOI: https://doi.org/10.1109/TKDE.2021.3120277.
D. M. Hawkins. Identification of Outliers, Dordrecht, The Netherlands: Springer, 1980. DOI: https://doi.org/10.1007/978-94-0153994-4.
R. A. Fisher. Iris. UCI Machine Learning Repository, 1988. DOI: https://doi.org/10.24432/C56C76.
X. Song, M. Wu, C. Jermaine, S. Ranka. Conditional anomaly detection. IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 5, pp. 631–645, 2007. DOI: https://doi.org/10.1109/TKDE.2007.1009.
M. Gupta, J. Gao, C. C. Aggarwal, J. Han. Outlier detection for temporal data: A survey. IEEE Transactions on Knowledge and data Engineering, vol. 26, no. 9, pp. 2250–2267, 2014. DOI: https://doi.org/10.1109/TKDE.2013.184.
V. Chandola, A. Banerjee, V. Kumar. Anomaly detection for discrete sequences: A survey. IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 5, pp. 823–839, 2012. DOI: https://doi.org/10.1109/TKDE.2010.235.
F. Ahmed, A. Courville. Detecting semantic anomalies. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, USA, pp. 3154–3162, 2020. DOI: https://doi.org/10.1609/aaai.v34i04.5712.
T. R. Laskar, J. X. Huang, V. Smetana, C. Stewart, K. Pouw, A. An, S. Chan, L. Liu. Extending isolation forest for anomaly detection in big data via K-means. ACM Transactions on Cyber-Physical Systems, vol. 5, no. 4, Article number 41, 2021. DOI: https://doi.org/10.1145/3460976.
F. T. Liu, K. M. Ting, Z. H. Zhou. On detecting clustered anomalies using SCiForest. In Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, Barcelona, Spain, pp. 274–290, 2010. DOI: https://doi.org/10.1007/978-3-642-15883-4_18.
S. K. Murthy, S. Kasif, S. Salzberg. A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, vol. 2, pp. 1–32, 1994. DOI: https://doi.org/10.1613/jair.63.
X. Y. Qin, K. M. Ting, Y. Zhu, V. C. S. Lee. Nearest-neighbour-induced isolation similarity and its impact on density-based clustering. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, USA, pp. 4755–4762, 2019. DOI: https://doi.org/10.1609/aaai.v33i01.33014755.
F. Aurenhammer. Voronoi diagrams–a survey of a fundamental geometric data structure. ACM Computing Surveys, vol. 23, no. 3, pp. 345–405, 1991. DOI: https://doi.org/10.1145/116873.116880.
X. Zhang, W. Dou, Q. He, R. Zhou, C. Leckie, R. Kotagiri, Z. Salcic. LSHiForest: A generic framework for fast tree isolation based ensemble anomaly analysis. In Proceedings of the 33rd International Conference on Data Engineering, San Diego, USA, pp. 983–994, 2017. DOI: https://doi.org/10.1109/ICDE.2017.145.
K. M. Ting, Y. Zhu, Z. H. Zhou. Isolation kernel and its effect on SVM. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, pp. 2329–2337, 2018. DOI: https://doi.org/10.1145/3219819.3219990.
M. M. Breunig, H. P. Kriegel, R. T. Ng, J. Sander. LOF: Identifying density-based local outliers. In Proceedings of ACM SIGMOD International Conference on Management of Data, Dallas, USA, pp. 93–104, 2000. DOI: https://doi.org/10.1145/342009.335388.
J. Du, G. Han, C. Lin, M. Martínez-García. ITrust: An anomaly-resilient trust model based on isolation forest for underwater acoustic sensor networks. IEEE Transactions on Mobile Computing, vol. 21, no. 5, pp. 1684–1696, 2022. DOI: https://doi.org/10.1109/TMC.2020.3028369.
Z. M. Wang, G. H. Song, C. Gao. An isolation-based distributed outlier detection framework using nearest neighbor ensembles for wireless sensor networks. IEEE Access, vol. 7, pp. 96319–96333, 2019. DOI: https://doi.org/10.1109/ACCESS.2019.2929581.
W. Fang, Y. Shao, P. E. D. Love, T. Hartmann, W. Liu. Detecting anomalies and de-noising monitoring data from sensors: A smart data approach. Advanced Engineering Informatics, vol. 55, Article number 101870, 2023. DOI: https://doi.org/10.1016/j.aei.2022.101870.
P. F. Marteau. Random partitioning forest for point-wise and collective anomaly detection–application to network intrusion detection. IEEE Transactions on Information Forensics and Security, vol. 16, pp. 2157–2172, 2021. DOI: https://doi.org/10.1109/TIFS.2021.3050605.
Z. Chiba, N. Abghour, K. Moussaid, A. El Omri, M. Rida. Newest collaborative and hybrid network intrusion detection framework based on Suricata and isolation forest algorithm. In Proceedings of the 4th International Conference on Smart City Applications, Casablanca, Morocco, Article number 77, 2019. DOI: https://doi.org/10.1145/3368756.3369061.
W. Hilal, S. A. Gadsden, J. Yawney. Financial fraud: A review of anomaly detection techniques and recent advances. Expert Systems with Applications, vol. 193, Article number 116429, 2022. DOI: https://doi.org/10.1016/j.eswa.2021.116429.
N. Islah, J. Koerner, R. Genov, T. A. Valiante, G. O’Leary. Machine learning with imbalanced EEG data-sets using outlier-based sampling. In Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, Montreal, Canada, pp. 112–115, 2020. DOI: https://doi.org/10.1109/EMBC44109.2020.9175401.
Y. Guo, X. Jiang, L. Tao, L. Meng, C. Dai, X. Long, F. Wan, Y. Zhang, J. van Dijk, R. M. Aarts, W. Chen, C. Chen. Epileptic seizure detection by cascading isolation forest-based anomaly screening and easyEnsemble. IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 30, Article number 915–924, 2022. DOI: https://doi.org/10.1109/TNSRE.2022.3163503.
S. Zhao, H. Gao, X. Li, H. Li, Y. Wang, R. Hu, J. Zhang, W. Yao, G. Li. An outlier detection based two-stage EEG artifact removal method using empirical wavelet transform and canonical correlation analysis. Biomedical Signal Processing and Control, vol. 92, Article number 106022, 2024. DOI: https://doi.org/10.1016/j.bspc.2024.106022.
Y. Himeur, K. Ghanem, A. Alsalemi, F. Bensaali, A. Amira. Artificial intelligence based anomaly detection of energy consumption in buildings: A review, current trends and new perspectives. Applied Energy, vol. 287, Article number 116601, 2021. DOI: https://doi.org/10.1016/j.apenergy.2021.116601.
S. Ahmed, Y. Lee, S. H. Hyun, I. Koo. Unsupervised machine learning-based detection of covert data integrity assault in smart grid networks utilizing isolation forest. IEEE Transactions on Information Forensics and Security, vol. 14, no. 10, pp. 2765–2777, 2019. DOI: https://doi.org/10.1109/TIFS.2019.2902822.
I. Goldenberg, G. I. Webb. Survey of distance measures for quantifying concept drift and shift in numeric data. Knowledge and Information Systems, vol. 60, no. 2, pp. 591–615, 2019. DOI: https://doi.org/10.1007/s10115-018-1257-z.
J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, A. Bouchachia. A survey on concept drift adaptation. ACM Computing Surveys, vol. 46, no. 4, Article number 44, 2014. DOI: https://doi.org/10.1145/2523813.
H. Xiang, X. Zhang. Edge computing empowered anomaly detection framework with dynamic insertion and deletion schemes on data streams. World Wide Web, vol. 25, no. 5, pp. 2163–2183, 2022. DOI: https://doi.org/10.1007/s11280-022-01052-z.
Z. Ding, M. Fei. An anomaly detection approach based on isolation forest algorithm for streaming data using sliding window. IFAC Proceedings Volumes, vol. 46, no. 20, pp. 12–17, 2013. DOI: https://doi.org/10.3182/20130902-3-CN-3020.00044.
S. C. Tan, K. M. Ting, T. F. Liu. Fast anomaly detection for streaming data. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Spain, pp. 1511–1516, 2011. DOI: https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-254.
S. Guha, N. Mishra, G. Roy, O. Schrijvers. Robust random cut forest based anomaly detection on streams. In Proceedings of the 33rd International Conference on Machine Learning, New York, USA, pp. 2712–2721, 2016.
B. Zong, Q. Song, M. R. Min, W. Cheng, C. Lumezanu, D. Cho, H. Chen. Deep autoencoding Gaussian mixture model for unsupervised anomaly detection. In Proceedings of the 24th International Conference on Learning Representations, Vancouver, Canada, 2018.
M. Wang, Y. Zhu, G. Li, G. Liu, B. Yang. Image anomaly detection with semantic-enhanced augmentation and distributional kernel. In Proceedings of the 24th International Conference on High Performance Computing & Communications; the 8th International Conference on Data Science & Systems; the 20th International Conference on Smart City; the 8th International Conference on Dependability in Sensor, Cloud & Big Data Systems & Application, Hainan, China, pp. 163–170, 2022. DOI: https://doi.org/10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00054.
L. Utkin, A. Ageev, A. Konstantinov, V. Muliukha. Improved anomaly detection by using the attention-based isolation forest. Algorithms, vol. 16, no. 1, Article number 19, 2023. DOI: https://doi.org/10.3390/a16010019.
M. Zhao, W. Li, L. Li, A. Wang, J. Hu, R. Tao. Infrared small UAV target detection via isolation forest. IEEE Transactions on Geoscience and Remote Sensing, vol. 61, Article number 5004316, 2023. DOI: https://doi.org/10.1109/TGRS.2023.3321723.
X. Li, Y. Lu, C. Desrosiers, X. Liu. Out-of-distribution detection for skin lesion images with deep isolation forest. In Proceedings of the 11th International Workshop on Machine Learning in Medical Imaging, Lima, Peru, pp. 91–100, 2020. DOI: https://doi.org/10.1007/978-3-030-59861-7_10.
X. Cheng, M. Zhang, S. Lin, K. Zhou, S. Zhao, H. Wang. Two-stream isolation forest based on deep features for hyperspectral anomaly detection. IEEE Geoscience and Remote Sensing Letters, vol. 20, Article number 5504205, 2023. DOI: https://doi.org/10.1109/LGRS.2023.3271899.
X. Song, S. Aryal, K. M. Ting, Z. Liu, B. He. Spectral–spatial anomaly detection of hyperspectral data based on improved isolation forest. IEEE Transactions on Geoscience and Remote Sensing, vol. 60, Article number 5516016, 2022. DOI: https://doi.org/10.1109/TGRS.2021.3104998.
R. Wang, F. Nie, Z. Wang, F. He, X. Li. Multiple features and isolation forest-based fast anomaly detector for hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 9, pp. 6664–6676, 2020. DOI: https://doi.org/10.1109/TGRS.2020.2978491.
A. Bhatt, A. Ganatra. Explosive weapons and arms detection with singular classification (WARDIC) on novel weapon dataset using deep learning: Enhanced OODA loop. Engineered Science, vol. 20, pp. 252–266, 2022. DOI: https://doi.org/10.30919/ES8E718.
A. Farzad, T. A. Gulliver. Unsupervised log message anomaly detection. ICT Express, vol. 6, no. 3, pp. 229–237, 2020. DOI: https://doi.org/10.1016/j.icte.2020.06.003.
H. Xu, G. Pang, Y. Wang, Y. Wang. Deep isolation forest for anomaly detection. IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 12, pp. 12591–12604, 2023. DOI: https://doi.org/10.1109/TKDE.2023.3270293.
K. Muandet, K. Fukumizu, B. Sriperumbudur, B. Schölkopf. Kernel mean embedding of distributions: A review and beyond. Foundations and Trends® in Machine Learning, vol. 10, no. 1–2, pp. 1–141, 2017. DOI: https://doi.org/10.1561/2200000060.
A. Smola, A. Gretton, L. Song, B. Schölkopf. A hilbert space embedding for distributions. In Proceedings of the 18th International Conference on Algorithmic Learning Theory, Sendai, Japan, pp. 13–31, 2007. DOI: https://doi.org/10.1007/978-3-540-75225-7_5.
Y. Wang, Z. Wang, K. M. Ting, Y. Shang. A principled distributional approach to trajectory similarity measurement and its application to anomaly detection. Journal of Artificial Intelligence Research, vol. 79, pp. 865–893, 2024. DOI: https://doi.org/10.1613/jair.1.15849.
K. M. Ting, Z. Liu, H. Zhang, Y. Zhu. A new distributional treatment for time series and an anomaly detection investigation. Proceedings of the VLDB Endowment, vol. 15, no. 11, pp. 2321–2333, 2022. DOI: https://doi.org/10.14778/3551793.3551796.
K. M. Ting, Z. Liu, L. Gong, H. Zhang, Y. Zhu. A new distributional treatment for time series anomaly detection. The VLDB Journal, vol. 33, no. 3, pp. 753–780, 2024. DOI: https://doi.org/10.1007/s00778-023-00832-x.
C. C. M. Yeh, Y. Zhu, L. Ulanova, N. Begum, Y. Ding, H. A. Dau, D. F. Silva, A. Mueen, E. Keogh. Matrix profile I: All pairs similarity joins for time series: A unifying view that includes motifs, discords and shapelets. In Proceedings of the 16th International Conference on Data Mining, Barcelona, Spain, pp. 1317–1322, 2016. DOI: https://doi.org/10.1109/ICDM.2016.0179.
S. Gharghabi, S. Imani, A. Bagnall, A. Darvishzadeh, E. Keogh. An ultra-fast time series distance measure to allow data mining in more complex real-world deployments. Data Mining and Knowledge Discovery, vol. 34, no. 4, pp. 1104–1135, 2020. DOI: https://doi.org/10.1007/s10618-020-00695-8.
Y. Zhu, Z. Zimmerman, N. S. Senobari, C. C. M. Yeh, G. Funning, A. Mueen, P. Brisk, E. Keogh. Matrix profile II: Exploiting a novel algorithm and GPUs to break the one hundred million barrier for time series motifs and joins. In Proceedings of the 16th International Conference on Data Mining, Barcelona, Spain, pp. 739–748, 2016. DOI: https://doi.org/10.1109/ICDM.2016.0085.
J. Paparrizos, L. Gravano. k-Shape: Efficient and accurate clustering of time series. In Proceedings of ACM SIG- MOD International Conference on Management of Data, Melbourne, Australia, pp. 1855–1870, 2015. DOI: https://doi.org/10.1145/2723372.2737793.
J. Paparrizos, M. J. Franklin. GRAIL: Efficient time-series representation learning. Proceedings of the VLDB Endowment, vol. 12, no. 11, pp. 1762–1777, 2019. DOI: https://doi.org/10.14778/3342263.3342648.
E. Keogh, C. A. Ratanamahatana. Exact indexing of dynamic time warping. Knowledge and Information Systems, vol. 7, no. 3, pp. 358–386, 2005. DOI: https://doi.org/10.1007/s10115-004-0154-9.
Y. Shen, Y. Chen, E. Keogh, H. Jin. Accelerating time series searching with large uniform scaling. In Proceedings of SIAM International Conference on Data Mining, San Diego, USA, pp. 234–242, 2018. DOI: https://doi.org/10.1137/1.9781611975321.27.
C. W. Tan, F. Petitjean, G. I. Webb. Elastic bands across the path: A new framework and method to lower bound DTW. In Proceedings of SIAM International Conference on Data Mining, Calgary, Canada, pp. 522–530, 2019. DOI: https://doi.org/10.1137/1.9781611975673.59.
O. Gold, M. Sharir. Dynamic time warping and geometric edit distance: Breaking the quadratic barrier. ACM Transactions on Algorithms, vol. 14, no. 4, Article number 50, 2018. DOI: https://doi.org/10.1145/3230734.
M. Kelly, R. Longjohn, K. Nottingham. UCI Machine Learning Repository, [Online], Available, https://archive.ics.uci.edu/, 2017.
F. J. Provost, T. Fawcett, R. Kohavi. The case against accuracy estimation for comparing induction algorithms. In Proceedings of the 15th International Conference on Machine Learning, Madison, USA, pp. 445–453, 1998.
C. Manning, H. Schütze. Foundations of Statistical Natural Language Processing, Cambridge, USA: MIT Press, 1999.
B. C. Xu, K. M. Ting, Y. Jiang. Isolation graph kernel. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp. 10487–10495, 2021. DOI: https://doi.org/10.1609/aaai.v35i12.17255.
F. T. Liu, K. M. Ting, Z. H. Zhou. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data, vol. 6, no. 1, Article number 3, 2012. DOI: https://doi.org/10.1145/2133360.2133363.
H. Xiang, X. Zhang, M. Dras, A. Beheshti, W. Dou, X. Xu. Deep optimal isolation forest with genetic algorithm for anomaly detection. In Proceedings of IEEE International Conference on Data Mining, Shanghai, China, pp. 678–687, 2023. DOI: https://doi.org/10.1109/ICDM58522.2023.00077.
S. Han, X. Hu, H. Huang, M. Jiang, Y. Zhao. ADBench: Anomaly detection benchmark. In Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, USA, pp. 32142–32159, 2022.
Y. Cao, Y. Ma, Y. Zhu, K. M. Ting. Revisiting streaming anomaly detection: Benchmark and evaluation. Artificial Intelligence Review, vol. 58, no. 1, Article number 8, 2025. DOI: https://doi.org/10.1007/s10462-024-10995-w.
D. Samariya, A. Thakkar. A comprehensive survey of anomaly detection algorithms. Annals of Data Science, vol. 10, no. 3, pp. 829–850, 2023. DOI: https://doi.org/10.1007/S40745-021-00362-9.
H. Paulheim, R. Meusel. A decomposition of the outlier detection problem into a set of supervised learning problems. Machine Learning, vol. 100, no. 2, pp. 509–531, 2015. DOI: https://doi.org/10.1007/s10994-015-5507-y.
K. Ouardini, H. Yang, B. Unnikrishnan, M. Romain, C. Garcin, H. Zenati, J. P. Campbell, M. F. Chiang, J. Kalpathy-Cramer, V. Chandrasekhar, P. Krishnaswamy, C. S. Foo. Towards practical unsupervised anomaly detection on retinal images. In Proceedings of the 1st MICCAI Workshop on Domain Adaptation and Representation Transfer, and the 1st International Workshop on Medical Image Learning with Less Labels and Imperfect Data, Shenzhen, China, pp. 225–234, 2019. DOI: https://doi.org/10.1007/978-3-030-33391-1_26.
T. Hayashi, H. Fujita, A. Hernandez-Matamoros. Less complexity one-class classification approach using construction error of convolutional image transformation network. Information Sciences, vol. 560, pp. 217–234, 2021. DOI: https://doi.org/10.1016/j.ins.2021.01.069.
C. Gini. On the measure of concentration with special reference to income and statistics. Colorado College Publication, General Series, no. 208, pp. 73–79, 1936.
Y. Cao, Y. Zhu, K. M. Ting, F. D. Salim, H. X. Li, L. Yang, G. Li. Detecting change intervals with isolation distributional kernel. Journal of Artificial Intelligence Research, vol. 79, pp. 273–306, 2024. DOI: https://doi.org/10.1613/jair.1.15762.
D. A. Huffman. A method for the construction of minimum-redundancy codes. Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, 1952. DOI: https://doi.org/10.1109/JRPROC.1952.273898.
K. M. Ting, J. R. Wells, Y. Zhu. Point-set kernel clustering. IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 5, pp. 5147–5158, 2023. DOI: https://doi.org/10.1109/TKDE.2022.3144914.
X. Han, Y. Zhu, K. M. Ting, D. C. Zhan, G. Li. Streaming hierarchical clustering based on point-set kernel. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington DC, USA, pp. 525–533, 2022. DOI: https://doi.org/10.1145/3534678.3539323.
Z. J. Wang, Y. Zhu, K. M. Ting. Distribution-based trajectory clustering. In Proceedings of IEEE International Conference on Data Mining, Shanghai, China, pp. 1379–1384, 2023. DOI: https://doi.org/10.1109/ICDM58522.2023.00178.
Y. Zhu, K. M. Ting. Kernel-based clustering via isolation distributional kernel. Information Systems, vol. 117, Article number 102212, 2023. DOI: https://doi.org/10.1016/J.IS.2023.102212.
X. Mu, K. M. Ting, Z. H. Zhou. Classification under streaming emerging new classes: A solution using completely-random trees. IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 8, pp. 1605–1618, 2017. DOI: https://doi.org/10.1109/TKDE.2017.2691702.
X. Q. Cai, P. Zhao, K. M. Ting, X. Mu, Y. Jiang. Nearest neighbor ensembles: An effective method for difficult problems in streaming classification with emerging new classes. In Proceedings of IEEE International Conference on Data Mining, Beijing, China, pp. 970–975, 2019. DOI: https://doi.org/10.1109/ICDM.2019.00109.
B. C. Xu, K. M. Ting, Z. H. Zhou. Isolation set-kernel and its application to multi-instance learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, USA, pp. 941–949, 2019. DOI: https://doi.org/10.1145/3292500.3330830.
K. M. Ting, T. Washio, J. R. Wells, H. Zhang. Isolation kernel density estimation. In Proceedings of IEEE International Conference on Data Mining, Auckland, New Zealand, pp. 619–628, 2021. DOI: https://doi.org/10.1109/ICDM51629.2021.00073.
K. M. Ting, T. Washio, J. Wells, H. Zhang, Y. Zhu. Isolation kernel estimators. Knowledge and Information Systems, vol. 65, no. 2, pp. 759–787, 2023. DOI: https://doi.org/10.1007/s10115-022-01765-7.
H. Zhang, K. Zhang, K. M. Ting, Y. Zhu. Towards a persistence diagram that is robust to noise and varied densities. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, USA, pp. 41952–41972, 2023.
C. Geng, S. J. Huang, S. Chen. Recent advances in open set recognition: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3614–3631, 2021. DOI: https://doi.org/10.1109/TPAMI.2020.2981604.
A. Mahdavi, M. Carvalho. A survey on open set recognition. In Proceedings of the 4th International Conference on Artificial Intelligence and Knowledge Engineering, Laguna Hills, USA, pp. 37–44, 2021. DOI: https://doi.org/10.1109/AIKE52691.2021.00013.
M. Salehi, H. Mirzaei, D. Hendrycks, Y. Li, M. H. Rohban, M. Sabokrou. A unified survey on anomaly, novelty, open-set, and out of-distribution detection: Solutions and future challenges. Transactions on Machine Learning Research, vol. 2022, Article number 234, 2022.
Acknowledgements
We appreciate the suggestions from Shuaibin Song, Zijing Wang and Zongyou Liu from Nanjing University, China, and Dr Xuyun Zhang from Macquarie University, Australia. Kai Ming Ting is supported by the National Natural Science Foundation of China (No. 62076120). This project is supported by the State Key Laboratory for Novel Software Technology at Nanjing University, China (No. KFKT2024A01). Open Access funding enabled and organized by CAUL and its Member Institutions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declared that they have no conflicts of interest to this work.
Additional information
Colored figures are available in the online version at https://link.springer.com/journal/11633
Yang Cao received the B. Sc. degree in information technology from Monash University, Australia in 2020, the M. Sc. degree in data science and the Ph. D. degree in artificial intelligence from Deakin University, Australia in 2021 and 2025, respectively. He is currently a postdoctoral researcher at Great Bay University, China.
His research interests include clustering analysis, anomaly detection and their application in renewable energy.
Haolong Xiang received the Ph. D. degree in artificial intelligence in the School of Computing, Macquarie University, Australia in 2024. He is currently working in the School of Software of Nanjing University of Information Science and Technology, China.
His research interests include anomaly detection, data mining and machine learning.
Hang Zhang received the B. Sc. degree in computer science from Tongji University, China in 2020. Currently, he is a Ph. D. degree candidate in the School of Artificial Intelligence, Nanjing University, China. And he is a member of Learning and Mining from Data (LAMDA) Group, China.
His research interests include machine learning, data mining and topological data analysis.
Ye Zhu received the Ph. D. degree in artificial intelligence with a Mollie Holman Medal for the best doctoral thesis of the year from Monash University, Australia in 2017. He is a senior lecturer at the School of Information Technology, Deakin University, Australia. He has published more than 50 papers in AI-related top international conferences or journals, including SIGKDD, AAAI, IJCAI, VLDB, AIJ, TKDE, PRJ, JAIR, ISJ and MLJ. He is on the program committee of SIGKDD, AAAI, IJCAI, PAKDD and ADMA. He has also secured several large research grants for multi-disciplinary research. He is an IEEE Senior Member.
His research interests include clustering analysis, anomaly detection, and their applications for pattern recognition and information retrieval.
Kai Ming Ting received the Ph. D. degree in computer science from the University of Sydney, Australia in 1996, and has worked at various universities in Australia and New Zealand, including the University of Waikato, Deakin University, Monash University, and Federation University. He is a professor at Nanjing University, China since 2020. He is the principal driver of isolation-based methods, and a key originator of isolation forest, isolation kernel, isolation distributional kernel and mass-based similarity. He has received research grants from the National Natural Science Foundation of China, the Australian Research Council, the US Air Force of Scientific Research (AFOSR/AOARD), Toyota InfoTechnology Center, and the Australian Institute of Sports.
His research interests include data-dependent similarity measures, anomaly detection, ensemble approaches, data mining, and machine learning.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Cao, Y., Xiang, H., Zhang, H. et al. Anomaly Detection Based on Isolation Mechanisms: A Survey. Mach. Intell. Res. 22, 849–865 (2025). https://doi.org/10.1007/s11633-025-1554-4
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s11633-025-1554-4