Abstract
Since its inception in 2009, Bitcoin is mired in controversies for providing a haven for illegal activities. Several types of illicit users hide behind the blanket of anonymity. Uncovering these entities is key for forensic investigations. Current methods utilize machine learning for identifying these illicit entities. However, the existing approaches only focus on a limited category of illicit users. The current paper proposes to address the issue by implementing an ensemble of decision trees for supervised learning. More parameters allow the ensemble model to learn discriminating features that can categorize multiple groups of illicit users from licit users. To evaluate the model, a dataset of 1216 real-life entities on Bitcoin was extracted from the Blockchain. Nine Features were engineered to train the model for segregating 16 different licit-illicit categories of users. The proposed model provided a reliable tool for forensic study. Empirical evaluation of the proposed model vis-a-vis three existing benchmark models was performed to highlight its efficacy. Experiments showed that the specificity and sensitivity of the proposed model were comparable to other models. Due to higher parameters of the ensemble tree model, the classification accuracy was 0.91, with 95% CI - 0.8727, 0.9477. This was better than SVM and Logistic Regression, the two popular models in the literature and comparable to the Random Forest and XGBOOST model. CPU and RAM utilization were also monitored to demonstrate the usefulness of the proposed work for real-world deployment. RAM utilization for the proposed model was higher by 30-45% compared to the other three models. Hence, the proposed model is resource-intensive as it has higher parameters than the other three models. Higher parameters also result in higher accuracy of predictions.
Similar content being viewed by others
Notes
In this paper, Bitcoin refers to the system, and bitcoin or BTC refers to the digital currency
References
Aiolli F, Conti M, Gangwal A, Polato M (2019) Mind your wallet’s privacy: Identifying bitcoin wallet apps and user’s actions through network traffic analysis. https://doi.org/10.1145/3297280.3297430
Akcora CG, Li Y, Gel YR, Kantarcioglu M (2019) Bitcoinheist: Topological data analysis for ransomware detection on the bitcoin blockchain. 1906.07852
Alqassem I, Rahwan I, Svetinovic D (2018) The anti-social system properties: Bitcoin network data analysis. IEEE Trans Syst Man Cybern Syst
Bartoletti M, Pes B, Serusi S (2018) Data mining for detecting bitcoin ponzi schemes. In: 2018 Crypto Valley Conference on Blockchain Technology (CVCBT), pp 75–84
Bistarelli S, Mercanti I, Santini F (2018) A suite of tools for the forensic analysis of bitcoin transactions: Preliminary report. In: European conference on parallel processing, Springer, pp 329–341
Bogner A (2017) Seeing is understanding: anomaly detection in blockchains with visualized features. In: Proceedings of the 2017 ACM international joint conference on pervasive and ubiquitous computing and proceedings of the 2017 ACM international symposium on wearable computers, pp 5–8
Bohannon J (2016) The bitcoin busts
Böhme R, Christin N, Edelman B, Moore T (2015) Bitcoin: economics, technology, and governance. J Econ Perspect 29(2): 213–38
Bonneau J, Narayanan A, Miller A, Clark J, Kroll JA, Felten EW (2014) Mixcoin: Anonymity for bitcoin with accountable mixes. In: International Conference on financial cryptography and data security. Springer, pp 486–504
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, association for computing machinery, New York, pp 785–794
Conti M, Kumar E S, Lal C, Ruj S (2018) A survey on security and privacy issues of bitcoin. IEEE Commun Surv Tutorials 20(4):3416–3452
Ermilov D, Panov M, Yanovich Y (2017) Automatic bitcoin address clustering. In: 2017 16Th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 461–466
Foley S, Karlsen J R, Putniṅš TJ (2019) Sex, drugs, and bitcoin: How much illegal activity is financed through cryptocurrencies?. Rev Financ Stud 32(5):1798–1853
Gaihre A, Luo Y, Liu H (2018) Do bitcoin users really care about anonymity? an analysis of the bitcoin transaction graph. In: 2018 IEEE International conference on big data, Big Data. IEEE, pp 1198–1207
Harlev MA, Sun Yin H, Langenheldt KC, Mukkamala R, Vatrapu R (2018) Breaking bad: De-anonymising entity types on the bitcoin blockchain using supervised machine learning. In: Proceedings of the 51st Hawaii international conference on system sciences
Herrera-Joancomartí J (2014) Research and challenges on bitcoin anonymity. In: Data privacy management, autonomous spontaneous security, and security assurance. Springer, pp 3–16
Hu Y, Seneviratne S, Thilakarathna K, Fukuda K, Seneviratne A (2019) Characterizing and detecting money laundering activities on the bitcoin network. arXiv:191212060
Irwin AS, Turner AB (2018) Illicit bitcoin transactions: challenges in getting to the who, what, when and where. Journal of money laundering control
Janda A (2016) Walletexplorer. com: Smart bicoin block explorer
Jourdan M, Blandin S, Wynter L, Deshpande P (2018) Characterizing entities in the bitcoin blockchain. In: 2018 IEEE International conference on data mining workshops (ICDMW). IEEE, pp 55–62
Kanemura K, Toyoda K, Ohtsuki T (2019) Identification of darknet markets’ bitcoin addresses by voting per-address classification results. In: 2019 IEEE International conference on blockchain and cryptocurrency (ICBC). IEEE, pp 154–158
Lee C, Maharjan S, Ko K, Hong J W K (2020) Toward detecting illegal transactions on bitcoin using machine-learning methods. In: Zheng Z, Dai H N, Tang M, Chen X (eds) Blockchain and trustworthy systems. Springer, Singapore, pp 520–533
Liang J, Li L, Luan S, Gan L, Zeng D (2019) Bitcoin exchange addresses identification and its application in online drug trading regulation
Liu T, Ge J, Wu Y, Dai B, Li L, Yao Z, Wen J, Shi H (2020) A new bitcoin address association method using a two-level learner model. In: Wen S, Zomaya A, Yang L T (eds) Algorithms and architectures for parallel processing. Springer International Publishing, Cham, pp 349–364
Maesa D D F, Marino A, Ricci L (2016) Uncovering the bitcoin blockchain: an analysis of the full users graph. In: 2016 IEEE international conference on data science and advanced analytics (DSAA) IEEE, pp 537–546
Maesa D D F, Marino A, Ricci L (2018) Data-driven analysis of bitcoin properties: exploiting the users graph. Int J Data Sci Anal 6(1):63–80
Maesa DDF, Marino A, Ricci L (2018) The graph structure of bitcoin. In: International conference on complex networks and their applications. Springer, pp 547–558
Maesa D D F, Marino A, Ricci L (2019) The bow tie structure of the bitcoin users graph. Appl Netw Sci 4(1):56
Monamo P, Marivate V, Twala B (2016) Unsupervised learning for robust bitcoin fraud detection. In: 2016 Information security for South Africa (ISSA). IEEE, pp 129–134
Monamo P M, Marivate V, Twala B (2016) A multifaceted approach to bitcoin fraud detection: Global and local outliers. In: 2016 15Th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 188–194
Nakamoto S (2019) Bitcoin: a peer-to-peer electronic cash system. Technical report, Manubot
Nan L, Tao D (2018) Bitcoin mixing detection using deep autoencoder. In: 2018 IEEE Third international conference on data science in cyberspace (DSC), pp 280–287
Paquet-Clouston M, Romiti M, Haslhofer B, Charvat T (2019) Spams meet cryptocurrencies: Sextortion in the bitcoin ecosystem. In: Proceedings of the 1st ACM conference on advances in financial technologies, pp 76–88
Park S, Im S, Seol Y, Paek J (2019) Nodes in the bitcoin network: comparative measurement study and survey. IEEE Access 7:57009–57022
Pham T, Lee S (2016) Anomaly detection in bitcoin network using unsupervised learning methods. arXiv:161103941
Phetsouvanh S, Oggier F, Datta A (2018) Egret: Extortion graph exploration techniques in the bitcoin network. In: 2018 IEEE International conference on data mining workshops (ICDMW), pp 244–251
Pinna A, Tonelli R, Orrú M, Marchesi M (2018) A petri nets model for blockchain analysis. Comput J 61(9):1374–1388
Portnoff RS, Huang DY, Doerfler P, Afroz S, McCoy D (2017) Backpage and bitcoin: Uncovering human traffickers. In: KDD ’17
Rahouti M, Xiong K, Ghani N (2018) Bitcoin concepts, threats, and machine-learning security solutions. IEEE Access 6: 67189–67205
Reyes-Macedo V G, Salinas-Rosales M, Garcia G G (2019) A method for blockchain transactions analysis. IEEE Lat Am Trans 17(07):1080–1087
Sabry F, Labda W, Erbad A, Al Jawaheri H, Malluhi Q (2019) Anonymity and privacy in bitcoin escrow trades. In: Proceedings of the 18th ACM workshop on privacy in the electronic society, pp 211–220
Shao W, Li H, Chen M, Jia C, Liu C, Wang Z (2018) Identifying bitcoin users using deep neural network. In: Vaidya J, Li J (eds) Algorithms and architectures for parallel processing. Springer International Publishing, Cham, pp 178–192
Sun Yin H H, Langenheldt K, Harlev M, Mukkamala R R, Vatrapu R (2019) Regulating cryptocurrencies: a supervised machine learning approach to de-anonymizing the bitcoin blockchain. J Manag Inf Syst 36(1):37–73
Toyoda K, Mathiopoulos P T, Ohtsuki T (2019) A novel methodology for hyip operators’ bitcoin addresses identification. IEEE Access 7:74835–74848
Turner A, Irwin ASM (2018) Bitcoin transactions: a digital discovery of illicit activity on the blockchain. Journal of Financial Crime
Vasek M, Moore T (2015) There’s no free lunch, even using bitcoin: Tracking the popularity and profits of virtual currency scams. In: Böhme R, Okamoto T (eds) Financial cryptography and data security. Springer, Berlin, pp 44–61
Weber M, Domeniconi G, Chen J, Weidele DKI, Bellei C, Robinson T, Leiserson CE (2019) Anti-money laundering in bitcoin: Experimenting with graph convolutional networks for financial forensics. arXiv:190802591
Wu Y, Luo A, Xu D (2019) Identifying suspicious addresses in bitcoin thefts. Digit Investig 31:200895. https://doi.org/10.1016/j.fsidi.2019.200895
Wu Y, Tao F, Liu L, Gu J, Panneerselvam J, Zhu R, Shahzad MN (2020) A bitcoin transaction network analytic method for future blockchain forensic investigation. IEEE Trans Netw Sci Eng:1–1
Yang L, Dong X, Xing S, Zheng J, Gu X, Song X (2019) An abnormal transaction detection mechanim on bitcoin. In: 2019 International conference on networking and network applications (NaNA). IEEE, pp 452–457
Yin H S, Vatrapu R (2017) A first estimation of the proportion of cybercriminal entities in the bitcoin ecosystem using supervised machine learning. In: 2017 IEEE International conference on big data, big data. IEEE, pp 3690–3699
Zarpelão B B, Miani R S, Rajarajan M (2019) Detection of bitcoin-based botnets using a one-class classifier. In: Blazy O, Yeun C Y (eds) Information security theory and practice. Springer International Publishing, Cham, pp 174–189
Zayuelas Muñoz J (2019) Detection of bitcoin miners from network measurements. B.S. thesis, Universitat Politècnica de Catalunya
Zhang Z, Zhou T, Xie Z (2017) Bitscope: Scaling bitcoin address de-anonymization using multi-resolution clustering
Zola F, Eguimendia M, Bruse J L, Urrutia RO (2019) Cascading machine learning to attack bitcoin anonymity. In: 2019 IEEE International conference on blockchain. IEEE, Blockchain, pp 10–17
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nerurkar, P., Bhirud, S., Patel, D. et al. Supervised learning model for identifying illegal activities in Bitcoin. Appl Intell 51, 3824–3843 (2021). https://doi.org/10.1007/s10489-020-02048-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-02048-w