research-article

Free Access

Just Accepted

Efficient Bike-sharing Repositioning with Cooperative Multi-Agent Deep Reinforcement Learning

Authors:
Yao Jing

Northwestern Polytechnical University, China

Northwestern Polytechnical University, China
Search about this author

,
Bin Guo

Northwestern Polytechnical University, China

Northwestern Polytechnical University, China
Search about this author

,
Yan Liu

Peking University, China

Peking University, China
Search about this author

,
Daqing Zhang

Télécom SudParis, France

Télécom SudParis, France
Search about this author

,
Djamal Zeghlache

Télécom SudParis, France

Télécom SudParis, France
Search about this author

,
Zhiwen Yu

Harbin Engineering University; Northwestern Polytechnical University, China

Harbin Engineering University; Northwestern Polytechnical University, China
Search about this author

Authors Info & Claims

ACM Transactions on Sensor NetworksAccepted on December 2023https://doi.org/10.1145/3639468

Online AM:03 January 2024Publication History

ACM Transactions on Sensor Networks

Abstract

As an emerging mobility-on-demand service, bike-sharing system (BSS) has spread all over the world by providing a flexible, cost-efficient, and environment-friendly transportation mode for citizens. Demand-supply unbalance is one of the main challenges in BSS because of the inefficiency of the existing bike repositioning strategy, which reallocates bikes according to a pre-defined periodic schedule without considering the highly dynamic user demands. While reinforcement learning has been used in some repositioning problems for mitigating demand-supply unbalance, there are significant barriers when extending it to BSS due to the dimension curse of action space resulting from the dynamic number of workers and bikes in the city. In this paper, we study these barriers and address them by proposing a novel bike repositioning system, namely BikeBrain, which consists of a demand prediction model and a spatio-temporal bike repositioning algorithm. Specifically, to obtain accurate and real-time usage demand for efficient bike repositioning, we first present a prediction model ST-NetPre, which directly predicts user demand considering the highly dynamic spatio-temporal characteristics. Furthermore, we propose a spatio-temporal cooperative multi-agent reinforcement learning method (ST-CBR) for learning the worker-based bike repositioning strategy in which each worker in BSS is considered an agent. Especially, ST-CBR adopts the centralized learning and decentralized execution way to achieve effective cooperation among large-scale dynamic agents based on Mean Field Reinforcement Learning (MFRL), while avoiding the huge dimension of action space. For dynamic action space, ST-CBR utilizes a SoftMax selector to select the specific action. Meanwhile, for the benefits and costs of agents’ operation, an efficient reward function is designed to seek an optimal control policy considering both immediate and future rewards. Extensive experiments are conducted based on large-scale real-world datasets, and the results have shown significant improvements of our proposed method over several state-of-the-art baselines on the demand-supply gap and operation cost measures.

References

Maruan Al-Shedivat, Trapit Bansal, Yuri Burda, Ilya Sutskever, Igor Mordatch, and Pieter Abbeel. 2017. Continuous adaptation via meta-learning in nonstationary and competitive environments. arXiv preprint arXiv:1710.03641(2017).Google Scholar
Szilárd Aradi. 2020. Survey of deep reinforcement learning for motion planning of autonomous vehicles. IEEE Transactions on Intelligent Transportation Systems 23, 2(2020), 740–759.Google ScholarDigital Library
Di Chai, Leye Wang, and Qiang Yang. 2018. Bike flow prediction with multi-graph convolutional networks. In Proceedings of the 26th ACM SIGSPATIAL international conference on advances in geographic information systems. 397–400.Google ScholarDigital Library
Jianguo Chen, Kenli Li, Keqin Li, Philip S Yu, and Zeng Zeng. 2021. Dynamic bicycle dispatching of dockless public bicycle-sharing systems using multi-objective reinforcement learning. ACM Transactions on Cyber-Physical Systems (TCPS) 5, 4 (2021), 1–24.Google ScholarDigital Library
Longbiao Chen, Zhihan Jiang, Jiangtao Wang, and Yasha Wang. 2019. Data-Driven Bike Sharing System Optimization: State of the Art and Future Opportunities.. In EWSN. 347–350.Google Scholar
Longbiao Chen, Daqing Zhang, Gang Pan, Xiaojuan Ma, Dingqi Yang, Kostadin Kushlev, Wangsheng Zhang, and Shijian Li. 2015. Bike sharing station placement leveraging heterogeneous urban open data. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 571–575.Google ScholarDigital Library
Longbiao Chen, Daqing Zhang, Leye Wang, Dingqi Yang, Xiaojuan Ma, Shijian Li, Zhaohui Wu, Gang Pan, Thi-Mai-Trang Nguyen, and Jérémie Jakubowicz. 2016. Dynamic cluster-based over-demand prediction in bike sharing systems. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 841–852.Google ScholarDigital Library
Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. 2019. Top-k off-policy correction for a REINFORCE recommender system. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 456–464.Google ScholarDigital Library
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785–794.Google ScholarDigital Library
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.Google ScholarDigital Library
Jakob Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, and Shimon Whiteson. 2018. Counterfactual multi-agent policy gradients. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.Google ScholarCross Ref
Christine Fricker and Nicolas Gast. 2016. Incentives and redistribution in homogeneous bike-sharing systems with stations of finite capacity. Euro journal on transportation and logistics 5, 3 (2016), 261–291.Google Scholar
Supriyo Ghosh, Michael Trick, and Pradeep Varakantham. 2016. Robust repositioning to counter unpredictable demand in bike sharing systems. (2016).Google Scholar
Supriyo Ghosh, Pradeep Varakantham, Yossiri Adulyasak, and Patrick Jaillet. 2017. Dynamic repositioning to reduce lost demand in bike sharing systems. Journal of Artificial Intelligence Research 58 (2017), 387–430.Google ScholarDigital Library
Jayesh K Gupta, Maxim Egorov, and Mykel Kochenderfer. 2017. Cooperative multi-agent control using deep reinforcement learning. In International conference on autonomous agents and multiagent systems. Springer, 66–83.Google ScholarCross Ref
David Isele, Reza Rahimi, Akansel Cosgun, Kaushik Subramanian, and Kikuo Fujimura. 2018. Navigating occluded intersections with autonomous vehicles using deep reinforcement learning. In 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 2034–2039.Google ScholarDigital Library
Guofu Li, Ning Cao, Pengjia Zhu, Yanwu Zhang, Yingying Zhang, Lei Li, Qingyuan Li, and Yu Zhang. 2021. Towards smart transportation system: A case study on the rebalancing problem of bike sharing system based on reinforcement learning. Journal of Organizational and End User Computing (JOEUC) 33, 3(2021), 35–49.Google ScholarCross Ref
Xinghua Li, Xinyuan Zhang, Cheng Cheng, Wei Wang, and Chao Yang. 2022. Dynamic Repositioning in Dock-less Bike-sharing System: A Multi-agent Reinforcement Learning Approach. In 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 170–175.Google Scholar
Yexin Li, Yu Zheng, and Qiang Yang. 2018. Dynamic bike reposition: A spatio-temporal reinforcement learning approach. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1724–1733.Google ScholarDigital Library
Yexin Li, Yu Zheng, and Qiang Yang. 2019. Efficient and effective express via contextual cooperative reinforcement learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 510–519.Google ScholarDigital Library
Aristidis Likas, Nikos Vlassis, and Jakob J Verbeek. 2003. The global k-means clustering algorithm. Pattern recognition 36, 2 (2003), 451–461.Google Scholar
Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971(2015).Google Scholar
Junming Liu, Leilei Sun, Weiwei Chen, and Hui Xiong. 2016. Rebalancing bike sharing systems: A multi-source data smart optimization. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1005–1014.Google ScholarDigital Library
Weibo Liu, Zidong Wang, Xiaohui Liu, Nianyin Zeng, Yurong Liu, and Fuad E Alsaadi. 2017. A survey of deep neural network architectures and their applications. Neurocomputing 234(2017), 11–26.Google ScholarCross Ref
Meghna Lowalekar, Pradeep Varakantham, Supriyo Ghosh, Sanjay Dominik Jena, and Patrick Jaillet. 2017. Online repositioning in bike sharing systems. In Twenty-seventh international conference on automated planning and scheduling.Google ScholarCross Ref
Ryan Lowe, Yi I Wu, Aviv Tamar, Jean Harb, OpenAI Pieter Abbeel, and Igor Mordatch. 2017. Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems 30 (2017).Google Scholar
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. PMLR, 1928–1937.Google Scholar
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602(2013).Google Scholar
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.Google Scholar
Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Icml.Google Scholar
Ling Pan, Qingpeng Cai, Zhixuan Fang, Pingzhong Tang, and Longbo Huang. 2019. A deep reinforcement learning framework for rebalancing dockless bike sharing systems. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 1393–1400.Google ScholarDigital Library
Peng Peng, Ying Wen, Yaodong Yang, Quan Yuan, Zhenkun Tang, Haitao Long, and Jun Wang. 2017. Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play starcraft combat games. arXiv preprint arXiv:1703.10069(2017).Google Scholar
Tabish Rashid, Mikayel Samvelyan, Christian Schroeder, Gregory Farquhar, Jakob Foerster, and Shimon Whiteson. 2018. Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In International Conference on Machine Learning. PMLR, 4295–4304.Google Scholar
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347(2017).Google Scholar
David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. 2017. Mastering the game of go without human knowledge. nature 550, 7676 (2017), 354–359.Google Scholar
Adish Singla, Marco Santoni, Gábor Bartók, Pratik Mukerji, Moritz Meenen, and Andreas Krause. 2015. Incentivizing users for balancing bike sharing systems. In Twenty-Ninth AAAI conference on artificial intelligence.Google ScholarDigital Library
Chao Song, Youfang Lin, Shengnan Guo, and Huaiyu Wan. 2020. Spatial-temporal synchronous graph convolutional networks: A new framework for spatial-temporal network data forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 914–921.Google ScholarCross Ref
Shangyu Sun, Huayi Wu, and Longgang Xiang. 2020. City-wide traffic flow forecasting using a deep convolutional neural network. Sensors 20, 2 (2020), 421.Google ScholarCross Ref
Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z Leibo, Karl Tuyls, et al. 2017. Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296(2017).Google Scholar
Ardi Tampuu, Tambet Matiisen, Dorian Kodelja, Ilya Kuzovkin, Kristjan Korjus, Juhan Aru, Jaan Aru, and Raul Vicente. 2017. Multiagent cooperation and competition with deep reinforcement learning. PloS one 12, 4 (2017), e0172395.Google ScholarCross Ref
Ming Tan. 1993. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning. 330–337.Google ScholarDigital Library
Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.Google ScholarCross Ref
Oriol Vinyals, Igor Babuschkin, Wojciech M Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H Choi, Richard Powell, Timo Ewalds, Petko Georgiev, et al. 2019. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 7782 (2019), 350–354.Google Scholar
Guang Wang, Zhou Qin, Shuai Wang, Huijun Sun, Zheng Dong, and Desheng Zhang. 2021. Record: Joint Real-Time Repositioning and Charging for Electric Carsharing with Dynamic Deadlines. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3660–3669.Google ScholarDigital Library
Jiadai Wang, Lei Zhao, Jiajia Liu, and Nei Kato. 2019. Smart resource allocation for mobile edge computing: A deep reinforcement learning approach. IEEE Transactions on emerging topics in computing 9, 3 (2019), 1529–1541.Google ScholarCross Ref
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc Lanctot, and Nando Freitas. 2016. Dueling network architectures for deep reinforcement learning. In International conference on machine learning. PMLR, 1995–2003.Google Scholar
Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, and Jun Wang. 2018. Mean field multi-agent reinforcement learning. In International Conference on Machine Learning. PMLR, 5571–5580.Google Scholar
Ke Yu, Chao Dong, Liang Lin, and Chen Change Loy. 2018. Crafting a toolchain for image restoration by deep reinforcement learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2443–2452.Google ScholarCross Ref
Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 World Wide Web Conference. 167–176.Google ScholarDigital Library
Xiaokang Zhou, Wei Liang, I Kevin, Kai Wang, Hao Wang, Laurence T Yang, and Qun Jin. 2020. Deep-learning-enhanced human activity recognition for Internet of healthcare things. IEEE Internet of Things Journal 7, 7 (2020), 6429–6438.Google ScholarCross Ref
Xiaokang Zhou, Wei Liang, I Kevin, Kai Wang, and Laurence T Yang. 2020. Deep correlation mining based on hierarchical hybrid networks for heterogeneous big data recommendations. IEEE Transactions on Computational Social Systems 8, 1 (2020), 171–178.Google ScholarCross Ref
Xiaokang Zhou, Wei Liang, Ke Yan, Weimin Li, I Kevin, Kai Wang, Jianhua Ma, and Qun Jin. 2022. Edge-enabled two-stage scheduling based on deep reinforcement learning for internet of everything. IEEE Internet of Things Journal 10, 4 (2022), 3295–3304.Google ScholarCross Ref

Index Terms

Efficient Bike-sharing Repositioning with Cooperative Multi-Agent Deep Reinforcement Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Planning and scheduling
      1. Multi-agent planning

Recommendations

Deep reinforcement learning for multi-agent interaction
Multi-agent systems research in the United Kingdom

The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel ...
Read More
Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems
Agents and Artificial Intelligence
Abstract
Using multi-agent reinforcement learning to find solutions to complex decision-making problems in shared environments has become standard practice in many scenarios. However, this is not the case in safety-critical scenarios, where the ...
Read More
Feudal Multi-Agent Deep Reinforcement Learning for Traffic Signal Control
AAMAS '20: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems

Reinforcement learning (RL) is a promising technique for optimizing traffic signal controllers that dynamically respond to real-time traffic conditions. Recent efforts that applied Multi-Agent RL (MARL) to this problem have shown remarkable improvement ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Sensor Networks Just Accepted
ISSN:1550-4859
EISSN:1550-4867
Table of Contents

Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Online AM: 3 January 2024
- Accepted: 15 December 2023
- Revised: 28 August 2023
- Received: 27 November 2022
Published in tosn Just Accepted

Check for updates
Author Tags
Deep reinforcement learning
Multi-agent Reinforcement Learning
Bike-Sharing System
bike repositioning
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 268
  Total Downloads
- Downloads (Last 12 months)268
- Downloads (Last 6 weeks)38
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient Bike-sharing Repositioning with Cooperative Multi-Agent Deep Reinforcement Learning

ACM Transactions on Sensor Networks

Abstract

References

Cited By

Index Terms

Recommendations

Deep reinforcement learning for multi-agent interaction

Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems

Feudal Multi-Agent Deep Reinforcement Learning for Traffic Signal Control

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Efficient Bike-sharing Repositioning with Cooperative Multi-Agent Deep Reinforcement Learning

ACM Transactions on Sensor Networks

Abstract

References

Cited By

Index Terms

Recommendations

Deep reinforcement learning for multi-agent interaction

Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems

Feudal Multi-Agent Deep Reinforcement Learning for Traffic Signal Control

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media