Skip to main content

Max-flow Min-cut Algorithm in Spark with Application to Road Networks

  • Conference paper
  • First Online:

Abstract

The max-flow min-cut problem is one of the most explored and studied problems in the area of combinatorial algorithms and optimization. In this paper, we solve the max-flow min-cut problem on large random graphs with log-normal distribution of outdegrees using the distributed Edmonds-Karp algorithm. The algorithm is implemented on a cluster using Spark. We compare the runtime between a single machine implementation and cluster implementation and analyze the impact of communication cost on runtime. In our experiments, we observe that the practical value recorded across various graphs is much lesser than the theoretical estimations primarily due to smaller diameter of the graph. Additionally, we extend this model theoretically on a large urban road network to evaluate the minimum number of sensors required for surveillance of the entire network. To validate the feasibility of this theoretical extension, we tested the model with a large log-normal graph with \(\sim \)1.1 million edges and obtained a max-flow value of 54, which implies that the minimum-cut set of the graph consists of 54 edges. This is a reasonable set of edges to place the sensors compared to the total number of edges. We believe that our approach can enhance the safety of road networks throughout the world.

V. Ramesh and S. Nagarajan contributed equally.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   60.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network flows: theory, algorithms, and applications (1993)

    Google Scholar 

  2. Badics, T., Boros, E.: Implementing a maximum flow algorithm: experiments with dynamic trees. Netw. Flows Matching First DIMACS Implement. Chall. 12, 43 (1993)

    Article  MATH  Google Scholar 

  3. Barnett, R.L., Sean Bovey, D., Atwell, R.J., Anderson, L.B.: Application of the maximum flow problem to sensor placement on urban road networks for homeland security. Homel. Secur. Aff. 3(3), 1–15 (2007)

    Google Scholar 

  4. Cheriyan, J., Maheshwari, S.N.: Analysis of preflow push algorithms for maximum network flow. SIAM J. Comput. 18(6), 1057–1086 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  5. Cherkassky, B.V., Goldberg, A.V.: On implementing the push-relabel method for the maximum flow problem. Algorithmica 19(4), 390–410 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  6. Crobak, J.R., Berry, J.W., Madduri, K., Bader, D.A.: Advanced shortest paths algorithms on a massively-multithreaded architecture. In: 2007 IEEE International Parallel and Distributed Processing Symposium, pp. 1–8. IEEE (2007)

    Google Scholar 

  7. Dancoisne, B., Dupont, E., Zhang, W.: Distributed max-flow in spark (2015)

    Google Scholar 

  8. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  9. Dinic, E.A.: Algorithm for solution of a problem of maximum flow in a network with power estimation. Sov. Math. Dokl. 11(5), 1277–1280 (1970)

    Google Scholar 

  10. Edmonds, J., Karp, R.M.: Theoretical improvements in algorithmic efficiency for network flow problems. J. ACM (JACM) 19(2), 248–264 (1972)

    Article  MATH  Google Scholar 

  11. Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 150–160. ACM (2000)

    Google Scholar 

  12. Ford, L.R., Fulkerson, D.R.: Maximal flow through a network. Can. J. Math. 8(3), 399–404 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  13. Goldberg, A.V.: Efficient graph algorithms for sequential and parallel computers. Ph.D. thesis, Massachusetts Instutute of Technology, Department of Electrical Engineering and Computer Science (1987)

    Google Scholar 

  14. Goldberg, A.V.: Recent developments in maximum flow algorithms. In: Arnborg, S., Ivansson, L. (eds.) SWAT 1998. LNCS, vol. 1432, pp. 1–10. Springer, Heidelberg (1998). doi:10.1007/BFb0054350

    Chapter  Google Scholar 

  15. Goldberg, A.V., Rao, S.: Beyond the flow decomposition barrier. J. ACM (JACM) 45(5), 783–797 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  16. Goldberg, A.V., Tarjan, R.E.: A new approach to the maximum-flow problem. J. ACM (JACM) 35(4), 921–940 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  17. Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2014), pp. 599–613 (2014)

    Google Scholar 

  18. Lei, G., Li, H.: Memory or time: performance evaluation for iterative operation on Hadoop and Spark. In: 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), pp. 721–727. IEEE (2013)

    Google Scholar 

  19. Apache Hadoop: Hadoop (2009)

    Google Scholar 

  20. Halim, F., Yap, R.H., Yongzheng, W.: A MapReduce-based maximum-flow algorithm for large small-world network graphs. In: 2011 31st International Conference on Distributed Computing Systems (ICDCS), pp. 192–202. IEEE (2011)

    Google Scholar 

  21. Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: a peta-scale graph mining system implementation and observations. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 229–238. IEEE (2009)

    Google Scholar 

  22. Kulkarni, M., Burtscher, M., Inkulu, R., Pingali, K., Casçaval, C.: How much parallelism is there in irregular applications? In: ACM Sigplan Notices, vol. 44, pp. 3–14. ACM (2009)

    Google Scholar 

  23. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 177–187. ACM (2005)

    Google Scholar 

  24. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection, June 2014. http://snap.stanford.edu/data

  25. Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)

    Google Scholar 

  26. Meyer, U., Sanders, P.: \(\delta \)-stepping: a parallelizable shortest path algorithm. J. Algorithm. 49(1), 114–152 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  27. Otsuki, K., Kobayashi, Y., Murota, K.: Improved max-flow min-cut algorithms in a circular disk failure model with application to a road network. Eur. J. Oper. Res. 248(2), 396–403 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  28. Saito, H., Toyoda, M., Kitsuregawa, M., Aihara, K.: A large-scale study of link spam detection by graph algorithms. In: Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web, pp. 45–48. ACM (2007)

    Google Scholar 

  29. Apache Spark: Apache spark™ is a fast and general engine for large-scale data processing (2016)

    Google Scholar 

  30. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. HotCloud 10, 10 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Varun Ramesh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Ramesh, V., Nagarajan, S., Mukherjee, S. (2017). Max-flow Min-cut Algorithm in Spark with Application to Road Networks. In: Jung, J., Kim, P. (eds) Big Data Technologies and Applications. BDTA 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 194. Springer, Cham. https://doi.org/10.1007/978-3-319-58967-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58967-1_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58966-4

  • Online ISBN: 978-3-319-58967-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics