Skip to main content

Advertisement

Log in

Low-cost crossed probing path planning for network failure localization

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Links inevitably fail in expanding networks, leading to user-perceived service interruptions. To localize link failures quickly and accurately is thus essential, and route-aware active probing makes it possible. Given the limited routing capacity and high traffic overhead, cross verification enables a light-weight probing scheme using reachability verification for distinct subsets of crossed paths to pinpoint the exact faulty links. Aiming to quickly optimize the crossed path design, we propose pruning genetic algorithm (PGA), which builds a pruning module on top of genetic algorithm to consistently produce high-quality solutions across various networks and avoid slow convergence in an exponentially large solution space by eliminating redundant paths. PGA also introduces extra repair operations to guarantee solution feasibility after crossover and mutation. Our experimental results on real-world network topologies demonstrate that PGA achieves a significant reduction of 23.0% to 58.3% in probing cost and 23.0% to 45.3% in forwarding cost in seconds or even milliseconds compared to its counterparts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Algorithm 1
Fig. 8
Algorithm 2
Algorithm 3
Fig. 9
Algorithm 4
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Availability of data and materials

The real-world topology dataset used for evaluation can be accessed at www.topology-zoo.org.

References

  1. Abbasi, M., Shahraki, A., Taherkordi, A.: Deep learning for network traffic monitoring and analysis (NTMA): A survey. Comput. Commun. 170, 19–41 (2021)

    Article  Google Scholar 

  2. Ahuja, S.S., Ramasubramanian, S., Krunz, M.: Single-link failure detection in all-optical networks using monitoring cycles and paths. IEEE/ACM Trans. Netw. 17(4), 1080–1093 (2009)

    Article  Google Scholar 

  3. Ahuja, S.S., Ramasubramanian, S., Krunz, M.: SRLG failure localization in optical networks. IEEE/ACM Trans. Netw. 19(4), 989–999 (2011)

    Article  Google Scholar 

  4. Alam, T., Qamar, S., Dixit, A., Benaida, M.: Genetic algorithm: Reviews, implementations, and applications. Int. J. Eng. Pedagog. 10(6), 57–77 (2020)

    Article  Google Scholar 

  5. Aubry, F., Lebrun, D., Vissicchio, S., Khong, M.T., Deville, Y., Bonaventure, O.: Scmon: Leveraging segment routing to improve network monitoring. In: 35th Annual IEEE International Conference on Computer Communications, INFOCOM 2016, pp. 1–9. IEEE, San Francisco, CA, USA, April 10-14, 2016 (2016)

  6. Basuki, A.I., Kuipers, F.: Localizing link failures in legacy and SDN networks. In: 10th International Workshop on Resilient Networks Design and Modeling, RNDM 2018, pp. 1–6. IEEE, Longyearbyen, Svalbard, Norway, August 27-29, 2018 (2018)

  7. Cao, J., Xia, R., Yang, P., Guo, C., Lu, G., Yuan, L., Zheng, Y., Wu, H., Xiong, Y., Maltz, D.: Per-packet load-balanced, low-latency routing for clos-based data center networks. In: Proceedings of the ninth ACM conference on Emerging networking experiments and technologies, pp. 49–60 (2013)

  8. Filsfils, C., Nainar, N.K., Pignataro, C., Cardona, J.C., Francois, P.: The segment routing architecture. In: 2015 IEEE Global Communications Conference (GLOBECOM), pp. 1–6. IEEE (2015)

  9. Gao, H., Zhao, L., Wang, H., Tian, Z., Nie, L., Li, K.: Xshot: Light-weight link failure localization using crossed probing cycles in SDN. In: ICPP 2020: 49th International Conference on Parallel Processing, pp. 56:1–56:11. ACM, Edmonton, AB, Canada, August 17-20, 2020 (2020)

  10. Gill, P., Jain, N., Nagappan, N.: Understanding network failures in data centers: measurement, analysis, and implications. In: Proceedings of the ACM SIGCOMM 2011 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 350–361. ACM, Toronto, ON, Canada, August 15-19, 2011 (2011)

  11. Gyimóthi, L., Tapolcai, J.: A heuristic algorithm for network-wide local unambiguous node failure localization. In: 2015 IEEE 16th International Conference on High Performance Switching and Routing (HPSR), pp. 1–6. IEEE (2015)

  12. Herodotou, H., Ding, B., Balakrishnan, S., Outhred, G., Fitter, P.: Scalable near real-time failure localization of data center networks. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 1689–1698. ACM, New York, NY, USA - August 24–27, 2014 (2014)

  13. Knight, S., Nguyen, H., Falkner, N., Bowden, R., Roughan, M.: The internet topology zoo. IEEE J. Sel. Areas Commun. 29(9), 1765–1775 (2011)

    Article  Google Scholar 

  14. Li, X., Yeung, K.L.: Ilp formulation for monitoring-cycle construction using segment routing. In: 2018 IEEE 43rd Conference on Local Computer Networks (LCN), pp. 485–492. IEEE (2018)

  15. Li, X., Yeung, K.L.: Monitoring trail design based on segment routing. IEEE Trans. Netw. Serv. Manag. 17(4), 2648–2661 (2020)

    Article  Google Scholar 

  16. Li, Z., Chen, Q., Koltun, V.: Combinatorial optimization with graph convolutional networks and guided tree search. Adv. Neural Inf. Process. Syst. 31 (2018)

  17. McKeown, N., Anderson, T.E., Balakrishnan, H., Parulkar, G.M., Peterson, L.L., Rexford, J., Shenker, S., Turner, J.S.: Openflow: enabling innovation in campus networks. Comput. Commun. Rev. 38(2), 69–74 (2008)

    Article  Google Scholar 

  18. Ogino, N., Kitahara, T.: Greedy computation of all-optical monitoring trails to minimize total monitoring cost. Opt. Switch. Netw. 32, 1–13 (2019)

    Article  Google Scholar 

  19. Ogino, N., Yokota, H.: Heuristic computation method for all-optical monitoring trails terminated at specified nodes. J. Light. Technol. 32(3), 467–482 (2013)

    Article  Google Scholar 

  20. Roy, A., Zeng, H., Bagga, J., Snoeren, A.C.: Passive realtime datacenter fault detection and localization. In: 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2017, pp. 595–612. USENIX Association, Boston, MA, USA, March 27-29, 2017 (2017)

  21. Saquib, S.M., Chinthalapati, E., Kumar, D.: Efficient topology failure detection in sdn networks (Aug 22 2017), uS Patent 9,742,648

  22. Tan, C., Jin, Z., Guo, C., Zhang, T., Wu, H., Deng, K., Bi, D., Xiang, D.: Netbouncer: Active device and link failure localization in data center networks. In: 16th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2019, pp. 599–614. USENIX Association, Boston, MA, February 26–28, 2019 (2019)

  23. Tapolcai, J., Ho, P.H., Babarczi, P., Rónyai, L.: On achieving all-optical failure restoration via monitoring trails. In: 2013 Proceedings IEEE INFOCOM, pp. 380–384. IEEE (2013)

  24. Tapolcai, J., Ho, P.H., Rónyai, L., Babarczi, P., Wu, B.: Failure localization for shared risk link groups in all-optical mesh networks using monitoring trails. J. Light. Technol. 29(10), 1597–1606 (2011)

    Article  Google Scholar 

  25. Tapolcai, J., Wu, B., Ho, P.: On monitoring and failure localization in mesh all-optical networks. In: INFOCOM 2009. 28th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, 19–25 April 2009, pp. 1008–1016. IEEE, Rio de Janeiro, Brazil (2009)

  26. Wang, X., Malboubi, M., Pan, Z., Ren, J., Wang, S., Xu, S., Chuah, C.: Proglimi: Programmable link metric identification in software-defined networks. IEEE/ACM Trans. Netw. 26(5), 2376–2389 (2018)

    Article  Google Scholar 

  27. Wu, B., Ho, P.H., Yeung, K.L.: Monitoring trail: On fast link failure localization in all-optical wdm mesh networks. J. Light. Technol. 27(18), 4175–4185 (2009)

    Article  Google Scholar 

  28. Wu, B., Yeung, K.L.: \(\text{M}^{2}\)-CYCLE: an optical layer algorithm for fast link failure detection in all-optical mesh networks. In: Proceedings of the Global Telecommunications Conference, 2006. GLOBECOM ’06. IEEE, San Francisco, CA, USA, 27 November–1 December 2006 (2006)

  29. Wu, B., Yeung, K.L., Ho, P.H.: Monitoring cycle design for fast link failure localization in all-optical networks. J. Light. Technol. 27(10), 1392–1401 (2009)

    Article  Google Scholar 

  30. Xing, Z., Tu, S., Xu, L.: Solve traveling salesman problem by monte carlo tree search and deep neural network (2020). arXiv:2005.06879

  31. Xu, Y., Fang, M., Chen, L., Xu, G., Du, Y., Zhang, C.: Reinforcement learning with multiple relational attention for solving vehicle routing problems. IEEE Trans. Cybern. 52(10), 11107–11120 (2021)

    Article  Google Scholar 

  32. Zeng, H., Huang, C.: Fault detection and path performance monitoring in meshed all-optical networks. In: Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM ’04, pp. 2014–2018. IEEE, Dallas, Texas, USA, 29 November–3 December 2004 (2004)

  33. Zeng, H., Huang, C., Vukovic, A.: Spanning-tree based monitoring-cycle construction for fault detection and localization in mesh aons. In: IEEE International Conference on Communications, 2005. ICC 2005. 2005. vol. 3, pp. 1726–1730. IEEE (2005)

  34. Zeng, H., Huang, C., Vukovic, A.: A novel fault detection and localization scheme for mesh all-optical networks based on monitoring-cycles. Photonic Netw. Commun. 11(3), 277–286 (2006)

    Article  Google Scholar 

  35. Zeng, H., Kazemian, P., Varghese, G., McKeown, N.: Automatic test packet generation. IEEE/ACM Trans. Netw. 22(2), 554–566 (2014)

    Article  Google Scholar 

  36. Zhao, G., Xu, H., Fan, J., Huang, L., Qiao, C.: Achieving fine-grained flow management through hybrid rule placement in sdns. IEEE Trans. Parallel Distrib. Syst. 32(3), 728–742 (2020)

    Article  Google Scholar 

Download references

Funding

This work is supported by the National Key Research and Development Program of China No.2022YFB4500702; project ZR2022LZH018 supported by Shandong Provincial Natural Science Foundation; the National Natural Science Foundation of China under grant 62141218, 62372322 and the open project of Zhejiang Lab (2021DA0AM01/003).

Author information

Authors and Affiliations

Authors

Contributions

Hongyun Gao wrote the main manuscript text and prepared all the experiments and figures. Laiping Zhao guided the core logic of the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Laiping Zhao.

Ethics declarations

Competing interests

Not applicable.

Ethics approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, H., Zhao, L., Chen, S. et al. Low-cost crossed probing path planning for network failure localization. World Wide Web 26, 3891–3914 (2023). https://doi.org/10.1007/s11280-023-01206-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-023-01206-7

Keywords

Navigation