A study on the classified model and the agent collaboration model for network configuration fault management

https://doi.org/10.1016/S0950-7051(03)00012-1Get rights and content

Abstract

This study presents the management model to manage network configuration fault in system and diagnosis and recovery algorithm through collaboration among the agents. The management model comprises three stages of detection, diagnosis, and recovery, and each uses a set of rule in Rule-Based Reasoning database to diagnose and recover the network configuration fault. And, it also presents the effective network configuration managing algorithm in system by overcoming the fault which cannot be figured out by system alone, and diagnosing and recovering it under consideration of network condition, by means of the collaboration among a number of agents distributed in network on management domain.

Introduction

The Internet appeared to make a noticeable advancement, based on recent computer telecommunication technologies and data telecommunication technologies. Besides, a variety of network equipment organizing the Internet and the systems linked to the equipments has increased in complexity and scale. In this situation, it is extremely difficult to manage the fault if system or network fault occurs. Currently, it is being managed directly by the technicians in corresponding fields and the Expert System their point of view has been accumulated. But, there would be plenty of scope to manage the fault occurred by this method. Because it is difficult for expert to take hold of all conditions of the system or network, and the Expert System, which contains expert's knowledge is also the same [6], [12].

In particular, it is hard to find the main cause of the configuration fault in the network system. Because the causes of the system configuration fault are not only one and in fact the symptoms are mostly equal [1]. For example, it's hard to say that the IP or the Subnet Mask is in error only because the web service does not operate in the system [9]. In these kinds to errors there could be many causes. The whole networks themselves might be downed, or the interface device for the system might have been damaged, or the intermediate router might have been downed [8].

In order to control these confusing errors, we have to use the ‘Rule-Based Reasoning (RBR)’ to find out the network error, examine them, and recover the network [2]. It is easier to adapt in the network environment and examine the network errors precisely by using the RBR [4], [5], [7]. The LODES system is one of the systems that is using the RBR to control the network errors. This LODES system is capable of examining the error, but it does not have the ability to control and recover the errors [11].

Most of all, it is hard to judge the network fault inside the system. It is hard to judge the interior or exterior network conditions by the system itself. Therefore, in order to find the cause of the error precisely, we have to collaborate among agents on-line.

In connection with network fault management, Cisco proposes TCP/IP network diagnosis tool using ping, traceroute and packet debugging. And Lucent suggests Navis™, network fault management software, which is real time network fault detection tool. But both Cisco and Lucent do not consider network management through collaboration among the agents.

Therefore, in this paper, it gives you the methodology to manage the fault by using the RBR through three processes automatically fault detection, fault diagnosis and fault recovery. And this paper gives you the fact that by collaborating among agents, you could examine the fault precisely and recover it by understanding the state of network.

And, in this paper, the management tool such as PING has been used to detect the network failure actively, not by the management protocol such as Simple Network Management Protocol (SNMP); passive access used in the existing management system.

The active fault management technique has an advancement that prohibits system resources the passive accessible approach has from being wasted and the system can manage itself.

Moreover, based on the fault scenarios in actual network environments, the RBR-based network configuration fault management method through collaboration among agents indicated this paper was verified [3].

Section snippets

Network configuration fault management model

Network configuration fault management is composed of three components—fault detection, fault diagnosis, and fault recovery (Fig. 1). Fault detection is a stage where the detection of error within the system is taken place using various kinds of fault detecting tools. Fault diagnosis is a stage for applying diagnostic rules based upon the fault detected in order to find out where the fault has been occurred and which system object has been affected. Fault diagnostic rule is in the rule-based

The configuration of agents

Three kinds of agents has been contrived to manage effectively the network configuration fault. That distinction depends on the location in network. They include This-Agent (T-Agent) which is installed in the system itself to manage fault, Neighbor-Agent (N-Agent) which is installed on the same Interior network with the T-Agent, and R-Agent (R-Agent) which is installed in the network different from the T-Agent. All the agents exist in the same management domain, and they communicate one another

Fault detection model

Fault detection model is used to detect the network configuration faults of system. Network configuration fault is categorized into two different components—physical and logical faults. Physical fault is related to hardware faults such as loose connection of Network Interface Card (NIC) to the cable, defect of NIC itself, disconnection between network line and hardware system and the network down. Logical fault is related to software faults such as inappropriate installation of NIC driver,

Experimental environments

It has been experimented to assess the validity of fault management algorithm, set the network in Sungkyunkwan University to the management domain, through collaboration among agents for network configuration fault management indicated in this paper. Table 5 is the description on each system which Default Gateway, T-Agent, N-Agent, and R-Agent are supposed to be installed. System 1, 2, and 3 were connected to the same Interior LAN, and system 4 was connected to the other network on management

Conclusion

This paper presents the algorithm diagnoses and recovers the network configuration fault in system through the collaboration among agents. The agents are divided into T-Agent, N-Agent, and R-Agent according to the location of the network, and diagnose and recover the network configuration fault occurred in system through the process collaborating in the form of query and answer. Moreover, it reports concerning the network configuration model based on RBR. This management model is divided into

Kwang-Jong Cho received his MS degree in the School of Electrical and Computer Engineering from Sungkyunkwan University, Korea in 2002. Currently, he is an engineer in the Engineering Information Technology Center in Institute for Advanced Engineering, Korea. His research interests include network or system fault management and GRID.

References (12)

  • C Hunt

    TCP/IP Network Administration

    (1998)
  • D.W. Gurer, I. Khan, R. Ogier, R. Keffer, An Artificial Intelligence Approach to Network Fault Management, SRI...
  • E.L Madruga et al.

    Fault management tools for a cooperative and decentralized network operations environment

    IEEE Journal on Selected Areas in Communications

    (1994)
  • J.-M. Yun, S.-J. Ahn, J.-W. Chung, Web Server Fault Diagnosis and Recovery Mechanism Using INBANCA, 2000, pp....
  • K.-H Cho et al.

    Rule-based fault detection agent system for fault detection and location on LAN

    Korea Information Processing Society

    (2000)
  • K Ohta et al.

    Proceedings of ISINM97

    (1997)
There are more references available in the full text version of this article.

Cited by (11)

  • Multi-agent based collaborative fault detection and identification in chemical processes

    2010, Engineering Applications of Artificial Intelligence
    Citation Excerpt :

    Abnormal operations of turbine are detected by placing appropriate linear threshold across the state variables. Cho et al. (2003) used an agent-based model to monitor communication network. They developed their agents with different knowledge-based reasoning systems to diagnose known network faults.

  • HYDES: A Web-based hydro turbine fault diagnosis system

    2008, Expert Systems with Applications
    Citation Excerpt :

    Due to several factors such as complexity of dynamics, incomplete uncertain knowledge and diverse sources of knowledge, many diagnostic methods and techniques are adopted in fault diagnosis research and development work. These methods and techniques can briefly be classified into the following: rule-based (Cho, Ahn, & Chung, 2003; El Gamal & Abdulghafour, 2003; Jämsä, Jounela, Vermasvuori, Endén, & Haavisto, 2003), knowledge-based (Cho et al., 2003; Ruiz et al., 2001), model-based (Ding, Fennel, & Ding, 2004; Liu & Coghill, 2005), case-based (Cunningham, Smyth, & Bonzano, 1998), neural network (Mohamed, Abdelaziz, & Mostafa, 2005; Yang, Han, & An, 2004), rough set theory (Tay & Shen, 2003; Wang & Li, 2004), fuzzy logic (Dash, Rengaswamy, & Venkatasubramanian, 2003; Tarifa & Scenna, 2004) and statistical method (Yang, Lim, & Tan, 2005). Actually in most cases, more than one technique or method is adopted.

  • NCDS: Data mining for discovering interesting network characteristics

    2005, Information and Software Technology
    Citation Excerpt :

    A user-friendly web interface allows the use of the NetSEC application close to final users. In Ref. [5] a management model presented to manage network configuration fault in system and diagnosis and recovery algorithm through collaboration among the agents. The management model comprises three stages of detection, diagnosis, and recovery, and each uses a set of rule in rule-based reasoning (RBR) database to diagnose and recover the network configuration fault.

  • Multi-Strategy Learning for Recognizing Network Symptoms

    2022, Recent Advances in Computer Science and Communications
  • DOS and Brute Force Attacks Faults Detection Using an Optimised Fuzzy C-Means

    2019, IEEE International Symposium on INnovations in Intelligent SysTems and Applications, INISTA 2019 - Proceedings
View all citing articles on Scopus

Kwang-Jong Cho received his MS degree in the School of Electrical and Computer Engineering from Sungkyunkwan University, Korea in 2002. Currently, he is an engineer in the Engineering Information Technology Center in Institute for Advanced Engineering, Korea. His research interests include network or system fault management and GRID.

Seong-Jin Ahn received his PhD degree in the Department of Information Engineering in Sungkyunkwan University, Korea in 1999. Currently, he is a professor in the Department of Computer Education in Sungkyunkwan University, Korea. His research interests include network management and UNIX system, high-speed network protocol.

Jin-Wook Chung is a professor in the School of Electrical and Computer Engineering, Sungkyunkwan University, Korea. He received his PhD degree in Computer Science from Seoul University, Korea, 1991. For the last several years, he has been working on various research projects and network management, which utilize Web, Java technologies. His research includes network management and high-speed network protocol and network security.

1

Tel.: +82-31-330-7454; fax: +82-31-330-7120.

View full text