|
expert systems - do they live up to expectation?
The size and complexity of networks supporting high value services means that extensive network management facilities are needed to handle the stream of fault and performance information.
For large networks, the volume of this information flow can be considerable, even under normal operating conditions. When a major failure occurs, the amount of data can increase by several orders of magnitude. In order to improve the manageability, low value information is often discarded. But even when filtering is applied, the volume of information may be still too great for operators to handle. Under major fault conditions, the flood of information compounds the problem and there is a danger that automatically discarded information may contain the key to identifying the underlying network problem. The result is that failures go undetected, ultimately leading to poorer quality of service to end customers.
In an attempt to improve the situation, network operators have employed expert systems in an attempt to automatically detect and handle underlying network problems. These systems usually maintain a knowledge base of failure scenarios and process large quantities of event information received from the network, attempting to detect a pre-defined failure 'signature'. Until now, the set of rules needed by these systems has been difficult, time-consuming and expensive to build, configure and maintain (for example, when network topology changes, numerous new rules have to be added and old rules adapted or deleted).
Much of the complexity in constructing and maintaining such expert systems arises from the process of translating the detailed knowledge of particular failure scenarios (gained from experienced users) into a form that the expert system can use. Current systems typically employ large and complex sets of rules to handle even the most basic failure scenarios; this is because they have to maintain element state and connectivity information over the duration of the failure. Consequently, adding support for a new scenario is tediously slow and requires expensive resources.
|