doi:10.1016/j.inffus.2005.06.002
Copyright © 2005 Elsevier B.V. All rights reserved.
Taxonomic knowledge structure discovery from imagery-based data using the neural associative incremental learning (NAIL) algorithm
Bradley J. Rhodes
, a, 
aMultisensor Exploitation Directorate, Fusion Technology and Systems Division, Advanced Information Technologies, BAE Systems, 6 New England Executive Park, Burlington, MA 01803, USA
Received 31 August 2004;
revised 25 May 2005;
accepted 1 June 2005.
Available online 27 July 2005.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
An important component of higher level fusion is knowledge discovery. One form of knowledge is a set of relationships between concepts. This paper addresses the automated discovery of ontological knowledge representations such as taxonomies/thesauri from imagery-based data. Multi-target classification is used to transform each source data point into a set of conceptual predictions from a pre-defined lexicon. This classification pre-processing produces co-occurrence data that is suitable for input to an ontology learning algorithm. A neural network with an associative, incremental learning (NAIL) algorithm processes this co-occurrence data to find relationships between elements of the lexicon, thus uncovering the knowledge structure ‘hidden’ in the dataset. The efficacy of this approach is demonstrated on a dataset created from satellite imagery of a metropolitan region. The flexibility of the NAIL algorithm is illustrated by employing it on an additional dataset comprised of topic categories from a text document collection. The usefulness of the knowledge structure discovered from the imagery data is illustrated via construction of a Bayesian network, which produces an inference engine capable of exploiting the learned knowledge model. Effective automation of knowledge discovery in an information fusion context has considerable potential for aiding the development of machine-based situation awareness capabilities.
Keywords: Knowledge structure; Information fusion; Taxonomy; Ontology learning; Incremental learning; Associative learning; Multi-target classification
Fig. 1. Left: Raw data of target (Boston) region using original Red, Green and Blue Landsat 7 Thematic Mapper bands. Center: Fused image provided by the Neural Fusion processing as described in the text. Right: Ground truth areas for selected physical features: ocean, ice, river, beach, park, road, residential, and industrial. Thin vertical lines indicate the strips employed to provide independent samples for model building, validation, and evaluation purposes. (The reader is referred to the Web version of this article to more easily see the color coded relationships betweeen figures.)
Fig. 2. Hierarchical relationship between ground truth categories in the imagery dataset.
Fig. 3. Recall/Precision/F1 results for the TopN and Threshold prediction selection methods from ARTMAP activations classifying the imagery-based dataset.
Fig. 4. Cartoon example of online learning in operation: (Step 1—top left) The network starts with nodes (unfilled circles A–D) that are fully connected with 0 weights; (Step 2) When two nodes are co-activated (filled circles A and C), the weights between them increase (A → C, C → A); (Step 3) When a different set of nodes is activated (B and C), the weights between the activated nodes increase (B → C, C → B). However, if any of the active nodes has a non-zero weight connection with an inactive node (C → A), the value of this connection weight will decrease as indicated. The A → C weight does not change because A is not active; (Step 4) Network operation is not restricted to pair-wise activation of nodes—here 3 nodes (A, C, and D) are co-activated with consequent weight changes illustrated. All weights associated with the active nodes increase from their Step 3 values except for the C → B weight, which decreases.
Fig. 5. Relationships between categories determined from the NAIL algorithm on the imagery-based dataset. Thick arrows indicate original links; thin arrows indicate additional (yet logically true) links; thin dashed arrows indicate reciprocal links between closely associated categories. All weight values
0.8 except where noted.
Fig. 6. Relationships between categories determined from association rule analysis on the imagery-based dataset. Thick arrows indicate original links; thin arrows indicate additional (yet logically true) links; thin dashed arrows indicate reciprocal links between closely associated categories. All confidence levels
0.8 except where noted.
Fig. 7. Error results for the network produced by the NAIL algorithm. The amount of data presented to the algorithm is increased from left to right in each panel by iterating through the dataset. The solid horizontal line in each panel indicates the performance level of the association rule mining algorithm for the dataset. Multiple iterations through the data do not change the association rule mining performance (see text for details). Left: Comparison with the original reference ontology (Fig. 2); Right: Comparison with an extended reference ontology that adds logically true relationships to the original reference model (see text for details).
Fig. 8. Relationships between categories determined from the NAIL algorithm on the text mining dataset using a weight threshold of 0.35. Solid black arrows depict links from lower level nodes to higher level nodes. Where reciprocal link weights meet the threshold these are indicated by dashed arrows (with actual weights included). Abbreviations: BOP = balance of payments; fx = foreign exchange; Nat-gas = natural gas.
Fig. 9. DAG created from the set of relationships shown in Fig. 5.
Fig. 10. Example Bayesian network illustrating the resultant structure and conditional probability tables (CPTs) that parameterize the network.
Fig. 11. Top: Network marginal probabilities in the absence of evidence. Bottom: Network marginal probabilities in the presence of road hard evidence—double-line borders indicate increased probabilities with respect to the no evidence baseline (shown above) and single-line borders indicated reduced probabilities.
Table 1.
Comparison of classifier performance using the maximum F1 value
