Constructing and navigating personalised views of the Web

doi:10.1016/S0306-4573(99)00018-7

Information Processing & Management

Volume 35, Issue 5, September 1999, Pages 679-689

https://doi.org/10.1016/S0306-4573(99)00018-7 Get rights and content

Abstract

WebClass is a system that allows Web users to create personalised conceptual data which is dynamically merged with original HTML source by a specialised proxy server. This allows groups of users to share ‘views’ of the World Wide Web that include conceptual information such as annotations and subject information. WebClass records paths followed by users during Web exploration. Graph traversal operators can be used to answer a variety of questions about explored regions of Web space.

Introduction

WebClass is a system that allows users to organise their experience of the World Wide Web (WWW). The system merges personalised conceptual information with original Web pages to provide alternative views of those Web pages. The transformation from the original to the alternative view is done transparently by a proxy service. These ‘views’ can be shared with other users who subscribe to the service. Applications of the system include:

•
Annotation. A user may associate a comment with a Web page. The comment will be visible to other subscribers who visit the same Web page.
•
Indexing. A user may associate a particular category or keyword with a Web page. By searching their personalised categories, a user can locate the associated pages. This is similar to existing bookmarking schemes offered by Web browsers, where a uniform resource identifier (URI) can be filed under a user-defined heading. However, an advantage of WebClass is that the associations can be shared between groups of users; additions by any one user are accessible to all subscribers. In addition, the associations work in both directions: if a Web page has been assigned a category, this fact will be visible to any user that visits the page.

Conceptual information is defined as high-level semantic information. This may consist of text, structured descriptions of objects, and the relationships between these objects. WebClass represents conceptual data using a strongly-typed graph data model (Greenhill & Venkatesh, 1998). WebClass has two major components:

•
A simple database system maintains associations between URIs and user-defined conceptual data objects. Given a database schema, it allows data objects to be created, edited and linked using a form-based Web interface. This is implemented as a Web server called the content server.
•
A lightweight proxy server fetches Web pages as requested by a user's Web browser, and transforms them according to the associated data stored in the database. This is called the transformation server.

An important feature of the system is that queries about documents are executed automatically as the documents are fetched. The user does not need to explicitly execute a query in order to discover that there is information associated with a Web page. WebClass does not compete with large scale indexing systems for resource discovery. However, it does allow users to organise their experience of the Web at a ‘personal’ level by keeping records of Web exploration, and by associating conceptual data with Web resources in a way that can be shared between groups of Web users.

This paper describes the implementation and use of WebClass, using the examples of annotation and indexing as potential applications. These examples are chosen to demonstrate the merging of conceptual data with Web content. Section 2 describes existing approaches to organisation of Web resources. Section 3 describes the data modelling facilities within WebClass. Section 4 describes how the system interacts with users. Visualisation of document relationships is described in Section 5.

Section snippets

Related work

The problem of organising the Web is complex and exists at many levels. Indexing services (e.g. Altavista, Excite, Yahoo, InfoSeek) attempt to cover large regions of the Web, and offer efficient search facilities based on keywords occurring in documents. These systems rely on regular traversals of the Web to fetch and index documents. An alternative distributed approach is used by the Harvest system (Bowman, Danzig, Hardy, Manber & Schwartz, 1995). Information Gatherers run at a Provider's

Data modelling

WebClass allows the association of abstract data objects with Web resources. These objects are organised in a type hierarchy, similar to that of an object-oriented (OO) database system. Simple types include numeric, boolean and textual types. More complex types can be built using record, variant and list constructors. Reference types allow references to objects and external resources (such as files and Web pages).

Fig. 1 shows the definition of a simple data model within. Material is a class of

Using WebClass

Once the user has defined a data model, WebClass creates an empty database. To access the database, the user configures their Web browser to use WebClass as a proxy.

As users explore the Web, their browsers provide navigation information to servers through ‘Referer’ (sic) fields in the hyper-text transfer protocol (HTTP) request header. When a client requests a resource via a hyper-link, the Web browser informs the server which resource contains the link. Normally this information is used by

Path visualisation

As users traverse the Web, WebClass maintains a map of their explorations. Graph traversal operators can be used to answer a variety of questions about explored regions of Web space. Typical questions might be:

•
Which documents are within a radius of r links from document A? This can be used to visualise the ‘neighborhood’ of a document. A traversal starts at A, and propagates out by following ‘refersTo’ and ‘referredFrom’ links until it reaches r links from the origin.
•
Which subject is most

Conclusion

For an individual user, WebClass provides a simple system for maintaining personalised conceptual data, and for dynamically merging this with original HTML code. For groups of users, it provides a way of sharing conceptual information about Web objects. By using a strongly-typed graph data model, WebClass encourages the construction formalised conceptual data structures.

While we emphasise the local generation of data, this could be disseminated in a distributed fashion. For example, WebClass

References (10)

S. Greenhill et al.
Noetica: a tool for semantic data modelling
Information Processing and Management
(1998)
C.M. Bowman et al.
The Harvest information discovery and access system
P. Buneman et al.
Adding structure to unstructured data
Fielding, R., Gettys, J., Mogul, J., Frystyk H., & Berners-Lee, T. (1997). Hypertext Transfer Protocol - HTTP/1.1....

There are more references available in the full text version of this article.

Cited by (3)

Online annotation - Research and practices
2007, Computers and Education
Annotation can be a valuable exercise when trying to understand new information. The technique can be used to create a ‘condensed’ version of the original information for later review and to add additional information into the existing document. The growth in web-based learning materials and information sources has created requirement for systems that allow annotations to be attached to these new sources and, potentially, shared with other learners. This paper discusses annotation in an educational context and introduces some of the web annotation systems currently available. It also provides an overview of the development of a new system, eLAWS, by the authors, based upon the Web Service architecture. Finally, the paper provides suggestions for the future development of e-Learning Annotation tools.
Semantic data modelling and visualization using Noetica
2000, Data and Knowledge Engineering
Citation Excerpt :
For example, an attribute “brightness” might have values “light” and “dark”. WebClass [25] is a system that allows users to develop personalised views of the World-Wide-Web. Conceptual data are modelled using Noetica.
Noetica is a tool for structuring knowledge about concepts and the relationships between them. It differs from typical information systems in that the knowledge it represents is abstract, highly connected, and includes meta-knowledge (knowledge about knowledge). Noetica represents knowledge using a strongly typed graph data model. By providing a rich type system it is possible to represent conceptual information using formalised structures. A class hierarchy provides a basic classification for all objects. This allows for a consistency of representation that is not often found in “free” semantic networks, and gives the ability to easily extend a knowledge model while retaining its semantics.
Visualisation and query tools are provided for this data model. Visualisation can be used to explore complete sets of link-classes, show paths while navigating through the database, or visualise the results of queries. Noetica supports goal-directed queries (a series of user-supplied goals that the system attempts to satisfy in sequence) and pathfinding queries (where the system finds relationships between objects in the database by following links).
Visualization of Web Spaces: State of the Art and Future Directions
2007, Data Base for Advances in Information Systems

View full text

Information Processing & Management

Constructing and navigating personalised views of the Web

Abstract

Introduction

Section snippets

Related work

Data modelling

Using WebClass

Path visualisation

Conclusion

Noetica: a tool for semantic data modelling

Information Processing and Management

The Harvest information discovery and access system

Adding structure to unstructured data

Online annotation - Research and practices

Semantic data modelling and visualization using Noetica

Visualization of Web Spaces: State of the Art and Future Directions