Constructing and navigating personalised views of the Web

https://doi.org/10.1016/S0306-4573(99)00018-7Get rights and content

Abstract

WebClass is a system that allows Web users to create personalised conceptual data which is dynamically merged with original HTML source by a specialised proxy server. This allows groups of users to share ‘views’ of the World Wide Web that include conceptual information such as annotations and subject information. WebClass records paths followed by users during Web exploration. Graph traversal operators can be used to answer a variety of questions about explored regions of Web space.

Introduction

WebClass is a system that allows users to organise their experience of the World Wide Web (WWW). The system merges personalised conceptual information with original Web pages to provide alternative views of those Web pages. The transformation from the original to the alternative view is done transparently by a proxy service. These ‘views’ can be shared with other users who subscribe to the service. Applications of the system include:

  • Annotation. A user may associate a comment with a Web page. The comment will be visible to other subscribers who visit the same Web page.

  • Indexing. A user may associate a particular category or keyword with a Web page. By searching their personalised categories, a user can locate the associated pages. This is similar to existing bookmarking schemes offered by Web browsers, where a uniform resource identifier (URI) can be filed under a user-defined heading. However, an advantage of WebClass is that the associations can be shared between groups of users; additions by any one user are accessible to all subscribers. In addition, the associations work in both directions: if a Web page has been assigned a category, this fact will be visible to any user that visits the page.

Conceptual information is defined as high-level semantic information. This may consist of text, structured descriptions of objects, and the relationships between these objects. WebClass represents conceptual data using a strongly-typed graph data model (Greenhill & Venkatesh, 1998). WebClass has two major components:

  • A simple database system maintains associations between URIs and user-defined conceptual data objects. Given a database schema, it allows data objects to be created, edited and linked using a form-based Web interface. This is implemented as a Web server called the content server.

  • A lightweight proxy server fetches Web pages as requested by a user's Web browser, and transforms them according to the associated data stored in the database. This is called the transformation server.

An important feature of the system is that queries about documents are executed automatically as the documents are fetched. The user does not need to explicitly execute a query in order to discover that there is information associated with a Web page. WebClass does not compete with large scale indexing systems for resource discovery. However, it does allow users to organise their experience of the Web at a ‘personal’ level by keeping records of Web exploration, and by associating conceptual data with Web resources in a way that can be shared between groups of Web users.

This paper describes the implementation and use of WebClass, using the examples of annotation and indexing as potential applications. These examples are chosen to demonstrate the merging of conceptual data with Web content. Section 2 describes existing approaches to organisation of Web resources. Section 3 describes the data modelling facilities within WebClass. Section 4 describes how the system interacts with users. Visualisation of document relationships is described in Section 5.

Section snippets

Related work

The problem of organising the Web is complex and exists at many levels. Indexing services (e.g. Altavista, Excite, Yahoo, InfoSeek) attempt to cover large regions of the Web, and offer efficient search facilities based on keywords occurring in documents. These systems rely on regular traversals of the Web to fetch and index documents. An alternative distributed approach is used by the Harvest system (Bowman, Danzig, Hardy, Manber & Schwartz, 1995). Information Gatherers run at a Provider's

Data modelling

WebClass allows the association of abstract data objects with Web resources. These objects are organised in a type hierarchy, similar to that of an object-oriented (OO) database system. Simple types include numeric, boolean and textual types. More complex types can be built using record, variant and list constructors. Reference types allow references to objects and external resources (such as files and Web pages).

Fig. 1 shows the definition of a simple data model within. Material is a class of

Using WebClass

Once the user has defined a data model, WebClass creates an empty database. To access the database, the user configures their Web browser to use WebClass as a proxy.

As users explore the Web, their browsers provide navigation information to servers through ‘Referer’ (sic) fields in the hyper-text transfer protocol (HTTP) request header. When a client requests a resource via a hyper-link, the Web browser informs the server which resource contains the link. Normally this information is used by

Path visualisation

As users traverse the Web, WebClass maintains a map of their explorations. Graph traversal operators can be used to answer a variety of questions about explored regions of Web space. Typical questions might be:

  • Which documents are within a radius of r links from document A? This can be used to visualise the ‘neighborhood’ of a document. A traversal starts at A, and propagates out by following ‘refersTo’ and ‘referredFrom’ links until it reaches r links from the origin.

  • Which subject is most

Conclusion

For an individual user, WebClass provides a simple system for maintaining personalised conceptual data, and for dynamically merging this with original HTML code. For groups of users, it provides a way of sharing conceptual information about Web objects. By using a strongly-typed graph data model, WebClass encourages the construction formalised conceptual data structures.

While we emphasise the local generation of data, this could be disseminated in a distributed fashion. For example, WebClass

References (10)

  • S. Greenhill et al.

    Noetica: a tool for semantic data modelling

    Information Processing and Management

    (1998)
  • C.M. Bowman et al.

    The Harvest information discovery and access system

  • P. Buneman et al.

    Adding structure to unstructured data

  • Fielding, R., Gettys, J., Mogul, J., Frystyk H., & Berners-Lee, T. (1997). Hypertext Transfer Protocol - HTTP/1.1....
There are more references available in the full text version of this article.

Cited by (3)

  • Online annotation - Research and practices

    2007, Computers and Education
  • Semantic data modelling and visualization using Noetica

    2000, Data and Knowledge Engineering
    Citation Excerpt :

    For example, an attribute “brightness” might have values “light” and “dark”. WebClass [25] is a system that allows users to develop personalised views of the World-Wide-Web. Conceptual data are modelled using Noetica.

  • Visualization of Web Spaces: State of the Art and Future Directions

    2007, Data Base for Advances in Information Systems
View full text