doi:10.1016/S0306-4379(01)00047-3
Copyright © 2002 Published by Elsevier Science Ltd. All rights reserved.
Signature-based structures for objects with set-valued attributes*1
a Department of Informatics, Aristotle University, Thessaloniki 54006, Greece
b Department of Computer and Communication Engineering, School of Engineering, University of Thessaly, Argonafton & Filellinon, 38221 Volos, Greece
c Department of Informatics, University of Cyprus, Nicosia 1678, Cyprus
Received 28 August 2000;
revised 5 March 2001;
accepted 30 August 2001
Available online 17 October 2001.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
Aiming at the efficient retrieval of objects with set-valued attributes, we introduce three variations of a new method in order to satisfy subset and superset queries. Our approach is to combine the advantages of two access methods, that of linear Hashing and of tree-shaped methods, on which other similar methods have been previously reported as well. Performance estimation analytical functions for each particular method are presented, followed by a thorough experimental comparison of all investigated structures, where analytical and experimental results deviate 10% on the average. Finally, the results of this performance evaluation are presented and discussed, clearly showing the superiority of the new methods reaching an improvement of up to 85%.
Fig. 1. An example of an S-tree with K=4 and k=2.
Fig. 2. An example of with F=16 and h=2.
Fig. 3. An example of structure with F=16 and h=2.
Fig. 4. An example of structure with F=16 and h=2.
Fig. 5. An example of structure with F=16 and h=2.
Fig. 6. Comparison of analytical estimates for the (left) and (right) methods as a function of the query weight.
Fig. 7. Comparison of analytical estimates for the method as a function of the query weight.
Fig. 8. Comparison of the proposed methods as a function of the weight of inserted signatures.
Fig. 9. Performance of the method: retrieval costs (left) and storage overhead (right).
Fig. 10. Comparison of the proposed methods for 512–120 signatures (left) and 512–154 signatures (right) in 2K pages, as a function of the query weight.
Fig. 11. Comparison of the proposed methods for 1024–256 signatures (left) and 1024–340 signatures (right) in 4K pages, as a function of the query weight.
Fig. 12. Time overhead of the proposed methods for 512–120 signatures (left) and 1024–256 signatures (right) in 2K and 4K pages, respectively, as a function of the query weight.
Fig. 13. Comparison of the proposed methods as a function of the number of inserted signatures for 512–120 signatures (left) and 1024–256 signatures (right).
Fig. 14. Comparison of the proposed methods as a function of the number of inserted signatures for 512–154 signatures (left) and 1024–340 signatures (right).
Fig. 15. Comparison of the proposed methods over a superset query for 512 signatures.
Fig. 16. Storage overhead (left) and the percentage of created overflow pages (right) of the five methods for 512 bits signatures and 2K page size.
Fig. 17. An example of with F=12 and h=1.
Fig. 18. The structure of Fig. 16, after the hash table has been expanded.
Table 1. Symbol table

Table 2. Parameters used in experiments and the values tested
