Skip to main content
Log in

Efficient mining of sequential patterns with time constraints by delimited pattern growth

  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

An active research topic in data mining is the discovery of sequential patterns, which finds all frequent subsequences in a sequence database. The generalized sequential pattern (GSP) algorithm was proposed to solve the mining of sequential patterns with time constraints, such as time gaps and sliding time windows. Recent studies indicate that the pattern-growth methodology could speed up sequence mining. However, the capabilities to mine sequential patterns with time constraints were previously available only within the Apriori framework. Therefore, we propose the DELISP (delimited sequential pattern) approach to provide the capabilities within the pattern-growth methodology. DELISP features in reducing the size of projected databases by bounded and windowed projection techniques. Bounded projection keeps only time-gap valid subsequences and windowed projection saves nonredundant subsequences satisfying the sliding time-window constraint. Furthermore, the delimited growth technique directly generates constraint-satisfactory patterns and speeds up the pattern growing process. The comprehensive experiments conducted show that DELISP has good scalability and outperforms the well-known GSP algorithm in the discovery of sequential patterns with time constraints.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 11th international conference on data engineering. Taipei, Taiwan, pp 3–14

  2. Ayres J, Gehrke JE, Yiu T, et al (2002) Sequential pattern mining using bitmaps. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining. Alberta, Canada

  3. Bettini C, Wang XS, Jajodia S (1998) Mining temporal relationships with multiple granularities in time sequences. Data Eng Bull 21:32–38

    Google Scholar 

  4. Garofalakis MN, Rastogi R, Shim K (1999) SPIRIT: Sequential pattern mining with regular expression constraints. In: Proceedings of the 25th international conference on very large data bases. Edinburgh, Scotland, pp 223–234

  5. Guralnik V, Garg N, Karypis G (2001) Parallel tree projection algorithm for sequence mining. In: Proceedings of the 7th international Euro-par conference on parallel processing, pp 310–320

  6. Han J, Pei J, Mortazavi-Asl B, et al (2000) FreeSpan: Frequent pattern-projected sequential pattern mining. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining, pp 355–359

  7. Lin MY, Lee SY (1998) Incremental update on sequential patterns in large databases. In: Proceedings of 10th IEEE international conference on tools with artificial intelligence. Taipei, Taiwan, pp 24–31

  8. Lin MY, Lee SY (2002) Fast discovery of sequential patterns by memory indexing. In: Proceedings of the 4th international conference on data warehousing and knowledge discovery (DaWaK02). Aix-en-Provence, France, pp 150–160

  9. Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Discov 1:259–289

    Article  Google Scholar 

  10. Masseglia F, Cathala F, Poncelet P (1998) The PSP approach for mining sequential patterns. In: Proceedings of the 2nd European symposium on principles of data mining and knowledge discovery, vol 1510. Nantes, France, pp 176–184

  11. Oates T, Schmill MD, Jensen D, et al (1997) A family of algorithms for finding temporal structure in data. In: Proceedings of the 6th international workshop on AI and statistics. Fort Lauderdale, Florida, pp 371–378

  12. Pei J, Han J (2002) Constrained frequent pattern mining: A pattern-growth view. SIGKDD Explor 4:31–39

    Article  Google Scholar 

  13. Pei J, Han J, Pinto H, et al (2001) PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of 2001 international conference on data engineering, pp 215–224

  14. Pei J, Han J, Wang W (2002) Mining sequential patterns with constraints in large databases. In: Proceedings of the 11th international conference on information and knowledge management

  15. Pinto H, Han J, Pei J, et al (2001) Multi-dimensional sequential pattern mining. In: Proceedings of the 10th international conference on information and knowledge management, pp 81–88

  16. Roddick JF, Spiliopoulou M (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Trans Knowl Data Eng 14:750–767

    Article  Google Scholar 

  17. Rolland P (2001) FlExPat: Flexible extraction of sequential patterns. In: Proceedings of the IEEE international conference on data mining 2001, pp 481–488

  18. Shintani T, Kitsuregawa M (1998) Mining algorithms for sequential patterns in parallel: Hash based approach. In: Proceedings of the 2nd Pacific–Asia conference on knowledge discovery and data mining, pp 283–294

  19. Srikant R, Agrawal R (1996) Mining sequential patterns: Generalizations and performance improvements. In: Proceedings of the 5th international conference on extending database technology. Avignon, France, pp 3–17 (an extended version is the IBM Research Report RJ 9994)

  20. Tsoukatos I, Gunopulos D (2001) Efficient mining of spatiotemporal patterns. In: Proceedings of the 7th international symposium of advances in spatial and temporal databases, pp 425–442

  21. Wang K (1997) Discovering patterns from large and dynamic sequential data. J Intell Inf Syst 9:33–56

    Article  Google Scholar 

  22. Zaki MJ (2001) SPADE: An efficient algorithm for mining frequent sequences. Mach Learn J 42:31-60

    Article  Google Scholar 

  23. Zaki MZ (2000) Sequence mining in categorical domains: Incorporating constraints. In: Proceedings of the 9th international conference on information and knowledge management. Washington, DC, pp 422–429

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming-Yen Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, MY., Lee, SY. Efficient mining of sequential patterns with time constraints by delimited pattern growth. Knowl Inf Syst 7, 499–514 (2005). https://doi.org/10.1007/s10115-004-0182-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-004-0182-5

Keywords

Navigation