skip to main content
10.1145/3003733.3003801acmotherconferencesArticle/Chapter ViewAbstractPublication PagespciConference Proceedingsconference-collections
research-article

Towards Automatic Structuring and Semantic Indexing of Legal Documents

Authors Info & Claims
Published:10 November 2016Publication History

ABSTRACT

Over the last years there has been a great increase on the number of freely available legal resources. Portals that allow users to search for legislation, using keywords are now a common place. However, in the vast majority of those portals, legal documents are not stored in a structured format with a rich set of meta data, but in presentation oriented manifestation, making impossible for the end users to inquiry semantics about the documents, such as date of enactment, date of repeal, jurisdiction, etc. or to reuse information and establish an interconnection with similar repositories. In this paper, we present an approach for extracting a machine readable semantic representation of legislation, from unstructured document formats. Our method exploits common formats of legal documents to identify blocks of structural and semantic information and models them according to a popular legal meta-schema. Our proposed method is highly extensible and achieves high accuracy for a variety of legal and para legal documents, especially legislation. Our evaluation results reveal that our methodology can be of great assistance for the automatic structuring and semantic indexing of legal resources.

References

  1. T. Agnoloni, E. Francesconi, and P. Spinosa. xmlegeseditor: an opensource visual xml editor for supporting legal national standards. In Proceedings of the V legislative XML workshop, pages 239--251, 2007.Google ScholarGoogle Scholar
  2. L. Bacci, P. Spinosa, C. Marchetti, R. Battistoni, I. Florence, I. Senate, and I. Rome. Automatic mark-up of legislative documents and its application to parallel text generation. In Proceedings of LOAIT Workshop, pages 45--54, 2009.Google ScholarGoogle Scholar
  3. G. Barabucci, L. Cervone, M. Palmirani, S. Peroni, and F. Vitali. Multi-layer markup and ontological structures in akoma ntoso. In AI Approaches to the Complexity of Legal Systems. Complex Systems, the Semantic Web, Ontologies, Argumentation, and Dialogue, pages 133--149. Springer, 2010. Google ScholarGoogle ScholarCross RefCross Ref
  4. V. R. Benjamins, P. Casanovas, J. Breuker, and A. Gangemi. Law and the semantic web, an introduction. In Law and the Semantic Web, pages 1--17. Springer, 2005. Google ScholarGoogle Scholar
  5. T. Berners-Lee, J. Hendler, O. Lassila, et al. The semantic web. Scientific american, 284(5):28--37, 2001. Google ScholarGoogle ScholarCross RefCross Ref
  6. A. Boer, R. Winkels, and F. Vitali. Metalex xml and the legal knowledge interchange format. In Computable models of the law, pages 21--41. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Carroll, T. Briscoe, and A. Sanfilippo. Parser evaluation: a survey and a new proposal. In Proceedings of LREC '98, pages 447--454, 1998.Google ScholarGoogle Scholar
  8. I. Chalkidis. Nomothesi@: Greek Legislation Platform. Bachelor thesis, University of Athens, 2014.Google ScholarGoogle Scholar
  9. E. Evans. Domain-driven design: tackling complexity in the heart of software. Addison-Wesley, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. Ford. Parsing expression grammars: a recognition-based syntactic foundation. In ACM SIGPLAN Notices, volume 39, pages 111--122, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Fowler. Domain-specific languages. Pearson Education, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Garofalakis, K. Plessas, and A. Plessas. Automated analysis of greek legislative texts for version control: limitations, caveats and challenges. In Proceedings of the 19th Panhellenic Conference on Informatics, pages 115--116, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Igari, A. Shimazu, and K. Ochimizu. Document structure analysis with syntactic model and parsers: Application to legal judgments. In JSAI International Symposium on A.I., pages 126--140, 2011.Google ScholarGoogle Scholar
  14. M. Koniaris. Organization, management and retrieval of law sources by utilizing technologies of the semantic web. Master's thesis, Hellenic Open University, 2012.Google ScholarGoogle Scholar
  15. C. F. Lima JAO. LexML Brasil, Parte 3 - LexML XML Schema, version 1.0. Technical report, 2008.Google ScholarGoogle Scholar
  16. Q. Lu, J. G. Conrad, K. Al-Kofahi, and W. Keenan. Legal document clustering with built-in topic segmentation. In Proceedings of CIKM '11, pages 383--392, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Lupo, F. Vitali, E. Francesconi, M. Palmirani, R. Winkels, E. de Maat, A. Boer, and P. Mascellani. ESTRELLA Project, Deliverable D3.1 - General XML format(s) for legal Sources, version 1.0. Technical report, 2007.Google ScholarGoogle Scholar
  18. A. Marchetti, F. Megale, E. Seta, and F. Vitali. Using xml as a means to access legislative documents: Italian and foreign experiences. ACM SIGAPP Applied Computing Review, 10(1):54--62, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Palmirani and R. Brighi. An xml editor for legal information management. In International Conference on Electronic Government, pages 421--429, 2003. Google ScholarGoogle ScholarCross RefCross Ref
  20. T. Parr. Language implementation patterns: create your own domain-specific and general programming languages. Pragmatic Bookshelf, 2009.Google ScholarGoogle Scholar
  21. T. Parr, S. Harwell, and K. Fisher. Adaptive ll (*) parsing: the power of dynamic analysis. In ACM SIGPLAN Notices, volume 49, pages 579--598, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. I.-P. Union. World e-Parliament Report 2010. 2010. http://www.ictparliament.org/wepr2010.html.Google ScholarGoogle Scholar
  23. S. Van De Ven, R. Hoekstra, R. Winkels, E. de Maat, and Á. Kollár. Metavex: Regulation drafting meets the semantic web. In Computable Models of the Law, pages 42--55. Springer, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    PCI '16: Proceedings of the 20th Pan-Hellenic Conference on Informatics
    November 2016
    449 pages
    ISBN:9781450347891
    DOI:10.1145/3003733

    Copyright © 2016 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 10 November 2016

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate190of390submissions,49%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader