1 Introduction

In general, better utilization of technology can improve the quality of life for many citizens in the developing countries. However, one of the specific challenges facing scholars is to take into account the wide range of different contexts that regulations and policies appear in (Bada and Madon 2006). Although there have been several strategies to development, still economic growth is a dominant perspective of development since the Second World War (Zheng 2009). Accordingly, the outcome of development is measured by gross national product or per capita income. The perspective of economic growth and increasing technology usage have pushed financial information systems (FISs) to become pervasive and automate customer processes and data. FISs are often business-critical systems because if they fail to deliver their services as expected then serious problems and significant economic losses may result (Sommerville 2015). Such systems have become the targets of security attacks due to the popularity of their online transactions that require sharing financial data (Bada and Madon 2006).

In addition, poor infrastructure in the less-developed countries has led attackers to have both the incentive and the ability to inflict significant harm to technology users (Ben-David et al. 2011). Emerging economies suffer economically proportionally more than developed ones from security attacks or cybercrimes (Ben-David et al. 2011). The costs of recovery from attackers form a much higher percentage of GDP for developing countries than for developed nations (Baker 2014). Research on this subject has encouragingly moved beyond the boundary of an organizational level of analysis to explore much wider societal and technical issues. This has also led security to be a central concern in financial systems and a hotspot in research (Cheng and Atlee 2007; Ben-David et al. 2011; Zafar et al. 2012; Osei-Bryson and Vogel 2014). Such systems, which are increasingly social too, should then be developed not only to address business goals or stakeholder needs but to address security and privacy issues because the boundary between sensitive and non-sensitive personal information is still blurring (Breaux 2014) 0. The impact of announcements of cyberattack on the stock market of publicly traded companies has been examined by Campbell et al. (2003). Moreover, security articles in financial systems are difficult for users to interpret, and their ambiguities and complexity make it difficult for software engineers to use them as a guide for designing and implementing these systems (Breaux et al. 2006). The consequences of failing to comply with these regulations not only may lead to the disclosure of customer confidential information, which could be damaging to their reputations and finances, but also can cause tremendous economic loss and reputation damage for financial providers (Wu et al. 2012; Ebad et al. 2016). Thus, building a legally compliant system is an engineering problem and is recognized as a significant issue, especially in systems governed by law (Massey et al. 2010; Maxwell et al. 2012). This challenge comes from ambiguities and domain-specific definitions found in governmental rules (Breaux et al. 2006; Otto and Antón 2007).

Security regulations of Tadawul, Saudi Arabian stock exchange, contain ambiguities that are intended by policy makers to be reinterpreted as business practices evolve and as the abilities to comply with regulations change over time. For instance, Article 7-2, states that “They (i.e., Tadawul’s personnel) should also take all reasonable precautions for properly switching off their computers”; the word “reasonable” is intentionally ambiguous, as it is unclear exactly which precautions are considered reasonable. This ambiguity comes from English words that cannot be removed from the legal requirements and can be mapped to different logical interpretations by software engineers; it can only be interpreted in the context of organizational practices, goods, and services (Breaux 2009). Because financial data have many forms such as stocks, mutual funds, commodities, and futures contracts, mining such data requires a combination of domain and technical knowledge plus an understanding of how the financial markets work and are manipulated (Boetticher 2006). Accordingly, there is a pressing need for methods to automatically formalize security texts, extract regulations, and enforce them throughout the supply chain. A well-known approach for such an analysis is the approach proposed by Breaux and Antón (2008). Up to our knowledge, most of the available works that formalize and extract rules and regulations use healthcare as a case study. In this study, we applied the Breaux and Antón’s approach to the security articles in the code of conduct of Tadawul. The approach is used here to extract formal descriptions of the Saudi rules that govern stakeholder actions from the regulations of Tadawul. In addition, due to economic and political complexities in Arab states, when conducting research, one cannot easily apply the same business models in these regions as those used in Western countries (Jakobsen 2013).

This paper is an early, novel attempt to apply concepts of the Breaux and Antón approach to security requirements of non-healthcare regulations in an emerging economy, Saudi Arabia. This provides a conceptual lens to address the issues of formalizing process in the context of financial security requirements; which can contribute to improving security issues in such FISs. This validation is expected to assist system administrators and policy makers as well as requirements engineers. This article is structured as follows: Sect. 2 describes the related work. In Sect. 3, we present our methodology in detail. Section 4 presents the required background that forms the basis for the Breaux and Antón approach. In Sect. 5 we present and discuss the results of our implementation. Section 6 describes the readability analysis on Tadawul’s statements. We discuss the study’s limitation in Sect. 7. Finally, Sect. 8 summarizes the article and offers some directions for future work in this area.

2 Related Work

In this section we, in chronological order, describe the previous works on formalizing the regulations. Breaux et al. (2006) analyzed the U.S. Health Insurance Portability and Accountability Act (HIPAA) rules as a set of natural language statements as rights, constraints, or obligations; after that, they extended their work in (Breaux and Antón 2008) by presenting a method to handle the cross-references appearing in the HIPAA text. Wu et al. (2012) proposed a framework to facilitate compliance analysis between healthcare systems and legal rules. Massey et al. (2010) discussed how requirements engineers can evaluate security software requirements for compliance with the relevant aspects of law in the form of a particular legal text. Specifically, they demonstrated their study of the iTrustFootnote 1 requirements with respect to the HIPAA rules. May et al. (2006) transformed legal text into a format that uses the commands in a presented semantic structure to express policies. Once the legal rules had been expressed, they used the SPIN model checkerFootnote 2 to reason about the consistency of the generated rules, and they determined differences between the 2000 and 2003 versions of the HIPAA Rules. To map legal requirements for sensitive information to a set of technical security requirements, Jensen et al. (2009) elicited security requirements from legislation applicable to the healthcare domain. They are written to support reuse and provide traceability to the legislation from which they were derived. They used the European Data Protection Directive, and then listed and referred to the Norwegian legislation used in their work as example of a national implementation of the Directive. Jorshari et al. (2011) presented a process, which supports the extraction of requirements from relevant laws and legislation and their mapping to system security requirements. The approach follows the Hohfeld legal taxonomy (Hohfeld 1913) and nature language patterns (Breaux and Antón 2008) to model and analyze legal text and Secure Tropos to extract and analyze security requirements (Mouratidis 2004); a smart card-based system was used as a case study to evaluate usefulness of the proposed approach. Kargl et al. (2008) analyzed the security and privacy requirements of pervasive eHealth monitoring systems (PEMS) which use wireless sensor networks. Based on their experience developing their eHealth system, ReMoteCare, they devised a model of such systems which they use to discuss threats and attacks, address security requirements, and give guidelines for security mechanisms. Ebad et al. (2016) applied the Breaux and Antón approach beyond U.S healthcare by demonstrating its validity in an analysis of Saudi healthcare privacy regulations, in Saudi Arabia. Practical attempts—namely, tools and techniques—to the task of natural-language processing of regulatory texts are considered by others (Brodie et al. 2006; Kiyavitskaya et al. 2007; Stamey and Rossi 2009).

In conclusion, formalizing and analyzing security requirements of financial legal texts have not attracted the attention of sufficient researchers with regard to that of healthcare regulations or the development of practical tools. Most attempts have focused on healthcare regulations (Bada and Madon 2006; Breaux et al. 2006; May et al. 2006; Kargl et al. 2008; Massey et al. 2010; Wu et al. 2012; Ebad et al. 2016). For this, there is a need to further validate Breaux and Antón methodology, heuristics, and patterns within the context of financial regulations and aviation standards to determine its applicability beyond healthcare.

This paper applies Breaux and Antón approach to formalize laws and regulations and to support the elicitation of security requirements of a financial system to demonstrate the applicability of the approach beyond healthcare. Without such validation, it is premature to automate Breaux and Antón approach that which is currently performed manually. According to Beckers (2012), the U.S. has no central data protection law, but separate privacy laws such as those of finance, healthcare, and children. The same happens with Saudi Arabia, the Code of Conduct for the Personnel of Saudi Stock Exchange Company (Tadawul) is for financial information, and the Ethics of the Medical Profession is for medical information (Ebad et al. 2016). However, criminal violations of health regulations in the U.S. (i.e., HIPAA) can involve violations of other laws (Massey et al. 2010). To date, this does not happen in the case of less-developed countries like Saudi Arabia. Although adoption of these guidelines in less-developed countries is positive because any kind of regulations is better than nothing, it is not enough unless these guidelines are implemented by any given software system. Our study contributes to this direction.

3 Research Strategy

Case study is an empirical research strategy suitable for investigating contemporary phenomena that cannot be addressed through controlled experiments; it is commonly used in areas like political sciences, public administration, and social work (Yin 2013). Recently, the term “case study” has appeared in the title of IT research papers (Runeson and Höst 2009). Our motivation for using case study methodology arises primarily from previous studies that have validated the Breaux and Antón approach using only healthcare documents. Moreover, we want to know how security requirements of a financial system successfully formalized. Compared with other research strategies, e.g., survey, histories, and analysis of archival financial records, the case study would be more suitable to answer for such a how question because it might examine how FISs could implement their legal regulations to achieve the necessary security. Herein, we apply the approach beyond healthcare by demonstrating its validity in analysis formalization of Saudi stock exchange security articles. Investigator triangulation was addressed by carefully documentation for the study’s procedures while conducting the case study. In addition, the author is experienced in analysis of such requirements on different contexts (Ebad et al. 2016). Contradictions, which limit reliability, and limitations, which limit external validity, are sought in Sect. 6, addressed and documented to the maximum extent possible. The units of analysis in our case study are the natural language characteristics including words, phrases, sentences, passages, items, subsections, sections, and articles in Tadawul’s code of conduct. We assume that the reader is familiar with the above characteristics of case study strategy such as units of analysis, triangulation, and analysis of archival records. Interested readers can consult (Yin 2013) for further information.

3.1 Saudi Stock Exchange (Tadawul)

Our case study in this research is from Saudi Arabia which is the largest oil producer, oil exporter, and oil proven reserves possessor worldwide. It also has almost 20 percent of the world’s proven reserves and has a leading role in OPEC so that its economy is considered the largest in the Middle Eastern region (Merdad 2012). Saudi Arabia is categorized as a high-income economy and is the only Arab country to be part of the G-20 major economies.

On the 19th of March 2007, the Council of Ministers in Saudi Arabia approved the formation of the Saudi Stock Exchange (Tadawul) Company. This was in accordance with Article-20 of the Capital Market Law establishing Tadawul as a joint stock company.Footnote 3 Tadawul is the only stock exchange in Saudi Arabia and the largest in the Middle Eastern region. Furthermore, Tadawul lists 156 firms as of September 2, 2012. In August 2010, the Capital Market Authority Council approved the code of conduct that is consistent with best professional practices that can be harmonized with the distinctive characteristic of Tadawul in order to have a highly ethical and professional working atmosphere for its Personnel. Tadawul shall ensure that its Personnel is committed to abide by the terms of the code of conduct to sustain its credibility, efficiency, and competence of its business and activities and to maintain its reputation in the capital market (Soldatkin and Astrasheuskaya 2011). Although the code of conduct consists of 16 articles, we focused in this research on the article related to information security, articles 6, 7, and 16. Because the main idea of Article 9 is already existing in Article 6, we ignore it from our focus. Tadawul’s code of conduct is available in the official website of Tadawul. We put the considered articles in the Appendix.

4 Concepts of the Breaux and Antón Approach

The Breaux and Antón approach that we validate in this paper was conducted in the spirit of Grounded Theory (GT), in which phenomenon is analyzed to develop an understanding of the present state of a specific subject of interest. GT is derived from data that have been gathered and analyzed systematically (Glaser and Strauss 1967). In particular, observations from a data set are relevant to that data set; the key points are marked within a text of interest marked with several codes. The codes are grouped into similar concepts to make them more workable. From these concepts, categories are formed, which are the basis for creating a theory. Therefore, our work herein is not based on hypothesis or a distinct theory that we hope to accept or reject. The results of this kind of analysis are expected to provide additional benefits to financial policy makers, requirements engineers, and information systems developers, especially in emerging economies, by providing more objective criteria for assessing Tadawul practices.

With the Breaux and Antón approach, the following three-step process should be repeated to extract rules:

  1. 1.

    Identify a statement written in natural language that expresses rights, permissions, or obligations.

  2. 2.

    Apply semantic parameterization to the statement to derive semantic models for the actors, actions, and objects of each statement.

  3. 3.

    Derive rules with preconditions and effects built from the temporal constraints that are related to the semantic models.

More details of the Breaux and Antón approach are not included here. We, therefore, assume that the reader is familiar with the basic concepts including: constraint, right, obligation, semantic parameterization, property types (subject, action, object, target, purpose), and rule tables (reference of rule, initial record of rule, extended record of rule), as well as the Grounded Theory (GT). Otherwise, interested readers can consult (Breaux and Antón 2008; Breaux 2009) for a comprehensive review of such details.

5 Analysis of Results

Applying the semantic parameterization process to the code of conduct of Tadawul yielded encodings for two rights, eight obligations, eighteen constraints, and seven rules. To characterize rights, obligations, and constraints, some patterns have been noticed. They consist of a sequence of words and parts of speech such as verbs and adjectives. The patterns for encoding rights and obligations identify an actor, action, and relationship to other objects or activities. Furthermore, the patterns for encoding rules may contain constraints on personnel rights and obligations. Table 1 shows the two, eight, and ten natural language patterns that were identified to encode rights, obligations, and constraints in the code of conducts, respectively.

Table 1 Patterns

The predicates produced by the semantic parameterization process distinguish properties, for example: the action, subject, and object of an activity. To deny/allow and parameterize rules related with information access, rule tables are designed.

Based on patterns shown in Table 1, we extracted the following two rights, eight obligations, and eighteen constraints:

R1: Tadawul shall have the right to pursue any legal proceedings.

R2: Tadawul’s Compliance Committee shall be the authority with which the application of the terms of code of conduct.

O1: Personnel and Tadawul’s CEO are prohibited from practicing other work.

O2: Personnel are prohibited from offering consultation.

O3: Personnel shall not disclose any confidential information.

O4: Personnel shall take all necessary precautions to secure confidential information.

O5: Personnel shall comply with specific policies at the end of service.

O6: Personnel must not demand any information not-relating to their work.

O7: Tadawul shall maintain the confidentiality of the information.

O8: Tadawul’s Compliance Committee shall keep all disclosed information.

C1: Tadawul shall have the right to pursue any legal proceedings.

C2: Tadawul’s Compliance Committee shall be the authority with which the application of the terms of code of conduct.

C3: Work is outside Tadawul.

C4: Personnel should obtain Tadawul’s written approval.

C5: Personnel are prohibited from practicing work.

C6: Personnel are prohibited from offering consultation.

C7: Tadawul’s CEO is prohibited from practicing work.

C8: Personnel shall disclose confidential information.

C9: Disclosing information shall be officially.

C10: Personnel shall take all necessary precautions to secure confidential information

C11: Personnel must refrain from disclosing confidential information.

C12: Personnel must stop using confidential information.

C13: Personnel must comply with the code of Law Practice.

C14: Personnel are at the end of service.

C15: Personnel must not demand any information.

C16: Information is relating to work.

C17: Tadawul shall maintain the confidentiality of the information.

C18: Tadawul’s Compliance Committee shall keep all disclosed information.

All of these rights, obligations, and constraints are transformed into the seven rules shown in Table 2. For further illustration, the fourth column, i.e., the statement level is added, which merely rephrases the constraints using an if-else statement.

Table 2 List of rules

For a single rule, there are one or more entities that could be queried. This entity is the value of the subject, target, object, or action. Therefore, we consider all of the expected values for the subject, target, object, and action.

Consider Rule 2, for instance, with the constraint C2, where C2 says “Tadawul’s Compliance Committee shall be the authority with which the application of the terms of code of conduct”. We can query two different entities, namely, the subject and object, as follows:

  • The Tadawul’s Compliance Committee (subject) can be queried regarding who might be the authority with which the application of the terms of code of conduct.

  • The terms of code of conduct (object) can be queried regarding to which Tadawul’s Compliance Committee might be the authority with.

For every rule, we created three tables (1) reference of rule which describes the rule and its reference, (2) initial record of rule which sorts the properties, and (3) extended record of rule which states the required query and its answer in the Value and Property columns, respectively. The goal of tables is to expose details that will support software engineers design access control systems. Due to space limitation, we merge the three tables of each rule into one.

Rule 1 (in Table 2), which represents right R1, is formalized through Table 3. Rule 2 (in Table 2), which represents right R2, is formalized through Table 4. Rules 3, 4, and 5 (all in Table 2), which represent the obligations O1, O2, and O3, are formalized through Tables 5, 6, and 7 respectively. Rule 6 (in Table 2), which represents obligations O4, O6, O7, and O8 are formalized through Table 8. Finally, Rule 7 (in Table 2), which represents obligation O5, is formalized through Table 9.

Table 3 Rule 1 tables
Table 4 Rule 2 tables
Table 5 Rule 3 tables
Table 6 Rule 4 tables
Table 7 Rule 5 tables
Table 8 Rule 6 tables
Table 9 Rule 7 tables

With the above rule tables, it is easier for software requirement engineers to address the ambiguity in each requirement of the Saudi stock exchange security articles. Unlike natural languages, these tables use the first-order predicate logic (FOPL) as a formal method to provide interpretation so that whether an expression is legal can be precisely determined. The other good point is related to existence of ‘shall’ statements in the Tadawul text; for example, in articles: 6-2, 7-1, 7-2, 7-5, and 16, such statements define what the system shall do (Sommerville 2015). In systems requirements engineering, the use of these statements is encouraged; as this implies a directive to express what is obligatory. By contrast, ‘shall not’ statements define system behavior that is unacceptable. These ‘shall not’ statements cannot be implemented directly but have to be decomposed into more specific ‘shall’ statements. For example, Article 7-1 (‘Personnel shall not disclose, announce or declare any such information to any other party’). This ‘shall not’ statement is decomposed into two ‘shall’ statements: ‘Personnel shall disclose confidential information’ and ‘Disclosing information shall be officially’. Table 7 shows this decomposition. Implementation of the ‘shall and ‘shall not’ statements might make the responsibility of professionals and ethics on machines or software rather than people.

With the above analysis, policy makers in Saudi Arabia can have insights in the semantic relationships that create the meaning of their policy. Accordingly, they can recommend mechanisms to resolve conflicts and ambiguity, resulting in higher quality policy statements that better conform to the Tadawul’s requirements.

6 Readability Analysis

Tadawul’s personnel are increasingly becoming more interested in knowing how to secure their information. Therefore, it is useful to assess security regulations in a way that tests their readability. Although the Tadawul provides useful examples, the examples are not exhaustive and it is up to the financial firm to make a subjective judgment with regard to readability. A more objective measure entails considering the reading skills of the target people, as well as conducting a readability analysis of the statements to know whether they are clear enough to be understood. The most widelyused method is to employ a statistical, standardized readability metric that allows an objective evaluation. The Flesch Reading Ease Score (FRES) is a metric usually used both to evaluate school texts as well as legal documents (Flesch 1949). It gives an approximate score for the difficulty of a text. The Flesch metrics have been commonly accepted benchmarks for decades. It rates texts on a 100-point scale, where a higher score indicates to a simpler text. FRES computation depends on two sub-metrics (a) the average sentence length (b) the average number of syllables per word. Short words and sentences are easier to read and, therefore, produce a higher FRES. The Flesch Grade Level (FGL) determines the U.S. grade-school equivalency level of a text and is also based on the previous two sub-metrics. The metrics can be computed as follows:

$$ {\text{FRES}} = 206.835 - 84.6 \times \frac{\text{total syllables}}{\text{total words}} - 1.015 \times \frac{\text{total words}}{\text{total sentences}} $$
(1)
$$ {\text{FGL}} = \left( {0.39 \times {\text{Average sentence length}} \left( {\text{in words}} \right)} \right) + \left( {11.8 \times {\text{Average number of syllables per word}}} \right) - 15.59 $$
(2)

According to Flesch (1979), FRES scores can be interpreted as shown in Table 10:

Table 10 Interpretation of FRES scores

These tests are bundled with several word processing programs and services. We used Microsoft Office Word, a well-known word processing, to find the FRES and FGL scores. Accordingly, it is found the FRES score to be 16.6, which is interpreted as ‘Very difficult to read and best understood by university graduates’. We ignored the FGL score because it is specialized for the U.S. grade level. The resultant score differs from that of financial regulations in developed countries. For example, the average FRES and FGL of financial policies of seven U.S. banks were 33.1 and 14.1, respectively, which were interpreted as ‘Difficult to read and best understood by university’ (Antón et al. 2004). The education level statistics for the adult U.S. population of Internet users was employed rather than that of the general population. In contrast, Tadawul’s readability score is low, though the high usage and government subsidization for technology in Saudi Arabia (Ebad 2016). Figure 1 shows some statistics that state that 95% of the Saudi organizations including governmental establishments, corporates, and educational institutes use computers (i.e., PCs, laptops, and tablets), and 64% of them use e-services (e.g., e-government, e-learning, e-banking, e-payment, and online shopping) (Communications and IT Commission 2014). A potential explanation for this low score might not strongly relate to the usage of technology but links to the regulations’ language. These regulations were written with the natural language of the Tadawul’s personnel (i.e., Arabic) which is the official language of the Saudi Stock Exchange. The English version that we studied herein is the official translation of the Arabic text. Accordingly, if there is a misunderstanding for any Tadawul’s regulation, the Arabic text would be considered. Unfortunately, we cannot calculate the above readability scores with Arabic text. Up to the best knowledge, this feature is not supported by the considered programs or tools. In general, shorter Tadawul’s regulations (e.g., Article 7-3-3) do not give additional information, but on average, they are no easier to read than their longer counterparts (e.g., Article 7-1).

Fig. 1
figure 1

Some statistics about the current state of ICT sector in Saudi Arabia (Communications and IT Commission 2014)

7 Limitations and Contradictions

Transformation of the legal text into logical elements still is a problem that must be addressed by researchers attempting to apply automated reasoning to data usage. Full automation in our context is not easy because the logical consequences in FOPL are semi-decidableFootnote 4 (Crick 2009). Full automation can only be applied to a decidable subset of FOPL using logic programming systems such as Prolog. The automation problem is a limitation that we noticed in the Breaux and Antón approach, which has not been validated mathematically using existing software tools. In general, transformation of the legal text into logical elements is still a problem that must be addressed by researchers attempting to apply automated reasoning to data usage. As we mentioned in Sect. 2, it is premature to automate the Breaux and Antón approach which is currently performed manually unless we are able to apply it beyond healthcare context which is made in this research. The other limitation comes from the use of English words that may lead to vagueness with other words, for example “shall” word (in Article 6-2, 7-1, 7-2, 7-5, and 16) and “must” word (in Article 7-3-1, 7-3-2, 7-4) as both words require a sense of commitment. Any contradiction could bind the study’s reliability (Yin 2013). In particular, we only identify trivial contradictions by observing negation in semantic models, as in the case of R1, O2, and O4b. To reinforce our study’s reliability, investigator triangulation has been considered as we mentioned in Sect. 3. We can also provide copies of materials used in our study for other researchers interested in repeating it.

8 Conclusion and Future Work

This paper has engaged with the debate on formalizing security requirements found at a financial information system in Saudi Arabia, by applying concepts of Breaux and Antón approach as a conceptual basis. The approach mainly depends on a semantic parameterization process that converts natural language descriptions into first-order predicate logic. To this end, we formalize the articles related with personnel information security from the code of conduct of Saudi stock exchange (Tadawul) in Saudi Arabia, the largest oil producer worldwide. Security in Saudi Arabia is an issue because of the high-income Saudi economy. We also used readability analysis of Tadawul’s regulations based on standard metrics. As a result of the case study, we extract two rights, eight obligations, eighteen constraints, seven rules, and several rule tables. Additionally, most of Tadawul’s security regulations considerably need more reading skills than just usage of the ICT services. Still, technology is relatively new and legislators are slow to agree with standards. To our knowledge, this work is an early, novel attempt to comprehensively formalize an entire text from financial regulations. The results of our formalization are expected to benefit regulators, legislators, policy makers, and information system administrators/developers by focusing their attention on those natural language policy semantics that are implementable in software systems. Moreover, the paper has reported that the Breaux and Antón approach is generalizable to different domains/countries. It is not simple for software engineers to ensure that their systems are compliant with all new legal requirements without stifling innovation. In future plans, more work is needed to address the automation process mentioned in the previous section using a hypothetical example of a legal document. This, which is in-progress, would be done by developing an automatic Breaux and Antón approach-based tool.