Introduction

Adequately powered randomized controlled trials (RCTs) are often considered to be the gold standard for physician decision making, although there is some controversy on this point [1, 2]. It can be difficult to conduct RCTs for hematopoietic stem cell transplantation (HSCT) due to the limited numbers of eligible patients with suitable donors [3, 4]. As an alternative to RCTs, results from registry data are used as a reliable source of information for evaluating HSCT outcomes, and high-impact clinical evidence has been accumulated from observational studies [5, 6]. Potential information, selection, or publication biases are also reported in the observational data analysis for HSCT outcome studies [3, 7, 8].

National and international registry data and the outcome registries

International, national, or regional registries collect donor [9] and recipient clinical data in collaboration with unrelated donor banks or cord blood banks. Some major registries have structured systems that facilitate transplant outcome research; these also analyze data within the data center, and their scientific activities have led to a number of publications. National or international registries also collaborate to produce scientific evidence with a larger impact. In addition to scientific reports, registries often publish activity reports to outline regional transplant activities [10, 11].

The Japanese transplant registry and Japanese society have an advanced data accrual system and infrastructure, and developed the registry study procedure especially over the past 10 years, by learning from major international registries with more experience. Japanese pediatric and adult transplant outcome registries were started in 1984 and 1993, respectively; prior to 2006, four different societies, the Japan Society for Hematopoietic Cell Transplantation (JSHCT), the Japanese Society of Pediatric Hematology, the Japan Marrow Donor Program, and the Japan Cord Blood Bank Network, separately collected recipient and donor clinical information. In 2006, the JSCHT data center unified paper-based data collection [12]. The unified registry is currently managed by the Japanese Data Center for Hematopoietic Cell Transplantation (JDCHCT) data center, under the Japanese transplant law, “Act for Appropriate Provision of Hematopoietic Stem Cells to be Used in Transplantations,” that went into effect in 2014 [13]. Activity of the registry is financially supported by the government based on the above Act. Allogeneic and autologous stem cell transplant data from Japanese centers are collected by the JDCHCT in collaboration with the JSHCT, and the data are collected by the TRUMP (Transplant Registry Unified Management Program) electronic data capture system, as it is explained by Atsuta et al. in this issue of the journal.

The JSHCT Working Groups were established in 2010 to facilitate the use of registry data for outcome studies, and the number of published transplant registry studies from Japan increased thereafter. The JDCHCT data center manages data collected by the TRUMP system, and the data are provided for the JSHCT Working Group studies. Working Groups consist of members who are voluntary JSHCT members basically with at least 3 years of membership. Working Group members can submit study proposals to the JSHCT via leadership of one of the twenty-three Working Groups, and as of September 2015 the proposals are reviewed by the Data Management Committee of the JSHCT and the JDCHCT. Once approved, the data are generally analyzed by the Working Group investigators or perhaps by collaborating statisticians of each study; a manuscript can be submitted after a procedure defined by the Data Management Committee [12, 14]. Beside the JSHCT Working Group structure, investigators outside of the Working Group can also submit study proposals to the JDCHCT, and they might be recommended to collaborate with Working Group investigators depending on the subject of the proposal.

The Asia-Pacific Blood and Marrow Transplantation Group (APBMT) collects data yearly on transplant activity in participating countries or regions [15, 16], and has published the APBMT Activity Survey every year since 2007 [17]. The APBMT collects data for this activity survey from national registries, persons in charge of data, or receives it direct from transplant centers, depending on the country (Table 1). Efforts to establish an Asian international outcome registry are ongoing, and data collection by the APBMT currently uses simpler forms with fewer variables than the Japanese form to reduce effort by participating parties with limited resources.

Table 1 Data submitting countries/regions for APBMT activity survey [17]

The Center for International Blood and Marrow Transplant Research (CIBMTR) is a registry in the United States that collects data from domestic and international transplant centers [5, 18, 19]. The CIBMTR also supports studies using pre-transplant research samples of related and unrelated recipient donor pairs. CIBMTR data are utilized for multicenter prospective clinical trial planning by the Blood and Marrow Transplant Clinical Trial Network (BMT CTN); the long-term follow-up mechanism of the CIBMTR is also used for continued follow-up for prospective trials [20, 21]. In its progress report, the CIBMTR reports that information on over 390,000 recipients is in the dataset and that 20,500 new transplant data are collected annually. These data are currently collected using an electronic data capture application (FomsNet3) [22]. The CIBMTR has developed a well-structured research support system over its long history. A volunteer member can propose a study to the CIBMTR Working Committees; the process of study review, approval, and support for the investigators are introduced in detail on their web site. The research activity of the CIBMTR is financially supported by grants from the National Institutes of Health (NIH), by Health Resources and Services Administration (HRSA) contract, and other sources [22]. At the time of grant application, the amount of funding for a U24 grant to support the CIBMTR operations was expected to exceed $18 million over the 5-year project period, according to the NIH web site [23].

The European Society for Blood and Marrow Transplantation (EBMT) [24] collects data using a web-based data management system similar to that used by the CIBMTR. The EBMT structures Working Parties to perform registry studies, and study procedures are also explained on the web site. The EBMT issues annual reports of transplant activities focused on different aspects of transplantation, such as general trends, alternative donor transplants, or transplants for pediatric recipients [2529].

Major international registries, the CIBMTR and the EBMT, assign or allocate statisticians to Working Committee/Working Parties or to each study, in order to review and guarantee the analysis before publication. Physician investigators of the JSHCT Working Groups need to collaborate with statisticians on their own for supervision or performing the analysis in most cases, but such a structured statistical support system is not yet available in Japan, due in part to the limited number of qualified staff at the Japanese data center (Table 2) [30].

Table 2 Number of staff at the CIBMTR and JDCHCT

These international registries further collaborate to analyze pooled data from multiple registries and to produce larger-scale studies. Collaborative activities by international registries are not limited to producing scientific publications; data are also shared among major international registries/or national and international registries to reduce the effort of duplicate data submission. The Worldwide Network for Blood and Marrow Transplantation (WBMT), a non-government organization, was created by international registries and its activities include networking and facilitating transplant-related activities [9] worldwide. The WBMT has conducted several global surveys to investigate global and regional trends in HSCT, in relation to socioeconomic status, team density, and other factors [3133].

Management of data quality of registry studies

Quality control in transplant data registries

Poor research design, lack of a formal analysis plan, or inappropriate data editing can lessen the value of the study outcomes, and many registries, including the JSHCT, review study proposals to determine whether they are scientifically sound and ethically compliant. Management of data from the time of its collection through final analysis is another critically important element of the clinical study process; the accuracy and completeness of data elements must be confirmed to improve the quality of conclusion drawn from analyses.

Electronic data collection (EDC) is thought to improve data quality in clinical trial data management [34], and many transplant registries use it to accumulate data through online electronic means, for instance, by employing the TRUMP2 system of the JDCHCT, rather than the paper case report form. One of the advantages of the EDC system is that simple entry errors, missing data entries, or extremely unusual values can be checked by system for which checks have been sufficiently programmed. If researchers handle insufficiently checked or unchecked data, they may recognize some bizarre or unusual data which seems clinically impossible: for instance, post-transplant relapse date or engraftment date before the date of transplant, or an unusually small number of pre-freeze umbilical cord blood cells, etc. Although the utility of such approaches depends on the thoroughness of logical check programs, checking of each data field or more complex checks of related multiple fields in several forms can be automatically performed in real time by a pre-defined EDC check system, along with transplant center data entry at each institution.

Automatic edit check programs in EDC systems can improve data quality; however, such a check system is not sufficient when it comes to deeper clinical aspects; checks and oversight of data quality by someone (e.g., data managers) are usually necessary. Generally for prospective clinical trials, data managers at central data center further review the data manually, and send queries to participating sites to clarify any discrepancies to correct data errors. An integral part of data quality control is always to identify systemic problems early, and providing feedback to transplant centers is also an important measure for continual improvement of data quality. Analysis of the Japanese registry is currently not performed in data centers, but is generally done by individual investigators; even when investigators notice potential errors in the fixed data, the query pathway is not available so the potential data error cannot be verified with the transplant centers. Solving these issues may well further improve the study quality, and increase the number of usable collected variables.

Data audit of registry

Two previous Japanese ethical guidelines for epidemiological or clinical studies, the Ethical Guideline for Epidemiological Research (Public Notice of the Ministry of Education, Culture, Sports, Science and Technology and the Ministry of Health, Labour and Welfare No. 1 of 2007) and the Ethical Guideline for Clinical Research (Public Notice of the Ministry of Health, Labour, and Welfare No. 415 of 2008) were recently updated and merged into a new guideline, the Ethical Guidelines for Medical and Health Research involving Human Subjects [35], which has been implemented with a guidance document by the Ministry of Health, Labour and Welfare and the Ministry of Education, Culture, Sports, Science and Technology. Under the new guideline, monitoring is required for interventional invasive studies (except for minor invasion). Transplant registries only collect observed clinical information without intervention, in spite of the invasive nature of HSCT, so monitoring is not required under the new guideline; in fact neither monitoring nor auditing are conducted by the Japanese transplant registry. However, this does not mean that quality control of registry data is not useful or necessary [36].

The CIBMR runs an on-site data audit program for international and domestic centers [37, 38]. Audited patients are randomly selected in the EDC system. As a rule, 16 recipients at each center are audited for each on-site auditing [39]. Auditors visit each center once within a 4-year audit cycle, for 3–5 days, and database data and patient medical records are compared; consent forms are also checked for patients in the research database sample repository. After the audit, corrections are reflected in the registry database at the data center. In addition to auditor feedback to data management staff at the center during a closing meeting on the final day of an audit, the audit report is later sent to the medical director and data management staff at transplant centers. The passing error rate for critical fields, which is considered important for outcome research [40], is less than or equal to 3 %; a corrective action plan must be submitted by centers that do not achieve this standard. The first audit program cycle for transplant registry in the US at the NMDP began in 1998, and the critical field error rate has decreased since that time [41].

The audit program in the US serves to improve the registry’s data quality, thanks to the efforts of both the data center and transplant centers [42]. Although the TRUMP registry does not have to be monitored or audited under the new Japanese ethical guidelines, it remains important for the Japanese transplant society and data center to continually consider ways of monitoring and improving data quality in order to further enhance research quality. To launch a similar audit program in Japan, personnel and travel expenses, and training of auditors would be major new burdens for data centers; for transplant centers, the staff time for auditing may become a potential burden as well. Introducing stringent requirements across all forms of data may exceed the operating capacity of some Japanese transplant centers. To date, more than 1200 variables are collected by the Japanese registry. Decisions on which of the variables collected are more significant, and focusing on improving their quality could be a first step in implementing quality control or quality assurance measures by the registry.