Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Energy landscape analysis and time-series clustering analysis of patient state multistability related to rheumatoid arthritis drug treatment: The KURAMA cohort study

  • Keiichi Yamamoto ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    yamamoto-k@cc.osaka-dent.ac.jp

    Affiliation Division of Data Science, Center for Industrial Research and Innovation, Translational Research Institute for Medical Innovation, Osaka Dental University, Hirakata City, Osaka, Japan

  • Masahiko Sakaguchi,

    Roles Conceptualization, Formal analysis, Methodology, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Engineering Informatics, Faculty of Information and Communication Engineering, Osaka Electro-Communication University, Neyagawa City, Osaka, Japan

  • Akira Onishi,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Validation, Writing – review & editing

    Affiliation Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Sakyo, Kyoto, Japan

  • Shinichiro Yokoyama,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing – review & editing

    Affiliation Oracle, Tokyo, Japan

  • Yusuke Matsui,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing – review & editing

    Affiliation Oracle, Tokyo, Japan

  • Wataru Yamamoto,

    Roles Conceptualization, Data curation, Investigation, Methodology, Validation, Writing – review & editing

    Affiliations Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Sakyo, Kyoto, Japan, Department of Health Information Management, Kurashiki Sweet Hospital, Nakasho, Kurashiki, Kurashiki City, Okayama Prefecture, Japan

  • Hideo Onizawa,

    Roles Data curation, Investigation, Validation, Writing – review & editing

    Affiliation Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Sakyo, Kyoto, Japan

  • Takayuki Fujii,

    Roles Data curation, Investigation, Validation, Writing – review & editing

    Affiliation Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Sakyo, Kyoto, Japan

  • Koichi Murata,

    Roles Data curation, Investigation, Validation, Writing – review & editing

    Affiliation Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Sakyo, Kyoto, Japan

  • Masao Tanaka,

    Roles Data curation, Investigation, Validation, Writing – review & editing

    Affiliation Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Sakyo, Kyoto, Japan

  • Motomu Hashimoto,

    Roles Conceptualization, Investigation, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation Department of Clinical Immunology, Osaka Metropolitan University Graduate School of Medicine, Osaka City, Japan

  • Shuichi Matsuda,

    Roles Conceptualization, Supervision, Writing – review & editing

    Affiliation Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Sakyo, Kyoto, Japan

  • Akio Morinobu

    Roles Conceptualization, Supervision, Writing – review & editing

    Affiliation Department of Advanced Medicine for Rheumatic Diseases, Kyoto University Graduate School of Medicine, Sakyo, Kyoto, Japan

Abstract

Rheumatoid arthritis causes joint inflammation due to immune abnormalities, resulting in joint pain and swelling. In recent years, there have been considerable advancements in the treatment of this disease. However, only approximately 60% of patients achieve remission. Patients with multifactorial diseases shift between states from day to day. Patients may remain in a good or poor state with few or no transitions, or they may switch between states frequently. The visualization of time-dependent state transitions, based on the evaluation axis of stable/unstable states, may provide useful information for achieving rheumatoid arthritis treatment goals. Energy landscape analysis can be used to quantitatively determine the stability/instability of each state in terms of energy. Time-series clustering is another method used to classify transitions into different groups to identify potential patterns within a time-series dataset. The objective of this study was to utilize energy landscape analysis and time-series clustering to evaluate multidimensional time-series data in terms of multistability. We profiled each patient’s state transitions during treatment using energy landscape analysis and time-series clustering. Energy landscape analysis divided state transitions into two patterns: “good stability leading to remission” and “poor stability leading to treatment dead-end.” The number of patients whose disease status improved increased markedly until approximately 6 months after treatment initiation and then plateaued after 1 year. Time-series clustering grouped patients into three clusters: “toward good stability,” “toward poor stability,” and “unstable.” Patients in the “unstable” cluster are considered to have clinical courses that are difficult to predict; therefore, these patients should be treated with more care. Early disease detection and treatment initiation are important. The evaluation of state multistability enables us to understand a patient’s current state in the context of overall state transitions related to rheumatoid arthritis drug treatment and to predict future state transitions.

Introduction

Rheumatoid arthritis (RA) causes joint inflammation due to immune abnormalities, resulting in joint pain and swelling. In recent years, there have been considerable advancements in the treatment of RA, partly due to the development of drugs such as methotrexate (MTX), biologic disease-modifying anti-rheumatic drugs (bDMARDs), and targeted synthetic DMARDs (tsDMARDs) such as Janus kinase (JAK) inhibitors; furthermore, the “treat-to-target (T2T) algorithm”, in which treatment is periodically adjusted to a target disease index, has led to improvements in RA treatment [15]. However, even with these approaches, only approximately 60% of patients achieve remission. Therefore, 10–20% of RA patients who are treatment-refractory have been identified as having difficult-to-treat (D2T) RA. In 2020, the European League Against Rheumatism (EULAR) published the D2T RA EULAR Definition. The development of appropriate treatments for refractory patients is urgently needed [68].

Due to advances in information technology, a variety of digitized data from daily practice, such as electronic medical records and various laboratory test values, can be collected. Therefore, expectations regarding “real-world evidence” are increasing [911]. However, the aim of many clinical studies is to consider the whole treatment as a “single intervention” and to determine the effectiveness and safety of that intervention while eliminating bias as much as possible. Multidimensional time-series data over the entire course of treatment are rarely analyzed in a temporal manner. In contrast, daily practice requires that the patient’s condition be observed over time and that the treatment method be selected in a timely manner to optimize efficacy and survival time. In other words, in daily practice, the whole treatment is not considered a “single intervention.” Instead, the optimal treatment is selected based on the constantly changing state of the patient. To obtain high-quality real-world evidence, it is necessary to develop a method for analyzing multidimensional time-series data in a time-dependent manner over the course of treatment and providing information that is more consistent with decision-making in daily practice [12, 13].

In multidimensional time-series data analysis in medicine, there are examples of applying dynamic treatment regimens (DTRs) and deep learning. However, establishing DTRs requires data on all treatments beginning at the initial visit in addition to an intermediate covariate history. The computational requirements for model building are high. Therefore, DTRs are often used in clinical trials or observational studies where multiple treatment regimens that are predicted in advance are compared in a time-dependent manner [1417]. Furthermore, the logic of deep learning is not very transparent. In many cases, the rationale is unclear, thus making it difficult to implement deep learning in daily practice [1824].

The Kyoto University Rheumatoid Arthritis Management Alliance (KURAMA) cohort database of the Rheumatology Center of Kyoto University Hospital includes data on initial consultation, follow-up, blood tests, etc., from all patients with RA visiting the center. The database has both clinical and research applications. It includes information on more than 3,000 patient registrations with more than 40,000 disease activity data points and has already been used extensively for the analysis of drug administration and disease activity [2528]. The KURAMA cohort database includes high-quality, multidimensional time-series data and is considered to be suitable for time-series data analysis.

Energy landscape analysis is used to estimate a model distribution of maximum likelihood from data, making it possible to visualize the ease with which transitions occur between high- and low-energy states and thus enabling intuitive interpretation [29, 30]. Patients with chronic or multifactorial diseases shift between states from day to day. These patients may remain in a good or poor state with little change, or their condition may vary frequently. Energy landscape analysis can quantitatively evaluate the stability/instability of each state as energy, which is not possible with the conventional method of simply clustering state variables. By assessing stable/unstable states in addition to conventional good/poor states, energy landscape analysis makes it possible to determine when interventions are effective and which states are difficult to treat. Energy landscape analysis is often used in areas such as protein folding and stability analysis [3133] and is considered useful for visualizing the state transitions of patients in real-world practice.

Time-series clustering is another method used to classify transitions into different groups to identify potential patterns within a time-series dataset [3439]. Dynamic time warping (DTW) is a type of time-series clustering that measures the distance and similarity between time-series data by finding the shortest path, which can be identified by summing the distance (i.e., the absolute value of the error) between the points of two time series. DTW can determine the similarity of time series even if the length and period of the time series are different [35, 40].

We hypothesized that using energy landscape analysis and time-series clustering with DTW to evaluate the multidimensional time-series data in the KURAMA cohort would enable the visualization of transitions between time-dependent states in patients with RA during drug treatment, thus providing useful information for achieving the goals of RA treatment.

The purpose of this study was to utilize energy landscape analysis and time-series clustering with DTW to evaluate multidimensional time-series data in the KURAMA cohort in terms of multistability, thereby facilitating the achievement of RA treatment goals.

Materials and methods

Study design and participants

This single-center, retrospective, observational study utilized the KURAMA cohort database (S1 Checklist). The study protocol adhered to the principles of the Declaration of Helsinki and was approved by the Kyoto University Graduate School of Medicine Medical Ethics Committee through a central collective review (R2820), and written informed consent for study participation was obtained from all patients. The study received approval on March 30, 2021, granting access to the data from that point onward.

All patients who presented to the Rheumatology Center of Kyoto University Hospital and who met the 1987 or 2010 RA classification criteria [41] were enrolled in the KURAMA cohort study, and clinical and functional data were recorded at baseline and at each visit during the study. The inclusion criteria were as follows: patients with RA enrolled between January 1, 2011, and December 31, 2018; no previous medication history at the first visit; and onset of clinical remission within 3 years or follow-up for up to 3 years.

Variables

We defined a model of patient state transitions in RA drug treatment based on T2T. The patient’s state shifts from time to time. Fig 1 shows the ease and direction of transition of the patient’s state based on high and low energy in RA practice.

thumbnail
Fig 1. Patient state transition model in RA drug treatment.

The higher the energy is, the more unstable the state and the easier it is to obtain a therapeutic effect. At initial consultation, many patients present in a poor state and with high disease activity (i.e., high energy). With treatment, they may transition to a good state or stabilize in a poor state. The physician observes the patient’s state and executes a treatment strategy to stabilize the patient in a good state or to prevent stabilization in a poor state below a certain energy threshold. The objective of this study was to visualize the state transitions of these patients as a population.

https://doi.org/10.1371/journal.pone.0302308.g001

When energy is low, the patient is stable in a good or poor state. On the other hand, when the energy is high, the patient is unstable and transitions easily to another state. The high-energy state is considered to indicate the period of effective treatment. Individual states were evaluated using a comprehensive disease activity assessment. “Good stability” was defined as meeting functional remission criteria based on the Health Assessment Questionnaire (HAQ), and “poor stability” was defined as falling below the energy threshold without functional remission (i.e., treatment dead-end) [42]. The term “treatment dead-end” indicates that the patient will be not in remission, regardless of all potential future treatment sequences. In this study, the treatment period was 3 years, and the state of each patient was evaluated up to 12 times. In RA practice, the physician collects information about the patient’s state through visual examination, palpation, blood tests, and imaging tests. Variables related to disease activity, bone destruction, and immunological response are used in comprehensive disease activity assessments in daily practice and were used to characterize the patient’s state in this study, including:

  • Rheumatoid factor (RF)
  • Erythrocyte sedimentation rate at 1 hour (ESR1h)
  • Patient’s visual analog scale (PtVAS)
  • Doctor’s visual analog scale (DrVAS)
  • State of bone destruction according to Steinbrocker’s staging classification (STAGE) [43]
  • Swollen joint count– 28 joints (SJC28)
  • Tender joint count– 28 joints (TJC28)

The time-series data that served as input values for the energy landscape analysis were binarized as high activity (i.e., nonremission) = 1 and low activity (i.e., remission) = -1. RF and ESR1h were binarized based on blood test reference values. PtVAS, DrVAS, SJC28, and TJC28 were binarized based on Boolean remission criteria [44], and STAGE was binarized based on Steinbrocker’s staging classification [43]. The remission criteria were as follows:

  • RF = 15 #Unit: IU/mL
  • ESR1h (male) = 10 #units: mm
  • ESR1h (female) = 20 #units: mm
  • PtVAS = 10 #100mm VAS
  • DrVAS = 10 #100mm VAS
  • STAGE = 3 #stage 1, 2 or stage 3, 4
  • SJC28 = 1
  • TJC28 = 1

Since there are two patterns per factor (i.e., +1/-1), analyzing seven factors generates 128 (i.e., 2^7) activity patterns. The physician selected the optimal treatment according to this information (i.e., state number) (Fig 2).

thumbnail
Fig 2. Treatment selection by the Rheumatologist based on state number.

Physicians select the optimal treatment according to seven test results (i.e., RF, ESR1h, PtVAS, DrVAS, STAGE, SJC28, and TJC28) that are used in comprehensive disease activity assessments in daily practice.

https://doi.org/10.1371/journal.pone.0302308.g002

Medications for RA were reviewed at least every 3 months until treatment goals were achieved according to T2T. The drug therapies were categorized as follows: “conventional synthetic DMARDs (csDMARDs) including MTX”; bDMARDs, including “cytotoxic T lymphocyte antigen 4-immunoglobulin G (CD80/86)”, “interleukin-6 (IL6) inhibitor,” “tumor necrosis factor (TNF) inhibitor”; and “tsDMARDs (JAK)”.

Analytical methods

Overview of the analytical methods.

Energy landscape analysis was used to assess and visualize state multistability as “energy.” Dynamic systems were formulated with a Boltzmann machine that used the Boltzmann distribution with energy to define the probability distribution. The Boltzmann machine reduced multidimensional data to a measurement considered to represent “energy.” An index of multistability was assigned to each state to identify whether the disease condition was improving or worsening. Furthermore, multidimensional time-series clustering of patients enabled us to classify and visualize the time-series transitions of unstable and stable states.

Energy landscape analysis and the Boltzmann machine.

For each patient (s), there are seven different factor values (i) observed at each of the 12 time points (t) every 3 months over 3 years. Let and T = {1,2,…,11,12}. Each factor xi(t) is binarized (-1 or +1) according to the clinical criteria for i∈Ω and tT.

We apply a Boltzmann machine [45] that extends the deterministic dynamics of the Hopfield model, a well-known model of associative memory, to stochastic dynamics.

The Boltzmann machine is defined as a multidimensional Boltzmann distribution. The definition of the distribution includes energy. A stochastic model on an undirected graph G (Ω, E) is defined, where E is the set of (i,j) links between nodes i,j∈Ω. where xi = −1 or +1, vector θ = {θi}, and matrix W = {wij} for i,j∈Ω. G (Ω, E) are assumed in the link structure of the Boltzmann machine, such that wij = wji and wii = 0. The third equation is called the energy function. The second equation is the normalizing constant of the first equation. According to the definition of the first equation, this Boltzmann machine is a mathematical model in which the energy function is smaller than the monotonically increasing nature of the exponential function, and the activity pattern appears with higher probability.

The parameters θ and W are trained to match the probability of occurrence of the actual observed data. For each person sS, the observation vector obtained at each time tT is assumed to be generated from an independent homoscedastic distribution. For the obtained dataset , the maximum likelihood method is applied to estimate the parameters s θ and W that satisfy the following equation:

The parameter that maximizes this log-likelihood function, log LD(θ,W), is obtained using the gradient ascent method as follows: where Eold is the expected value from the Boltzmann distribution using the previous parameter in the dataset, X(i) is the i-th element of X,i,j∈Ω, |S| is the number of elements in S, and ε = 0.2 is the learning rate. Up to 5,000,000 iterations were performed.

Disconnectivity graph.

First, a minimal activity pattern X = {xi|i∈Ω} is defined such that for any activity pattern Y with one different node in X, Φ(Y)≧Φ(X) holds for the given parameters s θ and W. Next, a path a with X and Y is defined. Let A(X,Y,a) = {Z|Z transform X one node at a time until it can be transformed into Y}. Note that this set exists for the number of paths a. Finally, the energy of the highest hill to be surmounted by path a is . The energy barrier of the transition from X to Y is defined as [29, 46, 47].

Time-series clustering.

We chose the variables yid(t) based on the activity pattern and chose the energy and HAQ for patient id and time t. Clustering was performed using the k-means method with distances based on the dynamic contraction method toward time-series vectors , where for id = 1,2,⋯,N, T is the time point, and N is the number of samples [48].

The number of clusters K was determined by the elbow method or silhouette analysis [49, 50]. The initial central value of cluster k was randomly chosen for . The following method of minimizing the within-cluster sum of squared errors of prediction (SSE) was used. Iterations by the k-means method were performed up to 50 times. The center value was updated by the mean vector in the cluster.

The distance function DTW: of two time-series vectors is defined as follows: for u<S and v<S, where |・| is the Euclidean distance. The DTW algorithm [51] for calculating the distance between two time series uses least-cost elasticity matching, which does not allow the time series to intersect.

Analysis environment and preprocessing

Oracle Autonomous Data Warehouse and Oracle Cloud Infrastructure Data Science were used as analysis environments. The Statistics and Machine Learning Toolbox and the Energy Landscape Analysis Toolbox v1.2 [29] of MATLAB R2022b were used for energy landscape analysis.

For missing value completion, STAGE, which comprises categorical data, was substituted for the before and after data. All other factors were numerical data, and linear interpolation of time-series data was used; for patients with fewer than 12 points, data were generated in the same way as for missing value imputation above. For example, for patients who entered remission or who withdrew from the study, data were generated in the same way as for the final point until the patients reached 3 years of remission. In the case of missing time-series points, categorical STAGE data were assigned before and after the values, and the numerical data were linearly interpolated.

Results

Participants and descriptive data

A flowchart of the subject selection, the 3-year trends for the seven factor values, and the distribution of the drug therapies administered are shown in Fig 3.

thumbnail
Fig 3. Subject selection flowchart, binarized averages of the seven factors, and distributions of administered drug therapies.

(A) The subject selection flowchart. (B) Binarized averages of the seven factors over 3 years. (C) 3-year trends in the distributions of administered drug therapies. The horizontal axis shows the number of points (0–11) at which the treatment effect was assessed (every 3 months). The 3-year average values of the seven factors were as follows: DrVAS = -0.11, Pt VAS = 0.49, TJC28 = -0.26, SJC28 = -0.22, STAGE = -0.17, RF = 0.51, and ESR1h = 0.03. Regarding the percentages of patients receiving each drug therapy, the 3-year averages were as follows: csDMARDs = 54%, CD80/86 = 10%, IL6 = 9%, TNFα = 26%, and JAK = 1%.

https://doi.org/10.1371/journal.pone.0302308.g003

From the 3,439 individuals enrolled in the KURAMA cohort from 2011 to 2018, we excluded 2,401 individuals who were duplicates or who lacked laboratory values or medication information within 3 years of their first visit. In addition, we excluded 441 patients with a history of past medication use at the time of their first visit. Thus, 597 patients were ultimately included in the analysis.

The baseline characteristics of the 597 subjects are shown in Table 1.

Results of the energy landscape analysis

The parameters of the Boltzmann machine were as follows:

θ = -0.4259–0.1940–0.2995–0.0313 0.7974–0.0784 0.6501

W = 0 0.1922 0.0028 0.5739 0.2281–0.0108 0.0189

0.1922 0 0.2048 0.5580 0.0284 0.1376–0.0332

0.0028 0.2048 0 0.1392 0.1422 0.0141 0.1811

0.5739 0.5580 0.1392 0 0.2834 0.1383 0.0793

0.2281 0.0284 0.1422 0.2834 0 0.0850–0.0103

-0.0108 0.1376 0.0141 0.1383 0.0850 0 0.2284

0.0189–0.0332 0.1811 0.0793–0.0103 0.2284 0

Energy landscape analysis (Fig 4, Table 2, and S1 Table) revealed that patient state could be divided into two fixed disease state patterns: “good stability leading to remission” (blue) and “poor stability (i.e., treatment dead-end) (red).

thumbnail
Table 2. Factor breakdown per state number (High activity: Nonremission = 1; low activity: Remission = -1), energy per state number, next transition state number, and minimal state number.

https://doi.org/10.1371/journal.pone.0302308.t002

Fig 4A shows the activity patterns. The numbers shown in the directed graph in Fig 4A are unique state numbers. The scale to the right indicates the high and low energy values of each state. The higher the energy value is, the more unstable the state is and the easier it is to transition to another state. On the other hand, the lower the energy value is, the more fixed the state is and the harder it is to transition to another state. In other words, higher energy values indicate that the patient is more responsive to drug treatment or that the rate of disease deterioration exceeds the effect of drug treatment. On the other hand, lower energy levels mean that the patient is more likely to remain in the current state and that the disease is less likely to improve or worsen.

Fig 4B shows whether each of the seven factors met the remission and nonremission criteria according to the state numbers that take the minimal energy of the patterns of good stability and poor stability. The factors shown in black met the remission criteria, while those in white did not. In the “good stability” pattern, all factors except RF and PtVAS met the remission criteria. On the other hand, in the “poor stability” pattern, none of the factors reached the remission criteria.

The patient’s state could switch between the two patterns as a result of drug treatment effects or other factors. Fig 4C shows the threshold value of the energy at which pattern transitions can occur (see: “Disconnectivity graph”).

Fig 4D shows the distributions of state numbers and energies. The horizontal axis is the energy value, and the vertical axis is the number of states. The line in red indicates the energy threshold (-1.48) at which the pattern can move between states. The threshold value was approximately -1.48. Approximately 85% (109/128) of the state numbers had energy values that could be transferred between patterns.

The state numbers were aggregated into four quadrants at the energy threshold, and the number of patients per quadrant was calculated (Fig 5).

thumbnail
Fig 5. Remission trends by number of people and HAQ score in the four quadrants.

https://doi.org/10.1371/journal.pone.0302308.g005

Fig 5A divides energy and pattern into four quadrants by threshold. The quadrant in which the energy is lower than the threshold and the activity pattern is “good stability” is G-L; the quadrant in which the energy is higher than the threshold and the activity pattern is “good stability” is G-H; the quadrant in which the energy is higher than the threshold and the activity pattern is “poor stability” is P-H; and the quadrant in which the energy is lower than the threshold and the activity pattern is “poor stability” is P-L.

Fig 5B shows the number of patients in each quadrant, with a marked increase in G-L and a decrease in P-L until approximately two points (i.e., 6 months), followed by a gradual increase or decrease. P-L decreased until approximately two points (i.e., 6 months), with no significant change thereafter. G-H decreased significantly throughout the entire period. G-H did not increase or decrease significantly across the entire treatment period.

Fig 5C shows the HAQ scores in each quadrant: G-L patients met the remission criteria for almost the entire period; the number of P-L patients who met the remission criteria decreased over time; the number of G-H and P-H patients who met the remission criteria changed minimally over time; and the number of P-H patients who met the criteria did not change over time.

In terms of overall treatment, the number of patients who responded to drug therapy and who demonstrated an improved disease status increased markedly until approximately 6 months after the start of treatment, after which the rate of increase gradually slowed. The rate of change plateaued after 1 year.

Results of time-series clustering

Individual patient state transitions are highly variable, making it difficult to identify patterns across the population (S1 Appendix). Time-series clustering was performed based on activity patterns, energy, and HAQ scores, and patient profiling was performed for each cluster. The number of clusters was set to “3” based on comprehensive analysis using the elbow method and silhouette technique [48, 49] (Fig 6 and Table 3).

thumbnail
Fig 6. Results of time-series clustering.

(A) Percent changes in the numbers of patients in the following three clusters: cluster 0, “toward good stability”; cluster 1, “toward poor stability”; and cluster 2, “unstable.” (B) HAQ scores. c Energy values.

https://doi.org/10.1371/journal.pone.0302308.g006

thumbnail
Table 3. Detailed characteristics of patients in the three clusters, based on data obtained at initial examination.

https://doi.org/10.1371/journal.pone.0302308.t003

Cluster 0 (toward good stability) (blue) was defined as patients with low energy, 80% of whom eventually assumed a good stability pattern and became stable. Cluster 1 (toward poor stability) (red) was defined as patients with low energy, almost all of whom were characterized by the “poor stability” pattern and had high HAQ scores. Cluster 2 (unstable) (green) consisted of patients with relatively high energy, almost all of whom demonstrated a “poor stability” pattern in the early stage of treatment. For the patients in Cluster 2 (unstable)”, the treatment response was not yet determined, and the majority of them met the remission criterion based on the HAQ score, although a high proportion of these patients exhibited a “poor stability” pattern in the early stage of treatment. Although the age at disease onset did not differ significantly between clusters, the age at initial diagnosis, duration of disease, and laboratory tests at initial diagnosis did differ, with the “poor stability” cluster (Cluster 1) showing a greater tendency for age and all laboratory values at initial examination than the other clusters.

The state transitions of individuals from each cluster are shown in the S2 Appendix.

Discussion

Key results

The whole treatment course of patients in the KURAMA cohort was analyzed in a time-dependent manner.

Energy landscape analysis revealed that state transitions were divided into two patterns: “good stability leading to remission” and “poor stability below the energy threshold without functional remission (i.e., treatment dead-end).” The energy threshold cutoff for switching between patterns was approximately -1.48. The states were aggregated into four quadrants based on this energy threshold, and the number of patients in each quadrant was calculated. The number of patients who demonstrated status improvement increased markedly until approximately 6 months after the start of treatment, after which the rate of increase gradually slowed. The rate of change plateaued after 1 year.

Patient profiling by time-series clustering showed that patients were classified into three clusters: “toward good stability,” “toward poor stability,” and “unstable.” Patients in the “unstable” cluster are considered to have clinical courses that are difficult to predict in RA practice; therefore, these patients should be treated with more care. Age at initial diagnosis, duration of disease, and laboratory values at initial diagnosis tended to differ by cluster, with the “toward poor stability” cluster exhibiting a higher age and higher laboratory values at initial examination than the other clusters. Compared to the “toward good stability” cluster, the “unstable” cluster had a higher age at initial examination and a longer duration of illness, but the “unstable” cluster tended to have better laboratory values at initial examination. The difference between these two clusters was marginal; however, there was a possibility that the initiation of effective treatment was delayed in the “unstable” cluster due to the relatively favorable initial conditions at the time of the first visit. Further research is warranted; however, the results of this study indicate the importance of early disease detection and treatment initiation, especially in “unstable” cases.

Energy landscape analysis was performed to visualize the state transitions of the patient population as a whole. However, this approach alone did not allow for the visualization of individual patient state transitions. We utilized time-series clustering to clarify the specific state transitions occurring within the given population. These methodologies complemented each other, revealing insights into both the collective state transitions of the population and the characteristic attributes of patients within that population. Traditionally, treatment decisions have been made based on past treatment histories and the latest evaluation of disease activity. Energy landscape analysis and time-series clustering analysis could yield insights regarding how a patient’s state might transition in the future and the optimal timing for effective treatment.

Limitations

This was a single-center observational study at Kyoto University Hospital that used registry data from 2011 to 2018. This study also did not evaluate the effectiveness of state transition visualization in daily RA drug therapy.

Interpretation

Currently, RA treatment practices are based on practice guidelines, and treatment approaches are determined based on evaluations of the efficacies of previously administered agents. Specific treatment strategies for individual patients are determined based on data obtained from comprehensive disease activity assessments and on the treatment history from the initial diagnosis to the present, and future state transitions rely on speculation based on physicians’ clinical experience. In our study, patient response to drug therapy and improved disease status significantly increased for approximately 6 months after the start of treatment and then plateaued after 1 year. Furthermore, our findings demonstrated that the patient population could be divided into three clusters: “toward good stability,” “toward poor stability,” and “unstable.” We quantitatively clarified the timing and duration of significant treatment effects through time-dependent analysis of the state of patients with RA, which is ever-changing due to drug therapy. In addition to the conventional assessment of overall disease activity, the evaluation of energy (i.e., state multistability) enables us to understand the relevance of the current state in the overall state transition related to RA drug treatment and to predict future state transitions. In the realm of artificial intelligence (AI) applications in healthcare, there is a fascinating parallel with the domain of chess, shogi, and go AI, where professionals occasionally find themselves defeated by seemingly unorthodox moves beyond human comprehension [52]. Similarly, in the context of real-world practice visualization derived from our methodologies, the recommended treatment may significantly deviate from established clinical guidelines. Despite low disease activity, AI could suggest early and intensive administration of biologic agents. In such cases, it becomes imperative to assume that the proposed therapeutic approach of AI holds merit and to therefore perform rigorous validation through randomized controlled trials (RCTs). Our research suggests the emergence of an innovative approach to the future landscape of medical research and practice, tentatively termed “AI-based RCTs,” to systematically investigate and corroborate the efficacy of AI-suggested clinical interventions. In the context of our study, for patients who are likely to experience treatment dead-end, it may be necessary to select a more effective treatment, such as a biologic, at an earlier stage. For patients whose condition improves, it may be possible to offer drug discontinuation or dose reduction. Clinical research involving “unstable” patients may allow for more effective clinical evaluation with smaller numbers of patients. Furthermore, state transition visualization may be a useful tool for facilitating communication between physicians and patients. Ultimately, by selecting treatments according to a patient’s cluster and their state transitions, it may be possible to adopt a personalized medicine approach targeting each patient’s state. This study suggested the possibility of optimizing the treatment plan based on the whole treatment course.

Generalizability

This was a single-center observational study. Kyoto University Hospital is a tertiary hospital and may accommodate patients with more severe illness than general hospitals and clinics; therefore, a multicenter study is needed. In addition, since this study was based on data collected in 2018, new drugs such as JAK inhibitors were likely to be used less frequently. Therefore, data from 2019 and beyond should be examined in future studies. Analyzing more recent data will enable the visualization of state transitions of RA drug therapy that more accurately reflect the reality of daily practice.

We believe that our proposed approach is useful as a visualization method for multidimensional time-series data in medicine and can be applied to diseases other than RA. In general, it is difficult to intuitively interpret multidimensional time-series data in medicine. We have made it possible to readily interpret state transitions in the drug treatment of patients with RA by consolidating the seven factors used in comprehensive disease activity assessment in daily practice into two dimensions, namely, state number and energy, and clustering along the time course. In many diseases, even the best treatment does not lead to a cure or remission; therefore, conservative treatment is used to prevent deterioration and maintain a good state. For multifactorial diseases such as diabetes and hypertension, target treatment values such as HbA1C and blood pressure levels are set by guidelines, and treatment strategies are chosen based on concepts equivalent to T2T in RA [53, 54]. Various factors, such as diet, lack of exercise, and stress, affect the condition of patients with multifactorial diseases. Nonetheless, advances in digital health technology have led to increased adoption of personal health records and health-related apps; therefore, the collection of frequently measured time-series sensor data from wearable devices is becoming increasingly feasible [5557]. We believe that our method may provide useful information for optimizing treatment plans to achieve behavioral change and personalized medicine by reducing the dimensionality of multiple factors and visualizing state transitions.

Considering the application of this research to personalized medicine in daily practice, it is necessary to develop an application that can calculate energy based on these seven factors and display its position in the state transition pattern in real time (Fig 7).

thumbnail
Fig 7. Examples of future real-world clinical implementations.

After the inspection is completed, the seven factors are entered into a tablet by the physician, and the state number, energy, and coordinate of the multistability score in the state transition are displayed. In this example, the energy is higher than the threshold value and is in a transitional state; therefore, the prospects for remission are considered adequate. However, the patient had a poor-stability cluster and is currently using csDMARDs. A change to a more effective drug may be considered.

https://doi.org/10.1371/journal.pone.0302308.g007

In addition, standardization of medical practice is necessary for high-quality multidimensional time-series data collection. Conventional medical practice involves interviewing each patient and performing necessary tests on a case-by-case basis. In the KURAMA cohort, all physicians used a common medical questionnaire to record patient states over time under certain standards and quality controls. It is necessary to standardize data and improve the medical system to enable the collection of high-quality time-series data that are useful for both medical care and analysis while balancing routine medical care and data analysis and eliminating unnecessary burdens as much as possible [58, 59].

Conclusions

This study suggested that evaluating state multistability and determining the patient’s state in daily practice may enable treatment plan optimization over the entire course of treatment. We believe that this study will contribute to the development of personalized medicine utilizing real-world data.

Supporting information

S1 Checklist. STROBE statement—checklist of items that should be included in reports of observational studies.

https://doi.org/10.1371/journal.pone.0302308.s001

(DOC)

S1 Table. List of the state transition probabilities of all 597 patients.

https://doi.org/10.1371/journal.pone.0302308.s002

(DOCX)

S1 Appendix. The state transitions of 10 randomly selected individuals.

The vertical axis indicates high and low energy. For the horizontal axis, the central vertical line divides the groups according to whether they are likely to transition to good stability or to persist in poor stability, according to the energy landscape analysis, and the solid red vertical line indicates the energy threshold (i.e., -1.48).

https://doi.org/10.1371/journal.pone.0302308.s003

(DOCX)

S2 Appendix. The state transitions of 10 randomly selected individuals from each cluster.

Transitions in the good stability (A), poor stability (B), and unstable (C) clusters. Individuals in cluster 1 (poor stability) tend to have low energy and to remain in the poor stability quadrant. On the other hand, those in cluster 2 (unstable) generally have higher energy and tend to move between the good stability and poor stability quadrants.

https://doi.org/10.1371/journal.pone.0302308.s004

(DOCX)

Acknowledgments

The authors would like to express their deepest gratitude to Dr. Satoshi Teramukai of Kyoto Prefectural University of Medicine, the staff of Wakayama Medical University, and Dr. Shunsuke Baba of Osaka Dental University for their helpful advice in conducting this study. We would like to extend our heartfelt gratitude to Haruo Horii, Naohiro Ito, and Masatoshi Fujii for their generous support for the KURAMA cohort.

References

  1. 1. Japan College of Rheumatology Guidelines for the Treatment of Rheumatoid. Minds 2020 [cited 7 February 2024]. Available from: https://minds.jcqhc.or.jp/docs/gl_pdf/G0001272/4/Rheumatoid_arthritis.pdf.
  2. 2. EULAR. EULAR recommendations: recommendations for management 2022 [cited 27 February 2024]. Available from: https://www.eular.org/recommendations-management.
  3. 3. American College of Rheumatology Guideline for the Treatment of Rheumatoid Arthritis. Rheumatoid arthritis guideline 2021 [cited 27 February 2024]. Available from: https://www.rheumatology.org/Portals/0/Files/2021-ACR-Guideline-for-Treatment-Rheumatoid-Arthritis-Early-View.pdf.
  4. 4. Smolen JS, Aletaha D, Bijlsma JW, Breedveld FC, Boumpas D, Burmester G, et al. Treating rheumatoid arthritis to target: recommendations of an international task force. Ann Rheum Dis. 2010;69: 631–637. pmid:20215140
  5. 5. Scott DL, Wolfe F, Huizinga TW. Rheumatoid arthritis. Lancet. 2010;376: 1094–1108. pmid:20870100
  6. 6. Nagy G, Roodenrijs NMT, Welsing PM, Kedves M, Hamar A, Van Der Goes MC, et al. EULAR definition of difficult-to-treat rheumatoid arthritis. Ann Rheum Dis. 2021;80: 31–35. pmid:33004335
  7. 7. Roodenrijs NMT, Hamar A, Kedves M, Nagy G, Van Laar JM, Van Der Heijde D, et al. Pharmacological and non-pharmacological therapeutic strategies in difficult-to-treat rheumatoid arthritis: a systematic literature review informing the EULAR recommendations for the management of difficult-to-treat rheumatoid arthritis. RMD Open. 2021;7: e001512. pmid:33419871
  8. 8. Watanabe R, Hashimoto M, Murata K, Murakami K, Tanaka M, Ohmura K, et al. Prevalence and predictive factors of difficult-to-treat rheumatoid arthritis: the KURAMA cohort. Immunol Med. 2022;45: 35–44. pmid:34033729
  9. 9. U.S. Food and Drug Administration (FDA). Framework for FDA’s real-world evidence program 2018 [cited 27 February 2024]. Available from: https://www.fda.gov/media/120060/download.
  10. 10. Maissenhaelter BE, Woolmore AL, Schlag PM. Real-world evidence research based on big data: motivation-challenges-success factors. Onkologe (Berl). 2018;24: 91–98. pmid:30464373
  11. 11. Schad F, Thronicke A. Real-world evidence-current developments and perspectives. Int J Environ Res Public Health. 2022;19: 10159. pmid:36011793
  12. 12. Liu S, See KC, Ngiam KY, Celi LA, Sun X, Feng M. Reinforcement learning for clinical decision support in critical care: comprehensive review. J Med Internet Res. 2020;22: e18477. pmid:32706670
  13. 13. Liu N, Liu Y, Logan B, Xu Z, Tang J, Wang Y. Learning the dynamic treatment regimes from medical registry data through deep Q-network. Sci Rep. 2019;9: 1495. pmid:30728403
  14. 14. Murphy SA. A generalization error for Q-learning. J Mach Learn Res. 2005;6: 1073–1097. pmid:16763665
  15. 15. Chakraborty B, Murphy S, Strecher V. Inference for non-regular parameters in optimal dynamic treatment regimes. Stat Methods Med Res. 2010;19: 317–343. pmid:19608604
  16. 16. Tang M, Wang L, Gorin MA, Taylor JMG. Step-adjusted tree-based reinforcement learning for evaluating nested dynamic treatment regimes using test-and-treat observational data. Stat Med. 2021;40: 6164–6177. pmid:34490942
  17. 17. Tao Y, Wang L, Almirall D. Tree-based reinforcement learning for estimating optimal dynamic treatment regimes. Ann Appl Stat. 2018;12: 1914–1938. pmid:30984321
  18. 18. Gunning D, Stefik M, Choi J, Miller T, Stumpf S, Yang GZ. XAI-explainable artificial intelligence. Sci Robot. 2019;4: eaay7120. pmid:33137719
  19. 19. Sheu RK, Pardeshi MS. A survey on medical explainable AI (XAI): recent progress, explainability approach, human interaction and scoring system. Sensors (Basel). 2022;22: 8068. pmid:36298417
  20. 20. Sharma M, Savage C, Nair M, Larsson I, Svedberg P, Nygren JM. Artificial intelligence applications in health care practice: scoping review. J Med Internet Res. 2022;24: e40238. pmid:36197712
  21. 21. Saria S, Butte A, Sheikh A. Better medicine through machine learning: what’s real, and what’s artificial? PLoS Med. 2018;15: e1002721. pmid:30596635
  22. 22. Ashrafian H, Darzi A. Transforming health policy through machine learning. PLoS Med. 2018;15: e1002692. pmid:30422977
  23. 23. Vayena E, Blasimme A, Cohen IG. Machine learning in medicine: addressing ethical challenges. PLoS Med. 2018;15: e1002689. pmid:30399149
  24. 24. Lambert SI, Madi M, Sopka S, Lenes A, Stange H, Buszello CP, et al. An integrative review on the acceptance of artificial intelligence among healthcare professionals in hospitals. NPJ Digit Med. 2023;6: 111. pmid:37301946
  25. 25. Umemoto A, Kuwada T, Murata K, Shiokawa M, Ota S, Murotani Y, et al. Identification of anti-citrullinated osteopontin antibodies and increased inflammatory response by enhancement of osteopontin binding to fibroblast-like synoviocytes in rheumatoid arthritis. Arthritis Res Ther. 2023;25: 25. pmid:36804906
  26. 26. Watanabe R, Kadoba K, Tamamoto A, Murata K, Murakami K, Onizawa H, et al. CD8+ regulatory T cell deficiency in elderly-onset rheumatoid arthritis. J Clin Med. 2023;12: 2342.
  27. 27. Katsushima M, Minamino H, Shirakashi M, Onishi A, Fujita Y, Yamamoto W, et al. High plasma homocysteine level is associated with increased prevalence of the non-remission state in rheumatoid arthritis: findings from the KURAMA cohort. Mod Rheumatol. 2023;33: 911–917. pmid:36069659
  28. 28. Terao C, Hashimoto M, Yamamoto K, Murakami K, Ohmura K, Nakashima R, et al. Three groups in the 28 joints for rheumatoid arthritis synovitis—analysis using more than 17,000 assessments in the KURAMA database. PLoS One. 2013;8: e59341. pmid:23555018
  29. 29. Ezaki T, Watanabe T, Ohzeki M, Masuda N. Energy landscape analysis of neuroimaging data. Philos Trans A Math Phys Eng Sci. 2017;375: 20160287. pmid:28507232
  30. 30. Burton HGA. Energy landscape of state-specific electronic structure theory. J Chem Theory Comput. 2022;18: 1512–1526. pmid:35179023
  31. 31. Frauenfelder H, Sligar SG, Wolynes PG. The energy landscapes and motions of proteins. Science. 1991;254: 1598–1603. pmid:1749933
  32. 32. Scheffer M, Carpenter SR, Lenton TM, Bascompte J, Brock W, Dakos V, et al. Anticipating critical transitions. Science. 2012;338: 344–348. pmid:23087241
  33. 33. Klepl D, He F, Wu M, Marco M, Blackburn DJ, Sarrigiannis PG. Characterising Alzheimer’s disease with EEG-based energy landscape analysis. IEEE J Biomed Health Inform. 2022;26: 992–1000. pmid:34406951
  34. 34. Aghabozorgi S, Shirkhorshidi AS, Wah TY. Time-series clustering–A decade review. Inf Syst. 2015;53: 16–38.
  35. 35. Stübinger J, Walter D. Using multi-dimensional dynamic time warping to identify time-varying lead-lag relationships. Sensors (Basel). 2022;22: 6884. pmid:36146233
  36. 36. Lindqvist J, Alfredsson L, Klareskog L, Lampa J, Westerlind H. Unmet needs in rheumatoid arthritis: a subgroup of patients with high levels of pain, fatigue, and psychosocial distress 3 years after diagnosis. ACR Open Rheumatol. 2022;4: 492–502.
  37. 37. Pettersson S, Demmelmaier I, Nordgren B, Dufour AB, Opava CH. Identification and prediction of fatigue trajectories in people with rheumatoid arthritis. ACR Open Rheumatol. 2022;4: 111–118. pmid:34758517
  38. 38. Van Der Elst K, Verschueren P, De Cock D, De Groef A, Stouten V, Pazmino S, et al. One in five patients with rapidly and persistently controlled early rheumatoid arthritis report poor well-being after 1 year of treatment. RMD Open. 2020;6: e001146. pmid:32371432
  39. 39. Lee YC, Frits ML, Iannaccone CK, Weinblatt ME, Shadick NA, Williams DA, et al. Subgrouping of patients with rheumatoid arthritis based on pain, fatigue, inflammation, and psychosocial factors. Arthritis Rheumatol. 2014;66: 2006–2014. pmid:24782222
  40. 40. Gold O, Sharir M. Dynamic time warping and geometric edit distance: breaking the quadratic barrier. ACM Trans Algorithms. 2018;14: 1–17.
  41. 41. Aletaha D, Neogi T, Silman AJ, Funovits J, Felson DT, Bingham CO, et al. 2010 rheumatoid arthritis classification criteria: an American college of rheumatology/European league against rheumatism collaborative initiative. Arthritis Rheum. 2010;62: 2569–2581. pmid:20872595
  42. 42. Mehdi F, Taylor WK, Jayakumar S, Marzyeh G. Medical Dead-ends and Learning to Identify High-risk States and Treatments. 2022; arXiv:2110.04186.
  43. 43. Steinbrocker O, Traeger CH, Batterman RC. Therapeutic criteria in rheumatoid arthritis. J Am Med Assoc. 1949;140: 659–662. pmid:18150288
  44. 44. Studenic P, Aletaha D, De Wit M, Stamm TA, Alasti F, Lacaille D, et al. American college of rheumatology/EULAR remission criteria for rheumatoid arthritis: 2022 revision. Ann Rheum Dis. 2023;82: 74–80. pmid:36280238
  45. 45. Ackley DH, Hinton GE, Sejnowski TJ. A learning algorithm for Boltzmann machines. Cogn Sci. 1985;9: 147–169.
  46. 46. Becker OM, Karplus M. The topology of multidimensional potential energy surfaces: theory and application to peptide structure and kinetics. J Chem Phys. 1997;106: 1495–1517.
  47. 47. Regonia PR, Takamura M, Nakano T, Ichikawa N, Fermin A, Okada G, et al. Modeling heterogeneous brain dynamics of depression and melancholia using energy landscape analysis. Front Psychiatry. 2021;12: 780997. pmid:34899435
  48. 48. Niennattrakul V, Ratanamahatana CA. On clustering multimedia time series data using K-means and dynamic time warping. In: 2007 international conference on multimedia and ubiquitous engineering (MUE’07). Seoul, Korea: IEEE; 2007. pp. 733–738.
  49. 49. Kaufman L, Rouseeuw PJ. Finding groups in data: an introduction to cluster analysis. Hoboken, NJ: John Wiley & Sons, Inc; 1990.
  50. 50. Han J, Kamber M. Data mining: concepts and techniques. Cambridge, UK: Morgan Kaufmann; 2022.
  51. 51. Bellman R, Kalaba R. On adaptive control processes. IRE Trans Autom Control. 1959;4: 1–9.
  52. 52. David S, Thomas H, Julian S, Demis H. AlphaZero: shedding new light on chess, shogi, and go. 2018 Dec 6 [cited 27 February 2024]. Available from: https://deepmind.google/discover/blog/alphazero-shedding-new-light-on-chess-shogi-and-go/.
  53. 53. American Diabetes Association. Standards of medical care in diabetes-2022 abridged for primary care providers. Clin Diabetes. 2022;40: 10–38. pmid:35221470
  54. 54. Nogami A, Kurita T, Kusano K, Goya M, Shoda M, Tada H, et al. JCS/JHRS 2021 guideline focused update on non-pharmacotherapy of cardiac arrhythmias. J Arrhythm. 2022;38: 1–30. pmid:35222748
  55. 55. Roehrs A, Da Costa CA, Righi RD, De Oliveira KS. Personal health records: a systematic literature review. J Med Internet Res. 2017;19: e13. pmid:28062391
  56. 56. Yamamoto K, Takahashi T, Urasaki M, Nagayasu Y, Shimamoto T, Tateyama Y, et al. Health observation app for COVID-19 symptom tracking integrated with personal health records: proof of concept and practical use study. JMIR Mhealth Uhealth. 2020;8: e19902. pmid:32568728
  57. 57. Clay I, Angelopoulos C, Bailey AL, Blocker A, Carini S, Carvajal R, et al. Sensor data integration: a new cross-industry collaboration to articulate value, define needs, and advance a framework for best practices. J Med Internet Res. 2021;23: e34493. pmid:34751656
  58. 58. Desai RJ, Matheny ME, Johnson K, Marsolo K, Curtis LH, Nelson JC, et al. Broadening the reach of the FDA sentinel system: a roadmap for integrating electronic health record data in a causal analysis framework. NPJ Digit Med. 2021;4: 170. pmid:34931012
  59. 59. Black AD, Car J, Pagliari C, Anandan C, Cresswell K, Bokun T, et al. The impact of eHealth on the quality and safety of health care: a systematic overview. PLoS Med. 2011;8: e1000387. pmid:21267058