Measurement Properties of 2 Novel PROs, the Pompe Disease Symptom Scale and Pompe Disease Impact Scale, in the COMET Study
Citation Manager Formats
Make Comment
See Comments

Abstract
Background and Objectives The Pompe Disease Symptom Scale (PDSS) and Impact Scale (PDIS) were created to measure the severity of symptoms and functional limitations experienced by patients with late-onset Pompe disease (LOPD). The objectives of this analysis were to establish a scoring algorithm and to examine the reliability, validity, and responsiveness of the measures using data from the COMET clinical trial.
Methods The COMET trial was a randomized, double-blind study comparing the efficacy and safety of avalglucosidase alfa and alglucosidase alfa in patients with LOPD aged 16–78 years at baseline. Adult participants (18 years or older) completed the PDSS and PDIS daily for 14 days at baseline and for 2 weeks before quarterly clinic visits for 1 year after randomization using an electronic diary. Data were pooled across treatment groups for the current analyses. Factor analysis and inter-item correlations were used to derive a scoring algorithm. Test-retest and internal consistency analyses examined the reliability of the measures. Correlations with criterion measures were used to evaluate validity and sensitivity to change. Anchor and distribution-based analyses were conducted to estimate thresholds for meaningful change.
Results Five multi-item domain scores were derived from the PDSS (Shortness of Breath, Overall Fatigue, Fatigue/Pain, Upper Extremity Weakness, Pain) and 2 from the PDIS (Mood, Difficulty Performing Activities). Internal consistency (Cronbach α > 0.90) and test-retest reliability (intraclass correlation >0.60) of the scores were supported. Cross-sectional and longitudinal correlations with the criterion measures generally supported the validity of the scores (r > 0.40). Within-patient meaningful change estimates ranging from 1.0 to 1.5 points were generated for the PDSS and PDIS domain scores.
Discussion The PDSS and PDIS are reliable and valid measures of LOPD symptoms and functional impacts. The measures can be used to evaluate burden of LOPD and effects of treatments in clinical trials, observational research, and clinical practice.
Trial Registration Information ClinicalTrials.gov identifier: NCT02782741
Introduction
Late-onset Pompe disease (LOPD) is a rare disorder caused by a deficiency of acid alpha-glucosidase, leading to glycogen accumulation in lysosomes, especially within muscle cells.1,2 Although the various symptoms of LOPD (eg, muscle weakness and respiratory impairment) are described in the literature,2 quantification of the frequency and impact of symptoms are not well-characterized. Therefore, Sanofi developed 2 patient-reported outcome (PRO) instruments: (1) a Pompe Disease Symptom Scale (PDSS), assessing breathing; feelings of tiredness, fatigue, or muscle weakness in different parts of the body; pain; and morning headache, and (2) a Pompe Disease Impact Scale (PDIS), assessing impacts of anxiety, feelings of worry and depression, and the ability and difficulties in performing certain activities of daily living (eg, walking, climbing stairs, rising from a sitting position, bending over, squatting down, exercising).
Qualitative development of the conceptual model has been reported in detail.3 In brief, findings from a detailed literature review were used to construct a preliminary conceptual model. Expert interviews were conducted with 3 LOPD-experienced clinicians, followed by 3 rounds of interviews with 13 participants with LOPD (4–5 participants/round), to discuss the initial model. Interview findings were used to identify novel and important key concepts and to finalize the conceptual model. The key areas in the conceptual model include signs and symptoms (respiratory, motor, and other) related to the disease process; direct impacts of symptoms on mobility and activity performance, eating, and activities of daily living; and general impacts on psychosocial experience and social and occupational participation.
The PDSS and PDIS were implemented as outcome measures in the Phase 3 COMET study (NCT02782741) comparing the efficacy and safety of avalglucosidase alfa with alglucosidase alfa during the study's double-blind phase. Treatment-naïve participants with a confirmed diagnosis of LOPD, aged 3 years and older, were enrolled into the COMET study.4 These tools are designed to measure changes in symptom severity and impacts of disease in patients with LOPD and also aid in the evaluation of the therapeutic efficacy of treatment throughout the LOPD disease spectrum in those 18 years and older.
We evaluated item performance, reliability, validity, and responsiveness of the PDSS and PDIS instruments using baseline and longitudinal data from patients with LOPD who were enrolled in the COMET study.
Methods
PDSS/PDIS Questionnaires
The PDSS and PDIS are self-administered questionnaires specifically designed to capture, respectively, symptoms and impacts of LOPD on individuals 18 years and older. Items and scales for the questionnaires are shown in eFigure 1 (links.lww.com/CPJ/A445).
The PDSS is a 12-item numerical scale measuring the following LOPD symptoms: Breathing difficulties (items 1 and 2), Tiredness (item 3), Fatigue (item 4), Muscle weakness (items 5–9), Muscle ache (item 10), Pain (item 11), and Morning headache (item 12). Patients rate their symptoms using an 11-point scale from ‘none’ (0) to ‘as bad as I can imagine’ (10). Higher scores indicate greater severity of symptoms.
The PDIS is a 15-item numerical scale measuring the following LOPD impacts: Anxiety (item 1), Worry (item 2), Depression (item 3), Ability to walk (items 4 and 5), Ability to climb stairs (items 6 and 7), Rising from a sitting position (items 8 and 9), Ability to bend over (items 10 and 11), Ability to squat (items 12 and 13), and Ability to exercise (items 14 and 15). Mood items (Anxiety, Worry, and Depression) are rated using an 11-point scale from ‘no impact’ (0) to ‘as bad as I can imagine’ (10). For difficulty performing activities, there are 6 pairs of questions. The first of each pair asks whether the activity was performed in the past 24 hours (items 4, 6, 8, 10, 12, and 14), with patients answering ‘no and not physically able’ (0), ‘no but physically able’ (1), or ‘yes’ (2). Patients answering ‘yes’ are asked the second question in the pair, categorizing the degree of difficulty in performing the activity in the past 24 hours (items 7, 9, 11,13, and 15, respectively). The second question is answered on a 5-point scale: ‘not at all difficult’ (0), ‘a little difficult’ (1), ‘somewhat difficult’ (2), ‘very difficult’ (3), and ‘extremely difficult’ (4). Patients responding with ‘0’ or ‘1’ on the first question do not answer the second question. Higher scores indicate greater impact on items.
The scoring approaches for the PDSS and PDIS are provided in full in eAppendix 1 and eAppendix 2 (links.lww.com/CPJ/A444), respectively.
Quantitative Analyses
During the COMET study,4 participants 18 years and older completed a 24-h recall e-diary at baseline and the data were averaged over 7 consecutive days (day −14 to day −8 and day −7 to day −1 before treatment). Thereafter, biweekly (14-day) scores were calculated as the average of daily responses from the protocol-specified visits (weeks 13, 25, 37, and 49 on treatment) and the 13 consecutive days before each visit. If the e-diary was unavailable (eg, device dysfunction or deficiency), a paper version was completed, and data were transferred to the database. If ≥ 4 daily scores of a 7-day period were missing, then the score was set to missing. Each COMET site explained the e-diary to participants. Alerts were sent to the participant’s phone during the study to remind them of important study tasks.
Quantitative analyses were conducted on pooled data from the COMET study,4 that is, PDSS or PDIS data from the avalglucosidase alfa and alglucosidase alfa arms were pooled, to evaluate item performance and psychometric properties of both questionnaires. All analyses were performed on the modified intent-to-treat (mITT) population,4 defined as all randomized patients who received at least one partial or total infusion of the study drug.
Distributional properties were assessed on item and domain characteristics. Summary statistics and the percentage of scale ranges were calculated for each symptom and impact item in the scales. The proportions of participants scoring at the most severe end of the scale (floor effect) and the least severe end of the scale (ceiling effect) for each item were also calculated. Floor and ceiling effects were considered small if ≤15% of patients achieved the worst and best health state and serious if >15% of patients achieved these states.5
Exploratory factor analysis (EFA)6 was conducted for the PDSS and PDIS separately, using an unweighted least square extraction method and promax rotation assuming that factors are correlated (oblique). The number of factors was determined using eigenvalues, screen plots, factor loading, simple structure, and clinical judgment. A total summary scale or set of scales were proposed considering the factor loadings and clinical and conceptual considerations. Item-to-item correlations and item-to-total correlations (corrected for overlap) were computed using Spearman rank correlation.
Internal consistency was evaluated using Cronbach alpha coefficients, with >0.7 considered as supporting the reliability of internal consistency.7 Intraclass correlation coefficient (ICC) values of 0.50–0.90 are considered to represent moderate-to-good reliability and values >0.90 excellent reliability.8 Test-retest reliability, an assessment of repeatable reliability (i.e., stability of an instrument over time), was assessed using a two-way mixed consistency model for the calculation of the ICCs (2,1) between 2 time points. Two test-retest analyses were conducted, one during screening (days −14 to −8 for the test and days −7 to −1 for the retest) and one between baseline (days −7 to −1) for the test and week 49 for the retest. The latter analysis was only conducted for participants who reported no change according to Patient Global Impression of Change (PGIC)9 items.
Construct validity assessed the ability of the PDSS and PDIS to measure the core constructs (i.e., the intended aspects of the disease model), by using data generated with other instruments and/or clinical assessments; both convergent validity and divergent validity were assessed. Convergent validity assessed the strength of correlation between instruments measuring the same concept, and divergent validity assessed the lack of correlation between instruments measuring different or weakly related concepts. Spearman correlation coefficients were calculated for the relationships of the PDSS and PDIS weekly scores and their EFA-derived scales at baseline (day −7 to day −1) with PRO scales/items from the following instruments: Short-Form 12 items (SF-12)—Physical Component Summary (PCS) and Mental Component Summary (MCS) scales,10 EuroQol-5 dimensions 5 levels (EQ-5D-5L) Pain and Mobility scales,11 and Rasch-built Pompe-specific activity (R-PAct) scale12 for quality of life and disease-related symptoms and forced vital capacity (FVC) % predicted in the upright position13,14; 6-minute walk test (6MWT) distance measured in meters15,-,17; and Quick Motor Function Test (QMFT)18 for outcome measures.
Sensitivity to change is the ability of an instrument to measure change in a state regardless of whether it is relevant or meaningful to the decision maker.19 This was assessed using an analysis of covariance adjusted for baseline scores to compare change from baseline in PDSS and PDIS at week 49 among those who improved in contrast with those who worsened or remained unchanged according to PGIC daily activities ratings at week 49. In addition, the magnitude of change in each change category using effect sizes (ESs, i.e., mean change from baseline divided by the SD at baseline) and standardized response means (SRMs, i.e., mean change from baseline divided by the SD of the change from baseline) was assessed. The responder groups as independent variables were defined as follows: ‘improved’: participants answering “a great deal better,” “moderately better,” or “somewhat better” and ‘worsened/no change’: participants answering “a great deal worse,” “moderately worse,” “somewhat worse,” or “no change.”
Within-patient meaningful change threshold. Both anchor and distribution-based methods were used to determine clinically meaningful improvement thresholds for the PDSS and PDIS. Anchor-based methods link PRO scores to known clinically relevant indicators or to the patient's determined rating of change. Anchors must correlate at least moderately (r > 0.30) with the PROs20 to be interpretable. The anchor-based approach evaluated meaningful change from baseline to week 49 using scores from PGIC items9 as anchors. These PGIC items consisted of Ability with Daily Activities (item 1), Disease-Related Symptoms (item 2), Ability to Breathe (item 3), and Mobility (item 4). Distribution-based methods, which allow change to be interpreted in the context of the variability of scores and reliability of the instrument, were also used to estimate the meaningful change threshold. Two distribution methods were used: one-half SD and standard error of measurement.
Standard Protocol Approvals, Registrations, and Patient Consents
This study used data from the COMET (NCT02782741) clinical trial. The COMET study protocol was reviewed and approved by appropriate ethics committees and/or institutional review boards and conducted in accordance with the Declaration of Helsinki and the International Council for Harmonisation guidelines for Good Clinical Practice. Research ethics and informed consent have previously been reported for the COMET study.4
Data Availability
Qualified researchers may request access to participant-level data and related study documents. Participant-level data will be anonymized, and study documents will be redacted to protect the privacy of study participants. Further details on data sharing criteria, eligible studies, and the process for requesting access of Sanofi can be found at vivli.org.
Results
Study Population
The mITT population included 100 participants from the COMET study, of whom 99 were 18 years or older and 1 was 16 years. The mean ± SD age of COMET participants was 47.6 ± 14.2 years (range, 16–78 years); 52 were male (52%). Most participants were White (94 [94%]), and ethnicity was also unevenly split between 76 (76%) not Hispanic or Latino, 15 (15%) Hispanic or Latino, and 9 (9.0%) who did not report their ethnicity. Baseline clinical and PRO measures are provided in eTable 1 (links.lww.com/CPJ/A445). At baseline, there were up to 72 participants with valid responses to individual PDSS/PDIS questions. The 1 participant younger than 18 years did not complete the PDSS and PDIS.
The PDSS and PDIS instrument completion rate was around 63% of expected daily entries.
Score Distribution
The PDSS and PDIS score distribution of the averaged weekly values for the week leading up to the baseline is provided in Table 1. Ceiling effects indicating the lowest symptom severity were most notable for 4 PDSS items: item 1 (Breathing), item 2 (Breathing while lying down), item 8 (Muscle weakness hand), and item 12 (Morning headache) (eTable 2, links.lww.com/CPJ/A445). No floor effects indicating the highest symptom severity were observed at baseline.
Pompe Disease Symptom Scale and Pompe Disease Impact Scale at Baseline—Item Score Distribution
Baseline ceiling effects were observed for 2 PDIS items—item 1 (Anxiety) and item 3 (Depression), respectively; floor effects indicating the highest symptom severity were noted for 3 PDIS items—item 11 (Bend over difficulty), 13 (Squat down difficulty), and 15 (Exercise difficulty).
Item-to-Item Correlations
Correlation between PDSS and PDIS item scores were sufficiently high (r > 0.3) to enable identification of scales in both instruments (eFigure 2, links.lww.com/CPJ/A445).
For the PDSS, 2 pairs of items—item 5 (Muscle weakness anywhere)/item 6 (Muscle weakness lower body) and item 10 (Muscle ache)/item 11 (Pain)—had strong correlations (r = 0.90 and r = 0.91, respectively), suggesting possible redundancy of items. Item 1 (Breathing)/item 2 (Breathing while lying down) also had high correlations (r = 0.86), as did Item 3 (Tiredness)/Item 4 (Fatigue; r = 0.89). Many groups of items show consistently high Spearman rank correlations (>0.5).
For the PDIS, correlations suggest 2 scales, which match its intended design. The first included items 1, 2, and 3 (Anxiety, Worry, and Depression, respectively), and their Spearman rank correlations varied between r = 0.78 and r = 0.93. The second included the difficulty items (items 5, 7, 9, 11, 13, and 15 [Walk difficulty, Climb difficulty, Rise difficulty, Bend over difficulty, Squat down difficulty, and Exercise difficulty, respectively]). Correlations varied between r = 0.46 and r = 0.80.
Exploratory Factor Analysis
Results of exploratory factor analysis for the PDSS items indicated that a four-factor solution seemed to be optimal. The 4 factors addressed the underlying concepts of Shortness of Breath, Overall Fatigue, Pain, and Upper Extremity Weakness Score. Item 12 (Morning headache) did not load onto any of the 4 factors (factor loading <0.40, Figure 1A). This item was, however, retained as a single-item scale, Morning Headache Score, given its stated relevance to patients during preliminary studies for the definition of the PDSS instrument.
(A) Pompe Disease Symptom Scale; (B) Pompe Disease Impact Scale.
Factor 1, labeled Overall Fatigue Score, accounted for most variability in the data. It was formed by 5 items: items 3 (Tiredness), 4 (Fatigue), 5 (Muscle weakness anywhere), 6 (Muscle weakness lower body), and 9 (Muscle weakness upper body). All standardized factor loadings were quite high (>0.8), except item 9, for which it was 0.54 (Figure 1A). The 2 items loading (>0.7) onto factor 2 relate to upper extremity weakness: items 7 (Muscle weakness arms) and 8 (Muscle weakness hand); this factor was labeled Upper Extremity Weakness Score. Two items related to pain (with loadings >0.8) were loaded on factor 3; this was labeled Pain Score. Finally, the 2 items relating to breathing, items 1 and 2 (Breathing and Breathing while lying down, respectively), with loadings >0.9, were loaded on factor 4, which was labeled Shortness of Breath Score. Given the moderate-to-strong correlations (r = 0.45 to 0.62) observed between factor 1 (Overall Fatigue), factor 2 (Upper Extremity Weakness Score), and factor 3 (Pain Score), we hypothesized that a second-order dimension labeled Fatigue/Pain Score, underlying the initial factors 1 to 3 and indirectly items 3 to 11, is justifiable.
For the PDIS, analysis defined a two-factor solution. All standardized factor loadings were quite high (>0.7, Figure 1B). Three items related to anxiety, worry, and depression were loaded on factor 1, which was labeled Mood Score. Items 5 (Walk difficulty), 7 (Climb difficulty), 9 (Rise difficulty), 11 (Bend over difficulty), and 13 (Squat down difficulty) were loaded on factor 2, which was labeled Difficulty Performing Activities Score. Item 15 (Exercise difficulty) was excluded from the analysis because few patients reported exercising, and therefore, the difficulty item was not completed.
In all cases, the items correlated strongly with their own scales and less with the other scales (eFigure 3, links.lww.com/CPJ/A445).
For both the PDSS and PDIS, scores were calculated as the average of underlying items. Therefore, items and scales shared the same range, 0 to 4 for Difficulty Performing Activities Score and 0 to 10 for all other scales. Higher scores indicate greater severity of symptoms or larger impact.
Reliability
All scales from the PDSS and PDIS showed excellent internal consistency with Cronbach alpha coefficient >0.90, ranging between 0.92 and 0.95 for the PDSS and 0.91 and 0.93 for the PDIS (Table 2).
Internal Consistency Reliability
Test-retest reliability was also demonstrated (Table 3). Both PDSS and PDIS domains and total scales had adequate-to-very high ICC values. All ICC values were >0.7 at screening and ranged from 0.60 to 0.85 between baseline and week 49, indicating acceptable test-retest reliability for the scale scores in both instruments.
Test-Retest Reliability
Construct Validity
Convergent validity was demonstrated by moderate-to-high correlations of PDSS and PDIS scale scores with concepts hypothesized to be similar from the other PRO and clinical measures and especially those with the most similar concepts (Figures 2, A and B, respectively). Discriminant validity was supported by the observation of lower (low-to-moderate) correlations with the other PRO measures, which captured less similar concepts.
(A) Pompe Disease Symptom Scale; (B) Pompe Disease Impact Scale. Data are (n) Spearman rank correlations. Dark green = strong correlations (r > 0.5); light green = moderate correlations (r = 0.3 to 0.5); yellow = weak correlations (r < 0.3). DPAS = Difficulty Performing Activities Score; EQ-5D-5 = EuroQol-5 dimensions 5 levels; FPS = Fatigue/Pain Score; MCS = Mental Component Summary; MHS = Morning Headache Score; MS = Mood Score; OFS = Overall Fatigue Score; PCS = Physical Component Summary; PS = Pain Score; RPAct = Rasch-built Pompe-specific activity; SBS = Shortness of Breath; SF-12 = Short Form 12 Items; TSS = Total Symptom Score; UEWS = Upper Extremity Weakness Score.
Sensitivity to Change
Sensitivity to change for the PDSS was supported by logical differences in change from baseline to week 49 among PGIC change groups, with significant differences observed for Total Symptom Score (p = 0.031), Fatigue/Pain Score (p = 0.011), and Overall Fatigue Score (p = 0.014; eTable 2, links.lww.com/CPJ/A445). The magnitude of these change scores were in the region of moderate-to-high effect sizes for the improved category. The results were inconclusive for detecting worsening owing to the small magnitude of change from baseline and lower sample size.
Similarly, sensitivity to change for the PDIS instrument was demonstrated by the logical differences in change from baseline to week 49 among PGIC change groups and small (for worsened/no change group) to moderate (improved group) effect sizes (eTable 2, links.lww.com/CPJ/A445).
Meaningful Change Thresholds
Candidate within-patient meaningful threshold values were generally identified from the anchor-based analysis as the lowest median value for a PGIC improvement category that exceeded the distribution-based approach estimates and the 95% confidence interval (CI) for the stable (i.e., ‘no change’) anchor group. The correlations between changes from baseline to week 49 on the PDSS/PDIS scores and PGIC responses were used to evaluate the reliability of the meaningful change estimates. PGIC items, yielding correlations ≥0.30 with the PDSS/PDIS scores, were used as the primary anchors in estimating the thresholds. Estimates were rounded to a single decimal place.
PDSS
The Overall Fatigue, Fatigue/Pain, and Shortness of Breath Scores were correlated ≥0.30 with at least one PGIC item; the Pain, Upper Extremity Weakness, and Morning Headache Scores were not correlated with any of the PGIC items ≥0.30; and therefore, the confidence in the meaningful change estimates is reduced for these domains (Figure 3). Because the correlations between the PDSS scores and PGIC item 3 (ability to breathe) were generally stronger than other PGIC items, estimates based on this anchor were highlighted.
Dark green: strong correlation, light green: moderate correlations, light yellow: weak correlations. N = 50 unless indicated otherwise. *N = 46. PDIS = Pompe Disease Impact Scale; PDSS = Pompe Disease Symptom Scale; PGIC = Patient Global Impression of Change.
Table 4 includes the mean and median change score values for the PDSS scores across the PGIC response categories. Patients are categorized as having experienced “no change,” “improvement” (any response indicating improvement on the PGIC), or “large improvement” (moderate or large improvement selected on PGIC) based on their responses to each PGIC item.
Clinically Meaningful Thresholds (Anchor-Based and Distribution Method)—Large/Any Improvement Groups [Mean/Median] vs No Change Group [95% Confidence Intervals]
The smallest median change value on the Shortness of Breath domain that exceeded the distribution-based estimates and the lower bound of the correlated PGIC items was −1.47. For the ability to breathe PGIC anchor item, this value was −1.16. For the Fatigue/Pain Score, the mean and median values exceeded the lower bound of the 95% CI for the stable group on 3 of the 4 PGIC items. For the ability to breathe PGIC item, the median value for the “large improvement” group (−1.30) exceeded the lower bound of the 95% CI. A similar trend was observed for the Overall Fatigue Score. For each of these scores, a meaningful change threshold of −1.5 was selected because this value generally exceeded the lower bound of the 95% CIs, exceeded the distribution-based estimates, and was consistent with the changes observed for the “large improvement” group across the PGIC items. The other PDSS scores did not correlate with the PGIC anchors. Therefore, a meaningful change estimate was selected that exceeded the distribution-based methods and was consistent with the thresholds for the other scores. A threshold of −1.5 was selected for these scores as well.
PDIS
The Mood Score was correlated by ≥0.30 with the ability to breathe PGIC item while the Difficulty Performing Activities Score was correlated by ≥0.30 with the daily activities, disease-related symptoms, and mobility PGIC items (Figure 3). The PGIC items with stronger correlations with the PDIS scores were emphasized in estimating the meaningful change thresholds.
For the Mood Score, the median value of −1.61 from the “large improvement” group exceeded the lower bound of the 95% CI for the stable group on the ability to breathe PGIC item. This estimate also exceeded the distribution-based estimates and the lower bound of the 95% CI for the stable groups for 2 of the other 3 PGIC items. To be consistent with the thresholds for the PDSS, a threshold of −1.5 was selected for the Mood Score.
For the Difficulty Performing Activities Score, the scale range is smaller (0–4) than the other scores (0–10). Therefore, the meaningful change estimate is also smaller than for the other scores. The lower bound of the 95% CI for the stable group was approximately −0.70 for 3 of the 4 PGIC items. The smallest value that exceeded this was the median value of the “large improvement” group across the PGIC items with stronger correlations with the Difficulty Performing Activities Score; this estimate was approximately −1.0. This value also exceeded the distribution-based values and was selected as the meaningful change estimate for this score.
Discussion
Patients with LOPD experience a variety of symptoms and functional limitations that can substantially decrease quality of life. Reliable and valid measures of how patients with LOPD experience their condition are needed for use in clinical research. The items of the PDSS and PDIS were generated through rigorous qualitative research3 and are designed to quantify the patient's perspective on their symptoms and functioning. Data from the COMET clinical trial support the psychometric properties of these novel PRO measures.
The PDSS yields scores assessing key symptom domains of LOPD, including Shortness of Breath, Overall Fatigue, Pain, Upper Extremity Weakness, and Morning Headache. The Overall Fatigue domain includes aspects of both tiredness and muscle weakness in different areas of the body. Although these could be considered as separate concepts, factor analysis indicated that they were highly correlated and measured the same underlying experience. Therefore, they were combined into a single domain score. Because degeneration of muscle strength is a key clinical feature of LOPD, this specific domain score may be particularly relevant in future research with LOPD. Additional PDSS items evaluating specific dimensions of upper extremity weakness—weakness in the arms and hands—formed a separate domain, rather than loading with the more general muscle weakness item asking about “upper body” weakness. This separate domain may supplement the Overall Fatigue domain in future research because it measures a specific symptom that may be important in fully understanding the patient's experience of LOPD symptoms.
The Overall Fatigue and Pain domains were strongly correlated. Although they measure separate concepts, they also seemed to capture a common underlying symptom cluster. Therefore, a second-order factor was retained that encompassed these symptoms. This composite domain may provide an efficient method of indexing these different experiences using a single score in future clinical research. Morning headache was not closely associated with any of the other items, and a notable floor effect was observed for this item. Because the item captures a potentially important aspect of the patient's experience, it was retained as a stand-alone score that is not combined with any of the other items. However, it may be considered for deletion in future versions of the PDSS.
The PDIS includes 2 domain scores—a Mood Score and Difficulty Performing Activities Score. The Mood Score includes ratings on the 3 items measuring the severity of negative moods—anxiety, depression, and worry. The Difficulty Performing Activities Score addresses basic instrumental activities of daily living that can be affected by muscle weakness, fatigue, and shortness of breath. Difficulty with exercising was omitted from the Difficulty Performing Activities scale because it seemed to measure a concept that is much less commonly attempted or achievable by the target population. Nevertheless, examining it as a separate index may be useful in LOPD samples without profound physical limitations. The dichotomous items addressing whether the patient completed an activity are also not included in the Difficulty Performing Activities scale. Nevertheless, these items may provide useful information on their own, in that moving from not being able to complete an activity to being able to complete it may be an important treatment benefit. Alternatively, losing function and changing over time from being able to complete an activity to not being able to complete the activity may be an important marker of disease progression.
The reliability and validity of the PDSS and PDIS were supported. The correlations with the criterion measures indicated that the symptom and impact ratings were strongly associated with other PRO measures of symptoms, functioning, and health status. In addition, there were specific relationships with pulmonary, exercise capacity, and motor function measures that strongly supported the convergent and discriminant validity of the PDSS and PDIS: The Shortness of Breath Score of the PDSS was associated with the FVC pulmonary function measure, and the Difficulty Performing Activities Score was associated with the 6MWT and QMFT. These specific correlations differentiated the Shortness of Breath Score and Difficulty Performing Activities Score from the other domains, as expected. Similarly, although the Pain item of the EQ-5D was correlated with several PDSS domains, it was most strongly related to the PDSS Pain Score, as expected. The pattern of change scores from baseline to week 49 were generally ordered correctly, where patients who reported improvement on a global impression item also reported greater reductions in symptoms and increased ability to perform activities. Only the Mood Score was disordered. However, the anchor items did not assess mood—they included symptoms and physical limitations—so this result is not surprising. In general, this pattern of convergent and divergent results strongly supports the construct validity of the PDSS and PDIS domain scores. The structural validity of the instruments would ideally be confirmed eventually on a different, larger sample.
Establishing a meaningful change threshold is an important step in developing a new measure for use in clinical research or practice. This allows each patient to be categorized as having experienced meaningful improvement or worsening in their symptoms or function. The analyses reported here followed the best practice guidelines of methodological experts and regulatory agencies. The estimates were generally consistent across the PDSS and PDIS domains. A decrease of 1 to 1.5 points or greater indicates a meaningful improvement on the PDSS and PDIS domains. These thresholds can be used in clinical research studies to identify responders to treatment. The relatively low correlations between the anchors and target measures increase the uncertainty around the meaningful change point estimates. When applying these thresholds, it may be useful to consider ranges of thresholds, and not just the single point estimates, to ensure that a robust change has occurred for individual patients.
The PDSS and PDIS were developed using regulatory and expert best practice guidelines.21 This iterative process for PRO development included qualitative interviews, disease-specific questionnaire development, and quantitative psychometric analyses. The PDSS and PDIS are reliable and valid measures that can be used to evaluate important LOPD-specific symptoms and impacts on patients in observational research and clinical trials and to monitor disease progression in clinical practice.
Acknowledgment
The authors acknowledge administrative assistance from Jane M. Gilbert, BSc, CMPP, and Kim Coleman Healy, PhD, CMPP, of Envision Scientific Solutions, contracted by Sanofi for publication support services. The authors thank Leah Granby, Naveen Deenadayalu, Kristen Gyorda, Susan Macera, Megan Cirlincione, and Olivier Huynh-Ba for their contributions. The authors exerted sole scientific control and received no honoraria for participation. Coauthor Giulio Flore, MSc, died on November 4, 2022.
Study Funding
This study was supported by Sanofi.
Disclosure
M.M. Dimachkie has received consulting fees from Abata/Third Rock, Abcuro, Amazentis, ArgenX, Astellas, Catalyst, Cello, Covance/Labcorp, CSL-Behring, EcoR1, Janssen, Kezar, MDA, Medlink, Momenta, NuFactor, Octapharma, Priovant, RaPharma/UCB, Roivant Sciences Inc, Sanofi, Scholar Rock, Shire Takeda, Spark Therapeutics, TACT, UCB Biopharma, and UpToDate. M.M. Dimachkie has undertaken contracted research for or received unrestricted educational grants from Alexion, Alnylam Pharmaceuticals, Amicus, Biomarin, Bristol-Myers Squibb, Catalyst, Corbus, CSL-Behring, FDA/OOPD, GlaxoSmithKline, Genentech, Grifols, Kezar, Mitsubishi Tanabe Pharma, MDA, NIH, Novartis, Octapharma, Orphazyme, Ra Pharma/UCB, Sanofi, Sarepta Therapeutics, Shire Takeda, Spark Therapeutics, The Myositis Association, TMA, UCB Biopharma/RaPharma, and Viromed/Healixmith. P.S. Kishnani has received research/grant support from Sanofi and Amicus Therapeutics and has received consulting fees and honoraria from Sanofi, Amicus Therapeutics, Maze Therapeutics, JCR Pharmaceutical, and Asklepios Biopharmaceutical, Inc. (AskBio). P.S. Kishnani is a member of the Pompe and Gaucher Disease Registry Advisory Boards for Sanofi, Amicus Therapeutics, and Baebies. P.S. Kishnani has equity in Asklepios Biopharmaceutical, Inc. (AskBio), and Maze Therapeutics. C. Ivanescu is an employee of IQVIA. G. Flore was an employee of IQVIA at the time of study participation. C. Gwaltney has received consulting fees from Sanofi. N.A.M.E. van der Beek has received consulting fees and travel reimbursement from Sanofi and Amicus Therapeutics. A. Hamed is an employee of Sanofi. K. An Haack and her spouse are employees of Sanofi. L. Pollissard is an employee of Sanofi. E. Baranowski was an employee of Sanofi at the time of study participation. S.E. Sparks is an employee of Sanofi. P. DasMahapatra is an employee of Sanofi. Full disclosure form information provided by the authors is available with the full text of this article at Neurology.org/cp.
Appendix Authors

Footnotes
Funding information and disclosures are provided at the end of the article. Full disclosure form information provided by the authors is available with the full text of this article at Neurology.org/cp.
The Article Processing Charge was funded by Sanofi.
Submitted and externally peer reviewed. The handling editor was Associate Editor Belinda A. Savage-Edwards, MD, FAAN.
- Received January 19, 2023.
- Accepted June 9, 2023.
- Copyright © 2023 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the American Academy of Neurology.
This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND), which permits downloading and sharing the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
References
- 1.↵
- Valle D,
- Antonarakis S,
- Ballabio A,
- Beaudet A,
- Mitchell G
- Reuser A,
- Hirschhorn R,
- Kroos M
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- Meyers L,
- Gamst G,
- Guarino A
- 7.↵
- Fayers PM,
- Machin D
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- Miller MR,
- Hankinson J,
- Brusasco V, et al.
- 14.↵
- Quanjer PH,
- Stanojevic S,
- Cole TJ, et al.
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵US Food and Drug Administration. Patient-focused drug development guidance public workshop. Methods to Identify What is Important to Patients & Select, Develop or Modify Fit-for-Purpose Clinical Outcomes Assessments. Discussion Document for Patient-Focused Drug Development Public Workshop on Guidance 3. Workshop October 15-16, 2018 [online]. Accessed November 10, 2022. fda.gov/downloads/Drugs/NewsEvents/UCM620708.pdf.
The Nerve!: Rapid online correspondence
REQUIREMENTS
You must ensure that your Disclosures have been updated within the previous six months. Please go to our Submission Site to add or update your Disclosure information.
Your co-authors must send a completed Publishing Agreement Form to Neurology Staff (not necessary for the lead/corresponding author as the form below will suffice) before you upload your comment.
If you are responding to a comment that was written about an article you originally authored:
You (and co-authors) do not need to fill out forms or check disclosures as author forms are still valid
and apply to letter.
Submission specifications:
- Submissions must be < 200 words with < 5 references. Reference 1 must be the article on which you are commenting.
- Submissions should not have more than 5 authors. (Exception: original author replies can include all original authors of the article)
- Submit only on articles published within 6 months of issue date.
- Do not be redundant. Read any comments already posted on the article prior to submission.
- Submitted comments are subject to editing and editor review prior to posting.
You May Also be Interested in
Dr. Sevil Yaşar and Dr. Behnam Sabayan
► Watch
Related Articles
- No related articles found.