Biostatistics, Evidence, and Research Design - Program in Physical Therapy

We study measurement, design, and analysis as they pertain to rehabilitation. Rehabilitation is a complex, dynamic process with many interacting factors at physiological, psychological, and sociological levels. Furthermore, the tools to study rehabilitation are constantly evolving, with scientists collecting more and more data, and a lot of data collected outside of traditional laboratory settings.

We collaborate with other research groups in the Program in Physical Therapy and the School of Medicine to ensure robust data collection and data management, to apply statistical and machine learning tools, and to promote open science practices.

Personnel

Keith Lohse, PhD, PStat – Faculty Investigator
Allison Miller, PhD, DPT – Postdoctoral Researcher
Chelsea Macpherson, PT, DPT, PhD, NCS – Post Doctoral Fellow

Research Themes and Projects

Ontology and Measurement in Neurological Recovery

Ontology refers to the set of concepts and categories in a subject area and the relationships between them. One such ontology is the International Classification of Functioning, Disability, and Health (ICF). Within the ICF, we need valid and reliable measures of “body structures” and “body functions”, “activities” at the level of the individual, and “participation” in a larger social context. We also need to understand how these constructs relate to each other, and it is much more complicated than a simple linear story of structure -> function -> activity -> participation. Working with the International Stroke Genetics Consortium (https://www.strokegenetics.org/), we robustly measure these behavioral phenotypes, so that we can better understand the genetics of recovery. To that end, we are seeking outcomes measures that are accurate and efficient, creating minimal burden on participants and capable of being collected at scale (e.g., collected in the tens of thousands of participants needed for genetic studies).

Longitudinal and Time-Series Data

Rehabilitation is fundamentally about change within a person over time. Statistically speaking, however, these temporally-dependent data violate the assumptions of many statistical tests and require specialized tools for analysis. Typically, people use the term “longitudinal data” to refer to few data points (e.g., <10) that are collected over a long timescale (e.g., days or months apart). In contrast, people use the term “time-series data” to refer to large numbers of data points (e.g., hundreds to millions) that are sampled at a very high density (e.g., milliseconds or microseconds apart). Although these different data types exist on a continuum, it is still useful to talk about them separately as differences in the sampling rate and number of observations make them amenable to different types of analyses. For instance, time-series data can be transformed into the frequency domain with Fourier analysis and future data can be predicted with various autoregressive models. In contrast, longitudinal data might be analyzed with linear or non-linear mixed-effect regression models, with the specifics of the model depending on the nature of the outcome (e.g., binary, ordinal, or interval/ratio) and the “shape” of the trajectory (e.g., linear, exponential, or sigmoidal). Although we do not create these mathematical tools, we apply them to rehabilitation problems and create instructional materials for rehabilitation researchers to use them.

Data use, re-use, and rehabilitation informatics

As with many fields, rehabilitation has seen astronomical growth in the amount and complexity of the data we produce. For instance, physiological data from EEG or accelerometry data from inertial sensors contain highly structured data (e.g., voltages or forces in discrete intervals of time) in very dense samples (e.g., 250-1,000Hz for minutes or hours of recording). In contrast, electronic health records contain loosely structured data from millions of individuals all with complex data types that all have unique relationships to each other and may or may not be recorded over time. This means that researchers and their students are facing increasingly large and complex data sets. In our research group, we want to give researchers the tools and training to work with their own data effectively. More than any one project, we want to make sure that data are Findable, Accessible, Interoperable, and Reusable (FAIR) in rehabilitation science. To that end, we are part of the educational leadership team for the Reproducible Rehabilitation (“ReproRehab”) program funded by NCMRR (https://www.reprorehab.usc.edu/), and we collaborate with other researchers at WUSTL to harmonize and archive large research datasets.

Current Projects and Collaborations

NIH/NICHD R01 HD068290: Translation of In-Clinic Gains to Gains in Daily Life. PI: Catherine Lang.
NIH/NICHD-NCMRR R25 HD105583: Building a data science workforce to improve the reproducibility of rehabilitation research. PI’s: Sook-Lei Liew & David Kennedy.
NIH/NIMH R01 MH123723: Variation in early motor function in autism, cerebellar injury, and normal twins. PI’s: Catherine Lang, Catherine Limperopoulos, Natasha Marrus.
NIH/NCCIH R34 AT011015: Moving Mindfully: A MBSR-Centered Approach to Freezing in Parkinson Disease. PI’s: Gammon Earhart & Kerri Rawson.
NIH/NIAMS R01 AR081881: Longitudinal biomechanics and patient-reported outcomes after periacetabular osteotomy for developmental dysplasia of the hip. PI: Michael Harris.