“Just because we can detect an effect does not mean it matters”
– The Power of Mathematical Thinking by Jordan Elleburg
In a recent post, I challenged you to become a Credible Hulk. You may be asking yourself how to go about doing that. As I said, the amount of information available can be intimidating. This blog is one of the thousands that you could spend your valuable time reading. Many blogs and articles make claims about the best way to improve your health, fix problems you are facing, and live your life in general. I have no intention of trying to control or shame you into specific decisions. Instead, I want to arm you with the tools and understanding of research so you can draw independent conclusions.
As Ph.D. programs and scores of textbooks are used to explain the development and dissecting of research studies, I hardly expect a brief blog post to impart all there is to know on this subject. But there are a few key points to be aware of when reading an article that references research. For those interested in a deeper dive, you can check our part 1 and part 2 from my clinician focused blog.
Type of Study
While there is an evidence pyramid, the research question has to fit the study. Sometimes, a randomized control trial is not possible. Take determining the risk of smoking as an example. We cannot randomly assign a few hundred people to smoke one pack of cigarettes a day and assign another few hundred to a control group then see which group has a higher death rate in 20 years. Additionally, the longer the study, the more potential for external biases and influencers – such as diet and exercise habits – to impact the results. But when we look at all the data from multiple studies, the picture is clear.
While still not control trials, we have many studies with large amounts of participants that consistently show more smoking increases the risk for lung cancer. If you stop smoking, switch from unfiltered to filtered, or reduce from smoking two packs a day to one, the risk of developing cancer plummets. Any way you look at the problem, smoking increases cancer risk. So, while some research questions will never pass a review board for a clinical trial, there are many ways to test a hypothesis.
A primary goal of research is to translate the results to the real world. To do that, the people included in the study need to represent the population you want to apply the results to. If you want to determine how effective strength training is for kids, looking at studies assessing middle-aged adults provides little value. This highlights the importance of knowing who was included in a study. These issues are very commonly seen in nutrition studies.
For example, several large prospective cohorts have concluded eating red meat causes increased mortality risk.1-3 Unlike in smoking studies, the results are not consistent. In addition, the meat-eating people included in the study also have the following characteristics compared to the vegetarians:
- Decreased exercise frequency and duration
- Increased BMI
- Increased smoking frequency
- Increased prevalence of diabetes
- Increased total caloric intake
- Increased alcohol intake
This leads to a couple of questions. Is it possible that elevated BMI, lack of exercise, excessive calorie intake, or a combination caused the increased mortality rates and not the red meat? It is very possible the red meat caused the development of those conditions; however, an association is not a cause and effect relationship.
I am not going to get into a nutrition debate in this post. I can save that for another day. Instead, I want to point out the importance of knowing who was studied and what was done. If a study says physical therapy is no more effective than a home exercise program, how do they define physical therapy? Many of the studies using physical therapy as a catch-all term are general and low intensity. The researchers design the program to fit the average, not the individual coming to the clinic. Also, the intensity is typically way too low. This creates confusion for the reader. We must be cautious when looking at an observational study, meaning one that simply observes outcomes over time rather than specifically intervening.4
The law of small numbers
We also need to make sure the study has enough participants. This is referred to as the power of a study. An underpowered study has more variability and makes it difficult to draw a conclusion.
Consider the following problem outlined in Thinking, Fast and Slow by Daniel Kahneman:
A recent study on the prevalence of kidney cancer in 3,141 counties across the US found that counties with the lowest rates are mostly rural, sparsely populated, and located in traditionally Republican states in the Midwest, the South, and the West.
Take a moment and consider the reasons. Now let’s look at the counties with the highest rates of kidney cancer:
They are mostly rural, sparsely populated, and located in traditionally Republican states in the Midwest, the South, and the West.
This is not a typo. The descriptions are the exact same. Why? The answer lies in the law of small numbers. The key is ‘sparsely populated.’ Low numbers result in high variability. The counties with a small number of residents have the greatest variability in rates.
Blinding is when the participants, researchers, or both are unaware of the difference between the actual intervention and the control. For example, if you are testing the effect of a medication, do the patients know if they are receiving the actual pill or a sugar pill (the placebo). A placebo is a procedure or event in a health care setting that a person believes is an intended treatment. The resulting placebo effect is when the person experiences the expected effects, even though they received a treatment that does have a mechanism for a direct effect (like anti-inflammatory medicine). This means the sugar pill causes pain relief if the patient believes they received pain medication.
Research consistently shows the placebo effect is very real and very strong. By properly blinding research participants, we can determine if the placebo effect occurred. If there is no blinding, we don’t know if improvements happened because of the intended actions of the intervention – such as reducing inflammation from a drug – or because of the power of the mind, or an unknown third-party influencer (such as a good night sleep).
Participants are not the only ones who need to be blinded. Studies have demonstrated that experimenters or providers act differently when they believe the intervention is a sham (like a sugar pill) rather than the real deal.5 Subtle changes in body language, effort, communication, and facial expressions can cue the patient that they are not receiving the experimental treatment. This worsens potential improvement.
Another issue is both the researcher and participant may demonstrate a performance bias. This is when we try harder if we know our performance is being measured. A researcher may focus more when providing the actual treatment instead of the sham. A participant may try harder on a performance test if they know they received the experimental treatment. These details are rarely found in journal articles or popular media.
Another concern is controlling the variables. This was briefly touched on with the nutrition study reference. While a study may test how effective an exercise protocol is, are the diets and sleep patterns of the participants tracked? Both diet and sleep have huge impacts on exercise. Some studies do control for them while others don’t. What about the experimenters?
How skilled is the clinician providing the treatment? Aside from the difference in the delivery of the techniques, what is the patient’s belief about joints being cracked? If they hate the sound of cracking knuckles or do not like being very up close and personal with someone, they may tense or simply respond poorly to a manipulation. How about the state of mind and body of the patient? This can affect any type of experiment. If they came into the experiment and had a rough night of sleep, had a fight with a significant other, were recently laid off at work, or they are an Orioles fan and the MLB season just started (a 60-game season is probably a relief to them), chances are they will be in a foul mood that may affect the results. We can’t control everything, but we should know the potential issues with studies.
Lastly, we have the art of determining “so what?” Unfortunately, it is easy to read a headline and move on. Headlines are often misleading or completely misrepresent what study results actually say. The size of an effect is important to consider. Let’s go back to rehabilitation. If you have difficulty walking your dog because your knee pain is a 6 out of 10, will a 5.5 out of 10 make a difference? A research study may show a new treatment provides a significant improvement in pain, but most patients will roll their eyes at the word “significant.” The word significant in the study often means statistically significant. This is referring to a statistical calculation that determines the probability that the observed effect occurred due to chance. I will not get into the details. For those interested, you can read my detailed research post here.
What you care more about is something referred to a clinical significance. For pain, this is typically represented by a 1.5/10 change. This means you need to experience a change of at least 1.5 to have a noticeable difference in your life. If you simply read a headline or poorly written article, you will miss out on the effect size. This can cause you to weigh something more heavily than you should. Another issue is drawing conclusions from correlations.
Correlation does not equal causation
A correlation is simply an association between two variables. This relationship can be positive or negative. For example, studying is positively correlated with better test scores. The more I study, the higher I score. While this correlation makes sense, many correlations can be completed unrelated, but the association remains. For example, from 1999 to 2009, the letters in the winning word of the Scripps National Spelling Bee is correlated with the number of people killed by venomous spiders. I think it is safe to say one did not cause the other. You can find many other entertaining and clearly unrelated examples. Looking at two data sets in a correlation – A and B – the relationship could be A causes B, B causes A, or an unknown C causes A and B.
Many people commonly state “correlation does not equal causation.” As I have shown, this is true. Unfortunately, this is often treated similarly to the “no offense but…” approach of acknowledging the issue but plowing through anyway. Similar to how someone will proceed to say something really offensive under the assumption that it is now ok following the disclaimer, correlations are still treated as causation and lead to a flawed understanding of research.
One last issue to cover is relative vs. absolute risk. Often a headline grabs our attention when saying “4 times the risk.” However, without knowing the initial risk, this carries little meaning. If the original risk of a treatment side effect is 1 in 10,000, a 4 times increase for a particular patient population is hardly cause for panic. The absolute risk is the number we are most concerned about. Now, if the absolute risk is 5% and there is a 4-fold increase, then there is certainly cause for concern. We need both values.
At the end of the day, the goal is to translate research into our day-to-day lives. Unfortunately, many people are not armed with the training needed to effectively understand research. If we simply rely on headlines and Facebook comments, we are likely to be led astray. It is challenging, but going to the source, or at least asking the right questions, will allow you to better understand research and how it applies to your life. To close, here is a summary of tips to consider when reading a research study:
- What question am I trying to answer?
- What background information do I need to understand the study?
- Who are the participants of the study?
- What type of study am I reading?
- What are the limitations of the study? What is missing?
- What follow-up research do I need to do before drawing conclusions? One study is never enough
Reading and digesting research takes practice. I am still learning and refining my ability to read and translate research into daily practice. I encourage you to stick with it and remain curious.
“Scientific knowledge is a body of statements of carying degrees of uncertainty/certainty – some most unsure, some nearly sure, non absolutely certain.”
– Richard Feynman
- Zheng, Y., et al., Association of changes in red meat consumption with total and cause specific mortality among US women and men: two prospective cohort studies. BMJ, 2019. 365: p. l2110.
- Wang, X., et al., Red and processed meat consumption and mortality: dose-response meta-analysis of prospective cohort studies. Public Health Nutr, 2016. 19(5): p. 893-905.
- Pan, A., et al., Red meat consumption and mortality: results from 2 prospective cohort studies. Arch Intern Med, 2012. 172(7): p. 555-63.
- Schuemie, M.J., et al., Interpreting observational studies: why empirical calibration is needed to correct p-values. Stat Med, 2014. 33(2): p. 209-18.
- Blasini, M., et al., The Role of Patient-Practitioner Relationships in Placebo and Nocebo Phenomena. Int Rev Neurobiol, 2018. 139: p. 211-231.
ABOUT THE AUTHOR
Zach has numerous research publications in peer-reviewed rehabilitation and medical journals. He has developed and taught weekend continuing education courses in the areas of plan of care development, exercise prescription, pain science, and nutrition. He has presented full education sessions at APTA NEXT conference and ACRM, PTAG, and FOTO annual conferences multiple platforms sessions and posters at CSM.
Zach is an active member of the Orthopedic and Research sections of the American Physical Therapy Association and the Physical Therapy Association of Georgia. He currently served on the APTA Science and Practice Affairs Committee and the PTAG Barney Poole Leadership Academy.