Causation in epidemiology: association and causation | Health Knowledge
The Bradford-Hill criteria are widely used in epidemiology as providing a framework against which to assess whether an observed There must be a one to one relationship between cause and outcome. 4. Temporal sequence of association. talks about causation with respect to epidemiology. TEMPORAL RELATIONSHIP • refers to the necessity for a cause to precede an effect in. Hill explained that for an exposure-disease relationship to be causal, exposure must the temporal progression in a multigenerational causal story . In traditional epidemiology, a monotonic biological gradient, wherein.
When ensuring temporality in the context of modern-day environmental exposures, it is important to consider that many of these involve low levels of exposure over extended time frames, and low incidence, micro-scale outcomes that occur following long latency periods. These factors make the prospect of designing a traditional epidemiologic study in which temporality is firmly established a costly, time consuming, and potentially unfeasible task.
However, improved chemical exposure monitoring and analytical capabilities, molecular epidemiology techniques, and advances in understanding disease progression allow for new and expanded ways to meet this criterion across a variety of study designs.
The use of biomarkers, state-of-the-art analytical testing at low limits of detection, and understanding of windows of toxicity and chromosome abnormalities in disease progression have increased our confidence in temporality as a useful criterion.
A modern example of expanded temporal analysis using data integration is illustrated by studies of low-dose exposures to arsenic through drinking water and food. Limited windows of exposure can be evaluated to determine effects of exposure during sensitive stages [ 2930 ]. By integrating new data and knowledge from these tools, temporal relationships can be considered even within cross-sectional or ecological studies that do not implicitly establish temporality within the study design.
Today, our understanding of temporality now includes a wider range of precisely defined wider exposure windows, some of which are more relevant to disease outcomes than previously thought. Through epigenetic mechanisms i. Such changes could be responsible for generational effects of synthetic estrogen diethylstilbestrol DES exposure which can lead to increased risk of breast cancer multiple generations removed from the initial exposure [ 32 ].
Analytical techniques are improving to detect these changes and to determine which epigenetic alterations may serve as indicators of disease potential and persistent biomarkers of a previous exposure [ 33 ]. Understanding the molecular-level changes that precede an observable outcome can help establish the temporal progression in a multigenerational causal story [ 34 ].
In traditional epidemiology, a monotonic biological gradient, wherein increased exposure resulted in increased incidence of disease, provides the clearest evidence of a causal relationship. However, Hill acknowledged that more complex dose—response relationships may exist, and modern studies have confirmed that a monotonic dose—response curve is an overly simplistic representation of most causal relationships.
In fact, most dose—response curves are non-linear and can even vary in shape from one study to the next depending on unique characteristics of the given population, exposure routes, and molecular endpoints assessed [ 36 ].
Furthermore, individual susceptibility and synergistic or antagonistic effects of cumulative exposures can make some biological gradients even more difficult to characterize.
Association and Causation | Health Knowledge
An example of this effect can be seen in aryl hydrocarbon receptor AhR -based mechanisms: Integration of advanced statistical capabilities, data modeling techniques, and knowledge from increased understanding of biomolecular interactions have resulted in the descriptions of more defined dose—response curves, capable of showing molecular effects at very low levels of exposure. Additionally, growing knowledge of genetic polymorphisms has illuminated the reasons behind individual variations in biological response to toxic insult and the dose—response relationships [ 8 ].
It is now possible to observe threshold responses in the low-dose range, rather than assuming linearity for all substances. Furthermore, experimental support for a dose—response phenomenon referred to as hormesis has increased with improved molecular techniques. Hormesis is characterized by low dose stimulation and a high dose inhibition [ 37 ]. The dose—response curve associated with this phenomenon is biphasic and, depending on the endpoint measured, is either J or U shaped [ 38 ].
Hormesis has been observed in both toxicology and pharmacology, and the features of the observed dose—response are consistent and independent of the biological model, endpoint measured, chemical or physical stressor, and mechanism [ 37 ]. The most distinctive feature of hormesis is that it is repeatedly observed below the typical threshold dose [ 37 ]. Biological gradient is an example of how data integration can complicate causal inference. New tools and technical capabilities have allowed researchers to characterize a variety of low-level molecular endpoints that may not lead to disease or observable adverse outcomes on a larger scale.
For example, innate responses can repair, eliminate, or reverse molecular changes caused by low levels of exposure. Thus, molecular changes within the no-observable-adverse-effect level NOAEL may not contribute to disease and are more indicative of a threshold dose response. Understanding the mechanisms at low level exposures allows us to elucidate a dose—response curve. For example, the in vitro endpoints for asbestos toxicity include generation of oxidative stress which results in genotoxicity and chromosome damage via DNA adduct formation [ 39 ].
However, damage at low levels, while measurable in vitro, is removed via cellular apoptosis which represents adaptive response and a threshold effect. Thus, responses at these low levels may not be indicative of disease, but rather adaptive responses that indicate a threshold must be overcome prior to disease initiation. Additionally, modern analytics have shown that epigenetic endpoints can occur in the low-dose range of environmental chemical exposures, though these measured changes may not lead to observable disease.
For example, Kim et al. These changes may provide insight regarding a mechanism of action for BPA during developmental exposure; however, further information regarding phenotypic changes is necessary to determine whether epigenetic changes at low level exposures are significant indicators of a dose-disease response relationship.
Thus, biological gradient can be broadened to include molecular dose—response relationships, if the actual response occurs at a dose that is also associated with disease onset or progression. Plausibility has historically been judged based on the presence of existing biological or social models that explain the association of interest. Today, tools such as high-throughput screening assays can be used to study a specific biologically plausible pathway and identify toxic agents that interfere with that pathway in defined ways.
The elucidation of biological pathways leading to liver toxicity have played a large role in advancing the interpretation of biological plausibility, and the integration of knowledge from various evidence streams has aided in those interpretations. The liver is typically the first organ with appreciable capacity for oxidative metabolism that an agent encounters after ingestion, and is therefore a key organ for studying potential toxicity [ 16 ]. Liver effects demonstrated using techniques such as high-throughput in vitro and in silico cell manipulation, can be seen as a harbinger for further toxic endpoints that might occur with more refined, realistic exposures [ 4142 ].
Researchers can now predict plausible relationships using in vitro and in silico screening tools targeting defined disease mechanisms, which represents a potential paradigm shift in how scientists frame causal research questions and design studies.
Historically, causal inference was approached with the assumption of a single-factor direct relationship i. However, researchers now understand that many disease outcomes are a result of the interplay and balance between multiple contributing and intermediary factors.
As such, demonstrating the biological plausibility of a causal relationship can be complex. However, improved statistical techniques can help researchers to understand complex disease progression from a molecular standpoint, where multiple risk factors, confounders, adaptive responses, and mediating mechanisms intersect [ 44 — 46 ]. Indeed, Hill identified histopathological evidence of bronchial epithelium changes and animal-based toxicity tests for the carcinogenicity of cigarette smoke as an example of a coherent story among several avenues of study design.
Today, coherence is another area in which molecular-based studies have been used to demonstrate a comprehensible story regarding various aspects of the exposure-to-disease paradigm. For example, lung tissue fiber analysis by scanning transmission electron microscopy STEM has expanded our knowledge of internal biologically effective amphibole dose relating to altered structure and function of lung tissue, supporting the conclusion that amphibole asbestos fibers induce mesothelioma [ 48 ].
Alternatively, advanced mechanistic studies can elucidate an incoherent body of epidemiologic literature, thereby strengthening the causal inference in one direction or another. Consider for example the carcinogenicity of hexavalent chromium [Cr VI ]. The body of epidemiologic literature regarding the carcinogenicity of Cr VI is limited and conflicting, particularly regarding ingestion exposures e. However, a recent array of genomic, pharmacokinetic, and mechanistic research—including metabolism, bioavailability and kinetic studies, mutagenic mode of action studies, and gene expression profiling—demonstrate that ingested Cr VI does indeed have a carcinogenic profile [ 4950 ].
Yet in modern contexts, experimentation must consider that many diseases result from multifaceted exposures and follow complex progression pathways.
Cessation of exposure as Hill described may not reverse or appreciably slow the progression of disease. In some cases, multiple risk factors, including diet, exercise, smoking, chemical exposures, and genetic predisposition can contribute to disease onset and progression.
Thus, while the combination of these factors may culminate in disease, experimental manipulation of a single contributory factor may or may not result in observable decreases in disease incidence. Researchers using a data integration framework can now draw from toxicological findings for experimental insight into causality.
In vitro studies that test mechanistic pathways and demonstrate the biological role of an agent in disease progression may result in knowledge that can be used to predict potential human health outcomes in a much more time-efficient manner than human studies, particularly for adverse outcomes with a long latency period. The expanded understanding of temporality in light of data from varied evidence streams can also affect interpretation of the experiment criterion.
Individual exposures can cause epigenetic modifications to parental DNA that result in an observed effect in future offspring, even though there is no direct exposure to the offspring.
Causality and the Interpretation of Epidemiologic Evidence
Experimental studies in animal models are often necessary to provide mechanistic support for an epidemiologic observation that involves complex temporality. For example, multiple animal studies provide support for the hypothesis that epigenetic changes induced by DES exposure in utero may be causative of transgenerational effects of DES exposure in females [ 3251 — 54 ].
Because epigenetic analyses in transgenerational human studies take decades and are riddled with potential confounders, reliance on animal models and advanced analytical techniques can help to support determination of a causal relationship. Analogy has been interpreted to mean that when one causal agent is known, the standards of evidence are lowered for a second causal agent that is similar in some way [ 55 ].
Indeed, some might argue that enough knowledge exists and is accessible today to identify an analogy for every situation, especially if the researcher pulls that knowledge from multiple disciplines and across evidence streams.
Today, researchers have a wider range of tools by which to seek an analogy, including disease progression pattern, common risk factors and confounders, and biological mechanisms of action. Therefore, the modern value of analogy is not gained from confirming a causal inference, but rather from proposing and testing mechanistic hypotheses.
Causation in epidemiology: association and causation
As an example, analogous mechanistic hypothesis testing has been conducted on carbon nanotubes CNTs using the extensive literature on the mechanistic toxicity of asbestos fibers. Models based on molecular structure and physical—chemical characteristics such as aspect ratio predict a mechanism of action similar to that of asbestos [ 57 ]. The physical morphology of CNTs appears similar to that of asbestos fibers; thus, respirable-sized fibers are expected to behave similarly in occupational settings and lead to similar lung translocation and deposition.
Additionally, asbestos fibers are known to cause inflammation and fibrosis of the lung pleura as a precursor to mesothelioma; these same outcomes have been demonstrated following CNT exposure [ 5859 ].
Further, CNTs have been found to stimulate the release of acute phase cytokines from human macrophages and mesothelial cells exposed to CNTs of varying lengths, demonstrating that CNT exposure results in a length-dependent pro-inflammatory response, similar to that of asbestos [ 60 ]. These findings enhance the asbestos analogy by confirming that CNTs may be capable of causing disease that begins with pleural inflammation—the same mechanism responsible for asbestos-related mesothelioma.
However, the results also demonstrate that not all CNTs have the same potential for carcinogenicity, implying that proactive design of engineered CNTs can limit the risks and allow for safe use of the compounds in a variety of applications—and that the analogy to asbestos should not be viewed in a way that limits continued research. As the world of epidemiologic research has changed and expanded, our criteria for determining causal inference must similarly evolve.
Epidemiologic investigation of causation conducted today must also evolve to reflect the concepts of data integration. The Bradford Hill Criteria remain one of the most cited concepts in health research and are still upheld as valid tools for aiding causal inference [ 61 ].
However, the way each criterion should be applied, interpreted, and weighted in a data integration framework must be carefully measured against the varied and often novel types of data available in each unique situation. In some ways, data integration degrades the value and importance of certain criteria, as it offers alternative interpretations for each criterion that give way for inductivism. In other words, in a data integration framework, researchers can interpret the criterion whichever way fits the available data as opposed to determining whether the data meets the criterion.
This type of application is dangerous as it bypasses the ultimate purpose of causal inference—determining whether the observed association is directionally causal or not. Nonetheless, data integration represents an opportunity to expand our abilities as researchers to think about causation. Herein, we have discussed how the data integration framework requires the compilation of more lines of evidence and more scrutiny for each of the criteria.
The examples above have demonstrated that data integration can enhance the application of the Bradford Hill Criteria in a causal analysis by: The Bradford Hill Criteria are far from outdated in a data integration framework. Causal inference in the field of epidemiology is no longer informed solely by traditional epidemiologic studies, but rather by a complementary host of evolving research tools and scientific disciplines.
Although specific interpretations of each criterion have evolved over time, the concepts that underlie each criterion can be applied to a variety of methodologies to answer questions about causation. The Bradford Hill Criteria can aid researchers in connecting the dots within a body of literature, either to lead to suggestions of causal relationships or identification of what more research is needed to understand potential causality.
As ever, the criteria should not be used as a heuristic for assessing causation in a vacuum; rather they should be viewed as a list of possible considerations meant to generate thoughtful discourse among researchers from diverse scientific fields.
All authors read and approved the final manuscript.Association & Causation
Acknowledgements The authors would like to acknowledge Peter Ruestow and Brent Kerger for their contributions towards manuscript revisions. We would also like to thank the anonymous peer reviewers for their critical feedback. No outside funding was received for this project.
Compliance with ethical guidelines Competing interests The authors declare that they have no competing interests. Contributor Information Kristen M. The environment and disease: Proc R Soc Med. Molecular epidemiology of cancer. CA Cancer J Clin. Assessing causal relationships in genomics: A dictionary of epidemiology. Oxford University Press; Biologic plausibility in causal inference: J Epidemiol Commun Health. There must be a one to one relationship between cause and outcome. Temporal sequence of association.
Exposure must precede outcome. Change in disease rates should follow from corresponding changes in exposure dose-response.
Presence of a potential biological mechanism. Does the removal of the exposure alter the frequency of the outcome? According to Rothman , while Hill did not propose these criteria as a checklist for evaluating whether a reported association might be interpreted as causal, they have been widely applied in this way. Rothman contends that the Bradford - Hill criteria fail to deliver on the hope of clearly distinguishing causal from non-causal relations. For example, the first criterion 'strength of association' does not take into account that not every component cause will have a strong association with the disease that it produces and that strength of association depends on the prevalence of other factors.
In terms of the third criterion, 'specificity', which suggests that a relationship is more likely to be causal if the exposure is related to a single outcome, Rothman argues that this criterion is misleading as a cause may have many effects, for example smoking. The fifth criterion, biological gradient, suggests that a causal association is increased if a biological gradient or dose-response curve can be demonstrated. However, such relationships may result from confounding or other biases.
According to Rothman, the only criterion that is truly a causal criterion is 'temporality', that is, that the cause preceded the effect. Note that it may be difficult, however, to ascertain the time sequence for cause and effect.
The process of causal inference is complex, and arriving at a tentative inference of a causal or non-causal nature of an association is a subjective process. For a comprehensive discussion on causality refer to Rothman.