Posts Tagged ‘Bayesian’

FDA Finalizes Bayesian Guidance: Good News!

Tuesday, February 16th, 2010

The United States Food and Drug Administration has been considering whether to allow the use of Bayesian statistic in the design and analysis of clinical trials for several years.  Happily, in my opinion, the FDA recently finalized its decision to allow, and even encourage, use of Bayesian statistics.  I believe this is a good thing both because of the stated reason that Bayesian methods can enable faster and more efficient clinical trials, and also because the Bayesian point of view is, to me, a more realistic and comprehensible approach to decision science in the clinical research context.  Although most of us are familiar with the traditional, frequentist approach to clinical trial data analysis, the subtleties of what the frequentist methods really tell us are sometimes lost or misinterpreted, resulting in misleading or frankly incorrect conclusions about the results of clinical trials.  I believe that adoption of Bayesian methods by the clinical research field will ultimately yield the view that these methods are more closely aligned with how we really think about what we want to accomplish and will better support the types of decisions we really want to make using clinical experiments.  Ultimately this may lead to better communication about trial results.

The FDA guidance document itself is fairly readable at 50 pages.  In addition to the official, regulatory reasoning behind its creation, the guidance document provides a nice background on Bayesian analysis from the clinical research perspective.  The last part of the document provides good detail in straight forward language regarding content that the FDA would like to see in submissions and conversations/communications that the Agency would like to have with sponsors before, during, and after trials with respect to trial design and analysis using Bayesian methods.

The document can be found on the FDA website:

Bayesian Speak, so to speak…

Friday, December 11th, 2009

What’s in a name…

Bayesian analysis is used in a wide variety of situations from clinical trial design and analysis to spam filtering in email programs.  As with most disciplines, Bayesian analysis developed its own terminology, which is shaped to capture the basic ideas inherent in this approach to statistical analysis.  Unfortunately, the description of the concepts represented by these terms is often impenetrable.  I have sought here to put words to these concepts that are a little more user-friendly.

I discussed the fundamental concepts of Bayesian analysis in an article that appeared here on October 26.  To recap, the ideas underlying Bayesian analysis are actually quite familiar and in some cases intuitive, perhaps more so than the concepts from classical (also known as “frequentist”) statistics that most of us were trained in.  Perhaps the most fundamental concept on which these two schools of analysis differ is the concept of probability. While the frequentist views probability as the fraction of a long series of trials having a particular value (say number of positive HIV blood tests out of 1000 people tested), the Bayesian views probability as a “degree of believability” of some parameter (I think there is a 7% chance of 100 HIV positive test in this group of 1000).   This distinction is crucial in so far as it allows us to use our existing information to update our belief about the parameter’s real value as we go.

In equation form Bayes’ Rule looks like this:

P(X|E) = P(E|X)P(X)/P(E)

Sometimes I find it helpful to put equations in plain language to better understand them.  If Bayes’ Rule was put in plain language describing results from HIV blood testing, it might look like this:

  1. The probability that this person is HIV positive, given a positive blood test =
  2. (My belief about the proportion of the population that is HIV positive X
  3. Probability of a positive blood test if the patient actually has HIV)         ÷
  4. The overall probability of obtaining positive blood test, whether true or false positive

(1)           Posterior Distribution

The posterior distribution is generally what is of interest in an analysis.  The value reported is the probability that the HIV status (parameter) is a specific value (positive or negative) given the observed test result (the current data).  In this way we can infer the value of the parameter, given the data we have and our prior beliefs about the value of the parameter.

(2)           Prior Distribution

The prior distribution represents in mathematical form the belief that one has about a parameter before collecting any data.  It may be that one is collecting data in an area with a low incidence of HIV; here the prior distribution might be p(HIV+) = 0.0001, with a corresponding p(HIV-) = 0.9999.  When collecting data in an area with a high incidence of HIV, one’s beliefs may lead to a prior distribution of this parameter in which p(HIV+) = 0.1 and p(HIV-) = 0.9.  These estimates can be based on previously collected data, other studies, or simply a hunch.

This last point is the hitch that many detractors of Bayesian analysis identify.  There is clearly a subjective component in selection of the prior distribution.  To alleviate this concern, one can use a non-informative prior or “flat prior”, in which the probability for each value of the parameter in question is equal.  Although you lose the ability to incorporate previously acquired knowledge into the analysis, other advantages of Bayesian analysis, such as the simplicity of interim analyses, are maintained.

(3)           Likelihood Function

The likelihood function is a representation of the chances of observing both the data given the specific value(s) of the parameter from the prior distribution.  In the example above, it is the probability that the test result will be positive if the individual is, in fact, HIV positive (the true positive rate).  This might be 0.97; in an alternative example looking at the question of false positives, the likelihood function might take the form of the probability of a positive result in an HIV negative person.  This value might be quite low, such as 0.001.

(4)           Event Probability

This is simply the overall probability of seeing a positive HIV test, either from a person who is infected with HIV or one who is not.

Credible Interval

The analog of the frequentist confidence interval is the Bayesian credible interval.  I have included it because confidence intervals are so commonly used.  It is derived from the posterior distribution and indicates the range of parameter values that the interval percentage (say 90%) covers and is used to indicate some level of confidence in the value of the parameter being estimated.  In fact, many of us speak of confidence intervals in terms that appropriate to describe the Bayesian credible interval: there is a 90% probability that the value of the parameter is in the interval X to Y.

On to bigger things…

Combining these concepts yields a wide variety potential trial designs.   One of the very cool things about Bayesian analysis is the ability to perform an analysis, then immediately use the resulting posterior distribution as the prior distribution in a subsequent analysis.  This enables interim analyses, adaptive randomization, and other design features that are more difficult, if not undoable, in the frequentist realm.

The models and computation that are required for many of these more useful designs are quite complicated.  But, in principle, all come back to the concepts above.