Written By: Tim Bushnell, Ph.D.
Isaac Newton was famous for saying “If I have seen further than others, it is by standing upon the shoulders of giants.” Implicit in that statement is that the information that the giants provided was reproducible. In fact, reproducibility is central to the scientific method and as far back as the 10th century, the concept of reproducibility of data was being discussed by Ibn al-Haytham.
In 2011, Prinz et al. published an article that indicated a case study looking at reproducibility by Bayer Healthcare found only 25% of academic studies were reproducible. This was followed up in 2012 by a report from Begley and Ellis that indicated on 11% of 53 landmark oncology studies were able to be replicated. So it seems that while we are trying to see farther, our lens may be out of focus.
Bruce Booth, writing for Forbes, published an article called “Scientific Reproducibility: Begley’s Six Rules” and in this article, he proposed the following 6 rules that should serve as a roadmap in evaluating scientific work, both published and your own work. These rules are:
- Were the studies blinded?
- Were all the results shown?
- Were the experiments repeated?
- Were the positive and negative controls shown?
- Were the reagents validated?
- Were the statistical tests appropriate?
While these rules are focused more on clinical trials, they are readily adopted for basic scientific inquiry. By starting to think about these questions in the early stages of discovery and into pre-clinical studies, which should increase the confidence and reproducibility of later stages of the process.
Using Begley’s Rules
Reproducibility is a mindset, it’s not one simple tweak and the data is reproducible. It is a matter of critically evaluating each process in the experiment and identifying areas that can be improved. It involves complete communication of the process. It involves relying on well-developed and documented standard operating procedures that everyone involved in the project are trained on.
Turning our attention back to Begley’s rules, how can these rules help you improve your research? They help provide a roadmap on how to design, validate, execute and report experimental data in a way that is more robust and reproducible.
1. Take the first rule, “Were the studies blinded?”
This is a critical component of clinical trials. In blinded studies, the subject does not know if they are part of the control group or the experimental group. In a double-blinded studied the experimenter also does not know what group the subjects are part of.
This helps prevent experimenter bias impacting the data. In the research setting, this technique is not often used, but with a little coordination within the laboratory, this could be implemented in the research setting.
2. Thinking about the second rule: “Were all the results shown?”
Flow cytometry is a data-rich technology and numbers are the name of the game. Experiments looking at the change in percentage a population or the change in the expression pattern of a given protein.
For this reason, the results of any experiments can often be summarized and presented as a table or graph that provides statistical information about the experiment, which is used to support (or refute) the thesis of the argument. A histogram or bivariant gating strategy is useful, but the meat of the argument will be in these summary figures, such as the one shown below.
Figure 1: Summary figure showing all the results of an experiment measuring the the changes in CD4+ cells after drug treatment. All the data is shown with the mean and standard deviation indicated. The number of data points and the p-value between the two datasets is indicated.
In addition to showing the data, thanks to the support of the Wallace H. Coulter Foundation and ISAC, there is a public database where flow cytometry data can be deposited. The Flow Repository allows researchers to upload their data for their published experiments. This allows for all researchers to review the data that the paper is based on, thus improving the ability of researchers to repeat and extend findings of interest.
3. In lines with showing all the data is the thir rule: “Were the experiments repeated?”
For any experiment, it is critical that there the experiments are replicated. This becomes the ‘n’ in any graph and helps evaluate how robust the experiment has been tested. Based on discovery-based work, estimates of the magnitude of the different and the expected variance in the data can be estimated. This, in turn, allows for a Power calculation, which can help guide the researcher in determining the ‘n’.
The smaller the difference that the researchers wish to test, the more samples that they will need to run. The program Statmate is one useful tool for performing these calculations. Figure 2 shows how to determine the number of samples to run based on the Statmate output.
Figure 2: Statmate output used to determine the number of replicates needed.
4. This leads to the fourth rule: “Were the positive and negative controls shown?”
In flow cytometry the controls that are used to determine the population of interest are very important to show. Since gating is a data reduction technique, incorrect gating can impact the data and conclusions.
Without showing and explaining the use of the controls, gating is more a subjective art than an objective evaluation. For those starting out in flow cytometry, using the data available in the Flow Repository along with the paper it came from is a good way to practice. The OMIPs are especially useful for this purpose.
5. Next is to examine the fifth rule :“Were the reagents validated?”
When thinking about flow cytometry, reagent validation is a critical step in the validation and optimization of any polychromatic panel. This is especially true of the antibodies used in experiments.
In Bradbury and Plückthun’s commentary in Nature, the authors estimate about 50% of the money spent on antibodies is wasted due to the quality of the antibodies. Issues with antibodies can include cross-reactivity, lot-to-lot variability and even the wrong antibody for the application.
With the advent of recombinant antibodies, this should begin to become less and less of an issue, but it will take time to for these reagents to penetrate the market. At a minimum, every antibody that comes into the lab should be tested and titrated to ensure the reagent is working properly.
Beyond the antibodies, any other reagent that is being used should be tested and validated. This includes the flow cytometer. While not a reagent, per se, it is essential to gathering the data and the results of the quality control being performed on the system should be accessible to the investigator. In fact, the investigator can build into their procedures their own QC steps that show the instrument and assay are working.
6. The last of Begley’s rules is “Were the statistical tests appropriate?”
All researchers want their data to be shown to be statistically significant because there is an inherent bias in published articles. It was shown by Dickersin et al., in their 1987 paper that papers containing data shown to be statistically significant were 3 times more likely to be published. Issues with HARKing and p-Hacking are troubling but can be reduced or avoided with a simple change to the mindset in experimental design.
Before any experiments are performed, it is critical to consider the outcome and how one would validate the hypothesis being tested. Doing this at the beginning of the process, rather than towards the end, allows the researcher to define the statistical testing that will be used and the threshold for significance.
Any deviations from this plan need to be reported so that readers can understand and evaluate the statistical analysis. Second, by defining the power of the experiment, this reduces the potential to stop collecting data early when the results support your hypothesis. While outliers can be very interesting in their own right, define a rule for excluding them from the analysis and report it.
In summary, Begley’s rules are a useful tool to use to evaluate the quality and reproducibility of data. It helps you look for important issues in a report and your own experiments. Couple these with the best practices in flow cytometry, and you are well on your way to improving the rigor and reproducibility of your work.
To learn more about Using Begley’s Rules To Improve Reproducibility In Flow Cytometry, and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.
My other passions include grilling, wine tasting, and real food. To be honest, my biggest passion is flow cytometry, which is something that Carol and I share. My personal mission is to make flow cytometry education accessible, relevant, and fun. I’ve had a long history in the field starting all the way back in graduate school.