How To Improve Reproducibility Through The Automated Analysis Of Flow Cytometry Data

Pin It
how to improve reproducibility | Expert Cytometry | flow cytometry data

Written by Ryan Brinkman, Ph.D.

Editor’s Note:  Reproducibility continues to be a critical area that all researchers need to be aware of. From the NIH’s focus on reproducibility in grant applications, to a renewed focus by reviewers on the way data has been analyzed and presented, it is imperative that researchers keep up on best practices to ensure they pass these hurdles. 

One area that flow cytometry researchers should be focusing on is the emerging changes in the area of automated data analysis. Over the last five years there have been dramatic changes and improvements in these programs and workflows. As Dr. Brinkman discusses below, the automated analysis of flow cytometry data is coming into its own. 

Flow cytometry (FCM) datasets that are currently being generated will be two orders of magnitude larger than any that exist today, and new instruments, both flow and mass cytometry, have increased the number of parameters measured for each single cell by 50% (to 30).

Even in 14 dimensional datasets there are 16,384 possible cell populations of interest pre-sample (1). The information contained within large and complex single cell datasets can only be realized with approaches to effectively curate, integrate, analyze, interpret, and share these datasets.

What Is Reproducibility And Automated Analysis?

While there are many steps in the analysis pipeline that can benefit from automated approaches for which approaches have been developed (Figure 1), a major bottleneck in the analysis of flow cytometry data is in the identification of cell populations.

Manual analytical techniques lack the capacity and rigour to bring out the full potential of signals latent in the data (1, 2) and its subjectivity has been identified to be the primary source of variation between analytic results (3, 4).how to improve reproducibility | Expert Cytometry | flow cytometry data

Figure 1: Typical flow cytometry automated analysis workflows.

Analysis usually starts with several pre-processing steps (blue boxes) followed by identification of cell populations of interest (orange boxes) and visualization.

To address this problem, the computational cytometry community has developed a collection of widely used approaches for the high throughput analysis of FCM and Mass Cytometry (CyTOF) (5). Methods have been extensively evaluated against manual analysis through the Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) project (6,7,8) and have been found to meet and in many cases exceed the performance of manual analysis.

Only by taking advantage of cutting-edge computational abilities will we be able to realize the full potential of data sets now being generated and be able to keep up with the quick rate of progress and advancement in our fields.

Further Reading (References hyperlinked above)…

  1. Aghaeepour N, Chattopadhyay PK, Ganesan A, O’Neill K, Zare H, Jalali A, … Brinkman, RR. Early immunologic correlates of HIV protection can be identified from computational analysis of complex multivariate T-cell flow cytometry assays. Bioinformatics 2012, 28(7):1009-16.
  2. O’Neill K, Aghaeepour N, Špidlen J, Brinkman RR. Flow cytometry bioinformatics. PLoS Comput Biol 2013. 9(12):e1003365.
  3. Maecker H, Rinfret A, D’Souza P, Darden J, Roig E, Landry C, … Sekaly R. Standardization of cytokine flow cytometry assays. BMC Immunol 2005. 6:13.
  4. Qiu P, Simonds E, Bendall S, Gibbs KJ, Bruggner R, Linderman M, … Plevritis S. Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE. Nat Biotechnol 2011. 29:886-91.
  5. Kvistborg P, Gouttefangeas C, Aghaeepour N, Cazaly A, Chattopadhyay PK, Chan C, … Maurer D. Thinking Outside the Gate: Single-Cell Assessments in Multiple Dimensions. Immunity 2015. 42(4):591-92.
  6. Aghaeepour N, Chattopadhyay P, Chikina M, Dhaene T, Van Gassen S, Kursa M, …, Brinkman RR. A benchmark for evaluation of algorithms for identification of cellular correlates of clinical outcomes. Cytometry 2016, 89(1):16-21.
  7. Aghaeepour N, Finak G, TheFlowCAPConsortium, TheDREAMConsortium, Hoos H, Mosmann T, … Scheuermann RH. Critical assessment of automated flow cytometry data analysis techniques. Nature Methods 2013. 10(3):228-238.
  8. Finak G, Langweiler M, Jaimes M, Malek M, Taghiyar J, Korin Y, … McCoy J. Standardizing Flow Cytometry Immunophenotyping Analysis from the Human ImmunoPhenotyping Consortium. Scientific Reports 2016. 6:20686.

As mentioned above, FCM datasets will soon be two orders of magnitude larger than those that exist today. As such, researchers must keep up on best practices for data reproducibility, especially in the area of automated data analysis. This will ensure that the field of flow cytometry and scientific research overall maintains its integrity while continuing to advance rapidly.

To learn more about how to improve reproducibility through automated analysis, and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.

Flow Cytometry Mastery Class wait list | Expert Cytometry | Flow Cytometry Training

Ryan Brinkman, Ph.D.

Ryan Brinkman, Ph.D.

Ryan is a distinguished scientist in the Terry Fox Laboratory at the British Columbia Cancer Agency, and professor of Medical Genetics at the University of British Columbia. His lab is focused on applying bioinformatics techniques to flow cytometry data and developing and maintaining MIFlowCyt and the FlowRepository.
Ryan Brinkman, Ph.D.