How To Perform A Flow Cytometry t-Test

The ultimate goal of any experiment is to analyze data and determine whether it supports or disproves a given hypothesis. To do that, scientists turn to statistics.

Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data. In applying statistics to, e.g., a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model process to be studied.

One of the first important concepts to take from this definition is the idea of a population. An example population might be all the people in the world who have a specific disease.

It is time and cost prohibitive to try to study all of these people, so the scientist must sample a subset of the population, such that this sample represents (as best as possible) the whole population. How big the population is and what fraction is sampled in the experiment contributes to the power of the experiment, a topic for another day.

Figure 1: Relationship of population, sample size, and statistics.

This sample size, and how it is obtained, should be described before one begins any experiments, as getting the population sampling correct is a critical component of improving reproducibility. Consequences of poor sample design can be found throughout history, such as the issues surrounding the use of Thalidomide in pregnant women.

The second critical component is to identify the question(s) that the experiments are designed to to test. This will lead the researcher to state the Null hypothesis (HO), which is what statistics are designed to test.

An additional factor that should be addressed at the beginning of the experimental process is the significance level (α value) — the probability of rejecting the null hypothesis when it is actually true (a Type I statistical error).

At the conclusion of the experiments, we collect the data to generate a P value, which we compare to the α value.

If the P value is less than the α value, the null hypothesis is rejected, and the findings are considered statistically significant. On the other hand, if the P value is greater than or equal to the α value, the null hypothesis cannot be rejected.

Once the experiments are done and the primary analysis is completed, it is time for the secondary analysis.

There are a host of different tests available, depending on what comparisons are being made and the distribution of the data (i.e. normally distributed, or not.) There is an excellent resource at the Graphpad Software website, makers of Graphpad Prism.

If we wish to compare either a single group to a theoretical hypothesis, or two different groups, and these groups are normally distributed, the test of choice is the Student’s t-Test, a method developed by William Gosset while working at Guinness Brewery.

Using the t-Test, the t-statistic is calculated on the distributions, which is an intermediate step on the way to calculating the P value. The P value is then compared to the threshold to determine if the data is statistically significant.

Assumptions About the Data

The t-Test assumes that the data comes from a normal (Gaussian) distribution. That is to say, the data observes a bell-shaped curve.

Figure 2: A normal distribution.

Although the t-Test was originally developed for small samples, it is also resistant to deviations from the normal distribution with larger sample sizes.
If the data doesn’t follow a normal distribution, a non-parametric test, such at the Wilcoxon or Mann-Whitney test, is best. Non-parametric tests rank the data and perform a t-Test on the ranked data, with the assumption that the ranked data is randomly distributed.

Performing a t-Test

The minimum information needed to perform a t-Test is the means, standard deviations, and number of observations for the two populations. As shown below :

Figure 3: Calculating a t-Test in Graphpad Prism (ver. 7) with input values calculated elsewhere.

The data is collected elsewhere, and the mean, standard deviation, and N are entered into the software. For visualization, a bar graph showing the average and standard deviation is plotted.

Using the analysis feature in the software, the appropriate statistical parameters are chosen (un-paired t-Test, threshold to 0.05 discussed below). The Welch correction is applied because the N’s are different between the two samples.

Prism generates a summary table and shows details in the red box. In this case, the experimental sample is statistically significantly different from the control, and we may reject the null hypothesis.

Another way to perform this test is to enter the data into your preferred program and let the software do the work, as shown below for Prism.

Figure 4: Calculating a t-Test in Graphpad Prism (ver. 7) by entering the data.

This second plotting method has the advantage of letting the reader see all the data points in the analysis.

Final Tips for Performing a t-Test

There are a few variations of the t-Test, based on sample size and variance in the data. One can perform a one- or two-tailed t-Test. The decision to use one versus the other is related to the hypothesis.

If the expected difference is in one direction, the one-tailed t-Test is performed. If it is not known, or the expected difference could be an increase or a decrease, the two-tailed t-Test is performed.

Figure 5: The null hypothesis for either a one-tailed (left) or two-tailed (right) t-Test.

In conclusion, to perform the t-Test, it is critical to start from the beginning of the experiment to establish several parameters, including the type of test, the null hypothesis, the assumptions about the data, the number of samples to be analyzed (Power of the experiment), and the threshold.

The experiments are performed, and only then, after the primary analysis is completed, is statistical testing performed.

Each software package has its specific methods of performing these tests, and we have shown you one (Graphpad Prism). It is recommended that you consult your local statistical community and see what they are using for their analysis.

By establishing the statistical plan at the beginning of the experiment, the planning for the rest of the experiment become easy. Likewise, one does not begin to chase a hypothesis with the data, rather the data stands alone to support or reject the hypothesis.

To learn more about How To Perform A Flow Cytometry t-Test, and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.

Join Expert Cytometry's Mastery Class

ABOUT TIM BUSHNELL, PHD

Tim Bushnell holds a PhD in Biology from the Rensselaer Polytechnic Institute. He is a co-founder of—and didactic mind behind—ExCyte, the world’s leading flow cytometry training company, which organization boasts a veritable library of in-the-lab resources on sequencing, microscopy, and related topics in the life sciences.

Tim Bushnell, PhD

Similar Articles

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

By: Tim Bushnell, PhD

Numbers are all around us.  My personal favorite is ≅1.618 aka ɸ aka ‘the golden ratio’.  It’s found throughout history, where it has influenced architects and artists. We see it in nature, in plants, and it is used in movies to frame shots. It can be approximated by the Fibonacci sequence (another math favorite of mine). However, I have not worked out how to apply this to flow cytometry.  That doesn’t mean numbers aren’t important in flow cytometry. They are central to everything we do, and in this blog, I’m going to flit around numbers-based questions that I have received…

How To Do Variant Calling From RNASeq NGS Data

How To Do Variant Calling From RNASeq NGS Data

By: Deepak Kumar, PhD

Developing variant calling and analysis pipelines for NGS sequenced data have become a norm in clinical labs. These pipelines include a strategic integration of several tools and techniques to identify molecular and structural variants. That eventually helps in the apt variant annotation and interpretation. This blog will delve into the concepts and intricacies of developing a “variant calling” pipeline using GATK. “Variant calling” can also be performed using tools other than GATK, such as FREEBAYES and SAMTOOLS.  In this blog, I will walk you through variant calling methods on Illumina germline RNASeq data. In the steps, wherever required, I will…

Understanding Clinical Trials And Drug Development As A Research Scientist

Understanding Clinical Trials And Drug Development As A Research Scientist

By: Deepak Kumar, PhD

Clinical trials are studies designed to test the novel methods of diagnosing and treating health conditions – by observing the outcomes of human subjects under experimental conditions.  These are interventional studies that are performed under stringent clinical laboratory settings. Contrariwise, non-interventional studies are performed outside the clinical trial settings that provide researchers an opportunity to monitor the effect of drugs in real-life situations. Non-interventional trials are also termed observational studies as they include post-marketing surveillance studies (PMS) and post-authorization safety studies (PASS). Clinical trials are preferred for testing newly developed drugs since interventional studies are conducted in a highly monitored…

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

By: Deepak Kumar, PhD

In the first blog of this series, we explored the power of sequencing the genome at various levels. We also dealt with how the characterization of the RNA expression levels helps us to understand the changes at the genome level. These changes impact the downstream expression of the target genes. In this blog, we will explore how NGS sequencing can help us comprehend DNA modification that affect the expression pattern of the given genes (epigenetic profiling) as well as characterizing the DNA-protein interactions that allow for the identification of genes that may be regulated by a given protein.  DNA Methylation Profiling…

How To Profile DNA And RNA Expression Using Next Generation Sequencing

How To Profile DNA And RNA Expression Using Next Generation Sequencing

By: Deepak Kumar, PhD

Why is Next Generation Sequencing so powerful to explore and answer both clinical and research questions. With the ability to sequence whole genomes, identifying novel changes between individuals, to exploring what RNA sequences are being expressed, or to examine DNA modifications and protein-DNA interactions occurring that can help researchers better understand the complex regulation of transcription. This, in turn, allows them to characterize changes during different disease states, which can suggest a way to treat said disease.  Over the next two blogs, I will highlight these different methods along with illustrating how these can help clinical diagnostics as well as…

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

By: Deepak Kumar, PhD

NGS methodologies have been used to produce high-throughput sequence data. These data with appropriate computational analyses facilitate variant identification and prove to be extremely valuable in pharmaceutical industries and clinical practice for developing drug molecules inhibiting disease progression. Thus, by providing a comprehensive profile of an individual’s variome — particularly that of clinical relevance consisting of pathogenic variants — NGS helps in determining new disease genes. The information thus obtained on genetic variations and the target disease genes can be used by the Pharma companies to develop drugs impeding these variants and their disease-causing effect. However simple this may allude…

7 Key Image Analysis Terms For New Microscopist

7 Key Image Analysis Terms For New Microscopist

By: Heather Brown-Harding, PhD

As scientists, we need to perform image analysis after we’ve acquired images in the microscope, otherwise, we have just a pretty picture and not data. The vocabulary for image processing and analysis can be a little intimidating to those new to the field. Therefore, in this blog, I’m going to break down 7 terms that are key when post-processing of images. 1. RGB Image Images acquired during microscopy can be grouped into two main categories. Either monochrome (that can be multichannel) or “RGB.” RGB stands for red, green, blue – the primary colors of light. The cameras in our phones…

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

By: Tim Bushnell, PhD

In the flow cytometry community, SPADE (Spanning-tree Progression Analysis of Density-normalized Events) is a favored algorithm for dealing with highly multidimensional or otherwise complex datasets. Like tSNE, SPADE extracts information across events in your data unsupervised and presents the result in a unique visual format. Given the growing popularity of this kind of algorithm for dealing with complex datasets, we decided to test the SPADE algorithm in 5 software packages, including Cytobank, FCS Express, FlowJo, R, and the original, free software made available by the author of SPADE. Which was the fastest?

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

By: Tim Bushnell, PhD

FlowJo is a powerful tool for performing and analyzing flow cytometry experiments, if you know how to use it to the fullest. This includes understanding embedding and using keywords, the FlowJo compensation wizard, spillover spreading matrix, FlowJo and R, and creating tables in FlowJo. Extending your use of FJ using these hacks will help organize your data, improve analysis and make your exported data easier to understand and explain to others. Take a few moments and explore all you can do with FJ beyond just gating populations.

Top Industry Career eBooks

Get the Advanced Microscopy eBook

Get the Advanced Microscopy eBook

Heather Brown-Harding, PhD

Learn the best practices and advanced techniques across the diverse fields of microscopy, including instrumentation, experimental setup, image analysis, figure preparation, and more.

Get The Free Modern Flow Cytometry eBook

Get The Free Modern Flow Cytometry eBook

Tim Bushnell, PhD

Learn the best practices of flow cytometry experimentation, data analysis, figure preparation, antibody panel design, instrumentation and more.

Get The Free 4-10 Compensation eBook

Get The Free 4-10 Compensation eBook

Tim Bushnell, PhD

Advanced 4-10 Color Compensation, Learn strategies for designing advanced antibody compensation panels and how to use your compensation matrix to analyze your experimental data.