How to Perform Doublet Discrimination In Flow Cytometry

What is doublet discrimination?

You are probably familiar with the term, “doublet discrimination” or “doublet exclusion”, and have likely included this flow cytometry measurement into at least some (if not all) of your gating strategies.

Even though you may utilize this important gating strategy, you may not have had the chance to delve deeper to explore exactly what doublets are and why it’s critical to exclude them. This article aims to do just that.

What is doublet discrimination?

The first aspect to understanding how a doublet exclusion gate works is to define what a doublet is. Most straightforwardly, a doublet is a single event that actually consists of 2 independent particles.

The cytometer classified these particles as a single event because they passed through the interrogation point very close to one another. In other words, the particles were so close together when they passed through this laser spot, that the instrument was incapable of distinguishing them as individual events or particles.

Given how good cytometers are at measuring single cells, how would a situation arise in which the instrument would not be able to do this? The answer to this question has a whole lot to do with cytometer electronics, and how they classify cells and other particles as events.

You may recall that an “event” is the fundamental unit of measurement in flow cytometry, and is defined by what we call a pulse. Pulses occur as a cell pass through the laser beam spots, and this passage generates signal from the detectors. This signal is monitored and processed by the cytometer’s electronics, and is the origin of the “A”, “H”, and “W” pulse parameters we know and love (Figure 1).

In the absence of a cell in a laser beam, the output of the detectors is not 0 but rather a low and constant level — a background “hum,” if you will — and is interpreted by the cytometer electronics as what we call the baseline.

Figure 1. Anatomy of the voltage pulse.

In order for something to actually be considered an event, a few things have to happen. First, the detectors have to generate signal.

Usually, this occurs when a cell or other particle passes through the laser beam. However, signal can also be of a more displeasing nature, including an extraordinarily high PMT voltage (effectively amplifying the baseline “hum” generates events itself), or perhaps errant laser light escaping into the detector.

Second, this signal needs to cross what we call the “threshold” in the “trigger” channel. The threshold is the fine line between being an event and not being an event. Any pulse that fails to cross this line in the trigger channel is not considered to be relevant, and is ignored by the electronics.

What happens at the threshold is similar to what happens when you blow bubbles with a wand, like we did as kids. If you blow only slightly on the wand, a bubble may pucker but not enough to actually escape as a discrete unit. However, if you blow strongly enough (i.e. the signal is intense enough), that bubble will “pass the threshold” and escape into a fully fledged one (a pulse).

Third, the signal needs to drop back down to baseline. It is based on this third criterion that a doublet can occur. If 2 cells pass through the trigger laser so close together that the pulse does not fall back to baseline between them, the cytometer assumes their 2 pulses actually belong to one particle and will classify them as such: one, single, large event (Figure 2).

Figure 2. Voltage pulse of a doublet event.

Why perform doublet discrimination?

These doublets can have some negative effects on results and data. Most critically, they wreak havoc when sorting. Failure to include doublet exclusion in your gating strategy is a sure way to end up with poor purity.

When 2 events, one a target event for sorting and the other a non-target event, comprise a doublet, BOTH will be sorted and purity will suffer. For sorts that require extra stringency in purity, 2 individual doublet exclusion strategies can be used, which we will discuss shortly.

The importance of excluding doublets is certainly not restricted to sorting. When identifying subpopulations for analysis, the presence of doublets can impact population frequency, which can in turn impact how the data is collected. If a doublet consists of a CD4+ cell and a CD4- cell, the event they comprise will be classified as CD4, skewing the CD4+ percentage.

Additionally, doublets can make for some very strange staining patterns. If a doublet consists of one CD4+ cell and another CD8+ cell, you may mistake this data artifact with the presence of a rare CD4+CD8+ population.

Finally, including a plot that serves as a doublet exclusion can also give you a sense of both how sticky your sample is, as well as the general quality of your sample preparation. For example, lots of doublets may indicate poor enzymatic digestion.

How to perform double discrimination

So, how do you actually identify and exclude doublets? The answer to this question can be gleaned from taking a deeper look at pulses. Let’s revisit what a doublet pulse looks like in comparison to that of a single particle (Figure 3).

Figure 3. Voltage pulses for single (left) and doublet (right) events.

Notice the differences between the doublet pulse and the single particle pulse: both the area and the width of the doublet pulse are larger than the single cell’s (because two cells spend longer passing through a laser beam than one cell) but the heights of the two pulses are very close, if not identical. We can take advantage of these observations to parse out which pulses belong to doublets and which belong to true single events in the data set.

There are a few things that we need to do to accomplish this. First, we have to choose a channel in which to compare area, height, and width measurements to each other. The only requirement for this channel is that it should be scaled linearly.

The magnitude of difference in any pulse parameter between a doublet and single event is not large, and the resolution of linear scale is necessary to be able to accurately identify doublets to exclude. This requirement precludes most fluorescent parameters, which are typically scaled logarithmically, leaving forward and side scatter (which are coincidentally also nice and bright signals) as the best choices.

One exception is when performing cell cycle analysis by DNA measurement. In this case, the DNA dye measurement will be scaled linearly, and this channel is often the best choice for a doublet exclusion.

Next, we need to set up plots to make the determination. This is done in a variety of ways, and the method that is chosen is often based on personal preference. The most typical plots are based on forward scatter, as the chart below indicates, but side scatter can also be a good choice.

X-axisY-axis
FSC-AreaFSC-Height
FSC-AreaFSC-Width
FSC-HeightFSC-Width

The final step is to identify doublets, draw a region around the single cells to exclude those doublets, and gate all subsequent analysis on this region. Depicted below are some typical plots and where doublets can be found (Figure 4).

One important tip: if you are using BD “digital” FACSDiva instrumentation, the pulse width parameter is not really measured, but is calculated from the pulse area. Therefore, in order to ensure an accurate doublet exclusion gate, be sure to calibrate the Area Scaling Factor associated with the doublet discrimination parameter if you intend to use the width pulse parameter for doublet exclusion.

Figure 4: Examples of different methods for excluding doublets

You may also be wondering whether implementing a tighter, more restrictive forward and side scatter gate may preclude the need to include a formal doublet exclusion gate. While some doublets can be identified by a forward and side scatter plot, not all can, especially when cells are irregularly shaped or the sample preparation is heterogenous, so it’s not worth the risk.

One final point. Don’t necessarily conclude that the presence of doublets in a sample reflects poor sample quality. Doublets are inevitable; even the best cell preparations contain them. Their presence is a function of a random distribution and, considering that flow cytometry and cell sorting are all about random distributions, inevitable. Some cells will just end up close enough to one another to produce a doublet, even in a suspension that consists entirely of single cells. The faster that cells are pushed through the system and the more dense the sample, the higher the frequency of doublets. So, don’t fret — just gate them out.

That’s about it! We hope you now have an appreciation of what doublets are, in terms of the instrumentation, and how and why to make sure they are excluded when analyzing a data set.

To learn more about How to Perform Doublet Discrimination In Flow Cytometry, and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.

Join Expert Cytometry's Mastery Class

ABOUT TIM BUSHNELL, PHD

Tim Bushnell holds a PhD in Biology from the Rensselaer Polytechnic Institute. He is a co-founder of—and didactic mind behind—ExCyte, the world’s leading flow cytometry training company, which organization boasts a veritable library of in-the-lab resources on sequencing, microscopy, and related topics in the life sciences.

Tim Bushnell, PhD

Similar Articles

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

By: Tim Bushnell, PhD

Numbers are all around us.  My personal favorite is ≅1.618 aka ɸ aka ‘the golden ratio’.  It’s found throughout history, where it has influenced architects and artists. We see it in nature, in plants, and it is used in movies to frame shots. It can be approximated by the Fibonacci sequence (another math favorite of mine). However, I have not worked out how to apply this to flow cytometry.  That doesn’t mean numbers aren’t important in flow cytometry. They are central to everything we do, and in this blog, I’m going to flit around numbers-based questions that I have received…

How To Do Variant Calling From RNASeq NGS Data

How To Do Variant Calling From RNASeq NGS Data

By: Deepak Kumar, PhD

Developing variant calling and analysis pipelines for NGS sequenced data have become a norm in clinical labs. These pipelines include a strategic integration of several tools and techniques to identify molecular and structural variants. That eventually helps in the apt variant annotation and interpretation. This blog will delve into the concepts and intricacies of developing a “variant calling” pipeline using GATK. “Variant calling” can also be performed using tools other than GATK, such as FREEBAYES and SAMTOOLS.  In this blog, I will walk you through variant calling methods on Illumina germline RNASeq data. In the steps, wherever required, I will…

Understanding Clinical Trials And Drug Development As A Research Scientist

Understanding Clinical Trials And Drug Development As A Research Scientist

By: Deepak Kumar, PhD

Clinical trials are studies designed to test the novel methods of diagnosing and treating health conditions – by observing the outcomes of human subjects under experimental conditions.  These are interventional studies that are performed under stringent clinical laboratory settings. Contrariwise, non-interventional studies are performed outside the clinical trial settings that provide researchers an opportunity to monitor the effect of drugs in real-life situations. Non-interventional trials are also termed observational studies as they include post-marketing surveillance studies (PMS) and post-authorization safety studies (PASS). Clinical trials are preferred for testing newly developed drugs since interventional studies are conducted in a highly monitored…

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

By: Deepak Kumar, PhD

In the first blog of this series, we explored the power of sequencing the genome at various levels. We also dealt with how the characterization of the RNA expression levels helps us to understand the changes at the genome level. These changes impact the downstream expression of the target genes. In this blog, we will explore how NGS sequencing can help us comprehend DNA modification that affect the expression pattern of the given genes (epigenetic profiling) as well as characterizing the DNA-protein interactions that allow for the identification of genes that may be regulated by a given protein.  DNA Methylation Profiling…

How To Profile DNA And RNA Expression Using Next Generation Sequencing

How To Profile DNA And RNA Expression Using Next Generation Sequencing

By: Deepak Kumar, PhD

Why is Next Generation Sequencing so powerful to explore and answer both clinical and research questions. With the ability to sequence whole genomes, identifying novel changes between individuals, to exploring what RNA sequences are being expressed, or to examine DNA modifications and protein-DNA interactions occurring that can help researchers better understand the complex regulation of transcription. This, in turn, allows them to characterize changes during different disease states, which can suggest a way to treat said disease.  Over the next two blogs, I will highlight these different methods along with illustrating how these can help clinical diagnostics as well as…

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

By: Deepak Kumar, PhD

NGS methodologies have been used to produce high-throughput sequence data. These data with appropriate computational analyses facilitate variant identification and prove to be extremely valuable in pharmaceutical industries and clinical practice for developing drug molecules inhibiting disease progression. Thus, by providing a comprehensive profile of an individual’s variome — particularly that of clinical relevance consisting of pathogenic variants — NGS helps in determining new disease genes. The information thus obtained on genetic variations and the target disease genes can be used by the Pharma companies to develop drugs impeding these variants and their disease-causing effect. However simple this may allude…

7 Key Image Analysis Terms For New Microscopist

7 Key Image Analysis Terms For New Microscopist

By: Heather Brown-Harding, PhD

As scientists, we need to perform image analysis after we’ve acquired images in the microscope, otherwise, we have just a pretty picture and not data. The vocabulary for image processing and analysis can be a little intimidating to those new to the field. Therefore, in this blog, I’m going to break down 7 terms that are key when post-processing of images. 1. RGB Image Images acquired during microscopy can be grouped into two main categories. Either monochrome (that can be multichannel) or “RGB.” RGB stands for red, green, blue – the primary colors of light. The cameras in our phones…

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

By: Tim Bushnell, PhD

In the flow cytometry community, SPADE (Spanning-tree Progression Analysis of Density-normalized Events) is a favored algorithm for dealing with highly multidimensional or otherwise complex datasets. Like tSNE, SPADE extracts information across events in your data unsupervised and presents the result in a unique visual format. Given the growing popularity of this kind of algorithm for dealing with complex datasets, we decided to test the SPADE algorithm in 5 software packages, including Cytobank, FCS Express, FlowJo, R, and the original, free software made available by the author of SPADE. Which was the fastest?

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

By: Tim Bushnell, PhD

FlowJo is a powerful tool for performing and analyzing flow cytometry experiments, if you know how to use it to the fullest. This includes understanding embedding and using keywords, the FlowJo compensation wizard, spillover spreading matrix, FlowJo and R, and creating tables in FlowJo. Extending your use of FJ using these hacks will help organize your data, improve analysis and make your exported data easier to understand and explain to others. Take a few moments and explore all you can do with FJ beyond just gating populations.

Top Industry Career eBooks

Get the Advanced Microscopy eBook

Get the Advanced Microscopy eBook

Heather Brown-Harding, PhD

Learn the best practices and advanced techniques across the diverse fields of microscopy, including instrumentation, experimental setup, image analysis, figure preparation, and more.

Get The Free Modern Flow Cytometry eBook

Get The Free Modern Flow Cytometry eBook

Tim Bushnell, PhD

Learn the best practices of flow cytometry experimentation, data analysis, figure preparation, antibody panel design, instrumentation and more.

Get The Free 4-10 Compensation eBook

Get The Free 4-10 Compensation eBook

Tim Bushnell, PhD

Advanced 4-10 Color Compensation, Learn strategies for designing advanced antibody compensation panels and how to use your compensation matrix to analyze your experimental data.