Data-dependent versus data-independent acquisition

What is the difference between data-dependent and data-independent acquisition?


Modern mass spectrometers are versatile instruments, and when combined with clever design can acquire data in many different modes. Something that is always useful in proteomics experiments. The two most popular strategies are generally divided into data-dependent (DDA) and data-independent acquisition (DIA). Many variations of these exist, and can have different names - often describing very similar things. Proteomics scientists also employ a third broad category, called targeted acquisition, but this will not be discussed here.

 
An illustration of data dependent and data independent modes of acquisition.


Data-dependent acquisition

In DDA, a high resolution precursor ion scan (also known as survey scan or just ‘MS1’) is first performed. This generates a list of MS1 masses, most of which correspond to intact peptide ions. Unless it is an experiment, where there was no protein digestion step, in which case the MS1 scan represents a much more complex protein isotopic envelope. In order to sequence and identify these peptides (or proteins), the top n most abundant precursor ions are sequentially isolated and fragmented. The “sequential” aspect is important here; it means the peptides are fragmented one-by-one. If a complex mixture is eluting from a chromatographic column there is only so much time to “catch” the peptides before they disappear (elute off the column). In addition, in cases where few highly abundant peptides are present in the mixture the instrument tends to stochastically sample these and completely miss the rest. To offset this a ‘dynamic exclusion’ time window is also set. During this window, any mass already fragmented will not be considered again.
The fragmentation process generates tandem mass spectra (product ion spectra, MS/MS or MS2), which can then be searched against a database containing all theoretical spectra generated from a set of proteins thought to be present in the sample. An alternative searching approach is to perform de novo sequencing, where peptide sequence is inferred directly from MS2 spectra.

Although DDA is extensively used in proteomics, it has some disadvantages. An obvious one is that peptides tend to co-elute from the chromatographic column and arrive at the mass spectrometer simultaneously and in very different concentrations. The data dependent nature of the experiment (top 5 to top 20 most intense ions are usually directed for fragmentation) limits the dynamic range of this method and means that only peptides present at high relative concentrations will be preferentially sampled whereas information about the proteins represented by low concentration peptides in the mixture can be incomplete.

Additionally, the selection of precursor ions is somehow stochastic. As a result, the overlap between peptide identifications can be poor, even for technical replicates of the same sample (below 75% has been reported – I know I need to add a reference here, here is one from David Tabb).

Another problem, pertinent to all database searching approaches is the availability and reliability of protein sequence information, which cannot always be guaranteed.

While it might be easy to criticise DDA, the fact remains it is still the workhorse of proteomics.

Data-independent acquisition

DIA is a technique that was popularised much later than DDA, and has gained community attention because it attempts to solve some of the problems inherent to DDA.
In this mode of acquisition, scientists abandon the idea of sequencing one peptide at a time, and instead all peptides eluting into the instrument at any given time are fragmented together – this concept is known as multiplex fragmentation. The process happens in a serial manner; all precursor ions are measured in the survey scan, followed by all ion fragmentation. 

An implementation of this method on the Waters TOF instruments was termed MSE (I think E stands for elevated energy, or maybe everything fragmentation – if anyone reading knows the answer, do let me know!). In an example of great instrument design an additional mode of separation by means of ion mobility was also added by Waters; it has been marketed as HDMSE (and this one I know what the acronym stands for, HD – high definition).

Another variation of DIA performed on orthogonal time-of-flight mass specs, and recently very popular in the field, is SWATH (this one stands for Sequential Window Acquisition of all THeoretical mass spectra). In SWATH, instead of fragmenting the entire mass range, smaller independent mass windows (e.g. 25 Da) are isolated and fragmented sequentially, until the complete desired mass range is covered. This happens on a chromatographic time scale, and there is a small mass overlap between each window. The overlap is necessary as it turns out it greatly helps with peptide identification later on.

DIA type of acquisition is also possible on the Orbitrap instruments and has been shown to increase throughput of analysis. I am going to have a much more extensive post about this, so that’s all I’m going to say here.

I have not yet mentioned any drawback of DIA. Does it mean there are none? Certainly not, but as the technique (both experimental and bioinformatics part) is still in active development I think the jury is still out… One obvious point should be mentioned tho.

Because of the all ion fragmentation, the tandem mass spectra generated in DIA experiments are chimeric (a lot of MS/MS peaks from many precursor ions) and therefore a lot more complex than typical DDA spectra. Increasing instrument resolution as well as decreasing the number of peptides eluting at any one time within the same isolation window can help. But ultimately it is a bioinformatics problem to solve.

Comments

Post a Comment