Discovery, identification and quantification of scent mark components The approaches for the discovery, identification and quantification of the volatile and

non-volatile components of scent marks are multidisciplinary, but primarily involve mass spectrometry. Once a pheromonally active compound has been identified and characterised, synthetic versions can be produced and used in behavioural experiments to confirm their roles in the physiological and psychological responses seen in, for example, mice (Novotny 2003).

In mammals, pheromones are typically embedded in a complex biological matrix, such as urine, requiring the first step in the discovery of volatiles to involve separation of the volatiles of interest from the rest of the sample. Volatiles are extracted from a sample or the headspace above the sample in a sealed container using a solvent, or utilising solid phase microextraction (SPME)/stir bar sorptive extraction (SBSE) techniques before being analysed using capillary gas chromatography coupled with mass spectrometry (GC-MS) (Robertson et al. 1993; Soini et al. 2005). Since pheromone production is known to be affected by hormones, the metabolic profiles of volatiles of animals in different behavioural or endocrine situations can be assessed by GC-MS, giving an insight into the compounds of interest in these situations (Novotny 2003). This information can then allow more detailed analysis of the pheromones of interest and the conduction of the relevant behavioural experiments. This particular approach has allowed the identification and characterisation of the volatile signals confirmed to be responsible for puberty induction and delay, oestrus induction, attraction, aggression and dominance in mice (Jemiolo et al. 1985, 1986, 1991; Ma et al. 1999; Novotny et al. 1985, 1986, 1990,1999).

The term ‘proteomics’ was coined by Wilkins et al. 1996, and advanced proteomic analysis has been instrumental in the identification, characterisation and quantification of the non-volatile protein components of scent marks, even in the absence of genomic data. The use of a wide range of mass spectrometric techniques, including matrix assisted laser desorption ionisation–time of flight mass spectrometry (MALDI-ToF-MS), electrospray ionisation–mass spectrometry (ESI-MS), liquid chromatography coupled with mass spectrometry (LC-MS) and liquid

chromatography with tandem mass spectrometry (LC-MS/MS) has allowed proteins in scent marks to be identified and characterised in terms of their molecular weight, peptide mass fingerprints and amino acid sequences. Chromatographic techniques have enabled the isolation of proteins of interest from complex matrices, and the use of recombinant proteins of interest can be used in behavioural experiments to provide a further insight into their roles in communication. The remainder of this section shall focus on the biochemical methods commonly utilised for the identification and characterisation of major urinary proteins, which have proven to be of great interest in mouse olfactory communication.

1.6.1 Discovery of MUPs in a scent mark

A commonly used method for the discovery of MUPs in a scent mark is by analysing the sample using sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS- PAGE). SDS, an anionic detergent, linearises and negatively charges proteins in the sample, allowing them to be separated on the polyacrylamide gel according to their molecular weight. Staining of the gel post-electrophoresis allows the protein content of the sample to be visualised by eye, and the presence of MUPs in the sample would usually result in a band(s) at approximately 20 kDa on the gel (the use of a molecular weight marker on the gel allows the location of ~ 20 kDa proteins to be determined). The MUP darcin, with a mass of 18,893 Da, however, is seen at around 16 kDa on a gel – even after treatment with SDS, darcin does not completely linearise, indicating that protein folding and shape can also affect the migration of proteins during SDS- PAGE analysis (Armstrong et al. 2005; Phelan et al. 2014). The number of protein bands seen on a gel gives an indication of the protein complexity of a scent mark, whilst the density of the bands gives an approximate indication of the abundance of that protein. In mouse urine (particularly male), a high concentration of a number of different MUPs gives a large, dense band at around 20 kDa, meaning initial identification of the number of MUPs and their relative abundances cannot be determined. Methods such as native PAGE (non-denaturing, meaning proteins can be separated according to folding and shape as well as molecular weight) and 2D- PAGE (where proteins are first separated by charge before being separated again by their molecular weight) allow proteins such as MUPs to be visualised more effectively prior to further biochemical analysis.

22 1.6.2 Identification of MUPs in a scent mark

Whilst SDS-PAGE analysis can give an approximation of the molecular weights of proteins in a scent mark, mass spectrometry is the primary tool for accurate identification of proteins. An accurate molecular weight for proteins in a sample can be determined via electrospray ionisation mass spectrometry (ESI-MS). Electrospray ionisation is a ‘soft’ ionisation technique, meaning very little fragmentation occurs during ionisation and so allows the formation of gas phase molecular ions. ESI can also produce multiply charged ions, which extends the mass range of the analyser to accommodate the higher orders of magnitude observed in proteins (Ho et al. 2003; Pitt 2009). Emitted ions are accelerated into the mass analyser and are subjected to mass spectrometric analysis, where proteins will exhibit their multiply-charged ions in a cluster (Ho et al. 2003). The number of charges on a protein molecule will depend on its molecular weight and the number of accessible basic sites (Ho et al. 2003). The molecular weight of the protein(s) can be calculated from the observed protein envelope using software provided by the mass spectrometer manufacturer, which can process and transform the raw mass spectrum to a true mass scale and determine the molecular weight (Armstrong et al. 2005). This is an important step in identifying MUPs in a scent mark, as calculated molecular weights can be compared and matched to the known MUP molecular weights to confirm their presence.

To identify the protein bands observed in SDS-PAGE analysis, parts of the bands can be excised before being digested with a protease to create peptide fragments suitable for mass spectrometric analysis. A protease hydrolyses the bonds between amino acids, and the use of different specific proteases (such as Lys-C, which only hydrolyses at the carboxyl side of lysine residues, and Glu-C which specifically hydrolyses at the carboxyl side of aspartic or glutamic acid residues) for proteolysis creates different peptide mass patterns. The peptides resulting from protein digestion can be subjected to mass spectrometric analysis to create what is known as a ‘peptide mass fingerprint’ (PMF), which can then be searched against a database containing known PMFs to help determine the identity of the protein (Perkins et al. 1999). MALDI-ToF-MS is a popular method for accurately measuring the m/z of each peptide and generating peptide mass fingerprints. Again, MALDI is a soft ionisation technique, resulting in no undesired fragmentation of peptides and it usually produces singly-charged ions, generating relatively simple mass spectra suitable for searching against a peptide mass fingerprint database. Searching against a relevant PMF database involves matching the experimental peptide masses to a theoretical dataset

from a matching protein digestion protocol, and a high score assigned by the database search engine indicates a strong match.

If no definitive identification can be made with a PMF, further peptide information can be obtain by analysing digested material using tandem mass spectrometry (MS/MS), which involves two stages of mass spectrometry. During the first stage, ionised intact peptides (often referred to as ‘precursor ions’) are measured before being isolated and fragmented. The fragmentation stage in MS/MS is most often via collision induced dissociation (CID), where the precursor ions collide with the molecules of an inert gas. The resulting fragments (known as the ‘product ions’) are measured in the second MS analyser, creating a peptide fragmentation pattern which can be searched against a database in a similar manner to a PMF search. The strength of a peptide match is based on the incidences of the observed precursor ion and product ion masses against the theoretical masses derived from the peptide sequence (Perkins et al. 1999). Again, a high score assigned by the database search engine indicates a strong match.

The identification of proteins significant in chemical signalling proves more difficult if there is little or no genomic data available for the species of interest, as there is limited database information to identify peptides and proteins against. In this case, MS/MS fragmentation spectra can be sequenced de novo, either manually or using software, where the mass difference between each product ion fragment denotes the mass of an amino acid. An example of the de novo sequencing of a peptide from an MS/MS spectrum is shown in Figure 1.6. The assembled amino acid sequence for that peptide can then be searched using the basic local alignment search tool for protein sequences (BLASTp), which searches against sequences in a selected database in the aim of identifying similar sequences or sequence tags that may denote the family membership of the protein (Atschul et al. 1990).

In document The Application of Protein Mass Spectrometry to the Understanding of Behaviour in Mus Species (Page 36-39)