Pages 587-591
Q-RT-PCR: data analysis software for
measurement of gene expression by competitive RT-PCR
Peter A. Doris
2, Amanda Hayward-Lester and Jon K. Hays Sr
1Department of Cell Biology and Biochemistry and 1 Department of Computer Services, Texas Tech University Health Sciences Center, Lubbock, TX 79430, USA
Received on April 17, 1997, revised on June 23, 1997, accepted on July 1, 1997
Abstract
Motivation: We have developed software to assist in the computation of gene expression from data obtained in competitive reverse transcription-polymerase chain reaction (RT-PCR). This report describes the mathematical basis of competitive RT-PCR and discusses the criteria which must be met to permit accurate estimations of gene expression to be obtained using this technique.
Results: The software that has been developed assists in both the assessment of assay performance (specifically in estab- lishing the equality of amplification efficiency of the native and competitor templates) and in the routine analysis of data obtained in quantitation of gene expression by competitive RT-PCR. The software is a 100 kb module which functions as a Microsoft Excel add-in. It is compatible with both Windows and Mac versions of Excel 5 and Excel 7 on the Windows 95 platform, and employs the spreadsheet, statistical and graphing capabilities incorporated into Excel.
Availability: The software can be downloaded from http://www.grad.ttuhsc.edu/archive/. A brief summary in both HTML and Microsoft Word 6 format of the installation and use of the software is also located at this website.
Contact: E-mail: [email protected]
Introduction
Quantitation of gene expression by competitive reverse tran- scription-polymerase chain reaction (RT-PCR) has been de- veloped to the extent that it can now be considered, if appl ied rigorously, as a well-evaluated, robust, quantitative method capable of providing accurate and precise estimates of the abundance of specific transcripts in RNA preparations, even in very small sample sizes (Hayward-Lester et al., 1997).
However, application of the method in a manner that permits the accuracy and precision of the method to be fully acquired
2To whom correspondence should be addressed at: Institute tor Molecular Medicine, University of Texas Health Sciences Center. 2121 W. Holcombe Blvd, Houston, TX 77030, USA
requires a mathematical approach to assay system validation and to routine data analysis. This approach, while routine, is not simple and has often been overlooked. Therefore, we have developed a software tool which provides users of this technique with a convenient means for the validation of assay performance and for routine data handling.
System and methods
The development of this quantitative method has required an approach founded on mathematical descriptions of the PCR amplification of templates in reactions in which two tem- plates are being simultaneously amplified (Raeymaekers,
1993). Amplification of a single template can be described by the equation:
N = No{\ + E)" (1) where N is the final amount of reaction product, /Vn is the initial amount of DNA in the reaction, n is the number of cycles and E is the efficiency of the reaction. If variations in reaction efficiency of only 5% occur between two samples being compared in two different reaction tubes, changes in amount of product of >100% after 30 amplification cycles will result and make comparison of gene expression levels between the samples meaningless. Competitive RT-PCR is designed to overcome this limitation. The idea is to amplify not only the cDNA of interest in the PCR reaction, but also another cDNA template (competitor) which can be amplified by exactly the same primers. If the amount of the competitor cDNA in the starting sample is known or is always constant between samples, it can be used as a quantitative reference providing that it shares necessary similarities with the native template for which it is providing reference. The use of RNA competitors to generate native and competitor cDN As simul- taneously in the same reaction tube permits this technique to be used for accurately quantitating gene expression.
Equation (1) can then be extended to describe the ampli- fication of both templates. If the initial unknown amount of a gene U in a competitive RT-PCR reaction is Uo and that of its specific competitor RNA is Co. and these are subject to /;
reaction cycles in which the efficiency of amplification is £u
and Ec for the unknown and competitor, respectively, then from equation (1) (above) we can describe the amount of reaction products at the end of n cycles by:
£/„ = £ / „ • ( ! + £„)"
C, =
(1 +£,)"(2) (3) Making a ratio of equations (2) and (3), and taking the log- arithm, gives:
\og(U,,/Cn) = log[/,,-logCu + n • log[(l + £„)/(! + £<•)] (4) A basic assumption of competitive RT-PCR which we have recently proven for systems following our own design is that Eu and Ec remain equal to one another throughout the reaction. If true, equation (4) reduces to:
\og(UjCn) = \ogUa- logC,, (5) In calculating competitive RT-PCR reactions to obtain the unknown amount of a gene (t/0) present in a sample, a plot is made which relates \og(Un/Cn) to logCo, the known amount of starting competitor RNA. This allows \ogUo to be calculated. The estimated initial value for the amount of na- tive gene expressed in a sample (Uo) is the antilog of this value.
Equation (5) indicates that such a plot will form a straight line having a slope of -1 [or of 1 if \og(Cn/Un) is plotted]
because it is of the form y = ax ± b where the value of a must be 1 or -1 to follow the model.
The calculations to be performed in establishing a com- petitive RT-PCR system and demonstrating that it is in con- cordance with these mathematical constraints are complex.
However, once such a system has been established and ac- cordance with mathematical requirements has been demon- strated, the amount of unknown gene expressed in a sample can be estimated in single tube assays, removing the need for titrations, once the approximate abundance has been deter- mined. However, even with such a simplified system, esti- mates of the initial abundance of the target template require laborious mathematics in which error can easily be intro- duced.
Our own work in developing competitive RT-PCR systems has shown that the use of homologous competitors is essen- tial if the competitive system is to produce accurate estimates (Hayward-Lestere/a/., 1995, 1996). Homologous competi- tors, however, have the capability of producing heteroduplex products formed by the annealing of the competitor ampli- cons with highly homologous native amplicons. In our ex- perience, such heteroduplexes are usually not resolved by non-denaturing agarose gel electrophoresis. This may lead to the erroneous assumption that, since heteroduplexes are not visible by this technique, they are not present. The introduc- tion of a novel, inexpensive and highly accurate PCR product
analysis technique based on ion pair reversed-phase HPLC has resolved the important question of heteroduplex forma- tion and permits accurate resolution of both homoduplex and heteroduplex reaction products (Hayward-Lester et al., 1995, 1996). In light of the insight gained in these experi- ments, it seems safest to assume that heteroduplexes are al- ways present at the conclusion of competitive reactions un- less positive evidence to the contrary exists. Since the math- ematical method involves analysis of reaction product ratios and heteroduplex formation disturbs the final homoduplex product ratios, it is essential that heteroduplexes be identi- fied, accurately quantitated and their contribution to final product ratios be included in the analysis.
Implementation
The data analysis system we have devised was produced using the visual basic programming tools in Microsoft Excel.
The resulting program has been compiled as a Microsoft Excel add-in module. In this manner, the program runs with- in Excel and can make use of the spreadsheet capabilities, data analysis and graphic plotting tools in Excel. Because of the cross-platform functionality of Microsoft Excel, the add- in module functions identically in versions of Excel 5 for both Windows and Mac operating systems, and Excel 7 for the Windows 95 operating system.
The software performs two essential functions. One relates to the validation of a new quantification system, the other to routine data analysis from a validated system. As discussed earlier, competitive RT-PCR systems are founded on the as- sumption that the amplification efficiencies of the native and competitor templates in PCR are identical to each other throughout the reaction. This can be tested by performing a titration of various amounts of competitor against uniform amounts of native template. If Ec and Eu are not identical, the resulting titrations may vary from ideals of linearity and slopes of unity. Thus, one module of the software analyzes titration data and provides an opportunity to test whether a newly developed system is complying with mathematical constraints. Once such validation is performed, titration be- comes unnecessary for routine quantification. So long as the approximate abundance of a target template in a sample is known, single tube competitions with only one dose of com- petitor can be performed using the second module in the soft- ware designed for single tube analysis.
The software is dialog driven with data entry occurring in response to requests provided in the dialogs. The initial win- dow (Figure 1) permits the user to select between options which provide a means to: (i) input information for use either in titration or single tube estimations concerning reaction product (native and competitor) and competitor RNA tem- plate sizes; (ii) enter data describing the accumulated product amounts present at the end of the reaction for either single
s S t a r t i n g Uialinj
Select one of the foilowl ng:
(8: Set up template
C- Input data for a titration which has a template C Input data for a single tube which hasatemplate C Edit a sheet
[ Ok ~ | |
I' Cancel )
Fig. 1. After selecting Q-RT-PCR from the Excel 'Tools' menu, the initial dialog allows the user to select from a variety of set-up, input and editing options.
tube estimates or for titration-based estimates and the corre- sponding amount of competitor RNA present in the reaction;
and (iii) edit existing data.
If selection (i) above is made, dialogs follow which accu- mulate the information necessary to build a spreadsheet to provide a template for performing estimations in either titra- tion or single tube modes (Figure 2). This template can be used for all future estimations based on the same parameters.
Data entry is designed around the HPLC-based product analysis which we have developed. However, since the cal- culations are based on reaction product ratios, not the abso- lute value or the type of value (densitometry, absorbance or fluorescence units, for example), any numerical data accu- rately reflecting reaction product amounts can be used. Dia- logs permit the entry of data for native, competitor and het- eroduplex products because, in non-denaturing separation
systems, we believe that heteroduplexes must always be as- sumed to be present and the information contained in the het- eroduplex product is essential to accurate calculation since the only occasion when the heteroduplex will not disturb the reaction product ratio is when the native and competitor tran- scripts fortuitously occur in exactly the same abundance in the reaction tube (Figure 3).
For titration analysis, the plug-in module makes use of the data analysis toolpack available in a full Excel installation.
The presence of this toolpack is verified by the Q-RT-PCR add-in module. The data analysis toolpack provides the pro- gram with a number of statistical analysis tools. Those for regression analysis are employed by the Q-RT-PCR add-in module. In addition to performing the regression analysis and displaying the statistics associated with it in a new spreadsheet, the Q-RT-PCR module also plots the titration line and displays the slope and other properties of the best fit line. This facility provides an opportunity to assess whether a newly developed competitive system is behaving in ac- cordance with the requirements of equation (5) above, there- by permitting simplification of the system to a single tube assay.
Data entry in both titration and single tube assay systems provides an immediate visual feedback by requesting verifi- cation that the numbers entered are correct and as intended.
This reduces error and minimizes time spent locating and correcting erroneous data in completed spreadsheets. In single tube format, the estimation system completes a single row of the spreadsheet for each sample. The estimated initial copy number of the target is converted to molecules of RNA transcript from the information provided about the size of the competitor transcript in base pairs and from the amount of competitor transcript added to the reaction (derived from UV spectroscopy of the purified RNA transcript). Sample identification is requested and entered on the spreadsheet.
The final element of the initial dialog permits editing of spreadsheets to allow for data correction or revision. In the
N a t i u e Size s C o m p e t i t o r Size S C o m p e t i t o r Transcript j i i i
What is the size (bp) of gour native PCR product?
What is the size (bp) of your competitor PCR product
What is the length (bp) of gour competitor transcription template measured from the RNA polgmerase start site to the restriction digest site where RNA transcription ends?
Fig. 2. Information concerning the size of the PCR reaction products and the competitor RNA transcript is collected in response to dialogs.
MIMUI I'lil.lUULI U l L ' j u i l l l i l „-.- — . Pleaae enter the value for the cell in
ROW 1
column NATIVE PRODUCT ( a m units)
RT-PCR quantification of rat cyclophilin-like protein from kidney RNA. Analysis of reaction product amounts was by HPLC separation and UV detection.
Discussion
Please enter the value for the cell i n ROW 1
column COMPETITOR PRODUCT (area u n i t s )
' W H O PRODUCT l a t e a u n f-'lea'>" enter the value for the cell i n ROW 1
column HYBRID PRODUCT (area units)
IMUIISI LUMPUITOR ( f ( | iiue for tnecell in k'uw :
column AMOUNT COMPETITOR (fg)
Fig. 3. Reaction product accumulation (determined by HPLC as area units) is entered for each close level of competitor in response to dialogs (shown).
The principal advantage of the software is that it performs a complete data analysis of competitive RT-PCR reactions.
The mathematical routines involved are constant, so the user needs to know only basic numerical information concerning the size of the reaction products (so that reaction product ra- tios can be calculated after correction for molar absorbance, fluorescence or densitometric measurement), the amount of competitor added to each reaction, the size of the competitor transcript (so that the amount of competitor added in fg can be used to calculate the number of molecules of native tran- script present in the sample) and the relative amount of each reaction product formed. Cross-platform compatibility is another important attribute. We hope that this software will be useful to many colleagues engaged in gene quantitation by competitive RT-PCR. The software is being freely distrib- uted (by Internet) for non-profit use. It can be downloaded from http://www.grad.ttuhsc.edu/archive/. A brief summary in both HTML and Microsoft Word 6 format of the installa- tion and use of the software is also located at this website.
single tube assay mode, each row of data is recalculated im- mediately; in titration mode, the regression analysis is up- dated after all revisions have been made.
A sample titration analysis is included in Figure 4. The data used to create this figure are real data obtained in competitive
Acknowledgements
We are grateful to Chris Palmer for assistance in early devel- opment work. Research in this laboratory is supported in part by the NIH (DDK 45534).
..•445
!10880 138660 251040 24751j 416'
338670 161490 397750 169230 263970 190860 227320 221640 I 64130 134210 .4540
Workbook 1
i --'
186705 471000 2000 433764 2.60188 0.41631 3.30103 201843 488017 1333 446785 2,20863 0.34412 3 12483 220071 363510 1000 333435 1.4400 0.16134 3 357172 343028 625 314830 0.88002 -0.05508 2 70588 311718 234132 500 214755 0.68804 -0.18182 2.60887 488880 170750 250 184874 0.33879 -0.47007 2 30704 715512 123186 100 112001 0.15702 0.80157 2
0 0 - - •- | • | -•
: rut correct co AMOUNT finjlcorrtcntio nits (area unrts(fj) ( j r t j units)
2.84170 684.801 3402078
Jogritio IjogstjndsrJogunknovt amountunl amount unknown
j t (fg) molacules
Fig. 4. (A) Sample of data obtained during competitive RT-PCR after entry into a titration template. (B) Automated statistical analysis (performed by the macro) of the titration using data in (A). (C) Automated plot of titration (performed by the macro) using data in (A).
R e f e r e n c e s RNA processing and genomic polymorphisms using reversed-phase HPLC. In Ferre.F. (ed.). Gene Quantification. Birkhauser, Boston.
Hayward-Lester,A., Oefner,P.J., Sabatini.S. and Dons.P.A. (1995) Hayward-Lester.A., Oefner.P.J. and Doris.P.A. (1996) Rapid quanti- Accurate and absolute quantitative measurement of gene expression fication of gene expression by competitive RT-PCR and ion-pair by single tube RT-PCR and HPLC. Genome Res., 5, 494-499. reversed-phase HPLC. BioTechniques, 20, 250-257.
Hayward-Lester.A., Chilton.B.S., Underhill,P.A., Oefner,P.J. and Raeymaekers,L. (1993) Quantitative PCR: theoretical considerations Doris.P.A. (1997) Quantification of specific nucleic acids, regulated with practical implications. Anal. Biochem., 214. 582-585.