Fully Automated R-peak Detection Algorithm
(FLORA)
for Fetal Magnetoencephalographic Data
Katrin Sippela,b,∗, Julia Moserb, Franziska Schlegerb, Hubert Preisslb,c,
Wolfgang Rosenstiela, Martin Sp¨ulera
aEberhard-Karls-University of Tuebingen, Wilhelm-Schickard- Institute for Computer
Science - Computer Engineering Department, 72076 Tuebingen
bInstitute for Diabetes Research and Metabolic Diseases (IDM) of the Helmholtz Center
Munich at the University of Tuebingen, fMEG Center; German Centre for Diabetes Research (DZD), 72076 Tuebingen
cEberhard-Karls-University of Tuebingen, Interfaculty Centre for Pharmacogenomics and
Pharma Research, Department of Pharmacy and Biochemistry, 72076 Tuebingen
Abstract
Background and Objective: Fetal magnetoencephalography (fMEG) is a method for recording fetal brain signals, fetal and maternal heart activity
simultane-ously. The identification of the R-peaks of the heartbeats forms the basis for
later heart rate (HR) and heart rate variability (HRV) analysis. The current
procedure for the evaluation of fetal magnetocardiograms (fMCG) is either
semi-automated evaluation using template matching (SATM) or Hilbert
transforma-tion algorithm (HTA). However, none of the methods available at present works
reliable for all datasets.
Methods: Our aim was to develop a unitary, responsive and fully automated
R-peak detection algorithm (FLORA) that combines and enhances both of the
methods used up to now.
Results: The evaluation of all methods on 55 datasets verifies that FLORA
outperforms both of these methods as well as a combination of the two, which
applies in particular to data of fetuses at earlier gestational age.
Conclusion: The combined analysis shows that FLORA is capable of providing
∗Corresponding author
good, stable and reproducible results without manual intervention.
Keywords: magnetocardiography
1. Introduction
Fetal magnetoencephalography (fMEG) facilitates the investigation of
fe-tal brain and autonomic nervous system development [1, 2]. Like
magnetoen-cephalography in children and adults [3], fMEG is a noninvasive method. It is
primarily used to measure auditory or visually event related brain responses of
5
the fetus [4, 5, 6, 7, 8, 9] as well as spontaneous brain activity [10, 11]. With its
good spatio-temporal resolution, it also allows to monitor maternal (mMCG)
and fetal magnetocardiograms (fMCG) simultaneously to the recording of fetal
brain activity (see Fig. 2). This enables us to additionally evaluate maternal
and fetal heart rate (HR), different parameters of heart rate variability (HRV)
10
[12] and fetal behavioural states [13, 14], starting at 20 weeks of gestational age. Identification of the R-peaks [15] of the heart activity (see Fig. 3) forms the basis
of HR and HRV analysis. The challenge of a good R-peak detection algorithm is
to find as many R-peaks as possible and reduce the number of false negative (FN)
peaks without detecting false positive (FP) peaks such as movement artifacts.
15
Both a low number of FN and a low number of FP peaks should result in a
num-ber of peaks per minute in the range of the natural heart rate, and small and
well-distributed distances between consecutive R-peaks (RR-interval). While
too many FN peaks lead to a too low heart rate and huge RR-intervals, too
many FP peaks cause a too high heart rate estimation, too small RR-intervals
20
and a lower signal-to-noise ratio (SNR).
A combination of semi automated evaluation by experts using template
match-ing (SATM) and the automated Hilbert transformation approach (HTA) [16]
is the standard procedure for both maternal and fetal R-peak detection. Both
of these commonly used methods function for most maternal heart evaluations
25
since the representation of the maternal heart signal is quite strong and
Since the mMCG signal is between 10 and 100 times stronger than the fMCG
signal, the maternal heart signal must first be removed from the data before
fe-tal heart evaluation can be carried out. In addition, the strength of the detected
30
heart activity is highly dependent on the quality of the recording, gestational
age and position of the fetus. It is even more difficult to detect the fMCG signal
when it is superimposed by muscle artifacts or if the fetus moves during the
recording. For both HR and HRV analysis, the maternal/fetal heart activity
must be detected as comprehensively and as precisely as possible. For critical
35
and noisy dataset, the previously used methods tend to be either complete or
precise but not both.
SATM requires the manual selection of one heartbeat as a template (see section
2.2.1). This template is then used to locate and mark all similar heartbeats in
the dataset. One dataset contains the information of 156 sensors over the whole
40
recording time. Since template matching works for the data of each sensor
indi-vidually, it achieves high accuracy results for stable recordings. However, if the
mother or the fetus moves during the measurement, this method is at a
disad-vantage since the fetal heart signal then no longer fits the template. This could
lead to a long gap in R-peak detection of up to several minutes per dataset.
45
Nevertheless, semi-automated evaluation is generally much more flexible than
fully automated evaluation. Due to manual interventions it is possible to meet
the individual demands of a specific dataset (e.g. the deactivation of very noisy
sensors), making it difficult for fully automated algorithms to achieve
compara-ble results. By contrast, the HTA generates a single signal for all sensors which
50
is an advantage for non-stationary measurements (see section 2.2.2). However,
if the overall noise level is too high, movements of the fetus could cover the real
R-peak or could be misinterpreted as an additional peak, leading to a loss of
accuracy and quality.
A continuous and precise R-peak identification must therefore be performed
55
by a combination of both methods to ensure that the majority of datasets are
validly evaluated. Since this is a complex and very time-consuming procedure,
Figure 1: Sketch of the fMEG device
which combines the advantages of both previously used methods, automatically detects sensors with high physiological or non-physiological noise and is adaptive
60
to the real heart rate in each dataset. It is also aimed to eliminate erroneously
detected heart beats and fill in missing peaks automatically to evaluate fMEG
data rapidly and at the same time as effectively as manual guided evaluation.
This paper describes how FLORA functions and compares its performance not
only to the two established methods, but also to a combination of the two.
65
2. Methods
2.1. Fully automated R-peak detection algorithm (FLORA)
FLORA can be divided into four main steps: First, a noise analysis is
per-formed. Second, the heart frequency of the raw dataset is analyzed to identify
the individual characteristics of each recording. Third, R-peaks are detected
70
by combining three different approaches. Fourth, contingently existing gaps are
filled step by step. This procedure is the same for both maternal and fetal
R-peak detection. Each of these steps will be explained in more detail in the
Figure 2: Sketch of the FLORA algorithmic procedure
Figure 3: Shape of averaged fetal heart beats over one dataset. Time point 0 refers to the detected R-peaks (the moment of highest magnetic activity during one heartbeat).
2.1.1. Noisy sensor detection
75
It is possible that some of the 156 single sensors show some unspecific noise
caused by defect sensors, a tattoo or the retainer of the pregnant woman. Such sensors impede the evaluation of the dataset and should be removed before
Figure 4: Frequency between 1.5 and 3Hz of one dataset, after removal of the maternal heart interference. In this case fnat= 2.27 which matches RRmain= 0.44 and a mean HR of about
136 bpm.
data processing. Since these sensors show high signal variance and/or very low
correlation to the signal of their neighbor sensors they can be easily identified
80
and set to zero.
2.1.2. Frequency analysis
The aim of the frequency analysis is to identify the individual heart rate
characteristic of each dataset. Therefore, a fast Fourier spectral analysis is
performed of the whole dataset. The result of the spectral analysis is smoothed
85
by a moving average window of 20 datapoints. The frequency at the maximum
of this curve represents the main natural frequency of the HR fnat (see Fig. 4).
The window for maternal frequency analysis is 0.8-2.2Hz (corresponds to a HR of 48 - 132 bpm) and 1.5-3Hz (corresponding to a HR of 90 - 180 bpm) for
fetal. The resulting main RR-interval RRmain= 1/fnat is used as parameters 90
2.1.3. Peak detection
Hilbert transformation. The magnitude of the Hilbert transformation is
calcu-lated for all sensors and all sampling points (t), resulting in a vector (resultmht)
with dimension 1 × t, rendering this kind of search resistant to fetal movements.
95
R-peaks are identified by a top-down peak search that commences with the
highest maximum point in the transformed dataset and continues with the next
maximum with a minimum distance dmin to the recent one. This minimum
distance is defined by the result of the frequency analysis: dmin= RRmain∗ 0.7.
Factor 0.7 is chosen since it is assumed that the HR does not deviate more than
100
30% from the main RR-interval RRmain. Subsequently, the average over the
whole dataset at the time points of all detected heartbeats (Ph) is calculated
and k sensors with the highest absolute R-peak value are selected (k = 20 for
maternal, k = 5 for fetal MCG).
RMS analysis and template generation. Having taken the signal of the selected
105
k sensors, a matrix with dimension k × t, the root mean square (RMS) over the
k sensors at each sampling point t is calculated resulting in a vector (resultrms)
with dimension 1 × t. Peak detection is performed with the same parameters as in the previous step. The result is averaged over all peak times (Pv) to generate
one template for each of the k sensors. Each template has the length of the
110
average RR-interval, with 40% of the time before and 60% after the R-peak.
This procedure is adopted to ensure that the characteristics of the P, Q, S and
T wave are enclosed.
Template matching. After generating the template by the RMS analysis, the
cross-correlation between the template and the signal is calculated for each of
115
the k sensors individually. These k cross correlations are summed up to a general
template cross-correlations vector (resulttcc) with dimension 1 × t.
Combination of results. To combine the results of the previous three analysis
steps, the product of their resulting curves is calculated:
resultall = resultmht∗ resultrms∗ resulttcc 120
A final peak search is performed on resultallusing the minimum peak distance
dmin and a minimum peak height of 0, since the baseline of resultallis negative.
The thereby detected R-peaks (Pf ound) are used for further analysis.
Interpolation of missing peaks. By dividing all resulting RR-intervals by RRmain,
gaps in the R-peak detection can be easily identified. The number of missing
125
beats is subsequently calculated by dividing the size of the gap by the mean of
the HR before and after the gap. The gaps are initially filled by dividing the
length of the gap by the number of missing beats (m) + 1 and thereby creat-ing m artificial peak time points. Since such an artificial fillcreat-ing would have a
profound effect on measures like the HRV, the algorithm searches for additional
130
peaks of resultallin the time window of ± 20 ms around the artificial time point.
These artificial time points are then replaced by the the highest peak in this
time window. The final number of detected R-peaks in FLORA (N PF LORA) is
the sum of detected peaks (Pf ound) and interpolated peaks (Pint).
2.2. Comparison to standard methods
135
To evaluate the performance of FLORA, the algorithm was compared to
both previously used methods SATM and HTA as well as to the combination
of both methods.
2.2.1. Semi Automated Template Matching (SATM)
This section describes template matching as implemented in the Data
Edi-140
tor Software, which is the standard software provided by the fMEG hardware
manufacturer (VSM MedTech Ltd.). Prior to R-peak detection, the dataset is detrended and filtered with a bandpass filter from 1-35 Hz. The user is
requested to mark a regular heart beat manually which is subsequently used
as a template for template-matching R-peak detection. The resulting R-peaks
145
(PSAT M) are exported to a marker-file.
2.2.2. Hilbert Transformation Approach (HTA)
This approach by Wilson et al. [16] uses Hilbert transformation. Prior to
1-60 Hz. A Fast Fourier Transformation of each sensor is carried out to calculate
150
the sum of the power at each frequency afterwards. The Hilbert transformation
is calculated on the 10 sensors with the highest power sum. The algorithm works with an upper (210 bpm) and lower (100 bpm) limit for the fetal heart
rate (40 - 110 bpm for maternal heart rate). Threshold values are used to locate
the peaks in Hilbert-transformed data which indicate the heartbeats (PHT A). 155
2.2.3. Combination of SATM and HTA (COMB)
For standard analysis of our fMEG studies, always both methods are applied
after each other to get the best heart beat detection for the existing datasets.
The result in which more R-peaks are detected is usually chosen for further
evaluation. Using these standard methods, an additional visual check is always
160
necessary to ensure that a higher number of peaks is not caused by arbitrarily
added peaks. With regard to the 55 selected datasets in this evaluation, the
evaluation by an expert found SATM to be more precise for 21 datasets and
HTA better for the other 34 datasets. This combination is denoted as COMB
below.
165
2.3. Evaluation
Data was recorded by a 156 sensor SARA (SQUID Array for
Reproduc-tive Assessment, VSM MedTech Ltd., Port Coquitlam, Canada) system at the
University of Tuebingen with a sampling rate of 610 or 1220 Hz. In total, 55
datasets of singleton pregnant women were evaluated. The gestational age of
170
the fetuses ranged from 26 to 39 weeks (mean 30.89 ± 3.31). The length of the recordings varied between 6 and 27 minutes (mean 12.96 ± 5.73) and included
both spontaneous data and different stimulation paradigms. Prior to
process-ing, the raw dataset is filtered with a Butterworth band pass filter from 1 to
35 Hz. Four metrics were used to evaluate the detection methods which are
175
2.3.1. Number of detected peaks per minute (NP)
The number of R-peaks found in the dataset is the first indicator for the
quality of the algorithm. Although in general, less peaks are found in critical
180
datasets, particularly when using the HTA, it could also be that fetal
move-ments are misinterpreted as R-peak which in turn would lead to higher NP and
lower SNR (see 2.4 and 2.4.1). Nevertheless, the NP for fetal analysis should
produce results in the normal range of fetal heart rate (120 - 160 bpm).
185
2.3.2. Difference between RR measures (RR-DIFF)
A mean HR of healthy fetuses of between 120 and 160 bpm results in a
normal RR-interval of between 0.375 and 0.5 seconds. The histogram of a HR
recording should therefore resemble a Gaussian distribution of the RR-interval with a peak within that range. The more gaps there are in the analysis, the
190
higher the mean RR-interval will be, and the more additional noise peaks the
algorithm finds, the lower the mean RR-interval will be. Due to the fact that
the mean RR-distance is dependent on the behavioral state of the fetus and that
it could also change over gestational age, the main natural RR-interval RRmain
is calculated by extracting the main frequency fnat using a frequency analysis of 195
the data as described in Figure 4. The more continuously the R-peak detection
works, the lower the difference RR-DIFF between RRmain and the mean
RR-interval RRestof the estimated HR will be. This parameter compares how close
NP and RR are to the average heart rate/heart frequency of each dataset.
RRmain=
1 fnat
(1)
200
RRdif f = |RRmain− RRest| (2)
2.4. Percentage of normal to normal intervals (PNN)
The percentage of normal to normal (PNN) is the amount of normal
RR-intervals. ’Normal’ in this case is defined as all RRestvalues between RRmain/2
rate are generally used for HRV analysis, a high PNN is important for a valid
205
HRV result. While in the RR-DIFF value many smaller gaps do not weigh as
heavily as one very big gap, the opposite applies with regard to the PNN.
2.4.1. Signal to noise ratio (SNR)
Although the above-mentioned measurements give us an indication about
210
the reliability of the algorithms, they cannot guarantee that the R-peaks added by FLORA correspond to real R-peaks and are not just arbitrary added points.
Therefore, the SNR of the averaged RR-peaks (see Fig. 3) of the raw data
is calculated, where amppeak describes the height of the amplitude at the
R-peak which is then divided by the median of the amplitude over the whole
215
average sample. Precisely located R-peaks would result in a high SNR, whereas
arbitrary added R-peaks would result in a lower amplitude at the time of the
R-peaks amppeak, and in more noise over the whole averaged sample (higher
median(amp)) and therefore in a lower SNR.
SN R = amppeak
median(amp) (3)
2.4.2. Statistics
220
SATM, HTA and COMB were compaired against FLORA in all four metrics
by using a Wilcoxon Signed Rank Test. To decide whether a result is significant,
the significance levels were adjusted by Bonferroni correction to p = f rac0.053.
Due to extreme outliers in the results, it was decided to report [median ± standard deviation] instead of the mean.
225
3. Results
FLORA algorithm was analyzed for maternal and fetal R-peak detection. Maternal R-peak detection is easily performed and works very well for all three
algorithms. Hence, there were no obvious differences and the following results
relate to fetal R-peak detection only.
Figure 5: Number of R-peaks, identified per minute. ∗denotes significant differences in the Wilcoxon signed rank test (p < 0.05), Bonferrroni corrected for multiple comparison.
3.0.1. Number of detected peaks per minute (NP)
A comparison of the number of heart beats (NP) per minute as obtained
with the four different methods showed a significant difference between SATM [132.67 ± 36.86] and FLORA [139.17 ± 23.15] as well as between HTA [137.40 ±
41.05] and FLORA. No significant difference is observed between COMB [138.35
235
± 29.24] and FLORA (see Fig. 5).
3.0.2. Difference between RR measures (RR-DIFF)
The difference between RRmain and the mean RR-interval RRest of the
estimated HR also showed a significant difference between SATM [0.01 ± 1.57]
and FLORA [0.001 ± 0.72] as well as HTA [0.005 ± 1.96] and FLORA. Again,
240
no significant difference is found between COMB [0.004 ± 0.20] and FLORA
(see Fig. 6).
3.1. Percentage of normal to normal intervals (PNN)
The PNN showed significant differences between FLORA [100 ± 5.81] and
all other methods, SATM [98.48 ± 19.26], HTA [99.97 ± 23.16] and COMB
245
[99.85 ± 14.57] (see Fig. 7).
In Figure 8 the PNN with regard to gestation age (GA) is displayed for FLORA,
Figure 6: The difference between RRnat and the averaged RR-interval RRestas estimated
by the different R-peak detection methods. ∗denotes significant differences in the Wilcoxon
signed rank test (p < 0.05), Bonferrroni corrected for multiple comparison.
Figure 7: The percentage of normal RR-intervals. ∗denotes significant differences in the
Wilcoxon signed rank test (p < 0.05), Bonferrroni corrected for multiple comparison.
both, SATM and HTA result in lower PNN than FLORA. Figure 9 displays the
PNN with reference to the measurement duration. FLORA in contrast to the
250
other methods shows a high PNN also in short time measurements (see Fig. 9).
3.2. Signal to noise ratio (SNR)
The SNR of FLORA [20.38 ± 4.50] is significantly different to SNR of HTA
[17.83 ± 6.48]. No significant difference is shown between FLORA and SATM
[20.03 ± 6.62], or COMB [19.34 ± 5.43] (see Fig. 10).
Figure 8: The relation of the gestational age (GA) of the fetus and the amount of normal RR-intervals in percent (PNN).
Figure 9: The relation of the measurement duration and the amount of normal RR-intervals in percent (PNN).
4. Discussion and Conclusion
Since FLORA was developed to perform fully automated R-peak detection,
its performance was tested on 55 different real datasets and compared it with the
currently established procedures. Four different evaluation metrics were used to
compare performance between the different approaches. Our results show that
Figure 10: The SNR shows how clearly the R-peak contrasts from the rest of the averaged heart signal. ∗denotes significant differences in the Wilcoxon signed rank test (p < 0.05), Bonferrroni corrected for multiple comparison.
FLORA not only performs just as well as the established methods SATM and
HTA and their combination COMB but also that it has some advantages over
these. FLORA is significantly more accurate in number of detected peaks per
minute (NP) than SATM and HTA and the difference between RR measures
(RR-DIFF) of FLORA is also considerably lower. Even if there is no significant
265
difference in NP and RR-DIFF between FLORA and COMB, FLORA is con-vincing in percentage of normal to normal intervals (PNN) , where the results
are significantly different from those of the standard methods SATM and HTA
as well as to their combination COMB. Based on the results PNN it could be
shown that FLORA works reliable over the whole GA range and is independent
270
of the measurement duration. FLORA resulted in a significantly higher
sig-nal to noise ratio (SNR) than HTA, indicating that despite the higher NP the
peaks derive from heartbeats that were actually detected and not from randomly
added points. In sum, FLORA is user-independent which is very important for
reproducibility. By removing noisy sensors, estimating the physiological heart
275
rate of the subject, optimizing the peak search, filling gaps and modulating the interpolated peaks, FLORA generates high quality HR datasets that can be
easily post-processed. Due to its automatization, FLORA can also be used for
Acknowledgment
280
This work was partially supported by the Luminous project
(EU-H2020-FETOPEN GA No.686764), the Helmholtz Alliance ICEMED-Imaging and
Cur-ing Environmental Metabolic Diseases and German Research Foundation
(TR-SFB 654 Plasticity and Sleep) and a grant from German Research Foundation
(DFG - DR 807/1-1).
285
References
[1] H. Preissl, C. L. Lowery, H. Eswaran, Fetal magnetoencephalography:
cur-rent progress and trends, Experimental neurology 190 (2004) 28–36.
[2] H. Preissl, C. L. Lowery, H. Eswaran, Fetal magnetoencephalography:
viewing the developing brain in utero, International review of
neurobiol-290
ogy 68 (2005) 1–23.
[3] R. Hari, S. Baillet, G. Barnes, R. Burgess, N. Forss, J. Gross,
M. H¨am¨al¨ainen, O. Jensen, R. Kakigi, F. Maugui`ere, et al., Ifcn-endorsed
practical guidelines for clinical magnetoencephalography (meg), Clinical
Neurophysiology.
295
[4] K. Linder, F. Schleger, I. Kiefer-Schmidt, L. Fritsche, S. K¨ummel, M. Heni,
M. Weiss, H.-U. H¨aring, H. Preissl, A. Fritsche, Gestational Diabetes
Im-pairs Human Fetal Postprandial Brain Activity, The Journal of Clinical
Endocrinology & Metabolism 100 (11) (2015) 4029–4036. doi:10.1210/ jc.2015-2692.
300
[5] F. Schleger, K. Landerl, J. Muenssinger, R. Draganova, M. Reinl, I.
Kiefer-Schmidt, M. Weiss, A. Wacker-Gußmann, M. Huotilainen, H. Preissl,
Mag-netoencephalographic Signatures of Numerosity Discrimination in Fetuses
and Neonates, Developmental Neuropsychology 39 (4) (2014) 316–329.
doi:10.1080/87565641.2014.914212.
[6] J. Muenssinger, T. Matuz, F. Schleger, R. Draganova, M. Weiss, I. D.
Kiefer-Schmidt, A. Wacker-Gussmann, R. B. Govindan, C. L. Lowery,
H. Eswaran, et al., Sensitivity to auditory spectral width in the fetus and infant–an fmeg study, Frontiers in human neuroscience 7 (2013) 917.
[7] R. Draganova, H. Eswaran, P. Murphy, C. Lowery, H. Preissl, Serial
mag-310
netoencephalographic study of fetal and newborn auditory discriminative
evoked responses, Early human development 83 (3) (2007) 199–207.
[8] L. Moraru, R. Sameni, U. Schneider, J. Haueisen, E. Schleußner, D. Hoyer,
Validation of fetal auditory evoked cortical responses to enhance the
as-sessment of early brain development using fetal meg measurements,
Phys-315
iological measurement 32 (11) (2011) 1847.
[9] M. Chen, B. D. Van Veen, R. T. Wakai, Linear minimum mean-square error
filtering for evoked responses: Application to fetal meg, IEEE transactions
on biomedical engineering 53 (5) (2006) 959–963.
[10] N. Haddad, R. B. Govindan, S. Vairavan, E. Siegel, J. Temple, H. Preissl,
320
C. L. Lowery, H. Eswaran, Correlation between fetal brain activity patterns
and behavioral states: an exploratory fetal magnetoencephalography study,
Experimental neurology 228 (2) (2011) 200–205.
[11] D. F. Rose, H. Eswaran, Spontaneous neuronal activity in fetuses and new-borns, Experimental neurology 190 (2004) 37–43.
325
[12] E. Fehlert, K. Willmann, L. Fritsche, K. Linder, H. Mat-Husin, F. Schleger,
M. Weiss, I. Kiefer-Schmidt, S. Brucker, H.-U. H¨aring, H. Preissl,
A. Fritsche, Gestational diabetes alters the fetal heart rate variability
dur-ing an oral glucose tolerance test: a fetal magnetocardiography study,
BJOG: An International Journal of Obstetrics & Gynaecology 124 (12)
330
(2017) 1891–1898. doi:10.1111/1471-0528.14474.
[13] J. G. Nijhuis, H. F. Prechtl, C. B. Martin, R. S. Bots, Are there behavioural
[14] J. Br¨andle, H. Preissl, R. Draganova, E. Ortiz, K. O. Kagan, H. Abele,
S. Y. Brucker, I. Kiefer-Schmidt, Heart rate variability parameters and fetal
335
movement complement fetal behavioral states detection via magnetography to monitor neurovegetative development, Frontiers in Human Neuroscience
9 (2015) 147. doi:10.3389/fnhum.2015.00147.
[15] M. Gertsch, Das EKG: auf einen Blick und im Detail, Springer-Verlag,
2008.
340
[16] J. D. Wilson, R. B. Govindan, J. O. Hatton, C. L. Lowery, H. Preissl,
Inte-grated approach for fetal QRS detection, IEEE Transactions on Biomedical