Detection of Heart Beat Positions in ECG Recordings : A Lead-Dependent Algorithm

This paper proposes a computerized heartbeat detection method in single-channel electrocardiograms (ECGs). First, the well-known Pan-Tompkins technique was implemented, and next, a channel-dependent version was developed, by adjusting threshold values and reducing false QRS detections. The algorithms were tested with the MIT-BIH Arrhythmia Database (original algorithm), and with the St. Petersburg Database (modified version). When validating the performances of the original Pan-Tompkins algorithm, we have achieved a sensitivity of Se = 99.81, at a positive predictivity (P+) = 99.85%. The F-Score was 0.9587, and the RMS RR Interval Error (RMSRRIE) resulted to be 4,480.46 ms. When analysing the performance of the modified algorithm, results provided an average value of Se = 99.92%, P+ = 99.98%, FScore = 0.9718, and a mean value of 111.05 ms. for the RMSRRIE. In conclusion, the improved PanTompkins algorithm provides higher values for sensitivity and positive predictivity, increased F-Score, and it significantly reduces the temporal error when estimating the positions of QRS complexes. Thus, it could be used as a starting point to detect heartbeat positions in more sophisticated computerized detection systems.


INTRODUCTION
According to the World Health Organization, despite recent advances in diagnosis and early detection techniques, heart attack is still the main cause of death on a global basis (World Health Organization, 2011).From 2003 to 2013, mortality rate due to heart conditions was reduced by 11.7% in the United States, although Cardiovascular Disease (CVD) is still behind 30.8% of the total number of deaths (Mozaffarian et al., 2016).Other developed countries show similar trends (Ocaña-Riola et al., 2014;Townsend et al., 2012;Nag and Glosh, 2013).Many of these deaths could be prevented through a more precise diagnosis and a more exhaustive monitoring of patients at risk.Early detection of cardiac conditions could help to establish a therapy that avoids potentially dangerous situations (Lloyd-Jones et al., 2006;Dansinger et al., 2005;Smith Jr et al., 2011), and to contribute to reduce health care expenditure (Lim et al., 2007;Kahn et al., 2008).
The electrocardiogram (ECG) reflects the electrical activity of the heart.Each heart contraction spreads an electrical impulse, which is recorded by several electrodes placed on the skin.The heart is composed of muscle cells that generate the mentioned contraction, and allow the transmission of the electrical impulse (Thaler, 2010).Each beat is composed by a series of waves, caused by voltage variation of the cardiac cells, being one of the most important the Q, R and S waves, that form the QRS complex, which corresponds to a ventricular depolarization.
The analysis of an ECG consists of detecting beat positions, and delineating the waves composing each beat.For this task, it can be useful to have a deep knowledge about the features of the waves constituting it.Each beat is divided into three stages: 1) auricular depolarization (P wave), 2) ventricular depolarization (QRS complex) and ventricular depolarization (T wave).These three stages are continuously repeating in the ECG signal, representing heartbeats along the time.
The analysis of the ECG is a time-consuming task.When reviewing scientific papers, a great number of works aimed at detecting and delineating cardiac waves can be found.Several algorithms achieve beat detection rates greater than 99.5% (Afonso et al., 1999;Adnane et al., 2009).Recent technical advances have made possible to achieve low execution times (even real-time) or to eliminate the need for specific hardware (Quero et al., 2005;Mazomenos et al., 2013).Most of these proposals are based on single-channel detection, in some cases applied independently to two channels of the ECG.Most of the algorithms include varied techniques, such as filter banks (Afonso et al., 1999), Hidden Markov Models (Andreao et al., 2004), fusion of wave peaks with energy detectors (Johnson et al., 2015), independent component analysis (Kuzilek and Lhotska, 2013), wavelets (Martínez et al., 2004), or neural networks (Osowski and Linth, 2001).
One example is the well-known Pan-Tompkins algorithm, aimed at detecting beats in single-channel ECG recordings (Pan and Tompkins, 1985).It is based on the slope, amplitude and width of the signal, and includes two main stages: (1) passing the ECG through several filters to reduce noise by smoothing the signal, and magnifying both the slope and the width of QRS complexes; and (2) thresholding to detect the QRS complexes.
Most of the algorithms to detect beats, included the Pan-Tompkins method, work on single-channel ECG recordings.However, in many occasions, different errors can occur, due to signal degradation or presence of noise, leading to inaccuracy in beat position detection, false positives, or even no detection.It is clear that, if we could use all channels of a multi-lead ECG recording for beat detection, the method could be more robust and efficient.
The main goal of this work is to develop a channel-dependent version of the Pan-Tompkins algorithm, adapted to all leads of a multi-channel ECG recording, in order to get better results in the detection rates, and that could be used to help specialists in the early detection of cardiac disease.
The rest of the paper is organized as follows.Next Section presents the experimental setup, whereas Section III describes in depth the proposed algorithm and the validation procedure.Then, Section IV gives the results obtained with our method compared against some widely-used alternatives.Finally, in Section V we present the conclusions and ideas for future work.

MATERIALS
To assess the validity of our algorithms to detect heart beats in ECG recordings, we have used two public databases, described in the following paragraphs.
The MIT-BIH Arrhythmia database (Moody and Mark, 2001;Goldberger et al., 2000) is a widely-used database that contains 48 half-hour excerpts of two-channel ambulatory ECG recordings obtained by the Boston's Beth Israel Hospital (now the Beth Israel Deaconess Medical Center).Records were digitized at 360 samples per second per channel.The database was manually annotated by, at least, two cardiologists working independently.It was used to test the single-channel implementation of the original Pan-Tompkins method (Pan and Tompkins, 1985).
St. Petersburg database (Goldberger et al., 2000) contains 75 half-hour recordings extracted from 32 Holter records, from patients undergoing tests for coronary artery disease.None of the patients had pacemakers; most had ventricular ectopic beats.From the 32 patients, 17 were men and 15 were women, all of them aged between 18-80 years.All the recordings were annotated, being each record 30 minutes long, and with 12 leads, sampled at 257 Hz.The preference for a record to be included in the database was to belong to a patient suffering from ischemia, coronary artery disease, conduction abnormalities, and arrhythmias.This database was used to measure the performance of the improved version of the Pan-Tompkins algorithm (Pan and Tompkins, 1985), because it contained multichannel ECG recordings.

METHODS
As stated before, we have implemented: 1) the original Pan-Tompkins algorithm (Pan and Tompkins, 1985), and 2) a channel-dependent version of the former algorithm.In this way, an optimized single-channel detection method was devised to use its results as inputs in a future multi-channel detection technique.Both methods are explained in this Section.

A. Pan-Tompkins Algorithm
A single-channel detection technique was programmed, taking as starting point the Pan-Tompkins algorithm (Pan and Tompkins, 1985).This method applies over single-channel ECG recordings, and detects heartbeat positions.The method is described in (Mondelo et al., 2017).Briefly, it is divided into two main stages: 1. Preprocessing: Several filters are applied over the ECG signal, trying to reduce noise by smoothing the signal.
At the same time, both the slope and the width of the QRS complexes are magnified.In this way, an output modified signal was constructed.2. Decision: signal peaks are labelled as QRS complexes, after applying several thresholds to the output signal; specifically, the following threshold values were used: 2 = 0.5×threshold  1 (2) SPKI and NPKI are the running estimates of the signal and noise peak values, respectively.Value threshold  2 is used only if threshold  1 fails to find a QRS complex within a certain distance from the previously detected one.Our implementation of the Pan-Tompkins standard algorithm was tested using the MIT-BIH Arrhythmia database, and it gave results similar to others than can be found in the literature.

B. Channel-Dependent Algorithm
We have already mentioned that the Pan-Tompkins algorithm was developed to detect beat positions using only information from one single channel of the ECG.Due to differences across channels, using the same singlechannel method for all channels in the first phase would discard information that could be useful to increase both the precision and the robustness of the second phase.
Thus, we have devised an optimized single-channel detection method, by adapting thresholds and filtering procedures for each channel.As far as we know, this approach cannot be found in the literature.Detection results obtained using this procedure were used as inputs to the second stage of our system.
To extend the technique to multi-lead ECG signal, and once we have proved our implementation of the Pan-Tompkins method, we have implemented a modified version of this algorithm, by adapting thresholds and filtering procedures for each channel, in an adaptive way.As far as we know, this approach cannot be found in the literature.
To get better detection performance, we introduced two improvements in the Pan-Tompkins method: (1) to adapt the values of thresholds  1 and  2 to each channel; and (2) to filter the sequences of detected QRS complexes, rejecting those whose distance to the previous one is lower than a certain value.
1. Threshold adjustment: In order to adapt the thresholds, we replaced the coefficient 0.25 by a lead-dependent value T in (1).In this way, this threshold was calculated with the expression: Threshold  2 (2) remained as in the standard Pan-Tompkins algorithm.By adapting the thresholds, we were able to improve the sensitivity (Se) and the positive predictivity (P + ), described later in this paper, of the beat sequences, specially on channels where the standard Pan-Tompkins algorithm failed to detect many QRS complexes.Using the training set of the St. Petersburg database, we obtained a set of values for T that optimize sensitivity and positive predictivity which can be found in Table 1.

Reduction of false positives:
The detection of QRS complexes is improved by using adapted thresholds.
However, this produces that some false positives appear, due to the decrease in the decision levels.This is easily solved by rejecting those beats whose distance to the previously detected one was lower than a certain value.This value was experimentally determined to be 40% of the median of the RR (distance between two consecutive beats) values of the channel.
The effect of including these improvements in the Pan-Tompkins algorithm was two-fold: it produced a set of more homogeneous values for Se and P + across channels, and it significantly reduced the temporal error when estimating the positions of QRS complexes.

C. Validation
As it was explained before, we have developed two algorithms, and validated them employing two different databases.The MIT-BIH Arrhythmia Database was employed to validate our implementation of the Pan- Performance of our algorithms was evaluated using the following indexes: • Sensitivity (Se): proportion of real beats correctly identified as beats  = /( + ) • Positive predictivity (P + ): probability for a real beat to be detected as real • F-Score: test accuracy TP is the number of true positives (correctly detected beats), FN is the number of false negatives (undetected beats) and FP is the number of false positives (incorrectly detected beats).These indexes were obtained using the tool bxb, included in the PhysioToolkit library (Goldberger et al., 2000).This tool compares two annotation files (one from the database, one obtained by our method) and calculates the sensitivity, positive predictivity and RMS RR Interval Error.

RESULTS AND DISCUSSION
In this Section, we present the results obtained with both the original Pan-Tompkins single-channel algorithm, and with the modified version, proposed in this paper.

Original Pan-Tompkins algorithm
As stated before, the standard Pan-Tompkins algorithm was implemented, and single-channel detection was performed.This algorithm was validated with the MIT-BIH Arrhythmia database.As it can be seen in Table 2, comparing against other approaches that can be found in the literature (Afonso et al., 1999;Martínez et al., 2004;Fard et al., 2008;Elgendi, 2013;Lee et al., 1996), our results are similar for P + and slightly better for Se.The F-Score and RMSRRIE were also calculated, yielding respectively, the values of 0.9983, and 1,261.19. Figure 1 shows an example of the detection process over a single-channel ECG recording.

Channel-dependent algorithm
As it was explained before, we have modified the standard Pan-Tompkins method to adapt this algorithm to the different leads of a multichannel ECG.This adaptation was tested using the St. Petersburg database, giving a significant improvement over the unmodified Pan-Tompkins method.The F-Score and RMSRRIE were calculated and compared with the corresponding values of the above mentioned unmodified version.For the channeldependent implementation, we have obtained an average F-Score = 0.9718, and a mean RMSRRIE = 111.05.Values for sensitivity and positive predictivity were Se = 99.92%, and P + = 99.98%.
These average values were calculated as the mean value of the F-Scores and RMSRRIE obtained for each individual channel.For example, for channel I, the F-score for the original Pan-Tompkins method was 0.9017, increasing this value to 0.9446 in our version.The corresponding RMSRRIE were 26,396.43 and 185.57, respectively.Similar values were obtained for each channel.
Table 2. Results of other works over the mit-bih arrythmia database, compared to our Pan-Tompkins implementation Work
It can be seen that the improvement is remarkable, specially in some channels.For instance, on channel I of record I28, performance rose from Se = 0.28%, P + = 44.44% to Se = 99.01%,P + = 99.22%when using the modified Pan-Tompkins algorithm instead of the original one.Other channels where detection was significantly improved by our method were channel 6 of record I05 (Se = 19.85%,P + = 99.32% to Se = 99.59%,P + = 99.93%),channel 9 of record I13 (Se = 42.21%,P + = 99.58% to Se = 100.00%,P + = 100.00%),or channel 5 of record I18 (Se = 11.06%,P + = 95.35% to Se = 98.57%,P + = 98.12%).Besides, low performance detecting beats caused high values of RMS RR Interval Error when using the unmodified Pan-Tompkins algorithm: on average, results improved from values around 4,500 ms to values around 110 ms.
An example of the detection process over two different leads of the ECG recording is presented in Figure 2.  Beat characterization is one of the most important tasks that must be performed by clinicians in order to detect cardiac abnormalities.Given the importance of this work, detection, as well as posterior classification of beats, are critical factors in diagnosing CVDs.
When evaluating techniques for beat detection in ECG signals, a database of ECG recording should be used, where the presence of beats must be well established.In many occasions, comparison of the different methods is not an easy task, because researchers employ different databases to test their algorithms, and uniform conditions to collect all the ECG data are not always provided.Many research groups have developed algorithms to detect beats in ECG recordings, and tested their methods employing the MIT-BIH Arrhythmia Database.In this way, Tables II and III present the results obtained applying our method to two public databases, the MIT-BIH Arrhythmia Database, and the St. Petersburg Database.When comparing our results with ours provided in the literature, we have obtained very acceptable values for sensitivity and positive predictivity, as well as for the F-Score and for the RMSRRIE.However, we are aware that detection of beats in single-channel ECG recordings can be limited by several aspects, such as the quality of the signal, the presence of noise, or the loss of several parts of the signal.

CONCLUSIONS
During the last decades, many efforts were devoted to the task of beat detection on ECG signals.There are different approaches that give good results, and many papers propose modifications over these approaches to improve the success rate.A large number of techniques are aimed at detecting QRS complexes and other heart waves in single-channel ECG recordings.
In this research paper, and making use of our implementation of the Pan-Tompkins method, we have developed a single-channel method, by adapting thresholds and filtering procedures for each channel.We believe that our detection results can be used as a part of a further algorithm to detect beats with a multi-channel detection method.However, single-channel detection methods can be of limited value, since different problems may arise when working with multi-lead devices.At present, works based on multi-channel detection are scarce, although combining the information present in the different channels to improve the results seems a good alternative.The use of adapted thresholds brings out the best in problematic channels, going, in some cases, from unusable results to values similar to the ones obtained from the best channels.
Future work will address the issue of implementing an algorithm that combines the detection information provided by each channel, in order to detect beats in multi-lead ECG signals.

Figure 1 .
Figure 1.Beat detection in a single-channel ECG recording

Figure 2 .
Figure 2. Beat detection in two different leads of a multi-channel ECG recording

Table 1 .
Values of T used in single-channel detection

Table 3 .
F-Score and RMS RR interval error obtained on the different leads of the St. Petersburg database Standard