When choosing a representation for the classification of heartbeats a common solution is using the coefficients of a linear combination of basis functions, such as Hermite functions. Among the advantages of this representation is the possibility of using model selection criteria for choosing the optimal representation, a property that is missing in other heartbeat representation schemes. However, to date none of the authors who have used basis functions has studied what is the optimal model length (number of functions in the linear combination). This length is usually chosen using ad hoc techniques such as the visual inspection of the reconstruction obtained for a few beats. This has led to such different choices as representing the QRS of the beats by as few as 3 or as much as 20 Hermite functions. This paper studies what is the optimal number of Hermite functions to be used when representing the QRS. The Hermite characterization of the QRS complex was calculated using from 2 to 30 functions. To determine the optimal number of functions AIC and BIC were calculated for all the heartbeats in the MIT-BIH database, obtaining for each QRS the optimum model length. The features of the Hermite characterization have been studied using feature selection techniques. Data about the impact of the length of the representation chosen on the computational resources is also presented. Using this information, we have developed a clustering algorithm based on mixture models that has a misclassification rate of 0.96% and 0.36% over the MIT-BIH database and the AHA database, respectively.
Keywords: Heartbeat representation, Hermite functions, ECG, Clustering