3 Conclusion - A comparison of statistical machine learning methods in heartbeat detection and

Both the unblocked and a blocked direct Hessenberg reduction algorithms take O(N3_{) ﬂops and O(N}3_{/B) I/Os. For large matrices, the performance can be}

improved on machines with multiple levels of memory, by the two step reduction, since all most all the operations of the ﬁrst step are matrix-matrix operations. We show that reduction of a nonsymmetric matrix to banded Hessenberg form of bandwidth t takes O(N3_{/ min}_{t,√_M_{}B) I/Os. We also show that the slab based}

algorithm does the best when the slab width k is chosen as min{√M , t}. It is also observed that, in the existing slab based algorithms, some of the elementary matrix operations like matrix multiplication should be handled I/O eﬃciently, to achieve optimal I/O performances.

References

1. Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Comm. ACM 31(9), 1116–1127 (1988)

2. Vitter, J.S.: External memory algorithms. In: Handbook of Massive Data Sets. Massive Comput., vol. 4, pp. 359–416. Kluwer Acad. Publ., Dordrecht (2002) 3. Mohanty, S.K.: I/O Eﬃcient Algorithms for Matrix Computations. PhD thesis,

Indian Institute of Technology Guwahati, Guwahati, India (2010)

4. Mohanty, S.K., Sajith, G.: I/O eﬃcient QR and QZ algorithms. In: 19th IEEE Annual International Conference on High Performance Computing (HiPC 2012), Pune, India (accepted, December 2012)

5. Roh, K., Crochemore, M., Iliopoulos, C.S., Park, K.: External memory algorithms for string problems. Fund. Inform. 84(1), 17–32 (2008)

6. Chiang, Y.J., Goodrich, M.T., Grove, E.F., Tamassia, R., Vengroﬀ, D.E., Vit- ter, J.S.: External-memory graph algorithms. In: Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 139–149. ACM, Philadelphia (1995)

7. Chiang, Y.J.: Dynamic and I/O-Eﬃcient Algorithms for Computational Geometry and Graph Problems: Theoretical and Experimental Results. PhD thesis, Brown University, Providence, RI, USA (1996)

8. Goodrich, M.T., Tsay, J.J., Vengroﬀ, D.E., Vitter, J.S.: External-memory computational geometry. In: Proceedings of the 34th Annual IEEE Symposium on Foundations of Computer Science, pp. 714–723. IEEE Computer Society Press, Palo Alto (1993)

9. Arge, L.: The buﬀer tree: a technique for designing batched external data structures. Algorithmica 37(1), 1–24 (2003)

10. Vitter, J.S.: External memory algorithms and data structures: dealing with massive data. ACM Comput. Surv. 33(2), 209–271 (2001)

11. Demaine, E.D.: Cache-oblivious algorithms and data structures. Lecture Notes from the EEF Summer School on Massive Data Sets, BRICS, University of Aarhus, Denmark (2002)

12. Vitter, J.S., Shriver, E.A.M.: Algorithms for parallel memory. I. Two-level memo- ries. Algorithmica 12(2-3), 110–147 (1994)

13. Toledo, S., Gustavson, F.G.: The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In: Fourth Workshop on Input/Output in Parallel and Distributed Systems, pp. 28–40. ACM Press (1996)

14. Reiley, W.C., Van de Geijn, R.A.: POOCLAPACK: parallel out-of-core linear algebra package. Technical Report CS-TR-99-33, Department of Computer Science, The University of Texas at Austin (November 1999)

15. Alpatov, P., Baker, G., Edwards, H.C., Gunnels, J., Morrow, G., Overfelt, J., de Geijn, R.A.V.: PLAPACK: Parallel linear algebra package design overview. In: Supercomputing 1997: Proceedings of the ACM/IEEE Conference on Supercom- puting, pp. 1–16. ACM, New York (1997)

16. Van de Geijn, R.A., Alpatou, P., Baker, G., Edwards, C., Gunnels, J., Morrow, G., Overfelt, J.: Using PLAPACK: Parallel Linear Algebra Package. MIT Press, Cambridge (1997)

17. Choi, J., Dongarra, J.J., Pozo, R., Walker, D.W.: ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers. In: Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, pp. 120–127. IEEE Computer Society Press (1992)

18. Anderson, E., Bai, Z., Bischof, C.H., Demmel, J., Dongarra, J.J., Croz, J.D., Green- baum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D.C.: LA- PACK Users’ Guide, 2nd edn. SIAM, Philadelphia (1995)

19. Basic Linear Algebra Subprograms(BLAS), http://www.netlib.org/blas/ 20. Toledo, S.: A survey of out-of-core algorithms in numerical linear algebra. In: Ex-

ternal Memory Algorithms. DIMACS Ser. Discrete Math. Theoret. Comput. Sci. Amer. Math. Soc., vol. 50, pp. 161–179, Piscataway, NJ, Providence, RI (1999) 21. Elmroth, E., Gustavson, F.G., Jonsson, I., K˚agstr¨om, B.: Recursive blocked al-

gorithms and hybrid data structures for dense matrix library software. SIAM Rev. 46(1), 3–45 (2004)

22. Haveliwala, T., Kamvar, S.D.: The second eigenvalue of the google matrix. Tech- nical Report 2003-20, Stanford InfoLab (2003)

23. Christopher, M.D., Eugenia, K., Takemasa, M.: Estimating and correcting global weather model error. Monthly Weather Review 135(2), 281–299 (2007)

24. Alter, O., Brown, P.O., Botstein, D.: Processing and modeling genome-wide expres- sion data using singular value decomposition. In: Bittner, M.L., Chen, Y., Dorsel, A.N., Dougherty, E.R. (eds.) Microarrays: Optical Technologies and Informatics, vol. 4266, pp. 171–186. SPIE (2001)

25. Xu, S., Bai, Z., Yang, Q., Kwak, K.S.: Singular value decomposition-based algorithm for IEEE 802.11a interference suppression in DS-UWB systems. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E89-A(7), 1913–1918 (2006) 26. Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins Stud-

ies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore (1996) 27. Watkins, D.S.: Fundamentals of Matrix Computations, 2nd edn. Pure and Applied

Mathematics. Wiley-Interscience. John Wiley & Sons, New York (2002)

28. Dongarra, J.J., Duﬀ, I.S., Sorensen, D.C., Van der Vorst, H.A.: Numerical Lin- ear Algebra for High Performance Computers. Software, Environments and Tools, vol. 7. SIAM, Philadelphia (1998)

29. Dongarra, J.J., Croz, J.D., Hammarling, S., Duﬀ, I.S.: A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Softw. 16(1), 1–17 (1990)

30. Elmroth, E., Gustavson, F.G.: New Serial and Parallel Recursive QR Factorization Algorithms for SMP Systems. In: K˚agstr¨om, B., Elmroth, E., Wa´sniewski, J., Don- garra, J. (eds.) PARA 1998. LNCS, vol. 1541, pp. 120–128. Springer, Heidelberg (1998)

31. Gunter, B.C., Reiley, W.C., Van de Geijn, R.A.: Implementation of out-of-core Cholesky and QR factorizations with POOCLAPACK. Technical Report CS-TR- 00-21, Austin, TX, USA (2000)

32. Gunter, B.C., Reiley, W.C., Van De Geijn, R.A.: Parallel out-of-core Cholesky and QR factorization with POOCLAPACK. In: IPDPS 2001: Proceedings of the 15th International Parallel & Distributed Processing Symposium. IEEE Computer Society, Washington, DC (2001)

33. Gunter, B.C., Van de Geijn, R.A.: Parallel out-of-core computation and updating of the QR factorization. ACM Trans. Math. Software 31(1), 60–78 (2005) 34. Buttari, A., Langou, J., Kurzak, J., Dongarra, J.J.: A class of parallel tiled lin-

ear algebra algorithms for multicore architectures. Parallel Comput. 35(1), 38–53 (2009)

35. Bischof, C.H., Lang, B., Sun, X.: A framework for symmetric band reduction. ACM Trans. Math. Software 26(4), 581–601 (2000)

36. Quintana Ort´ı, G., de Geijn, R.A.V.: Improving the performance of reduction to Hessenberg form. ACM Trans. Math. Software 32(2), 180–194 (2006)

37. Dongarra, J.J., Sorensen, D.C., Hammarling, S.J.: Block reduction of matrices to condensed forms for eigenvalue computations. J. Comput. Appl. Math. 27(1-2), 215–227 (1989)

38. Dongarra, J.J., van de Geijn, R.A.: Reduction to condensed form for the eigenvalue problem on distributed memory architectures. Parallel Comput. 18(9), 973–982 (1992)

39. Bischof, C.H., Lang, B., Sun, X.: Parellel tridiagonal through two-step band reduction. In: Proceedings of the Scalable High-Performance Computing Conference, pp. 23–27. IEEE Computer Society Press (May 1994)

40. Lang, B.: Using level 3 BLAS in rotation-based algorithms. SIAM J. Sci. Com- put. 19(2), 626–634 (1998)

41. Lang, B.: A parallel algorithm for reducing symmetric banded matrices to tridiagonal form. SIAM J. Sci. Comput. 14(6), 1320–1338 (1993)

42. Berry, M.W., Dongarra, J.J., Kim, Y.: A parallel algorithm for the reduction of a nonsymmetric matrix to block upper-Hessenberg form. Parallel Comput. 21(8), 1189–1211 (1995)

43. Ltaief, H., Kurzak, J., Dongarra, J.J.: Parallel block Hessenberg reduction using algorithms-by-tiles for multicore architectures revisited. LAPACK Working Note #208, University of Tennessee, Knoxville (2008)

44. Bai, Y., Ward, R.C.: Parallel block tridiagonalization of real symmetric matrices. J. Parallel Distrib. Comput. 68(5), 703–715 (2008)

45. Großer, B., Lang, B.: Eﬃcient parallel reduction to bidiagonal form. Parallel Com- put. 25(8), 969–986 (1999)

46. Lang, B.: Parallel reduction of banded matrices to bidiagonal form. Parallel Com- put. 22(1), 1–18 (1996)

47. Trefethen, L.N., Bau III, D.: Numerical Linear Algebra. SIAM (1997)

48. Ltaief, H., Kurzak, J., Dongarra, J.J.: Scheduling two-sided transformations using algorithms-by-tiles on multicore architectures. LAPACK Working Note #214, University of Tennessee, Knoxville (2009)

49. Bischof, C.H., Van Loan, C.F.: The W Y representation for products of Householder matrices. SIAM J. Sci. Statist. Comput. 8(1), S2–S13 (1987)

50. Wu, Y.J.J., Alpatov, P., Bischof, C.H., van de Geijn, R.A.: A parallel implementation of symmetric band reduction using PLAPACK. In: Proceedings of Scalable Parallel Library Conference. PRISM Working Note 35, Mississippi State University (1996)

51. Bai, Y.: High performance parallel approximate eigensolver for real symmetric matrices. PhD thesis, University of Tennessee, Knoxville (2005)

through Mathematical Modelling

In document A comparison of statistical machine learning methods in heartbeat detection and classification (Page 156-159)