Publications

Export 308 results:
Filters: First Letter Of Last Name is K  [Clear All Filters]
Book
Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part VII,” Lecture Notes in Computer Science, 1, no. 12143: Springer International Publishing, pp. 775, June 2020.
Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part II,” Lecture Notes in Computer Science, 1, no. 12138: Springer International Publishing, pp. 697, June 2020.
Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part IV,” Lecture Notes in Computer Science, 1, no. 12140: Springer International Publishing, pp. 668, June 2020.
Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part VI,” Lecture Notes in Computer Science, 1, no. 12142: Springer International Publishing, pp. 667, June 2020.
Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part I,” Lecture Notes in Computer Science, 1, no. 12137: Springer International Publishing, pp. 707, June 2020.
Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part III,” Lecture Notes in Computer Science, 1, no. 12139: Springer International Publishing, pp. 648, June 2020.
Krzhizhanovskaya, V., G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part V,” Lecture Notes in Computer Science, 1, no. 12141: Springer International Publishing, pp. 618, June 2020.
Wyrzykowski, R., E. Deelman, J. Dongarra, and K. Karczewski, Parallel Processing and Applied Mathematics: 13th International Conference, PPAM 2019, Bialystok, Poland, September 8–11, 2019, Revised Selected Papers, Part II,” Lecture Notes in Computer Science, no. 12044: Springer International Publishing, pp. 503, March 2020.
Wyrzykowski, R., E. Deelman, J. Dongarra, and K. Karczewski, Parallel Processing and Applied Mathematics: 13th International Conference, PPAM 2019, Bialystok, Poland, September 8–11, 2019, Revised Selected Papers, Part I,” Lecture Notes in Computer Science, 1, no. 12043: Springer International Publishing, pp. 581, March 2020.
Book Chapter
Dongarra, J., M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, Accelerating Numerical Dense Linear Algebra Calculations with GPUs,” Numerical Computations with GPUs: Springer International Publishing, pp. 3-28, 2014.  (1.06 MB)
Abdelfattah, A., S. Tomov, and J. Dongarra, Batch QR Factorization on GPUs: Design, Optimization, and Tuning,” Lecture Notes in Computer Science, vol. 13350, Cham, Springer International Publishing, June 2022.
Anzt, H., J. Dongarra, M. Gates, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, Bringing High Performance Computing to Big Data Algorithms,” Handbook of Big Data Technologies: Springer, 2017.  (1.22 MB)
Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures,” Lecture Notes in Computer Science, vol. 9573: Springer International Publishing, pp. 86-95, September 2015, 2016.  (327.14 KB)
Baboulin, M., J. Dongarra, A. Remy, S. Tomov, and I. Yamazaki, Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures,” Lecture Notes in Computer Science, vol. 9573: Springer International Publishing, pp. 86-95, September 2015, 2016.  (327.14 KB)
Parker, S., J. Mellor-Crummey, D. H. Ahn, H. Jagode, H. Brunst, S. Shende, A. D. Malony, D. DelSignore, R. Tschuter, R. Castain, et al., Performance Analysis and Debugging Tools at Scale,” Exascale Scientific Applications: Scalability and Performance Portability: Chapman & Hall / CRC Press, pp. 17-50, November 2017.
Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, Performance, Design, and Autotuning of Batched GEMM for GPUs,” High Performance Computing: 31st International Conference, ISC High Performance 2016, Frankfurt, Germany, June 19-23, 2016, Proceedings, no. 9697: Springer International Publishing, pp. 21–38, 2016.  (1.98 MB)
Bosilca, G., A. Bouteiller, A. Danalis, T. Herault, J. Kurzak, P. Luszczek, S. Tomov, and J. Dongarra, Scalable Dense Linear Algebra on Heterogeneous Hardware,” HPC: Transition Towards Exascale Processing, in the series Advances in Parallel Computing, 2013.  (760.32 KB)
Conference Paper
Gates, M., H. Anzt, J. Kurzak, and J. Dongarra, Accelerating Collaborative Filtering for Implicit Feedback Datasets using GPUs,” 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, IEEE, November 2015.  (1.02 MB)
Yamazaki, I., T. Mary, J. Kurzak, S. Tomov, and J. Dongarra, Access-averse Framework for Computing Low-rank Matrix Approximations,” First International Workshop on High Performance Big Graph Data Management, Analysis, and Mining, Washington, DC, October 2014.
Thiyagalingam, J., G. von Laszewski, J. Yin, M. Emani, J. Papay, G. Barrett, P. Luszczek, A. Tsaris, C. Kirkpatrick, F. Wang, et al., AI Benchmarking for Science: Efforts from the MLCommons Science Working Group,” Lecture Notes in Computer Science, vol. 13387: Springer International Publishing, pp. 47 - 64, January 2023.
Yi, Q., K. Kennedy, H. You, K. Seymour, and J. Dongarra, Automatic Blocking of QR and LU Factorizations for Locality,” 2nd ACM SIGPLAN Workshop on Memory System Performance (MSP 2004), Washington, DC, ACM, June 2004.  (212.77 KB)
Mucci, P., J. Dongarra, R. Kufrin, S. Moore, F. Song, and F. Wolf, Automating the Large-Scale Collection and Analysis of Performance,” 5th LCI International Conference on Linux Clusters: The HPC Revolution, Austin, Texas, May 2004.  (511.6 KB)
Gates, M., J. Kurzak, P. Luszczek, Y. Pei, and J. Dongarra, Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of Matrices,” Parallel and Distributed Processing Symposium Workshops (IPDPSW), Orlando, FL, IEEE, June 2017.
Kashi, A., P. Nayak, D. Kulkarni, A. Scheinberg, P. Lin, and H. Anzt, Batched sparse iterative solvers on GPU for the collision operator for fusion plasma simulations,” 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Lyon, France, IEEE, July 2022.  (1.26 MB)
Kashi, A., P. Nayak, D. Kulkarni, A. Scheinberg, P. Lin, and H. Anzt, Batched sparse iterative solvers on GPU for the collision operator for fusion plasma simulations,” 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Lyon, France, IEEE, July 2022.  (1.26 MB)
Haidar, A., J. Kurzak, G. Pichon, and M. Faverge, A Data Flow Divide and Conquer Algorithm for Multicore Architecture,” 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, IEEE, May 2015.  (535.44 KB)
Beck, M., T. Moore, N. French, E. Kissel, and M. Swany, Data Logistics: Toolkit and Applications,” 5th EAI International Conference on Smart Objects and Technologies for Social Good, Valencia, Spain, September 2019.  (6.71 MB)
Yamazaki, I., J. Kurzak, P. Luszczek, and J. Dongarra, Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime,” Workshop on Large-Scale Parallel Processing, IPDPS 2014, Phoenix, AZ, IEEE, May 2014.  (398.16 KB)
Kabir, K., A. Haidar, S. Tomov, and J. Dongarra, On the Design, Development, and Analysis of Optimized Matrix-Vector Multiplication Routines for Coprocessors,” ISC High Performance 2015, Frankfurt, Germany, July 2015.  (1.49 MB)
Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, Efficiency of General Krylov Methods on GPUs – An Experimental Study,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), Chicago, IL, IEEE, May 2016.  (285.28 KB)
Anzt, H., J. Dongarra, M. Kreutzer, G. Wellein, and M. Kohler, Efficiency of General Krylov Methods on GPUs – An Experimental Study,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), Chicago, IL, IEEE, May 2016.  (285.28 KB)
Solcà, R., A. Kozhevnikov, A. Haidar, S. Tomov, T. C. Schulthess, and J. Dongarra, Efficient Implementation Of Quantum Materials Simulations On Distributed CPU-GPU Systems,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.  (1.09 MB)
Cao, Q., Y. Pei, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications,” Platform for Advanced Scientific Computing Conference (PASC20), Geneva, Switzerland, ACM, June 2020.  (2.71 MB)
Cao, Q., R. Alomairy, Y. Pei, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, A Framework to Exploit Data Sparsity in Tile Low-Rank Cholesky Factorization,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), July 2022.  (1.03 MB)
Newburn, C. J., G. Bansal, M. Wood, L. Crivelli, J. Planas, A. Duran, P. Souza, L. Borges, P. Luszczek, S. Tomov, et al., Heterogeneous Streaming,” The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2016, Chicago, IL, IEEE, May 2016.  (2.73 MB)
Beams, N., A. Abdelfattah, S. Tomov, J. Dongarra, T. Kolev, and Y. Dudouit, High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs,” 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA): IEEE, November 2020.  (1.3 MB)
Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., High-Performance Tensor Contractions for GPUs,” International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016.  (2.36 MB)
Abdelfattah, A., M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, et al., High-Performance Tensor Contractions for GPUs,” International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016.  (2.36 MB)
Haidar, A., P. Luszczek, J. Kurzak, and J. Dongarra, An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware,” Supercomputing 2013, Denver, CO, November 2013.
Cao, Q., Y. Pei, K. Akbudak, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems,” 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021.  (1.08 MB)
Kurzak, J., Y. Tsai, M. Gates, A. Abdelfattah, and J. Dongarra, Massively Parallel Automated Software Tuning,” 48th International Conference on Parallel Processing (ICPP 2019), Kyoto, Japan, ACM Press, August 2019.  (911.88 KB)
Tsai, Y-H. Mike, N. Beams, and H. Anzt, Mixed Precision Algebraic Multigrid on GPUs,” Parallel Processing and Applied Mathematics (PPAM 2022), vol. 13826, Cham, Springer International Publishing, April 2023.
Yamazaki, I., S. Tomov, J. Kurzak, J. Dongarra, and J. Barlow, Mixed-precision Block Gram Schmidt Orthogonalization,” 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Austin, TX, ACM, November 2015.  (235.69 KB)
Yamazaki, I., J. Barlow, S. Tomov, J. Kurzak, and J. Dongarra, Mixed-precision orthogonalization process Performance on multicore CPUs with GPUs,” 2015 SIAM Conference on Applied Linear Algebra, Atlanta, GA, SIAM, October 2015.  (301.01 KB)
Haidar, A., K. Kabir, D. Fayad, S. Tomov, and J. Dongarra, Out of Memory SVD Solver for Big Data,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Waltham, MA, IEEE, September 2017.  (1.33 MB)
Kabir, K., A. Haidar, S. Tomov, and J. Dongarra, Performance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architectures,” The Spring Simulation Multi-Conference 2015 (SpringSim'15), Best Paper Award, Alexandria, VA, April 2015.  (608.44 KB)
Kabir, K., A. Haidar, S. Tomov, and J. Dongarra, Performance Analysis and Optimization of Two-Sided Factorization Algorithms for Heterogeneous Platform,” International Conference on Computational Science (ICCS 2015), Reykjavík, Iceland, June 2015.  (1.12 MB)
Cao, Q., Y. Pei, T. Herault, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, Performance Analysis of Tile Low-Rank Cholesky Factorization Using PaRSEC Instrumentation Tools,” Workshop on Programming and Performance Visualization Tools (ProTools 19) at SC19, Denver, CO, ACM, November 2019.  (429.55 KB)
Mary, T., I. Yamazaki, J. Kurzak, P. Luszczek, S. Tomov, and J. Dongarra, Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs,” The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, ACM, November 2015.
Dongarra, J., M. Gates, A. Haidar, Y. Jia, K. Kabir, P. Luszczek, and S. Tomov, Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi,” PPAM 2013, Warsaw, Poland, September 2013.  (284.97 KB)

Pages