Publications

A

Abdelfattah, A., S. Tomov, and J. Dongarra, Optimizing Batch HGEMM on Small Sizes Using Tensor Cores , San Jose, CA, GPU Technology Conference (GTC), March 2019.

(2.47 MB)

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Optimizing GPU Kernels for Irregular Batch Workloads: A Case Study for Cholesky Factorization,” IEEE High Performance Extreme Computing Conference (HPEC’18), Waltham, MA, IEEE, September 2018.

(729.87 KB)

Abdelfattah, A., J. Dongarra, D. Keyes, and H. Ltaeif, “Optimizing Memory-Bound Numerical Kernels on GPU Hardware Accelerators,” VECPAR 2012, Kobe, Japan, July 2012.

(737.28 KB)

Alvaro, W., J. Kurzak, and J. Dongarra, “Optimizing Matrix Multiplication for a Short-Vector SIMD Architecture - CELL Processor,” Parallel Computing, vol. 35, pp. 138-150, 00 2009.

(591.16 KB)

Angskun, T., G. Bosilca, B. Vander Zanden, and J. Dongarra, “Optimal Routing in Binomial Graph Networks,” The International Conference on Parallel and Distributed Computing, applications and Technologies (PDCAT), Adelaide, Australia, IEEE Computer Society, December 2007.

Anzt, H., M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. Dongarra, “Optimization and Performance Evaluation of the IDR Iterative Krylov Solver on GPUs,” The International Journal of High Performance Computing Applications, vol. 32, no. 2, pp. 220–230, March 2018.

(2.08 MB)

Aupy, G., A. Benoit, T. Herault, Y. Robert, and J. Dongarra, “Optimal Checkpointing Period: Time vs. Energy,” University of Tennessee Computer Science Technical Report (also LAWN 281), no. ut-eecs-13-718: University of Tennessee, October 2013.

(440.13 KB)

B

Bak, S., C. Bertoni, S. Boehm, R. Budiardja, B. M. Chapman, J. Doerfert, M. Eisenbach, H. Finkel, O. Hernandez, J. Huber, et al., “OpenMP application experiences: Porting to accelerated nodes,” Parallel Computing, vol. 109, March 2022.

Benoit, A., A. Cavelan, Y. Robert, and H. Sun, “Optimal Resilience Patterns to Cope with Fail-stop and Silent Errors,” 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Chicago, IL, IEEE, May 2016.

(603.58 KB)

Benoit, A., A. Cavelan, V. Le Fèvre, and Y. Robert, “Optimal Checkpointing Period with replicated execution on heterogeneous platforms,” 2017 Workshop on Fault-Tolerance for HPC at Extreme Scale, Washington, DC, IEEE Computer Society Press, June 2017.

(1.02 MB)

Betancourt, F., K. Wong, E. Asemota, Q. Marshall, D. Nichols, and S. Tomov, “OpenDIEL: A Parallel Workflow Engine and DataAnalytics Framework,” Practice and Experience in Advanced Research Computing (PEARC ’19), Chicago, IL, ACM, July 2019.

(1.48 MB)

C

Chaarawi, M., E. Gabriel, R. Keller, R. L. Graham, G. Bosilca, and J. Dongarra, “OMPIO: A Modular Software Architecture for MPI I/O,” 18th EuroMPI, Santorini, Greece, Springer, pp. 81-89, September 2011.

Coulomb, K., A. Degomme, M. Faverge, and F. Trahay, “An open-source tool-chain for performance analysis,” Parallel Tools Workshop, Dresden, Germany, September 2011.

(622.1 KB)

D

Dong, T., A. Haidar, S. Tomov, and J. Dongarra, “Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices,” International Conference on Computational Science (ICCS 2017), Zurich, Switzerland, Procedia Computer Science, June 2017.

(364.95 KB)

Dongarra, J., S. Hammarling, N. J. Higham, S. Relton, and M. Zounon, “Optimized Batched Linear Algebra for Modern Architectures,” Euro-Par 2017, Santiago de Compostela, Spain, Springer, August 2017.

(618.33 KB)

Dongarra, J., and A. Lastovetsky, “An Overview of Heterogeneous High Performance and Grid Computing,” Engineering the Grid (to appear): Nova Science Publishers, Inc., 00 2004.

(199.93 KB)

Du, Y., G. Pallez, L. Marchal, and Y. Robert, “Optimal Checkpointing Strategies for Iterative Applications,” IEEE Transactions on Parallel Distributed Systems, vol. 33, issue 3, pp. 507-522, March 2022.

(1.47 MB)

Du, P., P. Luszczek, and J. Dongarra, “OpenCL Evaluation for Numerical Linear Algebra Library Development,” Symposium on Application Accelerators in High-Performance Computing (SAAHPC '10), Knoxville, TN, July 2010.

(2.69 MB)

F

Fürlinger, K., and S. Moore, “OpenMP-centric Performance Analysis of Hybrid Applications,” Proc. 2008 IEEE International Conference on Cluster Computing (CLUSTER 2008), Tsukuba, Japan, January 2008.

(218.63 KB)

H

Haidar, A., T. Dong, P. Luszczek, S. Tomov, and J. Dongarra, “Optimization for Performance and Energy for Batched Matrix Computations on GPUs,” 8th Workshop on General Purpose Processing Using GPUs (GPGPU 8), San Francisco, CA, ACM, February 2015.

(699.5 KB)

Haidar, A., K. Kabir, D. Fayad, S. Tomov, and J. Dongarra, “Out of Memory SVD Solver for Big Data,” 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Waltham, MA, IEEE, September 2017.

(1.33 MB)

Herault, T., Y. Robert, A. Bouteiller, D. Arnold, K. Ferreira, G. Bosilca, and J. Dongarra, “Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Best Paper Award, Vancouver, BC, Canada, IEEE, May 2018.

(899.3 KB)

Hiroyasu, T., M. Miki, H. Shimosaka, and J. Dongarra, “Optimization Problem Solving System using Grid RPC,” 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, Tokyo, Japan, March 2003.

(71.6 KB)

Hiroyasu, T., M. Miki, H. Shimosaka, Y. Tanimura, and J. Dongarra, “Optimization System Using Grid RPC,” Meeting of the Japan Society of Mechanical Engineers, Kyoto University, Kyoto, Japan, October 2002.

Hiroyasu, T., M. Miki, J. Sawada, and J. Dongarra, “Optimization of Injection Schedule of Diesel Engine Using GridRPC,” Information Processing Society of Japan Symposium Series, vol. 2003, no. 14, pp. 189-197, January 2003.

(520.96 KB)

Hori, A., K. Yoshinaga, T. Herault, A. Bouteiller, G. Bosilca, and Y. Ishikawa, “Overhead of Using Spare Nodes,” The International Journal of High Performance Computing Applications, February 2020.

(2.15 MB)

N

Nath, R., S. Tomov, T. Dong, and J. Dongarra, “Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs,” ACM/IEEE Conference on Supercomputing (SC’11), Seattle, WA, November 2011.

(630.63 KB)

P

Plank, J., M. Beck, J. Dongarra, R. Wolski, and H. Casanova, “Optimizing Performance and Reliability in Distributed Computing Systems Through Wide Spectrum Storage,” Proceedings of the IPDPS 2003, NGS Workshop, Nice, France, pp. 209, January 2003.

S

Seymour, K., H. Nakada, S. Matsuoka, J. Dongarra, C. Lee, and H. Casanova, “Overview of GridRPC: A Remote Procedure Call API for Grid Computing,” Proceedings of the Third International Workshop on Grid Computing, pp. 274-278, January 2002.

(221.82 KB)

Shimosaka, H., T. Hiroyasu, M. Miki, and J. Dongarra, “Optimization Problem Solving System Using GridRPC,” IEEE Transactions on Parallel and Distributed Systems (submitted), January 2005.

(740.57 KB)

Steen, A J.. van der, and J. Dongarra, “Overview of High Performance Computers,” Handbook of Massive Data Sets: Kluwer Academic Publishers, pp. 791-852, January 2001.

(442.71 KB)

T

Tomov, S., P. Luszczek, I. Yamazaki, J. Dongarra, H. Anzt, and W. Sawyer, “Optimizing Krylov Subspace Solvers on Graphics Processing Units,” Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, Phoenix, AZ, IEEE, May 2014.

(536.32 KB)

W

White, J. B., and J. Dongarra, “Overlapping Computation and Communication for Advection on a Hybrid Parallel Computer,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.

Y

Yamazaki, I., S. Tomov, and J. Dongarra, “One-Sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators,” The International Conference on Computational Science (ICCS), June 2012.