Publications

Search

Show only items where

Author

Type

Term

Year

Keyword

Export 32 results:

Filters: First Letter Of Title is B [Clear All Filters]

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A

Abdelfattah, A., A. Haidar, S. Tomov, and J. Dongarra, “Batched One-Sided Factorizations of Tiny Matrices Using GPUs: Challenges and Countermeasures,” Journal of Computational Science, vol. 26, pp. 226–236, May 2018.

(3.73 MB)

Angskun, T., G. Bosilca, and J. Dongarra, “Binomial Graph: A Scalable and Fault- Tolerant Logical Network Topology,” Proceedings of The Fifth International Symposium on Parallel and Distributed Processing and Applications (ISPA07), Niagara Falls, Canada, Springer, August 2007.

(480.47 KB)

Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, “Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems,” ICCS 2012, Omaha, NE, June 2012.

(608.95 KB)

Anzt, H., J. Dongarra, G. Flegar, and E. S. Quintana-Orti, “Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioner Generation on GPUs,” Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, New York, NY, USA, ACM, pp. 1–10, February 2017.

(552.62 KB)

Anzt, H., S. Tomov, M. Gates, J. Dongarra, and V. Heuveline, Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems , no. UT-CS-11-689, December 2011.

(608.95 KB)

Anzt, H., E. Chow, T. Huckle, and J. Dongarra, “Batched Generation of Incomplete Sparse Approximate Inverses on GPUs,” Proceedings of the 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pp. 49–56, November 2016.

Anzt, H., J. Dongarra, M. Gates, J. Kurzak, P. Luszczek, S. Tomov, and I. Yamazaki, “Bringing High Performance Computing to Big Data Algorithms,” Handbook of Big Data Technologies: Springer, 2017.

(1.22 MB)

Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, “A Block-Asynchronous Relaxation Method for Graphics Processing Units,” University of Tennessee Computer Science Technical Report, no. UT-CS-11-687 / LAWN 258, November 2011.

(1.08 MB)

Anzt, H., E. Chow, and J. Dongarra, “On block-asynchronous execution on GPUs,” LAPACK Working Note, no. 291, November 2016.

(1.05 MB)

Anzt, H., S. Tomov, J. Dongarra, and V. Heuveline, “A Block-Asynchronous Relaxation Method for Graphics Processing Units,” Journal of Parallel and Distributed Computing, vol. 73, issue 12, pp. 1613–1626, December 2013.

(1.08 MB)

Asch, M., T. Moore, R. M. Badia, M. Beck, P. Beckman, T. Bidot, F. Bodin, F. Cappello, A. Choudhary, B. R. de Supinski, et al., “Big Data and Extreme-Scale Computing: Pathways to Convergence - Toward a Shaping Strategy for a Future Software and Data Ecosystem for Scientific Inquiry,” The International Journal of High Performance Computing Applications, vol. 32, issue 4, pp. 435–479, July 2018.

(1.29 MB)

B

Blackford, S., J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry, M. Heroux, L. Kaufman, A. Lumsdaine, A. Petitet, et al., “Basic Linear Algebra Subprograms (BLAS),” (an update), submitted to ACM TOMS, February 2001.

(228.33 KB)

C

Caniou, Y., E. Caron, A K W. Chang, and Y. Robert, “Budget-Aware Scheduling Algorithms for Scientific Workflows with Stochastic Task Weights on Heterogeneous IaaS Cloud Platforms,” 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Vancouver, BC, Canada, IEEE, May 2018.

(1.31 MB)

Caron, E., Y. Caniou, A K W. Chang, and Y. Robert, “Budget-aware scheduling algorithms for scientific workflows with stochastic task weights on IaaS Cloud platforms,” Concurrency and Computation: Practice and Experience, vol. 33, no. 17, pp. e6065, 2021.

(1.99 MB)

D

Danalis, A., P. Luszczek, G. Marin, J. Vetter, and J. Dongarra, “BlackjackBench: Hardware Characterization with Portable Micro-Benchmarks and Automatic Statistical Analysis of Results,” IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 2011.

Danalis, A., P. Luszczek, G. Marin, J. Vetter, and J. Dongarra, “BlackjackBench: Portable Hardware Characterization with Automated Results Analysis,” The Computer Journal, March 2013.

(408.45 KB)

Dongarra, J., I. Duff, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Hogg, P. Valero Lara, P. Luszczek, M. Zounon, et al., Batched BLAS (Basic Linear Algebra Subprograms) 2018 Specification , July 2018.

(483.05 KB)

Dongarra, J., H. Meuer, H. D. Simon, and E. Strohmaier, “Biannual Top-500 Computer Lists Track Changing Environments for Scientific Computing,” SIAM News, vol. 34, no. 9, October 2002.

(2.62 MB)

Dongarra, J., E. Jeannot, E. Saule, and Z. Shi, “Bi-objective Scheduling Algorithms for Optimizing Makespan and Reliability on Heterogeneous Systems,” 19th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA) (submitted), San Diego, CA, June 2007.

(223.82 KB)

F

Fagg, G., and J. Dongarra, “Building and using a Fault Tolerant MPI implementation,” International Journal of High Performance Applications and Supercomputing (to appear), 00 2004.

Faverge, M., J. Langou, Y. Robert, and J. Dongarra, “Bidiagonalization and R-Bidiagonalization: Parallel Tiled Algorithms, Critical Paths and Distributed-Memory Implementation,” IEEE International Parallel and Distributed Processing Symposium (IPDPS), Orlando, FL, IEEE, May 2017.

(328.15 KB)

G

Gamblin, T., P. Beckman, K. Keahey, K. Sato, M. Kondo, and G. Balazs, “BDEC2 Platform White Paper,” Innovative Computing Laboratory Technical Report, no. ICL-UT-19-11: University of Tennessee, September 2019.

(30.16 KB)

Abdelfattah, A., S. Tomov, and J. Dongarra, “Batch QR Factorization on GPUs: Design, Optimization, and Tuning,” Lecture Notes in Computer Science, vol. 13350, Cham, Springer International Publishing, June 2022.

H

Haidar, A., P. Luszczek, S. Tomov, and J. Dongarra, “Batched Matrix Computations on Hardware Accelerators,” EuroMPI/Asia 2015 Workshop, Bordeaux, France, September 2015.

(589.05 KB)

Haidar, A., A. Abdelfattah, S. Tomov, and J. Dongarra, “Batched Matrix Computations on Hardware Accelerators Based on GPUs,” 2015 SIAM Conference on Applied Linear Algebra (SIAM LA), Atlanta, GA, SIAM, October 2015.

(9.36 MB)

Haidar, A., T. Dong, P. Luszczek, S. Tomov, and J. Dongarra, “Batched matrix computations on hardware accelerators based on GPUs,” International Journal of High Performance Computing Applications, February 2015.

(2.16 MB)

K

Kashi, A., P. Nayak, D. Kulkarni, A. Scheinberg, P. Lin, and H. Anzt, “Batched sparse iterative solvers on GPU for the collision operator for fusion plasma simulations,” 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Lyon, France, IEEE, July 2022.

(1.26 MB)

M

Marques, O., J. Demmel, and P. B. Vasconcelos, “Bidiagonal SVD Computation via an Associated Tridiagonal Eigenproblem,” LAPACK Working Note, no. LAWN 295, ICL-UT-18-02: University of Tennessee, April 2018.

(1.53 MB)

McCraw, H., D. Terpstra, J. Dongarra, K. Davis, and R. Musselman, “Beyond the CPU: Hardware Performance Counter Monitoring on Blue Gene/Q,” International Supercomputing Conference 2013 (ISC'13), Leipzig, Germany, Springer, June 2013.

(624.58 KB)

“BDEC Pathways to Convergence: Toward a Shaping Strategy for a Future Software and Data Ecosystem for Scientific Inquiry,” Innovative Computing Laboratory Technical Report, no. ICL-UT-17-08: University of Tennessee, November 2017.

N

Nath, R., S. Tomov, and J. Dongarra, “Blas for GPUs,” Scientific Computing with Multicore and Accelerators, Boca Raton, Florida, CRC Press, 2010.

(1.05 MB)

Y

YarKhan, A., and J. Dongarra, “Biological Sequence Alignment on the Computational Grid Using the GrADS Framework,” Future Generation Computing Systems, vol. 21, no. 6: Elsevier, pp. 980-986, June 2005.

(147.29 KB)