Personal information

Activities

Employment (1)

Oak Ridge National Laboratory: Oak Ridge, Tennessee, US

2020-11-15 to present
Employment
Source: Self-asserted source
Pedro Valero-Lara

Works (47)

IRIS Reimagined: Advancements in Intelligent Runtime System for Task-Based Programming

2024 | Book chapter
Contributors: Narasinga Rao Miniskar; Seyong Lee; Johnston Beau; Aaron Young; Mohammad Alaul Haque Monil; Pedro Valero-Lara; Jeffrey S. Vetter
Source: check_circle
Crossref

Clacc: OpenACC for C/C++ in Clang

The International Journal of High Performance Computing Applications
2024-06-14 | Journal article
Contributors: Joel E Denny; Seyong Lee; Pedro Valero-Lara; Marc Gonzalez-Tallada; Keita Teranishi; Jeffrey S Vetter
Source: check_circle
Crossref

sKokkos: Enabling Kokkos with Transparent Device Selection on Heterogeneous Systems using OpenACC

2024-01-18 | Conference paper
Contributors: Pedro Valero-Lara; Seyong Lee; Joel Denny; Keita Teranishi; Jeffrey Vetter; Marc Gonzalez-Tallada
Source: check_circle
Crossref

A Portable and Heterogeneous LU Factorization on IRIS

2023 | Book chapter
Contributors: Pedro Valero-Lara; Jungwon Kim; Jeffrey S. Vetter
Source: check_circle
Crossref

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

2023 | Other
Contributors: Pedro Valero-Lara; Alexis Huante; Mustafa Al Lail; William F. Godoy; Keita Teranishi; Prasanna Balaprakash; Jeffrey S. Vetter
Source: Self-asserted source
Pedro Valero-Lara

Julia as a unifying end-to-end workflow language on the Frontier exascale system

2023-11-12 | Conference paper
Contributors: William F. Godoy; Pedro Valero-Lara; Caira Anderson; Katrina W. Lee; Ana Gainaru; Rafael Ferreira Da Silva; Jeffrey S. Vetter
Source: check_circle
Crossref

MatRIS: Multi-level Math Library Abstraction for Heterogeneity and Performance Portability using IRIS Runtime

2023-11-12 | Conference paper
Contributors: Mohammad Alaul Haque Monil; Narasinga Rao Miniskar; Keita Teranishi; Jeffrey S. Vetter; Pedro Valero-Lara
Source: check_circle
Crossref

Mixed-Precision S/DGEMM Using the TF32 and TF64 Frameworks on Low-Precision AI Tensor Cores

2023-11-12 | Conference paper
Contributors: Pedro Valero-Lara; Ian Jorquera; Frank Lui; Jeffrey Vetter
Source: check_circle
Crossref

Moment Representation of Regularized Lattice Boltzmann Methods on NVIDIA and AMD GPUs

2023-11-12 | Conference paper
Contributors: Pedro Valero-Lara; Jeffrey Vetter; John Gounley; Amanda Randles
Source: check_circle
Crossref

IRIS-DMEM: Efficient Memory Management for Heterogeneous Computing

2023 IEEE High Performance Extreme Computing Conference (HPEC)
2023-09-25 | Conference paper
Contributors: Narasinga Rao Miniskar; Mohammad Alaul Haque Monil; Pedro Valero-Lara; Frank Y. Liu; Jeffrey S. Vetter
Source: Self-asserted source
Pedro Valero-Lara

S4PST: Sustainability for Programming Systems and Tools: May Workshop Report

2023-08-24 | Report
Contributors: Hartwig Anzt; Pedro Valero Lara; William Godoy; Keita Teranishi
Source: check_circle
Crossref

Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation

2023-08-07 | Conference paper
Contributors: William Godoy; Pedro Valero-Lara; Keita Teranishi; Prasanna Balaprakash; Jeffrey Vetter
Source: check_circle
Crossref

A MultiGPU Performance-Portable Solution for Array Programming Based on Kokkos

Proceedings of the 9th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming
2023-06-06 | Book
Contributors: Pedro Valero-Lara; Jeffrey S. Vetter
Source: Self-asserted source
Pedro Valero-Lara

Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes

2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
2023-05 | Book
Contributors: William F. Godoy; Pedro Valero-Lara; T. Elise Dettling; Christian Trefftz; Ian Jorquera; Thomas Sheehy; Ross G. Miller; Marc Gonzalez-Tallada; Jeffrey S. Vetter; Valentin Churavy
Source: Self-asserted source
Pedro Valero-Lara

Tiling Framework for Heterogeneous Computing of Matrix based Tiled Algorithms

2023-02-25 | Conference paper
Contributors: Narasinga Rao Miniskar; Mohammad Alaul Haque Monil; Pedro Valero-Lara; Frank Liu; Jeffrey S. Vetter
Source: check_circle
Crossref

OpenMP Target Task: Tasking and Target Offloading on Heterogeneous Systems

2022 | Book chapter
Contributors: Pedro Valero-Lara; Jungwon Kim; Oscar Hernandez; Jeffrey Vetter
Source: check_circle
Crossref

IRIS-BLAS: Towards a Performance Portable and Heterogeneous BLAS Library

2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)
2022-12 | Book
Contributors: Narasinga Rao Miniskar; Mohammad Alaul Haque Monil; Pedro Valero-Lara; Frank Liu; Jeffrey S. Vetter
Source: Self-asserted source
Pedro Valero-Lara

KokkACC: Enhancing Kokkos with OpenACC

2022 Workshop on Accelerator Programming Using Directives (WACCPD)
2022-11 | Book
Contributors: Pedro Valero-Lara; Seyong Lee; Marc Gonzalez-Tallada; Joel Denny; Jeffrey S. Vetter
Source: Self-asserted source
Pedro Valero-Lara

Towards Enhancing Coding Productivity for GPU Programming Using Static Graphs

Electronics
2022-04-20 | Journal article
Contributors: Leonel Toledo; Pedro Valero-Lara; Jeffrey S. Vetter; Antonio J. Peña
Source: check_circle
Crossref
grade
Preferred source (of 2)‎

Propagation Pattern for Moment Representation of the Lattice Boltzmann Method

IEEE Transactions on Parallel and Distributed Systems
2022-03-01 | Journal article
Contributors: John Gounley; Madhurima Vardhan; Erik W. Draeger; Pedro Valero-Lara; Shirley V. Moore; Amanda Randles
Source: check_circle
Crossref

sLASs: A fully automatic auto-tuned linear algebra library based on OpenMP extensions implemented in OmpSs (LASs Library)

Journal of Parallel and Distributed Computing
2020-04 | Journal article
Part of ISSN: 0743-7315
Contributors: Pedro Valero-Lara; Sandra Catalán; Xavier Martorell; Tetsuzo Usui; Jesús Labarta
Source: Self-asserted source
Pedro Valero-Lara

MPI+OpenMP tasking scalability for multi-morphology simulations of the human brain

Parallel Computing
2019-05 | Journal article
Contributors: Pedro Valero-Lara; Raül Sirvent; Antonio J. Peña; Jesús Labarta
Source: check_circle
Crossref

NVIDIA GPUs Scalability to Solve Multiple (Batch) Tridiagonal Systems Implementation of cuThomasBatch

2018 | Book chapter
Contributors: Pedro Valero-Lara; Ivan Martínez-Pérez; Raül Sirvent; Xavier Martorell; Antonio J. Peña
Source: check_circle
Crossref

cuThomasBatch and cuThomasVBatch, CUDA Routines to compute batch of tridiagonal systems on NVIDIA GPUs

Concurrency and Computation: Practice and Experience
2018-12-25 | Journal article
Contributors: Pedro Valero‐Lara; Ivan Martínez‐Pérez; Raül Sirvent; Xavier Martorell; Antonio J. Peña
Source: check_circle
Crossref

Heterogeneous CPU+GPU approaches for mesh refinement over Lattice-Boltzmann simulations

Concurrency Computation
2017 | Journal article
EID:

2-s2.0-84983499587

Contributors: Valero-Lara, P.; Jansson, J.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems

Procedia Computer Science
2017 | Journal article
Part of ISSN: 1877-0509
Contributors: Jack Dongarra; Sven Hammarling; Nicholas J. Higham; Samuel D. Relton; Pedro Valero-Lara; Mawussi Zounon
Source: Self-asserted source
Pedro Valero-Lara

Reducing memory requirements for large size LBM simulations on GPUs

Concurrency and Computation: Practice and Experience
2017-12-25 | Journal article
Contributors: Pedro Valero‐Lara
Source: check_circle
Crossref

Leveraging the performance of LBM-HPC for large sizes on GPUs using ghost cells

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
2016 | Book
EID:

2-s2.0-85007153898

Contributors: Valero-Lara, P.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Lightning talk: Creating a standardised set of batched BLAS routines

CEUR Workshop Proceedings
2016 | Conference paper
EID:

2-s2.0-84991107749

Contributors: Dongarra, J.; Hammarling, S.; Higham, N.J.; Relton, S.D.; Valero-Lara, P.; Zounon, M.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Many-task computing on many-core architectures

Scalable Computing
2016 | Journal article
EID:

2-s2.0-84963752839

Contributors: Valero-Lara, P.; Nookala, P.; Pelayo, F.L.; Jansson, J.; Dimitropoulos, S.; Raicu, I.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Multi-domain grid refinement for lattice-Boltzmann simulations on heterogeneous platforms

Proceedings - IEEE 18th International Conference on Computational Science and Engineering, CSE 2015
2016 | Conference paper
EID:

2-s2.0-84962920078

Contributors: Valero-Lara, P.; Jansson, J.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Multicore and manycore: Hybrid computing architectures and applications

Innovative Research and Applications in Next-Generation High Performance Computing
2016 | Book
EID:

2-s2.0-85014336927

Contributors: Valero-Lara, P.; Paz-Gallardo, A.; Foster, E.L.; Prieto-Matías, M.; Pinelli, A.; Jansson, J.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

A Non-uniform Staggered Cartesian Grid approach for Lattice-Boltzmann method

Procedia Computer Science
2015 | Conference paper
EID:

2-s2.0-84939212648

Contributors: Valero-Lara, P.; Jansson, J.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Accelerating fluid-solid simulations (Lattice-Boltzmann & Immersed-Boundary) on heterogeneous architectures

Journal of Computational Science
2015 | Journal article
EID:

2-s2.0-84941285531

Contributors: Valero-Lara, P.; Igual, F.D.; Prieto-Matías, M.; Pinelli, A.; Favier, J.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

LBM-HPC - An open-source tool for fluid simulations. Case study: Unified parallel C (UPC-PGAS)

Proceedings - IEEE International Conference on Cluster Computing, ICCC
2015 | Conference paper
EID:

2-s2.0-84959312261

Contributors: Valero-Lara, P.; Jansson, J.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Accelerating solid-fluid interaction using Lattice-Boltzmann and Immersed Boundary coupled simulations on heterogeneous platforms

Procedia Computer Science
2014 | Conference paper
EID:

2-s2.0-84902780607

Contributors: Valero-Lara, P.; Pinelli, A.; Prieto-Matias, M.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Accelerating solid–fluid interaction based on the immersed boundary method on multicore and GPU architectures

Journal of Supercomputing
2014 | Journal article
EID:

2-s2.0-84919877196

Contributors: Valero-Lara, P.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Fast finite difference Poisson solvers on heterogeneous architectures

Computer Physics Communications
2014 | Journal article
EID:

2-s2.0-84894640088

Contributors: Valero-Lara, P.; Pinelli, A.; Prieto-Matias, M.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

hLCS. A hybrid GPGPU approach for solving multiple short and unbalanced LCS problems

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
2014 | Book
EID:

2-s2.0-84904902359

Contributors: Valero-Lara, P.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Multi-GPU acceleration of DARTEL (early detection of Alzheimer)

2014 IEEE International Conference on Cluster Computing, CLUSTER 2014
2014 | Conference paper
EID:

2-s2.0-84917706060

Contributors: Valero-Lara, P.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

A GPU approach for accelerating 3D deformable registration (DARTEL) on brain biomedical images

ACM International Conference Proceeding Series
2013 | Conference paper
EID:

2-s2.0-84886304713

Contributors: Valero-Lara, P.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Analysis in performance and new model for multiple kernels executions on many-core architectures

Proceedings of the 12th IEEE International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2013
2013 | Conference paper
EID:

2-s2.0-84889026220

Contributors: Valero-Lara, P.; Pelayo, F.L.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Block tridiagonal solvers on heterogeneous architectures

Proceedings of the 2012 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2012
2012 | Conference paper
EID:

2-s2.0-84867275121

Contributors: Valero-Lara, P.; Pinelli, A.; Favier, J.; Matias, M.P.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Improving the performance for the range search on metric spaces using a multi-GPU platform

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
2012 | Book
EID:

2-s2.0-84866052888

Contributors: Uribe-Paredes, R.; Arias, E.; Sánchez, J.L.; Cazorla, D.; Valero-Lara, P.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

MRF satellite image classification on GPU

Proceedings of the International Conference on Parallel Processing Workshops
2012 | Conference paper
EID:

2-s2.0-84871150569

Contributors: Valero-Lara, P.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

A GPU-based implementation for range queries on spaghettis data structure

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
2011 | Book
EID:

2-s2.0-79960300063

Contributors: Uribe-Paredes, R.; Valero-Lara, P.; Arias, E.; Sánchez, J.L.; Cazorla, D.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Similarity search implementations for multi-core and many-core processors

Proceedings of the 2011 International Conference on High Performance Computing and Simulation, HPCS 2011
2011 | Conference paper
EID:

2-s2.0-80052975757

Contributors: Uribe-Paredes, R.; Valero-Lara, P.; Arias, E.; Sánchez, J.L.; Cazorla, D.
Source: Self-asserted source
Pedro Valero-Lara via Scopus - Elsevier

Peer review (4 reviews for 4 publications/grants)

Review activity for Computer physics communications. (1)
Review activity for Journal of parallel and distributed computing. (1)
Review activity for Parallel computing. (1)
Review activity for The journal of supercomputing. (1)