<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-04-13T18:30:15Z</responseDate><request verb="GetRecord" identifier="oai:www.recercat.cat:2117/116807" metadataPrefix="marc">https://recercat.cat/oai/request</request><GetRecord><record><header><identifier>oai:recercat.cat:2117/116807</identifier><datestamp>2026-01-14T06:22:55Z</datestamp><setSpec>com_2072_1033</setSpec><setSpec>col_2072_452950</setSpec></header><metadata><record xmlns="http://www.loc.gov/MARC21/slim" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
   <leader>00925njm 22002777a 4500</leader>
   <datafield ind2=" " ind1=" " tag="042">
      <subfield code="a">dc</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">Garcia-Gasulla, Marta</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">Houzeaux, Guillaume</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">Ferrer, Roger</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">Artigues, Antoni</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">López, Victor</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">Labarta Mancho, Jesús José</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">Vázquez, Mariano</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="260">
      <subfield code="c">2018</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="520">
      <subfield code="a">The main computing tasks of a finite element code(FE) for solving partial differential equations (PDE's)&#xd;
are the algebraic system assembly and the iterative solver. This work focuses on the first task, in the context&#xd;
of a hybrid MPI+X paradigm. Although we will describe algorithms in the FE context, a similar strategy&#xd;
can be straightforwardly applied to other discretization methods, like the finite volume method.&#xd;
The matrix assembly consists of a loop over the elements of the MPI partition to compute element&#xd;
matrices and right-hand sides and their assemblies in the local system to each MPI partition. In a MPI+X&#xd;
hybrid parallelism context, X has consisted traditionally of loop parallelism using OpenMP. Several strate-&#xd;
gies have been proposed in the literature to implement this loop parallelism, like coloring or substructuring&#xd;
techniques to circumvent the race condition that appears when assembling the element system into the local&#xd;
system. The main drawback of the first technique is the decrease of the IPC due to bad spatial locality.&#xd;
The second technique avoids this issue but requires extensive changes in the implementation, which can&#xd;
be cumbersome when several element loops should be treated. We propose an alternative, based on the&#xd;
task parallelism of the element loop using some extensions to the OpenMP programming model. The task-&#xd;
ification of the assembly solves both aforementioned problems. In addition, dynamic load balance will be&#xd;
applied using the DLB library, especially efficient in the presence of hybrid meshes, where the relative costs&#xd;
of the different elements is impossible to estimate a priori. This paper presents the proposed methodology,&#xd;
its implementation and its validation through the solution of large computational mechanics problems up&#xd;
to 16k cores.</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="520">
      <subfield code="a">The use of large part of a supercomputer, even more in normal conditions of use, is never an innocuous exercise. The research leading to these results has received funding from: the European Union's Horizon 2020 Programme (2014–2020) and from Brazilian Ministry of Science, Technology and Innovation through Rede Nacional de Pesquisa (RNP), HPC4E Project, grant agreement 689772; the Energy oriented Centre of Excellence (EoCoE), grant agreement number 676629, funded within the Horizon2020 framework of the European Union; The Spanish Government (grant SEV2015-0493 of the Severo Ochoa Program); the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P); the Generalitat de Catalunya (contract 2014-SGR-1051); the Intel-BSC Exascale Lab collaboration project. Comissió Interdepartamental de Recerca i Innovació Tecnológica(Interdepartmental Commission for Research and Technological Innovation)</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="520">
      <subfield code="a">Sí</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="520">
      <subfield code="a">Post-print (author's final draft)</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Àrees temàtiques de la UPC::Informàtica</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">OpenMP</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Finite element code (FE)</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">OpenMP</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">OpenMP</subfield>
   </datafield>
   <datafield ind2="0" ind1="0" tag="245">
      <subfield code="a">MPI+X: task-based parallelization and dynamic load balance of finite element assembly</subfield>
   </datafield>
</record></metadata></record></GetRecord></OAI-PMH>