... memory accesses must be
atomic and since memory accesses must be performed one after another. There-
fore, processors may have to wait for quite a long time before memory accesses
that they have ... (4).
Thus, both P
1
and P
2
may print the old value for x
1
and x
2
, respectively.
Partial store ordering (PSO) models relax both the W → W and the W → R
ordering required for seque...
... description of XY routing for two-dimensional meshes and E-cube
routing for hypercubes as typical examples for dimension-order routing algorithms.
XY Routing for Two-Dimensional Meshes
For a two-dimensional ... {n
1
, ,n
k
} exists such that for 1 ≤ i < k each
message N
i
uses a link n
i
for transmission and waits for the release of link n
i+1
which is currently used f...
... matrices [127].
The GA approach is provided as a library with interfaces for C, C++, and Fortran
for different parallel platforms. The GA approach is based on a global address space
in which global ... synchronization are performed by the runtime
system, and no low-level lock synchronization must be performed.
Chapel has been developed by Cray Inc. as a new parallel language for...
... team. This fork
operation is performed implicitly. The program code inside the parallel construct
is called a parallel region and is executed in parallel by all threads of the team.
The parallel ... sophisticated implementa-
tions may queue command for execution by one of a set of threads. For multicore
processors, several threads are typically available for the execution of...
... coefficient matrix A = (a
ij
) ∈ R
n×n
is
symmetric and positive definite, i.e., if a
ij
= a
ji
and x
T
Ax > 0 for all x ∈ R
n
with
x = 0. For a symmetric and positive definite n × n matrix A ∈ R
n×n
there ... Factorization for Sparse Matrices 431
identify its original position in the full matrix. Thus, a compressed storage scheme
for sparse matrices needs the space for the n...