Commit Graph

2026 Commits

Author SHA1 Message Date
Jakub Jelinek c891d8dc23 Update ChangeLog and version files for release 2023-07-27 08:13:36 +00:00
GCC Administrator e0f07bc97d Daily bump. 2023-06-29 00:21:39 +00:00
Thomas Schwinge 09124b7ed7 Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]
Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
>     $ uname -srvi
>     Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
>     $ grep '^model name' < /proc/cpuinfo | uniq -c
>          12 model name      : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
>     $ nvidia-smi -L
>     GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

>     $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
>     1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
>     1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
>     -j12 GCC_TEST_PARALLEL_SLOTS=12
>     2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
>     2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

    2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata 505216maxresident)k
    2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata 505216maxresident)k

	PR testsuite/66005
	gcc/
	* doc/install.texi: Document (optional) Perl usage for parallel
	testing of libgomp.
	libgomp/
	* testsuite/lib/libgomp.exp: 'flock' through stdout.
	* testsuite/flock: New.
	* configure.ac (FLOCK): Point to that if no 'flock' available, but
	'perl' is.
	* configure: Regenerate.

(cherry picked from commit 04abe1944d30eb18a2060cfcd9695d085f7b4752)
2023-06-28 13:38:04 +02:00
Thomas Schwinge 3840d5ccf7 Support parallel testing in libgomp, part II [PR66005]
..., and enable if 'flock' is available for serializing execution testing.

Regarding the default of 19 parallel slots, this turned out to be a local
minimum for wall time when testing this on:

    $ uname -srvi
    Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC 2016 x86_64
    $ grep '^model name' < /proc/cpuinfo | uniq -c
         32 model name      : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

... in two configurations: case (a) standard configuration, no offloading
configured, case (b) offloading for GCN and nvptx configured but no devices
available.  For both cases, default plus '-m32' variant.

    $ \time make check-target-libgomp RUNTESTFLAGS="--target_board=unix\{,-m32\}"

Case (a), baseline:

    6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata 505044maxresident)k
    6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata 505172maxresident)k

This is what people have been complaining about, rightly so, in
<https://gcc.gnu.org/PR66005> "libgomp make check time is excessive" and
elsewhere.

Case (a), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=10
    3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata 505188maxresident)k
    -j15 GCC_TEST_PARALLEL_SLOTS=15
    3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata 505360maxresident)k
    -j17 GCC_TEST_PARALLEL_SLOTS=17
    3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata 505112maxresident)k
    -j18 GCC_TEST_PARALLEL_SLOTS=18
    3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata 505360maxresident)k
    -j19 GCC_TEST_PARALLEL_SLOTS=19
    3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata 505128maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20
    3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata 505100maxresident)k
    -j23 GCC_TEST_PARALLEL_SLOTS=23
    4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata 505200maxresident)k
    -j26 GCC_TEST_PARALLEL_SLOTS=26
    3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata 505160maxresident)k
    -j32 GCC_TEST_PARALLEL_SLOTS=32
    4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata 505160maxresident)k

Yay!

Case (b), baseline; 2+ h:

    7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata 994264maxresident)k

Case (b), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=10
    7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata 994344maxresident)k
    -j15 GCC_TEST_PARALLEL_SLOTS=15
    8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata 994228maxresident)k
    -j17 GCC_TEST_PARALLEL_SLOTS=17
    8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata 994176maxresident)k
    -j18 GCC_TEST_PARALLEL_SLOTS=18
    8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata 994248maxresident)k
    -j19 GCC_TEST_PARALLEL_SLOTS=19
    9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata 994260maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20
    9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata 994284maxresident)k
    -j23 GCC_TEST_PARALLEL_SLOTS=23
    10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata 994208maxresident)k
    -j26 GCC_TEST_PARALLEL_SLOTS=26
    11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata 994256maxresident)k
    -j32 GCC_TEST_PARALLEL_SLOTS=32
    11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata 994240maxresident)k

On my Dell Precision 7530 laptop:

    $ uname -srvi
    Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
    $ grep '^model name' < /proc/cpuinfo | uniq -c
         12 model name      : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
    $ nvidia-smi -L
    GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)

... in two configurations: case (c) standard configuration, no offloading
configured, case (d) offloading for nvptx configured and device available.
For both cases, only default variant, no '-m32'.

    $ \time make check-target-libgomp

Case (c), baseline; roughly half of case (a) (just one variant):

    1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
    1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k

Case (c), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=2
    1143.83user 110.76system 10:20.46elapsed 202%CPU (0avgtext+0avgdata 505216maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=6
    1737.08user 143.94system 4:59.48elapsed 628%CPU (0avgtext+0avgdata 505200maxresident)k
    1730.31user 143.02system 4:58.75elapsed 627%CPU (0avgtext+0avgdata 505152maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=8
    2192.63user 169.34system 4:52.96elapsed 806%CPU (0avgtext+0avgdata 505216maxresident)k
    2219.04user 167.67system 4:53.19elapsed 814%CPU (0avgtext+0avgdata 505152maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=10
    2463.93user 184.98system 4:48.39elapsed 918%CPU (0avgtext+0avgdata 505200maxresident)k
    2455.62user 183.68system 4:47.40elapsed 918%CPU (0avgtext+0avgdata 505216maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=12
    2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
    2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe]
    2613.18user 199.51system 4:44.06elapsed 990%CPU (0avgtext+0avgdata 505216maxresident)k

Case (d), baseline (compared to case (b): only nvptx offloading compilation,
but also nvptx offloading execution); ~1 h:

    2841.93user 653.68system 1:02:26elapsed 93%CPU (0avgtext+0avgdata 909792maxresident)k
    2842.03user 654.39system 1:02:24elapsed 93%CPU (0avgtext+0avgdata 909880maxresident)k

Case (d), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=2
    2856.39user 606.87system 33:58.64elapsed 169%CPU (0avgtext+0avgdata 909948maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=6
    3444.90user 666.86system 18:37.57elapsed 367%CPU (0avgtext+0avgdata 909856maxresident)k
    3462.13user 667.13system 18:36.87elapsed 369%CPU (0avgtext+0avgdata 909872maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=8
    3929.74user 716.22system 18:02.36elapsed 429%CPU (0avgtext+0avgdata 909832maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=10
    4152.84user 736.16system 17:43.05elapsed 459%CPU (0avgtext+0avgdata 909872maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=12
    4209.60user 749.00system 17:35.20elapsed 469%CPU (0avgtext+0avgdata 909840maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe]
    4255.54user 756.78system 17:29.06elapsed 477%CPU (0avgtext+0avgdata 909868maxresident)k

Worth noting is that with nvptx offloading, there is one execution test case
that times out ('libgomp.fortran/reverse-offload-5.f90').  This effectively
stalls progress for almost 5 min: quickly other executions test cases queue up
on the lock for all parallel slots.  That's working as expected; just noting
this as it accordingly does skew the wall time numbers.

	PR testsuite/66005
	libgomp/
	* configure.ac: Look for 'flock'.
	* testsuite/Makefile.am (gcc_test_parallel_slots): Enable parallel testing.
	* testsuite/config/default.exp: Don't 'load_lib "standard.exp"' here...
	* testsuite/lib/libgomp.exp: ... but here, instead.
	(libgomp_load): Override for parallel testing.
	* testsuite/libgomp-site-extra.exp.in (FLOCK): Set.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* testsuite/Makefile.in: Regenerate.

(cherry picked from commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba)
2023-06-28 13:38:04 +02:00
Rainer Orth 2aa6135efb Support parallel testing in libgomp, part I [PR66005]
..., while still hard-coding the number of parallel slots to one.

	PR testsuite/66005
	libgomp/
	* testsuite/Makefile.am (PWD_COMMAND): New variable.
	(%/site.exp): New target.
	(check_p_numbers0, check_p_numbers1, check_p_numbers2)
	(check_p_numbers3, check_p_numbers4, check_p_numbers5)
	(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
	(check_p_subdirs)
	(check_DEJAGNU_libgomp_targets): New variables.
	($(check_DEJAGNU_libgomp_targets)): New target.
	($(check_DEJAGNU_libgomp_targets)): New dependency.
	(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
	* testsuite/Makefile.in: Regenerate.
	* testsuite/lib/libgomp.exp: For parallel testing,
	'load_file ../libgomp-test-support.exp'.

Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
(cherry picked from commit e797db5c744f7b4e110f23a495fca8e6b8aebe83)
2023-06-28 13:38:04 +02:00
Thomas Schwinge 4b9af57eae libgomp C++ testsuite: Use 'lang_include_flags' instead of 'libstdcxx_includes'
With nvptx offloading configured, and supported, and CUDA available:

    $ make check-target-libgomp RUNTESTFLAGS="--all c.exp=context-1.c c++.exp=context-1.c"
    [...]
    Running [...]/libgomp.oacc-c/c.exp ...
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
    PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
    UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
    Running [...]/libgomp.oacc-c++/c++.exp ...
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
    PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
    [...]

..., but for 'c++.exp=context-1.c' alone, we currently get all-UNSUPPORTED:

    $ make check-target-libgomp RUNTESTFLAGS_="--all c++.exp=context-1.c"
    [...]
    Running [...]/libgomp.oacc-c++/c++.exp ...
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
    UNSUPPORTED: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/context-1.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2
    [...]

That is, if 'c.exp' executes first, it does successfully evaluate
'dg-require-effective-target openacc_cublas' -- and does cache this result (so
it isn't reevaluated for 'c++.exp').  However, for 'c++.exp' alone (that is,
without the 'c.exp' result cached), we run into:

    spawn -ignore SIGHUP [xgcc] [...] -x c++ openacc_cublas2311907.c [...]
    In file included from /usr/include/cuda_fp16.h:3673,
                     from /usr/include/cublas_api.h:75,
                     from /usr/include/cublas_v2.h:65,
                     from openacc_cublas2311907.c:3:
    /usr/include/cuda_fp16.hpp:67:10: fatal error: utility: No such file or directory

We're missing include paths to C++/libstdc++ build-tree headers.

Fix this by using the mechanism introduced for Fortran in
r212268 (commit f707da16f7) re
"libgomp.fortran/fortran.exp - add -fintrinsic-modules-path ${blddir}".

	libgomp/
	* testsuite/libgomp.c++/c++.exp: Use 'lang_include_flags' instead
	of 'libstdcxx_includes'.
	* testsuite/libgomp.oacc-c++/c++.exp: Likewise.

(cherry picked from commit 1b93b9191d073bf9e867ab8bfc8e4b59ba5af1f3)
2023-06-28 13:38:04 +02:00
GCC Administrator c2bfd2c399 Daily bump. 2023-05-17 00:20:59 +00:00
Tobias Burnus 7fb7d49b3c LTO: Fix writing of toplevel asm with offloading [PR109816]
When offloading was enabled, top-level 'asm' were added to the offloading
section, confusing assemblers which did not support the syntax. Additionally,
with offloading and -flto, the top-level assembler code did not end up
in the host files.

As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file,
the issue became more apparent, causing fails with nvptx for some
C++ testcases.

	PR libstdc++/109816

gcc/ChangeLog:

	* lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by
	'!lto_stream_offload_p'.

libgomp/ChangeLog:

	* testsuite/libgomp.c++/target-map-class-1.C: New test.
	* testsuite/libgomp.c++/target-map-class-2.C: New test.

(cherry picked from commit a835f046cdf017b9e8ad5576df4f10daaf8420d0)
2023-05-16 08:51:14 +02:00
GCC Administrator 843854acb0 Daily bump. 2023-05-06 00:20:44 +00:00
Julian Brown a4cc474b15 OpenACC: Further attach/detach clause fixes for Fortran [PR109622]
This patch moves several tests introduced by the following patch:

  https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616939.html
  commit r14-325-gcacf65d74463600815773255e8b82b4043432bd7

into the proper location for OpenACC testing (thanks to Thomas for
spotting my mistake!), and also fixes a few additional problems --
missing diagnostics for non-pointer attaches, and a case where a pointer
was incorrectly dereferenced. Tests are also adjusted for vector-length
warnings on nvidia accelerators.

2023-04-29  Julian Brown  <julian@codesourcery.com>

	PR fortran/109622

gcc/fortran/
	* openmp.cc (resolve_omp_clauses): Add diagnostic for
	non-pointer/non-allocatable attach/detach.
	* trans-openmp.cc (gfc_trans_omp_clauses): Remove dereference for
	pointer-to-scalar derived type component attach/detach.  Fix
	attach/detach handling for descriptors.

gcc/testsuite/
	* gfortran.dg/goacc/pr109622-5.f90: New test.
	* gfortran.dg/goacc/pr109622-6.f90: New test.

libgomp/
	* testsuite/libgomp.fortran/pr109622.f90: Move test...
	* testsuite/libgomp.oacc-fortran/pr109622.f90: ...to here. Ignore
	vector length warning.
	* testsuite/libgomp.fortran/pr109622-2.f90: Move test...
	* testsuite/libgomp.oacc-fortran/pr109622-2.f90: ...to here.  Add
	missing copyin/copyout variable. Ignore vector length warnings.
	* testsuite/libgomp.fortran/pr109622-3.f90: Move test...
	* testsuite/libgomp.oacc-fortran/pr109622-3.f90: ...to here.  Ignore
	vector length warnings.
	* testsuite/libgomp.oacc-fortran/pr109622-4.f90: New test.

(cherry picked from commit 0a26a42b237bada32165e61867a2bf4461c5fab2)
2023-05-05 13:14:24 +00:00
Julian Brown fa7c4ab365 OpenACC: Stand-alone attach/detach clause fixes for Fortran [PR109622]
This patch fixes several cases where multiple attach or detach mapping
nodes were being created for stand-alone attach or detach clauses
in Fortran.  After the introduction of stricter checking later during
compilation, these extra nodes could cause ICEs, as seen in the PR.

The patch also fixes cases that "happened to work" previously where
the user attaches/detaches a pointer to array using a descriptor, and
(I think!) the "_data" field has offset zero, hence the same address as
the descriptor as a whole.

2023-04-27  Julian Brown  <julian@codesourcery.com>

	PR fortran/109622

gcc/fortran/
	* trans-openmp.cc (gfc_trans_omp_clauses): Attach/detach clause fixes.

gcc/testsuite/
	* gfortran.dg/goacc/attach-descriptor.f90: Adjust expected output.

libgomp/
	* testsuite/libgomp.fortran/pr109622.f90: New test.
	* testsuite/libgomp.fortran/pr109622-2.f90: New test.
	* testsuite/libgomp.fortran/pr109622-3.f90: New test.

(cherry picked from commit cacf65d74463600815773255e8b82b4043432bd7)
2023-05-05 13:14:23 +00:00
Jakub Jelinek cc035c5d86 Update ChangeLog and version files for release 2023-04-26 07:10:03 +00:00
GCC Administrator 579cdc1e44 Daily bump. 2023-03-29 00:17:01 +00:00
Rainer Orth 8443f42f05 testsuite: Fix weak_undefined handling on Darwin
The patch that introduced the weak_undefined effective-target keyword
and corresponding dg-add-options support

commit 378ec7b87a
Author: Alexandre Oliva <oliva@adacore.com>
Date:   Thu Mar 23 00:45:05 2023 -0300

    [testsuite] test for weak_undefined support and add options

badly broke the affected tests on macOS like so:

ERROR: gcc.dg/addr_equal-1.c: unknown dg option: 89 for " dg-add-options 5 weak_undefined "
ERROR: gcc.dg/addr_equal-1.c: unknown dg option: 89 for " dg-add-options 5 weak_undefined "

add_options_for_weak_undefined tries to call an non-existant proc "89".
Even after fixing this by escaping the brackets, two tests still failed to
link since they lacked the corresponding calls do dg-add-options
weak_undefined.

Tested on x86_64-apple-darwin20.6.0 and i386-pc-solaris2.11.

2023-03-27  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	* lib/target-supports.exp (add_options_for_weak_undefined): Escape
	brackets.
	* gcc.dg/visibility-22.c: Add weak_undefined options.

	libgomp:
	* testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c: Add
	weak_undefined options.
2023-03-28 10:40:05 +02:00
GCC Administrator 13ec81eb4c Daily bump. 2023-03-25 00:16:51 +00:00
Tobias Burnus 243fa4883c libgomp.texi: Fix wording in GCN offload specifics
libgomp/
	* libgomp.texi (Offload-Target Specifics): Grammar fix.
2023-03-24 17:36:22 +01:00
Thomas Schwinge e8fec6998b Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate [PR104949]
Follow-up to commit 49d1a2f913
"OpenMP: Handle descriptors in target's firstprivate [PR104949]".

	PR fortran/104949
	libgomp/
	* target.c (gomp_map_vars_internal) <GOMP_MAP_FIRSTPRIVATE>: Add
	caveat/safeguard.
2023-03-24 17:14:54 +01:00
GCC Administrator c80654412b Daily bump. 2023-03-11 00:16:36 +00:00
Thomas Schwinge f8332e52a4 Use 'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]
Thereby considerably simplify the device plugins' 'GOMP_OFFLOAD_openacc_exec',
'GOMP_OFFLOAD_openacc_async_exec' functions: in terms of lines of code, but in
particular conceptually: no more device memory allocation, host to device data
copying, device memory deallocation -- 'GOMP_MAP_VARS_TARGET' does all that for
us.

This depends on commit 2b2340e236
"Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data",
where I said that "a use will emerge later", which is this one here.

	PR libgomp/90596
	libgomp/
	* target.c (gomp_map_vars_internal): Allow for
	'param_kind == GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_TARGET'.
	* oacc-parallel.c (GOACC_parallel_keyed): Pass
	'GOMP_MAP_VARS_TARGET' to 'goacc_map_vars'.
	* plugin/plugin-gcn.c (alloc_by_agent, gcn_exec)
	(GOMP_OFFLOAD_openacc_exec, GOMP_OFFLOAD_openacc_async_exec):
	Adjust, simplify.
	(gomp_offload_free): Remove.
	* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
	(GOMP_OFFLOAD_openacc_async_exec): Adjust, simplify.
	(cuda_free_argmem): Remove.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
	Adjust.
2023-03-10 18:05:27 +01:00
Thomas Schwinge 2b2340e236 Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data
This does *allow*, but under no circumstances is this currently going to be
used: all potentially applicable data is non-'ephemeral', and thus not
considered for 'gomp_coalesce_buf_add' for OpenACC 'async'.  (But a use will
emerge later.)

Follow-up to commit r12-2530-gd88a6951586c7229b25708f4486eaaf4bf4b5bbe
"Don't use libgomp 'cbuf' buffering with OpenACC 'async'", addressing this
TODO comment:

    TODO ... but we could allow CBUF usage for EPHEMERAL data?  (Open question:
    is it more performant to use libgomp CBUF buffering or individual device
    asyncronous copying?)

Ephemeral data is small, and therefore individual device asyncronous copying
does seem dubious -- in particular given that for all those, we'd individually
have to allocate and queue for deallocation a temporary buffer to capture the
ephemeral data.  Instead, just let the 'cbuf' *be* the temporary buffer.

	libgomp/
	* target.c (gomp_copy_host2dev, gomp_map_vars_internal): Allow
	libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral'
	data.
2023-03-10 16:19:53 +01:00
Thomas Schwinge 199867d07b Simplify OpenACC 'no_create' clause implementation
For 'OFFSET_INLINED', 'gomp_map_val' does the right thing, and we may then
simplify the device plugins accordingly.

This is a follow-up to
Subversion r279551 (Git commit a6163563f2)
"Add OpenACC 2.6's no_create",
Subversion r279622 (Git commit 5bcd470bf0)
"Use gomp_map_val for OpenACC host-to-device address translation".

	libgomp/
	* target.c (gomp_map_vars_internal): Use 'OFFSET_INLINED' for
	'GOMP_MAP_IF_PRESENT'.
	* plugin/plugin-gcn.c (gcn_exec, GOMP_OFFLOAD_openacc_exec)
	(GOMP_OFFLOAD_openacc_async_exec): Adjust.
	* plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec)
	(GOMP_OFFLOAD_openacc_async_exec): Likewise.
	* testsuite/libgomp.oacc-c-c++-common/no_create-1.c: Add 'async'
	testing.
	* testsuite/libgomp.oacc-c-c++-common/no_create-2.c: Likewise.
2023-03-10 15:48:43 +01:00
Thomas Schwinge b5037d4a07 OpenACC: Remove 'acc_async_test' -> skip shortcut in 'libgomp/oacc-async.c:goacc_wait'
We're not taking such a shortcut anywhere else, and (with future changes) it
has potential to confuse things if synchronization in a libgomp plugin happens
to have side effects even if an async queue currently is empty.

	libgomp/
	* oacc-async.c (goacc_wait): Remove 'acc_async_test' -> skip
	shortcut.
2023-03-10 15:37:47 +01:00
Thomas Schwinge 442d51a20e Document/verify another aspect of OpenACC 'async' semantics in 'libgomp.oacc-c-c++-common/data-3.c'
... that I almost broke with later implementation changes.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/data-3.c: Document/verify
	another aspect of OpenACC 'async' semantics.
2023-03-10 15:18:53 +01:00
Thomas Schwinge 649f1939ba Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' position
For an OpenACC compute construct, we've currently got:

  - [...]
  - acc_ev_enqueue_launch_start
  - launch kernel
  - free memory
  - acc_ev_free
  - acc_ev_enqueue_launch_end

This confused another thing that I'm working on, so I adjusted that to:

  - [...]
  - acc_ev_enqueue_launch_start
  - launch kernel
  - acc_ev_enqueue_launch_end
  - free memory
  - acc_ev_free

Correspondingly, verify 'acc_ev_alloc', 'acc_ev_free' in
'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'.

	libgomp/
	* plugin/plugin-gcn.c (gcn_exec): Fix 'acc_ev_enqueue_launch_end'
	position.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
	Verify 'acc_ev_alloc', 'acc_ev_free'.
2023-03-10 15:05:01 +01:00
GCC Administrator da2b9c6e31 Daily bump. 2023-03-10 00:17:15 +00:00
Hongyu Wang 288bc7b5d1 libgomp: Fix default value of GOMP_SPINCOUNT [PR 109062]
When OMP_WAIT_POLICY is not specified, current implementation will cause
icv flag GOMP_ICV_WAIT_POLICY unset, so global variable wait_policy
will remain its uninitialized value. Initialize it to -1 to make
GOMP_SPINCOUNT behavior consistent with its description.

libgomp/ChangeLog:

	PR libgomp/109062
	* env.c (wait_policy): Initialize to -1.
	(initialize_icvs): Initialize icvs->wait_policy to -1.
	* testsuite/libgomp.c-c++-common/pr109062.c: New test.
2023-03-09 09:01:13 +08:00
GCC Administrator 6a87fdd3ed Daily bump. 2023-03-09 00:17:00 +00:00
Tobias Burnus 2e3dd14dd2 libgomp.texi: Mention GCN_STACK_SIZE in Offload-Target Specifics
libgomp/ChangeLog:

	* libgomp.texi (Offload-Target Specifics): Mention GCN_STACK_SIZE.
2023-03-08 14:55:49 +01:00
GCC Administrator 14db9ed505 Daily bump. 2023-03-03 00:16:38 +00:00
Kwok Cheung Yeung ce9cd7258d amdgcn: Enable SIMD vectorization of math functions
Calls to vectorized versions of routines in the math library will now
be inserted when vectorizing code containing supported math functions.

2023-03-02  Kwok Cheung Yeung  <kcy@codesourcery.com>
	    Paul-Antoine Arras  <pa@codesourcery.com>

	gcc/
	* builtins.cc (mathfn_built_in_explicit): New.
	* config/gcn/gcn.cc: Include case-cfn-macros.h.
	(mathfn_built_in_explicit): Add prototype.
	(gcn_vectorize_builtin_vectorized_function): New.
	(gcn_libc_has_function): New.
	(TARGET_LIBC_HAS_FUNCTION): Define.
	(TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Define.

	gcc/testsuite/
	* gcc.target/gcn/simd-math-1.c: New testcase.
	* gcc.target/gcn/simd-math-2.c: New testcase.

	libgomp/
	* testsuite/libgomp.c/simd-math-1.c: New testcase.
2023-03-02 20:56:53 +00:00
GCC Administrator c88a7c6348 Daily bump. 2023-03-02 00:17:28 +00:00
Tobias Burnus 96ff97ff65 OpenMP/Fortran: Fix handling of optional is_device_ptr + bind(C) [PR108546]
For is_device_ptr, optional checks should only be done before calling
libgomp, afterwards they are NULL either because of absent or, by
chance, because it is unallocated or unassociated (for pointers/allocatables).

Additionally, it fixes an issue with explicit mapping for 'type(c_ptr)'.

	PR middle-end/108546

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_trans_omp_clauses): Fix mapping of
	type(C_ptr) variables.

gcc/ChangeLog:

	* omp-low.cc (lower_omp_target): Remove optional handling
	on the receiver side, i.e. inside target (data), for
	use_device_ptr.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/is_device_ptr-3.f90: New test.
	* testsuite/libgomp.fortran/use_device_ptr-optional-4.f90: New test.
2023-03-01 13:53:09 +01:00
GCC Administrator b6f98991b1 Daily bump. 2023-02-23 00:17:57 +00:00
Thomas Schwinge 320dc51c2d Add '-Wno-complain-wrong-lang', and use it in 'gcc/testsuite/lib/target-supports.exp:check_compile' and elsewhere
I noticed that GCC/Rust recently lost all LTO variants in torture testing:

     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O0  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O1  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2  (test for excess errors)
    -PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
    -PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O3 -g  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -Os  (test for excess errors)

Etc.

The reason is that when probing for availability of LTO, we run into:

    spawn [...]/build-gcc/gcc/testsuite/rust/../../gccrs -B[...]/build-gcc/gcc/testsuite/rust/../../ -fdiagnostics-plain-output -frust-incomplete-and-experimental-compiler-do-not-use -flto -c -o lto8274.o lto8274.c
    cc1: warning: command-line option '-frust-incomplete-and-experimental-compiler-do-not-use' is valid for Rust but not for C

For GCC/Rust testing, this flag is (as of recently) defaulted in
'gcc/testsuite/lib/rust.exp:rust_init':

    lappend ALWAYS_RUSTFLAGS "additional_flags=-frust-incomplete-and-experimental-compiler-do-not-use"

A few more "command-line option [...] is valid for [...] but not for [...]"
instances were found in the test suite logs, when more than one language is
involved.

With '-Wno-complain-wrong-lang' used in
'gcc/testsuite/lib/target-supports.exp:check_compile', we get back:

     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O0  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O1  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2  (test for excess errors)
    +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
    +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O3 -g  (test for excess errors)
     PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -Os  (test for excess errors)

Etc., and in total:

                    === rust Summary for unix ===

    # of expected passes            [-4990-]{+6718+}
    # of expected failures          [-39-]{+51+}

Anything that 'gcc/opts-global.cc:complain_wrong_lang' might do is cut
short by '-Wno-complain-wrong-lang', not just the one 'warning'
diagnostic.  This corresponds to what already exists via
'lang_hooks.complain_wrong_lang_p'.

The 'gcc/opts-common.cc:prune_options' changes follow the same rationale
as PR67640 "driver passes -fdiagnostics-color= always last": we need to
process '-Wno-complain-wrong-lang' early, so that it properly affects
other options appearing before it on the command line.

	gcc/
	* common.opt (-Wcomplain-wrong-lang): New.
	* doc/invoke.texi (-Wno-complain-wrong-lang): Document it.
	* opts-common.cc (prune_options): Handle it.
	* opts-global.cc (complain_wrong_lang): Use it.
	gcc/testsuite/
	* gcc.dg/Wcomplain-wrong-lang-1.c: New.
	* gcc.dg/Wcomplain-wrong-lang-2.c: Likewise.
	* gcc.dg/Wcomplain-wrong-lang-3.c: Likewise.
	* gcc.dg/Wcomplain-wrong-lang-4.c: Likewise.
	* gcc.dg/Wcomplain-wrong-lang-5.c: Likewise.
	* lib/target-supports.exp (check_compile): Use
	'-Wno-complain-wrong-lang'.
	* g++.dg/abi/empty12.C: Likewise.
	* g++.dg/abi/empty13.C: Likewise.
	* g++.dg/abi/empty14.C: Likewise.
	* g++.dg/abi/empty15.C: Likewise.
	* g++.dg/abi/empty16.C: Likewise.
	* g++.dg/abi/empty17.C: Likewise.
	* g++.dg/abi/empty18.C: Likewise.
	* g++.dg/abi/empty19.C: Likewise.
	* g++.dg/abi/empty22.C: Likewise.
	* g++.dg/abi/empty25.C: Likewise.
	* g++.dg/abi/empty26.C: Likewise.
	* gfortran.dg/bind-c-contiguous-1.f90: Likewise.
	* gfortran.dg/bind-c-contiguous-4.f90: Likewise.
	* gfortran.dg/bind-c-contiguous-5.f90: Likewise.
	libgomp/
	* testsuite/libgomp.fortran/alloc-10.f90: Use
	'-Wno-complain-wrong-lang'.
	* testsuite/libgomp.fortran/alloc-11.f90: Likewise.
	* testsuite/libgomp.fortran/alloc-7.f90: Likewise.
	* testsuite/libgomp.fortran/alloc-9.f90: Likewise.
	* testsuite/libgomp.fortran/allocate-1.f90: Likewise.
	* testsuite/libgomp.fortran/depend-4.f90: Likewise.
	* testsuite/libgomp.fortran/depend-5.f90: Likewise.
	* testsuite/libgomp.fortran/depend-6.f90: Likewise.
	* testsuite/libgomp.fortran/depend-7.f90: Likewise.
	* testsuite/libgomp.fortran/depend-inoutset-1.f90: Likewise.
	* testsuite/libgomp.fortran/examples-4/declare_target-1.f90:
	Likewise.
	* testsuite/libgomp.fortran/examples-4/declare_target-2.f90:
	Likewise.
	* testsuite/libgomp.fortran/order-reproducible-1.f90: Likewise.
	* testsuite/libgomp.fortran/order-reproducible-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise.
	* testsuite/libgomp.fortran/task-detach-6.f90: Remove left-over
	'dg-prune-output'.
2023-02-22 09:19:51 +01:00
GCC Administrator 88cc449525 Daily bump. 2023-02-17 00:17:49 +00:00
Jakub Jelinek 0b9bd33d69 libgomp: Fix up some typos in libgomp.texi
I decided to check for repeated the the in libgomp and noticed
there are several occurrences of a typo theads rather than threads
in libgomp.texi.

2023-02-16  Jakub Jelinek  <jakub@redhat.com>

	* libgomp.texi: Fix typos - theads -> threads.
2023-02-16 12:15:03 +01:00
Jakub Jelinek 9d71955f38 libgomp: Fix comment typo
I saw
FAIL: libgomp.fortran/target-nowait-array-section.f90   -O  execution test
in my last x86_64-linux bootstrap.  From quick skimming, it might be just
unreliable test, which assumes that asynchronous execution wouldn't produce
ordered sequence, but can't it happen even with asynchronous execution?

That said, while skimming the test, I've noticed a comment typo and
this patch fixes that up.

2023-02-16  Jakub Jelinek  <jakub@redhat.com>

	* testsuite/libgomp.fortran/target-nowait-array-section.f90: Fix
	comment typo and improve its wording.
2023-02-16 12:10:19 +01:00
GCC Administrator 29a3539193 Daily bump. 2023-02-16 00:18:19 +00:00
Tobias Burnus edaf1d6078 libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET
libgomp/
	* target.c (gomp_target_rev): Dereference ptr
	to get device address.
	* testsuite/libgomp.fortran/reverse-offload-5.f90: Add test
	for unallocated allocatable.
2023-02-15 11:21:11 +01:00
Tobias Burnus c7a9655be6 libgomp: Fix 'target enter data' with always pointer
As GOMP_MAP_ALWAYS_POINTER operates on the previous map item, ensure that
with 'target enter data' both are passed together to gomp_map_vars_internal.

libgomp/ChangeLog:

	* target.c (gomp_map_vars_internal): Add 'i > 0' before doing a
	kind check.
	(GOMP_target_enter_exit_data): If the next map item is
	GOMP_MAP_ALWAYS_POINTER map it together with the current item.
	* testsuite/libgomp.fortran/target-enter-data-3.f90: New test.
2023-02-15 11:18:31 +01:00
GCC Administrator e92e2c9671 Daily bump. 2023-02-10 00:17:42 +00:00
Tobias Burnus ac2949574d OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
This patch ensures that loop bounds depending on outer loop vars use the
proper TREE_VEC format. It additionally gives a sorry if such an outer
var has a non-one/non-minus-one increment as currently a count variable
is used in this case (see PR).

Finally, it avoids 'count' and just uses a local loop variable if the
step increment is +/-1.

	PR fortran/107424

gcc/fortran/ChangeLog:

	* trans-openmp.cc (struct dovar_init_d): Add 'sym' and
	'non_unit_incr' members.
	(gfc_nonrect_loop_expr): New.
	(gfc_trans_omp_do): Call it; use normal loop bounds
	for unit stride - and only create local loop var.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-3.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-4.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-5.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/privatization-1-compute-loop.f90: Update dg-note.
	* gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise.
2023-02-09 15:51:13 +01:00
GCC Administrator 8f3b85efbf Daily bump. 2023-02-08 00:17:03 +00:00
Thomas Schwinge 7ab75a6e6d Fix 'libgomp.fortran/reverse-offload-6.f90' nvptx offloading compilation
Fix-up for recent commit 0b1ce70a81
"libgomp: Fix reverse offload issues".

	libgomp/
	* testsuite/libgomp.fortran/reverse-offload-6.f90: Fix nvptx
	offloading compilation.
2023-02-07 23:44:33 +01:00
GCC Administrator 49e52115b0 Daily bump. 2023-02-04 00:16:24 +00:00
Tobias Burnus 0b1ce70a81 libgomp: Fix reverse offload issues
If there is nothing to map, skip the mapping and avoid attempting to
copy 0 bytes from addrs, sizes and kinds.

Additionally, it could happen that a non-allocated address was deallocated,
such as a pointer set, leading to a free for the actual data.

libgomp/
	* target.c (gomp_target_rev): Handle mapnum == 0 and avoid
	freeing not allocated memory.
	* testsuite/libgomp.fortran/reverse-offload-6.f90: New test.
2023-02-03 11:31:53 +01:00
Tobias Burnus f84fdb134d libgomp: enable reverse offload for AMDGCN
libgomp/ChangeLog:

	* libgomp.texi (5.0 Impl. Status, gcn specifics): Update for
	reverse offload.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Accept
	reverse-offload requirement.
2023-02-03 08:33:17 +01:00
GCC Administrator a37a0cb303 Daily bump. 2023-02-03 00:16:44 +00:00
Andrew Stubbs f6fff8a6fc amdgcn, libgomp: Manually allocated stacks
Switch from using stacks in the "private segment" to using a memory block
allocated on the host side.  The primary reason is to permit the reverse
offload implementation to access values located on the device stack, but
there may also be performance benefits, especially with repeated kernel
invocations.

This implementation unifies the stacks with the "team arena" optimization
feature, and now allows both to have run-time configurable sizes.

A new ABI is needed, so all libraries must be rebuilt, and newlib must be
version 4.3.0.20230120 or newer.

gcc/ChangeLog:

	* config/gcn/gcn-run.cc: Include libgomp-gcn.h.
	(struct kernargs): Replace the common content with kernargs_abi.
	(struct heap): Delete.
	(main): Read GCN_STACK_SIZE envvar.
	Allocate space for the device stacks.
	Write the new kernargs fields.
	* config/gcn/gcn.cc (gcn_option_override): Remove stack_size_opt.
	(default_requested_args): Remove PRIVATE_SEGMENT_BUFFER_ARG and
	PRIVATE_SEGMENT_WAVE_OFFSET_ARG.
	(gcn_addr_space_convert): Mask the QUEUE_PTR_ARG content.
	(gcn_expand_prologue): Move the TARGET_PACKED_WORK_ITEMS to the top.
	Set up the stacks from the values in the kernargs, not private.
	(gcn_expand_builtin_1): Match the stack configuration in the prologue.
	(gcn_hsa_declare_function_name): Turn off the private segment.
	(gcn_conditional_register_usage): Ensure QUEUE_PTR is fixed.
	* config/gcn/gcn.h (FIXED_REGISTERS): Fix the QUEUE_PTR register.
	* config/gcn/gcn.opt (mstack-size): Change the description.

include/ChangeLog:

	* gomp-constants.h (GOMP_VERSION_GCN): Bump.

libgomp/ChangeLog:

	* config/gcn/libgomp-gcn.h (DEFAULT_GCN_STACK_SIZE): New define.
	(DEFAULT_TEAM_ARENA_SIZE): New define.
	(struct heap): Move to this file.
	(struct kernargs_abi): Likewise.
	* config/gcn/team.c (gomp_gcn_enter_kernel): Use team arena size from
	the kernargs.
	* libgomp.h: Include libgomp-gcn.h.
	(TEAM_ARENA_SIZE): Remove.
	(team_malloc): Update the error message.
	* plugin/plugin-gcn.c (struct kernargs): Move common content to
	struct kernargs_abi.
	(struct agent_info): Rename team arenas to ephemeral memories.
	(struct team_arena_list): Rename ....
	(struct ephemeral_memories_list): to this.
	(struct heap): Delete.
	(team_arena_size): New variable.
	(stack_size): New variable.
	(print_kernel_dispatch): Update debug messages.
	(init_environment_variables): Read GCN_TEAM_ARENA_SIZE.
	Read GCN_STACK_SIZE.
	(get_team_arena): Rename ...
	(configure_ephemeral_memories): ... to this, and set up stacks.
	(release_team_arena): Rename ...
	(release_ephemeral_memories): ... to this.
	(destroy_team_arenas): Rename ...
	(destroy_ephemeral_memories): ... to this.
	(create_kernel_dispatch): Add num_threads parameter.
	Adjust for kernargs_abi refactor and ephemeral memories.
	(release_kernel_dispatch): Adjust for ephemeral memories.
	(run_kernel): Pass thread-count to create_kernel_dispatch.
	(GOMP_OFFLOAD_init_device): Adjust for ephemeral memories.
	(GOMP_OFFLOAD_fini_device): Adjust for ephemeral memories.

gcc/testsuite/ChangeLog:

	* gcc.c-torture/execute/pr47237.c: Xfail on amdgcn.
	* gcc.dg/builtin-apply3.c: Xfail for amdgcn.
	* gcc.dg/builtin-apply4.c: Xfail for amdgcn.
	* gcc.dg/torture/stackalign/builtin-apply-3.c: Xfail for amdgcn.
	* gcc.dg/torture/stackalign/builtin-apply-4.c: Xfail for amdgcn.
2023-02-02 11:47:03 +00:00
Tobias Burnus 8da7476c5f libgomp.texi (OpenMP TR11 impl. status): Fix 'strict' item
Fix the 'strict' modifier status: it is already listed (as 'Y') for OpenMP
5.1 for num_task and grainsize; only strict on num_threads is new with TR11.

libgomp/
	* libgomp.texi (OpenMP TR11): Fix item for 'strict' modifier.
2023-02-02 12:05:58 +01:00