Commit Graph

194472 Commits

Author SHA1 Message Date
Alexander Monakov 4a8aab9a23 .gitignore: do not ignore config.h
GCC does not support in-tree builds at the moment, so .gitignore
concealing artifacts of accidental in-tree ./configure run may cause
confusion. Un-ignore config.h, which is known to break the build.

ChangeLog:

	* .gitignore: Do not ignore config.h.
2022-07-19 17:07:04 +03:00
Marco Falke 20ab397224 libstdc++: Make __from_chars_alnum_to_val conversion explicit
The optimizations from commit r12-8175-ga54137c88061c7 introduced a
clang integer sanitizer error.

Fix this with an explicit static_cast, similar to the fix for PR 96766.

libstdc++-v3/ChangeLog:

	* include/std/charconv (__from_chars_alnum_to_val): Replace
	implicit conversion from int to unsigned char with explicit
	cast.
2022-07-19 14:56:42 +01:00
David Malcolm 2c044ff123 analyzer: fix taint handling of switch statements [PR106321]
PR analyzer/106321 reports false positives from
-Wanalyzer-tainted-array-index on switch statements, seen e.g.
in the Linux kernel in drivers/vfio/pci/vfio_pci_core.c, where
vfio_pci_core_ioctl has:

    |  744 |                 switch (info.index) {
    |      |                 ~~~~~~  ~~~~~~~~~~
    |      |                 |           |
    |      |                 |           (8) ...to here
    |      |                 (9) following ‘case 0 ... 5:’ branch...
    |......
    |  751 |                 case VFIO_PCI_BAR0_REGION_INDEX ... VFIO_PCI_BAR5_REGION_INDEX:
    |      |                 ~~~~
    |      |                 |
    |      |                 (10) ...to here

and then a false complaint about "use of attacker-controlled value
‘info.index’ in array lookup without upper-bounds checking", where
info.index has clearly had its bounds checked by the switch/case.

It turns out that when I rewrote switch handling for the analyzer in
r12-3101-g8ca7fa84a3af35, I removed notifications to state machines
about the constraints on cases.

This patch fixes that oversight by adding a new on_bounded_ranges vfunc
for region_model_context, called on switch statement edges, which calls
a new state_machine vfunc.  It implements it for the "taint" state
machine, so that it updates the "has bounds" flags at out-edges for
switch statements, based on whether the bounds from the edge appear to
actually constrain the switch index.

gcc/analyzer/ChangeLog:
	PR analyzer/106321
	* constraint-manager.h (bounded_ranges::get_count): New.
	(bounded_ranges::get_range): New.
	* engine.cc (impl_region_model_context::on_bounded_ranges): New.
	* exploded-graph.h (impl_region_model_context::on_bounded_ranges):
	New decl.
	* region-model.cc (region_model::apply_constraints_for_gswitch):
	Potentially call ctxt->on_bounded_ranges.
	* region-model.h (region_model_context::on_bounded_ranges): New
	vfunc.
	(noop_region_model_context::on_bounded_ranges): New.
	(region_model_context_decorator::on_bounded_ranges): New.
	* sm-taint.cc: Include "analyzer/constraint-manager.h".
	(taint_state_machine::on_bounded_ranges): New.
	* sm.h (state_machine::on_bounded_ranges): New.

gcc/testsuite/ChangeLog:
	PR analyzer/106321
	* gcc.dg/analyzer/torture/taint-read-index-2.c: Add test coverage
	for switch statements.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-07-19 09:53:39 -04:00
David Malcolm 434d521d11 analyzer: log out-edge description in exploded_graph::process_node
I found this logging tweak very helpful when working on
PR analyzer/106284.

gcc/analyzer/ChangeLog:
	* engine.cc (exploded_graph::process_node): Show any description
	of the out-edge when logging it for consideration.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-07-19 09:53:39 -04:00
Martin Liska edf0c132b1 Remote trailing : for subheading.
gcc/ChangeLog:

	* doc/extend.texi: Remove trailing :.
2022-07-19 15:41:10 +02:00
Prathamesh Kulkarni 4c32313025 forwprop: Use lhs type instead of arg0 in folding VEC_PERM_EXPR.
gcc/ChangeLog:

	* tree-ssa-forwprop.cc (simplify_permutation): Use lhs type
	instead of TREE_TYPE (arg0) as result type in folding VEC_PERM_EXPR.
2022-07-19 17:45:41 +05:30
Sebastian Huber f082bc79c1 RTEMS: Remove HAVE_POLL for libstdc++
The poll() function is not always available in RTEMS.

libstdc++-v3/ChangeLog:

	* configure: Regnerate.
	* configure.ac (newlib, *-rtems*): Remove HAVE_POLL.
2022-07-19 14:10:26 +02:00
Richard Biener e4ff11a8f2 middle-end/106331 - fix mem attributes for string op arguments
get_memory_rtx tries hard to come up with a MEM_EXPR to record
in the memory attributes but in the last fallback fails to properly
account for an unknown offset and thus, as visible in this testcase,
incorrect alignment computed from set_mem_attributes.  The following
rectifies both parts.

	PR middle-end/106331
	* builtins.cc (get_memory_rtx): Compute alignment from
	the original address and set MEM_OFFSET to unknown when
	we create a MEM_EXPR from the base object of the address.

	* gfortran.dg/pr106331.f90: New testcase.
2022-07-19 11:16:34 +02:00
Richard Biener 0f129766fd lto/106334 - relax assert during WPA tree merging
The dwarf2out map of tree to symbol + offset is populated too early
when streaming in trees so that when WPA tree merging decides to
recycle them the mapping prevails and if we are unlucky the same
address is used for another tree with a symbol + offset DIE to
record.  The following mitigates the resulting ICE by relaxing the
assert, allowing re-use of a slot during WPA.  Delaying the register
would be better but it's already somewhat hairy and uglifying this
further doesn't look too important right now.

	PR lto/106334
	* dwarf2out.cc (dwarf2out_register_external_die): Allow
	map entry re-use during WPA.
2022-07-19 11:16:27 +02:00
Roger Sayle 40f6e59122 PR c/106264: Silence warnings from __builtin_modf et al.
This middle-end patch resolves PR c/106264 which is a spurious warning
regression caused by the tree-level expansion of modf, frexp and remquo
producing "expression has no-effect" when the built-in function's result
is ignored.  When these built-ins were first expanded at tree-level,
fold_builtin_n would blindly set TREE_NO_WARNING for all built-ins. Now
that we're more discerning, we should precisely call suppress_warning
selectively on those COMPOUND_EXPRs that need them.

2022-07-19  Roger Sayle  <roger@nextmovesoftware.com>
	    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
	PR c/106264
	* builtins.cc (fold_builtin_frexp): Call suppress_warning on
	COMPOUND_EXPR to silence spurious warning if result isn't used.
	(fold_builtin_modf): Likewise.
	(do_mpfr_remquo): Likewise.

gcc/testsuite/ChangeLog
	PR c/106264
	* gcc.dg/pr106264.c: New test case.
2022-07-19 08:39:43 +01:00
Takayuki 'January June' Suwa 2180cdd8a0 xtensa: Correct the relative RTX cost that corresponds to the Move Immediate "MOVI" instruction
This patch corrects the overestimation of the relative cost of
'(set (reg) (const_int N))' where N fits into the instruction itself.

In fact, such overestimation confuses the RTL loop invariant motion pass.
As a result, it brings almost no negative impact from the speed point of
view, but addtiional reg-reg move instructions and register allocation
pressure about the size.

    /* example, optimized for size */
    extern int foo(void);
    extern int array[16];
    void test_0(void) {
      unsigned int i;
      for (i = 0; i < sizeof(array)/sizeof(*array); ++i)
        array[i] = 1024;
    }
    void test_1(void) {
      unsigned int i;
      for (i = 0; i < sizeof(array)/sizeof(*array); ++i)
        array[i] = array[i] ? 1024 : 0;
    }
    void test_2(void) {
      unsigned int i;
      for (i = 0; i < sizeof(array)/sizeof(*array); ++i)
        array[i] = foo() ? 0 : 1024;
    }

    ;; before
	.literal_position
	.literal .LC0, array
    test_0:
	l32r	a3, .LC0
	movi.n	a2, 0
	movi	a4, 0x400	// OK
    .L2:
	s32i.n	a4, a3, 0
	addi.n	a2, a2, 1
	addi.n	a3, a3, 4
	bnei	a2, 16, .L2
	ret.n
	.literal_position
	.literal .LC1, array
    test_1:
	l32r	a2, .LC1
	movi.n	a3, 0
	movi	a5, 0x400	// NG
    .L6:
	l32i.n	a4, a2, 0
	beqz.n	a4, .L5
	mov.n	a4, a5		// should be "movi a4, 0x400"
    .L5:
	s32i.n	a4, a2, 0
	addi.n	a3, a3, 1
	addi.n	a2, a2, 4
	bnei	a3, 16, .L6
	ret.n
	.literal_position
	.literal .LC2, array
    test_2:
	addi	sp, sp, -32
	s32i.n	a12, sp, 24
	l32r	a12, .LC2
	s32i.n	a13, sp, 20
	s32i.n	a14, sp, 16
	s32i.n	a15, sp, 12
	s32i.n	a0, sp, 28
	addi	a13, a12, 64
	movi.n	a15, 0		// NG
	movi	a14, 0x400	// and wastes callee-saved registers (only 4)
    .L11:
	call0	foo
	mov.n	a3, a14		// should be "movi a3, 0x400"
	movnez	a3, a15, a2
	s32i.n	a3, a12, 0
	addi.n	a12, a12, 4
	bne	a12, a13, .L11
	l32i.n	a0, sp, 28
	l32i.n	a12, sp, 24
	l32i.n	a13, sp, 20
	l32i.n	a14, sp, 16
	l32i.n	a15, sp, 12
	addi	sp, sp, 32
	ret.n

    ;; after
	.literal_position
	.literal .LC0, array
    test_0:
	l32r	a3, .LC0
	movi.n	a2, 0
	movi	a4, 0x400	// OK
    .L2:
	s32i.n	a4, a3, 0
	addi.n	a2, a2, 1
	addi.n	a3, a3, 4
	bnei	a2, 16, .L2
	ret.n
	.literal_position
	.literal .LC1, array
    test_1:
	l32r	a2, .LC1
	movi.n	a3, 0
    .L6:
	l32i.n	a4, a2, 0
	beqz.n	a4, .L5
	movi	a4, 0x400	// OK
    .L5:
	s32i.n	a4, a2, 0
	addi.n	a3, a3, 1
	addi.n	a2, a2, 4
	bnei	a3, 16, .L6
	ret.n
	.literal_position
	.literal .LC2, array
    test_2:
	addi	sp, sp, -16
	s32i.n	a12, sp, 8
	l32r	a12, .LC2
	s32i.n	a13, sp, 4
	s32i.n	a0, sp, 12
	addi	a13, a12, 64
    .L11:
	call0	foo
	movi.n	a3, 0		// OK
	movi	a4, 0x400	// and less register allocation pressure
	moveqz	a3, a4, a2
	s32i.n	a3, a12, 0
	addi.n	a12, a12, 4
	bne	a12, a13, .L11
	l32i.n	a0, sp, 12
	l32i.n	a12, sp, 8
	l32i.n	a13, sp, 4
	addi	sp, sp, 16
	ret.n

gcc/ChangeLog:

	* config/xtensa/xtensa.cc (xtensa_rtx_costs):
	Change the relative cost of '(set (reg) (const_int N))' where
	N fits into signed 12-bit from 4 to 0 if optimizing for size.
	And use the appropriate macro instead of the bare number 4.
2022-07-18 20:17:13 -07:00
GCC Administrator 79fb1124c8 Daily bump. 2022-07-19 00:16:32 +00:00
François Dumont 63d182fb86 libstdc++: Enhance branching in std::inplace_merge and std::stable_sort
When we manage to allocate a buffer of the expected size we can simplify the code to
perform the expected algorithm.

libstdc++-v3/ChangeLog:

	* include/bits/stl_algo.h
	(__merge_adaptive): Adapt to merge only when buffer is large enough..
	(__merge_adaptive_resize): New, adapt merge when buffer is too small.
	(__inplace_merge): Adapt, use latter.
	(__stable_sort_adaptive): Adapt to sort only when buffer is large enough.
	(__stable_sort_adaptive_resize): New, adapt sort when buffer is too small.
	(__stable_sort): Adapt, use latter.
2022-07-18 22:40:10 +02:00
Andrew MacLeod 5e47c9333d Check if transitives need to be registered.
Whenever a relation is added, register_transitive is always called.
If neither operand was in a relation before, or this is not a new
relation, then there is no need to register transitives.

	PR tree-optimization/106280
	* value-relation.cc (dom_oracle::register_relation): Register
	transitives only when it is possible for there to be one.
	(dom_oracle::set_one_relation): Return NULL if this is an
	existing relation.
2022-07-18 15:48:13 -04:00
Maciej W. Rozycki e9ee752bbe RISC-V/doc: Add index references for mrelax' and mriscv-attribute'
Add missing index references for the `-mrelax' and `-mriscv-attribute'
invocation options.

	gcc/
	* doc/invoke.texi (RISC-V Options): Add index references for
	`mrelax' and `mriscv-attribute'.
2022-07-18 16:47:21 +01:00
Maciej W. Rozycki fa16bb8ac0 RISC-V/doc: Correct the formatting of `-mstack-protector-guard-reg='
Add missing second space around the `-mstack-protector-guard-reg='
invocation option.

	gcc/
	* doc/invoke.texi (Option Summary): Add missing second space
	around `-mstack-protector-guard-reg='.
2022-07-18 16:47:20 +01:00
Maciej W. Rozycki 7df79970bf RISC-V/doc: Correct the name of `-mriscv-attribute'
Correct the name of the `-mriscv-attribute' invocation option, including
a typo in the negated form.

	gcc/
	* doc/invoke.texi (Option Summary): Fix `-mno-riscv-attribute'.
	(RISC-V Options): Likewise, and `-mriscv-attribute'.
2022-07-18 16:47:20 +01:00
Claudiu Zissulescu 7501eec65c arc: Add ARCHS release 310a tune variant.
Add mtune and mcpu options for ARCHS release 310a type CPU. The
mtune=release31a is designed to be used as an alternative to the
mcpu=hs4x_rel31 option.
ARCHS4x release 31a uses DSP instructions which are implemented a bit
different than mpy9. Hence, use safer mpy2 option.

gcc/
	* config/arc/arc-arch.h (arc_tune_attr): Add
	ARC_TUNE_ARCHS4X_REL31A variant.
	* config/arc/arc.cc (arc_override_options): Tune options for
	release 310a.
	(arc_sched_issue_rate): Use correct enum.
	(arc600_corereg_hazard): Textual change.
	(arc_hazard): Add release 310a tunning.
	* config/arc/arc.md (tune): Update and take into consideration new
	tune option.
	(tune_dspmpy): Likewise.
	(tune_store): New attribute.
	* config/arc/arc.opt (mtune): New tune option.
	* config/arc/arcHS4x.md (hs4x_brcc0, hs4x_brcc1): New cpu units.
	(hs4x_brcc_op): New instruction rezervation.
	(hs4x_data_store_1_op): Likewise.
	* config/arc/arc-cpus.def (hs4x_rel31): New cpu variant.
	* config/arc/arc-tables.opt: Regenerate.
	* config/arc/t-multilib: Likewise.
	* doc/invoke.texi (ARC): Update mcpu and tune sections.

Signed-off-by: Claudiu Zissulescu <claziss@gmail.com>
2022-07-18 15:45:20 +03:00
Richard Biener 87f46a16ec Fix builtin vs non-builtin partition merge in loop distribution
When r7-6373-g40b6bff965d004 fixed a costing issue it failed to
make the logic symmetric which means that we now fuse
normal vs. builtin when the cost model says so but we don't fuse
builtin vs. normal.  The following fixes that, also allowing
the cost model to decide to fuse two builtin partitions as otherwise
an intermediate non-builtin can result in a partial merge as well.

	* tree-loop-distribution.cc (loop_distribution::distribute_loop):
	When computing cost-based merging do not disregard builtin
	classified partitions in some cases.

	* gcc.dg/tree-ssa/ldist-24.c: XFAIL.
	* gcc.dg/tree-ssa/ldist-36.c: Adjust expected outcome.
2022-07-18 14:42:51 +02:00
Claudiu Zissulescu c8697735ab libgcc/arc: Update udivmodsi4 and make the lib safe for rf16
The ARC soft udivmodsi4 algorithm and as well as using umodsi3
for reduced register set configurations are wrong.

libgcc/
	* config/arc/lib2funcs.c (udivmodsi4): Update AND mask.
	* config/arc/lib1funcs.S (umodsi3): Don't use it for RF16
	configurations.
2022-07-18 15:00:53 +03:00
Richard Sandiford 7313381d2c arm: Replace arm_builtin_vectorized_function [PR106253]
This patch extends the fix for PR106253 to AArch32.  As with AArch64,
we were using ACLE intrinsics to vectorise scalar built-ins, even
though the two sometimes have different ECF_* flags.  (That in turn
is because the ACLE intrinsics should follow the instruction semantics
as closely as possible, whereas the scalar built-ins follow language
specs.)

The patch also removes the copysignf built-in, which only existed
for this purpose and wasn't a “real” arm_neon.h built-in.

Doing this also has the side-effect of enabling vectorisation of
rint and roundeven.  Logically that should be a separate patch,
but making it one would have meant adding a new int iterator
for the original set of instructions and then removing it again
when including new functions.

I've restricted the bswap tests to little-endian because we end
up with excessive spilling on big-endian.  E.g.:

        sub     sp, sp, #8
        vstr    d1, [sp]
        vldr    d16, [sp]
        vrev16.8        d16, d16
        vstr    d16, [sp]
        vldr    d0, [sp]
        add     sp, sp, #8
        @ sp needed
        bx      lr

Similarly, the copysign tests require little-endian because on
big-endian we unnecessarily load the constant from the constant pool:

        vldr.32 s15, .L3
        vdup.32 d0, d7[1]
        vbsl    d0, d2, d1
        bx      lr
.L3:
        .word   -2147483648

gcc/
	PR target/106253
	* config/arm/arm-builtins.cc (arm_builtin_vectorized_function):
	Delete.
	* config/arm/arm-protos.h (arm_builtin_vectorized_function): Delete.
	* config/arm/arm.cc (TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION):
	Delete.
	* config/arm/arm_neon_builtins.def (copysignf): Delete.
	* config/arm/iterators.md (nvrint_pattern): New attribute.
	* config/arm/neon.md (<NEON_VRINT:nvrint_pattern><VCVTF:mode>2):
	New pattern.
	(l<NEON_VCVT:nvrint_pattern><su_optab><VCVTF:mode><v_cmp_result>2):
	Likewise.
	(neon_copysignf<mode>): Rename to...
	(copysign<mode>3): ...this.

gcc/testsuite/
	PR target/106253
	* gcc.target/arm/vect_unary_1.c: New test.
	* gcc.target/arm/vect_binary_1.c: Likewise.
2022-07-18 12:57:10 +01:00
Claudiu Zissulescu 9c8349ee1a arc: Fix interrupt's epilogue.
The stack pointer adjustment in interrupt epilogue is happening after
restoring the ZOL registers which is wrong. Fixing this.

gcc/
	* config/arc/arc.cc (arc_expand_epilogue): Adjust the frame
	pointer first when in interrupts.

gcc/testsuite/
	* gcc.target/arc/interrupt-13.c: New file.

Signed-off-by: Claudiu Zissulescu <claziss@gmail.com>
2022-07-18 14:36:58 +03:00
Richard Biener ce92603fbe Improve common reduction vs builtin code generation in loop distribution
loop distribution currently cannot handle the situation when the
last partition is a builtin but there's a common reduction in all
partitions (like the final IV value).  The following lifts this
restriction by making the last non-builtin partition provide the
definitions for the loop-closed PHI nodes.  Since we have heuristics
in place to avoid code generating builtins last writing a testcase
is difficult (but I ran into a case with other pending patches that
made the heuristic ineffective).  What's remaining is the inability
to preserve common reductions when all partitions could be builtins
(in some cases final value replacement could come to the rescue here).

	* tree-loop-distribution.cc (copy_loop_before): Add
	the ability to replace the original LC PHI defs.
	(generate_loops_for_partition): Pass through a flag
	whether to redirect original LC PHI defs.
	(generate_code_for_partition): Likewise.
	(loop_distribution::distribute_loop): Compute the partition
	that should provide the LC PHI defs for common reductions
	and pass that down.
2022-07-18 13:19:22 +02:00
Richard Ball 06039e71f0 Replace manual swapping idiom with std::swap in aarch64.cc
gcc/config/aarch64/aarch64.cc has a few manual swapping idioms of the form:

x = in0, in0 = in1, in1 = x;

The preferred way is using the standard:

std::swap (in0, in1);

We should just fix these to use std::swap.
This will also allow us to eliminate the x temporary rtx.

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_evpc_trn): Use std:swap.
	(aarch64_evpc_uzp): Likewise.
	(aarch64_evpc_zip): Likewise.
2022-07-18 11:30:04 +01:00
Roger Sayle 2907bfc341 PR target/106231: Optimize (any_extend:DI (ctz:SI ...)) on x86_64.
This patch resolves PR target/106231 by providing insns that recognize
(zero_extend:DI (ctz:SI ...)) and (sign_extend:DI (ctz:SI ...)).  The
result of ctz:SI is always between 0 and 32 (or undefined), so
sign_extension is the same as zero_extension, and the result is already
extended in the destination register.

Things are a little complicated, because the existing implementation
of *ctzsi2 handles multiple cases, including false dependencies, which
we continue to support in this patch.

2022-07-18  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR target/106231
	* config/i386/i386.md (*ctzsidi2_<s>ext): New insn_and_split
	to recognize any_extend:DI of ctz:SI which is implicitly extended.
	(*ctzsidi2_<s>ext_falsedep): New define_insn to model a DImode
	extended ctz:SI that has preceding xor to break false dependency.

gcc/testsuite/ChangeLog
	PR target/106231
	* gcc.target/i386/pr106231-1.c: New test case.
	* gcc.target/i386/pr106231-2.c: New test case.
2022-07-18 07:44:38 +01:00
Roger Sayle 43c2505b31 Fix issue with x86_64_const_vector_operand predicate on x86.
This patch fixes (what I believe is) a latent bug in i386.md's
x86_64_const_vector_operand define_predicate.  According to the
documentation, when a predicate is called with rtx operand OP and
machine_mode operand MODE, we can't shouldn't assume that the
MODE is (or has been checked to be) GET_MODE (OP).

The failure mode is that recog can call x86_64_const_vector_operand
on an arbitrary CONST_VECTOR passing a MODE of V2QI_mode, but when
the CONST_VECTOR is in fact V1TImode, it's unsafe to directly call
ix86_convert_const_vector_to_integer, which assumes that the CONST_VECTOR
contains CONST_INTs when it actually contains CONST_WIDE_INTs.  The
checks in this define_predicate need to be testing OP's mode, and
ideally confirming that this matches the passed in/specified MODE.

This bug is currently latent, but adding an innocent/unrelated
define_insn, such as "(set (reg:CCC FLAGS_REG) (const_int 0))" to
i386.md can occasionally change the order in which genrecog generates
its tests, then ICEing during bootstrap due to V1TI CONST_VECTORs.

2022-07-18  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/predicates.md (x86_64_const_vector_operand):
	Check the operand's mode matches the specified mode argument.
2022-07-18 07:41:36 +01:00
Roger Sayle f9da2663f5 Add UNSPEC_MASKOP to kupck<mode> instructions in sse.md on x86.
This AVX512 specific patch to sse.md is split out from an earlier patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596199.html

The new splitters proposed in that patch interfere with AVX512's
kunpckdq instruction which is defined as identical RTL,
DW:DI = (HI:SI<<32)|zero_extend(LO:SI).  To distinguish these,
and avoid AVX512 mask registers accidentally being (ab)used by reload
to perform SImode scalar shifts, this patch adds the explicit
(unspec UNSPEC_MASKOP) to the unpack mask operations, which matches
what sse.md does for the other mask specific (logic) operations.

2022-07-18  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/sse.md (kunpckhi): Add UNSPEC_MASKOP unspec.
	(kunpcksi): Likewise, add UNSPEC_MASKOP unspec.
	(kunpckdi): Likewise, add UNSPEC_MASKOP unspec.
	(vec_pack_trunc_qi): Update to specify the now required
	UNSPEC_MASKOP unspec.
	(vec_pack_trunc_<mode>): Likewise.
2022-07-18 07:36:13 +01:00
GCC Administrator 6d7071776e Daily bump. 2022-07-18 00:16:24 +00:00
GCC Administrator 7bcd7f4735 Daily bump. 2022-07-17 00:16:23 +00:00
Ian Lance Taylor 2b5baaef0b go: fix f().x where f returns zero-sized type
Test case is https://go.dev/cl/417874.

Fixes golang/go#23870

	* go-gcc.cc (Gcc_backend::struct_field_expression): Handle a void
	expression, as for f().x where f returns a zero-sized type.
2022-07-16 16:30:34 -07:00
Takayuki 'January June' Suwa d6d8e6a7e1 xtensa: Optimize "bitwise AND with imm1" followed by "branch if (not) equal to imm2"
This patch enhances the effectiveness of the previously posted one:
"xtensa: Optimize bitwise AND operation with some specific forms of constants".

    /* example */
    extern void foo(int);
    void test(int a) {
      if ((a & (-1U << 8)) == (128 << 8))  /* 0 or one of "b4const" */
        foo(a);
    }

    ;; before
	.global	test
    test:
	movi	a3, -0x100
	movi.n	a4, 1
	and	a3, a2, a3
	slli	a4, a4, 15
	bne	a3, a4, .L3
	j.l	foo, a9
    .L1:
	ret.n

    ;; after
	.global test
    test:
	srli	a3, a2, 8
	bnei	a3, 128, .L1
	j.l	foo, a9
    .L1:
	ret.n

gcc/ChangeLog:

	* config/xtensa/xtensa.md
	(*masktrue_const_pow2_minus_one, *masktrue_const_negative_pow2,
	*masktrue_const_shifted_mask): If the immediate for bitwise AND is
	represented as '-(1 << N)', decrease the lower bound of N from 12
	to 1.  And the other immediate for conditional branch is now no
	longer limited to zero, but also one of some positive integers.
	Finally, remove the checks of some conditions, because the comparison
	expressions that don't satisfy such checks are determined as
	compile-time constants and thus will be optimized away before
	RTL expansion.
2022-07-16 00:27:42 -07:00
Takayuki 'January June' Suwa 1884f89782 xtensa: constantsynth: Make try to find shorter instruction
This patch allows the constant synthesis to choose shorter instruction
if possible.

    /* example */
    int test(void) {
      return 128 << 8;
    }

    ;; before
    test:
	movi	a2, 0x100
	addmi	a2, a2, 0x7f00
	ret.n

    ;; after
    test:
	movi.n	a2, 1
	slli	a2, a2, 15
	ret.n

When the Code Density Option is configured, the latter is one byte smaller
than the former.

gcc/ChangeLog:

	* config/xtensa/xtensa.cc (xtensa_emit_constantsynth): Remove.
	(xtensa_constantsynth_2insn): Change to try all three synthetic
	methods and to use the one that fits the immediate value of
	the seed into a Narrow Move Immediate instruction "MOVI.N"
	when the Code Density Option is configured.
2022-07-16 00:27:42 -07:00
GCC Administrator bdc7b765f8 Daily bump. 2022-07-16 00:16:30 +00:00
H.J. Lu 2582080f19 x86: Disable sibcall if indirect_return attribute doesn't match
When shadow stack is enabled, function with indirect_return attribute
may return via indirect jump.  In this case, we need to disable sibcall
if caller doesn't have indirect_return attribute and indirect branch
tracking is enabled since compiler won't generate ENDBR when calling the
caller.

gcc/

	PR target/85620
	* config/i386/i386.cc (ix86_function_ok_for_sibcall): Return
	false if callee has indirect_return attribute and caller
	doesn't.

gcc/testsuite/

	PR target/85620
	* gcc.target/i386/pr85620-2.c: Updated.
	* gcc.target/i386/pr85620-5.c: New test.
	* gcc.target/i386/pr85620-6.c: Likewise.
	* gcc.target/i386/pr85620-7.c: Likewise.
2022-07-15 16:58:05 -07:00
Roger Sayle fd3d25d6df PR target/106273: Add earlyclobber to *andn<dwi>3_doubleword_bmi on x86_64.
This patch resolves PR target/106273 which is a wrong code regression
caused by the recent reorganization to split doubleword operations after
reload on x86.  For the failing test case, the constraints on the
andnti3_doubleword_bmi pattern allow reload to allocate the output and
operand in overlapping but non-identical registers, i.e.

(insn 45 44 66 2 (parallel [
            (set (reg/v:TI 5 di [orig:96 i ] [96])
                (and:TI (not:TI (reg:TI 39 r11 [orig:83 _2 ] [83]))
                    (reg/v:TI 4 si [orig:100 i ] [100])))
            (clobber (reg:CC 17 flags))
        ]) "pr106273.c":13:5 562 {*andnti3_doubleword_bmi}

where the output is in registers 5 and 6, and the second operand is
registers 4 and 5, which then leads to the incorrect split:

(insn 113 44 114 2 (parallel [
            (set (reg:DI 5 di [orig:96 i ] [96])
                (and:DI (not:DI (reg:DI 39 r11 [orig:83 _2 ] [83]))
                    (reg:DI 4 si [orig:100 i ] [100])))
            (clobber (reg:CC 17 flags))
        ]) "pr106273.c":13:5 566 {*andndi_1}

(insn 114 113 66 2 (parallel [
            (set (reg:DI 6 bp [ i+8 ])
                (and:DI (not:DI (reg:DI 40 r12 [ _2+8 ]))
                    (reg:DI 5 di [ i+8 ])))
            (clobber (reg:CC 17 flags))
        ]) "pr106273.c":13:5 566 {*andndi_1}

[Notice that reg:DI 5 is set in the first instruction, but assumed
to have its original value in the second].  My first thought was
that this could be fixed by swapping the order of the split instructions
(which works in this case), but in the general case, it's impossible
to handle (set (reg:TI x) (op (reg:TI x+1) (reg:TI x-1)).  Hence for
correctness this pattern needs an earlyclobber "=&r", but we can also
allow cases where the output is the same as one of the operands (using
constraint "0").  The other binary logic operations (AND, IOR, XOR)
are unaffected as they constrain the output to match the first
operand, but BMI's andn is a three-operand instruction which can
lead to the overlapping cases described above.

2022-07-15  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR target/106273
	* config/i386/i386.md (*andn<dwi>3_doubleword_bmi): Update the
	constraints to reflect the output is earlyclobber, unless it is
	the same register (pair) as one of the operands.

gcc/testsuite/ChangeLog
	PR target/106273
	* gcc.target/i386/pr106273.c: New test case.
2022-07-15 22:48:56 +01:00
Steve Kargl 517fb1a781 Fortran: do not generate conflicting results under -ff2c [PR104313]
gcc/fortran/ChangeLog:

	PR fortran/104313
	* trans-decl.cc (gfc_generate_return): Do not generate conflicting
	fake results for functions with no result variable under -ff2c.

gcc/testsuite/ChangeLog:

	PR fortran/104313
	* gfortran.dg/pr104313.f: New test.
2022-07-15 22:08:24 +02:00
Marek Polacek 9a15d3beac c++: Add __reference_con{struc,ver}ts_from_temporary [PR104477]
This patch implements C++23 P2255R2, which adds two new type traits to
detect reference binding to a temporary.  They can be used to detect code
like

  std::tuple<const std::string&> t("meow");

which is incorrect because it always creates a dangling reference, because
the std::string temporary is created inside the selected constructor of
std::tuple, and not outside it.

There are two new compiler builtins, __reference_constructs_from_temporary
and __reference_converts_from_temporary.  The former is used to simulate
direct- and the latter copy-initialization context.  But I had a hard time
finding a test where there's actually a difference.  Under DR 2267, both
of these are invalid:

  struct A { } a;
  struct B { explicit B(const A&); };
  const B &b1{a};
  const B &b2(a);

so I had to peruse [over.match.ref], and eventually realized that the
difference can be seen here:

  struct G {
    operator int(); // #1
    explicit operator int&&(); // #2
  };

int&& r1(G{}); // use #2 (no temporary)
int&& r2 = G{}; // use #1 (a temporary is created to be bound to int&&)

The implementation itself was rather straightforward because we already
have the conv_binds_ref_to_prvalue function.  The main function here is
ref_xes_from_temporary.
I've changed the return type of ref_conv_binds_directly to tristate, because
previously the function didn't distinguish between an invalid conversion and
one that binds to a prvalue.  Since it no longer returns a bool, I removed
the _p suffix.

The patch also adds the relevant class and variable templates to <type_traits>.

	PR c++/104477

gcc/c-family/ChangeLog:

	* c-common.cc (c_common_reswords): Add
	__reference_constructs_from_temporary and
	__reference_converts_from_temporary.
	* c-common.h (enum rid): Add RID_REF_CONSTRUCTS_FROM_TEMPORARY and
	RID_REF_CONVERTS_FROM_TEMPORARY.

gcc/cp/ChangeLog:

	* call.cc (ref_conv_binds_directly_p): Rename to ...
	(ref_conv_binds_directly): ... this.  Add a new bool parameter.  Change
	the return type to tristate.
	* constraint.cc (diagnose_trait_expr): Handle
	CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	* cp-tree.h: Include "tristate.h".
	(enum cp_trait_kind): Add CPTK_REF_CONSTRUCTS_FROM_TEMPORARY
	and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	(ref_conv_binds_directly_p): Rename to ...
	(ref_conv_binds_directly): ... this.
	(ref_xes_from_temporary): Declare.
	* cxx-pretty-print.cc (pp_cxx_trait_expression): Handle
	CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	* method.cc (ref_xes_from_temporary): New.
	* parser.cc (cp_parser_primary_expression): Handle
	RID_REF_CONSTRUCTS_FROM_TEMPORARY and RID_REF_CONVERTS_FROM_TEMPORARY.
	(cp_parser_trait_expr): Likewise.
	(warn_for_range_copy): Adjust to call ref_conv_binds_directly.
	* semantics.cc (trait_expr_value): Handle
	CPTK_REF_CONSTRUCTS_FROM_TEMPORARY and CPTK_REF_CONVERTS_FROM_TEMPORARY.
	(finish_trait_expr): Likewise.

libstdc++-v3/ChangeLog:

	* include/std/type_traits (reference_constructs_from_temporary,
	reference_converts_from_temporary): New class templates.
	(reference_constructs_from_temporary_v,
	reference_converts_from_temporary_v): New variable templates.
	(__cpp_lib_reference_from_temporary): Define for C++23.
	* include/std/version (__cpp_lib_reference_from_temporary): Define for
	C++23.
	* testsuite/20_util/variable_templates_for_traits.cc: Test
	reference_constructs_from_temporary_v and
	reference_converts_from_temporary_v.
	* testsuite/20_util/reference_from_temporary/value.cc: New test.
	* testsuite/20_util/reference_from_temporary/value2.cc: New test.
	* testsuite/20_util/reference_from_temporary/version.cc: New test.

gcc/testsuite/ChangeLog:

	* g++.dg/ext/reference_constructs_from_temporary1.C: New test.
	* g++.dg/ext/reference_converts_from_temporary1.C: New test.
2022-07-15 11:30:38 -04:00
David Malcolm 0a8edfbd37 analyzer: fix taint false positive on optimized range checks [PR106284]
PR analyzer/106284 reports a false positive from
-Wanalyzer-tainted-array-index seen on the Linux kernel
with a version of my patches from:
  https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584372.html
in drivers/usb/class/usblp.c in function ‘usblp_set_protocol’ handling
usblp_ioctl on IOCNR_SET_PROTOCOL, which has:

  | 1337 |         if (protocol < USBLP_FIRST_PROTOCOL || protocol > USBLP_LAST_PROTOCOL)
  |      |            ~
  |      |            |
  |      |            (15) following ‘false’ branch...
  |......
  | 1341 |         if (usblp->intf->num_altsetting > 1) {
  |      |            ~~~~~~~~~~~~
  |      |            |     |
  |      |            |     (16) ...to here
  |      |            (17) following ‘true’ branch...
  | 1342 |                 alts = usblp->protocol[protocol].alt_setting;
  |      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  |      |                      |
  |      |                      (18) ...to here
  |      |                      (19) use of attacker-controlled value ‘arg’ in array lookup without bounds checking

where "arg" is "protocol" (albeit from the caller frame, the ioctl
callback), and is clearly checked at (15).

The root cause is that at -O1 and above fold-const's build_range-check
can optimize range checks
  (c>=low) && (c<=high)
into
  (c-low>=0) && (c-low<=high-low)
and thus into a single check:
  (unsigned)(c - low) <= (unsigned)(high-low).

I initially attempted to fix this by detecting such conditions in
region_model::on_condition, and calling on_condition for both of the
implied conditions.  This turned out not to work since the current
sm_context framework doesn't support applying two conditions
simultaneously: it led to a transition from the old state to has_lb,
then a transition from the old state *again* to has_ub, thus leaving
the new state as has_ub, rather than the stop state.

Instead, this patch fixes things by special-casing it within
taint_state_machine::on_condition.

gcc/analyzer/ChangeLog:
	PR analyzer/106284
	* sm-taint.cc (taint_state_machine::on_condition): Handle range
	checks optimized by build_range_check.

gcc/testsuite/ChangeLog:
	PR analyzer/106284
	* gcc.dg/analyzer/torture/taint-read-index-2.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-07-15 11:28:34 -04:00
David Malcolm b1d07b50d4 analyzer: documentation nits relating to new fd warnings
gcc/ChangeLog:
	* doc/invoke.texi (Static Analyzer Options): Add the new fd
	warnings to the initial gccoptlist, and to the list of those
	disabled by -fanalyzer-checker=taint.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-07-15 11:28:34 -04:00
Ian Lance Taylor 5054bc001d go: fix f(g()) where g returns zero-sized type
Test case is https://go.dev/cl/417481.

Fixes golang/go#23868

	* go-gcc.cc (Gcc_backend::call_expression): Handle a void
	argument, as for f(g()) where g returns a zero-sized type.
2022-07-15 08:04:42 -07:00
Andrew Carlotti 91259dd850 aarch64: Remove qualifier_internal
This has been unused since 2014, so there's no reason to retain it.

gcc/ChangeLog:

	* config/aarch64/aarch64-builtins.cc
	(enum aarch64_type_qualifiers): Remove qualifier_internal.
	(aarch64_init_simd_builtin_functions): Remove qualifier_internal check.
2022-07-15 15:31:19 +01:00
Andrew Carlotti 5ba864c5d1 aarch64: Add V1DI mode
We already have a V1DF mode, so this makes the vector modes more consistent.

Additionally, this allows us to recognise uint64x1_t and int64x1_t types given
only the mode and type qualifiers (e.g. in aarch64_lookup_simd_builtin_type).

gcc/ChangeLog:

	* config/aarch64/aarch64-builtins.cc
	(v1di_UP): Add V1DI mode to _UP macros.
	* config/aarch64/aarch64-modes.def (VECTOR_MODE): Add V1DI mode.
	* config/aarch64/aarch64-simd-builtin-types.def: Use V1DI mode.
	* config/aarch64/aarch64-simd.md
	(vec_extractv2dfv1df): Replace with...
	(vec_extract<mode><V1half>): ...this.
	* config/aarch64/aarch64.cc
	(aarch64_classify_vector_mode): Add V1DI mode.
	* config/aarch64/iterators.md
	(VQ_2E, V1HALF, V1half): New.
	(nunits): Add V1DI mode.
2022-07-15 15:30:29 +01:00
Andrew Carlotti 23dd41c480 MAINTAINERS: Add myself to Write After Approval
ChangeLog:

	* MAINTAINERS: Add myself to Write After Approval.
2022-07-15 14:48:47 +01:00
Roger Sayle 2fd215b03e PR target/106278: Keep REG_EQUAL notes consistent during TImode STV on x86_64.
This patch resolves PR target/106278 a regression on x86_64 caused by my
recent TImode STV improvements.  Now that TImode STV can handle comparisons
such as "(set (regs:CC) (compare:CC (reg:TI) ...))" the convert_insn method
sensibly checks that the mode of the SET_DEST is TImode before setting
it to V1TImode [to avoid V1TImode appearing on the hard reg CC_FLAGS.

Hence the current code looks like:

      if (GET_MODE (dst) == TImode)
 	{
 	  tmp = find_reg_equal_equiv_note (insn);
 	  if (tmp && GET_MODE (XEXP (tmp, 0)) == TImode)
 	    PUT_MODE (XEXP (tmp, 0), V1TImode);
	  PUT_MODE (dst, V1TImode);
	  fix_debug_reg_uses (dst);
 	}
      break;

which checks GET_MODE (dst) before calling PUT_MODE, and when a
change is made updating the REG_EQUAL_NOTE tmp if it exists.

The logical flaw (oversight) is that due to RTL sharing, the destination
of this set may already have been updated to V1TImode, as this chain is
being converted, but we still need to update any REG_EQUAL_NOTE that
still has TImode.  Hence the correct code is actually:

      if (GET_MODE (dst) == TImode)
 	{
	  PUT_MODE (dst, V1TImode);
	  fix_debug_reg_uses (dst);
	}
      if (GET_MODE (dst) == V1TImode)
	{
 	  tmp = find_reg_equal_equiv_note (insn);
 	  if (tmp && GET_MODE (XEXP (tmp, 0)) == TImode)
 	    PUT_MODE (XEXP (tmp, 0), V1TImode);
 	}
      break;

While fixing this behavior, I noticed I had some indentation whitespace
issues and some vestigial dead code in this function/method that I've
taken the liberty of cleaning up (as obvious) in this patch.

2022-07-15  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR target/106278
	* config/i386/i386-features.cc (general_scalar_chain::convert_insn):
	Fix indentation whitespace.
	(timode_scalar_chain::fix_debug_reg_uses): Likewise.
	(timode_scalar_chain::convert_insn): Delete dead code.
	Update TImode REG_EQUAL_NOTE even if the SET_DEST is already V1TI.
	Fix indentation whitespace.
	(convertible_comparison_p): Likewise.
	(timode_scalar_to_vector_candidate_p): Likewise.

gcc/testsuite/ChangeLog
	* gcc.dg/pr106278.c: New test case.
2022-07-15 14:39:28 +01:00
Aldy Hernandez 3aab916f4f Use pp_vrange for ranges in dump_ssaname_info.
This changes the ad-hoc dumping of ranges in the gimple pretty printer
to use the pp_vrange utility function, which has the benefit of
handling all range types going forward and unifying the dumping code.

Instead of:
	# RANGE [0, 51] NONZERO 0x3f
	# RANGE ~[5, 10]

we would now get:

	# RANGE [irange] long unsigned int [0, 51] NONZERO 0x3f
	# RANGE [irange] int [-MIN, 4][11, MAX]

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-pretty-print.cc (dump_ssaname_info): Use pp_vrange.
2022-07-15 11:41:04 +02:00
Aldy Hernandez 64864aa9e6 Convert vrange dumping facilities to pretty_printer.
We need to dump global ranges from the gimple pretty printer code, but
all the vrange dumping facilities work with FILE handles.  This patch
converts all the dumping methods to work with pretty printers, and
provides a wrapper so the FILE * methods continue to work for
debugging.  I also cleaned up the code a bit.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* Makefile.in (OBJS): Add value-range-pretty-print.o.
	* pretty-print.h (pp_vrange): New.
	* value-range.cc (vrange::dump): Call pp version.
	(unsupported_range::dump): Move to its own file.
	(dump_bound_with_infinite_markers): Same.
	(irange::dump): Same.
	(irange::dump_bitmasks): Same.
	(vrange::debug): Remove.
	* value-range.h: Remove virtual designation for dump methods.
	Remove dump_bitmasks method.
	* value-range-pretty-print.cc: New file.
	* value-range-pretty-print.h: New file.
2022-07-15 11:41:03 +02:00
Aldy Hernandez 91a7f30662 Implement visitor pattern for vrange.
We frequently do operations on the various (upcoming) range types.
The cascading if/switch statements of is_a<> are getting annoying and
repetitive.

The classic visitor pattern provides a clean way to implement classes
handling various range types without the need for endless
conditionals.  It also helps us keep polluting the vrange API with
functionality that should frankly live elsewhere.

In a follow-up patch I will add pretty printing facilities for vrange
and unify them with the dumping code.  This is a prime candidate for
the pattern, as the code isn't performance sensitive.  Other instances
(?? the dispatch code in range-ops ??) may still benefit from the hand
coded conditionals, since they elide vtables in favor of the
discriminator bit in vrange.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* value-range.cc (irange::accept): New.
	(unsupported_range::accept): New.
	* value-range.h (class vrange_visitor): New.
	(class vrange): Add accept method.
	(class unsupported_range): Same.
	(class Value_Range): Same.
2022-07-15 11:41:03 +02:00
Jonathan Wakely f858fe7a8b libcpp: Improve encapsulation of label_text
This adjusts the API of label_text so that the data members are private
and cannot be modified by callers.  Add accessors for them instead, and
make the accessors const-correct.  Also rename moved_from () to the more
idiomatic release ().  Also remove the unused take_or_copy () member
function which has confusing ownership semantics.

gcc/analyzer/ChangeLog:

	* call-info.cc (call_info::print): Adjust to new label_text API.
	* checker-path.cc (checker_event::dump): Likewise.
	(region_creation_event::get_desc): Likewise.
	(state_change_event::get_desc): Likewise.
	(superedge_event::should_filter_p): Likewise.
	(start_cfg_edge_event::get_desc): Likewise.
	(call_event::get_desc): Likewise.
	(return_event::get_desc): Likewise.
	(warning_event::get_desc): Likewise.
	(checker_path::dump): Likewise.
	(checker_path::debug): Likewise.
	* diagnostic-manager.cc (diagnostic_manager::prune_for_sm_diagnostic):
	Likewise.
	(diagnostic_manager::prune_interproc_events): Likewise.
	* engine.cc (feasibility_state::maybe_update_for_edge):
	Likewise.
	* program-state.cc (sm_state_map::to_json): Likewise.
	* region-model-impl-calls.cc (region_model::impl_call_analyzer_describe): Likewise.
	(region_model::impl_call_analyzer_dump_capacity): Likewise.
	* region.cc (region::to_json): Likewise.
	* sm-malloc.cc (inform_nonnull_attribute): Likewise.
	* store.cc (binding_map::to_json): Likewise.
	(store::to_json): Likewise.
	* supergraph.cc (superedge::dump): Likewise.
	* svalue.cc (svalue::to_json): Likewise.

gcc/c-family/ChangeLog:

	* c-format.cc (class range_label_for_format_type_mismatch):
	Adjust to new label_text API.

gcc/ChangeLog:

	* diagnostic-format-json.cc (json_from_location_range): Adjust
	to new label_text API.
	* diagnostic-format-sarif.cc (sarif_builder::make_location_object):
	Likewise.
	* diagnostic-show-locus.cc (struct pod_label_text): Likewise.
	(layout::print_any_labels): Likewise.
	* tree-diagnostic-path.cc (class path_label): Likewise.
	(struct event_range): Likewise.
	(default_tree_diagnostic_path_printer): Likewise.
	(default_tree_make_json_for_path): Likewise.

libcpp/ChangeLog:

	* include/line-map.h (label_text::take_or_copy): Remove.
	(label_text::moved_from): Rename to release.
	(label_text::m_buffer, label_text::m_owned): Make private.
	(label_text::get, label_text::is_owned): New accessors.
2022-07-15 09:40:47 +01:00
konglin1 ae69e6f61b i386: Fix _mm_[u]comixx_{ss,sd} codegen and add PF result. [PR106113]
gcc/ChangeLog:

	PR target/106113
	* config/i386/i386-builtin.def (BDESC): Fix [u]comi{ss,sd}
	comparison due to intrinsics changed over time.
	* config/i386/i386-expand.cc (ix86_ssecom_setcc):
	Add unordered check and mode for sse comi codegen.
	(ix86_expand_sse_comi): Add unordered check and check a different
	CCmode.
	(ix86_expand_sse_comi_round):Extract unordered check and mode part
	in ix86_ssecom_setcc.

gcc/testsuite/ChangeLog:

	PR target/106113
	* gcc.target/i386/avx-vcomisd-pr106113-2.c: New test.
	* gcc.target/i386/avx-vcomiss-pr106113-2.c: Ditto.
	* gcc.target/i386/avx-vucomisd-pr106113-2.c: Ditto.
	* gcc.target/i386/avx-vucomiss-pr106113-2.c: Ditto.
	* gcc.target/i386/sse-comiss-pr106113-1.c: Ditto.
	* gcc.target/i386/sse-comiss-pr106113-2.c: Ditto.
	* gcc.target/i386/sse-ucomiss-pr106113-1.c: Ditto.
	* gcc.target/i386/sse-ucomiss-pr106113-2.c: Ditto.
	* gcc.target/i386/sse2-comisd-pr106113-1.c: Ditto.
	* gcc.target/i386/sse2-comisd-pr106113-2.c: Ditto.
	* gcc.target/i386/sse2-ucomisd-pr106113-1.c: Ditto.
	* gcc.target/i386/sse2-ucomisd-pr106113-2.c: Ditto.
2022-07-15 10:29:37 +08:00
Prathamesh Kulkarni 4cbebddc2c [aarch64] Use op_mode instead of vmode in aarch64_vectorize_vec_perm_const.
gcc/ChangeLog:
	* config/aarch64/aarch64.cc (aarch64_vectorize_vec_perm_const): Use
	op_mode instead of vmode in calls to force_reg for op0 and op1.
2022-07-15 06:26:50 +05:30