The following avoids collecting all loops exit blocks into bitmaps
and computing the union of those up the loop tree possibly repeatedly.
Instead we make sure to do this only once for each loop with a
definition possibly requiring a LC phi node plus make sure to
leverage recorded exits to avoid the intermediate bitmap allocation.
* tree-ssa-loop-manip.cc (compute_live_loop_exits): Take
the def loop exit block bitmap as argument instead of
re-computing it here.
(add_exit_phis_var): Adjust.
(loop_name_cmp): New function.
(add_exit_phis): Sort variables to insert LC PHI nodes
after definition loop, for each definition loop compute
the exit block bitmap once.
(get_loops_exit): Remove.
(rewrite_into_loop_closed_ssa_1): Do not pre-record
all loop exit blocks into bitmaps. Record loop exits
if required.
gcc/ChangeLog:
* config/mips/mips.cc (mips_asan_shadow_offset): Reformat
to handle the N32 ABI.
* config/mips/mips.h (SUBTARGET_SHADOW_OFFSET): Remove
the macro, as it is not needed anymore.
As discussed on PR c++/53431, currently, "#pragma GCC diagnostic" does
not always take effect for diagnostics generated by libcpp. The reason
is that libcpp itself does not interpret this pragma and only sends it on
to the frontend, hence the pragma is only honored if the frontend
arranges for it. The C frontend does process the pragma immediately
(more or less) after seeing the token, so things work fine there. The PR
points out that it doesn't work for C++, because the C++ frontend
doesn't handle anything until it has read all the tokens from
libcpp. The underlying problem is not C++-specific, though, and for
instance, gcc -E has the same issue.
This commit fixes the PR by adding the concept of an early pragma handler that
can be registered by frontends, which gives them a chance to process
diagnostic pragmas from libcpp before it is too late for them to take
effect. The C++ and preprocess-only frontends are modified to use early
pragmas and correct the behavior.
gcc/c-family/ChangeLog:
PR preprocessor/53920
PR c++/53431
* c-common.cc (c_option_is_from_cpp_diagnostics): New function.
* c-common.h (c_option_is_from_cpp_diagnostics): Declare.
(c_pp_stream_token): Declare.
* c-ppoutput.cc (init_pp_output): Refactor logic about skipping
pragmas to...
(should_output_pragmas): ...here. New function.
(token_streamer::stream): Support handling early pragmas.
(do_line_change): Likewise.
(c_pp_stream_token): New function.
* c-pragma.cc (struct pragma_diagnostic_data): New helper class.
(pragma_diagnostic_lex_normal): New function. Moved logic for
interpreting GCC diagnostic pragmas here.
(pragma_diagnostic_lex_pp): New function for parsing diagnostic pragmas
directly from libcpp.
(handle_pragma_diagnostic): Refactor into helper function...
(handle_pragma_diagnostic_impl): ...here. New function.
(handle_pragma_diagnostic_early): New function.
(handle_pragma_diagnostic_early_pp): New function.
(struct pragma_ns_name): Renamed to...
(struct pragma_pp_data): ...this. Add new "early_handler" member.
(c_register_pragma_1): Support early pragmas in the preprocessor.
(c_register_pragma_with_early_handler): New function.
(c_register_pragma): Support the new early handlers in struct
internal_pragma_handler.
(c_register_pragma_with_data): Likewise.
(c_register_pragma_with_expansion): Likewise.
(c_register_pragma_with_expansion_and_data): Likewise.
(c_invoke_early_pragma_handler): New function.
(c_pp_invoke_early_pragma_handler): New function.
(init_pragma): Add early pragma support for diagnostic pragmas.
* c-pragma.h (struct internal_pragma_handler): Add new early handler
members.
(c_register_pragma_with_early_handler): Declare.
(c_invoke_early_pragma_handler): Declare.
(c_pp_invoke_early_pragma_handler): Declare.
gcc/cp/ChangeLog:
PR c++/53431
* parser.cc (cp_parser_pragma_kind): Move earlier in the file.
(cp_lexer_handle_early_pragma): New function.
(cp_lexer_new_main): Support parsing and handling early pragmas.
(c_parse_file): Adapt to changes in cp_lexer_new_main.
gcc/testsuite/ChangeLog:
PR preprocessor/53920
PR c++/53431
* c-c++-common/pragma-diag-11.c: New test.
* c-c++-common/pragma-diag-12.c: New test.
* c-c++-common/pragma-diag-13.c: New test.
The D front-end does not use exceptions, but it still requires RTTI for
some lowerings of convenience language features. Enforce it with by
building with `-fno-exceptions'.
gcc/d/ChangeLog:
* Make-lang.in (NOEXCEPTION_DFLAGS): Define.
(ALL_DFLAGS): Add NO_EXCEPTION_DFLAGS.
This patch adds a testcase for passing a closed fd to a function
that does not emit any warning.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fd-4.c: Add a new testcase to demonstrate
passsing of a closed file descriptor to a function that does
not emit any warning.
Signed-off-by: Immad Mir <mirimmad@outlook.com>
This patch reorders the initialization of state m_invalid in sm-fd.cc
so that the order of initializers is same as the ordering of the fields
in the class decl.
gcc/analyzer/ChangeLog:
PR analyzer/106184
* sm-fd.cc (fd_state_machine): Change ordering of initialization
of state m_invalid so that the order of initializers is same as
the ordering of the fields in the class decl.
Signed-off-by: Immad Mir <mirimmad@outlook.com>
This patch saves the "close" event in use_after_close diagnostic
and shows it where possible.
gcc/analyzer/ChangeLog:
* sm-fd.cc (use_after_close): save the "close" event and
show it where possible.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fd-4.c (test_3): change the message note to conform to the
changes in analyzer/sm-fd.cc
(test_4): Likewise.
Signed-off-by: Immad Mir <mirimmad@outlook.com>
Makefile cleanup; behaviour is unaffected.
gcc/ada/
* gcc-interface/Make-lang.in (ada/generated/gnatvsn.ads):
Simplify regular expression. The "interval expression",
i.e. \{8\} is part of the POSIX regular expressions, so it
should not be a problem for modern implementations of sed.
This avoids using a full access for constants internally generated from
assignments of aggregates with a Volatile_Full_Access type.
gcc/ada/
* gcc-interface/gigi.h (simple_constant_p): Declare.
* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Variable>: Strip
the qualifiers from the type of a simple constant.
(simple_constant_p): New predicate.
* gcc-interface/trans.cc (node_is_atomic): Return true for objects
with atomic type except for simple constants.
(node_is_volatile_full_access): Return false for simple constants
with VFA type.
gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Variable>: Create a
local constant holding the underlying GNAT type of the object. Do
not fiddle with the object size for an unconstrained array.
Fix detection of non-preelaborable constructs for checking SPARK
elaboration rules, which was tagging deferred constant declarations as
not preelaborable.
gcc/ada/
* sem_util.adb (Is_Non_Preelaborable_Construct): Fix for
deferred constants.
This patch corrects an error in the compiler whereby a buffer sizing
error fails to get raised when compiling a regex expression with an
insufficiently sized Pattern_Matcher as the documentation indicated.
This, in turn, could lead to indexing errors when attempting to call
Match with the malformed regex program buffer.
gcc/ada/
* libgnat/s-regpat.adb, libgnat/s-regpat.ads (Compile): Add a
new defaulted parameter Error_When_Too_Small to trigger an
error, if specified true, when Matcher is too small to hold the
compiled regex program.
This patch corrects an error in the compiler whereby a function call in
prefix notation within a class condition causes a spurious error
claiming the name in the call is a non-callable entity when there exists
a type extension in the same unit extended with a component featuring
the same name as the function in question.
gcc/ada/
* sem_ch4.adb (Analyze_Selected_Component): Add condition to
avoid interpreting derived type components as candidates for
selected components in preanalysis of inherited class
conditions.
This adds support in GNAT for ghost generic formal parameters, as
included in SPARK RM 6.9.
gcc/ada/
* ghost.adb (Check_Ghost_Context): Delay checking for generic
associations.
(Check_Ghost_Context_In_Generic_Association): Perform ghost
checking in analyzed generic associations.
(Check_Ghost_Formal_Procedure_Or_Package): Check SPARK RM
6.9(13-14) for formal procedures and packages.
(Check_Ghost_Formal_Variable): Check SPARK RM 6.9(13-14) for
variables.
* ghost.ads: Declarations for the above.
* sem_ch12.adb (Analyze_Associations): Apply delayed checking
for generic associations.
(Analyze_Formal_Object_Declaration): Same.
(Analyze_Formal_Subprogram_Declaration): Same.
(Instantiate_Formal_Package): Same.
(Instantiate_Formal_Subprogram): Same.
(Instantiate_Object): Same. Copy ghost aspect to newly declared
object for actual for IN formal object. Use new function
Get_Enclosing_Deep_Object to retrieve root object.
(Instantiate_Type): Copy ghost aspect to declared subtype for
actual for formal type.
* sem_prag.adb (Analyze_Pragma): Recognize new allowed
declarations.
* sem_util.adb (Copy_Ghost_Aspect): Copy the ghost aspect
between nodes.
(Get_Enclosing_Deep_Object): New function to return enclosing
deep object (or root for reachable part).
* sem_util.ads (Copy_Ghost_Aspect): Same.
(Get_Enclosing_Deep_Object): Same.
* libgnat/s-imageu.ads: Declare formal subprograms as ghost.
* libgnat/s-valuei.ads: Same.
* libgnat/s-valuti.ads: Same.
The compiler does not report an error on a type conversion to/from a
tagged type whose parent type is an interface type and there is no
relationship between the source and target types. This bug has been
dormant since January/2016.
This patch also improves the text of errors reported on interface type
conversions suggesting how to fix these errors.
gcc/ada/
* sem_res.adb (Resolve_Type_Conversion): Code cleanup since the
previous static check has been moved to Valid_Tagged_Conversion.
(Valid_Tagged_Conversion): Fix the code checking conversion
to/from interface types since incorrectly returns True when the
parent type of the operand type (or the target type) is an
interface type; add missing static checks on interface type
conversions.
To accomodate cases where objects allocated on the secondary stack
needed a more constrained alignement than Standard'Maximum_Alignement,
the alignment for all allocations in the full runtime were forced on to
be aligned on Standard'Maximum_Alignement*2. This changes removes this
workaround and correctly handles the over-alignment in all runtimes.
This change modifies the SS_Allocate procedure to accept a new Alignment
parameter and to dynamically realign the pointer returned by the memory
allocation (Allocate_* functions or dedicated stack allocations for
zfp/cert).
It also simplifies the 0-sized allocations by not allocating any memory
if pointer is already correctly aligned (already the case in cert and
zfp runtimes).
gcc/ada/
* libgnat/s-secsta.ads (SS_Allocate): Add new Alignment
parameter.
(Memory_Alignment): Remove.
* libgnat/s-secsta.adb (Align_Addr): New.
(SS_Allocate): Add new Alignment parameter. Realign pointer if
needed. Don't allocate anything for 0-sized allocations.
* gcc-interface/utils2.cc (build_call_alloc_dealloc_proc): Add
allocated object's alignment as last parameter to allocation
invocation.
A cleanup opportunity spotted while working on improved detection of
uninitialised local scalar objects.
gcc/ada/
* libgnat/g-socket.adb (Get_Address_Info): Reduce scope of the
Found variable; avoid repeated assignment inside the loop.
Only smp runtimes are built for vxworks7*, even though the -smp suffix
is removed during install. Therefore, in general, the build macros for
the non-smp runtimes are superfluous except on the legacy ppc-vxworks6
target where both the smp and non-smp runtime are built. Lastly, an
error message is added if a runtime build is commanded that doesn't
exist, rather then letting the build mysteriously fail.
gcc/ada/
* Makefile.rtl [arm,aarch64 vxworks7]: Remove rtp and kernel
build macros and set an error variable if needed.
[x86,x86_vxworks7]: Likewise.
[ppc,ppc64]: Set an error variable if needed.
(rts-err): New phony Makefile target.
(setup-rts): Depend on rts-err.
The compiler aborts with an internal error in gigi, but the problem is an
itype incorrectly shared between several branches of an if_statement that
has been created for a Build-In-Place return.
Three branches of this if_statement contain an allocator statement and
the latter two have been obtained as the result of calling New_Copy_Tree
on the first; now the initialization expression of the first had also been
obtained as the result of calling New_Copy_Tree on the original tree, and
these chained calls to New_Copy_Tree run afoul of an issue with the copy
of itypes after the rewrite of an aggregate as an expression with actions.
Fixing this issue looks quite delicate, so this fixes the incorrect sharing
by replacing the chained calls to New_Copy_Tree with repeated calls on the
original expression, which is more elegant in any case.
gcc/ada/
* exp_ch3.adb (Make_Allocator_For_BIP_Return): New local function.
(Expand_N_Object_Declaration): Use it to build the three allocators
for a Build-In-Place return with an unconstrained type. Update the
head comment after other recent changes.
The powerpc e500 port has been LTS'd
gcc/ada/
* libgnat/system-vxworks7-e500-kernel.ads: Remove.
* libgnat/system-vxworks7-e500-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-e500-rtp.ads: Likewise.
This patch corrects an error in the compiler whereby no
Corresponding_Spec was set for emptied CUDA global subprograms - leading
to a malformed tree.
gcc/ada/
* gnat_cuda.adb (Empty_CUDA_Global_Subprogram): Set
Specification and Corresponding_Spec to match the original
Kernel_Body.
Respect a comment in sinfo.ads, which says: "Unchecked type conversion
nodes should be created by calling Tbuild.Unchecked_Convert_To, rather
than by directly calling Nmake.Make_Unchecked_Type_Conversion."
No test appears to be affected by this change, so this is just a
cleanup.
gcc/ada/
* exp_ch6.adb (Build_Static_Check_Helper_Call): Replace explicit
call to Make_Unchecked_Type_Conversion with a call to
Unchecked_Convert_To.
* tbuild.adb (Unchecked_Convert_To): Fix whitespace.
It comes from the Volatile_Full_Access (or Atomic) aspect: the aggregate is
effectively analyzed/resolved twice and this does not work. It is fixed by
calling Is_Full_Access_Aggregate before resolution.
gcc/ada/
* exp_aggr.adb (Expand_Record_Aggregate): Do not call
Is_Full_Access_Aggregate here.
* freeze.ads (Is_Full_Access_Aggregate): Delete.
* freeze.adb (Is_Full_Access_Aggregate): Move to...
(Freeze_Entity): Do not call Is_Full_Access_Aggregate here.
* sem_aggr.adb (Is_Full_Access_Aggregate): ...here
(Resolve_Aggregate): Call Is_Full_Access_Aggregate here.
-fanalyzer handles -ftrivial-auto-var-init= by special-casing
IFN_DEFERRED_INIT to be a no-op, so that e.g.:
len_2 = .DEFERRED_INIT (4, 2, &"len"[0]);
is treated as a no-op, so that len_2 is still uninitialized after the
stmt.
PR analyzer/106204 reports that -fanalyzer gives false positives from
-Wanalyzer-use-of-uninitialized-value on locals that have their address
taken, due to e.g.:
_1 = .DEFERRED_INIT (4, 2, &"len"[0]);
len = _1;
where -fanalyzer leaves _1 uninitialized, and then complains about
the assignment to "len".
Fixed thusly by suppressing the warning when assigning from such SSA
names.
gcc/analyzer/ChangeLog:
PR analyzer/106204
* region-model.cc (within_short_circuited_stmt_p): Move extraction
of assign_stmt to caller.
(due_to_ifn_deferred_init_p): New.
(region_model::check_for_poison): Move extraction of assign_stmt
from within_short_circuited_stmt_p to here. Share logic with
call to due_to_ifn_deferred_init_p.
gcc/testsuite/ChangeLog:
PR analyzer/106204
* gcc.dg/analyzer/torture/uninit-pr106204.c: New test.
* gcc.dg/analyzer/uninit-pr106204.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This testcase demonstrates that my assumption that we would only be
interested in a class template lookup if the template-id is followed by ::
was wrong.
PR c++/106179
PR c++/106024
gcc/cp/ChangeLog:
* parser.cc (cp_parser_lookup_name): Remove :: requirement
for using unqualified lookup result.
gcc/testsuite/ChangeLog:
* g++.dg/template/operator16.C: New test.
Add a comment next to the getpid call to explain why the typecast is
needed.
for libstdc++-v3/ChangeLog
* testsuite/util/testsuite_fs.h (nonexistent_path): Explain
why we need the typecast.
The <https://gcc.gnu.org/pipermail/gcc/2022-May/238679.html> thread
seems to have concluded that -Wformat shouldn't warn about
printf((const char*) u8"test %d\n", 1);
saying "format string is not an array of type 'char'". This code
is not an aliasing violation, and there are no I/O functions for u8
strings, so the const char * cast is OK and shouldn't be disregarded.
PR c++/105626
gcc/c-family/ChangeLog:
* c-format.cc (check_format_arg): Don't emit -Wformat warnings with
u8 strings.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wformat-char8_t-1.C: New test.
Provide a relation oracle API which validates a relation between 2 ranges.
This allows relation queries that are symbolicly true to be overridden
by range specific information. ie. x == x is true symbolically, but for
floating point a NaN may invalidate this assumption.
* value-relation.cc (relation_to_code): New vector.
(relation_oracle::validate_relation): New.
(set_relation): Allow ssa1 == ssa2 to be registered.
* value-relation.h (validate_relation): New prototype.
(query_relation): Make internal variant protected.
This patch extends the earlier and;cmp to not;test optimization to also
perform this transformation for TImode on TARGET_64BIT and DImode on -m32,
One motivation for this is that it's a step to fixing the current failure
of gcc.target/i386/pr65105-5.c on -m32.
A more direct benefit for x86_64 is that the following code:
int foo(__int128 x, __int128 y)
{
return (x & y) == y;
}
improves with -O2 from 15 instructions:
movq %rdi, %r8
movq %rsi, %rax
movq %rax, %rdi
movq %r8, %rsi
movq %rdx, %r8
andq %rdx, %rsi
andq %rcx, %rdi
movq %rsi, %rax
movq %rdi, %rdx
xorq %r8, %rax
xorq %rcx, %rdx
orq %rdx, %rax
sete %al
movzbl %al, %eax
ret
to the slightly better 13 instructions:
movq %rdi, %r8
movq %rsi, %rax
movq %r8, %rsi
movq %rax, %rdi
notq %rsi
notq %rdi
andq %rdx, %rsi
andq %rcx, %rdi
movq %rsi, %rax
orq %rdi, %rax
sete %al
movzbl %al, %eax
ret
2022-07-05 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.cc (ix86_rtx_costs) <COMPARE>: Provide costs
for double word comparisons and tests (comparisons against zero).
* config/i386/i386.md (*test<mode>_not_doubleword): Split DWI
and;cmp into andn;cmp $0 as a pre-reload splitter.
(*andn<dwi>3_doubleword_bmi): Use <dwi> instead of <mode> in name.
(*<any_or><dwi>3_doubleword): Likewise.
gcc/testsuite/ChangeLog
* gcc.target/i386/testnot-3.c: New test case.
This patch is a follow-up to Hongtao's fix for PR target/105854. That
fix is perfectly correct, but the thing that caught my eye was why is
the compiler generating a shift by zero at all. Digging deeper it
turns out that we can easily optimize __builtin_ia32_palignr for
alignments of 0 and 64 respectively, which may be simplified to moves
of the highpart and lowpart respectively.
After adding optimizations to simplify the 64-bit DImode palignr, I
started to add the corresponding optimizations for vpalignr (i.e.
128-bit). The first oddity is that sse.md uses TImode and a special
SSESCALARMODE iterator, rather than V1TImode, and indeed the comment
above SSESCALARMODE hints that this should be "dropped in favor of
VIMAX_AVX2_AVX512BW". Hence this patch includes the migration of
<ssse3_avx2>_palignr<mode> to use VIMAX_AVX2_AVX512BW, basically
using V1TImode instead of TImode for 128-bit palignr.
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-,32},
with no new failures. Ok for mainline?
2022-07-05 Roger Sayle <roger@nextmovesoftware.com>
Hongtao Liu <hongtao.liu@intel.com>
gcc/ChangeLog
* config/i386/i386-builtin.def (__builtin_ia32_palignr128): Change
CODE_FOR_ssse3_palignrti to CODE_FOR_ssse3_palignrv1ti.
* config/i386/i386-expand.cc (expand_vec_perm_palignr): Use V1TImode
and gen_ssse3_palignv1ti instead of TImode.
* config/i386/sse.md (SSESCALARMODE): Delete.
(define_mode_attr ssse3_avx2): Handle V1TImode instead of TImode.
(<ssse3_avx2>_palignr<mode>): Use VIMAX_AVX2_AVX512BW as a mode
iterator instead of SSESCALARMODE.
(ssse3_palignrdi): Optimize cases where operands[3] is 0 or 64,
using a single move instruction (if required).
gcc/testsuite/ChangeLog
* gcc.target/i386/ssse3-palignr-2.c: New test case.
This patch addresses PR rtl-optimization/96692 on x86_64, by providing
a set of combine splitters to convert the three operation ((A|B)^C)^D
into a two operation sequence using andn when either A or B is the same
register as C or D. This is essentially a reassociation problem that's
only a win if the target supports an and-not instruction (as with -mbmi).
Hence for the new test case:
int f(int a, int b, int c)
{
return (a ^ b) ^ (a | c);
}
GCC on x86_64-pc-linux-gnu wth -O2 -mbmi would previously generate:
xorl %edi, %esi
orl %edx, %edi
movl %esi, %eax
xorl %edi, %eax
ret
but with this patch now generates:
andn %edx, %edi, %eax
xorl %esi, %eax
ret
2022-07-05 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR rtl-optimization/96692
* config/i386/i386.md (define_split): Split ((A | B) ^ C) ^ D
as (X & ~Y) ^ Z on target BMI when either C or D is A or B.
gcc/testsuite/ChangeLog
PR rtl-optimization/96692
* gcc.target/i386/bmi-andn-4.c: New test case.
Like macro locations, we only need to emit ordinary location
information for locations emitted into the CMI. This adds a hash table
noting which ordinary lines are needed. These are then sorted and
(sufficiently) adjacent lines are coalesced to a single map. There is
a tradeoff here, allowing greater separation reduces the number of
line maps, but increases the number of locations. It appears allowing
2 or 3 intervening lines is the sweet spot, and this patch chooses 2.
Compiling a hello-world #includeing <iostream> in it's GMF gives a
reduction in number of locations of 5 fold, but an increase in number
of maps about 4 fold. Examining one of the xtreme-header tests we
halve the number of locations and increase the number of maps by 9
fold.
Module interfaces that emit no entities (or macros, if a header-unit),
will now have no location tables.
gcc/cp/
* module.cc
(struct ord_loc_info, ord_loc_traits): New.
(ord_loc_tabke, ord_loc_remap): New globals.
(struct location_map_info): Delete.
(struct module_state_config): Rename ordinary_loc_align to
loc_range_bits.
(module_for_ordinary_loc): Adjust.
(module_state::note_location): Note ordinary locations,
return bool.
(module_state::write_location): Adjust ordinary location
streaming.
(module_state::read_location): Likewise.
(module_state::write_init_maps): Allocate ord_loc_table.
(module_state::write_prepare_maps): Reimplement ordinary
map preparation.
(module_state::read_prepare_maps): Adjust.
(module_state::write_ordinary_maps): Reimplement.
(module_state::write_macro_maps): Adjust.
(module_state::read_ordinary_maps): Reimplement.
(module_state::write_macros): Adjust.
(module_state::write_config): Adjust.
(module_state::read_config): Adjust.
(module_state::write_begin): Adjust.
(module_state::read_initial): Adjust.
gcc/testsuite/
* g++.dg/modules/loc-prune-1.C: Adjust.
* g++.dg/modules/loc-prune-4.C: New.
* g++.dg/modules/pr98718_a.C: Adjust.
* g++.dg/modules/pr98718_b.C: Adjust.
* g++.dg/modules/pr99072.H: Adjust.
This is another case like PR106182 where for the 2nd testcase in
the bug there are no removed or discovered loops but still changing
loop exits invalidates LC SSA and it is not enough to just scan for
uses in the blocks that changed loop depth. One might argue that
if we'd include former exit destinations we'd pick up the original
LC SSA use but for virtuals on block merging we'd have propagated
those out (while for regular uses we insert copies). CFG cleanup
can also be entered with loops needing fixup so any heuristics
based on loop structure are bound to fail.
PR tree-optimization/106198
* tree-cfgcleanup.cc (repair_loop_structures): Always do a
full LC SSA rewrite but only if any blocks changed loop
depth.
* gcc.dg/pr106198.c: New testcase.
The following removes the now unused per-loop path in LC SSA rewrite.
* tree-ssa-loop-manip.cc (find_uses_to_rename_def): Remove.
(find_uses_to_rename_in_loop): Likewise.
(rewrite_into_loop_closed_ssa_1): Remove loop parameter and
uses.
(rewrite_into_loop_closed_ssa): Adjust.
The code to remove LC PHI nodes in clean_up_loop_closed_phi does not handle
virtual operands because may_propagate_copy generally returns false
for them. The following copies the merge_blocks variant for
dealing with them.
This fixes a missed jump threading in gcc.dg/auto-init-uninit-4.c
which manifests in bogus uninit diagnostics.
PR tree-optimization/106186
* tree-ssa-propagate.cc (clean_up_loop_closed_phi):
Properly handle virtual PHI nodes.
The following properly handles aggregate returns of the const marked
STORE_LANES internal function to update virtual SSA form on-the-fly
rather than relying on a costly virtual SSA rewrite.
PR tree-optimization/106196
* tree-vect-stmts.cc (vect_finish_stmt_generation): Properly
handle aggregate returns of calls for VDEF updates.
* gcc.dg/torture/pr106196.c: New testcase.
The final loop IV use after the loop has that not in LC SSA
(and inserts not simplified _2 = _3 - 0 stmts). In particular
since it splits the exit edge when there's a virtual PHI in the
destination it breaks virtual LC SSA form (but likely also
non-virtual).
The following properly inserts LC PHIs instead.
2022-07-04 Richard Biener <rguenther@suse.de>
* tree-vect-loop-manip.cc (vect_set_loop_condition_normal):
Maintain LC SSA.