Making GENERIC64 a special case was never correct; prior to the
generalization of ".arch .no*" to cover all ISA extensions other
processor families supporting long NOPs should have been covered as
well. When introducing ".arch .nonops" (among others) it wasn't
apparent that a hidden implication of .cpunop not being possible to
separately turn off existed here. Seeing that the two large case label
blocks in the 2nd switch() already had identical behavior, simply
collapse all of the (useful) case labels into a single "default" one.
Since we don't key the NOP selection to user-controlled properties, we
may not use i386 features; otherwise we would violate a possible .arch
directive restricting ISA to pre-386.
Except for the shared 1- and 2-byte cases, the LEA uses corrupt %rsi
(by zero-extending %esi to %rsi). Introduce separate 64-bit patterns
which keep %rsi intact.
What matters is what was in effect at the time the original directive
was issued. Later changes to global state (bitness or ISA) must not
affect what code is generated.
The recorded value, and not the global variable, will want using in
TC_FRAG_INIT(). The so far file scope variable therefore needs to become
external, to be accessible there.
This patch makes a cosmetic change to the reloc_weaksym.s
by making the bneid instruction all lower case like all of
the other instructions in the example.
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
This patch adds the R_MICROBLAZE_32_NONE relocation type.
This is a 32-bit reloc that stores the 32-bit pc relative
value in two words (with an imm instruction).
Add test case to gas test suite.
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
There is currently a bug in the bit masking for the barrel shift
instructions because the bit mask is not including all of the
register bits which must be zero. With this patch, the disassembler
can be sure that the 32-bit value is indeed a barrel shift instruction
and not a data value in memory.
This fix can be verified by assembling and disassembling the following:
.text
.long 0x65005f5f
With this patch, the bug is fixed, and the objdump will know that
0x65005f5f is not a barrel shift instruction.
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
According to the commit 51498ab9abc6, the q extension was no longer allowed
for rv32 since version 2.2. Therefore, make sure the version of q is larger
than 2.2, in case the new extension conflict breaks the toolchain regressions,
which built with the old -misa-spec.
gas/
* testsuite/gas/riscv/zfa-zvfh.d: Set q to v2.2.
* testsuite/gas/riscv/zfa.d: Likewise.
This patch adds new gas tests for the
microblaze bsefi and bsifi instructions.
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
Since RV32E and RV64E are now ratified, this commit prepares the ABI
support for LP64E (LP64 with reduced GPRs).
gas/ChangeLog:
* config/tc-riscv.c (riscv_set_abi_by_arch): Update the error
message. (md_parse_option): Accept "lp64e".
* doc/c-riscv.texi: Update the documentation to allow "lp64e".
* testsuite/gas/riscv/mabi-fail-rv32e-lp64f.l:
Change error message.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64d.l: Likewise.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64q.l: Likewise.
Since RV32E *and* RV64E are ratified, RV64E is no longer invalid.
This commit removes a restriction that prevents making base ISA with
reduced GPRs with XLEN > 32.
bfd/ChangeLog:
* elfxx-riscv.c (riscv_parse_check_conflicts): Remove RV64E
conflict since the ratified 'E' base ISAs include RV64E.
gas/ChangeLog:
* testsuite/gas/riscv/march-fail-base-02.d: Removed.
* testsuite/gas/riscv/march-fail-base-02.l: Removed.
This patches adds new bsefi and bsifi instructions.
BSEFI- The instruction shall extract a bit field from a
register and place it right-adjusted in the destination register.
The other bits in the destination register shall be set to zero.
BSIFI- The instruction shall insert a right-adjusted bit field
from a register at another position in the destination register.
The rest of the bits in the destination register shall be unchanged.
Further documentation of these instructions can be found here:
https://docs.xilinx.com/v/u/en-US/ug984-vivado-microblaze-ref
With version 6 of the patch, no new relocation types are added as
this was unnecessary for adding the bsefi and bsifi instructions.
FIXED: Segfault caused by incorrect termination of microblaze_opcodes.
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Ibai Erkiaga <ibai.erkiaga-elorza@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
For the instructions of R_LARCH_B16/B21, if the immediate overflow,
add a B instruction and R_LARCH_B26 relocation.
For example:
.L1
...
blt $t0, $t1, .L1
R_LARCH_B16
change to:
.L1
...
bge $t0, $t1, .L2
b .L1
R_LARCH_B26
.L2
Some older kernels cannot handle the newly generated R_LARCH_32/64_PCREL,
so the assembler generates R_LARCH_ADD32/64+R_LARCH_SUB32/64 by default,
and use the assembler option mthin-add-sub to generate R_LARCH_32/64_PCREL
as much as possible.
The Option of mthin-add-sub does not affect the generation of R_LARCH_32_PCREL
relocation in .eh_frame.
This patches adds new bsefi and bsifi instructions.
BSEFI- The instruction shall extract a bit field from a
register and place it right-adjusted in the destination register.
The other bits in the destination register shall be set to zero.
BSIFI- The instruction shall insert a right-adjusted bit field
from a register at another position in the destination register.
The rest of the bits in the destination register shall be unchanged.
Further documentation of these instructions can be found here:
https://docs.xilinx.com/v/u/en-US/ug984-vivado-microblaze-ref
This patch has been tested for years of AMD Xilinx Yocto
releases as part of the following patch set:
https://github.com/Xilinx/meta-xilinx/tree/master/meta-microblaze/recipes-devtools/binutils/binutils
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Ibai Erkiaga <ibai.erkiaga-elorza@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
The range check should be checking for the range
ffffffff80000000..7fffffff, not ffffffff70000000.
This patch has been tested for years of AMD Xilinx Yocto
releases as part of the following patch set:
https://github.com/Xilinx/meta-xilinx/tree/master/meta-microblaze/recipes-devtools/binutils/binutils
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
Updated show usage for MicroBlaze specific assembler options
to include new entries.
This patch has been tested for years of AMD Xilinx Yocto
releases as part of the following patch set:
https://github.com/Xilinx/meta-xilinx/tree/master/meta-microblaze/recipes-devtools/binutils/binutils
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
---
V1->V2:
- removed new options which were unnecessary
- added documentation for MicroBlaze specific options
Signed-off-by: Michael J. Eager <eager@eagercon.com>
AVX-* features / insns paralleling earlier introduced AVX512* ones can
be encoded more compactly when the respective feature was explicitly
enabled by the user.
Apparently from its introduction the variable was only ever written (the
only read is merely to determine whether to write it with another value).
(Since, due to the need to re-indent, the adjacent lines setting
cpu_arch_tune need touching anyway, switch to using PREOCESSOR_*
constants where applicable, to make more obvious what the resulting
state is going to be.)
These may not be set from a value derived from cpu_arch_flags: That
starts with (almost) all functionality enabled, while cpu_arch_isa_flags
is supposed to track features that were explicitly enabled (and perhaps
later disabled) by the user.
To avoid needing to do any such adjustment in two places (each),
introduce helper functions used by both command line handling and
directive processing.
Following the folding of some generic AVX/AVX2 templates with their
AVX512F counterpart ones, do this for FMA ones as well, requiring one
further adjustment to cpu_flags_match().
In anticipation of APX introduce logic to reduce the number of templates
we have now, allowing to limit some the number of ones we then need to
gain.
The fundamental requirements are that
- attributes be compatible, which specifically means VexW needs to be
the same in the templates (which often isn't the case, for VEX
encodings having far more WIG tha, EVEX ones),
- the EVEX form being AVX512F (with or without AVX512VL), not any of its
extensions (the same will then be required for APX - it'll need to be
APX_F).
Note that in check_register() there's now a redundant zmm check. Since
this logic will need revisiting for APX anyway, I'd like to keep it that
way for now. (Similarly a couple of if()-s which could be folded are
kept separate, to reduce code churn when adding APX support.)
SAE / embedded rounding are invalid when there's the memory operand, as
the bit encoding this specifies broadcast in that case.
Broadcast needs to be specified on the memory operand.
PR gas/30856
In 5cc007751cdb ("x86: further adjust extend-to-32bit-address
conditions") I neglected the case of PUSH, which is the only insn
allowing (proper) symbol addresses to be used as immediates (not
displacements, like CALL/JMP) in the absence of any register operands.
Since it defaults to 64-bit operand size, guessing an L suffix is wrong
there.
Add a macro pcaddi instruction to support "pcaddi rd, symbol".
pcaddi has a 20-bit signed immediate, it can address a +/- 2MB pc relative
address, and the address should be 4-byte aligned.
The AArch64 feature-flag code is currently limited to a maximum
of 64 features. This patch reworks it so that the limit can be
increased more easily. The basic idea is:
(1) Turn the ARM_FEATURE_FOO macros into an enum, with the enum
counting bit positions.
(2) Make the feature-list macros take an array index argument
(currently always 0). The macros then return the
aarch64_feature_set contents for that array index.
An N-element array would then be initialised as:
{ MACRO (0), ..., MACRO (N - 1) }
(3) Provide convenience macros for initialising an
aarch64_feature_set for:
- a single feature
- a list of individual features
- an architecture version
- an architecture version + a list of additional features
(2) and (3) use the preprocessor to generate static initialisers.
The main restriction was that uses of the same preprocessor macro
cannot be nested. So if a macro wants to do something for N individual
arguments, it needs to use a chain of N macros to do it. There then
needs to be a way of deriving N, as a preprocessor token suitable for
pasting.
The easiest way of doing that was to precede each list of features
by the number of features in the list. So an aarch64_feature_set
initialiser for three features A, B and C would be written:
AARCH64_FEATURES (3, A, B, C)
This scheme makes it difficult to keep AARCH64_FEATURE_CRYPTO as a
synonym for SHA2+AES, so the patch expands the former to the latter.
The new Synopsys ARCv3 ISA has a similar instruction format like
the old ARCv1 and ARCv2 ISA. Thus, the ARCv3 addition is using
whatever we have for old ARC processors plus some ARCv3 spcific mods.
To distinguish between various ARC variants, we introduced two new
configure defines named TARGET_ARCv3_32 and TARGET_ARCv3_64 which are
set when we choose either an ARC32 (ARCv3/32) ISA toolchain or an
ARC64 (ARCv3/64) ISA toolchain.
gas/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* gas/config/tc-arc.h: Selectively define default target macros.
* gas/configure.ac: Add ARC64 target.
* gas/configure.tgt: Likewise.
* gas/configure: Regenerate
* gas/config.in: Regenerate.
* gas/config/tc-arc.c (DEFAULT_ARCH): New macro.
(default_arch): New variable.
(md_pseudo_table): Add xword.
(md_shortopts): Only a few options are recognized by the new ARC64
assembler.
(md_longopts): Likewise.
(ARC_CPU_TYPE_A64x): New define.
(ARC_CPU_TYPE_A32x): Likewise.
(cpu_type): New arch field.
(selected_cpu): Update fields.
(arc_opcode_hash_entry_iterator_init): Formating.
(arc_opcode_hash_entry_iterator_next): Likewise.
(arc_select_cpu): Likewise.
(arc_option): Likewise.
(check_cpu_feature): Likewise.
(debug_exp): Recognize new expression operands.
(parse_reloc_symbol): Parse new signed/unsigend cases.
(parse_opcode_flags): Update for the case when the flags needs
insert/extract functions.
(find_opcode_match): Match new signed/unsigned 32-bit immediates.
(autodetect_attributes): PLT34 only available for ARC64.
(md_assemble): Extend match characters.
(declare_fp_set): New function.
(init_default_arch): Likewise.
(md_begin): Detect and initialize the correct CPU and coresponding
registers.
(md_pcrel_from_section): Add new relocs.
(arc_target_format): New function.
(md_apply_fix): Add new relocs.
(md_parse_option): Update options.
(arc_show_cpu_list): Update with ARC64 cpus.
(md_show_usage): Update messages.
(may_relax_expr): Add PLT34 case.
(assemble_insn): Update for ARC64.
(arc_make_nops): New function.
(arc_handle_align): Refurbish this function, use arc_make_nops.
(tc_arc_fix_adjustable): Update messages.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
The md_pre_output_hook creating fixup is asynchronous, causing relocs
may be out of order in .eh_frame. Define GAS_SORT_RELOCS so that reorder
relocs when write_relocs.
Reported-by: Rui Ueyama <rui314@gmail.com>
Now that CpuLM is used solely in cpu_arch_flags and cpu_arch[] while
Cpu64 is solely used in insn templates, they no longer need to be
treated different from other "ordinary" flags; the only "unusual" one
left if CpuNo64. Fold both, leaving just Cpu64.
A total four places exists where we set the two bits from flag_code, but
these values are never used. The two bits are evaluated only when coming
from insn templates.
Drop these assignments. Also make obvious that cpu_flags_check_cpu64()
is only ever used against insn templates.
While update_code_flag() checks for LM / i386, set_cpu_arch() so far
didn't, allowing e.g. 64-bit code to be emitted after ".arch generic32".
Oddly enough a few of our testcases actually exhibit bad behavior (and
hence need minor adjustments).
Recognize "/<number>" suffixes on both -march=+avx10.1 and the
corresponding .arch directive, setting an upper bound on the vector size
that insns may use. Such a restriction can be reset by setting a new base
architecture, by using a suffix-less form, by disabling AVX10, or by
enabling any other VEX/EVEX-based vector extension.
While for most insns we can suppress their use with too wide operands
via registers becoming unavailable (or in Intel syntax memory operand
size specifiers not being recognized), mask register insns have to have
their minimum required vector size specified in a new attribute. (Of
course this new attribute could also be used on other insns.)
Note that .insn continues to be permitted to emit EVEX{512,256} (and
VEX256 ones) encodings regardless of vector size restrictions in place.
Of course these can't be expressed using zmm (or ymm) operands then,
but need using the EVEX.512.* forms (broadcast forms may be usable right
now, but this may go away so shouldn't be relied upon). This is why no
assertions should be added to build_{e,}vex_prefix().
Since this is merely a re-branding of certain AVX512* features, there's
little code to be added.
The main aspect here are new testcases. In order to be able to re-use
some of the existing testcases, several of them need their start symbols
adjusted. Note that 256- and 128-bit tests want adding here, as these
need to work right away. Subsequently they'll gain vector length
constraints.
Since it was missing and is wanted here, also add an AVX512VL+VPOPCNTDQ
test.