gcc/gcc44-power7.patch

9514 lines
318 KiB
Diff
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

2009-03-09 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vsx.md (vsx_store<mode>_update64): Use correct
registers for store with update.
(vsx_store<mode>_update32): Ditto.
(vsx_storedf_update<VSbit>): Ditto.
2009-03-06 Michael Meissner <meissner@linux.vnet.ibm.com>
* doc/invoke.texi (-mvsx-scalar-memory): New switch, to switch to
use VSX reg+reg addressing for all scalar double precision
floating point.
* config/rs6000/rs6000.opt (-vsx-scalar-memory): Ditto.
* configure.ac (gcc_cv_as_powerpc_mfpgpr): Set binutils version to
2.19.2.
(gcc_cv_as_powerpc_cmpb): Ditto.
(gcc_cv_as_powerpc_dfp): Ditto.
(gcc_cv_as_powerpc_vsx): Ditto.
(gcc_cv_as_powerpc_popcntd): Ditto.
* configure: Regenerate.
* config/rs6000/vector.md (VEC_int): New mode attribute for vector
conversions.
(VEC_INT): Ditto.
(ftrunc<mode>2): Make this a define_expand.
(float<VEC_int><mode>2): New vector conversion support to add VSX
32 bit int/32 bit floating point convert and 64 bit int/64 bit
floating point vector instructions.
(unsigned_float<VEC_int><mode>2): Ditto.
(fix_trunc<mode><VEC_int>2): Ditto.
(fixuns_trunc<mode><VEC_int>2): Ditto.
* config/rs6000/predicates.md (easy_fp_constant): 0.0 is an easy
constant under VSX.
(indexed_or_indirect_operand): Add VSX load/store with update
support.
* config/rs6000/rs6000.c (rs6000_debug_addr): New global for
-mdebug=addr.
(rs6000_init_hard_regno_mode_ok): Add -mvsx-scalar-memory
support.
(rs6000_override_options): Add -mdebug=addr support.
(rs6000_builtin_conversion): Add VSX same size conversions.
(rs6000_legitimize_address): Add -mdebug=addr support. Add
support for VSX load/store with update instructions.
(rs6000_legitimize_reload_address): Ditto.
(rs6000_legitimate_address): Ditto.
(rs6000_mode_dependent_address): Ditto.
(print_operand): Ditto.
(bdesc_1arg): Add builtins for conversion that calls either the
VSX or Altivec insn pattern.
(rs6000_common_init_builtins): Ditto.
* config/rs6000/vsx.md (VSX_I): Delete, no longer used.
(VSi): New mode attribute for conversions.
(VSI): Ditto.
(VSc): Ditto.
(vsx_mov<mode>): Add load/store with update support.
(vsx_load<mode>_update*): New insns for load/store with update
support.
(vsx_store<mode>_update*): Ditto.
(vsx_fmadd<mode>4): Generate correct code for V4SF.
(vsx_fmsub<mode>4): Ditto.
(vsx_fnmadd<mode>4_*): Ditto.
(vsx_fnmsub<mode>4_*): Ditto.
(vsx_float<VSi><mode>2): New insn for vector conversion.
(vsx_floatuns<VSi><mode>2): Ditto.
(vsx_fix_trunc<mode><VSi>2): Ditto.
(vsx_fixuns_trunc<mode><VSi>2): Ditto.
(vsx_xxmrghw): New insn for V4SF interleave.
(vsx_xxmrglw): Ditto.
* config/rs6000/rs6000.h (rs6000_debug_addr): -mdebug=addr
support.
(TARGET_DEBUG_ADDR): Ditto.
(rs6000_builtins): Add VSX instructions for eventual VSX
builtins.
* config/rs6000/altivec.md (altivec_vmrghsf): Don't do the altivec
instruction if VSX.
(altivec_vmrglsf): Ditto.
* config/rs6000/rs6000.md (movdf_hardfloat32): Add support for
using xxlxor to zero a floating register if VSX.
(movdf_hardfloat64_mfpgpr): Ditto.
(movdf_hardfloat64): Ditto.
2009-03-03 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vsx.md (vsx_xxmrglw): Delete for now, use Altivec.
(vsx_xxmrghw): Ditto.
* config/rs6000/altivec.md (altivec_vmrghsf): Use this insn even
on VSX systems.
(altivec_vmrglsf): Ditto.
* config/rs6000/rs6000.h (ASM_CPU_NATIVE_SPEC): Use %(asm_default)
if we are running as a cross compiler.
* config/rs6000/vector.md (vec_interleave_highv4sf): Use correct
constants for the extraction.
(vec_interleave_lowv4sf): Ditto.
* config/rs6000/rs6000.md (floordf2): Fix typo, make this a
define_expand, not define_insn.
* config/rs6000/aix53.h (ASM_CPU_SPEC): If -mcpu=native, call
%:local_cpu_detect(asm) to get the appropriate assembler flags for
the machine.
* config/rs6000/aix61.h (ASM_CPU_SPEC): Ditto.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Ditto.
(ASM_CPU_NATIVE_SPEC): New spec to get asm options if
-mcpu=native.
(EXTRA_SPECS): Add ASM_CPU_NATIVE_SPEC.
* config/rs6000/driver-rs6000.c (asm_names): New static array to
give the appropriate asm switches if -mcpu=native.
(host_detect_local_cpu): Add support for "asm".
* config/rs6000/rs6000.c (processor_target_table): Don't turn on
-misel by default for power7.
2009-03-02 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_emit_swdivdf): Revert last
change, since we reverted the floating multiply/add changes.
* doc/md.texi (Machine Constraints): Update rs6000 constraints.
* config/rs6000/vector.md (neg<mode>2): Fix typo to enable
vectorized negation.
(ftrunc<mode>2): Move ftrunc expander here from altivec.md, and
add V2DF case.
(vec_interleave_highv4sf): Correct type to be V4SF, not V4SI.
(vec_extract_evenv2df): Add expander.
(vec_extract_oddv2df): Ditto.
* config/rs6000/vsx.md (vsx_ftrunc<mode>2): New VSX pattern for
truncate.
(vsx_ftruncdf2): Ditto.
(vsx_xxspltw): New instruction for word splat.
(vsx_xxmrglw): Whitespace changes. Fix typo from V4SI to v4SF.
(vsx_xxmrghw): Ditto.
* config/rs6000/altivec.md (altivec_vmrghsf): Whitespace changes.
(altivec_vmrglsf): Ditto.
(altivec_vspltsf): Disable if we have VSX.
(altivec_ftruncv4sf2): Move expander to vector.md, rename insn.
* config/rs6000/rs6000.md (ftruncdf2): Add expander for VSX.
* config/rs6000/rs6000.c (rs6000_init_hard_regno_mode_ok):
Reenable vectorizing V4SF under altivec.
(rs6000_hard_regno_mode_ok): Don't allow floating values in LR,
CTR, MQ. Also, VRSAVE/VSCR are both 32-bits.
(rs6000_init_hard_regno_mode_ok): Print some of the special
registers if -mdebug=reg.
* config/rs6000/rs6000.md (floating multiply/add insns): Go back
to the original semantics for multiply add/subtract, particularly
with -ffast-math.
* config/rs6000/vsx.md (floating multiply/add insns): Mirror the
rs6000 floating point multiply/add insns in VSX.
2009-03-01 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vector.md (VEC_L): At TImode.
(VEC_M): Like VEC_L, except no TImode.
(VEC_base): Add TImode support.
(mov<mode>): Use VEC_M, not VEC_L. If there is no extra
optimization for the move, just generate the standard move.
(vector_store_<mode>): Ditto.
(vector_load_<mode>): Ditto.
(vec_init<mode>): Use vec_init_operand predicate.
* config/rs6000/predicates.md (vec_init_operand): New predicate.
* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Allow mode
in a VSX register if there is a move operation.
(rs6000_vector_reg_class): Add internal register number to the
debug output.
(rs6000_init_hard_regno_mode_ok): Reorganize so all of the code
for a given type is located together. If not -mvsx, make "ws"
constraint become NO_REGS, not FLOAT_REGS. Change -mdebug=reg
output.
(rs6000_expand_vector_init): Before calling gen_vsx_concat_v2df,
make sure the two float arguments are copied into registers.
(rs6000_legitimate_offset_address_p): If no vsx or altivec, don't
disallow offset addressing. Add V2DImode. If TImode is handled
by the vector unit, allow indexed addressing. Change default case
to be a fatal_insn instead of gcc_unreachable.
(rs6000_handle_altivec_attribute): Add support for vector double
if -mvsx.
(rs6000_register_move_cost): Add support for VSX_REGS. Know that
under VSX, you can move between float and altivec registers
cheaply.
(rs6000_emit_swdivdf): Change the pattern of the negate multiply
and subtract operation.
* config/rs6000/vsx.md (VSX_I): Add TImode.
(VSX_L): Add TImode.
(VSm): Ditto.
(VSs): Ditto.
(VSr): Ditto.
(UNSPEC_VSX_CONCAT_V2DF): New constant.
(vsx_fre<mode>2): Add reciprocal estimate.
(vsx_freDF2): Ditto.
(vsx_fnmadd<mode>4): Rework pattern so it matches the
canonicalization that the compiler does.
(vsx_fnmsub<mode>4): Ditto.
(vsx_fnmaddDF4): Ditto.
(vsx_fnmsubDF4): Ditto.
(vsx_vsel<mode>): Use vsx_register_operand, not register_operand.
(vsx_adddf3): Ditto.
(vsx_subdf3): Ditto.
(vsx_muldf3): Ditto.
(vsx_divdf3): Ditto.
(vsx_negdf3): Ditto.
(vsx_absdf2): Ditto.
(vsx_nabsdf2): Ditto.
(vsx_copysign<mode>3): Add copysign support.
(vsx_copysignDF3): Ditto.
(vsx_concat_v2df): Rewrite to use an UNSPEC.
(vsx_set_v2df): Use "ws" constraint for scalar float.
(vsx_splatv2df): Ditto.
* config/rs6000/rs6000.h (VECTOR_UNIT_NONE_P): New macro to say no
vector support.
(VECTOR_MEM_NONE_P): Ditto.
(VSX_MOVE_MODE): Add V2DImode, TImode.
* config/rs6000/altivec.md (VM): Add V2DI, TI.
(build_vector_mask_for_load): Fix thinko in VSX case.
* config/rs6000/rs6000.md (fmaddsf4_powerpc): Name previously
unnamed pattern. Fix insns so combine will generate the negative
multiply and subtract operations.
(fmaddsf4_power): Ditto.
(fmsubsf4_powerpc): Ditto.
(fmsubsf4_power): Ditto.
(fnmaddsf4_powerpc): Ditto.
(fnmaddsf4_power): Ditto.
(fnmsubsf4_powerpc): Ditto.
(fnmsubsf4_power): Ditto.
(fnsubsf4_powerpc2): Ditto.
(fnsubsf4_power2): Ditto.
(fmadddf4_powerpc): Ditto.
(fmadddf4_power): Ditto.
(fmsubdf4_powerpc): Ditto.
(fmsubdf4_power): Ditto.
(fnmadddf4_powerpc): Ditto.
(fnmadddf4_power): Ditto.
(fnmsubdf4_powerpc): Ditto.
(fnmsubdf4_power): Ditto.
(copysigndf3): If VSX, call the VSX copysign.
(fred): Split into an expander and insn. On insn, disable if
VSX.
(movdf_hardfloat32): Rework VSX support.
(movdf_hardfloat64_mfpgpr): Ditto.
(movdf_hardfloat64): Ditto.
(movti_ppc64): If vector unit is handling TImode, disable this
pattern.
2009-02-28 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/ppc-asm.h: If __VSX__ define the additional scalar
floating point registers that overlap with th Altivec registers.
2009-02-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/spe.md (spe_fixuns_truncdfsi2): Rename from
fixuns_truncdfsi2, and move fixuns_truncdfsi2 into rs6000.md.
* config/rs6000/ppc-asm.h: If __ALTIVEC__ is defined, define the
Altivec registers. If __VSX__ is defined, define the VSX
registers.
* config/rs6000/rs6000.opt (-mvsx-scalar-double): Make this on by
default.
(-misel): Make this a target mask instead of a variable.
* config/rs6000/rs6000.c (rs6000_isel): Delete global variable.
(rs6000_explicit_options): Delete isel field.
(POWERPC_MASKS): Add MASK_ISEL.
(processor_target_table): Add MASK_ISEL to 8540, 8548, e500mc, and
power7 processors.
(rs6000_override_options): Move -misel to a target mask.
(rs6000_handle_option): Ditto.
(rs6000_emit_int_cmove): Add support for 64-bit isel.
* config/rs6000/vsx.md (vsx_floatdidf2): New scalar floating point
pattern to support VSX conversion and rounding instructions.
(vsx_floatunsdidf2): Ditto.
(vsx_fix_trundfdi2): Ditto.
(vsx_fixuns_trundfdi2): Ditto.
(vsx_btrundf2): Ditto.
(vsx_floordf2): Ditto.
(vsx_ceildf2): Ditto.
* config/rs6000/rs6000.h (rs6000_isel): Delete global.
(TARGET_ISEL): Delete, since -misel is now a target mask.
(TARGET_ISEL64): New target option for -misel on 64-bit systems.
* config/rs6000/altivec.md (altivec_gtu<mode>): Use gtu, not geu.
* config/rs6000/rs6000.md (sel): New mode attribute for 64-bit
ISEL support.
(mov<mode>cc): Add 64-bit ISEL support.
(isel_signed_<mode>): Ditto.
(isel_unsigned_<mode>): Ditto.
(fixuns_truncdfsi2): Move expander here from spe.md.
(fixuns_truncdfdi2): New expander for unsigned floating point
conversion on power7.
(btruncdf2): Split into expander and insn. On the insn, disallow
on VSX, so the VSX instruction will be generated.
(ceildf2): Ditto.
(floordf2): Ditto.
(floatdidf2): Ditto.
(fix_truncdfdi2): Ditto.
(smindi3): Define if we have -misel on 64-bit systems.
(smaxdi3): Ditto.
(umindi3): Ditto.
(umaxdi3): Ditto.
* config/rs6000/e500.h (CHECK_E500_OPTIONS): Disable -mvsx on
E500.
2009-02-26 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/constraints.md ("wd" constraint): Change the
variable that holds the register class to use.
("wf" constraint): Ditto.
("ws" constraint): Ditto.
("wa" constraint): Ditto.
* config/rs6000/rs6000.opt (-mvsx-vector-memory): Make this on by
default.
(-mvsx-vector-float): Ditto.
* config/rs6000/rs6000.c (rs6000_vector_reg_class): New global to
hold the register classes for the vector modes.
(rs6000_vsx_v4sf_regclass): Delete, move into
rs6000_vector_reg_class.
(rs6000_vsx_v2df_regclass): Ditto.
(rs6000_vsx_df_regclass): Ditto.
(rs6000_vsx_reg_class): Rename from rs6000_vsx_any_regclass.
(rs6000_hard_regno_mode_ok): Rework VSX, Altivec registers.
(rs6000_init_hard_regno_mode_ok): Setup rs6000_vector_reg_class.
Drop rs6000_vsx_*_regclass. By default, use all 64 registers for
V4SF and V2DF. Use VSX_REG_CLASS_P macro instead of separate
tests. Update -mdebug=reg printout.
(rs6000_preferred_reload_class): If VSX, prefer FLOAT_REGS for
scalar floating point and ALTIVEC_REGS for the types that have
altivec instructions.
(rs6000_secondary_memory_needed): If VSX, we can copy between FPR
and Altivec registers without needed memory.
(rs6000_secondary_reload_class): Delete ATTRIBUTE_UNUSED from an
argument that is used. If VSX, we can copy between FPR and
Altivec registers directly.
* config/rs6000/rs6000.h (VSX_MOVE_MODE): Add in the Altivec
types.
(rs6000_vsx_v4sf_regclass): Delete.
(rs6000_vsx_v2df_regclass): Ditto.
(rs6000_vsx_df_regclass): Ditto.
(rs6000_vsx_reg_class): Rename from rs6000_vsx_any_reg_class.
(rs6000_vector_reg_class): New global to map machine mode to the
preferred register class to use for that mode.
(VSX_REG_CLASS_P): New macro to return true for all of the
register classes VSX items can be in.
2009-02-25 Michael Meissner <meissner@linux.vnet.ibm.com>
* doc/invoke.texi (-mvsx-vector-memory): Rename from
-mvsx-vector-move.
(-mvsx-vector-logical): Delete.
* config/rs6000/aix53.h (ASM_CPU_SPEC): Add power7 support.
* config/rs6000/aix61.h (ASM_CPU_SPEC): Ditto.
* config/rs6000/vector.md (all insns); Change from using
rs6000_vector_info to VECTOR_MEM_* or VECTOR_UNIT_* macros.
* config/rs6000/constraints.md ("wi" constraint): Delete.
("wl" constraint): Ditto.
("ws" constraint): Change to use rs6000_vsx_df_regclass.
* config/rs6000/rs6000.opt (-mvsx-vector-memory): Rename from
-mvsx-vector-move.
(-mvsx-vector-float): Make default 0, not 1.
(-mvsx-vector-double): Make default -1, not 1.
(-mvsx-vector-logical): Delete.
* config/rs6000/rs6000.c (rs6000_vector_info): Delete.
(rs6000_vector_unit): New global array to say what vector unit is
used for arithmetic instructions.
(rs6000_vector_move): New global array to say what vector unit is
used for memory references.
(rs6000_vsx_int_regclass): Delete.
(rs6000_vsx_logical_regclass): Delete.
(rs6000_hard_regno_nregs_internal): Switch from using
rs6000_vector_info to rs6000_vector_unit, rs6000_vector_move.
(rs6000_hard_regno_mode_ok): Ditto. Reformat code somewhat.
(rs6000_debug_vector_unit): New array to print vector unit
information if -mdebug=reg.
(rs6000_init_hard_regno_mode_ok): Rework to better describe VSX
and Altivec register sets.
(builtin_mask_for_load): Return 0 if -mvsx.
(rs6000_legitimize_reload_address): Allow AND in VSX addresses.
(rs6000_legitimate_address): Ditto.
(bdesc_3arg): Delete vselv2di builtin.
(rs6000_emit_minmax): Use rs6000_vector_unit instead of
rs6000_vector_info.
(rs6000_vector_mode_supported_p): Ditto.
* config/rs6000/vsx.md (all insns): Change from using
rs6000_vector_info to VECTOR_MEM_* and VECTOR_UNIT_* macros.
(VSr): Change to use "v" register class, not "wi".
(vsx_mov<mode>): Combine floating and integer. Allow prefered
register class, and then use ?wa for all VSX registers.
(vsx_fmadddf4): Use ws constraint, not f.
(vsx_fmsubdf4): Ditto.
(vsx_fnmadddf4): Ditto.
(vsx_fnmsubdf4): Ditto.
(vsx_and<mode>3): Use preferred register class, and then ?wa to
catch all VSX registers.
(vsx_ior<mode>3): Ditto.
(vsx_xor<mode>3): Ditto.
(vsx_one_cmpl<mode>2): Ditto.
(vsx_nor<mode>3): Ditto.
(vsx_andc<mode>3): Ditto.
* config/rs6000/rs6000.h (rs6000_vector_struct): Delete.
(rs6000_vector_info): Ditto.
(rs6000_vector_unit): New global array to say whether a machine
mode arithmetic is handled by a particular vector unit.
(rs6000_vector_mem): New global array to say which vector unit to
use for moves.
(VECTOR_UNIT_*): New macros to say which vector unit to use.
(VECTOR_MEM_*): Ditto.
(rs6000_vsx_int_regclass): Delete.
(rs6000_vsx_logical_regclass): Delete.
* config/rs6000/altivec.md (all insns): Change from using
rs6000_vector_info to VECTOR_MEM_* and VECTOR_UNIT_* macros.
(build_vector_mask_for_load): Disable if VSX.
* config/rs6000/rs6000.md (all DF insns): Change how the VSX
exclusion is done.
2009-02-24 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_debug_reg): New global.
(rs6000_debug_reg_print): New function to print register classes
for a given register range.
(rs6000_init_hard_regno_mode_ok): If -mdebug=reg, print out the
register class, call used, fixed information for most of the
registers. Print the vsx register class variables.
(rs6000_override_options): Add -mdebug=reg support.
* config/rs6000/rs6000.h (rs6000_debug_reg): New global.
(TARGET_DEBUG_REG): New target switch for -mdebug=reg.
2009-02-23 Michael Meissner <meissner@linux.vnet.ibm.com>
* reload.c (subst_reloads): Change gcc_assert into a fatal_insn.
* config/rs6000/vector.md (VEC_I): Reorder iterator.
(VEC_L): Ditto.
(VEC_C): New iterator field for vector comparisons.
(VEC_base): New mode attribute that mapes the vector type to the
base type.
(all insns): Switch to use rs6000_vector_info to determine whether
the insn is valid instead of using TARGET_VSX or TARGET_ALTIVEC.
(vcond<mode>): Move here from altivec, and add VSX support.
(vcondu<mode>): Ditto.
(vector_eq<mode>): New expander for vector comparisons.
(vector_gt<mode>): Ditto.
(vector_ge<mode>): Ditto.
(vector_gtu<mode>): Ditto.
(vector_geu<mode>): Ditto.
(vector_vsel<mode>): New expander for vector select.
(vec_init<mode>): Move expander from altivec.md and generalize for
VSX.
(vec_set<mode>): Ditto.
(vec_extract<mode>): Ditto.
(vec_interleave_highv4sf): Ditto.
(vec_interleave_lowv4sf): Ditto.
(vec_interleave_highv2df): New expander for VSX.
(vec_interleave_lowv2df): Ditto.
* config/rs6000/contraints.md (toplevel): Add comment on the
available constraint letters.
("w" constraint): Delete, in favor of using "w" as a two letter
constraint.
("wd" constraint): New VSX constraint.
("wf" constraint): Ditto.
("wi" constraint): Ditto.
("wl" constraint): Ditto.
("ws" constraint): Ditto.
("wa" constraint): Ditto.
* config/rs6000/predicates.md (indexed_or_indirect_operand):
Disable altivec support allowing AND of memory address if -mvsx.
* config/rs6000/rs6000.opt (-mvsx-vector-move): New switches to
allow finer control over whether VSX, Altivec, or the traditional
instructions are used.
(-mvsx-scalar-move): Ditto.
(-mvsx-vector-float): Ditto.
(-mvsx-vector-double): Ditto.
(-mvsx-vector-logical): Ditto.
(-mvsx-scalar-double): Ditto.
* config/rs6000/rs6000.c (rs6000_vector_info): New global to hold
various information about which vector instruction set to use, and
the alignment of data.
(rs6000_vsx_v4sf_regclass): New global to hold VSX register
class.
(rs6000_vsx_v2df_regclass): Ditto.
(rs6000_vsx_df_regclass): Ditto.
(rs6000_vsx_int_regclass): Ditto.
(rs6000_vsx_logical_regclass): Ditto.
(rs6000_vsx_any_regclass): Ditto.
(rs6000_hard_regno_nregs_internal): Rewrite to fine tune
VSX/Altivec register selection.
(rs6000_hard_regno_mode_ok): Ditto.
(rs6000_init_hard_regno_mode_ok): Set up the vector information
globals based on the -mvsx-* switches.
(rs6000_override_options): Add warnings for -mvsx and
-mlittle-endian or -mavoid-indexed-addresses.
(rs6000_builtin_vec_perm): Add V2DF/V2DI support.
(rs6000_expand_vector_init): Add V2DF support.
(rs6000_expand_vector_set): Ditto.
(rs6000_expand_vector_extract): Ditto.
(avoiding_indexed_address_p): Add VSX support.
(rs6000_legitimize_address): Ditto.
(rs6000_legitimize_reload_address): Ditto.
(rs6000_legitimite_address): Ditto.
(USE_ALTIVEC_FOR_ARG_P): Ditto.
(function_arg_boundary): Ditto.
(function_arg_advance): Ditto.
(function_arg): Ditto.
(get_vec_cmp_insn): Delete.
(rs6000_emit_vector_vsx): New function for VSX vector compare.
(rs6000_emit_vector_altivec): New function for Altivec vector
compare.
(get_vsel_insn): Delete.
(rs6000_emit_vector_select): Ditto.
(rs6000_override_options): If -mvsx, turn on -maltivec by
default.
(rs6000_builtin_vec_perm): Add support for V2DI, V2DF modes.
(bdesc_3arg): Add vector select and vector permute builtins for
V2DI and V2DF types. Switch to using the vector_* expander
instead of altivec_*.
(rs6000_init_builtins): Initialize new type nodes for VSX.
Initialize __vector double type. Initialize common builtins for
VSX.
(rs6000_emit_vector_compare): Add VSX support.
(rs6000_vector_mode_supported_p): If VSX, support V2DF.
* config/rs6000/vsx.md (VSX_I): New iterator for integer modes.
(VSX_L): Reorder iterator.
(lx_<mode>_vsx): Delete, no longer needed.
(stx_<mode>_vsx): Ditto.
(all insns): Change to use vsx_<name> instead of <name>_vsx for
consistancy with the other rs6000 md files. Change to use the new
"w" constraints for all insns. Change to use rs6000_vector_info
deciding whether to execute the instruction or not.
(vsx_mov<mode>): Rewrite constraints so GPR registers are not
chosen as reload targets. Split integer vector loads into a
separate insn, and favor the altivec register over the VSX fp
registers.
(vsx_fmadd<mode>4): Use <mode>, not <type>.
(vsx_fmsub<mode>4): Ditto.
(vsx_eq<mode>): New insns for V2DF/V4SF vector compare.
(vsx_gt<mode>): Ditto.
(vsx_ge<mode>): Ditto.
(vsx_vsel<mode>): New insns for VSX vector select.
(vsx_xxpermdi): New insn for DF permute.
(vsx_splatv2df): New insn for DF splat support.
(vsx_xxmrglw): New insns for DF interleave.
(vsx_xxmrghw): Ditto.
* config/rs6000/rs000.h (enum rs6000_vector): New enum to
describe which vector unit is being used.
(struct rs6000_vector_struct): New struct to describe the various
aspects about the current vector instruction set.
(rs6000_vector_info): New global to describe the current vector
instruction set.
(SLOW_UNALIGNED_ACCESS): If rs6000_vector_info has alignment
information for a type, use that.
(VSX_VECTOR_MOVE_MODE): New macro for all VSX vectors that are
supported by move instructions.
(VSX_MOVE_MODE): New macro for all VSX moves.
(enum rs6000_builtins): Add V2DI/V2DF vector select and permute
builtins.
(rs6000_builtin_type_index): Add new types for VSX vectors.
(rs6000_vsx_v4sf_regclass): New global to hold VSX register
class.
(rs6000_vsx_v2df_regclass): Ditto.
(rs6000_vsx_df_regclass): Ditto.
(rs6000_vsx_int_regclass): Ditto.
(rs6000_vsx_logical_regclass): Ditto.
(rs6000_vsx_any_regclass): Ditto.
* config/rs6000/altivec.md (UNSPEC_VCMP*): Delete unspec
constants no longer needed.
(UNSPEC_VSEL*): Ditto.
(altivec_lvx_<mode>): Delete, no longer needed.
(altivec_stvx_<mode>): Ditto.
(all insns): Rewrite to be consistant of altivec_<insn>. Switch
to use rs6000_vector_info to determine whether to issue to the
altivec form of the instructions.
(mov<mode>_altivec): Rewrite constraints so GPR registers are not
chosen as reload targets.
(altivec_eq<mode>): Rewrite vector conditionals, permute, select
to use iterators, and work with VSX.
(altivec_gt<mode>): Ditto.
(altivec_ge<mode>): Ditto.
(altivec_gtu<mode>): Ditto.
(altivec_geu<mode>): Ditto.
(altivec_vsel<mode>): Ditto.
(altivec_vperm_<mode>): Ditto.
(altivec_vcmp*): Rewrite to not use unspecs any more, and use mode
iterators, add VSX support.
(vcondv4si): Move to vector.md.
(vconduv4si): Ditto.
(vcondv8hi): Ditto.
(vconduv8hi): Ditto.
(vcondv16qi): Ditto.
(vconduv16qi): Ditto.
* config/rs6000/rs6000.md (negdf2_fpr): Add support for
-mvsx-scalar-double.
(absdf2_fpr): Ditto.
(nabsdf2_fpr): Ditto.
(adddf3_fpr): Ditto.
(subdf3_fpr): Ditto.
(muldf3_fpr): Ditto.
(divdf3_fpr): Ditto.
(DF multiply/add patterns): Ditto.
(sqrtdf2): Ditto.
(movdf_hardfloat32): Add VSX support.
(movdf_hardfloat64_mfpgpr): Ditto.
(movdf_hardfloat64): Ditto.
* doc/invoke.texi (-mvsx-*): Add new vsx switches.
2009-02-13 Michael Meissner <meissner@linux.vnet.ibm.com>
* config.in: Update two comments.
* config/rs6000/vector.md (VEC_L): Add V2DI type.
(move<mode>): Use VEC_L to get all vector types, and delete the
separate integer mode move definitions.
(vector_load_<mode>): Ditto.
(vector_store_<mode>): Ditto.
(vector move splitters): Move GPR register splitters here from
altivec.md.
* config/rs6000/constraints.md ("j"): Add "j" constraint to match
the mode's 0 value.
* config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Only
count the FPRs as being 128 bits if the mode is a VSX type.
(rs6000_hard_regno_mode_ok): Ditto.
(rs6000_emit_minmax): Use new VSX_MODE instead of separate tests.
* config/rs6000/vsx.md (VSX_L): Add V2DImode.
(VSm): Rename from VSX_mem, add modes for integer vectors. Change
all uses.
(VSs): Rename from VSX_op, add modes for integer vectors. Change
all uses.
(VSr): New mode address to give the register class.
(mov<mode>_vsx): Use VSr to get the register preferences. Add
explicit 0 option.
(scalar double precision patterns): Do not use v register
constraint right now.
(logical patterns): Use VSr mode attribute for register
preferences.
* config/rs6000/rs6000.h (VSX_SCALAR_MODE): New macro.
(VSX_MODE): Ditto.
* config/rs6000/altivec.md (VM): New mode iterator for memory
operations. Add V2DI mode.
(mov_altivec_<mode>): Disable if -mvsx for all modes, not just
V4SFmode.
(gpr move splitters): Move to vector.md.
(and<mode>3_altivec): Use VM mode iterator, not V.
(ior<mode>3_altivec): Ditto.
(xor<mode>3_altivec): Ditto.
(one_cmpl<mode>2_altivec): Ditto.
(nor<mode>3_altivec): Ditto.
(andc<mode>3_altivec): Ditto.
* config/rs6000/rs6000.md (movdf_hardfloat): Back out vsx changes.
(movdf_hardfloat64_vsx): Delete.
2009-02-12 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vector.md: New file to abstract out the expanders
for vector operations from alitvec.md.
* config/rs6000/predicates.md (vsx_register_operand): New
predicate to match VSX registers.
(vfloat_operand): New predicate to match registers used for vector
floating point operations.
(vint_operand): New predicate to match registers used for vector
integer operations.
(vlogical_operand): New predicate to match registers used for
vector logical operations.
* config/rs6000/rs6000-protos.h (rs6000_hard_regno_nregs): Change
from a function to an array.
(rs6000_class_max_nregs): Add declaration.
* config/rs6000/t-rs6000 (MD_INCLUDES): Define to include all of
the .md files included by rs6000.md.
* config/rs6000/rs6000.c (rs6000_class_max_nregs): New global
array to pre-calculate CLASS_MAX_NREGS.
(rs6000_hard_regno_nregs): Change from a function to an array to
pre-calculate HARD_REGNO_NREGS.
(rs6000_hard_regno_nregs_internal): Rename from
rs6000_hard_regno_nregs and add VSX support.
(rs6000_hard_regno_mode_ok): Add VSX support, and switch to use
lookup table rs6000_hard_regno_nregs.
(rs6000_init_hard_regno_mode_ok): Add initialization of
rs6000_hard_regno_nregs, and rs6000_class_max_nregs global
arrays.
(rs6000_override_options): Add some warnings for things that are
incompatible with -mvsx.
(rs6000_legitimate_offset_address_p): Add V2DFmode.
(rs6000_conditional_register_usage): Enable altivec registers if
-mvsx.
(bdesc_2arg): Change the name of the nor pattern.
(altivec_expand_ld_builtin): Change the names of the load
patterns to be the generic vector loads.
(altivec_expand_st_builtin): Change the names of the store
patterns to be the generic vector stores.
(print_operand): Add 'x' to print out a VSX register properly.
(rs6000_emit_minmax): Directly emit the min/max patterns for VSX
and Altivec.
* config/rs6000/vsx.md: New file to add all of the VSX specific
instructions. Add support for load, store, move, add, subtract,
multiply, multiply/add, divide, negate, absolute value, maximum,
minimum, sqrt, and, or, xor, and complent, xor, one's complement,
and nor instructions.
* config/rs6000/rs6000.h (UNITS_PER_VSX_WORD): Define.
(VSX_REGNO_P): New macro for VSX registers.
(VFLOAT_REGNO): New macro for vector floating point registers.
(VINT_REGNO): New macro for vector integer registers.
(VLOGICAL_REGNO): New macro for vector logical registers.
(VSX_VECTOR_MODE): New macro for vector modes supported by VSX.
(HARD_REGNO_NREGS): Switch to using pre-computed table.
(CLASS_MAX_NREGS): Ditto.
* config/rs6000/altivec.md (altivec_lvx_<mode>): Delete, repalced
by expanders in vector.md.
(altivec_stvx_<mode>): Ditto.
(mov<mode>): Ditto.
(mov_altivec_<mode>): Rename from mov<mode>_internal, and prefer
using VSX if available.
(addv4sf3_altivec): Rename from standard name, and prefer using
VSX if available.
(subv4sf3_altivec): Ditto.
(mulv4sf3_altivec): Ditto.
(smaxv4sf3_altivec): Ditto.
(sminv4sf3_altivec): Ditto.
(and<mode>3_altivec): Ditto.
(ior<mode>3_altivec): Ditto.
(xor<mode>3_altivec): Ditto.
(one_cmpl<mode>2): Ditto.
(nor<mode>3_altivec): Ditto.
(andc<mode>3_altivec): Ditto.
(absv4sf2_altivec): Ditto.
(vcondv4sf): Move to vector.md.
* config/rs6000/rs6000.md (negdf2_fpr): Add !TARGET_VSX to prefer
the version in vsx.md if -mvsx is available.
(absdf2_fpr): Ditto.
(nabsdf2_fpr): Ditto.
(adddf3_fpr): Ditto.
(subdf3_fpr): Ditto.
(muldf3_fpr): Ditto.
(multiply/add patterns): Ditto.
(movdf_hardfloat64): Disable if -mvsx.
(movdf_hardfloat64_vsx): Clone from movdf_hardfloat64 and add VSX
support.
(vector.md): Include new .md file.
(vsx.md): Ditto.
2009-02-11 Michael Meissner <meissner@linux.vnet.ibm.com>
* doc/invoke.texi (-mvsx, -mno-vsx): Document new switches.
* config/rs6000/linux64.opt (-mprofile-kernel): Move to being a
variable to reduce the number of target flag bits.
* config/rs6000/sysv4.opt (-mbit-align): Ditto.
(-mbit-word): Ditto.
(-mregnames): Ditto.
* config/rs6000/rs6000.opt (-mupdate, -mno-update): Ditto.
(-mvsx): New switch, enable VSX support.
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Define
__VSX__ if the vector/scalar instruction set is available.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Change
to allow -mprofile-kernel to be a variable.
* config/rs6000/rs6000.c (processor_target_table): Set -mvsx for
power7 cpus.
(POWERPC_MASKS): Add -mvsx.
* config/rs6000/rs6000.h (ADDITIONAL_REGISTER_NAMES): Add VSX
register names for the registers that overlap with the floating
point and altivec registers.
* config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS):
TARGET_NO_BITFIELD_WORD is now a variable, not a target mask.
2009-02-11 Pat Haugen <pthaugen@us.ibm.com>
Michael Meissner <meissner@linux.vnet.ibm.com>
* doc/invoke.texi (-mpopcntd, -mno-popcntd): Document new
switches.
* configure.ac (powerpc*-*-*): Test for the assembler having the
popcntd instruction.
* configure: Regenerate.
* config.in (HAVE_AS_POPCNTD): Add default value for configure
test.
* config/rs6000/power7.md: New file.
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Define
_ARCH_PWR7 if the popcntd instruction is available.
* config/rs6000/rs6000.opt (-mpopcntd): New switch to
enable/disable the use of the popcntd and popcntw instructions.
(-mfused-madd, -mno-fused-madd): Move to being a separate variable
because we are out of mask bits.
* config/rs6000/rs6000.c (power7_cost): Define.
(rs6000_override_options): Add Power7 support.
(rs6000_issue_rate): Ditto.
(insn_must_be_first_in_group): Ditto.
(insn_must_be_last_in_group): Ditto.
(rs6000_emit_popcount): Add support for using the popcntw and
popcntd instructions.
* config/rs6000/rs6000.h (ASM_CPU_POWER7_SPEC): Switch to using
popcntd as the test for a power7 assembler instead of vector
scalar instructions.
* (TARGET_POPCNTD): If assembler does not support the popcntd
instruction, disable using it.
(processor_type): Add Power7 entry.
* config/rs6000/rs6000.md (define_attr "cpu"): Add power7.
(power7.md): Include it.
(andi./andis./nor. patterns): Change insn type to fast_compare.
(popcntwsi2): Add popcntw support.
(popcntddi2): Add popcntd support.
testsuite/
2009-03-01 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/vsx-vector-1.c: Fix typos.
* gcc.target/powerpc/vsx-vector-2.c: Ditto.
* gcc.target/powerpc/vsx-vector-3.c: New file, test __vector
double.
* gcc.target/powerpc/vsx-vector-4.c: New file, test __vector float
uses VSX instructions if -mvsx.
* gcc.dg/vmx/vmx.exp (DEFAULT_VMXCLFAGS): Add -mno-vsx.
* lib/target-supports.exp (check_vsx_hw_available): New function
to test for VSX.
(check_vmx_hw_available): Add -mno-vsx to options.
(check_effective_target_powerpc_vsx_ok): New function to check if
the powerpc compiler can support VSX.
2009-02-27 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/vsx-vector-1.c: New file to test VSX code
generation.
* gcc.target/powerpc/vsx-vector-2.c: Ditto.
2009-02-01 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/popcount-2.c: New file for power7 support.
* gcc.target/powerpc/popcount-3.c: Ditto.
--- gcc/doc/invoke.texi (.../trunk) (revision 144557)
+++ gcc/doc/invoke.texi (.../branches/ibm/power7-meissner) (revision 144730)
@@ -712,7 +712,8 @@ See RS/6000 and PowerPC Options.
-maltivec -mno-altivec @gol
-mpowerpc-gpopt -mno-powerpc-gpopt @gol
-mpowerpc-gfxopt -mno-powerpc-gfxopt @gol
--mmfcrf -mno-mfcrf -mpopcntb -mno-popcntb -mfprnd -mno-fprnd @gol
+-mmfcrf -mno-mfcrf -mpopcntb -mno-popcntb -mpopcntd -mno-popcntd @gol
+-mfprnd -mno-fprnd @gol
-mcmpb -mno-cmpb -mmfpgpr -mno-mfpgpr -mhard-dfp -mno-hard-dfp @gol
-mnew-mnemonics -mold-mnemonics @gol
-mfull-toc -mminimal-toc -mno-fp-in-toc -mno-sum-in-toc @gol
@@ -726,7 +727,9 @@ See RS/6000 and PowerPC Options.
-mstrict-align -mno-strict-align -mrelocatable @gol
-mno-relocatable -mrelocatable-lib -mno-relocatable-lib @gol
-mtoc -mno-toc -mlittle -mlittle-endian -mbig -mbig-endian @gol
--mdynamic-no-pic -maltivec -mswdiv @gol
+-mdynamic-no-pic -maltivec -mswdiv -mvsx -mvsx-vector-memory @gol
+-mvsx-vector-float -mvsx-vector-double @gol
+-mvsx-scalar-double -mvsx-scalar-memory @gol
-mprioritize-restricted-insns=@var{priority} @gol
-msched-costly-dep=@var{dependence_type} @gol
-minsert-sched-nops=@var{scheme} @gol
@@ -13512,6 +13515,8 @@ These @samp{-m} options are defined for
@itemx -mno-mfcrf
@itemx -mpopcntb
@itemx -mno-popcntb
+@itemx -mpopcntd
+@itemx -mno-popcntd
@itemx -mfprnd
@itemx -mno-fprnd
@itemx -mcmpb
@@ -13536,6 +13541,8 @@ These @samp{-m} options are defined for
@opindex mno-mfcrf
@opindex mpopcntb
@opindex mno-popcntb
+@opindex mpopcntd
+@opindex mno-popcntd
@opindex mfprnd
@opindex mno-fprnd
@opindex mcmpb
@@ -13585,6 +13592,9 @@ The @option{-mpopcntb} option allows GCC
double precision FP reciprocal estimate instruction implemented on the
POWER5 processor and other processors that support the PowerPC V2.02
architecture.
+The @option{-mpopcntd} option allows GCC to generate the popcount
+instruction implemented on the POWER7 processor and other processors
+that support the PowerPC V2.06 architecture.
The @option{-mfprnd} option allows GCC to generate the FP round to
integer instructions implemented on the POWER5+ processor and other
processors that support the PowerPC V2.03 architecture.
@@ -13663,9 +13673,9 @@ The @option{-mcpu} options automatically
following options:
@gccoptlist{-maltivec -mfprnd -mhard-float -mmfcrf -mmultiple @gol
--mnew-mnemonics -mpopcntb -mpower -mpower2 -mpowerpc64 @gol
+-mnew-mnemonics -mpopcntb -mpopcntd -mpower -mpower2 -mpowerpc64 @gol
-mpowerpc-gpopt -mpowerpc-gfxopt -msingle-float -mdouble-float @gol
--msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr}
+-msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr -mvsx}
The particular options set for any particular CPU will vary between
compiler versions, depending on what setting seems to produce optimal
@@ -13766,6 +13776,44 @@ instructions.
This option has been deprecated. Use @option{-mspe} and
@option{-mno-spe} instead.
+@item -mvsx
+@itemx -mno-vsx
+@opindex mvsx
+@opindex mno-vsx
+Generate code that uses (does not use) vector/scalar (VSX)
+instructions, and also enable the use of built-in functions that allow
+more direct access to the VSX instruction set.
+
+@item -mvsx-vector-memory
+@itemx -mno-vsx-vector-memory
+If @option{-mvsx}, use VSX memory reference instructions for vectors
+instead of the Altivec instructions This option is a temporary switch
+to tune the compiler, and may not be supported in future versions.
+
+@item -mvsx-vector-float
+@itemx -mno-vsx-vector-float
+If @option{-mvsx}, use VSX arithmetic instructions for float vectors.
+This option is a temporary switch to tune the compiler, and may not be
+supported in future versions.
+
+@item -mvsx-vector-double
+@itemx -mno-vsx-vector-double
+If @option{-mvsx}, use VSX arithmetic instructions for double
+vectors. This option is a temporary switch to tune the
+compiler, and may not be supported in future versions.
+
+@item -mvsx-scalar-double
+@itemx -mno-vsx-scalar-double
+If @option{-mvsx}, use VSX arithmetic instructions for scalar double.
+This option is a temporary switch to tune the compiler, and may not be
+supported in future versions.
+
+@item -mvsx-scalar-memory
+@itemx -mno-vsx-scalar-memory
+If @option{-mvsx}, use VSX memory reference instructions for scalar
+double. This option is a temporary switch to tune the compiler, and
+may not be supported in future versions.
+
@item -mfloat-gprs=@var{yes/single/double/no}
@itemx -mfloat-gprs
@opindex mfloat-gprs
--- gcc/doc/md.texi (.../trunk) (revision 144557)
+++ gcc/doc/md.texi (.../branches/ibm/power7-meissner) (revision 144730)
@@ -1905,7 +1905,19 @@ Address base register
Floating point register
@item v
-Vector register
+Altivec vector register
+
+@item wd
+VSX vector register to hold vector double data
+
+@item wf
+VSX vector register to hold vector float data
+
+@item ws
+VSX vector register to hold scalar float data
+
+@item wa
+Any VSX register
@item h
@samp{MQ}, @samp{CTR}, or @samp{LINK} register
@@ -1991,6 +2003,9 @@ AND masks that can be performed by two r
@item W
Vector constant that does not require memory
+@item j
+Vector constant that is all zeros.
+
@end table
@item Intel 386---@file{config/i386/constraints.md}
--- gcc/reload.c (.../trunk) (revision 144557)
+++ gcc/reload.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -6255,8 +6255,14 @@ subst_reloads (rtx insn)
*r->where = reloadreg;
}
/* If reload got no reg and isn't optional, something's wrong. */
- else
- gcc_assert (rld[r->what].optional);
+ else if (!rld[r->what].optional)
+ {
+ char buffer[100];
+ sprintf (buffer,
+ "unable to find register for reload, replacement #%d",
+ i);
+ fatal_insn (buffer, insn);
+ }
}
}
--- gcc/testsuite/gcc.target/powerpc/popcount-3.c (.../trunk) (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/popcount-3.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -0,0 +1,9 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-options "-O2 -mcpu=power7 -m64" } */
+/* { dg-final { scan-assembler "popcntd" } } */
+
+long foo(int x)
+{
+ return __builtin_popcountl(x);
+}
--- gcc/testsuite/gcc.target/powerpc/vsx-vector-1.c (.../trunk) (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/vsx-vector-1.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -0,0 +1,74 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -ftree-vectorize -mcpu=power7 -m64" } */
+/* { dg-final { scan-assembler "xvadddp" } } */
+/* { dg-final { scan-assembler "xvsubdp" } } */
+/* { dg-final { scan-assembler "xvmuldp" } } */
+/* { dg-final { scan-assembler "xvdivdp" } } */
+/* { dg-final { scan-assembler "xvmadd" } } */
+/* { dg-final { scan-assembler "xvmsub" } } */
+
+#ifndef SIZE
+#define SIZE 1024
+#endif
+
+double a[SIZE] __attribute__((__aligned__(32)));
+double b[SIZE] __attribute__((__aligned__(32)));
+double c[SIZE] __attribute__((__aligned__(32)));
+double d[SIZE] __attribute__((__aligned__(32)));
+double e[SIZE] __attribute__((__aligned__(32)));
+
+void
+vector_add (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = b[i] + c[i];
+}
+
+void
+vector_subtract (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = b[i] - c[i];
+}
+
+void
+vector_multiply (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = b[i] * c[i];
+}
+
+void
+vector_multiply_add (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = (b[i] * c[i]) + d[i];
+}
+
+void
+vector_multiply_subtract (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = (b[i] * c[i]) - d[i];
+}
+
+void
+vector_divide (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = b[i] / c[i];
+}
--- gcc/testsuite/gcc.target/powerpc/vsx-vector-2.c (.../trunk) (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/vsx-vector-2.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -0,0 +1,74 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -ftree-vectorize -mcpu=power7 -m64" } */
+/* { dg-final { scan-assembler "xvaddsp" } } */
+/* { dg-final { scan-assembler "xvsubsp" } } */
+/* { dg-final { scan-assembler "xvmulsp" } } */
+/* { dg-final { scan-assembler "xvdivsp" } } */
+/* { dg-final { scan-assembler "xvmadd" } } */
+/* { dg-final { scan-assembler "xvmsub" } } */
+
+#ifndef SIZE
+#define SIZE 1024
+#endif
+
+float a[SIZE] __attribute__((__aligned__(32)));
+float b[SIZE] __attribute__((__aligned__(32)));
+float c[SIZE] __attribute__((__aligned__(32)));
+float d[SIZE] __attribute__((__aligned__(32)));
+float e[SIZE] __attribute__((__aligned__(32)));
+
+void
+vector_add (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = b[i] + c[i];
+}
+
+void
+vector_subtract (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = b[i] - c[i];
+}
+
+void
+vector_multiply (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = b[i] * c[i];
+}
+
+void
+vector_multiply_add (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = (b[i] * c[i]) + d[i];
+}
+
+void
+vector_multiply_subtract (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = (b[i] * c[i]) - d[i];
+}
+
+void
+vector_divide (void)
+{
+ int i;
+
+ for (i = 0; i < SIZE; i++)
+ a[i] = b[i] / c[i];
+}
--- gcc/testsuite/gcc.target/powerpc/vsx-vector-3.c (.../trunk) (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/vsx-vector-3.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -0,0 +1,48 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -ftree-vectorize -mcpu=power7 -m64" } */
+/* { dg-final { scan-assembler "xvadddp" } } */
+/* { dg-final { scan-assembler "xvsubdp" } } */
+/* { dg-final { scan-assembler "xvmuldp" } } */
+/* { dg-final { scan-assembler "xvdivdp" } } */
+/* { dg-final { scan-assembler "xvmadd" } } */
+/* { dg-final { scan-assembler "xvmsub" } } */
+
+__vector double a, b, c, d;
+
+void
+vector_add (void)
+{
+ a = b + c;
+}
+
+void
+vector_subtract (void)
+{
+ a = b - c;
+}
+
+void
+vector_multiply (void)
+{
+ a = b * c;
+}
+
+void
+vector_multiply_add (void)
+{
+ a = (b * c) + d;
+}
+
+void
+vector_multiply_subtract (void)
+{
+ a = (b * c) - d;
+}
+
+void
+vector_divide (void)
+{
+ a = b / c;
+}
--- gcc/testsuite/gcc.target/powerpc/popcount-2.c (.../trunk) (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/popcount-2.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -0,0 +1,9 @@
+/* { dg-do compile { target { ilp32 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-options "-O2 -mcpu=power7 -m32" } */
+/* { dg-final { scan-assembler "popcntw" } } */
+
+int foo(int x)
+{
+ return __builtin_popcount(x);
+}
--- gcc/testsuite/gcc.target/powerpc/vsx-vector-4.c (.../trunk) (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/vsx-vector-4.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -0,0 +1,48 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -ftree-vectorize -mcpu=power7 -m64" } */
+/* { dg-final { scan-assembler "xvaddsp" } } */
+/* { dg-final { scan-assembler "xvsubsp" } } */
+/* { dg-final { scan-assembler "xvmulsp" } } */
+/* { dg-final { scan-assembler "xvdivsp" } } */
+/* { dg-final { scan-assembler "xvmadd" } } */
+/* { dg-final { scan-assembler "xvmsub" } } */
+
+__vector float a, b, c, d;
+
+void
+vector_add (void)
+{
+ a = b + c;
+}
+
+void
+vector_subtract (void)
+{
+ a = b - c;
+}
+
+void
+vector_multiply (void)
+{
+ a = b * c;
+}
+
+void
+vector_multiply_add (void)
+{
+ a = (b * c) + d;
+}
+
+void
+vector_multiply_subtract (void)
+{
+ a = (b * c) - d;
+}
+
+void
+vector_divide (void)
+{
+ a = b / c;
+}
--- gcc/testsuite/gcc.dg/vmx/vmx.exp (.../trunk) (revision 144557)
+++ gcc/testsuite/gcc.dg/vmx/vmx.exp (.../branches/ibm/power7-meissner) (revision 144730)
@@ -31,7 +31,7 @@ if {![istarget powerpc*-*-*]
# nothing but extensions.
global DEFAULT_VMXCFLAGS
if ![info exists DEFAULT_VMXCFLAGS] then {
- set DEFAULT_VMXCFLAGS "-maltivec -mabi=altivec -std=gnu99"
+ set DEFAULT_VMXCFLAGS "-maltivec -mabi=altivec -std=gnu99 -mno-vsx"
}
# If the target system supports AltiVec instructions, the default action
--- gcc/testsuite/lib/target-supports.exp (.../trunk) (revision 144557)
+++ gcc/testsuite/lib/target-supports.exp (.../branches/ibm/power7-meissner) (revision 144730)
@@ -873,6 +873,32 @@ proc check_sse2_hw_available { } {
}]
}
+# Return 1 if the target supports executing VSX instructions, 0
+# otherwise. Cache the result.
+
+proc check_vsx_hw_available { } {
+ return [check_cached_effective_target vsx_hw_available {
+ # Some simulators are known to not support VSX instructions.
+ # For now, disable on Darwin
+ if { [istarget powerpc-*-eabi] || [istarget powerpc*-*-eabispe] || [istarget *-*-darwin*]} {
+ expr 0
+ } else {
+ set options "-mvsx"
+ check_runtime_nocache vsx_hw_available {
+ int main()
+ {
+ #ifdef __MACH__
+ asm volatile ("xxlor vs0,vs0,vs0");
+ #else
+ asm volatile ("xxlor 0,0,0");
+ #endif
+ return 0;
+ }
+ } $options
+ }
+ }]
+}
+
# Return 1 if the target supports executing AltiVec instructions, 0
# otherwise. Cache the result.
@@ -883,12 +909,13 @@ proc check_vmx_hw_available { } {
expr 0
} else {
# Most targets don't require special flags for this test case, but
- # Darwin does.
+ # Darwin does. Just to be sure, make sure VSX is not enabled for
+ # the altivec tests.
if { [istarget *-*-darwin*]
|| [istarget *-*-aix*] } {
- set options "-maltivec"
+ set options "-maltivec -mno-vsx"
} else {
- set options ""
+ set options "-mno-vsx"
}
check_runtime_nocache vmx_hw_available {
int main()
@@ -1519,6 +1546,33 @@ proc check_effective_target_powerpc_alti
}
}
+# Return 1 if this is a PowerPC target supporting -mvsx
+
+proc check_effective_target_powerpc_vsx_ok { } {
+ if { ([istarget powerpc*-*-*]
+ && ![istarget powerpc-*-linux*paired*])
+ || [istarget rs6000-*-*] } {
+ # AltiVec is not supported on AIX before 5.3.
+ if { [istarget powerpc*-*-aix4*]
+ || [istarget powerpc*-*-aix5.1*]
+ || [istarget powerpc*-*-aix5.2*] } {
+ return 0
+ }
+ return [check_no_compiler_messages powerpc_vsx_ok object {
+ int main (void) {
+#ifdef __MACH__
+ asm volatile ("xxlor vs0,vs0,vs0");
+#else
+ asm volatile ("xxlor 0,0,0");
+#endif
+ return 0;
+ }
+ } "-mvsx"]
+ } else {
+ return 0
+ }
+}
+
# Return 1 if this is a PowerPC target supporting -mcpu=cell.
proc check_effective_target_powerpc_ppu_ok { } {
--- gcc/config.in (.../trunk) (revision 144557)
+++ gcc/config.in (.../branches/ibm/power7-meissner) (revision 144730)
@@ -334,12 +334,18 @@
#endif
-/* Define if your assembler supports popcntb field. */
+/* Define if your assembler supports popcntb instruction. */
#ifndef USED_FOR_TARGET
#undef HAVE_AS_POPCNTB
#endif
+/* Define if your assembler supports popcntd instruction. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_POPCNTD
+#endif
+
+
/* Define if your assembler supports .register. */
#ifndef USED_FOR_TARGET
#undef HAVE_AS_REGISTER_PSEUDO_OP
--- gcc/configure.ac (.../trunk) (revision 144557)
+++ gcc/configure.ac (.../branches/ibm/power7-meissner) (revision 144730)
@@ -2,7 +2,7 @@
# Process this file with autoconf to generate a configuration script.
# Copyright 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006,
-# 2007, 2008 Free Software Foundation, Inc.
+# 2007, 2008, 2009 Free Software Foundation, Inc.
#This file is part of GCC.
@@ -3080,7 +3080,7 @@ foo: nop
esac
gcc_GAS_CHECK_FEATURE([move fp gpr support],
- gcc_cv_as_powerpc_mfpgpr, [9,99,0],,
+ gcc_cv_as_powerpc_mfpgpr, [2,19,2],,
[$conftest_s],,
[AC_DEFINE(HAVE_AS_MFPGPR, 1,
[Define if your assembler supports mffgpr and mftgpr.])])
@@ -3114,7 +3114,7 @@ LCF0:
esac
gcc_GAS_CHECK_FEATURE([compare bytes support],
- gcc_cv_as_powerpc_cmpb, [9,99,0], -a32,
+ gcc_cv_as_powerpc_cmpb, [2,19,2], -a32,
[$conftest_s],,
[AC_DEFINE(HAVE_AS_CMPB, 1,
[Define if your assembler supports cmpb.])])
@@ -3129,7 +3129,7 @@ LCF0:
esac
gcc_GAS_CHECK_FEATURE([decimal float support],
- gcc_cv_as_powerpc_dfp, [9,99,0], -a32,
+ gcc_cv_as_powerpc_dfp, [2,19,2], -a32,
[$conftest_s],,
[AC_DEFINE(HAVE_AS_DFP, 1,
[Define if your assembler supports DFP instructions.])])
@@ -3144,11 +3144,26 @@ LCF0:
esac
gcc_GAS_CHECK_FEATURE([vector-scalar support],
- gcc_cv_as_powerpc_vsx, [9,99,0], -a32,
+ gcc_cv_as_powerpc_vsx, [2,19,2], -a32,
[$conftest_s],,
[AC_DEFINE(HAVE_AS_VSX, 1,
[Define if your assembler supports VSX instructions.])])
+ case $target in
+ *-*-aix*) conftest_s=' .machine "pwr7"
+ .csect .text[[PR]]
+ popcntd 3,3';;
+ *) conftest_s=' .machine power7
+ .text
+ popcntd 3,3';;
+ esac
+
+ gcc_GAS_CHECK_FEATURE([popcntd support],
+ gcc_cv_as_powerpc_popcntd, [2,19,2], -a32,
+ [$conftest_s],,
+ [AC_DEFINE(HAVE_AS_POPCNTD, 1,
+ [Define if your assembler supports POPCNTD instructions.])])
+
gcc_GAS_CHECK_FEATURE([.gnu_attribute support],
gcc_cv_as_powerpc_gnu_attribute, [2,18,0],,
[.gnu_attribute 4,1],,
--- gcc/configure (.../trunk) (revision 144557)
+++ gcc/configure (.../branches/ibm/power7-meissner) (revision 144730)
@@ -23225,7 +23225,7 @@ if test "${gcc_cv_as_powerpc_mfpgpr+set}
else
gcc_cv_as_powerpc_mfpgpr=no
if test $in_tree_gas = yes; then
- if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0`
+ if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2`
then gcc_cv_as_powerpc_mfpgpr=yes
fi
elif test x$gcc_cv_as != x; then
@@ -23321,7 +23321,7 @@ if test "${gcc_cv_as_powerpc_cmpb+set}"
else
gcc_cv_as_powerpc_cmpb=no
if test $in_tree_gas = yes; then
- if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0`
+ if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2`
then gcc_cv_as_powerpc_cmpb=yes
fi
elif test x$gcc_cv_as != x; then
@@ -23367,7 +23367,7 @@ if test "${gcc_cv_as_powerpc_dfp+set}" =
else
gcc_cv_as_powerpc_dfp=no
if test $in_tree_gas = yes; then
- if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0`
+ if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2`
then gcc_cv_as_powerpc_dfp=yes
fi
elif test x$gcc_cv_as != x; then
@@ -23413,7 +23413,7 @@ if test "${gcc_cv_as_powerpc_vsx+set}" =
else
gcc_cv_as_powerpc_vsx=no
if test $in_tree_gas = yes; then
- if test $gcc_cv_gas_vers -ge `expr \( \( 9 \* 1000 \) + 99 \) \* 1000 + 0`
+ if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2`
then gcc_cv_as_powerpc_vsx=yes
fi
elif test x$gcc_cv_as != x; then
@@ -23443,6 +23443,52 @@ _ACEOF
fi
+ case $target in
+ *-*-aix*) conftest_s=' .machine "pwr7"
+ .csect .text[PR]
+ popcntd 3,3';;
+ *) conftest_s=' .machine power7
+ .text
+ popcntd 3,3';;
+ esac
+
+ echo "$as_me:$LINENO: checking assembler for popcntd support" >&5
+echo $ECHO_N "checking assembler for popcntd support... $ECHO_C" >&6
+if test "${gcc_cv_as_powerpc_popcntd+set}" = set; then
+ echo $ECHO_N "(cached) $ECHO_C" >&6
+else
+ gcc_cv_as_powerpc_popcntd=no
+ if test $in_tree_gas = yes; then
+ if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 19 \) \* 1000 + 2`
+ then gcc_cv_as_powerpc_popcntd=yes
+fi
+ elif test x$gcc_cv_as != x; then
+ echo "$conftest_s" > conftest.s
+ if { ac_try='$gcc_cv_as -a32 -o conftest.o conftest.s >&5'
+ { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5
+ (eval $ac_try) 2>&5
+ ac_status=$?
+ echo "$as_me:$LINENO: \$? = $ac_status" >&5
+ (exit $ac_status); }; }
+ then
+ gcc_cv_as_powerpc_popcntd=yes
+ else
+ echo "configure: failed program was" >&5
+ cat conftest.s >&5
+ fi
+ rm -f conftest.o conftest.s
+ fi
+fi
+echo "$as_me:$LINENO: result: $gcc_cv_as_powerpc_popcntd" >&5
+echo "${ECHO_T}$gcc_cv_as_powerpc_popcntd" >&6
+if test $gcc_cv_as_powerpc_popcntd = yes; then
+
+cat >>confdefs.h <<\_ACEOF
+#define HAVE_AS_POPCNTD 1
+_ACEOF
+
+fi
+
echo "$as_me:$LINENO: checking assembler for .gnu_attribute support" >&5
echo $ECHO_N "checking assembler for .gnu_attribute support... $ECHO_C" >&6
if test "${gcc_cv_as_powerpc_gnu_attribute+set}" = set; then
--- gcc/config/rs6000/aix53.h (.../trunk) (revision 144557)
+++ gcc/config/rs6000/aix53.h (.../branches/ibm/power7-meissner) (revision 144730)
@@ -57,20 +57,24 @@ do { \
#undef ASM_SPEC
#define ASM_SPEC "-u %{maix64:-a64 %{!mcpu*:-mppc64}} %(asm_cpu)"
-/* Common ASM definitions used by ASM_SPEC amongst the various targets
- for handling -mcpu=xxx switches. */
+/* Common ASM definitions used by ASM_SPEC amongst the various targets for
+ handling -mcpu=xxx switches. There is a parallel list in driver-rs6000.c to
+ provide the default assembler options if the user uses -mcpu=native, so if
+ you make changes here, make them there also. */
#undef ASM_CPU_SPEC
#define ASM_CPU_SPEC \
"%{!mcpu*: %{!maix64: \
%{mpowerpc64: -mppc64} \
%{maltivec: -m970} \
%{!maltivec: %{!mpower64: %(asm_default)}}}} \
+%{mcpu=native: %(asm_cpu_native)} \
%{mcpu=power3: -m620} \
%{mcpu=power4: -mpwr4} \
%{mcpu=power5: -mpwr5} \
%{mcpu=power5+: -mpwr5x} \
%{mcpu=power6: -mpwr6} \
%{mcpu=power6x: -mpwr6} \
+%{mcpu=power7: -mpwr7} \
%{mcpu=powerpc: -mppc} \
%{mcpu=rs64a: -mppc} \
%{mcpu=603: -m603} \
--- gcc/config/rs6000/vector.md (.../trunk) (revision 0)
+++ gcc/config/rs6000/vector.md (.../branches/ibm/power7-meissner) (revision 144730)
@@ -0,0 +1,518 @@
+;; Expander definitions for vector support between altivec & vsx. No
+;; instructions are in this file, this file provides the generic vector
+;; expander, and the actual vector instructions will be in altivec.md and
+;; vsx.md
+
+;; Copyright (C) 2009
+;; Free Software Foundation, Inc.
+;; Contributed by Michael Meissner <meissner@linux.vnet.ibm.com>
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3. If not see
+;; <http://www.gnu.org/licenses/>.
+
+
+;; Vector int modes
+(define_mode_iterator VEC_I [V16QI V8HI V4SI])
+
+;; Vector float modes
+(define_mode_iterator VEC_F [V4SF V2DF])
+
+;; Vector logical modes
+(define_mode_iterator VEC_L [V16QI V8HI V4SI V2DI V4SF V2DF TI])
+
+;; Vector modes for moves. Don't do TImode here.
+(define_mode_iterator VEC_M [V16QI V8HI V4SI V2DI V4SF V2DF])
+
+;; Vector comparison modes
+(define_mode_iterator VEC_C [V16QI V8HI V4SI V4SF V2DF])
+
+;; Base type from vector mode
+(define_mode_attr VEC_base [(V16QI "QI")
+ (V8HI "HI")
+ (V4SI "SI")
+ (V2DI "DI")
+ (V4SF "SF")
+ (V2DF "DF")
+ (TI "TI")])
+
+;; Same size integer type for floating point data
+(define_mode_attr VEC_int [(V4SF "v4si")
+ (V2DF "v2di")])
+
+(define_mode_attr VEC_INT [(V4SF "V4SI")
+ (V2DF "V2DI")])
+
+;; Vector move instructions.
+(define_expand "mov<mode>"
+ [(set (match_operand:VEC_M 0 "nonimmediate_operand" "")
+ (match_operand:VEC_M 1 "any_operand" ""))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+ /* modes without special handling just generate the normal SET operation. */
+ if (<MODE>mode != TImode && <MODE>mode != V2DImode && <MODE>mode != V2DFmode)
+ {
+ rs6000_emit_move (operands[0], operands[1], <MODE>mode);
+ DONE;
+ }
+ else if (!vlogical_operand (operands[0], <MODE>mode)
+ && !vlogical_operand (operands[1], <MODE>mode))
+ operands[1] = force_reg (<MODE>mode, operands[1]);
+})
+
+;; Generic vector floating point load/store instructions. These will match
+;; insns defined in vsx.md or altivec.md depending on the switches.
+(define_expand "vector_load_<mode>"
+ [(set (match_operand:VEC_M 0 "vfloat_operand" "")
+ (match_operand:VEC_M 1 "memory_operand" ""))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "vector_store_<mode>"
+ [(set (match_operand:VEC_M 0 "memory_operand" "")
+ (match_operand:VEC_M 1 "vfloat_operand" ""))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+;; Splits if a GPR register was chosen for the move
+(define_split
+ [(set (match_operand:VEC_L 0 "nonimmediate_operand" "")
+ (match_operand:VEC_L 1 "input_operand" ""))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)
+ && reload_completed
+ && gpr_or_gpr_p (operands[0], operands[1])"
+ [(pc)]
+{
+ rs6000_split_multireg_move (operands[0], operands[1]);
+ DONE;
+})
+
+
+;; Generic floating point vector arithmetic support
+(define_expand "add<mode>3"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (plus:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
+ (match_operand:VEC_F 2 "vfloat_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "sub<mode>3"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (minus:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
+ (match_operand:VEC_F 2 "vfloat_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "mul<mode>3"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (mult:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
+ (match_operand:VEC_F 2 "vfloat_operand" "")))]
+ "(VECTOR_UNIT_VSX_P (<MODE>mode)
+ || (VECTOR_UNIT_ALTIVEC_P (<MODE>mode) && TARGET_FUSED_MADD))"
+ "
+{
+ if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode))
+ {
+ emit_insn (gen_altivec_mulv4sf3 (operands[0], operands[1], operands[2]));
+ DONE;
+ }
+}")
+
+(define_expand "div<mode>3"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (div:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
+ (match_operand:VEC_F 2 "vfloat_operand" "")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "neg<mode>2"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (neg:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "
+{
+ if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode))
+ {
+ emit_insn (gen_altivec_negv4sf2 (operands[0], operands[1]));
+ DONE;
+ }
+}")
+
+(define_expand "abs<mode>2"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (abs:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "
+{
+ if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode))
+ {
+ emit_insn (gen_altivec_absv4sf2 (operands[0], operands[1]));
+ DONE;
+ }
+}")
+
+(define_expand "smin<mode>3"
+ [(set (match_operand:VEC_F 0 "register_operand" "")
+ (smin:VEC_F (match_operand:VEC_F 1 "register_operand" "")
+ (match_operand:VEC_F 2 "register_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "smax<mode>3"
+ [(set (match_operand:VEC_F 0 "register_operand" "")
+ (smax:VEC_F (match_operand:VEC_F 1 "register_operand" "")
+ (match_operand:VEC_F 2 "register_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+
+(define_expand "sqrt<mode>2"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (sqrt:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "ftrunc<mode>2"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (fix:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+
+;; Vector comparisons
+(define_expand "vcond<mode>"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (if_then_else:VEC_F
+ (match_operator 3 "comparison_operator"
+ [(match_operand:VEC_F 4 "vfloat_operand" "")
+ (match_operand:VEC_F 5 "vfloat_operand" "")])
+ (match_operand:VEC_F 1 "vfloat_operand" "")
+ (match_operand:VEC_F 2 "vfloat_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "
+{
+ if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
+ operands[3], operands[4], operands[5]))
+ DONE;
+ else
+ FAIL;
+}")
+
+(define_expand "vcond<mode>"
+ [(set (match_operand:VEC_I 0 "vint_operand" "")
+ (if_then_else:VEC_I
+ (match_operator 3 "comparison_operator"
+ [(match_operand:VEC_I 4 "vint_operand" "")
+ (match_operand:VEC_I 5 "vint_operand" "")])
+ (match_operand:VEC_I 1 "vint_operand" "")
+ (match_operand:VEC_I 2 "vint_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+ "
+{
+ if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
+ operands[3], operands[4], operands[5]))
+ DONE;
+ else
+ FAIL;
+}")
+
+(define_expand "vcondu<mode>"
+ [(set (match_operand:VEC_I 0 "vint_operand" "=v")
+ (if_then_else:VEC_I
+ (match_operator 3 "comparison_operator"
+ [(match_operand:VEC_I 4 "vint_operand" "")
+ (match_operand:VEC_I 5 "vint_operand" "")])
+ (match_operand:VEC_I 1 "vint_operand" "")
+ (match_operand:VEC_I 2 "vint_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+ "
+{
+ if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
+ operands[3], operands[4], operands[5]))
+ DONE;
+ else
+ FAIL;
+}")
+
+(define_expand "vector_eq<mode>"
+ [(set (match_operand:VEC_C 0 "vlogical_operand" "")
+ (eq:VEC_C (match_operand:VEC_C 1 "vlogical_operand" "")
+ (match_operand:VEC_C 2 "vlogical_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "vector_gt<mode>"
+ [(set (match_operand:VEC_C 0 "vlogical_operand" "")
+ (gt:VEC_C (match_operand:VEC_C 1 "vlogical_operand" "")
+ (match_operand:VEC_C 2 "vlogical_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "vector_ge<mode>"
+ [(set (match_operand:VEC_C 0 "vlogical_operand" "")
+ (ge:VEC_C (match_operand:VEC_C 1 "vlogical_operand" "")
+ (match_operand:VEC_C 2 "vlogical_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "vector_gtu<mode>"
+ [(set (match_operand:VEC_I 0 "vint_operand" "")
+ (gtu:VEC_I (match_operand:VEC_I 1 "vint_operand" "")
+ (match_operand:VEC_I 2 "vint_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+ "")
+
+(define_expand "vector_geu<mode>"
+ [(set (match_operand:VEC_I 0 "vint_operand" "")
+ (geu:VEC_I (match_operand:VEC_I 1 "vint_operand" "")
+ (match_operand:VEC_I 2 "vint_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+ "")
+
+;; Note the arguments for __builtin_altivec_vsel are op2, op1, mask
+;; which is in the reverse order that we want
+(define_expand "vector_vsel<mode>"
+ [(match_operand:VEC_F 0 "vlogical_operand" "")
+ (match_operand:VEC_F 1 "vlogical_operand" "")
+ (match_operand:VEC_F 2 "vlogical_operand" "")
+ (match_operand:VEC_F 3 "vlogical_operand" "")]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "
+{
+ if (VECTOR_UNIT_VSX_P (<MODE>mode))
+ emit_insn (gen_vsx_vsel<mode> (operands[0], operands[3],
+ operands[2], operands[1]));
+ else
+ emit_insn (gen_altivec_vsel<mode> (operands[0], operands[3],
+ operands[2], operands[1]));
+ DONE;
+}")
+
+(define_expand "vector_vsel<mode>"
+ [(match_operand:VEC_I 0 "vlogical_operand" "")
+ (match_operand:VEC_I 1 "vlogical_operand" "")
+ (match_operand:VEC_I 2 "vlogical_operand" "")
+ (match_operand:VEC_I 3 "vlogical_operand" "")]
+ "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+ "
+{
+ emit_insn (gen_altivec_vsel<mode> (operands[0], operands[3],
+ operands[2], operands[1]));
+ DONE;
+}")
+
+
+;; Vector logical instructions
+(define_expand "xor<mode>3"
+ [(set (match_operand:VEC_L 0 "vlogical_operand" "")
+ (xor:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")
+ (match_operand:VEC_L 2 "vlogical_operand" "")))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "ior<mode>3"
+ [(set (match_operand:VEC_L 0 "vlogical_operand" "")
+ (ior:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")
+ (match_operand:VEC_L 2 "vlogical_operand" "")))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "and<mode>3"
+ [(set (match_operand:VEC_L 0 "vlogical_operand" "")
+ (and:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")
+ (match_operand:VEC_L 2 "vlogical_operand" "")))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "one_cmpl<mode>2"
+ [(set (match_operand:VEC_L 0 "vlogical_operand" "")
+ (not:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "nor<mode>3"
+ [(set (match_operand:VEC_L 0 "vlogical_operand" "")
+ (not:VEC_L (ior:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")
+ (match_operand:VEC_L 2 "vlogical_operand" ""))))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+(define_expand "andc<mode>3"
+ [(set (match_operand:VEC_L 0 "vlogical_operand" "")
+ (and:VEC_L (not:VEC_L (match_operand:VEC_L 2 "vlogical_operand" ""))
+ (match_operand:VEC_L 1 "vlogical_operand" "")))]
+ "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "")
+
+;; Same size conversions
+(define_expand "float<VEC_int><mode>2"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (float:VEC_F (match_operand:<VEC_INT> 1 "vint_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "
+{
+ if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode))
+ {
+ emit_insn (gen_altivec_vcfsx (operands[0], operands[1], const0_rtx));
+ DONE;
+ }
+}")
+
+(define_expand "unsigned_float<VEC_int><mode>2"
+ [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+ (unsigned_float:VEC_F (match_operand:<VEC_INT> 1 "vint_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "
+{
+ if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode))
+ {
+ emit_insn (gen_altivec_vcfux (operands[0], operands[1], const0_rtx));
+ DONE;
+ }
+}")
+
+(define_expand "fix_trunc<mode><VEC_int>2"
+ [(set (match_operand:<VEC_INT> 0 "vint_operand" "")
+ (fix:<VEC_INT> (match_operand:VEC_F 1 "vfloat_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "
+{
+ if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode))
+ {
+ emit_insn (gen_altivec_vctsxs (operands[0], operands[1], const0_rtx));
+ DONE;
+ }
+}")
+
+(define_expand "fixuns_trunc<mode><VEC_int>2"
+ [(set (match_operand:<VEC_INT> 0 "vint_operand" "")
+ (unsigned_fix:<VEC_INT> (match_operand:VEC_F 1 "vfloat_operand" "")))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+ "
+{
+ if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode))
+ {
+ emit_insn (gen_altivec_vctuxs (operands[0], operands[1], const0_rtx));
+ DONE;
+ }
+}")
+
+
+;; Vector initialization, set, extract
+(define_expand "vec_init<mode>"
+ [(match_operand:VEC_C 0 "vlogical_operand" "")
+ (match_operand:VEC_C 1 "vec_init_operand" "")]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+ rs6000_expand_vector_init (operands[0], operands[1]);
+ DONE;
+})
+
+(define_expand "vec_set<mode>"
+ [(match_operand:VEC_C 0 "vlogical_operand" "")
+ (match_operand:<VEC_base> 1 "register_operand" "")
+ (match_operand 2 "const_int_operand" "")]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+ rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2]));
+ DONE;
+})
+
+(define_expand "vec_extract<mode>"
+ [(match_operand:<VEC_base> 0 "register_operand" "")
+ (match_operand:VEC_C 1 "vlogical_operand" "")
+ (match_operand 2 "const_int_operand" "")]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+ rs6000_expand_vector_extract (operands[0], operands[1],
+ INTVAL (operands[2]));
+ DONE;
+})
+
+;; Interleave patterns
+(define_expand "vec_interleave_highv4sf"
+ [(set (match_operand:V4SF 0 "vfloat_operand" "")
+ (vec_merge:V4SF
+ (vec_select:V4SF (match_operand:V4SF 1 "vfloat_operand" "")
+ (parallel [(const_int 0)
+ (const_int 2)
+ (const_int 1)
+ (const_int 3)]))
+ (vec_select:V4SF (match_operand:V4SF 2 "vfloat_operand" "")
+ (parallel [(const_int 2)
+ (const_int 0)
+ (const_int 3)
+ (const_int 1)]))
+ (const_int 5)))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)"
+ "")
+
+(define_expand "vec_interleave_lowv4sf"
+ [(set (match_operand:V4SF 0 "vfloat_operand" "")
+ (vec_merge:V4SF
+ (vec_select:V4SF (match_operand:V4SF 1 "vfloat_operand" "")
+ (parallel [(const_int 2)
+ (const_int 0)
+ (const_int 3)
+ (const_int 1)]))
+ (vec_select:V4SF (match_operand:V4SF 2 "vfloat_operand" "")
+ (parallel [(const_int 0)
+ (const_int 2)
+ (const_int 1)
+ (const_int 3)]))
+ (const_int 5)))]
+ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)"
+ "")
+
+(define_expand "vec_interleave_highv2df"
+ [(set (match_operand:V2DF 0 "vfloat_operand" "")
+ (vec_concat:V2DF
+ (vec_select:DF (match_operand:V2DF 1 "vfloat_operand" "")
+ (parallel [(const_int 0)]))
+ (vec_select:DF (match_operand:V2DF 2 "vfloat_operand" "")
+ (parallel [(const_int 0)]))))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "")
+
+(define_expand "vec_interleave_lowv2df"
+ [(set (match_operand:V2DF 0 "vfloat_operand" "")
+ (vec_concat:V2DF
+ (vec_select:DF (match_operand:V2DF 1 "vfloat_operand" "")
+ (parallel [(const_int 1)]))
+ (vec_select:DF (match_operand:V2DF 2 "vfloat_operand" "")
+ (parallel [(const_int 1)]))))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "")
+
+;; For 2 element vectors, even/odd is the same as high/low
+(define_expand "vec_extract_evenv2df"
+ [(set (match_operand:V2DF 0 "vfloat_operand" "")
+ (vec_concat:V2DF
+ (vec_select:DF (match_operand:V2DF 1 "vfloat_operand" "")
+ (parallel [(const_int 0)]))
+ (vec_select:DF (match_operand:V2DF 2 "vfloat_operand" "")
+ (parallel [(const_int 0)]))))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "")
+
+(define_expand "vec_extract_oddv2df"
+ [(set (match_operand:V2DF 0 "vfloat_operand" "")
+ (vec_concat:V2DF
+ (vec_select:DF (match_operand:V2DF 1 "vfloat_operand" "")
+ (parallel [(const_int 1)]))
+ (vec_select:DF (match_operand:V2DF 2 "vfloat_operand" "")
+ (parallel [(const_int 1)]))))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "")
--- gcc/config/rs6000/spe.md (.../trunk) (revision 144557)
+++ gcc/config/rs6000/spe.md (.../branches/ibm/power7-meissner) (revision 144730)
@@ -99,7 +99,7 @@ (define_insn "*divsf3_gpr"
;; Floating point conversion instructions.
-(define_insn "fixuns_truncdfsi2"
+(define_insn "spe_fixuns_truncdfsi2"
[(set (match_operand:SI 0 "gpc_reg_operand" "=r")
(unsigned_fix:SI (match_operand:DF 1 "gpc_reg_operand" "r")))]
"TARGET_HARD_FLOAT && TARGET_E500_DOUBLE"
--- gcc/config/rs6000/constraints.md (.../trunk) (revision 144557)
+++ gcc/config/rs6000/constraints.md (.../branches/ibm/power7-meissner) (revision 144730)
@@ -17,6 +17,8 @@
;; along with GCC; see the file COPYING3. If not see
;; <http://www.gnu.org/licenses/>.
+;; Available constraint letters: "e", "k", "u", "A", "B", "C", "D"
+
;; Register constraints
(define_register_constraint "f" "TARGET_HARD_FLOAT && TARGET_FPRS
@@ -50,6 +52,23 @@ (define_register_constraint "y" "CR_REGS
(define_register_constraint "z" "XER_REGS"
"@internal")
+;; Use w as a prefix to add VSX modes
+;; vector double (V2DF)
+(define_register_constraint "wd" "rs6000_vector_reg_class[V2DFmode]"
+ "@internal")
+
+;; vector float (V4SF)
+(define_register_constraint "wf" "rs6000_vector_reg_class[V4SFmode]"
+ "@internal")
+
+;; scalar double (DF)
+(define_register_constraint "ws" "rs6000_vector_reg_class[DFmode]"
+ "@internal")
+
+;; any VSX register
+(define_register_constraint "wa" "rs6000_vsx_reg_class"
+ "@internal")
+
;; Integer constraints
(define_constraint "I"
@@ -159,3 +178,7 @@ (define_constraint "t"
(define_constraint "W"
"vector constant that does not require memory"
(match_operand 0 "easy_vector_constant"))
+
+(define_constraint "j"
+ "Zero vector constant"
+ (match_test "(op == const0_rtx || op == CONST0_RTX (GET_MODE (op)))"))
--- gcc/config/rs6000/predicates.md (.../trunk) (revision 144557)
+++ gcc/config/rs6000/predicates.md (.../branches/ibm/power7-meissner) (revision 144730)
@@ -38,6 +38,37 @@ (define_predicate "altivec_register_oper
|| ALTIVEC_REGNO_P (REGNO (op))
|| REGNO (op) > LAST_VIRTUAL_REGISTER")))
+;; Return 1 if op is a VSX register.
+(define_predicate "vsx_register_operand"
+ (and (match_operand 0 "register_operand")
+ (match_test "GET_CODE (op) != REG
+ || VSX_REGNO_P (REGNO (op))
+ || REGNO (op) > LAST_VIRTUAL_REGISTER")))
+
+;; Return 1 if op is a vector register that operates on floating point vectors
+;; (either altivec or VSX).
+(define_predicate "vfloat_operand"
+ (and (match_operand 0 "register_operand")
+ (match_test "GET_CODE (op) != REG
+ || VFLOAT_REGNO_P (REGNO (op))
+ || REGNO (op) > LAST_VIRTUAL_REGISTER")))
+
+;; Return 1 if op is a vector register that operates on integer vectors
+;; (only altivec, VSX doesn't support integer vectors)
+(define_predicate "vint_operand"
+ (and (match_operand 0 "register_operand")
+ (match_test "GET_CODE (op) != REG
+ || VINT_REGNO_P (REGNO (op))
+ || REGNO (op) > LAST_VIRTUAL_REGISTER")))
+
+;; Return 1 if op is a vector register to do logical operations on (and, or,
+;; xor, etc.)
+(define_predicate "vlogical_operand"
+ (and (match_operand 0 "register_operand")
+ (match_test "GET_CODE (op) != REG
+ || VLOGICAL_REGNO_P (REGNO (op))
+ || REGNO (op) > LAST_VIRTUAL_REGISTER")))
+
;; Return 1 if op is XER register.
(define_predicate "xer_operand"
(and (match_code "reg")
@@ -234,6 +265,10 @@ (define_predicate "easy_fp_constant"
&& num_insns_constant_wide ((HOST_WIDE_INT) k[3]) == 1);
case DFmode:
+ /* The constant 0.f is easy under VSX. */
+ if (op == CONST0_RTX (DFmode) && VECTOR_UNIT_VSX_P (DFmode))
+ return 1;
+
/* Force constants to memory before reload to utilize
compress_float_constant.
Avoid this when flag_unsafe_math_optimizations is enabled
@@ -396,13 +431,16 @@ (define_predicate "indexed_or_indirect_o
(match_code "mem")
{
op = XEXP (op, 0);
- if (TARGET_ALTIVEC
- && ALTIVEC_VECTOR_MODE (mode)
+ if (VECTOR_MEM_ALTIVEC_P (mode)
&& GET_CODE (op) == AND
&& GET_CODE (XEXP (op, 1)) == CONST_INT
&& INTVAL (XEXP (op, 1)) == -16)
op = XEXP (op, 0);
+ else if (VECTOR_MEM_VSX_P (mode)
+ && GET_CODE (op) == PRE_MODIFY)
+ op = XEXP (op, 1);
+
return indexed_or_indirect_address (op, mode);
})
@@ -1336,3 +1374,19 @@ (define_predicate "stmw_operation"
return 1;
})
+
+;; Return true if the operand is a legitimate parallel for vec_init
+(define_predicate "vec_init_operand"
+ (match_code "parallel")
+{
+ /* Disallow V2DF mode with MEM's unless both are the same under VSX. */
+ if (mode == V2DFmode && VECTOR_UNIT_VSX_P (mode))
+ {
+ rtx op0 = XVECEXP (op, 0, 0);
+ rtx op1 = XVECEXP (op, 0, 1);
+ if ((MEM_P (op0) || MEM_P (op1)) && !rtx_equal_p (op0, op1))
+ return 0;
+ }
+
+ return 1;
+})
--- gcc/config/rs6000/ppc-asm.h (.../trunk) (revision 144557)
+++ gcc/config/rs6000/ppc-asm.h (.../branches/ibm/power7-meissner) (revision 144730)
@@ -63,7 +63,7 @@
#define f16 16
#define f17 17
#define f18 18
-#define f19 19
+#define f19 19
#define f20 20
#define f21 21
#define f22 22
@@ -77,6 +77,143 @@
#define f30 30
#define f31 31
+#ifdef __VSX__
+#define f32 32
+#define f33 33
+#define f34 34
+#define f35 35
+#define f36 36
+#define f37 37
+#define f38 38
+#define f39 39
+#define f40 40
+#define f41 41
+#define f42 42
+#define f43 43
+#define f44 44
+#define f45 45
+#define f46 46
+#define f47 47
+#define f48 48
+#define f49 49
+#define f50 30
+#define f51 51
+#define f52 52
+#define f53 53
+#define f54 54
+#define f55 55
+#define f56 56
+#define f57 57
+#define f58 58
+#define f59 59
+#define f60 60
+#define f61 61
+#define f62 62
+#define f63 63
+#endif
+
+#ifdef __ALTIVEC__
+#define v0 0
+#define v1 1
+#define v2 2
+#define v3 3
+#define v4 4
+#define v5 5
+#define v6 6
+#define v7 7
+#define v8 8
+#define v9 9
+#define v10 10
+#define v11 11
+#define v12 12
+#define v13 13
+#define v14 14
+#define v15 15
+#define v16 16
+#define v17 17
+#define v18 18
+#define v19 19
+#define v20 20
+#define v21 21
+#define v22 22
+#define v23 23
+#define v24 24
+#define v25 25
+#define v26 26
+#define v27 27
+#define v28 28
+#define v29 29
+#define v30 30
+#define v31 31
+#endif
+
+#ifdef __VSX__
+#define vs0 0
+#define vs1 1
+#define vs2 2
+#define vs3 3
+#define vs4 4
+#define vs5 5
+#define vs6 6
+#define vs7 7
+#define vs8 8
+#define vs9 9
+#define vs10 10
+#define vs11 11
+#define vs12 12
+#define vs13 13
+#define vs14 14
+#define vs15 15
+#define vs16 16
+#define vs17 17
+#define vs18 18
+#define vs19 19
+#define vs20 20
+#define vs21 21
+#define vs22 22
+#define vs23 23
+#define vs24 24
+#define vs25 25
+#define vs26 26
+#define vs27 27
+#define vs28 28
+#define vs29 29
+#define vs30 30
+#define vs31 31
+#define vs32 32
+#define vs33 33
+#define vs34 34
+#define vs35 35
+#define vs36 36
+#define vs37 37
+#define vs38 38
+#define vs39 39
+#define vs40 40
+#define vs41 41
+#define vs42 42
+#define vs43 43
+#define vs44 44
+#define vs45 45
+#define vs46 46
+#define vs47 47
+#define vs48 48
+#define vs49 49
+#define vs50 30
+#define vs51 51
+#define vs52 52
+#define vs53 53
+#define vs54 54
+#define vs55 55
+#define vs56 56
+#define vs57 57
+#define vs58 58
+#define vs59 59
+#define vs60 60
+#define vs61 61
+#define vs62 62
+#define vs63 63
+#endif
+
/*
* Macros to glue together two tokens.
*/
--- gcc/config/rs6000/linux64.opt (.../trunk) (revision 144557)
+++ gcc/config/rs6000/linux64.opt (.../branches/ibm/power7-meissner) (revision 144730)
@@ -20,5 +20,5 @@
; <http://www.gnu.org/licenses/>.
mprofile-kernel
-Target Report Mask(PROFILE_KERNEL)
+Target Report Var(TARGET_PROFILE_KERNEL)
Call mcount for profiling before a function prologue
--- gcc/config/rs6000/sysv4.opt (.../trunk) (revision 144557)
+++ gcc/config/rs6000/sysv4.opt (.../branches/ibm/power7-meissner) (revision 144730)
@@ -32,7 +32,7 @@ Target RejectNegative Joined
Specify bit size of immediate TLS offsets
mbit-align
-Target Report Mask(NO_BITFIELD_TYPE)
+Target Report Var(TARGET_NO_BITFIELD_TYPE)
Align to the base type of the bit-field
mstrict-align
@@ -87,11 +87,11 @@ Target Report Mask(EABI)
Use EABI
mbit-word
-Target Report Mask(NO_BITFIELD_WORD)
+Target Report Var(TARGET_NO_BITFIELD_WORD)
Allow bit-fields to cross word boundaries
mregnames
-Target Mask(REGNAMES)
+Target Var(TARGET_REGNAMES)
Use alternate register names
;; FIXME: Does nothing.
--- gcc/config/rs6000/rs6000-protos.h (.../trunk) (revision 144557)
+++ gcc/config/rs6000/rs6000-protos.h (.../branches/ibm/power7-meissner) (revision 144730)
@@ -64,9 +64,14 @@ extern int insvdi_rshift_rlwimi_p (rtx,
extern int registers_ok_for_quad_peep (rtx, rtx);
extern int mems_ok_for_quad_peep (rtx, rtx);
extern bool gpr_or_gpr_p (rtx, rtx);
+extern enum reg_class rs6000_preferred_reload_class(rtx, enum reg_class);
extern enum reg_class rs6000_secondary_reload_class (enum reg_class,
enum machine_mode, rtx);
-
+extern bool rs6000_secondary_memory_needed (enum reg_class, enum reg_class,
+ enum machine_mode);
+extern bool rs6000_cannot_change_mode_class (enum machine_mode,
+ enum machine_mode,
+ enum reg_class);
extern int paired_emit_vector_cond_expr (rtx, rtx, rtx,
rtx, rtx, rtx);
extern void paired_expand_vector_move (rtx operands[]);
@@ -170,7 +175,6 @@ extern int rs6000_register_move_cost (en
enum reg_class, enum reg_class);
extern int rs6000_memory_move_cost (enum machine_mode, enum reg_class, int);
extern bool rs6000_tls_referenced_p (rtx);
-extern int rs6000_hard_regno_nregs (int, enum machine_mode);
extern void rs6000_conditional_register_usage (void);
/* Declare functions in rs6000-c.c */
@@ -189,4 +193,6 @@ const char * rs6000_xcoff_strip_dollar (
void rs6000_final_prescan_insn (rtx, rtx *operand, int num_operands);
extern bool rs6000_hard_regno_mode_ok_p[][FIRST_PSEUDO_REGISTER];
+extern unsigned char rs6000_class_max_nregs[][LIM_REG_CLASSES];
+extern unsigned char rs6000_hard_regno_nregs[][FIRST_PSEUDO_REGISTER];
#endif /* rs6000-protos.h */
--- gcc/config/rs6000/t-rs6000 (.../trunk) (revision 144557)
+++ gcc/config/rs6000/t-rs6000 (.../branches/ibm/power7-meissner) (revision 144730)
@@ -16,3 +16,33 @@ rs6000-c.o: $(srcdir)/config/rs6000/rs60
# The rs6000 backend doesn't cause warnings in these files.
insn-conditions.o-warn =
+
+MD_INCLUDES = $(srcdir)/config/rs6000/rios1.md \
+ $(srcdir)/config/rs6000/rios2.md \
+ $(srcdir)/config/rs6000/rs64.md \
+ $(srcdir)/config/rs6000/mpc.md \
+ $(srcdir)/config/rs6000/40x.md \
+ $(srcdir)/config/rs6000/440.md \
+ $(srcdir)/config/rs6000/603.md \
+ $(srcdir)/config/rs6000/6xx.md \
+ $(srcdir)/config/rs6000/7xx.md \
+ $(srcdir)/config/rs6000/7450.md \
+ $(srcdir)/config/rs6000/8540.md \
+ $(srcdir)/config/rs6000/e300c2c3.md \
+ $(srcdir)/config/rs6000/e500mc.md \
+ $(srcdir)/config/rs6000/power4.md \
+ $(srcdir)/config/rs6000/power5.md \
+ $(srcdir)/config/rs6000/power6.md \
+ $(srcdir)/config/rs6000/power7.md \
+ $(srcdir)/config/rs6000/cell.md \
+ $(srcdir)/config/rs6000/xfpu.md \
+ $(srcdir)/config/rs6000/predicates.md \
+ $(srcdir)/config/rs6000/constraints.md \
+ $(srcdir)/config/rs6000/darwin.md \
+ $(srcdir)/config/rs6000/sync.md \
+ $(srcdir)/config/rs6000/vector.md \
+ $(srcdir)/config/rs6000/vsx.md \
+ $(srcdir)/config/rs6000/altivec.md \
+ $(srcdir)/config/rs6000/spe.md \
+ $(srcdir)/config/rs6000/dfp.md \
+ $(srcdir)/config/rs6000/paired.md
--- gcc/config/rs6000/power7.md (.../trunk) (revision 0)
+++ gcc/config/rs6000/power7.md (.../branches/ibm/power7-meissner) (revision 144730)
@@ -0,0 +1,320 @@
+;; Scheduling description for IBM POWER7 processor.
+;; Copyright (C) 2009 Free Software Foundation, Inc.
+;;
+;; Contributed by Pat Haugen (pthaugen@us.ibm.com).
+
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3. If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_automaton "power7iu,power7lsu,power7vsu,power7misc")
+
+(define_cpu_unit "iu1_power7,iu2_power7" "power7iu")
+(define_cpu_unit "lsu1_power7,lsu2_power7" "power7lsu")
+(define_cpu_unit "vsu1_power7,vsu2_power7" "power7vsu")
+(define_cpu_unit "bpu_power7,cru_power7" "power7misc")
+(define_cpu_unit "du1_power7,du2_power7,du3_power7,du4_power7,du5_power7"
+ "power7misc")
+
+
+(define_reservation "DU_power7"
+ "du1_power7|du2_power7|du3_power7|du4_power7")
+
+(define_reservation "DU2F_power7"
+ "du1_power7+du2_power7")
+
+(define_reservation "DU4_power7"
+ "du1_power7+du2_power7+du3_power7+du4_power7")
+
+(define_reservation "FXU_power7"
+ "iu1_power7|iu2_power7")
+
+(define_reservation "VSU_power7"
+ "vsu1_power7|vsu2_power7")
+
+(define_reservation "LSU_power7"
+ "lsu1_power7|lsu2_power7")
+
+
+; Dispatch slots are allocated in order conforming to program order.
+(absence_set "du1_power7" "du2_power7,du3_power7,du4_power7,du5_power7")
+(absence_set "du2_power7" "du3_power7,du4_power7,du5_power7")
+(absence_set "du3_power7" "du4_power7,du5_power7")
+(absence_set "du4_power7" "du5_power7")
+
+
+; LS Unit
+(define_insn_reservation "power7-load" 2
+ (and (eq_attr "type" "load")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,LSU_power7")
+
+(define_insn_reservation "power7-load-ext" 3
+ (and (eq_attr "type" "load_ext")
+ (eq_attr "cpu" "power7"))
+ "DU2F_power7,LSU_power7,FXU_power7")
+
+(define_insn_reservation "power7-load-update" 2
+ (and (eq_attr "type" "load_u")
+ (eq_attr "cpu" "power7"))
+ "DU2F_power7,LSU_power7+FXU_power7")
+
+(define_insn_reservation "power7-load-update-indexed" 3
+ (and (eq_attr "type" "load_ux")
+ (eq_attr "cpu" "power7"))
+ "DU4_power7,FXU_power7,LSU_power7+FXU_power7")
+
+(define_insn_reservation "power7-load-ext-update" 4
+ (and (eq_attr "type" "load_ext_u")
+ (eq_attr "cpu" "power7"))
+ "DU2F_power7,LSU_power7+FXU_power7,FXU_power7")
+
+(define_insn_reservation "power7-load-ext-update-indexed" 4
+ (and (eq_attr "type" "load_ext_ux")
+ (eq_attr "cpu" "power7"))
+ "DU4_power7,FXU_power7,LSU_power7+FXU_power7,FXU_power7")
+
+(define_insn_reservation "power7-fpload" 3
+ (and (eq_attr "type" "fpload")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,LSU_power7")
+
+(define_insn_reservation "power7-fpload-update" 3
+ (and (eq_attr "type" "fpload_u,fpload_ux")
+ (eq_attr "cpu" "power7"))
+ "DU2F_power7,LSU_power7+FXU_power7")
+
+(define_insn_reservation "power7-store" 6 ; store-forwarding latency
+ (and (eq_attr "type" "store")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,LSU_power7+FXU_power7")
+
+(define_insn_reservation "power7-store-update" 6
+ (and (eq_attr "type" "store_u")
+ (eq_attr "cpu" "power7"))
+ "DU2F_power7,LSU_power7+FXU_power7,FXU_power7")
+
+(define_insn_reservation "power7-store-update-indexed" 6
+ (and (eq_attr "type" "store_ux")
+ (eq_attr "cpu" "power7"))
+ "DU4_power7,LSU_power7+FXU_power7,FXU_power7")
+
+(define_insn_reservation "power7-fpstore" 6
+ (and (eq_attr "type" "fpstore")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,LSU_power7+VSU_power7")
+
+(define_insn_reservation "power7-fpstore-update" 6
+ (and (eq_attr "type" "fpstore_u,fpstore_ux")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,LSU_power7+VSU_power7+FXU_power7")
+
+(define_insn_reservation "power7-larx" 3
+ (and (eq_attr "type" "load_l")
+ (eq_attr "cpu" "power7"))
+ "DU4_power7,LSU_power7")
+
+(define_insn_reservation "power7-stcx" 10
+ (and (eq_attr "type" "store_c")
+ (eq_attr "cpu" "power7"))
+ "DU4_power7,LSU_power7")
+
+(define_insn_reservation "power7-vecload" 3
+ (and (eq_attr "type" "vecload")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,LSU_power7")
+
+(define_insn_reservation "power7-vecstore" 6
+ (and (eq_attr "type" "vecstore")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,LSU_power7+VSU_power7")
+
+(define_insn_reservation "power7-sync" 11
+ (and (eq_attr "type" "sync")
+ (eq_attr "cpu" "power7"))
+ "DU4_power7,LSU_power7")
+
+
+; FX Unit
+(define_insn_reservation "power7-integer" 1
+ (and (eq_attr "type" "integer,insert_word,insert_dword,shift,trap,\
+ var_shift_rotate,exts")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,FXU_power7")
+
+(define_insn_reservation "power7-cntlz" 2
+ (and (eq_attr "type" "cntlz")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,FXU_power7")
+
+(define_insn_reservation "power7-two" 2
+ (and (eq_attr "type" "two")
+ (eq_attr "cpu" "power7"))
+ "DU_power7+DU_power7,FXU_power7,FXU_power7")
+
+(define_insn_reservation "power7-three" 3
+ (and (eq_attr "type" "three")
+ (eq_attr "cpu" "power7"))
+ "DU_power7+DU_power7+DU_power7,FXU_power7,FXU_power7,FXU_power7")
+
+(define_insn_reservation "power7-cmp" 1
+ (and (eq_attr "type" "cmp,fast_compare")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,FXU_power7")
+
+(define_insn_reservation "power7-compare" 2
+ (and (eq_attr "type" "compare,delayed_compare,var_delayed_compare")
+ (eq_attr "cpu" "power7"))
+ "DU2F_power7,FXU_power7,FXU_power7")
+
+(define_bypass 3 "power7-cmp,power7-compare" "power7-crlogical,power7-delayedcr")
+
+(define_insn_reservation "power7-mul" 4
+ (and (eq_attr "type" "imul,imul2,imul3,lmul")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,FXU_power7")
+
+(define_insn_reservation "power7-mul-compare" 5
+ (and (eq_attr "type" "imul_compare,lmul_compare")
+ (eq_attr "cpu" "power7"))
+ "DU2F_power7,FXU_power7,nothing*3,FXU_power7")
+
+(define_insn_reservation "power7-idiv" 36
+ (and (eq_attr "type" "idiv")
+ (eq_attr "cpu" "power7"))
+ "DU2F_power7,iu1_power7*36|iu2_power7*36")
+
+(define_insn_reservation "power7-ldiv" 68
+ (and (eq_attr "type" "ldiv")
+ (eq_attr "cpu" "power7"))
+ "DU2F_power7,iu1_power7*68|iu2_power7*68")
+
+(define_insn_reservation "power7-isync" 1 ;
+ (and (eq_attr "type" "isync")
+ (eq_attr "cpu" "power7"))
+ "DU4_power7,FXU_power7")
+
+
+; CR Unit
+(define_insn_reservation "power7-mtjmpr" 4
+ (and (eq_attr "type" "mtjmpr")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,FXU_power7")
+
+(define_insn_reservation "power7-mfjmpr" 5
+ (and (eq_attr "type" "mfjmpr")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,cru_power7+FXU_power7")
+
+(define_insn_reservation "power7-crlogical" 3
+ (and (eq_attr "type" "cr_logical")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,cru_power7")
+
+(define_insn_reservation "power7-delayedcr" 3
+ (and (eq_attr "type" "delayed_cr")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,cru_power7")
+
+(define_insn_reservation "power7-mfcr" 6
+ (and (eq_attr "type" "mfcr")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,cru_power7")
+
+(define_insn_reservation "power7-mfcrf" 3
+ (and (eq_attr "type" "mfcrf")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,cru_power7")
+
+(define_insn_reservation "power7-mtcr" 3
+ (and (eq_attr "type" "mtcr")
+ (eq_attr "cpu" "power7"))
+ "DU4_power7,cru_power7+FXU_power7")
+
+
+; BR Unit
+; Branches take dispatch Slot 4. The presence_sets prevent other insn from
+; grabbing previous dispatch slots once this is assigned.
+(define_insn_reservation "power7-branch" 3
+ (and (eq_attr "type" "jmpreg,branch")
+ (eq_attr "cpu" "power7"))
+ "(du5_power7\
+ |du4_power7+du5_power7\
+ |du3_power7+du4_power7+du5_power7\
+ |du2_power7+du3_power7+du4_power7+du5_power7\
+ |du1_power7+du2_power7+du3_power7+du4_power7+du5_power7),bpu_power7")
+
+
+; VS Unit (includes FP/VSX/VMX/DFP)
+(define_insn_reservation "power7-fp" 6
+ (and (eq_attr "type" "fp,dmul")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,VSU_power7")
+
+(define_bypass 8 "power7-fp" "power7-branch")
+
+(define_insn_reservation "power7-fpcompare" 4
+ (and (eq_attr "type" "fpcompare")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,VSU_power7")
+
+(define_insn_reservation "power7-sdiv" 26
+ (and (eq_attr "type" "sdiv")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,VSU_power7")
+
+(define_insn_reservation "power7-ddiv" 32
+ (and (eq_attr "type" "ddiv")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,VSU_power7")
+
+(define_insn_reservation "power7-sqrt" 31
+ (and (eq_attr "type" "ssqrt")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,VSU_power7")
+
+(define_insn_reservation "power7-dsqrt" 43
+ (and (eq_attr "type" "dsqrt")
+ (eq_attr "cpu" "power7"))
+ "DU_power7,VSU_power7")
+
+(define_insn_reservation "power7-vecsimple" 2
+ (and (eq_attr "type" "vecsimple")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,VSU_power7")
+
+(define_insn_reservation "power7-veccmp" 7
+ (and (eq_attr "type" "veccmp")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,VSU_power7")
+
+(define_insn_reservation "power7-vecfloat" 7
+ (and (eq_attr "type" "vecfloat")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,VSU_power7")
+
+(define_bypass 6 "power7-vecfloat" "power7-vecfloat")
+
+(define_insn_reservation "power7-veccomplex" 7
+ (and (eq_attr "type" "veccomplex")
+ (eq_attr "cpu" "power7"))
+ "du1_power7,VSU_power7")
+
+(define_insn_reservation "power7-vecperm" 3
+ (and (eq_attr "type" "vecperm")
+ (eq_attr "cpu" "power7"))
+ "du2_power7,VSU_power7")
+
+
--- gcc/config/rs6000/rs6000-c.c (.../trunk) (revision 144557)
+++ gcc/config/rs6000/rs6000-c.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -265,6 +265,8 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi
builtin_define ("_ARCH_PWR6X");
if (! TARGET_POWER && ! TARGET_POWER2 && ! TARGET_POWERPC)
builtin_define ("_ARCH_COM");
+ if (TARGET_POPCNTD)
+ builtin_define ("_ARCH_PWR7");
if (TARGET_ALTIVEC)
{
builtin_define ("__ALTIVEC__");
@@ -306,6 +308,8 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi
/* Used by libstdc++. */
if (TARGET_NO_LWSYNC)
builtin_define ("__NO_LWSYNC__");
+ if (TARGET_VSX)
+ builtin_define ("__VSX__");
/* May be overridden by target configuration. */
RS6000_CPU_CPP_ENDIAN_BUILTINS();
--- gcc/config/rs6000/rs6000.opt (.../trunk) (revision 144557)
+++ gcc/config/rs6000/rs6000.opt (.../branches/ibm/power7-meissner) (revision 144730)
@@ -111,24 +111,44 @@ mhard-float
Target Report RejectNegative InverseMask(SOFT_FLOAT, HARD_FLOAT)
Use hardware floating point
-mno-update
-Target Report RejectNegative Mask(NO_UPDATE)
-Do not generate load/store with update instructions
+mpopcntd
+Target Report Mask(POPCNTD)
+Use PowerPC V2.06 popcntd instruction
+
+mvsx
+Target Report Mask(VSX)
+Use vector/scalar (VSX) instructions
+
+mvsx-vector-memory
+Target Report Var(TARGET_VSX_VECTOR_MEMORY) Init(-1)
+If -mvsx, use VSX vector load/store instructions instead of Altivec instructions
+
+mvsx-vector-float
+Target Report Var(TARGET_VSX_VECTOR_FLOAT) Init(-1)
+If -mvsx, use VSX arithmetic instructions for float vectors (on by default)
+
+mvsx-vector-double
+Target Report Var(TARGET_VSX_VECTOR_DOUBLE) Init(-1)
+If -mvsx, use VSX arithmetic instructions for double vectors (on by default)
+
+mvsx-scalar-double
+Target Report Var(TARGET_VSX_SCALAR_DOUBLE) Init(-1)
+If -mvsx, use VSX arithmetic instructions for scalar double (on by default)
+
+mvsx-scalar-memory
+Target Report Var(TARGET_VSX_SCALAR_MEMORY)
+If -mvsx, use VSX scalar memory reference instructions for scalar double (off by default)
mupdate
-Target Report RejectNegative InverseMask(NO_UPDATE, UPDATE)
+Target Report Var(TARGET_UPDATE) Init(1)
Generate load/store with update instructions
mavoid-indexed-addresses
Target Report Var(TARGET_AVOID_XFORM) Init(-1)
Avoid generation of indexed load/store instructions when possible
-mno-fused-madd
-Target Report RejectNegative Mask(NO_FUSED_MADD)
-Do not generate fused multiply/add instructions
-
mfused-madd
-Target Report RejectNegative InverseMask(NO_FUSED_MADD, FUSED_MADD)
+Target Report Var(TARGET_FUSED_MADD) Init(1)
Generate fused multiply/add instructions
msched-prolog
@@ -194,7 +214,7 @@ Target RejectNegative Joined
-mvrsave=yes/no Deprecated option. Use -mvrsave/-mno-vrsave instead
misel
-Target
+Target Report Mask(ISEL)
Generate isel instructions
misel=
--- gcc/config/rs6000/linux64.h (.../trunk) (revision 144557)
+++ gcc/config/rs6000/linux64.h (.../branches/ibm/power7-meissner) (revision 144730)
@@ -114,7 +114,7 @@ extern int dot_symbols;
error (INVALID_32BIT, "32"); \
if (TARGET_PROFILE_KERNEL) \
{ \
- target_flags &= ~MASK_PROFILE_KERNEL; \
+ SET_PROFILE_KERNEL (0); \
error (INVALID_32BIT, "profile-kernel"); \
} \
} \
--- gcc/config/rs6000/rs6000.c (.../trunk) (revision 144557)
+++ gcc/config/rs6000/rs6000.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -178,9 +178,6 @@ int rs6000_spe;
/* Nonzero if we want SPE ABI extensions. */
int rs6000_spe_abi;
-/* Nonzero to use isel instructions. */
-int rs6000_isel;
-
/* Nonzero if floating point operations are done in the GPRs. */
int rs6000_float_gprs = 0;
@@ -227,10 +224,18 @@ int dot_symbols;
const char *rs6000_debug_name;
int rs6000_debug_stack; /* debug stack applications */
int rs6000_debug_arg; /* debug argument handling */
+int rs6000_debug_reg; /* debug register classes */
+int rs6000_debug_addr; /* debug memory addressing */
/* Value is TRUE if register/mode pair is acceptable. */
bool rs6000_hard_regno_mode_ok_p[NUM_MACHINE_MODES][FIRST_PSEUDO_REGISTER];
+/* Maximum number of registers needed for a given register class and mode. */
+unsigned char rs6000_class_max_nregs[NUM_MACHINE_MODES][LIM_REG_CLASSES];
+
+/* How many registers are needed for a given register and mode. */
+unsigned char rs6000_hard_regno_nregs[NUM_MACHINE_MODES][FIRST_PSEUDO_REGISTER];
+
/* Built in types. */
tree rs6000_builtin_types[RS6000_BTI_MAX];
@@ -270,7 +275,6 @@ struct {
bool altivec_abi; /* True if -mabi=altivec/no-altivec used. */
bool spe; /* True if -mspe= was used. */
bool float_gprs; /* True if -mfloat-gprs= was used. */
- bool isel; /* True if -misel was used. */
bool long_double; /* True if -mlong-double- was used. */
bool ieee; /* True if -mabi=ieee/ibmlongdouble used. */
bool vrsave; /* True if -mvrsave was used. */
@@ -286,6 +290,18 @@ struct builtin_description
const char *const name;
const enum rs6000_builtins code;
};
+
+/* Describe the vector unit used for modes. */
+enum rs6000_vector rs6000_vector_unit[NUM_MACHINE_MODES];
+enum rs6000_vector rs6000_vector_mem[NUM_MACHINE_MODES];
+enum reg_class rs6000_vector_reg_class[NUM_MACHINE_MODES];
+
+/* Describe the alignment of a vector. */
+int rs6000_vector_align[NUM_MACHINE_MODES];
+
+/* Describe the register classes used by VSX instructions. */
+enum reg_class rs6000_vsx_reg_class = NO_REGS;
+
/* Target cpu costs. */
@@ -749,6 +765,25 @@ struct processor_costs power6_cost = {
16, /* prefetch streams */
};
+/* Instruction costs on POWER7 processors. */
+static const
+struct processor_costs power7_cost = {
+ COSTS_N_INSNS (2), /* mulsi */
+ COSTS_N_INSNS (2), /* mulsi_const */
+ COSTS_N_INSNS (2), /* mulsi_const9 */
+ COSTS_N_INSNS (2), /* muldi */
+ COSTS_N_INSNS (18), /* divsi */
+ COSTS_N_INSNS (34), /* divdi */
+ COSTS_N_INSNS (3), /* fp */
+ COSTS_N_INSNS (3), /* dmul */
+ COSTS_N_INSNS (13), /* sdiv */
+ COSTS_N_INSNS (16), /* ddiv */
+ 128, /* cache line size */
+ 32, /* l1 cache */
+ 256, /* l2 cache */
+ 12, /* prefetch streams */
+};
+
static bool rs6000_function_ok_for_sibcall (tree, tree);
static const char *rs6000_invalid_within_doloop (const_rtx);
@@ -963,12 +998,10 @@ static tree rs6000_gimplify_va_arg (tree
static bool rs6000_must_pass_in_stack (enum machine_mode, const_tree);
static bool rs6000_scalar_mode_supported_p (enum machine_mode);
static bool rs6000_vector_mode_supported_p (enum machine_mode);
-static int get_vec_cmp_insn (enum rtx_code, enum machine_mode,
- enum machine_mode);
+static rtx rs6000_emit_vector_compare_vsx (enum rtx_code, rtx, rtx, rtx);
+static rtx rs6000_emit_vector_compare_altivec (enum rtx_code, rtx, rtx, rtx);
static rtx rs6000_emit_vector_compare (enum rtx_code, rtx, rtx,
enum machine_mode);
-static int get_vsel_insn (enum machine_mode);
-static void rs6000_emit_vector_select (rtx, rtx, rtx, rtx);
static tree rs6000_stack_protect_fail (void);
const int INSN_NOT_AVAILABLE = -1;
@@ -1045,6 +1078,9 @@ static const char alt_reg_names[][8] =
#endif
#ifndef TARGET_PROFILE_KERNEL
#define TARGET_PROFILE_KERNEL 0
+#define SET_PROFILE_KERNEL(N)
+#else
+#define SET_PROFILE_KERNEL(N) TARGET_PROFILE_KERNEL = (N)
#endif
/* The VRSAVE bitmask puts bit %v0 as the most significant bit. */
@@ -1299,28 +1335,96 @@ static const char alt_reg_names[][8] =
struct gcc_target targetm = TARGET_INITIALIZER;
+/* Return number of consecutive hard regs needed starting at reg REGNO
+ to hold something of mode MODE.
+ This is ordinarily the length in words of a value of mode MODE
+ but can be less for certain modes in special long registers.
+
+ For the SPE, GPRs are 64 bits but only 32 bits are visible in
+ scalar instructions. The upper 32 bits are only available to the
+ SIMD instructions.
+
+ POWER and PowerPC GPRs hold 32 bits worth;
+ PowerPC64 GPRs and FPRs point register holds 64 bits worth. */
+
+static int
+rs6000_hard_regno_nregs_internal (int regno, enum machine_mode mode)
+{
+ unsigned HOST_WIDE_INT reg_size;
+
+ if (FP_REGNO_P (regno))
+ reg_size = (VECTOR_UNIT_VSX_P (mode)
+ ? UNITS_PER_VSX_WORD
+ : UNITS_PER_FP_WORD);
+
+ else if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode))
+ reg_size = UNITS_PER_SPE_WORD;
+
+ else if (ALTIVEC_REGNO_P (regno))
+ reg_size = UNITS_PER_ALTIVEC_WORD;
+
+ /* The value returned for SCmode in the E500 double case is 2 for
+ ABI compatibility; storing an SCmode value in a single register
+ would require function_arg and rs6000_spe_function_arg to handle
+ SCmode so as to pass the value correctly in a pair of
+ registers. */
+ else if (TARGET_E500_DOUBLE && FLOAT_MODE_P (mode) && mode != SCmode
+ && !DECIMAL_FLOAT_MODE_P (mode))
+ reg_size = UNITS_PER_FP_WORD;
+
+ else
+ reg_size = UNITS_PER_WORD;
+
+ return (GET_MODE_SIZE (mode) + reg_size - 1) / reg_size;
+}
/* Value is 1 if hard register REGNO can hold a value of machine-mode
MODE. */
static int
rs6000_hard_regno_mode_ok (int regno, enum machine_mode mode)
{
+ int last_regno = regno + rs6000_hard_regno_nregs[mode][regno] - 1;
+
+ /* VSX registers that overlap the FPR registers are larger than for non-VSX
+ implementations. Don't allow an item to be split between a FP register
+ and an Altivec register. */
+ if (VECTOR_UNIT_VSX_P (mode) || VECTOR_MEM_VSX_P (mode))
+ {
+ enum reg_class rclass = rs6000_vector_reg_class[mode];
+ if (FP_REGNO_P (regno))
+ return ((rclass == FLOAT_REGS || rclass == VSX_REGS)
+ && FP_REGNO_P (last_regno));
+
+ if (ALTIVEC_REGNO_P (regno))
+ return ((rclass == ALTIVEC_REGS || rclass == VSX_REGS)
+ && ALTIVEC_REGNO_P (last_regno));
+ }
+
/* The GPRs can hold any mode, but values bigger than one register
cannot go past R31. */
if (INT_REGNO_P (regno))
- return INT_REGNO_P (regno + HARD_REGNO_NREGS (regno, mode) - 1);
+ return INT_REGNO_P (last_regno);
- /* The float registers can only hold floating modes and DImode.
- This excludes the 32-bit decimal float mode for now. */
+ /* The float registers (except for VSX vector modes) can only hold floating
+ modes and DImode. This excludes the 32-bit decimal float mode for
+ now. */
if (FP_REGNO_P (regno))
- return
- ((SCALAR_FLOAT_MODE_P (mode)
- && (mode != TDmode || (regno % 2) == 0)
- && FP_REGNO_P (regno + HARD_REGNO_NREGS (regno, mode) - 1))
- || (GET_MODE_CLASS (mode) == MODE_INT
+ {
+ if (SCALAR_FLOAT_MODE_P (mode)
+ && (mode != TDmode || (regno % 2) == 0)
+ && FP_REGNO_P (last_regno))
+ return 1;
+
+ if (GET_MODE_CLASS (mode) == MODE_INT
&& GET_MODE_SIZE (mode) == UNITS_PER_FP_WORD)
- || (PAIRED_SIMD_REGNO_P (regno) && TARGET_PAIRED_FLOAT
- && PAIRED_VECTOR_MODE (mode)));
+ return 1;
+
+ if (PAIRED_SIMD_REGNO_P (regno) && TARGET_PAIRED_FLOAT
+ && PAIRED_VECTOR_MODE (mode))
+ return 1;
+
+ return 0;
+ }
/* The CR register can only hold CC modes. */
if (CR_REGNO_P (regno))
@@ -1331,28 +1435,312 @@ rs6000_hard_regno_mode_ok (int regno, en
/* AltiVec only in AldyVec registers. */
if (ALTIVEC_REGNO_P (regno))
- return ALTIVEC_VECTOR_MODE (mode);
+ return VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode);
/* ...but GPRs can hold SIMD data on the SPE in one register. */
if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode))
return 1;
- /* We cannot put TImode anywhere except general register and it must be
- able to fit within the register set. */
+ /* Don't allow anything but word sized integers (aka pointers) in CTR/LR. You
+ really don't want to spill your floating point values to those
+ registers. Also do it for the old MQ register in the power. */
+ if (regno == CTR_REGNO || regno == LR_REGNO || regno == MQ_REGNO)
+ return (GET_MODE_CLASS (mode) == MODE_INT
+ && GET_MODE_SIZE (mode) <= UNITS_PER_WORD);
+
+ /* The VRSAVE/VSCR registers are 32-bits (they are fixed, but add this for
+ completeness). */
+ if (regno == VRSAVE_REGNO || regno == VSCR_REGNO)
+ return (mode == SImode);
+
+ /* We cannot put TImode anywhere except general register and it must be able
+ to fit within the register set. In the future, allow TImode in the
+ Altivec or VSX registers. */
return GET_MODE_SIZE (mode) <= UNITS_PER_WORD;
}
-/* Initialize rs6000_hard_regno_mode_ok_p table. */
+/* Print interesting facts about registers. */
static void
-rs6000_init_hard_regno_mode_ok (void)
+rs6000_debug_reg_print (int first_regno, int last_regno, const char *reg_name)
{
int r, m;
+ for (r = first_regno; r <= last_regno; ++r)
+ {
+ const char *comma = "";
+ int len;
+
+ if (first_regno == last_regno)
+ fprintf (stderr, "%s:\t", reg_name);
+ else
+ fprintf (stderr, "%s%d:\t", reg_name, r - first_regno);
+
+ len = 8;
+ for (m = 0; m < NUM_MACHINE_MODES; ++m)
+ if (rs6000_hard_regno_mode_ok_p[m][r] && rs6000_hard_regno_nregs[m][r])
+ {
+ if (len > 70)
+ {
+ fprintf (stderr, ",\n\t");
+ len = 8;
+ comma = "";
+ }
+
+ if (rs6000_hard_regno_nregs[m][r] > 1)
+ len += fprintf (stderr, "%s%s/%d", comma, GET_MODE_NAME (m),
+ rs6000_hard_regno_nregs[m][r]);
+ else
+ len += fprintf (stderr, "%s%s", comma, GET_MODE_NAME (m));
+
+ comma = ", ";
+ }
+
+ if (call_used_regs[r])
+ {
+ if (len > 70)
+ {
+ fprintf (stderr, ",\n\t");
+ len = 8;
+ comma = "";
+ }
+
+ len += fprintf (stderr, "%s%s", comma, "call-used");
+ comma = ", ";
+ }
+
+ if (fixed_regs[r])
+ {
+ if (len > 70)
+ {
+ fprintf (stderr, ",\n\t");
+ len = 8;
+ comma = "";
+ }
+
+ len += fprintf (stderr, "%s%s", comma, "fixed");
+ comma = ", ";
+ }
+
+ if (len > 70)
+ {
+ fprintf (stderr, ",\n\t");
+ comma = "";
+ }
+
+ fprintf (stderr, "%sregno = %d\n", comma, r);
+ }
+}
+
+/* Map enum rs6000_vector to string. */
+static const char *
+rs6000_debug_vector_unit[] = {
+ "none",
+ "altivec",
+ "vsx",
+ "paired",
+ "spe",
+ "other"
+};
+
+/* Initialize the various global tables that are based on register size. */
+static void
+rs6000_init_hard_regno_mode_ok (void)
+{
+ int r, m, c;
+ enum reg_class vsx_rc = (TARGET_ALTIVEC ? VSX_REGS : FLOAT_REGS);
+ bool float_p = (TARGET_HARD_FLOAT && TARGET_FPRS);
+
+ /* Precalculate vector information, this must be set up before the
+ rs6000_hard_regno_nregs_internal below. */
+ for (m = 0; m < NUM_MACHINE_MODES; ++m)
+ {
+ rs6000_vector_unit[m] = rs6000_vector_mem[m] = VECTOR_NONE;
+ rs6000_vector_reg_class[m] = NO_REGS;
+ }
+
+ /* TODO, add TI/V2DI mode for moving data if Altivec or VSX. */
+
+ /* V2DF mode, VSX only. */
+ if (float_p && TARGET_VSX && TARGET_VSX_VECTOR_DOUBLE)
+ {
+ rs6000_vector_unit[V2DFmode] = VECTOR_VSX;
+ rs6000_vector_mem[V2DFmode] = VECTOR_VSX;
+ rs6000_vector_align[V2DFmode] = 64;
+ }
+
+ /* V4SF mode, either VSX or Altivec. */
+ if (float_p && TARGET_VSX && TARGET_VSX_VECTOR_FLOAT)
+ {
+ rs6000_vector_unit[V4SFmode] = VECTOR_VSX;
+ if (TARGET_VSX_VECTOR_MEMORY || !TARGET_ALTIVEC)
+ {
+ rs6000_vector_align[V4SFmode] = 32;
+ rs6000_vector_mem[V4SFmode] = VECTOR_VSX;
+ } else {
+ rs6000_vector_align[V4SFmode] = 128;
+ rs6000_vector_mem[V4SFmode] = VECTOR_ALTIVEC;
+ }
+ }
+ else if (float_p && TARGET_ALTIVEC)
+ {
+ rs6000_vector_unit[V4SFmode] = VECTOR_ALTIVEC;
+ rs6000_vector_mem[V4SFmode] = VECTOR_ALTIVEC;
+ rs6000_vector_align[V4SFmode] = 128;
+ }
+
+ /* V16QImode, V8HImode, V4SImode are Altivec only, but possibly do VSX loads
+ and stores. */
+ if (TARGET_ALTIVEC)
+ {
+ rs6000_vector_unit[V4SImode] = VECTOR_ALTIVEC;
+ rs6000_vector_unit[V8HImode] = VECTOR_ALTIVEC;
+ rs6000_vector_unit[V16QImode] = VECTOR_ALTIVEC;
+
+ rs6000_vector_reg_class[V16QImode] = ALTIVEC_REGS;
+ rs6000_vector_reg_class[V8HImode] = ALTIVEC_REGS;
+ rs6000_vector_reg_class[V4SImode] = ALTIVEC_REGS;
+
+ if (TARGET_VSX && TARGET_VSX_VECTOR_MEMORY)
+ {
+ rs6000_vector_mem[V4SImode] = VECTOR_VSX;
+ rs6000_vector_mem[V8HImode] = VECTOR_VSX;
+ rs6000_vector_mem[V16QImode] = VECTOR_VSX;
+ rs6000_vector_align[V4SImode] = 32;
+ rs6000_vector_align[V8HImode] = 32;
+ rs6000_vector_align[V16QImode] = 32;
+ }
+ else
+ {
+ rs6000_vector_mem[V4SImode] = VECTOR_ALTIVEC;
+ rs6000_vector_mem[V8HImode] = VECTOR_ALTIVEC;
+ rs6000_vector_mem[V16QImode] = VECTOR_ALTIVEC;
+ rs6000_vector_align[V4SImode] = 128;
+ rs6000_vector_align[V8HImode] = 128;
+ rs6000_vector_align[V16QImode] = 128;
+ }
+ }
+
+ /* DFmode, see if we want to use the VSX unit. */
+ if (float_p && TARGET_VSX && TARGET_VSX_SCALAR_DOUBLE)
+ {
+ rs6000_vector_unit[DFmode] = VECTOR_VSX;
+ rs6000_vector_align[DFmode] = 64;
+ rs6000_vector_mem[DFmode]
+ = (TARGET_VSX_SCALAR_MEMORY ? VECTOR_VSX : VECTOR_NONE);
+ }
+
+ /* TODO, add SPE and paired floating point vector support. */
+
+ /* Set the VSX register classes. */
+ rs6000_vector_reg_class[V4SFmode]
+ = ((VECTOR_UNIT_VSX_P (V4SFmode) && VECTOR_MEM_VSX_P (V4SFmode))
+ ? vsx_rc
+ : (VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)
+ ? ALTIVEC_REGS
+ : NO_REGS));
+
+ rs6000_vector_reg_class[V2DFmode]
+ = (VECTOR_UNIT_VSX_P (V2DFmode) ? vsx_rc : NO_REGS);
+
+ rs6000_vector_reg_class[DFmode]
+ = ((!float_p || !VECTOR_UNIT_VSX_P (DFmode))
+ ? NO_REGS
+ : ((TARGET_VSX_SCALAR_MEMORY)
+ ? vsx_rc
+ : FLOAT_REGS));
+
+ rs6000_vsx_reg_class = (float_p && TARGET_VSX) ? vsx_rc : NO_REGS;
+
+ /* Precalculate HARD_REGNO_NREGS. */
+ for (r = 0; r < FIRST_PSEUDO_REGISTER; ++r)
+ for (m = 0; m < NUM_MACHINE_MODES; ++m)
+ rs6000_hard_regno_nregs[m][r] = rs6000_hard_regno_nregs_internal (r, m);
+
+ /* Precalculate HARD_REGNO_MODE_OK. */
for (r = 0; r < FIRST_PSEUDO_REGISTER; ++r)
for (m = 0; m < NUM_MACHINE_MODES; ++m)
if (rs6000_hard_regno_mode_ok (r, m))
rs6000_hard_regno_mode_ok_p[m][r] = true;
+
+ /* Precalculate CLASSS_MAX_NREGS sizes. */
+ for (c = 0; c < LIM_REG_CLASSES; ++c)
+ {
+ int reg_size;
+
+ if (TARGET_VSX && VSX_REG_CLASS_P (c))
+ reg_size = UNITS_PER_VSX_WORD;
+
+ else if (c == ALTIVEC_REGS)
+ reg_size = UNITS_PER_ALTIVEC_WORD;
+
+ else if (c == FLOAT_REGS)
+ reg_size = UNITS_PER_FP_WORD;
+
+ else
+ reg_size = UNITS_PER_WORD;
+
+ for (m = 0; m < NUM_MACHINE_MODES; ++m)
+ rs6000_class_max_nregs[m][c]
+ = (GET_MODE_SIZE (m) + reg_size - 1) / reg_size;
+ }
+
+ if (TARGET_E500_DOUBLE)
+ rs6000_class_max_nregs[DFmode][GENERAL_REGS] = 1;
+
+ if (TARGET_DEBUG_REG)
+ {
+ const char *nl = (const char *)0;
+
+ fprintf (stderr, "Register information: (last virtual reg = %d)\n",
+ LAST_VIRTUAL_REGISTER);
+ rs6000_debug_reg_print (0, 31, "gr");
+ rs6000_debug_reg_print (32, 63, "fp");
+ rs6000_debug_reg_print (FIRST_ALTIVEC_REGNO,
+ LAST_ALTIVEC_REGNO,
+ "vs");
+ rs6000_debug_reg_print (LR_REGNO, LR_REGNO, "lr");
+ rs6000_debug_reg_print (CTR_REGNO, CTR_REGNO, "ctr");
+ rs6000_debug_reg_print (CR0_REGNO, CR7_REGNO, "cr");
+ rs6000_debug_reg_print (MQ_REGNO, MQ_REGNO, "mq");
+ rs6000_debug_reg_print (XER_REGNO, XER_REGNO, "xer");
+ rs6000_debug_reg_print (VRSAVE_REGNO, VRSAVE_REGNO, "vrsave");
+ rs6000_debug_reg_print (VSCR_REGNO, VSCR_REGNO, "vscr");
+ rs6000_debug_reg_print (SPE_ACC_REGNO, SPE_ACC_REGNO, "spe_a");
+ rs6000_debug_reg_print (SPEFSCR_REGNO, SPEFSCR_REGNO, "spe_f");
+
+ fprintf (stderr,
+ "\n"
+ "V16QI reg_class = %s\n"
+ "V8HI reg_class = %s\n"
+ "V4SI reg_class = %s\n"
+ "V2DI reg_class = %s\n"
+ "V4SF reg_class = %s\n"
+ "V2DF reg_class = %s\n"
+ "DF reg_class = %s\n"
+ "vsx reg_class = %s\n\n",
+ reg_class_names[rs6000_vector_reg_class[V16QImode]],
+ reg_class_names[rs6000_vector_reg_class[V8HImode]],
+ reg_class_names[rs6000_vector_reg_class[V4SImode]],
+ reg_class_names[rs6000_vector_reg_class[V2DImode]],
+ reg_class_names[rs6000_vector_reg_class[V4SFmode]],
+ reg_class_names[rs6000_vector_reg_class[V2DFmode]],
+ reg_class_names[rs6000_vector_reg_class[DFmode]],
+ reg_class_names[rs6000_vsx_reg_class]);
+
+ for (m = 0; m < NUM_MACHINE_MODES; ++m)
+ if (rs6000_vector_unit[m] || rs6000_vector_mem[m])
+ {
+ nl = "\n";
+ fprintf (stderr, "Vector mode: %-5s arithmetic: %-8s move: %-8s\n",
+ GET_MODE_NAME (m),
+ rs6000_debug_vector_unit[ rs6000_vector_unit[m] ],
+ rs6000_debug_vector_unit[ rs6000_vector_mem[m] ]);
+ }
+
+ if (nl)
+ fputs (nl, stderr);
+ }
}
#if TARGET_MACHO
@@ -1482,12 +1870,15 @@ rs6000_override_options (const char *def
{"801", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT},
{"821", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT},
{"823", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT},
- {"8540", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN},
+ {"8540", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN
+ | MASK_ISEL},
/* 8548 has a dummy entry for now. */
- {"8548", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN},
+ {"8548", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN
+ | MASK_ISEL},
{"e300c2", PROCESSOR_PPCE300C2, POWERPC_BASE_MASK | MASK_SOFT_FLOAT},
{"e300c3", PROCESSOR_PPCE300C3, POWERPC_BASE_MASK},
- {"e500mc", PROCESSOR_PPCE500MC, POWERPC_BASE_MASK | MASK_PPC_GFXOPT},
+ {"e500mc", PROCESSOR_PPCE500MC, POWERPC_BASE_MASK | MASK_PPC_GFXOPT
+ | MASK_ISEL},
{"860", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT},
{"970", PROCESSOR_POWER4,
POWERPC_7400_MASK | MASK_PPC_GPOPT | MASK_MFCRF | MASK_POWERPC64},
@@ -1520,9 +1911,10 @@ rs6000_override_options (const char *def
POWERPC_BASE_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_PPC_GFXOPT
| MASK_MFCRF | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP
| MASK_MFPGPR},
- {"power7", PROCESSOR_POWER5,
+ {"power7", PROCESSOR_POWER7,
POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
- | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP},
+ | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
+ | MASK_VSX}, /* Don't add MASK_ISEL by default */
{"powerpc", PROCESSOR_POWERPC, POWERPC_BASE_MASK},
{"powerpc64", PROCESSOR_POWERPC64,
POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_POWERPC64},
@@ -1549,7 +1941,8 @@ rs6000_override_options (const char *def
POWERPC_MASKS = (POWERPC_BASE_MASK | MASK_PPC_GPOPT | MASK_STRICT_ALIGN
| MASK_PPC_GFXOPT | MASK_POWERPC64 | MASK_ALTIVEC
| MASK_MFCRF | MASK_POPCNTB | MASK_FPRND | MASK_MULHW
- | MASK_DLMZB | MASK_CMPB | MASK_MFPGPR | MASK_DFP)
+ | MASK_DLMZB | MASK_CMPB | MASK_MFPGPR | MASK_DFP
+ | MASK_POPCNTD | MASK_VSX | MASK_ISEL)
};
set_masks = POWER_MASKS | POWERPC_MASKS | MASK_SOFT_FLOAT;
@@ -1594,10 +1987,6 @@ rs6000_override_options (const char *def
}
}
- if ((TARGET_E500 || rs6000_cpu == PROCESSOR_PPCE500MC)
- && !rs6000_explicit_options.isel)
- rs6000_isel = 1;
-
if (rs6000_cpu == PROCESSOR_PPCE300C2 || rs6000_cpu == PROCESSOR_PPCE300C3
|| rs6000_cpu == PROCESSOR_PPCE500MC)
{
@@ -1642,15 +2031,47 @@ rs6000_override_options (const char *def
}
}
+ /* Add some warnings for VSX. Enable -maltivec unless the user explicitly
+ used -mno-altivec */
+ if (TARGET_VSX)
+ {
+ const char *msg = NULL;
+ if (!TARGET_HARD_FLOAT || !TARGET_FPRS
+ || !TARGET_SINGLE_FLOAT || !TARGET_DOUBLE_FLOAT)
+ msg = "-mvsx requires hardware floating point";
+ else if (TARGET_PAIRED_FLOAT)
+ msg = "-mvsx and -mpaired are incompatible";
+ /* The hardware will allow VSX and little endian, but until we make sure
+ things like vector select, etc. work don't allow VSX on little endian
+ systems at this point. */
+ else if (!BYTES_BIG_ENDIAN)
+ msg = "-mvsx used with little endian code";
+ else if (TARGET_AVOID_XFORM > 0)
+ msg = "-mvsx needs indexed addressing";
+
+ if (msg)
+ {
+ warning (0, msg);
+ target_flags &= MASK_VSX;
+ }
+ else if (!TARGET_ALTIVEC && (target_flags_explicit & MASK_ALTIVEC) == 0)
+ target_flags |= MASK_ALTIVEC;
+ }
+
/* Set debug flags */
if (rs6000_debug_name)
{
if (! strcmp (rs6000_debug_name, "all"))
- rs6000_debug_stack = rs6000_debug_arg = 1;
+ rs6000_debug_stack = rs6000_debug_arg = rs6000_debug_reg
+ = rs6000_debug_addr = 1;
else if (! strcmp (rs6000_debug_name, "stack"))
rs6000_debug_stack = 1;
else if (! strcmp (rs6000_debug_name, "arg"))
rs6000_debug_arg = 1;
+ else if (! strcmp (rs6000_debug_name, "reg"))
+ rs6000_debug_reg = 1;
+ else if (! strcmp (rs6000_debug_name, "addr"))
+ rs6000_debug_addr = 1;
else
error ("unknown -mdebug-%s switch", rs6000_debug_name);
}
@@ -1741,8 +2162,8 @@ rs6000_override_options (const char *def
rs6000_spe = 0;
if (!rs6000_explicit_options.float_gprs)
rs6000_float_gprs = 0;
- if (!rs6000_explicit_options.isel)
- rs6000_isel = 0;
+ if (!(target_flags_explicit & MASK_ISEL))
+ target_flags &= ~MASK_ISEL;
}
/* Detect invalid option combinations with E500. */
@@ -1751,12 +2172,14 @@ rs6000_override_options (const char *def
rs6000_always_hint = (rs6000_cpu != PROCESSOR_POWER4
&& rs6000_cpu != PROCESSOR_POWER5
&& rs6000_cpu != PROCESSOR_POWER6
+ && rs6000_cpu != PROCESSOR_POWER7
&& rs6000_cpu != PROCESSOR_CELL);
rs6000_sched_groups = (rs6000_cpu == PROCESSOR_POWER4
|| rs6000_cpu == PROCESSOR_POWER5);
rs6000_align_branch_targets = (rs6000_cpu == PROCESSOR_POWER4
|| rs6000_cpu == PROCESSOR_POWER5
- || rs6000_cpu == PROCESSOR_POWER6);
+ || rs6000_cpu == PROCESSOR_POWER6
+ || rs6000_cpu == PROCESSOR_POWER7);
rs6000_sched_restricted_insns_priority
= (rs6000_sched_groups ? 1 : 0);
@@ -1951,6 +2374,10 @@ rs6000_override_options (const char *def
rs6000_cost = &power6_cost;
break;
+ case PROCESSOR_POWER7:
+ rs6000_cost = &power7_cost;
+ break;
+
default:
gcc_unreachable ();
}
@@ -2001,7 +2428,7 @@ rs6000_override_options (const char *def
static tree
rs6000_builtin_mask_for_load (void)
{
- if (TARGET_ALTIVEC)
+ if (TARGET_ALTIVEC && !TARGET_VSX)
return altivec_builtin_mask_for_load;
else
return 0;
@@ -2015,18 +2442,19 @@ rs6000_builtin_mask_for_load (void)
static tree
rs6000_builtin_conversion (enum tree_code code, tree type)
{
- if (!TARGET_ALTIVEC)
- return NULL_TREE;
-
switch (code)
{
case FIX_TRUNC_EXPR:
switch (TYPE_MODE (type))
{
case V4SImode:
+ if (VECTOR_UNIT_NONE_P (V4SImode) || VECTOR_UNIT_NONE_P (V4SFmode))
+ return NULL_TREE;
+
return TYPE_UNSIGNED (type)
- ? rs6000_builtin_decls[ALTIVEC_BUILTIN_VCTUXS]
- : rs6000_builtin_decls[ALTIVEC_BUILTIN_VCTSXS];
+ ? rs6000_builtin_decls[VECTOR_BUILTIN_FIXUNS_V4SF_V4SI]
+ : rs6000_builtin_decls[VECTOR_BUILTIN_FIX_V4SF_V4SI];
+
default:
return NULL_TREE;
}
@@ -2035,9 +2463,12 @@ rs6000_builtin_conversion (enum tree_cod
switch (TYPE_MODE (type))
{
case V4SImode:
+ if (VECTOR_UNIT_NONE_P (V4SImode) || VECTOR_UNIT_NONE_P (V4SFmode))
+ return NULL_TREE;
+
return TYPE_UNSIGNED (type)
- ? rs6000_builtin_decls[ALTIVEC_BUILTIN_VCFUX]
- : rs6000_builtin_decls[ALTIVEC_BUILTIN_VCFSX];
+ ? rs6000_builtin_decls[VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF]
+ : rs6000_builtin_decls[VECTOR_BUILTIN_FLOAT_V4SI_V4SF];
default:
return NULL_TREE;
}
@@ -2150,6 +2581,14 @@ rs6000_builtin_vec_perm (tree type, tree
d = rs6000_builtin_decls[ALTIVEC_BUILTIN_VPERM_4SF];
break;
+ case V2DFmode:
+ d = rs6000_builtin_decls[ALTIVEC_BUILTIN_VPERM_2DF];
+ break;
+
+ case V2DImode:
+ d = rs6000_builtin_decls[ALTIVEC_BUILTIN_VPERM_2DI];
+ break;
+
default:
return NULL_TREE;
}
@@ -2229,6 +2668,7 @@ static bool
rs6000_handle_option (size_t code, const char *arg, int value)
{
enum fpu_type_t fpu_type = FPU_NONE;
+ int isel;
switch (code)
{
@@ -2331,14 +2771,14 @@ rs6000_handle_option (size_t code, const
rs6000_parse_yes_no_option ("vrsave", arg, &(TARGET_ALTIVEC_VRSAVE));
break;
- case OPT_misel:
- rs6000_explicit_options.isel = true;
- rs6000_isel = value;
- break;
-
case OPT_misel_:
- rs6000_explicit_options.isel = true;
- rs6000_parse_yes_no_option ("isel", arg, &(rs6000_isel));
+ target_flags_explicit |= MASK_ISEL;
+ isel = 0;
+ rs6000_parse_yes_no_option ("isel", arg, &isel);
+ if (isel)
+ target_flags |= MASK_ISEL;
+ else
+ target_flags &= ~MASK_ISEL;
break;
case OPT_mspe:
@@ -2967,6 +3407,9 @@ output_vec_const_move (rtx *operands)
vec = operands[1];
mode = GET_MODE (dest);
+ if (TARGET_VSX && zero_constant (vec, mode))
+ return "xxlxor %x0,%x0,%x0";
+
if (TARGET_ALTIVEC)
{
rtx splat_vec;
@@ -3190,20 +3633,21 @@ rs6000_expand_vector_init (rtx target, r
if (n_var == 0)
{
rtx const_vec = gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0));
- if (mode != V4SFmode && all_const_zero)
+ bool int_vector_p = (GET_MODE_CLASS (mode) == MODE_VECTOR_INT);
+ if ((int_vector_p || TARGET_VSX) && all_const_zero)
{
/* Zero register. */
emit_insn (gen_rtx_SET (VOIDmode, target,
gen_rtx_XOR (mode, target, target)));
return;
}
- else if (mode != V4SFmode && easy_vector_constant (const_vec, mode))
+ else if (int_vector_p && easy_vector_constant (const_vec, mode))
{
/* Splat immediate. */
emit_insn (gen_rtx_SET (VOIDmode, target, const_vec));
return;
}
- else if (all_same)
+ else if (all_same && int_vector_p)
; /* Splat vector element. */
else
{
@@ -3213,6 +3657,18 @@ rs6000_expand_vector_init (rtx target, r
}
}
+ if (mode == V2DFmode)
+ {
+ gcc_assert (TARGET_VSX);
+ if (all_same)
+ emit_insn (gen_vsx_splatv2df (target, XVECEXP (vals, 0, 0)));
+ else
+ emit_insn (gen_vsx_concat_v2df (target,
+ copy_to_reg (XVECEXP (vals, 0, 0)),
+ copy_to_reg (XVECEXP (vals, 0, 1))));
+ return;
+ }
+
/* Store value to stack temp. Load vector element. Splat. */
if (all_same)
{
@@ -3272,6 +3728,13 @@ rs6000_expand_vector_set (rtx target, rt
int width = GET_MODE_SIZE (inner_mode);
int i;
+ if (mode == V2DFmode)
+ {
+ gcc_assert (TARGET_VSX);
+ emit_insn (gen_vsx_set_v2df (target, val, target, GEN_INT (elt)));
+ return;
+ }
+
/* Load single variable value. */
mem = assign_stack_temp (mode, GET_MODE_SIZE (inner_mode), 0);
emit_move_insn (adjust_address_nv (mem, inner_mode, 0), val);
@@ -3309,6 +3772,13 @@ rs6000_expand_vector_extract (rtx target
enum machine_mode inner_mode = GET_MODE_INNER (mode);
rtx mem, x;
+ if (mode == V2DFmode)
+ {
+ gcc_assert (TARGET_VSX);
+ emit_insn (gen_vsx_extract_v2df (target, vec, GEN_INT (elt)));
+ return;
+ }
+
/* Allocate mode-sized buffer. */
mem = assign_stack_temp (mode, GET_MODE_SIZE (mode), 0);
@@ -3627,9 +4097,13 @@ rs6000_legitimate_offset_address_p (enum
case V8HImode:
case V4SFmode:
case V4SImode:
- /* AltiVec vector modes. Only reg+reg addressing is valid and
+ case V2DFmode:
+ case V2DImode:
+ /* AltiVec/VSX vector modes. Only reg+reg addressing is valid and
constant offset zero should not occur due to canonicalization. */
- return false;
+ if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode))
+ return false;
+ break;
case V4HImode:
case V2SImode:
@@ -3646,6 +4120,11 @@ rs6000_legitimate_offset_address_p (enum
if (TARGET_E500_DOUBLE)
return SPE_CONST_OFFSET_OK (offset);
+ /* If we are using VSX scalar loads, restrict ourselves to reg+reg
+ addressing. */
+ if (VECTOR_MEM_VSX_P (DFmode))
+ return false;
+
case DDmode:
case DImode:
/* On e500v2, we may have:
@@ -3716,7 +4195,9 @@ avoiding_indexed_address_p (enum machine
{
/* Avoid indexed addressing for modes that have non-indexed
load/store instruction forms. */
- return TARGET_AVOID_XFORM && !ALTIVEC_VECTOR_MODE (mode);
+ return (TARGET_AVOID_XFORM
+ && (!TARGET_ALTIVEC || !ALTIVEC_VECTOR_MODE (mode))
+ && (!TARGET_VSX || !VSX_VECTOR_MODE (mode)));
}
inline bool
@@ -3808,25 +4289,30 @@ rtx
rs6000_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
enum machine_mode mode)
{
+ rtx ret = NULL_RTX;
+ rtx orig_x = x;
+
if (GET_CODE (x) == SYMBOL_REF)
{
enum tls_model model = SYMBOL_REF_TLS_MODEL (x);
if (model != 0)
- return rs6000_legitimize_tls_address (x, model);
+ ret = rs6000_legitimize_tls_address (x, model);
}
- if (GET_CODE (x) == PLUS
- && GET_CODE (XEXP (x, 0)) == REG
- && GET_CODE (XEXP (x, 1)) == CONST_INT
- && (unsigned HOST_WIDE_INT) (INTVAL (XEXP (x, 1)) + 0x8000) >= 0x10000
- && !((TARGET_POWERPC64
- && (mode == DImode || mode == TImode)
- && (INTVAL (XEXP (x, 1)) & 3) != 0)
- || SPE_VECTOR_MODE (mode)
- || ALTIVEC_VECTOR_MODE (mode)
- || (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode
- || mode == DImode || mode == DDmode
- || mode == TDmode))))
+ else if (GET_CODE (x) == PLUS
+ && GET_CODE (XEXP (x, 0)) == REG
+ && GET_CODE (XEXP (x, 1)) == CONST_INT
+ && ((unsigned HOST_WIDE_INT) (INTVAL (XEXP (x, 1)) + 0x8000)
+ >= 0x10000)
+ && !((TARGET_POWERPC64
+ && (mode == DImode || mode == TImode)
+ && (INTVAL (XEXP (x, 1)) & 3) != 0)
+ || (TARGET_SPE && SPE_VECTOR_MODE (mode))
+ || (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode))
+ || (TARGET_VSX && VSX_VECTOR_MODE (mode))
+ || (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode
+ || mode == DImode || mode == DDmode
+ || mode == TDmode))))
{
HOST_WIDE_INT high_int, low_int;
rtx sum;
@@ -3834,7 +4320,7 @@ rs6000_legitimize_address (rtx x, rtx ol
high_int = INTVAL (XEXP (x, 1)) - low_int;
sum = force_operand (gen_rtx_PLUS (Pmode, XEXP (x, 0),
GEN_INT (high_int)), 0);
- return gen_rtx_PLUS (Pmode, sum, GEN_INT (low_int));
+ ret = gen_rtx_PLUS (Pmode, sum, GEN_INT (low_int));
}
else if (GET_CODE (x) == PLUS
&& GET_CODE (XEXP (x, 0)) == REG
@@ -3850,32 +4336,29 @@ rs6000_legitimize_address (rtx x, rtx ol
&& mode != TFmode
&& mode != TDmode)
{
- return gen_rtx_PLUS (Pmode, XEXP (x, 0),
- force_reg (Pmode, force_operand (XEXP (x, 1), 0)));
+ ret = gen_rtx_PLUS (Pmode, XEXP (x, 0),
+ force_reg (Pmode, force_operand (XEXP (x, 1), 0)));
}
- else if (ALTIVEC_VECTOR_MODE (mode))
+ else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
{
- rtx reg;
-
/* Make sure both operands are registers. */
if (GET_CODE (x) == PLUS)
- return gen_rtx_PLUS (Pmode, force_reg (Pmode, XEXP (x, 0)),
- force_reg (Pmode, XEXP (x, 1)));
-
- reg = force_reg (Pmode, x);
- return reg;
+ ret = gen_rtx_PLUS (Pmode, force_reg (Pmode, XEXP (x, 0)),
+ force_reg (Pmode, XEXP (x, 1)));
+ else
+ ret = force_reg (Pmode, x);
}
- else if (SPE_VECTOR_MODE (mode)
+ else if ((TARGET_SPE && SPE_VECTOR_MODE (mode))
|| (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode
|| mode == DDmode || mode == TDmode
|| mode == DImode)))
{
if (mode == DImode)
- return NULL_RTX;
- /* We accept [reg + reg] and [reg + OFFSET]. */
+ ret = NULL_RTX;
- if (GET_CODE (x) == PLUS)
- {
+ /* We accept [reg + reg] and [reg + OFFSET]. */
+ else if (GET_CODE (x) == PLUS)
+ {
rtx op1 = XEXP (x, 0);
rtx op2 = XEXP (x, 1);
rtx y;
@@ -3894,12 +4377,12 @@ rs6000_legitimize_address (rtx x, rtx ol
y = gen_rtx_PLUS (Pmode, op1, op2);
if ((GET_MODE_SIZE (mode) > 8 || mode == DDmode) && REG_P (op2))
- return force_reg (Pmode, y);
+ ret = force_reg (Pmode, y);
else
- return y;
+ ret = y;
}
-
- return force_reg (Pmode, x);
+ else
+ ret = force_reg (Pmode, x);
}
else if (TARGET_ELF
&& TARGET_32BIT
@@ -3915,7 +4398,7 @@ rs6000_legitimize_address (rtx x, rtx ol
{
rtx reg = gen_reg_rtx (Pmode);
emit_insn (gen_elf_high (reg, x));
- return gen_rtx_LO_SUM (Pmode, reg, x);
+ ret = gen_rtx_LO_SUM (Pmode, reg, x);
}
else if (TARGET_MACHO && TARGET_32BIT && TARGET_NO_TOC
&& ! flag_pic
@@ -3933,17 +4416,34 @@ rs6000_legitimize_address (rtx x, rtx ol
{
rtx reg = gen_reg_rtx (Pmode);
emit_insn (gen_macho_high (reg, x));
- return gen_rtx_LO_SUM (Pmode, reg, x);
+ ret = gen_rtx_LO_SUM (Pmode, reg, x);
}
else if (TARGET_TOC
&& GET_CODE (x) == SYMBOL_REF
&& constant_pool_expr_p (x)
&& ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), Pmode))
{
- return create_TOC_reference (x);
+ ret = create_TOC_reference (x);
}
else
- return NULL_RTX;
+ ret = NULL_RTX;
+
+ if (TARGET_DEBUG_ADDR)
+ {
+ fprintf (stderr,
+ "\nrs6000_legitimize_address: mode %s, original addr:\n",
+ GET_MODE_NAME (mode));
+ debug_rtx (orig_x);
+ if (ret)
+ {
+ fprintf (stderr, "New addr:\n");
+ debug_rtx (ret);
+ }
+ else
+ fprintf (stderr, "NULL returned\n");
+ }
+
+ return ret;
}
/* This is called from dwarf2out.c via TARGET_ASM_OUTPUT_DWARF_DTPREL.
@@ -4232,6 +4732,9 @@ rs6000_legitimize_reload_address (rtx x,
int opnum, int type,
int ind_levels ATTRIBUTE_UNUSED, int *win)
{
+ rtx orig_x = x;
+ rtx ret = NULL_RTX;
+
/* We must recognize output that we have already generated ourselves. */
if (GET_CODE (x) == PLUS
&& GET_CODE (XEXP (x, 0)) == PLUS
@@ -4243,17 +4746,17 @@ rs6000_legitimize_reload_address (rtx x,
BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
opnum, (enum reload_type)type);
*win = 1;
- return x;
+ ret = x;
}
#if TARGET_MACHO
- if (DEFAULT_ABI == ABI_DARWIN && flag_pic
- && GET_CODE (x) == LO_SUM
- && GET_CODE (XEXP (x, 0)) == PLUS
- && XEXP (XEXP (x, 0), 0) == pic_offset_table_rtx
- && GET_CODE (XEXP (XEXP (x, 0), 1)) == HIGH
- && XEXP (XEXP (XEXP (x, 0), 1), 0) == XEXP (x, 1)
- && machopic_operand_p (XEXP (x, 1)))
+ else if (DEFAULT_ABI == ABI_DARWIN && flag_pic
+ && GET_CODE (x) == LO_SUM
+ && GET_CODE (XEXP (x, 0)) == PLUS
+ && XEXP (XEXP (x, 0), 0) == pic_offset_table_rtx
+ && GET_CODE (XEXP (XEXP (x, 0), 1)) == HIGH
+ && XEXP (XEXP (XEXP (x, 0), 1), 0) == XEXP (x, 1)
+ && machopic_operand_p (XEXP (x, 1)))
{
/* Result of previous invocation of this function on Darwin
floating point constant. */
@@ -4261,40 +4764,42 @@ rs6000_legitimize_reload_address (rtx x,
BASE_REG_CLASS, Pmode, VOIDmode, 0, 0,
opnum, (enum reload_type)type);
*win = 1;
- return x;
+ ret = x;
}
#endif
/* Force ld/std non-word aligned offset into base register by wrapping
in offset 0. */
- if (GET_CODE (x) == PLUS
- && GET_CODE (XEXP (x, 0)) == REG
- && REGNO (XEXP (x, 0)) < 32
- && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode)
- && GET_CODE (XEXP (x, 1)) == CONST_INT
- && (INTVAL (XEXP (x, 1)) & 3) != 0
- && !ALTIVEC_VECTOR_MODE (mode)
- && GET_MODE_SIZE (mode) >= UNITS_PER_WORD
- && TARGET_POWERPC64)
+ else if (GET_CODE (x) == PLUS
+ && GET_CODE (XEXP (x, 0)) == REG
+ && REGNO (XEXP (x, 0)) < 32
+ && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode)
+ && GET_CODE (XEXP (x, 1)) == CONST_INT
+ && (INTVAL (XEXP (x, 1)) & 3) != 0
+ && !ALTIVEC_VECTOR_MODE (mode)
+ && !VSX_VECTOR_MODE (mode)
+ && GET_MODE_SIZE (mode) >= UNITS_PER_WORD
+ && TARGET_POWERPC64)
{
x = gen_rtx_PLUS (GET_MODE (x), x, GEN_INT (0));
push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL,
BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
opnum, (enum reload_type) type);
*win = 1;
- return x;
+ ret = x;
}
- if (GET_CODE (x) == PLUS
- && GET_CODE (XEXP (x, 0)) == REG
- && REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER
- && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode)
- && GET_CODE (XEXP (x, 1)) == CONST_INT
- && !SPE_VECTOR_MODE (mode)
- && !(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode
- || mode == DDmode || mode == TDmode
- || mode == DImode))
- && !ALTIVEC_VECTOR_MODE (mode))
+ else if (GET_CODE (x) == PLUS
+ && GET_CODE (XEXP (x, 0)) == REG
+ && REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER
+ && REG_MODE_OK_FOR_BASE_P (XEXP (x, 0), mode)
+ && GET_CODE (XEXP (x, 1)) == CONST_INT
+ && !SPE_VECTOR_MODE (mode)
+ && !(TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode
+ || mode == DDmode || mode == TDmode
+ || mode == DImode))
+ && !ALTIVEC_VECTOR_MODE (mode)
+ && !VSX_VECTOR_MODE (mode))
{
HOST_WIDE_INT val = INTVAL (XEXP (x, 1));
HOST_WIDE_INT low = ((val & 0xffff) ^ 0x8000) - 0x8000;
@@ -4305,42 +4810,44 @@ rs6000_legitimize_reload_address (rtx x,
if (high + low != val)
{
*win = 0;
- return x;
+ ret = x;
}
+ else
+ {
+ /* Reload the high part into a base reg; leave the low part
+ in the mem directly. */
- /* Reload the high part into a base reg; leave the low part
- in the mem directly. */
-
- x = gen_rtx_PLUS (GET_MODE (x),
- gen_rtx_PLUS (GET_MODE (x), XEXP (x, 0),
- GEN_INT (high)),
- GEN_INT (low));
+ x = gen_rtx_PLUS (GET_MODE (x),
+ gen_rtx_PLUS (GET_MODE (x), XEXP (x, 0),
+ GEN_INT (high)),
+ GEN_INT (low));
- push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL,
- BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
- opnum, (enum reload_type)type);
- *win = 1;
- return x;
+ push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL,
+ BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
+ opnum, (enum reload_type)type);
+ *win = 1;
+ return x;
+ }
}
- if (GET_CODE (x) == SYMBOL_REF
- && !ALTIVEC_VECTOR_MODE (mode)
- && !SPE_VECTOR_MODE (mode)
+ else if (GET_CODE (x) == SYMBOL_REF
+ && VECTOR_MEM_NONE_P (mode)
+ && (!TARGET_SPE || !SPE_VECTOR_MODE (mode))
#if TARGET_MACHO
- && DEFAULT_ABI == ABI_DARWIN
- && (flag_pic || MACHO_DYNAMIC_NO_PIC_P)
+ && DEFAULT_ABI == ABI_DARWIN
+ && (flag_pic || MACHO_DYNAMIC_NO_PIC_P)
#else
- && DEFAULT_ABI == ABI_V4
- && !flag_pic
+ && DEFAULT_ABI == ABI_V4
+ && !flag_pic
#endif
- /* Don't do this for TFmode or TDmode, since the result isn't offsettable.
- The same goes for DImode without 64-bit gprs and DFmode and DDmode
- without fprs. */
- && mode != TFmode
- && mode != TDmode
- && (mode != DImode || TARGET_POWERPC64)
- && ((mode != DFmode && mode != DDmode) || TARGET_POWERPC64
- || (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)))
+ /* Don't do this for TFmode or TDmode, since the result isn't
+ offsettable. The same goes for DImode without 64-bit gprs and
+ DFmode and DDmode without fprs. */
+ && mode != TFmode
+ && mode != TDmode
+ && (mode != DImode || TARGET_POWERPC64)
+ && ((mode != DFmode && mode != DDmode) || TARGET_POWERPC64
+ || (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)))
{
#if TARGET_MACHO
if (flag_pic)
@@ -4359,37 +4866,61 @@ rs6000_legitimize_reload_address (rtx x,
BASE_REG_CLASS, Pmode, VOIDmode, 0, 0,
opnum, (enum reload_type)type);
*win = 1;
- return x;
+ ret = x;
}
/* Reload an offset address wrapped by an AND that represents the
masking of the lower bits. Strip the outer AND and let reload
convert the offset address into an indirect address. */
- if (TARGET_ALTIVEC
- && ALTIVEC_VECTOR_MODE (mode)
- && GET_CODE (x) == AND
- && GET_CODE (XEXP (x, 0)) == PLUS
- && GET_CODE (XEXP (XEXP (x, 0), 0)) == REG
- && GET_CODE (XEXP (XEXP (x, 0), 1)) == CONST_INT
- && GET_CODE (XEXP (x, 1)) == CONST_INT
- && INTVAL (XEXP (x, 1)) == -16)
+ else if (VECTOR_MEM_ALTIVEC_P (mode)
+ && GET_CODE (x) == AND
+ && GET_CODE (XEXP (x, 0)) == PLUS
+ && GET_CODE (XEXP (XEXP (x, 0), 0)) == REG
+ && GET_CODE (XEXP (XEXP (x, 0), 1)) == CONST_INT
+ && GET_CODE (XEXP (x, 1)) == CONST_INT
+ && INTVAL (XEXP (x, 1)) == -16)
{
x = XEXP (x, 0);
*win = 1;
- return x;
+ ret = x;
}
- if (TARGET_TOC
- && GET_CODE (x) == SYMBOL_REF
- && constant_pool_expr_p (x)
- && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), mode))
+ else if (TARGET_TOC
+ && GET_CODE (x) == SYMBOL_REF
+ && constant_pool_expr_p (x)
+ && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (x), mode))
{
x = create_TOC_reference (x);
*win = 1;
- return x;
+ ret = x;
+ }
+
+ else
+ {
+ *win = 0;
+ ret = x;
}
- *win = 0;
- return x;
+
+ if (TARGET_DEBUG_ADDR)
+ {
+ fprintf (stderr,
+ "\nrs6000_legitimize_reload_address: mode = %s, opnum = %d, "
+ "type = %d, ind_levels = %d, win = %d, original addr:\n",
+ GET_MODE_NAME (mode), opnum, type, ind_levels, *win);
+ debug_rtx (orig_x);
+
+ if (orig_x == ret)
+ fprintf (stderr, "Same address returned\n");
+ else if (!ret)
+ fprintf (stderr, "NULL returned\n");
+ else
+ {
+ fprintf (stderr, "New address:\n");
+ debug_rtx (ret);
+ }
+ }
+
+ return ret;
}
/* GO_IF_LEGITIMATE_ADDRESS recognizes an RTL expression
@@ -4412,77 +4943,101 @@ rs6000_legitimize_reload_address (rtx x,
int
rs6000_legitimate_address (enum machine_mode mode, rtx x, int reg_ok_strict)
{
+ int ret;
+ rtx orig_x = x;
+
/* If this is an unaligned stvx/ldvx type address, discard the outer AND. */
if (TARGET_ALTIVEC
- && ALTIVEC_VECTOR_MODE (mode)
+ && (ALTIVEC_VECTOR_MODE (mode) || VSX_VECTOR_MODE (mode))
&& GET_CODE (x) == AND
&& GET_CODE (XEXP (x, 1)) == CONST_INT
&& INTVAL (XEXP (x, 1)) == -16)
x = XEXP (x, 0);
if (RS6000_SYMBOL_REF_TLS_P (x))
- return 0;
- if (legitimate_indirect_address_p (x, reg_ok_strict))
- return 1;
- if ((GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC)
- && !ALTIVEC_VECTOR_MODE (mode)
- && !SPE_VECTOR_MODE (mode)
- && mode != TFmode
- && mode != TDmode
- /* Restrict addressing for DI because of our SUBREG hackery. */
- && !(TARGET_E500_DOUBLE
- && (mode == DFmode || mode == DDmode || mode == DImode))
- && TARGET_UPDATE
- && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
- return 1;
- if (legitimate_small_data_p (mode, x))
- return 1;
- if (legitimate_constant_pool_address_p (x))
- return 1;
+ ret = 0;
+ else if (legitimate_indirect_address_p (x, reg_ok_strict))
+ ret = 1;
+ else if ((GET_CODE (x) == PRE_INC || GET_CODE (x) == PRE_DEC)
+ && !VECTOR_MEM_ALTIVEC_P (mode)
+ /* && !VECTOR_MEM_VSX_P (mode) */
+ && (TARGET_SPE && !SPE_VECTOR_MODE (mode))
+ && mode != TFmode
+ && mode != TDmode
+ /* Restrict addressing for DI because of our SUBREG hackery. */
+ && !(TARGET_E500_DOUBLE
+ && (mode == DFmode || mode == DDmode || mode == DImode))
+ && TARGET_UPDATE
+ && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
+ ret = 1;
+ else if (legitimate_small_data_p (mode, x))
+ ret = 1;
+ else if (legitimate_constant_pool_address_p (x))
+ ret = 1;
/* If not REG_OK_STRICT (before reload) let pass any stack offset. */
- if (! reg_ok_strict
- && GET_CODE (x) == PLUS
- && GET_CODE (XEXP (x, 0)) == REG
- && (XEXP (x, 0) == virtual_stack_vars_rtx
- || XEXP (x, 0) == arg_pointer_rtx)
- && GET_CODE (XEXP (x, 1)) == CONST_INT)
- return 1;
- if (rs6000_legitimate_offset_address_p (mode, x, reg_ok_strict))
- return 1;
- if (mode != TImode
- && mode != TFmode
- && mode != TDmode
- && ((TARGET_HARD_FLOAT && TARGET_FPRS)
- || TARGET_POWERPC64
- || (mode != DFmode && mode != DDmode)
- || (TARGET_E500_DOUBLE && mode != DDmode))
- && (TARGET_POWERPC64 || mode != DImode)
- && !avoiding_indexed_address_p (mode)
- && legitimate_indexed_address_p (x, reg_ok_strict))
- return 1;
- if (GET_CODE (x) == PRE_MODIFY
- && mode != TImode
- && mode != TFmode
- && mode != TDmode
- && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)
- || TARGET_POWERPC64
- || ((mode != DFmode && mode != DDmode) || TARGET_E500_DOUBLE))
- && (TARGET_POWERPC64 || mode != DImode)
- && !ALTIVEC_VECTOR_MODE (mode)
- && !SPE_VECTOR_MODE (mode)
- /* Restrict addressing for DI because of our SUBREG hackery. */
- && !(TARGET_E500_DOUBLE
- && (mode == DFmode || mode == DDmode || mode == DImode))
- && TARGET_UPDATE
- && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict)
- && (rs6000_legitimate_offset_address_p (mode, XEXP (x, 1), reg_ok_strict)
- || (!avoiding_indexed_address_p (mode)
- && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)))
- && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
- return 1;
- if (legitimate_lo_sum_address_p (mode, x, reg_ok_strict))
- return 1;
- return 0;
+ else if (! reg_ok_strict
+ && GET_CODE (x) == PLUS
+ && GET_CODE (XEXP (x, 0)) == REG
+ && (XEXP (x, 0) == virtual_stack_vars_rtx
+ || XEXP (x, 0) == arg_pointer_rtx)
+ && GET_CODE (XEXP (x, 1)) == CONST_INT)
+ ret = 1;
+ else if (rs6000_legitimate_offset_address_p (mode, x, reg_ok_strict))
+ ret = 1;
+ else if (mode != TImode
+ && mode != TFmode
+ && mode != TDmode
+ && ((TARGET_HARD_FLOAT && TARGET_FPRS)
+ || TARGET_POWERPC64
+ || (mode != DFmode && mode != DDmode)
+ || (TARGET_E500_DOUBLE && mode != DDmode))
+ && (TARGET_POWERPC64 || mode != DImode)
+ && !avoiding_indexed_address_p (mode)
+ && legitimate_indexed_address_p (x, reg_ok_strict))
+ ret = 1;
+ else if (GET_CODE (x) == PRE_MODIFY
+ && VECTOR_MEM_VSX_P (mode)
+ && TARGET_UPDATE
+ && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)
+ && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
+ ret = 1;
+ else if (GET_CODE (x) == PRE_MODIFY
+ && mode != TImode
+ && mode != TFmode
+ && mode != TDmode
+ && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)
+ || TARGET_POWERPC64
+ || ((mode != DFmode && mode != DDmode) || TARGET_E500_DOUBLE))
+ && (TARGET_POWERPC64 || mode != DImode)
+ && !VECTOR_MEM_ALTIVEC_P (mode)
+ && (!TARGET_SPE || !SPE_VECTOR_MODE (mode))
+ /* Restrict addressing for DI because of our SUBREG hackery. */
+ && !(TARGET_E500_DOUBLE
+ && (mode == DFmode || mode == DDmode || mode == DImode))
+ && TARGET_UPDATE
+ && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict)
+ && (rs6000_legitimate_offset_address_p (mode, XEXP (x, 1), reg_ok_strict)
+ || (!avoiding_indexed_address_p (mode)
+ && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)))
+ && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
+ ret = 1;
+ else if (legitimate_lo_sum_address_p (mode, x, reg_ok_strict))
+ ret = 1;
+ else
+ ret = 0;
+
+ if (TARGET_DEBUG_ADDR)
+ {
+ fprintf (stderr,
+ "\nrs6000_legitimate_address: return = %d, mode = %s, "
+ "strict = %d\n",
+ ret,
+ GET_MODE_NAME (mode),
+ reg_ok_strict);
+ debug_rtx (orig_x);
+ }
+
+ return ret;
}
/* Go to LABEL if ADDR (a legitimate address expression)
@@ -4499,28 +5054,40 @@ rs6000_legitimate_address (enum machine_
bool
rs6000_mode_dependent_address (rtx addr)
{
+ bool ret = false;
+
switch (GET_CODE (addr))
{
case PLUS:
if (GET_CODE (XEXP (addr, 1)) == CONST_INT)
{
unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
- return val + 12 + 0x8000 >= 0x10000;
+ ret = (val + 12 + 0x8000 >= 0x10000);
}
break;
case LO_SUM:
- return true;
+ ret = true;
+ break;
/* Auto-increment cases are now treated generically in recog.c. */
case PRE_MODIFY:
- return TARGET_UPDATE;
+ ret = (TARGET_UPDATE != 0);
+ break;
default:
break;
}
- return false;
+ if (TARGET_DEBUG_ADDR)
+ {
+ fprintf (stderr,
+ "\nrs6000_mode_dependent_address: ret = %d\n",
+ (int)ret);
+ debug_rtx (addr);
+ }
+
+ return ret;
}
/* Implement FIND_BASE_TERM. */
@@ -4571,43 +5138,6 @@ rs6000_offsettable_memref_p (rtx op)
return rs6000_legitimate_offset_address_p (GET_MODE (op), XEXP (op, 0), 1);
}
-/* Return number of consecutive hard regs needed starting at reg REGNO
- to hold something of mode MODE.
- This is ordinarily the length in words of a value of mode MODE
- but can be less for certain modes in special long registers.
-
- For the SPE, GPRs are 64 bits but only 32 bits are visible in
- scalar instructions. The upper 32 bits are only available to the
- SIMD instructions.
-
- POWER and PowerPC GPRs hold 32 bits worth;
- PowerPC64 GPRs and FPRs point register holds 64 bits worth. */
-
-int
-rs6000_hard_regno_nregs (int regno, enum machine_mode mode)
-{
- if (FP_REGNO_P (regno))
- return (GET_MODE_SIZE (mode) + UNITS_PER_FP_WORD - 1) / UNITS_PER_FP_WORD;
-
- if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode))
- return (GET_MODE_SIZE (mode) + UNITS_PER_SPE_WORD - 1) / UNITS_PER_SPE_WORD;
-
- if (ALTIVEC_REGNO_P (regno))
- return
- (GET_MODE_SIZE (mode) + UNITS_PER_ALTIVEC_WORD - 1) / UNITS_PER_ALTIVEC_WORD;
-
- /* The value returned for SCmode in the E500 double case is 2 for
- ABI compatibility; storing an SCmode value in a single register
- would require function_arg and rs6000_spe_function_arg to handle
- SCmode so as to pass the value correctly in a pair of
- registers. */
- if (TARGET_E500_DOUBLE && FLOAT_MODE_P (mode) && mode != SCmode
- && !DECIMAL_FLOAT_MODE_P (mode))
- return (GET_MODE_SIZE (mode) + UNITS_PER_FP_WORD - 1) / UNITS_PER_FP_WORD;
-
- return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
-}
-
/* Change register usage conditional on target flags. */
void
rs6000_conditional_register_usage (void)
@@ -4672,14 +5202,14 @@ rs6000_conditional_register_usage (void)
= call_really_used_regs[14] = 1;
}
- if (!TARGET_ALTIVEC)
+ if (!TARGET_ALTIVEC && !TARGET_VSX)
{
for (i = FIRST_ALTIVEC_REGNO; i <= LAST_ALTIVEC_REGNO; ++i)
fixed_regs[i] = call_used_regs[i] = call_really_used_regs[i] = 1;
call_really_used_regs[VRSAVE_REGNO] = 1;
}
- if (TARGET_ALTIVEC)
+ if (TARGET_ALTIVEC || TARGET_VSX)
global_regs[VSCR_REGNO] = 1;
if (TARGET_ALTIVEC_ABI)
@@ -5101,6 +5631,8 @@ rs6000_emit_move (rtx dest, rtx source,
case V2SFmode:
case V2SImode:
case V1DImode:
+ case V2DFmode:
+ case V2DImode:
if (CONSTANT_P (operands[1])
&& !easy_vector_constant (operands[1], mode))
operands[1] = force_const_mem (mode, operands[1]);
@@ -5270,6 +5802,9 @@ rs6000_emit_move (rtx dest, rtx source,
break;
case TImode:
+ if (VECTOR_MEM_ALTIVEC_OR_VSX_P (TImode))
+ break;
+
rs6000_eliminate_indexed_memrefs (operands);
if (TARGET_POWER)
@@ -5285,7 +5820,7 @@ rs6000_emit_move (rtx dest, rtx source,
break;
default:
- gcc_unreachable ();
+ fatal_insn ("bad move", gen_rtx_SET (VOIDmode, dest, source));
}
/* Above, we may have called force_const_mem which may have returned
@@ -5305,10 +5840,10 @@ rs6000_emit_move (rtx dest, rtx source,
&& TARGET_HARD_FLOAT && TARGET_FPRS)
/* Nonzero if we can use an AltiVec register to pass this arg. */
-#define USE_ALTIVEC_FOR_ARG_P(CUM,MODE,TYPE,NAMED) \
- (ALTIVEC_VECTOR_MODE (MODE) \
- && (CUM)->vregno <= ALTIVEC_ARG_MAX_REG \
- && TARGET_ALTIVEC_ABI \
+#define USE_ALTIVEC_FOR_ARG_P(CUM,MODE,TYPE,NAMED) \
+ ((ALTIVEC_VECTOR_MODE (MODE) || VSX_VECTOR_MODE (MODE)) \
+ && (CUM)->vregno <= ALTIVEC_ARG_MAX_REG \
+ && TARGET_ALTIVEC_ABI \
&& (NAMED))
/* Return a nonzero value to say to return the function value in
@@ -5549,7 +6084,7 @@ function_arg_boundary (enum machine_mode
&& int_size_in_bytes (type) >= 8
&& int_size_in_bytes (type) < 16))
return 64;
- else if (ALTIVEC_VECTOR_MODE (mode)
+ else if ((ALTIVEC_VECTOR_MODE (mode) || VSX_VECTOR_MODE (mode))
|| (type && TREE_CODE (type) == VECTOR_TYPE
&& int_size_in_bytes (type) >= 16))
return 128;
@@ -5694,7 +6229,7 @@ function_arg_advance (CUMULATIVE_ARGS *c
cum->nargs_prototype--;
if (TARGET_ALTIVEC_ABI
- && (ALTIVEC_VECTOR_MODE (mode)
+ && ((ALTIVEC_VECTOR_MODE (mode) || VSX_VECTOR_MODE (mode))
|| (type && TREE_CODE (type) == VECTOR_TYPE
&& int_size_in_bytes (type) == 16)))
{
@@ -6288,7 +6823,7 @@ function_arg (CUMULATIVE_ARGS *cum, enum
else
return gen_rtx_REG (mode, cum->vregno);
else if (TARGET_ALTIVEC_ABI
- && (ALTIVEC_VECTOR_MODE (mode)
+ && ((ALTIVEC_VECTOR_MODE (mode) || VSX_VECTOR_MODE (mode))
|| (type && TREE_CODE (type) == VECTOR_TYPE
&& int_size_in_bytes (type) == 16)))
{
@@ -7212,10 +7747,13 @@ static const struct builtin_description
{ MASK_ALTIVEC, CODE_FOR_altivec_vperm_v4si, "__builtin_altivec_vperm_4si", ALTIVEC_BUILTIN_VPERM_4SI },
{ MASK_ALTIVEC, CODE_FOR_altivec_vperm_v8hi, "__builtin_altivec_vperm_8hi", ALTIVEC_BUILTIN_VPERM_8HI },
{ MASK_ALTIVEC, CODE_FOR_altivec_vperm_v16qi, "__builtin_altivec_vperm_16qi", ALTIVEC_BUILTIN_VPERM_16QI },
- { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v4sf, "__builtin_altivec_vsel_4sf", ALTIVEC_BUILTIN_VSEL_4SF },
- { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v4si, "__builtin_altivec_vsel_4si", ALTIVEC_BUILTIN_VSEL_4SI },
- { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v8hi, "__builtin_altivec_vsel_8hi", ALTIVEC_BUILTIN_VSEL_8HI },
- { MASK_ALTIVEC, CODE_FOR_altivec_vsel_v16qi, "__builtin_altivec_vsel_16qi", ALTIVEC_BUILTIN_VSEL_16QI },
+ { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v2df, "__builtin_altivec_vperm_2df", ALTIVEC_BUILTIN_VPERM_2DF },
+ { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v2di, "__builtin_altivec_vperm_2di", ALTIVEC_BUILTIN_VPERM_2DI },
+ { MASK_ALTIVEC, CODE_FOR_vector_vselv4sf, "__builtin_altivec_vsel_4sf", ALTIVEC_BUILTIN_VSEL_4SF },
+ { MASK_ALTIVEC, CODE_FOR_vector_vselv4si, "__builtin_altivec_vsel_4si", ALTIVEC_BUILTIN_VSEL_4SI },
+ { MASK_ALTIVEC, CODE_FOR_vector_vselv8hi, "__builtin_altivec_vsel_8hi", ALTIVEC_BUILTIN_VSEL_8HI },
+ { MASK_ALTIVEC, CODE_FOR_vector_vselv16qi, "__builtin_altivec_vsel_16qi", ALTIVEC_BUILTIN_VSEL_16QI },
+ { MASK_ALTIVEC, CODE_FOR_vector_vselv2df, "__builtin_altivec_vsel_2df", ALTIVEC_BUILTIN_VSEL_2DF },
{ MASK_ALTIVEC, CODE_FOR_altivec_vsldoi_v16qi, "__builtin_altivec_vsldoi_16qi", ALTIVEC_BUILTIN_VSLDOI_16QI },
{ MASK_ALTIVEC, CODE_FOR_altivec_vsldoi_v8hi, "__builtin_altivec_vsldoi_8hi", ALTIVEC_BUILTIN_VSLDOI_8HI },
{ MASK_ALTIVEC, CODE_FOR_altivec_vsldoi_v4si, "__builtin_altivec_vsldoi_4si", ALTIVEC_BUILTIN_VSLDOI_4SI },
@@ -7289,18 +7827,18 @@ static struct builtin_description bdesc_
{ MASK_ALTIVEC, CODE_FOR_altivec_vcfux, "__builtin_altivec_vcfux", ALTIVEC_BUILTIN_VCFUX },
{ MASK_ALTIVEC, CODE_FOR_altivec_vcfsx, "__builtin_altivec_vcfsx", ALTIVEC_BUILTIN_VCFSX },
{ MASK_ALTIVEC, CODE_FOR_altivec_vcmpbfp, "__builtin_altivec_vcmpbfp", ALTIVEC_BUILTIN_VCMPBFP },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpequb, "__builtin_altivec_vcmpequb", ALTIVEC_BUILTIN_VCMPEQUB },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpequh, "__builtin_altivec_vcmpequh", ALTIVEC_BUILTIN_VCMPEQUH },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpequw, "__builtin_altivec_vcmpequw", ALTIVEC_BUILTIN_VCMPEQUW },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpeqfp, "__builtin_altivec_vcmpeqfp", ALTIVEC_BUILTIN_VCMPEQFP },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgefp, "__builtin_altivec_vcmpgefp", ALTIVEC_BUILTIN_VCMPGEFP },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtub, "__builtin_altivec_vcmpgtub", ALTIVEC_BUILTIN_VCMPGTUB },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtsb, "__builtin_altivec_vcmpgtsb", ALTIVEC_BUILTIN_VCMPGTSB },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtuh, "__builtin_altivec_vcmpgtuh", ALTIVEC_BUILTIN_VCMPGTUH },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtsh, "__builtin_altivec_vcmpgtsh", ALTIVEC_BUILTIN_VCMPGTSH },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtuw, "__builtin_altivec_vcmpgtuw", ALTIVEC_BUILTIN_VCMPGTUW },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtsw, "__builtin_altivec_vcmpgtsw", ALTIVEC_BUILTIN_VCMPGTSW },
- { MASK_ALTIVEC, CODE_FOR_altivec_vcmpgtfp, "__builtin_altivec_vcmpgtfp", ALTIVEC_BUILTIN_VCMPGTFP },
+ { MASK_ALTIVEC, CODE_FOR_vector_eqv16qi, "__builtin_altivec_vcmpequb", ALTIVEC_BUILTIN_VCMPEQUB },
+ { MASK_ALTIVEC, CODE_FOR_vector_eqv8hi, "__builtin_altivec_vcmpequh", ALTIVEC_BUILTIN_VCMPEQUH },
+ { MASK_ALTIVEC, CODE_FOR_vector_eqv4si, "__builtin_altivec_vcmpequw", ALTIVEC_BUILTIN_VCMPEQUW },
+ { MASK_ALTIVEC, CODE_FOR_vector_eqv4sf, "__builtin_altivec_vcmpeqfp", ALTIVEC_BUILTIN_VCMPEQFP },
+ { MASK_ALTIVEC, CODE_FOR_vector_gev4sf, "__builtin_altivec_vcmpgefp", ALTIVEC_BUILTIN_VCMPGEFP },
+ { MASK_ALTIVEC, CODE_FOR_vector_gtuv16qi, "__builtin_altivec_vcmpgtub", ALTIVEC_BUILTIN_VCMPGTUB },
+ { MASK_ALTIVEC, CODE_FOR_vector_gtuv8hi, "__builtin_altivec_vcmpgtsb", ALTIVEC_BUILTIN_VCMPGTSB },
+ { MASK_ALTIVEC, CODE_FOR_vector_gtuv4si, "__builtin_altivec_vcmpgtuh", ALTIVEC_BUILTIN_VCMPGTUH },
+ { MASK_ALTIVEC, CODE_FOR_vector_gtv16qi, "__builtin_altivec_vcmpgtsh", ALTIVEC_BUILTIN_VCMPGTSH },
+ { MASK_ALTIVEC, CODE_FOR_vector_gtv8hi, "__builtin_altivec_vcmpgtuw", ALTIVEC_BUILTIN_VCMPGTUW },
+ { MASK_ALTIVEC, CODE_FOR_vector_gtv4si, "__builtin_altivec_vcmpgtsw", ALTIVEC_BUILTIN_VCMPGTSW },
+ { MASK_ALTIVEC, CODE_FOR_vector_gtv4sf, "__builtin_altivec_vcmpgtfp", ALTIVEC_BUILTIN_VCMPGTFP },
{ MASK_ALTIVEC, CODE_FOR_altivec_vctsxs, "__builtin_altivec_vctsxs", ALTIVEC_BUILTIN_VCTSXS },
{ MASK_ALTIVEC, CODE_FOR_altivec_vctuxs, "__builtin_altivec_vctuxs", ALTIVEC_BUILTIN_VCTUXS },
{ MASK_ALTIVEC, CODE_FOR_umaxv16qi3, "__builtin_altivec_vmaxub", ALTIVEC_BUILTIN_VMAXUB },
@@ -7331,7 +7869,7 @@ static struct builtin_description bdesc_
{ MASK_ALTIVEC, CODE_FOR_altivec_vmulosb, "__builtin_altivec_vmulosb", ALTIVEC_BUILTIN_VMULOSB },
{ MASK_ALTIVEC, CODE_FOR_altivec_vmulouh, "__builtin_altivec_vmulouh", ALTIVEC_BUILTIN_VMULOUH },
{ MASK_ALTIVEC, CODE_FOR_altivec_vmulosh, "__builtin_altivec_vmulosh", ALTIVEC_BUILTIN_VMULOSH },
- { MASK_ALTIVEC, CODE_FOR_altivec_norv4si3, "__builtin_altivec_vnor", ALTIVEC_BUILTIN_VNOR },
+ { MASK_ALTIVEC, CODE_FOR_norv4si3, "__builtin_altivec_vnor", ALTIVEC_BUILTIN_VNOR },
{ MASK_ALTIVEC, CODE_FOR_iorv4si3, "__builtin_altivec_vor", ALTIVEC_BUILTIN_VOR },
{ MASK_ALTIVEC, CODE_FOR_altivec_vpkuhum, "__builtin_altivec_vpkuhum", ALTIVEC_BUILTIN_VPKUHUM },
{ MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum, "__builtin_altivec_vpkuwum", ALTIVEC_BUILTIN_VPKUWUM },
@@ -7796,6 +8334,11 @@ static struct builtin_description bdesc_
{ MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vupklsh", ALTIVEC_BUILTIN_VEC_VUPKLSH },
{ MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_vupklsb", ALTIVEC_BUILTIN_VEC_VUPKLSB },
+ { MASK_ALTIVEC|MASK_VSX, CODE_FOR_floatv4siv4sf2, "__builtin_vec_float_sisf", VECTOR_BUILTIN_FLOAT_V4SI_V4SF },
+ { MASK_ALTIVEC|MASK_VSX, CODE_FOR_unsigned_floatv4siv4sf2, "__builtin_vec_uns_float_sisf", VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF },
+ { MASK_ALTIVEC|MASK_VSX, CODE_FOR_fix_truncv4sfv4si2, "__builtin_vec_fix_sfsi", VECTOR_BUILTIN_FIX_V4SF_V4SI },
+ { MASK_ALTIVEC|MASK_VSX, CODE_FOR_fixuns_truncv4sfv4si2, "__builtin_vec_fixuns_sfsi", VECTOR_BUILTIN_FIXUNS_V4SF_V4SI },
+
/* The SPE unary builtins must start with SPE_BUILTIN_EVABS and
end with SPE_BUILTIN_EVSUBFUSIAAW. */
{ 0, CODE_FOR_spe_evabs, "__builtin_spe_evabs", SPE_BUILTIN_EVABS },
@@ -8352,16 +8895,16 @@ altivec_expand_ld_builtin (tree exp, rtx
switch (fcode)
{
case ALTIVEC_BUILTIN_LD_INTERNAL_16qi:
- icode = CODE_FOR_altivec_lvx_v16qi;
+ icode = CODE_FOR_vector_load_v16qi;
break;
case ALTIVEC_BUILTIN_LD_INTERNAL_8hi:
- icode = CODE_FOR_altivec_lvx_v8hi;
+ icode = CODE_FOR_vector_load_v8hi;
break;
case ALTIVEC_BUILTIN_LD_INTERNAL_4si:
- icode = CODE_FOR_altivec_lvx_v4si;
+ icode = CODE_FOR_vector_load_v4si;
break;
case ALTIVEC_BUILTIN_LD_INTERNAL_4sf:
- icode = CODE_FOR_altivec_lvx_v4sf;
+ icode = CODE_FOR_vector_load_v4sf;
break;
default:
*expandedp = false;
@@ -8405,16 +8948,16 @@ altivec_expand_st_builtin (tree exp, rtx
switch (fcode)
{
case ALTIVEC_BUILTIN_ST_INTERNAL_16qi:
- icode = CODE_FOR_altivec_stvx_v16qi;
+ icode = CODE_FOR_vector_store_v16qi;
break;
case ALTIVEC_BUILTIN_ST_INTERNAL_8hi:
- icode = CODE_FOR_altivec_stvx_v8hi;
+ icode = CODE_FOR_vector_store_v8hi;
break;
case ALTIVEC_BUILTIN_ST_INTERNAL_4si:
- icode = CODE_FOR_altivec_stvx_v4si;
+ icode = CODE_FOR_vector_store_v4si;
break;
case ALTIVEC_BUILTIN_ST_INTERNAL_4sf:
- icode = CODE_FOR_altivec_stvx_v4sf;
+ icode = CODE_FOR_vector_store_v4sf;
break;
default:
*expandedp = false;
@@ -9372,6 +9915,8 @@ rs6000_init_builtins (void)
{
V2SI_type_node = build_vector_type (intSI_type_node, 2);
V2SF_type_node = build_vector_type (float_type_node, 2);
+ V2DI_type_node = build_vector_type (intDI_type_node, 2);
+ V2DF_type_node = build_vector_type (double_type_node, 2);
V4HI_type_node = build_vector_type (intHI_type_node, 4);
V4SI_type_node = build_vector_type (intSI_type_node, 4);
V4SF_type_node = build_vector_type (float_type_node, 4);
@@ -9404,7 +9949,10 @@ rs6000_init_builtins (void)
uintHI_type_internal_node = unsigned_intHI_type_node;
intSI_type_internal_node = intSI_type_node;
uintSI_type_internal_node = unsigned_intSI_type_node;
+ intDI_type_internal_node = intDI_type_node;
+ uintDI_type_internal_node = unsigned_intDI_type_node;
float_type_internal_node = float_type_node;
+ double_type_internal_node = float_type_node;
void_type_internal_node = void_type_node;
(*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
@@ -9462,13 +10010,18 @@ rs6000_init_builtins (void)
get_identifier ("__vector __pixel"),
pixel_V8HI_type_node));
+ if (TARGET_VSX)
+ (*lang_hooks.decls.pushdecl) (build_decl (TYPE_DECL,
+ get_identifier ("__vector double"),
+ V2DF_type_node));
+
if (TARGET_PAIRED_FLOAT)
paired_init_builtins ();
if (TARGET_SPE)
spe_init_builtins ();
if (TARGET_ALTIVEC)
altivec_init_builtins ();
- if (TARGET_ALTIVEC || TARGET_SPE || TARGET_PAIRED_FLOAT)
+ if (TARGET_ALTIVEC || TARGET_SPE || TARGET_PAIRED_FLOAT || TARGET_VSX)
rs6000_common_init_builtins ();
if (TARGET_PPC_GFXOPT)
{
@@ -10407,6 +10960,26 @@ rs6000_common_init_builtins (void)
tree int_ftype_v8hi_v8hi
= build_function_type_list (integer_type_node,
V8HI_type_node, V8HI_type_node, NULL_TREE);
+ tree v2df_ftype_v2df_v2df_v2df
+ = build_function_type_list (V2DF_type_node,
+ V2DF_type_node, V2DF_type_node,
+ V2DF_type_node, NULL_TREE);
+ tree v2di_ftype_v2di_v2di_v2di
+ = build_function_type_list (V2DI_type_node,
+ V2DI_type_node, V2DI_type_node,
+ V2DI_type_node, NULL_TREE);
+ tree v2df_ftype_v2df_v2df_v16qi
+ = build_function_type_list (V2DF_type_node,
+ V2DF_type_node, V2DF_type_node,
+ V16QI_type_node, NULL_TREE);
+ tree v2di_ftype_v2di_v2di_v16qi
+ = build_function_type_list (V2DI_type_node,
+ V2DI_type_node, V2DI_type_node,
+ V16QI_type_node, NULL_TREE);
+ tree v4sf_ftype_v4si
+ = build_function_type_list (V4SF_type_node, V4SI_type_node, NULL_TREE);
+ tree v4si_ftype_v4sf
+ = build_function_type_list (V4SI_type_node, V4SF_type_node, NULL_TREE);
/* Add the simple ternary operators. */
d = bdesc_3arg;
@@ -10443,6 +11016,12 @@ rs6000_common_init_builtins (void)
case VOIDmode:
type = opaque_ftype_opaque_opaque_opaque;
break;
+ case V2DImode:
+ type = v2di_ftype_v2di_v2di_v2di;
+ break;
+ case V2DFmode:
+ type = v2df_ftype_v2df_v2df_v2df;
+ break;
case V4SImode:
type = v4si_ftype_v4si_v4si_v4si;
break;
@@ -10466,6 +11045,12 @@ rs6000_common_init_builtins (void)
{
switch (mode0)
{
+ case V2DImode:
+ type = v2di_ftype_v2di_v2di_v16qi;
+ break;
+ case V2DFmode:
+ type = v2df_ftype_v2df_v2df_v16qi;
+ break;
case V4SImode:
type = v4si_ftype_v4si_v4si_v16qi;
break;
@@ -10721,6 +11306,10 @@ rs6000_common_init_builtins (void)
type = v2si_ftype_v2sf;
else if (mode0 == V2SImode && mode1 == QImode)
type = v2si_ftype_char;
+ else if (mode0 == V4SImode && mode1 == V4SFmode)
+ type = v4si_ftype_v4sf;
+ else if (mode0 == V4SFmode && mode1 == V4SImode)
+ type = v4sf_ftype_v4si;
else
gcc_unreachable ();
@@ -11601,13 +12190,101 @@ rs6000_instantiate_decls (void)
instantiate_decl_rtl (cfun->machine->sdmode_stack_slot);
}
+/* Given an rtx X being reloaded into a reg required to be
+ in class CLASS, return the class of reg to actually use.
+ In general this is just CLASS; but on some machines
+ in some cases it is preferable to use a more restrictive class.
+
+ On the RS/6000, we have to return NO_REGS when we want to reload a
+ floating-point CONST_DOUBLE to force it to be copied to memory.
+
+ We also don't want to reload integer values into floating-point
+ registers if we can at all help it. In fact, this can
+ cause reload to die, if it tries to generate a reload of CTR
+ into a FP register and discovers it doesn't have the memory location
+ required.
+
+ ??? Would it be a good idea to have reload do the converse, that is
+ try to reload floating modes into FP registers if possible?
+ */
+
+enum reg_class
+rs6000_preferred_reload_class (rtx x, enum reg_class rclass)
+{
+ enum machine_mode mode = GET_MODE (x);
+
+ if (TARGET_VSX && VSX_VECTOR_MODE (mode) && x == CONST0_RTX (mode)
+ && VSX_REG_CLASS_P (rclass))
+ return rclass;
+
+ if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode) && rclass == ALTIVEC_REGS
+ && easy_vector_constant (x, mode))
+ return rclass;
+
+ if (CONSTANT_P (x) && reg_classes_intersect_p (rclass, FLOAT_REGS))
+ return NO_REGS;
+
+ if (GET_MODE_CLASS (mode) == MODE_INT && rclass == NON_SPECIAL_REGS)
+ return GENERAL_REGS;
+
+ /* For VSX, prefer the traditional registers. */
+ if (rclass == VSX_REGS)
+ {
+ if (mode == DFmode)
+ return FLOAT_REGS;
+
+ if (ALTIVEC_VECTOR_MODE (mode))
+ return ALTIVEC_REGS;
+ }
+
+ return rclass;
+}
+
+/* If we are copying between FP or AltiVec registers and anything else, we need
+ a memory location. The exception is when we are targeting ppc64 and the
+ move to/from fpr to gpr instructions are available. Also, under VSX, you
+ can copy vector registers from the FP register set to the Altivec register
+ set and vice versa. */
+
+bool
+rs6000_secondary_memory_needed (enum reg_class class1,
+ enum reg_class class2,
+ enum machine_mode mode)
+{
+ if (class1 == class2)
+ return false;
+
+ if (TARGET_VSX && VSX_MOVE_MODE (mode) && VSX_REG_CLASS_P (class1)
+ && VSX_REG_CLASS_P (class2))
+ return false;
+
+ if (class1 == FLOAT_REGS
+ && (!TARGET_MFPGPR || !TARGET_POWERPC64
+ || ((mode != DFmode)
+ && (mode != DDmode)
+ && (mode != DImode))))
+ return true;
+
+ if (class2 == FLOAT_REGS
+ && (!TARGET_MFPGPR || !TARGET_POWERPC64
+ || ((mode != DFmode)
+ && (mode != DDmode)
+ && (mode != DImode))))
+ return true;
+
+ if (class1 == ALTIVEC_REGS || class2 == ALTIVEC_REGS)
+ return true;
+
+ return false;
+}
+
/* Return the register class of a scratch register needed to copy IN into
or out of a register in RCLASS in MODE. If it can be done directly,
NO_REGS is returned. */
enum reg_class
rs6000_secondary_reload_class (enum reg_class rclass,
- enum machine_mode mode ATTRIBUTE_UNUSED,
+ enum machine_mode mode,
rtx in)
{
int regno;
@@ -11663,6 +12340,13 @@ rs6000_secondary_reload_class (enum reg_
&& (rclass == FLOAT_REGS || rclass == NON_SPECIAL_REGS))
return (mode != SDmode) ? NO_REGS : GENERAL_REGS;
+ /* Memory, and FP/altivec registers can go into fp/altivec registers under
+ VSX. */
+ if (TARGET_VSX
+ && (regno == -1 || VSX_REGNO_P (regno))
+ && VSX_REG_CLASS_P (rclass))
+ return NO_REGS;
+
/* Memory, and AltiVec registers can go into AltiVec registers. */
if ((regno == -1 || ALTIVEC_REGNO_P (regno))
&& rclass == ALTIVEC_REGS)
@@ -11676,6 +12360,28 @@ rs6000_secondary_reload_class (enum reg_
/* Otherwise, we need GENERAL_REGS. */
return GENERAL_REGS;
}
+
+/* Return nonzero if for CLASS a mode change from FROM to TO is invalid. */
+
+bool
+rs6000_cannot_change_mode_class (enum machine_mode from,
+ enum machine_mode to,
+ enum reg_class rclass)
+{
+ return (GET_MODE_SIZE (from) != GET_MODE_SIZE (to)
+ ? ((GET_MODE_SIZE (from) < 8 || GET_MODE_SIZE (to) < 8
+ || TARGET_IEEEQUAD)
+ && reg_classes_intersect_p (FLOAT_REGS, rclass))
+ : (((TARGET_E500_DOUBLE
+ && ((((to) == DFmode) + ((from) == DFmode)) == 1
+ || (((to) == TFmode) + ((from) == TFmode)) == 1
+ || (((to) == DDmode) + ((from) == DDmode)) == 1
+ || (((to) == TDmode) + ((from) == TDmode)) == 1
+ || (((to) == DImode) + ((from) == DImode)) == 1))
+ || (TARGET_SPE
+ && (SPE_VECTOR_MODE (from) + SPE_VECTOR_MODE (to)) == 1))
+ && reg_classes_intersect_p (GENERAL_REGS, rclass)));
+}
/* Given a comparison operation, return the bit number in CCR to test. We
know this is a valid comparison.
@@ -12406,6 +13112,26 @@ print_operand (FILE *file, rtx x, int co
fprintf (file, "%d", i + 1);
return;
+ case 'x':
+ /* X is a FPR or Altivec register used in a VSX context. */
+ if (GET_CODE (x) != REG || !VSX_REGNO_P (REGNO (x)))
+ output_operand_lossage ("invalid %%x value");
+ else
+ {
+ int reg = REGNO (x);
+ int vsx_reg = (FP_REGNO_P (reg)
+ ? reg - 32
+ : reg - FIRST_ALTIVEC_REGNO + 32);
+
+#ifdef TARGET_REGNAMES
+ if (TARGET_REGNAMES)
+ fprintf (file, "%%vs%d", vsx_reg);
+ else
+#endif
+ fprintf (file, "%d", vsx_reg);
+ }
+ return;
+
case 'X':
if (GET_CODE (x) == MEM
&& (legitimate_indexed_address_p (XEXP (x, 0), 0)
@@ -12518,13 +13244,16 @@ print_operand (FILE *file, rtx x, int co
/* Fall through. Must be [reg+reg]. */
}
- if (TARGET_ALTIVEC
+ if (VECTOR_MEM_ALTIVEC_P (GET_MODE (x))
&& GET_CODE (tmp) == AND
&& GET_CODE (XEXP (tmp, 1)) == CONST_INT
&& INTVAL (XEXP (tmp, 1)) == -16)
tmp = XEXP (tmp, 0);
+ else if (VECTOR_MEM_VSX_P (GET_MODE (x))
+ && GET_CODE (tmp) == PRE_MODIFY)
+ tmp = XEXP (tmp, 1);
if (GET_CODE (tmp) == REG)
- fprintf (file, "0,%s", reg_names[REGNO (tmp)]);
+ fprintf (file, "%s,%s", reg_names[0], reg_names[REGNO (tmp)]);
else
{
if (!GET_CODE (tmp) == PLUS
@@ -13296,55 +14025,62 @@ output_e500_flip_gt_bit (rtx dst, rtx sr
return string;
}
-/* Return insn index for the vector compare instruction for given CODE,
- and DEST_MODE, OP_MODE. Return INSN_NOT_AVAILABLE if valid insn is
- not available. */
+/* Return insn for VSX comparisons. */
-static int
-get_vec_cmp_insn (enum rtx_code code,
- enum machine_mode dest_mode,
- enum machine_mode op_mode)
+static rtx
+rs6000_emit_vector_compare_vsx (enum rtx_code code,
+ rtx mask,
+ rtx op0,
+ rtx op1)
{
- if (!TARGET_ALTIVEC)
- return INSN_NOT_AVAILABLE;
-
switch (code)
{
+ default:
+ break;
+
case EQ:
- if (dest_mode == V16QImode && op_mode == V16QImode)
- return UNSPEC_VCMPEQUB;
- if (dest_mode == V8HImode && op_mode == V8HImode)
- return UNSPEC_VCMPEQUH;
- if (dest_mode == V4SImode && op_mode == V4SImode)
- return UNSPEC_VCMPEQUW;
- if (dest_mode == V4SImode && op_mode == V4SFmode)
- return UNSPEC_VCMPEQFP;
+ case GT:
+ case GE:
+ emit_insn (gen_rtx_SET (VOIDmode,
+ mask,
+ gen_rtx_fmt_ee (code, GET_MODE (mask),
+ op0,
+ op1)));
+ return mask;
+ }
+
+ return NULL_RTX;
+}
+
+/* Return insn for Altivec comparisons. */
+
+static rtx
+rs6000_emit_vector_compare_altivec (enum rtx_code code,
+ rtx mask,
+ rtx op0,
+ rtx op1)
+{
+ switch (code)
+ {
+ default:
break;
+
case GE:
- if (dest_mode == V4SImode && op_mode == V4SFmode)
- return UNSPEC_VCMPGEFP;
+ if (GET_MODE (mask) != V4SFmode)
+ return NULL_RTX;
+ /* fall through */
+ case EQ:
case GT:
- if (dest_mode == V16QImode && op_mode == V16QImode)
- return UNSPEC_VCMPGTSB;
- if (dest_mode == V8HImode && op_mode == V8HImode)
- return UNSPEC_VCMPGTSH;
- if (dest_mode == V4SImode && op_mode == V4SImode)
- return UNSPEC_VCMPGTSW;
- if (dest_mode == V4SImode && op_mode == V4SFmode)
- return UNSPEC_VCMPGTFP;
- break;
case GTU:
- if (dest_mode == V16QImode && op_mode == V16QImode)
- return UNSPEC_VCMPGTUB;
- if (dest_mode == V8HImode && op_mode == V8HImode)
- return UNSPEC_VCMPGTUH;
- if (dest_mode == V4SImode && op_mode == V4SImode)
- return UNSPEC_VCMPGTUW;
- break;
- default:
- break;
+ emit_insn (gen_rtx_SET (VOIDmode,
+ mask,
+ gen_rtx_fmt_ee (code, GET_MODE (mask),
+ op0,
+ op1)));
+ return mask;
}
- return INSN_NOT_AVAILABLE;
+
+ return NULL_RTX;
}
/* Emit vector compare for operands OP0 and OP1 using code RCODE.
@@ -13355,129 +14091,111 @@ rs6000_emit_vector_compare (enum rtx_cod
rtx op0, rtx op1,
enum machine_mode dmode)
{
- int vec_cmp_insn;
rtx mask;
- enum machine_mode dest_mode;
- enum machine_mode op_mode = GET_MODE (op1);
+ bool swap_operands = false;
+ bool try_again = false;
- gcc_assert (TARGET_ALTIVEC);
+ gcc_assert (TARGET_ALTIVEC || TARGET_VSX);
gcc_assert (GET_MODE (op0) == GET_MODE (op1));
- /* Floating point vector compare instructions uses destination V4SImode.
- Move destination to appropriate mode later. */
- if (dmode == V4SFmode)
- dest_mode = V4SImode;
- else
- dest_mode = dmode;
+ mask = gen_reg_rtx (dmode);
- mask = gen_reg_rtx (dest_mode);
- vec_cmp_insn = get_vec_cmp_insn (rcode, dest_mode, op_mode);
+ /* Try for VSX before Altivec. */
+ if (TARGET_VSX && VSX_VECTOR_MODE (dmode))
+ {
+ rtx vsx = rs6000_emit_vector_compare_vsx (rcode, mask, op0, op1);
+ if (vsx)
+ return vsx;
+ }
+ else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (dmode))
+ {
+ rtx av = rs6000_emit_vector_compare_altivec (rcode, mask, op0, op1);
+ if (av)
+ return av;
+ }
- if (vec_cmp_insn == INSN_NOT_AVAILABLE)
+ switch (rcode)
{
- bool swap_operands = false;
- bool try_again = false;
- switch (rcode)
- {
- case LT:
- rcode = GT;
- swap_operands = true;
- try_again = true;
- break;
- case LTU:
- rcode = GTU;
- swap_operands = true;
- try_again = true;
- break;
- case NE:
- case UNLE:
- case UNLT:
- case UNGE:
- case UNGT:
- /* Invert condition and try again.
- e.g., A != B becomes ~(A==B). */
- {
- enum rtx_code rev_code;
- enum insn_code nor_code;
- rtx eq_rtx;
-
- rev_code = reverse_condition_maybe_unordered (rcode);
- eq_rtx = rs6000_emit_vector_compare (rev_code, op0, op1,
- dest_mode);
-
- nor_code = optab_handler (one_cmpl_optab, (int)dest_mode)->insn_code;
- gcc_assert (nor_code != CODE_FOR_nothing);
- emit_insn (GEN_FCN (nor_code) (mask, eq_rtx));
+ case LT:
+ rcode = GT;
+ swap_operands = true;
+ try_again = true;
+ break;
+ case LTU:
+ rcode = GTU;
+ swap_operands = true;
+ try_again = true;
+ break;
+ case NE:
+ case UNLE:
+ case UNLT:
+ case UNGE:
+ case UNGT:
+ /* Invert condition and try again.
+ e.g., A != B becomes ~(A==B). */
+ {
+ enum rtx_code rev_code;
+ enum insn_code nor_code;
+ rtx eq_rtx;
+
+ rev_code = reverse_condition_maybe_unordered (rcode);
+ eq_rtx = rs6000_emit_vector_compare (rev_code, op0, op1, dmode);
+
+ nor_code = optab_handler (one_cmpl_optab, (int)dmode)->insn_code;
+ gcc_assert (nor_code != CODE_FOR_nothing);
+ emit_insn (GEN_FCN (nor_code) (mask, eq_rtx));
+ return mask;
+ }
+ break;
+ case GE:
+ case GEU:
+ case LE:
+ case LEU:
+ /* Try GT/GTU/LT/LTU OR EQ */
+ {
+ rtx c_rtx, eq_rtx;
+ enum insn_code ior_code;
+ enum rtx_code new_code;
- if (dmode != dest_mode)
- {
- rtx temp = gen_reg_rtx (dest_mode);
- convert_move (temp, mask, 0);
- return temp;
- }
- return mask;
- }
- break;
- case GE:
- case GEU:
- case LE:
- case LEU:
- /* Try GT/GTU/LT/LTU OR EQ */
+ switch (rcode)
{
- rtx c_rtx, eq_rtx;
- enum insn_code ior_code;
- enum rtx_code new_code;
-
- switch (rcode)
- {
- case GE:
- new_code = GT;
- break;
-
- case GEU:
- new_code = GTU;
- break;
+ case GE:
+ new_code = GT;
+ break;
- case LE:
- new_code = LT;
- break;
+ case GEU:
+ new_code = GTU;
+ break;
- case LEU:
- new_code = LTU;
- break;
+ case LE:
+ new_code = LT;
+ break;
- default:
- gcc_unreachable ();
- }
+ case LEU:
+ new_code = LTU;
+ break;
- c_rtx = rs6000_emit_vector_compare (new_code,
- op0, op1, dest_mode);
- eq_rtx = rs6000_emit_vector_compare (EQ, op0, op1,
- dest_mode);
-
- ior_code = optab_handler (ior_optab, (int)dest_mode)->insn_code;
- gcc_assert (ior_code != CODE_FOR_nothing);
- emit_insn (GEN_FCN (ior_code) (mask, c_rtx, eq_rtx));
- if (dmode != dest_mode)
- {
- rtx temp = gen_reg_rtx (dest_mode);
- convert_move (temp, mask, 0);
- return temp;
- }
- return mask;
+ default:
+ gcc_unreachable ();
}
- break;
- default:
- gcc_unreachable ();
- }
- if (try_again)
- {
- vec_cmp_insn = get_vec_cmp_insn (rcode, dest_mode, op_mode);
- /* You only get two chances. */
- gcc_assert (vec_cmp_insn != INSN_NOT_AVAILABLE);
- }
+ c_rtx = rs6000_emit_vector_compare (new_code,
+ op0, op1, dmode);
+ eq_rtx = rs6000_emit_vector_compare (EQ, op0, op1,
+ dmode);
+
+ ior_code = optab_handler (ior_optab, (int)dmode)->insn_code;
+ gcc_assert (ior_code != CODE_FOR_nothing);
+ emit_insn (GEN_FCN (ior_code) (mask, c_rtx, eq_rtx));
+ return mask;
+ }
+ break;
+ default:
+ gcc_unreachable ();
+ }
+ if (try_again)
+ {
if (swap_operands)
{
rtx tmp;
@@ -13485,69 +14203,23 @@ rs6000_emit_vector_compare (enum rtx_cod
op0 = op1;
op1 = tmp;
}
- }
-
- emit_insn (gen_rtx_SET (VOIDmode, mask,
- gen_rtx_UNSPEC (dest_mode,
- gen_rtvec (2, op0, op1),
- vec_cmp_insn)));
- if (dmode != dest_mode)
- {
- rtx temp = gen_reg_rtx (dest_mode);
- convert_move (temp, mask, 0);
- return temp;
- }
- return mask;
-}
-
-/* Return vector select instruction for MODE. Return INSN_NOT_AVAILABLE, if
- valid insn doesn exist for given mode. */
-static int
-get_vsel_insn (enum machine_mode mode)
-{
- switch (mode)
- {
- case V4SImode:
- return UNSPEC_VSEL4SI;
- break;
- case V4SFmode:
- return UNSPEC_VSEL4SF;
- break;
- case V8HImode:
- return UNSPEC_VSEL8HI;
- break;
- case V16QImode:
- return UNSPEC_VSEL16QI;
- break;
- default:
- return INSN_NOT_AVAILABLE;
- break;
+ if (TARGET_VSX && VSX_VECTOR_MODE (dmode))
+ {
+ rtx vsx = rs6000_emit_vector_compare_vsx (rcode, mask, op0, op1);
+ if (vsx)
+ return vsx;
+ }
+ else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (dmode))
+ {
+ rtx av = rs6000_emit_vector_compare_altivec (rcode, mask, op0, op1);
+ if (av)
+ return av;
+ }
}
- return INSN_NOT_AVAILABLE;
-}
-
-/* Emit vector select insn where DEST is destination using
- operands OP1, OP2 and MASK. */
-
-static void
-rs6000_emit_vector_select (rtx dest, rtx op1, rtx op2, rtx mask)
-{
- rtx t, temp;
- enum machine_mode dest_mode = GET_MODE (dest);
- int vsel_insn_index = get_vsel_insn (GET_MODE (dest));
- temp = gen_reg_rtx (dest_mode);
-
- /* For each vector element, select op1 when mask is 1 otherwise
- select op2. */
- t = gen_rtx_SET (VOIDmode, temp,
- gen_rtx_UNSPEC (dest_mode,
- gen_rtvec (3, op2, op1, mask),
- vsel_insn_index));
- emit_insn (t);
- emit_move_insn (dest, temp);
- return;
+ /* You only get two chances. */
+ gcc_unreachable ();
}
/* Emit vector conditional expression.
@@ -13562,15 +14234,29 @@ rs6000_emit_vector_cond_expr (rtx dest,
enum rtx_code rcode = GET_CODE (cond);
rtx mask;
- if (!TARGET_ALTIVEC)
+ if (!TARGET_ALTIVEC && !TARGET_VSX)
return 0;
/* Get the vector mask for the given relational operations. */
mask = rs6000_emit_vector_compare (rcode, cc_op0, cc_op1, dest_mode);
- rs6000_emit_vector_select (dest, op1, op2, mask);
+ if (!mask)
+ return 0;
- return 1;
+ if ((TARGET_VSX && VSX_VECTOR_MOVE_MODE (dest_mode))
+ || (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (dest_mode)))
+ {
+ rtx cond2 = gen_rtx_fmt_ee (NE, VOIDmode, mask, const0_rtx);
+ emit_insn (gen_rtx_SET (VOIDmode,
+ dest,
+ gen_rtx_IF_THEN_ELSE (dest_mode,
+ cond2,
+ op1,
+ op2)));
+ return 1;
+ }
+
+ return 0;
}
/* Emit a conditional move: move TRUE_COND to DEST if OP of the
@@ -13766,8 +14452,8 @@ rs6000_emit_int_cmove (rtx dest, rtx op,
{
rtx condition_rtx, cr;
- /* All isel implementations thus far are 32-bits. */
- if (GET_MODE (rs6000_compare_op0) != SImode)
+ if (GET_MODE (rs6000_compare_op0) != SImode
+ && (!TARGET_POWERPC64 || GET_MODE (rs6000_compare_op0) != DImode))
return 0;
/* We still have to do the compare, because isel doesn't do a
@@ -13776,12 +14462,24 @@ rs6000_emit_int_cmove (rtx dest, rtx op,
condition_rtx = rs6000_generate_compare (GET_CODE (op));
cr = XEXP (condition_rtx, 0);
- if (GET_MODE (cr) == CCmode)
- emit_insn (gen_isel_signed (dest, condition_rtx,
- true_cond, false_cond, cr));
+ if (GET_MODE (rs6000_compare_op0) == SImode)
+ {
+ if (GET_MODE (cr) == CCmode)
+ emit_insn (gen_isel_signed_si (dest, condition_rtx,
+ true_cond, false_cond, cr));
+ else
+ emit_insn (gen_isel_unsigned_si (dest, condition_rtx,
+ true_cond, false_cond, cr));
+ }
else
- emit_insn (gen_isel_unsigned (dest, condition_rtx,
- true_cond, false_cond, cr));
+ {
+ if (GET_MODE (cr) == CCmode)
+ emit_insn (gen_isel_signed_di (dest, condition_rtx,
+ true_cond, false_cond, cr));
+ else
+ emit_insn (gen_isel_unsigned_di (dest, condition_rtx,
+ true_cond, false_cond, cr));
+ }
return 1;
}
@@ -13808,6 +14506,15 @@ rs6000_emit_minmax (rtx dest, enum rtx_c
enum rtx_code c;
rtx target;
+ /* VSX/altivec have direct min/max insns. */
+ if ((code == SMAX || code == SMIN) && VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode))
+ {
+ emit_insn (gen_rtx_SET (VOIDmode,
+ dest,
+ gen_rtx_fmt_ee (code, mode, op0, op1)));
+ return;
+ }
+
if (code == SMAX || code == SMIN)
c = GE;
else
@@ -15785,6 +16492,7 @@ emit_frame_save (rtx frame_reg, rtx fram
/* Some cases that need register indexed addressing. */
if ((TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))
+ || (TARGET_VSX && VSX_VECTOR_MODE (mode))
|| (TARGET_E500_DOUBLE && mode == DFmode)
|| (TARGET_SPE_ABI
&& SPE_VECTOR_MODE (mode)
@@ -19320,6 +20028,7 @@ rs6000_issue_rate (void)
case CPU_POWER4:
case CPU_POWER5:
case CPU_POWER6:
+ case CPU_POWER7:
return 5;
default:
return 1;
@@ -19921,6 +20630,41 @@ insn_must_be_first_in_group (rtx insn)
break;
}
break;
+ case PROCESSOR_POWER7:
+ type = get_attr_type (insn);
+
+ switch (type)
+ {
+ case TYPE_CR_LOGICAL:
+ case TYPE_MFCR:
+ case TYPE_MFCRF:
+ case TYPE_MTCR:
+ case TYPE_IDIV:
+ case TYPE_LDIV:
+ case TYPE_COMPARE:
+ case TYPE_DELAYED_COMPARE:
+ case TYPE_VAR_DELAYED_COMPARE:
+ case TYPE_ISYNC:
+ case TYPE_LOAD_L:
+ case TYPE_STORE_C:
+ case TYPE_LOAD_U:
+ case TYPE_LOAD_UX:
+ case TYPE_LOAD_EXT:
+ case TYPE_LOAD_EXT_U:
+ case TYPE_LOAD_EXT_UX:
+ case TYPE_STORE_U:
+ case TYPE_STORE_UX:
+ case TYPE_FPLOAD_U:
+ case TYPE_FPLOAD_UX:
+ case TYPE_FPSTORE_U:
+ case TYPE_FPSTORE_UX:
+ case TYPE_MFJMPR:
+ case TYPE_MTJMPR:
+ return true;
+ default:
+ break;
+ }
+ break;
default:
break;
}
@@ -19982,6 +20726,23 @@ insn_must_be_last_in_group (rtx insn)
break;
}
break;
+ case PROCESSOR_POWER7:
+ type = get_attr_type (insn);
+
+ switch (type)
+ {
+ case TYPE_ISYNC:
+ case TYPE_SYNC:
+ case TYPE_LOAD_L:
+ case TYPE_STORE_C:
+ case TYPE_LOAD_EXT_U:
+ case TYPE_LOAD_EXT_UX:
+ case TYPE_STORE_UX:
+ return true;
+ default:
+ break;
+ }
+ break;
default:
break;
}
@@ -20554,8 +21315,8 @@ rs6000_handle_altivec_attribute (tree *n
else if (type == long_long_unsigned_type_node
|| type == long_long_integer_type_node)
error ("use of %<long long%> in AltiVec types is invalid");
- else if (type == double_type_node)
- error ("use of %<double%> in AltiVec types is invalid");
+ else if (type == double_type_node && !TARGET_VSX)
+ error ("use of %<double%> in AltiVec types is invalid without -mvsx");
else if (type == long_double_type_node)
error ("use of %<long double%> in AltiVec types is invalid");
else if (type == boolean_type_node)
@@ -20581,6 +21342,7 @@ rs6000_handle_altivec_attribute (tree *n
result = (unsigned_p ? unsigned_V16QI_type_node : V16QI_type_node);
break;
case SFmode: result = V4SF_type_node; break;
+ case DFmode: result = V2DF_type_node; break;
/* If the user says 'vector int bool', we may be handed the 'bool'
attribute _before_ the 'vector' attribute, and so select the
proper type in the 'b' case below. */
@@ -22116,7 +22878,7 @@ rs6000_register_move_cost (enum machine_
if (! reg_classes_intersect_p (to, GENERAL_REGS))
from = to;
- if (from == FLOAT_REGS || from == ALTIVEC_REGS)
+ if (from == FLOAT_REGS || from == ALTIVEC_REGS || from == VSX_REGS)
return (rs6000_memory_move_cost (mode, from, 0)
+ rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
@@ -22136,6 +22898,12 @@ rs6000_register_move_cost (enum machine_
return 2 * hard_regno_nregs[0][mode];
}
+ /* If we have VSX, we can easily move between FPR or Altivec registers. */
+ else if (TARGET_VSX
+ && ((from == VSX_REGS || from == FLOAT_REGS || from == ALTIVEC_REGS)
+ || (to == VSX_REGS || to == FLOAT_REGS || to == ALTIVEC_REGS)))
+ return 2;
+
/* Moving between two similar registers is just one instruction. */
else if (reg_classes_intersect_p (to, from))
return (mode == TFmode || mode == TDmode) ? 4 : 2;
@@ -22376,8 +23144,8 @@ rs6000_emit_swrsqrtsf (rtx dst, rtx src)
emit_label (XEXP (label, 0));
}
-/* Emit popcount intrinsic on TARGET_POPCNTB targets. DST is the
- target, and SRC is the argument operand. */
+/* Emit popcount intrinsic on TARGET_POPCNTB (Power5) and TARGET_POPCNTD
+ (Power7) targets. DST is the target, and SRC is the argument operand. */
void
rs6000_emit_popcount (rtx dst, rtx src)
@@ -22385,6 +23153,16 @@ rs6000_emit_popcount (rtx dst, rtx src)
enum machine_mode mode = GET_MODE (dst);
rtx tmp1, tmp2;
+ /* Use the PPC ISA 2.06 popcnt{w,d} instruction if we can. */
+ if (TARGET_POPCNTD)
+ {
+ if (mode == SImode)
+ emit_insn (gen_popcntwsi2 (dst, src));
+ else
+ emit_insn (gen_popcntddi2 (dst, src));
+ return;
+ }
+
tmp1 = gen_reg_rtx (mode);
if (mode == SImode)
@@ -22797,7 +23575,7 @@ rs6000_vector_mode_supported_p (enum mac
if (TARGET_SPE && SPE_VECTOR_MODE (mode))
return true;
- else if (TARGET_ALTIVEC && ALTIVEC_VECTOR_MODE (mode))
+ else if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode))
return true;
else
--- gcc/config/rs6000/vsx.md (.../trunk) (revision 0)
+++ gcc/config/rs6000/vsx.md (.../branches/ibm/power7-meissner) (revision 144730)
@@ -0,0 +1,864 @@
+;; VSX patterns.
+;; Copyright (C) 2009
+;; Free Software Foundation, Inc.
+;; Contributed by Michael Meissner <meissner@linux.vnet.ibm.com>
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3. If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; Iterator for vector floating point types supported by VSX
+(define_mode_iterator VSX_F [V4SF V2DF])
+
+;; Iterator for logical types supported by VSX
+(define_mode_iterator VSX_L [V16QI V8HI V4SI V2DI V4SF V2DF TI])
+
+;; Map into the appropriate load/store name based on the type
+(define_mode_attr VSm [(V16QI "vw4")
+ (V8HI "vw4")
+ (V4SI "vw4")
+ (V4SF "vw4")
+ (V2DF "vd2")
+ (V2DI "vd2")
+ (TI "vw4")])
+
+;; Map into the appropriate suffix based on the type
+(define_mode_attr VSs [(V16QI "sp")
+ (V8HI "sp")
+ (V4SI "sp")
+ (V4SF "sp")
+ (V2DF "dp")
+ (V2DI "dp")
+ (TI "sp")])
+
+;; Map into the register class used
+(define_mode_attr VSr [(V16QI "v")
+ (V8HI "v")
+ (V4SI "v")
+ (V4SF "wf")
+ (V2DI "wd")
+ (V2DF "wd")
+ (TI "wd")])
+
+;; Same size integer type for floating point data
+(define_mode_attr VSi [(V4SF "v4si")
+ (V2DF "v2di")])
+
+(define_mode_attr VSI [(V4SF "V4SI")
+ (V2DF "V2DI")])
+
+;; Word size for same size conversion
+(define_mode_attr VSc [(V4SF "w")
+ (V2DF "d")])
+
+;; Bitsize for DF load with update
+(define_mode_attr VSbit [(SI "32")
+ (DI "64")])
+
+(define_constants
+ [(UNSPEC_VSX_CONCAT_V2DF 500)])
+
+;; VSX moves
+(define_insn "*vsx_mov<mode>"
+ [(set (match_operand:VSX_L 0 "nonimmediate_operand" "=Z,<VSr>,<VSr>,?Z,?wa,?wa,*o,*r,*r,<VSr>,?wa,v")
+ (match_operand:VSX_L 1 "input_operand" "<VSr>,Z,<VSr>,wa,Z,wa,r,o,r,j,j,W"))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)
+ && (register_operand (operands[0], <MODE>mode)
+ || register_operand (operands[1], <MODE>mode))"
+{
+ switch (which_alternative)
+ {
+ case 0:
+ case 3:
+ return "stx<VSm>%U0x %x1,%y0";
+
+ case 1:
+ case 4:
+ return "lx<VSm>%U0x %x0,%y1";
+
+ case 2:
+ case 5:
+ return "xvmov<VSs> %x0,%x1";
+
+ case 6:
+ case 7:
+ case 8:
+ return "#";
+
+ case 9:
+ case 10:
+ return "xxlxor %x0,%x0,%x0";
+
+ case 11:
+ return output_vec_const_move (operands);
+
+ default:
+ gcc_unreachable ();
+ }
+}
+ [(set_attr "type" "vecstore,vecload,vecsimple,vecstore,vecload,vecsimple,store,load,*,vecsimple,vecsimple,*")])
+
+;; Load/store with update
+;; Define insns that do load or store with update. Because VSX only has
+;; reg+reg addressing, pre-decrement or pre-inrement is unlikely to be
+;; generated.
+;;
+;; In all these cases, we use operands 0 and 1 for the register being
+;; incremented because those are the operands that local-alloc will
+;; tie and these are the pair most likely to be tieable (and the ones
+;; that will benefit the most).
+
+(define_insn "*vsx_load<mode>_update64"
+ [(set (match_operand:VSX_L 3 "vsx_register_operand" "=<VSr>,?wa")
+ (mem:VSX_L (plus:DI (match_operand:DI 1 "gpc_reg_operand" "0,0")
+ (match_operand:DI 2 "gpc_reg_operand" "r,r"))))
+ (set (match_operand:DI 0 "gpc_reg_operand" "=b,b")
+ (plus:DI (match_dup 1)
+ (match_dup 2)))]
+ "TARGET_64BIT && TARGET_UPDATE && VECTOR_MEM_VSX_P (<MODE>mode)"
+ "lx<VSm>ux %x3,%0,%2"
+ [(set_attr "type" "vecload")])
+
+(define_insn "*vsx_load<mode>_update32"
+ [(set (match_operand:VSX_L 3 "vsx_register_operand" "=<VSr>,?wa")
+ (mem:VSX_L (plus:SI (match_operand:SI 1 "gpc_reg_operand" "0,0")
+ (match_operand:SI 2 "gpc_reg_operand" "r,r"))))
+ (set (match_operand:SI 0 "gpc_reg_operand" "=b,b")
+ (plus:SI (match_dup 1)
+ (match_dup 2)))]
+ "TARGET_32BIT && TARGET_UPDATE && VECTOR_MEM_VSX_P (<MODE>mode)"
+ "lx<VSm>ux %x3,%0,%2"
+ [(set_attr "type" "vecload")])
+
+(define_insn "*vsx_store<mode>_update64"
+ [(set (mem:VSX_L (plus:DI (match_operand:DI 1 "gpc_reg_operand" "0,0")
+ (match_operand:DI 2 "gpc_reg_operand" "r,r")))
+ (match_operand:VSX_L 3 "gpc_reg_operand" "<VSr>,?wa"))
+ (set (match_operand:DI 0 "gpc_reg_operand" "=b,b")
+ (plus:DI (match_dup 1)
+ (match_dup 2)))]
+ "TARGET_64BIT && TARGET_UPDATE && VECTOR_MEM_VSX_P (<MODE>mode)"
+ "stx<VSm>ux %x3,%0,%2"
+ [(set_attr "type" "vecstore")])
+
+(define_insn "*vsx_store<mode>_update32"
+ [(set (mem:VSX_L (plus:SI (match_operand:SI 1 "gpc_reg_operand" "0,0")
+ (match_operand:SI 2 "gpc_reg_operand" "r,r")))
+ (match_operand:VSX_L 3 "gpc_reg_operand" "<VSr>,?wa"))
+ (set (match_operand:SI 0 "gpc_reg_operand" "=b,b")
+ (plus:SI (match_dup 1)
+ (match_dup 2)))]
+ "TARGET_32BIT && TARGET_UPDATE && VECTOR_MEM_VSX_P (<MODE>mode)"
+ "stx<VSm>ux %x3,%0,%2"
+ [(set_attr "type" "vecstore")])
+
+(define_insn "*vsx_loaddf_update<VSbit>"
+ [(set (match_operand:DF 3 "vsx_register_operand" "=ws,?wa")
+ (mem:DF (plus:P (match_operand:P 1 "gpc_reg_operand" "0,0")
+ (match_operand:P 2 "gpc_reg_operand" "r,r"))))
+ (set (match_operand:P 0 "gpc_reg_operand" "=b,b")
+ (plus:P (match_dup 1)
+ (match_dup 2)))]
+ "TARGET_<VSbit>BIT && TARGET_UPDATE && VECTOR_MEM_VSX_P (DFmode)"
+ "lxsdux %x3,%0,%2"
+ [(set_attr "type" "vecload")])
+
+(define_insn "*vsx_storedf_update<VSbit>"
+ [(set (mem:DF (plus:P (match_operand:P 1 "gpc_reg_operand" "0,0")
+ (match_operand:P 2 "gpc_reg_operand" "r,r")))
+ (match_operand:DF 3 "gpc_reg_operand" "ws,?wa"))
+ (set (match_operand:P 0 "gpc_reg_operand" "=b,b")
+ (plus:P (match_dup 1)
+ (match_dup 2)))]
+ "TARGET_<VSbit>BIT && TARGET_UPDATE && VECTOR_MEM_VSX_P (DFmode)"
+ "stxsdux %x3,%0,%2"
+ [(set_attr "type" "vecstore")])
+
+;; We may need to have a varient on the pattern for use in the prologue
+;; that doesn't depend on TARGET_UPDATE.
+
+
+;; VSX vector floating point arithmetic instructions
+(define_insn "*vsx_add<mode>3"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (plus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvadd<VSs> %x0,%x1,%x2"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_sub<mode>3"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (minus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvsub<VSs> %x0,%x1,%x2"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_mul<mode>3"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (mult:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvmul<VSs> %x0,%x1,%x2"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_div<mode>3"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (div:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvdiv<VSs> %x0,%x1,%x2"
+ [(set_attr "type" "vecfdiv")])
+
+(define_insn "*vsx_fre<mode>2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (unspec:VSX_F [(match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")]
+ UNSPEC_FRES))]
+ "flag_finite_math_only && VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvre<VSs> %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_neg<mode>2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (neg:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvneg<VSs> %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_abs<mode>2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (abs:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvabs<VSs> %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_nabs<mode>2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (neg:VSX_F
+ (abs:VSX_F
+ (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>"))))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvnabs<VSs> %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_smax<mode>3"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (smax:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvmax<VSs> %x0,%x1,%x2"
+ [(set_attr "type" "veccmp")])
+
+(define_insn "*vsx_smin<mode>3"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (smin:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvmin<VSs> %x0,%x1,%x2"
+ [(set_attr "type" "veccmp")])
+
+(define_insn "*vsx_sqrt<mode>2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (sqrt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvsqrt<VSs> %x0,%x1"
+ [(set_attr "type" "vecfdiv")])
+
+;; Fused vector multiply/add instructions
+(define_insn "*vsx_fmadd<mode>4"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>")
+ (plus:VSX_F
+ (mult:VSX_F
+ (match_operand:VSX_F 1 "vsx_register_operand" "%<VSr>,<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,0"))
+ (match_operand:VSX_F 3 "gpc_reg_operand" "0,<VSr>")))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD"
+ "@
+ xvmadda<VSs> %x0,%x1,%x2
+ xvmaddm<VSs> %x0,%x1,%x3"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_fmsub<mode>4"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>")
+ (minus:VSX_F
+ (mult:VSX_F
+ (match_operand:VSX_F 1 "vsx_register_operand" "%<VSr>,<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,0"))
+ (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>")))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD"
+ "@
+ xvmsuba<VSs> %x0,%x1,%x2
+ xvmsubm<VSs> %x0,%x1,%x3"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_fnmadd<mode>4_1"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>")
+ (neg:VSX_F
+ (plus:VSX_F
+ (mult:VSX_F
+ (match_operand:VSX_F 1 "vsx_register_operand" "%<VSr>,<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,0"))
+ (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>"))))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD
+ && HONOR_SIGNED_ZEROS (DFmode)"
+ "@
+ xvnmadda<VSs> %x0,%x1,%x2
+ xvnmaddm<VSs> %x0,%x1,%x3"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_fnmadd<mode>4_2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>")
+ (minus:VSX_F
+ (mult:VSX_F
+ (neg:VSX_F
+ (match_operand:VSX_F 1 "gpc_reg_operand" "%<VSr>,<VSr>"))
+ (match_operand:VSX_F 2 "gpc_reg_operand" "<VSr>,0"))
+ (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>")))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD
+ && !HONOR_SIGNED_ZEROS (DFmode)"
+ "@
+ xvnmadda<VSs> %x0,%x1,%x2
+ xvnmaddm<VSs> %x0,%x1,%x3"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_fnmsub<mode>4_1"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>")
+ (neg:VSX_F
+ (minus:VSX_F
+ (mult:VSX_F
+ (match_operand:VSX_F 1 "vsx_register_operand" "%<VSr>,<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,0"))
+ (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>"))))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD
+ && HONOR_SIGNED_ZEROS (DFmode)"
+ "@
+ xvnmsuba<VSs> %x0,%x1,%x2
+ xvnmsubm<VSs> %x0,%x1,%x3"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_fnmsub<mode>4_2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>,<VSr>")
+ (minus:VSX_F
+ (match_operand:VSX_F 3 "vsx_register_operand" "0,<VSr>")
+ (mult:VSX_F
+ (match_operand:VSX_F 1 "vsx_register_operand" "%<VSr>,<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>,0"))))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD
+ && !HONOR_SIGNED_ZEROS (DFmode)"
+ "@
+ xvnmsuba<VSs> %x0,%x1,%x2
+ xvnmsubm<VSs> %x0,%x1,%x3"
+ [(set_attr "type" "vecfloat")])
+
+;; Vector conditional expressions
+(define_insn "*vsx_eq<mode>"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (eq:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvcmpeq<VSs> %x0,%x1,%x2"
+ [(set_attr "type" "veccmp")])
+
+(define_insn "*vsx_gt<mode>"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (gt:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvcmpgt<VSs> %x0,%x1,%x2"
+ [(set_attr "type" "veccmp")])
+
+(define_insn "*vsx_ge<mode>"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (ge:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvcmpge<VSs> %x0,%x1,%x2"
+ [(set_attr "type" "veccmp")])
+
+(define_insn "vsx_vsel<mode>"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (if_then_else:VSX_F (ne (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")
+ (const_int 0))
+ (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")
+ (match_operand:VSX_F 3 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xxsel %x0,%x3,%x2,%x1"
+ [(set_attr "type" "vecperm")])
+
+;; Copy sign
+(define_insn "*vsx_copysign<mode>3"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (if_then_else:VSX_F
+ (ge:VSX_F (match_operand:VSX_F 2 "vsx_register_operand" "<VSr>")
+ (const_int 0))
+ (abs:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>"))
+ (neg:VSX_F (abs:VSX_F (match_dup 1)))))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvcpsgn<VSs> %x0,%x2,%x1"
+ [(set_attr "type" "vecsimple")])
+
+(define_insn "*vsx_ftrunc<mode>2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (fix:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvr<VSs>piz %x0,%x1"
+ [(set_attr "type" "vecperm")])
+
+(define_insn "*vsx_float<VSi><mode>2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (float:VSX_F (match_operand:<VSI> 1 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvcvsx<VSc><VSs> %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_floatuns<VSi><mode>2"
+ [(set (match_operand:VSX_F 0 "vsx_register_operand" "=<VSr>")
+ (unsigned_float:VSX_F (match_operand:<VSI> 1 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvcvux<VSc><VSs> %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_fix_trunc<mode><VSi>2"
+ [(set (match_operand:<VSI> 0 "vsx_register_operand" "=<VSr>")
+ (fix:<VSI> (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvcv<VSs>sx<VSc>s %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+(define_insn "*vsx_fixuns_trunc<mode><VSi>2"
+ [(set (match_operand:<VSI> 0 "vsx_register_operand" "=<VSr>")
+ (unsigned_fix:<VSI> (match_operand:VSX_F 1 "vsx_register_operand" "<VSr>")))]
+ "VECTOR_UNIT_VSX_P (<MODE>mode)"
+ "xvcv<VSs>ux<VSc>s %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+
+;; VSX scalar double precision floating point operations
+(define_insn"*vsx_adddf3"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (plus:DF (match_operand:DF 1 "vsx_register_operand" "ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsadddp %x0,%x1,%x2"
+ [(set_attr "type" "fp")
+ (set_attr "fp_type" "fp_addsub_d")])
+
+(define_insn"*vsx_subdf3"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (minus:DF (match_operand:DF 1 "vsx_register_operand" "ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xssubdp %x0,%x1,%x2"
+ [(set_attr "type" "fp")
+ (set_attr "fp_type" "fp_addsub_d")])
+
+(define_insn"*vsx_muldf3"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (mult:DF (match_operand:DF 1 "vsx_register_operand" "ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsmuldp %x0,%x1,%x2"
+ [(set_attr "type" "dmul")
+ (set_attr "fp_type" "fp_mul_d")])
+
+(define_insn"*vsx_divdf3"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (div:DF (match_operand:DF 1 "vsx_register_operand" "ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsdivdp %x0,%x1,%x2"
+ [(set_attr "type" "ddiv")])
+
+(define_insn "*vsx_fredf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (unspec:DF [(match_operand:DF 1 "vsx_register_operand" "ws")]
+ UNSPEC_FRES))]
+ "flag_finite_math_only && VECTOR_UNIT_VSX_P (DFmode)"
+ "xsredp %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_sqrtdf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (sqrt:DF (match_operand:DF 1 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xssqrtdp %x0,%x1"
+ [(set_attr "type" "dsqrt")])
+
+(define_insn"*vsx_negdf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (neg:DF (match_operand:DF 1 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsnegdp %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn"vsx_absdf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (abs:DF (match_operand:DF 1 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsabsdp %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn"*vsx_nabsdf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (neg:DF (abs:DF (match_operand:DF 1 "vsx_register_operand" "ws"))))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsnabsdp %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_smaxdf3"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (smax:DF (match_operand:DF 1 "vsx_register_operand" "ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsmaxdp %x0,%x1,%x2"
+ [(set_attr "type" "fp")])
+
+
+(define_insn "*vsx_smindf3"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (smin:DF (match_operand:DF 1 "vsx_register_operand" "ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsmindp %x0,%x1,%x2"
+ [(set_attr "type" "fp")])
+
+;; Fused vector multiply/add instructions
+(define_insn "*vsx_fmadddf4"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws")
+ (plus:DF (mult:DF (match_operand:DF 1 "vsx_register_operand" "%ws,ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws,0"))
+ (match_operand:DF 3 "gpc_reg_operand" "0,ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD"
+ "@
+ xsmaddadp %x0,%x1,%x2
+ xsmaddmdp %x0,%x1,%x3"
+ [(set_attr "type" "dmul")
+ (set_attr "fp_type" "fp_maddsub_d")])
+
+(define_insn "*vsx_fmsubdf4"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws")
+ (minus:DF (mult:DF (match_operand:DF 1 "vsx_register_operand" "%ws,ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws,0"))
+ (match_operand:DF 3 "vsx_register_operand" "0,ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD"
+ "@
+ xsmsubadp %x0,%x1,%x2
+ xsmsubmdp %x0,%x1,%x3"
+ [(set_attr "type" "dmul")
+ (set_attr "fp_type" "fp_maddsub_d")])
+
+(define_insn "*vsx_fnmadddf4_1"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws")
+ (neg:DF
+ (plus:DF (mult:DF (match_operand:DF 1 "vsx_register_operand" "%ws,ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws,0"))
+ (match_operand:DF 3 "vsx_register_operand" "0,ws"))))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD
+ && HONOR_SIGNED_ZEROS (DFmode)"
+ "@
+ xsnmaddadp %x0,%x1,%x2
+ xsnmaddmdp %x0,%x1,%x3"
+ [(set_attr "type" "dmul")
+ (set_attr "fp_type" "fp_maddsub_d")])
+
+(define_insn "*vsx_fnmadddf4_2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws")
+ (minus:DF (mult:DF (neg:DF
+ (match_operand:DF 1 "gpc_reg_operand" "%ws,ws"))
+ (match_operand:DF 2 "gpc_reg_operand" "ws,0"))
+ (match_operand:DF 3 "vsx_register_operand" "0,ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD
+ && !HONOR_SIGNED_ZEROS (DFmode)"
+ "@
+ xsnmaddadp %x0,%x1,%x2
+ xsnmaddmdp %x0,%x1,%x3"
+ [(set_attr "type" "dmul")
+ (set_attr "fp_type" "fp_maddsub_d")])
+
+(define_insn "*vsx_fnmsubdf4_1"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws")
+ (neg:DF
+ (minus:DF
+ (mult:DF (match_operand:DF 1 "vsx_register_operand" "%ws,ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws,0"))
+ (match_operand:DF 3 "vsx_register_operand" "0,ws"))))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD
+ && HONOR_SIGNED_ZEROS (DFmode)"
+ "@
+ xsnmsubadp %x0,%x1,%x2
+ xsnmsubmdp %x0,%x1,%x3"
+ [(set_attr "type" "dmul")
+ (set_attr "fp_type" "fp_maddsub_d")])
+
+(define_insn "*vsx_fnmsubdf4_2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws,ws")
+ (minus:DF
+ (match_operand:DF 3 "vsx_register_operand" "0,ws")
+ (mult:DF (match_operand:DF 1 "vsx_register_operand" "%ws,ws")
+ (match_operand:DF 2 "vsx_register_operand" "ws,0"))))]
+ "VECTOR_UNIT_VSX_P (DFmode) && TARGET_FUSED_MADD
+ && !HONOR_SIGNED_ZEROS (DFmode)"
+ "@
+ xsnmsubadp %x0,%x1,%x2
+ xsnmsubmdp %x0,%x1,%x3"
+ [(set_attr "type" "dmul")
+ (set_attr "fp_type" "fp_maddsub_d")])
+
+;; For the conversions, limit the register class for the integer value to be
+;; the fprs. For the unsigned tests, there isn't a generic double -> unsigned
+;; conversion in rs6000.md so don't test VECTOR_UNIT_VSX_P, just test against
+;; VSX.
+
+(define_insn "*vsx_floatdidf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (float:DF (match_operand:DI 1 "vsx_register_operand" "!f#r")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xscvsxddp %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_floatunsdidf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (unsigned_float:DF (match_operand:DI 1 "vsx_register_operand" "!f#r")))]
+ "TARGET_HARD_FLOAT && TARGET_VSX"
+ "xscvuxddp %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_fix_truncdfdi2"
+ [(set (match_operand:DI 0 "vsx_register_operand" "=!f#r")
+ (fix:DI (match_operand:DF 1 "vsx_register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xscvdpsxds %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_fixuns_truncdfdi2"
+ [(set (match_operand:DI 0 "vsx_register_operand" "=!f#r")
+ (unsigned_fix:DI (match_operand:DF 1 "vsx_register_operand" "ws")))]
+ "TARGET_HARD_FLOAT && TARGET_VSX"
+ "xscvdpuxds %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_btruncdf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (unspec:DF [(match_operand:DF 1 "vsx_register_operand" "ws")]
+ UNSPEC_FRIZ))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsrdpiz %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_floordf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (unspec:DF [(match_operand:DF 1 "vsx_register_operand" "ws")]
+ UNSPEC_FRIM))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsrdpim %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_ceildf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (unspec:DF [(match_operand:DF 1 "vsx_register_operand" "ws")]
+ UNSPEC_FRIP))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsrdpip %x0,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "vsx_copysigndf3"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (if_then_else:DF (ge:DF (match_operand:DF 2 "vsx_register_operand" "ws")
+ (const_int 0))
+ (abs:DF (match_operand:DF 1 "vsx_register_operand" "ws"))
+ (neg:DF (abs:DF (match_dup 1)))))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xscpsgndp %x0,%x2,%x1"
+ [(set_attr "type" "fp")])
+
+(define_insn "*vsx_ftruncdf2"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (fix:DF (match_operand:DF 1 "register_operand" "ws")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "xsrdppiz %x0,%x1"
+ [(set_attr "type" "vecfloat")])
+
+
+;; Logical and permute operations
+(define_insn "*vsx_and<mode>3"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+ (and:VSX_L
+ (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")
+ (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxland %x0,%x1,%x2"
+ [(set_attr "type" "vecsimple")])
+
+(define_insn "*vsx_ior<mode>3"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+ (ior:VSX_L (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")
+ (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxlor %x0,%x1,%x2"
+ [(set_attr "type" "vecsimple")])
+
+(define_insn "*vsx_xor<mode>3"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+ (xor:VSX_L
+ (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")
+ (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxlxor %x0,%x1,%x2"
+ [(set_attr "type" "vecsimple")])
+
+(define_insn "*vsx_one_cmpl<mode>2"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+ (not:VSX_L
+ (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxlnor %x0,%x1,%x1"
+ [(set_attr "type" "vecsimple")])
+
+(define_insn "*vsx_nor<mode>3"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+ (not:VSX_L
+ (ior:VSX_L
+ (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")
+ (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa"))))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxlnor %x0,%x1,%x2"
+ [(set_attr "type" "vecsimple")])
+
+(define_insn "*vsx_andc<mode>3"
+ [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
+ (and:VSX_L
+ (not:VSX_L
+ (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa"))
+ (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")))]
+ "VECTOR_MEM_VSX_P (<MODE>mode)"
+ "xxlandc %x0,%x1,%x2"
+ [(set_attr "type" "vecsimple")])
+
+
+;; Permute operations
+
+(define_insn "vsx_concat_v2df"
+ [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,?wa")
+ (unspec:V2DF
+ [(match_operand:DF 1 "vsx_register_operand" "f,wa")
+ (match_operand:DF 2 "vsx_register_operand" "f,wa")]
+ UNSPEC_VSX_CONCAT_V2DF))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "xxpermdi %x0,%x1,%x2,0"
+ [(set_attr "type" "vecperm")])
+
+;; Set a double into one element
+(define_insn "vsx_set_v2df"
+ [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd")
+ (vec_merge:V2DF
+ (match_operand:V2DF 1 "vsx_register_operand" "wd")
+ (vec_duplicate:V2DF (match_operand:DF 2 "vsx_register_operand" "ws"))
+ (match_operand:QI 3 "u5bit_cint_operand" "i")))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+ operands[3] = GEN_INT (INTVAL (operands[3]) & 1);
+ return \"xxpermdi %x0,%x1,%x2,%3\";
+}
+ [(set_attr "type" "vecperm")])
+
+;; Extract a DF element from V2DF
+(define_insn "vsx_extract_v2df"
+ [(set (match_operand:DF 0 "vsx_register_operand" "=ws")
+ (vec_select:DF (match_operand:V2DF 1 "vsx_register_operand" "wd")
+ (parallel
+ [(match_operand:QI 2 "u5bit_cint_operand" "i")])))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+ operands[3] = GEN_INT (INTVAL (operands[2]) & 1);
+ return \"xxpermdi %x0,%x1,%x1,%3\";
+}
+ [(set_attr "type" "vecperm")])
+
+;; General V2DF permute
+(define_insn "vsx_xxpermdi"
+ [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd")
+ (vec_concat:V2DF
+ (vec_select:DF (match_operand:V2DF 1 "vsx_register_operand" "wd")
+ (parallel
+ [(match_operand:QI 2 "u5bit_cint_operand" "i")]))
+ (vec_select:DF (match_operand:V2DF 3 "vsx_register_operand" "wd")
+ (parallel
+ [(match_operand:QI 4 "u5bit_cint_operand" "i")]))))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+ operands[5] = GEN_INT (((INTVAL (operands[2]) & 1) << 1)
+ | (INTVAL (operands[4]) & 1));
+ return \"xxpermdi %x0,%x1,%x3,%5\";
+}
+ [(set_attr "type" "vecperm")])
+
+;; V2DF splat
+(define_insn "vsx_splatv2df"
+ [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,wd")
+ (vec_duplicate:V2DF
+ (match_operand:DF 1 "input_operand" "ws,Z")))]
+ "VECTOR_UNIT_VSX_P (V2DFmode)"
+ "@
+ xxpermdi %x0,%x1,%x1,0
+ lxvdsx %x0,%y1"
+ [(set_attr "type" "vecperm,vecload")])
+
+;; V4SF splat
+(define_insn "*vsx_xxspltw"
+ [(set (match_operand:V4SF 0 "vsx_register_operand" "=wf")
+ (vec_duplicate:V4SF
+ (vec_select:SF (match_operand:V4SF 1 "vsx_register_operand" "wf")
+ (parallel
+ [(match_operand:QI 2 "u5bit_cint_operand" "i")]))))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+ "xxspltw %x0,%x1,%2"
+ [(set_attr "type" "vecperm")])
+
+;; V4SF interleave
+(define_insn "*vsx_xxmrghw"
+ [(set (match_operand:V4SF 0 "register_operand" "=v")
+ (vec_merge:V4SF (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v")
+ (parallel [(const_int 0)
+ (const_int 2)
+ (const_int 1)
+ (const_int 3)]))
+ (vec_select:V4SF (match_operand:V4SF 2 "register_operand" "v")
+ (parallel [(const_int 2)
+ (const_int 0)
+ (const_int 3)
+ (const_int 1)]))
+ (const_int 5)))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+ "xxmrghw %x0,%x1,%x2"
+ [(set_attr "type" "vecperm")])
+
+(define_insn "*vsx_xxmrglw"
+ [(set (match_operand:V4SF 0 "register_operand" "=v")
+ (vec_merge:V4SF
+ (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v")
+ (parallel [(const_int 2)
+ (const_int 0)
+ (const_int 3)
+ (const_int 1)]))
+ (vec_select:V4SF (match_operand:V4SF 2 "register_operand" "v")
+ (parallel [(const_int 0)
+ (const_int 2)
+ (const_int 1)
+ (const_int 3)]))
+ (const_int 5)))]
+ "VECTOR_UNIT_VSX_P (V4SFmode)"
+ "xxmrglw %x0,%x1,%x2"
+ [(set_attr "type" "vecperm")])
--- gcc/config/rs6000/rs6000.h (.../trunk) (revision 144557)
+++ gcc/config/rs6000/rs6000.h (.../branches/ibm/power7-meissner) (revision 144730)
@@ -1,6 +1,6 @@
/* Definitions of target machine for GNU compiler, for IBM RS/6000.
Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
- 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008
+ 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
Free Software Foundation, Inc.
Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu)
@@ -72,14 +72,16 @@
#define ASM_CPU_POWER6_SPEC "-mpower4 -maltivec"
#endif
-#ifdef HAVE_AS_VSX
+#ifdef HAVE_AS_POPCNTD
#define ASM_CPU_POWER7_SPEC "-mpower7"
#else
#define ASM_CPU_POWER7_SPEC "-mpower4 -maltivec"
#endif
-/* Common ASM definitions used by ASM_SPEC among the various targets
- for handling -mcpu=xxx switches. */
+/* Common ASM definitions used by ASM_SPEC among the various targets for
+ handling -mcpu=xxx switches. There is a parallel list in driver-rs6000.c to
+ provide the default assembler options if the user uses -mcpu=native, so if
+ you make changes here, make them also there. */
#define ASM_CPU_SPEC \
"%{!mcpu*: \
%{mpower: %{!mpower2: -mpwr}} \
@@ -88,6 +90,7 @@
%{!mpowerpc64*: %{mpowerpc*: -mppc}} \
%{mno-power: %{!mpowerpc*: -mcom}} \
%{!mno-power: %{!mpower*: %(asm_default)}}} \
+%{mcpu=native: %(asm_cpu_native)} \
%{mcpu=common: -mcom} \
%{mcpu=cell: -mcell} \
%{mcpu=power: -mpwr} \
@@ -163,6 +166,7 @@
#define EXTRA_SPECS \
{ "cpp_default", CPP_DEFAULT_SPEC }, \
{ "asm_cpu", ASM_CPU_SPEC }, \
+ { "asm_cpu_native", ASM_CPU_NATIVE_SPEC }, \
{ "asm_default", ASM_DEFAULT_SPEC }, \
{ "cc1_cpu", CC1_CPU_SPEC }, \
{ "asm_cpu_power5", ASM_CPU_POWER5_SPEC }, \
@@ -179,6 +183,10 @@ extern const char *host_detect_local_cpu
#define EXTRA_SPEC_FUNCTIONS \
{ "local_cpu_detect", host_detect_local_cpu },
#define HAVE_LOCAL_CPU_DETECT
+#define ASM_CPU_NATIVE_SPEC "%:local_cpu_detect(asm)"
+
+#else
+#define ASM_CPU_NATIVE_SPEC "%(asm_default)"
#endif
#ifndef CC1_CPU_SPEC
@@ -240,6 +248,14 @@ extern const char *host_detect_local_cpu
#define TARGET_DFP 0
#endif
+/* Define TARGET_POPCNTD if the target assembler does not support the
+ popcount word and double word instructions. */
+
+#ifndef HAVE_AS_POPCNTD
+#undef TARGET_POPCNTD
+#define TARGET_POPCNTD 0
+#endif
+
#ifndef TARGET_SECURE_PLT
#define TARGET_SECURE_PLT 0
#endif
@@ -295,6 +311,7 @@ enum processor_type
PROCESSOR_POWER4,
PROCESSOR_POWER5,
PROCESSOR_POWER6,
+ PROCESSOR_POWER7,
PROCESSOR_CELL
};
@@ -388,9 +405,13 @@ extern struct rs6000_cpu_select rs6000_s
extern const char *rs6000_debug_name; /* Name for -mdebug-xxxx option */
extern int rs6000_debug_stack; /* debug stack applications */
extern int rs6000_debug_arg; /* debug argument handling */
+extern int rs6000_debug_reg; /* debug register handling */
+extern int rs6000_debug_addr; /* debug memory addressing */
#define TARGET_DEBUG_STACK rs6000_debug_stack
#define TARGET_DEBUG_ARG rs6000_debug_arg
+#define TARGET_DEBUG_REG rs6000_debug_reg
+#define TARGET_DEBUG_ADDR rs6000_debug_addr
extern const char *rs6000_traceback_name; /* Type of traceback table. */
@@ -401,13 +422,65 @@ extern int rs6000_ieeequad;
extern int rs6000_altivec_abi;
extern int rs6000_spe_abi;
extern int rs6000_spe;
-extern int rs6000_isel;
extern int rs6000_float_gprs;
extern int rs6000_alignment_flags;
extern const char *rs6000_sched_insert_nops_str;
extern enum rs6000_nop_insertion rs6000_sched_insert_nops;
extern int rs6000_xilinx_fpu;
+/* Describe which vector unit to use for a given machine mode. */
+enum rs6000_vector {
+ VECTOR_NONE, /* Type is not a vector or not supported */
+ VECTOR_ALTIVEC, /* Use altivec for vector processing */
+ VECTOR_VSX, /* Use VSX for vector processing */
+ VECTOR_PAIRED, /* Use paired floating point for vectors */
+ VECTOR_SPE, /* Use SPE for vector processing */
+ VECTOR_OTHER /* Some other vector unit */
+};
+
+extern enum rs6000_vector rs6000_vector_unit[];
+
+#define VECTOR_UNIT_NONE_P(MODE) \
+ (rs6000_vector_unit[(MODE)] == VECTOR_NONE)
+
+#define VECTOR_UNIT_VSX_P(MODE) \
+ (rs6000_vector_unit[(MODE)] == VECTOR_VSX)
+
+#define VECTOR_UNIT_ALTIVEC_P(MODE) \
+ (rs6000_vector_unit[(MODE)] == VECTOR_ALTIVEC)
+
+#define VECTOR_UNIT_ALTIVEC_OR_VSX_P(MODE) \
+ (rs6000_vector_unit[(MODE)] == VECTOR_ALTIVEC \
+ || rs6000_vector_unit[(MODE)] == VECTOR_VSX)
+
+/* Describe whether to use VSX loads or Altivec loads. For now, just use the
+ same unit as the vector unit we are using, but we may want to migrate to
+ using VSX style loads even for types handled by altivec. */
+extern enum rs6000_vector rs6000_vector_mem[];
+
+#define VECTOR_MEM_NONE_P(MODE) \
+ (rs6000_vector_mem[(MODE)] == VECTOR_NONE)
+
+#define VECTOR_MEM_VSX_P(MODE) \
+ (rs6000_vector_mem[(MODE)] == VECTOR_VSX)
+
+#define VECTOR_MEM_ALTIVEC_P(MODE) \
+ (rs6000_vector_mem[(MODE)] == VECTOR_ALTIVEC)
+
+#define VECTOR_MEM_ALTIVEC_OR_VSX_P(MODE) \
+ (rs6000_vector_mem[(MODE)] == VECTOR_ALTIVEC \
+ || rs6000_vector_mem[(MODE)] == VECTOR_VSX)
+
+/* Return the alignment of a given vector type, which is set based on the
+ vector unit use. VSX for instance can load 32 or 64 bit aligned words
+ without problems, while Altivec requires 128-bit aligned vectors. */
+extern int rs6000_vector_align[];
+
+#define VECTOR_ALIGN(MODE) \
+ ((rs6000_vector_align[(MODE)] != 0) \
+ ? rs6000_vector_align[(MODE)] \
+ : (int)GET_MODE_BITSIZE ((MODE)))
+
/* Alignment options for fields in structures for sub-targets following
AIX-like ABI.
ALIGN_POWER word-aligns FP doubles (default AIX ABI).
@@ -432,7 +505,7 @@ extern int rs6000_xilinx_fpu;
#define TARGET_SPE_ABI 0
#define TARGET_SPE 0
#define TARGET_E500 0
-#define TARGET_ISEL rs6000_isel
+#define TARGET_ISEL64 (TARGET_ISEL && TARGET_POWERPC64)
#define TARGET_FPRS 1
#define TARGET_E500_SINGLE 0
#define TARGET_E500_DOUBLE 0
@@ -530,6 +603,7 @@ extern int rs6000_xilinx_fpu;
#endif
#define UNITS_PER_FP_WORD 8
#define UNITS_PER_ALTIVEC_WORD 16
+#define UNITS_PER_VSX_WORD 16
#define UNITS_PER_SPE_WORD 8
#define UNITS_PER_PAIRED_WORD 8
@@ -600,8 +674,9 @@ extern int rs6000_xilinx_fpu;
#define PARM_BOUNDARY (TARGET_32BIT ? 32 : 64)
/* Boundary (in *bits*) on which stack pointer should be aligned. */
-#define STACK_BOUNDARY \
- ((TARGET_32BIT && !TARGET_ALTIVEC && !TARGET_ALTIVEC_ABI) ? 64 : 128)
+#define STACK_BOUNDARY \
+ ((TARGET_32BIT && !TARGET_ALTIVEC && !TARGET_ALTIVEC_ABI && !TARGET_VSX) \
+ ? 64 : 128)
/* Allocation boundary (in *bits*) for the code of a function. */
#define FUNCTION_BOUNDARY 32
@@ -613,10 +688,11 @@ extern int rs6000_xilinx_fpu;
local store. TYPE is the data type, and ALIGN is the alignment
that the object would ordinarily have. */
#define LOCAL_ALIGNMENT(TYPE, ALIGN) \
- ((TARGET_ALTIVEC && TREE_CODE (TYPE) == VECTOR_TYPE) ? 128 : \
+ (((TARGET_ALTIVEC || TARGET_VSX) \
+ && TREE_CODE (TYPE) == VECTOR_TYPE) ? 128 : \
(TARGET_E500_DOUBLE \
- && TYPE_MODE (TYPE) == DFmode) ? 64 : \
- ((TARGET_SPE && TREE_CODE (TYPE) == VECTOR_TYPE \
+ && TYPE_MODE (TYPE) == DFmode) ? 64 : \
+ ((TARGET_SPE && TREE_CODE (TYPE) == VECTOR_TYPE \
&& SPE_VECTOR_MODE (TYPE_MODE (TYPE))) || (TARGET_PAIRED_FLOAT \
&& TREE_CODE (TYPE) == VECTOR_TYPE \
&& PAIRED_VECTOR_MODE (TYPE_MODE (TYPE)))) ? 64 : ALIGN)
@@ -674,15 +750,17 @@ extern int rs6000_xilinx_fpu;
/* Define this macro to be the value 1 if unaligned accesses have a cost
many times greater than aligned accesses, for example if they are
emulated in a trap handler. */
-/* Altivec vector memory instructions simply ignore the low bits; SPE
- vector memory instructions trap on unaligned accesses. */
+/* Altivec vector memory instructions simply ignore the low bits; SPE vector
+ memory instructions trap on unaligned accesses; VSX memory instructions are
+ aligned to 4 or 8 bytes. */
#define SLOW_UNALIGNED_ACCESS(MODE, ALIGN) \
(STRICT_ALIGNMENT \
|| (((MODE) == SFmode || (MODE) == DFmode || (MODE) == TFmode \
|| (MODE) == SDmode || (MODE) == DDmode || (MODE) == TDmode \
|| (MODE) == DImode) \
&& (ALIGN) < 32) \
- || (VECTOR_MODE_P ((MODE)) && (ALIGN) < GET_MODE_BITSIZE ((MODE))))
+ || (VECTOR_MODE_P ((MODE)) && (((int)(ALIGN)) < VECTOR_ALIGN (MODE))))
+
/* Standard register usage. */
@@ -909,16 +987,60 @@ extern int rs6000_xilinx_fpu;
/* True if register is an AltiVec register. */
#define ALTIVEC_REGNO_P(N) ((N) >= FIRST_ALTIVEC_REGNO && (N) <= LAST_ALTIVEC_REGNO)
+/* True if register is a VSX register. */
+#define VSX_REGNO_P(N) (FP_REGNO_P (N) || ALTIVEC_REGNO_P (N))
+
+/* Alternate name for any vector register supporting floating point, no matter
+ which instruction set(s) are available. */
+#define VFLOAT_REGNO_P(N) \
+ (ALTIVEC_REGNO_P (N) || (TARGET_VSX && FP_REGNO_P (N)))
+
+/* Alternate name for any vector register supporting integer, no matter which
+ instruction set(s) are available. */
+#define VINT_REGNO_P(N) ALTIVEC_REGNO_P (N)
+
+/* Alternate name for any vector register supporting logical operations, no
+ matter which instruction set(s) are available. */
+#define VLOGICAL_REGNO_P(N) VFLOAT_REGNO_P (N)
+
/* Return number of consecutive hard regs needed starting at reg REGNO
to hold something of mode MODE. */
-#define HARD_REGNO_NREGS(REGNO, MODE) rs6000_hard_regno_nregs ((REGNO), (MODE))
+#define HARD_REGNO_NREGS(REGNO, MODE) rs6000_hard_regno_nregs[(MODE)][(REGNO)]
#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) \
((TARGET_32BIT && TARGET_POWERPC64 \
&& (GET_MODE_SIZE (MODE) > 4) \
&& INT_REGNO_P (REGNO)) ? 1 : 0)
+#define VSX_VECTOR_MODE(MODE) \
+ ((MODE) == V4SFmode \
+ || (MODE) == V2DFmode) \
+
+#define VSX_VECTOR_MOVE_MODE(MODE) \
+ ((MODE) == V16QImode \
+ || (MODE) == V8HImode \
+ || (MODE) == V4SImode \
+ || (MODE) == V2DImode \
+ || (MODE) == V4SFmode \
+ || (MODE) == V2DFmode) \
+
+#define VSX_SCALAR_MODE(MODE) \
+ ((MODE) == DFmode)
+
+#define VSX_MODE(MODE) \
+ (VSX_VECTOR_MODE (MODE) \
+ || VSX_SCALAR_MODE (MODE))
+
+#define VSX_MOVE_MODE(MODE) \
+ (VSX_VECTOR_MOVE_MODE (MODE) \
+ || VSX_SCALAR_MODE(MODE) \
+ || (MODE) == V16QImode \
+ || (MODE) == V8HImode \
+ || (MODE) == V4SImode \
+ || (MODE) == V2DImode \
+ || (MODE) == TImode)
+
#define ALTIVEC_VECTOR_MODE(MODE) \
((MODE) == V16QImode \
|| (MODE) == V8HImode \
@@ -934,10 +1056,12 @@ extern int rs6000_xilinx_fpu;
#define PAIRED_VECTOR_MODE(MODE) \
((MODE) == V2SFmode)
-#define UNITS_PER_SIMD_WORD(MODE) \
- (TARGET_ALTIVEC ? UNITS_PER_ALTIVEC_WORD \
- : (TARGET_SPE ? UNITS_PER_SPE_WORD : (TARGET_PAIRED_FLOAT ? \
- UNITS_PER_PAIRED_WORD : UNITS_PER_WORD)))
+#define UNITS_PER_SIMD_WORD(MODE) \
+ (TARGET_VSX ? UNITS_PER_VSX_WORD \
+ : (TARGET_ALTIVEC ? UNITS_PER_ALTIVEC_WORD \
+ : (TARGET_SPE ? UNITS_PER_SPE_WORD \
+ : (TARGET_PAIRED_FLOAT ? UNITS_PER_PAIRED_WORD \
+ : UNITS_PER_WORD))))
/* Value is TRUE if hard register REGNO can hold a value of
machine-mode MODE. */
@@ -965,6 +1089,10 @@ extern int rs6000_xilinx_fpu;
? ALTIVEC_VECTOR_MODE (MODE2) \
: ALTIVEC_VECTOR_MODE (MODE2) \
? ALTIVEC_VECTOR_MODE (MODE1) \
+ : VSX_VECTOR_MODE (MODE1) \
+ ? VSX_VECTOR_MODE (MODE2) \
+ : VSX_VECTOR_MODE (MODE2) \
+ ? VSX_VECTOR_MODE (MODE1) \
: 1)
/* Post-reload, we can't use any new AltiVec registers, as we already
@@ -1056,9 +1184,10 @@ extern int rs6000_xilinx_fpu;
For any two classes, it is very desirable that there be another
class that represents their union. */
-/* The RS/6000 has three types of registers, fixed-point, floating-point,
- and condition registers, plus three special registers, MQ, CTR, and the
- link register. AltiVec adds a vector register class.
+/* The RS/6000 has three types of registers, fixed-point, floating-point, and
+ condition registers, plus three special registers, MQ, CTR, and the link
+ register. AltiVec adds a vector register class. VSX registers overlap the
+ FPR registers and the Altivec registers.
However, r0 is special in that it cannot be used as a base register.
So make a class for registers valid as base registers.
@@ -1073,6 +1202,7 @@ enum reg_class
GENERAL_REGS,
FLOAT_REGS,
ALTIVEC_REGS,
+ VSX_REGS,
VRSAVE_REGS,
VSCR_REGS,
SPE_ACC_REGS,
@@ -1103,6 +1233,7 @@ enum reg_class
"GENERAL_REGS", \
"FLOAT_REGS", \
"ALTIVEC_REGS", \
+ "VSX_REGS", \
"VRSAVE_REGS", \
"VSCR_REGS", \
"SPE_ACC_REGS", \
@@ -1132,6 +1263,7 @@ enum reg_class
{ 0xffffffff, 0x00000000, 0x00000008, 0x00020000 }, /* GENERAL_REGS */ \
{ 0x00000000, 0xffffffff, 0x00000000, 0x00000000 }, /* FLOAT_REGS */ \
{ 0x00000000, 0x00000000, 0xffffe000, 0x00001fff }, /* ALTIVEC_REGS */ \
+ { 0x00000000, 0xffffffff, 0xffffe000, 0x00001fff }, /* VSX_REGS */ \
{ 0x00000000, 0x00000000, 0x00000000, 0x00002000 }, /* VRSAVE_REGS */ \
{ 0x00000000, 0x00000000, 0x00000000, 0x00004000 }, /* VSCR_REGS */ \
{ 0x00000000, 0x00000000, 0x00000000, 0x00008000 }, /* SPE_ACC_REGS */ \
@@ -1179,8 +1311,8 @@ enum reg_class
: (REGNO) == CR0_REGNO ? CR0_REGS \
: CR_REGNO_P (REGNO) ? CR_REGS \
: (REGNO) == MQ_REGNO ? MQ_REGS \
- : (REGNO) == LR_REGNO ? LINK_REGS \
- : (REGNO) == CTR_REGNO ? CTR_REGS \
+ : (REGNO) == LR_REGNO ? LINK_REGS \
+ : (REGNO) == CTR_REGNO ? CTR_REGS \
: (REGNO) == ARG_POINTER_REGNUM ? BASE_REGS \
: (REGNO) == XER_REGNO ? XER_REGS \
: (REGNO) == VRSAVE_REGNO ? VRSAVE_REGS \
@@ -1190,10 +1322,18 @@ enum reg_class
: (REGNO) == FRAME_POINTER_REGNUM ? BASE_REGS \
: NO_REGS)
+/* VSX register classes. */
+extern enum reg_class rs6000_vector_reg_class[];
+extern enum reg_class rs6000_vsx_reg_class;
+
/* The class value for index registers, and the one for base regs. */
#define INDEX_REG_CLASS GENERAL_REGS
#define BASE_REG_CLASS BASE_REGS
+/* Return whether a given register class can hold VSX objects. */
+#define VSX_REG_CLASS_P(CLASS) \
+ ((CLASS) == VSX_REGS || (CLASS) == FLOAT_REGS || (CLASS) == ALTIVEC_REGS)
+
/* Given an rtx X being reloaded into a reg required to be
in class CLASS, return the class of reg to actually use.
In general this is just CLASS; but on some machines
@@ -1213,13 +1353,7 @@ enum reg_class
*/
#define PREFERRED_RELOAD_CLASS(X,CLASS) \
- ((CONSTANT_P (X) \
- && reg_classes_intersect_p ((CLASS), FLOAT_REGS)) \
- ? NO_REGS \
- : (GET_MODE_CLASS (GET_MODE (X)) == MODE_INT \
- && (CLASS) == NON_SPECIAL_REGS) \
- ? GENERAL_REGS \
- : (CLASS))
+ rs6000_preferred_reload_class (X, CLASS)
/* Return the register class of a scratch register needed to copy IN into
or out of a register in CLASS in MODE. If it can be done directly,
@@ -1234,18 +1368,7 @@ enum reg_class
are available.*/
#define SECONDARY_MEMORY_NEEDED(CLASS1,CLASS2,MODE) \
- ((CLASS1) != (CLASS2) && (((CLASS1) == FLOAT_REGS \
- && (!TARGET_MFPGPR || !TARGET_POWERPC64 \
- || ((MODE != DFmode) \
- && (MODE != DDmode) \
- && (MODE != DImode)))) \
- || ((CLASS2) == FLOAT_REGS \
- && (!TARGET_MFPGPR || !TARGET_POWERPC64 \
- || ((MODE != DFmode) \
- && (MODE != DDmode) \
- && (MODE != DImode)))) \
- || (CLASS1) == ALTIVEC_REGS \
- || (CLASS2) == ALTIVEC_REGS))
+ rs6000_secondary_memory_needed (CLASS1, CLASS2, MODE)
/* For cpus that cannot load/store SDmode values from the 64-bit
FP registers without using a full 64-bit load/store, we need
@@ -1257,32 +1380,15 @@ enum reg_class
/* Return the maximum number of consecutive registers
needed to represent mode MODE in a register of class CLASS.
- On RS/6000, this is the size of MODE in words,
- except in the FP regs, where a single reg is enough for two words. */
-#define CLASS_MAX_NREGS(CLASS, MODE) \
- (((CLASS) == FLOAT_REGS) \
- ? ((GET_MODE_SIZE (MODE) + UNITS_PER_FP_WORD - 1) / UNITS_PER_FP_WORD) \
- : (TARGET_E500_DOUBLE && (CLASS) == GENERAL_REGS \
- && (MODE) == DFmode) \
- ? 1 \
- : ((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD))
+ On RS/6000, this is the size of MODE in words, except in the FP regs, where
+ a single reg is enough for two words, unless we have VSX, where the FP
+ registers can hold 128 bits. */
+#define CLASS_MAX_NREGS(CLASS, MODE) rs6000_class_max_nregs[(MODE)][(CLASS)]
/* Return nonzero if for CLASS a mode change from FROM to TO is invalid. */
#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \
- (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \
- ? ((GET_MODE_SIZE (FROM) < 8 || GET_MODE_SIZE (TO) < 8 \
- || TARGET_IEEEQUAD) \
- && reg_classes_intersect_p (FLOAT_REGS, CLASS)) \
- : (((TARGET_E500_DOUBLE \
- && ((((TO) == DFmode) + ((FROM) == DFmode)) == 1 \
- || (((TO) == TFmode) + ((FROM) == TFmode)) == 1 \
- || (((TO) == DDmode) + ((FROM) == DDmode)) == 1 \
- || (((TO) == TDmode) + ((FROM) == TDmode)) == 1 \
- || (((TO) == DImode) + ((FROM) == DImode)) == 1)) \
- || (TARGET_SPE \
- && (SPE_VECTOR_MODE (FROM) + SPE_VECTOR_MODE (TO)) == 1)) \
- && reg_classes_intersect_p (GENERAL_REGS, CLASS)))
+ rs6000_cannot_change_mode_class (FROM, TO, CLASS)
/* Stack layout; function entry, exit and calling. */
@@ -1343,8 +1449,8 @@ extern enum rs6000_abi rs6000_current_ab
#define STARTING_FRAME_OFFSET \
(FRAME_GROWS_DOWNWARD \
? 0 \
- : (RS6000_ALIGN (crtl->outgoing_args_size, \
- TARGET_ALTIVEC ? 16 : 8) \
+ : (RS6000_ALIGN (crtl->outgoing_args_size, \
+ (TARGET_ALTIVEC || TARGET_VSX) ? 16 : 8) \
+ RS6000_SAVE_AREA))
/* Offset from the stack pointer register to an item dynamically
@@ -1354,8 +1460,8 @@ extern enum rs6000_abi rs6000_current_ab
length of the outgoing arguments. The default is correct for most
machines. See `function.c' for details. */
#define STACK_DYNAMIC_OFFSET(FUNDECL) \
- (RS6000_ALIGN (crtl->outgoing_args_size, \
- TARGET_ALTIVEC ? 16 : 8) \
+ (RS6000_ALIGN (crtl->outgoing_args_size, \
+ (TARGET_ALTIVEC || TARGET_VSX) ? 16 : 8) \
+ (STACK_POINTER_OFFSET))
/* If we generate an insn to push BYTES bytes,
@@ -1605,7 +1711,7 @@ typedef struct rs6000_args
#define EPILOGUE_USES(REGNO) \
((reload_completed && (REGNO) == LR_REGNO) \
|| (TARGET_ALTIVEC && (REGNO) == VRSAVE_REGNO) \
- || (crtl->calls_eh_return \
+ || (crtl->calls_eh_return \
&& TARGET_AIX \
&& (REGNO) == 2))
@@ -2316,7 +2422,24 @@ extern char rs6000_reg_names[][8]; /* re
/* no additional names for: mq, lr, ctr, ap */ \
{"cr0", 68}, {"cr1", 69}, {"cr2", 70}, {"cr3", 71}, \
{"cr4", 72}, {"cr5", 73}, {"cr6", 74}, {"cr7", 75}, \
- {"cc", 68}, {"sp", 1}, {"toc", 2} }
+ {"cc", 68}, {"sp", 1}, {"toc", 2}, \
+ /* VSX registers overlaid on top of FR, Altivec registers */ \
+ {"vs0", 32}, {"vs1", 33}, {"vs2", 34}, {"vs3", 35}, \
+ {"vs4", 36}, {"vs5", 37}, {"vs6", 38}, {"vs7", 39}, \
+ {"vs8", 40}, {"vs9", 41}, {"vs10", 42}, {"vs11", 43}, \
+ {"vs12", 44}, {"vs13", 45}, {"vs14", 46}, {"vs15", 47}, \
+ {"vs16", 48}, {"vs17", 49}, {"vs18", 50}, {"vs19", 51}, \
+ {"vs20", 52}, {"vs21", 53}, {"vs22", 54}, {"vs23", 55}, \
+ {"vs24", 56}, {"vs25", 57}, {"vs26", 58}, {"vs27", 59}, \
+ {"vs28", 60}, {"vs29", 61}, {"vs30", 62}, {"vs31", 63}, \
+ {"vs32", 77}, {"vs33", 78}, {"vs34", 79}, {"vs35", 80}, \
+ {"vs36", 81}, {"vs37", 82}, {"vs38", 83}, {"vs39", 84}, \
+ {"vs40", 85}, {"vs41", 86}, {"vs42", 87}, {"vs43", 88}, \
+ {"vs44", 89}, {"vs45", 90}, {"vs46", 91}, {"vs47", 92}, \
+ {"vs48", 93}, {"vs49", 94}, {"vs50", 95}, {"vs51", 96}, \
+ {"vs52", 97}, {"vs53", 98}, {"vs54", 99}, {"vs55", 100}, \
+ {"vs56", 101},{"vs57", 102},{"vs58", 103},{"vs59", 104}, \
+ {"vs60", 105},{"vs61", 106},{"vs62", 107},{"vs63", 108} }
/* Text to write out after a CALL that may be replaced by glue code by
the loader. This depends on the AIX version. */
@@ -2480,10 +2603,14 @@ enum rs6000_builtins
ALTIVEC_BUILTIN_VSEL_4SF,
ALTIVEC_BUILTIN_VSEL_8HI,
ALTIVEC_BUILTIN_VSEL_16QI,
+ ALTIVEC_BUILTIN_VSEL_2DF, /* needed for VSX */
+ ALTIVEC_BUILTIN_VSEL_2DI, /* needed for VSX */
ALTIVEC_BUILTIN_VPERM_4SI,
ALTIVEC_BUILTIN_VPERM_4SF,
ALTIVEC_BUILTIN_VPERM_8HI,
ALTIVEC_BUILTIN_VPERM_16QI,
+ ALTIVEC_BUILTIN_VPERM_2DF, /* needed for VSX */
+ ALTIVEC_BUILTIN_VPERM_2DI, /* needed for VSX */
ALTIVEC_BUILTIN_VPKUHUM,
ALTIVEC_BUILTIN_VPKUWUM,
ALTIVEC_BUILTIN_VPKPX,
@@ -3110,6 +3237,163 @@ enum rs6000_builtins
RS6000_BUILTIN_RECIPF,
RS6000_BUILTIN_RSQRTF,
+ /* VSX builtins. */
+ VSX_BUILTIN_LXSDUX,
+ VSX_BUILTIN_LXSDX,
+ VSX_BUILTIN_LXVD2UX,
+ VSX_BUILTIN_LXVD2X,
+ VSX_BUILTIN_LXVDSX,
+ VSX_BUILTIN_LXVW4UX,
+ VSX_BUILTIN_LXVW4X,
+ VSX_BUILTIN_STXSDUX,
+ VSX_BUILTIN_STXSDX,
+ VSX_BUILTIN_STXVD2UX,
+ VSX_BUILTIN_STXVD2X,
+ VSX_BUILTIN_STXVW4UX,
+ VSX_BUILTIN_STXVW4X,
+ VSX_BUILTIN_XSABSDP,
+ VSX_BUILTIN_XSADDDP,
+ VSX_BUILTIN_XSCMPODP,
+ VSX_BUILTIN_XSCMPUDP,
+ VSX_BUILTIN_XSCPSGNDP,
+ VSX_BUILTIN_XSCVDPSP,
+ VSX_BUILTIN_XSCVDPSXDS,
+ VSX_BUILTIN_XSCVDPSXWS,
+ VSX_BUILTIN_XSCVDPUXDS,
+ VSX_BUILTIN_XSCVDPUXWS,
+ VSX_BUILTIN_XSCVSPDP,
+ VSX_BUILTIN_XSCVSXDDP,
+ VSX_BUILTIN_XSCVUXDDP,
+ VSX_BUILTIN_XSDIVDP,
+ VSX_BUILTIN_XSMADDADP,
+ VSX_BUILTIN_XSMADDMDP,
+ VSX_BUILTIN_XSMAXDP,
+ VSX_BUILTIN_XSMINDP,
+ VSX_BUILTIN_XSMOVDP,
+ VSX_BUILTIN_XSMSUBADP,
+ VSX_BUILTIN_XSMSUBMDP,
+ VSX_BUILTIN_XSMULDP,
+ VSX_BUILTIN_XSNABSDP,
+ VSX_BUILTIN_XSNEGDP,
+ VSX_BUILTIN_XSNMADDADP,
+ VSX_BUILTIN_XSNMADDMDP,
+ VSX_BUILTIN_XSNMSUBADP,
+ VSX_BUILTIN_XSNMSUBMDP,
+ VSX_BUILTIN_XSRDPI,
+ VSX_BUILTIN_XSRDPIC,
+ VSX_BUILTIN_XSRDPIM,
+ VSX_BUILTIN_XSRDPIP,
+ VSX_BUILTIN_XSRDPIZ,
+ VSX_BUILTIN_XSREDP,
+ VSX_BUILTIN_XSRSQRTEDP,
+ VSX_BUILTIN_XSSQRTDP,
+ VSX_BUILTIN_XSSUBDP,
+ VSX_BUILTIN_XSTDIVDP,
+ VSX_BUILTIN_XSTSQRTDP,
+ VSX_BUILTIN_XVABSDP,
+ VSX_BUILTIN_XVABSSP,
+ VSX_BUILTIN_XVADDDP,
+ VSX_BUILTIN_XVADDSP,
+ VSX_BUILTIN_XVCMPEQDP,
+ VSX_BUILTIN_XVCMPEQSP,
+ VSX_BUILTIN_XVCMPGEDP,
+ VSX_BUILTIN_XVCMPGESP,
+ VSX_BUILTIN_XVCMPGTDP,
+ VSX_BUILTIN_XVCMPGTSP,
+ VSX_BUILTIN_XVCPSGNDP,
+ VSX_BUILTIN_XVCPSGNSP,
+ VSX_BUILTIN_XVCVDPSP,
+ VSX_BUILTIN_XVCVDPSXDS,
+ VSX_BUILTIN_XVCVDPSXWS,
+ VSX_BUILTIN_XVCVDPUXDS,
+ VSX_BUILTIN_XVCVDPUXWS,
+ VSX_BUILTIN_XVCVSPDP,
+ VSX_BUILTIN_XVCVSPSXDS,
+ VSX_BUILTIN_XVCVSPSXWS,
+ VSX_BUILTIN_XVCVSPUXDS,
+ VSX_BUILTIN_XVCVSPUXWS,
+ VSX_BUILTIN_XVCVSXDDP,
+ VSX_BUILTIN_XVCVSXDSP,
+ VSX_BUILTIN_XVCVSXWDP,
+ VSX_BUILTIN_XVCVSXWSP,
+ VSX_BUILTIN_XVCVUXDDP,
+ VSX_BUILTIN_XVCVUXDSP,
+ VSX_BUILTIN_XVCVUXWDP,
+ VSX_BUILTIN_XVCVUXWSP,
+ VSX_BUILTIN_XVDIVDP,
+ VSX_BUILTIN_XVDIVSP,
+ VSX_BUILTIN_XVMADDADP,
+ VSX_BUILTIN_XVMADDASP,
+ VSX_BUILTIN_XVMADDMDP,
+ VSX_BUILTIN_XVMADDMSP,
+ VSX_BUILTIN_XVMAXDP,
+ VSX_BUILTIN_XVMAXSP,
+ VSX_BUILTIN_XVMINDP,
+ VSX_BUILTIN_XVMINSP,
+ VSX_BUILTIN_XVMOVDP,
+ VSX_BUILTIN_XVMOVSP,
+ VSX_BUILTIN_XVMSUBADP,
+ VSX_BUILTIN_XVMSUBASP,
+ VSX_BUILTIN_XVMSUBMDP,
+ VSX_BUILTIN_XVMSUBMSP,
+ VSX_BUILTIN_XVMULDP,
+ VSX_BUILTIN_XVMULSP,
+ VSX_BUILTIN_XVNABSDP,
+ VSX_BUILTIN_XVNABSSP,
+ VSX_BUILTIN_XVNEGDP,
+ VSX_BUILTIN_XVNEGSP,
+ VSX_BUILTIN_XVNMADDADP,
+ VSX_BUILTIN_XVNMADDASP,
+ VSX_BUILTIN_XVNMADDMDP,
+ VSX_BUILTIN_XVNMADDMSP,
+ VSX_BUILTIN_XVNMSUBADP,
+ VSX_BUILTIN_XVNMSUBASP,
+ VSX_BUILTIN_XVNMSUBMDP,
+ VSX_BUILTIN_XVNMSUBMSP,
+ VSX_BUILTIN_XVRDPI,
+ VSX_BUILTIN_XVRDPIC,
+ VSX_BUILTIN_XVRDPIM,
+ VSX_BUILTIN_XVRDPIP,
+ VSX_BUILTIN_XVRDPIZ,
+ VSX_BUILTIN_XVREDP,
+ VSX_BUILTIN_XVRESP,
+ VSX_BUILTIN_XVRSPI,
+ VSX_BUILTIN_XVRSPIC,
+ VSX_BUILTIN_XVRSPIM,
+ VSX_BUILTIN_XVRSPIP,
+ VSX_BUILTIN_XVRSPIZ,
+ VSX_BUILTIN_XVRSQRTEDP,
+ VSX_BUILTIN_XVRSQRTESP,
+ VSX_BUILTIN_XVSQRTDP,
+ VSX_BUILTIN_XVSQRTSP,
+ VSX_BUILTIN_XVSUBDP,
+ VSX_BUILTIN_XVSUBSP,
+ VSX_BUILTIN_XVTDIVDP,
+ VSX_BUILTIN_XVTDIVSP,
+ VSX_BUILTIN_XVTSQRTDP,
+ VSX_BUILTIN_XVTSQRTSP,
+ VSX_BUILTIN_XXLAND,
+ VSX_BUILTIN_XXLANDC,
+ VSX_BUILTIN_XXLNOR,
+ VSX_BUILTIN_XXLOR,
+ VSX_BUILTIN_XXLXOR,
+ VSX_BUILTIN_XXMRGHD,
+ VSX_BUILTIN_XXMRGHW,
+ VSX_BUILTIN_XXMRGLD,
+ VSX_BUILTIN_XXMRGLW,
+ VSX_BUILTIN_XXPERMDI,
+ VSX_BUILTIN_XXSEL,
+ VSX_BUILTIN_XXSLDWI,
+ VSX_BUILTIN_XXSPLTD,
+ VSX_BUILTIN_XXSPLTW,
+ VSX_BUILTIN_XXSWAPD,
+
+ /* Combine VSX/Altivec builtins. */
+ VECTOR_BUILTIN_FLOAT_V4SI_V4SF,
+ VECTOR_BUILTIN_UNSFLOAT_V4SI_V4SF,
+ VECTOR_BUILTIN_FIX_V4SF_V4SI,
+ VECTOR_BUILTIN_FIXUNS_V4SF_V4SI,
+
RS6000_BUILTIN_COUNT
};
@@ -3123,6 +3407,8 @@ enum rs6000_builtin_type_index
RS6000_BTI_V16QI,
RS6000_BTI_V2SI,
RS6000_BTI_V2SF,
+ RS6000_BTI_V2DI,
+ RS6000_BTI_V2DF,
RS6000_BTI_V4HI,
RS6000_BTI_V4SI,
RS6000_BTI_V4SF,
@@ -3146,7 +3432,10 @@ enum rs6000_builtin_type_index
RS6000_BTI_UINTHI, /* unsigned_intHI_type_node */
RS6000_BTI_INTSI, /* intSI_type_node */
RS6000_BTI_UINTSI, /* unsigned_intSI_type_node */
+ RS6000_BTI_INTDI, /* intDI_type_node */
+ RS6000_BTI_UINTDI, /* unsigned_intDI_type_node */
RS6000_BTI_float, /* float_type_node */
+ RS6000_BTI_double, /* double_type_node */
RS6000_BTI_void, /* void_type_node */
RS6000_BTI_MAX
};
@@ -3157,6 +3446,8 @@ enum rs6000_builtin_type_index
#define opaque_p_V2SI_type_node (rs6000_builtin_types[RS6000_BTI_opaque_p_V2SI])
#define opaque_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_opaque_V4SI])
#define V16QI_type_node (rs6000_builtin_types[RS6000_BTI_V16QI])
+#define V2DI_type_node (rs6000_builtin_types[RS6000_BTI_V2DI])
+#define V2DF_type_node (rs6000_builtin_types[RS6000_BTI_V2DF])
#define V2SI_type_node (rs6000_builtin_types[RS6000_BTI_V2SI])
#define V2SF_type_node (rs6000_builtin_types[RS6000_BTI_V2SF])
#define V4HI_type_node (rs6000_builtin_types[RS6000_BTI_V4HI])
@@ -3183,7 +3474,10 @@ enum rs6000_builtin_type_index
#define uintHI_type_internal_node (rs6000_builtin_types[RS6000_BTI_UINTHI])
#define intSI_type_internal_node (rs6000_builtin_types[RS6000_BTI_INTSI])
#define uintSI_type_internal_node (rs6000_builtin_types[RS6000_BTI_UINTSI])
+#define intDI_type_internal_node (rs6000_builtin_types[RS6000_BTI_INTDI])
+#define uintDI_type_internal_node (rs6000_builtin_types[RS6000_BTI_UINTDI])
#define float_type_internal_node (rs6000_builtin_types[RS6000_BTI_float])
+#define double_type_internal_node (rs6000_builtin_types[RS6000_BTI_double])
#define void_type_internal_node (rs6000_builtin_types[RS6000_BTI_void])
extern GTY(()) tree rs6000_builtin_types[RS6000_BTI_MAX];
--- gcc/config/rs6000/altivec.md (.../trunk) (revision 144557)
+++ gcc/config/rs6000/altivec.md (.../branches/ibm/power7-meissner) (revision 144730)
@@ -21,18 +21,7 @@
(define_constants
[(UNSPEC_VCMPBFP 50)
- (UNSPEC_VCMPEQUB 51)
- (UNSPEC_VCMPEQUH 52)
- (UNSPEC_VCMPEQUW 53)
- (UNSPEC_VCMPEQFP 54)
- (UNSPEC_VCMPGEFP 55)
- (UNSPEC_VCMPGTUB 56)
- (UNSPEC_VCMPGTSB 57)
- (UNSPEC_VCMPGTUH 58)
- (UNSPEC_VCMPGTSH 59)
- (UNSPEC_VCMPGTUW 60)
- (UNSPEC_VCMPGTSW 61)
- (UNSPEC_VCMPGTFP 62)
+ ;; 51-62 deleted
(UNSPEC_VMSUMU 65)
(UNSPEC_VMSUMM 66)
(UNSPEC_VMSUMSHM 68)
@@ -87,10 +76,7 @@ (define_constants
(UNSPEC_VEXPTEFP 156)
(UNSPEC_VRSQRTEFP 157)
(UNSPEC_VREFP 158)
- (UNSPEC_VSEL4SI 159)
- (UNSPEC_VSEL4SF 160)
- (UNSPEC_VSEL8HI 161)
- (UNSPEC_VSEL16QI 162)
+ ;; 159-162 deleted
(UNSPEC_VLSDOI 163)
(UNSPEC_VUPKHSB 167)
(UNSPEC_VUPKHPX 168)
@@ -125,11 +111,11 @@ (define_constants
(UNSPEC_INTERHI_V4SI 228)
(UNSPEC_INTERHI_V8HI 229)
(UNSPEC_INTERHI_V16QI 230)
- (UNSPEC_INTERHI_V4SF 231)
+ ;; delete 231
(UNSPEC_INTERLO_V4SI 232)
(UNSPEC_INTERLO_V8HI 233)
(UNSPEC_INTERLO_V16QI 234)
- (UNSPEC_INTERLO_V4SF 235)
+ ;; delete 235
(UNSPEC_LVLX 236)
(UNSPEC_LVLXL 237)
(UNSPEC_LVRX 238)
@@ -176,39 +162,17 @@ (define_mode_iterator VIshort [V8HI V16Q
(define_mode_iterator VF [V4SF])
;; Vec modes, pity mode iterators are not composable
(define_mode_iterator V [V4SI V8HI V16QI V4SF])
+;; Vec modes for move/logical/permute ops, include vector types for move not
+;; otherwise handled by altivec (v2df, v2di, ti)
+(define_mode_iterator VM [V4SI V8HI V16QI V4SF V2DF V2DI TI])
(define_mode_attr VI_char [(V4SI "w") (V8HI "h") (V16QI "b")])
-;; Generic LVX load instruction.
-(define_insn "altivec_lvx_<mode>"
- [(set (match_operand:V 0 "altivec_register_operand" "=v")
- (match_operand:V 1 "memory_operand" "Z"))]
- "TARGET_ALTIVEC"
- "lvx %0,%y1"
- [(set_attr "type" "vecload")])
-
-;; Generic STVX store instruction.
-(define_insn "altivec_stvx_<mode>"
- [(set (match_operand:V 0 "memory_operand" "=Z")
- (match_operand:V 1 "altivec_register_operand" "v"))]
- "TARGET_ALTIVEC"
- "stvx %1,%y0"
- [(set_attr "type" "vecstore")])
-
;; Vector move instructions.
-(define_expand "mov<mode>"
- [(set (match_operand:V 0 "nonimmediate_operand" "")
- (match_operand:V 1 "any_operand" ""))]
- "TARGET_ALTIVEC"
-{
- rs6000_emit_move (operands[0], operands[1], <MODE>mode);
- DONE;
-})
-
-(define_insn "*mov<mode>_internal"
- [(set (match_operand:V 0 "nonimmediate_operand" "=Z,v,v,o,r,r,v")
- (match_operand:V 1 "input_operand" "v,Z,v,r,o,r,W"))]
- "TARGET_ALTIVEC
+(define_insn "*altivec_mov<mode>"
+ [(set (match_operand:V 0 "nonimmediate_operand" "=Z,v,v,*o,*r,*r,v,v")
+ (match_operand:V 1 "input_operand" "v,Z,v,r,o,r,j,W"))]
+ "VECTOR_MEM_ALTIVEC_P (<MODE>mode)
&& (register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
{
@@ -220,52 +184,17 @@ (define_insn "*mov<mode>_internal"
case 3: return "#";
case 4: return "#";
case 5: return "#";
- case 6: return output_vec_const_move (operands);
+ case 6: return "vxor %0,%0,%0";
+ case 7: return output_vec_const_move (operands);
default: gcc_unreachable ();
}
}
- [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,*")])
-
-(define_split
- [(set (match_operand:V4SI 0 "nonimmediate_operand" "")
- (match_operand:V4SI 1 "input_operand" ""))]
- "TARGET_ALTIVEC && reload_completed
- && gpr_or_gpr_p (operands[0], operands[1])"
- [(pc)]
-{
- rs6000_split_multireg_move (operands[0], operands[1]); DONE;
-})
-
-(define_split
- [(set (match_operand:V8HI 0 "nonimmediate_operand" "")
- (match_operand:V8HI 1 "input_operand" ""))]
- "TARGET_ALTIVEC && reload_completed
- && gpr_or_gpr_p (operands[0], operands[1])"
- [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
+ [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*")])
(define_split
- [(set (match_operand:V16QI 0 "nonimmediate_operand" "")
- (match_operand:V16QI 1 "input_operand" ""))]
- "TARGET_ALTIVEC && reload_completed
- && gpr_or_gpr_p (operands[0], operands[1])"
- [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
-
-(define_split
- [(set (match_operand:V4SF 0 "nonimmediate_operand" "")
- (match_operand:V4SF 1 "input_operand" ""))]
- "TARGET_ALTIVEC && reload_completed
- && gpr_or_gpr_p (operands[0], operands[1])"
- [(pc)]
-{
- rs6000_split_multireg_move (operands[0], operands[1]); DONE;
-})
-
-(define_split
- [(set (match_operand:V 0 "altivec_register_operand" "")
- (match_operand:V 1 "easy_vector_constant_add_self" ""))]
- "TARGET_ALTIVEC && reload_completed"
+ [(set (match_operand:VM 0 "altivec_register_operand" "")
+ (match_operand:VM 1 "easy_vector_constant_add_self" ""))]
+ "VECTOR_UNIT_ALTIVEC_P (<MODE>mode) && reload_completed"
[(set (match_dup 0) (match_dup 3))
(set (match_dup 0) (match_dup 4))]
{
@@ -346,11 +275,11 @@ (define_insn "add<mode>3"
"vaddu<VI_char>m %0,%1,%2"
[(set_attr "type" "vecsimple")])
-(define_insn "addv4sf3"
+(define_insn "*altivec_addv4sf3"
[(set (match_operand:V4SF 0 "register_operand" "=v")
(plus:V4SF (match_operand:V4SF 1 "register_operand" "v")
(match_operand:V4SF 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vaddfp %0,%1,%2"
[(set_attr "type" "vecfloat")])
@@ -392,11 +321,11 @@ (define_insn "sub<mode>3"
"vsubu<VI_char>m %0,%1,%2"
[(set_attr "type" "vecsimple")])
-(define_insn "subv4sf3"
+(define_insn "*altivec_subv4sf3"
[(set (match_operand:V4SF 0 "register_operand" "=v")
(minus:V4SF (match_operand:V4SF 1 "register_operand" "v")
(match_operand:V4SF 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vsubfp %0,%1,%2"
[(set_attr "type" "vecfloat")])
@@ -457,131 +386,81 @@ (define_insn "altivec_vcmpbfp"
"vcmpbfp %0,%1,%2"
[(set_attr "type" "veccmp")])
-(define_insn "altivec_vcmpequb"
- [(set (match_operand:V16QI 0 "register_operand" "=v")
- (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
- (match_operand:V16QI 2 "register_operand" "v")]
- UNSPEC_VCMPEQUB))]
+(define_insn "*altivec_eq<mode>"
+ [(set (match_operand:VI 0 "altivec_register_operand" "=v")
+ (eq:VI (match_operand:VI 1 "altivec_register_operand" "v")
+ (match_operand:VI 2 "altivec_register_operand" "v")))]
"TARGET_ALTIVEC"
- "vcmpequb %0,%1,%2"
- [(set_attr "type" "vecsimple")])
+ "vcmpequ<VI_char> %0,%1,%2"
+ [(set_attr "type" "veccmp")])
-(define_insn "altivec_vcmpequh"
- [(set (match_operand:V8HI 0 "register_operand" "=v")
- (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
- (match_operand:V8HI 2 "register_operand" "v")]
- UNSPEC_VCMPEQUH))]
+(define_insn "*altivec_gt<mode>"
+ [(set (match_operand:VI 0 "altivec_register_operand" "=v")
+ (gt:VI (match_operand:VI 1 "altivec_register_operand" "v")
+ (match_operand:VI 2 "altivec_register_operand" "v")))]
"TARGET_ALTIVEC"
- "vcmpequh %0,%1,%2"
- [(set_attr "type" "vecsimple")])
+ "vcmpgts<VI_char> %0,%1,%2"
+ [(set_attr "type" "veccmp")])
-(define_insn "altivec_vcmpequw"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
- (match_operand:V4SI 2 "register_operand" "v")]
- UNSPEC_VCMPEQUW))]
+(define_insn "*altivec_gtu<mode>"
+ [(set (match_operand:VI 0 "altivec_register_operand" "=v")
+ (gtu:VI (match_operand:VI 1 "altivec_register_operand" "v")
+ (match_operand:VI 2 "altivec_register_operand" "v")))]
"TARGET_ALTIVEC"
- "vcmpequw %0,%1,%2"
- [(set_attr "type" "vecsimple")])
+ "vcmpgtu<VI_char> %0,%1,%2"
+ [(set_attr "type" "veccmp")])
-(define_insn "altivec_vcmpeqfp"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "v")
- (match_operand:V4SF 2 "register_operand" "v")]
- UNSPEC_VCMPEQFP))]
- "TARGET_ALTIVEC"
+(define_insn "*altivec_eqv4sf"
+ [(set (match_operand:V4SF 0 "altivec_register_operand" "=v")
+ (eq:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v")
+ (match_operand:V4SF 2 "altivec_register_operand" "v")))]
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vcmpeqfp %0,%1,%2"
[(set_attr "type" "veccmp")])
-(define_insn "altivec_vcmpgefp"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "v")
- (match_operand:V4SF 2 "register_operand" "v")]
- UNSPEC_VCMPGEFP))]
- "TARGET_ALTIVEC"
- "vcmpgefp %0,%1,%2"
+(define_insn "*altivec_gtv4sf"
+ [(set (match_operand:V4SF 0 "altivec_register_operand" "=v")
+ (gt:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v")
+ (match_operand:V4SF 2 "altivec_register_operand" "v")))]
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
+ "vcmpgtfp %0,%1,%2"
[(set_attr "type" "veccmp")])
-(define_insn "altivec_vcmpgtub"
- [(set (match_operand:V16QI 0 "register_operand" "=v")
- (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
- (match_operand:V16QI 2 "register_operand" "v")]
- UNSPEC_VCMPGTUB))]
- "TARGET_ALTIVEC"
- "vcmpgtub %0,%1,%2"
- [(set_attr "type" "vecsimple")])
-
-(define_insn "altivec_vcmpgtsb"
- [(set (match_operand:V16QI 0 "register_operand" "=v")
- (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
- (match_operand:V16QI 2 "register_operand" "v")]
- UNSPEC_VCMPGTSB))]
- "TARGET_ALTIVEC"
- "vcmpgtsb %0,%1,%2"
- [(set_attr "type" "vecsimple")])
-
-(define_insn "altivec_vcmpgtuh"
- [(set (match_operand:V8HI 0 "register_operand" "=v")
- (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
- (match_operand:V8HI 2 "register_operand" "v")]
- UNSPEC_VCMPGTUH))]
- "TARGET_ALTIVEC"
- "vcmpgtuh %0,%1,%2"
- [(set_attr "type" "vecsimple")])
-
-(define_insn "altivec_vcmpgtsh"
- [(set (match_operand:V8HI 0 "register_operand" "=v")
- (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
- (match_operand:V8HI 2 "register_operand" "v")]
- UNSPEC_VCMPGTSH))]
- "TARGET_ALTIVEC"
- "vcmpgtsh %0,%1,%2"
- [(set_attr "type" "vecsimple")])
-
-(define_insn "altivec_vcmpgtuw"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
- (match_operand:V4SI 2 "register_operand" "v")]
- UNSPEC_VCMPGTUW))]
- "TARGET_ALTIVEC"
- "vcmpgtuw %0,%1,%2"
- [(set_attr "type" "vecsimple")])
-
-(define_insn "altivec_vcmpgtsw"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
- (match_operand:V4SI 2 "register_operand" "v")]
- UNSPEC_VCMPGTSW))]
- "TARGET_ALTIVEC"
- "vcmpgtsw %0,%1,%2"
- [(set_attr "type" "vecsimple")])
-
-(define_insn "altivec_vcmpgtfp"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (unspec:V4SI [(match_operand:V4SF 1 "register_operand" "v")
- (match_operand:V4SF 2 "register_operand" "v")]
- UNSPEC_VCMPGTFP))]
- "TARGET_ALTIVEC"
- "vcmpgtfp %0,%1,%2"
+(define_insn "*altivec_gev4sf"
+ [(set (match_operand:V4SF 0 "altivec_register_operand" "=v")
+ (ge:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v")
+ (match_operand:V4SF 2 "altivec_register_operand" "v")))]
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
+ "vcmpgefp %0,%1,%2"
[(set_attr "type" "veccmp")])
+(define_insn "altivec_vsel<mode>"
+ [(set (match_operand:VM 0 "altivec_register_operand" "=v")
+ (if_then_else:VM (ne (match_operand:VM 1 "altivec_register_operand" "v")
+ (const_int 0))
+ (match_operand:VM 2 "altivec_register_operand" "v")
+ (match_operand:VM 3 "altivec_register_operand" "v")))]
+ "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+ "vsel %0,%3,%2,%1"
+ [(set_attr "type" "vecperm")])
+
;; Fused multiply add
(define_insn "altivec_vmaddfp"
[(set (match_operand:V4SF 0 "register_operand" "=v")
(plus:V4SF (mult:V4SF (match_operand:V4SF 1 "register_operand" "v")
(match_operand:V4SF 2 "register_operand" "v"))
(match_operand:V4SF 3 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vmaddfp %0,%1,%2,%3"
[(set_attr "type" "vecfloat")])
;; We do multiply as a fused multiply-add with an add of a -0.0 vector.
-(define_expand "mulv4sf3"
+(define_expand "altivec_mulv4sf3"
[(use (match_operand:V4SF 0 "register_operand" ""))
(use (match_operand:V4SF 1 "register_operand" ""))
(use (match_operand:V4SF 2 "register_operand" ""))]
- "TARGET_ALTIVEC && TARGET_FUSED_MADD"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode) && TARGET_FUSED_MADD"
"
{
rtx neg0;
@@ -684,7 +563,7 @@ (define_insn "altivec_vnmsubfp"
(neg:V4SF (minus:V4SF (mult:V4SF (match_operand:V4SF 1 "register_operand" "v")
(match_operand:V4SF 2 "register_operand" "v"))
(match_operand:V4SF 3 "register_operand" "v"))))]
- "TARGET_ALTIVEC"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vnmsubfp %0,%1,%2,%3"
[(set_attr "type" "vecfloat")])
@@ -758,11 +637,11 @@ (define_insn "smax<mode>3"
"vmaxs<VI_char> %0,%1,%2"
[(set_attr "type" "vecsimple")])
-(define_insn "smaxv4sf3"
+(define_insn "*altivec_smaxv4sf3"
[(set (match_operand:V4SF 0 "register_operand" "=v")
(smax:V4SF (match_operand:V4SF 1 "register_operand" "v")
(match_operand:V4SF 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vmaxfp %0,%1,%2"
[(set_attr "type" "veccmp")])
@@ -782,11 +661,11 @@ (define_insn "smin<mode>3"
"vmins<VI_char> %0,%1,%2"
[(set_attr "type" "vecsimple")])
-(define_insn "sminv4sf3"
+(define_insn "*altivec_sminv4sf3"
[(set (match_operand:V4SF 0 "register_operand" "=v")
(smin:V4SF (match_operand:V4SF 1 "register_operand" "v")
(match_operand:V4SF 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vminfp %0,%1,%2"
[(set_attr "type" "veccmp")])
@@ -905,7 +784,7 @@ (define_insn "altivec_vmrghw"
"vmrghw %0,%1,%2"
[(set_attr "type" "vecperm")])
-(define_insn "altivec_vmrghsf"
+(define_insn "*altivec_vmrghsf"
[(set (match_operand:V4SF 0 "register_operand" "=v")
(vec_merge:V4SF (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v")
(parallel [(const_int 0)
@@ -918,7 +797,7 @@ (define_insn "altivec_vmrghsf"
(const_int 3)
(const_int 1)]))
(const_int 5)))]
- "TARGET_ALTIVEC"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vmrghw %0,%1,%2"
[(set_attr "type" "vecperm")])
@@ -990,35 +869,37 @@ (define_insn "altivec_vmrglh"
(define_insn "altivec_vmrglw"
[(set (match_operand:V4SI 0 "register_operand" "=v")
- (vec_merge:V4SI (vec_select:V4SI (match_operand:V4SI 1 "register_operand" "v")
- (parallel [(const_int 2)
- (const_int 0)
- (const_int 3)
- (const_int 1)]))
- (vec_select:V4SI (match_operand:V4SI 2 "register_operand" "v")
- (parallel [(const_int 0)
- (const_int 2)
- (const_int 1)
- (const_int 3)]))
- (const_int 5)))]
+ (vec_merge:V4SI
+ (vec_select:V4SI (match_operand:V4SI 1 "register_operand" "v")
+ (parallel [(const_int 2)
+ (const_int 0)
+ (const_int 3)
+ (const_int 1)]))
+ (vec_select:V4SI (match_operand:V4SI 2 "register_operand" "v")
+ (parallel [(const_int 0)
+ (const_int 2)
+ (const_int 1)
+ (const_int 3)]))
+ (const_int 5)))]
"TARGET_ALTIVEC"
"vmrglw %0,%1,%2"
[(set_attr "type" "vecperm")])
-(define_insn "altivec_vmrglsf"
+(define_insn "*altivec_vmrglsf"
[(set (match_operand:V4SF 0 "register_operand" "=v")
- (vec_merge:V4SF (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v")
- (parallel [(const_int 2)
- (const_int 0)
- (const_int 3)
- (const_int 1)]))
- (vec_select:V4SF (match_operand:V4SF 2 "register_operand" "v")
- (parallel [(const_int 0)
- (const_int 2)
- (const_int 1)
- (const_int 3)]))
- (const_int 5)))]
- "TARGET_ALTIVEC"
+ (vec_merge:V4SF
+ (vec_select:V4SF (match_operand:V4SF 1 "register_operand" "v")
+ (parallel [(const_int 2)
+ (const_int 0)
+ (const_int 3)
+ (const_int 1)]))
+ (vec_select:V4SF (match_operand:V4SF 2 "register_operand" "v")
+ (parallel [(const_int 0)
+ (const_int 2)
+ (const_int 1)
+ (const_int 3)]))
+ (const_int 5)))]
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vmrglw %0,%1,%2"
[(set_attr "type" "vecperm")])
@@ -1095,68 +976,53 @@ (define_insn "altivec_vmulosh"
[(set_attr "type" "veccomplex")])
-;; logical ops
+;; logical ops. Have the logical ops follow the memory ops in
+;; terms of whether to prefer VSX or Altivec
-(define_insn "and<mode>3"
- [(set (match_operand:VI 0 "register_operand" "=v")
- (and:VI (match_operand:VI 1 "register_operand" "v")
- (match_operand:VI 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+(define_insn "*altivec_and<mode>3"
+ [(set (match_operand:VM 0 "register_operand" "=v")
+ (and:VM (match_operand:VM 1 "register_operand" "v")
+ (match_operand:VM 2 "register_operand" "v")))]
+ "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
"vand %0,%1,%2"
[(set_attr "type" "vecsimple")])
-(define_insn "ior<mode>3"
- [(set (match_operand:VI 0 "register_operand" "=v")
- (ior:VI (match_operand:VI 1 "register_operand" "v")
- (match_operand:VI 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+(define_insn "*altivec_ior<mode>3"
+ [(set (match_operand:VM 0 "register_operand" "=v")
+ (ior:VM (match_operand:VM 1 "register_operand" "v")
+ (match_operand:VM 2 "register_operand" "v")))]
+ "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
"vor %0,%1,%2"
[(set_attr "type" "vecsimple")])
-(define_insn "xor<mode>3"
- [(set (match_operand:VI 0 "register_operand" "=v")
- (xor:VI (match_operand:VI 1 "register_operand" "v")
- (match_operand:VI 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+(define_insn "*altivec_xor<mode>3"
+ [(set (match_operand:VM 0 "register_operand" "=v")
+ (xor:VM (match_operand:VM 1 "register_operand" "v")
+ (match_operand:VM 2 "register_operand" "v")))]
+ "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
"vxor %0,%1,%2"
[(set_attr "type" "vecsimple")])
-(define_insn "xorv4sf3"
- [(set (match_operand:V4SF 0 "register_operand" "=v")
- (xor:V4SF (match_operand:V4SF 1 "register_operand" "v")
- (match_operand:V4SF 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
- "vxor %0,%1,%2"
- [(set_attr "type" "vecsimple")])
-
-(define_insn "one_cmpl<mode>2"
- [(set (match_operand:VI 0 "register_operand" "=v")
- (not:VI (match_operand:VI 1 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+(define_insn "*altivec_one_cmpl<mode>2"
+ [(set (match_operand:VM 0 "register_operand" "=v")
+ (not:VM (match_operand:VM 1 "register_operand" "v")))]
+ "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
"vnor %0,%1,%1"
[(set_attr "type" "vecsimple")])
-(define_insn "altivec_nor<mode>3"
- [(set (match_operand:VI 0 "register_operand" "=v")
- (not:VI (ior:VI (match_operand:VI 1 "register_operand" "v")
- (match_operand:VI 2 "register_operand" "v"))))]
- "TARGET_ALTIVEC"
+(define_insn "*altivec_nor<mode>3"
+ [(set (match_operand:VM 0 "register_operand" "=v")
+ (not:VM (ior:VM (match_operand:VM 1 "register_operand" "v")
+ (match_operand:VM 2 "register_operand" "v"))))]
+ "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
"vnor %0,%1,%2"
[(set_attr "type" "vecsimple")])
-(define_insn "andc<mode>3"
- [(set (match_operand:VI 0 "register_operand" "=v")
- (and:VI (not:VI (match_operand:VI 2 "register_operand" "v"))
- (match_operand:VI 1 "register_operand" "v")))]
- "TARGET_ALTIVEC"
- "vandc %0,%1,%2"
- [(set_attr "type" "vecsimple")])
-
-(define_insn "*andc3_v4sf"
- [(set (match_operand:V4SF 0 "register_operand" "=v")
- (and:V4SF (not:V4SF (match_operand:V4SF 2 "register_operand" "v"))
- (match_operand:V4SF 1 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+(define_insn "*altivec_andc<mode>3"
+ [(set (match_operand:VM 0 "register_operand" "=v")
+ (and:VM (not:VM (match_operand:VM 2 "register_operand" "v"))
+ (match_operand:VM 1 "register_operand" "v")))]
+ "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
"vandc %0,%1,%2"
[(set_attr "type" "vecsimple")])
@@ -1392,7 +1258,7 @@ (define_insn "*altivec_vspltsf"
(vec_select:SF (match_operand:V4SF 1 "register_operand" "v")
(parallel
[(match_operand:QI 2 "u5bit_cint_operand" "i")]))))]
- "TARGET_ALTIVEC"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vspltw %0,%1,%2"
[(set_attr "type" "vecperm")])
@@ -1404,19 +1270,19 @@ (define_insn "altivec_vspltis<VI_char>"
"vspltis<VI_char> %0,%1"
[(set_attr "type" "vecperm")])
-(define_insn "ftruncv4sf2"
+(define_insn "*altivec_ftruncv4sf2"
[(set (match_operand:V4SF 0 "register_operand" "=v")
(fix:V4SF (match_operand:V4SF 1 "register_operand" "v")))]
- "TARGET_ALTIVEC"
+ "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
"vrfiz %0,%1"
[(set_attr "type" "vecfloat")])
(define_insn "altivec_vperm_<mode>"
- [(set (match_operand:V 0 "register_operand" "=v")
- (unspec:V [(match_operand:V 1 "register_operand" "v")
- (match_operand:V 2 "register_operand" "v")
- (match_operand:V16QI 3 "register_operand" "v")]
- UNSPEC_VPERM))]
+ [(set (match_operand:VM 0 "register_operand" "=v")
+ (unspec:VM [(match_operand:VM 1 "register_operand" "v")
+ (match_operand:VM 2 "register_operand" "v")
+ (match_operand:V16QI 3 "register_operand" "v")]
+ UNSPEC_VPERM))]
"TARGET_ALTIVEC"
"vperm %0,%1,%2,%3"
[(set_attr "type" "vecperm")])
@@ -1515,180 +1381,6 @@ (define_insn "altivec_vrefp"
"vrefp %0,%1"
[(set_attr "type" "vecfloat")])
-(define_expand "vcondv4si"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (if_then_else:V4SI
- (match_operator 3 "comparison_operator"
- [(match_operand:V4SI 4 "register_operand" "v")
- (match_operand:V4SI 5 "register_operand" "v")])
- (match_operand:V4SI 1 "register_operand" "v")
- (match_operand:V4SI 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
- "
-{
- if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
- operands[3], operands[4], operands[5]))
- DONE;
- else
- FAIL;
-}
- ")
-
-(define_expand "vconduv4si"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (if_then_else:V4SI
- (match_operator 3 "comparison_operator"
- [(match_operand:V4SI 4 "register_operand" "v")
- (match_operand:V4SI 5 "register_operand" "v")])
- (match_operand:V4SI 1 "register_operand" "v")
- (match_operand:V4SI 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
- "
-{
- if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
- operands[3], operands[4], operands[5]))
- DONE;
- else
- FAIL;
-}
- ")
-
-(define_expand "vcondv4sf"
- [(set (match_operand:V4SF 0 "register_operand" "=v")
- (if_then_else:V4SF
- (match_operator 3 "comparison_operator"
- [(match_operand:V4SF 4 "register_operand" "v")
- (match_operand:V4SF 5 "register_operand" "v")])
- (match_operand:V4SF 1 "register_operand" "v")
- (match_operand:V4SF 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
- "
-{
- if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
- operands[3], operands[4], operands[5]))
- DONE;
- else
- FAIL;
-}
- ")
-
-(define_expand "vcondv8hi"
- [(set (match_operand:V8HI 0 "register_operand" "=v")
- (if_then_else:V8HI
- (match_operator 3 "comparison_operator"
- [(match_operand:V8HI 4 "register_operand" "v")
- (match_operand:V8HI 5 "register_operand" "v")])
- (match_operand:V8HI 1 "register_operand" "v")
- (match_operand:V8HI 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
- "
-{
- if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
- operands[3], operands[4], operands[5]))
- DONE;
- else
- FAIL;
-}
- ")
-
-(define_expand "vconduv8hi"
- [(set (match_operand:V8HI 0 "register_operand" "=v")
- (if_then_else:V8HI
- (match_operator 3 "comparison_operator"
- [(match_operand:V8HI 4 "register_operand" "v")
- (match_operand:V8HI 5 "register_operand" "v")])
- (match_operand:V8HI 1 "register_operand" "v")
- (match_operand:V8HI 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
- "
-{
- if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
- operands[3], operands[4], operands[5]))
- DONE;
- else
- FAIL;
-}
- ")
-
-(define_expand "vcondv16qi"
- [(set (match_operand:V16QI 0 "register_operand" "=v")
- (if_then_else:V16QI
- (match_operator 3 "comparison_operator"
- [(match_operand:V16QI 4 "register_operand" "v")
- (match_operand:V16QI 5 "register_operand" "v")])
- (match_operand:V16QI 1 "register_operand" "v")
- (match_operand:V16QI 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
- "
-{
- if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
- operands[3], operands[4], operands[5]))
- DONE;
- else
- FAIL;
-}
- ")
-
-(define_expand "vconduv16qi"
- [(set (match_operand:V16QI 0 "register_operand" "=v")
- (if_then_else:V16QI
- (match_operator 3 "comparison_operator"
- [(match_operand:V16QI 4 "register_operand" "v")
- (match_operand:V16QI 5 "register_operand" "v")])
- (match_operand:V16QI 1 "register_operand" "v")
- (match_operand:V16QI 2 "register_operand" "v")))]
- "TARGET_ALTIVEC"
- "
-{
- if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
- operands[3], operands[4], operands[5]))
- DONE;
- else
- FAIL;
-}
- ")
-
-
-(define_insn "altivec_vsel_v4si"
- [(set (match_operand:V4SI 0 "register_operand" "=v")
- (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
- (match_operand:V4SI 2 "register_operand" "v")
- (match_operand:V4SI 3 "register_operand" "v")]
- UNSPEC_VSEL4SI))]
- "TARGET_ALTIVEC"
- "vsel %0,%1,%2,%3"
- [(set_attr "type" "vecperm")])
-
-(define_insn "altivec_vsel_v4sf"
- [(set (match_operand:V4SF 0 "register_operand" "=v")
- (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v")
- (match_operand:V4SF 2 "register_operand" "v")
- (match_operand:V4SI 3 "register_operand" "v")]
- UNSPEC_VSEL4SF))]
- "TARGET_ALTIVEC"
- "vsel %0,%1,%2,%3"
- [(set_attr "type" "vecperm")])
-
-(define_insn "altivec_vsel_v8hi"
- [(set (match_operand:V8HI 0 "register_operand" "=v")
- (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
- (match_operand:V8HI 2 "register_operand" "v")
- (match_operand:V8HI 3 "register_operand" "v")]
- UNSPEC_VSEL8HI))]
- "TARGET_ALTIVEC"
- "vsel %0,%1,%2,%3"
- [(set_attr "type" "vecperm")])
-
-(define_insn "altivec_vsel_v16qi"
- [(set (match_operand:V16QI 0 "register_operand" "=v")
- (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
- (match_operand:V16QI 2 "register_operand" "v")
- (match_operand:V16QI 3 "register_operand" "v")]
- UNSPEC_VSEL16QI))]
- "TARGET_ALTIVEC"
- "vsel %0,%1,%2,%3"
- [(set_attr "type" "vecperm")])
-
(define_insn "altivec_vsldoi_<mode>"
[(set (match_operand:V 0 "register_operand" "=v")
(unspec:V [(match_operand:V 1 "register_operand" "v")
@@ -1878,6 +1570,14 @@ (define_expand "build_vector_mask_for_lo
gcc_assert (GET_CODE (operands[1]) == MEM);
addr = XEXP (operands[1], 0);
+ if (VECTOR_MEM_VSX_P (GET_MODE (operands[1])))
+ {
+ /* VSX doesn't and off the bottom address bits, and memory
+ operations are aligned to the natural data type. */
+ emit_insn (gen_rtx_SET (VOIDmode, operands[0], operands[1]));
+ DONE;
+ }
+
temp = gen_reg_rtx (GET_MODE (addr));
emit_insn (gen_rtx_SET (VOIDmode, temp,
gen_rtx_NEG (GET_MODE (addr), addr)));
@@ -1959,95 +1659,6 @@ (define_insn "*altivec_stvesfx"
"stvewx %1,%y0"
[(set_attr "type" "vecstore")])
-(define_expand "vec_init<mode>"
- [(match_operand:V 0 "register_operand" "")
- (match_operand 1 "" "")]
- "TARGET_ALTIVEC"
-{
- rs6000_expand_vector_init (operands[0], operands[1]);
- DONE;
-})
-
-(define_expand "vec_setv4si"
- [(match_operand:V4SI 0 "register_operand" "")
- (match_operand:SI 1 "register_operand" "")
- (match_operand 2 "const_int_operand" "")]
- "TARGET_ALTIVEC"
-{
- rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2]));
- DONE;
-})
-
-(define_expand "vec_setv8hi"
- [(match_operand:V8HI 0 "register_operand" "")
- (match_operand:HI 1 "register_operand" "")
- (match_operand 2 "const_int_operand" "")]
- "TARGET_ALTIVEC"
-{
- rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2]));
- DONE;
-})
-
-(define_expand "vec_setv16qi"
- [(match_operand:V16QI 0 "register_operand" "")
- (match_operand:QI 1 "register_operand" "")
- (match_operand 2 "const_int_operand" "")]
- "TARGET_ALTIVEC"
-{
- rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2]));
- DONE;
-})
-
-(define_expand "vec_setv4sf"
- [(match_operand:V4SF 0 "register_operand" "")
- (match_operand:SF 1 "register_operand" "")
- (match_operand 2 "const_int_operand" "")]
- "TARGET_ALTIVEC"
-{
- rs6000_expand_vector_set (operands[0], operands[1], INTVAL (operands[2]));
- DONE;
-})
-
-(define_expand "vec_extractv4si"
- [(match_operand:SI 0 "register_operand" "")
- (match_operand:V4SI 1 "register_operand" "")
- (match_operand 2 "const_int_operand" "")]
- "TARGET_ALTIVEC"
-{
- rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2]));
- DONE;
-})
-
-(define_expand "vec_extractv8hi"
- [(match_operand:HI 0 "register_operand" "")
- (match_operand:V8HI 1 "register_operand" "")
- (match_operand 2 "const_int_operand" "")]
- "TARGET_ALTIVEC"
-{
- rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2]));
- DONE;
-})
-
-(define_expand "vec_extractv16qi"
- [(match_operand:QI 0 "register_operand" "")
- (match_operand:V16QI 1 "register_operand" "")
- (match_operand 2 "const_int_operand" "")]
- "TARGET_ALTIVEC"
-{
- rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2]));
- DONE;
-})
-
-(define_expand "vec_extractv4sf"
- [(match_operand:SF 0 "register_operand" "")
- (match_operand:V4SF 1 "register_operand" "")
- (match_operand 2 "const_int_operand" "")]
- "TARGET_ALTIVEC"
-{
- rs6000_expand_vector_extract (operands[0], operands[1], INTVAL (operands[2]));
- DONE;
-})
-
;; Generate
;; vspltis? SCRATCH0,0
;; vsubu?m SCRATCH2,SCRATCH1,%1
@@ -2069,7 +1680,7 @@ (define_expand "abs<mode>2"
;; vspltisw SCRATCH1,-1
;; vslw SCRATCH2,SCRATCH1,SCRATCH1
;; vandc %0,%1,SCRATCH2
-(define_expand "absv4sf2"
+(define_expand "altivec_absv4sf2"
[(set (match_dup 2)
(vec_duplicate:V4SI (const_int -1)))
(set (match_dup 3)
@@ -2132,7 +1743,7 @@ (define_expand "vec_shl_<mode>"
DONE;
}")
-;; Vector shift left in bits. Currently supported ony for shift
+;; Vector shift right in bits. Currently supported ony for shift
;; amounts that can be expressed as byte shifts (divisible by 8).
;; General shift amounts can be supported using vsro + vsr. We're
;; not expecting to see these yet (the vectorizer currently
@@ -2665,7 +2276,7 @@ (define_expand "vec_pack_trunc_v4si"
DONE;
}")
-(define_expand "negv4sf2"
+(define_expand "altivec_negv4sf2"
[(use (match_operand:V4SF 0 "register_operand" ""))
(use (match_operand:V4SF 1 "register_operand" ""))]
"TARGET_ALTIVEC"
@@ -2994,29 +2605,6 @@ (define_expand "vec_extract_oddv16qi"
emit_insn (gen_vpkuhum_nomode (operands[0], operands[1], operands[2]));
DONE;
}")
-(define_expand "vec_interleave_highv4sf"
- [(set (match_operand:V4SF 0 "register_operand" "")
- (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "")
- (match_operand:V4SF 2 "register_operand" "")]
- UNSPEC_INTERHI_V4SF))]
- "TARGET_ALTIVEC"
- "
-{
- emit_insn (gen_altivec_vmrghsf (operands[0], operands[1], operands[2]));
- DONE;
-}")
-
-(define_expand "vec_interleave_lowv4sf"
- [(set (match_operand:V4SF 0 "register_operand" "")
- (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "")
- (match_operand:V4SF 2 "register_operand" "")]
- UNSPEC_INTERLO_V4SF))]
- "TARGET_ALTIVEC"
- "
-{
- emit_insn (gen_altivec_vmrglsf (operands[0], operands[1], operands[2]));
- DONE;
-}")
(define_expand "vec_interleave_high<mode>"
[(set (match_operand:VI 0 "register_operand" "")
--- gcc/config/rs6000/aix61.h (.../trunk) (revision 144557)
+++ gcc/config/rs6000/aix61.h (.../branches/ibm/power7-meissner) (revision 144730)
@@ -57,20 +57,24 @@ do { \
#undef ASM_SPEC
#define ASM_SPEC "-u %{maix64:-a64 %{!mcpu*:-mppc64}} %(asm_cpu)"
-/* Common ASM definitions used by ASM_SPEC amongst the various targets
- for handling -mcpu=xxx switches. */
+/* Common ASM definitions used by ASM_SPEC amongst the various targets for
+ handling -mcpu=xxx switches. There is a parallel list in driver-rs6000.c to
+ provide the default assembler options if the user uses -mcpu=native, so if
+ you make changes here, make them there also. */
#undef ASM_CPU_SPEC
#define ASM_CPU_SPEC \
"%{!mcpu*: %{!maix64: \
%{mpowerpc64: -mppc64} \
%{maltivec: -m970} \
%{!maltivec: %{!mpower64: %(asm_default)}}}} \
+%{mcpu=native: %(asm_cpu_native)} \
%{mcpu=power3: -m620} \
%{mcpu=power4: -mpwr4} \
%{mcpu=power5: -mpwr5} \
%{mcpu=power5+: -mpwr5x} \
%{mcpu=power6: -mpwr6} \
%{mcpu=power6x: -mpwr6} \
+%{mcpu=power7: -mpwr7} \
%{mcpu=powerpc: -mppc} \
%{mcpu=rs64a: -mppc} \
%{mcpu=603: -m603} \
--- gcc/config/rs6000/rs6000.md (.../trunk) (revision 144557)
+++ gcc/config/rs6000/rs6000.md (.../branches/ibm/power7-meissner) (revision 144730)
@@ -138,7 +138,7 @@ (define_attr "length" ""
;; Processor type -- this attribute must exactly match the processor_type
;; enumeration in rs6000.h.
-(define_attr "cpu" "rios1,rios2,rs64a,mpccore,ppc403,ppc405,ppc440,ppc601,ppc603,ppc604,ppc604e,ppc620,ppc630,ppc750,ppc7400,ppc7450,ppc8540,ppce300c2,ppce300c3,ppce500mc,power4,power5,power6,cell"
+(define_attr "cpu" "rios1,rios2,rs64a,mpccore,ppc403,ppc405,ppc440,ppc601,ppc603,ppc604,ppc604e,ppc620,ppc630,ppc750,ppc7400,ppc7450,ppc8540,ppce300c2,ppce300c3,ppce500mc,power4,power5,power6,power7,cell"
(const (symbol_ref "rs6000_cpu_attr")))
@@ -167,6 +167,7 @@ (define_attr "cell_micro" "not,condition
(include "power4.md")
(include "power5.md")
(include "power6.md")
+(include "power7.md")
(include "cell.md")
(include "xfpu.md")
@@ -218,6 +219,9 @@ (define_mode_attr wd [(QI "b") (HI "h")
; DImode bits
(define_mode_attr dbits [(QI "56") (HI "48") (SI "32")])
+;; ISEL/ISEL64 target selection
+(define_mode_attr sel [(SI "") (DI "64")])
+
;; Start with fixed-point load and store insns. Here we put only the more
;; complex forms. Basic data transfer is done later.
@@ -520,7 +524,7 @@ (define_insn ""
"@
{andil.|andi.} %2,%1,0xff
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -546,7 +550,7 @@ (define_insn ""
"@
{andil.|andi.} %0,%1,0xff
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -687,7 +691,7 @@ (define_insn ""
"@
{andil.|andi.} %2,%1,0xff
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -713,7 +717,7 @@ (define_insn ""
"@
{andil.|andi.} %0,%1,0xff
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -856,7 +860,7 @@ (define_insn ""
"@
{andil.|andi.} %2,%1,0xffff
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -882,7 +886,7 @@ (define_insn ""
"@
{andil.|andi.} %0,%1,0xffff
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -1670,7 +1674,7 @@ (define_insn ""
"@
nor. %2,%1,%1
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -1696,7 +1700,7 @@ (define_insn ""
"@
nor. %0,%1,%1
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -2221,10 +2225,22 @@ (define_insn "popcntb<mode>2"
"TARGET_POPCNTB"
"popcntb %0,%1")
+(define_insn "popcntwsi2"
+ [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+ (popcount:SI (match_operand:SI 1 "gpc_reg_operand" "r")))]
+ "TARGET_POPCNTD"
+ "popcntw %0,%1")
+
+(define_insn "popcntddi2"
+ [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+ (popcount:DI (match_operand:DI 1 "gpc_reg_operand" "r")))]
+ "TARGET_POPCNTD && TARGET_POWERPC64"
+ "popcntd %0,%1")
+
(define_expand "popcount<mode>2"
[(set (match_operand:GPR 0 "gpc_reg_operand" "")
(popcount:GPR (match_operand:GPR 1 "gpc_reg_operand" "")))]
- "TARGET_POPCNTB"
+ "TARGET_POPCNTB || TARGET_POPCNTD"
{
rs6000_emit_popcount (operands[0], operands[1]);
DONE;
@@ -2852,7 +2868,7 @@ (define_insn "andsi3_mc"
{rlinm|rlwinm} %0,%1,0,%m2,%M2
{andil.|andi.} %0,%1,%b2
{andiu.|andis.} %0,%1,%u2"
- [(set_attr "type" "*,*,compare,compare")])
+ [(set_attr "type" "*,*,fast_compare,fast_compare")])
(define_insn "andsi3_nomc"
[(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
@@ -2895,7 +2911,8 @@ (define_insn "*andsi3_internal2_mc"
#
#
#"
- [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare")
+ [(set_attr "type" "fast_compare,fast_compare,fast_compare,delayed_compare,\
+ compare,compare,compare,compare")
(set_attr "length" "4,4,4,4,8,8,8,8")])
(define_insn "*andsi3_internal3_mc"
@@ -2915,7 +2932,8 @@ (define_insn "*andsi3_internal3_mc"
#
#
#"
- [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare")
+ [(set_attr "type" "compare,fast_compare,fast_compare,delayed_compare,compare,\
+ compare,compare,compare")
(set_attr "length" "8,4,4,4,8,8,8,8")])
(define_split
@@ -2974,7 +2992,8 @@ (define_insn "*andsi3_internal4"
#
#
#"
- [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare")
+ [(set_attr "type" "fast_compare,fast_compare,fast_compare,delayed_compare,\
+ compare,compare,compare,compare")
(set_attr "length" "4,4,4,4,8,8,8,8")])
(define_insn "*andsi3_internal5_mc"
@@ -2996,7 +3015,8 @@ (define_insn "*andsi3_internal5_mc"
#
#
#"
- [(set_attr "type" "compare,compare,compare,delayed_compare,compare,compare,compare,compare")
+ [(set_attr "type" "compare,fast_compare,fast_compare,delayed_compare,compare,\
+ compare,compare,compare")
(set_attr "length" "8,4,4,4,8,8,8,8")])
(define_split
@@ -3127,7 +3147,7 @@ (define_insn "*boolsi3_internal2"
"@
%q4. %3,%1,%2
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -3156,7 +3176,7 @@ (define_insn "*boolsi3_internal3"
"@
%q4. %0,%1,%2
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -3281,7 +3301,7 @@ (define_insn "*boolccsi3_internal2"
"@
%q4. %3,%1,%2
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -3310,7 +3330,7 @@ (define_insn "*boolccsi3_internal3"
"@
%q4. %0,%1,%2
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -5303,7 +5323,7 @@ (define_insn "fres"
"fres %0,%1"
[(set_attr "type" "fp")])
-(define_insn ""
+(define_insn "*fmaddsf4_powerpc"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5314,7 +5334,7 @@ (define_insn ""
[(set_attr "type" "fp")
(set_attr "fp_type" "fp_maddsub_s")])
-(define_insn ""
+(define_insn "*fmaddsf4_power"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5323,7 +5343,7 @@ (define_insn ""
"{fma|fmadd} %0,%1,%2,%3"
[(set_attr "type" "dmul")])
-(define_insn ""
+(define_insn "*fmsubsf4_powerpc"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5334,7 +5354,7 @@ (define_insn ""
[(set_attr "type" "fp")
(set_attr "fp_type" "fp_maddsub_s")])
-(define_insn ""
+(define_insn "*fmsubsf4_power"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5343,7 +5363,7 @@ (define_insn ""
"{fms|fmsub} %0,%1,%2,%3"
[(set_attr "type" "dmul")])
-(define_insn ""
+(define_insn "*fnmaddsf4_powerpc_1"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(neg:SF (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5354,7 +5374,7 @@ (define_insn ""
[(set_attr "type" "fp")
(set_attr "fp_type" "fp_maddsub_s")])
-(define_insn ""
+(define_insn "*fnmaddsf4_powerpc_2"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f"))
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5365,7 +5385,7 @@ (define_insn ""
[(set_attr "type" "fp")
(set_attr "fp_type" "fp_maddsub_s")])
-(define_insn ""
+(define_insn "*fnmaddsf4_power_1"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(neg:SF (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5374,7 +5394,7 @@ (define_insn ""
"{fnma|fnmadd} %0,%1,%2,%3"
[(set_attr "type" "dmul")])
-(define_insn ""
+(define_insn "*fnmaddsf4_power_2"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f"))
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5384,7 +5404,7 @@ (define_insn ""
"{fnma|fnmadd} %0,%1,%2,%3"
[(set_attr "type" "dmul")])
-(define_insn ""
+(define_insn "*fnmsubsf4_powerpc_1"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(neg:SF (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5395,7 +5415,7 @@ (define_insn ""
[(set_attr "type" "fp")
(set_attr "fp_type" "fp_maddsub_s")])
-(define_insn ""
+(define_insn "*fnmsubsf4_powerpc_2"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(minus:SF (match_operand:SF 3 "gpc_reg_operand" "f")
(mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
@@ -5406,7 +5426,7 @@ (define_insn ""
[(set_attr "type" "fp")
(set_attr "fp_type" "fp_maddsub_s")])
-(define_insn ""
+(define_insn "*fnmsubsf4_power_1"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(neg:SF (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
(match_operand:SF 2 "gpc_reg_operand" "f"))
@@ -5415,7 +5435,7 @@ (define_insn ""
"{fnms|fnmsub} %0,%1,%2,%3"
[(set_attr "type" "dmul")])
-(define_insn ""
+(define_insn "*fnmsubsf4_power_2"
[(set (match_operand:SF 0 "gpc_reg_operand" "=f")
(minus:SF (match_operand:SF 3 "gpc_reg_operand" "f")
(mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
@@ -5496,9 +5516,18 @@ (define_expand "copysigndf3"
(match_dup 5))
(match_dup 3)
(match_dup 4)))]
- "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
- && !HONOR_NANS (DFmode) && !HONOR_SIGNED_ZEROS (DFmode)"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && ((TARGET_PPC_GFXOPT
+ && !HONOR_NANS (DFmode)
+ && !HONOR_SIGNED_ZEROS (DFmode))
+ || VECTOR_UNIT_VSX_P (DFmode))"
{
+ if (VECTOR_UNIT_VSX_P (DFmode))
+ {
+ emit_insn (gen_vsx_copysigndf3 (operands[0], operands[1],
+ operands[2]));
+ DONE;
+ }
operands[3] = gen_reg_rtx (DFmode);
operands[4] = gen_reg_rtx (DFmode);
operands[5] = CONST0_RTX (DFmode);
@@ -5542,12 +5571,12 @@ (define_split
DONE;
}")
-(define_expand "movsicc"
- [(set (match_operand:SI 0 "gpc_reg_operand" "")
- (if_then_else:SI (match_operand 1 "comparison_operator" "")
- (match_operand:SI 2 "gpc_reg_operand" "")
- (match_operand:SI 3 "gpc_reg_operand" "")))]
- "TARGET_ISEL"
+(define_expand "mov<mode>cc"
+ [(set (match_operand:GPR 0 "gpc_reg_operand" "")
+ (if_then_else:GPR (match_operand 1 "comparison_operator" "")
+ (match_operand:GPR 2 "gpc_reg_operand" "")
+ (match_operand:GPR 3 "gpc_reg_operand" "")))]
+ "TARGET_ISEL<sel>"
"
{
if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
@@ -5564,28 +5593,28 @@ (define_expand "movsicc"
;; leave out the mode in operand 4 and use one pattern, but reload can
;; change the mode underneath our feet and then gets confused trying
;; to reload the value.
-(define_insn "isel_signed"
- [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
- (if_then_else:SI
+(define_insn "isel_signed_<mode>"
+ [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+ (if_then_else:GPR
(match_operator 1 "comparison_operator"
[(match_operand:CC 4 "cc_reg_operand" "y")
(const_int 0)])
- (match_operand:SI 2 "gpc_reg_operand" "b")
- (match_operand:SI 3 "gpc_reg_operand" "b")))]
- "TARGET_ISEL"
+ (match_operand:GPR 2 "gpc_reg_operand" "b")
+ (match_operand:GPR 3 "gpc_reg_operand" "b")))]
+ "TARGET_ISEL<sel>"
"*
{ return output_isel (operands); }"
[(set_attr "length" "4")])
-(define_insn "isel_unsigned"
- [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
- (if_then_else:SI
+(define_insn "isel_unsigned_<mode>"
+ [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+ (if_then_else:GPR
(match_operator 1 "comparison_operator"
[(match_operand:CCUNS 4 "cc_reg_operand" "y")
(const_int 0)])
- (match_operand:SI 2 "gpc_reg_operand" "b")
- (match_operand:SI 3 "gpc_reg_operand" "b")))]
- "TARGET_ISEL"
+ (match_operand:GPR 2 "gpc_reg_operand" "b")
+ (match_operand:GPR 3 "gpc_reg_operand" "b")))]
+ "TARGET_ISEL<sel>"
"*
{ return output_isel (operands); }"
[(set_attr "length" "4")])
@@ -5633,7 +5662,8 @@ (define_expand "negdf2"
(define_insn "*negdf2_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(neg:DF (match_operand:DF 1 "gpc_reg_operand" "f")))]
- "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"fneg %0,%1"
[(set_attr "type" "fp")])
@@ -5646,14 +5676,16 @@ (define_expand "absdf2"
(define_insn "*absdf2_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(abs:DF (match_operand:DF 1 "gpc_reg_operand" "f")))]
- "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"fabs %0,%1"
[(set_attr "type" "fp")])
(define_insn "*nabsdf2_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(neg:DF (abs:DF (match_operand:DF 1 "gpc_reg_operand" "f"))))]
- "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"fnabs %0,%1"
[(set_attr "type" "fp")])
@@ -5668,7 +5700,8 @@ (define_insn "*adddf3_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(plus:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
(match_operand:DF 2 "gpc_reg_operand" "f")))]
- "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"{fa|fadd} %0,%1,%2"
[(set_attr "type" "fp")
(set_attr "fp_type" "fp_addsub_d")])
@@ -5684,7 +5717,8 @@ (define_insn "*subdf3_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(minus:DF (match_operand:DF 1 "gpc_reg_operand" "f")
(match_operand:DF 2 "gpc_reg_operand" "f")))]
- "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"{fs|fsub} %0,%1,%2"
[(set_attr "type" "fp")
(set_attr "fp_type" "fp_addsub_d")])
@@ -5700,7 +5734,8 @@ (define_insn "*muldf3_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
(match_operand:DF 2 "gpc_reg_operand" "f")))]
- "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"{fm|fmul} %0,%1,%2"
[(set_attr "type" "dmul")
(set_attr "fp_type" "fp_mul_d")])
@@ -5718,7 +5753,8 @@ (define_insn "*divdf3_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(div:DF (match_operand:DF 1 "gpc_reg_operand" "f")
(match_operand:DF 2 "gpc_reg_operand" "f")))]
- "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && !TARGET_SIMPLE_FPU"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && !TARGET_SIMPLE_FPU
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"{fd|fdiv} %0,%1,%2"
[(set_attr "type" "ddiv")])
@@ -5734,73 +5770,81 @@ (define_expand "recipdf3"
DONE;
})
-(define_insn "fred"
+(define_expand "fred"
+ [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
+ (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRES))]
+ "(TARGET_POPCNTB || VECTOR_UNIT_VSX_P (DFmode)) && flag_finite_math_only"
+ "")
+
+(define_insn "*fred_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRES))]
- "TARGET_POPCNTB && flag_finite_math_only"
+ "TARGET_POPCNTB && flag_finite_math_only && !VECTOR_UNIT_VSX_P (DFmode)"
"fre %0,%1"
[(set_attr "type" "fp")])
-(define_insn ""
+(define_insn "*fmadddf4_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
(match_operand:DF 2 "gpc_reg_operand" "f"))
(match_operand:DF 3 "gpc_reg_operand" "f")))]
- "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT
+ && VECTOR_UNIT_NONE_P (DFmode)"
"{fma|fmadd} %0,%1,%2,%3"
[(set_attr "type" "dmul")
(set_attr "fp_type" "fp_maddsub_d")])
-(define_insn ""
+(define_insn "*fmsubdf4_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
(match_operand:DF 2 "gpc_reg_operand" "f"))
(match_operand:DF 3 "gpc_reg_operand" "f")))]
- "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT"
+ "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT
+ && VECTOR_UNIT_NONE_P (DFmode)"
"{fms|fmsub} %0,%1,%2,%3"
[(set_attr "type" "dmul")
(set_attr "fp_type" "fp_maddsub_d")])
-(define_insn ""
+(define_insn "*fnmadddf4_fpr_1"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(neg:DF (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
(match_operand:DF 2 "gpc_reg_operand" "f"))
(match_operand:DF 3 "gpc_reg_operand" "f"))))]
"TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT
- && HONOR_SIGNED_ZEROS (DFmode)"
+ && HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)"
"{fnma|fnmadd} %0,%1,%2,%3"
[(set_attr "type" "dmul")
(set_attr "fp_type" "fp_maddsub_d")])
-(define_insn ""
+(define_insn "*fnmadddf4_fpr_2"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(minus:DF (mult:DF (neg:DF (match_operand:DF 1 "gpc_reg_operand" "f"))
(match_operand:DF 2 "gpc_reg_operand" "f"))
(match_operand:DF 3 "gpc_reg_operand" "f")))]
"TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT
- && ! HONOR_SIGNED_ZEROS (DFmode)"
+ && ! HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)"
"{fnma|fnmadd} %0,%1,%2,%3"
[(set_attr "type" "dmul")
(set_attr "fp_type" "fp_maddsub_d")])
-(define_insn ""
+(define_insn "*fnmsubdf4_fpr_1"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(neg:DF (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
(match_operand:DF 2 "gpc_reg_operand" "f"))
(match_operand:DF 3 "gpc_reg_operand" "f"))))]
"TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT
- && HONOR_SIGNED_ZEROS (DFmode)"
+ && HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)"
"{fnms|fnmsub} %0,%1,%2,%3"
[(set_attr "type" "dmul")
(set_attr "fp_type" "fp_maddsub_d")])
-(define_insn ""
+(define_insn "*fnmsubdf4_fpr_2"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(minus:DF (match_operand:DF 3 "gpc_reg_operand" "f")
(mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
(match_operand:DF 2 "gpc_reg_operand" "f"))))]
"TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT
- && ! HONOR_SIGNED_ZEROS (DFmode)"
+ && ! HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)"
"{fnms|fnmsub} %0,%1,%2,%3"
[(set_attr "type" "dmul")
(set_attr "fp_type" "fp_maddsub_d")])
@@ -5809,7 +5853,8 @@ (define_insn "sqrtdf2"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(sqrt:DF (match_operand:DF 1 "gpc_reg_operand" "f")))]
"(TARGET_PPC_GPOPT || TARGET_POWER2) && TARGET_HARD_FLOAT && TARGET_FPRS
- && TARGET_DOUBLE_FLOAT"
+ && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"fsqrt %0,%1"
[(set_attr "type" "dsqrt")])
@@ -5898,6 +5943,18 @@ (define_expand "fix_truncsfsi2"
"TARGET_HARD_FLOAT && !TARGET_FPRS && TARGET_SINGLE_FLOAT"
"")
+(define_expand "fixuns_truncdfsi2"
+ [(set (match_operand:SI 0 "gpc_reg_operand" "")
+ (unsigned_fix:SI (match_operand:DF 1 "gpc_reg_operand" "")))]
+ "TARGET_HARD_FLOAT && TARGET_E500_DOUBLE"
+ "")
+
+(define_expand "fixuns_truncdfdi2"
+ [(set (match_operand:DI 0 "register_operand" "")
+ (unsigned_fix:DI (match_operand:DF 1 "register_operand" "")))]
+ "TARGET_HARD_FLOAT && TARGET_VSX"
+ "")
+
; For each of these conversions, there is a define_expand, a define_insn
; with a '#' template, and a define_split (with C code). The idea is
; to allow constant folding with the template of the define_insn,
@@ -6139,10 +6196,17 @@ (define_insn "fctiwz"
"{fcirz|fctiwz} %0,%1"
[(set_attr "type" "fp")])
-(define_insn "btruncdf2"
+(define_expand "btruncdf2"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIZ))]
"TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "")
+
+(define_insn "*btruncdf2_fprs"
+ [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
+ (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIZ))]
+ "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"friz %0,%1"
[(set_attr "type" "fp")])
@@ -6153,10 +6217,17 @@ (define_insn "btruncsf2"
"friz %0,%1"
[(set_attr "type" "fp")])
-(define_insn "ceildf2"
+(define_expand "ceildf2"
+ [(set (match_operand:DF 0 "gpc_reg_operand" "")
+ (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "")] UNSPEC_FRIP))]
+ "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "")
+
+(define_insn "*ceildf2_fprs"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIP))]
- "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"frip %0,%1"
[(set_attr "type" "fp")])
@@ -6167,10 +6238,17 @@ (define_insn "ceilsf2"
"frip %0,%1"
[(set_attr "type" "fp")])
-(define_insn "floordf2"
+(define_expand "floordf2"
+ [(set (match_operand:DF 0 "gpc_reg_operand" "")
+ (unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "")] UNSPEC_FRIM))]
+ "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "")
+
+(define_insn "*floordf2_fprs"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIM))]
- "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+ "TARGET_FPRND && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"frim %0,%1"
[(set_attr "type" "fp")])
@@ -6181,6 +6259,7 @@ (define_insn "floorsf2"
"frim %0,%1"
[(set_attr "type" "fp")])
+;; No VSX equivalent to frin
(define_insn "rounddf2"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(unspec:DF [(match_operand:DF 1 "gpc_reg_operand" "f")] UNSPEC_FRIN))]
@@ -6195,6 +6274,12 @@ (define_insn "roundsf2"
"frin %0,%1"
[(set_attr "type" "fp")])
+(define_expand "ftruncdf2"
+ [(set (match_operand:DF 0 "gpc_reg_operand" "")
+ (fix:DF (match_operand:DF 1 "gpc_reg_operand" "")))]
+ "VECTOR_UNIT_VSX_P (DFmode)"
+ "")
+
; An UNSPEC is used so we don't have to support SImode in FP registers.
(define_insn "stfiwx"
[(set (match_operand:SI 0 "memory_operand" "=Z")
@@ -6210,17 +6295,40 @@ (define_expand "floatsisf2"
"TARGET_HARD_FLOAT && !TARGET_FPRS"
"")
-(define_insn "floatdidf2"
+(define_expand "floatdidf2"
+ [(set (match_operand:DF 0 "gpc_reg_operand" "")
+ (float:DF (match_operand:DI 1 "gpc_reg_operand" "")))]
+ "(TARGET_POWERPC64 || TARGET_XILINX_FPU || VECTOR_UNIT_VSX_P (DFmode))
+ && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS"
+ "")
+
+(define_insn "*floatdidf2_fpr"
[(set (match_operand:DF 0 "gpc_reg_operand" "=f")
(float:DF (match_operand:DI 1 "gpc_reg_operand" "!f#r")))]
- "(TARGET_POWERPC64 || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS"
+ "(TARGET_POWERPC64 || TARGET_XILINX_FPU)
+ && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS
+ && !VECTOR_UNIT_VSX_P (DFmode)"
"fcfid %0,%1"
[(set_attr "type" "fp")])
-(define_insn "fix_truncdfdi2"
+(define_expand "floatunsdidf2"
+ [(set (match_operand:DF 0 "gpc_reg_operand" "")
+ (unsigned_float:DF (match_operand:DI 1 "gpc_reg_operand" "")))]
+ "TARGET_VSX"
+ "")
+
+(define_expand "fix_truncdfdi2"
+ [(set (match_operand:DI 0 "gpc_reg_operand" "")
+ (fix:DI (match_operand:DF 1 "gpc_reg_operand" "")))]
+ "(TARGET_POWERPC64 || TARGET_XILINX_FPU || VECTOR_UNIT_VSX_P (DFmode))
+ && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS"
+ "")
+
+(define_insn "*fix_truncdfdi2_fpr"
[(set (match_operand:DI 0 "gpc_reg_operand" "=!f#r")
(fix:DI (match_operand:DF 1 "gpc_reg_operand" "f")))]
- "(TARGET_POWERPC64 || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS"
+ "(TARGET_POWERPC64 || TARGET_XILINX_FPU) && TARGET_HARD_FLOAT
+ && TARGET_DOUBLE_FLOAT && TARGET_FPRS && !VECTOR_UNIT_VSX_P (DFmode)"
"fctidz %0,%1"
[(set_attr "type" "fp")])
@@ -7609,7 +7717,7 @@ (define_insn "anddi3_mc"
andi. %0,%1,%b2
andis. %0,%1,%u2
#"
- [(set_attr "type" "*,*,*,compare,compare,*")
+ [(set_attr "type" "*,*,*,fast_compare,fast_compare,*")
(set_attr "length" "4,4,4,4,4,8")])
(define_insn "anddi3_nomc"
@@ -7667,7 +7775,9 @@ (define_insn "*anddi3_internal2_mc"
#
#
#"
- [(set_attr "type" "compare,compare,delayed_compare,compare,compare,compare,compare,compare,compare,compare,compare,compare")
+ [(set_attr "type" "fast_compare,compare,delayed_compare,fast_compare,\
+ fast_compare,compare,compare,compare,compare,compare,\
+ compare,compare")
(set_attr "length" "4,4,4,4,4,8,8,8,8,8,8,12")])
(define_split
@@ -7718,7 +7828,9 @@ (define_insn "*anddi3_internal3_mc"
#
#
#"
- [(set_attr "type" "compare,compare,delayed_compare,compare,compare,compare,compare,compare,compare,compare,compare,compare")
+ [(set_attr "type" "fast_compare,compare,delayed_compare,fast_compare,\
+ fast_compare,compare,compare,compare,compare,compare,\
+ compare,compare")
(set_attr "length" "4,4,4,4,4,8,8,8,8,8,8,12")])
(define_split
@@ -7858,7 +7970,7 @@ (define_insn "*booldi3_internal2"
"@
%q4. %3,%1,%2
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -7887,7 +7999,7 @@ (define_insn "*booldi3_internal3"
"@
%q4. %0,%1,%2
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -7958,7 +8070,7 @@ (define_insn "*boolcdi3_internal2"
"@
%q4. %3,%2,%1
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -7987,7 +8099,7 @@ (define_insn "*boolcdi3_internal3"
"@
%q4. %0,%2,%1
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -8024,7 +8136,7 @@ (define_insn "*boolccdi3_internal2"
"@
%q4. %3,%1,%2
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -8053,7 +8165,7 @@ (define_insn "*boolccdi3_internal3"
"@
%q4. %0,%1,%2
#"
- [(set_attr "type" "compare")
+ [(set_attr "type" "fast_compare,compare")
(set_attr "length" "4,8")])
(define_split
@@ -8070,6 +8182,51 @@ (define_split
(compare:CC (match_dup 0)
(const_int 0)))]
"")
+
+(define_expand "smindi3"
+ [(match_operand:DI 0 "gpc_reg_operand" "")
+ (match_operand:DI 1 "gpc_reg_operand" "")
+ (match_operand:DI 2 "gpc_reg_operand" "")]
+ "TARGET_ISEL64"
+ "
+{
+ rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]);
+ DONE;
+}")
+
+(define_expand "smaxdi3"
+ [(match_operand:DI 0 "gpc_reg_operand" "")
+ (match_operand:DI 1 "gpc_reg_operand" "")
+ (match_operand:DI 2 "gpc_reg_operand" "")]
+ "TARGET_ISEL64"
+ "
+{
+ rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]);
+ DONE;
+}")
+
+(define_expand "umindi3"
+ [(match_operand:DI 0 "gpc_reg_operand" "")
+ (match_operand:DI 1 "gpc_reg_operand" "")
+ (match_operand:DI 2 "gpc_reg_operand" "")]
+ "TARGET_ISEL64"
+ "
+{
+ rs6000_emit_minmax (operands[0], UMIN, operands[1], operands[2]);
+ DONE;
+}")
+
+(define_expand "umaxdi3"
+ [(match_operand:DI 0 "gpc_reg_operand" "")
+ (match_operand:DI 1 "gpc_reg_operand" "")
+ (match_operand:DI 2 "gpc_reg_operand" "")]
+ "TARGET_ISEL64"
+ "
+{
+ rs6000_emit_minmax (operands[0], UMAX, operands[1], operands[2]);
+ DONE;
+}")
+
;; Now define ways of moving data around.
@@ -8473,8 +8630,8 @@ (define_split
;; The "??" is a kludge until we can figure out a more reasonable way
;; of handling these non-offsettable values.
(define_insn "*movdf_hardfloat32"
- [(set (match_operand:DF 0 "nonimmediate_operand" "=!r,??r,m,f,f,m,!r,!r,!r")
- (match_operand:DF 1 "input_operand" "r,m,r,f,m,f,G,H,F"))]
+ [(set (match_operand:DF 0 "nonimmediate_operand" "=!r, ??r, m, ws, ?wa, ws, ?wa, Z, ?Z, f, f, m, wa, !r, !r, !r")
+ (match_operand:DF 1 "input_operand" "r, m, r, ws, wa, Z, Z, ws, wa, f, m, f, j, G, H, F"))]
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
&& (gpc_reg_operand (operands[0], DFmode)
|| gpc_reg_operand (operands[1], DFmode))"
@@ -8553,19 +8710,30 @@ (define_insn "*movdf_hardfloat32"
return \"\";
}
case 3:
- return \"fmr %0,%1\";
case 4:
- return \"lfd%U1%X1 %0,%1\";
+ return \"xscpsgndp %x0,%x1,%x1\";
case 5:
- return \"stfd%U0%X0 %1,%0\";
case 6:
+ return \"lxsd%U1x %x0,%y1\";
case 7:
case 8:
+ return \"stxsd%U0x %x1,%y0\";
+ case 9:
+ return \"fmr %0,%1\";
+ case 10:
+ return \"lfd%U1%X1 %0,%1\";
+ case 11:
+ return \"stfd%U0%X0 %1,%0\";
+ case 12:
+ return \"xxlxor %x0,%x0,%x0\";
+ case 13:
+ case 14:
+ case 15:
return \"#\";
}
}"
- [(set_attr "type" "two,load,store,fp,fpload,fpstore,*,*,*")
- (set_attr "length" "8,16,16,4,4,4,8,12,16")])
+ [(set_attr "type" "two, load, store, fp, fp, fpload, fpload, fpstore, fpstore, fp, fpload, fpstore, vecsimple, *, *, *")
+ (set_attr "length" "8, 16, 16, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 8, 12, 16")])
(define_insn "*movdf_softfloat32"
[(set (match_operand:DF 0 "nonimmediate_operand" "=r,r,m,r,r,r")
@@ -8613,19 +8781,26 @@ (define_insn "*movdf_softfloat32"
; ld/std require word-aligned displacements -> 'Y' constraint.
; List Y->r and r->Y before r->r for reload.
(define_insn "*movdf_hardfloat64_mfpgpr"
- [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,f,f,m,*c*l,!r,*h,!r,!r,!r,r,f")
- (match_operand:DF 1 "input_operand" "r,Y,r,f,m,f,r,h,0,G,H,F,f,r"))]
+ [(set (match_operand:DF 0 "nonimmediate_operand" "=Y, r, !r, ws, ?wa, ws, ?wa, Z, ?Z, f, f, m, wa, *c*l, !r, *h, !r, !r, !r, r, f")
+ (match_operand:DF 1 "input_operand" "r, Y, r, ws, ?wa, Z, Z, ws, wa, f, m, f, j, r, h, 0, G, H, F, f, r"))]
"TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS
- && TARGET_DOUBLE_FLOAT
+ && TARGET_DOUBLE_FLOAT
&& (gpc_reg_operand (operands[0], DFmode)
|| gpc_reg_operand (operands[1], DFmode))"
"@
std%U0%X0 %1,%0
ld%U1%X1 %0,%1
mr %0,%1
+ xscpsgndp %x0,%x1,%x1
+ xscpsgndp %x0,%x1,%x1
+ lxsd%U1x %x0,%y1
+ lxsd%U1x %x0,%y1
+ stxsd%U0x %x1,%y0
+ stxsd%U0x %x1,%y0
fmr %0,%1
lfd%U1%X1 %0,%1
stfd%U0%X0 %1,%0
+ xxlxor %x0,%x0,%x0
mt%0 %1
mf%1 %0
{cror 0,0,0|nop}
@@ -8634,33 +8809,40 @@ (define_insn "*movdf_hardfloat64_mfpgpr"
#
mftgpr %0,%1
mffgpr %0,%1"
- [(set_attr "type" "store,load,*,fp,fpload,fpstore,mtjmpr,mfjmpr,*,*,*,*,mftgpr,mffgpr")
- (set_attr "length" "4,4,4,4,4,4,4,4,4,8,12,16,4,4")])
+ [(set_attr "type" "store, load, *, fp, fp, fpload, fpload, fpstore, fpstore, fp, fpload, fpstore, vecsimple, mtjmpr, mfjmpr, *, *, *, *, mftgpr, mffgpr")
+ (set_attr "length" "4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 8, 12, 16, 4, 4")])
; ld/std require word-aligned displacements -> 'Y' constraint.
; List Y->r and r->Y before r->r for reload.
(define_insn "*movdf_hardfloat64"
- [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,f,f,m,*c*l,!r,*h,!r,!r,!r")
- (match_operand:DF 1 "input_operand" "r,Y,r,f,m,f,r,h,0,G,H,F"))]
+ [(set (match_operand:DF 0 "nonimmediate_operand" "=Y, r, !r, ws, ?wa, ws, ?wa, Z, ?Z, f, f, m, wa, *c*l, !r, *h, !r, !r, !r")
+ (match_operand:DF 1 "input_operand" "r, Y, r, ws, wa, Z, Z, ws, wa, f, m, f, j, r, h, 0, G, H, F"))]
"TARGET_POWERPC64 && !TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS
- && TARGET_DOUBLE_FLOAT
+ && TARGET_DOUBLE_FLOAT
&& (gpc_reg_operand (operands[0], DFmode)
|| gpc_reg_operand (operands[1], DFmode))"
"@
std%U0%X0 %1,%0
ld%U1%X1 %0,%1
mr %0,%1
+ xscpsgndp %x0,%x1,%x1
+ xscpsgndp %x0,%x1,%x1
+ lxsd%U1x %x0,%y1
+ lxsd%U1x %x0,%y1
+ stxsd%U0x %x1,%y0
+ stxsd%U0x %x1,%y0
fmr %0,%1
lfd%U1%X1 %0,%1
stfd%U0%X0 %1,%0
+ xxlxor %x0,%x0,%x0
mt%0 %1
mf%1 %0
{cror 0,0,0|nop}
#
#
#"
- [(set_attr "type" "store,load,*,fp,fpload,fpstore,mtjmpr,mfjmpr,*,*,*,*")
- (set_attr "length" "4,4,4,4,4,4,4,4,4,8,12,16")])
+ [(set_attr "type" "store, load, *, fp, fp, fpload, fpload, fpstore, fpstore, fp, fpload, fpstore, vecsimple, mtjmpr, mfjmpr, *, *, *, *")
+ (set_attr "length" " 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 8, 12, 16")])
(define_insn "*movdf_softfloat64"
[(set (match_operand:DF 0 "nonimmediate_operand" "=r,Y,r,cl,r,r,r,r,*h")
@@ -9237,15 +9419,16 @@ (define_insn "*movti_string"
(define_insn "*movti_ppc64"
[(set (match_operand:TI 0 "nonimmediate_operand" "=r,o<>,r")
(match_operand:TI 1 "input_operand" "r,r,m"))]
- "TARGET_POWERPC64 && (gpc_reg_operand (operands[0], TImode)
- || gpc_reg_operand (operands[1], TImode))"
+ "(TARGET_POWERPC64 && (gpc_reg_operand (operands[0], TImode)
+ || gpc_reg_operand (operands[1], TImode)))
+ && VECTOR_MEM_NONE_P (TImode)"
"#"
[(set_attr "type" "*,load,store")])
(define_split
[(set (match_operand:TI 0 "gpc_reg_operand" "")
(match_operand:TI 1 "const_double_operand" ""))]
- "TARGET_POWERPC64"
+ "TARGET_POWERPC64 && VECTOR_MEM_NONE_P (TImode)"
[(set (match_dup 2) (match_dup 4))
(set (match_dup 3) (match_dup 5))]
"
@@ -9271,7 +9454,7 @@ (define_split
(define_split
[(set (match_operand:TI 0 "nonimmediate_operand" "")
(match_operand:TI 1 "input_operand" ""))]
- "reload_completed
+ "reload_completed && VECTOR_MEM_NONE_P (TImode)
&& gpr_or_gpr_p (operands[0], operands[1])"
[(pc)]
{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
@@ -14891,6 +15074,8 @@ (define_insn "prefetch"
(include "sync.md")
+(include "vector.md")
+(include "vsx.md")
(include "altivec.md")
(include "spe.md")
(include "dfp.md")
--- gcc/config/rs6000/e500.h (.../trunk) (revision 144557)
+++ gcc/config/rs6000/e500.h (.../branches/ibm/power7-meissner) (revision 144730)
@@ -37,6 +37,8 @@
{ \
if (TARGET_ALTIVEC) \
error ("AltiVec and E500 instructions cannot coexist"); \
+ if (TARGET_VSX) \
+ error ("VSX and E500 instructions cannot coexist"); \
if (TARGET_64BIT) \
error ("64-bit E500 not supported"); \
if (TARGET_HARD_FLOAT && TARGET_FPRS) \
--- gcc/config/rs6000/driver-rs6000.c (.../trunk) (revision 144557)
+++ gcc/config/rs6000/driver-rs6000.c (.../branches/ibm/power7-meissner) (revision 144730)
@@ -343,11 +343,115 @@ detect_processor_aix (void)
#endif /* _AIX */
+/*
+ * Array to map -mcpu=native names to the switches passed to the assembler.
+ * This list mirrors the specs in ASM_CPU_SPEC, and any changes made here
+ * should be made there as well.
+ */
+
+struct asm_name {
+ const char *cpu;
+ const char *asm_sw;
+};
+
+static const
+struct asm_name asm_names[] = {
+#if defined (_AIX)
+ { "power3", "-m620" },
+ { "power4", "-mpwr4" },
+ { "power5", "-mpwr5" },
+ { "power5+", "-mpwr5x" },
+ { "power6", "-mpwr6" },
+ { "power6x", "-mpwr6" },
+ { "power7", "-mpwr7" },
+ { "powerpc", "-mppc" },
+ { "rs64a", "-mppc" },
+ { "603", "-m603" },
+ { "603e", "-m603" },
+ { "604", "-m604" },
+ { "604e", "-m604" },
+ { "620", "-m620" },
+ { "630", "-m620" },
+ { "970", "-m970" },
+ { "G5", "-m970" },
+ { NULL, "\
+%{!maix64: \
+%{mpowerpc64: -mppc64} \
+%{maltivec: -m970} \
+%{!maltivec: %{!mpower64: %(asm_default)}}}" },
+
+#else
+ { "common", "-mcom" },
+ { "cell", "-mcell" },
+ { "power", "-mpwr" },
+ { "power2", "-mpwrx" },
+ { "power3", "-mppc64" },
+ { "power4", "-mpower4" },
+ { "power5", "%(asm_cpu_power5)" },
+ { "power5+", "%(asm_cpu_power5)" },
+ { "power6", "%(asm_cpu_power6) -maltivec" },
+ { "power6x", "%(asm_cpu_power6) -maltivec" },
+ { "power7", "%(asm_cpu_power7)" },
+ { "powerpc", "-mppc" },
+ { "rios", "-mpwr" },
+ { "rios1", "-mpwr" },
+ { "rios2", "-mpwrx" },
+ { "rsc", "-mpwr" },
+ { "rsc1", "-mpwr" },
+ { "rs64a", "-mppc64" },
+ { "401", "-mppc" },
+ { "403", "-m403" },
+ { "405", "-m405" },
+ { "405fp", "-m405" },
+ { "440", "-m440" },
+ { "440fp", "-m440" },
+ { "464", "-m440" },
+ { "464fp", "-m440" },
+ { "505", "-mppc" },
+ { "601", "-m601" },
+ { "602", "-mppc" },
+ { "603", "-mppc" },
+ { "603e", "-mppc" },
+ { "ec603e", "-mppc" },
+ { "604", "-mppc" },
+ { "604e", "-mppc" },
+ { "620", "-mppc64" },
+ { "630", "-mppc64" },
+ { "740", "-mppc" },
+ { "750", "-mppc" },
+ { "G3", "-mppc" },
+ { "7400", "-mppc -maltivec" },
+ { "7450", "-mppc -maltivec" },
+ { "G4", "-mppc -maltivec" },
+ { "801", "-mppc" },
+ { "821", "-mppc" },
+ { "823", "-mppc" },
+ { "860", "-mppc" },
+ { "970", "-mpower4 -maltivec" },
+ { "G5", "-mpower4 -maltivec" },
+ { "8540", "-me500" },
+ { "8548", "-me500" },
+ { "e300c2", "-me300" },
+ { "e300c3", "-me300" },
+ { "e500mc", "-me500mc" },
+ { NULL, "\
+%{mpower: %{!mpower2: -mpwr}} \
+%{mpower2: -mpwrx} \
+%{mpowerpc64*: -mppc64} \
+%{!mpowerpc64*: %{mpowerpc*: -mppc}} \
+%{mno-power: %{!mpowerpc*: -mcom}} \
+%{!mno-power: %{!mpower*: %(asm_default)}}" },
+#endif
+};
+
/* This will be called by the spec parser in gcc.c when it sees
a %:local_cpu_detect(args) construct. Currently it will be called
with either "arch" or "tune" as argument depending on if -march=native
or -mtune=native is to be substituted.
+ Additionally it will be called with "asm" to select the appropriate flags
+ for the assembler.
+
It returns a string containing new command line parameters to be
put at the place of the above two options, depending on what CPU
this is executed.
@@ -361,29 +465,35 @@ const char
const char *cache = "";
const char *options = "";
bool arch;
+ bool assembler;
+ size_t i;
if (argc < 1)
return NULL;
arch = strcmp (argv[0], "cpu") == 0;
- if (!arch && strcmp (argv[0], "tune"))
+ assembler = (!arch && strcmp (argv[0], "asm") == 0);
+ if (!arch && !assembler && strcmp (argv[0], "tune"))
return NULL;
+ if (! assembler)
+ {
#if defined (_AIX)
- cache = detect_caches_aix ();
+ cache = detect_caches_aix ();
#elif defined (__APPLE__)
- cache = detect_caches_darwin ();
+ cache = detect_caches_darwin ();
#elif defined (__FreeBSD__)
- cache = detect_caches_freebsd ();
- /* FreeBSD PPC does not provide any cache information yet. */
- cache = "";
+ cache = detect_caches_freebsd ();
+ /* FreeBSD PPC does not provide any cache information yet. */
+ cache = "";
#elif defined (__linux__)
- cache = detect_caches_linux ();
- /* PPC Linux does not provide any cache information yet. */
- cache = "";
+ cache = detect_caches_linux ();
+ /* PPC Linux does not provide any cache information yet. */
+ cache = "";
#else
- cache = "";
+ cache = "";
#endif
+ }
#if defined (_AIX)
cpu = detect_processor_aix ();
@@ -397,6 +507,17 @@ const char
cpu = "powerpc";
#endif
+ if (assembler)
+ {
+ for (i = 0; i < sizeof (asm_names) / sizeof (asm_names[0]); i++)
+ {
+ if (!asm_names[i].cpu || !strcmp (asm_names[i].cpu, cpu))
+ return asm_names[i].asm_sw;
+ }
+
+ return NULL;
+ }
+
return concat (cache, "-m", argv[0], "=", cpu, " ", options, NULL);
}
--- gcc/config/rs6000/sysv4.h (.../trunk) (revision 144557)
+++ gcc/config/rs6000/sysv4.h (.../branches/ibm/power7-meissner) (revision 144730)
@@ -119,9 +119,9 @@ do { \
else if (!strcmp (rs6000_abi_name, "i960-old")) \
{ \
rs6000_current_abi = ABI_V4; \
- target_flags |= (MASK_LITTLE_ENDIAN | MASK_EABI \
- | MASK_NO_BITFIELD_WORD); \
+ target_flags |= (MASK_LITTLE_ENDIAN | MASK_EABI); \
target_flags &= ~MASK_STRICT_ALIGN; \
+ TARGET_NO_BITFIELD_WORD = 1; \
} \
else \
{ \