Compare commits

...

61 Commits
f20 ... master

Author SHA1 Message Date
Jakub Martisko 709845b89e atlas.spec: pass RPM_LD_FLAGS to linker
resolves: 1547515
2018-04-11 12:46:08 +02:00
Jakub Martisko 7268b7aa10 atlas.spec: fix the release number 2018-03-05 12:29:10 +01:00
Jakub Martisko 63d75114da atlas.spec: add gcc to buildrequires 2018-03-05 12:20:52 +01:00
Fedora Release Engineering 9561fa26c1 - Rebuilt for https://fedoraproject.org/wiki/Fedora_28_Mass_Rebuild
Signed-off-by: Fedora Release Engineering <releng@fedoraproject.org>
2018-02-07 03:12:34 +00:00
Björn Esser 1fb89d180f
Rebuilt for GCC8 2018-01-30 14:28:02 +01:00
Jakub Martisko f128eda788 - Rebase to 3.10.3 2017-08-17 11:00:17 +02:00
Tom Callaway 17da750ef6 rebuild against fixed lapack 2017-08-16 16:32:58 -04:00
Jakub Martisko 4d0e90e529 Rebuild for updated lapack 2017-08-10 11:42:49 +02:00
Fedora Release Engineering 202c1937a9 - Rebuilt for https://fedoraproject.org/wiki/Fedora_27_Binutils_Mass_Rebuild 2017-08-02 17:55:57 +00:00
Fedora Release Engineering 68ac25ef42 - Rebuilt for https://fedoraproject.org/wiki/Fedora_27_Mass_Rebuild 2017-07-26 03:28:52 +00:00
Fedora Release Engineering 298208165a - Rebuilt for https://fedoraproject.org/wiki/Fedora_26_Mass_Rebuild 2017-02-10 06:29:56 +00:00
Björn Esser 1a2f01f0f0 Rebuilt for GCC-7 2017-01-28 10:24:04 +01:00
Orion Poplawski 35e303b80a Limit instruction set on x86_64 (bug #1405397) 2016-12-16 16:29:33 -07:00
Orion Poplawski 6af83c0231 Correct Make.inc adjustments that were going awry to fix FTBFS (BZ#1402627). 2016-12-14 17:07:55 -07:00
Dennis Gilmore 8888d79619 - Rebuilt for https://fedoraproject.org/wiki/Fedora_24_Mass_Rebuild 2016-02-03 16:39:37 +00:00
Zbigniew Jędrzejewski-Szmek d03b304c5a Rebuild for updated lapack
https://bugzilla.redhat.com/show_bug.cgi?id=1286349
2015-11-28 21:42:05 -05:00
Than Ngo 76d4dba820 cleanup the patch 2015-11-26 17:32:38 +01:00
Than Ngo 13262e006a backport upstream patch for power8 support 2015-11-26 17:13:02 +01:00
Than Ngo f086598d07 add correct assembler option for ppc64 2015-11-13 16:01:16 +01:00
Than Ngo 5b8ceef1ab add correct machine type for ppc64 -> fix build failure on ppc64 2015-11-04 18:21:57 +01:00
Susi Lehtola c7f12adc64 Drop bundled(lapack). 2015-10-28 17:25:10 +01:00
Than Ngo 91230ab335 fix ppc64le patch, which causes build failure on ppc64le 2015-07-09 10:46:33 +02:00
Dennis Gilmore cddcd13655 - Rebuilt for https://fedoraproject.org/wiki/Fedora_23_Mass_Rebuild 2015-06-17 01:16:05 +00:00
Marcin Juszkiewicz 6a17475b1c Refreshed AArch64 patch 2015-06-10 20:19:20 +02:00
Dan Horák 049545a9bc - drop upstreamed s390 patch 2015-06-05 11:51:27 +02:00
Frantisek Kluknavsky 9a6d77d550 include all single-threaded wrapper libraries in -static subpackage
- bz#1222079
2015-05-20 14:07:34 +02:00
Orion Poplawski da91cc64e6 Fixup sources 2015-05-14 11:39:07 -06:00
Orion Poplawski 2940efa59b Update to 3.10.2 (bug #1118596)
- Autodetect arm arch
- Add arch_option for ppc64le
2015-05-14 11:32:35 -06:00
Frantisek Kluknavsky c5bccf55f6 lapack bundled again, mark this. 2015-03-05 15:49:29 +01:00
Susi Lehtola bf44b5d377 Drop unnecessary devel package requirements on lapack-devel. 2015-02-08 05:43:42 +01:00
Susi Lehtola b21a4fe1c8 Static link to system LAPACK as in earlier releases of Fedora. 2015-02-07 07:39:47 +01:00
Frantisek Kluknavsky 3f7c964eea updated chkconfig and dependencies of atlas-devel after unbundling 2015-01-28 17:26:24 +01:00
Frantisek Kluknavsky 01531f84c8 unbundled lapack (only a few modified routines shipped with atlas sources are supposed to stay) 2015-01-23 23:23:49 +01:00
Jaromir Capik f7d47dc9a3 patching for Power8 to pass performance tunings and tests on P8 builders 2014-10-30 18:29:59 +01:00
Orion Poplawski 9371c38444 Drop %defattr() and BuildRoot 2014-10-24 13:31:00 -06:00
Orion Poplawski 8a3c0ce3cf Fix alternatives install 2014-10-24 13:29:11 -06:00
Frantisek Kluknavsky 5a588758d4 added pkgconfig file 2014-10-23 18:54:39 +02:00
Frantisek Kluknavsky 46bc45dbe7 added pkgconfig file 2014-10-23 18:49:45 +02:00
Peter Robinson 932ed464ac - Rebuilt for https://fedoraproject.org/wiki/Fedora_21_22_Mass_Rebuild 2014-08-15 21:41:23 +00:00
Dennis Gilmore 6d51f79182 - Rebuilt for https://fedoraproject.org/wiki/Fedora_21_Mass_Rebuild 2014-06-06 20:58:42 -05:00
Peter Robinson 93e9bf9d18 - Don't fail build on make check on aarch64 due to issues with tests
- Unbreak AArch64 build.
- ARMv8 is different from ARMv7 so should not be treated as such. Otherwise
  atlas tries to do some crazy ARMv764 build and fail.
2014-02-24 09:16:09 +00:00
Frantisek Kluknavsky ecd0c2adff updated lapack to 3.5.0 2013-11-20 12:31:28 +01:00
Frantisek Kluknavsky 100afcd2b2 updated subpackage description 2013-11-13 16:45:39 +01:00
Frantisek Kluknavsky 2c71620acc patch for aarch64 from https://bugzilla.redhat.com/attachment.cgi?id=755555 2013-11-05 14:07:26 +01:00
Frantisek Kluknavsky e381725c32 patch for aarch64 from https://bugzilla.redhat.com/attachment.cgi?id=755555 2013-11-05 13:59:24 +01:00
Frantisek Kluknavsky 6c8df60df6 Provides: bundled(lapack) 2013-10-16 14:09:14 +02:00
Frantisek Kluknavsky ce70abdb41 make check on arm enabled - seems to work 2013-10-10 11:38:15 +02:00
Orion Poplawski ff52573f7d Drop debug for now 2013-10-02 21:04:23 -06:00
Orion Poplawski 885b47ab37 - Add -DATL_ARM_HARDFP=1 for arm build
- Rework how arm flags are set
- Add some diag output for still failing arm test
2013-10-02 20:40:35 -06:00
Frantisek Kluknavsky 6dbfdbf430 disable tests on arm to allow update for x86 2013-09-30 13:52:10 +02:00
Frantisek Kluknavsky 64a1bfa23d disable tests on arm to allow update for x86 2013-09-30 12:41:46 +02:00
Frantisek Kluknavsky 7fe7cb921f disable affinity to prevent crash on systems with fewer cpus 2013-09-24 16:53:36 +02:00
Frantisek Kluknavsky 48e39087ac disable affinity to prevent crash on systems with fewer cpus 2013-09-24 16:33:16 +02:00
Orion Poplawski f8e736600e Add %check section 2013-09-23 14:05:44 -06:00
Frantisek Kluknavsky 381f2c21f2 gitignore updated 2013-09-23 09:23:41 +02:00
Frantisek Kluknavsky 225ecb7519 add misssing source file 2013-09-20 17:07:11 +02:00
Frantisek Kluknavsky 84e3e0e998 modified arm archdef uploaded as source 2013-09-20 16:56:08 +02:00
Frantisek Kluknavsky 3ba016b393 bogus in changelog after merge 2013-09-20 16:46:33 +02:00
Frantisek Kluknavsky a0b20b5850 bogus datum in changelog 2013-09-20 16:30:36 +02:00
Frantisek Kluknavsky caba23768a lapack source was uploaded to git instead of lookaside cache by mistake 2013-09-20 16:28:32 +02:00
Frantisek Kluknavsky 62670b4e6a Rebase to 3.10.1
- Dropped x86_64-SSE2, ix86-SSE1, ix86-3DNow, z10, z196 (uncompilable).
- Modified incompatible patches.
- Added armv7neon support, modified archdef from softfp abi to hard abi.
- Modified Make.lib to include build-id, soname, versioned library name and symlinks.
- Now builds monolithic libsatlas (serial) and libtatlas (threaded)
  libraries with lapack and blas included.
- Lapack source tarball needed instead of static library.
- Disabled cpu throttling detection again (sorry, could not work on atlas
  otherwise, feel free to enable yet again - atlas-throttling.patch).
- Removed mentions of "Fedora" to promote redistribution.
- Modified parts of atlas.spec sometimes left in place, work still in progress,
  cleanup needed.
2013-09-20 16:01:10 +02:00
19 changed files with 1715 additions and 778 deletions

9
.gitignore vendored
View File

@ -1,3 +1,12 @@
atlas3.8.3.tar.bz2
PPRO32.tgz
K7323DNow.tgz
/atlas3.10.0.tar.bz2
/atlas3.10.1.tar.bz2
/IBMz932.tar.bz2
/IBMz964.tar.bz2
/POWER332.tar.bz2
/ARMv732NEON.tar.bz2
/lapack-3.5.0.tgz
/atlas3.10.2.tar.bz2
/POWER864LEVSXp4.tar.bz2

View File

@ -1,68 +0,0 @@
Notes on the Fedora version of ATLAS
by Quentin Spencer
updated: October 4, 2005
updated by Deji Akingunola
October 15, 2008
updated by Deji Akingunola
June 15, 2011
Because ATLAS relies on compile-time optimizations to obtain improved
performance over BLAS and LAPACK, the resulting binaries are closely
tied to the hardware on which they are compiled, and can likely result
in very poor performance on other hardware. For this reason,
including a package like ATLAS in Fedora requires some compromises.
Firstly, a binary ATLAS package must perform reasonably well on the
entire range of hardware on which it could potentially be installed.
Optimizing ATLAS for the most modern hardware can result in
significant performance penalties for users using the same package on
older hardware. Second, building from the same source package must
result in identical binaries for any computer of a particular
architecture. This is because the binaries installed on a user's
computer are built on a computer in the Fedora Extras build system,
which will have hardware different from the end user's hardware, and
quite possibly different from other available hardware in the build
system.
As of version 3.8.2 (in Fedora), ATLAS builds uses achitectural defaults,
are partial results of past searches when the compiler and architecture
are known, to discover the appropriate kernels used to build all the required
libraries. They make install time quicker and also ensure that good results are
obtained, since they typically represent several searches and/or user
intervention into the usual search so that maximum performance is found.
The result is a set of libraries that will not
necessarily achieve optimal performance on any given hardware but
should still offer significant performance gains over the reference
BLAS and LAPACK libraries on most hardware. The binary package
includes the atlas libraries as well as binary-compatible blas and
lapack libraries that should work as a drop-in replacement for the
standard ones (they are installed in /usr/lib{64}/atlas* rather than
/usr/lib{64}).
For 32bit x86 systems, the default atlas package on was built using Pentium Pro
architectural defaults using just x87 math optimization. In addition to the
base 32bit build, 4 ATLAS subpackages are built for 3Dnow, SSE, SSE2, and SSE3
ix86 extensions, using architectural defaults obtained from Athlon K7, PIII,
Pentium 4 with SSE2 extension and PENTIUM 4 with SSE3 extensions respectively.
On 64bit x86 systems the default atlas package on was built with SSE2 optimization using architetural default made for AMD's HAMMER processor, and an additional
SSE3-enabled subpackage was built also using architetural default made for AMD's HAMMER processor.
This packaging allows multiple installation of different atlas sub-packages
at the same time. The alternatives system (read 'man alternatives' for usage)
is used in the -devel subpackages to select the appropriate location for the
architectural dependent header files.
This package is designed to build RPMs that are identical regardless
of where they are compiled and that provide reasonable performance on
a wide range of hardware. For users who want optimal performance on
particular hardware, custom RPMs can be built from the source package
by setting the RPM macro "enable_native_atlas" to a value of 1. This
can be done from the command line as in the following example:
rpmbuild -D "enable_native_atlas 1" --rebuild atlas-3.8.3-1.src.rpm
This will cause the ATLAS build to use the achitectural default most
appropriate for the system on which the package is to be built.

47
README.dist Normal file
View File

@ -0,0 +1,47 @@
Notes on the packaged version of ATLAS
by Quentin Spencer
updated: October 4, 2005
updated by Deji Akingunola
October 15, 2008
updated by Deji Akingunola
June 15, 2011
updated by Frantisek Kluknavsky
Nov 20, 2012
Because ATLAS relies on compile-time optimizations to obtain improved
performance over BLAS and LAPACK, the resulting binaries are closely
tied to the hardware on which they are compiled, and can likely result
in very poor performance on other hardware. For this reason,
including a package like ATLAS in Fedora requires some compromises.
Optimizing ATLAS for the most modern hardware can result in
significant performance penalties for users using the same package on
older hardware. A binary ATLAS package must perform reasonably well on the
entire range of hardware on which it could potentially be installed.
The result is a set of libraries that will not
necessarily achieve optimal performance on any given hardware but
should still offer significant performance gains over the reference
BLAS and LAPACK libraries on most hardware.
In addition to the base 32bit build, subpackages are built for SSE, SSE2,
and SSE3 ix86 extensions.
On 64bit x86 systems the default atlas package was built with SSE3
optimization.
This packaging allows multiple installation of different atlas sub-packages
at the same time. The alternatives system (read 'man alternatives' for usage)
is used in the -devel subpackages to select the appropriate location for the
architectural dependent header files.
For users who want optimal performance on
particular hardware, custom RPMs can be built from the source package
by setting the RPM macro "enable_native_atlas" to a value of 1. This
can be done from the command line as in the following example:
rpmbuild -D "enable_native_atlas 1" --rebuild atlas-3.8.3-1.src.rpm

219
atlas-aarch64port.patch Normal file
View File

@ -0,0 +1,219 @@
Author: Mark Salter <msalter@redhat.com>
Index: ATLAS/CONFIG/include/atlconf.h
===================================================================
--- ATLAS.orig/CONFIG/include/atlconf.h
+++ ATLAS/CONFIG/include/atlconf.h
@@ -16,9 +16,9 @@ enum OSTYPE {OSOther=0, OSLinux, OSSunOS
((OS_) == OSWin64) )
enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS,
- AFARM, AFS390};
+ AFARM, AFS390, AFAARCH64};
-#define NMACH 52
+#define NMACH 53
static char *machnam[NMACH] =
{"UNKNOWN", "POWER3", "POWER4", "POWER5", "PPCG4", "PPCG5",
"POWER6", "POWER7", "POWERe6500", "IBMz9", "IBMz10", "IBMz196",
@@ -29,7 +29,7 @@ static char *machnam[NMACH] =
"Efficeon", "K7", "HAMMER", "AMD64K10h", "AMDLLANO", "AMDDOZER","AMDDRIVER",
"UNKNOWNx86", "IA64Itan", "IA64Itan2",
"USI", "USII", "USIII", "USIV", "UST1", "UST2", "UnknownUS",
- "MIPSR1xK", "MIPSICE9", "ARMv7"};
+ "MIPSR1xK", "MIPSICE9", "ARMv7", "AARCH64"};
enum MACHTYPE {MACHOther, IbmPwr3, IbmPwr4, IbmPwr5, PPCG4, PPCG5,
IbmPwr6, IbmPwr7, Pwre6500,
IbmZ9, IbmZ10, IbmZ196, /* s390(x) in Linux */
@@ -42,7 +42,8 @@ enum MACHTYPE {MACHOther, IbmPwr3, IbmPw
SunUSI, SunUSII, SunUSIII, SunUSIV, SunUST1, SunUST2, SunUSX,
MIPSR1xK, /* includes R10K, R12K, R14K, R16K */
MIPSICE9, /* SiCortex ICE9 -- like MIPS5K */
- ARMv7 /* includes Cortex A8, A9 */
+ ARMv7, /* includes Cortex A8, A9 */
+ AARCH64
};
#define MachIsX86(mach_) \
( (mach_) >= x86x87 && (mach_) <= x86X )
@@ -63,6 +64,8 @@ enum MACHTYPE {MACHOther, IbmPwr3, IbmPw
( (mach_) == ARMv7 )
#define MachIsS390(mach_) \
( (mach_) >= IbmZ9 && (mach_) <= IbmZ196 )
+#define MachIsAARCH64(mach_) \
+ ( (mach_) == AARCH64 )
static char *f2c_namestr[5] = {"UNKNOWN","Add_", "Add__", "NoChange", "UpCase"};
@@ -84,13 +87,13 @@ enum ISAEXT
{ISA_None=0, ISA_VSX, ISA_AV, ISA_AVXMAC, ISA_AVXFMA4, ISA_AVX,
ISA_SSE3, ISA_SSE2, ISA_SSE1, ISA_3DNow, ISA_NEON};
-#define NASMD 9
+#define NASMD 10
enum ASMDIA
{ASM_None=0, gas_x86_32, gas_x86_64, gas_sparc, gas_ppc, gas_parisc,
- gas_mips, gas_arm, gas_s390};
+ gas_mips, gas_arm, gas_s390, gas_aarch64};
static char *ASMNAM[NASMD] =
{"", "GAS_x8632", "GAS_x8664", "GAS_SPARC", "GAS_PPC", "GAS_PARISC",
- "GAS_MIPS", "GAS_ARM", "GAS_S390"};
+ "GAS_MIPS", "GAS_ARM", "GAS_S390", "GAS_AARCH64"};
/*
* Used for archinfo probes (can pack in bitfield)
Index: ATLAS/CONFIG/src/Makefile
===================================================================
--- ATLAS.orig/CONFIG/src/Makefile
+++ ATLAS/CONFIG/src/Makefile
@@ -260,6 +260,11 @@ IRun_BINDP :
redir=config0.out
- cat config0.out
+IRun_GAS_AARCH64 :
+ $(CC) $(CCFLAGS) -o xprobe_gas_aarch64 $(SRCdir)/backend/probe_this_asm.c $(SRCdir)/backend/probe_gas_aarch64.S
+ $(MAKE) $(atlrun) atldir=$(mydir) exe=xprobe_gas_aarch64 args="$(args)" \
+ redir=config0.out
+ - cat config0.out
IRun_GAS_S390 :
$(CC) $(CCFLAGS) -o xprobe_gas_s390 $(SRCdir)/backend/probe_this_asm.c $(SRCdir)/backend/probe_gas_s390.S
$(MAKE) $(atlrun) atldir=$(mydir) exe=xprobe_gas_s390 args="$(args)" \
Index: ATLAS/CONFIG/src/SpewMakeInc.c
===================================================================
--- ATLAS.orig/CONFIG/src/SpewMakeInc.c
+++ ATLAS/CONFIG/src/SpewMakeInc.c
@@ -391,6 +391,8 @@ char *GetPtrbitsFlag(enum OSTYPE OS, enu
if (MachIsIA64(arch))
return(sp);
+ if (MachIsAARCH64(arch))
+ return(sp);
if (MachIsMIPS(arch))
return((ptrbits == 64) ? "-mabi=64" : "-mabi=n32");
if (MachIsS390(arch))
Index: ATLAS/CONFIG/src/atlcomp.txt
===================================================================
--- ATLAS.orig/CONFIG/src/atlcomp.txt
+++ ATLAS/CONFIG/src/atlcomp.txt
@@ -267,6 +267,17 @@ MACH=ARMv7 OS=ALL LVL=1000 COMPS=dmc,dkc
MACH=ARMv7 OS=ALL LVL=1000 COMPS=f77
'gfortran' '-mcpu=cortex-a8 -mfpu=vfpv3 -mfloat-abi=softfp -O'
#
+# AArch64 defaults
+#
+MACH=AARCH64 OS=ALL LVL=1000 COMPS=xcc
+ 'gcc' '-O2'
+MACH=AARCH64 OS=ALL LVL=1000 COMPS=smc,skc,gcc,icc
+ 'gcc' '-O2'
+MACH=AARCH64 OS=ALL LVL=1000 COMPS=dmc,dkc
+ 'gcc' '-O2'
+MACH=AARCH64 OS=ALL LVL=1000 COMPS=f77
+ 'gfortran' '-O'
+#
# Generic defaults
#
MACH=ALL OS=ALL LVL=5 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc
Index: ATLAS/CONFIG/src/atlconf_misc.c
===================================================================
--- ATLAS.orig/CONFIG/src/atlconf_misc.c
+++ ATLAS/CONFIG/src/atlconf_misc.c
@@ -563,6 +563,7 @@ enum ARCHFAM ProbeArchFam(char *targ)
else if (strstr(res, "ia64")) fam = AFIA64;
else if (strstr(res, "mips")) fam = AFMIPS;
else if (strstr(res, "arm")) fam = AFARM;
+ else if (strstr(res, "aarch64")) fam = AFAARCH64;
else if (strstr(res, "s390")) fam = AFS390;
else if ( strstr(res, "i686") || strstr(res, "i586") ||
strstr(res, "i486") || strstr(res, "i386") ||
@@ -588,6 +589,7 @@ enum ARCHFAM ProbeArchFam(char *targ)
strstr(res, "x86_64") ) fam = AFX86;
else if (strstr(res, "mips")) fam = AFMIPS;
else if (strstr(res, "arm")) fam = AFARM;
+ else if (strstr(res, "aarch64")) fam = AFAARCH64;
else if (strstr(res, "s390")) fam = AFS390;
free(res);
}
Index: ATLAS/CONFIG/src/backend/Make.ext
===================================================================
--- ATLAS.orig/CONFIG/src/backend/Make.ext
+++ ATLAS/CONFIG/src/backend/Make.ext
@@ -57,6 +57,8 @@ probe_gas_arm.S : $(basf)
$(extC) -b $(basf) -o probe_gas_arm.S rout=probe_gas_arm.S
probe_gas_s390.S : $(basf)
$(extC) -b $(basf) -o probe_gas_s390.S rout=probe_gas_s390.S
+probe_gas_aarch64.S : $(basf)
+ $(extC) -b $(basf) -o probe_gas_aarch64.S rout=probe_gas_aarch64.S
probe_AVXMAC.S : $(basf)
$(extC) -b $(basf) -o probe_AVXMAC.S rout=probe_AVXMAC.S
probe_AVXFMA4.S : $(basf)
Index: ATLAS/CONFIG/src/backend/archinfo_linux.c
===================================================================
--- ATLAS.orig/CONFIG/src/backend/archinfo_linux.c
+++ ATLAS/CONFIG/src/backend/archinfo_linux.c
@@ -267,6 +267,14 @@ enum MACHTYPE ProbeArch()
free(res);
}
break;
+ case AFAARCH64:
+ res = atlsys_1L(NULL, "fgrep 'Processor' /proc/cpuinfo", 0, 0);
+ if (res)
+ {
+ if (strstr(res, "AArch64")) mach = AARCH64;
+ free(res);
+ }
+ break;
default:
#if 0
if (!CmndOneLine(NULL, "fgrep 'cpu family' /proc/cpuinfo", res))
Index: ATLAS/CONFIG/src/backend/probe_gas_aarch64.S
===================================================================
--- /dev/null
+++ ATLAS/CONFIG/src/backend/probe_gas_aarch64.S
@@ -0,0 +1,14 @@
+#define ATL_GAS_AARCH64
+#include "atlas_asm.h"
+#
+# Linux AArch64 assembler for:
+# int asm_probe(int i)
+# RETURNS: i*3
+#
+.text
+.globl ATL_asmdecor(asm_probe)
+.type ATL_asmdecor(asm_probe), %function
+ATL_asmdecor(asm_probe):
+ add w0, w0, w0, LSL #1
+ ret
+.size ATL_asmdecor(asm_probe),.-ATL_asmdecor(asm_probe)
Index: ATLAS/CONFIG/src/probe_comp.c
===================================================================
--- ATLAS.orig/CONFIG/src/probe_comp.c
+++ ATLAS/CONFIG/src/probe_comp.c
@@ -582,7 +582,7 @@ char *GetPtrbitsFlag(enum OSTYPE OS, enu
char *sp = "";
int i, j, k;
- if (MachIsIA64(arch))
+ if (MachIsIA64(arch) || MachIsAARCH64(arch))
return(sp);
if (MachIsMIPS(arch))
return((ptrbits == 64) ? "-mabi=64" : "-mabi=n32");
Index: ATLAS/include/atlas_genparse.h
===================================================================
--- ATLAS.orig/include/atlas_genparse.h
+++ ATLAS/include/atlas_genparse.h
@@ -6,13 +6,13 @@
#include <assert.h>
#include <string.h>
#include <ctype.h>
-#define NASMD 9
+#define NASMD 10
enum ASMDIA
{ASM_None=0, gas_x86_32, gas_x86_64, gas_sparc, gas_ppc, gas_parisc,
- gas_mips, gas_arm, gas_s390};
+ gas_mips, gas_arm, gas_s390, gas_aarch64};
static char *ASMNAM[NASMD] =
{"", "GAS_x8632", "GAS_x8664", "GAS_SPARC", "GAS_PPC", "GAS_PARISC",
- "GAS_MIPS", "GAS_ARM", "GAS_S390"};
+ "GAS_MIPS", "GAS_ARM", "GAS_S390", "GAS_AARCH64"};
/*
* Basic data structure for forming queues with some minimal info
*/

17
atlas-affinity.patch Normal file
View File

@ -0,0 +1,17 @@
diff -up wrk/src/threads/ATL_thread_start.c.wrk wrk/src/threads/ATL_thread_start.c
--- wrk/src/threads/ATL_thread_start.c.wrk 2013-09-23 13:46:51.881085276 +0200
+++ wrk/src/threads/ATL_thread_start.c 2013-09-24 16:13:59.021065418 +0200
@@ -101,9 +101,10 @@ int ATL_thread_start(ATL_thread_t *thr,
ATL_assert(!pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED));
pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM); /* no chk, OK to fail */
#ifdef ATL_PAFF_SETAFFNP
- CPU_ZERO(&cpuset);
- CPU_SET(affID, &cpuset);
- ATL_assert(!pthread_attr_setaffinity_np(&attr, sizeof(cpuset), &cpuset));
+ //affinity crashes a machine with fewer processors than the builder
+ //CPU_ZERO(&cpuset);
+ //CPU_SET(affID, &cpuset);
+ //ATL_assert(!pthread_attr_setaffinity_np(&attr, sizeof(cpuset), &cpuset));
#elif defined(ATL_PAFF_SETPROCNP)
ATL_assert(!pthread_attr_setprocessor_np(&attr, (pthread_spu_t)affID,
PTHREAD_BIND_FORCED_NP));

14
atlas-genparse.patch Normal file
View File

@ -0,0 +1,14 @@
diff --git a/include/atlas_genparse.h b/include/atlas_genparse.h
index 909a38e..1e6d153 100644
--- a/include/atlas_genparse.h
+++ b/include/atlas_genparse.h
@@ -163,7 +163,8 @@ static int GetDoubleArr(char *str, int N, double *d)
if (!str)
break;
str++;
- assert(sscanf(str, "%le", d+i) == 1);
+ if (sscanf(str, "%le", d+i) != 1)
+ break;
i++;
}
return(i);

View File

@ -1,15 +1,16 @@
diff -up ATLAS/CONFIG/src/SpewMakeInc.c.melf ATLAS/CONFIG/src/SpewMakeInc.c
--- ATLAS/CONFIG/src/SpewMakeInc.c.melf 2011-05-14 11:33:24.000000000 -0600
+++ ATLAS/CONFIG/src/SpewMakeInc.c 2012-08-09 10:52:28.051926489 -0600
@@ -665,9 +665,9 @@ main(int nargs, char **args)
if (MachIsX86(mach))
{
if (ptrbits == 32)
- fprintf(fpout, " -melf_i386");
+ fprintf(fpout, " -Wl,-melf_i386");
else if (ptrbits == 64)
- fprintf(fpout, " -melf_x86_64");
+ fprintf(fpout, " -Wl,-melf_x86_64");
if (OS == OSFreeBSD)
fprintf(fpout, "_fbsd");
}
diff --git a/CONFIG/src/SpewMakeInc.c b/CONFIG/src/SpewMakeInc.c
index eed259e..65d68a1 100644
--- a/CONFIG/src/SpewMakeInc.c
+++ b/CONFIG/src/SpewMakeInc.c
@@ -764,9 +764,9 @@ int main(int nargs, char **args)
else
{
if (ptrbits == 32)
- fprintf(fpout, " -melf_i386");
+ fprintf(fpout, " -Wl,-melf_i386");
else if (ptrbits == 64)
- fprintf(fpout, " -melf_x86_64");
+ fprintf(fpout, " -Wl,-melf_x86_64");
if (OS == OSFreeBSD)
fprintf(fpout, "_fbsd");
}

View File

@ -0,0 +1,32 @@
Subject: atlas new archdef for ppc64le
From: Michel Normand <normand@linux.vnet.ibm.com>
Date: Sun, 13 Jun 2014 18:02:47 +0200
Need to define different archdef names
for ppc64 (that is Big Endian) and ppc64le (that is Little Endian).
This is already done upstream in atlas 3.11.30 with issue
https://sourceforge.net/p/math-atlas/patches/66/
Required at least as long as I need the bypass of
atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
---
CONFIG/src/SpewMakeInc.c | 4 ++++
1 file changed, 4 insertions(+)
Index: ATLAS/CONFIG/src/SpewMakeInc.c
===================================================================
--- ATLAS.orig/CONFIG/src/SpewMakeInc.c
+++ ATLAS/CONFIG/src/SpewMakeInc.c
@@ -542,6 +542,10 @@ int main(int nargs, char **args)
fprintf(fpout, "# -------------------------------------------------\n");
fprintf(fpout, " ARCH = %s", machnam[mach]);
fprintf(fpout, "%d", ptrbits);
+ /* for ppc64le archi add 'LE' characters */
+ #if defined(__powerpc64__) && (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
+ fprintf(fpout, "%s", "LE");
+ #endif
if (ISAX)
fprintf(fpout, "%s", ISAXNAM[ISAX]);
if (!USEIEEE)

View File

@ -1,279 +0,0 @@
---
CONFIG/include/atlconf.h | 18 +++++++-----
CONFIG/src/Makefile | 5 +++
CONFIG/src/SpewMakeInc.c | 5 +++
CONFIG/src/atlcomp.txt | 50 ++++++++++++++++++++++++++++++++++++
CONFIG/src/atlconf_misc.c | 2 +
CONFIG/src/backend/Make.ext | 2 +
CONFIG/src/backend/archinfo_linux.c | 12 ++++++++
CONFIG/src/backend/probe_gas_s390.S | 13 +++++++++
CONFIG/src/probe_comp.c | 2 +
include/atlas_prefetch.h | 6 ++++
10 files changed, 108 insertions(+), 7 deletions(-)
Index: b/CONFIG/include/atlconf.h
===================================================================
--- a/CONFIG/include/atlconf.h
+++ b/CONFIG/include/atlconf.h
@@ -14,9 +14,9 @@ enum OSTYPE {OSOther=0, OSLinux, OSSunOS
OSWin9x, OSWinNT, OSHPUX, OSFreeBSD, OSOSX};
#define OSIsWin(OS_) (((OS_) == OSWinNT) || ((OS_) == OSWin9x))
-enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS};
+enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS, AFS390};
-#define NMACH 37
+#define NMACH 42
static char *machnam[NMACH] =
{"UNKNOWN", "POWER3", "POWER4", "POWER5", "PPCG4", "PPCG5",
"POWER6", "POWER7",
@@ -25,7 +25,8 @@ static char *machnam[NMACH] =
"Efficeon", "K7", "HAMMER", "AMD64K10h", "UNKNOWNx86",
"IA64Itan", "IA64Itan2",
"USI", "USII", "USIII", "USIV", "UST2", "UnknownUS",
- "MIPSR1xK", "MIPSICE9"};
+ "MIPSR1xK", "MIPSICE9",
+ "IBMz900", "IBMz990", "IBMz9", "IBMz10", "IBMz196" };
enum MACHTYPE {MACHOther, IbmPwr3, IbmPwr4, IbmPwr5, PPCG4, PPCG5,
IbmPwr6, IbmPwr7,
IntP5, IntP5MMX, IntPPRO, IntPII, IntPIII, IntPM, IntCoreS,
@@ -34,7 +35,8 @@ enum MACHTYPE {MACHOther, IbmPwr3, IbmPw
IA64Itan, IA64Itan2,
SunUSI, SunUSII, SunUSIII, SunUSIV, SunUST2, SunUSX,
MIPSR1xK, /* includes R10K, R12K, R14K, R16K */
- MIPSICE9 /* SiCortex ICE9 -- like MIPS5K */
+ MIPSICE9, /* SiCortex ICE9 -- like MIPS5K */
+ IBMz900, IBMz990, IBMz9, IBMz10, IBMz196 /* s390(x) in Linux */
};
#define MachIsX86(mach_) \
( (mach_) >= IntP5 && (mach_) <= x86X )
@@ -51,6 +53,8 @@ enum MACHTYPE {MACHOther, IbmPwr3, IbmPw
#endif
#define MachIsPPC(mach_) \
( (mach_) >= PPCG4 && (mach_) <= PPCG5 )
+#define MachIsS390(mach_) \
+ ( (mach_) >= IBMz900 && (mach_) <= IBMz196 )
static char *f2c_namestr[5] = {"UNKNOWN","Add_", "Add__", "NoChange", "UpCase"};
static char *f2c_intstr[5] =
@@ -68,13 +72,13 @@ static char *ISAXNAM[NISA] =
{"", "AltiVec", "SSE3", "SSE2", "SSE1", "3DNow"};
enum ISAEXT {ISA_None=0, ISA_AV, ISA_SSE3, ISA_SSE2, ISA_SSE1, ISA_3DNow};
-#define NASMD 7
+#define NASMD 8
enum ASMDIA
{ASM_None=0, gas_x86_32, gas_x86_64, gas_sparc, gas_ppc, gas_parisc,
- gas_mips};
+ gas_mips, gas_s390};
static char *ASMNAM[NASMD] =
{"", "GAS_x8632", "GAS_x8664", "GAS_SPARC", "GAS_PPC", "GAS_PARISC",
- "GAS_MIPS"};
+ "GAS_MIPS", "GAS_S390"};
/*
Index: b/CONFIG/src/Makefile
===================================================================
--- a/CONFIG/src/Makefile
+++ b/CONFIG/src/Makefile
@@ -177,6 +177,11 @@ IRun_GAS_x8632 :
$(MAKE) $(atlrun) atldir=$(mydir) exe=xprobe_gas_x8632 args="$(args)" \
redir=config0.out
- cat config0.out
+IRun_GAS_S390 :
+ $(CC) $(CCFLAGS) -o xprobe_gas_s390 $(SRCdir)/backend/probe_this_asm.c $(SRCdir)/backend/probe_gas_s390.S
+ $(MAKE) $(atlrun) atldir=$(mydir) exe=xprobe_gas_s390 args="$(args)" \
+ redir=config0.out
+ - cat config0.out
IRunC2C :
- rm -f config0.out xc2c c2cslave.o
Index: b/CONFIG/src/SpewMakeInc.c
===================================================================
--- a/CONFIG/src/SpewMakeInc.c
+++ b/CONFIG/src/SpewMakeInc.c
@@ -342,6 +342,9 @@ char *GetPtrbitsFlag(enum OSTYPE OS, enu
return(sp);
if (MachIsMIPS(arch))
return((ptrbits == 64) ? "-mabi=64" : "-mabi=n32");
+ if (MachIsS390(arch))
+ return((ptrbits == 64) ? "-m64" : "-m31");
+
if (!CompIsGcc(comp))
{
/*
@@ -671,6 +674,8 @@ main(int nargs, char **args)
if (OS == OSFreeBSD)
fprintf(fpout, "_fbsd");
}
+ if (MachIsS390(mach))
+ fprintf(fpout, ptrbits == 32 ? "-m31" : "-m64");
fprintf(fpout, "\n F77SYSLIB = %s\n", f77lib ? f77lib : "");
fprintf(fpout, " BC = $(ICC)\n");
fprintf(fpout, " NCFLAGS = $(ICCFLAGS)\n");
Index: b/CONFIG/src/atlcomp.txt
===================================================================
--- a/CONFIG/src/atlcomp.txt
+++ b/CONFIG/src/atlcomp.txt
@@ -164,6 +164,56 @@ MACH=ALL OS=WinNT LVL=0 COMPS=f77
MACH=P4,PM OS=WinNT LVL=0 COMPS=icc,dmc,smc,dkc,skc,xcc
'icl' '-QxN -O3 -Qprec -fp:extended -fp:except -nologo -Oy'
#
+# IBM System z or zEnterprise
+#
+
+# z900 or z800
+MACH=IBMz900 OS=ALL LVL=1000 COMPS=f77
+ 'gfortran' '-march=z900 -O3 -funroll-loops'
+MACH=IBMz900 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
+ 'gcc' '-march=z900 -O3 -funroll-loops'
+
+# z990 or z890
+MACH=IBMz990 OS=ALL LVL=1000 COMPS=f77
+ 'gfortran' '-march=z990 -O3 -funroll-loops'
+MACH=IBMz990 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
+ 'gcc' '-march=z990 -O3 -funroll-loops'
+
+# z9-EC z9-BC or z9-109
+MACH=IBMz9 OS=ALL LVL=1000 COMPS=f77
+ 'gfortran' '-march=z9-109 -O3 -funroll-loops'
+MACH=IBMz9 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
+ 'gcc' '-march=z9-109 -O3 -funroll-loops'
+
+# on z10 and z196 gcc emits prefetches which disturb cache size
+# detection and optimization. Therefore, we use fno-prefetch-loop-arrays
+# z10
+MACH=IBMz10 OS=ALL LVL=1000 COMPS=f77
+ 'gfortran' '-march=z10 -O3 -funroll-loops -fno-prefetch-loop-arrays'
+MACH=IBMz10 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
+ 'gcc' '-march=z10 -O3 -funroll-loops -fno-prefetch-loop-arrays'
+
+# z196. we also try to fallback to z10 and z9 for older compilers
+MACH=IBMz196 OS=ALL LVL=1000 COMPS=f77
+ 'gfortran' '-march=z196 -O3 -funroll-loops -fno-prefetch-loop-arrays'
+MACH=IBMz196 OS=ALL LVL=800 COMPS=f77
+ 'gfortran' '-march=z10 -O3 -funroll-loops -fno-prefetch-loop-arrays'
+MACH=IBMz196 OS=ALL LVL=600 COMPS=f77
+ 'gfortran' '-march=z9-109 -O3 -funroll-loops'
+MACH=IBMz196 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
+ 'gcc' '-march=z196 -O3 -funroll-loops -fno-prefetch-loop-arrays'
+MACH=IBMz196 OS=ALL LVL=800 COMPS=smc,dmc,skc,dkc,icc,xcc
+ 'gcc' '-march=z10 -O3 -funroll-loops -fno-prefetch-loop-arrays'
+MACH=IBMz196 OS=ALL LVL=600 COMPS=smc,dmc,skc,dkc,icc,xcc
+ 'gcc' '-march=z9-109 -O3 -funroll-loops'
+
+# ALL march options failed, go back to conservative defaults
+MACH=IBMz900,IBMz990,IBMz9,IBMz10,IBMz196 OS=ALL LVL=500 COMPS=f77
+ 'gfortran' '-O3 -funroll-loops'
+MACH=IBMz900,IBMz990,IBMz9,IBMz10,IBMz196 OS=ALL LVL=500 COMPS=smc,dmc,skc,dkc,icc,xcc
+ 'gcc' '-O3 -funroll-loops'
+
+#
# Generic defaults
#
MACH=ALL OS=ALL LVL=5 COMPS=icc,smc,dmc,skc,dkc,xcc
Index: b/CONFIG/src/atlconf_misc.c
===================================================================
--- a/CONFIG/src/atlconf_misc.c
+++ b/CONFIG/src/atlconf_misc.c
@@ -480,6 +480,7 @@ enum ARCHFAM ProbeArchFam(char *targ)
else if (strstr(res, "alpha")) fam = AFALPHA;
else if (strstr(res, "ia64")) fam = AFIA64;
else if (strstr(res, "mips")) fam = AFMIPS;
+ else if (strstr(res, "s390")) fam = AFS390;
else if ( strstr(res, "i686") || strstr(res, "i586") ||
strstr(res, "i486") || strstr(res, "i386") ||
strstr(res, "x86") || strstr(res, "x86_64") ) fam = AFX86;
@@ -501,6 +502,7 @@ enum ARCHFAM ProbeArchFam(char *targ)
strstr(res, "i486") || strstr(res, "i386") ||
strstr(res, "x86_64") ) fam = AFX86;
else if (strstr(res, "mips")) fam = AFMIPS;
+ else if (strstr(res, "s390")) fam = AFS390;
}
}
return(fam);
Index: b/CONFIG/src/backend/Make.ext
===================================================================
--- a/CONFIG/src/backend/Make.ext
+++ b/CONFIG/src/backend/Make.ext
@@ -43,6 +43,8 @@ probe_gas_parisc.S : $(basf)
$(extC) -b $(basf) -o probe_gas_parisc.S rout=probe_gas_parisc.S
probe_gas_mips.S : $(basf)
$(extC) -b $(basf) -o probe_gas_mips.S rout=probe_gas_mips.S
+probe_gas_s390.S : $(basf)
+ $(extC) -b $(basf) -o probe_gas_s390.S rout=probe_gas_s390.S
probe_SSE3.S : $(basf)
$(extC) -b $(basf) -o probe_SSE3.S rout=probe_SSE3.S
probe_SSE2.S : $(basf)
Index: b/CONFIG/src/backend/archinfo_linux.c
===================================================================
--- a/CONFIG/src/backend/archinfo_linux.c
+++ b/CONFIG/src/backend/archinfo_linux.c
@@ -193,6 +193,18 @@ enum MACHTYPE ProbeArch()
}
#endif
break;
+ case AFS390:
+ if ( !CmndOneLine(NULL, "cat /proc/cpuinfo | fgrep \"processor \"", res) )
+ {
+ if (strstr(res, "2064") || strstr(res, "2066")) mach = IBMz900;
+ else if (strstr(res, "2084") || strstr(res, "2086")) mach = IBMz990;
+ else if (strstr(res, "2094") || strstr(res, "2096")) mach = IBMz9;
+ else if (strstr(res, "2097") || strstr(res, "2098")) mach = IBMz10;
+ /* we consider anything else to be a z196 or later */
+ else mach = IBMz196;
+ }
+ break;
+
default:
#if 0
if (!CmndOneLine(NULL, "fgrep 'cpu family' /proc/cpuinfo", res))
Index: b/CONFIG/src/backend/probe_gas_s390.S
===================================================================
--- /dev/null
+++ b/CONFIG/src/backend/probe_gas_s390.S
@@ -0,0 +1,13 @@
+#define ATL_GAS_PPC
+#include "atlas_asm.h"
+/*
+ * Linux S390 assembler for:
+ * int asm_probe(int i)
+ * RETURNS: i*3
+ */
+.globl ATL_asmdecor(asm_probe)
+ATL_asmdecor(asm_probe):
+ lr r3,r2
+ ar r2,r3
+ ar r2,r3
+ br r14
Index: b/CONFIG/src/probe_comp.c
===================================================================
--- a/CONFIG/src/probe_comp.c
+++ b/CONFIG/src/probe_comp.c
@@ -509,6 +509,8 @@ char *GetPtrbitsFlag(enum OSTYPE OS, enu
return(sp);
if (MachIsMIPS(arch))
return((ptrbits == 64) ? "-mabi=64" : "-mabi=n32");
+ if (MachIsS390(arch))
+ return((ptrbits == 64) ? "-m64" : "-m31");
if (!CompIsGcc(comp))
{
/*
Index: b/include/atlas_prefetch.h
===================================================================
--- a/include/atlas_prefetch.h
+++ b/include/atlas_prefetch.h
@@ -149,6 +149,12 @@
#define ATL_GOT_L1PREFETCH
#define ATL_L1LS 32
#define ATL_L2LS 64
+#elif defined(ATL_ARCH_IBMz196) || defined(ATL_ARCH_IBMz10)
+ #define ATL_pfl1R(mem) __builtin_prefetch(mem, 0, 3)
+ #define ATL_pfl1W(mem) __builtin_prefetch(mem, 1, 3)
+ #define ATL_GOT_L1PREFETCH
+ #define ATL_L1LS 256
+ #define ATL_L2LS 256
#elif defined(__GNUC__) /* last ditch, use gcc predefined func */
#define ATL_pfl1R(mem) __builtin_prefetch(mem, 0, 3)
#define ATL_pfl1W(mem) __builtin_prefetch(mem, 1, 3)

View File

@ -0,0 +1,40 @@
From 3119c671c566761a79ac98405cb619892acde3e8 Mon Sep 17 00:00:00 2001
From: Lukas Slebodnik <lslebodn@redhat.com>
Date: Fri, 20 Sep 2013 09:26:58 +0200
Subject: [PATCH] atlas-shared_libraries
---
ATLAS/makes/Make.lib | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/ATLAS/makes/Make.lib b/ATLAS/makes/Make.lib
index ab1eb9963d36678972a0a410905169aaa563dc64..27c6e316b442e09b0f46afac7940aaa11e25e45c 100644
--- a/ATLAS/makes/Make.lib
+++ b/ATLAS/makes/Make.lib
@@ -4,6 +4,8 @@ mySRCdir = $(SRCdir)/lib
#
# override with libatlas.so only when atlas is built to one lib
#
+so_ver_major=3
+so_ver = $(so_ver_major).10
DYNlibs = liblapack.so libf77blas.so libcblas.so libatlas.so
PTDYNlibs = liblapack.so libptf77blas.so libptcblas.so libatlas.so
CDYNlibs = liblapack.so libcblas.so libatlas.so
@@ -116,9 +116,12 @@ LDTRY:
-rpath-link $(LIBINSTdir) \
--whole-archive $(libas) --no-whole-archive $(LIBS)
GCCTRY:
- $(GOODGCC) -shared -o $(outso) \
- -Wl,"-rpath-link $(LIBINSTdir)" \
+ $(GOODGCC) -shared -o $(outso).$(so_ver) \
+ \
+ -Wl,-soname,"$(outso).$(so_ver_major)" \
-Wl,--whole-archive $(libas) -Wl,--no-whole-archive $(LIBS)
+ ln -s $(outso).$(so_ver) $(outso).$(so_ver_major)
+ ln -s $(outso).$(so_ver) $(outso)
GCCTRY_norp:
$(GOODGCC) -shared -o $(outso) \
-Wl,--whole-archive $(libas) -Wl,--no-whole-archive $(LIBS)
--
1.8.3.1

12
atlas-throttling.patch Normal file
View File

@ -0,0 +1,12 @@
diff -up ATLAS/CONFIG/src/config.c.zaloha ATLAS/CONFIG/src/config.c
--- ATLAS/CONFIG/src/config.c.zaloha 2012-10-25 11:29:02.495425989 +0200
+++ ATLAS/CONFIG/src/config.c 2012-10-25 11:42:10.218216957 +0200
@@ -711,6 +711,8 @@ int ProbePtrbits(int verb, char *targarg
int ProbeCPUThrottle(int verb, char *targarg, enum OSTYPE OS, enum ASMDIA asmb)
{
+ return 0; /* impossible to turn off cpu throttling => ignore */
+ /* this undermines performance of compiled library */
int i, iret;
char *ln;
i = strlen(targarg) + 22 + 12;

View File

@ -0,0 +1,17 @@
diff -up wrk/makes/Make.lib.wrk wrk/makes/Make.lib
--- wrk/makes/Make.lib.wrk 2015-01-23 21:14:46.465494411 +0100
+++ wrk/makes/Make.lib 2015-01-23 22:48:39.632479588 +0100
@@ -185,11 +185,11 @@ TRYALL :
#
fat_ptshared : # threaded target
$(MAKE) TRYALL outso=libtatlas.so \
- libas="libptlapack.a libptf77blas.a libptcblas.a libatlas.a" \
+ libas="libptlapack.a libptf77blas.a libptcblas.a libatlas.a $(SLAPACKlib)" \
LIBINSTdir="$(LIBINSTdir)"
fat_shared : # serial target
$(MAKE) TRYALL outso=libsatlas.so \
- libas="liblapack.a libf77blas.a libcblas.a libatlas.a" \
+ libas="liblapack.a libf77blas.a libcblas.a libatlas.a $(SLAPACKlib)" \
LIBINSTdir="$(LIBINSTdir)"
#
# Builds shared lib, not include fortran codes from LAPACK

View File

@ -0,0 +1,131 @@
From: Michel Normand <normand@linux.vnet.ibm.com>
Subject: atlas.3.10.2 add power8 cpu
Date: Thu, 18 Sep 2014 15:13:24 +0200
atlas.3.10.2 add Power8 cpu
tracked upstream by issue 67
https://sourceforge.net/p/math-atlas/patches/67/
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
---
CONFIG/ARCHS/Make.ext | 7 +++++++
CONFIG/include/atlconf.h | 6 +++---
CONFIG/src/atlcomp.txt | 6 ++++++
CONFIG/src/backend/archinfo_aix.c | 2 ++
CONFIG/src/backend/archinfo_linux.c | 1 +
include/atlas_pca.h | 2 +-
6 files changed, 20 insertions(+), 4 deletions(-)
Index: ATLAS/CONFIG/ARCHS/Make.ext
===================================================================
--- ATLAS.orig/CONFIG/ARCHS/Make.ext
+++ ATLAS/CONFIG/ARCHS/Make.ext
@@ -33,6 +33,7 @@ files = AMD64K10h32SSE3.tar.bz2 AMD64K10
MIPSR1xK64.tar.bz2 Makefile P432SSE2.tar.bz2 P4E32SSE3.tar.bz2 \
P4E64SSE3.tar.bz2 PIII32SSE1.tar.bz2 POWER432.tar.bz2 \
POWER464.tar.bz2 POWER564.tar.bz2 POWER764VSX.tar.bz2 \
+ POWER864VSX.tar.bz2 \
PPCG432AltiVec.tar.bz2 PPCG532AltiVec.tar.bz2 PPCG564AltiVec.tar.bz2 \
PPRO32.tar.bz2 USIII32.tar.bz2 USIII64.tar.bz2 USIV32.tar.bz2 \
USIV64.tar.bz2 UST232.tar.bz2 UST264.tar.bz2 atlas_test1.1.3.tar.bz2 \
@@ -308,6 +309,12 @@ POWER764VSX.tar.bz2 : $(basdr)/POWER764V
/tmp/POWER764VSX.tar POWER764VSX
bzip2 /tmp/POWER764VSX.tar
mv /tmp/POWER764VSX.tar.bz2 ./.
+POWER864VSX.tar.bz2 : $(basdr)/POWER864VSX
+ - rm -f /tmp/POWER864VSX.tar /tmp/POWER864VSX.tar.bz2
+ cd $(basdr) ; tar --dereference --exclude 'CVS' -c -f \
+ /tmp/POWER864VSX.tar POWER864VSX
+ bzip2 /tmp/POWER864VSX.tar
+ mv /tmp/POWER864VSX.tar.bz2 ./.
IBMz1032.tar.bz2 : $(basdr)/IBMz1032
- rm -f /tmp/IBMz1032.tar /tmp/IBMz1032.tar.bz2
cd $(basdr) ; tar --dereference --exclude 'CVS' -c -f \
Index: ATLAS/CONFIG/include/atlconf.h
===================================================================
--- ATLAS.orig/CONFIG/include/atlconf.h
+++ ATLAS/CONFIG/include/atlconf.h
@@ -18,10 +18,10 @@ enum OSTYPE {OSOther=0, OSLinux, OSSunOS
enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS,
AFARM, AFS390};
-#define NMACH 52
+#define NMACH 53
static char *machnam[NMACH] =
{"UNKNOWN", "POWER3", "POWER4", "POWER5", "PPCG4", "PPCG5",
- "POWER6", "POWER7", "POWERe6500", "IBMz9", "IBMz10", "IBMz196",
+ "POWER6", "POWER7", "POWER8", "POWERe6500", "IBMz9", "IBMz10", "IBMz196",
"x86x87", "x86SSE1", "x86SSE2", "x86SSE3",
"P5", "P5MMX", "PPRO", "PII", "PIII", "PM", "CoreSolo",
"CoreDuo", "Core2Solo", "Core2", "Corei1", "Corei2", "Corei3",
@@ -31,7 +31,7 @@ static char *machnam[NMACH] =
"USI", "USII", "USIII", "USIV", "UST1", "UST2", "UnknownUS",
"MIPSR1xK", "MIPSICE9", "ARMv7"};
enum MACHTYPE {MACHOther, IbmPwr3, IbmPwr4, IbmPwr5, PPCG4, PPCG5,
- IbmPwr6, IbmPwr7, Pwre6500,
+ IbmPwr6, IbmPwr7, IbmPwr8, Pwre6500,
IbmZ9, IbmZ10, IbmZ196, /* s390(x) in Linux */
x86x87, x86SSE1, x86SSE2, x86SSE3, /* generic targets */
IntP5, IntP5MMX, IntPPRO, IntPII, IntPIII, IntPM, IntCoreS,
Index: ATLAS/CONFIG/src/atlcomp.txt
===================================================================
--- ATLAS.orig/CONFIG/src/atlcomp.txt
+++ ATLAS/CONFIG/src/atlcomp.txt
@@ -190,6 +190,10 @@ MACH=PPCG5 OS=ALL LVL=1000 COMPS=dmc,icc
'gcc' '-mpowerpc64 -maltivec -mabi=altivec -mcpu=970 -mtune=970 -O2'
MACH=PPCG5 OS=ALL LVL=1000 COMPS=skc
'gcc' '-mpowerpc64 -maltivec -mabi=altivec -mcpu=970 -mtune=970 -O2 -mvrsave'
+MACH=POWER8 OS=ALL LVL=1010 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc
+ 'gcc' '-O2 -mvsx -mcpu=power8 -mtune=power8 -m64 -mvrsave -funroll-all-loops'
+MACH=POWER8 OS=ALL LVL=1010 COMPS=f77
+ 'gfortran' '-O2 -mvsx -mcpu=power8 -mtune=power8 -m64 -mvrsave -funroll-all-loops'
MACH=POWER7 OS=ALL LVL=1010 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc
'gcc' '-O2 -mvsx -mcpu=power7 -mtune=power7 -m64 -mvrsave -funroll-all-loops'
MACH=POWER7 OS=ALL LVL=1010 COMPS=f77
@@ -210,6 +214,8 @@ MACH=POWER4 OS=ALL LVL=1010 COMPS=icc,dm
'gcc' '-mcpu=power4 -mtune=power4 -O3 -fno-schedule-insns -fno-rerun-loop-opt'
MACH=POWER4 OS=ALL LVL=1010 COMPS=f77
'xlf' '-qtune=pwr4 -qarch=pwr4 -O3 -qmaxmem=-1 -qfloat=hsflt'
+MACH=POWER8 OS=ALL LVL=1010 COMPS=f77
+ 'xlf' '-qtune=pwr8 -qarch=pwr8 -O3 -qmaxmem=-1 -qfloat=hsflt'
#
# IBM System z or zEnterprise.
# These compiler flags given by IBM; -O3 -funroll-loops are chosen because
Index: ATLAS/CONFIG/src/backend/archinfo_linux.c
===================================================================
--- ATLAS.orig/CONFIG/src/backend/archinfo_linux.c
+++ ATLAS/CONFIG/src/backend/archinfo_linux.c
@@ -77,6 +77,7 @@ enum MACHTYPE ProbeArch()
else if (strstr(res, "7455")) mach = PPCG4;
else if (strstr(res, "PPC970FX")) mach = PPCG5;
else if (strstr(res, "PPC970MP")) mach = PPCG5;
+ else if (strstr(res, "POWER8")) mach = IbmPwr8;
else if (strstr(res, "POWER7")) mach = IbmPwr7;
else if (strstr(res, "POWER6")) mach = IbmPwr6;
else if (strstr(res, "POWER5")) mach = IbmPwr5;
Index: ATLAS/include/atlas_pca.h
===================================================================
--- ATLAS.orig/include/atlas_pca.h
+++ ATLAS/include/atlas_pca.h
@@ -26,7 +26,7 @@
#endif
#elif defined(ATL_ARCH_POWER3) || defined(ATL_ARCH_POWER4) || \
defined(ATL_ARCH_POWER5) || defined(ATL_ARCH_POWER6) || \
- defined(ATL_ARCH_POWER7)
+ defined(ATL_ARCH_POWER7) || defined(ATL_ARCH_POWER8)
#ifdef __GNUC__
#define ATL_membarrier __asm__ __volatile__ ("dcs")
/* #define ATL_USEPCA 1 */
Index: ATLAS/CONFIG/src/backend/archinfo_aix.c
===================================================================
--- ATLAS.orig/CONFIG/src/backend/archinfo_aix.c
+++ ATLAS/CONFIG/src/backend/archinfo_aix.c
@@ -67,6 +67,8 @@ enum MACHTYPE ProbeArch()
{
if (strstr(res, "PowerPC_POWER5"))
mach = IbmPwr5;
+ else if (strstr(res, "PowerPC_POWER8"))
+ mach = IbmPwr8;
else if (strstr(res, "PowerPC_POWER7"))
mach = IbmPwr7;
else if (strstr(res, "PowerPC_POWER6"))

View File

@ -0,0 +1,220 @@
From: Michel Normand <normand@linux.vnet.ibm.com>
Subject: atlas.3.10.2 ppc64le abiv2 patch
Date: Mon, 28 Jul 2014 04:29:05 -0400
atlas.3.10.2 abiv2 step2 complete the changes already present in atlas 3.10.2
* still some files with opd ABI V1 to be disabled for ABI V2
tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
atlas.3.10.2 ppc64le abiv2 step3
* change offsets of parameters read from stack to avoid some segfaults.
(values changes 120 => 104 and 128 => 112 identified by gdb investigation)
Despite this step3 patch there are two Remaining problems for ppc64le archi:
* TODO: still have seg-faults in console during build/check
but is not critical (without make check) and rpm are generated on fedora.
unable to investigate because of problem tracked by issue 950
https://sourceforge.net/p/math-atlas/support-requests/950/
* TODO: make check failure because xsslvtst execution failure
related to vector assembly code that assumes big-endian env
as written in ATL_cmm4x4x128_av.c and ATL_smm4x4x128_av.c.
Would need significant work to support little-endian as per
endianess comments of all PowerPC vector instructions in:
https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/FBFA164F824370F987256D6A006F424D/$file/vector_simd_pem.ppc.2005AUG23.pdf
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
---
tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c | 7 +++++++
tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c | 7 +++++++
tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c | 9 ++++++++-
tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c | 20 ++++++++++++++++++--
tune/blas/gemm/CASES/ATL_smm4x4x128_av.c | 23 ++++++++++++++++++++++-
5 files changed, 62 insertions(+), 4 deletions(-)
Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
+++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
@@ -268,7 +268,7 @@ Mjoin(.,ATL_USERMM):
.globl Mjoin(_,ATL_USERMM)
Mjoin(_,ATL_USERMM):
#else
- #if defined(ATL_USE64BITS)
+ #if defined(ATL_USE64BITS) && _CALL_ELF != 2
/*
* Official Program Descripter section, seg fault w/o it on Linux/PPC64
*/
@@ -324,8 +324,15 @@ ATL_USERMM:
#endif
#ifdef ATL_USE64BITS
+#if _CALL_ELF == 2
+/* ABIv2 */
+ ld pC0, 104(r1)
+ ld incCn, 112(r1)
+#else
+/* ABIv1 */
ld pC0, 120(r1)
ld incCn, 128(r1)
+#endif
#elif defined(ATL_AS_OSX_PPC) || defined(ATL_AS_AIX_PPC)
lwz pC0, 68(r1)
lwz incCn, 72(r1)
Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
+++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
@@ -170,13 +170,21 @@ void ATL_USERMM(const int M, const int N
const TYPE beta, TYPE *C, const int ldc)
(r10) 8(r1)
*******************************************************************************
-64 bit ABIs:
+64 bit ABIv1s:
r3 r4 r5 r6/f1
void ATL_USERMM(const int M, const int N, const int K, const TYPE alpha,
r7 r8 r9 r10
const TYPE *A, const int lda, const TYPE *B, const int ldb,
f2 120(r1) 128(r1)
const TYPE beta, TYPE *C, const int ldc)
+
+64 bit ABIv2s:
+ r3 r4 r5 r6/f1
+void ATL_USERMM(const int M, const int N, const int K, const TYPE alpha,
+ r7 r8 r9 r10
+ const TYPE *A, const int lda, const TYPE *B, const int ldb,
+ f2 104(r1) 112(r1)
+ const TYPE beta, TYPE *C, const int ldc)
#endif
#ifdef ATL_AS_AIX_PPC
.csect .text[PR]
@@ -202,7 +210,7 @@ Mjoin(.,ATL_USERMM):
.globl Mjoin(_,ATL_USERMM)
Mjoin(_,ATL_USERMM):
#else
- #if defined(ATL_USE64BITS)
+ #if defined(ATL_USE64BITS) && _CALL_ELF != 2
/*
* Official Program Descripter section, seg fault w/o it on Linux/PPC64
*/
@@ -257,9 +265,17 @@ ATL_USERMM:
#endif
#endif
+
#if defined (ATL_USE64BITS)
+#if _CALL_ELF == 2
+/* ABIv2 */
+ ld pC0, 104(r1)
+ ld incCn, 112(r1)
+#else
+/* ABIv1 */
ld pC0, 120(r1)
ld incCn, 128(r1)
+#endif
#elif defined(ATL_AS_OSX_PPC) || defined(ATL_AS_AIX_PPC)
lwz pC0, 68(r1)
lwz incCn, 72(r1)
Index: ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
+++ ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
@@ -196,7 +196,7 @@ void ATL_USERMM(const int M, const int N
.globl Mjoin(_,ATL_USERMM)
Mjoin(_,ATL_USERMM):
#else
- #if defined(ATL_USE64BITS)
+ #if defined(ATL_USE64BITS) && _CALL_ELF != 2
/*
* Official Program Descripter section, seg fault w/o it on Linux/PPC64
*/
@@ -221,8 +221,15 @@ ATL_USERMM:
* kernel instead
*/
#if defined (ATL_USE64BITS)
+#if _CALL_ELF == 2
+/* ABIv2 */
+ ld r10, 104(r1)
+ ld r5, 112(r1)
+#else
+/* ABIv1 */
ld r10, 120(r1)
ld r5, 128(r1)
+#endif
#elif defined(ATL_AS_OSX_PPC)
lwz r10, 60(r1)
lwz r5, 64(r1)
@@ -285,8 +292,15 @@ ATL_USERMM:
eqv r0, r0, r0 /* all 1s */
ATL_WriteVRSAVE(r0) /* signal we use all vector regs */
#if defined (ATL_USE64BITS)
+#if _CALL_ELF == 2
+ /* ABIv2 */
+ ld pC0, FSIZE+104(r1)
+ ld ldc, FSIZE+112(r1)
+#else
+ /* ABIv1 */
ld pC0, FSIZE+120(r1)
ld ldc, FSIZE+128(r1)
+#endif
#elif defined(ATL_AS_OSX_PPC)
lwz pC0, FSIZE+60(r1)
lwz ldc, FSIZE+64(r1)
@@ -4258,8 +4272,15 @@ UNALIGNED_C:
eqv r0, r0, r0 /* all 1s */
ATL_WriteVRSAVE(r0) /* signal we use all vector regs */
#if defined (ATL_USE64BITS)
+#if _CALL_ELF == 2
+ /* ABIv2 */
+ ld pC0, FSIZE+104(r1)
+ ld ldc, FSIZE+112(r1)
+#else
+ /* ABIv1 */
ld pC0, FSIZE+120(r1)
ld ldc, FSIZE+128(r1)
+#endif
#elif defined(ATL_AS_OSX_PPC)
lwz pC0, FSIZE+60(r1)
lwz ldc, FSIZE+64(r1)
Index: ATLAS/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c
+++ ATLAS/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c
@@ -258,8 +258,15 @@ ATL_USERMM:
eqv r0, r0, r0 /* all 1s */
ATL_WriteVRSAVE(r0) /* signal we use all vector regs */
#if defined (ATL_USE64BITS)
+#if _CALL_ELF == 2
+/* ABIv2 */
+ ld pC0, FSIZE+104(r1)
+ ld ldc, FSIZE+112(r1)
+#else
+/* ABIv1 */
ld pC0, FSIZE+120(r1)
ld ldc, FSIZE+128(r1)
+#endif
#elif defined(ATL_AS_OSX_PPC)
lwz pC0, FSIZE+60(r1)
lwz ldc, FSIZE+64(r1)
Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c
+++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c
@@ -405,8 +405,15 @@ Mjoin(_,ATL_USERMM):
*/
#ifdef ATL_GAS_LINUX_PPC
#ifdef ATL_USE64BITS
+ #if _CALL_ELF == 2
+ /* ABIv2 */
+ ld pC0, 104(r1)
+ ld incCn, 112(r1)
+ #else
+ /* ABIv1 */
ld pC0, 120(r1)
ld incCn, 128(r1)
+ #endif
#else
lwz incCn, FSIZE+8(r1)
#endif

View File

@ -0,0 +1,151 @@
From: Michel Normand <normand@linux.vnet.ibm.com>
Subject: atlas.3.10.2 ppc64le do not use files with lvx
Date: Tue, 12 Aug 2014 16:07:06 +0200
ppc64le do not use files with lvx
This is a temporary patch as long as the related files
are not ported yet to ppc64 little-endian.
Warning: patch to be applied only for ppc64le architecture
and will also need atlas-new_archdef_for_ppc64le.patch
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
---
tune/blas/gemm/CASES/ccases.flg | 6 +-----
tune/blas/gemm/CASES/dcases.flg | 8 +-------
tune/blas/gemm/CASES/dcases.vnb | 4 ----
tune/blas/gemm/CASES/scases.flg | 9 +--------
tune/blas/gemm/CASES/scases.vnb | 3 ---
tune/blas/gemm/CASES/zcases.flg | 8 +-------
6 files changed, 4 insertions(+), 34 deletions(-)
Index: ATLAS/tune/blas/gemm/CASES/ccases.flg
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/ccases.flg
+++ ATLAS/tune/blas/gemm/CASES/ccases.flg
@@ -1,5 +1,5 @@
<ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
-24
+22
304 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \
gcc
-mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O
@@ -48,13 +48,9 @@ gcc
328 480 8 8 2 1 1 8 8 2 ATL_mm8x8x2.c "R. Clint Whaley" \
gcc
-fomit-frame-pointer -O2 -fno-tree-loop-optimize
-329 192 4 4 4 1 16 4 4 4 ATL_cmm4x4x128_av.c "R. Clint Whaley" \
-gcc
--x assembler-with-cpp
331 192 4 4 1 1 1 4 4 1 ATL_smm4x4xURx_mips.c "R. Clint Whaley" \
gcc
-x assembler-with-cpp -mips4
-332 192 8 2 4 1 0 8 2 4 ATL_smm8x2x4_av.c "IBM"
333 448 4 4 2 1 1 4 4 2 ATL_smm4x4x2pf_arm.c "R. Clint Whaley" \
gcc
-x assembler-with-cpp -mfpu=vfpv3
Index: ATLAS/tune/blas/gemm/CASES/scases.flg
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/scases.flg
+++ ATLAS/tune/blas/gemm/CASES/scases.flg
@@ -1,5 +1,5 @@
<ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
-25
+22
304 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \
gcc
-mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O
@@ -48,16 +48,9 @@ gcc
328 480 8 8 2 1 1 8 8 2 ATL_mm8x8x2.c "R. Clint Whaley" \
gcc
-fomit-frame-pointer -O2 -fno-tree-loop-optimize
-329 192 4 4 4 1 16 4 4 4 ATL_smm4x4x128_av.c "R. Clint Whaley" \
-gcc
--x assembler-with-cpp
-330 200 92 92 92 1 16 92 92 92 ATL_smm4x4x128_av.c "R. Clint Whaley" \
-gcc
--x assembler-with-cpp
331 192 4 4 1 1 1 4 4 1 ATL_smm4x4xURx_mips.c "R. Clint Whaley" \
gcc
-x assembler-with-cpp -mips4
-332 192 8 2 4 1 0 8 2 4 ATL_smm8x2x4_av.c "IBM"
333 448 4 4 2 1 1 4 4 2 ATL_smm4x4x2pf_arm.c "R. Clint Whaley" \
gcc
-x assembler-with-cpp -mfpu=vfpv3
Index: ATLAS/tune/blas/gemm/CASES/scases.vnb
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/scases.vnb
+++ ATLAS/tune/blas/gemm/CASES/scases.vnb
@@ -31,9 +31,6 @@
# Defaults: TA='t', TB='n', SSE=0, X87=0, LDBOT=1, RTKU=0, AOUTER=0,
# KBMAX=KU, KBMIN=KU, BETAN1=0, RTMN=1
#
-ID=1 ROUT='ATL_smm4x4x128_av.c' AUTH='R. Clint Whaley' MU=4 NU=4 KU=4 \
- LDKB=1 LDBOT=1 KBMIN=4 KBMAX=128 ASM=GAS_PPC \
- COMP='gcc' FLAGS='-x assembler-with-cpp'
ID=2 ROUT='ATL_smm4x4x16_av.c' AUTH='R. Clint Whaley' MU=4 NU=4 KU=16 \
LDKB=1 LDBOT=0 KBMIN=16 KBMAX=2048 ASM=GAS_SPARC \
COMP='gcc' FLAGS='-x assembler-with-cpp'
Index: ATLAS/tune/blas/gemm/CASES/dcases.flg
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/dcases.flg
+++ ATLAS/tune/blas/gemm/CASES/dcases.flg
@@ -1,5 +1,5 @@
<ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
-32
+30
306 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \
gcc
-mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O -fno-schedule-insns -fno-schedule-insns2
@@ -79,12 +79,6 @@ gcc
336 192 4 4 1 1 1 4 4 1 ATL_dmm4x4xURx_mips.c "R. Clint Whaley" \
gcc
-x assembler-with-cpp -mips4
-337 192 4 4 1 1 16 4 4 1 ATL_dmm4x4x80_ppc.c "Whaley & Castaldo" \
-gcc
--x assembler-with-cpp
-338 192 8 4 2 1 0 8 4 2 ATL_dmm8x4x2_vsx.c "IBM" \
-gcc
--O3 -mvsx
339 448 4 4 2 1 1 4 4 2 ATL_dmm4x4x2pf_arm.c "R. Clint Whaley" \
gcc
-x assembler-with-cpp -mfpu=vfpv3
Index: ATLAS/tune/blas/gemm/CASES/dcases.vnb
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/dcases.vnb
+++ ATLAS/tune/blas/gemm/CASES/dcases.vnb
@@ -53,10 +53,6 @@ ID=6 ROUT='ATL_dmm4x1x90_x87.c' AUTH='R
ID=7 ROUT='ATL_dmm8x1x120_sse2.c' AUTH='R. Clint Whaley' \
MU=8 NU=1 KU=1 KBMAX=512 ASM=GAS_x8664 BETAN1=1 \
COMP='gcc' FLAGS='-m64 -x assembler-with-cpp'
-ID=70 ROUT='ATL_dmm4x4x80_ppc.c' AUTH='R. Clint Whaley' TA='T', TB='N' \
- MU=4 NU=4 KU=1 KBMIN=1 KBMAX=80 ASM=GAS_PPC BETAN1=0 LDBOT=0 \
- LDAB=0 LDISKB=1 RTN=1 RTM=1 RTK=0 \
- COMP='gcc' FLAGS='-x assembler-with-cpp'
ID=80 ROUT='ATL_dmm4x4x16r8_US.c' AUTH='R. Clint Whaley' TA='T', TB='N' \
MU=4 NU=4 KU=24 KBMIN=24 KBMAX=512 ASM=GAS_SPARC BETAN1=0 \
LDAB=0 RTK=1 RTN=1 RTM=1 LDBOT=0 LDISKB=1 LDAB=1 \
Index: ATLAS/tune/blas/gemm/CASES/zcases.flg
===================================================================
--- ATLAS.orig/tune/blas/gemm/CASES/zcases.flg
+++ ATLAS/tune/blas/gemm/CASES/zcases.flg
@@ -1,5 +1,5 @@
<ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
-31
+29
306 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \
gcc
-mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O -fno-schedule-insns -fno-schedule-insns2
@@ -76,12 +76,6 @@ gcc
336 192 4 4 1 1 1 4 4 1 ATL_dmm4x4xURx_mips.c "R. Clint Whaley" \
gcc
-x assembler-with-cpp -mips4
-337 192 4 4 1 1 16 4 4 1 ATL_dmm4x4x80_ppc.c "Whaley & Castaldo" \
-gcc
--x assembler-with-cpp
-338 192 8 4 2 1 0 8 4 2 ATL_dmm8x4x2_vsx.c "IBM" \
-gcc
--O3 -mvsx
339 448 4 4 2 1 1 4 4 2 ATL_dmm4x4x2pf_arm.c "R. Clint Whaley" \
gcc
-x assembler-with-cpp -mfpu=vfpv3

1116
atlas.spec

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,38 @@
diff -up ATLAS/include/atlas_genparse.h.than ATLAS/include/atlas_genparse.h
--- ATLAS/include/atlas_genparse.h.than 2015-11-26 10:53:55.056586198 -0500
+++ ATLAS/include/atlas_genparse.h 2015-11-26 10:56:00.168537914 -0500
@@ -149,13 +149,24 @@ static int asmNames2bitfield(char *str)
}
/* procedure 7 */
-static int GetDoubleArr(char *str, int N, double *d)
+static int GetDoubleArr(char *callerstr, int N, double *d)
/*
* Reads in a list with form "%le,%le...,%le"; N-length d recieves doubles.
* RETURNS: the number of doubles found, or N, whichever is less
*/
{
- int i=1;
+ int i;
+ char *dupstr = DupString(callerstr);
+ char *str = dupstr;
+ /* strip the string to end on first white space */
+ for (i=0; dupstr[i]; i++)
+ {
+ if (isspace(dupstr[i])) {
+ dupstr[i] = '\0';
+ break;
+ }
+ }
+ i = 1;
assert(sscanf(str, "%le", d) == 1);
while (i < N)
{
@@ -167,6 +178,7 @@ static int GetDoubleArr(char *str, int N
break;
i++;
}
+ free(dupstr);
return(i);
}

View File

@ -0,0 +1,26 @@
From: Michel Normand <normand@linux.vnet.ibm.com>
Subject: initialize malloc memory.invtrsm.wms.oct23
Date: Mon, 14 Apr 2014 17:18:53 +0200
References: http://sourceforge.net/p/math-atlas/mailman/message/32471499/
initialize malloc memory invtrsm.c
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
---
ATLAS/tune/blas/level3/invtrsm.c | 1 +
1 file changed, 1 insertion(+)
Index: ATLAS/tune/blas/level3/invtrsm.c
===================================================================
--- ATLAS.orig/tune/blas/level3/invtrsm.c
+++ ATLAS/tune/blas/level3/invtrsm.c
@@ -525,6 +525,7 @@ static double RunTiming
a = A = malloc(i * ATL_MulBySize(incA));
if (A)
{
+ memset(A,0,i*ATL_MulBySize(incA)); /* wms (!!) malloc call above returns non-initialized memory. */
if (Uplo == TestGE)
for (i=0; i < k; i++)
Mjoin(PATL,gegen)(N, N, A+i*incA, lda, N+lda);

26
sources
View File

@ -1,9 +1,17 @@
1bb3abde499b492b4be1f1a0759fbfa2 atlas3.8.4.tar.bz2
9ddf8c76e5e9781c542b712f704460e1 IBMz1032.tgz
ee4cbc1f15cb4cd5f5266969a4bc62a7 IBMz1064.tgz
edd3cb5602c6282e4a30691e728bd064 IBMz19632.tgz
21f630520058859ad0b8b798bd17dc5a IBMz19664.tgz
3f174cdcb4c964843f27dbfc4ad4b1c8 K7323DNow.tgz
676548252837b1e458181111443f340f PPRO32.tgz
ebb4732aff468bbc223e7f734252173b USII32.tgz
31f8ae7583d290e5414a1a61ff6e7e39 USII64.tgz
SHA512 (atlas3.10.3.tar.bz2) = bf17306f09f2aa973cb776e2c9eacfb2409ad4d95d19802e1c4e0597d0a099fccdb5eaafe273c2682a41e41a3c6fabc8bbba4ce03180cffea40ede5df1d1f56e
SHA512 (IBMz1032.tgz) = f745187d75073de461d6948489dad3abea9a67ad10ec63e021160d3f61ae5be36e94768faa0e7e6e3158b1401bf954eae1e7e6416857b652415030836c6aba3d
SHA512 (IBMz1064.tgz) = 14fbc584a8711a0292c8be0dce962bd7ac12347b2d203c2a7b0cc66ea68ac57d5b88afc6778df39efea43077fcc70c6c63db365b5b4badb879ab6900b5296094
SHA512 (IBMz1264.tar.bz2) = 54bab951a818feb08fe5c671213db80d17bfefe75a3993d80655161219f018e87125c4ccc09c701cde45fd672a9856f4fff557ffc378c5b2fe7e9c6ebc3bd1de
SHA512 (IBMz19632.tgz) = aa10213265866b3176efe1d9d204da170844573f7ab26a36551a174eab3951ccd5f54a5149f1351affc38c510162cce9e10eb2a830af32992cb3febe9e1ecafe
SHA512 (IBMz19664.tgz) = 5837d5dfd04c31c304e1f454d0148bd412ec8853c50a7c3dcee9a61529bd04c30d68a0c7aae2bfa2c393fde3582fe36f98e6f5891b271b19562491298ba600b2
SHA512 (IBMz932.tar.bz2) = 8f71140d1b30d00ed44faea71e42ab3ff24917a62670f7becdb0d861bf4e7c3c972f9601d161439a518dcc87405c74af31cdd4e2996999a5da8452cc8d2a52df
SHA512 (IBMz964.tar.bz2) = b7356e5b47615c64c9b2dd6a497f071e39d4d90f6dd42478fea1d7597cf21ea08123c480fde002aba181a2ad0eeb21acb61469c7e4b2e8961e4d72e5f86e14cc
SHA512 (PPRO32.tgz) = a30069e79f95a36b2c7125e7861218e9612bd92913db929ea98800201e7ca7d55c9a1480063c7d5a4c50fcb2b271907ce43cd9b229c694a5ee3b56561a7820e6
SHA512 (USII32.tgz) = e9d3b1f5ccd38fc364666205e33f7a927e96c3cebc35d9692cafe3b536697224f20702641f875421b200ff78774831fd5790174ef55c899e0cdb905e3ac2371f
SHA512 (USII64.tgz) = 5bd654f8b45306a18f3ad2b593ba23012909ba5ad91614de5024b80998bde832d0ddc84d2c0c0e75dd28915f3c07ec40ac9351213b24e54028fbad4d385ebcc6
SHA512 (POWER332.tar.bz2) = 95a7281dbb7a2d0897a58599577afdedba66e6e5edb73223efdeecd93b6706031139b9b58b14345449dccbf1abfa8275bc261f826c692282d14dc30320728c75
SHA512 (ARMv732NEON.tar.bz2) = 92acbdd8f7aebd841a10a13df85baa00c518dae388e1ee8dd4bc35fc461d732d2df2cfeae0a3614cea251b80a9ee6a5b49ad71ab8b36b98b70bda6d1c215c78d
SHA512 (ARMa732.tar) = 47d6564b5a439bc3778ccc79242220b236c7dc8d36e12ce6850c7e9a02e2379618322c003ba4490573c40b78227c2c3925222da4f4e5f87aab48eae192b45bb9
SHA512 (ARMa732.tar.bz2) = 8b83b59a32f18d2cd432c205efd4358b0000ce1685799f2f38a60532bc925e9cd871371d2dfd226ab8e30e830bf608f022d63bcd26f26f9fe74acab067bd4d4f
SHA512 (POWER864LEVSXp4.tar.bz2) = e2fa637061a4a4806bc091009c37ccd719c4c4051baf36ed451917e255375881fa168caa5ca296ae9c89bb28523d9015fda42a5dbc51aef4c66efbf6efd966d2
SHA512 (K7323DNow.tgz) = e1d5e4208ce454b5f5daa68663d2dd28a2bd3cc97496e4e1515df880b9ccd00bcc75bd820402c3b2bf8409f98500e43f2481fbf5dd480f7d0ba60fe2f82a1ac1