Compare commits
62 Commits
Author | SHA1 | Date | |
---|---|---|---|
|
709845b89e | ||
|
7268b7aa10 | ||
|
63d75114da | ||
|
9561fa26c1 | ||
|
1fb89d180f | ||
|
f128eda788 | ||
|
17da750ef6 | ||
|
4d0e90e529 | ||
|
202c1937a9 | ||
|
68ac25ef42 | ||
|
298208165a | ||
|
1a2f01f0f0 | ||
|
35e303b80a | ||
|
6af83c0231 | ||
|
8888d79619 | ||
|
d03b304c5a | ||
|
76d4dba820 | ||
|
13262e006a | ||
|
f086598d07 | ||
|
5b8ceef1ab | ||
|
c7f12adc64 | ||
|
91230ab335 | ||
|
cddcd13655 | ||
|
6a17475b1c | ||
|
049545a9bc | ||
|
9a6d77d550 | ||
|
da91cc64e6 | ||
|
2940efa59b | ||
|
c5bccf55f6 | ||
|
bf44b5d377 | ||
|
b21a4fe1c8 | ||
|
3f7c964eea | ||
|
01531f84c8 | ||
|
f7d47dc9a3 | ||
|
9371c38444 | ||
|
8a3c0ce3cf | ||
|
5a588758d4 | ||
|
46bc45dbe7 | ||
|
932ed464ac | ||
|
6d51f79182 | ||
|
93e9bf9d18 | ||
|
ecd0c2adff | ||
|
100afcd2b2 | ||
|
2c71620acc | ||
|
e381725c32 | ||
|
6c8df60df6 | ||
|
ce70abdb41 | ||
|
ff52573f7d | ||
|
885b47ab37 | ||
|
6dbfdbf430 | ||
|
64a1bfa23d | ||
|
7fe7cb921f | ||
|
48e39087ac | ||
|
f8e736600e | ||
|
381f2c21f2 | ||
|
225ecb7519 | ||
|
84e3e0e998 | ||
|
3ba016b393 | ||
|
a0b20b5850 | ||
|
caba23768a | ||
|
62670b4e6a | ||
|
f974d7b834 |
9
.gitignore
vendored
9
.gitignore
vendored
@ -1,3 +1,12 @@
|
||||
atlas3.8.3.tar.bz2
|
||||
PPRO32.tgz
|
||||
K7323DNow.tgz
|
||||
/atlas3.10.0.tar.bz2
|
||||
/atlas3.10.1.tar.bz2
|
||||
/IBMz932.tar.bz2
|
||||
/IBMz964.tar.bz2
|
||||
/POWER332.tar.bz2
|
||||
/ARMv732NEON.tar.bz2
|
||||
/lapack-3.5.0.tgz
|
||||
/atlas3.10.2.tar.bz2
|
||||
/POWER864LEVSXp4.tar.bz2
|
||||
|
@ -1,68 +0,0 @@
|
||||
Notes on the Fedora version of ATLAS
|
||||
|
||||
by Quentin Spencer
|
||||
updated: October 4, 2005
|
||||
|
||||
updated by Deji Akingunola
|
||||
October 15, 2008
|
||||
|
||||
updated by Deji Akingunola
|
||||
June 15, 2011
|
||||
|
||||
Because ATLAS relies on compile-time optimizations to obtain improved
|
||||
performance over BLAS and LAPACK, the resulting binaries are closely
|
||||
tied to the hardware on which they are compiled, and can likely result
|
||||
in very poor performance on other hardware. For this reason,
|
||||
including a package like ATLAS in Fedora requires some compromises.
|
||||
Firstly, a binary ATLAS package must perform reasonably well on the
|
||||
entire range of hardware on which it could potentially be installed.
|
||||
Optimizing ATLAS for the most modern hardware can result in
|
||||
significant performance penalties for users using the same package on
|
||||
older hardware. Second, building from the same source package must
|
||||
result in identical binaries for any computer of a particular
|
||||
architecture. This is because the binaries installed on a user's
|
||||
computer are built on a computer in the Fedora Extras build system,
|
||||
which will have hardware different from the end user's hardware, and
|
||||
quite possibly different from other available hardware in the build
|
||||
system.
|
||||
|
||||
As of version 3.8.2 (in Fedora), ATLAS builds uses achitectural defaults,
|
||||
are partial results of past searches when the compiler and architecture
|
||||
are known, to discover the appropriate kernels used to build all the required
|
||||
libraries. They make install time quicker and also ensure that good results are
|
||||
obtained, since they typically represent several searches and/or user
|
||||
intervention into the usual search so that maximum performance is found.
|
||||
The result is a set of libraries that will not
|
||||
necessarily achieve optimal performance on any given hardware but
|
||||
should still offer significant performance gains over the reference
|
||||
BLAS and LAPACK libraries on most hardware. The binary package
|
||||
includes the atlas libraries as well as binary-compatible blas and
|
||||
lapack libraries that should work as a drop-in replacement for the
|
||||
standard ones (they are installed in /usr/lib{64}/atlas* rather than
|
||||
/usr/lib{64}).
|
||||
|
||||
For 32bit x86 systems, the default atlas package on was built using Pentium Pro
|
||||
architectural defaults using just x87 math optimization. In addition to the
|
||||
base 32bit build, 4 ATLAS subpackages are built for 3Dnow, SSE, SSE2, and SSE3
|
||||
ix86 extensions, using architectural defaults obtained from Athlon K7, PIII,
|
||||
Pentium 4 with SSE2 extension and PENTIUM 4 with SSE3 extensions respectively.
|
||||
|
||||
On 64bit x86 systems the default atlas package on was built with SSE2 optimization using architetural default made for AMD's HAMMER processor, and an additional
|
||||
SSE3-enabled subpackage was built also using architetural default made for AMD's HAMMER processor.
|
||||
|
||||
This packaging allows multiple installation of different atlas sub-packages
|
||||
at the same time. The alternatives system (read 'man alternatives' for usage)
|
||||
is used in the -devel subpackages to select the appropriate location for the
|
||||
architectural dependent header files.
|
||||
|
||||
This package is designed to build RPMs that are identical regardless
|
||||
of where they are compiled and that provide reasonable performance on
|
||||
a wide range of hardware. For users who want optimal performance on
|
||||
particular hardware, custom RPMs can be built from the source package
|
||||
by setting the RPM macro "enable_native_atlas" to a value of 1. This
|
||||
can be done from the command line as in the following example:
|
||||
|
||||
rpmbuild -D "enable_native_atlas 1" --rebuild atlas-3.8.3-1.src.rpm
|
||||
|
||||
This will cause the ATLAS build to use the achitectural default most
|
||||
appropriate for the system on which the package is to be built.
|
47
README.dist
Normal file
47
README.dist
Normal file
@ -0,0 +1,47 @@
|
||||
Notes on the packaged version of ATLAS
|
||||
|
||||
by Quentin Spencer
|
||||
updated: October 4, 2005
|
||||
|
||||
updated by Deji Akingunola
|
||||
October 15, 2008
|
||||
|
||||
updated by Deji Akingunola
|
||||
June 15, 2011
|
||||
|
||||
updated by Frantisek Kluknavsky
|
||||
Nov 20, 2012
|
||||
|
||||
Because ATLAS relies on compile-time optimizations to obtain improved
|
||||
performance over BLAS and LAPACK, the resulting binaries are closely
|
||||
tied to the hardware on which they are compiled, and can likely result
|
||||
in very poor performance on other hardware. For this reason,
|
||||
including a package like ATLAS in Fedora requires some compromises.
|
||||
Optimizing ATLAS for the most modern hardware can result in
|
||||
significant performance penalties for users using the same package on
|
||||
older hardware. A binary ATLAS package must perform reasonably well on the
|
||||
entire range of hardware on which it could potentially be installed.
|
||||
|
||||
The result is a set of libraries that will not
|
||||
necessarily achieve optimal performance on any given hardware but
|
||||
should still offer significant performance gains over the reference
|
||||
BLAS and LAPACK libraries on most hardware.
|
||||
|
||||
In addition to the base 32bit build, subpackages are built for SSE, SSE2,
|
||||
and SSE3 ix86 extensions.
|
||||
|
||||
On 64bit x86 systems the default atlas package was built with SSE3
|
||||
optimization.
|
||||
|
||||
This packaging allows multiple installation of different atlas sub-packages
|
||||
at the same time. The alternatives system (read 'man alternatives' for usage)
|
||||
is used in the -devel subpackages to select the appropriate location for the
|
||||
architectural dependent header files.
|
||||
|
||||
For users who want optimal performance on
|
||||
particular hardware, custom RPMs can be built from the source package
|
||||
by setting the RPM macro "enable_native_atlas" to a value of 1. This
|
||||
can be done from the command line as in the following example:
|
||||
|
||||
rpmbuild -D "enable_native_atlas 1" --rebuild atlas-3.8.3-1.src.rpm
|
||||
|
219
atlas-aarch64port.patch
Normal file
219
atlas-aarch64port.patch
Normal file
@ -0,0 +1,219 @@
|
||||
Author: Mark Salter <msalter@redhat.com>
|
||||
|
||||
Index: ATLAS/CONFIG/include/atlconf.h
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/include/atlconf.h
|
||||
+++ ATLAS/CONFIG/include/atlconf.h
|
||||
@@ -16,9 +16,9 @@ enum OSTYPE {OSOther=0, OSLinux, OSSunOS
|
||||
((OS_) == OSWin64) )
|
||||
|
||||
enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS,
|
||||
- AFARM, AFS390};
|
||||
+ AFARM, AFS390, AFAARCH64};
|
||||
|
||||
-#define NMACH 52
|
||||
+#define NMACH 53
|
||||
static char *machnam[NMACH] =
|
||||
{"UNKNOWN", "POWER3", "POWER4", "POWER5", "PPCG4", "PPCG5",
|
||||
"POWER6", "POWER7", "POWERe6500", "IBMz9", "IBMz10", "IBMz196",
|
||||
@@ -29,7 +29,7 @@ static char *machnam[NMACH] =
|
||||
"Efficeon", "K7", "HAMMER", "AMD64K10h", "AMDLLANO", "AMDDOZER","AMDDRIVER",
|
||||
"UNKNOWNx86", "IA64Itan", "IA64Itan2",
|
||||
"USI", "USII", "USIII", "USIV", "UST1", "UST2", "UnknownUS",
|
||||
- "MIPSR1xK", "MIPSICE9", "ARMv7"};
|
||||
+ "MIPSR1xK", "MIPSICE9", "ARMv7", "AARCH64"};
|
||||
enum MACHTYPE {MACHOther, IbmPwr3, IbmPwr4, IbmPwr5, PPCG4, PPCG5,
|
||||
IbmPwr6, IbmPwr7, Pwre6500,
|
||||
IbmZ9, IbmZ10, IbmZ196, /* s390(x) in Linux */
|
||||
@@ -42,7 +42,8 @@ enum MACHTYPE {MACHOther, IbmPwr3, IbmPw
|
||||
SunUSI, SunUSII, SunUSIII, SunUSIV, SunUST1, SunUST2, SunUSX,
|
||||
MIPSR1xK, /* includes R10K, R12K, R14K, R16K */
|
||||
MIPSICE9, /* SiCortex ICE9 -- like MIPS5K */
|
||||
- ARMv7 /* includes Cortex A8, A9 */
|
||||
+ ARMv7, /* includes Cortex A8, A9 */
|
||||
+ AARCH64
|
||||
};
|
||||
#define MachIsX86(mach_) \
|
||||
( (mach_) >= x86x87 && (mach_) <= x86X )
|
||||
@@ -63,6 +64,8 @@ enum MACHTYPE {MACHOther, IbmPwr3, IbmPw
|
||||
( (mach_) == ARMv7 )
|
||||
#define MachIsS390(mach_) \
|
||||
( (mach_) >= IbmZ9 && (mach_) <= IbmZ196 )
|
||||
+#define MachIsAARCH64(mach_) \
|
||||
+ ( (mach_) == AARCH64 )
|
||||
|
||||
|
||||
static char *f2c_namestr[5] = {"UNKNOWN","Add_", "Add__", "NoChange", "UpCase"};
|
||||
@@ -84,13 +87,13 @@ enum ISAEXT
|
||||
{ISA_None=0, ISA_VSX, ISA_AV, ISA_AVXMAC, ISA_AVXFMA4, ISA_AVX,
|
||||
ISA_SSE3, ISA_SSE2, ISA_SSE1, ISA_3DNow, ISA_NEON};
|
||||
|
||||
-#define NASMD 9
|
||||
+#define NASMD 10
|
||||
enum ASMDIA
|
||||
{ASM_None=0, gas_x86_32, gas_x86_64, gas_sparc, gas_ppc, gas_parisc,
|
||||
- gas_mips, gas_arm, gas_s390};
|
||||
+ gas_mips, gas_arm, gas_s390, gas_aarch64};
|
||||
static char *ASMNAM[NASMD] =
|
||||
{"", "GAS_x8632", "GAS_x8664", "GAS_SPARC", "GAS_PPC", "GAS_PARISC",
|
||||
- "GAS_MIPS", "GAS_ARM", "GAS_S390"};
|
||||
+ "GAS_MIPS", "GAS_ARM", "GAS_S390", "GAS_AARCH64"};
|
||||
|
||||
/*
|
||||
* Used for archinfo probes (can pack in bitfield)
|
||||
Index: ATLAS/CONFIG/src/Makefile
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/Makefile
|
||||
+++ ATLAS/CONFIG/src/Makefile
|
||||
@@ -260,6 +260,11 @@ IRun_BINDP :
|
||||
redir=config0.out
|
||||
- cat config0.out
|
||||
|
||||
+IRun_GAS_AARCH64 :
|
||||
+ $(CC) $(CCFLAGS) -o xprobe_gas_aarch64 $(SRCdir)/backend/probe_this_asm.c $(SRCdir)/backend/probe_gas_aarch64.S
|
||||
+ $(MAKE) $(atlrun) atldir=$(mydir) exe=xprobe_gas_aarch64 args="$(args)" \
|
||||
+ redir=config0.out
|
||||
+ - cat config0.out
|
||||
IRun_GAS_S390 :
|
||||
$(CC) $(CCFLAGS) -o xprobe_gas_s390 $(SRCdir)/backend/probe_this_asm.c $(SRCdir)/backend/probe_gas_s390.S
|
||||
$(MAKE) $(atlrun) atldir=$(mydir) exe=xprobe_gas_s390 args="$(args)" \
|
||||
Index: ATLAS/CONFIG/src/SpewMakeInc.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/SpewMakeInc.c
|
||||
+++ ATLAS/CONFIG/src/SpewMakeInc.c
|
||||
@@ -391,6 +391,8 @@ char *GetPtrbitsFlag(enum OSTYPE OS, enu
|
||||
|
||||
if (MachIsIA64(arch))
|
||||
return(sp);
|
||||
+ if (MachIsAARCH64(arch))
|
||||
+ return(sp);
|
||||
if (MachIsMIPS(arch))
|
||||
return((ptrbits == 64) ? "-mabi=64" : "-mabi=n32");
|
||||
if (MachIsS390(arch))
|
||||
Index: ATLAS/CONFIG/src/atlcomp.txt
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/atlcomp.txt
|
||||
+++ ATLAS/CONFIG/src/atlcomp.txt
|
||||
@@ -267,6 +267,17 @@ MACH=ARMv7 OS=ALL LVL=1000 COMPS=dmc,dkc
|
||||
MACH=ARMv7 OS=ALL LVL=1000 COMPS=f77
|
||||
'gfortran' '-mcpu=cortex-a8 -mfpu=vfpv3 -mfloat-abi=softfp -O'
|
||||
#
|
||||
+# AArch64 defaults
|
||||
+#
|
||||
+MACH=AARCH64 OS=ALL LVL=1000 COMPS=xcc
|
||||
+ 'gcc' '-O2'
|
||||
+MACH=AARCH64 OS=ALL LVL=1000 COMPS=smc,skc,gcc,icc
|
||||
+ 'gcc' '-O2'
|
||||
+MACH=AARCH64 OS=ALL LVL=1000 COMPS=dmc,dkc
|
||||
+ 'gcc' '-O2'
|
||||
+MACH=AARCH64 OS=ALL LVL=1000 COMPS=f77
|
||||
+ 'gfortran' '-O'
|
||||
+#
|
||||
# Generic defaults
|
||||
#
|
||||
MACH=ALL OS=ALL LVL=5 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc
|
||||
Index: ATLAS/CONFIG/src/atlconf_misc.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/atlconf_misc.c
|
||||
+++ ATLAS/CONFIG/src/atlconf_misc.c
|
||||
@@ -563,6 +563,7 @@ enum ARCHFAM ProbeArchFam(char *targ)
|
||||
else if (strstr(res, "ia64")) fam = AFIA64;
|
||||
else if (strstr(res, "mips")) fam = AFMIPS;
|
||||
else if (strstr(res, "arm")) fam = AFARM;
|
||||
+ else if (strstr(res, "aarch64")) fam = AFAARCH64;
|
||||
else if (strstr(res, "s390")) fam = AFS390;
|
||||
else if ( strstr(res, "i686") || strstr(res, "i586") ||
|
||||
strstr(res, "i486") || strstr(res, "i386") ||
|
||||
@@ -588,6 +589,7 @@ enum ARCHFAM ProbeArchFam(char *targ)
|
||||
strstr(res, "x86_64") ) fam = AFX86;
|
||||
else if (strstr(res, "mips")) fam = AFMIPS;
|
||||
else if (strstr(res, "arm")) fam = AFARM;
|
||||
+ else if (strstr(res, "aarch64")) fam = AFAARCH64;
|
||||
else if (strstr(res, "s390")) fam = AFS390;
|
||||
free(res);
|
||||
}
|
||||
Index: ATLAS/CONFIG/src/backend/Make.ext
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/backend/Make.ext
|
||||
+++ ATLAS/CONFIG/src/backend/Make.ext
|
||||
@@ -57,6 +57,8 @@ probe_gas_arm.S : $(basf)
|
||||
$(extC) -b $(basf) -o probe_gas_arm.S rout=probe_gas_arm.S
|
||||
probe_gas_s390.S : $(basf)
|
||||
$(extC) -b $(basf) -o probe_gas_s390.S rout=probe_gas_s390.S
|
||||
+probe_gas_aarch64.S : $(basf)
|
||||
+ $(extC) -b $(basf) -o probe_gas_aarch64.S rout=probe_gas_aarch64.S
|
||||
probe_AVXMAC.S : $(basf)
|
||||
$(extC) -b $(basf) -o probe_AVXMAC.S rout=probe_AVXMAC.S
|
||||
probe_AVXFMA4.S : $(basf)
|
||||
Index: ATLAS/CONFIG/src/backend/archinfo_linux.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/backend/archinfo_linux.c
|
||||
+++ ATLAS/CONFIG/src/backend/archinfo_linux.c
|
||||
@@ -267,6 +267,14 @@ enum MACHTYPE ProbeArch()
|
||||
free(res);
|
||||
}
|
||||
break;
|
||||
+ case AFAARCH64:
|
||||
+ res = atlsys_1L(NULL, "fgrep 'Processor' /proc/cpuinfo", 0, 0);
|
||||
+ if (res)
|
||||
+ {
|
||||
+ if (strstr(res, "AArch64")) mach = AARCH64;
|
||||
+ free(res);
|
||||
+ }
|
||||
+ break;
|
||||
default:
|
||||
#if 0
|
||||
if (!CmndOneLine(NULL, "fgrep 'cpu family' /proc/cpuinfo", res))
|
||||
Index: ATLAS/CONFIG/src/backend/probe_gas_aarch64.S
|
||||
===================================================================
|
||||
--- /dev/null
|
||||
+++ ATLAS/CONFIG/src/backend/probe_gas_aarch64.S
|
||||
@@ -0,0 +1,14 @@
|
||||
+#define ATL_GAS_AARCH64
|
||||
+#include "atlas_asm.h"
|
||||
+#
|
||||
+# Linux AArch64 assembler for:
|
||||
+# int asm_probe(int i)
|
||||
+# RETURNS: i*3
|
||||
+#
|
||||
+.text
|
||||
+.globl ATL_asmdecor(asm_probe)
|
||||
+.type ATL_asmdecor(asm_probe), %function
|
||||
+ATL_asmdecor(asm_probe):
|
||||
+ add w0, w0, w0, LSL #1
|
||||
+ ret
|
||||
+.size ATL_asmdecor(asm_probe),.-ATL_asmdecor(asm_probe)
|
||||
Index: ATLAS/CONFIG/src/probe_comp.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/probe_comp.c
|
||||
+++ ATLAS/CONFIG/src/probe_comp.c
|
||||
@@ -582,7 +582,7 @@ char *GetPtrbitsFlag(enum OSTYPE OS, enu
|
||||
char *sp = "";
|
||||
int i, j, k;
|
||||
|
||||
- if (MachIsIA64(arch))
|
||||
+ if (MachIsIA64(arch) || MachIsAARCH64(arch))
|
||||
return(sp);
|
||||
if (MachIsMIPS(arch))
|
||||
return((ptrbits == 64) ? "-mabi=64" : "-mabi=n32");
|
||||
Index: ATLAS/include/atlas_genparse.h
|
||||
===================================================================
|
||||
--- ATLAS.orig/include/atlas_genparse.h
|
||||
+++ ATLAS/include/atlas_genparse.h
|
||||
@@ -6,13 +6,13 @@
|
||||
#include <assert.h>
|
||||
#include <string.h>
|
||||
#include <ctype.h>
|
||||
-#define NASMD 9
|
||||
+#define NASMD 10
|
||||
enum ASMDIA
|
||||
{ASM_None=0, gas_x86_32, gas_x86_64, gas_sparc, gas_ppc, gas_parisc,
|
||||
- gas_mips, gas_arm, gas_s390};
|
||||
+ gas_mips, gas_arm, gas_s390, gas_aarch64};
|
||||
static char *ASMNAM[NASMD] =
|
||||
{"", "GAS_x8632", "GAS_x8664", "GAS_SPARC", "GAS_PPC", "GAS_PARISC",
|
||||
- "GAS_MIPS", "GAS_ARM", "GAS_S390"};
|
||||
+ "GAS_MIPS", "GAS_ARM", "GAS_S390", "GAS_AARCH64"};
|
||||
/*
|
||||
* Basic data structure for forming queues with some minimal info
|
||||
*/
|
17
atlas-affinity.patch
Normal file
17
atlas-affinity.patch
Normal file
@ -0,0 +1,17 @@
|
||||
diff -up wrk/src/threads/ATL_thread_start.c.wrk wrk/src/threads/ATL_thread_start.c
|
||||
--- wrk/src/threads/ATL_thread_start.c.wrk 2013-09-23 13:46:51.881085276 +0200
|
||||
+++ wrk/src/threads/ATL_thread_start.c 2013-09-24 16:13:59.021065418 +0200
|
||||
@@ -101,9 +101,10 @@ int ATL_thread_start(ATL_thread_t *thr,
|
||||
ATL_assert(!pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED));
|
||||
pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM); /* no chk, OK to fail */
|
||||
#ifdef ATL_PAFF_SETAFFNP
|
||||
- CPU_ZERO(&cpuset);
|
||||
- CPU_SET(affID, &cpuset);
|
||||
- ATL_assert(!pthread_attr_setaffinity_np(&attr, sizeof(cpuset), &cpuset));
|
||||
+ //affinity crashes a machine with fewer processors than the builder
|
||||
+ //CPU_ZERO(&cpuset);
|
||||
+ //CPU_SET(affID, &cpuset);
|
||||
+ //ATL_assert(!pthread_attr_setaffinity_np(&attr, sizeof(cpuset), &cpuset));
|
||||
#elif defined(ATL_PAFF_SETPROCNP)
|
||||
ATL_assert(!pthread_attr_setprocessor_np(&attr, (pthread_spu_t)affID,
|
||||
PTHREAD_BIND_FORCED_NP));
|
14
atlas-genparse.patch
Normal file
14
atlas-genparse.patch
Normal file
@ -0,0 +1,14 @@
|
||||
diff --git a/include/atlas_genparse.h b/include/atlas_genparse.h
|
||||
index 909a38e..1e6d153 100644
|
||||
--- a/include/atlas_genparse.h
|
||||
+++ b/include/atlas_genparse.h
|
||||
@@ -163,7 +163,8 @@ static int GetDoubleArr(char *str, int N, double *d)
|
||||
if (!str)
|
||||
break;
|
||||
str++;
|
||||
- assert(sscanf(str, "%le", d+i) == 1);
|
||||
+ if (sscanf(str, "%le", d+i) != 1)
|
||||
+ break;
|
||||
i++;
|
||||
}
|
||||
return(i);
|
@ -1,15 +1,16 @@
|
||||
diff -up ATLAS/CONFIG/src/SpewMakeInc.c.melf ATLAS/CONFIG/src/SpewMakeInc.c
|
||||
--- ATLAS/CONFIG/src/SpewMakeInc.c.melf 2011-05-14 11:33:24.000000000 -0600
|
||||
+++ ATLAS/CONFIG/src/SpewMakeInc.c 2012-08-09 10:52:28.051926489 -0600
|
||||
@@ -665,9 +665,9 @@ main(int nargs, char **args)
|
||||
if (MachIsX86(mach))
|
||||
{
|
||||
if (ptrbits == 32)
|
||||
- fprintf(fpout, " -melf_i386");
|
||||
+ fprintf(fpout, " -Wl,-melf_i386");
|
||||
else if (ptrbits == 64)
|
||||
- fprintf(fpout, " -melf_x86_64");
|
||||
+ fprintf(fpout, " -Wl,-melf_x86_64");
|
||||
if (OS == OSFreeBSD)
|
||||
fprintf(fpout, "_fbsd");
|
||||
}
|
||||
diff --git a/CONFIG/src/SpewMakeInc.c b/CONFIG/src/SpewMakeInc.c
|
||||
index eed259e..65d68a1 100644
|
||||
--- a/CONFIG/src/SpewMakeInc.c
|
||||
+++ b/CONFIG/src/SpewMakeInc.c
|
||||
@@ -764,9 +764,9 @@ int main(int nargs, char **args)
|
||||
else
|
||||
{
|
||||
if (ptrbits == 32)
|
||||
- fprintf(fpout, " -melf_i386");
|
||||
+ fprintf(fpout, " -Wl,-melf_i386");
|
||||
else if (ptrbits == 64)
|
||||
- fprintf(fpout, " -melf_x86_64");
|
||||
+ fprintf(fpout, " -Wl,-melf_x86_64");
|
||||
if (OS == OSFreeBSD)
|
||||
fprintf(fpout, "_fbsd");
|
||||
}
|
||||
|
32
atlas-new_archdef_for_ppc64le.patch
Normal file
32
atlas-new_archdef_for_ppc64le.patch
Normal file
@ -0,0 +1,32 @@
|
||||
Subject: atlas new archdef for ppc64le
|
||||
From: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
Date: Sun, 13 Jun 2014 18:02:47 +0200
|
||||
|
||||
Need to define different archdef names
|
||||
for ppc64 (that is Big Endian) and ppc64le (that is Little Endian).
|
||||
This is already done upstream in atlas 3.11.30 with issue
|
||||
https://sourceforge.net/p/math-atlas/patches/66/
|
||||
|
||||
Required at least as long as I need the bypass of
|
||||
atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch
|
||||
|
||||
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
---
|
||||
CONFIG/src/SpewMakeInc.c | 4 ++++
|
||||
1 file changed, 4 insertions(+)
|
||||
|
||||
Index: ATLAS/CONFIG/src/SpewMakeInc.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/SpewMakeInc.c
|
||||
+++ ATLAS/CONFIG/src/SpewMakeInc.c
|
||||
@@ -542,6 +542,10 @@ int main(int nargs, char **args)
|
||||
fprintf(fpout, "# -------------------------------------------------\n");
|
||||
fprintf(fpout, " ARCH = %s", machnam[mach]);
|
||||
fprintf(fpout, "%d", ptrbits);
|
||||
+ /* for ppc64le archi add 'LE' characters */
|
||||
+ #if defined(__powerpc64__) && (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
|
||||
+ fprintf(fpout, "%s", "LE");
|
||||
+ #endif
|
||||
if (ISAX)
|
||||
fprintf(fpout, "%s", ISAXNAM[ISAX]);
|
||||
if (!USEIEEE)
|
@ -1,279 +0,0 @@
|
||||
---
|
||||
CONFIG/include/atlconf.h | 18 +++++++-----
|
||||
CONFIG/src/Makefile | 5 +++
|
||||
CONFIG/src/SpewMakeInc.c | 5 +++
|
||||
CONFIG/src/atlcomp.txt | 50 ++++++++++++++++++++++++++++++++++++
|
||||
CONFIG/src/atlconf_misc.c | 2 +
|
||||
CONFIG/src/backend/Make.ext | 2 +
|
||||
CONFIG/src/backend/archinfo_linux.c | 12 ++++++++
|
||||
CONFIG/src/backend/probe_gas_s390.S | 13 +++++++++
|
||||
CONFIG/src/probe_comp.c | 2 +
|
||||
include/atlas_prefetch.h | 6 ++++
|
||||
10 files changed, 108 insertions(+), 7 deletions(-)
|
||||
|
||||
Index: b/CONFIG/include/atlconf.h
|
||||
===================================================================
|
||||
--- a/CONFIG/include/atlconf.h
|
||||
+++ b/CONFIG/include/atlconf.h
|
||||
@@ -14,9 +14,9 @@ enum OSTYPE {OSOther=0, OSLinux, OSSunOS
|
||||
OSWin9x, OSWinNT, OSHPUX, OSFreeBSD, OSOSX};
|
||||
#define OSIsWin(OS_) (((OS_) == OSWinNT) || ((OS_) == OSWin9x))
|
||||
|
||||
-enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS};
|
||||
+enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS, AFS390};
|
||||
|
||||
-#define NMACH 37
|
||||
+#define NMACH 42
|
||||
static char *machnam[NMACH] =
|
||||
{"UNKNOWN", "POWER3", "POWER4", "POWER5", "PPCG4", "PPCG5",
|
||||
"POWER6", "POWER7",
|
||||
@@ -25,7 +25,8 @@ static char *machnam[NMACH] =
|
||||
"Efficeon", "K7", "HAMMER", "AMD64K10h", "UNKNOWNx86",
|
||||
"IA64Itan", "IA64Itan2",
|
||||
"USI", "USII", "USIII", "USIV", "UST2", "UnknownUS",
|
||||
- "MIPSR1xK", "MIPSICE9"};
|
||||
+ "MIPSR1xK", "MIPSICE9",
|
||||
+ "IBMz900", "IBMz990", "IBMz9", "IBMz10", "IBMz196" };
|
||||
enum MACHTYPE {MACHOther, IbmPwr3, IbmPwr4, IbmPwr5, PPCG4, PPCG5,
|
||||
IbmPwr6, IbmPwr7,
|
||||
IntP5, IntP5MMX, IntPPRO, IntPII, IntPIII, IntPM, IntCoreS,
|
||||
@@ -34,7 +35,8 @@ enum MACHTYPE {MACHOther, IbmPwr3, IbmPw
|
||||
IA64Itan, IA64Itan2,
|
||||
SunUSI, SunUSII, SunUSIII, SunUSIV, SunUST2, SunUSX,
|
||||
MIPSR1xK, /* includes R10K, R12K, R14K, R16K */
|
||||
- MIPSICE9 /* SiCortex ICE9 -- like MIPS5K */
|
||||
+ MIPSICE9, /* SiCortex ICE9 -- like MIPS5K */
|
||||
+ IBMz900, IBMz990, IBMz9, IBMz10, IBMz196 /* s390(x) in Linux */
|
||||
};
|
||||
#define MachIsX86(mach_) \
|
||||
( (mach_) >= IntP5 && (mach_) <= x86X )
|
||||
@@ -51,6 +53,8 @@ enum MACHTYPE {MACHOther, IbmPwr3, IbmPw
|
||||
#endif
|
||||
#define MachIsPPC(mach_) \
|
||||
( (mach_) >= PPCG4 && (mach_) <= PPCG5 )
|
||||
+#define MachIsS390(mach_) \
|
||||
+ ( (mach_) >= IBMz900 && (mach_) <= IBMz196 )
|
||||
|
||||
static char *f2c_namestr[5] = {"UNKNOWN","Add_", "Add__", "NoChange", "UpCase"};
|
||||
static char *f2c_intstr[5] =
|
||||
@@ -68,13 +72,13 @@ static char *ISAXNAM[NISA] =
|
||||
{"", "AltiVec", "SSE3", "SSE2", "SSE1", "3DNow"};
|
||||
enum ISAEXT {ISA_None=0, ISA_AV, ISA_SSE3, ISA_SSE2, ISA_SSE1, ISA_3DNow};
|
||||
|
||||
-#define NASMD 7
|
||||
+#define NASMD 8
|
||||
enum ASMDIA
|
||||
{ASM_None=0, gas_x86_32, gas_x86_64, gas_sparc, gas_ppc, gas_parisc,
|
||||
- gas_mips};
|
||||
+ gas_mips, gas_s390};
|
||||
static char *ASMNAM[NASMD] =
|
||||
{"", "GAS_x8632", "GAS_x8664", "GAS_SPARC", "GAS_PPC", "GAS_PARISC",
|
||||
- "GAS_MIPS"};
|
||||
+ "GAS_MIPS", "GAS_S390"};
|
||||
|
||||
|
||||
/*
|
||||
Index: b/CONFIG/src/Makefile
|
||||
===================================================================
|
||||
--- a/CONFIG/src/Makefile
|
||||
+++ b/CONFIG/src/Makefile
|
||||
@@ -177,6 +177,11 @@ IRun_GAS_x8632 :
|
||||
$(MAKE) $(atlrun) atldir=$(mydir) exe=xprobe_gas_x8632 args="$(args)" \
|
||||
redir=config0.out
|
||||
- cat config0.out
|
||||
+IRun_GAS_S390 :
|
||||
+ $(CC) $(CCFLAGS) -o xprobe_gas_s390 $(SRCdir)/backend/probe_this_asm.c $(SRCdir)/backend/probe_gas_s390.S
|
||||
+ $(MAKE) $(atlrun) atldir=$(mydir) exe=xprobe_gas_s390 args="$(args)" \
|
||||
+ redir=config0.out
|
||||
+ - cat config0.out
|
||||
|
||||
IRunC2C :
|
||||
- rm -f config0.out xc2c c2cslave.o
|
||||
Index: b/CONFIG/src/SpewMakeInc.c
|
||||
===================================================================
|
||||
--- a/CONFIG/src/SpewMakeInc.c
|
||||
+++ b/CONFIG/src/SpewMakeInc.c
|
||||
@@ -342,6 +342,9 @@ char *GetPtrbitsFlag(enum OSTYPE OS, enu
|
||||
return(sp);
|
||||
if (MachIsMIPS(arch))
|
||||
return((ptrbits == 64) ? "-mabi=64" : "-mabi=n32");
|
||||
+ if (MachIsS390(arch))
|
||||
+ return((ptrbits == 64) ? "-m64" : "-m31");
|
||||
+
|
||||
if (!CompIsGcc(comp))
|
||||
{
|
||||
/*
|
||||
@@ -671,6 +674,8 @@ main(int nargs, char **args)
|
||||
if (OS == OSFreeBSD)
|
||||
fprintf(fpout, "_fbsd");
|
||||
}
|
||||
+ if (MachIsS390(mach))
|
||||
+ fprintf(fpout, ptrbits == 32 ? "-m31" : "-m64");
|
||||
fprintf(fpout, "\n F77SYSLIB = %s\n", f77lib ? f77lib : "");
|
||||
fprintf(fpout, " BC = $(ICC)\n");
|
||||
fprintf(fpout, " NCFLAGS = $(ICCFLAGS)\n");
|
||||
Index: b/CONFIG/src/atlcomp.txt
|
||||
===================================================================
|
||||
--- a/CONFIG/src/atlcomp.txt
|
||||
+++ b/CONFIG/src/atlcomp.txt
|
||||
@@ -164,6 +164,56 @@ MACH=ALL OS=WinNT LVL=0 COMPS=f77
|
||||
MACH=P4,PM OS=WinNT LVL=0 COMPS=icc,dmc,smc,dkc,skc,xcc
|
||||
'icl' '-QxN -O3 -Qprec -fp:extended -fp:except -nologo -Oy'
|
||||
#
|
||||
+# IBM System z or zEnterprise
|
||||
+#
|
||||
+
|
||||
+# z900 or z800
|
||||
+MACH=IBMz900 OS=ALL LVL=1000 COMPS=f77
|
||||
+ 'gfortran' '-march=z900 -O3 -funroll-loops'
|
||||
+MACH=IBMz900 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
|
||||
+ 'gcc' '-march=z900 -O3 -funroll-loops'
|
||||
+
|
||||
+# z990 or z890
|
||||
+MACH=IBMz990 OS=ALL LVL=1000 COMPS=f77
|
||||
+ 'gfortran' '-march=z990 -O3 -funroll-loops'
|
||||
+MACH=IBMz990 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
|
||||
+ 'gcc' '-march=z990 -O3 -funroll-loops'
|
||||
+
|
||||
+# z9-EC z9-BC or z9-109
|
||||
+MACH=IBMz9 OS=ALL LVL=1000 COMPS=f77
|
||||
+ 'gfortran' '-march=z9-109 -O3 -funroll-loops'
|
||||
+MACH=IBMz9 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
|
||||
+ 'gcc' '-march=z9-109 -O3 -funroll-loops'
|
||||
+
|
||||
+# on z10 and z196 gcc emits prefetches which disturb cache size
|
||||
+# detection and optimization. Therefore, we use fno-prefetch-loop-arrays
|
||||
+# z10
|
||||
+MACH=IBMz10 OS=ALL LVL=1000 COMPS=f77
|
||||
+ 'gfortran' '-march=z10 -O3 -funroll-loops -fno-prefetch-loop-arrays'
|
||||
+MACH=IBMz10 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
|
||||
+ 'gcc' '-march=z10 -O3 -funroll-loops -fno-prefetch-loop-arrays'
|
||||
+
|
||||
+# z196. we also try to fallback to z10 and z9 for older compilers
|
||||
+MACH=IBMz196 OS=ALL LVL=1000 COMPS=f77
|
||||
+ 'gfortran' '-march=z196 -O3 -funroll-loops -fno-prefetch-loop-arrays'
|
||||
+MACH=IBMz196 OS=ALL LVL=800 COMPS=f77
|
||||
+ 'gfortran' '-march=z10 -O3 -funroll-loops -fno-prefetch-loop-arrays'
|
||||
+MACH=IBMz196 OS=ALL LVL=600 COMPS=f77
|
||||
+ 'gfortran' '-march=z9-109 -O3 -funroll-loops'
|
||||
+MACH=IBMz196 OS=ALL LVL=1000 COMPS=smc,dmc,skc,dkc,icc,xcc
|
||||
+ 'gcc' '-march=z196 -O3 -funroll-loops -fno-prefetch-loop-arrays'
|
||||
+MACH=IBMz196 OS=ALL LVL=800 COMPS=smc,dmc,skc,dkc,icc,xcc
|
||||
+ 'gcc' '-march=z10 -O3 -funroll-loops -fno-prefetch-loop-arrays'
|
||||
+MACH=IBMz196 OS=ALL LVL=600 COMPS=smc,dmc,skc,dkc,icc,xcc
|
||||
+ 'gcc' '-march=z9-109 -O3 -funroll-loops'
|
||||
+
|
||||
+# ALL march options failed, go back to conservative defaults
|
||||
+MACH=IBMz900,IBMz990,IBMz9,IBMz10,IBMz196 OS=ALL LVL=500 COMPS=f77
|
||||
+ 'gfortran' '-O3 -funroll-loops'
|
||||
+MACH=IBMz900,IBMz990,IBMz9,IBMz10,IBMz196 OS=ALL LVL=500 COMPS=smc,dmc,skc,dkc,icc,xcc
|
||||
+ 'gcc' '-O3 -funroll-loops'
|
||||
+
|
||||
+#
|
||||
# Generic defaults
|
||||
#
|
||||
MACH=ALL OS=ALL LVL=5 COMPS=icc,smc,dmc,skc,dkc,xcc
|
||||
Index: b/CONFIG/src/atlconf_misc.c
|
||||
===================================================================
|
||||
--- a/CONFIG/src/atlconf_misc.c
|
||||
+++ b/CONFIG/src/atlconf_misc.c
|
||||
@@ -480,6 +480,7 @@ enum ARCHFAM ProbeArchFam(char *targ)
|
||||
else if (strstr(res, "alpha")) fam = AFALPHA;
|
||||
else if (strstr(res, "ia64")) fam = AFIA64;
|
||||
else if (strstr(res, "mips")) fam = AFMIPS;
|
||||
+ else if (strstr(res, "s390")) fam = AFS390;
|
||||
else if ( strstr(res, "i686") || strstr(res, "i586") ||
|
||||
strstr(res, "i486") || strstr(res, "i386") ||
|
||||
strstr(res, "x86") || strstr(res, "x86_64") ) fam = AFX86;
|
||||
@@ -501,6 +502,7 @@ enum ARCHFAM ProbeArchFam(char *targ)
|
||||
strstr(res, "i486") || strstr(res, "i386") ||
|
||||
strstr(res, "x86_64") ) fam = AFX86;
|
||||
else if (strstr(res, "mips")) fam = AFMIPS;
|
||||
+ else if (strstr(res, "s390")) fam = AFS390;
|
||||
}
|
||||
}
|
||||
return(fam);
|
||||
Index: b/CONFIG/src/backend/Make.ext
|
||||
===================================================================
|
||||
--- a/CONFIG/src/backend/Make.ext
|
||||
+++ b/CONFIG/src/backend/Make.ext
|
||||
@@ -43,6 +43,8 @@ probe_gas_parisc.S : $(basf)
|
||||
$(extC) -b $(basf) -o probe_gas_parisc.S rout=probe_gas_parisc.S
|
||||
probe_gas_mips.S : $(basf)
|
||||
$(extC) -b $(basf) -o probe_gas_mips.S rout=probe_gas_mips.S
|
||||
+probe_gas_s390.S : $(basf)
|
||||
+ $(extC) -b $(basf) -o probe_gas_s390.S rout=probe_gas_s390.S
|
||||
probe_SSE3.S : $(basf)
|
||||
$(extC) -b $(basf) -o probe_SSE3.S rout=probe_SSE3.S
|
||||
probe_SSE2.S : $(basf)
|
||||
Index: b/CONFIG/src/backend/archinfo_linux.c
|
||||
===================================================================
|
||||
--- a/CONFIG/src/backend/archinfo_linux.c
|
||||
+++ b/CONFIG/src/backend/archinfo_linux.c
|
||||
@@ -193,6 +193,18 @@ enum MACHTYPE ProbeArch()
|
||||
}
|
||||
#endif
|
||||
break;
|
||||
+ case AFS390:
|
||||
+ if ( !CmndOneLine(NULL, "cat /proc/cpuinfo | fgrep \"processor \"", res) )
|
||||
+ {
|
||||
+ if (strstr(res, "2064") || strstr(res, "2066")) mach = IBMz900;
|
||||
+ else if (strstr(res, "2084") || strstr(res, "2086")) mach = IBMz990;
|
||||
+ else if (strstr(res, "2094") || strstr(res, "2096")) mach = IBMz9;
|
||||
+ else if (strstr(res, "2097") || strstr(res, "2098")) mach = IBMz10;
|
||||
+ /* we consider anything else to be a z196 or later */
|
||||
+ else mach = IBMz196;
|
||||
+ }
|
||||
+ break;
|
||||
+
|
||||
default:
|
||||
#if 0
|
||||
if (!CmndOneLine(NULL, "fgrep 'cpu family' /proc/cpuinfo", res))
|
||||
Index: b/CONFIG/src/backend/probe_gas_s390.S
|
||||
===================================================================
|
||||
--- /dev/null
|
||||
+++ b/CONFIG/src/backend/probe_gas_s390.S
|
||||
@@ -0,0 +1,13 @@
|
||||
+#define ATL_GAS_PPC
|
||||
+#include "atlas_asm.h"
|
||||
+/*
|
||||
+ * Linux S390 assembler for:
|
||||
+ * int asm_probe(int i)
|
||||
+ * RETURNS: i*3
|
||||
+ */
|
||||
+.globl ATL_asmdecor(asm_probe)
|
||||
+ATL_asmdecor(asm_probe):
|
||||
+ lr r3,r2
|
||||
+ ar r2,r3
|
||||
+ ar r2,r3
|
||||
+ br r14
|
||||
Index: b/CONFIG/src/probe_comp.c
|
||||
===================================================================
|
||||
--- a/CONFIG/src/probe_comp.c
|
||||
+++ b/CONFIG/src/probe_comp.c
|
||||
@@ -509,6 +509,8 @@ char *GetPtrbitsFlag(enum OSTYPE OS, enu
|
||||
return(sp);
|
||||
if (MachIsMIPS(arch))
|
||||
return((ptrbits == 64) ? "-mabi=64" : "-mabi=n32");
|
||||
+ if (MachIsS390(arch))
|
||||
+ return((ptrbits == 64) ? "-m64" : "-m31");
|
||||
if (!CompIsGcc(comp))
|
||||
{
|
||||
/*
|
||||
Index: b/include/atlas_prefetch.h
|
||||
===================================================================
|
||||
--- a/include/atlas_prefetch.h
|
||||
+++ b/include/atlas_prefetch.h
|
||||
@@ -149,6 +149,12 @@
|
||||
#define ATL_GOT_L1PREFETCH
|
||||
#define ATL_L1LS 32
|
||||
#define ATL_L2LS 64
|
||||
+#elif defined(ATL_ARCH_IBMz196) || defined(ATL_ARCH_IBMz10)
|
||||
+ #define ATL_pfl1R(mem) __builtin_prefetch(mem, 0, 3)
|
||||
+ #define ATL_pfl1W(mem) __builtin_prefetch(mem, 1, 3)
|
||||
+ #define ATL_GOT_L1PREFETCH
|
||||
+ #define ATL_L1LS 256
|
||||
+ #define ATL_L2LS 256
|
||||
#elif defined(__GNUC__) /* last ditch, use gcc predefined func */
|
||||
#define ATL_pfl1R(mem) __builtin_prefetch(mem, 0, 3)
|
||||
#define ATL_pfl1W(mem) __builtin_prefetch(mem, 1, 3)
|
40
atlas-shared_libraries.patch
Normal file
40
atlas-shared_libraries.patch
Normal file
@ -0,0 +1,40 @@
|
||||
From 3119c671c566761a79ac98405cb619892acde3e8 Mon Sep 17 00:00:00 2001
|
||||
From: Lukas Slebodnik <lslebodn@redhat.com>
|
||||
Date: Fri, 20 Sep 2013 09:26:58 +0200
|
||||
Subject: [PATCH] atlas-shared_libraries
|
||||
|
||||
---
|
||||
ATLAS/makes/Make.lib | 9 +++++++--
|
||||
1 file changed, 7 insertions(+), 2 deletions(-)
|
||||
|
||||
diff --git a/ATLAS/makes/Make.lib b/ATLAS/makes/Make.lib
|
||||
index ab1eb9963d36678972a0a410905169aaa563dc64..27c6e316b442e09b0f46afac7940aaa11e25e45c 100644
|
||||
--- a/ATLAS/makes/Make.lib
|
||||
+++ b/ATLAS/makes/Make.lib
|
||||
@@ -4,6 +4,8 @@ mySRCdir = $(SRCdir)/lib
|
||||
#
|
||||
# override with libatlas.so only when atlas is built to one lib
|
||||
#
|
||||
+so_ver_major=3
|
||||
+so_ver = $(so_ver_major).10
|
||||
DYNlibs = liblapack.so libf77blas.so libcblas.so libatlas.so
|
||||
PTDYNlibs = liblapack.so libptf77blas.so libptcblas.so libatlas.so
|
||||
CDYNlibs = liblapack.so libcblas.so libatlas.so
|
||||
@@ -116,9 +116,12 @@ LDTRY:
|
||||
-rpath-link $(LIBINSTdir) \
|
||||
--whole-archive $(libas) --no-whole-archive $(LIBS)
|
||||
GCCTRY:
|
||||
- $(GOODGCC) -shared -o $(outso) \
|
||||
- -Wl,"-rpath-link $(LIBINSTdir)" \
|
||||
+ $(GOODGCC) -shared -o $(outso).$(so_ver) \
|
||||
+ \
|
||||
+ -Wl,-soname,"$(outso).$(so_ver_major)" \
|
||||
-Wl,--whole-archive $(libas) -Wl,--no-whole-archive $(LIBS)
|
||||
+ ln -s $(outso).$(so_ver) $(outso).$(so_ver_major)
|
||||
+ ln -s $(outso).$(so_ver) $(outso)
|
||||
GCCTRY_norp:
|
||||
$(GOODGCC) -shared -o $(outso) \
|
||||
-Wl,--whole-archive $(libas) -Wl,--no-whole-archive $(LIBS)
|
||||
--
|
||||
1.8.3.1
|
||||
|
12
atlas-throttling.patch
Normal file
12
atlas-throttling.patch
Normal file
@ -0,0 +1,12 @@
|
||||
diff -up ATLAS/CONFIG/src/config.c.zaloha ATLAS/CONFIG/src/config.c
|
||||
--- ATLAS/CONFIG/src/config.c.zaloha 2012-10-25 11:29:02.495425989 +0200
|
||||
+++ ATLAS/CONFIG/src/config.c 2012-10-25 11:42:10.218216957 +0200
|
||||
@@ -711,6 +711,8 @@ int ProbePtrbits(int verb, char *targarg
|
||||
|
||||
int ProbeCPUThrottle(int verb, char *targarg, enum OSTYPE OS, enum ASMDIA asmb)
|
||||
{
|
||||
+ return 0; /* impossible to turn off cpu throttling => ignore */
|
||||
+ /* this undermines performance of compiled library */
|
||||
int i, iret;
|
||||
char *ln;
|
||||
i = strlen(targarg) + 22 + 12;
|
17
atlas.3.10.1-unbundle.patch
Normal file
17
atlas.3.10.1-unbundle.patch
Normal file
@ -0,0 +1,17 @@
|
||||
diff -up wrk/makes/Make.lib.wrk wrk/makes/Make.lib
|
||||
--- wrk/makes/Make.lib.wrk 2015-01-23 21:14:46.465494411 +0100
|
||||
+++ wrk/makes/Make.lib 2015-01-23 22:48:39.632479588 +0100
|
||||
@@ -185,11 +185,11 @@ TRYALL :
|
||||
#
|
||||
fat_ptshared : # threaded target
|
||||
$(MAKE) TRYALL outso=libtatlas.so \
|
||||
- libas="libptlapack.a libptf77blas.a libptcblas.a libatlas.a" \
|
||||
+ libas="libptlapack.a libptf77blas.a libptcblas.a libatlas.a $(SLAPACKlib)" \
|
||||
LIBINSTdir="$(LIBINSTdir)"
|
||||
fat_shared : # serial target
|
||||
$(MAKE) TRYALL outso=libsatlas.so \
|
||||
- libas="liblapack.a libf77blas.a libcblas.a libatlas.a" \
|
||||
+ libas="liblapack.a libf77blas.a libcblas.a libatlas.a $(SLAPACKlib)" \
|
||||
LIBINSTdir="$(LIBINSTdir)"
|
||||
#
|
||||
# Builds shared lib, not include fortran codes from LAPACK
|
131
atlas.3.10.2-add_power8_cpu.patch
Normal file
131
atlas.3.10.2-add_power8_cpu.patch
Normal file
@ -0,0 +1,131 @@
|
||||
From: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
Subject: atlas.3.10.2 add power8 cpu
|
||||
Date: Thu, 18 Sep 2014 15:13:24 +0200
|
||||
|
||||
atlas.3.10.2 add Power8 cpu
|
||||
tracked upstream by issue 67
|
||||
https://sourceforge.net/p/math-atlas/patches/67/
|
||||
|
||||
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
---
|
||||
CONFIG/ARCHS/Make.ext | 7 +++++++
|
||||
CONFIG/include/atlconf.h | 6 +++---
|
||||
CONFIG/src/atlcomp.txt | 6 ++++++
|
||||
CONFIG/src/backend/archinfo_aix.c | 2 ++
|
||||
CONFIG/src/backend/archinfo_linux.c | 1 +
|
||||
include/atlas_pca.h | 2 +-
|
||||
6 files changed, 20 insertions(+), 4 deletions(-)
|
||||
|
||||
Index: ATLAS/CONFIG/ARCHS/Make.ext
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/ARCHS/Make.ext
|
||||
+++ ATLAS/CONFIG/ARCHS/Make.ext
|
||||
@@ -33,6 +33,7 @@ files = AMD64K10h32SSE3.tar.bz2 AMD64K10
|
||||
MIPSR1xK64.tar.bz2 Makefile P432SSE2.tar.bz2 P4E32SSE3.tar.bz2 \
|
||||
P4E64SSE3.tar.bz2 PIII32SSE1.tar.bz2 POWER432.tar.bz2 \
|
||||
POWER464.tar.bz2 POWER564.tar.bz2 POWER764VSX.tar.bz2 \
|
||||
+ POWER864VSX.tar.bz2 \
|
||||
PPCG432AltiVec.tar.bz2 PPCG532AltiVec.tar.bz2 PPCG564AltiVec.tar.bz2 \
|
||||
PPRO32.tar.bz2 USIII32.tar.bz2 USIII64.tar.bz2 USIV32.tar.bz2 \
|
||||
USIV64.tar.bz2 UST232.tar.bz2 UST264.tar.bz2 atlas_test1.1.3.tar.bz2 \
|
||||
@@ -308,6 +309,12 @@ POWER764VSX.tar.bz2 : $(basdr)/POWER764V
|
||||
/tmp/POWER764VSX.tar POWER764VSX
|
||||
bzip2 /tmp/POWER764VSX.tar
|
||||
mv /tmp/POWER764VSX.tar.bz2 ./.
|
||||
+POWER864VSX.tar.bz2 : $(basdr)/POWER864VSX
|
||||
+ - rm -f /tmp/POWER864VSX.tar /tmp/POWER864VSX.tar.bz2
|
||||
+ cd $(basdr) ; tar --dereference --exclude 'CVS' -c -f \
|
||||
+ /tmp/POWER864VSX.tar POWER864VSX
|
||||
+ bzip2 /tmp/POWER864VSX.tar
|
||||
+ mv /tmp/POWER864VSX.tar.bz2 ./.
|
||||
IBMz1032.tar.bz2 : $(basdr)/IBMz1032
|
||||
- rm -f /tmp/IBMz1032.tar /tmp/IBMz1032.tar.bz2
|
||||
cd $(basdr) ; tar --dereference --exclude 'CVS' -c -f \
|
||||
Index: ATLAS/CONFIG/include/atlconf.h
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/include/atlconf.h
|
||||
+++ ATLAS/CONFIG/include/atlconf.h
|
||||
@@ -18,10 +18,10 @@ enum OSTYPE {OSOther=0, OSLinux, OSSunOS
|
||||
enum ARCHFAM {AFOther=0, AFPPC, AFSPARC, AFALPHA, AFX86, AFIA64, AFMIPS,
|
||||
AFARM, AFS390};
|
||||
|
||||
-#define NMACH 52
|
||||
+#define NMACH 53
|
||||
static char *machnam[NMACH] =
|
||||
{"UNKNOWN", "POWER3", "POWER4", "POWER5", "PPCG4", "PPCG5",
|
||||
- "POWER6", "POWER7", "POWERe6500", "IBMz9", "IBMz10", "IBMz196",
|
||||
+ "POWER6", "POWER7", "POWER8", "POWERe6500", "IBMz9", "IBMz10", "IBMz196",
|
||||
"x86x87", "x86SSE1", "x86SSE2", "x86SSE3",
|
||||
"P5", "P5MMX", "PPRO", "PII", "PIII", "PM", "CoreSolo",
|
||||
"CoreDuo", "Core2Solo", "Core2", "Corei1", "Corei2", "Corei3",
|
||||
@@ -31,7 +31,7 @@ static char *machnam[NMACH] =
|
||||
"USI", "USII", "USIII", "USIV", "UST1", "UST2", "UnknownUS",
|
||||
"MIPSR1xK", "MIPSICE9", "ARMv7"};
|
||||
enum MACHTYPE {MACHOther, IbmPwr3, IbmPwr4, IbmPwr5, PPCG4, PPCG5,
|
||||
- IbmPwr6, IbmPwr7, Pwre6500,
|
||||
+ IbmPwr6, IbmPwr7, IbmPwr8, Pwre6500,
|
||||
IbmZ9, IbmZ10, IbmZ196, /* s390(x) in Linux */
|
||||
x86x87, x86SSE1, x86SSE2, x86SSE3, /* generic targets */
|
||||
IntP5, IntP5MMX, IntPPRO, IntPII, IntPIII, IntPM, IntCoreS,
|
||||
Index: ATLAS/CONFIG/src/atlcomp.txt
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/atlcomp.txt
|
||||
+++ ATLAS/CONFIG/src/atlcomp.txt
|
||||
@@ -190,6 +190,10 @@ MACH=PPCG5 OS=ALL LVL=1000 COMPS=dmc,icc
|
||||
'gcc' '-mpowerpc64 -maltivec -mabi=altivec -mcpu=970 -mtune=970 -O2'
|
||||
MACH=PPCG5 OS=ALL LVL=1000 COMPS=skc
|
||||
'gcc' '-mpowerpc64 -maltivec -mabi=altivec -mcpu=970 -mtune=970 -O2 -mvrsave'
|
||||
+MACH=POWER8 OS=ALL LVL=1010 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc
|
||||
+ 'gcc' '-O2 -mvsx -mcpu=power8 -mtune=power8 -m64 -mvrsave -funroll-all-loops'
|
||||
+MACH=POWER8 OS=ALL LVL=1010 COMPS=f77
|
||||
+ 'gfortran' '-O2 -mvsx -mcpu=power8 -mtune=power8 -m64 -mvrsave -funroll-all-loops'
|
||||
MACH=POWER7 OS=ALL LVL=1010 COMPS=icc,smc,dmc,skc,dkc,xcc,gcc
|
||||
'gcc' '-O2 -mvsx -mcpu=power7 -mtune=power7 -m64 -mvrsave -funroll-all-loops'
|
||||
MACH=POWER7 OS=ALL LVL=1010 COMPS=f77
|
||||
@@ -210,6 +214,8 @@ MACH=POWER4 OS=ALL LVL=1010 COMPS=icc,dm
|
||||
'gcc' '-mcpu=power4 -mtune=power4 -O3 -fno-schedule-insns -fno-rerun-loop-opt'
|
||||
MACH=POWER4 OS=ALL LVL=1010 COMPS=f77
|
||||
'xlf' '-qtune=pwr4 -qarch=pwr4 -O3 -qmaxmem=-1 -qfloat=hsflt'
|
||||
+MACH=POWER8 OS=ALL LVL=1010 COMPS=f77
|
||||
+ 'xlf' '-qtune=pwr8 -qarch=pwr8 -O3 -qmaxmem=-1 -qfloat=hsflt'
|
||||
#
|
||||
# IBM System z or zEnterprise.
|
||||
# These compiler flags given by IBM; -O3 -funroll-loops are chosen because
|
||||
Index: ATLAS/CONFIG/src/backend/archinfo_linux.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/backend/archinfo_linux.c
|
||||
+++ ATLAS/CONFIG/src/backend/archinfo_linux.c
|
||||
@@ -77,6 +77,7 @@ enum MACHTYPE ProbeArch()
|
||||
else if (strstr(res, "7455")) mach = PPCG4;
|
||||
else if (strstr(res, "PPC970FX")) mach = PPCG5;
|
||||
else if (strstr(res, "PPC970MP")) mach = PPCG5;
|
||||
+ else if (strstr(res, "POWER8")) mach = IbmPwr8;
|
||||
else if (strstr(res, "POWER7")) mach = IbmPwr7;
|
||||
else if (strstr(res, "POWER6")) mach = IbmPwr6;
|
||||
else if (strstr(res, "POWER5")) mach = IbmPwr5;
|
||||
Index: ATLAS/include/atlas_pca.h
|
||||
===================================================================
|
||||
--- ATLAS.orig/include/atlas_pca.h
|
||||
+++ ATLAS/include/atlas_pca.h
|
||||
@@ -26,7 +26,7 @@
|
||||
#endif
|
||||
#elif defined(ATL_ARCH_POWER3) || defined(ATL_ARCH_POWER4) || \
|
||||
defined(ATL_ARCH_POWER5) || defined(ATL_ARCH_POWER6) || \
|
||||
- defined(ATL_ARCH_POWER7)
|
||||
+ defined(ATL_ARCH_POWER7) || defined(ATL_ARCH_POWER8)
|
||||
#ifdef __GNUC__
|
||||
#define ATL_membarrier __asm__ __volatile__ ("dcs")
|
||||
/* #define ATL_USEPCA 1 */
|
||||
Index: ATLAS/CONFIG/src/backend/archinfo_aix.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/CONFIG/src/backend/archinfo_aix.c
|
||||
+++ ATLAS/CONFIG/src/backend/archinfo_aix.c
|
||||
@@ -67,6 +67,8 @@ enum MACHTYPE ProbeArch()
|
||||
{
|
||||
if (strstr(res, "PowerPC_POWER5"))
|
||||
mach = IbmPwr5;
|
||||
+ else if (strstr(res, "PowerPC_POWER8"))
|
||||
+ mach = IbmPwr8;
|
||||
else if (strstr(res, "PowerPC_POWER7"))
|
||||
mach = IbmPwr7;
|
||||
else if (strstr(res, "PowerPC_POWER6"))
|
220
atlas.3.10.2-ppc64le_abiv2.patch
Normal file
220
atlas.3.10.2-ppc64le_abiv2.patch
Normal file
@ -0,0 +1,220 @@
|
||||
From: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
Subject: atlas.3.10.2 ppc64le abiv2 patch
|
||||
Date: Mon, 28 Jul 2014 04:29:05 -0400
|
||||
|
||||
atlas.3.10.2 abiv2 step2 complete the changes already present in atlas 3.10.2
|
||||
* still some files with opd ABI V1 to be disabled for ABI V2
|
||||
tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
|
||||
tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
|
||||
tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
|
||||
|
||||
atlas.3.10.2 ppc64le abiv2 step3
|
||||
* change offsets of parameters read from stack to avoid some segfaults.
|
||||
(values changes 120 => 104 and 128 => 112 identified by gdb investigation)
|
||||
|
||||
Despite this step3 patch there are two Remaining problems for ppc64le archi:
|
||||
* TODO: still have seg-faults in console during build/check
|
||||
but is not critical (without make check) and rpm are generated on fedora.
|
||||
unable to investigate because of problem tracked by issue 950
|
||||
https://sourceforge.net/p/math-atlas/support-requests/950/
|
||||
|
||||
* TODO: make check failure because xsslvtst execution failure
|
||||
related to vector assembly code that assumes big-endian env
|
||||
as written in ATL_cmm4x4x128_av.c and ATL_smm4x4x128_av.c.
|
||||
Would need significant work to support little-endian as per
|
||||
endianess comments of all PowerPC vector instructions in:
|
||||
https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/FBFA164F824370F987256D6A006F424D/$file/vector_simd_pem.ppc.2005AUG23.pdf
|
||||
|
||||
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
---
|
||||
tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c | 7 +++++++
|
||||
tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c | 7 +++++++
|
||||
tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c | 9 ++++++++-
|
||||
tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c | 20 ++++++++++++++++++--
|
||||
tune/blas/gemm/CASES/ATL_smm4x4x128_av.c | 23 ++++++++++++++++++++++-
|
||||
5 files changed, 62 insertions(+), 4 deletions(-)
|
||||
|
||||
Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
|
||||
+++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x32_ppc.c
|
||||
@@ -268,7 +268,7 @@ Mjoin(.,ATL_USERMM):
|
||||
.globl Mjoin(_,ATL_USERMM)
|
||||
Mjoin(_,ATL_USERMM):
|
||||
#else
|
||||
- #if defined(ATL_USE64BITS)
|
||||
+ #if defined(ATL_USE64BITS) && _CALL_ELF != 2
|
||||
/*
|
||||
* Official Program Descripter section, seg fault w/o it on Linux/PPC64
|
||||
*/
|
||||
@@ -324,8 +324,15 @@ ATL_USERMM:
|
||||
#endif
|
||||
|
||||
#ifdef ATL_USE64BITS
|
||||
+#if _CALL_ELF == 2
|
||||
+/* ABIv2 */
|
||||
+ ld pC0, 104(r1)
|
||||
+ ld incCn, 112(r1)
|
||||
+#else
|
||||
+/* ABIv1 */
|
||||
ld pC0, 120(r1)
|
||||
ld incCn, 128(r1)
|
||||
+#endif
|
||||
#elif defined(ATL_AS_OSX_PPC) || defined(ATL_AS_AIX_PPC)
|
||||
lwz pC0, 68(r1)
|
||||
lwz incCn, 72(r1)
|
||||
Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
|
||||
+++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x80_ppc.c
|
||||
@@ -170,13 +170,21 @@ void ATL_USERMM(const int M, const int N
|
||||
const TYPE beta, TYPE *C, const int ldc)
|
||||
(r10) 8(r1)
|
||||
*******************************************************************************
|
||||
-64 bit ABIs:
|
||||
+64 bit ABIv1s:
|
||||
r3 r4 r5 r6/f1
|
||||
void ATL_USERMM(const int M, const int N, const int K, const TYPE alpha,
|
||||
r7 r8 r9 r10
|
||||
const TYPE *A, const int lda, const TYPE *B, const int ldb,
|
||||
f2 120(r1) 128(r1)
|
||||
const TYPE beta, TYPE *C, const int ldc)
|
||||
+
|
||||
+64 bit ABIv2s:
|
||||
+ r3 r4 r5 r6/f1
|
||||
+void ATL_USERMM(const int M, const int N, const int K, const TYPE alpha,
|
||||
+ r7 r8 r9 r10
|
||||
+ const TYPE *A, const int lda, const TYPE *B, const int ldb,
|
||||
+ f2 104(r1) 112(r1)
|
||||
+ const TYPE beta, TYPE *C, const int ldc)
|
||||
#endif
|
||||
#ifdef ATL_AS_AIX_PPC
|
||||
.csect .text[PR]
|
||||
@@ -202,7 +210,7 @@ Mjoin(.,ATL_USERMM):
|
||||
.globl Mjoin(_,ATL_USERMM)
|
||||
Mjoin(_,ATL_USERMM):
|
||||
#else
|
||||
- #if defined(ATL_USE64BITS)
|
||||
+ #if defined(ATL_USE64BITS) && _CALL_ELF != 2
|
||||
/*
|
||||
* Official Program Descripter section, seg fault w/o it on Linux/PPC64
|
||||
*/
|
||||
@@ -257,9 +265,17 @@ ATL_USERMM:
|
||||
#endif
|
||||
#endif
|
||||
|
||||
+
|
||||
#if defined (ATL_USE64BITS)
|
||||
+#if _CALL_ELF == 2
|
||||
+/* ABIv2 */
|
||||
+ ld pC0, 104(r1)
|
||||
+ ld incCn, 112(r1)
|
||||
+#else
|
||||
+/* ABIv1 */
|
||||
ld pC0, 120(r1)
|
||||
ld incCn, 128(r1)
|
||||
+#endif
|
||||
#elif defined(ATL_AS_OSX_PPC) || defined(ATL_AS_AIX_PPC)
|
||||
lwz pC0, 68(r1)
|
||||
lwz incCn, 72(r1)
|
||||
Index: ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
|
||||
+++ ATLAS/tune/blas/gemm/CASES/ATL_smm4x4x128_av.c
|
||||
@@ -196,7 +196,7 @@ void ATL_USERMM(const int M, const int N
|
||||
.globl Mjoin(_,ATL_USERMM)
|
||||
Mjoin(_,ATL_USERMM):
|
||||
#else
|
||||
- #if defined(ATL_USE64BITS)
|
||||
+ #if defined(ATL_USE64BITS) && _CALL_ELF != 2
|
||||
/*
|
||||
* Official Program Descripter section, seg fault w/o it on Linux/PPC64
|
||||
*/
|
||||
@@ -221,8 +221,15 @@ ATL_USERMM:
|
||||
* kernel instead
|
||||
*/
|
||||
#if defined (ATL_USE64BITS)
|
||||
+#if _CALL_ELF == 2
|
||||
+/* ABIv2 */
|
||||
+ ld r10, 104(r1)
|
||||
+ ld r5, 112(r1)
|
||||
+#else
|
||||
+/* ABIv1 */
|
||||
ld r10, 120(r1)
|
||||
ld r5, 128(r1)
|
||||
+#endif
|
||||
#elif defined(ATL_AS_OSX_PPC)
|
||||
lwz r10, 60(r1)
|
||||
lwz r5, 64(r1)
|
||||
@@ -285,8 +292,15 @@ ATL_USERMM:
|
||||
eqv r0, r0, r0 /* all 1s */
|
||||
ATL_WriteVRSAVE(r0) /* signal we use all vector regs */
|
||||
#if defined (ATL_USE64BITS)
|
||||
+#if _CALL_ELF == 2
|
||||
+ /* ABIv2 */
|
||||
+ ld pC0, FSIZE+104(r1)
|
||||
+ ld ldc, FSIZE+112(r1)
|
||||
+#else
|
||||
+ /* ABIv1 */
|
||||
ld pC0, FSIZE+120(r1)
|
||||
ld ldc, FSIZE+128(r1)
|
||||
+#endif
|
||||
#elif defined(ATL_AS_OSX_PPC)
|
||||
lwz pC0, FSIZE+60(r1)
|
||||
lwz ldc, FSIZE+64(r1)
|
||||
@@ -4258,8 +4272,15 @@ UNALIGNED_C:
|
||||
eqv r0, r0, r0 /* all 1s */
|
||||
ATL_WriteVRSAVE(r0) /* signal we use all vector regs */
|
||||
#if defined (ATL_USE64BITS)
|
||||
+#if _CALL_ELF == 2
|
||||
+ /* ABIv2 */
|
||||
+ ld pC0, FSIZE+104(r1)
|
||||
+ ld ldc, FSIZE+112(r1)
|
||||
+#else
|
||||
+ /* ABIv1 */
|
||||
ld pC0, FSIZE+120(r1)
|
||||
ld ldc, FSIZE+128(r1)
|
||||
+#endif
|
||||
#elif defined(ATL_AS_OSX_PPC)
|
||||
lwz pC0, FSIZE+60(r1)
|
||||
lwz ldc, FSIZE+64(r1)
|
||||
Index: ATLAS/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c
|
||||
+++ ATLAS/tune/blas/gemm/CASES/ATL_cmm4x4x128_av.c
|
||||
@@ -258,8 +258,15 @@ ATL_USERMM:
|
||||
eqv r0, r0, r0 /* all 1s */
|
||||
ATL_WriteVRSAVE(r0) /* signal we use all vector regs */
|
||||
#if defined (ATL_USE64BITS)
|
||||
+#if _CALL_ELF == 2
|
||||
+/* ABIv2 */
|
||||
+ ld pC0, FSIZE+104(r1)
|
||||
+ ld ldc, FSIZE+112(r1)
|
||||
+#else
|
||||
+/* ABIv1 */
|
||||
ld pC0, FSIZE+120(r1)
|
||||
ld ldc, FSIZE+128(r1)
|
||||
+#endif
|
||||
#elif defined(ATL_AS_OSX_PPC)
|
||||
lwz pC0, FSIZE+60(r1)
|
||||
lwz ldc, FSIZE+64(r1)
|
||||
Index: ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c
|
||||
+++ ATLAS/tune/blas/gemm/CASES/ATL_dmm4x4x2pf_av.c
|
||||
@@ -405,8 +405,15 @@ Mjoin(_,ATL_USERMM):
|
||||
*/
|
||||
#ifdef ATL_GAS_LINUX_PPC
|
||||
#ifdef ATL_USE64BITS
|
||||
+ #if _CALL_ELF == 2
|
||||
+ /* ABIv2 */
|
||||
+ ld pC0, 104(r1)
|
||||
+ ld incCn, 112(r1)
|
||||
+ #else
|
||||
+ /* ABIv1 */
|
||||
ld pC0, 120(r1)
|
||||
ld incCn, 128(r1)
|
||||
+ #endif
|
||||
#else
|
||||
lwz incCn, FSIZE+8(r1)
|
||||
#endif
|
151
atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch
Normal file
151
atlas.3.10.2-ppc64le_do_not_use_files_with_lvx.patch
Normal file
@ -0,0 +1,151 @@
|
||||
From: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
Subject: atlas.3.10.2 ppc64le do not use files with lvx
|
||||
Date: Tue, 12 Aug 2014 16:07:06 +0200
|
||||
|
||||
ppc64le do not use files with lvx
|
||||
This is a temporary patch as long as the related files
|
||||
are not ported yet to ppc64 little-endian.
|
||||
|
||||
Warning: patch to be applied only for ppc64le architecture
|
||||
and will also need atlas-new_archdef_for_ppc64le.patch
|
||||
|
||||
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
---
|
||||
tune/blas/gemm/CASES/ccases.flg | 6 +-----
|
||||
tune/blas/gemm/CASES/dcases.flg | 8 +-------
|
||||
tune/blas/gemm/CASES/dcases.vnb | 4 ----
|
||||
tune/blas/gemm/CASES/scases.flg | 9 +--------
|
||||
tune/blas/gemm/CASES/scases.vnb | 3 ---
|
||||
tune/blas/gemm/CASES/zcases.flg | 8 +-------
|
||||
6 files changed, 4 insertions(+), 34 deletions(-)
|
||||
|
||||
Index: ATLAS/tune/blas/gemm/CASES/ccases.flg
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/ccases.flg
|
||||
+++ ATLAS/tune/blas/gemm/CASES/ccases.flg
|
||||
@@ -1,5 +1,5 @@
|
||||
<ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
|
||||
-24
|
||||
+22
|
||||
304 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O
|
||||
@@ -48,13 +48,9 @@ gcc
|
||||
328 480 8 8 2 1 1 8 8 2 ATL_mm8x8x2.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-fomit-frame-pointer -O2 -fno-tree-loop-optimize
|
||||
-329 192 4 4 4 1 16 4 4 4 ATL_cmm4x4x128_av.c "R. Clint Whaley" \
|
||||
-gcc
|
||||
--x assembler-with-cpp
|
||||
331 192 4 4 1 1 1 4 4 1 ATL_smm4x4xURx_mips.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-x assembler-with-cpp -mips4
|
||||
-332 192 8 2 4 1 0 8 2 4 ATL_smm8x2x4_av.c "IBM"
|
||||
333 448 4 4 2 1 1 4 4 2 ATL_smm4x4x2pf_arm.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-x assembler-with-cpp -mfpu=vfpv3
|
||||
Index: ATLAS/tune/blas/gemm/CASES/scases.flg
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/scases.flg
|
||||
+++ ATLAS/tune/blas/gemm/CASES/scases.flg
|
||||
@@ -1,5 +1,5 @@
|
||||
<ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
|
||||
-25
|
||||
+22
|
||||
304 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O
|
||||
@@ -48,16 +48,9 @@ gcc
|
||||
328 480 8 8 2 1 1 8 8 2 ATL_mm8x8x2.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-fomit-frame-pointer -O2 -fno-tree-loop-optimize
|
||||
-329 192 4 4 4 1 16 4 4 4 ATL_smm4x4x128_av.c "R. Clint Whaley" \
|
||||
-gcc
|
||||
--x assembler-with-cpp
|
||||
-330 200 92 92 92 1 16 92 92 92 ATL_smm4x4x128_av.c "R. Clint Whaley" \
|
||||
-gcc
|
||||
--x assembler-with-cpp
|
||||
331 192 4 4 1 1 1 4 4 1 ATL_smm4x4xURx_mips.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-x assembler-with-cpp -mips4
|
||||
-332 192 8 2 4 1 0 8 2 4 ATL_smm8x2x4_av.c "IBM"
|
||||
333 448 4 4 2 1 1 4 4 2 ATL_smm4x4x2pf_arm.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-x assembler-with-cpp -mfpu=vfpv3
|
||||
Index: ATLAS/tune/blas/gemm/CASES/scases.vnb
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/scases.vnb
|
||||
+++ ATLAS/tune/blas/gemm/CASES/scases.vnb
|
||||
@@ -31,9 +31,6 @@
|
||||
# Defaults: TA='t', TB='n', SSE=0, X87=0, LDBOT=1, RTKU=0, AOUTER=0,
|
||||
# KBMAX=KU, KBMIN=KU, BETAN1=0, RTMN=1
|
||||
#
|
||||
-ID=1 ROUT='ATL_smm4x4x128_av.c' AUTH='R. Clint Whaley' MU=4 NU=4 KU=4 \
|
||||
- LDKB=1 LDBOT=1 KBMIN=4 KBMAX=128 ASM=GAS_PPC \
|
||||
- COMP='gcc' FLAGS='-x assembler-with-cpp'
|
||||
ID=2 ROUT='ATL_smm4x4x16_av.c' AUTH='R. Clint Whaley' MU=4 NU=4 KU=16 \
|
||||
LDKB=1 LDBOT=0 KBMIN=16 KBMAX=2048 ASM=GAS_SPARC \
|
||||
COMP='gcc' FLAGS='-x assembler-with-cpp'
|
||||
Index: ATLAS/tune/blas/gemm/CASES/dcases.flg
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/dcases.flg
|
||||
+++ ATLAS/tune/blas/gemm/CASES/dcases.flg
|
||||
@@ -1,5 +1,5 @@
|
||||
<ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
|
||||
-32
|
||||
+30
|
||||
306 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O -fno-schedule-insns -fno-schedule-insns2
|
||||
@@ -79,12 +79,6 @@ gcc
|
||||
336 192 4 4 1 1 1 4 4 1 ATL_dmm4x4xURx_mips.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-x assembler-with-cpp -mips4
|
||||
-337 192 4 4 1 1 16 4 4 1 ATL_dmm4x4x80_ppc.c "Whaley & Castaldo" \
|
||||
-gcc
|
||||
--x assembler-with-cpp
|
||||
-338 192 8 4 2 1 0 8 4 2 ATL_dmm8x4x2_vsx.c "IBM" \
|
||||
-gcc
|
||||
--O3 -mvsx
|
||||
339 448 4 4 2 1 1 4 4 2 ATL_dmm4x4x2pf_arm.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-x assembler-with-cpp -mfpu=vfpv3
|
||||
Index: ATLAS/tune/blas/gemm/CASES/dcases.vnb
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/dcases.vnb
|
||||
+++ ATLAS/tune/blas/gemm/CASES/dcases.vnb
|
||||
@@ -53,10 +53,6 @@ ID=6 ROUT='ATL_dmm4x1x90_x87.c' AUTH='R
|
||||
ID=7 ROUT='ATL_dmm8x1x120_sse2.c' AUTH='R. Clint Whaley' \
|
||||
MU=8 NU=1 KU=1 KBMAX=512 ASM=GAS_x8664 BETAN1=1 \
|
||||
COMP='gcc' FLAGS='-m64 -x assembler-with-cpp'
|
||||
-ID=70 ROUT='ATL_dmm4x4x80_ppc.c' AUTH='R. Clint Whaley' TA='T', TB='N' \
|
||||
- MU=4 NU=4 KU=1 KBMIN=1 KBMAX=80 ASM=GAS_PPC BETAN1=0 LDBOT=0 \
|
||||
- LDAB=0 LDISKB=1 RTN=1 RTM=1 RTK=0 \
|
||||
- COMP='gcc' FLAGS='-x assembler-with-cpp'
|
||||
ID=80 ROUT='ATL_dmm4x4x16r8_US.c' AUTH='R. Clint Whaley' TA='T', TB='N' \
|
||||
MU=4 NU=4 KU=24 KBMIN=24 KBMAX=512 ASM=GAS_SPARC BETAN1=0 \
|
||||
LDAB=0 RTK=1 RTN=1 RTM=1 LDBOT=0 LDISKB=1 LDAB=1 \
|
||||
Index: ATLAS/tune/blas/gemm/CASES/zcases.flg
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/gemm/CASES/zcases.flg
|
||||
+++ ATLAS/tune/blas/gemm/CASES/zcases.flg
|
||||
@@ -1,5 +1,5 @@
|
||||
<ID> <flag> <mb> <nb> <kb> <muladd> <lat> <mu> <nu> <ku> <rout> "<Contributer>"
|
||||
-31
|
||||
+29
|
||||
306 192 4 3 8 0 4 4 3 8 ATL_mm4x3x8p.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-mcpu=ultrasparc -mtune=ultrasparc -fomit-frame-pointer -O -fno-schedule-insns -fno-schedule-insns2
|
||||
@@ -76,12 +76,6 @@ gcc
|
||||
336 192 4 4 1 1 1 4 4 1 ATL_dmm4x4xURx_mips.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-x assembler-with-cpp -mips4
|
||||
-337 192 4 4 1 1 16 4 4 1 ATL_dmm4x4x80_ppc.c "Whaley & Castaldo" \
|
||||
-gcc
|
||||
--x assembler-with-cpp
|
||||
-338 192 8 4 2 1 0 8 4 2 ATL_dmm8x4x2_vsx.c "IBM" \
|
||||
-gcc
|
||||
--O3 -mvsx
|
||||
339 448 4 4 2 1 1 4 4 2 ATL_dmm4x4x2pf_arm.c "R. Clint Whaley" \
|
||||
gcc
|
||||
-x assembler-with-cpp -mfpu=vfpv3
|
1119
atlas.spec
1119
atlas.spec
File diff suppressed because it is too large
Load Diff
38
getdoublearr.stripwhite.patch
Normal file
38
getdoublearr.stripwhite.patch
Normal file
@ -0,0 +1,38 @@
|
||||
diff -up ATLAS/include/atlas_genparse.h.than ATLAS/include/atlas_genparse.h
|
||||
--- ATLAS/include/atlas_genparse.h.than 2015-11-26 10:53:55.056586198 -0500
|
||||
+++ ATLAS/include/atlas_genparse.h 2015-11-26 10:56:00.168537914 -0500
|
||||
@@ -149,13 +149,24 @@ static int asmNames2bitfield(char *str)
|
||||
}
|
||||
|
||||
/* procedure 7 */
|
||||
-static int GetDoubleArr(char *str, int N, double *d)
|
||||
+static int GetDoubleArr(char *callerstr, int N, double *d)
|
||||
/*
|
||||
* Reads in a list with form "%le,%le...,%le"; N-length d recieves doubles.
|
||||
* RETURNS: the number of doubles found, or N, whichever is less
|
||||
*/
|
||||
{
|
||||
- int i=1;
|
||||
+ int i;
|
||||
+ char *dupstr = DupString(callerstr);
|
||||
+ char *str = dupstr;
|
||||
+ /* strip the string to end on first white space */
|
||||
+ for (i=0; dupstr[i]; i++)
|
||||
+ {
|
||||
+ if (isspace(dupstr[i])) {
|
||||
+ dupstr[i] = '\0';
|
||||
+ break;
|
||||
+ }
|
||||
+ }
|
||||
+ i = 1;
|
||||
assert(sscanf(str, "%le", d) == 1);
|
||||
while (i < N)
|
||||
{
|
||||
@@ -167,6 +178,7 @@ static int GetDoubleArr(char *str, int N
|
||||
break;
|
||||
i++;
|
||||
}
|
||||
+ free(dupstr);
|
||||
return(i);
|
||||
}
|
||||
|
26
initialize_malloc_memory.invtrsm.wms.oct23.patch
Normal file
26
initialize_malloc_memory.invtrsm.wms.oct23.patch
Normal file
@ -0,0 +1,26 @@
|
||||
From: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
Subject: initialize malloc memory.invtrsm.wms.oct23
|
||||
Date: Mon, 14 Apr 2014 17:18:53 +0200
|
||||
References: http://sourceforge.net/p/math-atlas/mailman/message/32471499/
|
||||
|
||||
initialize malloc memory invtrsm.c
|
||||
|
||||
|
||||
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
|
||||
Signed-off-by: Michel Normand <normand@linux.vnet.ibm.com>
|
||||
---
|
||||
ATLAS/tune/blas/level3/invtrsm.c | 1 +
|
||||
1 file changed, 1 insertion(+)
|
||||
|
||||
Index: ATLAS/tune/blas/level3/invtrsm.c
|
||||
===================================================================
|
||||
--- ATLAS.orig/tune/blas/level3/invtrsm.c
|
||||
+++ ATLAS/tune/blas/level3/invtrsm.c
|
||||
@@ -525,6 +525,7 @@ static double RunTiming
|
||||
a = A = malloc(i * ATL_MulBySize(incA));
|
||||
if (A)
|
||||
{
|
||||
+ memset(A,0,i*ATL_MulBySize(incA)); /* wms (!!) malloc call above returns non-initialized memory. */
|
||||
if (Uplo == TestGE)
|
||||
for (i=0; i < k; i++)
|
||||
Mjoin(PATL,gegen)(N, N, A+i*incA, lda, N+lda);
|
26
sources
26
sources
@ -1,9 +1,17 @@
|
||||
1bb3abde499b492b4be1f1a0759fbfa2 atlas3.8.4.tar.bz2
|
||||
9ddf8c76e5e9781c542b712f704460e1 IBMz1032.tgz
|
||||
ee4cbc1f15cb4cd5f5266969a4bc62a7 IBMz1064.tgz
|
||||
edd3cb5602c6282e4a30691e728bd064 IBMz19632.tgz
|
||||
21f630520058859ad0b8b798bd17dc5a IBMz19664.tgz
|
||||
3f174cdcb4c964843f27dbfc4ad4b1c8 K7323DNow.tgz
|
||||
676548252837b1e458181111443f340f PPRO32.tgz
|
||||
ebb4732aff468bbc223e7f734252173b USII32.tgz
|
||||
31f8ae7583d290e5414a1a61ff6e7e39 USII64.tgz
|
||||
SHA512 (atlas3.10.3.tar.bz2) = bf17306f09f2aa973cb776e2c9eacfb2409ad4d95d19802e1c4e0597d0a099fccdb5eaafe273c2682a41e41a3c6fabc8bbba4ce03180cffea40ede5df1d1f56e
|
||||
SHA512 (IBMz1032.tgz) = f745187d75073de461d6948489dad3abea9a67ad10ec63e021160d3f61ae5be36e94768faa0e7e6e3158b1401bf954eae1e7e6416857b652415030836c6aba3d
|
||||
SHA512 (IBMz1064.tgz) = 14fbc584a8711a0292c8be0dce962bd7ac12347b2d203c2a7b0cc66ea68ac57d5b88afc6778df39efea43077fcc70c6c63db365b5b4badb879ab6900b5296094
|
||||
SHA512 (IBMz1264.tar.bz2) = 54bab951a818feb08fe5c671213db80d17bfefe75a3993d80655161219f018e87125c4ccc09c701cde45fd672a9856f4fff557ffc378c5b2fe7e9c6ebc3bd1de
|
||||
SHA512 (IBMz19632.tgz) = aa10213265866b3176efe1d9d204da170844573f7ab26a36551a174eab3951ccd5f54a5149f1351affc38c510162cce9e10eb2a830af32992cb3febe9e1ecafe
|
||||
SHA512 (IBMz19664.tgz) = 5837d5dfd04c31c304e1f454d0148bd412ec8853c50a7c3dcee9a61529bd04c30d68a0c7aae2bfa2c393fde3582fe36f98e6f5891b271b19562491298ba600b2
|
||||
SHA512 (IBMz932.tar.bz2) = 8f71140d1b30d00ed44faea71e42ab3ff24917a62670f7becdb0d861bf4e7c3c972f9601d161439a518dcc87405c74af31cdd4e2996999a5da8452cc8d2a52df
|
||||
SHA512 (IBMz964.tar.bz2) = b7356e5b47615c64c9b2dd6a497f071e39d4d90f6dd42478fea1d7597cf21ea08123c480fde002aba181a2ad0eeb21acb61469c7e4b2e8961e4d72e5f86e14cc
|
||||
SHA512 (PPRO32.tgz) = a30069e79f95a36b2c7125e7861218e9612bd92913db929ea98800201e7ca7d55c9a1480063c7d5a4c50fcb2b271907ce43cd9b229c694a5ee3b56561a7820e6
|
||||
SHA512 (USII32.tgz) = e9d3b1f5ccd38fc364666205e33f7a927e96c3cebc35d9692cafe3b536697224f20702641f875421b200ff78774831fd5790174ef55c899e0cdb905e3ac2371f
|
||||
SHA512 (USII64.tgz) = 5bd654f8b45306a18f3ad2b593ba23012909ba5ad91614de5024b80998bde832d0ddc84d2c0c0e75dd28915f3c07ec40ac9351213b24e54028fbad4d385ebcc6
|
||||
SHA512 (POWER332.tar.bz2) = 95a7281dbb7a2d0897a58599577afdedba66e6e5edb73223efdeecd93b6706031139b9b58b14345449dccbf1abfa8275bc261f826c692282d14dc30320728c75
|
||||
SHA512 (ARMv732NEON.tar.bz2) = 92acbdd8f7aebd841a10a13df85baa00c518dae388e1ee8dd4bc35fc461d732d2df2cfeae0a3614cea251b80a9ee6a5b49ad71ab8b36b98b70bda6d1c215c78d
|
||||
SHA512 (ARMa732.tar) = 47d6564b5a439bc3778ccc79242220b236c7dc8d36e12ce6850c7e9a02e2379618322c003ba4490573c40b78227c2c3925222da4f4e5f87aab48eae192b45bb9
|
||||
SHA512 (ARMa732.tar.bz2) = 8b83b59a32f18d2cd432c205efd4358b0000ce1685799f2f38a60532bc925e9cd871371d2dfd226ab8e30e830bf608f022d63bcd26f26f9fe74acab067bd4d4f
|
||||
SHA512 (POWER864LEVSXp4.tar.bz2) = e2fa637061a4a4806bc091009c37ccd719c4c4051baf36ed451917e255375881fa168caa5ca296ae9c89bb28523d9015fda42a5dbc51aef4c66efbf6efd966d2
|
||||
SHA512 (K7323DNow.tgz) = e1d5e4208ce454b5f5daa68663d2dd28a2bd3cc97496e4e1515df880b9ccd00bcc75bd820402c3b2bf8409f98500e43f2481fbf5dd480f7d0ba60fe2f82a1ac1
|
||||
|
Loading…
Reference in New Issue
Block a user