diff --git a/.gitignore b/.gitignore index a5036d0..4c1feaa 100644 --- a/.gitignore +++ b/.gitignore @@ -4,5 +4,5 @@ Reference.pdf Getting_Started.pdf /tbb30_20110419oss_src.tgz /Design_Patterns.pdf -/CHANGES.txt tbb*/ +/tbb40_20110809oss_src.tgz diff --git a/CHANGES.txt b/CHANGES.txt new file mode 100644 index 0000000..cb9c0c8 --- /dev/null +++ b/CHANGES.txt @@ -0,0 +1,1122 @@ +TBB 4.0 commercial-aligned release + +Changes (w.r.t. TBB 3.0 Update 8 commercial-aligned release): + +Improvements: +- concurrent_priority_queue is now a supported feature. Capacity control methods were removed. +- Flow graph is now a supported feature of TBB, rather than being a community preview feature. +- A new memory backend has been implemented in the TBB allocator. It can return small memory blocks to the OS and thus later reuse that memory for large object allocation. +- Improved partitioning algorithms for parallel_for and parallel_reduce to better handle load imbalance. +- The convex_hull example has been refactored to produce reproducible performance results. +- The Major Interface version has changed from 5 to 6. Deprecated interfaces might be removed in future releases. + +Community Preview Features: +- Added Community Preview Feature: serial subset of TBB for modeling a sequential execution of a parallel algorithm. This release introduces serial parallel_for. +- Added Community Preview Feature: or_node (accepts multiple inputs, forwarding each input separately to successors), split_node (accepts tuples, and forwards each element to a corresponding successor), and multioutput_function_node (accepts one input, and passes the input and a tuple of output ports to the function body to support outputs to multiple successors). +- Added Community Preview Feature: Scalable Memory Pools (more control on memory source, grouping, collective deallocatation). + +------------------------------------------------------------------------ +TBB 3.0 Update 8 commercial-aligned release + +Changes (w.r.t. TBB 3.0 Update 7 commercial-aligned release): + +- Task priorities has become an official feature of TBB, + not community preview as before. +- Atomics API extended, and implementation refactored. +- Added task::set_parent() method. +- Added concurrent_unordered_set container. + +Open-source contributions integrated: + +- PowerPC support by Raf Schietekat. +- Fix of potential task pool overrun and other improvements + in the task scheduler by Raf Schietekat. +- Fix in parallel_for_each to work with std::set in Visual* C++ 2010. + +Community Preview Features: + +- Graph community preview feature was renamed to flow graph. + Multiple improvements in the implementation. + Binpack example of the feature was added. +- A number of improvements to concurrent_priority_queue. + Shortpath example was added for the feature. +- TBB runtime loader functionality was added (Windows*-only). + This allows set exact versions of TBB library to be used in run-time, + while setting directories for the library search. +- parallel_deterministic_reduce template function was added. + +------------------------------------------------------------------------ +TBB 3.0 Update 7 commercial-aligned release + +Changes (w.r.t. TBB 3.0 Update 6 commercial-aligned release): + +- Added implementation of the platform isolation layer based on + GCC atomic built-ins; it is supposed to work on any platform + where GCC has these built-ins. + +Community Preview Features: + +- Graph's dining_philosophers example added +- A number of improvements to graph and concurrent_priority_queue + + +------------------------------------------------------------------------ +TBB 3.0 Update 6 commercial-aligned release + +Changes (w.r.t. TBB 3.0 Update 5 commercial-aligned release): + +- Added Community Preview feature: task and task group priority, and + Fractal example demonstrating it. +- parallel_pipeline optimized for data items of small and large sizes. +- Graph's join_node is now parametrized with a tuple of up to 10 types. +- Improved performance of concurrent_priority_queue. + +Open-source contributions integrated: + +- Initial NetBSD support by Aleksej Saushev. + +Bugs fixed: + +- Failure to locate Cilk runtime library to enable Cilk/TBB interop. +- Data race that could result in concurrent_unordered_map structure + corruption after call to clear() method. +- Crash caused by invoking Cilk/TBB interop after one of the libraries + is unloaded. +- Stack corruption caused by PIC version of 64-bit CAS compiled by Intel + compiler on Linux. +- Inconsistency of exception propagation mode possible when application + built with Microsoft* Visual Studio* 2008 or earlier uses TBB built + with Microsoft* Visual Studio* 2010. +- Affinitizing master thread to a subset of available CPUs after TBB + scheduler was initialized tied all worker threads to the same CPUs. +- Method is_stolen_task() always returned 'false' for affinitized tasks. +- write_once_node and overwrite_node did not immediately send buffered + items to successors + +------------------------------------------------------------------------ +TBB 3.0 Update 5 commercial-aligned release + +Changes (w.r.t. TBB 3.0 Update 4 commercial-aligned release): + +- Added Community Preview feature: graph. +- Added automatic propagation of master thread FPU settings to + TBB worker threads. +- Added a public function to perform a sequentially consistent full + memory fence: tbb::atomic_fence() in tbb/atomic.h. + +Bugs fixed: + +- Data race that could result in scheduler data structures corruption + when using fire-and-forget tasks. +- Potential referencing of destroyed concurrent_hash_map element after + using erase(accessor&A) method with A acquired as const_accessor. +- Fixed a correctness bug in the convex hull example. + +Open-source contributions integrated: + +- Patch for calls to internal::atomic_do_once() by Andrey Semashev. + +------------------------------------------------------------------------ +TBB 3.0 Update 4 commercial-aligned release + +Changes (w.r.t. TBB 3.0 Update 3 commercial-aligned release): + +- Added Community Preview feature: concurrent_priority_queue. +- Fixed library loading to avoid possibility for remote code execution, + see http://www.microsoft.com/technet/security/advisory/2269637.mspx. +- Added support of more than 64 cores for appropriate Microsoft* + Windows* versions. For more details, see + http://msdn.microsoft.com/en-us/library/dd405503.aspx. +- Default number of worker threads is adjusted in accordance with + process affinity mask. + +Bugs fixed: + +- Calls of scalable_* functions from inside the allocator library + caused issues if the functions were overridden by another module. +- A crash occurred if methods run() and wait() were called concurrently + for an empty tbb::task_group (1736). +- The tachyon example exhibited build problems associated with + bug 554339 on Microsoft* Visual Studio* 2010. Project files were + modified as a partial workaround to overcome the problem. See + http://connect.microsoft.com/VisualStudio/feedback/details/554339. + +------------------------------------------------------------------------ +TBB 3.0 Update 3 commercial-aligned release + +Changes (w.r.t. TBB 3.0 Update 2 commercial-aligned release): + +- cache_aligned_allocator class reworked to use scalable_aligned_malloc. +- Improved performance of count() and equal_range() methods + in concurrent_unordered_map. +- Improved implementation of 64-bit atomic loads and stores on 32-bit + platforms, including compilation with VC 7.1. +- Added implementation of atomic operations on top of OSAtomic API + provided by Mac OS* X. +- Removed gratuitous try/catch blocks surrounding thread function calls + in tbb_thread. +- Xcode* projects were added for sudoku and game_of_life examples. +- Xcode* projects were updated to work without TBB framework. + +Bugs fixed: + +- Fixed a data race in task scheduler destruction that on rare occasion + could result in memory corruption. +- Fixed idle spinning in thread bound filters in tbb::pipeline (1670). + +Open-source contributions integrated: + +- MinGW-64 basic support by brsomoza (partially). +- Patch for atomic.h by Andrey Semashev. +- Support for AIX & GCC on PowerPC by Giannis Papadopoulos. +- Various improvements by Raf Schietekat. + +------------------------------------------------------------------------ +TBB 3.0 Update 2 commercial-aligned release + +Changes (w.r.t. TBB 3.0 Update 1 commercial-aligned release): + +- Destructor of tbb::task_group class throws missing_wait exception + if there are tasks running when it is invoked. +- Cilk-TBB interop layer added to protect TBB TLS in case of + "Cilk-TBB-Cilk nesting" usage model. +- Compilation fix for dependent template names in concurrent_queue. +- Memory allocator code refactored to ease development and maintenance. + +Bugs fixed: + +- Improved interoperability with other Intel software tools on Linux in + case of dynamic replacement of memory allocator (1700) +- Fixed install issues that prevented installation on + Mac OS* X 10.6.4 (1711). + +------------------------------------------------------------------------ +TBB 3.0 Update 1 commercial-aligned release + +Changes (w.r.t. TBB 3.0 commercial-aligned release): + +- Decreased memory fragmentation by allocations bigger than 8K. +- Lazily allocate worker threads, to avoid creating unnecessary stacks. + +Bugs fixed: + +- TBB allocator used much more memory than malloc (1703) - see above. +- Deadlocks happened in some specific initialization scenarios + of the TBB allocator (1701, 1704). +- Regression in enumerable_thread_specific: excessive requirements + for object constructors. +- A bug in construction of parallel_pipeline filters when body instance + was a temporary object. +- Incorrect usage of memory fences on PowerPC and XBOX360 platforms. +- A subtle issue in task group context binding that could result + in cancelation signal being missed by nested task groups. +- Incorrect construction of concurrent_unordered_map if specified + number of buckets is not power of two. +- Broken count() and equal_range() of concurrent_unordered_map. +- Return type of postfix form of operator++ for hash map's iterators. + +------------------------------------------------------------------------ +TBB 3.0 commercial-aligned release + +Changes (w.r.t. TBB 2.2 Update 3 commercial-aligned release): + +- All open-source-release changes down to TBB 2.2 U3 below + were incorporated into this release. + +------------------------------------------------------------------------ +20100406 open-source release + +Changes (w.r.t. 20100310 open-source release): + +- Added support for Microsoft* Visual Studio* 2010, including binaries. +- Added a PDF file with recommended Design Patterns for TBB. +- Added parallel_pipeline function and companion classes and functions + that provide a strongly typed lambda-friendly pipeline interface. +- Reworked enumerable_thread_specific to use a custom implementation of + hash map that is more efficient for ETS usage models. +- Added example for class task_group; see examples/task_group/sudoku. +- Removed two examples, as they were long outdated and superceded: + pipeline/text_filter (use pipeline/square); + parallel_while/parallel_preorder (use parallel_do/parallel_preorder). +- PDF documentation updated. +- Other fixes and changes in code, tests, and examples. + +Bugs fixed: + +- Eliminated build errors with MinGW32. +- Fixed post-build step and other issues in VS projects for examples. +- Fixed discrepancy between scalable_realloc and scalable_msize that + caused crashes with malloc replacement on Windows. + +------------------------------------------------------------------------ +20100310 open-source release + +Changes (w.r.t. TBB 2.2 Update 3 commercial-aligned release): + +- Version macros changed in anticipation of a future release. +- Directory structure aligned with Intel(R) C++ Compiler; + now TBB binaries reside in //[bin|lib] + (in TBB 2.x, it was [bin|lib]//). +- Visual Studio projects changed for examples: instead of separate set + of files for each VS version, now there is single 'msvs' directory + that contains workspaces for MS C++ compiler (_cl.sln) and + Intel C++ compiler (_icl.sln). Works with VS 2005 and above. +- The name versioning scheme for backward compatibility was improved; + now compatibility-breaking changes are done in a separate namespace. +- Added concurrent_unordered_map implementation based on a prototype + developed in Microsoft for a future version of PPL. +- Added PPL-compatible writer-preference RW lock (reader_writer_lock). +- Added TBB_IMPLEMENT_CPP0X macro to control injection of C++0x names + implemented in TBB into namespace std. +- Added almost-C++0x-compatible std::condition_variable, plus a bunch + of other C++0x classes required by condition_variable. +- With TBB_IMPLEMENT_CPP0X, tbb_thread can be also used as std::thread. +- task.cpp was split into several translation units to structure + TBB scheduler sources layout. Static data layout and library + initialization logic were also updated. +- TBB scheduler reworked to prevent master threads from stealing + work belonging to other masters. +- Class task was extended with enqueue() method, and slightly changed + semantics of methods spawn() and destroy(). For exact semantics, + refer to TBB Reference manual. +- task_group_context now allows for destruction by non-owner threads. +- Added TBB_USE_EXCEPTIONS macro to control use of exceptions in TBB + headers. It turns off (i.e. sets to 0) automatically if specified + compiler options disable exception handling. +- TBB is enabled to run on top of Microsoft's Concurrency Runtime + on Windows* 7 (via our worker dispatcher known as RML). +- Removed old unused busy-waiting code in concurrent_queue. +- Described the advanced build & test options in src/index.html. +- Warning level for GCC raised with -Wextra and a few other options. +- Multiple fixes and improvements in code, tests, examples, and docs. + +Open-source contributions integrated: + +- Xbox support by Roman Lut (Deep Shadows), though further changes are + required to make it working; e.g. post-2.1 entry points are missing. +- "Eventcount" by Dmitry Vyukov evolved into concurrent_monitor, + an internal class used in the implementation of concurrent_queue. + +------------------------------------------------------------------------ +TBB 2.2 Update 3 commercial-aligned release + +Changes (w.r.t. TBB 2.2 Update 2 commercial-aligned release): + +- PDF documentation updated. + +Bugs fixed: + +- concurrent_hash_map compatibility issue exposed on Linux in case + two versions of the container were used by different modules. +- enforce 16 byte stack alignment for consistence with GCC; required + to work correctly with 128-bit variables processed by SSE. +- construct() methods of allocator classes now use global operator new. + +------------------------------------------------------------------------ +TBB 2.2 Update 2 commercial-aligned release + +Changes (w.r.t. TBB 2.2 Update 1 commercial-aligned release): + +- parallel_invoke and parallel_for_each now take function objects + by const reference, not by value. +- Building TBB with /MT is supported, to avoid dependency on particular + versions of Visual C++* runtime DLLs. TBB DLLs built with /MT + are located in vc_mt directory. +- Class critical_section introduced. +- Improvements in exception support: new exception classes introduced, + all exceptions are thrown via an out-of-line internal method. +- Improvements and fixes in the TBB allocator and malloc replacement, + including robust memory identification, and more reliable dynamic + function substitution on Windows*. +- Method swap() added to class tbb_thread. +- Methods rehash() and bucket_count() added to concurrent_hash_map. +- Added support for Visual Studio* 2010 Beta2. No special binaries + provided, but CRT-independent DLLs (vc_mt) should work. +- Other fixes and improvements in code, tests, examples, and docs. + +Open-source contributions integrated: + +- The fix to build 32-bit TBB on Mac OS* X 10.6. +- GCC-based port for SPARC Solaris by Michailo Matijkiw, with use of + earlier work by Raf Schietekat. + +Bugs fixed: + +- 159 - TBB build for PowerPC* running Mac OS* X. +- 160 - IBM* Java segfault if used with TBB allocator. +- crash in concurrent_queue (1616). + +------------------------------------------------------------------------ +TBB 2.2 Update 1 commercial-aligned release + +Changes (w.r.t. TBB 2.2 commercial-aligned release): + +- Incorporates all changes from open-source releases below. +- Documentation was updated. +- TBB scheduler auto-initialization now covers all possible use cases. +- concurrent_queue: made argument types of sizeof used in paddings + consistent with those actually used. +- Memory allocator was improved: supported corner case of user's malloc + calling scalable_malloc (non-Windows), corrected processing of + memory allocation requests during tbb memory allocator startup + (Linux). +- Windows malloc replacement has got better support for static objects. +- In pipeline setups that do not allow actual parallelism, execution + by a single thread is guaranteed, idle spinning eliminated, and + performance improved. +- RML refactoring and clean-up. +- New constructor for concurrent_hash_map allows reserving space for + a number of items. +- Operator delete() added to the TBB exception classes. +- Lambda support was improved in parallel_reduce. +- gcc 4.3 warnings were fixed for concurrent_queue. +- Fixed possible initialization deadlock in modules using TBB entities + during construction of global static objects. +- Copy constructor in concurrent_hash_map was fixed. +- Fixed a couple of rare crashes in the scheduler possible before + in very specific use cases. +- Fixed a rare crash in the TBB allocator running out of memory. +- New tests were implemented, including test_lambda.cpp that checks + support for lambda expressions. +- A few other small changes in code, tests, and documentation. + +------------------------------------------------------------------------ +20090809 open-source release + +Changes (w.r.t. TBB 2.2 commercial-aligned release): + +- Fixed known exception safety issues in concurrent_vector. +- Better concurrency of simultaneous grow requests in concurrent_vector. +- TBB allocator further improves performance of large object allocation. +- Problem with source of text relocations was fixed on Linux +- Fixed bugs related to malloc replacement under Windows +- A few other small changes in code and documentation. + +------------------------------------------------------------------------ +TBB 2.2 commercial-aligned release + +Changes (w.r.t. TBB 2.1 U4 commercial-aligned release): + +- Incorporates all changes from open-source releases below. +- Architecture folders renamed from em64t to intel64 and from itanium + to ia64. +- Major Interface version changed from 3 to 4. Deprecated interfaces + might be removed in future releases. +- Parallel algorithms that use partitioners have switched to use + the auto_partitioner by default. +- Improved memory allocator performance for allocations bigger than 8K. +- Added new thread-bound filters functionality for pipeline. +- New implementation of concurrent_hash_map that improves performance + significantly. +- A few other small changes in code and documentation. + +------------------------------------------------------------------------ +20090511 open-source release + +Changes (w.r.t. previous open-source release): + +- Basic support for MinGW32 development kit. +- Added tbb::zero_allocator class that initializes memory with zeros. + It can be used as an adaptor to any STL-compatible allocator class. +- Added tbb::parallel_for_each template function as alias to parallel_do. +- Added more overloads for tbb::parallel_for. +- Added support for exact exception propagation (can only be used with + compilers that support C++0x std::exception_ptr). +- tbb::atomic template class can be used with enumerations. +- mutex, recursive_mutex, spin_mutex, spin_rw_mutex classes extended + with explicit lock/unlock methods. +- Fixed size() and grow_to_at_least() methods of tbb::concurrent_vector + to provide space allocation guarantees. More methods added for + compatibility with std::vector, including some from C++0x. +- Preview of a lambda-friendly interface for low-level use of tasks. +- scalable_msize function added to the scalable allocator (Windows only). +- Rationalized internal auxiliary functions for spin-waiting and backoff. +- Several tests undergo decent refactoring. + +Changes affecting backward compatibility: + +- Improvements in concurrent_queue, including limited API changes. + The previous version is deprecated; its functionality is accessible + via methods of the new tbb::concurrent_bounded_queue class. +- grow* and push_back methods of concurrent_vector changed to return + iterators; old semantics is deprecated. + +------------------------------------------------------------------------ +TBB 2.1 Update 4 commercial-aligned release + +Changes (w.r.t. TBB 2.1 U3 commercial-aligned release): + +- Added tests for aligned memory allocations and malloc replacement. +- Several improvements for better bundling with Intel(R) C++ Compiler. +- A few other small changes in code and documentaion. + +Bugs fixed: + +- 150 - request to build TBB examples with debug info in release mode. +- backward compatibility issue with concurrent_queue on Windows. +- dependency on VS 2005 SP1 runtime libraries removed. +- compilation of GUI examples under Xcode* 3.1 (1577). +- On Windows, TBB allocator classes can be instantiated with const types + for compatibility with MS implementation of STL containers (1566). + +------------------------------------------------------------------------ +20090313 open-source release + +Changes (w.r.t. 20081109 open-source release): + +- Includes all changes introduced in TBB 2.1 Update 2 & Update 3 + commercial-aligned releases (see below for details). +- Added tbb::parallel_invoke template function. It runs up to 10 + user-defined functions in parallel and waits for them to complete. +- Added a special library providing ability to replace the standard + memory allocation routines in Microsoft* C/C++ RTL (malloc/free, + global new/delete, etc.) with the TBB memory allocator. + Usage details are described in include/tbb/tbbmalloc_proxy.h file. +- Task scheduler switched to use new implementation of its core + functionality (deque based task pool, new structure of arena slots). +- Preview of Microsoft* Visual Studio* 2005 project files for + building the library is available in build/vsproject folder. +- Added tests for aligned memory allocations and malloc replacement. +- Added parallel_for/game_of_life.net example (for Windows only) + showing TBB usage in a .NET application. +- A number of other fixes and improvements to code, tests, makefiles, + examples and documents. + +Bugs fixed: + +- The same list as in TBB 2.1 Update 4 right above. + +------------------------------------------------------------------------ +TBB 2.1 Update 3 commercial-aligned release + +Changes (w.r.t. TBB 2.1 U2 commercial-aligned release): + +- Added support for aligned allocations to the TBB memory allocator. +- Added a special library to use with LD_PRELOAD on Linux* in order to + replace the standard memory allocation routines in C/C++ with the + TBB memory allocator. +- Added null_mutex and null_rw_mutex: no-op classes interface-compliant + to other TBB mutexes. +- Improved performance of parallel_sort, to close most of the serial gap + with std::sort, and beat it on 2 and more cores. +- A few other small changes. + +Bugs fixed: + +- the problem where parallel_for hanged after exception throw + if affinity_partitioner was used (1556). +- get rid of VS warnings about mbstowcs deprecation (1560), + as well as some other warnings. +- operator== for concurrent_vector::iterator fixed to work correctly + with different vector instances. + +------------------------------------------------------------------------ +TBB 2.1 Update 2 commercial-aligned release + +Changes (w.r.t. TBB 2.1 U1 commercial-aligned release): + +- Incorporates all open-source-release changes down to TBB 2.1 U1, + except for: + - 20081019 addition of enumerable_thread_specific; +- Warning level for Microsoft* Visual C++* compiler raised to /W4 /Wp64; + warnings found on this level were cleaned or suppressed. +- Added TBB_runtime_interface_version API function. +- Added new example: pipeline/square. +- Added exception handling and cancellation support + for parallel_do and pipeline. +- Added copy constructor and [begin,end) constructor to concurrent_queue. +- Added some support for beta version of Intel(R) Parallel Amplifier. +- Added scripts to set environment for cross-compilation of 32-bit + applications on 64-bit Linux with Intel(R) C++ Compiler. +- Fixed semantics of concurrent_vector::clear() to not deallocate + internal arrays. Fixed compact() to perform such deallocation later. +- Fixed the issue with atomic when T is incomplete type. +- Improved support for PowerPC* Macintosh*, including the fix + for a bug in masked compare-and-swap reported by a customer. +- As usual, a number of other improvements everywhere. + +------------------------------------------------------------------------ +20081109 open-source release + +Changes (w.r.t. previous open-source release): + +- Added new serial out of order filter for tbb::pipeline. +- Fixed the issue with atomic::operator= reported at the forum. +- Fixed the issue with using tbb::task::self() in task destructor + reported at the forum. +- A number of other improvements to code, tests, makefiles, examples + and documents. + +Open-source contributions integrated: +- Changes in the memory allocator were partially integrated. + +------------------------------------------------------------------------ +20081019 open-source release + +Changes (w.r.t. previous open-source release): + +- Introduced enumerable_thread_specific. This new class provides a + wrapper around native thread local storage as well as iterators and + ranges for accessing the thread local copies (1533). +- Improved support for Intel(R) Threading Analysis Tools + on Intel(R) 64 architecture. +- Dependency from Microsoft* CRT was integrated to the libraries using + manifests, to avoid issues if called from code that uses different + version of Visual C++* runtime than the library. +- Introduced new defines TBB_USE_ASSERT, TBB_USE_DEBUG, + TBB_USE_PERFORMANCE_WARNINGS, TBB_USE_THREADING_TOOLS. +- A number of other improvements to code, tests, makefiles, examples + and documents. + +Open-source contributions integrated: + +- linker optimization: /incremental:no . + +------------------------------------------------------------------------ +20080925 open-source release + +Changes (w.r.t. previous open-source release): + +- Same fix for a memory leak in the memory allocator as in TBB 2.1 U1. +- Improved support for lambda functions. +- Fixed more concurrent_queue issues reported at the forum. +- A number of other improvements to code, tests, makefiles, examples + and documents. + +------------------------------------------------------------------------ +TBB 2.1 Update 1 commercial-aligned release + +Changes (w.r.t. TBB 2.1 commercial-aligned release): + +- Fixed small memory leak in the memory allocator. +- Incorporates all open-source-release changes since TBB 2.1, except for: + - 20080825 changes for parallel_do; + +------------------------------------------------------------------------ +20080825 open-source release + +Changes (w.r.t. previous open-source release): + +- Added exception handling and cancellation support for parallel_do. +- Added default HashCompare template argument for concurrent_hash_map. +- Fixed concurrent_queue.clear() issues due to incorrect assumption + about clear() being private method. +- Added the possibility to use TBB in applications that change + default calling conventions (Windows* only). +- Many improvements to code, tests, examples, makefiles and documents. + +Bugs fixed: + +- 120, 130 - memset declaration missed in concurrent_hash_map.h + +------------------------------------------------------------------------ +20080724 open-source release + +Changes (w.r.t. previous open-source release): + +- Inline assembly for atomic operations improved for gcc 4.3 +- A few more improvements to the code. + +------------------------------------------------------------------------ +20080709 open-source release + +Changes (w.r.t. previous open-source release): + +- operator=() was added to the tbb_thread class according to + the current working draft for std::thread. +- Recognizing SPARC* in makefiles for Linux* and Sun Solaris*. + +Bugs fixed: + +- 127 - concurrent_hash_map::range fixed to split correctly. + +Open-source contributions integrated: + +- fix_set_midpoint.diff by jyasskin +- SPARC* support in makefiles by Raf Schietekat + +------------------------------------------------------------------------ +20080622 open-source release + +Changes (w.r.t. previous open-source release): + +- Fixed a hang that rarely happened on Linux + during deinitialization of the TBB scheduler. +- Improved support for Intel(R) Thread Checker. +- A few more improvements to the code. + +------------------------------------------------------------------------ +TBB 2.1 commercial-aligned release + +Changes (w.r.t. TBB 2.0 U3 commercial-aligned release): + +- All open-source-release changes down to, and including, TBB 2.0 below, + were incorporated into this release. + +------------------------------------------------------------------------ +20080605 open-source release + +Changes (w.r.t. previous open-source release): + +- Explicit control of exported symbols by version scripts added on Linux. +- Interfaces polished for exception handling & algorithm cancellation. +- Cache behavior improvements in the scalable allocator. +- Improvements in text_filter, polygon_overlay, and other examples. +- A lot of other stability improvements in code, tests, and makefiles. +- First release where binary packages include headers/docs/examples, so + binary packages are now self-sufficient for using TBB. + +Open-source contributions integrated: + +- atomics patch (partially). +- tick_count warning patch. + +Bugs fixed: + +- 118 - fix for boost compatibility. +- 123 - fix for tbb_machine.h. + +------------------------------------------------------------------------ +20080512 open-source release + +Changes (w.r.t. previous open-source release): + +- Fixed a problem with backward binary compatibility + of debug Linux builds. +- Sun* Studio* support added. +- soname support added on Linux via linker script. To restore backward + binary compatibility, *.so -> *.so.2 softlinks should be created. +- concurrent_hash_map improvements - added few new forms of insert() + method and fixed precondition and guarantees of erase() methods. + Added runtime warning reporting about bad hash function used for + the container. Various improvements for performance and concurrency. +- Cancellation mechanism reworked so that it does not hurt scalability. +- Algorithm parallel_do reworked. Requirement for Body::argument_type + definition removed, and work item argument type can be arbitrarily + cv-qualified. +- polygon_overlay example added. +- A few more improvements to code, tests, examples and Makefiles. + +Open-source contributions integrated: + +- Soname support patch for Bugzilla #112. + +Bugs fixed: + +- 112 - fix for soname support. + +------------------------------------------------------------------------ +TBB 2.0 U3 commercial-aligned release (package 017, April 20, 2008) + +Corresponds to commercial 019 (for Linux*, 020; for Mac OS* X, 018) +packages. + +Changes (w.r.t. TBB 2.0 U2 commercial-aligned release): + +- Does not contain open-source-release changes below; this release is + only a minor update of TBB 2.0 U2. +- Removed spin-waiting in pipeline and concurrent_queue. +- A few more small bug fixes from open-source releases below. + +------------------------------------------------------------------------ +20080408 open-source release + +Changes (w.r.t. previous open-source release): + +- count_strings example reworked: new word generator implemented, hash + function replaced, and tbb_allocator is used with std::string class. +- Static methods of spin_rw_mutex were replaced by normal member + functions, and the class name was versioned. +- tacheon example was renamed to tachyon. +- Improved support for Intel(R) Thread Checker. +- A few more minor improvements. + +Open-source contributions integrated: + +- Two sets of Sun patches for IA Solaris support. + +------------------------------------------------------------------------ +20080402 open-source release + +Changes (w.r.t. previous open-source release): + +- Exception handling and cancellation support for tasks and algorithms + fully enabled. +- Exception safety guaranties defined and fixed for all concurrent + containers. +- User-defined memory allocator support added to all concurrent + containers. +- Performance improvement of concurrent_hash_map, spin_rw_mutex. +- Critical fix for a rare race condition during scheduler + initialization/de-initialization. +- New methods added for concurrent containers to be closer to STL, + as well as automatic filters removal from pipeline + and __TBB_AtomicAND function. +- The volatile keyword dropped from where it is not really needed. +- A few more minor improvements. + +------------------------------------------------------------------------ +20080319 open-source release + +Changes (w.r.t. previous open-source release): + +- Support for gcc version 4.3 was added. +- tbb_thread class, near compatible with std::thread expected in C++0x, + was added. + +Bugs fixed: + +- 116 - fix for compilation issues with gcc version 4.2.1. +- 120 - fix for compilation issues with gcc version 4.3. + +------------------------------------------------------------------------ +20080311 open-source release + +Changes (w.r.t. previous open-source release): + +- An enumerator added for pipeline filter types (serial vs. parallel). +- New task_scheduler_observer class introduced, to observe when + threads start and finish interacting with the TBB task scheduler. +- task_scheduler_init reverted to not use internal versioned class; + binary compatibility guaranteed with stable releases only. +- Various improvements to code, tests, examples and Makefiles. + +------------------------------------------------------------------------ +20080304 open-source release + +Changes (w.r.t. previous open-source release): + +- Task-to-thread affinity support, previously kept under a macro, + now fully legalized. +- Work-in-progress on cache_aligned_allocator improvements. +- Pipeline really supports parallel input stage; it's no more serialized. +- Various improvements to code, tests, examples and Makefiles. + +Bugs fixed: + +- 119 - fix for scalable_malloc sometimes failing to return a big block. +- TR575 - fixed a deadlock occurring on Windows in startup/shutdown + under some conditions. + +------------------------------------------------------------------------ +20080226 open-source release + +Changes (w.r.t. previous open-source release): + +- Introduced tbb_allocator to select between standard allocator and + tbb::scalable_allocator when available. +- Removed spin-waiting in pipeline and concurrent_queue. +- Improved performance of concurrent_hash_map by using tbb_allocator. +- Improved support for Intel(R) Thread Checker. +- Various improvements to code, tests, examples and Makefiles. + +------------------------------------------------------------------------ +TBB 2.0 U2 commercial-aligned release (package 017, February 14, 2008) + +Corresponds to commercial 017 (for Linux*, 018; for Mac OS* X, 016) +packages. + +Changes (w.r.t. TBB 2.0 U1 commercial-aligned release): + +- Does not contain open-source-release changes below; this release is + only a minor update of TBB 2.0 U1. +- Add support for Microsoft* Visual Studio* 2008, including binary + libraries and VS2008 projects for examples. +- Use SwitchToThread() not Sleep() to yield threads on Windows*. +- Enhancements to Doxygen-readable comments in source code. +- A few more small bug fixes from open-source releases below. + +Bugs fixed: + +- TR569 - Memory leak in concurrent_queue. + +------------------------------------------------------------------------ +20080207 open-source release + +Changes (w.r.t. previous open-source release): + +- Improvements and minor fixes in VS2008 projects for examples. +- Improvements in code for gating worker threads that wait for work, + previously consolidated under #if IMPROVED_GATING, now legalized. +- Cosmetic changes in code, examples, tests. + +Bugs fixed: + +- 113 - Iterators and ranges should be convertible to their const + counterparts. +- TR569 - Memory leak in concurrent_queue. + +------------------------------------------------------------------------ +20080122 open-source release + +Changes (w.r.t. previous open-source release): + +- Updated examples/parallel_for/seismic to improve the visuals and to + use the affinity_partitioner (20071127 and forward) for better + performance. +- Minor improvements to unittests and performance tests. + +------------------------------------------------------------------------ +20080115 open-source release + +Changes (w.r.t. previous open-source release): + +- Cleanup, simplifications and enhancements to the Makefiles for + building the libraries (see build/index.html for high-level + changes) and the examples. +- Use SwitchToThread() not Sleep() to yield threads on Windows*. +- Engineering work-in-progress on exception safety/support. +- Engineering work-in-progress on affinity_partitioner for + parallel_reduce. +- Engineering work-in-progress on improved gating for worker threads + (idle workers now block in the OS instead of spinning). +- Enhancements to Doxygen-readable comments in source code. + +Bugs fixed: + +- 102 - Support for parallel build with gmake -j +- 114 - /Wp64 build warning on Windows*. + +------------------------------------------------------------------------ +20071218 open-source release + +Changes (w.r.t. previous open-source release): + +- Full support for Microsoft* Visual Studio* 2008 in open-source. + Binaries for vc9/ will be available in future stable releases. +- New recursive_mutex class. +- Full support for 32-bit PowerMac including export files for builds. +- Improvements to parallel_do. + +------------------------------------------------------------------------ +20071206 open-source release + +Changes (w.r.t. previous open-source release): + +- Support for Microsoft* Visual Studio* 2008 in building libraries + from source as well as in vc9/ projects for examples. +- Small fixes to the affinity_partitioner first introduced in 20071127. +- Small fixes to the thread-stack size hook first introduced in 20071127. +- Engineering work in progress on concurrent_vector. +- Engineering work in progress on exception behavior. +- Unittest improvements. + +------------------------------------------------------------------------ +20071127 open-source release + +Changes (w.r.t. previous open-source release): + +- Task-to-thread affinity support (affinity partitioner) first appears. +- More work on concurrent_vector. +- New parallel_do algorithm (function-style version of parallel while) + and parallel_do/parallel_preorder example. +- New task_scheduler_init() hooks for getting default_num_threads() and + for setting thread stack size. +- Support for weak memory consistency models in the code base. +- Futex usage in the task scheduler (Linux). +- Started adding 32-bit PowerMac support. +- Intel(R) 9.1 compilers are now the base supported Intel(R) compiler + version. +- TBB libraries added to link line automatically on Microsoft Windows* + systems via #pragma comment linker directives. + +Open-source contributions integrated: + +- FreeBSD platform support patches. +- AIX weak memory model patch. + +Bugs fixed: + +- 108 - Removed broken affinity.h reference. +- 101 - Does not build on Debian Lenny (replaced arch with uname -m). + +------------------------------------------------------------------------ +20071030 open-source release + +Changes (w.r.t. previous open-source release): + +- More work on concurrent_vector. +- Better support for building with -Wall -Werror (or not) as desired. +- A few fixes to eliminate extraneous warnings. +- Begin introduction of versioning hooks so that the internal/API + version is tracked via TBB_INTERFACE_VERSION. The newest binary + libraries should always work with previously-compiled code when- + ever possible. +- Engineering work in progress on using futex inside the mutexes (Linux). +- Engineering work in progress on exception behavior. +- Engineering work in progress on a new parallel_do algorithm. +- Unittest improvements. + +------------------------------------------------------------------------ +20070927 open-source release + +Changes (w.r.t. TBB 2.0 U1 commercial-aligned release): + +- Minor update to TBB 2.0 U1 below. +- Begin introduction of new concurrent_vector interfaces not released + with TBB 2.0 U1. + +------------------------------------------------------------------------ +TBB 2.0 U1 commercial-aligned release (package 014, October 1, 2007) + +Corresponds to commercial 014 (for Linux*, 016) packages. + +Changes (w.r.t. TBB 2.0 commercial-aligned release): + +- All open-source-release changes down to, and including, TBB 2.0 below, + were incorporated into this release. +- Made a number of changes to the officially supported OS list: + Added Linux* OSs: + Asianux* 3, Debian* 4.0, Fedora Core* 6, Fedora* 7, + Turbo Linux* 11, Ubuntu* 7.04; + Dropped Linux* OSs: + Asianux* 2, Fedora Core* 4, Haansoft* Linux 2006 Server, + Mandriva/Mandrake* 10.1, Miracle Linux* 4.0, + Red Flag* DC Server 5.0; + Only Mac OS* X 10.4.9 (and forward) and Xcode* tool suite 2.4.1 (and + forward) are now supported. +- Commercial installers on Linux* fixed to recommend the correct + binaries to use in more cases, with less unnecessary warnings. +- Changes to eliminate spurious build warnings. + +Open-source contributions integrated: + +- Two small header guard macro patches; it also fixed bug #94. +- New blocked_range3d class. + +Bugs fixed: + +- 93 - Removed misleading comments in task.h. +- 94 - See above. + +------------------------------------------------------------------------ +20070815 open-source release + +Changes: + +- Changes to eliminate spurious build warnings. +- Engineering work in progress on concurrent_vector allocator behavior. +- Added hooks to use the Intel(R) compiler code coverage tools. + +Open-source contributions integrated: + +- Mac OS* X build warning patch. + +Bugs fixed: + +- 88 - Fixed TBB compilation errors if both VS2005 and Windows SDK are + installed. + +------------------------------------------------------------------------ +20070719 open-source release + +Changes: + +- Minor update to TBB 2.0 commercial-aligned release below. +- Changes to eliminate spurious build warnings. + +------------------------------------------------------------------------ +TBB 2.0 commercial-aligned release (package 010, July 19, 2007) + +Corresponds to commercial 010 (for Linux*, 012) packages. + +- TBB open-source debut release. + +------------------------------------------------------------------------ +TBB 1.1 commercial release (April 10, 2007) + +Changes (w.r.t. TBB 1.0 commercial release): + +- auto_partitioner which offered an automatic alternative to specifying + a grain size parameter to estimate the best granularity for tasks. +- The release was added to the Intel(R) C++ Compiler 10.0 Pro. + +------------------------------------------------------------------------ +TBB 1.0 Update 2 commercial release + +Changes (w.r.t. TBB 1.0 Update 1 commercial release): + +- Mac OS* X 64-bit support added. +- Source packages for commercial releases introduced. + +------------------------------------------------------------------------ +TBB 1.0 Update 1 commercial-aligned release + +Changes (w.r.t. TBB 1.0 commercial release): + +- Fix for critical package issue on Mac OS* X. + +------------------------------------------------------------------------ +TBB 1.0 commercial release (August 29, 2006) + +Changes (w.r.t. TBB 1.0 beta commercial release): + +- New namespace (and compatibility headers for old namespace). + Namespaces are tbb and tbb::internal and all classes are in the + underscore_style not the WindowsStyle. +- New class: scalable_allocator (and cache_aligned_allocator using that + if it exists). +- Added parallel_for/tacheon example. +- Removed C-style casts from headers for better C++ compliance. +- Bug fixes. +- Documentation improvements. +- Improved performance of the concurrent_hash_map class. +- Upgraded parallel_sort() to support STL-style random-access iterators + instead of just pointers. +- The Windows vs7_1 directories renamed to vs7.1 in examples. +- New class: spin version of reader-writer lock. +- Added push_back() interface to concurrent_vector(). + +------------------------------------------------------------------------ +TBB 1.0 beta commercial release + +Initial release. + +Features / APIs: + +- Concurrent containers: ConcurrentHashTable, ConcurrentVector, + ConcurrentQueue. +- Parallel algorithms: ParallelFor, ParallelReduce, ParallelScan, + ParallelWhile, Pipeline, ParallelSort. +- Support: AlignedSpace, BlockedRange (i.e., 1D), BlockedRange2D +- Task scheduler with multi-master support. +- Atomics: read, write, fetch-and-store, fetch-and-add, compare-and-swap. +- Locks: spin, reader-writer, queuing, OS-wrapper. +- Memory allocation: STL-style memory allocator that avoids false + sharing. +- Timers. + +Tools Support: +- Thread Checker 3.0. +- Thread Profiler 3.0. + +Documentation: +- First Use Documents: README.txt, INSTALL.txt, Release_Notes.txt, + Doc_Index.html, Getting_Started.pdf, Tutorial.pdf, Reference.pdf. +- Class hierarchy HTML pages (Doxygen). +- Tree of index.html pages for navigating the installed package, esp. + for the examples. + +Examples: +- One for each of these TBB features: ConcurrentHashTable, ParallelFor, + ParallelReduce, ParallelWhile, Pipeline, Task. +- Live copies of examples from Getting_Started.pdf. +- TestAll example that exercises every class and header in the package + (i.e., a "liveness test"). +- Compilers: see Release_Notes.txt. +- APIs: OpenMP, WinThreads, Pthreads. + +Packaging: +- Package for Windows installs IA-32 and EM64T bits. +- Package for Linux installs IA-32, EM64T and IPF bits. +- Package for Mac OS* X installs IA-32 bits. +- All packages support Intel(R) software setup assistant (ISSA) and + install-time FLEXlm license checking. +- ISSA support allows license file to be specified directly in case of + no Internet connection or problems with IRC or serial #s. +- Linux installer allows root or non-root, RPM or non-RPM installs. +- FLEXlm license servers (for those who need floating/counted licenses) + are provided separately on Intel(R) Premier. + +------------------------------------------------------------------------ +* Other names and brands may be claimed as the property of others. diff --git a/sources b/sources index c13b95d..6787447 100644 --- a/sources +++ b/sources @@ -1,6 +1,5 @@ -c7d1712d9e3ff76d9037aff5fcd67a17 tbb30_20110419oss_src.tgz -3378f28664adc82d20bccbe35ea4fbfb Design_Patterns.pdf -2d56ac8475941c81f004359ab75e4120 Getting_Started.pdf -bb609bdd5dcb4b0a8bc0709e3bef622e Reference.pdf -30239687f75afd96ae1ad62e380633b9 Tutorial.pdf -62d5169d83f256b502c6a7d0729a68e9 CHANGES.txt +c3c66663c10261ff03d1b071ab74e659 tbb40_20110809oss_src.tgz +683109a2b732ecd56185d9019667718f Design_Patterns.pdf +907eed2e81e0d29a93848a26e0fbfa5d Getting_Started.pdf +131f0f2ae4311794dfa37b7a9172c54e Reference.pdf +74fca4778a2c624631c157b07beab7ec Tutorial.pdf diff --git a/tbb-3.0-mfence.patch b/tbb-4.0-mfence.patch similarity index 54% rename from tbb-3.0-mfence.patch rename to tbb-4.0-mfence.patch index b209ee7..09a89fc 100644 --- a/tbb-3.0-mfence.patch +++ b/tbb-4.0-mfence.patch @@ -1,8 +1,8 @@ -diff -up tbb30_20110419oss/include/tbb/machine/linux_ia32.h\~ tbb30_20110419oss/include/tbb/machine/linux_ia32.h ---- tbb30_20110419oss/include/tbb/machine/linux_ia32.h~ 2011-04-19 13:48:59.000000000 +0200 -+++ tbb30_20110419oss/include/tbb/machine/linux_ia32.h 2011-07-26 16:09:19.986615628 +0200 -@@ -40,7 +40,14 @@ - #define __TBB_control_consistency_helper() +diff -up tbb40_20110809oss/include/tbb/machine/linux_ia32.h\~ tbb40_20110809oss/include/tbb/machine/linux_ia32.h +--- tbb40_20110809oss/include/tbb/machine/linux_ia32.h~ 2011-08-24 15:51:56.000000000 +0200 ++++ tbb40_20110809oss/include/tbb/machine/linux_ia32.h 2011-10-18 15:04:01.994271994 +0200 +@@ -42,7 +42,14 @@ + #define __TBB_control_consistency_helper() __TBB_compiler_fence() #define __TBB_acquire_consistency_helper() __TBB_compiler_fence() #define __TBB_release_consistency_helper() __TBB_compiler_fence() -#define __TBB_full_memory_fence() __asm__ __volatile__("mfence": : :"memory") @@ -18,4 +18,4 @@ diff -up tbb30_20110419oss/include/tbb/machine/linux_ia32.h\~ tbb30_20110419oss/ #if __TBB_ICC_ASM_VOLATILE_BROKEN #define __TBB_VOLATILE -Diff finished. Tue Jul 26 16:09:26 2011 +Diff finished. Tue Oct 18 15:04:09 2011 diff --git a/tbb.spec b/tbb.spec index 857d1fe..5f6ec2a 100644 --- a/tbb.spec +++ b/tbb.spec @@ -1,5 +1,5 @@ -%define releasedate 20110419 -%define major 3 +%define releasedate 20110809 +%define major 4 %define minor 0 %define dotver %{major}.%{minor} %define sourcebasename tbb%{major}%{minor}_%{releasedate}oss @@ -12,7 +12,7 @@ Release: 1.%{releasedate}%{?dist} License: GPLv2 with exceptions Group: Development/Tools URL: http://threadingbuildingblocks.org/ -Source0: http://threadingbuildingblocks.org/uploads/76/169/3.0%%20update%%206/tbb30_20110419oss_src.tgz +Source0: http://threadingbuildingblocks.org/uploads/77/175/4.0/tbb40_20110809oss_src.tgz # Upstream regularly replaces the "Latest" documentation with what's # actually Latest at that point. These sources may no longer match @@ -30,7 +30,7 @@ Source4: %{docurl}/%{source_4} Source5: %{docurl}/%{source_5} Patch1: tbb-3.0-cxxflags.patch -Patch2: tbb-3.0-mfence.patch +Patch2: tbb-4.0-mfence.patch BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n) BuildRequires: libstdc++-devel # We need "arch" and "hostname" binaries: @@ -122,6 +122,11 @@ rm -rf ${RPM_BUILD_ROOT} %doc %{source_5} %changelog +* Tue Oct 18 2011 Petr Machata - 4.0-1.20110809 +- Rebase to 4.0 + - Port the mfence patch + - Refresh the documentation bundle + * Tue Jul 26 2011 Petr Machata - 3.0-1.20110419 - Rebase to 3.0-r6 - Port both patches