The Barcelona (aka "K10") microarchitecture is the latest design from AMD for both the server and desktop markets. The Phenom is the quad-core desktop variant, the Athlon X2 series includes the dual-core variant, and the 23xx and 83xx Opterons are the quad-core server varient.
The key changes over the previous line are covered in brief here and in greater detail here. Most of the interesting features require the use of an upgraded CPU socket denoted by a "+" (e.g. Socket AM2+ or Socket F+), though the CPU will work in non-plus sockets on current motherboards. Some of the "plus socket" features are:
• Separated voltage planes allow the CPU to have a different voltage/frequency for each core and the northbridge.
• HyperTransport 3.0, allowing greater bus bandwidth, including support for DDR2-1066.
In addition, the Barcelona introduces a shared L3 cache, which should have a major impact on HPC applications.
One major issue, however, is an L3 TLB bug present in the first generation of this architecture. This problem can be solved by disabling part of the L3 TLB system in the BIOS or via software (with a 10% performance penalty), or using a unique Linux patch to route around the problem with limited slowdown (but the patch is not intended for production use). See the Phenom wikipedia article for details.
In short, while Intel retains the upper hand in horsepower now, the AMD Barcelona design seems to sport many of the features predicted for future system design.
More information:
Wikipedia's Barcelona article covers the architecture in depth.
Anandtech benchmarking puts the chip through its paces.
To find a Barcelona-based chip, see Wikipedia:
• Phenom quad-cores
• Barcelona-based dual-core Athlons (scroll to "Phenom based")
• Barcelona-based quad-core Opterons (23xx and 83xx)
| permalink
This presentation file explains why the hybrid of MPI/OpenMP programming is required.
It comes with the examples and strategies.
Also, it talks about when the hybrid mode performs better.
http://www.nersc.gov/nusers/services/tr ... hybrid.ppt
Also, here are the paper list focusing on the performance in hybrid MPI/OpenMP applications.
1. Felix Wolf, et. al, "Automatic performance anlysis of hybrid MPI/OpenMP applications", Journal of Systems Architecture 2003.
2. Laksono Adhianto, et. al, "Performance Modeling of Communication and Computation in Hybrid MPI/OpenMP applications", ICPADS 2006.
3. Edmond Chow, et. al, "Assessing Performance of Hybrid MPI/OpenMP Programs on SMP Clusters"
| permalink
The Parallel Workloads Archive is an archive of real workload submission and execution data from real clusters from 1993 to the present. It supports a unified log file format (the Standard Workload Format, SWF) so that data from diverse sources can be analyzed together. With over 3 million jobs from 21 clusters, there's plenty of information to be had. It's maintained by Dror Feitelson of the Hebrew University of Jerusalem.
You can either visit the original Parallel Workloads Archive or our mirror of the site.
| permalink | related link
Back

Calendar



