|
| Benchmark
Results | What Is Interleaved Memory |
The new Macintosh Centris 650 and Quadra 800
computers introduced today feature a newly designed memory
controller which supports interleaved memory. The following
article explains how interleaved memory works on these machines
and how to configure the machines for maximum performance.
Interleaved Memory on the Centris 650 and Quadra
800
The main memory subsystem of the Macintosh Centris
650 and Quadra 800 computers makes use of a memory access
technique called "interleaved memory". This memory
organization serves to reduce the overall access time of the
68040 processor into DRAM. The following description illustrates
how this memory organization works and why it results in reduced
memory access time.
Non-interleaved Memory System
In a non-interleaved memory system, all of the
first bank of memory, bank 0, is addressed before the first
long word of the second bank of memory, bank 1, all of bank
1 is addressed before the first long word of bank 2, and so
on. Figure 1 shows this organization for two banks of N long
words. (A long word is 4 bytes, or 32 bits, and is the natural
unit of memory for the 68040.)
Bank 0 Bank 1
----------------- -----------------
| 0 | | N |
----------------- -----------------
| 1 | | N+1 |
----------------- -----------------
| 2 | | N+2 |
----------------- -----------------
~ ~ ~ ~
----------------- -----------------
| N-2 | | 2N-2 |
----------------- -----------------
| N-1 | | 2N-1 |
----------------- -----------------
^ ^
| |
----------------------------
|
v
-----------------
| Buffer |
-----------------
^
|
v System Data Bus
-----------------------------------------------------
Figure 1. Non-interleaved Memory Organization
The 68040 performs burst accesses (a single
bus transaction that reads or writes 16 bytes in 4 adjacent
long words) to move data between its caches and memory. All
16 bytes come from one bank of DRAM in a non-interleaved memory
system, so the time required to complete the transfer depends
directly on the access time of the DRAM. Figure 2 shows an
example of such a burst access. The time needed to access
the 2nd, 3rd, and 4th long words is shorter because a feature
of the DRAMs called "page-mode access" is used.
__ __ __ __ __ __ __ __ __ __
Clock __| |__| |__| |__| |__| |__| |__| |__| |__| |__|
______________________________________________________
DRAM Accesses | 1st long word | 2nd lwd | 3rd lwd | 4th lwd |
------------------------------------------------------
Figure 2. Non-interleaved Burst Access Timing
Interleaved Memory System
In an interleaved memory system, there are still
two physical banks of DRAM, but logically the system sees
one bank of memory that is twice as large. In the interleaved
bank, the first long word of bank 0 is followed by the first
long word of bank 1, which is followed by the second long
word of bank 0, which is followed by the second long word
of bank 1, and so on. Figure 3 shows this organization for
two physical banks of N long words. All even long words of
the logical bank are located in physical bank 0 and all odd
long words are located in physical bank 1.
Bank 0 Bank 1
----------------- -----------------
| 0 | | 1 |
----------------- -----------------
| 2 | | 3 |
----------------- -----------------
| 4 | | 5 |
----------------- -----------------
~ ~ ~ ~
----------------- -----------------
| 2N-4 | | 2N-3 |
----------------- -----------------
| 2N-2 | | 2N-1 |
----------------- -----------------
^ ^
| |
v v
----------------- -----------------
| Buffer | | Buffer |
----------------- -----------------
^ ^
| |
v System Data Bus v
-----------------------------------------------------
Figure 3. Interleaved Memory Organization
The interleaved memory configuration is designed
to speed up 68040 burst accesses by as much as 30%. (The actual
improvement depends on the system clock speed and the DRAM
access time.) Since the four long words of a burst access
are spread across two physical banks of DRAM, the individual
accesses can be overlapped to hide part, or all, of the DRAM
access time delay, as shown below in Figure 4.
__ __ __ __ __ __ __ __ __ __
Clock __| |__| |__| |__| |__| |__| |__| |__| |__| |__|
_______________________________
| 1st long word | 3rd lwd |
-------------------------------
DRAM Accesses ______________________________
| 2nd long word | 4th lwd |
-------------------------------
Figure 4. Interleaved Burst Access Timing
Centris 650 / Quadra 800 Memory Organization
Physically, the DRAM in a Centris 650 or Quadra
800 system is organized as 10 banks of memory, where each
bank is 32 bits wide and 4 or 16 MBytes deep. Logically, the
DRAM is organized as 5 pairs of banks, any of which may or
may not be interleaved. At system boot time, each pair of
DRAM banks is examined; if they are the same size (4 or 16
MBytes) the interleaved memory configuration for that bank
pair will be enabled. Otherwise, the bank pair will be left
in the non-interleaved configuration. The memory controller
in the C610/Q800 is capable of operating with some bank pairs
in the interleaved configuration and some bank pairs in the
non-interleaved configuration. The type of memory access which
is performed is determined dynamically at the start of each
cycle based on the value of an "interleave configuration
register" within the memory controller. ROM accesses
cannot be interleaved since there is only a single bank of
ROM.
The C650/Q800 motherboard contains 4 or 8 MB
of DRAM and 4 DRAM SIMM sockets. Systems which contain 8 MB
on the motherboard (all Q800s and some C650s) already interleave
the two 4 MB banks soldered to the motherboard. Systems which
have only 4 MB soldered on the motherboard cannot interleave
the single soldered 4 MB DRAM bank, although DRAM on SIMMs
can still be interleaved.
Each DRAM SIMM can contain either one or two
banks of DRAM. The C650 and Q800 use 72 pin DRAM SIMMs - these
SIMMs have a 32-bit data path, allowing memory upgrades to
be performed with a single SIMM. Single-sided SIMMs contain
one DRAM bank; double-sided SIMMs contain two DRAM banks.
A double-sided SIMM cannot contain an interleaved bank pair
since there are not enough pins on the SIMM to accommodate
the two 32-bit data buses required for interleaved memory.
Interleaving can only be done between DRAM SIMM pairs.
The motherboard contains banks 0 & 1, SIMM
slot 1 contains banks 2 & 3 (remember, SIMMs can be double-sided
and contain 2 banks of DRAM), slot 2 contains banks 4 &
5, slot 3 contains banks 6 & 7, and slot 4 contains banks
8 & 9. SIMM slot pairs 1-2 and 3-4 are interleaved together
whenever a bank pair is of the same size. For example, if
4 MB SIMMs are placed in both SIMM slots 1 and 2, then that
memory will be interleaved (banks 2 & 4). If a double-sided
8 MB SIMM (i.e, a SIMM with two 4 MB banks on it) is placed
in slot 1, and a single-sided 4 MB SIMM is placed in bank
2, then two of the banks will be interleaved (banks 2 &
4) and one bank will not be interleaved (bank 3).
The gist of all this is that in order to maximally
enable memory interleaving, memory upgrades should be performed
with a pair of SIMMs, both of the same size. A single SIMM
can be used for memory expansion, but will result in a portion
of memory being non-interleaved. The system actually takes
care of configuring everything automatically at boot, regardless
of what memory is installed. However, by physically configuring
DRAM in identically sized bank pairs, the fastest overall
memory access is achieved (i.e., the highest performance).
The actual performance delta between an interleaved and non-interleaved
memory system will depend on the application, and will vary
from application to application.
- Dale Adams
Apple Computer
These questions and answers appeared in the
Apple Information Alley
Q. Is there any information available on the
performance gains or losses when using interleaved instead
of non-interleaved memory on PCI-based Power Macintosh computers?
Specifically how much of a speed advantage would
be lost if you use one 16 MB SIMM rather than two 8 MB SIMMs
in an interleaved arrangement?
A. For increased performance it is better to
configure a PCI-based Power Macintosh computer for memory
interleaving rather than installing memory in a non-interleaved
configuration. This means that you will get better performance
if you configure your system with two 16 MB DIMMs rather than
one 32 MB DIMM. This applies to all other combinations of
same-sized DIMMs.
The actual performance will vary from computer
to computer. In general, a Power Macintosh with a PowerPC
604 microprocessor, such as the Power Macintosh 8500 or 9500
series computer, gets anywhere from a 5% to 15% boost in performance.
The average is about an 8% increase in performance speed.
On a Power Macintosh with a PowerPC 601 microprocessor, such
as the Power Macintosh 7500 series, you may get only a slightly
better performance gain by using memory interleaving rather
than non-interleaved DIMMs. Some third-party benchmarking
applications may report exaggerated performance differences
between interleaved and non-interleaved computers.
Q.How do I populate DIMMs in my PCI-based Power
Macintosh Computer to maximize performance using memory interleaving?
If I have an odd number of DIMMS, where should I place the
odd DIMM to get the best performance from memory interleaving?
A. Iinterleaving is accomplished by 'pairing'
two DIMMs in corresponding slots. That is, one DIMM in A1,
and another DIMM in B1 will set the machine up to use memory
interleaving.
If you have an odd number of DIMMs, the matched
pairs will run the memory interleaved. The odd DIMM will then
run non-interleaved. For the interleaving to be most effective,
the DIMMs must be the same size and speed, (usually, should
be of the same manufacturer, but not necessary). In reference
to the memory addressing, the A1/B1 will be the lower addresses,
going up to the A6/B6 being the highest address.
In relation to performance, it really does not
matter where the DIMMs are placed. The software is intelligent
enough to figure out which banks are being used, and is able
to "stitch" the memory together as required.
Note: Memory interleaving is only available
in the Power Macintosh 7500, 8500, and 9500 series computers.
The Power Macintosh 7200 uses a different memory controller
which does not support interleaving.
|