Article 169484 of comp.os.vms: Dave Cherkus wrote: > > : Does anyone know about release dates and the like for the 21264 and the > : PC264 motherboard that will go with it? > > Microprocessor Reports estimates volume production in the fourth quarter > of 1997. > > : Someone seems to be very secretive about something. These are some notes that I wrote after the DEC presentation at the last Microprocessor forum, where the 21264 was introduced. Norm ================================================================================ The 21264 vs the 21164 o. Transistor count is up from 9.7 million to 15.2 o. The 21264 has four integer and two floating point units vs the 21164's two integer and two floating point units o. The 21264 can execute instructions out-of-order whereas the 21164 has to execute instructions in-order. o. The cache structure has been changed: 21164: L1 cache: 8K inst, 8K data; L2 cache: 96K inst/data 21264: L1 cache: 64K inst, 64K data, no L2 cache The 21164's L1 cache took only one clock cycle to access, but a miss, which was frequent, cost six clock cycles. The 21264's L1 cache takes two clock cycles per access, but because of the 21264's out-of-order execution, this only costs 4% performance while greatly reducing cache misses. o. Cache hint instructions have been added to the 21264 which a compiler can use to pre-load instructions/data into cache o. The branch prediction mechanism has been improved. o. A floating square root instruction has been added to the 21264 o. The pin-bandwidth, which is the bandwidth from the chip to the off-chip cache and from the chip to main memory has been dramatically improved. o. Instructions to improve multi-media performance have been added. These take about 0.5% of the chip area. Overview -------- o. Estimated performance of a 500 Mhz 21264 is 30+ SpecINT95 and 50+ SpecFP95. This compares to 15 SpecINT95 and 21 SpecFP95 for a 500 Mhz 21164. - These numbers are based on running unoptimized 21164 code, ie, just taking the 21164 code as it exists and moving it to the 21264. Recompiled code should perform better. - The CMOS-6 process, (a 0.35 micron process) currently used for the 21164, and to be the initial process for the 21264 should go to at least 600 Mhz. o. The 21264 excels on a new performance comparison called STREAM (McCalpin), which is a memory to memory copy: 21264 500 Mhz ~1600 Mb/sec 21164 500 Mhz ~300 Mb/sec PA8000 180 Mhz ~300 Mb/sec P6 200 Mhz ~200 Mb/sec Ultra SPARC 200 Mhz ~250 Mb/sec R10000 200 Mhz ~200 Mb/sec - The increase in pin bandwidth should result in improvements in real world application performance far greater than indicated by the benchmarks. Benchmarks tend to run in on-chip caches while real world applications almost always break caches and thus are many times limited by the rate at which data and instructions can be moved from off-chip cache or main memory into the processor. o. Samples are to be shipped in Q1 97, volume shipments in H2 97. First systems will probably ship Q4 97. ================================================================================ === Norm Donovan Phone: (408)970-5678 === === Director IT === === Siliconix/TEMIC Inc === === 2201 Laurelwood Dr, Santa Clara CA, USA 95054 === === Norm.Donovan@TEMIC.Com === ================================================================================