Keywords: Hewlett-Packard Intel Merced EPIC 64 bit ISA IA-64 performnce predication speculative loading RISC VLIW The page can be 
accessed at two URLs: 
http://www.microprocessor.sscc.ru/Merced/ - direct satellite channel to Hamburg, Germany 
http://www2.ssd.sscc.ru/microprocessor/Merced/ - fiber line to St. Petersburg, Russia, then to Finland
 
Jump to  VLSI Microprocessors Home  Russian version (koi8-r)
 
 

Merced Facts and Speculations

by Alexei Pylkin
Supercomputer Software Department RAS
 
  "This truly uncompromising article in-depth analyzes main features and real novelties of forthcoming processor. It gives a chance to you, learning a processor, to have your own opinion on it." 
Oleg Yu. Repin, maintainer of the VLSI Microprocessors 



 
Contents
 
Introduction
  Merced is code name of Intel's general purpose 64-bit microprocessor, which is currently under development. It is scheduled for production in mid-2000. The processor should be fabricated on the basis of 0.18 micron process technology. Intel Corporation expects to begin sample production in 1999. 

The processor is named after Merced city, located near San Jose, Calif., USA. 

Merced should be the first member of new IA-64 family. IA-64 stands for Intel 64-bit Architecture. IA-64 implements EPIC (acronym from Explicitly Parallel Instruction Computing) concept. EPIC is jointly defined by HP and Intel, they claim EPIC to be fundamental architecture technology, analogous to CISC and RISC. IA-64 includes new 64-bit instruction set which is also jointly developed by HP-Intel. In official HP-Intel's announcements the new instruction set also called as 64-bit Instruction Set Architecture (64-bit ISA). In addition, Merced should provide full compatibility with Intel's x86 family. Intel officials often use IA-32 (abbreviation from Intel 32 bit Architecture) instead of x86. 

Today it`s known about two IA-64 processors under development 

  • already mentioned Merced
  • McKinley, expected in producton in 2001. It is being developed mainly by HP
Recently Intel announced two more IA-64 processors: Madison would appear in 2002, then - Deerfild. 
Chronology of events
  Hewlett-Packard and Intel announced their joint research-and-development project in June of 1994. Aimed at providing advanced technologies for end-of-the-decade workstation, server and enterprise-computing products, the two companies efforts include development of the 64-bit instruction set and compiler optimization. 

Two years later, in 1996, HP produced its first 64-bit general purpose processor named PA-8000. It was the first member of new PA-RISC 2.0 family. It is naturally to assume PA-RISC 2.0 to be the result of the joint R&D 64-bit instruction set project. The more so, as PA-8000 implements two of key IA-64 features - predication and speculation. But there is no official information which confirms the assumption. 

In October 9, 1997 Intel Corporation announced ([1]) that 

  • the first member of its new family of 64-bit microprocessors, code named Merced, is scheduled for production in 1999. The processor will be produced on the basis of Intel's 0.18 micron process technology, which is currently under development.
  • the processor is intended for servers and workstations market segment
  • Merced processors will run all the software that currently operates on Intel x86 processor-based machines
  • Intel has a complete IA-64 compatible software development environment running, and key independent software vendors (ISVs) are using it to develop operating systems and enterprise-level applications.
On Oct. 14, 1997 at the Microprocessor Forum in San Jose, Calif., Intel and Hewlett-Packard companies made their first public disclosure of the IA-64 basics. There was joint presentation by Intel Fellow and Director of Microprocessor Architecture John Crawford and Hewlett-Packard's Manager and Lead Architect Jerry Huck. Bill Worley and Rajiv Gupta of HP, Don Alpert and Hans Mulder of Intel were recognized as the key people in EPIC research team. Transcripts of the speeches may be found on Intel's Web site ([3]). Slides of HP/Intel IA-64 presentation at Microprocessor Forum are located on HP's Web site ([3a]). In addition Intel Corporation made a few press releases on the topic ([4], [5]). 

Also the same day at the Microprocessor Forum there was a presentation by Joel Birnbaum, Director of Hewlett-Packard Laboratories, Senior Vice President of Research and Development. He presented a short retrospective of the architecture work at HP from the early '80s until their decision to form the HP-Intel IA-64 alliance in 1994. According to Joel Birnbaum, the result of HP Labs research, known internally as Wide-Word and then as SP-PA, Super-Parallel Processor Architecture, served as the starting point for the Intel alliance; Bill Worley of HP Labs has headed the stages of both the Precision Architecture and Wide-Word efforts. Joel Birnbaum said Wide-Word included such features as statical parallelism, speculation, predication, mechanisms to enable number and speed of functional units to scale. Also Joel Birnbaum explained decision to make alliance with Intel. It is too long to place it here. Rajiv Gupta was mentioned as HP Labs' technical lead in HP's collaboration with Intel. 

On May 29, 1998 Intel Corporation announced ([2]) a change in the production schedule of the Merced processor. According to the announced, planned production volumes are moved from 1999 to mid-2000. Intel Corporation expects to begin sample production in 1999. The announcement does not contain information on Merced's architecture and technology process. 

On October 12-15, 1998 Microprocessor Forum was held. At the presentation "IA-64 Processors: Features and Futures" Intel`s Stephen Smith provided some insights into the IA-64 products and their features.

EPIC, IA-64, Merced
  According to HP and Intel, EPIC concept includes all VLIW advantages and does not include the disadvantages. John Crawford ([3]) revealed the following EPIC features: 
  • A lot of registers.
  • Ability to scale to a lot of functional units. HP and Intel officials call this feature "inherently scalable instruction set".
  • Having instruction level parallelism explicit in the machine code. Dependencies between instructions are found and handled by the compiler, not by the processor.
  • Predication. Instructions from different branches of conditional statement are marked by so named predicate registers and executed simultaneously.
  • Speculative loading. Data from slow memory is loaded in advance.
These EPIC features are explained in more details below. 

HP and Intel officials claim EPIC to be the next generation concept. They opposed EPIC to CISC and RISC architectures. In their opinion ([4]), traditional microprocessor architectures have fundamental attributes that limit performance. But some RISC processor makers don't share such a pessimistic opinion ([6]). By the way, in 1980's, when RISC concept has appeared, there were many assertions that "CISC ran out of gas" and that CISC has fundamental attributes that limit performance. But processors, recognized as CISC, are still widely used (e.g. Intel x86 family). Their performance still increases. 

In fact, all the abbreviations - CISC, RISC, VLIW mean idealized concepts only. It's difficult to classify real microprocessors. Present-day microprocessors, reckoned among RISC, differ greatly from the first processors of RISC architecture. The same with CISC. Most perfect processors implement a lot of successful ideas not depending on concepts they came from. 
 

IA-64 features
  IA-64 registers: 
  • 128 64-bit general purpose registers
  • 128 80-bit floating-point registers
  • 64 1-bit predicate registers
As you remember, having a lot of registers reckoned by John Crawford among fundamental EPIC features. Indeed, 128 is a big number when compared with 8 general purpose registers of x86 family. But, for example, MIPS R10000 has 64 integer and 64 floating-point 64-bit registers. 

IA-64 instruction format: 

  • op-code
  • predicate register (6 bits)
  • source register 1 (7 bits)
  • source register 2 (7 bits)
  • destination register (7 bits)
  • special fields for integer and floating-point arithmetic
  • misc.
IA-64 instructions are packed by compiler into 128 bit length bundle. The bundle contains three IA-64 instructions along with template. The template indicates dependencies between instructions - whether the instructions in the bundle can be executed in parallel or if one or more must be executed serially due to operand dependencies. The template also indicates whether the bundle can be executed in parallel with the neighbor bundles. 

Let's reckon all combinations of three instructions in a bundle: 

    i1  ||  i2  ||  i3    - all instruction executed in parallel 
    i1  &  i2  ||  i3   - first  i1,  then i2 and i3 executed in parallel 
    i1  ||  i2  &  i3   - i1 and  i2 executed in parallel, then i3 
    i1  &  i2  &  i3  -  i1, i2, i3 executed serially 
A single bundle containing three instructions corresponds to set of three functional units. IA-64 processors would contain different numbers of such sets. And these processors will be able to run the same code. Indeed, assume an IA-64 processor has N sets of three functional units each; then using the template information on dependencies between bundles it is possible to chain the bundles to create instruction word of Nx3 instructions (N bundles) in length. This is the way chosen by HP and Intel to provide scalability of IA-64. Certainly the concept is beautiful. Unfortunately IA-64 is not absolutely perfect. 
  • In BYTE's article ([7]) Tom R. Halfhill writes: "Successive generations of IA-64 processors will run older IA-64 software, but the software might not run at top speed until it's recompiled."
  • Jerry Huck notes that although IA-64 is able to scale to any (N x 3) number of functional units but number of registers are fixed.
  • Also Jerry Huck mentioned ([3]), that IA-64 code size would be larger than RISC`s one. Three IA-64 instructions are 128 bit length, RISC instruction is usually 32 bit length - four instructions per 128 bit.
In addition, there is some mess here. In the second half of February, 1998 at Intel Developer Forum principal engineer Carole Dulong said that in such architecture as Merced, the proportion of integer, floating-point, specialized and load-store units would be determined by the mix of these operations in the expected code streams. But on 14 Oct, 1997 at Microprocessor Forum ([3]) HP and Intel officials revealed that IA-64 family microprocessors would contain N replicated sets of three functional units and it is naturally to assume that such a set must contain one integer, one floating-point and one load/store units. The statements are conflicting. 

By the way, EPIC bears striking resemblance to Texas Instruments' TMS320C6x's VelociTI architecture. A good example is TMS320C6201 DSP processor. The processor contains 32 general purpose registers - this is not a small number. It has 8 functional units - this is a large number even comparing to up-to-date superscalar processors. TMS320C6201 instructions are packed by compiler into instruction words containing 8 instructions along with the template. The template indicates dependencies between instructions - explicit parallelism. Each instruction has conditional field - predication. 

IA-64 family is not the only upcoming VLIW-like design of general-purpose CPU. For example, E2k (Elbrus-2000) processor is under development since 1992 in Elbrus, Russia. Elbrus's Chief Technology Officer, Associate Member of the Russian Academy of Science, Professor Boris Babaian says the processor will be two times faster than Merced's successor, McKinley. It is estimated E2k running at 1.2GHz will deliver 135 SPECint95 and 350 SPECfp95. 

There are more examples: 

Moreover, today it becomes common to implement VLIW in DSP and media-processors. 
Predication
  Predication is a method to handle conditional branches. The main idea of the method - compiler schedules both possible paths of the branch to be executed on processor simultaneously. Indeed, EPIC processors would have a lot of functional units. 

When an IA-64 compiler finds a branch statement in the source code it marks all the instructions that represent each path of the branch with a unique identifier called a predicate. Each instruction has a predicate field for that. When the CPU encounters a predicated branch at the run time, it will begin executing the code along both destinations of the branch. But it does not store the results while predicate registers values are not defined. After the condition is evaluated, the processor stores a 1 in predicate register which correspond to "true" destination and a 0 in another. Before storing the results, the CPU checks each instruction's predicate register. If the register contains a 1, the instruction is valid, so the CPU retires the instruction and stores the result. If the register contains a 0, the instruction is invalid, so the CPU discards the result. 

The ARM architecture from Advanced RISC Machines Ltd. (Cambridge, UK) has included a form of predication since its inception in 1980's. By the way, Intel Corporation has a license from Advanced RISC Machines to produce, sell and enhance the StrongARM (developed by Digital Corporation, DEC has licensed the ARM architecture) microprocessor family. All instructions of already mentioned TMS320 DSPs include conditional fields. Some instructions of HP PA-RISC are predicated. 

Describing the predication HP and Intel representatives often mention conference paper A Comparison of Full and Partial Predicated Execution Support for ILP Processors, that was done by Scott A. Mahlke, Richard E. Hank, James E. McCormick, David I. August, and Wen-mei W. Hwu from IMPACT Research Group which is located in University of Illinois at Urbana-Champaign. This paper was published in Proceedings of the 22nd International Symposium on Computer Architecture, Santa Margherita Ligure, Italy, June 1995. Currently some authors of the research are employed by HP corporation. In that study they measured how effective is predication at increasing performance. They had a hypothetical eight-wide machine. They found that, on average, about half of the branches allowed predication. 

Unfortunately, HP and Intel has not revealed how IA-64 processors will handle the second half of the conditional branches. 

Existing RISC processors use prediction and speculative execution along with predication. They rather often predict correctly - in 95% of cases. 
 

Speculative loading
  Sometimes processors are idle while waiting for the load from relatively slow memory to complete. Speculative loading mechanism is aimed to reduce the processor idle times. 

Using this mechanism, the load can be placed by compiler as early as possible in the code. Therefore when some instruction will need data from memory, the processor will not be idle. Such replaced loads would be named speculative loads. They will be indicated by a special way. The compiler will insert speculative load check instruction right before instruction using speculatively loaded data. If an exception occurs when the data is needed, the exception will be recognized in the load's original "home block." - when the processor will encounter speculative load check instruction. If, for example, the compiler carries load instruction out of the branch which is never executed, then the exception will be ignored. 

Usually processor designs implement L1, L2 and L3 caches to break dependence on memory latency. HP PA-8500, for example, has 1.5 Mbytes of L1 single-cycle cache on-chip. 

Instruction sets in both Sun UltraSPARC (SPARC version 9) and IBM POWER3 include prefetch instructions that explicitly tell the CPU to preload certain data and instructions into their L1 caches. HP PA-8xxx processors also implements speculative fetching of data. These prefetch instructions resemble the described speculative loads, don't they? 

Other Merced-related facts 
 

Estimated performance 
 

  According to Intel Corporation' press release, Merced would provide industry leading performance. More exact official estimations are not announced yet. But then Intel announced 32-bit Foster (x86-architecture), which will be equal to Merced in floating-point performance. And even Merced's successor, McKinley, will be slower than Foster in 32-bit integer calculations. So Intel itself said Merced will not be a performance champion. 

MicroDesign Resources` analyst team expects Merced to operate at speeds of around 800 MHz and to deliver 45 SPECint95 and 70 SPECfp95. In x86 mode, Merced could match the performance of a 500MHz Pentium. Performance results for 450MHz Pentium II are 17.2 SPECint95 and 12.9 SPECfp95. So Merced would run x86-code 3-5 times slower than native one. 

Alpha 21264 on 500MHz already shows 27.7 SPECint95 and 58.7 SPECfp95 performance results. It is possible to run x86-code on Alpha using FX!32 binary translator. Performance is decreasing by 3 times at an average. 

By the way, in 1997 Intel Corporation bought several Digital Equipment Corporation's licenses on Digital Alpha processor. Intel had to buy them to escape law-court punishment for illegal using Digital Alpha technological solution in its production. Probably Digital Alpha know-how greatly influenced upon forthcoming Merced design. 

D.H. Brown analyst Tony Iams says that the performance estimates he has seen show that UltraSPARC will still have the advantage over Merced in floating point performance. Iams says that UltraSPARC and Merced are expected to be equal in terms of integer performance. 
Estimated performance of UltraSPARC-III is about 35 SPECint95 and 60 SPECfp95 on 600 MHz. 

In general, Digital Alpha 21264, Sun UltraSPARC-III, IBM POWER3 are recognized to become Merced competitors. 
But POWER3 and Alpha 21264 are already in production, production of UltraSPARC-III is expected in 1999 while the first Merced is scheduled for production in 2000. 

Price
  It`s estimated the Merced chips will sell for about $5,000 each.
64-bit
  In 2000 Merced would be the first Intel developed 64-bit microprocessor. The very first 64-bit general-purpose microprocessor is MIPS R4000. It was produced in 1992. Now MIPS is widely used in supercomputers, servers, workstations and even in game consoles (e.g. Nintendo-64). Also for several years 64-bit general-purpose microprocessors Digital Alpha (1992), PowerPC-620 (1994), Sun UltraSPARC (1995), HP PA-RISC 2.0 (1996) are widely used. 

Moreover UltraSPARC also contains a number of 128-bit registers. 

Operating frequency
  Linley Gwennap in [16] assumes the first Merced chip will operate at frequency of about 800 MHz. Digital Alpha 21164' operating frequencies is up-to 600 MHz; Alpha 21164 600 MHz is in serial production since 1997. In October, 1996 Exponential Technologies` PowerPC/750 MHz was demonstrated. In February, 1998 IBM Corporation demonstrated Xperimental PowerPC operating at 1GHz.
Technology
  The processor will be produced on 0.18 micron process technology, which is also under development currently in Intel Corporation. Decreasing such a technology characteristic allows to reduce power dissipation, to raise operating frequency, to enlarge scale integration. Enlarging scale integration allows to place more functional units, registers, cache on a processor. Currently all of the above 64-bit microprocessors are produced on 0.35 and 0.25 micron technology. Intel Corporation uses 0.25-micron technology to produce its 32-bit x86 processors. 

The first Merced will be a cartridge-style module, including a CPU, L2 cache and bus interface, said Merced director of marketing Ronald Curry. The cartridge will employ a newly defined system bus, using concepts from the Pentium-II bus. 
 

Compatibility
  Before the official Intel announcement in 1997 it was expected that jointly developed HP-Intel architecture would provide source compatibility with x86 and PA-RISC families. But now it's disclosed that Merced implementing this architecture will run only software that currently operates on x86 family. 

EPIC and CISC concepts are contrary. While the EPIC implies explicit parallelism (therefore compiler parallelizes and schedules code), CISC implies implicit one (on-chip parallelization and scheduling). And the concepts are to be combined in the Merced design. That's strange. 

In the Microprocessor Report article Intel patent application titled "Method and Apparatus for Transitioning Between Instruction Sets in a Processor" is analyzed. The Intel application describes a processor, which is assumed to be Merced, that executes both x86 instructions and a second "64-bit instruction set," which is assumed to be IA-64. The Intel document describes a processor that can support operating systems and applications that use either or both instruction sets. Patent application includes description of several instructions used to switch modes and share data between the two instruction sets. And Linley Gwennap, the article author, writes "In some places, the document gives the impression that Intel will treat IA-64 as simply a 64-bit extension to x86, much as when the 386 pioneered new 32-bit modes." 

In brief, it is not clear about x86 compatibility. Only one thing is for certain: Intel's officials say Merced will be able to run x86 code.

Operating systems supporting Merced
  Sun Microsystems and Intel announced 16 Dec 1997 that Sun will develop a version of its Solaris (UNIX dialect) operating system for Intel's Merced. Sun and Intel also announced a licensing agreement whereby the Merced-optimized Solaris will be licensed to other hardware companies. Vendors including Fujitsu, NCR, Siemens Nixdorf and Toshiba said they would use Solaris on their Intel based products. Current version of Solaris is 7 (2.7) which is a fully 64-bit system. Merced should be supported beginning from version 8 (2.8). 

Digital Equipment Corporation and Sequent port Digital UNIX to Merced. Tandem, Compaq and Sequent announced they would ship their Merced based systems under Digital UNIX environment. Digital UNIX is the first 64-bit member of UNIX family on market. Digital UNIX became a 64-bit in 1993. 

Hewlett-Packard prepare its HP-UX (UNIX dialect) for Merced. Current version of HP-UX is 11.0. This is the first version of HP-UX with full 64-bit support. As you remember HP is a co-developer of EPIC. HP licenses HP-UX to Hitachi, NEC and Stratus. 

Microsoft Corporation announced forthcoming Windows NT 5.0 would have a 64-bit variant for Merced. Unfortunately, Microsoft has not any experience in 64-bit software development. By the way, Microsoft developed its first 32-bit system only 8 years after first Intel's 32-bit processor i386 had appeared. 

Silicon Graphics Inc. has reached agreement with Intel to port IRIX (UNIX dialect) to Merced. 

Novell, Inc. unveiled its plans to develop new network operating system code-named Modesto. Novell says Modesto will leverage Intel's IA-64 processors while preserving backward compatibility with NetWare 5. 
 

Merced compilers
  Compilers for Merced are developed by Intel, Hewlett-Packard, Microsoft, Metaware Inc. (Santa Cruz, Calif.) and Edinburgh Portable Compilers Ltd. (Edinburgh, UK) companies. People from Pentium Compiler Group have plans to provide EPIC/IA-64 support for GCC

On 9 Oct. 1997 it was announced Intel had a complete IA-64 compatible software development environment running, and key independent software vendors (ISVs) were using it to develop operating systems and enterprise-level applications. And at the Intel Developer Forum (Sept. 15-17, 1998) Merced emulator software was demonstrated. 

HP has released Trimaran System which is an integrated compilation and performance monitoring infrastructure for research in instruction-level parallelism. Trimaran is a collaboration of three research groups: 

 
Summary
  EPIC has the same principal feature as VLIW - compiler, not processor, performs the parallelizing of instruction stream. This approach has such advantages: 
  • makes simpler architecture of a processor; one may place more registers and functional units in EPIC processor instead of hard-wired parallelizing logic
  • processor does not waste time on analysis of command stream
  • processor can analyze only limited area of the program while the compiler is able to analyze the whole program.
  • if a program is to be executed repeatedly it is advantageous to parallelize the program only once when compiling rather than to do it each run time.
The disadvantages: 
  • A compiler analyze program only statically. The program is scheduled once for all. But the program execution path can be changed heavily even by little alternation of input data.
  • Complexity of the compiler increases drastically. Therefore number of errors in compiler and compilation time increase badly too.
  • Even a more complicated debug tool is required to deal with violently optimized and parallelized machine code.
  • Merced's performance depends heavily upon quality of compiler. IA-64 compilers are currently in development. Their quality is not known.
Probably developing of the decent parallelizing compiler for IA-64 is more advanced problem than even Merced designing itself. Maybe the only example of successful commercial parallelizing (ILP) compiler today is TI`s compiler for TMS320 signal processors family. The compiler have been developed for quite a long time. 

According to HP and Intel's announcements the architecture simplicity is one of EPIC advantages. But IA-64 will support complex instruction set of x86 family. 

As expected, in x86 mode, 800-MHz Merced could match the performance of a hypothetical 500-MHz Pentium. Then the old software for processors of x86 family will not use Merced in any efficient way. It is too expensive to run DOS or Windows on Merced. Intel Corporation aims Merced to enterprise servers and high-level workstations. Processors of x86 family has never been used for these purposes so it is not clear why Merced is to support x86. 

Perhaps increasing of the number of functional units is not such a difficult problem for RISC processor and is not so easy for EPIC as it is assumed by EPIC/IA-64 developers. Especially as processors, which are recognized as RISC, already use many features to be implemented in upcoming Merced. As already mentioned, classifying processors among RISC, CISC and VLIW is a kind of fiction. Up-to-date processors implement successful ideas coming from all the above concepts. 

In the article from Microprocessor Report, which is dated January 26, 1998, it is supposed many EPIC features can be added to existing RISC instruction sets using extension words; a retrofitted processor could execute current RISC binaries, but on programs compiled to take advantage of the new EPIC features, the processor could be as fast as or faster than IA-64 chips. 

As HP and Intel repeatedly claimed that Merced would be implementation of revolutionary EPIC concept. But some of already built processors have major EPIC symptoms, e.g. TI's TMS320C6201 DSP processor (1997). 

Nevertheless Merced is a very interesting experiment in VLIW design. Certainly it will have hard but interesting destiny. That's why HP and Intel play safe along with their appeal to whole computer industry to make a transition to Merced. Just a few facts. Intel plans to continue its 32-bit processor x86-family. In addition Intel bought several Digital Equipment Corporation's licenses on famous Digital Alpha RISC processor. Hewlett-Packard, EPIC co-developer, is continuing development of new members of PA-RISC family. PA-8500 is expected to appear in systems in the second half of 1998. It will be followed by PA-8600, 8700, 8800 and 8900! 
 

Sources and Links
 
[1] New 64-Bit Processor Will Extend the Intel Architecture http://www.intel.com/pressroom/archive/releases/sp100997.HTM 

[2] Intel Notifies Customers Of Change In Merced Processor Schedule http://www.intel.com/pressroom/archive/releases/sp052998.htm 

[3] John Crawford, Intel, and Jerry Huck, HP: Motivations and Design Approach for the IA-64 64-Bit Instruction Set Architecture http://www.intel.com/pressroom/archive/speeches/mpf1097c.htm 

[3a] Slides of HP/Intel IA-64 presentation at Microprocessor Forum http://www.hp.com/esy/technology/ia_64/products/slides/index.htm 

[3b] 1997 Microprocessor Forum http://www.chipanalyst.com/q/@3720331wxyxzk/events/mpf/highlights.html 

[4] The Next Generation of Microprocessor Architecture: A 64-bit Instruction Set Architecture (ISA) Based on EPIC Technology http://www.intel.com/pressroom/archive/backgrnd/sp101497.HTM 

[5] HP and Intel Unveil Breakthrough EPIC Technology at Microprocessor Forum http://www.intel.com/pressroom/archive/releases/sp101497.HTM 

[6] Solaris on Merced: What's in it for Sun? by Robert McMillan, SunWorld, January 1998 http://www.sun.com/sunworldonline/swol-01-1998/swol-01-ia64.html 

[7] Beyond Pentium-II by Tom R. Halfhill, BYTE, December 1997 http://www.byte.com/art/9712/sec5/art1.htm 
 

Information on Merced from MicroDesign Resources http://www.chipanalyst.com/q/mpr/merced/ 
[8] IA-64 and Merced--What and Why by Peter Christy, MPR 12/30/96 http://www.chipanalyst.com/q/mpr/merced/1017vp.html 

[9] First Merced Patent Surfaces by Linley Gwennap, MPR 3/31/97 http://www.chipanalyst.com/q/mpr/merced/merced.html 

[10] Intel, HP Make EPIC Disclosure by Linley Gwennap, MicroDesign Resources http://www.chipanalyst.com/q/mpr/merced/v11_14.html 

[11] Intel's Merced and IA-64: Technology and Market Forecast by Linley Gwennap, MicroDesign Resources http://www.mdronline.com/q/tech_lib/IA64/index.html 
 

Other Links 
[12] IA-64 Overview from HP http://www.hp.com/esy/technology/ia_64/overview/index.html 

[13] IA-64 News from HP http://www.hp.com/esy/technology/ia_64/news/ 

[14] VLSI Microprocessors by Oleg Yu. Repin http://www.microprocessor.sscc.ru 

[15] The Russians Are Coming by Keith Diefendorff, Microprocessor Report 02/15/1999. Short version of the article is available at http://www.elbrus.ru/press/mprep-p1.html

[16] EPIC historical precendents by Mark Smotherman http://www.cs.clemson.edu/~mark/epic.html 

[17] Texas Instruments' Digital Signal Processing Solutions http://www.ti.com/sc/docs/dsps/products.htm 

[18] The VLIW project at IBM Research http://www.research.ibm.com/vliw/proj.html 

[19] The Word on VLIW by Dick Pountain, BYTE, April 1996 http://www.byte.com/art/9604/sec8/art3.htm 

[20] VLIW Questions by Peter Wayner, BYTE, November 1994 http://www.byte.com/art/9411/sec12/art1.htm 

[21] What is VLIW? BYTE, November 1994 http://www.byte.com/art/9411/sec12/art2.htm 

[22] Free On-Line Dictionary of Computing http://wombat.doc.ic.ac.uk/foldoc/index.html 
 

Glossary 
 

CISC - acronym from Complex Instruction Set Computer 
 

 
    CISC design is aimed to provide comfort for programmer/compiler. But CISC is not intended to achieve great performance. Each CISC instruction can perform several low-level operations such as memory access, arithmetic operations or address calculations. CISC design supports high-level languages by providing "high-level" instructions such as procedure call and return, loop instructions such as "decrement and branch if non-zero" and complex addressing modes to allow data structure and array accesses to be compiled into single instructions. Often CISC instructions is a microcode stored in processor's ROM. Main CISC drawbacks are significant complexity of design and low performance. 

    Examples of CISC processors are Motorola 680x0 family and Intel x86 family (IA-32). Both 680x0 and x86 are still popular. 

    CISC was coined in contrast to RISC. 
     

RISC - acronym from Reduced Instruction Set Computer
 
    Main features of RISC concept are: 
    • identical instruction length
    • uniform instruction encoding (usually: op-code, destination register, two source registers)
    • only registers can be used as instruction operands
    • a lot of general purpose registers which can be used by any instruction in any context
    • instruction can perform only simple operation
    • pipelines
    • at least one instruction completes per cycle
    • simple addressing modes
    Examples of processors, recognized as RISC: MIPS, SPARC, PowerPC, Digital Alpha, HP PA-RISC, Intel 960, AMD 29000. 

    The RISC concept provides more abilities to the compiler to perform optimization. Now just RISC microprocessors prevails. The field of usage is very wide - from microcontrollers to supercomputers. Exactly RISC microprocessors achieve the highest levels of performance, today's industry performance leader is Digital Alpha. There are several standards on RISC architectures, often called as Open Architectures. Among them are MIPS (current version is IV, R10000), SPARC (current version is 9, UltraSPARC) and PowerPC.

VLIW - acronym from Very Long Instruction Word
 
    VLIW is used to describe a processor instruction set implementing horizontal microcode. Several (4 - 8) primitive instructions are packed by compiler into Very Long Instruction Word. This word corresponds to set of functional units. VLIW may be classified as a static superscalar architecture. It is static in sense that finding parallelism in code is performed by the compiler, not by the processor. As stated in Microprocessor Report (2/14/94): 
      The objective of VLIW is to eliminate the complicated instruction scheduling and parallel dispatch that occurs in most modern microprocessors. In theory, a VLIW processor should be faster and less expensive than a comparable RISC chip.
    VLIW suits for exploiting instruction-level parallelism (ILP) in programs. There is explicit parallelism in VLIW machine code. 

    VLIW processors are not used wide. The most famous VLIW machine was built by (the late) Multiflow Computer, Inc. The company is defunct now. Hewlett-Packard has many engineers on-board from Multiflow. In Russia Elbrus-3 VLIW-based supercomputer is well-known. Perhaps contemporary example of VLIW processor is TI's DSP TMS320C6x family. The VLIW effort at the IBM T.J. Watson Research Center started in 1986. 
     

 
About author
Alexei Pylkin obtained Bachelor of Sciences Degree in Mathematics from the State Technical University of Novosibirsk, Russia in 1997. Despite his young ages, he has a perfect record of successful projects, mainly in parallel computing and supercomputing fields. In 1997 he developed a new programming language targeted at parallel algorithms representation and he has also realized a compiler for this language. 
Currently Alexei Pylkin is working toward his master thesis at the Supercomputer Software Department RAS. The goal of his master project is developing full-featured software tool for automated composing of parallel algorithms. Alexei considers his project as a way to broadening the applicability of high-performance computing methods in researches and industries. 
Alexei Pylkin could be accessed at e-mail pylkin@ssd.sscc.ru
The document was last updated on 15 Mar 1999.
Copyright (C) 1998, 1999 Alexei Pylkin, Supercomputer Software Department RAS