• No results found

Review Paper on Crusoe Processor

N/A
N/A
Protected

Academic year: 2020

Share "Review Paper on Crusoe Processor"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013)

668

Review Paper on Crusoe Processor

Anuradha G. Deshmukh

1

, Dr. Ulahas B. Shinde

2

1Student of electronics department S.P.W.E.C Aurangabad Abstract— Mobile computing has been the buzzword for

quite a long time. Mobile computing devices such as laptop, desktop computer etc.are very important in day to day life and they require microprocessor. Microprocessor is the heart of that devices. desktop computer have very different command on processor that’s why they require lots off transistor and because of transistor they consume more power due to more power processor gets heated this type of processor is called power hungry processor. A hot processor require cooling fan and hardware become noisier bulky. this regard must have a proper 'performance-power' balance to ensure commercial success. Crusoe is the new microprocessor, which has been designed especially for the mobile computing market. In this paper The new technology, that is Crusoe processor is fundamentally software based: the power savings come from replacing large numbers of transistors with software, and operating frequency is increase up to 1GHz to

1.7GHz. software part consist of Code morphing

software(CMS),Very long instruction word(VLIW).

KeywordsCMS-Code morphing softwareOS independent software that acts as an interpreter on top of the processor optimizing instructions and making calculated decisions to enhance the processor’s performance,

Interpreter - Interpreter module that interprets x86 instructions,

Rollback -when a fault is encountered the state of the working

register is rolled back to the state held in the shadow register,

Translator - Translator module that recompiles the x86 instructions into optimized VLIW instructions,

VLIW - Very long instruction word CPU that executes up to four operations in each clock cycle.

I. INTRODUCTION

Several problem are face during manufacturing the processor that are, Energy, efficiency, Compatibility, Performance. Energy efficiency is very important in mobile devices where focused is directed towards battery life verses performance. Ref.[2] To solve this problem Crusoe processor is to design very less transistor. Due to less transistor heat dissipation is also less.Ref.[1] Transmeta's Crusoe x86 processor is capable of running at peak performance with low power requirements. This allows for a dense arrangement of CPU with minimum cooling requirements, while achieving competitive performance per watt.

Ref.[3] High performance processors have traditionally relied mainly on clock frequency and superscalar instruction issue to boost performance. while superscalar and frequency are continuous used then they have diminish the gain and this is appear in future. SAN MATEO, Calif. — Transmeta Corp. will formally unveil plans on (Jan. 6) to aim its X86-compatible with Crusoe processor at the embedded market, where its relatively small size and low power consumption give it an edge. Against Intel Corp. for design wins and profitability. Ref.[4]The Crusoe Smart Embedded (SE) processors are versions of Transmeta's existing Crusoe 5500 and 5800 microprocessors that have undergone a 24-hour burn-in testing process that rates them for 10 years of operation at temperatures up to 100°C.Transmeta is also guaranteeing five years of availability and support. Ref.[4]The company's embedded effort started when Matt Perry, who managed the Maverick MP3 chip group at Cirrus Logic Inc., joined Transmeta as president in April 2002. "It's been under the covers since Matt arrived. He had more of an embedded systems background," said Tom Lee, director for embedded business development at Transmeta (Santa Clara, Calif.).The SE chips are sampling now at speed of 667 MHz, 800 MHz and 933 MHz, with a cost of $50 each in 1,000s. The 667-MHz Crusoe SE is essentially the current 5500 processor, and the 800-MHz and 933-MHz SEs are

the 5800 parts. The chips come in standard and low-power

versions. The standard 667-MHz CPU consumes 6.1 watts,

while the low-power version consumes 5.1 W. The

company is currently developing a version of the Crusoe's internal software to boost real-time performance; those parts will ship later this year. Transmeta would not reveal target latency figures for those processors, but Lee said he does not expect they will meet so called hard real-time requirements that demand latency as low as 20 milliseconds. In this paper we will sketch the structure of VLIW and CMS and also challenges to minimize the power consumption and battery life verses performance.

II. THE CRUSOE TECHNOLOGY

A. Very long instruction word

(2)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013)

[image:2.612.331.548.124.267.2]

669 The microprocessor consist of a hardware i.e. VLIW core and software is CMS. CMS is compatible with x86 architecture and creating a VLIW engine with two floating point, memory unit, branching unit, integer unit .a Crusoe processor long instruction word called a molecule it may be 64 bits or 128 bits long and contain up to four RISC-like instructions, called atoms. All atoms within a molecule are executed in parallel. and the molecule format directly determines how atoms get routed to functional units; this greatly simplifies the decode and dispatch hardware.

Figure:1 A molecule can contain up to four atoms, which are executed in parallel.

[image:2.612.77.259.258.353.2]

.

Figure 2 Conventional superscalar out-of-order CPUs use hardware to create and dispatch micro-ops that can execute in parallel.

Integer register file has 64 registers, %r0 through %r63, out of which some are allocated to hold x86 state while others contain state internal to the system or can use temporarily registor.suppose we want to write a simple statement in language, C.

p=q*r–s/t+u*v-w%x

Figure 3 Execution tree

Ref.[5]The aim of parallel processing is to execute multiple operations simultaneously on independent hardware units to reduce the overall execution time of a program .This approach is taken by superscalar and other approach is taken from VLIW.

B. The Code Morphing Software

[image:2.612.82.253.389.526.2]
(3)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013)

[image:3.612.356.532.511.615.2]

670 As a result, either two or four x86 atoms are packed into a VLIW molecule (Figure 4) which is then executed in the pipeline. Ref[7]The Transmeta Crusoe processor TM5800 is a VLIW processor each instruction (called molecule) takes two or four RISK operation to a subset of five functional unit that are floating point two ALU’s branching point, memory unit.

Figure 4.CMS representation

C. The Interpreter

The Code Morphing software contains an Interpreter module that interprets x86 instructions one at a time, much like a traditional microprocessor. The Interpreter function is it reject frequently executed code from being needlessly optimized and gathers run time statistical information about the x86 instructions it sees for determining whether optimization as are necessary.

Figure5. Interpreter

D. The Translator

Upon detecting critical, frequently used x86 instruction sequences, the Code Morphing software invokes a Translator module that direct recompiles the x86 instructions into optimized VLIW instructions, called ―Translations‖. The native translations reduce the number of instructions executed and results in better performance. Further efficiencies is saving the translations in memory that is inaccessible to normal x86 code. This special memory area is named the ―Translation Cache‖ and allows the Code Morphing software to re-use translations and eliminate redundancies. Upon encountering previously translated x86 instruction sequences, the Code Morphing software skips the translation process and executes the cached translation directly out of the Translating cache. The Code Morphing software matches repeated executions with entries in the Translation Cache and the optimized translation is executed at full speed with minimum overhead. The initial cost of the translation is amortized over repeated executions.

Figure 6. The translator

During execution CMS guaranties correct operation through careful scheduling, eliminating hazards by inserting no-ops if necessary to stall execution.

Figure7. Translation Chache

[image:3.612.86.262.583.686.2]
(4)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013)

671 However in the case of Memory Mapped I/O, CMS cannot distinguish at translation time whether the instruction is a memory access or I/O having to wait until runtime. When handling I/O within a program, CMS keeps the order of instructions in the same way they appear in the initial code since I/O calls cannot be rolled back. The CMS resides in a ROM and is the first program to start executing when the processor boots. It translates an entire group of x86 instructions at once, creating an optimized translation, (where as a superscalar x86 translates single instructions in isolation). Moreover, while a traditional x86 translates each instruction every time it is executed, on a Crusoe, instructions are translated once, and the resulting translation is saved in a translation cache, making use of Locality of Reference property of code. The next time the (already translated) x86 code is executed, the system skips the translation step and directly executes the existing optimized translation. Crusoe hardware, in comparison with other x86 processors, can achieve excellent performance in dynamic translation, because it has been specifically designed with dynamic translation in mind.

E. Advantages of code morphing technique

Advantages of code morphing software provides to the Crusoe processor over traditional processor .conventional microprocessor designs approaching 40 million transistors, managing heat and power consumption is now one of the industry’s biggest challenges. Switching every transistor for on-off operation require bit of energy this is avoided in CMS, by replacing logic transistor therefore heat generation is very less

III. CRUSOE PROCESSOR ARCHITECTURE

[image:4.612.342.536.136.293.2]

The Crusoe microprocessor is available in the market in the following versions: TM3120, TM3200, TM5400 and TM5600.The basic architecture of all the above models are same except for some minor changes since various models have been introduced for different segments of the mobile computing market.

Figure 8.Architecture of Crusoe processor Model TM5800 CPU Core

(5)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013)

672 The Crusoe Processor includes an 8-way set-associative Level 1 (L1) instruction cache, and a 16-way set associative L1 data cache. It also includes an integrated Level 2 (L2) write-back cache for improved effective memory bandwidth and enhanced performance. This cache architecture assures maximum internal memory bandwidth for performance intensive mobile applications, while maintaining the same low-power implementation that provides a superior performance-to-power consumption ratio relative to previous x86 implementations. Other than having execution h/w for logical, arithmetic, shift, and floating point instructions, as in conventional processors, the Crusoe has very distinctive features from traditional x86 designs. To ease the translation process from x86 to the core VLIW instruction set, the h/w generates the same condition codes as conventional x86 processors and operates on the same 80-bit floating-point numbers.

A. Integrated DDR SDRAM Memory Controller

DDR SDRAM interface is the highest performance memory interface available on the Crusoe. The DDR SDRAM controller supports only Double Data Rate (DDR) SDRAM and transfers data at a rate that is twice the clock frequency of the inter-face. This feature is absent in the model TM 3200.The DDR SDRAM controller supports up to four banks, the equivalent of two Dual In-line Memory Modules (DIMMs), of DDR SDRAM using a 64-bit wide inter-face.

B. Integrated SDR SDRAM Memory Controller

The SDR SDRAM memory controller supports up to four banks, equivalent to two Small Outline Dual In-line Memory Modules (SO-DIMMS), of Single Data Rate (SDR) SDRAM that can be configured as 64-bit or 72-bit SO-DIMMs. These SO-DIMMs can be populated with 64M-bit, 128M-bit or 256M-bit devices. All SO-DIMMs must use the same frequency SDRAMs, but there are no restrictions on mixing different SODIMM configurations into each SO-DIMM slot. The frequency setting for the SDR SDRAM interface is initialized during the power-on boot sequence

IV. WORKING OF CRUSOE PROCESSOR

VLIW is a technique that combines multiple standard instructions into one long instruction word. This word contains instruction that can be executed at the same time on separate chips or different parts of the same chip. It provides explicit parallelism, i.e. executing more than one basic (primitive) instruction at a time. By using VLIW you enable the compiler, not the chip to determine which instructions can be run concurrently.

This is an advantage because the compiler knows more information about the program than the chip does by the time the code gets to the chip. Trace scheduling is an important technique in VLIW processing. i.e. the compiler processes the code and determines which path is the most frequently traveled, and then optimizes this path. The path is then optimized and rejoined with the other basic blocks using split and rejoin blocks. VLIW is to eliminate the complicated instruction scheduling.

A. Making a Translation

Code morphing system translates a chunk of x86 code into equivalent code for the Crusoe processor’s VLIW engine. Assume that the filtering and path selection algorithms have chosen the following four x86 instructions, (A) through (D), for translation.

A.addl %eax,(%esp) // load data from stack, add to %eax

B. addl %ebx,(%esp) // ditto, for %ebx

C.movl %esi,(%ebp) // load %esi from memory

D. subl %ecx,5 // subtract 5 from %ecx register

In first pass translator decode the x86 instruction and translated into a simple sequence of atom. CMS take four atom in one sequence tats why there is time consumption is very less.

I have search related information about Crusoe processor which is published in international journal those processor

is operate in 900MHz, and Crusoe is a 128-bit

(6)

International Journal of Emerging Technology and Advanced Engineering

Website: www.ijetae.com (ISSN 2250-2459,ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013)

673 The first generation (TM8600) was manufactured using a TSMC 130 nm process and produced at speeds up to 1.1 GHz. The second generation (TM8800 and TM8820) was manufactured using a Fujitsu 90 nm process and produced at speeds ranging from 1 GHz to 1.7 GHz. the hardware portion of the new chip was increased from 128 bits to 256 bits, enabling it to handle eight instructions per clock twice the previous number.

REFERENCES

[1] Dean Gaudet ―Dense Computing with Transmeta's

Crusoe‖Transmeta, Inc. 2001 IEEE International Conference on Cluster Computing (CLUSTER.01). 0-7695-1116-3/02 $17.00 © 2002 _ IEEE.

[2] Linley Gwennap, Transmeta Rewrites the Rules, Linux Journal, April 2002, 11-20.

[3] I. Kadayif , M. Kandemir , I. Kolcu ―Exploiting Processor Workload Heterogeneity for Reducing Energy Consumption in Chip Multiprocessors‖ Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’04) 1530-1591/04 $20.00 © 2004 IEEE.

[4] Florin Danu Villanova ―The Crusoe Processr‖ University 800 Lancaster Ave. Villanova, PA 19086 USA. Transmeta Corporation, The versatility of Crusoe.

[5] T.M.Conte Superscaler and VLIW processor in hand book of parallel and distributed computing McGraw Hill New York 1995. [6] Alexander Wolfe, ―The Software Side of Crusoe‖, Embedded

Systems Programming Site, 2000.

[7] James C. Dehnert, Brian K. Grant, John P. Banning, Richard Johnson, Thomas Kistler, Alexander Klaiber, Jim Mattson ―The

Figure

Figure:1 A molecule can contain up to four atoms, which are executed in parallel.
Figure 6.  The translator
Figure 8.Architecture of Crusoe processor Model TM5800 CPU Core

References

Related documents

If we consider the view monoid induced by the separation algebra under this interference relation, R( View MSL ), we obtain a notion of view that is spe- cific about variables

The main issues in this study that different from the previous researches is to test reclaimed water samples in different percentage of blending with potable water 20%,50% and

Kala III dimulai sejak bayi lahir sampai lahirnya plasenta. Proses ini merupakan kelanjutan dari proses persalinan sebelumnya. Selama kala III proses pemisahan

Abstracts should be submitted electronically to ECS headquarters, and questions and inquiries should be sent to the symposium organizers: Pe- ter Hesketh , Georgia Institute

Þegar ákvörðun er kærð eða eitthvað slíkt eru náttúrulega öll gögn send áfram til … Það getur þó vel verið að þetta hafi átt sér stað.“ Þá kom fram sú

Deliver effective training and clear information to all customer-facing staff and their managers, to ensure they understand, at an appropriate level, each product and service

Electronic poster presented at the annual meeting of the American Psychological Association (student forum).. The Theory of Planned Behavior and influences of special

1.0 INTRODUCTION.. It is indeed a paradox that, while developing countries are faced with the daunting task of mobilizing adequate domestic resources for