FREE ELECTRONIC LIBRARY - Books, dissertations, abstract

Pages:   || 2 | 3 | 4 | 5 |   ...   | 6 |

«Master's Thesis Design of an Extensible Processor Bc. Michal Prok² Supervisor: Dr. Ing. Martin Novotný Study Programme: Electrical Engineering and ...»

-- [ Page 1 ] --

Czech Technical University in Prague

Faculty of Electrical Engineering

Department of Computer Science and Engineering

Master's Thesis

Design of an Extensible Processor

Bc. Michal Prok²

Supervisor: Dr. Ing. Martin Novotný

Study Programme: Electrical Engineering and Information Technology

Field of Study: Computer Science and Engineering

May 11, 2012




I would like to thank my girlfriend, the supervisor of this work and all the folks at the

Department of Digital Design at FIT CTU for their wonderful support during my studies and while working on this thesis.

vi vii Declaration I hereby declare that I have completed this thesis independently and that I have listed all the literature and publications used.

I have no objection to usage of this work in compliance with the act Ÿ60 Zákon £. 121/2000Sb.

(copyright law), and with the rights connected with the copyright act including the changes in the act.

In Prague on May 11, 2012

viii ix Abstract This work presents design and implementation of a processor based on a reduced MIPS32 architecture on FPGA. Instruction set of this processor can be extended by custom coprocessors. The processor implements only part of the MIPS32 instruction set neccessary for this work.

Abstrakt Tato práce popisuje návrh a implementaci procesoru zaloºeného na redukované instruk£ní sad¥ MIPS32 na FPGA. Instruk£ní sada tohoto procesoru je roz²i°itelná pouºitím uºivatelských koprocesor·. Procesor implementuje pouze podmnoºinu instrukcí MIPS32 pot°ebnou pro tuto práci.

x Contents 1 Introduction 1

1.1 Field Programmable Gate Array......................... 2

1.2 Structure of This Work............................... 2 2 Analysis 5

2.1 Interface Requirements...............................5

2.2 Existing Coprocessor Interfaces..........

–  –  –

Introduction The mainstream central processing units of contemporary computers are designed with the common case in mind. The main goal of microprocessor design is usually to maximize the performance of the most common software. Desktop, server and embedded software heavily utilize integer arithmetics where the integers t in processor registers[2].

In the last two decades some of the desktop and server software began to utilize also oating point computation. CPU1 vendors responded to this demand by introducing optional coprocessors designed for hardware acceleration of oating point operations[6].

At rst these oating point coprocessors were located on a separate chip and in a separate package. The main processor was designed in a way that it could perform its functions with or without the coprocessor. Floating point coprocessors usually had their own instructions interleaved with the processor's instruction stream. The processor was aware of the coprocessor instructions and passed them to the coprocessor for execution. When the coprocessor was not present in the system, the processor would usually raise an exception while decoding the coprocessor instruction. This allowed the operating system to handle these exceptions and emulate coprocessor instructions in software.

As the demand for oating point calculations rose, the oating point coprocessors began to be integrated on the same chip as the processor itself. The integration has gone so far that on most architectures the oating point coprocessor became just a specialized execution unit within the instruction pipeline[8].

While the oating point computations are the most popular example of coprocessor acceleration, this is hardly the only one. There are many more areas where the core algorithms can achieve signicantly higher performance by utilizing specialized hardware. Good examples are CRC2 computations, cryptographic algorithms or more complex mathematical operations not usually implemented in oating point execution units.

From the programmer's point of view it would be good to have all the possible (or at least all the useful) operations implemented in the CPU. The execution units required for these special operations would consume chip area that could be otherwise used to accelerate the more common operations thus resulting in higher overall performance gain. The design of more complex CPUs would also raise their prices and most of the users would have to pay Central Processing Unit Cyclic Redundancy Checksum

–  –  –

for a design of a specialized hardware which they are never going to use. This is the reason why the more specialized operations are usually implemented on separate coprocessors.

The use of the coprocessor also promotes modular design of the whole hardware system.

With coprocessors the CPU instruction set can be extended without any modication in the design of the processor itself. This means that simple, yet powerful, coprocessors can be designed relatively fast as a response to market demand.

Coprocessors are particularly popular in the area of processors implemented on FPGAs3.

A processor vendor provides a relatively complex CPU suitable for implementation on an FPGA. This CPU usually comes with complete software build and debug environment. User can then easily extend computation capabilities of the whole system by supplementing the processor with its own logic.

The goal of this work is to design a simple processor with coprocessor interface. The coprocessor interface will be integrated into the processor pipeline in a way that will allow the coprocessor to read and write processor registers and inuence the processor program ow.

The result of this work will be used for further research in the area of algorithm acceleration using custom coprocessors. These coprocessors could be generated from a simple algorithmic description that could be either written by the user or automatically extracted from the software source code.

Since the ultimate goal of this work is the algorithm acceleration, the coprocessor interface must be implemented in a way that will allow the coprocessor instructions to be executed as fast as possible. This may require some trade os between design simplicity and execution latency.

1.1 Field Programmable Gate Array The FPGA is an integrated circuit used for implementation of a custom logic. Most of the FPGA area is dedicated to small logic cells that can be programmed to implement simple logic function. Each of these cells can implement a function of a few logic gated. These cells are connected via programmable routing matrix to form more complex logic functions.

The FPGAs of the last few generations are of the sucient size to implement a full processor. The largest of the contemporary FPGAs can t up to several hundreds of simple processors.

The FPGA can be fully or partially reprogrammed in a matter of seconds or even milliseconds. This makes it an ideal platform for implementation of custom application accelerators.

1.2 Structure of This Work Chapter 2 briey describes implementations of a coprocessor interface in two existing FPGA soft core processors. This chapter then analyses possible features of a new coprocessor interface.

Field Programmable Gate Array

1.2. STRUCTURE OF THIS WORK Chapter 3 documents results of the implementation of the processor and its interfaces.

Chapter 4 describes all the tests that were using for verication of the implemented processor and coprocessors.

Chapter 5 sums up all the implementation and verication results.

4 CHAPTER 1. INTRODUCTION Chapter 2 Analysis This chapter overviews the design of a coprocessor interface.

2.1 Interface Requirements

There are following requirements for the coprocessor interface:

• The coprocessor must be able to read processor registers.

• The coprocessor must be able to write processor registers.

• The coprocessor must be able to stall processor pipeline.

• The processor must be able to ush in-ight coprocessor instructions.

• The coprocessor should be able to enforce processor jump.

• The coprocessor instructions must be executed as a part of program instruction stream.

2.2 Existing Coprocessor Interfaces There are multiple processors with coprocessor support. Most of the time these processors are using a proprietary coprocessor interface with closed documentation. Both Xilinx and Altera provide their own soft core processors with coprocessor support.

2.2.1 Microblaze Coprocessor Interface

Microblaze is a soft core embedded microprocessor provided by Xilinx. The license to this processor is a part of an EDK1 license.

The processor communicates with the coprocessor(s) via FSL2. Each FSL is a one way queue with 33-bit words. 32 bits are usually used for data and a single bit is reserved for control.

Xilinx Embedded Development Kit

–  –  –

There are special instructions to read or write data from or to FSL. There is several variants of each of these instructions. For example the write can be blocking or non-blocking which modies the behaviour when the FSL FIFO memory is full. With non-blocking version there will be an error state set and the processor will immediately continue to execution of next instruction. With blocking variant the processor will stall its pipeline until the data can be written to the FIFO.

This type of interface allows asynchronous execution of coprocessor operations. The processor can queue operation requests and the continue executing other code while the coprocessor produces results.

Disadvantage of this interface is that it is relatively slow to pass data to the coprocessor.

If the coprocessor is to be used to for example for some operation that takes two 32-bit input operands and produces one 32-bit result, this operation can be executed in at least three clock cycles. First two cycles are used to write both operands to FSL and the third one is used to read the result back.

This interface is best suited for complex operations with long execution times.

2.2.2 Nios II Coprocessor Interface

Nios II is a second generation of soft core embedded microprocessor provided by Altera.

This processor allows insertion of custom execution units directly into its instruction pipeline. These units bypass the standard ALU3 and execute user dened operations.

The execution unit interface is specied in several levels. In the simplest level, the execution unit is provided with values of two general purpose registers specied in the instruction and outputs one 32-bit result which is written back to the register le.

There are more complex versions of the interface which allows the execution unit to implement multi-cycle operations, requested operation decoding and even an internal register le.

The execution unit is integrated into one stage of the normal processor pipeline. This means that the multi-cycle operations stall the whole instruction pipeline during the execution.

The instructions are still decoded by the processor and the execution unit is provided only with register values, register addresses and operation codes that were extracted from the instruction word by the processor's instruction decoder. This means that the custom unit cannot use some custom instruction format that would suite it needs.

2.3 Coprocessor Interface Since the main goal of this work is to design a processor that can be extended by various coprocessors, the coprocessor interface is a major issue.

Arithmetic and Logic Unit

2.3. COPROCESSOR INTERFACE 2.3.1 Hardware Interface First thing that needs to be discussed is the hardware interface between the processor and the coprocessor. From this point of view, there are only two main choices: the coprocessor can be either connected to the main system bus or it can be connected to the processor via some dedicated bus.

System Bus The main advantage of this solution is its simplicity. Coprocessor connected to the main system bus doesn't really dier from any other processor peripheral. Another advantage is that system bus connection allows the coprocessor easy access to system memory and other peripheral devices.

Pages:   || 2 | 3 | 4 | 5 |   ...   | 6 |

Similar works:

«PROSPECTUS AND SYLLABUS Associateship in Information Science 2010-2012 (A Master's Degree Course) National Institute of Science Communication And Information Resources 14, Satsang Vihar Marg, New Delhi110 067, INDIA PROSPECTUS AND SYLLABUS ASSOCIATESHIP IN INFORMATION SCIENCE (A Master’s Degree Course) (2010-2012) NISCAIR NATIONAL INSTITUTE OF SCIENCE COMMUNICATION AND INFORMATION RESOURCES (Council of Scientific and Industrial Research) 14 Satsang Vihar Marg, New Delhi 110 067. INDIA...»

«Mining Insurance Data to Promote Traffic Safety and Better Match Rates to Risk Gregory L. Hayward, FCAS, MAAA, FCIA, CPCU Mining Insurance Data To Promote Traffic Safety and Better Match Rates to Risk By Greg Hayward, FCAS, MAAA, FCIA, CPCU Abstract Operating or riding in a vehicle is one o f the most dangerous things the typical person does on a regular basis. This paper describes how one company is using new technologies and techniques to mine massive amounts o f vehicle crash statistics. In...»

«Niederdruck-Integralschaumgießen Technologie für Aluminiumgussteile mit reduziertem Körperschall Der Technischen Fakultät der Universität Erlangen-Nürnberg zur Erlangung des Grades DOKTOR-INGENIEUR vorgelegt von André Trepper Erlangen – 2010 Als Dissertation genehmigt von der Technischen Fakultät der Universität Erlangen-Nürnberg Tag der Einreichung: 01.09.2010 Tag der Promotion: 22.11.2010 Dekan: Prof. Dr.-Ing. habil. R. German 1. Berichterstatter: PD Dr.-Ing. habil. C. Körner 2....»

«Owner's Manual Bedienungsanleitung Contents Page 1. Introduction 3 2. Operating and safety instructions 3 3. Guarantee 3 4. Operation and functional description 4 4.1 The Neptune signal flow 5 4.2 Subtractive Synthesis (Theory) 6 5. Initial start-up 6 5.1 Using the external audio input 6 5.2 Technical data of the CV and Gate In/ Output 7 6. Trouble-shooting 7 7. MIDI functions 7 7.1 Normal operating mode 7 7.2 Learn mode 8 7.2.1 Note-on Command 8 7.2.2 Program-change Command 8 7.3 Re-trigger 8...»

«CSCI 599 – Geospatial Data Integration 1. Introduction This class will cover the theoretical foundations, methods, techniques, and software systems for geospatial data integration. This includes the latest research in a variety of topics that are central to spatial computing, including the geospatial semantic web, geospatial linked data, spatial data mining, geocoding, document linking, location-based services, volunteered geographic-information, feature extraction, layer registration and...»

«TECHNISCHE UNIVERSITÄT MÜNCHEN Inhaltsverzeichnis Lehrstuhl für Siedlungswasserwirtschaft Untersuchung des Einflusses von Oberflächeneigenschaften verschiedener Kunststoffe auf die Besiedlung durch und die Umsatzleistung von PSEUDOMONAS PUTIDA MT2 Dipl. Biol. Univ. Dominik Montag Vollständiger Abdruck der von der Fakultät für Bauingenieurund Vermessungswesen der Technischen Universität München zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.)...»

«SPEEDING UP XML QUERYING Satisfiability Test & Containment Test of XPath Queries in the Presence of XML Schema Definitions Dissertation by Jinghua Groppe Lübeck, Germany, July 2008 Jinghua Groppe Institut für Informationssysteme Universität zu Lübeck Ratzeburger Allee 160 D-23538 Lübeck Germany E-Mail: Jinghua.Groppe@ifis.uni-luebeck.de Dissertation zur Erlangung des akademischen Grades Doktor der Naturwissenschaften (Dr. rer. nat.) der Technisch-Naturwissenschaftlichen Fakultät der...»

«TECHNISCHE UNIVERSITÄT MÜNCHEN Fachgebiet für Entwicklungsbiologie der Pflanzen Effect of Genetic Kinome Alterations on the Response of Cancer Cells to SUTENT Therapy Torsten Winkler Vollständiger Abdruck der von der Fakultät Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt der Technischen Universität München zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Univ.-Prof. Dr. C....»

«User Guide Blackbaud Hosting Services ©2011 Blackbaud, Inc. This publication, or any part thereof, may not be reproduced or transmitted in any form or by any means, electronic, or mechanical, including photocopying, recording, storage in an information retrieval system, or otherwise, without the prior written permission of Blackbaud, Inc. The information in this manual has been carefully checked and is believed to be accurate. Blackbaud, Inc., assumes no responsibility for any inaccuracies,...»

«isma Discussion Paper Series 04-2011 The Effects of Population Ageing on Private Consumption – A Simulation for Austria based on Household Data up to 2050 Mag. (FH) Birgit Aigner-Walder Prof. Dr. habil Thomas Döring isma – Forschungszentrum für Interregionale Studien und Internationales Management Fachhochschule Kärnten, Europastr. 4, A-9500 Villach, www.fh-kaernten.at/isma The effects of population ageing on private consumption – A simulation for Austria based on household data up to...»

<<  HOME   |    CONTACTS
2016 www.book.dislib.info - Free e-library - Books, dissertations, abstract

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.