News Column

Researchers Submit Patent Application, "Low Power and High Performance Physical Register Free List Implementation for Microprocessors", for Approval

January 29, 2014



By a News Reporter-Staff News Editor at Electronics Newsweekly -- From Washington, D.C., VerticalNews journalists report that a patent application by the inventors Vats, Suparn (Fremont, CA); Mylius, John H. (Gilroy, CA); Radhakrishnan, Abhijit (Santa Clara, CA), filed on July 3, 2012, was made available online on January 16, 2014.

No assignee for this patent application has been made.

News editors obtained the following quote from the background information supplied by the inventors: "This invention relates to microprocessors, and more particularly, to efficiently reducing the latency and power of register renaming.

"Microprocessors typically include overlapping pipeline stages and out-of-order execution of instructions. Additionally, microprocessors may support simultaneous multi-threading to increase throughput. Microprocessor throughput may be measured by the useful execution of a number of instructions per thread for each stage of a pipeline. These techniques take advantage of instruction level parallelism (ILP) and may increase the throughput. However, these techniques generally add more hardware and more depth to a pipeline. In addition, control dependencies and data dependencies associated with such techniques may reduce a maximum throughput of the microprocessor.

"Speculative execution of instructions is used to perform parallel execution of instructions despite control dependencies in the source code. In a software application, straight line code is a group of instructions without branches, loops, or tests that may be sequentially executed, although implemented hardware may perform out-of-order processing of instructions. Straight line code may also be referred to as a basic block of instructions. In straight line code, read after write (RAW), write after read (WAR) or write after write (WAW) dependencies may be encountered. Register renaming may be used to allow parallel execution of instructions despite the WAR and WAW dependencies. The execution techniques used to increase throughput may utilize a relatively large number of non-architectural registers which may be referred to as 'physical registers'.

"Physical registers are typically used to store the state of intermediate results from instruction execution after eliminating false write after read (WAR) dependencies and re-ordering write after write (WAW) dependencies in the pipeline. A free list is used to keep track of which physical registers are not currently in use. These particular free physical registers are available for use by incoming instructions. As the number of physical registers increase, the number of storage elements used for the free list and for identifying recently retired physical register identifiers increases. Therefore, on-die real estate, clock signal loading, signal cross-capacitance, and as a result, power may increase for the maintenance of these physical registers.

"In view of the above, methods and mechanisms for reducing the latency and power of register renaming are desired."

As a supplement to the background information on this patent application, VerticalNews correspondents also obtained the inventors' summary information for this patent application: "Systems and methods for reducing the latency and power of register renaming are contemplated. In various embodiments, a processor includes a register rename unit that receives decoded instructions. The decoded instructions include one or more destination architectural registers (ARs) for renaming. The processor may also include a free list, storing availability information corresponding to multiple physical registers (PR) used for register renaming. In some embodiments, the free list may comprise multiple banks The register rename unit additionally receives one or more returning PR IDs. A returning PR ID is a PR ID that is available again for assignment to a destination AR but is not yet indicated in the free list as being available.

"Control logic, which may be within the register rename unit, may determine that the multiple banks within the free list are unbalanced with available PR IDs. In response to this determination, the register rename unit may assign one or more returning PR IDs to the received one or more destination ARs before assigning any PR IDs from any bank of the multiple banks of available PR IDs. In various embodiments, selected banks within the multiple banks may not currently store availability information for the one or more of the assigned returning PR IDs. Therefore, the unbalanced banks may return to being balanced.

"In various embodiments, each of the banks includes a single bit width decoded vector. Each bit indicates whether a given PR ID of the multiple PR IDs is available for renaming. The decoded vector may appreciably reduce a number of storage elements, an amount of clock loading, an amount of wire routing capacitance, and thereby an amount of power used for the free list. In various other embodiments, the register rename unit stalls the update of the free list with returning PR IDs in order to help regain balance among the banks. In yet other embodiments, the register rename unit stalls the update with returning PR IDs for banks that do not have the lowest number of available PR IDs. In contrast, the banks within the free list with the lowest number of available PR IDs may be updated with associated returning PR IDs.

"These and other embodiments will be further appreciated upon reference to the following description and drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

"FIG. 1 is a generalized block diagram of one embodiment of a computer system.

"FIG. 2 is a generalized block diagram of one embodiment of a processor core that performs superscalar, out-of-order execution with zero-cycle load operations.

"FIG. 3 is a generalized flow diagram of one embodiment of a method for creating zero-cycle load operations.

"FIG. 4 is a generalized flow diagram of one embodiment of a method for processing zero-cycle load operations.

"FIG. 5 is a generalized flow diagram of one embodiment of a method for committing instructions that include zero-cycle load operations.

"While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims. As used throughout this application, the word 'may' is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words 'include,' 'including,' and 'includes' mean including, but not limited to.

"Various units, circuits, or other components may be described as 'configured to' perform a task or tasks. In such contexts, 'configured to' is a broad recitation of structure generally meaning 'having circuitry that' performs the task or tasks during operation. As such, the unit/circuit/component can be configured to perform the task even when the unit/circuit/component is not currently on. In general, the circuitry that forms the structure corresponding to 'configured to' may include hardware circuits. Similarly, various units/circuits/components may be described as performing a task or tasks, for convenience in the description. Such descriptions should be interpreted as including the phrase 'configured to.' Reciting a unit/circuit/component that is configured to perform one or more tasks is expressly intended not to invoke 35 U.S.C. .sctn.112, paragraph six, interpretation for that unit/circuit/component."

For additional information on this patent application, see: Vats, Suparn; Mylius, John H.; Radhakrishnan, Abhijit. Low Power and High Performance Physical Register Free List Implementation for Microprocessors. Filed July 3, 2012 and posted January 16, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=392&p=8&f=G&l=50&d=PG01&S1=20140109.PD.&OS=PD/20140109&RS=PD/20140109

Keywords for this news article include: Patents, Electronics, Microprocessors.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Electronics Newsweekly


Story Tools