CN101963897B - Apparatus and method for dual data path processing - Google Patents
Apparatus and method for dual data path processing Download PDFInfo
- Publication number
- CN101963897B CN101963897B CN201010276291.9A CN201010276291A CN101963897B CN 101963897 B CN101963897 B CN 101963897B CN 201010276291 A CN201010276291 A CN 201010276291A CN 101963897 B CN101963897 B CN 101963897B
- Authority
- CN
- China
- Prior art keywords
- instruction
- instructions
- data
- control
- configurable
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/3012—Organisation of register space, e.g. banked or distributed register file
- G06F9/3013—Organisation of register space, e.g. banked or distributed register file according to data content, e.g. floating-point registers, address registers
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3853—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution of compound instructions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
- G06F9/3891—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute organised in groups of units sharing resources, e.g. clusters
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3893—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
- G06F9/3895—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros
- G06F9/3897—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros with adaptable data path
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Executing Machine-Instructions (AREA)
- Advance Control (AREA)
- Devices For Executing Special Programs (AREA)
- Hardware Redundancy (AREA)
Abstract
一种具有控制和数据处理能力的计算机处理器,包括用于解码指令的解码单元。数据处理装置包括第一数据执行路径和第二数据执行路径,所述第一数据执行路径包括固定操作符,所述第二数据执行路径至少包括可配置操作符,所述可配置操作符具有多个预定义的配置,所述配置中的至少一些可通过数据处理指令的操作码部分来选择。解码单元可操作用于检测数据处理指令是定义固定数据处理操作还是可配置数据处理操作,所述解码单元使计算机系统将用于处理的数据在检测到所述固定数据处理指令时提供给所述第一数据执行路径,而在检测到可配置数据处理指令时提供给所述可配置数据执行路径。
A computer processor having control and data processing capabilities, including a decode unit for decoding instructions. The data processing device includes a first data execution path and a second data execution path, the first data execution path includes fixed operators, the second data execution path includes at least configurable operators, and the configurable operators have multiple pre-defined configurations, at least some of which are selectable via opcode portions of data processing instructions. a decoding unit operable to detect whether a data processing instruction defines a fixed data processing operation or a configurable data processing operation, the decoding unit causing the computer system to provide data for processing to the said fixed data processing instruction upon detection of said fixed data processing instruction A first data execution path is provided to the configurable data execution path upon detection of a configurable data processing instruction.
Description
本申请是申请日为2005年3月22日申请号为200580010665.X(PCT/GB2005/001073)的同名中国专利申请的分案申请。 This application is a divisional application of a Chinese patent application with the same name filed on March 22, 2005 with application number 200580010665.X (PCT/GB2005/001073). the
技术领域technical field
本发明涉及一种计算机处理器,一种操作该计算机处理器的方法,以及一种包括计算机用的指令集的计算机程序产品。 The invention relates to a computer processor, a method of operating the computer processor, and a computer program product comprising an instruction set for a computer. the
背景技术Background technique
为了提高计算机处理器的速度,现有技术结构已使用了双执行路径用于执行指令。双执行路径处理器可以根据单指令多数据(SIMD)原理操作,利用操作的并行性用于提高处理器速度。 In order to increase the speed of computer processors, prior art architectures have used dual execution paths for executing instructions. Dual execution path processors may operate according to the Single Instruction Multiple Data (SIMD) principle, exploiting parallelism of operations for increasing processor speed. the
然而,虽然使用双执行路径和SIMD处理,但是仍不断的需要提高处理器速度。典型的双执行路径处理器使用两个大致类似的通路,因此每个通路都处理控制代码和数据路径代码。虽然公知的处理器支持32位标准编码和16位“密集”编码的组合,但是该方案承受着许多不足,包括缺少在16位格式中少数可用位中的语义内容。 However, despite the use of dual execution paths and SIMD processing, there is a continuing need to increase processor speed. A typical dual execution path processor uses two roughly similar paths, so each path processes control code and data path code. While known processors support a combination of 32-bit standard encoding and 16-bit "dense" encoding, this scheme suffers from a number of disadvantages, including a lack of semantic content in the few bits available in 16-bit formats. the
此外,常规的通用数字信号处理器不能匹配用于许多目的的应用特定算法,包括执行诸如卷积、快速傅立叶变换、Trellis/Viterbi编码、相关性、有限脉冲响应过滤和其他操作的专用操作。 Furthermore, conventional general-purpose digital signal processors cannot be matched to application-specific algorithms for many purposes, including performing specialized operations such as convolution, fast Fourier transform, Trellis/Viterbi coding, correlation, finite impulse response filtering, and others. the
发明内容Contents of the invention
在根据本发明的一个实施例中,提供一种具有控制和数据处理能力的计算机处理器。该计算机处理器包括:用于解码指令的解码单元;包括第一数据执行路径和第二数据执行路径的数据处理设备,所述第一数据执行路径包括固定操作符,所述第二数据执行路径至少包括可配置操作符,所述可配置操作符具有多个预定义的配置,通过数据处理指令的操作码部分可选择所述配置中的至少一些;其中所述解码单元可操作用于检测数据处理指令是定义固定数据处理操作还是可配置数据处理操作,所述解码单元使计算机系统将用于处理的数据在检测到所述固定数据处理指令时提供给所述第一数据执行路径,而在检测到 可配置数据处理指令时提供给所述可配置数据执行路径。 In one embodiment according to the invention, a computer processor having control and data processing capabilities is provided. The computer processor comprises: a decoding unit for decoding instructions; a data processing device comprising a first data execution path comprising fixed operators and a second data execution path including at least configurable operators having a plurality of predefined configurations, at least some of which are selectable via an opcode portion of a data processing instruction; wherein the decode unit is operable to detect data Whether a processing instruction defines a fixed data processing operation or a configurable data processing operation, the decode unit causes the computer system to provide data for processing to the first data execution path upon detection of the fixed data processing instruction, and upon detection of the fixed data processing instruction, When a configurable data processing instruction is detected, it is provided to the configurable data execution path. the
在另一相关实施例中,解码单元能够解码来自存储器的指令包流,每个包包括多个指令。解码单元也可操作用于检测指令包是否包含数据处理指令。可配置操作符以多位值的级别、或者以字的级别可配置,其中多位值包括具有四个或更多位的多位值。根据单指令多数据原理,第一数据执行路径的多个固定操作符可以被布置用于在独立通道中执行多个固定操作。同样,根据单指令多数据原理,第二数据执行路径的多个可配置操作符可以被布置用于在不同通道中执行多个操作。 In another related embodiment, the decode unit is capable of decoding a stream of instruction packets from the memory, each packet including a plurality of instructions. The decoding unit is also operable to detect whether an instruction packet contains a data processing instruction. The configurable operators are configurable at the level of multi-bit values, including multi-bit values having four or more bits, or at the level of words. According to the Single Instruction Multiple Data principle, multiple fixed operators of the first data execution path may be arranged to perform multiple fixed operations in independent lanes. Likewise, multiple configurable operators of the second data execution path may be arranged to perform multiple operations in different lanes according to the Single Instruction Multiple Data principle. the
在另一相关实施例中,第二执行路径的可配置操作符可以被布置用于接收确定所执行的操作的特性的配置信息。可以从定义可配置数据处理操作的指令的字段接收该信息。第二执行路径的可配置操作符可以被布置用于接收包括控制相关的互连性的信息的可配置信息。该计算机处理器进一步包括与第二数据执行路径的可配置操作符相关联的控制映射,所述控制映射可操作用于从可配置数据处理指令接收至少一个配置位,并给响应于此的可配置操作符提供配置信息。该配置信息可以通过所述可配置操作符确定操作的特性;并且控制两个或多个所述配置操作符之间的互连性。 In another related embodiment, the configurable operator of the second execution path may be arranged to receive configuration information determining characteristics of the operations performed. This information may be received from fields of instructions defining configurable data processing operations. The configurable operator of the second execution path may be arranged to receive configurable information comprising information controlling related interconnectivity. The computer processor further includes a control map associated with the configurable operator of the second data execution path, the control map operable to receive at least one configuration bit from a configurable data processing instruction, and to assign a configurable bit in response thereto. Configuration operators provide configuration information. The configuration information may determine characteristics of operation by said configurable operators; and control interconnectivity between two or more of said configurable operators. the
在另一相关实施例中,第二执行路径的可配置操作符可被布置用于从源而不是从可配置数据处理指令来接收确定待执行的操作的特性的配置信息、或者控制互连性的配置信息。第二数据执行路径的至少一个可配置操作符能够在向结果存储器返回结果之前以比两个计算大的执行深度来执行数据处理指令。该计算机处理器可以包括转换装置,其用于从可配置数据处理指令接收数据处理操作数并在适当时转换所述数据处理操作数用于提供给一个或多个所述可配置操作符。计算机处理器也可以包括以下转换装置,其用于从一个或多个所述可配置操作符接收结果,并在适当时转换所述结果用于提供给结果存储器和反馈循环中的一个或多个。该计算机处理器也包括多个控制映射,其用于将从可配置数据处理指令所接收的配置位映射成用于提供给第二数据执行路径的可配置操作符的配置信息。同样,该计算机处理器可以包括以下转换装置,其用于从控制映射接收配置信息,并在适当时转换该配置信息用于提供给第二数据执行路径的可配置操作符。该计算机处理器也可以包括从一个或多个以下项所选择的可配置操作符:乘累加操作符;算术操作符;状态操作符;和交叉通道换码 器。同样,该计算机处理器可以包括能执行从如下项中所选择的一个或多个操作的操作符和指令集:快速傅立叶变换;反向快速傅立叶变换;Viterbi编码/解码;Turbo编码/解码;和有限脉冲响应计算;以及任何其他相关性或卷积。 In another related embodiment, the configurable operators of the second execution path may be arranged to receive configuration information determining characteristics of operations to be performed, or to control interconnectivity, from sources other than configurable data processing instructions. configuration information. At least one configurable operator of the second data execution path is capable of executing data processing instructions with an execution depth greater than two computations before returning a result to a result store. The computer processor may comprise conversion means for receiving data processing operands from configurable data processing instructions and converting said data processing operands as appropriate for supply to one or more of said configurable operators. The computer processor may also include transformation means for receiving results from one or more of said configurable operators and transforming said results as appropriate for feeding to one or more of a result memory and a feedback loop . The computer processor also includes a plurality of control maps for mapping configuration bits received from the configurable data processing instructions to configuration information for providing configurable operators of the second data execution path. Likewise, the computer processor may comprise conversion means for receiving configuration information from the control map and converting the configuration information as appropriate for feeding to the configurable operators of the second data execution path. The computer processor may also include a configurable operator selected from one or more of: a multiply-accumulate operator; an arithmetic operator; a state operator; and a cross-path transcoder. Likewise, the computer processor may include a set of operators and instructions capable of performing one or more operations selected from: Fast Fourier Transform; Inverse Fast Fourier Transform; Viterbi encoding/decoding; Turbo encoding/decoding; and Finite impulse response calculations; and any other correlation or convolution. the
在根据本发明的另一实施例中,提供一种操作具有控制和数据处理能力的计算机处理器的方法,所述计算机处理器包括第一数据执行路径和第二数据执行路径,所述第一数据执行路径包括固定操作符,所述第二数据执行路径包括可配置操作符,所述可配置操作符具有多个预定义的配置,所述配置中的至少一些可通过数据处理指令的操作码部分来选择。该方法包括:解码多个指令以检测所述多个指令的至少一个数据处理指令是定义固定数据处理操作还是可配置数据处理操作;使计算机处理器将用于处理的数据在检测到固定数据处理指令时提供给所述第一数据执行路径,而在检测可配置数据处理指令时提供给所述可配置数据执行路径;以及输出结果。 In another embodiment according to the present invention there is provided a method of operating a computer processor having control and data processing capabilities, the computer processor comprising a first data execution path and a second data execution path, the first The data execution path includes fixed operators, and the second data execution path includes configurable operators having a plurality of predefined configurations, at least some of which are operable via opcodes of the data processing instructions part to select. The method includes: decoding a plurality of instructions to detect whether at least one data processing instruction of the plurality of instructions defines a fixed data processing operation or a configurable data processing operation; causing a computer processor to process data for processing when the fixed data processing operation is detected providing instructions to said first data execution path and upon detection of a configurable data processing instruction to said configurable data execution path; and outputting a result. the
在根据本发明的另一实施例中,提供一种包括程序代码装置的计算机程序产品,所述程序代码装置用于使计算机处理器执行以下步骤,其中所述计算机处理器包括第一数据执行路径和第二数据执行路径,所述第一数据执行路径包括固定操作符,所述第二数据执行路径包括可配置操作符,所述可配置操作符具有多个预定义的配置,所述配置中的至少一些可通过数据处理指令的操作码部分来选择,即:解码多个指令以检测所述多个指令的至少一个数据处理指令是定义固定数据处理操作还是可配置数据处理操作;使计算机处理器将用于处理的数据在检测到固定数据处理指令时提供给所述第一数据执行路径,而在检测可配置数据处理指令时提供给所述可配置数据执行路径;以及输出结果。 In another embodiment according to the present invention there is provided a computer program product comprising program code means for causing a computer processor to perform the following steps, wherein the computer processor comprises a first data execution path and a second data execution path, the first data execution path includes fixed operators, the second data execution path includes configurable operators, the configurable operators have a plurality of predefined configurations, in which At least some of the data processing instructions are selectable through the opcode portion of the data processing instructions, namely: decoding a plurality of instructions to detect whether at least one data processing instruction of the plurality of instructions defines a fixed data processing operation or a configurable data processing operation; causes the computer to process providing data for processing to the first data execution path upon detection of a fixed data processing instruction and to the configurable data execution path upon detection of a configurable data processing instruction; and outputting a result. the
在根据本发明的另一实施例中,提供一种包括第一多个指令和第二多个指令的数据处理指令集,所述第一多个指令具有指示数据处理操作的固定类型的字段,所述第二多个指令具有指示数据处理操作的可配置类型的字段。 In another embodiment according to the present invention there is provided a set of data processing instructions comprising a first plurality of instructions having a field indicating a fixed type of data processing operation and a second plurality of instructions, The second plurality of instructions has a field indicating a configurable type of data processing operation. the
在根据本发明的另一实施例中,提供一种包含可配置操作符的数据执行路径的计算机处理器,其中可配置操作符包括操作符配置的多个预定义的组,每个组包括来自独立的操作符类的操作符。操作符类可以包括从一个或多个如下项中所选择的类:乘累加操作符;算术操作符;状态操作符;和换码器。从操作符配置的每个预定义的组内所选择的操作符之间的连接能够通过由计算机处理器所执行的指令内的操作码部分来配置。同样,从操作符配置的多于一个的 预定义的组所选择的操作符之间的连接能够通过由计算机处理器所执行的指令内的操作码部分来配置。 In another embodiment according to the invention there is provided a computer processor comprising a data execution path of configurable operators, wherein the configurable operators comprise a plurality of predefined groups of operator configurations, each group consisting of Operators for individual operator classes. Operator classes may include classes selected from one or more of: multiply-accumulate operators; arithmetic operators; state operators; and escapers. Connections between operators selected from within each predefined group of operator configurations can be configured through opcode portions within instructions executed by a computer processor. Likewise, connections between operators selected from more than one predefined group of operator configurations can be configured through opcode portions within instructions executed by a computer processor. the
本发明提供一种计算机处理器,其包括解码来自存储器的指令包的解码单元,每个指令包包括多个指令;包括多个功能单元且可操作用于执行控制处理操作的处理通道;其中所述解码单元可操作用于接收具有64位位长的指令包,并且可操作用于使用所述指令包中的识别位来检测所述指令包是否定义三个每个都具有21位位长的控制指令,以及其中当所述解码单元检测到所述指令包包括三个这种控制指令时所述控制指令被提供给所述处理通道用于按照所述三个这种控制指令出现在所述指令包中的顺序来执行。 The present invention provides a computer processor comprising a decoding unit for decoding instruction packets from a memory, each instruction packet comprising a plurality of instructions; a processing channel comprising a plurality of functional units and operable to perform control processing operations; wherein the The decode unit is operable to receive an instruction packet having a bit length of 64 bits, and is operable to use an identification bit in the instruction packet to detect whether the instruction packet defines three instruction packets each having a bit length of 21 bits a control instruction, and wherein when the decoding unit detects that the instruction packet includes three such control instructions, the control instruction is provided to the processing channel for appearing in the The sequence in the instruction packet is executed. the
本发明还提供一种操作计算机处理器的方法,该计算机处理器包括处理通道并且能够执行具有多个功能单元的控制处理操作,该方法包括(a)接收来自存储器的指令包序列,所述指令包的每一个均包括多个定义了操作的指令;(b)通过以下方式来依次解码每个指令包:使用所述指令包中的识别位来确定所述指令包是否定义了三个每个具有21位位长的控制指令,并且其中当解码单元检测到所述指令包包括三个这种控制指令时,提供所述控制指令给所述处理通道用于按照所述三个这种控制指令出现在所述指令包中的顺序来执行。 The present invention also provides a method of operating a computer processor comprising a processing channel and capable of performing control processing operations having a plurality of functional units, the method comprising (a) receiving a sequence of instruction packets from a memory, the instruction Each of the packets includes a plurality of instructions defining operations; (b) each instruction packet is decoded in turn by using the identification bits in the instruction packet to determine whether the instruction packet defines three each a control instruction having a bit length of 21 bits, and wherein when the decoding unit detects that the instruction packet includes three such control instructions, providing the control instruction to the processing channel for following the three such control instructions executed in the order in which they appear in the instruction packet. the
本发明的其他优点和新颖特性在如下说明中将会部分地被提出,并且依据下面的审查和附图,对于本领域技术人员而言部分地是显然的;或者可以通过实施本发明被学习到。 Other advantages and novel features of the present invention will be set forth in part in the following description, and in part will be apparent to those skilled in the art from the following examination and drawings; or can be learned by practicing the present invention . the
附图说明Description of drawings
为了更好的理解本发明,并说明可以如何同样实施本发明,现在将仅通过示例参考附图,其中: For a better understanding of the invention, and to illustrate how it may likewise be implemented, reference will now be made, by way of example only, to the accompanying drawings, in which:
图1是根据本发明实施例的不对称的双执行路径计算机处理器的框图; 1 is a block diagram of an asymmetric dual execution path computer processor according to an embodiment of the invention;
图2表示根据本发明实施例的用于图1的处理器的指令的示例性类;以及 Figure 2 represents an exemplary class of instructions for the processor of Figure 1 according to an embodiment of the invention; and
图3是表示根据本发明实施例的可配置深执行单元的组件的示意图; Fig. 3 is a schematic diagram representing components of a configurable deep execution unit according to an embodiment of the present invention;
具体实施方式 Detailed ways
图1是根据本发明实施例的不对称的双路径计算机处理器的框图。图1的处理器将单指令流100的处理在两个不同的硬件执行路径之间划分:即用于处
FIG. 1 is a block diagram of an asymmetric dual-path computer processor according to an embodiment of the present invention. The processor of FIG. 1 divides the processing of a
理控制代码的控制执行路径102、和用于处理数据代码的数据执行路径103。两个执行路径102、103的数据宽度、操作符和其他特征根据控制代码和数据路径代码的不同特征而不同。典型地,控制代码支持较少、较窄的寄存器,难于并行化,典型地(但不是唯一地)用C代码或另一高级语言来写,并且它的代码密度一般比它的速度性能更重要。相反,数据路径代码典型地支持宽寄存器的大文件,可高度并行化,以汇编语言来写,并且它的性能比它的代码密度更重要。在图1的处理器中,两个不同的执行路径102和103专用于处理两种不同类型的代码,每侧都具有其自己的结构寄存器文件(诸如控制寄存器文件104和数据寄存器文件105),在寄存器宽度和数量方面是不同的;控制寄存器具有较窄的宽度,以位数计(在一个示例中,32位),而数据寄存器具有较宽的宽度(在一个示例中,64位)。因为寄存器的两个执行路径执行不同的专门功能而具有不同的位宽度,因此该处理器是不对称的。 Control execution path 102 for processing control code, and data execution path 103 for processing data code. The data width, operators and other characteristics of the two execution paths 102, 103 differ according to the different characteristics of the control code and data path code. Typically, the control code supports fewer, narrower registers, is difficult to parallelize, is typically (but not exclusively) written in C code or another high-level language, and its code density is generally more important than its speed performance . In contrast, datapath code typically supports large files of wide registers, is highly parallelizable, is written in assembly language, and its performance is more important than its code density. In the processor of FIG. 1, two different execution paths 102 and 103 are dedicated to processing two different types of code, each side having its own architectural register file (such as control register file 104 and data register file 105), The difference is in register width and number; control registers have a narrower width in bits (in one example, 32 bits), while data registers have a wider width (in one example, 64 bits). The processor is asymmetric because the two execution paths of the registers perform different specialized functions and thus have different bit widths. the
在图1的处理器中,指令流100由指令包的序列组成。所提供的每个指令包由指令解码单元101解码,其从数据指令中分离控制指令,如下进一步所述。控制执行路径102为指令流处理控制流操作,并利用分支单元106、执行单元107、和载入存储单元108管理机器的状态寄存器,其中在该实施例中所述载入存储单元108被数据执行路径103共享。只有处理器的控制侧需要对编译器(诸如对于C、C++、或Java语言的编译器、或另一高级语言编译器)可视。在控制侧内,分支单元106和执行单元107的操作依照本领域普通技术人员公知的常规处理器设计。
In the processor of FIG. 1,
在固定执行单元109和可配置深度执行单元110中,数据执行路径103使用SIMD(单指令多数据)并行性。就像将在下面进一步描述的那样,除了常规的SIMD处理器所使用的宽度以外,为了增加每指令工作,可配置深度执行单元110提供处理的深度。
In both the fixed
如果被解码的指令定义控制指令,则其被施加给机器的控制执行路径上的适当的功能单元(例如分支单元106、执行单元107和载入/存储单元108)。如果被解码的指令定义具有固定或者可配置数据处理操作的指令,则其被供应给数据处理执行路径。在指令包的数据指令部分内,指定位表示指令是固定还是可配置数据处理指令,以及在可配置指令的情况下,另外的指定位定义配置信息。根据被解码的数据处理指令的子类型,将数据提供给机器的数据处理路径的固定或可配置执行子路径。
If the decoded instruction defines a control instruction, it is applied to the appropriate functional units (eg,
这里,“可配置”表示从多个预定义的(“伪静态”)操作符配置中选择操作符配置的能力。操作符的伪静态配置是有效的用以使操作符(i)执行特定类型的操作或者(ii)以特定形式与相关元件互连或者(iii)上述(i)和(ii)的组合。实际上,所选的伪静态配置每次可以确定许多操作符元素的特性和互连性。它也能控制与数据路径相关联的转换配置。在优选的实施例中,至少部分多个伪静态操作符配置通过数据处理指令的操作代码部分是可选择的,这将在下面进一步描述。同样根据这里的实施例,“可配置指令”允许以多位值的级别执行定制的操作;例如以四个或多个位多位值的级别,或者以字的级别。 Here, "configurable" denotes the ability to select an operator configuration from a number of predefined ("pseudo-static") operator configurations. A pseudo-static configuration of an operator is effective to cause the operator to (i) perform a particular type of operation or (ii) interconnect with related elements in a particular fashion or (iii) a combination of (i) and (ii) above. In fact, the selected pseudo-static configuration can determine the properties and interconnectivity of many operator elements at a time. It also controls the transformation configuration associated with the datapath. In a preferred embodiment, at least some of the plurality of pseudo-static operator configurations are selectable through the opcode portion of the data processing instruction, as further described below. Also according to embodiments herein, "configurable instructions" allow customized operations to be performed at the level of multi-bit values; for example at the level of four or more bit multi-bit values, or at the level of words. the
需要指出的是,控制和数据处理指令可以定义存储器访问(载入/存储)和基本算术操作,所述控制和数据处理指令在机器的它们的相应不同的侧上被执行。用于控制操作的输入/操作数可被提供给控制寄存器文件104/从控制寄存器文件104提供,而用于数据处理操作的数据/操作数被提供给寄存器文件105/从寄存器文件105提供。
It should be noted that the control and data processing instructions, which are executed on their respective different sides of the machine, may define memory accesses (load/store) and basic arithmetic operations. Inputs/operands for control operations may be provided to/from the control register file 104 , while data/operands for data processing operations are provided to/from the
根据本发明的实施例,每个数据处理操作的至少一个输入可以是矢量。在这方面,可以认为可配置数据路径的可配置操作符和/或转换电路是可配置的, 以利用所执行的操作的特性和/或其间的互连性执行矢量操作。例如,对数据处理操作的64位矢量输入可以包括四个16位的标量操作数。这里,“矢量”是标量操作数的集合。矢量算术可以在多个标量操作数上执行,并可以包括标量元素的转向、移动和置换。不是矢量操作的所有操作数都需要是矢量;例如,矢量操作可以有标量和至少一个矢量作为输入;并且输出或者是标量或者是矢量的结果。 According to an embodiment of the present invention, at least one input of each data processing operation may be a vector. In this regard, the configurable operators and/or transformation circuits of the configurable datapath may be considered configurable to perform vector operations utilizing the nature of the operations performed and/or the interconnectivity therebetween. For example, a 64-bit vector input to a data processing operation may include four 16-bit scalar operands. Here, a "vector" is a collection of scalar operands. Vector arithmetic can be performed on multiple scalar operands, and can include steering, shifting, and permutation of scalar elements. Not all operands of a vector operation need to be vectors; for example, a vector operation can have a scalar and at least one vector as input; and the output is either a scalar or a vector result. the
这里,“控制指令”包括专用于程序流和分支以及地址产生的指令;但不是数据处理。“数据处理指令”包括用于逻辑操作或算术操作的指令,对于该算术操作,至少一个输入是矢量。数据处理指令可以在多个数据指令上操作,例如在SIMD处理中,或在处理数据元素的宽的、短的矢量中。上述的控制指令和数据处理指令的基本功能并不重叠;然而,共性在于两种类型的代码都具有逻辑和标量算术能力。 Here, "control instructions" include instructions dedicated to program flow and branching and address generation; but not data processing. "Data processing instructions" include instructions for logical operations or arithmetic operations for which at least one input is a vector. Data processing instructions may operate on multiple data instructions, such as in SIMD processing, or in processing wide, short vectors of data elements. The basic functions of the control instructions and data processing instructions described above do not overlap; however, the commonality is that both types of codes have logic and scalar arithmetic capabilities. the
图2示出用于图1的处理器的指令包的三种类型。指令包的每种类型都是64位长。指令包211是3标量类型,用于密集控制代码,并包括三个21位控制指令(c21)。指令包212和213是LIW(长指令字)类型,用于数据路径代码的并行执行。在该示例中,每个指令包212、213都包括两个指令,但是如果需要可以包括不同的数目。指令包212包括34位数据指令(d34)和28位存储器指令(m28);并且被用于并行执行具有数据侧载入存储操作(m28指令)的数据侧算术(d34指令)。存储器类指令(m28)可以利用来自控制侧的地址从处理器的控制侧或数据侧读出,或写入处理器的控制侧或数据侧。指令包213包括34位数据指令(d34)和21位控制指令(c21);并被用于并行执行具有控制侧操作(c21指令)(例如控制侧算术、分支或者载入存储操作)的数据侧算术(d34指令)。
FIG. 2 shows three types of instruction packets for the processor of FIG. 1 . Each type of instruction packet is 64 bits long. The
图1的实施例的指令解码单元101使用每个指令包的初始识别位、或者在预定位位置处的某些其他指定的识别位,用于确定正在解码哪一种类型的包。例如,如图2所示,初始位“1”表示指令包是标量控制指令类型,具有3个控制指令;而初始位“01”和“00”表示类型212和213的指令包,在包212中具有数据和存储器指令或者在包213中具有数据和控制指令。已经解码了每个指令包的初始位,图1的解码单元101根据指令包的类型将每个包的指令适当地传递到控制执行路径102或者数据执行路径103。
The instruction decode unit 101 of the embodiment of FIG. 1 uses the initial identification bits of each instruction packet, or some other designated identification bit at a predetermined bit position, for determining which type of packet is being decoded. For example, as shown in Figure 2, the initial bit "1" indicates that the instruction packet is a scalar control instruction type, with 3 control instructions; while the initial bits "01" and "00" indicate the instruction packets of
为了执行图2的指令包,图1的实施例的处理器的指令解码单元101从存储器 顺序地取得程序包;并程序包顺序地被执行。在指令包内,顺序地执行包211的指令,其中首先执行64位字的最低有效端的21位控制指令,然后是接下来的21位控制指令,以及然后是最高有效端的21位控制指令。在指令包212和213内,可以同时执行指令(在根据本发明的实施例中,虽然这不是必需的情况)。因此,以图1的实施例的处理器的程序顺序,程序包被顺序地执行;但是包内的指令可以或者顺序地被执行(对于包类型211),或同时被执行(对于包212和213)。下面,将类型212和213的指令包分别简称为MD和CD包(分别包含一个存储器和一个数据指令;以及一个控制指令和一个数据指令)。
In order to execute the instruction package of Fig. 2, the instruction decoding unit 101 of the processor of the embodiment of Fig. 1 obtains the program package sequentially from the memory; and the program package is executed sequentially. Within an instruction packet, the instructions of
通过使用21位控制指令,图1的实施例克服了许多在具有其他长度指令的处理器中以及特别是在支持数据指令用的32位标准编码和控制代码用的16位“密集”编码的组合的处理器中所发现的缺陷。在这种双16/32位处理器中,由于使用每条指令用的双编码、或者使用具有通过分支、提取地址在编码方案之间转换的装置或其他装置的两个独立的解码器而引起冗余。根据本发明实施例,通过使用单21位长度用于所有控制指令来消除该冗余。此外,使用21位控制指令消除在16位“密集”编码方案中不充分的语义内容所产生的缺陷。由于不充分的语义内容,使用16位方案的处理器典型地需要设计折衷的某些混合,诸如:使用两操作数破坏性操作,其中相应的代码膨胀(code bloat)用于复制;使用对寄存器文件的子集的有窗口访问,其中代码膨胀用于溢出/填充或者窗口指针操作;或频繁逆转为32位格式,因为不是所有的操作都可以以16位格式中很少可用的操作码位来表示。在本发明实施例中,通过使用21位控制指令减轻这些缺陷。 By using 21-bit control instructions, the embodiment of FIG. 1 overcomes many combinations of 32-bit standard encoding for data instructions and 16-bit "dense" encoding for control codes in processors with instructions of other lengths and in particular defects found in processors. In such dual 16/32-bit processors, due to the use of dual encoding for each instruction, or the use of two separate decoders with means for switching between encoding schemes by branching, fetching addresses, or other means redundancy. According to an embodiment of the present invention, this redundancy is eliminated by using a single 21-bit length for all control instructions. Furthermore, the use of 21-bit control instructions eliminates the drawbacks arising from insufficient semantic content in 16-bit "dense" encoding schemes. Processors using 16-bit schemes typically require some mix of design trade-offs due to insufficient semantic content, such as: use of two-operand destructive operations with corresponding code bloat for copying; A subset of the file has windowed access, where code bloats for overflow/fill or window pointer operations; or frequently reverses to 32-bit format, since not all operations can be done with the few opcode bits available in 16-bit format express. In an embodiment of the present invention, these drawbacks are mitigated by using 21-bit control instructions. the
根据本发明实施例,可以使用大量指令。例如,指令签名可以是如下任一种,其中C格式、M格式、和D格式分别表示控制、存储器访问和数据格式: According to embodiments of the present invention, a large number of instructions may be used. For example, an instruction signature can be any of the following, where C format, M format, and D format represent control, memory access, and data formats, respectively:
同样,根据本发明一个实施例,C格式指令都提供SISD(单指令单数据) 操作,而M格式和D格式指令提供SISD或SIMD操作。例如,控制指令可以提供一般的算术、比较和逻辑指令;控制流指令;存储器载入和存储指令;以及其他。数据指令可以提供一般的算术、移位、逻辑和比较指令;清洗(shuffle)、分类、字节扩展和置换指令;线性反馈偏移寄存指令;以及经由可配置深度执行单元110(如下所述)由用户定义的指令。存储器指令可以提供存储器载入和存储;将所选择的数据寄存器复制到控制寄存器;将广播控制寄存器复制到数据寄存器;以及立即到寄存器指令。 Equally, according to one embodiment of the present invention, C format instruction all provides SISD (single instruction single data) operation, and M format and D format instruction provide SISD or SIMD operation. For example, control instructions may provide general arithmetic, comparison, and logic instructions; control flow instructions; memory load and store instructions; and others. Data instructions may provide general arithmetic, shift, logic, and compare instructions; shuffle, sort, byte extension, and permutation instructions; linear feedback offset register instructions; and Directives defined by the user. Memory instructions may provide memory loads and stores; copy selected data registers to control registers; copy broadcast control registers to data registers; and immediate-to-register instructions. the
根据本发明一个实施例,图1的处理器的特征在于第一固定数据执行路径和第二可配置数据执行路径。第一数据路径具有以与常规的SIMD处理设计类似的形式被分裂为通道的固定SIMD执行单元。第二数据路径具有可配置深度执行单元110。“深度执行”指的是在向寄存器文件返回结果之前在由单个发布的指令所提供的数据上执行多个连续操作的处理器能力。深度执行的一个示例在于常规的MAC操作(乘和累加),其在来自单个指令的数据上执行两个操作(乘法和加法),因此具有数量级2的深度。深度执行也可以以操作数输入的数目等于结果输出的数目为特征;或等同地,价进(valency-in)等于价出(valency-out)。因此,例如具有一个结果的常规两操作数加法不是优选的深度执行的示例,因为操作数的数目不等于结果的数目;而卷积、快速傅立叶变换、Trellis/Viterbi编码、相关器、有限脉冲响应过滤器以及其他信号处理算法是深度执行的示例。专用数字信号处理(DSP)算法典型地在位级上以及以存储器映射的形式执行深度执行。但是,常规的寄存器映射通用DSP的算法不执行深度执行,而是在MAC操作中,执行顺序深度最多为数量级2的指令。相反,图1的处理器提供寄存器映射通用处理器,其能够深度执行数量级大于2的动态可配置的字级指令。在图1的处理器中,深度执行指令的特性(待执行的数学函数的图表)可以由指令本身中的配置信息调节/定制。在优选实施例中,格式指令包括被分配给配置信息的位位置。为了提供这个能力,深度执行单元110具有可配置执行资源,其意味着可以上载操作符模式、互连性和常数以适合每个应用。深度执行对执行的并行性添加深度,其正交于由SIMD和LIW处理的早期构思所提供的宽度;因此它表示用于增加目标处理器的每指令工作(work-per-instruction)的其他尺度。 According to one embodiment of the invention, the processor of FIG. 1 is characterized by a first fixed data execution path and a second configurable data execution path. The first data path has fixed SIMD execution units split into lanes in a similar fashion to conventional SIMD processing designs. The second data path has a configurable depth of execution units 110 . "Deep execution" refers to the processor's ability to perform multiple sequential operations on data provided by a single issued instruction before returning the result to the register file. One example of deep execution is the conventional MAC operation (Multiply and Accumulate), which performs two operations (Multiply and Add) on data from a single instruction, and thus has a depth of order 2. Deep execution may also be characterized by the number of operand inputs equaling the number of result outputs; or equivalently, valency-in equals valency-out. Thus, for example, regular two-operand addition with one result is not an example of a preferred deep implementation, since the number of operands is not equal to the number of results; whereas convolution, fast Fourier transform, Trellis/Viterbi encoding, correlators, finite impulse response Filters and other signal processing algorithms are examples of deep implementations. Dedicated digital signal processing (DSP) algorithms typically perform deep execution at the bit level and in memory-mapped form. However, the algorithms of conventional register-mapped general-purpose DSPs do not perform deep execution, but, in MAC operations, execute instructions with a sequential depth of at most an order of two. In contrast, the processor of FIG. 1 provides a register-mapped general-purpose processor capable of executing dynamically configurable word-level instructions orders of magnitude greater than two in depth. In the processor of Figure 1, the characteristics of deeply executed instructions (the graph of mathematical functions to be executed) can be adjusted/customized by configuration information in the instructions themselves. In a preferred embodiment, the format instructions include bit positions assigned to configuration information. To provide this capability, the deep execution unit 110 has configurable execution resources, which means that operator modes, interconnections and constants can be uploaded to suit each application. Deep execution adds depth to the parallelism of execution, which is orthogonal to the width provided by early concepts of SIMD and LIW processing; thus it represents an additional metric for increasing the work-per-instruction of the target processor . the
图3示出根据本发明实施例的可配置深度执行单元310的组件。如图1所示,可配置深度执行单元110是数据执行路径103的一部分,并因此可以由来自图2的 MD和CD指令包212和213的数据侧指令指示。在图3中,从图1的指令解码单元101和数据寄存器文件105将指令314和操作数315提供到深度执行单元310。被解码的指令314中的多位配置代码被用于访问控制映射316,其将多位代码扩展为比较复杂的配置信号集用于配置深度执行单元的操作符。例如,控制映射316可以被实施为查询表,其中将指令的不同的可能多位代码映射为深度执行单元的不同的可能操作符配置。根据对控制映射316的查询表查询的结果,交叉互连317配置一组操作符318-321,在任何布置中对于执行由多位指令代码所表示的操作符配置都是必要的。例如,该操作符可以包括:乘法操作符318、算术逻辑单元(ALU)操作符319、状态操作符320、或交叉通道换码器321。在一个实施例中,深度执行单元包含15个操作符:一个乘法操作符318、八个ALU操作符319、四个状态操作符320、和两个交叉通道换码器321;尽管其他操作符数目也是可能的。被提供到深度执行单元的操作数315可以是例如16位操作数;将这些操作数提供到第二交叉互连322,其可以将操作数提供给合适的操作符318-321。第二交叉互连322也从操作符318-321接收中间结果的反馈324,所述反馈接着又同样可以由第二交叉互连322提供给合适的操作符318-321。第三交叉互连323多路复用来自操作符318-321的结果,并输出最后结果325。各种控制信号可以被用于配置操作符;例如,图3的实施例的控制映射316不必要被实施为单个查询表,而是可以被实施为两个或更多级联查询表的序列。第一查询表中的项目可以从给出的多位指令代码指向第二查询表,因此减少了在每个查询表中用于复杂操作符配置所需的存储量。例如,第一查询表可以被组织为配置种类的库,使得多个多位指令代码在第一查询表中被组合在一起,其中每组指向提供该组的每个多位代码的特定配置的随后的查询表。
FIG. 3 illustrates components of a configurable
根据图3的实施例,操作符优选地被预配置为各种操作符类。实际上,这通过硬布线的策略层来实现。该方法的优势在于,意味着需要存储更少的预定义的配置,并且控制电路可以更简单。例如,将操作符318预配置在乘法操作符的类中;将操作符319预配置为ALU操作符;将操作符320预配置为状态操作符;以及将操作符321预配置为交叉通道换码器;而且其他预配置的类是可能的。然而,即使操作符的类被预配置,对于用于实施所给出的算法的特定配置的最终布置,指令的运行时间灵活性能够布置至少以下项:(i)在每类中的操作符的连接性;(ii)与来自其他类的操作符的连接性;(iii)任何相关转换装置的连 接性。
According to the embodiment of Fig. 3, operators are preferably pre-configured into various operator classes. In practice, this is achieved through a hardwired policy layer. The advantage of this approach is that it means that fewer predefined configurations need to be stored and the control circuitry can be simpler. For example,
技术人员应当理解,虽然上面已描述了什么被认为是本发明的最佳模式以及在什么情况下执行本发明的其他模式是适当的,但是本发明不应局限于在优选实施例的所述描述中公开的特定装置配置或方法步骤。本领域技术人员同样应当认识到,本发明具有广泛的应用,并且实施例允许在不偏离本发明构思的情况下具有广范的不同的实施和修改。特别是,这里提及的示例性位宽不是限制性的,也不是被称为半字、字、长等的位宽的任意选择。 It will be appreciated by those skilled in the art that while the above has described what is considered the best mode of the invention and where other modes of carrying out the invention are appropriate, the invention should not be limited to the described description of the preferred embodiments. Specific apparatus configurations or method steps disclosed in . It will also be appreciated by those skilled in the art that the invention has broad applicability and that the embodiments allow for a wide range of different implementations and modifications without departing from the inventive concept. In particular, the exemplary bit widths mentioned here are not limiting, nor are they arbitrary choices of bit widths referred to as halfwords, words, long, etc. the
Claims (2)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US10/813,433 US8484441B2 (en) | 2004-03-31 | 2004-03-31 | Apparatus and method for separate asymmetric control processing and data path processing in a configurable dual path processor that supports instructions having different bit widths |
| US10/813433 | 2004-03-31 |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN200580010665XA Division CN1989485B (en) | 2004-03-31 | 2005-03-22 | Apparatus and method for dual data path processing |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN101963897A CN101963897A (en) | 2011-02-02 |
| CN101963897B true CN101963897B (en) | 2014-03-12 |
Family
ID=34962960
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201010276291.9A Expired - Fee Related CN101963897B (en) | 2004-03-31 | 2005-03-22 | Apparatus and method for dual data path processing |
| CN200580010665XA Expired - Fee Related CN1989485B (en) | 2004-03-31 | 2005-03-22 | Apparatus and method for dual data path processing |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN200580010665XA Expired - Fee Related CN1989485B (en) | 2004-03-31 | 2005-03-22 | Apparatus and method for dual data path processing |
Country Status (8)
| Country | Link |
|---|---|
| US (1) | US8484441B2 (en) |
| EP (1) | EP1735699B1 (en) |
| JP (1) | JP5382635B2 (en) |
| KR (1) | KR20070037568A (en) |
| CN (2) | CN101963897B (en) |
| CA (1) | CA2560093A1 (en) |
| TW (1) | TWI362617B (en) |
| WO (1) | WO2005096142A2 (en) |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9047094B2 (en) | 2004-03-31 | 2015-06-02 | Icera Inc. | Apparatus and method for separate asymmetric control processing and data path processing in a dual path processor |
| US7676646B2 (en) * | 2005-03-02 | 2010-03-09 | Cisco Technology, Inc. | Packet processor with wide register set architecture |
| US7529909B2 (en) * | 2006-12-28 | 2009-05-05 | Microsoft Corporation | Security verified reconfiguration of execution datapath in extensible microcomputer |
| US8755515B1 (en) | 2008-09-29 | 2014-06-17 | Wai Wu | Parallel signal processing system and method |
| KR101893796B1 (en) | 2012-08-16 | 2018-10-04 | 삼성전자주식회사 | Method and apparatus for dynamic data format |
| CN111158756B (en) * | 2019-12-31 | 2021-06-29 | 百度在线网络技术(北京)有限公司 | Method and apparatus for processing information |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5600801A (en) * | 1993-07-15 | 1997-02-04 | Dell Usa, L.P. | Multiple function interface device for option card |
Family Cites Families (49)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4228498A (en) * | 1977-10-12 | 1980-10-14 | Dialog Systems, Inc. | Multibus processor for increasing execution speed using a pipeline effect |
| US5136697A (en) * | 1989-06-06 | 1992-08-04 | Advanced Micro Devices, Inc. | System for reducing delay for execution subsequent to correctly predicted branch instruction using fetch information stored with each block of instructions in cache |
| DE69031257T2 (en) * | 1989-09-21 | 1998-02-12 | Texas Instruments Inc | Integrated circuit with an embedded digital signal processor |
| JPH05324430A (en) | 1992-05-26 | 1993-12-07 | Toshiba Corp | Data processor |
| US5423051A (en) | 1992-09-24 | 1995-06-06 | International Business Machines Corporation | Execution unit with an integrated vector operation capability |
| US5600810A (en) | 1994-12-09 | 1997-02-04 | Mitsubishi Electric Information Technology Center America, Inc. | Scaleable very long instruction word processor with parallelism matching |
| US6052773A (en) * | 1995-02-10 | 2000-04-18 | Massachusetts Institute Of Technology | DPGA-coupled microprocessors |
| US5737631A (en) * | 1995-04-05 | 1998-04-07 | Xilinx Inc | Reprogrammable instruction set accelerator |
| JP2931890B2 (en) | 1995-07-12 | 1999-08-09 | 三菱電機株式会社 | Data processing device |
| JP3658072B2 (en) | 1996-02-07 | 2005-06-08 | 株式会社ルネサステクノロジ | Data processing apparatus and data processing method |
| JPH09265397A (en) | 1996-03-29 | 1997-10-07 | Hitachi Ltd | VLIW instruction processor |
| GB2311882B (en) | 1996-04-04 | 2000-08-09 | Videologic Ltd | A data processing management system |
| US5956518A (en) | 1996-04-11 | 1999-09-21 | Massachusetts Institute Of Technology | Intermediate-grain reconfigurable processing device |
| DE19634031A1 (en) | 1996-08-23 | 1998-02-26 | Siemens Ag | Processor with pipelining structure |
| US6006321A (en) | 1997-06-13 | 1999-12-21 | Malleable Technologies, Inc. | Programmable logic datapath that may be used in a field programmable device |
| US5922065A (en) * | 1997-10-13 | 1999-07-13 | Institute For The Development Of Emerging Architectures, L.L.C. | Processor utilizing a template field for encoding instruction sequences in a wide-word format |
| JP3451921B2 (en) | 1998-03-30 | 2003-09-29 | 松下電器産業株式会社 | Processor |
| EP0953898A3 (en) | 1998-04-28 | 2003-03-26 | Matsushita Electric Industrial Co., Ltd. | A processor for executing Instructions from memory according to a program counter, and a compiler, an assembler, a linker and a debugger for such a processor |
| US6226735B1 (en) | 1998-05-08 | 2001-05-01 | Broadcom | Method and apparatus for configuring arbitrary sized data paths comprising multiple context processing elements |
| US6292845B1 (en) | 1998-08-26 | 2001-09-18 | Infineon Technologies North America Corp. | Processing unit having independent execution units for parallel execution of instructions of different category with instructions having specific bits indicating instruction size and category respectively |
| DE19843640A1 (en) | 1998-09-23 | 2000-03-30 | Siemens Ag | Procedure for configuring a configurable hardware block |
| US6553414B1 (en) | 1998-10-02 | 2003-04-22 | Canon Kabushiki Kaisha | System used in plural information processing devices for commonly using peripheral device in network |
| WO2000049496A1 (en) | 1999-02-15 | 2000-08-24 | Koninklijke Philips Electronics N.V. | Data processor with a configurable functional unit and method using such a data processor |
| EP1050810A1 (en) | 1999-05-03 | 2000-11-08 | STMicroelectronics SA | A computer system comprising multiple functional units |
| GB2352066B (en) | 1999-07-14 | 2003-11-05 | Element 14 Ltd | An instruction set for a computer |
| US6526430B1 (en) | 1999-10-04 | 2003-02-25 | Texas Instruments Incorporated | Reconfigurable SIMD coprocessor architecture for sum of absolute differences and symmetric filtering (scalable MAC engine for image processing) |
| US7039790B1 (en) | 1999-11-15 | 2006-05-02 | Texas Instruments Incorporated | Very long instruction word microprocessor with execution packet spanning two or more fetch packets with pre-dispatch instruction selection from two latches according to instruction bit |
| EP1102163A3 (en) | 1999-11-15 | 2005-06-29 | Texas Instruments Incorporated | Microprocessor with improved instruction set architecture |
| US6255849B1 (en) | 2000-02-04 | 2001-07-03 | Xilinx, Inc. | On-chip self-modification for PLDs |
| TW516320B (en) | 2000-02-22 | 2003-01-01 | Intervideo Inc | Implementation of quantization for SIMD architecture |
| JP2001306321A (en) | 2000-04-19 | 2001-11-02 | Matsushita Electric Ind Co Ltd | Processor |
| US7120781B1 (en) | 2000-06-30 | 2006-10-10 | Intel Corporation | General purpose register file architecture for aligned simd |
| JP2004512716A (en) * | 2000-10-02 | 2004-04-22 | アルテラ・コーポレイション | Programmable logic integrated circuit device including dedicated processor device |
| US20020174266A1 (en) * | 2001-05-18 | 2002-11-21 | Krishna Palem | Parameterized application programming interface for reconfigurable computing systems |
| JP2003005958A (en) | 2001-06-25 | 2003-01-10 | Pacific Design Kk | Data processor and method for controlling the same |
| JP2003099397A (en) | 2001-09-21 | 2003-04-04 | Pacific Design Kk | Data processing system |
| US6798239B2 (en) * | 2001-09-28 | 2004-09-28 | Xilinx, Inc. | Programmable gate array having interconnecting logic to support embedded fixed logic circuitry |
| JP3785343B2 (en) | 2001-10-02 | 2006-06-14 | 日本電信電話株式会社 | Client server system and data communication method in client server system |
| JP3779602B2 (en) | 2001-11-28 | 2006-05-31 | 松下電器産業株式会社 | SIMD operation method and SIMD operation device |
| KR100464406B1 (en) | 2002-02-08 | 2005-01-03 | 삼성전자주식회사 | Apparatus and method for dispatching very long instruction word with variable length |
| US7159099B2 (en) | 2002-06-28 | 2007-01-02 | Motorola, Inc. | Streaming vector processor with reconfigurable interconnection switch |
| JP3982353B2 (en) | 2002-07-12 | 2007-09-26 | 日本電気株式会社 | Fault tolerant computer apparatus, resynchronization method and resynchronization program |
| US7024543B2 (en) | 2002-09-13 | 2006-04-04 | Arm Limited | Synchronising pipelines in a data processing apparatus |
| TW569138B (en) | 2002-09-19 | 2004-01-01 | Faraday Tech Corp | A method for improving instruction selection efficiency in a DSP/RISC compiler |
| US7464254B2 (en) | 2003-01-09 | 2008-12-09 | Cisco Technology, Inc. | Programmable processor apparatus integrating dedicated search registers and dedicated state machine registers with associated execution hardware to support rapid application of rulesets to data |
| JP2004309570A (en) | 2003-04-02 | 2004-11-04 | Seiko Epson Corp | Optical communication module, optical communication device, and method of manufacturing the same |
| US7496776B2 (en) | 2003-08-21 | 2009-02-24 | International Business Machines Corporation | Power throttling method and apparatus |
| US7176713B2 (en) * | 2004-01-05 | 2007-02-13 | Viciciv Technology | Integrated circuits with RAM and ROM fabrication options |
| US7949856B2 (en) | 2004-03-31 | 2011-05-24 | Icera Inc. | Method and apparatus for separate control processing and data path processing in a dual path processor with a shared load/store unit |
-
2004
- 2004-03-31 US US10/813,433 patent/US8484441B2/en not_active Expired - Lifetime
-
2005
- 2005-03-22 EP EP05729261.7A patent/EP1735699B1/en not_active Expired - Lifetime
- 2005-03-22 JP JP2007505615A patent/JP5382635B2/en not_active Expired - Lifetime
- 2005-03-22 CN CN201010276291.9A patent/CN101963897B/en not_active Expired - Fee Related
- 2005-03-22 CA CA002560093A patent/CA2560093A1/en not_active Abandoned
- 2005-03-22 CN CN200580010665XA patent/CN1989485B/en not_active Expired - Fee Related
- 2005-03-22 WO PCT/GB2005/001073 patent/WO2005096142A2/en not_active Ceased
- 2005-03-22 KR KR1020067020243A patent/KR20070037568A/en not_active Ceased
- 2005-03-24 TW TW094109120A patent/TWI362617B/en not_active IP Right Cessation
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5600801A (en) * | 1993-07-15 | 1997-02-04 | Dell Usa, L.P. | Multiple function interface device for option card |
Also Published As
| Publication number | Publication date |
|---|---|
| CA2560093A1 (en) | 2005-10-13 |
| CN101963897A (en) | 2011-02-02 |
| US20050223197A1 (en) | 2005-10-06 |
| TWI362617B (en) | 2012-04-21 |
| JP2007531135A (en) | 2007-11-01 |
| CN1989485A (en) | 2007-06-27 |
| US8484441B2 (en) | 2013-07-09 |
| WO2005096142A3 (en) | 2006-06-08 |
| CN1989485B (en) | 2011-08-03 |
| JP5382635B2 (en) | 2014-01-08 |
| WO2005096142A2 (en) | 2005-10-13 |
| EP1735699A2 (en) | 2006-12-27 |
| EP1735699B1 (en) | 2017-11-22 |
| TW200540713A (en) | 2005-12-16 |
| KR20070037568A (en) | 2007-04-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN100583027C (en) | Apparatus and method for asymmetric dual path processing | |
| US8484442B2 (en) | Apparatus and method for control processing in dual path processor | |
| US9965275B2 (en) | Element size increasing instruction | |
| JP2002333978A (en) | Vliw type processor | |
| CN101963897B (en) | Apparatus and method for dual data path processing | |
| WO2006136764A1 (en) | A data processing apparatus and method for accelerating execution of subgraphs | |
| KR20070022239A (en) | Apparatus and method for asymmetric dual path processing |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| ASS | Succession or assignment of patent right |
Owner name: HUIDA TECHNOLOGY ENGLAND CO., LTD. Free format text: FORMER OWNER: ICERA INC. Effective date: 20130116 |
|
| C41 | Transfer of patent application or patent right or utility model | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20130116 Address after: London, England Applicant after: ICERA Inc. Address before: Bristol Applicant before: Icera Inc. Effective date of registration: 20130116 Address after: Bristol Applicant after: Icera Inc. Address before: Bristol Applicant before: Icera Inc. |
|
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20140312 |
|
| CF01 | Termination of patent right due to non-payment of annual fee |