CN1908927A - Reconfigurable integrated circuit device - Google Patents
Reconfigurable integrated circuit device Download PDFInfo
- Publication number
- CN1908927A CN1908927A CNA2006100083495A CN200610008349A CN1908927A CN 1908927 A CN1908927 A CN 1908927A CN A2006100083495 A CNA2006100083495 A CN A2006100083495A CN 200610008349 A CN200610008349 A CN 200610008349A CN 1908927 A CN1908927 A CN 1908927A
- Authority
- CN
- China
- Prior art keywords
- memory
- processor element
- data
- access
- integrated circuit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
Landscapes
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
- Multi Processors (AREA)
- Microcomputers (AREA)
Abstract
Description
技术领域technical field
本发明涉及可重配置的集成电路器件,更具体地说,涉及被安装在可重配置集成电路器件中的内部存储器的新颖配置,用于执行与外部存储器之间的数据传输。The present invention relates to reconfigurable integrated circuit devices, and more particularly, to a novel configuration of an internal memory mounted in a reconfigurable integrated circuit device for performing data transfer to and from an external memory.
背景技术Background technique
可重配置集成电路器件包括多个处理器元件和用于互连这些处理器元件的网络,其中定序器响应于外部或内部事件来向处理器元件和网络提供配置数据,并根据该配置数据,利用处理器元件和网络来配置任意运算状态或运算电路。传统的可编程微处理器顺序地读取存储在存储器中的指令,并顺序地处理它们。由于一个处理器同时执行的指令数是有限的,因此微处理器的处理能力也受到某种限制。A reconfigurable integrated circuit device includes a plurality of processor elements and a network for interconnecting these processor elements, wherein the sequencer provides configuration data to the processor elements and the network in response to external or internal events, and according to the configuration data , using processor elements and networks to configure arbitrary computing states or computing circuits. A conventional programmable microprocessor reads instructions stored in memory sequentially and processes them sequentially. Since the number of instructions that a processor can execute at the same time is limited, the processing power of the microprocessor is also limited to some extent.
另一方面,在最近提出的可重配置集成电路器件中,具有加法器、乘法器、比较器等功能的ALU和例如延迟电路、计数器等多种处理器元件被预先安装,并且用于连接这些处理器元件的网络也被安装,然后,根据从具有定序器的状态转换控制部件而来的配置数据,所述多个处理器元件和网络被重新配置为所需配置,而且在该运算状态下执行预定的运算。当在一种运算状态下的数据处理完成时,根据其他配置数据来构造另一种运算状态,而且在该状态下执行不同的数据处理。On the other hand, in recently proposed reconfigurable integrated circuit devices, ALUs with functions such as adders, multipliers, comparators, etc., and various processor elements such as delay circuits, counters, etc. are preinstalled and used to connect these A network of processor elements is also installed, and then, based on configuration data from a state transition control unit with a sequencer, the plurality of processor elements and network are reconfigured to the desired configuration, and in the operational state Next, execute the predetermined operation. When data processing in one operation state is completed, another operation state is constructed based on other configuration data, and different data processing is performed in this state.
通过以此方式动态地构造不同运算状态,可提高对大量数据的数据处理能力,并且可提高整体处理效率。这种可重配置集成电路器件例如在日本专利申请早期公开No.2001-312481中公开。By dynamically constructing different operation states in this way, the data processing capability for a large amount of data can be improved, and the overall processing efficiency can be improved. Such a reconfigurable integrated circuit device is disclosed, for example, in Japanese Patent Application Laid-Open No. 2001-312481.
发明内容Contents of the invention
在传统的可重配置集成电路器件中,多个处理器元件的阵列被连接在处理器之间的开关包围,状态转换控制部件向处理器元件和开关组提供配置数据,以设置任意运算状态。在处理器元件组中,数据从外部存储器输入,被设置为运算状态的处理器元件组对输入数据执行预定数据处理,如此获得的数据被输出。In a conventional reconfigurable integrated circuit device, an array of multiple processor elements is surrounded by switches connected between the processors, and a state transition control unit provides configuration data to the processor elements and switch banks to set arbitrary computing states. In the group of processor elements, data is input from an external memory, the group of processor elements set in an operation state performs predetermined data processing on the input data, and the data thus obtained is output.
在上述集成电路器件中,数据处理所需的数据从外部存储器被成批读取,并被存储在内部存储器中,然后被设置为某种运算状态的处理器元件组和开关组对读取的所有数据执行数据处理。In the above-mentioned integrated circuit device, the data required for data processing is read in batches from the external memory and stored in the internal memory, and then set to a certain operation state of the processor element group and the switch group for reading All data performs data processing.
但是,可重配置的集成电路器件利用动态配置的预定数量的处理器元件执行不同的应用。因此,每个处理器元件需要在所需的定时上向外部存储器写或从外部存储器读取所需数量的数据。在现有技术中,经由使用连接处理器元件的开关组的数据路径来传输数据,并且仅能在预定的定时上与外部存储器进行数据传输。However, reconfigurable integrated circuit devices utilize a dynamically configured predetermined number of processor elements to execute different applications. Therefore, each processor element needs to write or read a required amount of data to or from the external memory at a required timing. In the prior art, data is transferred via a data path using a switch group connecting processor elements, and data transfer with an external memory is only possible at predetermined timing.
此外,用于存储从外部存储器读取的数据或要被写到外部存储器的数据的预定数量的内部存储器被安装用于多个处理器元件,但是将由用户配置的运算状态是可变的,因此很难估计需要多少个内部存储器以及内部存储器需要何种输入/输出特性。因此在可重配置集成电路器件中,内部存储器的配置和操作需要很高的灵活度。In addition, a predetermined number of internal memories for storing data read from or to be written to an external memory are mounted for a plurality of processor elements, but the operation state to be configured by the user is variable, so It is difficult to estimate how much internal memory is required and what kind of input/output characteristics the internal memory requires. Therefore, in the reconfigurable integrated circuit device, the configuration and operation of the internal memory need a high degree of flexibility.
鉴于上述原因,本发明的目的在于提供一种可重配置的集成电路器件,其允许内部存储器的高度灵活的配置和操作。In view of the above reasons, it is an object of the present invention to provide a reconfigurable integrated circuit device which allows highly flexible configuration and operation of the internal memory.
为了达到此目的,本发明的第一方面是一种可重配置的集成电路器件,该器件基于配置数据被动态构建为任意运算状态,该器件包括:多个群集,所述群集包括多个分别具有计算单元的运算处理器元件、与外部存储器之间进行数据传输的具有存储器的存储器处理器元件、以及用于在任意状态下连接运算处理器元件和存储器处理器元件的处理器元件间开关组;群集间开关组,用于在任意状态下构建群集之间的数据路径;以及外部存储器总线,用于执行存储器处理器元件和外部存储器之间的数据传输,其中所述运算处理器元件、存储器处理器元件、处理器元件间开关组和群集间开关组基于配置数据而被动态改变,此外还提供了直接存储器访问控制部件,其响应于从多个群集的存储器处理器元件而来的访问请求,通过直接存储器访问来执行存储器处理器元件和外部存储器之间的数据传输。To this end, a first aspect of the present invention is a reconfigurable integrated circuit device that is dynamically constructed into an arbitrary operational state based on configuration data, the device comprising: a plurality of clusters comprising a plurality of Arithmetic processor element with calculation unit, memory processor element with memory for data transfer with external memory, and interprocessor element switch group for connecting arithmetic processor element and memory processor element in any state an inter-cluster switch group for constructing a data path between clusters in any state; and an external memory bus for performing data transfers between a memory processor element and an external memory, wherein the arithmetic processor element, the memory processor elements, inter-processor element switch groups, and inter-cluster switch groups are dynamically changed based on configuration data, and a direct memory access control component is provided that responds to access requests from multiple clusters of memory processor elements , data transfers between the memory processor element and the external memory are performed by direct memory access.
根据第一方面,安装在群集中的存储器处理器元件可经由与群集间开关组不同的外部存储器总线,通过直接存储器访问与外部存储器之间进行数据传输,而且可以在适于重配置后的运算状态的定时上,对外部存储器中的数据执行重配置后的运算。According to the first aspect, the memory processor elements installed in the cluster can transfer data to and from the external memory through direct memory access via an external memory bus different from the inter-cluster switch group, and can perform operation after reconfiguration At the timing of the state, the reconfigured operation is performed on the data in the external memory.
在本发明的第一方面中,优选地,所述群集还包括用于存储所述配置数据的配置数据存储器,以及定序器,所述定序器响应于从所述运算处理器元件和存储器处理器元件而来的结束信号,从所述配置数据存储器输出用于构建下一运算状态的配置数据。In the first aspect of the present invention, preferably, said cluster further comprises a configuration data memory for storing said configuration data, and a sequencer responsive to The end signal from the processor element outputs the configuration data for constructing the next operation state from the configuration data memory.
在本发明的第一方面中,优选地,所述可重配置的集成电路器件还包括数据流控制部件,该数据流控制部件被安装为所述多个存储器处理器元件的公用部件,用于接受来自所述多个存储器处理器元件的直接存储器访问请求,并向用于所述多个存储器处理器元件的直接存储器访问控制部件指示同步的直接存储器访问请求。In the first aspect of the present invention, preferably, the reconfigurable integrated circuit device further includes a data flow control unit installed as a common unit of the plurality of memory processor elements for Direct memory access requests from the plurality of memory processor elements are accepted and simultaneous direct memory access requests are indicated to direct memory access control means for the plurality of memory processor elements.
在第一方面中,优选地,所述可重配置的集成电路器件还包括数据流控制部件,该数据流控制部件被安装为所述多个存储器处理器元件的公用部件,用于接受来自所述多个存储器处理器元件的直接存储器访问请求,并向用于所述多个存储器处理器元件的直接存储器访问控制部件指示同步的直接存储器访问请求。通过该数据流控制部件,来自所述多个存储器处理器元件的访问请求可被同步执行。In the first aspect, preferably, the reconfigurable integrated circuit device further includes a data flow control unit installed as a common unit of the plurality of memory processor elements for receiving data from all direct memory access requests of the plurality of memory processor elements, and indicate simultaneous direct memory access requests to direct memory access control means for the plurality of memory processor elements. Through the data flow control section, access requests from the plurality of memory processor elements can be executed synchronously.
在第一方面中,所述存储器处理器元件还包括与连接到所述处理器元件间开关组的内部总线之间的内侧接口,以及与所述外部存储器总线之间的外侧接口,其中在所述存储器处理器元件经由所述外侧接口通过直接存储器访问来访问所述外部存储器的同时,所述运算处理器元件经由内侧接口来访问存储器处理器元件。根据该方面,可无缝地在外部存储器和运算处理器元件之间进行数据传输。In the first aspect, said memory processor element further comprises an inner interface to an internal bus connected to said inter-processor element switch bank, and an outer interface to said external memory bus, wherein said While the memory processor element accesses the external memory by direct memory access via the external interface, the arithmetic processor element accesses the memory processor element via the internal interface. According to this aspect, data transfer between the external memory and the arithmetic processor element can be seamlessly performed.
在第一方面中,同样优选地,存储器处理器元件在通过直接存储器访问与外部存储器之间进行数据传输的同时,接受与运算处理器元件之间的数据传输,当通过直接存储器访问的数据传输跟不上与运算处理器元件之间的数据传输时断言(assert)一个停顿(stall)信号,以停止所述多个运算处理器元件的运算,并且在能够跟上时取消所述停顿信号。根据该方面,当不能在所述外部存储器和所述运算处理器元件之间进行无缝数据传输时,运算处理器元件的运算可被停止,以避免误操作。In the first aspect, it is also preferred that the memory processor element accepts data transfer with the arithmetic processor element while performing data transfer with the external memory by direct memory access, when the data transfer by direct memory access asserting a stall signal to stop operations of the plurality of arithmetic processor elements when unable to keep up with data transfers to and from the arithmetic processor elements, and deasserting the stall signal when able to catch up. According to this aspect, when seamless data transfer cannot be performed between the external memory and the arithmetic processor element, the arithmetic processor element's operation may be stopped to avoid erroneous operations.
为了达到该目的,本发明的第二方面是一种可重配置的集成电路器件,该器件基于配置数据被动态配置为预定运算状态,该器件包括:多个群集,所述群集包括具有计算单元的运算处理器元件、与外部存储器之间进行数据传输的具有存储器的存储器处理器元件、以及用于在任意状态下连接运算处理器元件和存储器处理器元件的处理器元件间开关组;群集间开关组,用于在任意状态下构建群集之间的数据路径;以及外部存储器总线,用于执行存储器处理器元件和外部存储器之间的数据传输,其中所述运算处理器元件、存储器处理器元件、处理器元件间开关组和群集间开关组基于配置数据而被动态改变,此外还提供了直接存储器访问控制部件,其响应于从多个群集的存储器处理器元件而来的访问请求,通过直接存储器访问来执行存储器处理器元件和外部存储器之间的数据传输,所述存储器处理器元件包括第一和第二存储器库,其中当所述第一和第二存储器库中的一个正在通过直接存储器访问与外部存储器进行数据传输时,所述第一和第二存储器库中的另一个与运算处理器元件进行数据传输。To achieve this object, a second aspect of the present invention is a reconfigurable integrated circuit device that is dynamically configured to a predetermined operational state based on configuration data, the device comprising: a plurality of clusters, the clusters comprising computing units The operation processor element of the computer, the memory processor element with memory for data transmission between the external memory, and the inter-processor element switch group for connecting the operation processor element and the memory processor element in any state; inter-cluster a switch bank for constructing a data path between clusters in an arbitrary state; and an external memory bus for performing data transfer between a memory processor element and an external memory, wherein the arithmetic processor element, the memory processor element , inter-processor element switch groups and inter-cluster switch groups are dynamically changed based on configuration data, and a direct memory access control unit is provided that responds to access requests from memory processor elements of multiple clusters through direct memory access to perform a data transfer between a memory processor element and an external memory, the memory processor element including first and second memory banks, wherein when one of the first and second memory banks is passing through the direct memory When accessing and performing data transmission with the external memory, the other one of the first and second memory banks performs data transmission with the arithmetic processor element.
根据第二方面,可经由不同于所述群集间开关组的外部存储器总线,在任意定时上执行所述外部存储器和所述运算处理器元件之间的无缝数据传输。According to the second aspect, seamless data transfer between the external memory and the arithmetic processor element can be performed at arbitrary timing via an external memory bus different from the inter-cluster switch group.
根据本发明,安装在每个群集中的存储器处理器元件使得可独立于群集之间的数据路径,通过对外部存储器的直接存储器访问实现数据传输,从而增加向可重配置集成电路器件中的存储器处理器元件进行数据传输的灵活性,并且可以高效地完成数据传输。According to the present invention, the memory processor element installed in each cluster enables data transfer by direct memory access to external memory independent of the data path between the clusters, thereby increasing the memory capacity in reconfigurable integrated circuit devices. The flexibility of the processor element to perform data transfers, and the data transfers can be done efficiently.
附图说明Description of drawings
图1是描述了构成根据本实施例的可重配置集成电路器件的一部分的一个群集(cluster)的框图;FIG. 1 is a block diagram illustrating a cluster constituting a part of the reconfigurable integrated circuit device according to the present embodiment;
图2是描述了根据本实施例的PE网络部件的配置示例的示意图;FIG. 2 is a schematic diagram illustrating a configuration example of a PE network element according to the present embodiment;
图3是描述了根据本实施例的根据PE网络部件的配置数据配置的电路的配置示例的示意图;3 is a schematic diagram illustrating a configuration example of a circuit configured according to configuration data of a PE network element according to the present embodiment;
图4是描述了根据本实施例的根据PE网络部件的配置数据配置的电路的配置示例的示意图;4 is a schematic diagram illustrating a configuration example of a circuit configured according to configuration data of a PE network element according to the present embodiment;
图5是描述了根据本实施例的可重配置集成电路器件的框图;FIG. 5 is a block diagram illustrating a reconfigurable integrated circuit device according to the present embodiment;
图6是描述了根据本实施例的存储器处理器元件的示例的框图;FIG. 6 is a block diagram illustrating an example of a memory processor element according to the present embodiment;
图7A-7C是描述了根据本实施例的存储器处理器元件中的两个存储器库(memory bank)的切换操作的示意图;7A-7C are schematic diagrams describing switching operations of two memory banks (memory banks) in the memory processor element according to the present embodiment;
图8A-8C是描述了根据本实施例的存储器处理器元件中的两个存储器库的切换操作的示意图;8A-8C are schematic diagrams describing switching operations of two memory banks in the memory processor element according to the present embodiment;
图9A-9C是描述了根据本实施例的存储器处理器元件中的两个存储器库的切换操作的示意图;9A-9C are schematic diagrams describing switching operations of two memory banks in the memory processor element according to the present embodiment;
图10A-10C是描述了根据本实施例的存储器处理器元件中的两个存储器库的切换操作的示意图;10A-10C are schematic diagrams describing switching operations of two memory banks in the memory processor element according to the present embodiment;
图11A-11C是描述了根据本实施例的存储器处理器元件中的两个存储器库的切换操作的示意图;11A-11C are schematic diagrams describing switching operations of two memory banks in the memory processor element according to the present embodiment;
图12是描述了根据本实施例的存储器处理器元件的控制部件的框图;FIG. 12 is a block diagram illustrating a control section of the memory processor element according to the present embodiment;
图13是根据本实施例的存储器处理器元件的控制部件的状态转换图;FIG. 13 is a state transition diagram of the control part of the memory processor element according to the present embodiment;
图14A-14B是描述了访问结束寄存器的标志改变控制的示意图;14A-14B are schematic diagrams describing the flag change control of the access end register;
图15A-15B是描述了存储器PE中的外侧接口的示意图;以及15A-15B are schematic diagrams describing the external interface in the memory PE; and
图16是描述了存储器PE中的外侧接口的示意图。FIG. 16 is a schematic diagram describing the external interface in the memory PE.
具体实施方式Detailed ways
现在参照附图描述本发明的实施例。但是,本发明的技术范围将不局限于这些实施例,而是延伸到权利要求及其等同物的内容。Embodiments of the present invention will now be described with reference to the accompanying drawings. However, the technical scope of the present invention shall not be limited to these embodiments, but extend to the contents of the claims and their equivalents.
图1是构成了根据本实施例的可重配置集成电路器件的一部分的一个群集的框图。群集10包括:定序器SEQ,用于执行状态管理;配置数据存储器14,用于存储配置数据CD;以及将根据配置数据CD而被配置为任意电路配置的处理器元件网络部件16。在配置数据存储器14中,配置数据CD是从配置数据加载部件(未示出)加载的。FIG. 1 is a block diagram of a cluster constituting a part of the reconfigurable integrated circuit device according to the present embodiment. The
处理器元件网络部件16包括:多个处理器元件(此后常称为PE)PE0-PE5;PE间开关20,这一组开关是用于连接PE的选择器;以及输入端口部件22和输出端口部件24,它们是与其他群集之间进行数据传输的接口。输入端口部件22和输出端口部件24连接到群集间开关组30。根据图1中的示例,处理器元件PE0-PE3都是运算PE,并且每一个的内部具有ALU、加法器、比较器。处理器元件PE4是另一个PE,例如延迟电路或计数器,而处理器元件PE5是内部具有RAM的存储器PE。The processor
配置数据CD0-CD5从配置数据存储器14被提供给处理器元件PE0-PE5,并且配置数据被存储在这些PE中的寄存器(未示出)中。基于在这些寄存器中设置的配置数据CD0-CD5,每个PE中的电路被动态地配置。同样地,配置数据CD还从配置数据存储器14被提供到PE间开关组20,而且基于该数据,所需的内部开关组结构被配置并且PE之间的数据路径被动态配置。群集间开关组30也基于配置数据CD被动态配置,而且群集之间的数据路径也被配置。Configuration data CD0-CD5 are provided to processor elements PE0-PE5 from a
群集中的存储器处理器元件PE5可经由PE间开关组20与PE0-PE4中每一个进行数据传输。因此,存储器处理器元件PE5连接到内部总线I-BUS。存储器处理器元件PE5可经由外部总线E-BUS1和E-BUS2与外部存储器E-MEM直接进行数据传输,该存储器访问是通过直接存储器访问控制部件DMAC的控制,经由与群集间开关组30不同的总线而直接进行的。因此,存储器处理器元件PE5可与外部存储器E-MEM直接进行数据传输,而且可以在与群集间的数据路径操作无关的定时上进行数据传输。The memory processor element PE5 in the cluster can perform data transmission with each of PE0-PE4 via the
每个结束信号CS0-CS5分别从每个处理器元件PE0-PE5输出,切换信号生成部件12基于这些结束信号输出切换信号SW1。响应于该切换信号SW1,定序器SEQ输出新地址Add和切换信号SW2到配置数据存储器14,响应于此,新配置数据被输出,PE网络部件16中的电路配置被重新配置。Each end signal CS0-CS5 is output from each processor element PE0-PE5 respectively, and the switching signal generating section 12 outputs the switching signal SW1 based on these end signals. In response to the switching signal SW1, the sequencer SEQ outputs a new address Add and a switching signal SW2 to the
图2是示出了根据本实施例的PE网络部件的配置示例的示意图。运算处理器元件PE0-PE3、存储器处理器元件PE5和其他处理器元件PE4可经由选择器41(PE间开关组20中的一个开关)连接。在该配置中,每个处理器元件PE0-PE5可基于配置数据CD0-CD5被配置为任意一种配置,PE间开关组20的选择器41也可基于配置数据CD被配置为任意一种配置。FIG. 2 is a schematic diagram showing a configuration example of PE network components according to the present embodiment. The arithmetic processor elements PE0-PE3, the memory processor element PE5, and the other processor element PE4 can be connected via a selector 41 (one switch in the inter-PE switch group 20). In this configuration, each processor element PE0-PE5 can be configured as any configuration based on the configuration data CD0-CD5, and the
如图2右下角所例示的那样,选择器41包括:寄存器42,用于存储配置数据CD;选择器电路43,用于根据寄存器42的数据来选择输入;以及触发器44,其与时钟CK同步地锁存选择器电路43的输出。As illustrated in the lower right corner of Figure 2, the
图3和图4是根据本实施例描述了根据PE网络部件的配置数据配置的电路配置示例的示意图。在图3和图4中,可动态配置运算电路的运算处理器元件PE0-PE3和PE6被PE间开关组20连接,并且被配置为高速执行预定运算的专用运算电路。处理器元件PE6未在图1和图2中示出。FIG. 3 and FIG. 4 are schematic diagrams illustrating circuit configuration examples configured according to configuration data of PE network components according to the present embodiment. In FIGS. 3 and 4 , the arithmetic processor elements PE0-PE3 and PE6 of the dynamically configurable arithmetic circuit are connected by an
图3中的示例是当对输入数据a、b、c、d、e和f执行下列算术表达式的专用运算电路被配置时的示例。The example in FIG. 3 is an example when a dedicated operation circuit that performs the following arithmetic expressions on input data a, b, c, d, e, and f is configured.
(a+b)+(c-d)+(e+f)(a+b)+(c-d)+(e+f)
根据该配置的示例,处理器元件PE0被配置为A=a+b运算电路,处理器元件PE1被配置为B=c-d运算电路,处理器元件PE2被配置为C=e+f运算电路,处理器元件PE3被配置为D=A+B运算电路,处理器元件PE6被配置为E=D+C运算电路。数据a~f中的每一个从存储器处理器元件和外部群集(未示出)被提供,处理器元件PE6的输出作为运算结果E被输出到存储器处理器元件和外部群集。According to an example of this configuration, the processor element PE0 is configured as an A=a+b operation circuit, the processor element PE1 is configured as a B=c-d operation circuit, the processor element PE2 is configured as a C=e+f operation circuit, and the processing The processor element PE3 is configured as a D=A+B arithmetic circuit, and the processor element PE6 is configured as an E=D+C arithmetic circuit. Each of the data a to f is supplied from a memory processor element and an external cluster (not shown), and an output of the processor element PE6 is output as an operation result E to the memory processor element and the external cluster.
处理器元件PE0、PE1和PE2并行执行运算,处理器元件PE3对上面的运算结果执行运算D=A+B,最后处理器元件PE6执行运算E=D+C。以此方式,通过配置专用运算电路实现了并行运算,从而提高了运算处理效率。The processor elements PE0, PE1 and PE2 perform operations in parallel, the processor element PE3 performs the operation D=A+B on the above operation results, and finally the processor element PE6 performs the operation E=D+C. In this way, the parallel operation is realized by configuring the dedicated operation circuit, thereby improving the operation processing efficiency.
每个运算处理器元件都具有内建的ALU、加法器、乘法器和比较器,并且可基于配置数据CD被重配置为任意运算电路。通过如图3所示进行配置,可配置用于执行上述专用运算的专用运算电路。并且通过配置这样的专用运算电路,多个运算可被并行执行,从而可提高运算效率。Each arithmetic processor element has a built-in ALU, adder, multiplier, and comparator, and can be reconfigured into an arbitrary arithmetic circuit based on configuration data CD. By configuring as shown in FIG. 3, it is possible to configure a dedicated operation circuit for performing the above-mentioned dedicated calculation. And by configuring such a dedicated computing circuit, multiple operations can be executed in parallel, thereby improving computing efficiency.
图4的示例是当对输入数据a~d执行(a+b)*(c-d)运算的专用运算电路被配置时的示例。处理器元件PE0被配置为A=a+b运算电路,处理器元件PE1被配置为B=c-d运算电路,处理器元件PE3被配置为C=A*B运算电路,运算结果C被输出到存储器处理器元件或外部群集。在此情形下,同样地,处理器元件PE0和PE1并行执行运算,处理器元件PE3对其运算结果A和B执行运算C=A*B。因此,通过配置专用运算电路,上述运算效率可被提高,而且对大量数据的运算效率也可提高。The example of FIG. 4 is an example when a dedicated operation circuit performing (a+b)*(c-d) operation on input data a to d is configured. The processor element PE0 is configured as an A=a+b operation circuit, the processor element PE1 is configured as a B=c-d operation circuit, and the processor element PE3 is configured as a C=A*B operation circuit, and the operation result C is output to the memory Processor elements or external clusters. In this case, too, the processor elements PE0 and PE1 perform operations in parallel, and the processor element PE3 performs the operation C=A*B on its operation results A and B. Therefore, by configuring a dedicated operation circuit, the above-mentioned operation efficiency can be improved, and also operation efficiency for a large amount of data can be improved.
图5是描述根据本实施例的可重配置集成电路器件的框图。在图5中,安装了多个群集CLS0-CLS3,用于连接这些群集的群集间开关组30被安置在这些群集之间。通过根据配置数据CD来配置该群集间开关组30,可动态地配置一个组合了多个群集的任意运算电路。FIG. 5 is a block diagram illustrating a reconfigurable integrated circuit device according to the present embodiment. In FIG. 5, a plurality of clusters CLS0-CLS3 are installed, and
在图5的示例中,存储器处理器元件PE-RAM被安装在群集CLS0-CLS3的每一个中。在一个群集中,可根据情况安装多个存储器处理器元件或不安装存储器处理器元件。这些存储器处理器元件经由外部总线E-BUS1连接到直接访问控制部件DMAC,并经由访问控制部件DMAC通过直接存储器访问来执行与外部存储器E-MEM之间的数据传输。关于外部存储器E-MEM,例如DDR-SDRAM(双数据率同步DRAM)被用作高速存储器的示例。此外,安装一个公共数据流控制部件40用于多个存储器处理器元件PE-RAM。每个存储器处理器元件发出访问请求DR0-DR3,响应于该访问请求,数据流控制部件40发送访问命令到控制部件DMAC,从而通过DMA与发送了访问请求的存储器处理器元件执行数据传输。In the example of FIG. 5, memory processor elements PE-RAM are installed in each of the clusters CLS0-CLS3. In a cluster, multiple or no memory processor elements may be installed as appropriate. These memory processor elements are connected to the direct access control section DMAC via the external bus E-BUS1, and perform data transfer with the external memory E-MEM by direct memory access via the access control section DMAC. As for the external memory E-MEM, for example, DDR-SDRAM (Double Data Rate Synchronous DRAM) is used as an example of a high-speed memory. In addition, a common data
数据流控制部件40接受来自多个存储器处理器元件的访问请求,并同步地执行多个存储器处理器元件和外部存储器之间的DMA数据传输。换言之,访问控制部件DMAC基于来自数据流控制部件40的访问命令ACMD,通过轮转方式(round-robin)来同步执行与多个存储器处理器元件之间的DMA数据传输。The data
以此方式,群集中的存储器处理器元件以DMA方式从外部存储器E-MEM传输数据,该数据将被利用群集中的运算处理器元件配置的运算电路处理,并将处理后的数据以DMA方式传输到外部存储器E-MEM。这种DMA方式的传输由外部总线E-BUS1和E-BUS2直接执行,所述外部总线独立于用于连接群集的群集间开关组30。因此,在可重配置集成电路器件中,即使群集间开关组30的连接结构是动态改变的,也可以在每个存储器处理器元件所需的定时上,经由独立于群集间开关组30的路径来在每个存储器处理器元件和外部存储器之间进行数据传输,并且可以为动态配置的群集或者为多个群集实现最优数据传输。In this way, the memory processor element in the cluster transfers data from the external memory E-MEM by DMA, the data will be processed by the arithmetic circuit configured by the arithmetic processor element in the cluster, and the processed data is DMAed Transfer to external memory E-MEM. This DMA transfer is performed directly by the external buses E-BUS1 and E-BUS2 independent of the
图6是描述了根据本实施例的存储器处理器元件的示例的框图。为了实现在外部存储器和群集中的运算处理器元件之间的无缝数据传输,存储器处理器元件包括第一存储器库BNK0和第二存储器库BNK1,还包括这些存储器库和PE间开关组20之间的内侧接口50,以及这些存储器库和外部总线E-BUS1之间的外侧接口52。存储器库BNK0和BNK1分别包括四个16位宽RAM。内侧接口50连接到与PE间开关组20相连接的内部总线I-BUS,基于配置数据CD被动态配置为不同的输入/输出总线接口结构。外侧接口52连接到外部总线E-BUS1,并且也基于配置数据CD而被动态配置为不同的输入/输出总线接口结构。有关将被配置的输入/输出总线接口结构的细节将在后面描述。FIG. 6 is a block diagram illustrating an example of a memory processor element according to the present embodiment. In order to realize the seamless data transfer between the external memory and the arithmetic processor element in the cluster, the memory processor element includes a first memory bank BNK0 and a second memory bank BNK1, and also includes between these memory banks and the
在第一存储器库BNK0和第二存储器库BNK1中,当一个存储器库正在与内部运算处理器元件PE/ALU进行数据传输时,另一个则与外部存储器E-MEM进行数据传输,而且两个存储器库还可以交替执行数据传输。因此,选择器SEL被安装在存储器库BNK0、BNK1与内侧接口50、外侧接口52之间,这些选择器SEL根据配置数据CD被设置。于是,第一和第二存储器库可被交替连接到内侧和外侧接口。接口50和52与每个存储器库BNK0和BNK1之间的信号线都包括16位数据线、地址线和所有其他必要的控制线。In the first memory bank BNK0 and the second memory bank BNK1, when one memory bank is performing data transmission with the internal arithmetic processor element PE/ALU, the other is performing data transmission with the external memory E-MEM, and the two memory banks The library can also alternately perform data transfers. Therefore, selectors SEL are installed between the memory banks BNK0, BNK1 and the inner interface 50, the
存储器处理器元件内部包括:存储器控制部件54,用于控制存储器库的切换和控制DMA请求;以及运算控制部件56,用于执行对内部运算处理器元件PE/ALU的运算执行控制。存储器控制部件54监视存储器库的状态,并执行对存储器库的切换控制、DMA请求、以及对用于停止运算处理器元件的操作的停顿信号STR的断言和取消,从而实现外部存储器和内部运算处理器元件之间的无缝数据传输。响应于该停顿信号STR,运算控制部件56控制运算处理器元件操作的开始和停止。The memory processor element includes: a
图7A-7C和图8A-8C是描述了本实施例的存储器处理器元件中的两个存储器库的切换操作的示意图。在图7A-7C和图8A-8C中,在存储器处理器元件PE/RAM中示出了两个存储器库BNK0、BNK1和访问结束寄存器END-REG,其中访问结束控制器被存储器控制部件54(见图6)用来控制存储器库的切换。存在两个访问结束寄存器END-REG,其中分别存储用于指示第一和第二存储器库的访问状态的标志,例如,当存储器访问结束并且接收到结束信号时,该标志被设置为结束状态“0”,而当存储器库进入访问使能状态(就绪)时,该标志被设置为就绪状态“1”。通过监视这两个寄存器值,存储器控制部件54(见图6)控制两个存储器库BNK0和BNK1的切换。7A-7C and FIGS. 8A-8C are schematic diagrams describing switching operations of two memory banks in the memory processor element of the present embodiment. In FIGS. 7A-7C and FIGS. 8A-8C, two memory banks BNK0, BNK1 and an access end register END-REG are shown in the memory processor element PE/RAM, wherein the access end controller is controlled by the memory control unit 54( See Figure 6) used to control the switching of memory banks. There are two access end registers END-REG in which flags indicating the access states of the first and second memory banks are respectively stored, for example, when the memory access ends and an end signal is received, the flag is set to the end state " 0", and when the memory bank enters the access enable state (ready), the flag is set to the ready state "1". By monitoring these two register values, the memory control section 54 (see FIG. 6) controls switching of the two memory banks BNK0 and BNK1.
现在参照图6、图7A-7C和图8A-8C描述初始启动后的操作。在启动时,定序器SEQ在复位被清零后输出对应于初始启动的地址,并且用于初始启动的配置数据从配置数据存储器14(图6)输出,群集中的处理器元件PE和PE间开关组20被配置为初始电路配置。通过该初始启动,初始值被设置在访问结束寄存器END-REG中,如图7A所示。在该示例中,第一存储器库BNK0的寄存器处于就绪状态(标志是“0”),而第二存储器库BNK1的寄存器处于访问结束状态(标志是“1”)。通过该初始启动,选择器SEL被配置以使得第一存储器库BNK0连接到外侧接口52,而第二存储器库BNK1连接到内侧接口50。Operation after initial start-up will now be described with reference to Figure 6, Figures 7A-7C and Figures 8A-8C. At startup, the sequencer SEQ outputs the address corresponding to the initial startup after reset is cleared, and the configuration data for the initial startup is output from the configuration data memory 14 (FIG. 6), the processor elements PE and PE in the cluster The
在初始启动之后,存储器控制部件54查阅访问结束寄存器,并输出对外部存储器的访问请求DMAR。如上所述,访问请求DMAR经由数据流控制部件40(图5)被发送到直接存储器访问控制部件DMAC,在外部存储器E-MEM和第一存储器库BNK0之间开始了直接数据传输。具体而言,从外部存储器E-MEM读取的数据经由外部总线被直接传输和写入第一存储器库BNK0。如上所述,初始启动时的访问请求DMAR从多个存储器处理器元件输出,因此利用多个直接存储器访问的数据传输被同步执行。After the initial startup, the
然后,如图7B所示,当从外部存储器E-MEM到第一存储器库BNK0的数据传输结束时,从DMA控制部件DMAC发送访问结束信号END1,响应于此,访问结束寄存器END-REG中对应于第一存储器库的位变为访问结束状态(标志“1”)。以此方式,当两个寄存器都变为访问结束状态(标志“1”)时,存储器控制部件54发出状态结束信号CS,使得定序器SEQ输出下一地址Add并使得配置数据存储器14输出新的配置数据CD,从而切换第一存储器库BNK0和第二存储器库BNK1。换言之,第二存储器库BNK1连接到外侧接口52,第一存储器库BNK0连接到内侧接口50。Then, as shown in FIG. 7B, when the data transfer from the external memory E-MEM to the first memory bank BNK0 ends, the DMA control part DMAC sends the access end signal END1, and in response to this, the corresponding access end register END-REG The bit in the first memory bank becomes the access end state (flag "1"). In this way, when both registers become the access end state (flag "1"), the
然后,如图7C所示,当两个存储器库被切换时,存储器控制部件54清零访问结束寄存器END-REG,从而将两个存储器库都设置为就绪状态(标志“0”)。响应于该状态,存储器控制部件54输出访问请求DMAR到外部存储器,基于该访问请求,DMA控制部件DMAC控制外部存储器E-MEM和第二存储器库BNK1之间的数据传输。在此情形下的访问控制DMAR是在存储器处理器元件需要进行访问的定时上发出的,这与初始启动时是不同的,因此数据传输根据需要而执行。同时,存储器控制部件54输出信号ALU-EN,该信号指示了内部运算处理器元件可被执行,响应于此,运算控制部件56输出运算开始信号ALU-ST到内部运算处理器元件PE/ALU,并开始运算处理器元件的运算处理。于是,内部运算处理器元件PE/ALU访问第一存储器库BNK0,读取数据,并对读取的数据执行运算处理。Then, as shown in FIG. 7C, when the two memory banks are switched, the
然后,如图8A所示,当第二存储器库BNK1和外部存储器E-MEM之间的数据传输结束时,响应于访问结束信号END1,访问结束寄存器END-REG被设置为访问结束状态(标志“1”)。通常,与外部存储器之间的直接存储器访问具有较宽的数据总线宽度,因此是高速数据传输,并且在与内部运算处理器元件间的数据传输之前结束。Then, as shown in FIG. 8A, when the data transfer between the second memory bank BNK1 and the external memory E-MEM ends, in response to the access end signal END1, the access end register END-REG is set to the access end state (flag " 1"). Typically, direct memory access to and from external memory has a wider data bus width, and therefore is a high-speed data transfer, and ends before data transfer to and from the internal arithmetic processor element.
如图8B所示,来自内部运算处理器元件PE/ALU的访问也结束了,访问结束寄存器END-REG的另一标志也被访问结束信号END2设置为访问结束状态(标志“1”)。响应于此,存储器控制部件54输出状态结束信号CS,并根据从配置数据存储器14输出的配置数据CD,替换第一存储器库BNK0和第二存储器库BNK1与内侧和外侧接口之间的连接。As shown in FIG. 8B, the access from the internal arithmetic processor element PE/ALU is also ended, and another flag of the access end register END-REG is also set to the access end state (flag "1") by the access end signal END2. In response thereto,
如图8C所示,存储器控制部件54再次输出直接存储器访问请求DMAR,开始第一存储器库BNK0和外部存储器E-MEM之间的数据传输,运算控制部件56输出运算开始信号ALU-ST并开始从内部运算处理器元件PE/ALU到第2存储器库BNK1的访问。As shown in FIG. 8C , the
如上所述,通过交替切换第一和第二存储器库,存储器控制部件54实现从外部存储器E-MEM到内部运算处理器元件的无缝数据传输。具体而言,与外部存储器之间的直接存储器访问比内部运算处理器元件的访问快,因此运算处理器元件可无缝地读取和处理数据。As described above, by alternately switching the first and second memory banks, the
图9A-9C是描述了根据本实施例的存储器处理器元件中的两个存储器库的切换操作的示意图。这里将描述在无缝数据传输出现问题时的控制。由于与外部存储器之间的直接数据传输以高速进行,因此通常一个存储器库在另一个存储器库结束与内部运算PE间的数据传输之前就结束了与外部存储器间的数据传输。当与内部运算PE间的数据传输完成时,执行存储器库切换控制,于是可实现在外部存储器和内部运算PE之间的无缝数据传输。但是由于某些原因,有些情形下与内部运算PE之间的数据传输先完成。9A-9C are diagrams describing switching operations of two memory banks in the memory processor element according to the present embodiment. Controls in case of problems with seamless data transfer will be described here. Since the direct data transfer with the external memory is performed at high speed, usually one memory bank finishes the data transfer with the external memory before the other memory bank finishes the data transfer with the internal computing PE. When the data transfer with the internal computing PE is completed, memory bank switching control is performed, so that seamless data transfer between the external memory and the internal computing PE can be realized. However, due to some reasons, in some cases, the data transmission with the internal computing PE is completed first.
如图9A所示,如果从第一存储器库BNK0到内部运算PE的数据传输先结束,则访问结束寄存器END-REG被结束信号END2设置为访问结束状态(标志“1”)。响应于此,存储器控制部件54向运算控制部件56断言一个停顿信号STR,于是运算PE阵列暂时停止其流水线处理。换言之,当不能从存储器PE读取数据时,运算PE阵列的流水线处理无法进行,运算处理开始出现问题。As shown in FIG. 9A, if the data transfer from the first memory bank BNK0 to the internal operation PE ends first, the access end register END-REG is set to the access end state (flag "1") by the end signal END2. In response to this, the
如图9B所示,当第二存储器库BNK1的数据传输完成时,访问结束寄存器END-REG被结束信号END1设置为访问结束状态。于是,存储器控制部件54输出状态结束信号CS,并根据配置数据CD切换存储器库。然后,如图9C所示,存储器控制部件54输出访问请求DMAR,使得第一存储器库BNK0开始与外部存储器之间的数据传输,取消停顿信号STR,并重新开始内部运算PE阵列的操作,于是,第二存储器库BNK1开始与内部运算PE之间的数据传输。As shown in FIG. 9B, when the data transfer of the second memory bank BNK1 is completed, the access end register END-REG is set to the access end state by the end signal END1. Then, the
以此方式,专用运算电路被配置,并且数据运算处理被流水线式处理,于是在存储器控制部件54监视两个存储器库的访问状态并且数据的无缝传输被禁止时,存储器控制部件54断言一个停顿信号STR,以停止对内部运算PE的流水线处理。这样,可以避免流水线处理可能出现的问题。当无缝传输被使能时,存储器控制部件54取消停顿信号STR,并重新开始流水线处理。In this way, the dedicated arithmetic circuit is configured, and the data arithmetic processing is pipelined, so when the
图10A-10C和图11A-11C是描述了存储器处理器元件中的两个存储器库的切换操作的示意图。这是在执行经由存储器PE从内部运算PE到外部存储器E-MEM的数据传输时的示例。10A-10C and 11A-11C are schematic diagrams describing switching operations of two memory banks in a memory processor element. This is an example when performing data transfer from the internal operation PE to the external memory E-MEM via the memory PE.
在图10A中,运算PE向第一存储器库BNK0写数据。在图10B中,当数据写完成时,两个访问结束寄存器END-REG都变为访问结束状态(标志“1”)。响应于此,存储器控制部件54输出状态结束信号CS,并基于配置数据CD来切换两个存储器库。如图10C所示,第一存储器库BNK0通过访问请求DMAC开始与外部存储器之间的直接数据传输,通过到运算PE的运算开始信号ALU-ST开始从运算PE到第二存储器库BNK1的数据写。In FIG. 10A, the operation PE writes data to the first memory bank BNK0. In FIG. 10B, when data writing is completed, both access end registers END-REG become access end states (flag "1"). In response to this, the
然后,如图11A所示,第一存储器库BNK0的数据传输首先完成,从运算PE的数据写如图11B所示结束。于是,存储器控制部件54切换两个存储器库,交换后的存储器库的数据传输如图11C所示分别开始。Then, as shown in FIG. 11A , the data transfer of the first memory bank BNK0 is completed first, and the data writing of the slave operation PE is completed as shown in FIG. 11B . Then, the
如上所述,从运算PE到外部存储器的数据传输也经由存储器PE被无缝执行。如果无缝数据传输被中途禁止,则停顿信号STR被取消,运算PE阵列停止流水线处理,并且在数据传输被使能时重新开始流水线处理。As described above, data transfer from the operation PE to the external memory is also performed seamlessly via the memory PE. If the seamless data transmission is disabled midway, the pause signal STR is canceled, the arithmetic PE array stops the pipeline processing, and restarts the pipeline processing when the data transmission is enabled.
图12是描述了根据本实施例的存储器处理器元件的控制部件的框图。图13是其控制部件的状态转换图。在图12的示例中,同一群集中的存储器单元60具有多个存储器处理器元件RAM-PE0~PEn,运算处理器元件的阵列PE/ALU阵列被配置为与存储器处理器元件RAM-PE0~PEn中的每一个相对应。每个存储器PE包括作为存储器控制部件54的库切换控制部件541和DMA传输执行判断部件542,还具有作为运算控制部件56的ALU运算执行判断部件561。多个存储器PE共享作为运算控制部件56的ALU运算控制部件562,DMA传输控制部件543被提供为存储器控制部件54。存储器PE中的第一存储器库BNK0和第二存储器库BNK1被配置为经由外部总线交替地与访问控制部件DMAC进行数据传输,以及经由群集中的PE间开关组PE-SW交替地与运算处理器元件阵列PE/ALU阵列进行数据传输。FIG. 12 is a block diagram describing a control section of the memory processor element according to the present embodiment. Fig. 13 is a state transition diagram of its control components. In the example of FIG. 12, the memory unit 60 in the same cluster has a plurality of memory processor elements RAM-PE0˜PEn, and the array PE/ALU array of the arithmetic processor elements is configured to be connected with the memory processor elements RAM-PE0˜PEn. corresponds to each of the . Each memory PE includes a bank
下面将参照图13中的状态转换图描述控制流。如上所述,第一存储器处理器元件RAM-PE启动,并基于配置数据CD被配置为所需电路配置(C10)。通过所述启动,访问结束寄存器END-REG被设置为初始值标志,存储器库通过该标志状态变为初始状态(C12)。The control flow will be described below with reference to the state transition diagram in FIG. 13 . As mentioned above, the first memory processor element RAM-PE is enabled and configured to the desired circuit configuration based on the configuration data CD (C10). By the activation, the access end register END-REG is set to an initial value flag by which the memory bank becomes the initial state (C12).
在存储器处理器元件RAM-PE启动之后的操作期间,库切换控制部件541根据访问结束寄存器END-REG的状态(都是标志“1”)来控制存储器库的切换(C12),从而切换存储器库(C14)。当存储器库被切换时,运算PE的电路配置可被相应地转换(C12、C14)。During the operation after the memory processor element RAM-PE starts, the bank switching
当存储器库被切换时,DMA传输执行判断部件542判断到外部存储器的数据传输是否可能,如果数据传输可被执行,则DMA传输执行判断部件542向安装在存储器PE外部的DMA传输控制部件543输出DMA传输使能信号DMA-EN(C16)。是否可以进行数据传输取决于指示存储器库状态的访问结束寄存器END-REG的状态。相应的DMA传输控制部件543经由数据流控制部件40(未示出,见图5)输出访问请求到访问控制部件DMAC(C18),数据传输被执行(C20)。当与外部存储器的数据传输结束时,DMA传输控制部件543接收数据传输结束信号END1,数据传输结束信号END10被发送到库切换控制部件541。然后,根据访问结束寄存器END-REG的状态执行上述库切换控制(C12)。When the memory bank is switched, the DMA transfer
另一方面,当存储器库被切换时,ALU运算执行判断部件561基于访问结束寄存器END-REG来监视存储器库的状态,并判断从运算PE的访问是否可能,即,运算PE是否可执行运算处理(C22)。如果执行是可能的,则ALU运算执行判断部件561输出运算执行使能信号ALU-EN。On the other hand, when the memory bank is switched, the ALU operation
仅当从所有存储器处理器元件RAM-PE0~PEn都接收到运算执行使能信号ALU-EN时,ALU运算控制部件562输出运算开始信号ALU-ST到群集中的所有运算PE阵列(C24),并使得所有运算PE阵列同步执行运算处理(C26)。换言之,群集中的多个运算PE阵列必须在执行与多个存储器PE的数据传输的同时同步执行流水线处理,因此一个ALU运算控制部件562被安装为多个存储器PE的公用部件,并且仅当从所有存储器PE接收到运算执行使能信号ALU-EN时,ALU运算控制部件562才向多个运算PE阵列输出运算开始信号ALU-ST。ALU运算执行判断部件561监视存储器库的状态,如果数据传输不能无缝地进行,则ALU运算执行判断部件561断言一个停顿信号STR,并停止运算PE阵列的流水线处理。停顿信号STR如上所述。Only when the operation execution enable signal ALU-EN is received from all memory processor elements RAM-PE0~PEn, the ALU
当运算处理完成时,到运算PE侧的存储器库的访问结束,于是从运算PE接收结束信号END2,ALU运算执行判断部件561取消运算执行使能信号ALU-EN。通过该结束信号END2,访问结束寄存器END-REG的标志状态被改变,存储器库被切换或者运算PE的配置改变被相应地控制和执行(C12、C14)。When the operation processing is completed, the access to the memory bank on the operation PE side ends, and upon receiving the end signal END2 from the operation PE, the ALU operation
在图13中,虚线那的状态转换示出了存储器PE的状态转换,其左侧示出了DMA传输控制部件543和直接存储器访问控制部件DMAC的状态,其右侧示出了ALU运算控制部件562和运算PE阵列的状态。In FIG. 13, the state transition of the dotted line shows the state transition of the memory PE, the left side shows the state of the DMA
在图12和图13中,DMA传输控制部件543基于DMA传输执行判断部件542输出的DMA传输使能信号DMA-EN输出DMA请求,但是DMA传输控制部件543可检查直接存储器访问控制部件DMAC接受的信道状态,从而判断DMA传输是否可被执行,即DMA传输执行定时是否合适,如果合适的话则输出DMA请求。这样,当直接存储器访问控制部件DMAC的信道数量超过预定数量而且定时不适于发送DMA请求时,可停止对DMA请求的发送,直到信道数量变为预定数量或少于预定数量,并且DMA传输定时可被延迟。DMA传输使能信号DMA-EN是根据访问结束寄存器END-REG的状态生成的,因此对延迟DMA传输定时的这一控制是很重要的。In FIGS. 12 and 13, the DMA
在图13中,当运算处理器元件阵列的操作结束时(C26),新的配置数据从定序器输出,运算PE的配置数据被改变(C12)。在必要时,配置数据被切换。In FIG. 13, when the operation of the arithmetic processor element array ends (C26), new configuration data is output from the sequencer, and the configuration data of the arithmetic PE is changed (C12). Configuration data is switched when necessary.
图14A-14B是描述了访问结束寄存器的标志改变控制的示意图。图14A示出了当存储器库BNK0/1连接到内侧(运算PE阵列侧)时的标志改变控制。用于访问的地址Add从运算PE阵列侧被提供给存储器库BNK,相应的访问被执行。该访问地址Add也被提供给存储器控制部件54中的比较器70。当电路被基于配置数据配置时将被访问的结束地址E-Add已被预先设置在比较器70中。每次地址有效信号Valid(该信号指示附接到访问地址的地址是否有效)变为有效,比较器70就比较访问地址Add和结束地址E-Add,并且如果它们匹配则将访问结束寄存器END-REG的标志变为“1”。14A-14B are diagrams describing flag change control of the access end register. FIG. 14A shows flag change control when the memory bank BNK0/1 is connected to the inside (operation PE array side). The address Add for access is supplied to the memory bank BNK from the arithmetic PE array side, and the corresponding access is performed. This access address Add is also supplied to the comparator 70 in the
作为另一控制方法,响应于来自运算PE阵列的结束信号END2,访问结束寄存器END-REG的标志可被变为结束状态“1”。在任一情形下,当内侧和外侧存储器库被切换时,访问结束寄存器END-REG的标志都被设置为就绪状态“0”。As another control method, the flag of the access end register END-REG may be changed to end state "1" in response to the end signal END2 from the operation PE array. In either case, when the inside and outside memory banks are switched, the flag of the access end register END-REG is set to the ready state "0".
图14B示出了当存储器库0/1连接到外侧(外部存储器E-MEM侧)时的标志改变控制。在此情形下,访问地址Add被从访问控制部件DMAC提供。响应于来自访问控制部件DMAC的结束信号END1,存储器控制部件54将访问结束寄存器END-REG的标志变为结束状态“1”,当存储器库的内侧和外侧被切换时,存储器控制部件54响应于切换结束信号END-SW将访问结束寄存器END-REG的标志设置为就绪状态“0”。FIG. 14B shows flag change control when the
此外,访问结束寄存器END-REG的结束状态通过重置被清零并且被设置为就绪状态。Also, the end state of the access end register END-REG is cleared by reset and set to the ready state.
图15A-15B和16是描述了存储器PE中的外侧接口的示意图。外侧接口52连接到外部总线E-BUS1,并基于配置数据CD被动态配置为不同的输入/输出总线接口结构。通常,用于直接存储器访问的外部总线E-BUS1具有较宽的总线宽度。例如,在外部存储器E-MEM是32位DDR-SDRAM时,数据在一个时钟周期内被输出两次,因此外部总线E-BUS1的总线宽度是64位。在此情形下,外侧接口52的电路被配置为使得64位数据并行地输入到存储器库BNK中的四个16位RAM,或并行地从存储器库BNK中的四个16位RAM输出。15A-15B and 16 are schematic diagrams describing external interfaces in the memory PE. The
图15A示出了当外部总线E-BUS1的总线宽度是16位时的外侧接口。如上所述,64位数据被并行地输入到四个16位RAM,或并行地从四个16位RAM输出。FIG. 15A shows the outside interface when the bus width of the external bus E-BUS1 is 16 bits. As described above, 64-bit data is input to or output from four 16-bit RAMs in parallel.
图15B示出了当总线宽度为32位时的情形,接口被配置为使得32位数据被并行地输入两组RAM,或并行地从这两组RAM输出,其中每组由两个16位RAM构成。向每组的两个RAM输入16位数据和从每组的两个RAM输出16位数据的接口是串行的。Figure 15B shows the situation when the bus width is 32 bits, the interface is configured such that 32-bit data is input in parallel to two sets of RAMs, or output from these two sets of RAMs in parallel, where each set consists of two 16-bit RAMs constitute. The interface for inputting 16-bit data to and outputting 16-bit data from the two RAMs of each group is serial.
图16示出了当总线带宽是16位并且接口被配置为使得16位数据被串行输入四个16位RAM或被串行输出四个16位RAM。图16中接口52的配置与内侧接口的配置相同。换言之,内侧接口被配置为图16所示的配置,因为运算PE阵列侧的内部总线宽度较窄,即16位。因此,内侧接口50被配置为使得16位数据被串行输入四个16位RAM或被串行输出四个16位RAM。FIG. 16 shows when the bus bandwidth is 16 bits and the interface is configured such that 16 bits of data are serially input to or serially output from four 16 bit RAMs. The configuration of the
以此方式,对存储器PE中的接口50和52进行配置,以和基于配置数据CD而连接的总线的配置相匹配。In this way, the
如上所述,根据本实施例,包括多个运算PE和存储器PE的多组群集被布置在可通过动态改变电路配置而被配置的集成电路器件中,群集通过连接状态被动态改变的开关组互连,独立于该群集间开关组,群集中的存储器PE与外部存储器连接。存储器PE可执行与外部存储器的DMA传输。存储器PE例如还是双缓冲器配置,从而可在外部存储器和运算PE之间进行无缝数据传输,如果数据传输出现问题,则运算PE阵列的流水线操作暂时停止。As described above, according to the present embodiment, a plurality of groups of clusters including a plurality of operation PEs and memory PEs are arranged in an integrated circuit device configurable by dynamically changing the circuit configuration, and the clusters are interconnected through switch groups whose connection states are dynamically changed. Connected, independent of the inter-cluster switch group, the storage PEs in the cluster are connected to external storage. Memory PE can perform DMA transfer with external memory. The memory PE is also configured with double buffers, so that seamless data transmission can be performed between the external memory and the computing PE. If there is a problem in data transmission, the pipeline operation of the computing PE array is temporarily stopped.
本发明基于2005年8月2日提交的在先日本专利申请No.2005-224208并要求享受其优先权,该在先申请的全部内容通过引用而包含于此。This application is based on and claims priority from prior Japanese Patent Application No. 2005-224208 filed on August 2, 2005, the entire contents of which are hereby incorporated by reference.
Claims (16)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2005224208A JP4536618B2 (en) | 2005-08-02 | 2005-08-02 | Reconfigurable integrated circuit device |
| JP2005224208 | 2005-08-02 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1908927A true CN1908927A (en) | 2007-02-07 |
| CN100414535C CN100414535C (en) | 2008-08-27 |
Family
ID=37700038
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CNB2006100083495A Expired - Fee Related CN100414535C (en) | 2005-08-02 | 2006-02-17 | reconfigurable integrated circuit device |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20070033369A1 (en) |
| JP (1) | JP4536618B2 (en) |
| CN (1) | CN100414535C (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101620588B (en) * | 2008-07-03 | 2011-01-19 | 中国人民解放军信息工程大学 | A Connection and Management Method of Reconfigurable Components in High Performance Computer |
| CN101727434B (en) * | 2008-10-20 | 2012-06-13 | 北京大学深圳研究生院 | Integrated circuit structure special for specific application algorithm |
| WO2017177928A1 (en) * | 2016-04-12 | 2017-10-19 | Huawei Technologies Co., Ltd. | Scalable autonomic message-transport with synchronization |
| US10185606B2 (en) | 2016-04-12 | 2019-01-22 | Futurewei Technologies, Inc. | Scalable autonomic message-transport with synchronization |
| US10289598B2 (en) | 2016-04-12 | 2019-05-14 | Futurewei Technologies, Inc. | Non-blocking network |
Families Citing this family (58)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP4201816B2 (en) * | 2004-07-30 | 2008-12-24 | 富士通株式会社 | Reconfigurable circuit and control method of reconfigurable circuit |
| US7861060B1 (en) * | 2005-12-15 | 2010-12-28 | Nvidia Corporation | Parallel data processing systems and methods using cooperative thread arrays and thread identifier values to determine processing behavior |
| JP4653697B2 (en) * | 2006-05-29 | 2011-03-16 | 株式会社日立製作所 | Power management method |
| US7680988B1 (en) * | 2006-10-30 | 2010-03-16 | Nvidia Corporation | Single interconnect providing read and write access to a memory shared by concurrent threads |
| US8108625B1 (en) | 2006-10-30 | 2012-01-31 | Nvidia Corporation | Shared memory with parallel access and access conflict resolution mechanism |
| US8176265B2 (en) | 2006-10-30 | 2012-05-08 | Nvidia Corporation | Shared single-access memory with management of multiple parallel requests |
| US7962702B1 (en) * | 2007-07-09 | 2011-06-14 | Rockwell Collins, Inc. | Multiple independent levels of security (MILS) certifiable RAM paging system |
| JP5260068B2 (en) * | 2008-01-31 | 2013-08-14 | 古野電気株式会社 | Detection device and detection method |
| US8103853B2 (en) * | 2008-03-05 | 2012-01-24 | The Boeing Company | Intelligent fabric system on a chip |
| JP5431003B2 (en) * | 2009-04-03 | 2014-03-05 | スパンション エルエルシー | Reconfigurable circuit and reconfigurable circuit system |
| US9361960B2 (en) * | 2009-09-16 | 2016-06-07 | Rambus Inc. | Configurable memory banks of a memory device |
| JP5711889B2 (en) * | 2010-01-27 | 2015-05-07 | スパンション エルエルシー | Reconfigurable circuit and semiconductor integrated circuit |
| KR101076869B1 (en) * | 2010-03-16 | 2011-10-25 | 광운대학교 산학협력단 | Memory centric communication apparatus in coarse grained reconfigurable array |
| JP5678782B2 (en) * | 2011-04-07 | 2015-03-04 | 富士通セミコンダクター株式会社 | Reconfigurable integrated circuit device |
| US9130596B2 (en) * | 2011-06-29 | 2015-09-08 | Seagate Technology Llc | Multiuse data channel |
| US10157060B2 (en) | 2011-12-29 | 2018-12-18 | Intel Corporation | Method, device and system for control signaling in a data path module of a data stream processing engine |
| JP5927012B2 (en) * | 2012-04-11 | 2016-05-25 | 太陽誘電株式会社 | Reconfigurable semiconductor device |
| US10331583B2 (en) | 2013-09-26 | 2019-06-25 | Intel Corporation | Executing distributed memory operations using processing elements connected by distributed channels |
| US10078606B2 (en) * | 2015-11-30 | 2018-09-18 | Knuedge, Inc. | DMA engine for transferring data in a network-on-a-chip processor |
| US10203911B2 (en) * | 2016-05-18 | 2019-02-12 | Friday Harbor Llc | Content addressable memory (CAM) implemented tuple spaces |
| CN113660439A (en) * | 2016-12-27 | 2021-11-16 | 株式会社半导体能源研究所 | Imaging device and electronic apparatus |
| US10474375B2 (en) | 2016-12-30 | 2019-11-12 | Intel Corporation | Runtime address disambiguation in acceleration hardware |
| US10416999B2 (en) | 2016-12-30 | 2019-09-17 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator |
| US10572376B2 (en) | 2016-12-30 | 2020-02-25 | Intel Corporation | Memory ordering in acceleration hardware |
| US10558575B2 (en) * | 2016-12-30 | 2020-02-11 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator |
| US10515046B2 (en) | 2017-07-01 | 2019-12-24 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator |
| US10387319B2 (en) | 2017-07-01 | 2019-08-20 | Intel Corporation | Processors, methods, and systems for a configurable spatial accelerator with memory system performance, power reduction, and atomics support features |
| US10445234B2 (en) | 2017-07-01 | 2019-10-15 | Intel Corporation | Processors, methods, and systems for a configurable spatial accelerator with transactional and replay features |
| US10515049B1 (en) | 2017-07-01 | 2019-12-24 | Intel Corporation | Memory circuits and methods for distributed memory hazard detection and error recovery |
| US10445451B2 (en) | 2017-07-01 | 2019-10-15 | Intel Corporation | Processors, methods, and systems for a configurable spatial accelerator with performance, correctness, and power reduction features |
| US10467183B2 (en) | 2017-07-01 | 2019-11-05 | Intel Corporation | Processors and methods for pipelined runtime services in a spatial array |
| US10469397B2 (en) | 2017-07-01 | 2019-11-05 | Intel Corporation | Processors and methods with configurable network-based dataflow operator circuits |
| US10496574B2 (en) | 2017-09-28 | 2019-12-03 | Intel Corporation | Processors, methods, and systems for a memory fence in a configurable spatial accelerator |
| US11086816B2 (en) | 2017-09-28 | 2021-08-10 | Intel Corporation | Processors, methods, and systems for debugging a configurable spatial accelerator |
| US10445098B2 (en) | 2017-09-30 | 2019-10-15 | Intel Corporation | Processors and methods for privileged configuration in a spatial array |
| US10380063B2 (en) | 2017-09-30 | 2019-08-13 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator having a sequencer dataflow operator |
| US10565134B2 (en) | 2017-12-30 | 2020-02-18 | Intel Corporation | Apparatus, methods, and systems for multicast in a configurable spatial accelerator |
| US10445250B2 (en) | 2017-12-30 | 2019-10-15 | Intel Corporation | Apparatus, methods, and systems with a configurable spatial accelerator |
| US10564980B2 (en) | 2018-04-03 | 2020-02-18 | Intel Corporation | Apparatus, methods, and systems for conditional queues in a configurable spatial accelerator |
| US11307873B2 (en) | 2018-04-03 | 2022-04-19 | Intel Corporation | Apparatus, methods, and systems for unstructured data flow in a configurable spatial accelerator with predicate propagation and merging |
| US10853073B2 (en) | 2018-06-30 | 2020-12-01 | Intel Corporation | Apparatuses, methods, and systems for conditional operations in a configurable spatial accelerator |
| US10891240B2 (en) * | 2018-06-30 | 2021-01-12 | Intel Corporation | Apparatus, methods, and systems for low latency communication in a configurable spatial accelerator |
| US11200186B2 (en) | 2018-06-30 | 2021-12-14 | Intel Corporation | Apparatuses, methods, and systems for operations in a configurable spatial accelerator |
| US10459866B1 (en) | 2018-06-30 | 2019-10-29 | Intel Corporation | Apparatuses, methods, and systems for integrated control and data processing in a configurable spatial accelerator |
| US10678724B1 (en) | 2018-12-29 | 2020-06-09 | Intel Corporation | Apparatuses, methods, and systems for in-network storage in a configurable spatial accelerator |
| EP3938921A4 (en) * | 2019-03-11 | 2022-12-14 | Untether AI Corporation | Computational memory |
| US12124530B2 (en) * | 2019-03-11 | 2024-10-22 | Untether Ai Corporation | Computational memory |
| US10817291B2 (en) | 2019-03-30 | 2020-10-27 | Intel Corporation | Apparatuses, methods, and systems for swizzle operations in a configurable spatial accelerator |
| US10965536B2 (en) | 2019-03-30 | 2021-03-30 | Intel Corporation | Methods and apparatus to insert buffers in a dataflow graph |
| US10915471B2 (en) | 2019-03-30 | 2021-02-09 | Intel Corporation | Apparatuses, methods, and systems for memory interface circuit allocation in a configurable spatial accelerator |
| US11029927B2 (en) | 2019-03-30 | 2021-06-08 | Intel Corporation | Methods and apparatus to detect and annotate backedges in a dataflow graph |
| US11037050B2 (en) | 2019-06-29 | 2021-06-15 | Intel Corporation | Apparatuses, methods, and systems for memory interface circuit arbitration in a configurable spatial accelerator |
| US11342944B2 (en) | 2019-09-23 | 2022-05-24 | Untether Ai Corporation | Computational memory with zero disable and error detection |
| US11907713B2 (en) | 2019-12-28 | 2024-02-20 | Intel Corporation | Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator |
| US11468002B2 (en) * | 2020-02-28 | 2022-10-11 | Untether Ai Corporation | Computational memory with cooperation among rows of processing elements and memory thereof |
| US12086080B2 (en) | 2020-09-26 | 2024-09-10 | Intel Corporation | Apparatuses, methods, and systems for a configurable accelerator having dataflow execution circuits |
| CN112967172B (en) * | 2021-02-26 | 2024-09-17 | 成都商汤科技有限公司 | Data processing device, method, computer equipment and storage medium |
| US20250245181A1 (en) * | 2024-01-30 | 2025-07-31 | Google Llc | System and Methods for Multi-Pod Inter-Chip Interconnect |
Family Cites Families (24)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPS608970A (en) * | 1983-06-29 | 1985-01-17 | Fuji Electric Co Ltd | Multi-controller system |
| JPS60186151A (en) * | 1984-03-05 | 1985-09-21 | Matsushita Electric Ind Co Ltd | Data communicating method between processors |
| CA2129882A1 (en) * | 1993-08-12 | 1995-02-13 | Soheil Shams | Dynamically reconfigurable interprocessor communication network for simd multiprocessors and apparatus implementing same |
| US5842034A (en) * | 1996-12-20 | 1998-11-24 | Raytheon Company | Two dimensional crossbar mesh for multi-processor interconnect |
| US5978379A (en) * | 1997-01-23 | 1999-11-02 | Gadzoox Networks, Inc. | Fiber channel learning bridge, learning half bridge, and protocol |
| US6366999B1 (en) * | 1998-01-28 | 2002-04-02 | Bops, Inc. | Methods and apparatus to support conditional execution in a VLIW-based array processor with subword execution |
| US6041400A (en) * | 1998-10-26 | 2000-03-21 | Sony Corporation | Distributed extensible processing architecture for digital signal processing applications |
| JP3674515B2 (en) * | 2000-02-25 | 2005-07-20 | 日本電気株式会社 | Array type processor |
| US7006521B2 (en) * | 2000-11-15 | 2006-02-28 | Texas Instruments Inc. | External bus arbitration technique for multicore DSP device |
| US7233998B2 (en) * | 2001-03-22 | 2007-06-19 | Sony Computer Entertainment Inc. | Computer architecture and software cells for broadband networks |
| US7093104B2 (en) * | 2001-03-22 | 2006-08-15 | Sony Computer Entertainment Inc. | Processing modules for computer architecture for broadband networks |
| US6526491B2 (en) * | 2001-03-22 | 2003-02-25 | Sony Corporation Entertainment Inc. | Memory protection system and method for computer architecture for broadband networks |
| US6809734B2 (en) * | 2001-03-22 | 2004-10-26 | Sony Computer Entertainment Inc. | Resource dedication system and method for a computer architecture for broadband networks |
| US7516334B2 (en) * | 2001-03-22 | 2009-04-07 | Sony Computer Entertainment Inc. | Power management for processing modules |
| US6826662B2 (en) * | 2001-03-22 | 2004-11-30 | Sony Computer Entertainment Inc. | System and method for data synchronization for a computer architecture for broadband networks |
| US7231500B2 (en) * | 2001-03-22 | 2007-06-12 | Sony Computer Entertainment Inc. | External data interface in a computer architecture for broadband networks |
| US7152151B2 (en) * | 2002-07-18 | 2006-12-19 | Ge Fanuc Embedded Systems, Inc. | Signal processing resource for selective series processing of data in transit on communications paths in multi-processor arrangements |
| US20020184291A1 (en) * | 2001-05-31 | 2002-12-05 | Hogenauer Eugene B. | Method and system for scheduling in an adaptable computing engine |
| US20040022094A1 (en) * | 2002-02-25 | 2004-02-05 | Sivakumar Radhakrishnan | Cache usage for concurrent multiple streams |
| US7124211B2 (en) * | 2002-10-23 | 2006-10-17 | Src Computers, Inc. | System and method for explicit communication of messages between processes running on different nodes in a clustered multiprocessor system |
| US7093079B2 (en) * | 2002-12-17 | 2006-08-15 | Intel Corporation | Snoop filter bypass |
| JP4423953B2 (en) * | 2003-07-09 | 2010-03-03 | 株式会社日立製作所 | Semiconductor integrated circuit |
| JP4359490B2 (en) * | 2003-11-28 | 2009-11-04 | アイピーフレックス株式会社 | Data transmission method |
| US20080162877A1 (en) * | 2005-02-24 | 2008-07-03 | Erik Richter Altman | Non-Homogeneous Multi-Processor System With Shared Memory |
-
2005
- 2005-08-02 JP JP2005224208A patent/JP4536618B2/en not_active Expired - Fee Related
-
2006
- 2006-01-27 US US11/340,871 patent/US20070033369A1/en not_active Abandoned
- 2006-02-17 CN CNB2006100083495A patent/CN100414535C/en not_active Expired - Fee Related
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101620588B (en) * | 2008-07-03 | 2011-01-19 | 中国人民解放军信息工程大学 | A Connection and Management Method of Reconfigurable Components in High Performance Computer |
| CN101727434B (en) * | 2008-10-20 | 2012-06-13 | 北京大学深圳研究生院 | Integrated circuit structure special for specific application algorithm |
| WO2017177928A1 (en) * | 2016-04-12 | 2017-10-19 | Huawei Technologies Co., Ltd. | Scalable autonomic message-transport with synchronization |
| US10185606B2 (en) | 2016-04-12 | 2019-01-22 | Futurewei Technologies, Inc. | Scalable autonomic message-transport with synchronization |
| US10289598B2 (en) | 2016-04-12 | 2019-05-14 | Futurewei Technologies, Inc. | Non-blocking network |
Also Published As
| Publication number | Publication date |
|---|---|
| CN100414535C (en) | 2008-08-27 |
| JP4536618B2 (en) | 2010-09-01 |
| US20070033369A1 (en) | 2007-02-08 |
| JP2007041781A (en) | 2007-02-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN1908927A (en) | Reconfigurable integrated circuit device | |
| JP4391935B2 (en) | Processing system with interspersed processors and communication elements | |
| CN107273093B (en) | Scalable Computing Fabric | |
| CN1526100A (en) | integrated circuit device | |
| CN111274025A (en) | System and method for accelerating data processing in SSD | |
| CN112486908B (en) | Hierarchical multi-RPU multi-PEA reconfigurable processor | |
| CN1716227A (en) | Operating means and operation apparatus control method, program and computer-readable medium | |
| JP2005044361A (en) | Self-contained processor subsystem as component for system-on-chip design | |
| TWI666551B (en) | Decentralized allocation of resources and interconnect structures to support the execution of instruction sequences by a plurality of engines | |
| US20250251967A1 (en) | Multiple contexts for a compute unit in a reconfigurable data processor | |
| WO2023076521A1 (en) | Force-quit for reconfigurable processors | |
| TWI668574B (en) | Computing apparatus, system-on-chip and method of quality of service ordinal modification | |
| US8190856B2 (en) | Data transfer network and control apparatus for a system with an array of processing elements each either self- or common controlled | |
| US8843728B2 (en) | Processor for enabling inter-sequencer communication following lock competition and accelerator registration | |
| WO2022088171A1 (en) | Neural processing unit synchronization systems and methods | |
| Hussain et al. | Pgc: a pattern-based graphics controller | |
| Eisenhardt et al. | Optimizing partial reconfiguration of multi-context architectures | |
| CN106201931B (en) | A kind of hypervelocity matrix operation coprocessor system | |
| CN111209230B (en) | Data processing device, method and related products | |
| CN105718421A (en) | Data caching updating system for multiple coarseness dynamically-reconfigurable arrays | |
| US20250208907A1 (en) | Controller for an array of data processing engines | |
| CN1639690A (en) | Semiconductor device | |
| US20250370941A1 (en) | Dma strategies for aie control and configuration | |
| WO2026084584A1 (en) | Processor system for performing neural network computations | |
| WO2015123848A1 (en) | Reconfigurable processor and conditional execution method thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| ASS | Succession or assignment of patent right |
Owner name: FUJITSU MICROELECTRONICS CO., LTD. Free format text: FORMER OWNER: FUJITSU LIMITED Effective date: 20081024 |
|
| C41 | Transfer of patent application or patent right or utility model | ||
| TR01 | Transfer of patent right |
Effective date of registration: 20081024 Address after: Tokyo, Japan, Japan Patentee after: Fujitsu Microelectronics Ltd. Address before: Kanagawa Patentee before: Fujitsu Ltd. |
|
| C56 | Change in the name or address of the patentee |
Owner name: FUJITSU SEMICONDUCTORS CO., LTD Free format text: FORMER NAME: FUJITSU MICROELECTRON CO., LTD. |
|
| CP03 | Change of name, title or address |
Address after: Kanagawa Patentee after: Fujitsu Semiconductor Co., Ltd. Address before: Tokyo, Japan, Japan Patentee before: Fujitsu Microelectronics Ltd. |
|
| ASS | Succession or assignment of patent right |
Owner name: SPANSION LLC N. D. GES D. STAATES Free format text: FORMER OWNER: FUJITSU SEMICONDUCTOR CO., LTD. Effective date: 20140102 |
|
| C41 | Transfer of patent application or patent right or utility model | ||
| TR01 | Transfer of patent right |
Effective date of registration: 20140102 Address after: American California Patentee after: Spansion LLC N. D. Ges D. Staates Address before: Kanagawa Patentee before: Fujitsu Semiconductor Co., Ltd. |
|
| C41 | Transfer of patent application or patent right or utility model | ||
| TR01 | Transfer of patent right |
Effective date of registration: 20160408 Address after: American California Patentee after: Cypress Semiconductor Corp. Address before: American California Patentee before: Spansion LLC N. D. Ges D. Staates |
|
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20080827 Termination date: 20170217 |
|
| CF01 | Termination of patent right due to non-payment of annual fee |