TW448376B - Circuit and method cryptographic multiplication for computing the RSA and ECC algorithms - Google Patents
Circuit and method cryptographic multiplication for computing the RSA and ECC algorithms Download PDFInfo
- Publication number
- TW448376B TW448376B TW88121181A TW88121181A TW448376B TW 448376 B TW448376 B TW 448376B TW 88121181 A TW88121181 A TW 88121181A TW 88121181 A TW88121181 A TW 88121181A TW 448376 B TW448376 B TW 448376B
- Authority
- TW
- Taiwan
- Prior art keywords
- value
- input
- output
- adder
- signal
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/60—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
- G06F7/72—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
- G06F7/728—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic using Montgomery reduction
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/60—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
- G06F7/72—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
- G06F7/724—Finite field arithmetic
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Complex Calculations (AREA)
Abstract
Description
448376 五、發明說明(1) 本應用已於1998年12月18日,以專利應用字號 09/215,934建檔於美利堅:合眾國。 發明背景448376 V. Description of the invention (1) This application has been filed in the United States of America: Patent United States on December 18, 1998 under the patent application number 09 / 215,934. Background of the invention
本發明總的與公開金鑰保密技術有關,特別是與產生公 開金墙密碼的積體電路有關D 萊衛斯特-莎彌爾-亞道曼(RSA)及橢圓曲線密碼(ECC)是 公開金鑰密碼演算法,其可為往來於電子裝置間之數位資 料’提供出高度的安全保護RSA與£:(:(: '(Fp)模數學演算 法的計算,可用硬體的乘法器來執行,而ECC(多項式基之 F 2M)的多項式數學演算法則可用不同的硬體乘法器來執行 。此二用以計算RSA與ECC演算法之硬體乘法器,其内部可 使管線化的架構以因應此二演算法會使用到的大量的平行 計算。管線化結構之乘法器所提供出之低功率功能,是許 多應用所需要的。 以硬體來實現RSA與ECC演算法的計算,並不容易。所以 ’最適系統應用之密碼型式決定出最適計算RSA或ECC演算 法之硬體乘法器架構。隨著對更快速的密碼運算以及更高 效鮮需求的增加,我們需要一個更好的硬體模乘法器架構 ’以確保其在提高速度與效能的同時,仍維持高水準之保 密度。 於是’以積體電路來實現該提供密碼之乘算系統使其具 雨效能、低成本以及低功率,會是一個很好的方式。若乘 算系統是計算RSA與ECC演算法之乘算系統,則更佳。 圖示之簡要描述The present invention is generally related to the public key security technology, and particularly to the integrated circuit that generates the public golden wall cipher. Key cryptographic algorithm, which can provide a high degree of security for digital data between electronic devices. RSA and £: (: (: '(Fp) modulus mathematical calculations can be calculated using hardware multipliers. Implementation, and ECC (polynomial basis F 2M) polynomial mathematical algorithms can be executed with different hardware multipliers. The two hardware multipliers used to calculate RSA and ECC algorithms can be pipelined internally. In order to respond to the large number of parallel calculations that the two algorithms will use. The low-power function provided by the multiplier of the pipeline structure is required for many applications. The hardware is used to implement the calculation of the RSA and ECC algorithms, and Not easy. So 'the type of cryptography that best fits the system determines the hardware multiplier architecture that is optimal for computing RSA or ECC algorithms. With the increase in demand for faster cryptographic operations and more efficient freshness, we need a more Hardware modular multiplier architecture 'to ensure that it improves speed and efficiency while still maintaining a high level of density. So' integrated circuits are used to implement the cryptographic multiplication system to make it rainy and cost effective. And low power will be a good way. If the multiplication system is a multiplication system that calculates RSA and ECC algorithms, it is even better. Brief description of the diagram
448376 五、發明說明(2) 圖1之方塊圖是其内所具之RSA算術處理器與ECC算術處 理器乃分離之積體電路型密碼系統的具體實施方式; 圖2之方塊圖是其内僅具單一處理器來執行RS a與ECC資 料密碼計算之積體電路型密碼系統的具體實施方式; 圖3之電路圖是圖2單一處理器之一部份的具體實施例; 圖4之電路圖是圖2單一處理器之一部份的另一具體實施 例; 圖5之電路圖是用以計算ECC演算法(多項式基之F2M)之 乘法器其一部份之電路圖; 圖δ之方塊圖,是用以計算RSA或ECC演算法之ΙχΝ乘法器 ♦ 9 圖7是圖6乘法器中C暫存器用於單循環乘法運算時,其 組成單元之電路圖;以及 圖8是圖6乘法器中C暫存器用於雙循環乘法運算時,其 組成單元之電路圖。 較佳具體實施例之詳細說明 總的來說’本發明之積體型密碼電路可提供出支援衛斯 特-ί多彌爾-亞道曼(RSA)與橢圓曲線密碼(ECC)演算法之密 碼功能。該可產生密碼之積體電路已應用於電子商務,呼 叫器’行動電路,智慧卡以及智慧卡終端。像個人健康記 錄’財務記錄,指紋以及視網膜跡等資料,都可以利用整 數模乘算’模多項式乘算,加法’減法以及指數運算等功 能來予以加密^該積體型密碼電路提供出一種硬體結構, 可有效率地計算RSA與ECC演算法-448376 V. Description of the invention (2) The block diagram of FIG. 1 is a specific implementation of the integrated circuit-type cryptographic system in which the RSA arithmetic processor and the ECC arithmetic processor are separate; the block diagram of FIG. 2 is within A specific implementation of the integrated circuit type cryptographic system with only a single processor to perform RS a and ECC data cryptographic calculations; the circuit diagram of FIG. 3 is a specific embodiment of a part of the single processor of FIG. 2; the circuit diagram of FIG. 4 is Figure 2 shows another embodiment of a part of a single processor; the circuit diagram of Figure 5 is a circuit diagram of a part of a multiplier used to calculate an ECC algorithm (polynomial basis F2M); the block diagram of Figure δ is I × N multiplier used to calculate RSA or ECC algorithm ♦ 9 Figure 7 is a circuit diagram of the constituent units when the C register in the multiplier of FIG. 6 is used for single-cycle multiplication; and FIG. 8 is the C register in the multiplier of FIG. 6 When the register is used for double-cycle multiplication, the circuit diagram of its constituent units. Detailed description of the preferred embodiments In general, the integrated cryptographic circuit of the present invention can provide cryptographic codes that support Wester-Domir-Adolman (RSA) and Elliptic Curve Cryptography (ECC) algorithms. Features. The integrated circuit capable of generating passwords has been applied to e-commerce, caller 'mobile circuits, smart cards and smart card terminals. Information such as personal health records, financial records, fingerprints, and retinal traces can be encrypted using integer modular multiplication, modular polynomial multiplication, addition, subtraction, and exponential operations. The integrated cryptographic circuit provides a hardware Structure to efficiently calculate RSA and ECC algorithms-
4483 7 6 五、發明說明(3) 圖1之方塊圖是積體式密碼系統1〇之具體實施例,其具 有一RSA算術處理器18以及一分離的ECC算術處理器20。此 單晶片型式之密碼系統10被規劃為運作於數據通信網路中 ’其使用RSA或ECC演算法來執行密碼功能。密碼系統]〇包 含一主端介面方塊12 ’此介面之輸入連接至一介面匯流排 。密碼系統1 0即是透過該資料匯流排,與外部的其他電子 裝置C未顯示)進行資料信號的傳送與接收。舉個例子,密 碼系統1 0外部之電子裝置,如微處理器,隨機存取記憶體 (RAM) ’唯讀記憶體(R0M),記憶體存取控制器(MAC),記 憶體安全管理單元(SMMU)以及通用非同步接收/傳送 (UART)方塊’均是將資料提至主端介面方塊〗2之終端機上 或是對主端介面方塊1 2終端機上之資料進行處理。圓中並 未將接於密碼系統1 〇外部之方塊顯示出來。 密碼系統1 0另包含暫存記憶體1 4,此記憶體之輸入連接 至主端介面方塊12。暫存記憶體14接收可啟動密碼系統1〇 執行公開金鑰密碼功能之資料。所以,記憶體1 4中儲存著 支援該RSA模指數運算以及該橢圓曲線點乘算之資料,此 二種運算分別地由RSA算術處理器1 8以及ECC算術處理器20 來執行。 具體地說’若是要執行RSA模指數運算,那麼記憶體1 4 所儲存的資料就包括,像是模值N、運算元A與B的值,以 及部份乘積值◊若是要執行橢圓曲線點乘算,那麼記憶體 1 4所儲存的資料就包括,像是既約多項式、奇質數域之值 、產生器點之ECC系統參數、橢圓曲線參數、點縮放器,4483 7 6 V. Description of the invention (3) The block diagram of FIG. 1 is a specific embodiment of the integrated cryptosystem 10, which has an RSA arithmetic processor 18 and a separate ECC arithmetic processor 20. This single-chip cryptographic system 10 is planned to operate in a data communication network. It uses RSA or ECC algorithms to perform cryptographic functions. Cryptographic system] 〇 Contains a main interface interface box 12 ′ The input of this interface is connected to an interface bus. The cryptographic system 10 transmits and receives data signals through this data bus with other external electronic devices (not shown). For example, cryptographic system 10 external electronic devices, such as microprocessors, random access memory (RAM) 'read only memory (R0M), memory access controller (MAC), memory security management unit (SMMU) and Universal Asynchronous Receive / Transmit (UART) blocks are both to upload data to the terminal on the master interface block 2 or process the data on the terminal 12 on the master interface block. The squares connected to the outside of the password system 10 are not shown in the circle. The password system 10 also contains temporary storage memory 14, and the input of this memory is connected to the main interface interface block 12. The temporary storage memory 14 receives data that can activate the cryptographic system 10 to perform the public key cryptographic function. Therefore, the memory 14 stores data supporting the RSA modulus exponent operation and the elliptic curve point multiplication operation. These two operations are performed by the RSA arithmetic processor 18 and the ECC arithmetic processor 20, respectively. Specifically, if the RSA modulus exponent operation is to be performed, the data stored in the memory 1 4 includes, for example, the modulus value N, the values of the operands A and B, and the partial product value. If an elliptic curve point is to be performed Multiplication, then the data stored in memory 14 includes, for example, reduced polynomials, values of odd prime numbers, ECC system parameters of generator points, elliptic curve parameters, point scalers,
4483 76 五、發明說明(4) 以及臨時的數值。 典型地,記憶體14的儲存容量可儲存的RSA與ECC之金鑰 大小比,約為四比一 *比方說,如果記憶體可儲存〗〇24位 元大小之RS A金鑰,那麼相同的容量可儲存大約256位元大 小之ECC金鑰。所以,較之於使用ECC演算法,在使用RSA 演算法時’記憶體1 4可提供出較高的安全等級。使用記憶 體14來儲存RS A算術處理器18與ECC算術處理器20二者所需 之資料’可使密碼系統1 〇的矽面積及整體成本縮小與降低 〇 相似類型之軟體指令亦可用來計算該ECC與RSA演算法。 舉個例子,該RSA演算法在計算指數功能時,使用的是二 進位平方-及-相乘副程式,而該ECC演算法在計算點相乘 時,使周的則是雙-及〜加副程式。是故,相似的副程式用 於支援RSA或ECC演算法之數學運算。在許多個別的演算法 之間,也可發現相類似之物,譬如,RSA之整數模-N,以 及ECC之多項式基中模相乘。 在運作時,儲存在記憶體14中之資料值會轉移至RS a算 術處理器18或ECC算術處理器20中。控制電路16所提供之 控制信號管理著記憶體14、RSA算術處理器18、ECC算術處 理器20以及主端介面方塊12,這四者間資料的移轉。另外 ,在處理資料時,控制方塊1 6所產生之控制信號也控制著 RSA算術處理器丨8以及ECC算術處理器2〇所提供之數學計算 k另個角度看’從控制方塊1 β所出之控制信號可致能該 用以計算RSA演算法之RSA算術處理器1 8,亦可致能該用以4483 76 V. Description of the invention (4) and temporary values. Typically, the ratio of the RSA and ECC key sizes that can be stored in the storage capacity of the memory 14 is about four to one * For example, if the memory can store a RS A key of 24 bits in size, the same The capacity can store approximately 256-bit ECC keys. Therefore, compared with the use of the ECC algorithm, the 'memory 14' can provide a higher level of security when using the RSA algorithm. Use the memory 14 to store the data required by both the RS A arithmetic processor 18 and the ECC arithmetic processor 20 'to reduce and reduce the silicon area and overall cost of the cryptographic system 10. Similar types of software instructions can also be used to calculate The ECC and RSA algorithms. For example, the RSA algorithm uses the binary square-and-multiplication subroutine when calculating the exponential function, and the ECC algorithm uses the double- and ~ plus when calculating the point multiplication. Subroutine. Therefore, similar subroutines are used for mathematical operations that support RSA or ECC algorithms. Similar things can be found among many individual algorithms, such as the integer modulus -N of RSA and the modulus multiplication in the polynomial basis of ECC. In operation, the data values stored in the memory 14 are transferred to the RS a arithmetic processor 18 or the ECC arithmetic processor 20. The control signals provided by the control circuit 16 manage the data transfer among the memory 14, the RSA arithmetic processor 18, the ECC arithmetic processor 20, and the master-side interface block 12. In addition, when processing data, the control signals generated by the control block 16 also control the mathematical calculations provided by the RSA arithmetic processor 8 and the ECC arithmetic processor 20. From another angle, 'from the control block 1 β The control signal can enable the RSA arithmetic processor 18 for calculating the RSA algorithm, or enable the RSA arithmetic processor
448376 五、發明說明(5) 計算ECC演算法之ECC算術處理器20 » RSA與ECC演算法間存 在之相似之物降低了控制電路1 6所產生之控制信號的數量 圖2之方塊圖是積體式之密碼系統24之另一具體實施例 ’其僅具單一個算術處理器22來計算RS A與ECC演算法。應 注意的疋圖令相同的參考數字代表的是相同的元件。此密 碼系統24之主端介面方塊1 2透過介面匯流排’而與其他的 電子裝置(未顯示)連接。資料信號透過主端介面方塊12, 移轉至儲存資料用之暫存記憶體1 4。控制信號1 6提供控制 信號至算術處理器22以管理暫存記憶體14中資料的移轉, 以及控制算術處理器22所提供出的功能。該INI'/POLY信號 就是這樣的一個由控制電路1 6所產生出來的控制信號,此 INT/P0LY信號可選擇(或致能)算術處理器22執行RSA演算 法還是ECC演算法之數學運算。所以,算術處理器22乃是 根據RSA模指數或ECC橢圓曲線點乘算,來提供密碼功能。 圖3是*圖2中該單一算術處理器22之一部份的具體實施電 路圖。算術處理器22執行運算元Λ與B之相乘,然後將乘積 值即P^C P⑴’ Pi+2與Ρ;+3 ’提供至模縮減器6 0。運算元A 與B可以是數字資料,也可以是用美國資訊交換標準碼 (ASCII)轉換成序數之一般文字串,或是其他轉換過之 集組。 算術處理器22之模縮減器60,包含一X列,Y行之加法器 p列,其中X與Y是整數。該加法器陣列之較佳具體實施方 ",應為十六行,+六列。不過,應該注意的是,本發明448376 V. Description of the invention (5) ECC arithmetic processor for calculating ECC algorithm 20 »The similarity between RSA and ECC algorithm reduces the number of control signals generated by the control circuit 1 6 The block diagram in Figure 2 is the product Another specific embodiment of the cryptographic system 24 of the type 'it has only a single arithmetic processor 22 to calculate RS A and ECC algorithms. It should be noted that the same reference numerals represent the same components. The main interface block 12 of the password system 24 is connected to other electronic devices (not shown) through an interface bus'. The data signal is transferred to the temporary storage memory 14 for storing data through the main interface interface box 12. The control signal 16 provides a control signal to the arithmetic processor 22 to manage the transfer of data in the temporary storage memory 14 and control the functions provided by the arithmetic processor 22. The INI '/ POLY signal is such a control signal generated by the control circuit 16. The INT / P0LY signal can select (or enable) the arithmetic processor 22 to perform the RSA algorithm or the mathematical operation of the ECC algorithm. Therefore, the arithmetic processor 22 provides the cryptographic function based on the RSA modulus index or the ECC elliptic curve point multiplication. FIG. 3 is a detailed implementation circuit diagram of a part of the single arithmetic processor 22 in FIG. 2. The arithmetic processor 22 performs a multiplication of the operands Λ and B, and then supplies the product values P ^ C P⑴ 'Pi + 2 and P; +3' to the modulo reducer 60. Operands A and B can be numeric data, or ordinary text strings converted into ordinal numbers using the American Standard Code for Information Interchange (ASCII), or other converted sets. The modulo reducer 60 of the arithmetic processor 22 includes an X column and an Y row adder p column, where X and Y are integers. A preferred embodiment of the adder array "quotes should be sixteen rows and six columns. It should be noted, however, that the present invention
第9頁 4483 76 五、發明說明(6) 並不限制加法器陣列一定要是十六行,十六列’或是行列 數要相同°為簡明故,我們將模縮減器60簡化為一四乘四 之加法器陣列,外帶一些相關的邏輯閘。 加法器90,92,94與96組成模縮減器60之加法器陣列之 ,加法器100,i 〇2,1〇4與106組成模縮減器60之加法 器陣列之行X,,加法器n〇,112,114與116組成模縮減器 60之加法器陣列之行I,加法器120,1 22 , 1 24與126則組 成模縮減器60之加法器陣列之行&。加法器9〇至“,1〇〇 至106: 11〇至1丨6以及120至126中的每一個均具第一與第 二資料輸入,一進位輸入(CI),一進位輸出(c〇),以及一 加總輸出(S)。 打X。之加法器90,92,94與96的第一輸入,各自地連接 至輸入終端機80,82,84與86。兩輸入及閘89,91,93與 7 = 入彼此相接’共同地連接至閂鎖128之。輸出^ J , 9 3與9 5的輸出則分別地連接至加法器9 0,9 2 透過及二f·二輸入。$外’加法器90之進位輸出(C0), 彳# +連接至加法器92之進位輸入(CI),加法器92 出’透過及間92A連接至加法器94之進位輸入, 口 :二之進位輸出,則透過及間94A連接至加法器96之 法器96之進位輸出,透過及問96Α連接至閃 ^之資料輸入。問鎖152的輪出則連接至加法器90的進 位輸入= 阻:Π,譬如,及開90Α,92Α,94Α與96Α也稱之為 ^電路。在該與所有的阻隔電路均有連接之選擇(或致Page 9 4483 76 V. Description of the invention (6) There is no restriction on the adder array must be sixteen rows, sixteen columns' or the number of rows and columns must be the same. For simplicity, we will simplify the modulo reducer 60 to one or four times. Four adder array with some related logic gates. Adders 90, 92, 94, and 96 form the adder array of the modular reducer 60, and adders 100, 100, 104, and 106 form the adder array of the modular reducer 60, X, and adder n 〇, 112, 114, and 116 form the adder array row of the modular reducer 60, and the adders 120, 1 22, 1 24, and 126 form the adder array row of the modular reducer 60 &. Adders 90 to ", 100 to 106: 110 to 1 6 and 120 to 126 each have a first and a second data input, a carry input (CI), a carry output (c. ), And a total output (S). Hit X. The first inputs of the adders 90, 92, 94, and 96 are connected to the input terminals 80, 82, 84, and 86, respectively. Two inputs and gates 89, 91, 93 and 7 = inputs are connected to each other 'and are connected to latch 128 in common. Outputs ^ J, 9 3 and 9 5 are connected to adders 9 0, 9 2 through and two f · 2 inputs respectively. . $ 外 'carry output (C0) of adder 90, 彳 # + connected to carry input (CI) of adder 92, adder 92 out' is connected to carry input of adder 94 through and between 92A, port: two The carry output is connected to the carry output of the adder 96 through the 94A and the data input of the flash ^ through the 96A. The turn-out of the lock 152 is connected to the carry input of the adder 90 = Impedance: Π, for example, and 90A, 92A, 94A, and 96A are also called ^ circuits. There is a choice to connect to all blocking circuits (or cause
1麵 第10胃 4483 7 6 五、發明說明(7) 能)信號是邏輯1時,該進位信號可通過該阻隔電路。另— 方面’在該選擇(或致能)信號是邏輯〇時,該進位信號可 則會被阻隔而無法通過該阻隔電路。 行乂丨之加法器1〇〇,102,104與106的第一輸入,各自地 連接至行XDi加法器90,92,94與96的輸出。兩輸入及開 99 ’101 ’103與1〇5的第一輸入彼此相接,共同地連接至 閂鎖132之Q輸出。及閘99,i〇1,1 〇3與1〇5的輸出則分別 地連接至加法器1〇〇,1〇2,1〇4與1〇6的第二輸入。另外, 加法器1 0 0之進位輸出’透過及閘丨〇 〇 A連接至加法器丨〇 2之 進位輸入,加法器102之進位輸出,透過及閘1〇2A連接至 加法器1 0 4之進位輸入;加法器I 〇 4之進位輸出,則是透過 及閘1 04A連接至加法器1〇6之進位輸入。加法器〗〇6之進位 輸出’透過及閘1 〇 6 A連接至閂鎖1 5 6之資料輸入。閂鎖1 5 β 的輸出則連接至加法器丨0 〇的進位輸入。 行Χ2之加法器1 1 〇,1 1 2,1 1 4與1 1 6的第一輸入,各自地 連接至行X〗之加法器100,102,1〇4與1〇6的輸出。兩輸入 及閘109 ’ 111,113與115的第一輸入彼此相接,共同地連 接至閂鎖136之Q輸出。及閘109,Π1,η3與115的輸出則 分別地連接至加法器U0,112,114與116的第二輸入。另 ^,加法器11 〇之進位輸出,透過了及閘丨} 〇Α連接至加法 器Π2之進位輪入;加法器112之進位輸出,透過了及閘 112Α連接至加法器114之進位輸入;加法器114之進位 :則是透過及閘114Α連接至加法器116之進位輸入。加法 器11 6之進位輸出,透過了及閘丨丨6Α連接至閂鎖1 之資料 448376 五、發明說明(8) 輸入。閂鎖1 6 0的輸出則連接至加法器11 〇的進位輸入 行X3之加法器120,122,124與126的第一輸入,各自地 連接至行X2之加法器110 ’112,114與116的輸出。兩輸入 及閘119,121 ’123與125的第一輸入彼此相接’共同地連 接至閂鎖140之Q輸出。及閘119,121,123與125的輸出則 分別地連接至加法器1 2 0,1 2 2,1 2 4與1 2 6的第二輸入。另 外’加法器120之進位輸出,透過了及閘120A連接至加法 器122之進位輸入;加法器122之進位輸出,透過了及閘 1 22A連接至加法器124之進位輸入;加法器124之進位輸出 ’則是透過及閘1 24A連接至加法器1 26之進位輸入。加法 器126之進位輸出,透過了及閘126A連接至閂鎖162之資料 輸入°閂鎖162的輸出則連接至加法器120的進位輸入。加 法器120,122,124與126的輸出S,分別地連接至輸出終 端機 164,166,168與170。及閘90A-96A,100Α-1〇6Α, 110A-116A ’以及120A-126A,在算術處理器22正在計算整 數-模-N乘法時’是致能的;而在該算術處理器正在計算 以多項式為基礎之模乘法時,則是禁能的。換言之,當正 在計算該以多項式為基礎之模乘法演算法時,加法器9〇_ 96 ’100-106 ,以及12〇_126的進位輸出信號, 均無法傳送出去。該等及閘的參考號碼均被多加了一個 "A"字’這是為了表示每一個加法器(譬如加法器9〇 )均有 一個相應的及閘(即9〇A),可影響加法器之進位輸出是否 可傳送至相鄰加法器的進位輸入(通過或阻絕)。 另外’及閘8 9,1 〇1,1 1 3與1 2 5的第二輸入彼此相接,1 side 10th stomach 4483 7 6 V. Description of the invention (7) Yes) When the signal is logic 1, the carry signal can pass through the blocking circuit. On the other hand, when the selection (or enable) signal is logic 0, the carry signal may be blocked from passing through the blocking circuit. The first inputs of the adders 100, 102, 104, and 106 of the lines 乂 are connected to the outputs of the line XDi adders 90, 92, 94, and 96, respectively. The two inputs and the first inputs of the switches 99'101'103 and 105 are connected to each other and are commonly connected to the Q output of the latch 132. The outputs of gates 99, 101, 103 and 105 are connected to the second inputs of adders 100, 102, 104 and 106, respectively. In addition, the carry output of the adder 100 is connected to the carry input of the adder 丨 00A through the AND gate, and the carry output of the adder 102 is connected to the adder 104 through the AND gate 102A. Carry input; the carry output of adder I 04 is connected to the carry input of adder 106 through the AND gate 104A. Adder 〖〇6 Carry Output ′ is connected to the data input of the latch 156 through the gate 1 06 A. The output of latch 1 5 β is connected to the carry input of adder 丨 0 〇. The first inputs of the adders 1 1 0, 1 1 2, 1 1 4 and 1 1 6 of the row X2 are respectively connected to the outputs of the adders 100, 102, 104 and 106 of the row X 2. The two inputs and the first inputs of the gates 109 '111, 113, and 115 are connected to each other and are commonly connected to the Q output of the latch 136. The outputs of the AND gates 109, Π1, η3 and 115 are connected to the second inputs of the adders U0, 112, 114 and 116, respectively. In addition, the carry output of the adder 11 〇 passes through the AND gate 丨} 〇Α is connected to the carry round of the adder Π2; the carry output of the adder 112 is connected to the carry input of the adder 114 through the AND gate 112A; Carry of adder 114: It is connected to the carry input of adder 116 through AND gate 114A. The carry output of the adder 11 6 is connected to the data of the latch 1 through the gate 丨 丨 6Α 448376 V. Description of the invention (8) Input. The output of latch 1 60 is connected to the first inputs of adders 120, 122, 124 and 126 of carry input line X3 of adder 11 0, respectively to the adders 110'112, 114, and 116 of line X2. Output. The two inputs and the first inputs of the gates 119, 121 ', 123 and 125 are connected to each other' and are commonly connected to the Q output of the latch 140. The outputs of the AND gates 119, 121, 123 and 125 are connected to the second inputs of the adders 1 2 0, 1 2 2, 1 2 4 and 1 2 6 respectively. In addition, the carry output of the adder 120 is connected to the carry input of the adder 122 through the AND gate 120A; the carry output of the adder 122 is connected to the carry input of the adder 124 through the AND gate 1 22A; the carry of the adder 124 Output 'is the carry input connected to adder 1 26 through gate 1 24A. The carry output of adder 126 is passed through the data input of latch 162 and gate 126A. The output of latch 162 is connected to the carry input of adder 120. The outputs S of the adders 120, 122, 124, and 126 are connected to the output terminals 164, 166, 168, and 170, respectively. And gates 90A-96A, 100A-106A, 110A-116A, and 120A-126A are enabled when the arithmetic processor 22 is calculating integer-modular-N multiplication; and the arithmetic processor is calculating Polynomial-based modular multiplication is disabled. In other words, when the polynomial-based modulo multiplication algorithm is being calculated, the carry output signals of the adder 90_96'100-106 and the 12_126 cannot be transmitted. The reference number of the sum gate is added with an "A". This is to indicate that each adder (such as the adder 90) has a corresponding sum gate (that is, 90A), which can affect the addition. Whether the carry output of the adder can be passed to the carry input (pass or block) of an adjacent adder. In addition, the second inputs of the gates 9, 9, 10, 1, 13 and 1 2 5 are connected to each other,
第12頁 448376 五、發明說明(9) 共同地連接至輸入終端機81。及閘91,1〇3與115的第二輸 入彼此相接’共同地連接至輸入終端機83以及閂鎖158之 輸入。及閘93與105的第二輸入彼此相接,共同地連接至 輸入終端機85以及閂鎖154之輸入。及閘95的第二輸入則 是連接至輸入終端機87以及閂鎖150之輸入。及閘99,111 與123的第二輸入彼此相接,共同地連接至閂鎖丨58之輸出 。及閘109與121的第二輸入彼此相接,共同地連接至輸入 終端機85以及閂鎖154之輸出。及閘1 19的第二輸入則是連 接至閂鎖158之輸出。 閂鎖128 ’132,136與140,每一個均具一設定輸入(S) ,一重置輸入(R) ’以及一輸出(Q)。在信號T是高邏輯值 ,以致輸出Q之傳號與輸入S之信號值相同時,閂鎖128, 132,136與140是致能的。當信號T由高邏輯值轉移至低邏 輯值之時,輸出Q之信號是被閂鎖住的。輸入r之信號可以 重置輸出Q之信號。該等128,132,136與140之重置輸入R 彼此相接,共同地連接至終端機79。終端機79則連接以接 收重置信號R之用。兩輸入及閘130的輸出,連接至閂鎖 128之設定輪入。及閘130的第一輸入則連接至加法器90之 第一輸入。兩輸入及閘134的輸出,連接至閂鎖132之設定 輸入。及閘1 34的第一輸入則連接至加法器1 02之第一輸入 。兩輸入及閘138的輸出,連接至閂鎖136之設定輸入。及 閘1 38的第一輸入則連接至加法器1 1 4之第一輸入。兩輸入 及閘142的輸出,連接至閂鎖140之設定輸入。及閛142的 第一輸入則連接至加法器126之第一輸入。該等及閘130 ’Page 12 448376 V. Description of the invention (9) It is connected to the input terminal 81 in common. The second inputs of the gates 91, 103 and 115 are connected to each other 'and are commonly connected to the input terminal 83 and the input of the latch 158. The second inputs of the gates 93 and 105 are connected to each other and are commonly connected to the inputs of the input terminal 85 and the latch 154. The second input of the AND gate 95 is the input connected to the input terminal 87 and the latch 150. The second inputs of the AND gates 99, 111 and 123 are connected to each other and are commonly connected to the output of the latch 58. The second inputs of the gates 109 and 121 are connected to each other and are commonly connected to the input terminal 85 and the output of the latch 154. The second input of the AND gate 1 19 is the output connected to the latch 158. The latches 128'132, 136, and 140 each have a set input (S), a reset input (R) ', and an output (Q). When the signal T has a high logic value such that the signal number of the output Q is the same as the signal value of the input S, the latches 128, 132, 136, and 140 are enabled. When the signal T transitions from a high logic value to a low logic value, the signal of the output Q is latched. The input r signal can reset the output Q signal. The reset inputs R of 128, 132, 136, and 140 are connected to each other and are commonly connected to the terminal 79. The terminal 79 is connected to receive the reset signal R. The two inputs and the output of the gate 130 are connected to the setting wheel of the latch 128. The first input of the AND gate 130 is connected to the first input of the adder 90. The two inputs and the output of the gate 134 are connected to the setting input of the latch 132. The first input of the AND gate 1 34 is connected to the first input of the adder 102. The two inputs and the output of the gate 138 are connected to the set inputs of the latch 136. The first input of the AND gate 1 38 is connected to the first input of the adder 1 1 4. The two inputs and the output of the gate 142 are connected to the set input of the latch 140. The first input of sum 142 is connected to the first input of adder 126.于 和 门 130 ’
第13頁 448376 五、發明說明(ίο) 134,138與142之第二輸入彼此相接,共同地連接至終端 機78。終端機78則連接以接收信號T之用。 大的運算元,像是譬如,兩個1024位元之運算元的相乘 ,是透過一乘法器(未顯示),以管線化的技術來相乘通過 或旋轉。典型地,對於較大之運算元A與B,會先將之分到 為較小的,稱之為數元之群組,譬如,數元Α〇-以及一h 。該管線化之乘法器其尺寸恰適可做這些數元之相乘。譬 如,該等數元Αβ-ΑΝ以及心-81、.是16位元之二進位數,則該乘 法器亦是1 6位元之乘法器(但本發明並不以此為限)。 通式來看,整數-模-Ν蒙哥馬利乘法的形式為:. (A*R mod N)(B*R mod Ν)+ΜΝ 其中= A是該第一運算元且為一整數; B是該第二運算元且為一整數; N是一奇數值; mod N是(A*B*R)/N的餘數值,可定義出有限域中元件的 數目; R是大於該N值之2整數乘冪值;以及 \是一縮減值’此值應使(A*R mod N)(B*R moci N)+\ *N可為R所整除且不會損失有效位元。 運作時,模縮減器60在接收了(a*R mod N)與(B*R mod N )的乘積後,產生整數-模-N蒙哥馬利乘法用之縮減的部 份乘積輸出。為簡明故’我們以四位元數字為例來説明整 數-模-N蒙哥馬利乘法。參考圖3,輸入終端機8〇,82,84Page 13 448376 V. Description of Invention (ίο) The second inputs of 134, 138 and 142 are connected to each other and are connected to the terminal 78 in common. The terminal 78 is connected for receiving the signal T. Large operands, such as, for example, the multiplication of two 1024-bit operands, are multiplied by a multiplier (not shown) and pipelined to pass or rotate. Typically, for larger operands A and B, they will be divided into smaller groups, which are called groups of numbers, for example, numbers A0- and -h. The pipelined multiplier is sized to multiply these numbers. For example, if the numbers Αβ-ΑΝ and Xin-81,. Are double digits of 16 bits, the multiplier is also a 16-bit multiplier (but the invention is not limited to this). In general terms, the integer-modulus-N Montgomery multiplication is of the form: (A * R mod N) (B * R mod Ν) + ΜΝ where = A is the first operand and is an integer; B is the The second operand is an integer; N is an odd value; mod N is the remainder of (A * B * R) / N, which can define the number of components in the finite domain; R is 2 integers greater than the value of N Power value; and \ is a reduced value '. This value should be such that (A * R mod N) (B * R moci N) + \ N is divisible by R without losing significant bits. In operation, after receiving the product of (a * R mod N) and (B * R mod N), the modulo reducer 60 generates a reduced partial product output for integer-modulus-N Montgomery multiplication. For the sake of simplicity 'we will use four-digit numbers as an example to illustrate the integer-modulus-N Montgomery multiplication. Referring to FIG. 3, input terminals 80, 82, 84
第14頁 448376 五、發明說明(11) 與86接收運算元,像是譬如, 的各自項:pi+D,P1+1 , p1+2與!^另休〇與心相乘後之乘積 ^與⑺則接收^”,^^卜卜輸入終端機81 ’83 模縮減器60產生一縮減的模1相2乘此即、’整數N之值。 端機164-170。 、 項’並輪出至輸出終 2 :減器』。施行該福斯特1哥馬利縮減演算法。在該 : = 利縮減演算法中,某™特定位元之邏輯值 描1、拉 疋否要對齊加入加總值。當特定位元具邏輯 二時’該模縮減器60的架構可讓“值對齊且將之加入該 力^值。H由料及加上該Μ,該\ 132’136及140中決定並儲存。換言之’當產減乘 積項於輸出終端機1 64- 1 70處之縮減處理期間,該^之值 被決定出來,而非在數元Aq與^相乘之前決定的。 舉個例子,當A1Q = 9,U6 與N1Q = 13 時,(A*R mod N)項 之值為0001(以2為基底)。而當bi() = 1i,、=16與1 = 13時, (B*R m〇d N)項之值則為0111。注意,我們會將運算元A〇 與BG先乘以R,以簡化硬體模減化問題。當運算元(A*R mod N)與(B*R mod N)相乘後,乘積項pii3,Pi9,Pii,Pi。 成為 01 1 1。 k u μ 初始時,終端機79之重置信號會導致閂鎖丨28,132, 136與140之Q輸出成為邏輯零。及閘13〇的一輸入接收該具 邏輯值1之乘積項PitQ ’另一輸入則接收信號T。及閘13〇會 產生邏輯1值之輸出,使閂鎖1 2 8被設定,亦即,閂鎖1 2 8 的Q輸出會是邏輯值1 ^必須注意的是,在運算元Ae與心,Page 14 448376 V. Description of the invention (11) and 86 receive operands, such as, for example, the respective terms of: pi + D, P1 + 1, p1 + 2, and! ^ Also, the product of multiplication with heart ^ Receiving with the rule ^ ", ^^ input into the terminal 81 '83 The modulo reducer 60 generates a reduced modulo 1 phase 2 multiplied by this, that is, the value of 'integer N. Terminals 164-170., Term' and round Out to the end of output 2: Subtractor ”. The Foster 1 Gammaley reduction algorithm is implemented. In this: = reduction algorithm, the logical value of a certain bit is described in 1. Total value. When a specific bit has a logical two, the architecture of the modulo reducer 60 allows the value to be aligned and added to the force ^ value. H is determined by adding the M, the 132, 136, and 140 and stored. In other words, when the production-subtraction product term is in the reduction process at the output terminal 1 64- 1 70, the value of ^ is determined instead of being determined before multiplying the numbers Aq and ^. For example, when A1Q = 9, U6 and N1Q = 13, the value of the (A * R mod N) term is 0001 (based on 2). When bi () = 1i, = 16 and 1 = 13, the value of the (B * R m〇d N) term is 0111. Note that we will multiply the operands A0 and BG by R first to simplify the hardware modulation problem. When the operands (A * R mod N) and (B * R mod N) are multiplied, the product terms pii3, Pi9, Pii, Pi. Becomes 01 1 1. k u μ Initially, the reset signal of terminal 79 will cause the Q outputs of latches 28, 132, 136 and 140 to become logic zero. One input of the AND gate 130 receives the product term PitQ 'with a logical value of 1 and the other input receives the signal T. And the gate 13〇 will produce a logic 1 value output, so that the latch 1 2 8 is set, that is, the Q output of the latch 1 2 8 will be a logic value 1 ^ It must be noted that in the operand Ae and the heart,
第15頁 4483 7 6 五、發明說明(12) 即運算元A〇與心之較低序數元相乘在一起的運算過程中, 該信號T是邏輯值1。另應注意的是,閂鎖128之輸出q為邏 輯1值,會導致及閘89,91,93與95被致能,使值Ni+。,N… ,Niu與Ni+3可分別地傳送至加法器90,92,94與96的第二 輸入。是故’Xc行之加法器所產生出的值是,n , Nif2,Ni+3 值與相應的Ρίί(1,Pi + 1,Pi+2 與Pi+3 值的和。 加法器90的第一與第二輸入均為邏輯值1,所以輸出5將 會是邏輯值0 ;而加法器90會於輸出C0上產生之進位信號 。加法器92之第一輸入所接枚的是邏輯1值,第二輸入所 接收的則是邏輯0值,另外,輸入C I上有代表進位之邏輯j 值信號。加法器92之輸出S就成為邏輯值〇,輸出c〇上之進 位信號就為邏輯值1。 加法器94之第一輸入端所接收到的是邏輯1,第二輸入 端所接收到的則是由及閘93而來之邏輯1 ,另外其進位信 號因及閘92A而致能。加法器94的輸出S其邏輯值為i ;進 位輸出C0上之進位輸出信號則亦為邏輯值1。相類似地, 加法器96之第一輸入端所接收到的是邏輯〇,第二輸入端 所接收到的則是由及閘95而來之邏輯1,另外其進位信號 因及閘94A而致能。加法器94的輸出s上之輸出信號的邏輯 值為0 ;而進位輪出c〇上之進位輸出信號則為邏輯值1。根 據福斯特-蒙哥馬利縮減演算法,若該特定位元位置具邏 輯值1 ’亦即輪入終端機8〇上之最小有效位元位置那麼 B使N值排直線,而加入到p值中。 再一次地,根據福斯特—蒙哥馬利縮減演算法,該L行Page 15 4483 7 6 V. Description of the invention (12) That is, during the operation of multiplying the operand A0 and the lower ordinal number of the heart, the signal T is a logical value 1. It should also be noted that the output q of the latch 128 is a logic 1 value, which will cause the AND gates 89, 91, 93, and 95 to be enabled, making the value Ni +. , N ..., Niu and Ni + 3 can be sent to the second inputs of adders 90, 92, 94 and 96, respectively. Therefore, the value produced by the adder of the line "Xc" is the sum of the values of n, Nif2, Ni + 3 and the corresponding values of Pl (1, Pi + 1, Pi + 2, and Pi + 3. The 90th Both the first and second inputs are logic values 1, so output 5 will be a logic value 0; and adder 90 will generate a carry signal on output C0. The first input of adder 92 is a logic 1 value The second input receives a logic 0 value. In addition, there is a logical j value signal representing the carry on the input CI. The output S of the adder 92 becomes a logical value 0, and the carry signal on the output c0 is a logical value. 1. The first input of the adder 94 receives a logic 1 and the second input receives a logic 1 from the AND gate 93, and its carry signal is enabled by the AND gate 92A. The output S of the adder 94 has a logical value i; the carry output signal on the carry output C0 is also a logical value 1. Similarly, the first input of the adder 96 receives logic 0 and the second input The terminal receives a logic 1 from the AND gate 95, and its carry signal is enabled by the AND gate 94A. The output of the adder 94 The logical value of the output signal on s is 0; and the carry output signal on the carry wheel out c0 is a logical value 1. According to the Foster-Montgomery reduction algorithm, if the specific bit position has a logical value of 1 'also That is, the position of the least significant bit on the terminal 80 is rotated so that B makes the N value lined up and added to the p value. Again, according to the Foster-Montgomery reduction algorithm, the L line
第16頁 4483 76 五、發明說明(13) 之加法器所產生之資料其所具有的值’將視特殊資料位元 位置上之資料而定。在此例子中,該特殊資料位元位置是 為加法器92的輸出S。必須注意的是’及閘1 34的輸入所接 收到的是,來自於加法器92的輸出S之邏輯〇信號。閂鎖 132不會被設定,所以其輸出Q依然為邏輯值〇。及閘99, 101,103與105就會分別地在加法器100,102,104,106 的第二輸入端上,留下邏輯值0。加法器1〇〇之第一與第二 輸入均為邏輯值〇,所以其所產生之輸出s為邏輯值〇。相 類似地’加法器102之第一與第二輸入均為邏輯值〇,所以 其所產生之輸出S為邏輯值0。加法器1 04的第一輸入則為 邏輯值1 ’第二輸入為邏輯值0 ’所以其所產生之輸出S為 邏輯值1。加法器106之第一與第二輸入均為邏輯值〇,所 以其所產生之輸出S為邏輯值0。是故,行Xii加法器1〇6 ,104,102與100,各自所產生出之值,排列起來是為: 0100 ° X2行之加法器所產生出之資料所具有的值,也是視特殊 資料位元位置上之資料而定。在此例子中,該特殊資料位 元乃是加法器1 04的輸出邏輯值。必須注意的是,及閘1 38 的輸入所接收到的是’來自於加法器1〇4之輸出S之邏輯1 信號。及閘1 3 8的輸出值為邏輯1,會使閂鎖1 3 6被設定, 所以其輸出Q就成為邏輯值1。及閘1〇9,111,113與115, 均被閂鎖1 3 6產生出之邏輯值1所致能。所以,加法器1 〇 〇 ’ 1 0 2 ’ 1 〇 4與丨〇 6輪出端之資料分別地就會轉移至加法器 110,112,114與116的第二輸入。加法器11〇之第一與第Page 16 4483 76 V. The value of the data generated by the adder of the description of the invention (13) 'will depend on the data in the special data bit position. In this example, the special data bit position is the output S of the adder 92. It must be noted that what is received by the input of the gate 134 is a logic 0 signal from the output S of the adder 92. The latch 132 will not be set, so its output Q remains at a logic value of zero. The AND gates 99, 101, 103, and 105 will leave logical values of 0 on the second inputs of the adders 100, 102, 104, and 106, respectively. The first and second inputs of the adder 100 are both logic values 0, so the output s produced by the adder 100 is logic value 0. Similarly, the first and second inputs of the 'adder 102 are both logic values 0, so the output S produced by it is a logic value 0. The first input of the adder 104 is a logical value 1 'and the second input is a logical value 0', so the output S produced by the adder 104 is a logical value 1. The first and second inputs of the adder 106 are both logic values 0, so the output S produced by it is a logic value 0. Therefore, the values generated by the row Xii adders 106, 104, 102, and 100 are arranged as follows: 0100 ° The value of the data generated by the adder of row X2 has the values, which are also treated as special data. Depending on the position of the bit. In this example, the special data bit is the output logic value of the adder 104. It must be noted that what is received by the input of gate 1 38 is a logic 1 signal from the output S of the adder 104. The output value of AND gate 1 3 8 is logic 1, which causes latch 1 3 6 to be set, so its output Q becomes logic value 1. The gates 109, 111, 113, and 115 are all enabled by the logic value 1 generated by the latch 136. Therefore, the data of the adders 100 ′, 102 ′, 104, and 106 are transferred to the second inputs of the adders 110, 112, 114, and 116, respectively. The first and the first of the adder 11
第17頁 4483 7 6 五、發明說明(14) ' 二輸入均為邏輯值〇,所以其所產生之輸出S為邏輯值〇。 相類似地’加法器112之第一與第二輸入均為邏輯值0,所 以其所產生之輸出S為邏輯值〇。加法器114之第一與第二 輸入均為邏輯值1,所以其所產生之輸出S為邏輯值〇,輸 出C0之進位輸出信號則為邏輯1值。加法器116的第一與第 二輸入均為邏輯值〇且從及閘Π4Α處也傳來一邏輯1值至其 進位輸入端。加法器116的輸出s,產生出邏輯1值。是故 ,行乂2之加法器116,114,112與110,各自所產生出之值 ,排列起來是為:1 〇 〇 〇。 X3行之加法器120,122,124與126所產生出之資料所具 有的值’也是視該特殊資料位元位置上之資料而定。在此 例子中’該特殊資料位元乃是加法器1丨6的輸出邏輯值。 及問142的輸入所接收到的是,來自於加法器116輸出δ之 邏輯1信號。及閘142具有來自於加法器116之邏輯值1,以 及#號T之邏輯1值’因此閂鎖140被設定。閂鎖140的Q輸 出所具有的邏輯1值,致能了及閛H9,121,123與125。 加法器1 1 0 ’ 1 1 2,1 1 4與1 1 6輸出端之資料分別地就會轉移 至加法器120,122 ’124與126的第一輪入。加法器120之 第一與第二輸入均為邏輯值〇 ’所以其所產生之輸出S為邏 輯值0。相類似地’加法器丨22之第一與第二輸入均為邏輯 值0 ’所以其所產生之輪出S為邏輯值〇。加法器124之第一 與第二輸入也是均為邏輯值〇,所以其所產生之輸出S為邏 輯值0。加法器126的第一與第二輪入均為邏輯值!,所以 於輸出端S上產生出邏輯〇值’在進位輸出端上產生出邏輯Page 17 4483 7 6 V. Description of the invention (14) 'Both inputs are logical values 0, so the output S produced by them is a logical value 0. Similarly, the first and second inputs of the 'adder 112 are both logic 0, so the output S produced by it is a logic 0. The first and second inputs of the adder 114 are both logic values 1, so the output S produced by it is a logic value 0, and the carry output signal of output C0 is a logic 1 value. The first and second inputs of the adder 116 are both logic values 0 and a logic 1 value is also transmitted from the sum gate 4A to its carry input terminal. The output s of the adder 116 produces a logical one. For this reason, the adders 116, 114, 112, and 110 of row 2 each produce values that are arranged as follows: 1 00. The value of the data generated by the adders 120, 122, 124, and 126 in line X3 'also depends on the data at the particular data bit position. In this example, 'the special data bit is the output logic value of the adder 1 6. What is received at the input of question 142 is a logic 1 signal from the output δ of the adder 116. The AND gate 142 has a logical value 1 from the adder 116, and a logical 1 value of the # sign T so that the latch 140 is set. The logic 1 value of the Q output of the latch 140 enables 閛 H9, 121, 123, and 125. The data at the outputs of the adders 1 1 0 ′ 1 1 2, 1 1 4 and 1 1 6 are transferred to the first rounds of the adders 120, 122 ′ 124 and 126, respectively. The first and second inputs of the adder 120 are both logical values 0 ', so the output S produced by the adder 120 is a logical value zero. Similarly, the first and second inputs of the 'adder 22' are both logic 0's, so the round-out S produced by it is a logic 0. The first and second inputs of the adder 124 are also logic values 0, so the output S produced by the adder 124 is logic value 0. The first and second rounds of adder 126 are logical values! , So a logic 0 value is generated on the output terminal S ’, and a logic value is generated on the carry output terminal.
第18頁 五、發明說明(15) 1值之進位輸出信號。是故,行乂3之加法器126,124,122 與120,於輸出終端機1 64-170上,產生出各自的值:〇〇〇〇 〇 在進行導致該AG與Bo之第一部份乘積成為零之縮減過程 期間’閂鎖128,132,136與140被設定成内含有用於\的 值1101,此\將用於隨後的管線化乘法運算中。隨著該第 一部份乘積之縮減成為零,該信號T會從邏輯1轉變為邏輯 0,並且將\之值存入閂鎖128,132,136與140之中。該 儲存之\值、N之下一個數元、以及數元b;-b63與^—心3之乘 積,都會被模縮減器60取用來完成該多項式之乘算。 圖4是屬圖2之單一算術處理器22 —部份之乘數結構171 另一具體實施的電路圖。乘數結構1 7 1執行整數—模—n乘法 以及模多項式基乘法之數學運算。為簡明故,乘數結構 1 71被簡化成一四乘四之加法器陣列。雖然我們將乘數結 構1 7 1描述為一列數與行數相同之加法器陣列,但本發明 並不以此為限。 乘數結構171具有XQ行之加法器90,92 ,94與96 ;X行之 加法器100 ’102 ’104與106 ;X2行之加法器iiQ,112, 114與116以及X3 4亍之加法|§120 ’122 ’124盘126。另外, 儲存有進位輸出信號之閂鎖1 52,1 56,ι60與162,該進位 輸出信號將使用於用以產生下一個部份乘積之計算整數_ 模-N乘法中。閂鎖150,154與158儲存運算元6之^料位 元’閂鎖226 ’ 228與23 0儲存有N值的資料位元,此N值將 用於產生下一個部份乘積。Page 18 V. Description of the invention (15) Carry output signal of 1 value. Therefore, the adders 126, 124, 122, and 120 of line 3, on the output terminals 1 64-170, produce their respective values: 〇〇〇〇〇 in the first part of the AG and Bo During the reduction process when the product becomes zero, the latches 128, 132, 136, and 140 are set to contain the value 1101 for \, which will be used in subsequent pipelined multiplication operations. As the product of the first part is reduced to zero, the signal T changes from a logical one to a logical zero, and the value of \ is stored in latches 128, 132, 136, and 140. The stored value, the number below N, and the product of number b; -b63 and ^ -heart 3 are all taken by the modular reducer 60 to complete the multiplication of the polynomial. FIG. 4 is a circuit diagram of another specific implementation of the multiplier structure 171 which is a part of the single arithmetic processor 22 of FIG. 2. The multiplier structure 1 7 1 performs mathematical operations on integer-modular-n multiplication and modular polynomial base multiplication. For brevity, the multiplier structure 1 71 is reduced to a four-by-four adder array. Although we described the multiplier structure 1 7 1 as an adder array with the same number of columns and rows, the present invention is not limited to this. Multiplier structure 171 has XQ line adders 90, 92, 94, and 96; X line adders 100'102, 104, and 106; X2 line adders iiQ, 112, 114, and 116, and X3 43 adder | § 120 '122' 124 plate 126. In addition, latches 1, 52, 1, 56, 60, and 162 are stored for the carry output signal. The carry output signal will be used in the calculation of the integer-modulus-N multiplication to generate the next partial product. Latches 150, 154, and 158 store the ^ material bit of operand 6 ' Latch 226 ' 228 and 23 0 store data bits of N values, and this N value will be used to generate the next partial product.
448376 五、發明說明(16) 乘數結構171中之多工器,每一個均具有四個輸入,— 個輸出以及兩個選擇輸入。多工器172-178 ,182-188, 192-198 ’以及202-208的輸出均連接至加法器的苐一輸入 ’但必須注意的是’該等多工器的輸出也是可以連接至該 等加法器的第二輸入端的。多工器的第一與第二選擇輪入 上之信號’可選出多工器四個輸入信號中的一個,轉移至 多工器的輸出。多工器172 -178之輸出信號分別地轉移至 加法器90-96之第一輸入。多工器182-188之輸出信號分別 地轉移至加法器100-106之第一輸入。多工器192-1 98之輸 出信號分別地轉移至加法器1 1 0 -1 1 6之第一輸入。多工器 202-208之輸出信號分別地轉移至加法器120-126之第一輸 入0 另外,多工器1 72-1 78之第一選擇輸入共同地彼此相接 來接收信號。多工器1 72- 1 78之第二選擇輸入亦彼此 相接,共同地連接至閂鎖122的輸出。多工器1 82-1 88之第 一選擇輸入共同地彼此相接來接收信號。多工器 182-188之第二選擇輸入亦彼此相接,共同地連接至閂鎖 216的輸出。多工器1 92-1 98之第一選擇輪入共同地彼此相 接來接收信號Α(ίίΛ2)。多工器1 92- 1 98之第二選擇輸入亦彼 此相接,共同地連接至閂鎖220的輸出。多工器202-208之 第一選擇輸入共同地彼此相接來接收信號Α( &Α3)。多工器 2 02-208之第二選擇輸入亦彼此相接,共同地連接至閂鎖 224的輸出。 多工器 172-178,182-188,192-198,以及2 02-208 的第448376 V. Description of the invention (16) Each of the multiplexers in the multiplier structure 171 has four inputs, one output and two selection inputs. Multiplexers 172-178, 182-188, 192-198 'and the outputs of 202-208 are connected to the first input of the adder', but it must be noted that 'the outputs of these multiplexers can also be connected to such The second input of the adder. The signal on the first and second selection wheels of the multiplexer can select one of the four input signals of the multiplexer and transfer it to the output of the multiplexer. The output signals of the multiplexers 172-178 are respectively transferred to the first inputs of the adders 90-96. The output signals of the multiplexers 182-188 are respectively transferred to the first inputs of the adders 100-106. The output signals of the multiplexers 192-1 98 are respectively transferred to the first inputs of the adders 1 1 0 -1 16. The output signals of the multiplexers 202-208 are respectively transferred to the first input 0 of the adders 120-126. In addition, the first selection inputs of the multiplexers 1 72-1 78 are commonly connected to each other to receive signals. The second selection inputs of the multiplexers 1 72- 1 78 are also connected to each other and are commonly connected to the output of the latch 122. The first selection inputs of the multiplexers 1 82-1 88 are commonly connected to each other to receive signals. The second selection inputs of the multiplexers 182-188 are also connected to each other and are commonly connected to the output of the latch 216. The first selections of the multiplexers 1 92-1 98 are connected in common to each other to receive the signal A (ίΛ2). The second selection inputs of the multiplexers 1 92- 1 98 are also connected to each other and are commonly connected to the output of the latch 220. The first selection inputs of the multiplexers 202-208 are commonly connected to each other to receive the signal A (& A3). The second selection inputs of the multiplexer 2 02-208 are also connected to each other and are commonly connected to the output of the latch 224. Multiplexers 172-178, 182-188, 192-198, and No. 2 02-208
第20頁 448376 五、發明說明(π) 一輸入均共同地連接以接收邏輯零值。多工器172,174, 176與178的第二輸入分別地接收Β(位元。)’ Β(位元〇 ’ Β(位元2)與6(位 元3)值。多工器172,174,176與178的第三輸入分別地接收 Ν(位元〇) ’ Ν(位元】)’ Ν(位元2)與Ν(位元3)值。多工器172 ’174 ’17 1 7 8的第四輪入分別地接收Ν與Β之和的各分值。所以,每 一個多工器的第四輸入所接收的均是其第二與第三輸入值 的加總邏輯值。 當多工器之第一與第二選擇輸入端所接收到的邏輯值當 多工器之第一與第二選擇輸入端所接收到的邏輯值為00時 ,則多工器 172-178,182-188,192-198 以及 202-208,各 自將其第一輸入端上之信號轉移至輸出。當第一與第二選 擇輸入端所接收到的邏輯值為01時,則多工器1 72-1 78, 182-188,192-198以及202-208,各自將其第二輸入端上 之信號轉移至輸出。當第一與第二選擇輸入端所接收到的 邏輯值為10時,則多工器172-1 78,182-188,19 2-198以 及202-208,各自將其第三輸入端上之信號轉移至輸出。 當第一與第二選擇輸入端所接收到的邏輯值為11時,則多 工器 172-178,182-188,192-198 以及 202 - 208,各自將其 第四輸入端上之信號轉移至輸出。 當信號Τ從邏輯1轉移至邏輯0時,閂鎖21 2,21 6,220與 224會各自將邏輯電路210,214,218與222所傳來之資料 信號給鎖住。邏輯電路2 1 0所產生的資料信號是信號Α(^0) 與Βα⑽的乘積’與Ρ(〇)互斥或運算後的值,其中Ρ(〇)是 先前部份乘積值的最小有效位元。邏輯電路214所產生的Page 20 448376 V. Description of the Invention (π) One input is connected in common to receive a logic zero value. The second inputs of the multiplexers 172, 174, 176, and 178 respectively receive the values of B (bit.) 'Β (bit 0' B (bit 2) and 6 (bit 3). The multiplexer 172, The third inputs of 174, 176, and 178 respectively receive N (bit 0) 'Ν (bit)' N (bit 2) and N (bit 3) values. Multiplexer 172 '174 '17 1 The fourth round of 7 8 receives the scores of the sum of N and B respectively. Therefore, the fourth input of each multiplexer receives the logical sum of its second and third input values. When the logic values received by the first and second selection inputs of the multiplexer are 00, the multiplexers 172-178, 182-188, 192-198 and 202-208 respectively transfer the signals on their first inputs to the outputs. When the logic value received by the first and second selection inputs is 01, the multiplexer 1 72-1 78, 182-188, 192-198, and 202-208 each transfer the signal on its second input to the output. When the logic value received by the first and second selection inputs is 10, Multiplexer 172-1 78 182-188, 19 2-198, and 202-208, each of which transfers the signal on its third input to the output. When the logic value received by the first and second selection inputs is 11, the multiplexer 172-178, 182-188, 192-198, and 202-208, each of which transfers the signal on its fourth input to the output. When the signal T is transferred from logic 1 to logic 0, latch 21 2, 21 6, 220 and 224 lock the data signals from logic circuits 210, 214, 218, and 222, respectively. The data signals generated by logic circuit 2 10 are the product of signals A (^ 0) and Bα⑽, and P ( 〇) The value of the exclusive OR operation, where P (〇) is the least significant bit of the previous partial product value.
4 483 76 五、發明說明(18) 負料^5號疋6號A(位元與B(位元〇的乘積,與加法器μ之輸出 信號互斥或運算後的值。邏輯電路218所產生的資料信號 是k號A(位元n與Bf位元〇的乘積’與加法器1 之輸出信號互斥 或運算後的值。邏輯電路222所產生的資料信號是信號^位 元3)與8(位元(〇的乘積’與加法器116之輸出信號互斥或運算& 的值。 及閘90A-96A位於行\加法器的進位路徑鏈中。所以, 及閘90A-96A有能力控制行Xq之進位鏈上信號的傳遞。相 類似地,及閘100A-106A是位於行X]加法器的進位路徑鏈 中’有能力控制行X!之進位鏈上信號的傳遞β及開丨丨^_ 1 1 6 Α是位於行X2加法器的進位路徑鏈中,有能力控制行& 之進位鏈上信號的傳遞。及閘12〇A —126A是位於行"·χ加法2 器的進位路徑鏈中,有能力控制行心之進位鏈上信^的傳 遞。當乘數結構171正在計算整數_模-n乘法時’及閘9〇八_ 96Α ’ 100Α-106Α,110Α_116Α 以及12〇Α—126α 當乘數結㈣i正在計算模多項式基乘法時,及閉9〇α_96α ’ 100Α-106Α ’ 110Α-116Α 以及120Α-126Α 是禁能的。換言 之,乘法結構171之進位鏈路徑,只有在計算整數_模1乘 法時才是暢通的,而路徑上的進位信號才能傳遞至相鄰之 加法器單元。 產生數7CAG與匕之部份乘積的乘積處理過程,會導致輸 出端164-170之邏輯值缩減。是故,數元、乘上數元心所得 到之部份乘積,所有的數元均為邏輯〇值。另外,閂鎖丨28 ,132 ’136與140也已在Α〇與心相乘的期間被適當地設定, 4 483 "^ 五、發明說明(19) 儲存著、的值。在接下來的相乘運算期間,該已儲存之、 值,連同相應的N〗-N63,數元B]_B63,以及數元,均會 被乘法結構171取用來作整數-模-Ν乘法之數學運算之用。 參考著圖4,以下例子是模多項式乘積之計算過程。多 項式乘法之蒙哥馬利縮減演算法,採取下列的形式: (A * R m 〇 d N ) ( Β * R m 〇 d Ν ) + \ 木 Ν 其中: A是該第一運算元且為一多項式; B是該第·一運算元且為一多項式; N是一既約多項式; mod N是(A*B*R)/N的餘數值,可定義出有限域中元件的 數目; R是大於該N值之2整數乘冪值;以及 \是一縮減值’此值應使(A*R mod N)(B*R mod N)+\ *N可為R所整除且不會損失有效位元。 舉例’當我們以2為基底’假設a = 5 (基底十)或為X2 +1 ( 多項式形式)’ R= 1 6 (基底十)或為X4 (多項式形式),以及 N=ll(基底十)或為X3 + X + l(多項式形式),那麼該(A*R mod N)項成為:(XHX4)mod N,其結果應等於χ+i或01 }。另外 ,當B = 4(基底十),R = 16(基底十),以及N=U(基底十)時 ’該(B*R mod N)成為(X6)m〇d N,其結果應等於χ2 + ι(多項 式形式)或是值1 0 1。注意,我們會將運算元心與β。先乘以r ’以簡化硬體模減化問題。當運算元(A*R mod Ν)與(B*R m〇d N)相乘後,乘積項Pi+3,Pu2,ρ;“,p“fl成為m'/。乘數4 483 76 V. Explanation of the invention (18) Negative material ^ 5 疋 6A (the product of bit A and bit B (bit 0, the value of the output signal of the adder μ is mutually exclusive or calculated. Logic circuit 218 The generated data signal is k number A (the product of bit n and Bf bit 0 'and the output signal of adder 1 are mutually exclusive or calculated. The data signal generated by logic circuit 222 is signal ^ bit 3) And 8 (bit (product of 0) and the output signal of the adder 116 are mutually exclusive OR operation & value. And gates 90A-96A are located in the carry path chain of the row \ adder. Therefore, and gates 90A-96A have The ability to control the transmission of signals on the carry chain of row Xq. Similarly, the gates 100A-106A are located in the carry path chain of the adder's' ability to control the transmission of signals on the carry chain of row X!丨 丨 ^ _ 1 1 6 A is located in the carry path chain of the row X2 adder, and has the ability to control the transmission of signals on the carry chain of the row & AND gate 12〇A-126A is located in the row " χ addition 2 In the carry path chain of the router, it has the ability to control the transmission of the letter ^ on the carry chain of the Xinxin. When the multiplier structure 171 is calculating the integer_modulus-n multiplication Sum gates 108 and 96A '100A-106A, 110A_116A, and 12〇A-126α When the multiplier result ㈣i is calculating a modular polynomial basis multiplication, and the closing 90 ° _96α' 100Α-106Α '110Α-116Α and 120Α-126Α are Disabled. In other words, the carry chain path of the multiplication structure 171 is unblocked only when the integer modulo 1 multiplication is calculated, and the carry signal on the path can be transmitted to the adjacent adder unit. The number 7CAG and The product processing of partial products will cause the logical value of the output terminals 164-170 to shrink. Therefore, all the numbers obtained by multiplying the digits and multiplying the digit centers are logical zero values. In addition, Latches 28, 132, 136, and 140 have also been set appropriately during the period of multiplication of Ao and the heart, 4 483 " ^ V. Description of the invention (19) The value of is stored. In the following phase During the multiplication operation, the stored value, together with the corresponding N〗 -N63, the number B] _B63, and the number will be used by the multiplication structure 171 for mathematical operations of integer-modulus-N multiplication. With reference to Figure 4, the following example is a calculation of the product of modular polynomials Process. The Montgomery reduction algorithm for polynomial multiplication takes the following form: (A * R m 〇d N) (Β * R m 〇d Ν) + \ 木 Ν where: A is the first operand and is a polynomial B is the first operator and is a polynomial; N is a reduced polynomial; mod N is the remainder of (A * B * R) / N, which can define the number of components in the finite field; R is greater than The value of N that is an integer power of 2; and \ is a reduced value '; this value should make (A * R mod N) (B * R mod N) + \ N be divisible by R without losing significant bits yuan. For example, when we use 2 as the basis, suppose a = 5 (basis ten) or X2 +1 (polynomial form) 'R = 1 6 (basis ten) or X4 (polynomial form), and N = ll (basis ten) ) Or X3 + X + l (polynomial form), then the (A * R mod N) term becomes: (XHX4) mod N, and the result should be equal to χ + i or 01}. In addition, when B = 4 (base ten), R = 16 (base ten), and N = U (base ten), 'this (B * R mod N) becomes (X6) mOd N, and the result should equal χ2 + ι (polynomial form) or the value 1 0 1. Note that we will associate the operator center with β. First multiply by r 'to simplify the hardware modulation problem. When the operands (A * R mod Ν) and (B * R m〇d N) are multiplied, the product term Pi + 3, Pu2, ρ; ", p" fl becomes m '/. multiplier
第23頁 ^48376 五、發明說明(20) 結構 171 將[(A*R mod N)*(B*R mod N)]mod N 的乘積值縮 減R倍,產生了值0111或是多項式形式之(XHX + l)。 初始時,終端機79之重置信號會導致閂鎖128,132, 136與140之Q輸出成為邏輯零。及閘130的一輸入接收該具 邏輯值1之乘積項Pi+Q,另一輸入則接收信號T。及閘130會 產生邏輯1值之輸出,使閂鎖1 2 8被設定,亦即,閂鎖1 2 8 的Q輸出會是邏輯值1。必須注意的是,在運算元心與匕, 即在運算元AG與BQ之較低段數元相乘的運算過程中,該信 號T是邏輯值1。另應注意的是,閂鎖128之輸出Q為邏輯1 值,會導致及閘89,91,93與95被致能,使值N1+Q,N1+1,Page 23 ^ 48376 V. Description of the invention (20) Structure 171 Reduces the product value of [(A * R mod N) * (B * R mod N)] mod N by a factor of R, resulting in the value 0111 or a polynomial form (XHX + l). Initially, the reset signal of terminal 79 will cause the Q outputs of latches 128, 132, 136, and 140 to become logic zero. One input of the AND gate 130 receives the product term Pi + Q having a logical value of 1, and the other input receives a signal T. The AND gate 130 will produce a logic 1 value output, so that the latch 1 2 8 is set, that is, the Q output of the latch 1 2 8 will be a logic value 1. It must be noted that the signal T is a logical value 1 during the operation of the core of the operator and the dagger, that is, the multiplication of the lower-order elements of the operators AG and BQ. It should also be noted that the output Q of the latch 128 is a logic 1 value, which will cause the AND gates 89, 91, 93, and 95 to be enabled, making the values N1 + Q, N1 + 1,
Ni+2與\+3可分別地傳送至加法器90,92,94與96的第二輸 入。是故,XQ行之加法器所產生出的值是:,N ,,N i 十u 1+1 i +2 ’ Nit3值與相應的ρί+ϋ,pi+1,p]+2與pi+3值的和。 加法1§90的第一與第一輸入均為邏輯值1,所以輸出§將 會是邏輯值0 ;而加法器90會於輸出C0上產生之進位信號 加法器92之第一輸入所接收到的是邏輯1值,第二輸入 所接收到的亦是邏輯1值’另外.輸入C I上進位信號位為 邏輯值0(及閘90A阻擋了加法器90所產生的進位作缺,植 遞至加法器9…加法器92之輸出S就成:邏進輯= C 0上之進位彳§號就為邏輯值1 ^必須注意的是,閘9 2 a阻擋 了加法器92所產生的進位信號,傳遞至加法器94。 " 加法器94之第一輸入端所接收到的是邏輯1,第二輸入 端所接收到的則是由及閘93而來之邏輯〇,另外其進^信 號為邏輯0〇加法器94的輸出S其邏輯值為i ;而進位輸^Ni + 2 and \ +3 can be passed to the second inputs of adders 90, 92, 94 and 96, respectively. Therefore, the value produced by the adder of XQ line is :, N ,, N i ten u 1 + 1 i +2 'Nit3 value and corresponding ρί + ϋ, pi + 1, p] +2 and pi + Sum of 3 values. The first and first inputs of §90 of addition 1 are both logical value 1, so the output § will be a logical value of 0; and the carry signal generated by adder 90 on output C0 is received by the first input of adder 92 It is a logic 1 value, and the second input also receives a logic 1 value. In addition, the carry signal bit on the input CI is a logic value 0 (and the gate 90A prevents the carry generated by the adder 90 from being missing, and is transferred to The output S of the adder 9 ... the adder 92 becomes: logical progression = carry on C 0 彳 § number is a logical value 1 ^ It must be noted that the gate 9 2 a blocks the carry signal generated by the adder 92 , Pass to the adder 94. " The first input of the adder 94 receives a logic 1 and the second input receives a logic 0 from the AND gate 93, and its input signal The output S of the logic 0 adder 94 has a logic value i; and the carry input ^
48376 五、發明說明(21) C 0上之進位輸出信说則為邏輯值〇。相類似地,加法器9 6 之第一輸入端所接收到的是邏輯1,第二輸入端所接收到 的則是由及閘9 5而來之邏輯1,另外其進位信號為邏輯值 0。加法器96的輸出S上之輸出信號其邏輯值為〇 ;而進位 輸出C0上之進位輸出信號則為邏輯值1。根據福斯特-蒙哥 馬利縮減演算法’若該特定位元位置具邏輯值1,亦即輸 入終端機80上之最小有效位元位置,那麼會使ν值排直線 ’而加入到ρ值中。 根據福斯特-蒙哥馬利縮減演算法,該χ]行之加法器所 產生之 >'料其所具有的值’將視特殊資料位元位置上之資 料而定。在此例子中’該特殊資料位元位置是為加法器92 的輸出S。必須注意的是,及閘134的輸入所接收到的是, 來自於加法器92的輸出s之邏輯〇信號。閂鎖132不會被設 疋,所以其輸出Q依然為邏輯值〇。及閘9 9 , i 〇丨,i 3與 105就會分別地在加法器1〇〇,1〇2,1〇4,ι〇6的第二輸入 端上,留下邏輯值〇。加法器1〇〇之第一與第二輸入均為邏 輯值0 ’所以其所產生之輸出S為邏輯值〇。相類似地,加 法器102之第一與第二輸入均為邏輯值〇,所以其所產生之 輸出S為邏輯值〇。加法器1〇4的第—輸入則為邏輯值1,第 二輪入為邏輯值〇,所以其所產生之輸出S為邏輯值1。加 法器106之弟一與第二輪入均為邏輯值〇,所以其所產生之 輸出S為邏輯值〇。是故,行Χι之加法器1〇6,1〇4,ι〇2與 1〇〇,各自所產生出之值,排列起來是為:〇1 〇(}。 X2行之加法器所產生出之資料所具有的值,也是視特殊48376 V. Description of the invention (21) The carry output signal on C 0 is said to be a logical value 0. Similarly, the first input of the adder 9 6 receives a logic 1 and the second input receives a logic 1 from the AND gate 9 5, and its carry signal is a logic value 0. . The output signal on the output S of the adder 96 has a logic value of 0; and the carry output signal on the carry output C0 has a logic value of 1. According to the Foster-Montgomery reduction algorithm ', if the specific bit position has a logical value of 1, that is, the least significant bit position on the terminal 80 is input, then the value of ν will be lined up and added to the value of ρ. According to the Foster-Montgomery reduction algorithm, the > 'expected value' generated by the adder of the χ] line will depend on the data at the particular data bit position. In this example, 'the special data bit position is the output S of the adder 92. It must be noted that what is received by the input of the AND gate 134 is a logic 0 signal from the output s of the adder 92. The latch 132 is not set, so its output Q remains at a logic value of zero. And the gates 9 9, i 〇 丨, i 3 and 105 will respectively leave the logical value 0 on the second input terminals of the adders 100, 102, 104, and 06. The first and second inputs of the adder 100 are both logical values 0 ', so the output S produced by the adder 100 is a logical value zero. Similarly, the first and second inputs of the adder 102 are both logic values 0, so the output S produced by the adder 102 is logic value 0. The first input of the adder 104 is a logical value 1, and the second round input is a logical value 0, so the output S produced by the adder is a logical value 1. Both the first and second rounds of the adder 106 are logical values 0, so the output S produced by it is a logical value 0. Therefore, the adders 10, 10, 10, and 100, which are in the X-line, are each generated by the following values: 〇 1 〇 (}. The value of the data also depends on the special
第25頁 4483 76_ 五、發明說明(22) ' ' ---- 資料位元位置上之資料而定。在此例子中,該特殊資料位 元位置乃是加法器1 04之輸出S。必須注意的是,及開丨38 的輸入所接收到的是’來自於加法器丨04之輸出s之邏輯丄 信號。及閘138的輸出值為邏輯】,會使閂鎖136被設定, 所以其輸出Q就成為邏輯值1。及閘109 ’1Π ,113與115, 均被閂鎖1 3 6產生出之邏輯值1所致能。所以,加法器1 〇 〇 ,102,104與106輸出端之資料分別地就會轉移至加法器 110 ’112,114與116的第二輸入。加法器11〇之第—與第 二輸入均為邏輯值〇,所以其所產生之輸出S為邏輯值〇。 相類似地’加法器11 2之第一與第二輸入均為邏輯值〇,所 以其所產生之輸出S為邏輯值〇。加法器114之第一與第二 輸入均為邏輯值1 ’所以其所產生之輸出S為邏輯值〇,其 輸出CO之進位輸出信號則為邏輯1值。該進位輸出信號的 ,輯1值被及閘U 4所阻擋,無法傳遞至加法器丨丨6。加法 器116的第一輸入為邏輯值〇 ’第二輸入則為邏輯1值且該 進位輸入為邏輯值〇。加法器116的輸出δ處,產生出邏輯j 值。是故’行X2之加法器U 6,1 14,11 2與11 〇,各自所產 生出之值,排列起來是為:1 〇 〇 〇。 行之加法器120,122,124與126所產生出之資料所具 有的值’也是視該特殊資料位元位置上之資料而定。在此 例子中’該特殊資料位元位置乃是加法器丨丨6的輸出s ^及 閉1 42的輸入所接收到的是,來自於加法器丨丨6輸出s之邏 輯1信號。及閘1 42具有來自於加法器11 6之邏輯值1,以及 信號T之邏輯1值’因此閂鎖1 4 0被設定。閂鎖1 4 0的Q輸出Page 25 4483 76_ 5. Description of the invention (22) '' ---- It depends on the data in the data bit position. In this example, the special data bit position is the output S of the adder 104. It must be noted that what is received by the input of ON 38 is a logic 信号 signal from the output s of the adder 04. The output value of the AND gate 138 is logic [1], which causes the latch 136 to be set, so its output Q becomes a logic value of 1. The sum gates 109'1Π, 113, and 115 are all enabled by the logic value 1 generated by the latch 1 3 6. Therefore, the data at the outputs of the adders 100, 102, 104, and 106 are transferred to the second inputs of the adders 110'112, 114, and 116, respectively. The first and second inputs of the adder 11 are both logical values 0, so the output S produced by the adder 11 is a logical value 0. Similarly, the first and second inputs of the 'adder 112' are both logical values 0, so the output S produced by them is a logical value 0. The first and second inputs of the adder 114 are both logic 1's, so the output S produced by it is a logic 0, and the carry output signal of its output CO is a logic 1. The value of 1 of the carry output signal is blocked by the AND gate U 4 and cannot be passed to the adder 6. The first input of the adder 116 is a logic value 0 'and the second input is a logic 1 value and the carry input is a logic value 0. At the output δ of the adder 116, a logical j value is generated. That is, the values generated by the adders U 6, 1, 14, 11 2 and 11 ′ of the row X2 are respectively arranged as follows: 1 000. The value of the data generated by the adders 120, 122, 124, and 126 is also determined by the data at the particular data bit position. In this example, the position of the special data bit is the output s ^ of the adder 丨 6 and the input of the closed 142 receives the logic 1 signal from the output s of the adder 丨 6. The AND gate 1 42 has a logical value 1 from the adder 116 and a logical 1 value of the signal T ', so that the latch 1 40 is set. Q output of latch 1 4 0
第26頁 4483 76 五、發明說明(23) 所具有的邏輯1值,致能了及閘119,121,123與125。加 法器11 0,11 2,114與1 1 6輸出端之資料分別地就會轉移至 加法器120 ’ 122,124與126的第一輸入。加法器12〇之苐 一與第二輸入均為邏輯值〇 ’所以其所產生之輸出S為邏輯 值0。相類似地,加法器122之第一與第二輸入均為邏輯值 0,所以其所產生之輸出S為邏輯值〇。加法器124之第一與 第二輸入也是均為邏輯值0,所以其所產生之輸出S為邏輯 值0。加法器126的第一與第二輸入均為邏輯值1,所以於 輸出端S上產生出邏輯〇值’在進位輸出端上產生出邏輯i 值之進位輸出信號。及閘1 2 6 A阻擋了該進位信號傳遞至問 鎖162 »是故’行χ3之加法器126,124,122與120,於輸 出終端機164-170上,產生出各自的值:〇〇〇〇 π 在縮減處理的第一乘算週期期間’數元、與&之部份乘 積的第一N位元,會被縮減成零。閂鎖Kg ,132,136與 140被設定成内含用於\的值U 〇1,此\將用於隨後的用 以決定運算元A與B乘積之管線化乘法運算中。隨著該第一 部份乘積之縮減成為零,該信號T會從邏輯〗轉變為邏輯〇 ,並且將\之值存入閂鎖128,132,136與140之中。該儲 存之 \ 值;N(it3) ’ N(if2) ’ 與之〇〇〇〇 值;以及p(〖3), P(i+2) , P(m)與P(i + 0)之0 0 0 0值,都會被乘數結構j 71取用來完 成忒多項式之縮減處理。在完成第二乘算週期完成之後’ 輸出終端機170,168,166與164上之信號就為〇111,譬如 ’是多項式形式(XHX + 1 )的值a 簡要地參考圓4 t (A*R mod N)與(B*R mod N)之模多項Page 26 4483 76 V. Explanation of the invention (23) The logic 1 value has enabled the gates 119, 121, 123 and 125. The data at the outputs of the adders 11 0, 11 2, 114 and 1 16 will be transferred to the first inputs of the adders 120 '122, 124, and 126, respectively. The first and second inputs of the adder 120 are both logical values 0 ', so the output S produced by the adder 12 is a logical value 0. Similarly, the first and second inputs of the adder 122 are both logic 0, so the output S produced by the adder 122 is logic 0. The first and second inputs of the adder 124 are also logic 0, so the output S produced by the adder 124 is logic 0. The first and second inputs of the adder 126 both have a logic value of 1, so a logic 0 value is generated at the output terminal S, and a carry output signal of a logic i value is generated at the carry output terminal. And the gate 1 2 6 A prevented the carry signal from being transmitted to the interlock 162 »So the adders 126, 124, 122, and 120 of the row χ3 generate the respective values on the output terminals 164-170: 〇〇 〇〇π During the first multiplication cycle of the reduction process, the first N bits of the product of the 'number' and the part of & will be reduced to zero. The latches Kg, 132, 136, and 140 are set to contain the value U 〇1, which will be used in subsequent pipelined multiplication operations to determine the product of operands A and B. As the product of the first part is reduced to zero, the signal T will change from logical to logical 0, and the value of \ will be stored in latches 128, 132, 136, and 140. The value of the store; the value of N (it3) 'N (if2)' and its value of 00; and the value of p (〖3), P (i + 2), P (m) and P (i + 0) The value 0 0 0 0 will be taken by the multiplier structure j 71 to complete the reduction of the unitary polynomial. After the completion of the second multiplication cycle, the signals on the output terminals 170, 168, 166, and 164 are 〇111, such as' is the value a of the polynomial form (XHX + 1). Refer briefly to the circle 4 t (A * R mod N) and (B * R mod N)
Μ: 第27頁 44837^ 五、發明說明(24) 式乘算所產生出之二進位乘積,與圖3電路所產生出之相 同。在汁异模多項式基之乘算時’及閘9〇j\-g6A ,i〇〇a-106Α ’ 110Α-116Α與120Α-126Α是不致能的。所以,加法器 90-96,加法器 100~106,加法器 11〇-116,加法器 12〇_126 無法將進位信號傳遞至相鄰的加法器單元。這些及閘的禁 能’將使得每一個C I終端機所接收到的是邏輯零值。 在該第一乘算週期的期間,該縮減處理會於輸出終端機 170,168,166與164上,產生出運算元\與8。之第一部份 乘積值0 0 0 0。另外’在該第一部份乘積之產生期間,閂鎖 224 ’220 ’216與212會被设定且將\的值iioi鎖住,此值 將用於隨後的管線化乘算中。在該第二乘算週期的期間, 輸出終端機170 ’168 ’166與164上的信號分別為二進位值 0111(或是多項式形式(ΧΗΧ+1)的值)。 必須注意的是,乘數結構1 71架構可決定出\的值,並 將之铺存在閃鎖212,216,220與224。換言之,在運算元 八〇與3£!相乘之前’ \值尚未計算出來,\值是在決定運算 元AQ與B〇乘積的週期期間,被決定出且閂鎖住的。在用以 決定運算元A與B全部乘積之管線化處理中,運算其他數元 的相乘期間’是會使用到該閂鎖的\值的。 圖5是用以計算模多項式基乘算之乘法器232的一部份電 路圖。簡短地參考圖4 ’在乘數結構1 71用作計算模多項式 基乘算時,及閘9(^-96八,10(^-10614,110六-116八與120八- 1 2 6 A是不致能的。所以,加法器單元並無法接收到從該相 鄰加法器單元之進位輸出(C0)終端機而來之進位輪入信號Μ: Page 27 44837 ^ V. Explanation of the invention (24) The binary product produced by the type multiplication is the same as that produced by the circuit of FIG. When multiplying the basis of a polynomial of different modes, the sum of 90j \ -g6A, i00a-106A, 110A-116A, and 120A-126A is disabled. Therefore, the adders 90-96, the adders 100-106, the adders 110-116, and the adders 12〇_126 cannot pass the carry signals to the adjacent adder units. These AND gate disables' will cause each CI terminal to receive a logic zero value. During the first multiplication cycle, the reduction processing will generate operands \ and 8 on the output terminals 170, 168, 166, and 164. The first part of the product is 0 0 0 0. In addition, during the generation of the product of the first part, the latches 224 ′ 220 ′ 216 and 212 are set and the value iioi is locked. This value will be used in the subsequent pipeline multiplication. During the second multiplication period, the signals on the output terminals 170'168'166 and 164 are binary values of 0111 (or values in polynomial form (χΗχ + 1)). It must be noted that the multiplier structure 1 71 architecture can determine the value of \ and store it in the flash locks 212, 216, 220, and 224. In other words, the value of \ before multiplication of the operands 80 and 3 £! Has not been calculated, and the value is determined and latched during the period in which the product of the operands AQ and B0 is determined. In the pipeline processing for determining the total product of the operands A and B, the multiplication period of the other operands' will use the value of the latch. Fig. 5 is a part of a circuit diagram of a multiplier 232 for calculating a basis polynomial multiplication. Briefly refer to FIG. 4 'When the multiplier structure 1 71 is used to calculate the modulus polynomial basis multiplication, and the gate 9 (^ -96 eight, 10 (^ -10614, 110 six -116 eight and 120 eight-1 2 6 A It is disabled. Therefore, the adder unit cannot receive the carry-in signal from the carry output (C0) terminal of the adjacent adder unit.
第28頁 448376 五、發明說明(25) 。於是’所有的加法器90-96 ’ 100-106,110-116,與 120-126的加法器單元,均可以圖5中所示之半加器單元取 代之。該等用作為半加器單元之互斥或閘其參考數字的後 面,均加有字母” H"。Page 28 448376 V. Description of the Invention (25). Thus, all of the adder units 90-96, 100-106, 110-116, and 120-126 can be replaced by the half adder unit shown in FIG. The letters “H " are added to the reference numbers of the mutex or gate used as the half adder unit.
對mod N)=XHX4 mod N,(B*R mod N) = (X6) mod N ’A = (X2+1),B = (X2),R = (X4)之例子而言,(A 來 R mod N)與 (B*R mod N)之多項式乘算在其第一乘算週期期間,會於 輸出終端機170,168,166與164上,產生出數值〇〇〇〇。所 以,該第一部份乘積會被縮減成零,而該\值會被判定為 1101並將數元分別地儲存在閂鎖224,220,216與212中。 該儲存之\值將用於稍後的用以產生運算元A與b之全部乘 積之乘算週期中。在該第二乘算週期的期間輸出終端機 170,168,166與164上之信號為二進位值〇ln,譬如,多 項式形式之(χ2 + χ+ι )。 ,二用以計算乘算或模多項式基乘算之歸 =—:方塊圖;其中M是乘法器單元之數目。乘法器 元;厂暫運算元…暫存器242,-用以儲存運算 …,以及-用以儲存心;=二乘”之。暫存器 出重置線,但在進入第一乘算暫。雖然圖中並未示 246清為零。必須注竟的θ ,也。3之刖,需先將c暫存器 模-Ν乘算時,Ν暫存;2心二'法器240計算的是整數_ 法器240計算的是模多項式某的疋/數之二進位值,而乘 是既約多項式之二進位值,Ν暫存器248儲存的 值圖6中之暫存器242-248具Μ位For the example of mod N) = XHX4 mod N, (B * R mod N) = (X6) mod N 'A = (X2 + 1), B = (X2), R = (X4), (A to The polynomial multiplication of R mod N) and (B * R mod N) during the first multiplication period will produce a value of 0.0000 on the output terminals 170, 168, 166, and 164. Therefore, the product of the first part is reduced to zero, and the \ value is determined to be 1101 and the numbers are stored in the latches 224, 220, 216, and 212, respectively. This stored \ value will be used in a later multiplication cycle to generate the total product of operands A and b. During this second multiplication period, the signals on the output terminals 170, 168, 166, and 164 are binary values of 0ln, for example, in the polynomial form (χ2 + χ + ι). Second, used to calculate the multiplication or modular polynomial basis multiplication = —: block diagram; where M is the number of multiplier units. Multiplier element; factory temporary operand ... temporary register 242,-used to store the operation ..., and-used to store the heart; = two times ". The register is out of the reset line, but before entering the first multiplication temporary .Although 246 is not shown in the figure, it must be cleared to zero. Θ must be noted, too. For 3, you need to first multiply the c temporary register module -N, and N is temporarily stored; Is the integer _ generator 240 calculates the binary value of a unitary / number of a modulus polynomial, and the multiplication is the binary value of the reduced polynomial, the value stored in the N register 248. The register 242 in FIG. 6 248 M
4483 764483 76
在=較佳具體實施例中,B暫存器242是一可將儲存於其 中之貧料,向左或向右移位之位移暫存器。舉例,卷乘法 器240計算的是整數-模-N乘算時,B暫存器242會將&料向 右移位,亦即,B暫存器242會先從其最低有效位元開始, 一步步地將位元資料轉移至多工器25〇。另一方面當乘 法器240計算的是模多項式基乘算時,B暫存器242會將資 料向左移位,亦即,Β暫存器242會先從其最高有效位元開 始,—步步地將位元資料轉移至多工器25〇。用於閂鎖 存器242中數值之時鐘信號、a暫存器244、c暫存器24β、 以及N暫存器248,並未顯示於圖6中。同樣地,連接至每 一個暫存器之輸出入,以便資料可進出於暫存器之匯流 線,也並未顯示於圖6中。 乘法器240根據〖NT/POLY輸入端上之信號,來決定是要 執行整數-模-N乘算,還是要執行模多項式基乘算。該 INT/POLY輸入,連接至乘法器25〇之選擇輸入端、B暫存器 242、以及C暫存器246加法器單元之輸入(看圖7與8中之 入INT/POLY)。所以,當!NT/P0LY輸入端上之信號致令乘别 法器240執行模多項式基乘算時,B暫存器242執行的是資 料向左的移位,從其最高有效資料位元位置中的資料開始 ’透過乘法器250,將資料送至C暫存器246之輸入。當乘 法器240執行整數-模-N乘算時,B暫存器242執行的是資料 向右的移位,從其最低有效資料位元位置t的資料間始’, 透過乘法器250 ’將資料送至C暫存器246之輸入。 °In the preferred embodiment, the B register 242 is a shift register that can shift the lean material stored therein to the left or right. For example, when the volume multiplier 240 calculates an integer-modular-N multiplication, the B register 242 shifts the & material to the right, that is, the B register 242 starts with its least significant bit first Step by step, the bit data is transferred to the multiplexer 25. On the other hand, when the multiplier 240 calculates modulo polynomial basis multiplication, the B register 242 shifts the data to the left, that is, the B register 242 starts with its most significant bit first. The bit data is transferred to the multiplexer 25 in steps. The clock signals for the values in the latch register 242, the a register 244, the c register 24β, and the N register 248 are not shown in FIG. Similarly, the inputs and outputs of each register are connected so that data can flow into and out of the register's bus line, which is not shown in Figure 6. The multiplier 240 determines whether to perform an integer-modular-N multiplication or a modular polynomial base multiplication based on the signal at the NT / POLY input. The INT / POLY input is connected to the selection input of the multiplier 25, the input of the B register 242, and the adder unit of the C register 246 (see the INT / POLY in Figures 7 and 8). So when! The signal at the NT / P0LY input causes the multiplier 240 to perform a modular polynomial basis multiplication. The B register 242 performs a leftward shift of the data, starting with the data in its most significant data bit position. 'Through the multiplier 250, the data is sent to the input of the C register 246. When the multiplier 240 performs an integer-modulo-N multiplication, the B register 242 performs a right shift of the data, starting from the data between its least significant data bit position t, and through the multiplier 250 ', The data is input to the C register 246. °
第30頁 448376 五 '發明說明(27) 。圓7是於單一乘算週期運作,兩於乘法器24〇(圖6冗暫存 ^246中之單元270之電路圖。雖然圖7之乘法器24〇採用的 疋漣波進位乘法器,但須了解的是,乘法器24〇可以實施 為進位儲蓄乘法器。是故,C暫存器246的單元(:(η_υ,C(n_2) ,二,與G合併單元2 70於計算模多項式基以及整數_模, 乘算中。單元270的I NT/POLY輸入如果是邏輯零,則單元 27〇會執行模多項式基的運算。單元27〇中之閂鎖262,鎖 住的是第"i"個位元,儲存的是⑽㊉ CARRY I)值,其中與义值乃分別地儲存在a暫存器 2一44,B暫存器242與N暫存器248的特定位元位置(標示為位 凡位置1 )中。CH[GH是儲存在c暫存器246中數值之最高有效 位兀。則是儲存在鄰接c暫存器246中第"i1,個位元之 暫存器單元中之先前部份乘積值。 另方面,當乘法器240被選作用來執行整數—模J乘算 計算時,單元270中之閂鎖262,鎖住的是(Α.η.φ' ⑽㊉CARRYIN㈣㊉值 ' 其中',β與 \乃是分別地儲存在Α暫存器244,Β暫存器242 與N暫存器 248的特定位元位置(標示為位元位置丨)令的值。匕⑽是儲 存在C暫存器246中數值之最低有效位元。CARRYIN, ”是從 該鄰近C暫存器246令第"〗"個位元之加法器單元所傳來的 進位信號^ CARRY INu_2)是從C暫存器246中第,,丨”個位元移 亡兩個單元之加法器單元所傳來的進位信號^ 則是儲 :在鄰接C暫存器246中第"i"個位元之閂鎖中之先 乘積值。Page 30 448376 V. Description of the invention (27). Circle 7 operates in a single multiplication cycle. It is two circuit diagrams of unit 270 in multiplier 24o (Figure 6 redundantly stores ^ 246). Although the multiplier carry multiplier used in multiplier 24o of FIG. 7 must be It is understood that the multiplier 24 can be implemented as a carry-save multiplier. Therefore, the unit of the C register 246 (: (η_υ, C (n_2)), two, and the unit 27 combined with G are used to calculate the modulus polynomial basis and Integer_modulus, multiplying. If the I NT / POLY input of unit 270 is a logical zero, unit 27 will perform the operation of the base of the modular polynomial. The latch 262 in unit 27 will lock the "quoti" The single bit stores the value of ⑽㊉ CARRY I), where the positive and negative values are respectively stored in the specific bit positions of the a register 2 to 44, the B register 242 and the N register 248 (labeled as Bit 1 in position 1). CH [GH is the most significant bit of the value stored in the c register 246. It is the " i1, one-bit register unit stored in the adjacent c register 246 On the other hand, when multiplier 240 is selected to perform an integer-modulus J multiplication calculation, latch 262 in unit 270 What is locked is (Α.η.φ '⑽㊉CARRYIN㈣㊉value' wherein ', β and \ are stored in specific bit positions of A register 244, B register 242, and N register 248 (labeled Is the value of the bit position 丨). The dagger is the least significant bit of the value stored in the C register 246. CARRYIN, "" is the "quote" bit from the neighboring C register 246 The carry signal from the adder unit of the adder unit ^ CARRY INu_2) is the carry signal from the adder unit of the first, second, and third units of the C register 246, which is stored: The value of the first product in the latch of the " i " bit in the adjacent C register 246.
448376 五、發明說明(28) ---- 操作時,整數_模-N乘算之運算元A與運算元B的整數相 乘,會於多個乘算週期中完成。β暫存器242中之資料會以 一個乘算週期移出一個的方式,移至c暫存器246中。所以 ,(:暫存器246執行運算元A與β的相乘,並將乘積值縮減ν 的整數倍’產生出(A*B*R-1 mod Ν)值。所以,在第一乘算 週期中,運算元β的最低有效資料位元會先通過乘法器以〇 ,而後移往C暫存器246 □在下一個乘算週期中,Β暫存器 2-42之向右移出操作,將使得其内之下一最低有效資料位 το先通過乘法器250,而後移至c暫存器246。該相乘過程 會一直持續,直到β暫存器242中儲存的運算元8數值,全 都以一個乘算週期移動一個資料位元的方式,經過乘法器 250移至C暫存器246,而C暫存器24 6產生出乘積 mod Ν)為止。 必須注意的是,具(A;icR m〇d N)形式之運算元A與具(bm mod N)形式之運算元B的相乘,會產生縮減形式為(a*b*r mod N)之乘積。換言之,該乘積被縮減了 R倍。舉例, (A*R mod N)項的值1011〇儲存在A暫存器244 ,(B*R m〇d n )項的值1〇1〇1储存在3暫存器242 ^項的值111〇1則儲存在 Ν、暫存器248。初始時’ c暫存器246會被清為零使先前部 伤乘積之值成為零。在此例文中,乘法SMO所產生 的(An*R mod N)乘積值為(1〇〇1)。 詳.. .田地·•兒月我們乃是將儲存在A暫存器2 4 4中之值與該 k B #存器2 4 2而來之最低有效資料位元相乘,以得到該第 一部份乘積。所以’ A暫存器244中的值(1〇11〇)會與B(〇)_448376 V. Description of the invention (28) ---- During operation, multiplying the integer of operand A and operand B of integer_modulus-N multiplication will be completed in multiple multiplication cycles. The data in the β register 242 will be moved to the c register 246 in a manner that one is multiplied by one. Therefore, (: register 246 performs a multiplication of operands A and β, and reduces the product value by an integer multiple of ν 'to produce a value of (A * B * R-1 mod Ν). Therefore, in the first multiplication In the cycle, the least significant data bit of the operand β will first pass through the multiplier by 0, and then move to the C register 246. □ In the next multiplication cycle, the B register 2-42 will be shifted out to the right. So that the next least significant data bit το passes through the multiplier 250 first, and then moves to the c register 246. This multiplication process will continue until the value of operand 8 stored in the β register 242 is all The method of shifting one data bit by one multiplication period is moved to the C register 246 through the multiplier 250, and the C register 246 produces a product mod N). It must be noted that the multiplication of operand A in the form (A; icR m〇d N) and operand B in the form (bm mod N) results in a reduced form of (a * b * r mod N) Of the product. In other words, the product is reduced by R times. For example, the value of the (A * R mod N) item 1011 is stored in the A register 244, and the value of the (B * R m〇dn) item 1010 is stored in the 3 register 242. The value of the item 111 〇1 is stored in N, temporary register 248. Initially, the 'c register 246 is cleared to zero so that the value of the previous injury product becomes zero. In this example, the product value of (An * R mod N) produced by the multiplication SMO is (1001). Details ... field · month We multiply the value stored in the A register 2 4 4 by the least significant data bit from the k B # register 2 4 2 to get the first Partial product. So the value in the A register 244 (1010) will be the same as B (〇) _
第32頁 /ΐ 483 76 五、發明說明(29) 亦即B的最低有效位元(1 〇 1 〇丄)相乘。 (1) 10110 < ==儲存在A暫存器244中之值 (2) xl01〇i < = = β(〇),b之最低有效位元 (3) 10110 < ==第一位元乘積 根據福斯特-象哥馬利縮減演算法,該部份乘積之特定 位元位置中之邏輯值資料,決定了 N值應否對齊該部份乘 積並與之相加,以縮減該部份乘積值於m〇d N。若該特定 位元位置中之資料為邏輯〇值,則N值不會與該部份乘積相 加。反之’若該特定位元位置中之資料為邏輯1值,則N值 與該部份乘積相加。在此例中,該特定位元位置是該第一 位元乘積之最低有效位元(1 〇 Π ^)。此位置中的資料是邏 輯值零’於是’N值不會加入該第一位元乘積(3)。 第二位元的乘算,則是儲存在A暫存器244中之值與B暫 存器242中之下一個最低有效位元的相乘。所以,是A暫存 器244中的值(10110)乘以B(l),此βο)即為b的下一最低 有效資料位元-邏輯值0(101^1)。 (1)10110 〈==儲存在Α暫存器244中之值 (4) xl01公1 0= B(l),B之下一個最低有效位元 (5) 00000 < ==第二位元乘算結果 該第二位位元乘積(5)會與該先前已儲存之結果(3)相 加,產生出該第二部份乘積(6) ^ (5) 00 00 0 < ==第二位元乘算結果 (3) + 10110 < ==第一部份乘積 (6) 10110 <二=第二部份乘積Page 32 / ΐ 483 76 V. Description of the invention (29) That is, the least significant bit (100) of B is multiplied. (1) 10110 < == value stored in register A 244 (2) xl01〇i < = = β (〇), the least significant bit of b (3) 10110 < == first bit The meta product is based on the Foster-Like Gomery reduction algorithm. The logical value data in the specific bit position of the partial product determines whether the N value should be aligned with the partial product and added to it to reduce the Partial product values are in mOd N. If the data in that particular bit position is a logical zero, then the value of N will not be added to the product of that part. Conversely, if the data in the specific bit position is a logical 1 value, the value of N is added to the product of the part. In this example, the specific bit position is the least significant bit (10 Π ^) of the first bit product. The data in this position is a logical value of zero, so the value of N is not added to the first bit product (3). The multiplication of the second bit is the multiplication of the value stored in the A register 244 and the next least significant bit in the B register 242. Therefore, the value (10110) in the A register 244 is multiplied by B (l), and β?) Is the next least significant data bit of b-a logical value of 0 (101 ^ 1). (1) 10110 <== value stored in A register 244 (4) x l01 male 1 0 = B (l), the least significant bit below B (5) 00000 < == second bit Multiplication result The second bit product (5) is added to the previously stored result (3) to produce the second partial product (6) ^ (5) 00 00 0 < == 第Two-bit multiplication result (3) + 10110 < == product of the first part (6) 10110 < two = product of the second part
第33頁 448376Page 448 376
五、發明說明(30) 根據福斯特-蒙哥馬利縮減演算法,該第二部份乘積之 特定位元位置中之邏輯值,決定了該第二部份乘積應否 減。在此例中,該特定位元位置是該最低有效資料位' 左一位置(10110)。該第二資料位元具邏輯1值,於是,/ 值要對齊並加入該第二部份乘積。換言之,藉由將該1^值 對齊該特定位元位置,與第二部份乘積相加’來縮 二部份乘積。 $ (6) 1011 0 < ==第二部份乘積 (7) + 11101 < ==對齊的 N 值 C8) 1010000< ==已縮減之第二部份乘積 第三位兀的乘算,則是儲存在A暫存器244中之值與趴㈧ 相乘;β( 2)即為B暫存器242從右數過來之第三位元位置中 之資料位元值(1 〇i〇i)。 (1) 1 0 1 1 0 < ==儲存在A暫存器244中之值 (9) xl〇l〇l < = = β(2),下一個最低有效位元 (10) 10110 < ==第三位元乘算結果 接著’该第二位元乘積(丨〇 )應與先前之結果(8 )相加, 以提供出第三部份乘積(Π)。 (8 ) 1 0 1 0 000 < ==先前結果 (10) +I0I1Q < ==第三位元乘算結果 (11) 10101000 < ==第三部份乘積 忒第二部份乘積之特定位元位置中之邏輯值決定了該 第二部伤乘積應否縮減。在此例中,該特定位元位置是 (1 0 1 0 1Ρ 0 )從右數過來之第三位元位置。該第三資料位元V. Explanation of the invention (30) According to the Foster-Montgomery reduction algorithm, the logical value in the specific bit position of the product of the second part determines whether the product of the second part should be reduced. In this example, the specific bit position is the leftmost position of the least significant data bit (10110). The second data bit has a logical 1 value, so the / value is aligned and added to the second partial product. In other words, the partial product is reduced by aligning the 1 ^ value with the specific bit position and adding it to the second partial product '. $ (6) 1011 0 < == product of the second part (7) + 11101 < == aligned N value C8) 1010000 < == product of the reduced third part of the product of the second part, Then, the value stored in the A register 244 is multiplied by the party; β (2) is the data bit value (1 〇i〇) in the third bit position of the B register 242 from the right. i). (1) 1 0 1 1 0 < == value stored in A register 244 (9) xl101l < = = β (2), next least significant bit (10) 10110 < == Third-bit multiplication result Then 'the second-bit product (丨 〇) should be added to the previous result (8) to provide the third partial product (Π). (8) 1 0 1 0 000 < == previous result (10) + I0I1Q < == third bit multiplication result (11) 10101000 < == product of the third part 忒A logical value in a particular bit position determines whether the second injury product should be reduced. In this example, the specific bit position is the third bit position (1 0 1 0 1P 0) from the right. The third data bit
4483 7 e 五、發明說明(31) -- 具邏輯值〇,於是’該第三部份乘積不需加上該N值。 第四位元的乘算,則是儲存在A暫存器244中之值 相乘,· B(3)即為β暫存器242從右數過來之第四位元^立 之資料位元值(1Ρ1 01)。 (I) < ==儲存在Α暫存器244中之值 (12) xlpl01 < = = β(3),下一個最低有效位元 (13) 〇〇〇〇〇 < ==第四位元乘算結果 接著,該第四位元乘積應與第三部份乘積(u)相加,以 提供出第四部份乘積(1 4 )。 (II) 10101000 <二=第三部份乘積 (13) + Q 〇 〇 〇 〇 < ==第四位元乘算結果 (14) 10101000 < ==第四部份乘積 該第四部份乘積之特定位元位置中之邏輯值,決定了該 第四部份乘積應否縮減。在此例中,該特定位元位置是 (1 0 1 〇1 0 0 0 )從右數過來之第四位元位置。該第四資料位元 具邏輯值1,於是,需將該N值對齊並加入該第四部份乘積 〇 (14) 1 0 1 0 1 0 0 0 < ==第四部份乘積 (15) +11101 < ==對齊的 N 值 (16) 110010000 < ==已縮減之第四部份乘積 第五位元的乘异’則是儲存在A暫存器244中之值與β(4) 相乘;Β(4)即為β暫存器242從右數過來之第四位元位置中 之資料位元值Q0101)。 (1) 丨❶11^儲存在Α暫存器244中之值4483 7 e V. Description of the invention (31)-It has a logical value of 0, so ‘the product of the third part does not need to add the value of N. The multiplication of the fourth bit is the value stored in the A register 244, and B (3) is the fourth bit of the data from the β register 242 that counts from the right. Value (1P1 01). (I) < == Value stored in A register 244 (12) xlpl01 < = = β (3), next least significant bit (13) 〇〇〇〇〇 < == 4th Bit multiplication result Next, the fourth bit product should be added to the third partial product (u) to provide a fourth partial product (1 4). (II) 10101000 < 2 = product of the third part (13) + Q 〇〇〇〇〇 < == result of the fourth bit multiplication (14) 10101000 < == product of the fourth part The logical value in the specific bit position of the product of the shares determines whether the fourth product should be reduced. In this example, the specific bit position is the (10 1 0 1 0 0) fourth bit position from the right. The fourth data bit has a logical value of 1, so the N value needs to be aligned and added to the fourth partial product. (14) 1 0 1 0 1 0 0 0 < == Fourth partial product (15 ) +11101 < == aligned N value (16) 110010000 < == multiplication of the fifth part of the reduced fourth product multiplied by the fifth bit 'is the value stored in the A register 244 and β ( 4) Multiplication; B (4) is the data bit value Q0101 in the fourth bit position of the β register 242 from the right. (1) 丨 ❶11 ^ Value stored in A register 244
第35頁 ^483 7 6Page 35 ^ 483 7 6
(17) xi〇101 0= B(4),下一個最低有效位元 (18) 10110 〇=第五位元乘積結果 接著,該第五位元乘積應與該已縮減之第四部份乘積 (16)相加’以提供出第五部份乘積(19)。 (16) 110010000 < ==已縮減之第四部份乘積 (18) + 10110 〈==第五位元乘算結果 ( 1 9 ) 1 0 1 1 1 1 0 0 0 0 < ==第五部份乘積 再一次地,該第五部份乘積中某特定位元之邏輯值,決 定了該第五部份乘積應否縮減。在此例中,該特定位元位 置是值(1011110000)從右數過來之第五位元位置。該第五 資料位元具邏輯值1 ’於是,需將該N值對齊並加入該第五 部份乘積。 (19) 1011110000 〇=第五部份乘積 (20) + 11101 < ==正綠對齊之N值 (21) 10011000000 < ==已縮減之第五部份乘積 (A*R mod N)與(B*R mod N) ’ 即(10110)與(ι〇ι〇1)的乘 積,大於該N值。若最後之已縮減部份乘積值大於N,則將 該最後部份乘積值減去N值。換言之,即是將n值(1 1 1 〇 1 ) 對齊該已縮減之部份乘積( 1 00 1 1 0 0 0000)後減去。必須注 意的是,該1 XN乘法器240已用於計算該mod N)之 最後乘積,且算出是1001。 在執行運算元A與B的相乘之前’該福斯特-蒙哥馬利縮 減演算法中之\值是尚未計算出來的;但可從前例看出, \值是在縮減數元Ao與BG的乘積時’決定出來的。應注意(17) xi〇101 0 = B (4), the next least significant bit (18) 10110 〇 = the result of the fifth bit product Next, the product of the fifth bit should be the product of the reduced fourth part (16) Add 'to provide the fifth partial product (19). (16) 110010000 < == product of reduced fourth part (18) + 10110 <== result of the fifth bit multiplication (1 9) 1 0 1 1 1 1 0 0 0 0 < == Five-Part Product Again, the logical value of a particular bit in the fifth-part product determines whether the fifth-part product should shrink. In this example, the specific bit position is the fifth bit position from the right of the value (1011110000). The fifth data bit has a logical value of 1 '. Therefore, the N value needs to be aligned and added to the fifth partial product. (19) 1011110000 〇 = product of the fifth part (20) + 11101 < == positive green aligned N value (21) 10011000000 < == product of the reduced fifth part (A * R mod N) and (B * R mod N) 'is the product of (10110) and (ι〇ι〇1), which is greater than the value of N. If the final reduced product value is greater than N, the final product value is subtracted from the N value. In other words, the value of n (1 1 1 〇 1) is aligned with the product of the reduced part (1 00 1 1 0 0 0000) and then subtracted. It must be noted that the 1 XN multiplier 240 has been used to calculate the final product of the mod N), and the calculation is 1001. Before performing the multiplication of operands A and B, the \ value in the Foster-Montgomery reduction algorithm has not yet been calculated; but as can be seen from the previous example, the \ value is the product of the reduced elements Ao and BG It's decided. Should pay attention
第36頁 a 76 五、發明說明(33) 的是,Ν值是奇數的,亦即,Ν之最低有效位元的邏輯值為 1。所以,若Ν所加上的加總值,其特定位元之值為邏輯i ’則mod N)較低位元的值就會是零。另一角度來 看,該福斯特-蒙哥馬利縮減演算法在產生縮減K倍之乘積 時,會造成該最低有效位元的值為邏輯〇。 參考圖6與7,可產生出支援ECC(多項式基的則是F2M)之 (A*B) mod N的乘積,其中a與B代表橢圓曲線座標之有限 域單元,N則是既約或基底多項式。產生出乘積所需要的 乘算週期數’有一部份乃是取決於儲存在B暫存器242中之 位兀數而元定。資料乃是一次一個地,從β暫存器242中移 位到C暫存器246中。所以,運算元Α與6的相乘,在(Α*Β) Ν值時’將該相乘後的值縮減Ν的整數倍,這些動作都 是在C暫存器246中執行。因為在乘法器240計算模多項式 基乘算時,進位信號是無法在加法器單元間傳遞的,所以 我們可以將模多項式基的乘算設計為:Α暫存器2 4 4之最高 有效資料位元與B暫存器242中之最高有效資料位元的相乘 °此方式免除了須將運算元做成蒙哥馬利格式(即,A —AR mod N)之必要。B暫存器242從最高有效資料元開始,經過 乘法器250,將資料位元一個個地移位至◦暫存器246中。 將B暫存器242中之最高有效資料位元(即B(4)值)乘上a 暫存器244令之值,即產生出第一部份乘積。所以,舉例 ’將B(4)-即111〇1(多項式形式為X4)之二進位值1,乘上a 暫存器244中之二進位值10110(多項式形式gXHXHX)。該 既約多項式N所具之值為100ι01 (多項式形式為χΗχΗ1)。Page 36 a 76 5. Invention Description (33) is that the value of N is odd, that is, the logical value of the least significant bit of N is 1. Therefore, if the total value added by N has a value of a specific bit of logic i ', then the value of the lower bit of mod N) will be zero. From another perspective, when the Foster-Montgomery reduction algorithm generates a product of K times reduction, the value of the least significant bit will be logic 0. Referring to Figures 6 and 7, the product of (A * B) mod N supporting ECC (polynomial basis is F2M) can be generated, where a and B represent the finite field units of the elliptic curve coordinates, and N is the approximation or basis Polynomial. The number of multiplication cycles' required to produce a product is determined in part by the number of bits stored in the B register 242. The data is shifted from the β register 242 to the C register 246 one at a time. Therefore, the multiplication of the operands A and 6 reduces the multiplied value by an integer multiple of N when the value of (A * B) N is obtained. These operations are performed in the C register 246. Because the carry signal cannot be transferred between the adder units when the multiplier 240 calculates the modulo polynomial basis multiplication, we can design the multiplication of the modulo polynomial basis as: the most significant data bit of the A register 2 4 4 Multiplication of the most significant data bit in the B register 242 ° This method eliminates the need to make the operand into Montgomery format (ie, A — AR mod N). The B register 242 starts from the most significant data element and passes the multiplier 250 to shift the data bits one by one into the ◦ register 246. Multiplying the most significant data bit (ie, the value of B (4)) in the B register 242 by the value of the a register 244, produces the first partial product. So, for example, ‘multiply B (4) —that is, the value of 11010 (the polynomial form is X4) by 1 into the binary value of 10110 (the polynomial form gXHXHX). The reduced polynomial N has a value of 100ι01 (the polynomial form is χΗχΗ1).
第37頁 4483 7f 五、發明說明(34) Π) 10110 < ==儲存在A暫存器244中之值 (2) Χ1Π01 < = = β(4),最高有效位元 (3) 10110 < ==第一部份乘積結果 將該第-部份乘積值加上先前部份乘積所得出之加總值 ’會由於-開始時C暫存器246被清為零,該先前部份^積 所具之值為零,而仍為第一部份乘積值1〇11〇。在下一個 乘算週期中’Β暫存H 242中之資料會向左移& ’使得 存器242之下一最高資料位元經過乘法器25〇,轉移至c 存器246中。C暫存器246會將儲存在A暫存器244令之 以該下一最高資料位元。即為1〇11〇(多項式形式為 X )乘以β ( 3 )-即m 0 1 (多項式形式為X3)之二進位值}。 (1) 1 011 〇 < ==儲存在A暫存器244中之值 (4) XUl〇l < = = B(3),下一個最高有效位元 (5) 1 0Π0 < ==第二位元乘算結果 該第二位位元乘積(5 )會與該先前已儲存之結果相加, 產生出該第二部份乘積(6)。 ° C 3) 10110 < ==;第一部份乘積 (5) + 1QJJ 0 < ==第二位元乘算結果 (6) 111010 < =二第二部份乘積 測出特定位元位置中之邏輯值’以決定該部份乘積應否 縮減。若該特定位元具邏輯值1,那麼就將該Ν值對齊^兮 部份乘積之特定位元位置,然後相加。在此例_,該二 位元乃是該第二部份乘積之最高有效資料位元。此處:$ 特定位元之值為邏輯值l(ill〇l〇)❶是故,須將 4483 7 6 五、發明說明(35) N )對齊於該最高有效資料位元後減去。Page 37 4483 7f V. Description of the invention (34) Π) 10110 < == value stored in A register 244 (2) χ1Π01 < = = β (4), most significant bit (3) 10110 < == The product of the first part of the product is added to the value of the product of the first part and the sum of the product of the previous part. 'The C register 246 is cleared to zero at the beginning of the previous part. The value of the ^ product is zero, and it is still the product value of the first part 10101. In the next multiplication cycle, the data in the 'B temporary storage H 242 will be shifted to the left &' so that the highest data bit under the memory 242 passes through the multiplier 25 and is transferred to the c memory 246. The C register 246 stores the order in the A register 244 to the next highest data bit. That is 1011 (polynomial form X) multiplied by β (3)-that is, the binary value of m 0 1 (polynomial form X3)}. (1) 1 011 〇 < == value stored in A register 244 (4) XUl01 < = = B (3), next most significant bit (5) 1 0Π0 < == Second bit multiplication result The second bit product (5) is added to the previously stored result to produce the second partial product (6). ° C 3) 10110 <==; product of the first part (5) + 1QJJ 0 < == result of the second bit multiplication (6) 111010 < = two product of the second part to measure a specific bit The logical value in the position 'determines whether the product of the part should be reduced. If the specific bit has a logical value of 1, then the N value is aligned with the specific bit position of the partial product and then added. In this example, the two bits are the most significant data bits of the product of the second part. Here: The value of the specified bit is the logical value l (ill〇l〇). Therefore, 4483 7 6 V. Invention Description (35) N) must be aligned with the most significant data bit and subtracted.
應注意的是,在計算模多項式基之乘算時,乘法器240 是不傳遞進位信號的’所以’ ”相加”或”相減”動作僅是對 兩數值執行互斥或邏輯運算而已。另應注意的是,在與N 相加之後,該第二部份乘積之最高有效位元會縮減成為零 〇 (6) 111 〇 〇=第二部份乘積 (7) -10-01 01 < ==對齊的NUHXHX3)值 (8) 011111 〇=已縮減之第二部份乘積 (Χ7 + ΧΗΧ5 + ΧΗΧ3) 第三位元的乘算’則是儲存在Α暫存器244中之值與β(2) 的相乘,Β(2)即為Β暫存器242從左數過來之第三位元位置 中之資料位元值(11丄〇1)。 (1) m 1 〇 < ==儲存在Α暫存器244中之值 (9) Xlli〇l < = = b(2),下一最高有效位元 (10) 10110 < ==第三位元乘算結果 接著’該第三位元乘積(1 〇)應與先前之結果(即該已縮 減之第二部份乘積(8 ))相加,以提供出第三部份乘積(11 ) 〇 (8) 011111 0=已縮減之第二部份乘積 (10) + 10110 < =二第三位元乘算(XHXHX3) (Π) 0 1 0 1 0 0 0 〇=第三部份乘積(X7 + X5) 該第三部份乘積之特定位元位置中之邏輯值,決定了該 第三部份乘積應否縮減。在此例中,該特定位元位置是It should be noted that the multiplier 240 does not pass the carry signal when calculating the multiplication of the modulo polynomial basis. Therefore, the "addition" or "subtraction" action is only a mutual exclusion or logical operation on the two values. It should also be noted that after adding to N, the most significant bit of the product of the second part is reduced to zero. 0 (6) 111 〇〇 = product of the second part (7) -10-01 01 < == aligned NUHXHX3) value (8) 011111 〇 = reduced second part product (χ7 + χ + χ5 + χΗχ3) The multiplication of the third bit is the value stored in the A register 244 and By multiplying β (2), B (2) is the data bit value (11 丄 01) in the third bit position of the B register 242 from the left. (1) m 1 〇 < == value stored in A register 244 (9) Xlli〇l < = = b (2), next most significant bit (10) 10110 < == No. The result of the three-bit multiplication followed by 'the third-bit product (10) should be added to the previous result (that is, the reduced second-part product (8)) to provide a third-part product ( 11) 〇 (8) 011111 0 = the product of the reduced second part (10) + 10110 < = the second and third digit multiplication (XHXHX3) (Π) 0 1 0 1 0 0 0 〇 = the third part Partial product (X7 + X5) The logical value in the specific bit position of the product of the third part determines whether the product of the third part should be reduced. In this example, the specific bit position is
第39頁 綱376 五、發明說明(36) (〇_! ο 1 ο ο 〇)從左數過來之第二位元位置。該第二資料位元 具邏輯值1 ’於是,須將該N值(X2*N)對齊於該第三部份乘 積後減去。 ( 1 1 ) 0 1 0 1 000 < ==第三部份乘積(χ7 + χ5) (12) - 100101 < ==對齊的N(XUX4 + X2)值 ( 1 3 ) 000 1 1 0 1〈==縮減之第三部份乘積(χ5 + χ4 + χ2) 第四位元的乘算’則是儲存在Α暫存器244中之值與B(l) 的相乘;B(l)即為Β暫存器242從左數過來之第四位元位置 中之資料位元值(111公1)。 (1) 101 10 < ==儲存在Α暫存器244中之值 (14) X111公1 < = = B(l),下一最高有效位元 (15) 〇〇〇〇〇 < ==第四位元乘算結果 接著,該第四位元乘算結果(丨5 )應與該已縮減之第三部 份乘積(1 3 )相加,以提供出第四部份乘積(丨6 )。 (13) 0001101 < ==已縮減之第三部份乘積 (15) + 〇〇〇〇〇 < ==第四位元乘算結果 (16) 00011010 < ==第四部份乘積(χΗΧ4 + Χ2) 該第四部份乘積之特定位元位置中之邏輯值,決定了該 第四部份乘積應否縮減。在此例中,該特定位元位置是 (0 1 1 0 1 0 )從左數過來之第三位元位置。該第三資料位元 為邏輯值0,於是,該第四部份乘積不須加上該Ν值。 第五位元的乘算’則是儲存在Α暫存器244中之值與8(〇) 的相乘;B(l)即為B暫存器242從左數過來之第五位元位置 中之資料位元值(11 1 〇1)。Page 39 Outline 376 V. Description of the invention (36) (〇_! Ο 1 ο ο 〇) The position of the second digit from the left. The second data bit has a logical value of 1 ', so the N value (X2 * N) must be aligned with the product of the third part and subtracted. (1 1) 0 1 0 1 000 < == product of the third part (χ7 + χ5) (12)-100101 < == aligned N (XUX4 + X2) value (1 3) 000 1 1 0 1 <== The product of the reduced third part (χ5 + χ4 + χ2) The multiplication of the fourth bit is the multiplication of the value stored in the A register 244 by B (l); B (l) That is the data bit value (111 male 1) in the fourth bit position of the B register 242 from the left. (1) 101 10 < == value stored in A register 244 (14) X111 male 1 < = = B (l), next most significant bit (15) 〇〇〇〇〇 < == Fourth bit multiplication result Next, the fourth bit multiplication result (丨 5) should be added to the reduced third part product (1 3) to provide the fourth part product (丨 6). (13) 0001101 < == product of the reduced third part (15) + 〇〇〇〇〇〇 < == result of the fourth bit multiplication (16) 00011010 < == product of the fourth part ( χΗχ4 + χ2) The logical value in the specific bit position of the product of the fourth part determines whether the product of the fourth part should be reduced. In this example, the specific bit position is (0 1 1 0 1 0) the third bit position from the left. The third data bit is a logical value 0, so the fourth partial product need not be added to the N value. The multiplication of the fifth bit is the multiplication of the value stored in the A register 244 and 8 (〇); B (l) is the fifth bit position of the B register 242 counting from the left The data bit value in (11 1 〇1).
第40頁 4483 76 五、發明說明(37) (1) 101 10〈二=儲存在A暫存器244中之值 (17) xlll〇i < = = β(〇) ’下一最高有效位元 (18) 101 10 < ==第五位元乘算結果 接著’該第五位元乘算結果(18)應與該已縮減之第四部 份乘積C1 6 )相加,以提供出第五部份乘積(1 9 )。 ( 1 6 ) 0 0 0 1 1 0 1 0 〇=已縮減之第四部份乘積 (18) + 10110 < ==第五位元乘算結果 ( 1 9 ) 0 0 0 1 0 0 0 10 < ==第五部份乘積(ΧΗΧ) 該第五部份乘積之特疋位元位置中之邏輯值,決定了該 第五部份乘積應否縮減。在此例中,該特定位元位置是 (G00100010)從左數過來之第四位元位置。該第四資料位 元具邏輯值1,於是,須將該Ν值對齊於該第五部份乘積後 減去。 (19) 0 0 0 1 0 0 0 1 0 < =二第五部份乘積 (20) - 1 〇 0 1 0 1 〈==正確對齊的Ν值 (21) 000000111 < ==縮減之第五部份乘積(χ2 + χ + 1) 該乘算過程將持續下去’直到Β暫存器242中所儲存的值 均以一個乘算週期移一個資料位元的方式,經過乘法器 2 50轉移至C暫存器246且C暫存器246產生出乘積(A*B mod N)為止。邊值為二進位多項式形式為χΗχ2 + χ1)之(A mod N)項,與該值為二進位111〇1(多項式形式為χ4 + χ3 + χ2 + 1)之(B mod Ν)項相乘,產生出〇〇〇〇〇〇111(多項式形式為 XHX + 1)。 圖8之電路圖,乃是另一可釺對兩週期乘算操作,用作Page 40 4483 76 V. Description of the invention (37) (1) 101 10 <2 = value stored in A register 244 (17) xlll〇i < = = β (〇) 'Next most significant bit (18) 101 10 < == fifth bit multiplication result followed by 'the fifth bit multiplication result (18) should be added to the reduced fourth part product C1 6) to provide the The fifth part of the product (1 9). (1 6) 0 0 0 1 1 0 1 0 〇 = the product of the reduced fourth part (18) + 10110 < == the fifth multiplication result (1 9) 0 0 0 1 0 0 0 10 < == Fifth Part Product (XYZ) The logical value in the special bit position of the fifth part product determines whether the fifth part product should be reduced. In this example, the specific bit position is (G00100010) the fourth bit position from the left. The fourth data bit has a logical value of 1, so the N value must be aligned with the product of the fifth part and subtracted. (19) 0 0 0 1 0 0 0 1 0 < = product of the second and fifth parts (20)-1 〇0 1 0 1 <== correctly aligned N value (21) 000000111 < == reduced first Five-part product (χ2 + χ + 1) The multiplication process will continue 'until the values stored in the B register 242 are shifted by one data bit in a multiplication cycle, and transferred by the multiplier 2 50 Until the C register 246 and the C register 246 produce a product (A * B mod N). The boundary value is a (A mod N) term with a binary polynomial form of χΗχ2 + χ1), which is multiplied by a (B mod Ν) term with a binary value of 111〇1 (polynomial form of χ4 + χ3 + χ2 + 1) , Which yields 100,000 (the polynomial form is XHX + 1). The circuit diagram of Figure 8 is another example of a two-cycle multiplication operation.
第41頁 448376 五、發明說明(38) 為乘法器24〇(圖6)之C暫存器246中所有位元位置之單元。 參考圖δ ’單元28〇(圖8)描述出C暫存器246之單元C(n_n, c(n ” ’…’與C0的邏輯情況。乘法器240之INT/POLY輸入端 兔'為邏輯0 ’則是選擇乘法器240用作計算模多項式基之乘 算。參考圖8 ’單元280中之閂鎖,鎖住的是(ΑθΒίΦΝ, ㊉值’其中\ Κ與义值分別地儲存在a暫存器244 ’B暫存器242與N暫存器248的特定位元位置(標示為位元 位置丨)中° CKIGH是C暫存器246中最高有效位元之數值。 C(1-u則是毗鄰C暫存器246第"i,1個位元之加法器單元,其 内所存之先前部份乘積值。 了另:方面,當乘法器24〇(圖6)被選作用來執行整數-模 N乘算6十算時,單元280鎖住的是(AJBi ®CARRYIN0(i_1}㊉ )值其中乂與A乃是分別地儲存在a暫存琴244盘β暫 f器Μ的特定位元(標示為位元。中的值。c_N〇㈣ 疋從該毗鄰C暫存器246第,| Γ個單元之加法器單元所傳來 的進位^號。Cn-υ則是該她鄰於C暫存器246第"i"個單元 之加法器單元,其内所儲之先前部份乘積值。 =果C暫存器246(圖6)所閂鎖的最低有效資料位元(LSB) 二二輯Γ ’那麼,就須進入第二乘算週期以決定C肩 值及對所產生出來之部份乘積進行縮減。 存入特信號若為邏輯1值,則執行之。Νί乃是儲存在N暫 存:248特疋位元位置(標示為位元位置。中之數值、所以 週週期先計算出W之部份積,然後第二乘算 遇期再針對該已計算出來之部份積進行縮減。有一個回授Page 41 448376 V. Description of the invention (38) is a unit of all bit positions in the C register 246 of the multiplier 24 (Fig. 6). With reference to the figure δ 'unit 28 (Figure 8) describes the logical situation of the units C (n_n, c (n "' ... 'and C0) of the C register 246. The INT / POLY input terminal of the multiplier 240 is logic 0 'is used to select the multiplier 240 for the multiplication of the modulus polynomial basis. Referring to FIG. 8' the latch in unit 280 locks (ΑθΒίΦΝ, ㊉ 值 'where \ κ and the meaning are stored in a respectively In register 244 'B register 242 and N register 248 in a specific bit position (labeled as bit position 丨) ° CKIGH is the value of the most significant bit in C register 246. C (1- u is the 1-bit adder unit adjacent to the C register 246, and the previous partial product value stored in it. In addition: On the other hand, when the multiplier 24o (Figure 6) is selected to function To perform an integer-modular N multiplication of sixty, the unit 280 locks the value of (AJBi ®CARRYIN0 (i_1) ㊉) where 乂 and A are stored separately in a temporary storage piano 244 disk β temporary f M The specific bit (labeled as the bit. The value in .c_N〇㈣ 疋 is the carry ^ from the adjacent C register 246, the adder unit of the Γ units. Cn-υ is the She is next to C The adder unit of the " i " unit of the device 246 stores the previous partial product value. = The least significant data bit (LSB) latched by the C register 246 (Figure 6) 22 Then, you must enter the second multiplication cycle to determine the C-shoulder value and reduce the partial product generated. If the special signal is a logical 1 value, it is executed. Νί is stored in N Temporary storage: 248 special bit positions (labeled as bit positions.), So the weekly period first calculates the partial product of W, and then the second multiplication period is performed on the calculated partial product. Reduction. There is a feedback
第42頁 148376Page 148 376
五、發明說明(39) 路徑可將C;值提供至乘法器282 , s成,导士 ,可在該第二乘算週期细„ ,担^ 月’月間 &供Ni值經由乘法器284而 至全加器286之輸入。平$而 一 卞。,大約需要百分之5〇之笛 —乘算週期時間來產生該縮減乘積(A*B*Ri m〇d Ν)。 此刻應已對本發明實施為積體電路之密碼乘算系統,其 所達到之高效能,低成本以低功率有所激賞。該硬體的^ 法器因可同時地支援RSA與ECC演算法之兩運算元之相乘運 算,而具高效能。該乘算系統可修改成適合大運算分+ _ π 7U < 運 异’以及修改成執行計算時所需之時鐘週期較先前技 系統為少。 之V. Description of the invention (39) The path can provide the value of C; to the multiplier 282, s, the guide can be detailed in the second multiplication cycle, ^ month's month & for the Ni value through the multiplier 284 And to the input of the full adder 286. It is equivalent to 卞. It takes about 50% of the flute—multiplying the cycle time to produce the reduced product (A * B * Ri m0d Ν). It should be at this moment. For the cryptographic multiplication system implemented as an integrated circuit of the present invention, the high efficiency, low cost, and low power achieved are highly appreciated. The hardware ^ means can simultaneously support the two operands of the RSA and ECC algorithms The multiplication operation has high performance. The multiplication system can be modified to suit large calculation points + _ π 7U < operation difference 'and modified to perform calculations with fewer clock cycles than the prior art system.
第43頁Page 43
Claims (1)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US21593498A | 1998-12-18 | 1998-12-18 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| TW448376B true TW448376B (en) | 2001-08-01 |
Family
ID=22804995
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW88121181A TW448376B (en) | 1998-12-18 | 1999-12-03 | Circuit and method cryptographic multiplication for computing the RSA and ECC algorithms |
Country Status (3)
| Country | Link |
|---|---|
| AU (1) | AU3286399A (en) |
| TW (1) | TW448376B (en) |
| WO (1) | WO2000038047A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI462010B (en) * | 2008-01-15 | 2014-11-21 | Inside Secure | Cryptographic method and system using a representation change of a point on an elliptic curve |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| DE10107376A1 (en) * | 2001-02-16 | 2002-08-29 | Infineon Technologies Ag | Method and device for modular multiplication and arithmetic unit for modular multiplication |
| US7711763B2 (en) * | 2001-02-21 | 2010-05-04 | Mips Technologies, Inc. | Microprocessor instructions for performing polynomial arithmetic operations |
| ATE316668T1 (en) * | 2001-12-14 | 2006-02-15 | Koninkl Philips Electronics Nv | ASSEMBLY BELT CORE IN A MONTGOMERY MULTIPLER |
| US20040252829A1 (en) * | 2003-04-25 | 2004-12-16 | Hee-Kwan Son | Montgomery modular multiplier and method thereof using carry save addition |
| US7769797B2 (en) | 2004-01-20 | 2010-08-03 | Samsung Electronics Co., Ltd. | Apparatus and method of multiplication using a plurality of identical partial multiplication modules |
| CN104503730A (en) * | 2014-10-24 | 2015-04-08 | 山东华芯半导体有限公司 | Instruction-based large-number point addition and point multiplication operation circuit and realization method |
| CN106961282A (en) * | 2017-03-31 | 2017-07-18 | 山东超越数控电子有限公司 | A kind of Hardware Implementation of the High Speed I BM algorithm structures based on binary BCH code |
| CN113031911B (en) * | 2019-12-24 | 2024-09-17 | 上海寒武纪信息科技有限公司 | Multiplier, data processing method, device and chip |
| CN121036982B (en) * | 2025-10-30 | 2026-03-17 | 上海芯联芯智能科技有限公司 | Encryption and decryption device, encryption method, decryption method and related products |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| DE19644688B4 (en) * | 1996-10-28 | 2005-06-16 | Systemonic Ag | Circuit arrangement of a digital multiplier module for processing binary numbers and elements made of GF (2m) |
| GB9707861D0 (en) * | 1997-04-18 | 1997-06-04 | Certicom Corp | Arithmetic processor |
-
1999
- 1999-02-05 WO PCT/US1999/002451 patent/WO2000038047A1/en not_active Ceased
- 1999-02-05 AU AU32863/99A patent/AU3286399A/en not_active Abandoned
- 1999-12-03 TW TW88121181A patent/TW448376B/en active
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI462010B (en) * | 2008-01-15 | 2014-11-21 | Inside Secure | Cryptographic method and system using a representation change of a point on an elliptic curve |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2000038047A1 (en) | 2000-06-29 |
| AU3286399A (en) | 2000-07-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Koziel et al. | SIKE’d up: Fast hardware architectures for supersingular isogeny key encapsulation | |
| Satoh et al. | A scalable dual-field elliptic curve cryptographic processor | |
| Hasan et al. | A modified Massey-Omura parallel multiplier for a class of finite fields | |
| Reyhani-Masoleh et al. | Low complexity word-level sequential normal basis multipliers | |
| EP0531158B1 (en) | Method of and apparatus for encryption and decryption of communication data | |
| US6182104B1 (en) | Circuit and method of modulo multiplication | |
| CN100527072C (en) | Device and method for carrying out montgomery mode multiply | |
| US6356636B1 (en) | Circuit and method for fast modular multiplication | |
| US6671709B2 (en) | Multiplier cell and method of computing | |
| US5210710A (en) | Modulo arithmetic processor chip | |
| US4891781A (en) | Modulo arithmetic processor chip | |
| Xie et al. | Efficient implementation of finite field arithmetic for binary ring-LWE post-quantum cryptography through a novel lookup-table-like method | |
| Tian et al. | High-speed FPGA implementation of SIKE based on an ultra-low-latency modular multiplier | |
| TW448376B (en) | Circuit and method cryptographic multiplication for computing the RSA and ECC algorithms | |
| Großschädl | A bit-serial unified multiplier architecture for finite fields GF (p) and GF (2m) | |
| Koziel et al. | SIKE'd Up: Fast and Secure Hardware Architectures for Supersingular Isogeny Key Encapsulation | |
| JP4783382B2 (en) | Montgomery method multiplication remainder calculator | |
| TW200403584A (en) | Apparatus and method for calculating a result of a modular multiplication | |
| JPH05324277A (en) | Cryptographic communication method | |
| Lee | Low-latency bit-parallel systolic multiplier for irreducible x m+ x n+ 1 with gcd (m, n)= 1 | |
| JP4177526B2 (en) | Multiplication residue calculation method and multiplication residue circuit | |
| Lu et al. | A programmable VLSI architecture for computing multiplication and polynomial evaluation modulo a positive integer | |
| Kameyama et al. | Design of an RSA Encryption Processor Based on Signed‐Digit Multivalued Arithmetic Circuits | |
| US7403965B2 (en) | Encryption/decryption system for calculating effective lower bits of a parameter for Montgomery modular multiplication | |
| TW202321900A (en) | Modular multiplication circuit and corresponding modular multiplication method |