JPH0213999A

JPH0213999A - Voice signal detection method and apparatus

Info

Publication number: JPH0213999A
Application number: JP1112386A
Authority: JP
Inventors: Pierre A Laurent; ピエール　アンドレ　ロウラン
Original assignee: Thomson CSF SA
Current assignee: Thales SA
Priority date: 1988-05-04
Filing date: 1989-05-02
Publication date: 1990-01-18
Also published as: EP0341128A1; US4982341A; ATE83578T1; FR2631147A1; ES2036813T3; FR2631147B1; GR3007361T3; EP0341128B1; CA1312357C; DE68903872D1; DE68903872T2

Abstract

PURPOSE: To correctly detect a speech signal among signals buried in noises by judging that there is the speech signals among the signals affected by the noises when the maximum energy or energy ratio of digital signals passed through filters is larger than its maximum threshold value. CONSTITUTION: The filter 44 and an energy calculation device 45 provide a parameter Xph and an energy calculation device 46 provides a parameter X; and the parameters X and Xph are applied to the 1st operand input and 2nd operand input of a divider 47 respectively to calculate a parameter R. Then when the maximum energy or energy ratio R of the digital signals passed through filters 43 and 44 is larger than its maximum threshold value, it is judged that there is the speech signal among the signals affected by noises. Consequently, the speech signal among the signals buried in the noises can correctly be detected.

Description

【発明の詳細な説明】（発明の背景）本発明は、特に車輌上で交互無線電気伝送に使用するこ
とのできる、音声信号の検出方法及び装置に関するもの
である。BACKGROUND OF THE INVENTION The present invention relates to a method and apparatus for detecting audio signals, which can be used in particular for alternating wireless electrical transmission on vehicles.

（先行技術の説明）大抵の先行技術による音声活動の検出器は、最小でも２
０ｄＢ程度の十分に高い信号対雑音比の場合を除き、正
しく動作することができない。しかもこのことは、静か
なオフィスの形式の環境における作業条件に対応するも
のである。Description of the Prior Art Most prior art voice activity detectors have a minimum of two
It cannot operate correctly except in the case of a sufficiently high signal-to-noise ratio of about 0 dB. Moreover, this corresponds to working conditions in a quiet office-type environment.

対照的に車輌上では、会話／雑音の弁別は非常に弱い信
号対雑音比、普通は１０ｄＢ未満を考慮に入れなければ
ならない。ある条件の下では、（例えば平均的防音措置
を施した車輌内での高いエンジン速度の状態では）、雑
音レベルは信号のレベルをも上辺ることがある。In contrast, on a vehicle, speech/noise discrimination must take into account a very weak signal-to-noise ratio, typically less than 10 dB. Under certain conditions (eg, at high engine speeds in a vehicle with average soundproofing), the noise level may even exceed the signal level.

最後に、弁別すべき雑音のレベルと種類は、その車輌に
固有の条件（例えば防音の程度）によって変るが、また
とられるルートの関数としても変る。特に好ましくない
例は都市のルートのそれである。そこでは考慮すべき雑
音は一般に、高レベルであり、静止していず、普通大い
に変化する。Finally, the level and type of noise to be discriminated will vary depending on the specific conditions of the vehicle (eg, the degree of soundproofing), but will also vary as a function of the route taken. A particularly unfavorable example is that of urban routes. The noise to be considered there is generally high level, non-stationary and usually highly variable.

雑音の多い環境で動作するように設計された音声活動の
検出器の具体化は、申請者のために提出された、１９７
９年９月２８日付の特許申請番号７９７４２２７かられ
かる。しかしこの検出器は声の音を除いては、会話／雑
音の弁別を最適化するのに使うことができず、音声信号
を専らしきい電圧値と比較することで決定が行われてい
て、この変数が、実際の雑音レベルを考慮に入れること
なく、音声信号の最大振幅値と自動的に連結されている
。その結果は、会話信号が雑音中に埋没しているような
ひどく乱された環境中で正しく動作することができるに
は十分でない、性能レベルということになる。An embodiment of a voice activity detector designed to operate in noisy environments was submitted for the applicant, 197
From patent application number 7974227 dated September 28, 1999. However, this detector cannot be used to optimize speech/noise discrimination, except for voice sounds, and decisions are made by comparing the audio signal exclusively to a threshold voltage value. This variable is automatically linked to the maximum amplitude value of the audio signal without taking into account the actual noise level. The result is a level of performance that is not sufficient to be able to operate properly in highly disturbed environments where speech signals are buried in noise.

（発明の要約）本発明の目的は上記の欠点を克服することである。この
趣旨で本発明の対象は、雑音に埋没している信号中の音
声信号の検出に関する方法である。上記方法は次の各段
階より成る。SUMMARY OF THE INVENTION The aim of the invention is to overcome the above-mentioned drawbacks. To this effect, the subject of the invention is a method for the detection of a speech signal in a signal buried in noise. The above method consists of the following steps.

・信号を幾つかのフレームに切り分けること。・Dividing the signal into several frames.

・サンプルの決められた数ｎを含むディジタル信号を得
るため各フレームをサンプリングすること。- Sampling each frame to obtain a digital signal containing a determined number n of samples.

・前置増幅されたディジタル信号を得るために、ディジ
タル信号を前置増幅すること。- Preamplifying a digital signal to obtain a preamplified digital signal.

・フィルタを通したディジタル信号を得るために、高域
通過ディジタルフィルタによって、前置増幅されたディ
ジタル信号をろ波すること。- Filtering the preamplified digital signal by a high-pass digital filter to obtain a filtered digital signal.

・各フレームにおいて、前置増幅された信号のサンプル
の最大エネルギー、及びフィルタを通したディジタル信
号のサンプルの最大エネルギーを測ること。- Measure the maximum energy of the samples of the preamplified signal and the maximum energy of the samples of the filtered digital signal in each frame.

・このフィルタを通したディジタル信号のサンプルの最
大エネルギーと、前置増幅されたディジタル信号のサン
プルの最大エネルギーとの間のエネルギー比をとること
。- Taking the energy ratio between the maximum energy of the sample of the digital signal passed through this filter and the maximum energy of the sample of the preamplified digital signal.

・二つの限界値の間で、フィルタを通した信号のサンプ
ルのエネルギー及びエネルギー比の、長期間の平均値を
計算すること。Calculating the long-term average value of the energy and energy ratio of the samples of the filtered signal between two limits.

・長期間の平均値を基準として、四つのしきい値を計算
することであって、その中の二つは最大値でそれぞれフ
ィルタを通した信号及びエネルギー比に対する会話状態
の二つの下限を形成し、その中の二つは最小信号でそれ
ぞれフィルタを通した信号とエネルギー比に対する雑音
状態の二つの上限を形成するものであり、フィルタを通
した信号の最大エネルギーとエネルギー比をこれらのし
きい値と比較する。Calculating four thresholds based on long-term average values, two of which are the maximum and form two lower limits of the conversation state for the filtered signal and energy ratio, respectively. However, two of them are the minimum signal and form two upper limits of the noise state to the filtered signal and energy ratio, respectively, and the maximum energy of the filtered signal and the energy ratio are defined by these thresholds. Compare with value.

・フィルタを通したディジタル信号の最大エネルギー又
はエネルギー比が、それぞれの最大のしきい値よりも大
きいとき、雑音に犯された信号中の音声信号の存在を判
断すること。- Determining the presence of a speech signal in the noise corrupted signal when the maximum energy or energy ratio of the filtered digital signal is greater than the respective maximum threshold.

・及び、フィルタを通したディジタル信号の最大エネル
ギー、又はエネルギー比Ｒが、それぞれの最小しきい値
よりも小さいとき、雑音に犯された信号中の音声信号が
不在であると判断すること。and determining the absence of a voice signal in the noise-affected signal when the maximum energy or energy ratio R of the filtered digital signal is smaller than the respective minimum threshold.

本発明のその他の対象は、上記の方法を実施するための
装置である。A further subject of the invention is a device for carrying out the above method.

以下余白（実施例）第１図から第４図までに示す、本発明による方法は、約
２０ミリ秒の雑音が入り込んでいる信号についてなされ
、信号サンプルＳを与えるためフレーム当り１６０個の
サンプルという割合でサンプリングされる、実際の実施
例である。第１図のステップ１から５までに示すとおり
、処理か行われるディジタル信号Ｓは、先ずステップ１
で前置増幅されて信号サンプルＳｎを与え、次にステッ
プ２で遮断周波数ＦＣ・１２００Ｈｚを有する高域２戸
波器によって炉液して、信号サンプルｓｐｈ　（ｎ）を
与える。次のステップ３と４では、次のパラメータが計
算される。In the margin below (Example) The method according to the invention, illustrated in FIGS. 1 to 4, is carried out on a signal with about 20 milliseconds of noise introduced, and with 160 samples per frame to give a signal sample S. This is a practical example, sampled by percentage. As shown in steps 1 to 5 in FIG. 1, the digital signal S to be processed is first processed in step 1.
It is preamplified in step 2 to give a signal sample Sn, and then amplified in step 2 by a high-frequency two-wave generator with a cutoff frequency FC of 1200 Hz to give a signal sample sph (n). In the next steps 3 and 4, the following parameters are calculated.

Ｘ　＝　ｍａｓ　（Ｓｎ）ｘ、ｈ＝　ｍａｘＳｐｈ　（ｒｌ）ここでｎは１と１６０の間の整数である。これらの計算
はサンプル５（ｎ）とＳ＋＞ｈ（ｎ）の各シーケンスに
おいて、最大振幅または最大エネルギーを有するサンプ
ルを探すことからなっている。X = mas (Sn) x, h = maxSph (rl) where n is an integer between 1 and 160. These calculations consist of looking for the sample with the maximum amplitude or maximum energy in each sequence of samples 5(n) and S+>h(n).

ステップ５は、ステップ３と４で計算した二つのパラメ
ータＸｐｂとＸとの比Ｒ＝ｘｐｈ／Ｘを計算することよ
り成る。Step 5 consists of calculating the ratio R=xph/X of the two parameters Xpb and X calculated in steps 3 and 4.

続くステップ６から１１までは、次に示す諸関係に従っ
たパラメータ×１及びＲ１の計算より成る。The following steps 6 to 11 consist of calculating the parameters x1 and R1 according to the following relationships.

Ｘｐｈが第１図で前のフレームで計算したＸ、よりも大
きく、Ｘ、。１ｄと書くことにすると、・×１＝ｘｐｈ
　　そうでなければ・Ｘｔ＝ＴＸ−Ｘｔｏｌｄ　　＋（１−ＴＸ）ＸｐｈＲ
が第１図で前のフレームで計算したＲ比よりも大きく、
Ｒｌｏｌｄと書くことにすると、・Ｒ，＝Ｒそうでなけ
れば・Ｒ＋＝Ｔｒ　Ｒｌｏｌｄ　＋（１−Ｔｒ）　Ｒこのこ
とはパラメータＸ、及びＲ１の値において、一つのフレ
ームから次のフレームから次のフレームへ、瞬間的な成
長が許容され得るが、一方その減少はそれぞれＴＸ及び
Ｔｒに等しい時定数で、もっとゆっくり起こるであろう
。本発明の提案する実施のＢ様によれば、時定数の値は
０．７５に固定されている。これは約７０ミリ秒に対応
する。次のステップ１２から２９までは、第２図と第３
図に示しであるが、パラメータＸｐｈ及びＲの長期平均
値を使い、四つの検出しきい値を求めることより成る。If Xph is larger than X, calculated in the previous frame in Fig. 1, then X,. If we write it as 1d, ・×1=xph
Otherwise・Xt=TX−Xtold+(1−TX)XphR
is larger than the R ratio calculated in the previous frame in Figure 1,
If we write it as Rlold, ・R,=R Otherwise, ・R+=Tr Rlold + (1-Tr) To the frame, instantaneous growth may be allowed, while its decrease will occur more slowly, with time constants equal to TX and Tr, respectively. According to Mr. B of the proposed implementation of the present invention, the value of the time constant is fixed at 0.75. This corresponds to approximately 70 milliseconds. The next steps 12 to 29 are shown in Figures 2 and 3.
As shown in the figure, it consists of determining four detection thresholds using long-term average values of parameters Xph and R.

後者は、しきい値の過度の変化を禁するように、先ずス
テップ１２において、一定、最大、最小値の間に制限さ
れる。ｘｐｈとＲ２の変化の限度はそれぞれ、Ｘｐｈｉ
ｎｆ、Ｘｐｈｓｕｐ、　Ｒ，ｉｎｆ、Ｒ，ｓｕｐと呼ぶ
ことにする。The latter is first limited between a constant, maximum and minimum value in step 12 so as to prohibit excessive changes in the threshold. The limits of change of xph and R2 are respectively Xphi
They will be called nf, Xphsup, R,inf, and R,sup.

ステップ１３から２２までは二つのパラメータ×２及び
Ｒ２の計算にあり次の関係を確認している。Steps 13 to 22 involve calculating two parameters x 2 and R2, and confirming the following relationship.

Ｘ２　＝　ＭＡＸ　（ＭＩＮ　（Ｘｐｈ、　Ｘｐｈ、５
ｕｐ）　＋　Ｌｈ、　１ｎｆ）Ｒ２＝　ＭＡＸ　（ＭＩ
Ｎ　ＣＲ，Ｒ，５ｕｐ）　、　Ｒ，１ｎｆ）パラメータ
ｘｐ及びＲの長期間平均値はそれぞれＸｍｏｙ及びＲｍ
ｏｙと記し、次の関係を適用することにおいてステップ
２３から２８までで計算される。X2 = MAX (MIN (Xph, Xph, 5
up) + Lh, 1nf) R2= MAX (MI
N CR, R, 5up), R, 1nf) The long-term average values of parameters xp and R are Xmoy and Rm, respectively.
oy and is calculated in steps 23 to 28 in applying the following relationship:

ｘ２か第３図で直前のフレームで計算されたパラメータ
Ｒ□。アより大きく、Ｘｍｏｙ−０ｌｄとかくことにす
れば・Ｘｍｏｙ＝Ｔｍ−ｘｍｏｙ−ｏｔａ”（１−Ｔｍ）、
Ｘａ　ｒそうでなければ・Ｘｍｏｙ”　Ｔｄ−Ｘｍｏｙ．ｏｌｄ＋（＋ａ＋（Ｉ
　　Ｔｄ）、Ｘ＊Ｒ２が第３図で直前のフレームて計算
され、Ｒ４゜Ｖ、Ｏｌｄと書かれたパラメータＲｆｆｉ
。、より大きい場合は、・Ｒｍｏｖ”　Ｔｍ・Ｒｍｏｙ．ｏｌｄ＋（ｔａ”（Ｉ
　　Ｔｍ）Ｒ２゜そうでなければ・島。ｙ”　Ｔｄ−Ｒｎ＋ｏｙ．ｏｌｄ＋（＋ｄ”（１
−Ｔｄ）、Ｒ２これらの関係において、上昇時定数Ｔ、
は指数的にゆっくり上昇することに備えられており、一
方下降時定数Ｔ、は指数的に早く上昇することに備えら
れており、考えている平均値は雑音に対応するレベルま
で素早く下がって戻る。これらの時定数の値は、本発明
の提案する実施の態様においては、上昇に対し、では０
．９５すなわち約４００ｍ５に固定され、加工に対して
は０．２すなわち約１３ｍ５に固定されている。最終的
に、しきい値の四つの値はステップ２９で計算され、次
の関係によって上に定めたＸ、。、とＲｍｏ　ｙの値を
使用する。x2 or the parameter R□ calculated in the previous frame in FIG. If we write it as Xmoy-0ld, which is larger than A, then
Xa r Otherwise・Xmoy” Td−Xmoy.old+(+a+(I
Td), X*R2 is calculated in the previous frame in Figure 3, and the parameter Rffi is
. , if larger than ・Rmov” Tm・Rmoy.old+(ta”(I
Tm) R2゜Otherwise, island. y"Td-Rn+oy.old+(+d"(1
-Td), R2 In these relationships, the rising time constant T,
is provisioned to rise exponentially slowly, while the falling time constant T, is provisioned to rise exponentially fast, and the considered mean quickly falls back to a level corresponding to the noise. . In the proposed embodiment of the invention, the values of these time constants are 0 for increasing
．． 95, or approximately 400 m5, and for machining, it is fixed at 0.2, or approximately 13 m5. Finally, the four values of the threshold are calculated in step 29, defined above by the following relationship: , and the value of Rmo y.

ＳＸｔ会話＝　ａ　、　Ｘｍｏｙ＋Ｘｐｈ、　１ｎｆＳ
Ｘ、雑音＝　ｂ　、Ｘ、、、＋Ｘｐｈ、１ｎｆＳＲ，会
話＝　ａ　、Ｒ＋ｍａｙ＋Ｒ，１ｎｆＳＲ４雑音＝　ｂ
　、　Ｒ，１１，ｙ＋Ｒ，ｉｎｆ乗算係数ａ及びｂの値
は、本発明の提案している例では、１．８と２．５に固
定されている。なお、パラメータＸｐｈまたはＲの−っ
か対応する下限値よりも小さい場合は、関連する決定は
自動的にとられることを頭においておくことが望ましい
。SXt conversation = a, Xmoy+Xph, 1nfS
X, Noise = b, X, , +Xph, 1nfSR, Conversation = a, R+may+R, 1nfSR4 Noise = b
, R,11,y+R,inf The values of the multiplication coefficients a and b are fixed to 1.8 and 2.5 in the example proposed by the present invention. It should be noted that it is desirable to keep in mind that if either of the parameters Xph or R is smaller than the corresponding lower limit value, the relevant decision will be taken automatically.

本方法のステップ１から５までを実施し、エネルギー比
を計算する装置は第５図に示しである。The apparatus for carrying out steps 1 to 5 of the method and calculating the energy ratios is shown in FIG.

この装置は第１のフィルタ４３を有し、これか伝達関数
Ｈ（ｚ）　＝　１−０．８６・ｚ−”を持っていてステ
ップ１に示す信号の前置増幅を行う。このフィルタはそ
の出力によって、先ず第２の高域が波器（フィルタ）４
４で約１２００Ｈｚの遮断周波数を有するものに接続さ
れ、二番目にはエネルギー計算装置４６に接続される。This device has a first filter 43, which has a transfer function H(z) = 1-0.86.z-'' and performs the preamplification of the signal shown in step 1. Depending on the output, first the second high frequency waveform filter 4
4 with a cutoff frequency of approximately 1200 Hz, and the second is connected to an energy calculation device 46.

第２の高域ろ波器４４はまたその出力において、エネル
ギー計算装置４６と同様のエネルギー計算装置４５に結
合されている。フィルタ４４とエネルギー計算装置４５
は、拳法のステップ２と３の実行によってパラメータＸ
ｐｈを提供し、エネルギー計算装置４６はパラメータＸ
を提供する。パラメータＸとＸｐｈはそれぞれ除算器４
７の第１オペランド入力及び第２オペランド入力に加え
られ、ステップ５によってパラメータＲを計算する。The second high-pass filter 44 is also coupled at its output to an energy calculation device 45 similar to energy calculation device 46 . Filter 44 and energy calculation device 45
is the parameter X by executing Steps 2 and 3 of Kempo
ph, and the energy calculation device 46 provides the parameter
I will provide a. Parameters X and Xph are each divided by divider 4
7 is added to the first operand input and the second operand input, and the parameter R is calculated by step 5.

エネルギー計算装置４５及び４６の実施の態様は第６図
に示されている。この回路は、側路回路５０を通してレ
ジスタ４９に結合した比較器回路４８を有している。こ
の比較器回路４８は二つの入力があり、第１の入力はデ
ィジタルフィルタ４３によって与えられる信号サンプル
Ｓ　（ｎ）、またはディジタルフィルタ４４によって与
えられる信号サンプルを受信する。第２の入力はレジス
タ４９の出力に接続されている。側路回路５０は比較器
回路４８の入力によって制御され、信号サンプル５（ｎ
）又は５ｐｈ（ｎ）がレジスタ４９の内容よりも大きい
時、レジスタ４９の入力に対する信号サンプル５（ｎ）
又は５ｐｈ（ｎ）を側路する。An embodiment of the energy calculation devices 45 and 46 is shown in FIG. The circuit includes a comparator circuit 48 coupled to a register 49 through a bypass circuit 50. This comparator circuit 48 has two inputs, the first input receiving signal samples S (n) provided by digital filter 43 or signal samples provided by digital filter 44 . The second input is connected to the output of register 49. The bypass circuit 50 is controlled by the input of the comparator circuit 48 and outputs signal samples 5(n
) or 5ph(n) is greater than the contents of register 49, the signal sample 5(n) for the input of register 49
Or bypass 5ph(n).

さもなければ、レジスタ４９は自身にループを閉じたま
までいる。Otherwise, register 49 remains closed to itself.

ステップ６から１１までを実施する装置の一つの実施の
態様を第７図に示す。この装置は、側路回路５３を通し
て累算器回路５２に結合された比較器回路５１を有する
。乗算回路５４は、第１オペランド入力によって比較器
回路５１の第１の入力に接続され、その第２オペランド
入力においてパラメータ１−Ｔ、又は１−Ｔｒを受信す
る。これらのパラメータは拳法のステップ８と１１にお
いてあられされているものである。第２の乗算回路５５
は第１のオペランド入力によって、累算器５２の出力に
接続され、第２オペランド入力において拳法のステップ
８と１１で表されるパラメータＴＸ又はＴ、を受信する
。この乗算回路５４及び５５の出力は、それぞれ加算回
路５６の第一オペランド入力及び第二オペランド入力に
接続され、その出力が側路回路５３の第一の入力に接続
されている。累算器５２の出力は更に比較器回路５１の
第二のオペランド入力に接続される。ステップ６から１
１までによれば、パラメータＸｐｈ又はＲは、比較器回
路５１の第一の入力に加えられ、累算器回路５２の内容
Ｘ、。ｌｄ又はＲ１゜１．と比較される。ステップ６は
又はステップ９に従って、パラメータＸｐｈ又はＲが累
算器回路５２の内容Ｘ、。！、又はＲｏ。ｌｄより大き
ければ、側路回路５３は累算器回路５２の内容を、ステ
ップ７及び１０に従ってパラメータＸｐｈ又はＲの一つ
によって更新する。そうでない時は、側路回路５３は加
算回路５６の出力を累算器回路５２の入力に切り換えて
、ステップ８及び１１に関して上記の関係によって定め
られた、パラメータｘ１又はＲ１により累算器の内容を
更新する。これらの関係において、積（１−ＴＸ）　Ｘ
Ｘｐｈ又は稍（１−Ｔｒ）　ＸＲは乗算回路５４によっ
て得られ、禎ＴｘｘＸ−ｏ１ｄまたはＴＲＸＲｌｏｌｄ
　Ｌｔ乗１ｊｔ回Ｆａ５５によって実行される。これら
積の和は加算回路５６によって作られる。One embodiment of an apparatus for carrying out steps 6 to 11 is shown in FIG. The device has a comparator circuit 51 coupled to an accumulator circuit 52 through a bypass circuit 53. A multiplier circuit 54 is connected by a first operand input to a first input of the comparator circuit 51 and receives the parameter 1-T, or 1-Tr, at its second operand input. These parameters are those mentioned in Steps 8 and 11 of Kenpo. Second multiplication circuit 55
is connected by a first operand input to the output of the accumulator 52 and receives at a second operand input the parameters TX or T, represented by Kenpo steps 8 and 11. The outputs of the multiplier circuits 54 and 55 are connected to the first and second operand inputs of the adder circuit 56, respectively, and the output thereof is connected to the first input of the bypass circuit 53. The output of accumulator 52 is further connected to a second operand input of comparator circuit 51. Steps 6 to 1
According to up to 1, the parameter Xph or R is applied to the first input of the comparator circuit 51 and the contents X, of the accumulator circuit 52. ld or R1゜1. compared to Step 6 or according to step 9, the parameter Xph or R is the content X of the accumulator circuit 52. ! , or Ro. If it is greater than ld, the bypass circuit 53 updates the contents of the accumulator circuit 52 with one of the parameters Xph or R according to steps 7 and 10. Otherwise, the bypass circuit 53 switches the output of the adder circuit 56 to the input of the accumulator circuit 52, and the contents of the accumulator are determined by the parameters x1 or R1, as determined by the relationships described above with respect to steps 8 and 11. Update. In these relationships, the product (1-TX)
Xph or (1-Tr) XR is obtained by the multiplication circuit 54, and TxxX-o1d or TRXRold
It is executed by Fa55 Lt times 1jt times. The sum of these products is created by adder circuit 56.

本方法のステップ１２から２２までは、第２図に示され
ているか、しきい値増幅器（図示せず）によって行われ
るか、ただしその特性は第８図（Ａ）　と第８図（Ｂ）
に示されている。これらのしきい値増幅器は、パラメー
タｘ１及びＲ１の超過値を考慮に入れないですむことを
可能にする。これらの特性により、各パラメータｘ１又
はＲ□は二つの値、ｘｌｐｈ、　ｔｎｆとＸ１ｐｈ、５
Ｉｊｐ又はＲｔ、ｉｎｆとＲｔ、ｓｕｐの間に制限され
る。これらの特性はしきい値Ｘｔｐ）、、ｉｎｆとｘｌ
ｐｈ、５ｕｌ）又はＲ，、ｉｎｆとＲ，、ｓｕｐの間で
、パラメータ×１及びＲ１の直線的関係によりパラメー
タｘ２及びＲ２の作成を可能にする。このパラメータｘ
２及びＲ２はその振幅が、これらのしきい値の外側のパ
ラメータＸ、及びＲ８の値に関して制限されるものであ
る。Steps 12 to 22 of the method are shown in FIG. 2 or performed by a threshold amplifier (not shown), the characteristics of which are shown in FIGS. 8(A) and 8(B).
is shown. These threshold amplifiers make it possible not to take into account excess values of the parameters x1 and R1. Due to these characteristics, each parameter x1 or R□ has two values, xlph, tnf and X1ph, 5
It is limited between Ijp or Rt, inf and Rt, sup. These characteristics are thresholds Xtp), , inf and xl
ph, 5ul) or R,,inf and R,,sup, the linear relationship of the parameters x1 and R1 allows the creation of the parameters x2 and R2. This parameter x
2 and R2 are those whose amplitudes are limited with respect to the values of parameter X and R8 outside these thresholds.

以下余白拳法のステップ２３から２８までによって説明されてい
る、平均値島又はＲｕｅ計算する装置の一つの実施の態
様を第９図に示す。この装置には減算回路５７、乗算回
路５８、加算回路５９及び、レジスタ６０かあり、この
順番に直列接続されている。減算回路５７は、パラメー
タｘ２又はＲ２が加えられる第一オペランド入力と、レ
ジスタ６０に接続されている第二オペランド入力とを有
する。この装置はまた二つの入力がある比較器回路６１
を有し、それぞれ減算回路５７の二つの入力にそれぞれ
つながっている。比較器回路６１の出力は側路回路６２
の制御入力に接続されている。この側路回路は二つの入
力を有し、これらに時定数ＴｍとＴ、が印加される。側
路回路６２の出力は乗算回路５８の第一オペランド入力
に接続され、乗算回路５８の第二オペランド入力は減算
回路５７の出力に接続されている。乗算回路５８の出力
はさらに加算回路５９の第一オペランド入力に接続され
、加算回路５９の第二オペランド入力は減算回路５７の
第一オペランド入力につながっている。この平均値計算
装置はステップ２３から２８までに示す方法の演算を実
行させることができる。ステップ２３又はステップ２６
により、パラメータｘ２又はＲ２は比較器回路６１の第
一の比較入力に加えられて、レジスタ６０の内容ｘｒ、
。９．。ｌｄと比較され、それぞれの値がレジスタ６０
の内容より大きい場合は、比較器６１は側路回路６２に
、時定数Ｔ、、を乗算回路５８の第一オペランド入力に
加えるよう命令する。乗算回路５８はそれの第二オペラ
ンド入力に、レジスタ６０の内容×７゜ア。１ｄとその
第一オペランド入力に加えられたパラメータｘ２又はＲ
２の値との間で行なった減算の結果をうける。乗算回路
５８で行なった乗算Ｔ、　（Ｘ、、、．ｏｌｄ＋（ｌｄ
−Ｘｌ）又はＴ、　（Ｘ、、、、　。ｔｄ−Ｒ２）の結
果は加算回路５９の第一オペランド入力に加えられ、そ
の第二オペランド入力に加えられたパラメータｘ２又は
Ｒ２に加算される。この加算回路５９で行なわれた加算
の結果はレジスタ６０内に伝達される。しかしステップ
２３又は２６において、パラメータ×２又はＲ２の値が
レジスタ６０で見出される×、。、。１ｄ又はＲ１゜、
。ｌ、の値より大きくない場合は、側路回路６２は比較
器回路６１によって、時定数Ｔｄの値を乗算回路５８の
第一オペランド入力に加えるよう命令される。これらの
条件の６とでは、計算は上記の記述と同様に行なわれ、
時定数Ｔ、は拳法のステップ２５及び２８に示される関
係に従って、時定数Ｔｄの値と置換えられる。One embodiment of an apparatus for calculating the mean value island or Rue, which will be described below by steps 23 to 28 of Margin Kempo, is shown in FIG. This device includes a subtraction circuit 57, a multiplication circuit 58, an addition circuit 59, and a register 60, which are connected in series in this order. Subtraction circuit 57 has a first operand input to which parameter x2 or R2 is added, and a second operand input connected to register 60. This device also includes a comparator circuit 61 with two inputs.
are respectively connected to two inputs of the subtraction circuit 57. The output of the comparator circuit 61 is sent to the bypass circuit 62
connected to the control input of the This bypass circuit has two inputs to which time constants Tm and T are applied. The output of the bypass circuit 62 is connected to the first operand input of the multiplier circuit 58, and the second operand input of the multiplier circuit 58 is connected to the output of the subtraction circuit 57. The output of the multiplier circuit 58 is further connected to the first operand input of the adder circuit 59, and the second operand input of the adder circuit 59 is connected to the first operand input of the subtracter circuit 57. This average value calculation device can perform the calculations of the method shown in steps 23 to 28. Step 23 or Step 26
, the parameter x2 or R2 is applied to the first comparison input of the comparator circuit 61 and the contents of the register 60 xr,
. 9. . ld and the respective values are stored in register 60.
, then the comparator 61 instructs the bypass circuit 62 to add a time constant T, , to the first operand input of the multiplier circuit 58. Multiplier circuit 58 has at its second operand input the contents of register 60 x 7°. 1d and the parameter x2 or R added to its first operand input
Receives the result of subtraction between the value of 2 and 2. The multiplication T performed in the multiplication circuit 58, (X,,,.old+(ld
-Xl) or T, (X, . . .td-R2) is added to the first operand input of the adder circuit 59 and added to the parameter x2 or R2 added to its second operand input. The result of addition performed by adder circuit 59 is transmitted into register 60. However, in step 23 or 26 the value of parameter x2 or R2 is found in register 60. ,. 1d or R1°,
. If it is not greater than the value of l, the bypass circuit 62 is commanded by the comparator circuit 61 to add the value of the time constant Td to the first operand input of the multiplier circuit 58. For condition 6 of these, the calculation is done as described above,
The time constant T, is replaced by the value of the time constant Td according to the relationship shown in steps 25 and 28 of Kenpo.

会話のしきい値又は雑音のしきい値（ＳＸ、　　’“会
話°゛及びｓｘ、　　”雑音ＩＩ　、ＳＬ　　ＩＩ会話
″及びｓｎ。Speech threshold or noise threshold (SX, ``Speech'' and sx, ``Noise II, SL II Speech'' and sn.

°゛雑音°゛）の拳法のステップ２９で確立した関係に
よる計算は、第１０図（Ａ）及び第１０図（Ｂ）に述べ
る回路によって実行される。ＳＸｔ　　“会話°°又は
ＳＲ。The calculations according to the relationships established in step 29 of the Kempo ``Noise'') are performed by the circuits described in FIGS. 10(A) and 10(B). SXt “Conversation °° or SR.

“会話′″のしきい値は加算回路６４に接続される乗算
回路６３によって計算される。乗算回路６３は、その第
一オペランド入力において第９図のレジスタ６０によっ
て与えられるｘ、ｖｌＱｙ又はＲｍｏｙを受信し、また
パラメータ盗が加えられる第二オペランド入力を持って
いる。乗算の結果は加算回路６４の第一オペランド入力
に加えて、その第二オペランド入力に加えられるしきい
値Ｓｐｈ、　ｉｎｆに加えられる。加算回路６４の出力
はＳＸ１　　“会話パシきい値又はＳＲ１′会話°°シ
きい値を与えることになる。The threshold value for "conversation" is calculated by a multiplier circuit 63 connected to an adder circuit 64. Multiplier circuit 63 receives at its first operand input x, vlQy or Rmoy provided by register 60 of FIG. 9, and also has a second operand input to which the parameter steal is added. The result of the multiplication is added to the first operand input of adder circuit 64 as well as to a threshold value Sph, inf which is applied to its second operand input. The output of adder circuit 64 will provide the SX1 "speech threshold" or the SR1' speech threshold.

同様にして、ＳＸ１　　“雑音シきい値又はＳＲ。Similarly, SX1 “Noise threshold or SR.

′“雑音°°シきい値、あるいはその両方が乗算回路６
５と加算回路６６とによって計算される。乗算回路６５
の第一オペランド入力は、第９図のレジスタ６０で与え
られるパラメータＸｍｏｙ及びＲ，。、を受信する。乗
算回路６５はまたパラメータ旦が加えられる、第二のオ
ペランド入力を有している。その出力は加算器６６の第
一のオペランド入力に接続され、その第二のオペランド
入力がパラメータのしきい値Ｘ　ｐｈ、　ｉｎｆの値を
受ける。加算回路６６の出力はしきい値ｓｘ、　　’“
雑音”及びＳＲよ　“雑音パを出力する。これらのしき
い値は、本方法のステップ３０から４０まで、及び第１
１図（Ａ）と第１１図（Ｂ）に示すグラフによって、ｘ
ｌとＲ１の比較を可能とする。対応する比較装置は第１
２図に示しである。この回路は、６７から７０までの番
号をつけた四つの比較器回路の一組を持っており、これ
らはそれぞれ会話／雑音弁別器７１の四つの入力に結合
されている。比較器回路６７はパラメータｘ１を会話し
きい値ＳＸ１“会話″と比較し、比較器６８はパラメー
タ×１をしきい値ＳＸ１　　“雑音″と比較し、比較器
６９はパラメータＲ１をしきい値ＳＲ，“会話“°と比
較し、比較器７０はパラメータＲ１をしきい値ＳＲ＋　
　’“雑音パと比較する。会話／雑音弁別器７１は、音
声活動信号ＤＡＶを第１３図に示す状態図によって作り
出す。この状態図は二つの安定状態ＤＡＶＯとＤＡＶＩ
を持っており、ＬｌからＲ４までの文字で表わした不安
定状態がある。安定状態ＤＡＶＯは「雑音」状態で、そ
こでは会話信号がないとき音声活動検出器が置かれる。′“Noise°°threshold or both are multiplier circuit 6
5 and the adder circuit 66. Multiplication circuit 65
The first operand inputs of are the parameters Xmoy and R, given in register 60 of FIG. , receive. Multiplier circuit 65 also has a second operand input to which the parameter DA is added. Its output is connected to a first operand input of an adder 66, whose second operand input receives the value of the parameter threshold X ph,inf. The output of the adder circuit 66 is the threshold value sx, '“
Output "noise" and SR "noise". These thresholds are used in steps 30 to 40 of the method and in the first
According to the graphs shown in Figure 1 (A) and Figure 11 (B), x
This allows comparison of l and R1. The corresponding comparator is the first
This is shown in Figure 2. This circuit has a set of four comparator circuits numbered 67 through 70, each coupled to four inputs of speech/noise discriminator 71. Comparator circuit 67 compares parameter x1 with speech threshold SX1 "speech", comparator 68 compares parameter x1 with threshold SX1 "noise", comparator 69 compares parameter R1 with threshold SR , “conversation” °, the comparator 70 sets the parameter R1 to the threshold value SR+
The speech/noise discriminator 71 produces a voice activity signal DAV according to the state diagram shown in FIG.
, and there are unstable states represented by letters Ll to R4. Steady state DAVO is the "noise" state in which the voice activity detector is placed when there is no speech signal.

また安定状態ＤＡＶＩは、入力に加えられた信号か会話
信号を含んでいるとき音声活動検出器が置かれる状態で
ある。検出器が「雑音」状態ＤＡＶＯにあるときは、二
つのパラメータＸＳＸ、とＲ３Ｘ、の中の一つが、不安
定状態Ｌ１を通過する際に、対応する会話しきい値ｓｘ
、　　”会話″又はＳＲ，”会話°′よりも大きい場合
だけ、会話状態ＤＡＶＩに行くのである。The stable state DAVI is also the state in which the voice activity detector is placed when the input contains an applied signal or a speech signal. When the detector is in the “noise” state DAVO, one of the two parameters
, ``conversation'' or SR, ``conversation °'', only goes to the conversation state DAVI.

そうでない場合、すなわちパラメータｘ８かしきい値Ｓ
Ｘ、　　”会話″゛よりも小さく、パラメータＲ１がし
きい値ＳＲ，”会話″よりも小さい場合は、雑音という
判断が持続される。If not, i.e. parameter x8 or threshold S
If the parameter R1 is smaller than the threshold value SR, "conversation", the determination as noise is maintained.

対照的に、音声活動検出器が会話状態ＤＡＶＩにあると
きは、二つのパラメータｘ１とＲ１の中の一つか、対応
する雑音しきい値以下である場合、すなわちｘｌがしき
い値ＳＸ＋　　“雑音”°より小さく、Ｒ１がしきい値
ＳＲ，”雑音′°より小さい場合にのみ、検出器は雑音
状態ＤＡＶＩに行くことができる。これらの条件の下で
は、検出器は不安定状態Ｌ２を通過する。信号ＤＡＶの
状態変化のこのアルゴリズムは、第４図のステップ３０
から３９までに表わされている。信号ＤＡＶの状態の各
変化の後、及びステップ４０で表わされる初期化の段階
の後、本方法は第１図のステップ６の動作にもどる。In contrast, when the voice activity detector is in the conversation state DAVI, one of the two parameters x1 and R1 is less than or equal to the corresponding noise threshold, i.e. xl equals the threshold SX + "noise" The detector can go to the noise state DAVI only if R1 is less than the threshold SR, "noise" °. Under these conditions, the detector passes through the unstable state L2 .This algorithm of state changes of signal DAV is performed in step 30 of FIG.
It is expressed from 39 to 39. After each change in the state of signal DAV, and after the initialization phase represented by step 40, the method returns to the operations of step 6 of FIG.

ただし第４図のダイヤグラムのステップ４１と４２に示
すように、雑音状態ＤＡＶＯへの変化は、Ｎ（ａｎｇＪ
と記したタイミングカウンタ（図示せず）で計算された
、ある期間の終りにおいてのみ有効である。このタイミ
ングカウンタは「会話」状態ＤＡＶＩと判断されたとき
は何時でも、ステップ３５と３９において最大カウント
値でロードされ、そのカウンタの内容は、ＤＡＶＯとい
う判断がステップ３６において起こるときは何時でも、
−単位だけ減らされる。このことが、話し手によ乞会話
におけるギャップの間、又はある話のエネルギーが低い
ときその話の終りを遮断している間に、系統的に「雑音
」状態に行くことを避けさせることかできるのである。However, as shown in steps 41 and 42 of the diagram in FIG.
Valid only at the end of a period, calculated by a timing counter (not shown) marked . This timing counter is loaded with the maximum count value in steps 35 and 39 whenever the "conversation" state DAVI is determined, and the contents of the counter are loaded with the maximum count value whenever the determination DAVO occurs in step 36.
-reduced by a unit. This can allow speakers to systematically avoid going into a "noise" state during gaps in begging conversations, or while cutting off the end of a speech when its energy is low. It is.

本発明に基づく本方法の実施例は、今まて述べてきた装
置に限られるものではなく、また例えば読み出し専用メ
モリ（ＲＯＭ　）のような記録されたマイクロプログラ
ムを有する計算手段より成る構造体によっても、等しく
十分に実施することができることは、極めて明白である
。Embodiments of the method according to the invention are not limited to the devices just described, but also to structures comprising computing means having a recorded microprogram, such as a read-only memory (ROM). It is quite clear that the same can be implemented equally well.

[Brief explanation of the drawing]

第１図と第２図と第３図と第４図は、本発明によって実
施される方法の、異なったステップを説明するフローチ
ャートである。第５図は本発明による方法のステップ１から５までを実
施する、エネルギー比の計算用装置を示す。第６図は、フィルタを通した信号又は前置増幅した第５
図の信号の一フレーム中で、最大のエネルギーを有する
サンプルの値の計算用装置の実施の態様を示す。第７図は第１図のステップ６から１１までを実施するた
めの装置の実施の態様を示す。第８図（Ａ）及び第８図（Ｂ）は、第２図のステップ１
２から２２までにおいて表わされる、しきい値を求める
のに使用される方法を示す二つのグラフである。第９図は第２図のステップ１２から２２までにおいて、
説明されている平均値島。、及びＲ１゜７の計算用装置
の実施の態様を示す。第１０図（Ａ）と第１Ｏ図（Ｂ）は、本発明によるしき
値の計算用の二つの回路を示す。第１１図（Ａ）及び第１１図（Ｂ）は、本発明によって
、適応しきい値による比較のモードを説明する、二つの
グラフを示す。第１２図は、第４図のステップ３０から４０までを実施
するための比較装置の実施の態様を示す。第１３図は、音声信号中に音声信号が存在するか否かを
定めることを可能とする、決定アルゴリズムを示す、状
態図である。1, 2, 3 and 4 are flowcharts illustrating different steps of a method implemented in accordance with the present invention. FIG. 5 shows a device for calculating energy ratios, implementing steps 1 to 5 of the method according to the invention. Figure 6 shows the filtered or preamplified 5
2 shows an embodiment of a device for calculating the value of the sample with maximum energy in one frame of the signal of the figure; FIG. FIG. 7 shows an embodiment of an apparatus for carrying out steps 6 to 11 of FIG. FIG. 8(A) and FIG. 8(B) show step 1 of FIG.
2 to 22 are two graphs illustrating the method used to determine the threshold; FIG. 9 shows steps 12 to 22 in FIG.
Mean island explained. , and an embodiment of an apparatus for calculating R1°7. 10(A) and 10(B) show two circuits for threshold calculation according to the invention. FIGS. 11A and 11B show two graphs illustrating a mode of comparison with adaptive thresholds in accordance with the present invention. FIG. 12 shows an implementation of a comparison device for carrying out steps 30 to 40 of FIG. FIG. 13 is a state diagram illustrating a decision algorithm that allows determining whether an audio signal is present in an audio signal.

Claims

[Claims]

(1) A method for detecting a voice signal in a signal buried in noise, which consists of the following steps: (a) A step of cutting the signal into several frames. (b) sampling each frame to obtain a digital signal containing a determined number of n samples; (c) preamplifying the digital signal to obtain a preamplified digital signal. (d) To obtain the filtered digital signal,
Filtering the preamplified digital signal by a high-pass digital filter. (e) measuring, in each frame, the maximum energy of the samples of the preamplified signal and the maximum energy of the samples of the filtered digital signal; (f) taking the energy ratio between the maximum energy of the filtered digital signal and the maximum energy of the preamplified digital signal samples; (g) calculating a long-term average value of the energy and energy ratio of the samples of the filtered signal between two limits; (h) calculating four thresholds based on long-term average values, two of which are maximum values and two lower limits of the conversational state for the filtered signal and energy ratio, respectively; , two of which are the minimum signal and form two upper bounds of the noise conditions to the filtered signal and energy ratio, respectively, and the maximum energy of the filtered signal and the energy ratio and these Compare with threshold. (i) determining the presence of a speech signal in the noise corrupted signal when the maximum energy or energy ratio of the filtered digital signal is greater than a respective maximum threshold; (j) and determining the absence of a speech signal in the noise-affected signal when the maximum energy or energy cost R of the filtered digital signal is less than the respective minimum threshold.

2. The method of claim 1, wherein the digital signal is preamplified by a z-transforming high-pass digital filter (H(z)=1.86z^1).

3. The method of claim 2, wherein the high-pass digital filter has a cutoff frequency of about 1200 Hz.

4. The method of claim 3, wherein the measurement of maximum energy in each frame occurs for the sample of maximum amplitude.

(5) Long-term average value of the maximum energy of the filter
5. A method according to claim 4, wherein determining _m is calculated at each current frame by applying a recursion relation of the form:
o_y_. X_m_o if larger than _o_l_d
_y=T_m・X_m_o_y_. _o_l_d+(1
-T_m). X_2 or the value of parameter X_2 is parameter X_m_o_y_. If it is smaller than _o_l_d, then X_m_o_y=T_d・X_m_o_y_. ＿o＿l
_d+(1-T_d). X_2 However, the value of X_2 is determined by two thresholds X_p_. s_u_p and X_p_. ＿i＿n
X_m_o equal to the value of the maximum energy sample X_p_h in each frame bounded between
＿y＿． _o_l_d is a long-term average value calculated in the immediately previous frame, T_m and T_d are time constants, and T_m is a larger time constant than T_d.

(6) Average value of the maximum value of energy ratio R_m_o_y
is calculated in each current frame by applying a recursion relation of the form; that is, parameter R_2 is equal to parameter R_m_o_
y_. If it is larger than _o_l_d, R_m_o_y=T_m・R_m_o_y_. ＿o＿l
_d+(1-T_m)R_2 or parameter R_2 is parameter R_m_o_y_. If smaller than _o_l_d, R_m_o_y=T_d・R_m_o_y_
．． _o_l_d+(1-T_d)R_2However, R_m_
o_y_. _o_l_d indicates the long-term average energy ratio calculated in the previous frame.

(7) The method of claim 6, wherein the four thresholds are calculated by applying the following relationship: SX_1 conversation=a. X_m_o_y+X_p_h_. ＿
i_n_fSX_1 noise=b. X_m_o_y+X_p
＿h＿． _i_n_fSR_1 conversation=a. R_m_o_
y+R_. _i_n_fSR_1 noise=b. R_m_o
___y+R_. _i_n_f However, a and b are constants.

(8) The method according to claim 7, wherein ¥a¥=1.8 and ¥b¥=1.25.

(9) Apparatus for detecting a speech signal in a signal corrupted by noise, comprising the following means: (a) in each frame, the maximum energy of the preamplified signal and the maximum energy of the filtered digital signal; The first means of calculating the ratio between the maximum energy. (b) A second means for calculating the maximum energy of the filtered signal and a long-term average value of the energy ratio. (c) third means coupled to the second means for calculating maximum and minimum adaptive thresholds for the filtered digital signal and energy ratio; (d) determining means coupled to the third means for determining whether an audio signal is present or absent in the digital signal;

10. The apparatus of claim 9, wherein each of said first, second, third, and fourth means is formed by microprogrammed computing means.

11. The apparatus of claim 10, wherein said microprogrammed calculation means are formed by a signal processor.