CN110472251A

CN110472251A - Translation model training method, sentence translation method, equipment and storage medium

Info

Publication number: CN110472251A
Application number: CN201810445783.2A
Authority: CN
Inventors: 程勇; 涂兆鹏; 孟凡东; 翟俊杰; 刘洋
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2018-05-10
Filing date: 2018-05-10
Publication date: 2019-11-19
Anticipated expiration: 2038-05-10
Also published as: JP7179273B2; EP3792789A4; EP3792789A1; US20200364412A1; WO2019214365A1; CN110472251B; JP2021515322A; US11900069B2

Abstract

The present application discloses a method for training a translation model, including: obtaining a training sample set, which includes a plurality of training samples; determining a disturbance sample set corresponding to each training sample in the training sample set, and the disturbance sample set includes at least A perturbation sample, the semantic similarity between the perturbation sample and the corresponding training sample is higher than the first preset value; the initial translation model is trained by using a plurality of training samples and a set of perturbation samples corresponding to each training sample to obtain a target translation model. The solution provided by the embodiment of the present application introduces disturbance samples during model training, so the robustness and translation quality of machine translation can be improved.

Description

Translation model training method, sentence translation method, equipment and storage medium

技术领域technical field

本申请涉及计算机技术领域，具体涉及一种翻译模型训练的方法、语句翻译的方法、计算机设备、终端设备及计算机可读存储介质。The present application relates to the field of computer technology, and in particular to a translation model training method, a sentence translation method, computer equipment, terminal equipment, and a computer-readable storage medium.

背景技术Background technique

随着人工智能的发展，机器翻译已经被广泛使用，如同声传译和聊天内容翻译等，都是基于机器翻译将一种输入语言转换为另一种语言输出。With the development of artificial intelligence, machine translation has been widely used, such as simultaneous interpretation and chat content translation, which are based on machine translation to convert one input language into another language output.

神经机器翻译是一种完全基于神经网络的机器翻译模型，其在诸多语言对上已经达到了很好的翻译水平，已被广泛的应用在各种机器翻译产品中。然而，由于神经机器翻译模型基于一个完整的神经网络，其建模的全局性导致目标端的每个输出依赖于源端输入的每个词，使得对于输入中的微小扰动过度敏感。例如，在中文到英文得翻译中，用户输入“他们不怕困难做出围棋AI”，机器翻译模型给出的英文翻译为“They are not afraid ofdifficulties to make Go AI”，然而，当用户输入一个相似的语句“他们不畏困难做出围棋AI”，机器翻译的输出发生了剧烈改变，结果为“They are not afraid to make Go AI”，尽管用户只是用近义词替换了其中一个词，但其翻译结果却发生了剧烈变化。Neural machine translation is a machine translation model based entirely on neural networks. It has achieved a good level of translation in many language pairs and has been widely used in various machine translation products. However, since the neural machine translation model is based on a complete neural network, the global nature of its modeling causes each output of the target end to depend on each word input by the source end, making it overly sensitive to small disturbances in the input. For example, in the translation from Chinese to English, the user inputs "They are not afraid of difficulties to make Go AI", the English translation given by the machine translation model is "They are not afraid of difficulties to make Go AI", however, when the user enters a similar The sentence "They are not afraid of difficulties to make Go AI", the output of machine translation has changed drastically, and the result is "They are not afraid to make Go AI", although the user just replaced one of the words with a synonym, the translation result But drastic changes have taken place.

由此可见，目前的神经机器翻译的稳定性，也就是鲁棒性比较差。It can be seen that the stability of the current neural machine translation, that is, the robustness is relatively poor.

发明内容Contents of the invention

本申请实施例提供一种翻译模型训练的方法，可以提高机器翻译的鲁棒性，以及翻译质量。本申请实施例还提供了相应的语句翻译的方法、计算机设备、终端设备以及计算机可读存储介质。The embodiment of the present application provides a translation model training method, which can improve the robustness and translation quality of machine translation. Embodiments of the present application also provide corresponding sentence translation methods, computer equipment, terminal equipment, and computer-readable storage media.

本申请第一方面提供一种翻译模型训练的方法，包括：The first aspect of the present application provides a method for training a translation model, including:

获取训练样本集合，所述训练样本集合中包括多个训练样本；Obtain a training sample set, the training sample set includes a plurality of training samples;

确定所述训练样本集合中每个训练样本各自对应的扰动样本集合，所述扰动样本集合包括至少一个扰动样本，所述扰动样本与对应训练样本的语义相似度高于第一预设值；Determine a disturbance sample set corresponding to each training sample in the training sample set, the disturbance sample set includes at least one disturbance sample, and the semantic similarity between the disturbance sample and the corresponding training sample is higher than a first preset value;

使用所述多个训练样本和所述每个训练样本各自对应的扰动样本集合训练初始翻译模型，以得到目标翻译模型。An initial translation model is trained by using the plurality of training samples and the perturbation sample set corresponding to each training sample to obtain a target translation model.

本申请第二方面提供一种语句翻译的方法，包括：The second aspect of the present application provides a method for sentence translation, including:

接收以第一语言表达的第一待翻译语句；receiving a first sentence to be translated expressed in a first language;

使用目标翻译模型对所述第一待翻译语句进行翻译，以得到用第二语言表达的翻译结果语句，其中所述目标翻译模型为使用多个训练样本和所述多个训练样本中每个训练样本各自对应的扰动样本集合训练得到的，所述扰动样本集合包括至少一个扰动样本，所述扰动样本与对应训练样本的语义相似度高于第一预设值；Translate the first sentence to be translated using a target translation model to obtain a translation result sentence expressed in a second language, wherein the target translation model is trained using a plurality of training samples and each of the plurality of training samples The samples are obtained by training the disturbance sample set corresponding to each sample, the disturbance sample set includes at least one disturbance sample, and the semantic similarity between the disturbance sample and the corresponding training sample is higher than the first preset value;

输出所述用第二语言表达的翻译结果语句。outputting the translation result sentence expressed in the second language.

本申请第三方面提供一种翻译模型训练的装置，包括：The third aspect of the present application provides a translation model training device, including:

获取单元，用于获取训练样本集合，所述训练样本集合中包括多个训练样本；an acquisition unit, configured to acquire a training sample set, the training sample set including a plurality of training samples;

确定单元，用于确定所述获取单元获取的所述训练样本集合中每个训练样本各自对应的扰动样本集合，所述扰动样本集合包括至少一个扰动样本，所述扰动样本与对应训练样本的语义相似度高于第一预设值；A determination unit, configured to determine a disturbance sample set corresponding to each training sample in the training sample set acquired by the acquisition unit, the disturbance sample set includes at least one disturbance sample, and the semantics of the disturbance sample and the corresponding training sample The similarity is higher than the first preset value;

模型训练单元，用于使用所述获取单元获得的所述多个训练样本和所述确定单元确定的所述每个训练样本各自对应的扰动样本集合训练初始翻译模型，以得到目标翻译模型。A model training unit, configured to train an initial translation model by using the plurality of training samples obtained by the acquiring unit and the perturbation sample sets corresponding to each training sample determined by the determining unit, so as to obtain a target translation model.

本申请第四方面提供一种语句翻译的装置，包括：The fourth aspect of the present application provides a sentence translation device, including:

接收单元，用于接收以第一语言表达的第一待翻译语句；a receiving unit, configured to receive a first sentence to be translated expressed in a first language;

翻译单元，用于使用目标翻译模型对所述接收单元接收的所述第一待翻译语句进行翻译，以得到用第二语言表达的翻译结果语句，其中所述目标翻译模型为使用多个训练样本和所述多个训练样本中每个训练样本各自对应的扰动样本集合训练得到的，所述扰动样本集合包括至少一个扰动样本，所述扰动样本与对应训练样本的语义相似度高于第一预设值；a translation unit, configured to use a target translation model to translate the first sentence to be translated received by the receiving unit to obtain a translation result sentence expressed in a second language, wherein the target translation model uses a plurality of training samples and each of the plurality of training samples corresponding to the disturbance sample set training, the disturbance sample set includes at least one disturbance sample, and the semantic similarity between the disturbance sample and the corresponding training sample is higher than that of the first preset set value;

输出单元，用于输出所述翻译单元翻译出的用第二语言表达的翻译结果语句。An output unit, configured to output the translation result sentence expressed in the second language translated by the translation unit.

本申请第五方面提供一种计算机设备，所述计算机设备包括：输入/输出(I/O)接口、处理器和存储器，所述存储器中存储有程序指令；A fifth aspect of the present application provides a computer device, the computer device comprising: an input/output (I/O) interface, a processor, and a memory, where program instructions are stored in the memory;

所述处理器用于执行存储器中存储的程序指令，执行如第一方面所述的方法。The processor is configured to execute program instructions stored in the memory, and execute the method as described in the first aspect.

本申请第六方面提供一种终端设备，所述终端设备包括：输入/输出(I/O)接口、处理器和存储器，所述存储器中存储有程序指令；A sixth aspect of the present application provides a terminal device, where the terminal device includes: an input/output (I/O) interface, a processor, and a memory, where program instructions are stored in the memory;

所述处理器用于执行存储器中存储的程序指令，执行如第二方面所述的方法。The processor is configured to execute program instructions stored in the memory, and execute the method as described in the second aspect.

本申请第七方面提供一种计算机可读存储介质，包括指令，所述指令在计算机设备上运行时，使得所述计算机设备执行如上述第一方面所述的方法或第二方面所述的方法。The seventh aspect of the present application provides a computer-readable storage medium, including instructions, which, when run on a computer device, cause the computer device to execute the method described in the first aspect or the method described in the second aspect .

本申请的又一方面提供了一种包含指令的计算机程序产品，当其在计算机上运行时，使得计算机执行上述第一方面或第二方面所述的方法。Another aspect of the present application provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the method described in the first aspect or the second aspect above.

本申请实施例在翻译模型训练时就采用了扰动样本，扰动样本与训练样本的语义相似度高于第一预设值，也就是扰动样本与训练样本的语义很相近，这样训练出来的目标翻译模型在接收到带有噪声的语句时，也可以正确进行翻译。从而提高了机器翻译的鲁棒性，以及翻译质量。In the embodiment of the present application, disturbance samples are used in the translation model training. The semantic similarity between the disturbance samples and the training samples is higher than the first preset value, that is, the semantic similarity between the disturbance samples and the training samples is very similar, so that the trained target translation The model can also translate correctly when it receives sentences with noise. Thereby improving the robustness of machine translation, as well as the translation quality.

附图说明Description of drawings

图1是本申请实施例中翻译模型训练的系统的一实施例示意图；Fig. 1 is a schematic diagram of an embodiment of a system for translation model training in the embodiment of the present application;

图2是本申请实施例中翻译模型训练的方法的一实施例示意图；Fig. 2 is a schematic diagram of an embodiment of the method for translation model training in the embodiment of the present application;

图3是本申请实施例中初始翻译模型的一架构示意图；Fig. 3 is a schematic diagram of the structure of the initial translation model in the embodiment of the present application;

图4是本申请实施例中语句翻译的方法的一实施例示意图；Fig. 4 is a schematic diagram of an embodiment of the method for sentence translation in the embodiment of the present application;

图5是本申请实施例中语句翻译的一应用场景示意图；Fig. 5 is a schematic diagram of an application scenario of sentence translation in the embodiment of the present application;

图6是本申请实施例中语句翻译的另一应用场景示意图；FIG. 6 is a schematic diagram of another application scenario of sentence translation in the embodiment of the present application;

图7是本申请实施例中语句翻译的另一应用场景示意图；FIG. 7 is a schematic diagram of another application scenario of sentence translation in the embodiment of the present application;

图8是本申请实施例中语句翻译的另一应用场景示意图；Fig. 8 is a schematic diagram of another application scenario of sentence translation in the embodiment of the present application;

图9是本申请实施例中翻译模型训练的装置的一实施例示意；FIG. 9 is a schematic diagram of an embodiment of the device for training the translation model in the embodiment of the present application;

图10是本申请实施例中语句翻译的装置的一实施例示意；Fig. 10 is a schematic diagram of an embodiment of the device for sentence translation in the embodiment of the present application;

图11是本申请实施例中计算机设备的一实施例示意图；Fig. 11 is a schematic diagram of an embodiment of computer equipment in the embodiment of the present application;

图12是本申请实施例中终端设备的一实施例示意图。Fig. 12 is a schematic diagram of an embodiment of a terminal device in an embodiment of the present application.

具体实施方式Detailed ways

下面结合附图，对本申请的实施例进行描述，显然，所描述的实施例仅仅是本申请一部分的实施例，而不是全部的实施例。本领域普通技术人员可知，随着技术的发展和新场景的出现，本申请实施例提供的技术方案对于类似的技术问题，同样适用。Embodiments of the present application are described below in conjunction with the accompanying drawings. Apparently, the described embodiments are only part of the embodiments of the present application, not all of the embodiments. Those of ordinary skill in the art know that, with the development of technology and the emergence of new scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.

本申请实施例提供一种翻译模型训练的方法，可以提高机器翻译的鲁棒性，以及翻译质量。本申请实施例还提供了相应的语句翻译的方法、计算机设备、终端设备以及计算机可读存储介质。以下分别进行详细说明。The embodiment of the present application provides a translation model training method, which can improve the robustness and translation quality of machine translation. Embodiments of the present application also provide corresponding sentence translation methods, computer equipment, terminal equipment, and computer-readable storage media. Each will be described in detail below.

随着人工智能的发展，机器翻译的准确度越来越高，很大程度上方便了用户。如：同声传译、文字翻译等场景中都用到了机器翻译。机器翻译通常是基于模型的翻译，也就是通过预先训练翻译模型，训练好的翻译模型可以接收一种语言的语句，然后将该语句转换成另一种语言输出。目前神经机器翻译是完全基于神经网络的机器翻译模型，翻译的准确度较高，但该模型的抗噪声能力不好，一旦输入的语句中有微小的扰动，输出的语句就会不准确。因此，本申请实施例提供一种翻译模型训练的方法，在翻译模型训练时在训练样本中就引入了各种扰动样本，从而保证了训练出的翻译模型在接收到带有扰动的语句时，也可以正确进行翻译。With the development of artificial intelligence, the accuracy of machine translation is getting higher and higher, which greatly facilitates users. For example, machine translation is used in scenarios such as simultaneous interpretation and text translation. Machine translation is usually model-based translation, that is, by pre-training a translation model, the trained translation model can receive a sentence in one language and then convert the sentence into another language for output. At present, neural machine translation is a machine translation model based entirely on neural networks. The accuracy of translation is high, but the anti-noise ability of this model is not good. Once there is a slight disturbance in the input sentence, the output sentence will be inaccurate. Therefore, the embodiment of the present application provides a translation model training method. During the translation model training, various disturbance samples are introduced into the training samples, thereby ensuring that when the trained translation model receives a sentence with disturbance, It can also be translated correctly.

需要说明的是，本申请实施例中，扰动包括噪声。It should be noted that, in the embodiment of the present application, the disturbance includes noise.

下面结合附图介绍本申请实施例中翻译模型训练的过程。The following describes the translation model training process in the embodiment of the present application with reference to the accompanying drawings.

图1为本申请实施例中翻译模型训练的系统的一实施例示意图。FIG. 1 is a schematic diagram of an embodiment of a translation model training system in an embodiment of the present application.

如图1所示，本申请实施例中的翻译模型训练的系统的一实施例包括计算机设备10和数据库20，数据库20中存储有训练样本。As shown in FIG. 1 , an embodiment of the translation model training system in the embodiment of the present application includes a computer device 10 and a database 20 , and training samples are stored in the database 20 .

计算机设备10从数据库20获取训练样本集合，然后使用该训练样本集合进行翻译模型训练，得到目标翻译模型。The computer device 10 obtains a training sample set from the database 20, and then uses the training sample set to perform translation model training to obtain a target translation model.

该模型训练的过程参阅图2翻译模型训练的方法的一实施例进行理解。The process of the model training can be understood by referring to an embodiment of the method for training the translation model in FIG. 2 .

如图2所示，本申请实施例提供的翻译模型训练的方法的一实施例包括：As shown in Figure 2, an embodiment of the translation model training method provided by the embodiment of the present application includes:

101、获取训练样本集合，所述训练样本集合中包括多个训练样本。101. Acquire a training sample set, where the training sample set includes multiple training samples.

本申请实施例中，训练样本集合中的训练样本指的是不带有扰动的样本。In the embodiment of the present application, the training samples in the training sample set refer to samples without disturbance.

102、确定所述训练样本集合中每个训练样本各自对应的扰动样本集合，所述扰动样本集合包括至少一个扰动样本，所述扰动样本与对应训练样本的语义相似度高于第一预设值。102. Determine a disturbance sample set corresponding to each training sample in the training sample set, the disturbance sample set includes at least one disturbance sample, and the semantic similarity between the disturbance sample and the corresponding training sample is higher than a first preset value .

本申请实施例中，扰动样本是指包含了扰动信息或者噪声的样本，但语义与训练样本的相似度还是基本一致的，扰动信息可以是意思相同但文字不同的词，也可以是其他情形能使语句的语义不发生较大变化的词。In the embodiment of the present application, a disturbance sample refers to a sample that contains disturbance information or noise, but the similarity between the semantics and the training sample is basically the same. The disturbance information can be a word with the same meaning but different words, or it can be a word that can be used in other situations. A word that does not change the semantics of the sentence significantly.

本申请实施例中的第一预设值可以为一具体值，如：90％或95％等，本处只是举例说明，并不限定第一预设值的取值，该第一预设值可以根据需求设定。The first preset value in the embodiment of the present application can be a specific value, such as: 90% or 95%, etc., this is only an example, and does not limit the value of the first preset value, the first preset value Can be set according to needs.

关于训练样本与扰动样本的关系可以参阅如下例子进行理解：The relationship between training samples and disturbance samples can be understood by referring to the following examples:

训练样本：“他们不怕困难做出围棋AI”。Training sample: "They are not afraid of difficulties to make Go AI".

扰动样本：“他们不畏困难做出围棋AI”。Disturbance sample: "They are not afraid of difficulties to make Go AI".

由上述例子可见，训练样本和扰动样本的语义很接近，只是用不同的词，如“不畏”对原词“不怕”做了替换。It can be seen from the above examples that the semantics of the training samples and the disturbance samples are very close, but a different word, such as "not afraid" is used to replace the original word "not afraid".

103、使用所述多个训练样本和所述每个训练样本各自对应的扰动样本集合训练初始翻译模型，以得到目标翻译模型。103. Train an initial translation model by using the plurality of training samples and the perturbation sample set corresponding to each training sample to obtain a target translation model.

在模型训练时，使用训练样本和对应的扰动样本一起训练。During model training, the training samples and the corresponding perturbed samples are used to train together.

可选地，本申请实施例提供的翻译模型训练的方法的另一实施例中，Optionally, in another embodiment of the translation model training method provided in the embodiment of this application,

所述每个训练样本为一个训练样本对，所述训练样本对包括训练输入样本和训练输出样本；Each of the training samples is a training sample pair, and the training sample pair includes a training input sample and a training output sample;

对应地，所述确定每个训练样本各自对应的扰动样本集合，可以包括：Correspondingly, the determination of the perturbation sample set corresponding to each training sample may include:

确定每个训练输入样本各自对应的扰动输入样本集合，以及所述扰动输出样本集合对应的扰动输出样本，所述扰动输入样本集合包括至少一个扰动输入样本，所述扰动输出样本与所述训练输出样本相同；determining a disturbance input sample set corresponding to each training input sample, and a disturbance output sample corresponding to the disturbance output sample set, the disturbance input sample set includes at least one disturbance input sample, and the disturbance output sample is the same as the training output same sample;

对应地，使用所述多个训练样本和所述每个训练样本各自对应的扰动样本集合训练初始翻译模型，以得到目标翻译模型，可以包括：Correspondingly, using the plurality of training samples and the perturbation sample sets corresponding to each training sample to train the initial translation model to obtain the target translation model may include:

使用多个训练样本对和所述每个训练输入样本各自对应的扰动输入样本集合，以及所述扰动输出样本集合对应的扰动输出样本训练初始翻译模型，以得到目标翻译模型。The initial translation model is trained by using a plurality of training sample pairs, a disturbance input sample set corresponding to each training input sample, and a disturbance output sample corresponding to the disturbance output sample set, so as to obtain a target translation model.

本申请实施例中，训练输入样本为第一语言，训练输出样本为第二种语言。第一语言和第二语言不同。本申请实施例中第一语言用中文举例，第二语言用英文举例。但不应将中文和英文理解为是对本申请实施例中翻译模型的限定。本申请实施例中的翻译模型可以适用于任意两种不同语言之间的互译。只要训练时采用了相应两种语言的训练样本，就可以实现这两种语言之间的翻译。In this embodiment of the present application, the training input samples are in the first language, and the training output samples are in the second language. First language and second language are different. In the embodiment of the present application, Chinese is used as an example for the first language, and English is used as an example for the second language. However, Chinese and English should not be interpreted as limitations on the translation model in the embodiment of the present application. The translation model in the embodiment of the present application may be applicable to mutual translation between any two different languages. As long as the training samples of the corresponding two languages are used in the training, the translation between the two languages can be realized.

本申请实施例中，每个训练输入样本可以有多个扰动输入样本，但每个扰动输入样本对应的扰动输出样本都与训练输出样本相同。In the embodiment of the present application, each training input sample may have multiple disturbance input samples, but the disturbance output samples corresponding to each disturbance input sample are the same as the training output samples.

以上训练输入样本、训练输出样本、扰动输入样本与扰动输出样本之间的对应关系可以参阅表1进行理解。The correspondence between the above training input samples, training output samples, disturbance input samples and disturbance output samples can be understood by referring to Table 1.

表1Table 1

由以上表1可见，训练输入样本为x时，训练输出样本为y，x对应的扰动输入样本有多个，分别为x′1、x′2和x′3等，每个扰动输入样本对应的扰动输出样本都为y。这样，就可以确保训练出来的目标翻译模型，在无论输入为x还是为x′1、x′2和x′3时，输出的翻译结果都为y。从而进一步保证了目标翻译模型翻译的鲁棒性和翻译质量。It can be seen from Table 1 above that when the training input sample is x, the training output sample is y, and there are multiple disturbance input samples corresponding to x, namely x′1, x′2, and x′3, etc., and each disturbance input sample corresponds to The perturbed output samples of are all y. In this way, it can be ensured that the trained target translation model will output a translation result of y no matter whether the input is x or x′1, x′2 and x′3. Thus, the robustness and translation quality of the target translation model translation are further guaranteed.

当然，表1也只是举例说明，训练输入样本对应的扰动输入样本可以比表1中列举的少或者比表1中列举的多。Of course, Table 1 is only an example, and the disturbance input samples corresponding to the training input samples may be less than or more than those listed in Table 1.

以上介绍了扰动输入样本，下面介绍扰动输入样本的产生。The perturbation input sample is introduced above, and the generation of the perturbation input sample is described below.

一种扰动输入样本的产生方式可以是：A perturbed input sample can be generated by:

所述确定所述训练样本集合中每个训练输入样本各自对应的扰动输入样本集合，可以包括：The determining the disturbance input sample set corresponding to each training input sample in the training sample set may include:

确定所述每个训练输入样本中的第一词语，所述第一词语为待被替换的词语；determining a first word in each of the training input samples, the first word being a word to be replaced;

用至少一个第二词语分别替换所述第一词语，以得到所述扰动输入样本集合，所述第二词语与所述第一词语的语义相似度高于第二预设值。The first words are respectively replaced with at least one second word to obtain the disturbed input sample set, and the semantic similarity between the second words and the first words is higher than a second preset value.

本申请实施例中，从词汇级别来产生带扰动的语句，给定一个输入的语句，然后采样其中的要修改的第一词语，确定该第一词语的位置，然后将这些位置的第一词语用词语表中第二词语替换。In the embodiment of the present application, a perturbed sentence is generated from the vocabulary level, an input sentence is given, and then the first word to be modified is sampled, the position of the first word is determined, and the first word at these positions Replace with the second word in the list.

词语表中会包含很多词语，关于第二词语的选择可以参考如下公式进行理解。The word list will contain many words, and the choice of the second word can be understood by referring to the following formula.

其中，E[x_i]是第一词语x_i的词向量，cos(E[x_i],E[x])度量了第一词语x_i与第二词语x的相似度。由于词向量能够捕捉词语的语义信息，因此，通过此替换方式，能够较好的将当前语句中的第一词x_i替换成与其有相近语义信息的第二词语x。Wherein, E[ _xi ] is the word vector of the first word x _i , and cos(E[ _xi ],E[x]) measures the similarity between the first word x _i and the second word x. Since word vectors can capture the semantic information of words, through this replacement method, the first word x _i in the current sentence can be better replaced with a second word x that has similar semantic information.

另一种扰动输入样本的产生方式可以是：Another way to generate perturbed input samples could be:

确定所述每个训练输入样本中每个词语的词向量；determining a word vector for each term in each of the training input samples;

每次在所述每个词语的词向量上叠加一个不同的高斯噪声向量，以得到所述扰动样本集合。Each time a different Gaussian noise vector is superimposed on the word vector of each word to obtain the disturbance sample set.

本申请实施例中，从特征级别产生带扰动的语句。给定一个语句，可以得到该语句中每个词语的向量，然后给每个词语的词向量都加上高斯噪声来模拟可能的扰动种类，可以参阅如下公式进行理解：In the embodiment of the present application, the perturbed sentences are generated from the feature level. Given a sentence, you can get the vector of each word in the sentence, and then add Gaussian noise to the word vector of each word to simulate possible disturbance types. You can refer to the following formula for understanding:

E[x′_i]＝E[x_i]+ε,ε～N(0,δ²I)E[x′ _i ]＝E[xi _i ]+ε,ε～N(0,δ ² I)

以上公式表示，E[x_i]标识词语x_i的词向量，E[x′_i]是加入高斯噪声后词语的词向量，向量ε是从方差为δ²的高斯噪声中采样出来的，δ是一个超参数。The above formula shows that E[ _xi ] identifies the word vector of the word x _i , E[x′ _i ] is the word vector of the word after adding Gaussian noise, the vector ε is sampled from the Gaussian noise with a variance of δ ² , δ is a hyperparameter.

本技术方案中是一个通用方案，其可以自由的定义任何加入扰动输入的策略。This technical solution is a general solution, which can freely define any strategy of adding disturbance input.

以上介绍了扰动输入样本的产生过程，下面介绍本申请实施例中翻译模型的架构。The generation process of the disturbance input samples is described above, and the structure of the translation model in the embodiment of the present application is introduced below.

如图3所示，本申请实施例提供的初始翻译模型包括编码器、分类器和解码器。As shown in FIG. 3 , the initial translation model provided by the embodiment of the present application includes an encoder, a classifier, and a decoder.

编码器用于接收训练输入样本和对应的扰动输入样本，并输出第一中间表示结果和第二中间表示结果，第一中间表示结果为训练输入样本的中间表示结果，第二中间表示结果为对应的扰动输入样本的中间表示结果。The encoder is used to receive training input samples and corresponding disturbance input samples, and output the first intermediate representation result and the second intermediate representation result, the first intermediate representation result is the intermediate representation result of the training input sample, and the second intermediate representation result is the corresponding The intermediate representation result of perturbing the input samples.

分类器用于区分第一中间表示结果和第二中间表示结果。A classifier is used to distinguish between the first intermediate representation result and the second intermediate representation result.

解码器用于根据第一中间表示结果输出训练输出样本，根据第二中间表示结果输出训练输出样本。The decoder is configured to output training output samples according to the first intermediate representation result, and output training output samples according to the second intermediate representation result.

所述初始翻译模型的模型目标函数包括与所述分类器和所述编码器相关的分类目标函数、与所述编码器和所述解码器相关的训练目标函数和扰动目标函数；The model objective function of the initial translation model includes a classification objective function associated with the classifier and the encoder, a training objective function and a perturbation objective function associated with the encoder and the decoder;

其中，所述分类目标函数中包括所述训练输入样本、所述对应的扰动输入样本、所述编码器的参数和所述分类器的参数；Wherein, the classification objective function includes the training input samples, the corresponding disturbance input samples, parameters of the encoder and parameters of the classifier;

所述训练目标函数包括所述训练输入样本、所述训练输出样本、所述编码器的参数和所述解码器的参数；said training objective function includes said training input samples, said training output samples, parameters of said encoder and parameters of said decoder;

所述扰动目标函数包括所述扰动输入样本、所述训练输出样本、所述编码器的参数和所述解码器的参数。The perturbation objective function includes the perturbation input samples, the training output samples, parameters of the encoder and parameters of the decoder.

本申请实施例中，训练输入样本可以用x表示，对应的扰动输入样本可以用x′表示，训练输出样本和扰动输出样本都用y表示，第一中间表示结果可以用H_x表示，第二中间表示结果可以用H_x′表示，分类目标函数可以用L_inv(x,x′)表示，训练目标函数可以用L_true(x,y)表示，扰动目标函数可以用L_noisy(x′,y)表示。In the embodiment of the present application, the training input sample can be represented by x, the corresponding disturbance input sample can be represented by x′, the training output sample and the disturbance output sample are both represented by y, the first intermediate representation result can be represented by H _x , and the second The intermediate representation result can be represented by H _x′ , the classification objective function can be represented by L _inv (x,x′), the training objective function can be represented by L _true (x,y), and the disturbance objective function can be represented by L _noisy (x′, y) said.

本申请实施例中的初始翻译模型可以为神经机器翻译模型。The initial translation model in this embodiment of the present application may be a neural machine translation model.

对初始翻译模型的训练目标是使得初始翻译模型能对于x和x′的翻译行为保持基本一致。编码器负责将第一语言的语句x转化为H_x，而解码器以H_x为输入输出目标语言语句y。本申请实施例的训练目标是训练一个扰动不变的编码器和解码器。The training goal of the initial translation model is to make the translation behavior of the initial translation model for x and x′ basically consistent. The encoder is responsible for converting the sentence x of the first language into H _x , and the decoder outputs the sentence y of the target language with H _x as input. The training goal of the embodiment of the present application is to train a disturbance-invariant encoder and decoder.

因为x′是x的一个微小的改变，所以会有相似的语义信息。给定一个输入对(x，x′)，在翻译模型训练时的训练目标是：(1)编码表示H_x应该与H_x′尽可能相近；(2)给定H_x′，解码器应该输出相同的y。因此，本申请实施例中引入了两个训练目标去增强编码器与解码器的鲁棒性：Since x' is a small change of x, there will be similar semantic information. Given an input pair (x, x′), the training objectives during translation model training are: (1) the encoded representation H _x should be as close as possible to H _x′ ; (2) given H _x′ , the decoder should output the same y. Therefore, two training objectives are introduced in the embodiment of this application to enhance the robustness of the encoder and decoder:

引入L_inv(x,x′)鼓励编码器对于x和x′输出相似的表示，从而实现扰动不变的编码器，通过对抗学习来实现此目标。Introducing _Linv (x,x′) encourages the encoder to output similar representations for x and x′, thus achieving a perturbation-invariant encoder, which is achieved by adversarial learning.

引入L_noisy(x′,y)引导解码器能够对于含有扰动的输入x′产生目标语言语句y。The introduction of L _noisy (x',y) guides the decoder to generate target language sentence y for input x' containing disturbance.

两个新引入的训练目标能够实现神经机器翻译模型的鲁棒性，使得其可以免于遭受由于输入的微小扰动而引起的输出空间的剧烈变化。同时，会将在原始数据x和y上的训练目标L_true(x,y)引入来保证在提升神经机器翻译模型鲁棒性的同时增强翻译的质量。Two newly introduced training objectives enable neural machine translation models to be robust against drastic changes in the output space due to small perturbations in the input. At the same time, the training target L _true (x,y) on the original data x and y will be introduced to ensure that the translation quality is enhanced while improving the robustness of the neural machine translation model.

因此，初始翻译模型的模型目标函数为：Therefore, the model objective function of the initial translation model is:

其中θ_enc是编码器的参数，θ_dec是解码器的参数，θ_dis是分类器的参数。α和β用来控制原始的翻译任务和机器翻译模型稳定性之间的重要度。where θ _enc is the parameter of the encoder, θ _dec is the parameter of the decoder, and θ _dis is the parameter of the classifier. α and β are used to control the importance between the original translation task and the stability of the machine translation model.

扰动不变的编码器的目标是当编码器输入一个正确语句x和其对应的扰动语句x′后，编码器对两个语句产生的表示是无法区分，其能直接有利于解码器产生鲁棒的输出。在本申请实施例中可以将编码器作为产生器G，其定义了产生隐表示H_x序列的过程。同时引入了分类器D去区分原始输入的表示H_x和扰动输入H_x′。产生器G的作用是对于x和x′产生相近的表示，使得分类器D无法区分他们，然而分类器D的作用是尽力区分它们。The goal of a perturbation-invariant encoder is that when the encoder inputs a correct sentence x and its corresponding perturbed sentence x′, the representations generated by the encoder for the two sentences are indistinguishable, which can directly benefit the decoder to generate robust Output. In the embodiment of the present application, the encoder can be used as the generator G, which defines the process of generating the hidden representation H _x sequence. At the same time, a classifier D is introduced to distinguish the original input representation H _x from the perturbed input H _x′ . The role of the generator G is to generate similar representations for x and x', so that the classifier D cannot distinguish them, while the role of the classifier D is to try to distinguish them.

形式上，对抗性学习目标定义为：Formally, the adversarial learning objective is defined as:

L_inv(x,x′；θ_enc,θ_dis)＝E_x～S[-log D(G(x))]L _inv (x,x′; θ _enc ,θ _dis )＝E _x～S [-log D(G(x))]

+E_x′～N(x)[-log(1-D(G(x′)))]+E _x′～N(x) [-log(1-D(G(x′)))]

给定一个输入，分类器会输出一个分类值，其目标是最大化正确语句x的分类值，同时最小化扰动语句x′的分类值。Given an input, a classifier outputs a classification value whose goal is to maximize the classification value of the correct sentence x while minimizing the classification value of the perturbed sentence x′.

采用随机梯度下降来优化模型目标函数J(θ)。在前向传播中，除了包含x和y的一批数据，还包含x′和y的一批数据。通过这两批数据能够计算出J(θ)的数值，然后，计算出J(θ)对应于模型参数的梯度，这些梯度会用来去更新模型参数。因为，L_inv的目标是最大化正确语句x的分类值，同时最小化扰动语句x′的分类值，所以L_inv对于参数集合θ_enc的梯度乘以-1，其他梯度正常传播。由此，可以计算出初始训练模型中θ_enc、θ_dec和θ_dis的取值，从而训练出具有抗噪能力的目标翻译模型。Stochastic gradient descent is used to optimize the model objective function J(θ). In the forward propagation, in addition to a batch of data containing x and y, it also contains a batch of data x' and y. Through these two batches of data, the value of J(θ) can be calculated, and then the gradient of J(θ) corresponding to the model parameters can be calculated, and these gradients will be used to update the model parameters. Because the goal of _Linv is to maximize the classification value of the correct sentence x while minimizing the classification value of the perturbed sentence x′, so the gradient of _Linv for the parameter set θ _enc is multiplied by -1, and other gradients propagate normally. Thus, the values of θ _enc , θ _dec and θ _dis in the initial training model can be calculated, so as to train a target translation model with anti-noise capability.

也就是说，本申请实施例中，所述使用多个训练样本对和所述每个训练输入样本各自对应的扰动输入样本集合，以及所述扰动输出样本集合对应的扰动输出样本训练初始翻译模型，以得到目标翻译模型，包括：That is to say, in this embodiment of the present application, the initial translation model is trained using multiple training sample pairs and the disturbance input sample sets corresponding to each of the training input samples, and the disturbance output samples corresponding to the disturbance output sample sets , to get the target translation model, including:

将每个训练输入样本、对应的扰动输入样本以及对应的训练输出样本输入所述模型目标函数；inputting each training input sample, corresponding perturbed input sample, and corresponding training output sample into said model objective function;

按照梯度下降的方式优化所述模型目标函数，以确定出所述编码器的参数的取值、所述解码器的参数的取值和所述分类器的参数的取值，其中，对于分类目标函数中编码器的参数的梯度乘以-1。Optimize the model objective function in a gradient descent manner to determine the values of the parameters of the encoder, the values of the parameters of the decoder and the values of the parameters of the classifier, wherein, for the classification target The gradient of the parameters of the encoder in the function is multiplied by -1.

以上介绍了目标翻译模型的训练过程，下面介绍使用该目标翻译模型进行语句翻译的过程。The above describes the training process of the target translation model, and the following describes the process of using the target translation model for sentence translation.

如图4所示，本申请实施例提供的语句翻译的方法的一实施例包括：As shown in Figure 4, an embodiment of the method for sentence translation provided by the embodiment of the present application includes:

201、接收以第一语言表达的第一待翻译语句。201. Receive a first sentence to be translated expressed in a first language.

本申请实施例中，第一语言可以是目标翻译模型所支持的任意一种类型的语言。In this embodiment of the present application, the first language may be any type of language supported by the target translation model.

202、使用目标翻译模型对所述第一待翻译语句进行翻译，以得到用第二语言表达的翻译结果语句，其中所述目标翻译模型为使用多个训练样本和所述多个训练样本中每个训练样本各自对应的扰动样本集合训练得到的，所述扰动样本集合包括至少一个扰动样本，所述扰动样本与对应训练样本的语义相似度高于第一预设值。202. Translate the first sentence to be translated by using a target translation model to obtain a translation result sentence expressed in a second language, wherein the target translation model uses multiple training samples and each of the multiple training samples Each of the training samples corresponds to a set of perturbed samples, the set of perturbed samples includes at least one perturbed sample, and the semantic similarity between the perturbed sample and the corresponding training sample is higher than the first preset value.

关于目标翻译模型可以参阅前述模型训练过程的实施例进行理解，本处不做过多赘述。The target translation model can be understood by referring to the above-mentioned embodiment of the model training process, and details are not repeated here.

203、输出所述用第二语言表达的翻译结果语句。203. Output the translation result sentence expressed in the second language.

第二语言是与第一语言不同的语言，例如：第一语言为中文，第二语言为英文。The second language is a language different from the first language, for example: the first language is Chinese and the second language is English.

本申请实施例中，因为目标翻译模型具备抗噪能力，所以接收到带有噪声的语句时，也可以正确进行翻译。从而提高了机器翻译的鲁棒性，以及翻译质量。In the embodiment of the present application, because the target translation model has anti-noise capability, it can also translate correctly when receiving a sentence with noise. Thereby improving the robustness of machine translation, as well as the translation quality.

可选地，本申请实施例提供的语句翻译的方法的另一实施例中，还可以包括：Optionally, in another embodiment of the sentence translation method provided in the embodiment of the present application, it may also include:

接收以所述第一语言表达的第二待翻译语句，所述第二待翻译语句为所述第一待翻译语句的扰动语句，所述第二待翻译语句与所述第一待翻译语句的相似度高于第一预设值；receiving a second sentence to be translated expressed in the first language, the second sentence to be translated is a disturbance sentence of the first sentence to be translated, the second sentence to be translated is the same as the first sentence to be translated The similarity is higher than the first preset value;

使用目标翻译模型对所述第二待翻译语句进行翻译，以得到与所述第一待翻译语句对应的所述翻译结果语句；Translating the second sentence to be translated by using a target translation model to obtain the translation result sentence corresponding to the first sentence to be translated;

输出所述用所述翻译结果语句。Output the sentence using the translation result.

本申请实施例中，第一待翻译语句不限于是上述示例中的训练输入样本，可以是上述扰动输入样本中的一个。In this embodiment of the present application, the first sentence to be translated is not limited to the training input sample in the above example, and may be one of the above disturbance input samples.

关于本申请实施例中的语句翻译方案可以参阅下述两个场景示例进行理解。The sentence translation solution in the embodiment of the present application can be understood by referring to the following two scenario examples.

图5中的(A)-(C)为本申请实施例在社交应用中的文本翻译的一场景示例图。(A)-(C) in FIG. 5 is an example diagram of a scene of text translation in a social application according to an embodiment of the present application.

如图5中的(A)所示，要将社交应用中的“他们不怕困难做出围棋AI”翻译成英文，则长按文字部分，就会出现图5中的(B)所示的页面，在图5中的(B)所示的页面中出现了“复制”、“转发”、“删除”和“译英”等功能框，当然图5中的(B)只是举例说明，“译英”也可以改成“翻译”，然后再出现下拉框选择对应的翻译文字。用户在图5中的(B)所示的页面上点击“译英”后，则会出现图5中的(C)所示的翻译结果“They are not afraid of difficulties tomake Go AI”。As shown in (A) in Figure 5, to translate "They are not afraid of difficulties to make Go AI" in social applications into English, long press the text part, and the page shown in (B) in Figure 5 will appear , Function boxes such as "copy", "forward", "delete" and "translation" have appeared in the page shown in (B) in Figure 5, of course (B) in Figure 5 is just an example, "translate English" can also be changed to "translation", and then a drop-down box will appear to select the corresponding translated text. After the user clicks "Translate English" on the page shown in (B) in Figure 5, the translation result "They are not afraid of difficulties to make Go AI" shown in (C) in Figure 5 will appear.

图5中的(A)-(C)为本申请实施例在社交应用中的文本翻译的另一场景示例图。(A)-(C) in FIG. 5 are diagrams illustrating another scene example of text translation in a social application according to the embodiment of the present application.

如图6中的A所示，要将社交应用中的“他们不畏困难做出围棋AI”翻译成英文，则长按文字部分，就会出现图6中的(B)所示的页面，在图5中的(B)所示的页面中出现了“复制”、“转发”、“删除”和“译英”等功能框，用户在As shown in A in Figure 6, if you want to translate "they are not afraid of difficulties to make Go AI" in social applications into English, press and hold the text part, and the page shown in (B) in Figure 6 will appear. Function boxes such as "copy", "forward", "delete" and "translate" have appeared in the page shown in (B) in Figure 5, and the user

图6中的(B)所示的页面上点击“译英”后，则会出现图6中的(C)所示的翻译结果“They are not afraid of difficulties to make Go AI”。After clicking "Translate English" on the page shown in (B) in Figure 6, the translation result "They are not afraid of difficulties to make Go AI" shown in (C) in Figure 6 will appear.

由图5中的(A)-(C)，以及图6中的(A)-(C)的过程和结果比对中可见，虽然图5中的(A)要翻译的语句是“他们不怕困难做出围棋AI”，图6的(A)中要翻译的语句是“他们不畏困难做出围棋AI”，针对这两个语义相似的语句，图5的(C)和图6的(C)中分别得到了相同的翻译结果“They are not afraid of difficulties to make Go AI”。可见，本申请实施例提供的语句翻译方案的鲁棒性更好，翻译质量更好。By (A)-(C) among Fig. 5, and the process of (A)-(C) among Fig. 6 and result are compared in visible, although the sentence to be translated of (A) among Fig. 5 is " they are not afraid of Difficult to make Go AI", the sentence to be translated in (A) of Figure 6 is "They made Go AI without fear of difficulty", for these two semantically similar sentences, Figure 5 (C) and Figure 6 ( In C), the same translation result "They are not afraid of difficulties to make Go AI" was obtained respectively. It can be seen that the sentence translation solution provided by the embodiment of the present application has better robustness and better translation quality.

图7为本申请实施例的语句翻译在同声传译场景的一应用示意图。FIG. 7 is a schematic diagram of the application of sentence translation in the simultaneous interpretation scene according to the embodiment of the present application.

如图7所示，在同声传译场景中，发言者用中文说出了“他们不怕困难做出围棋AI”，在使用英文频道的听众中听到的语句为“They are not afraid of difficulties tomake Go AI”。As shown in Figure 7, in the simultaneous interpretation scene, the speaker said "They are not afraid of difficulties to make Go AI" in Chinese, and the sentence heard by the audience using the English channel was "They are not afraid of difficulties to make Go AI".

图8为本申请实施例的语句翻译在同声传译场景的另一应用示意图。FIG. 8 is a schematic diagram of another application of sentence translation in the simultaneous interpretation scene according to the embodiment of the present application.

如图8所示，在同声传译场景中，发言者用中文说出了“他们不畏困难做出围棋AI”，在使用英文频道的听众中听到的语句为“They are not afraid of difficulties tomake Go AI”。As shown in Figure 8, in the simultaneous interpretation scene, the speaker said in Chinese "They are not afraid of difficulties to make Go AI", and the sentence heard by the audience using the English channel was "They are not afraid of difficulties tomake Go AI".

由图7和图8的示例对比可见，对于语义相似的输入，翻译的结果是相同的，可见，本申请实施例提供的语句翻译方案的鲁棒性更好，翻译质量更好。From the comparison of the examples in Fig. 7 and Fig. 8, it can be seen that for inputs with similar semantics, the translation results are the same. It can be seen that the sentence translation solution provided by the embodiment of the present application has better robustness and better translation quality.

需要说明的是，以上两个应用场景只是举例说明，本申请实施例的方案可以用在多种翻译场景中，而且涉及到的终端设备的形态也不限于图5至图8中所示出的形态。It should be noted that the above two application scenarios are just examples, and the solutions of the embodiments of the present application can be used in various translation scenarios, and the forms of the involved terminal devices are not limited to those shown in Figures 5 to 8 form.

以上实施例介绍了本申请实施例中目标翻译模型的训练过程和使用目标翻译模型进行语句翻译的过程，下面结合附图介绍本申请实施例中的翻译模型训练的装置、语句翻译的装置、计算节设备和终端设备。The above embodiments have introduced the training process of the target translation model in the embodiment of the present application and the process of using the target translation model for sentence translation. The following describes the translation model training device, sentence translation device, and calculation in the embodiment of the present application in conjunction with the accompanying drawings. Festival equipment and terminal equipment.

如图9所示，本申请实施例提供的翻译模型训练的装置30的一实施例包括：As shown in FIG. 9, an embodiment of the translation model training device 30 provided in the embodiment of the present application includes:

获取单元301，用于获取训练样本集合，所述训练样本集合中包括多个训练样本；An acquisition unit 301, configured to acquire a training sample set, which includes a plurality of training samples;

确定单元302，用于确定所述获取单元301获取的所述训练样本集合中每个训练样本各自对应的扰动样本集合，所述扰动样本集合包括至少一个扰动样本，所述扰动样本与对应训练样本的语义相似度高于第一预设值；A determination unit 302, configured to determine a disturbance sample set corresponding to each training sample in the training sample set acquired by the acquisition unit 301, the disturbance sample set includes at least one disturbance sample, and the disturbance sample is the same as the corresponding training sample The semantic similarity of is higher than the first preset value;

模型训练单元303，用于使用所述获取单元301获得的所述多个训练样本和所述确定单元302确定的所述每个训练样本各自对应的扰动样本集合训练初始翻译模型，以得到目标翻译模型。A model training unit 303, configured to train an initial translation model using the plurality of training samples obtained by the acquiring unit 301 and the perturbation sample sets corresponding to each training sample determined by the determining unit 302, so as to obtain the target translation Model.

可选地，所述确定单元302，用于在所述每个训练样本为一个训练样本对，所述训练样本对包括训练输入样本和训练输出样本时，确定每个训练输入样本各自对应的扰动输入样本集合，以及所述扰动输出样本集合对应的扰动输出样本，所述扰动输入样本集合包括至少一个扰动输入样本，所述扰动输出样本与所述训练输出样本相同；Optionally, the determining unit 302 is configured to determine the disturbance corresponding to each training input sample when each training sample is a training sample pair, and the training sample pair includes a training input sample and a training output sample. an input sample set, and a disturbance output sample corresponding to the disturbance output sample set, the disturbance input sample set includes at least one disturbance input sample, and the disturbance output sample is the same as the training output sample;

模型训练单元303，用于使用多个训练样本对和所述每个训练输入样本各自对应的扰动输入样本集合，以及所述扰动输出样本集合对应的扰动输出样本训练初始翻译模型，以得到目标翻译模型。A model training unit 303, configured to train an initial translation model using a plurality of training sample pairs, a set of perturbed input samples corresponding to each of the training input samples, and a set of perturbed output samples corresponding to the set of perturbed output samples, so as to obtain the target translation Model.

可选地，所述确定单元302用于：Optionally, the determining unit 302 is configured to:

可选地，所述初始翻译模型包括编码器、分类器和解码器；Optionally, the initial translation model includes an encoder, a classifier and a decoder;

所述编码器用于接收所述训练输入样本和对应的扰动输入样本，并输出第一中间表示结果和第二中间表示结果，所述第一中间表示结果为所述训练输入样本的中间表示结果，所述第二中间表示结果为所述对应的扰动输入样本的中间表示结果；The encoder is configured to receive the training input sample and the corresponding disturbance input sample, and output a first intermediate representation result and a second intermediate representation result, the first intermediate representation result being an intermediate representation result of the training input sample, The second intermediate representation result is an intermediate representation result of the corresponding perturbed input sample;

所述分类器用于区分所述第一中间表示结果和所述第二中间表示结果；said classifier for distinguishing said first intermediate representation result from said second intermediate representation result;

所述解码器用于根据第一中间表示结果输出所述训练输出样本，根据所述第二中间表示结果输出所述训练输出样本。The decoder is configured to output the training output samples according to the first intermediate representation result, and output the training output samples according to the second intermediate representation result.

可选地，所述初始翻译模型的模型目标函数包括与所述分类器和所述编码器相关的分类目标函数、与所述编码器和所述解码器相关的训练目标函数和扰动目标函数；Optionally, the model objective function of the initial translation model includes a classification objective function related to the classifier and the encoder, a training objective function and a perturbation objective function related to the encoder and the decoder;

可选地，模型训练单元303用于：Optionally, the model training unit 303 is used for:

本申请实施例提供的翻译模型训练的装置30可以参阅上述方法实施例部分的相应内容进行理解，本处不再重复赘述。The translation model training device 30 provided in the embodiment of the present application can be understood by referring to the corresponding content in the above method embodiment, and will not be repeated here.

如图10所示，本申请实施例提供的语句翻译的装置的一实施例包括：As shown in Figure 10, an embodiment of the sentence translation device provided by the embodiment of the present application includes:

接收单元401，用于接收以第一语言表达的第一待翻译语句；a receiving unit 401, configured to receive a first sentence to be translated expressed in a first language;

翻译单元402，用于使用目标翻译模型对所述接收单元401接收的所述第一待翻译语句进行翻译，以得到用第二语言表达的翻译结果语句，其中所述目标翻译模型为使用多个训练样本和所述多个训练样本中每个训练样本各自对应的扰动样本集合训练得到的，所述扰动样本集合包括至少一个扰动样本，所述扰动样本与对应训练样本的语义相似度高于第一预设值；The translation unit 402 is configured to use a target translation model to translate the first sentence to be translated received by the receiving unit 401 to obtain a translation result sentence expressed in a second language, wherein the target translation model uses multiple The training sample and the disturbance sample set corresponding to each training sample in the plurality of training samples are trained, the disturbance sample set includes at least one disturbance sample, and the semantic similarity between the disturbance sample and the corresponding training sample is higher than the first a preset value;

输出单元403，用于输出所述翻译单元402翻译出的用第二语言表达的翻译结果语句。The output unit 403 is configured to output the translation result sentence expressed in the second language translated by the translation unit 402 .

可选地，所述接收单元401，还用于接收以所述第一语言表达的第二待翻译语句，所述第二待翻译语句为所述第一待翻译语句的扰动语句，所述第二待翻译语句与所述第一待翻译语句的相似度高于第一预设值；Optionally, the receiving unit 401 is further configured to receive a second sentence to be translated expressed in the first language, the second sentence to be translated is a disturbance sentence of the first sentence to be translated, and the second sentence to be translated is The similarity between the second sentence to be translated and the first sentence to be translated is higher than a first preset value;

所述翻译单元402，还用于使用目标翻译模型对所述第二待翻译语句进行翻译，以得到与所述第一待翻译语句对应的所述翻译结果语句；The translation unit 402 is further configured to use a target translation model to translate the second sentence to be translated to obtain the translation result sentence corresponding to the first sentence to be translated;

所述输出单元403，还用于输出所述翻译结果语句。The output unit 403 is further configured to output the translation result sentence.

以上语句翻译的装置40可以参阅方法实施例部分的相应内容进行理解，本处不再重复赘述。The device 40 for sentence translation above can be understood by referring to the corresponding content in the method embodiment, and will not be repeated here.

图11是本申请实施例提供的计算机设备50的结构示意图。所述计算机设备50包括处理器510、存储器540和输入输出(I/O)接口530，存储器540可以包括只读存储器和随机存取存储器，并向处理器510提供操作指令和数据。存储器540的一部分还可以包括非易失性随机存取存储器(NVRAM)。FIG. 11 is a schematic structural diagram of a computer device 50 provided by an embodiment of the present application. The computer device 50 includes a processor 510 , a memory 540 and an input/output (I/O) interface 530 . The memory 540 may include a read-only memory and a random access memory, and provides operation instructions and data to the processor 510 . A portion of memory 540 may also include non-volatile random access memory (NVRAM).

在一些实施方式中，存储器540存储了如下的元素，可执行模块或者数据结构，或者他们的子集，或者他们的扩展集:In some embodiments, the memory 540 stores the following elements, executable modules or data structures, or their subsets, or their extensions:

在本申请实施例中，在地面标志确定的过程中，通过调用存储器540存储的操作指令(该操作指令可存储在操作系统中)，In the embodiment of the present application, during the process of determining the ground marker, by calling the operation instruction stored in the memory 540 (the operation instruction can be stored in the operating system),

处理器510控制计算机设备50的操作，处理器510还可以称为CPU(CentralProcessing Unit，中央处理单元)。存储器540可以包括只读存储器和随机存取存储器，并向处理器510提供指令和数据。存储器540的一部分还可以包括非易失性随机存取存储器(NVRAM)。具体的应用中计算机设备50的各个组件通过总线系统520耦合在一起，其中总线系统520除包括数据总线之外，还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见，在图中将各种总线都标为总线系统520。The processor 510 controls operations of the computer device 50, and the processor 510 may also be referred to as a CPU (Central Processing Unit, central processing unit). Memory 540 may include read-only memory and random-access memory, and provides instructions and data to processor 510 . A portion of memory 540 may also include non-volatile random access memory (NVRAM). In a specific application, various components of the computer device 50 are coupled together through the bus system 520, wherein the bus system 520 may include a power bus, a control bus, and a status signal bus, etc. in addition to a data bus. However, for clarity of illustration, the various buses are labeled as bus system 520 in the figure.

上述本申请实施例揭示的方法可以应用于处理器510中，或者由处理器510实现。处理器510可能是一种集成电路芯片，具有信号的处理能力。在实现过程中，上述方法的各步骤可以通过处理器510中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器510可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成，或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器，闪存、只读存储器，可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器540，处理器510读取存储器540中的信息，结合其硬件完成上述方法的步骤。The methods disclosed in the foregoing embodiments of the present application may be applied to the processor 510 or implemented by the processor 510 . The processor 510 may be an integrated circuit chip and has a signal processing capability. In the implementation process, each step of the above method may be implemented by an integrated logic circuit of hardware in the processor 510 or instructions in the form of software. The above-mentioned processor 510 may be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, and the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 540, and the processor 510 reads the information in the memory 540, and completes the steps of the above method in combination with its hardware.

可选地，处理器510用于：Optionally, the processor 510 is used for:

在所述每个训练样本为一个训练样本对，所述训练样本对包括训练输入样本和训练输出样本时，确定每个训练输入样本各自对应的扰动输入样本集合，以及所述扰动输出样本集合对应的扰动输出样本，所述扰动输入样本集合包括至少一个扰动输入样本，所述扰动输出样本与所述训练输出样本相同；When each of the training samples is a pair of training samples, and the pair of training samples includes training input samples and training output samples, determine the set of disturbance input samples corresponding to each training input sample, and the corresponding set of disturbance output samples The perturbed output samples of the perturbed input samples set include at least one perturbed input sample, the perturbed output samples are the same as the training output samples;

可选地，处理器510用于：Optionally, the processor 510 is used for:

上对计算机设备50的描述可以参阅图1至图3部分的描述进行理解，本处不再重复赘述。The above description of the computer device 50 can be understood by referring to the descriptions in FIG. 1 to FIG. 3 , and will not be repeated here.

上述语句翻译的过程由终端设备来执行时，例如手机，平板电脑、PDA(PersonalDigital Assistant，个人数字助理)、POS(Point of Sales，销售终端)、车载电脑等任意终端设备，以终端为手机为例：When the above sentence translation process is performed by a terminal device, such as any terminal device such as a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of Sales, sales terminal), a vehicle-mounted computer, etc., the terminal is a mobile phone. example:

图12示出的是与本发明实施例提供的终端设备相关的手机的部分结构的框图。参考图12，手机包括：射频(Radio Frequency，RF)电路1110、存储器1120、输入单元1130、显示单元1140、传感器1150、音频电路1160、无线保真(wireless fidelity，WiFi)模块1170、处理器1180、以及摄像头1190等部件。本领域技术人员可以理解，图12中示出的手机结构并不构成对手机的限定，可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件布置。Fig. 12 shows a block diagram of a partial structure of a mobile phone related to the terminal device provided by the embodiment of the present invention. Referring to FIG. 12 , the mobile phone includes: a radio frequency (Radio Frequency, RF) circuit 1110, a memory 1120, an input unit 1130, a display unit 1140, a sensor 1150, an audio circuit 1160, a wireless fidelity (wireless fidelity, WiFi) module 1170, and a processor 1180 , and camera 1190 and other components. Those skilled in the art can understand that the structure of the mobile phone shown in FIG. 12 does not constitute a limitation to the mobile phone, and may include more or less components than shown in the figure, or combine some components, or arrange different components.

下面结合图12对手机的各个构成部件进行具体的介绍：The following is a specific introduction to each component of the mobile phone in conjunction with Figure 12:

RF电路1110可用于收发信息或通话过程中，信号的接收和发送，RF电路1110也就是收发器。特别地，将基站的下行信息接收后，给处理器1180处理；另外，将设计上行的数据发送给基站。通常，RF电路1110包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier，LNA)、双工器等。此外，RF电路1110还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议，包括但不限于全球移动通讯系统(Global System of Mobile communication，GSM)、通用分组无线服务(General Packet Radio Service，GPRS)、码分多址(Code Division Multiple Access，CDMA)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、长期演进(Long Term Evolution，LTE)、电子邮件、短消息服务(Short Messaging Service，SMS)等。The RF circuit 1110 can be used for receiving and sending signals during sending and receiving information or talking, and the RF circuit 1110 is also a transceiver. Specifically, after receiving the downlink information of the base station, it is processed by the processor 1180; in addition, the data for designing uplink is sent to the base station. Generally, the RF circuit 1110 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (Low Noise Amplifier, LNA), a duplexer, and the like. In addition, RF circuitry 1110 may also communicate with networks and other devices via wireless communications. The above-mentioned wireless communication can use any communication standard or protocol, including but not limited to Global System of Mobile Communication (Global System of Mobile communication, GSM), General Packet Radio Service (General Packet Radio Service, GPRS), Code Division Multiple Access (Code Division Multiple Access, CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (Long Term Evolution, LTE), email, Short Messaging Service (Short Messaging Service, SMS), etc.

存储器1120可用于存储软件程序以及模块，处理器1180通过运行存储在存储器1120的软件程序以及模块，从而执行手机的各种功能应用以及数据处理。存储器1120可主要包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等；存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外，存储器1120可以包括高速随机存取存储器，还可以包括非易失性存储器，例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 1120 can be used to store software programs and modules, and the processor 1180 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 1120 . The memory 1120 can mainly include a program storage area and a data storage area, wherein the program storage area can store an operating system, at least one application program required by a function (such as a sound playback function, an image playback function, etc.) and the like; Data created by the use of mobile phones (such as audio data, phonebook, etc.), etc. In addition, the memory 1120 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage devices.

输入单元1130可用于接收用户输入的待翻译语句、翻译指示灯。具体地，输入单元1130可包括触控面板1131以及其他输入设备1132。触控面板1131，也称为触摸屏，可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板1131上或在触控面板1131附近的操作)，并根据预先设定的程式驱动相应的连接装置。可选的，触控面板1131可包括触摸检测装置和触摸控制器两个部分。其中，触摸检测装置检测用户的触摸方位，并检测触摸操作带来的信号，将信号传送给触摸控制器；触摸控制器从触摸检测装置上接收触摸信息，并将它转换成触点坐标，再送给处理器1180，并能接收处理器1180发来的命令并加以执行。此外，可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板1131。除了触控面板1131，输入单元1130还可以包括其他输入设备1132。具体地，其他输入设备1132可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。The input unit 1130 can be used to receive the sentence to be translated and the translation indicator light input by the user. Specifically, the input unit 1130 may include a touch panel 1131 and other input devices 1132 . The touch panel 1131, also referred to as a touch screen, can collect touch operations of the user on or near it (for example, the user uses any suitable object or accessory such as a finger or a stylus on the touch panel 1131 or near the touch panel 1131). operation), and drive the corresponding connection device according to the preset program. Optionally, the touch panel 1131 may include two parts, a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch orientation, and detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and sends it to the to the processor 1180, and can receive and execute commands sent by the processor 1180. In addition, the touch panel 1131 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 1131 , the input unit 1130 may also include other input devices 1132 . Specifically, other input devices 1132 may include, but are not limited to, one or more of physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, joysticks, and the like.

显示单元1140可用于显示翻译的结果。显示单元1140可包括显示面板1141，可选的，可以采用液晶显示器(Liquid Crystal Display，LCD)、有机发光二极管(OrganicLight-Emitting Diode,OLED)等形式来配置显示面板1141。进一步的，触控面板1131可覆盖显示面板1141，当触控面板1131检测到在其上或附近的触摸操作后，传送给处理器1180以确定触摸事件的类型，随后处理器1180根据触摸事件的类型在显示面板1141上提供相应的视觉输出。虽然在图12中，触控面板1131与显示面板1141是作为两个独立的部件来实现手机的输入和输入功能，但是在某些实施例中，可以将触控面板1131与显示面板1141集成而实现手机的输入和输出功能。The display unit 1140 can be used to display the translated results. The display unit 1140 may include a display panel 1141. Optionally, the display panel 1141 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD) or an organic light-emitting diode (Organic Light-Emitting Diode, OLED). Further, the touch panel 1131 can cover the display panel 1141, and when the touch panel 1131 detects a touch operation on or near it, it sends it to the processor 1180 to determine the type of the touch event, and then the processor 1180 determines the type of the touch event according to the The type provides a corresponding visual output on the display panel 1141 . Although in FIG. 12, the touch panel 1131 and the display panel 1141 are used as two independent components to realize the input and input functions of the mobile phone, in some embodiments, the touch panel 1131 and the display panel 1141 can be integrated to form a mobile phone. Realize the input and output functions of the mobile phone.

手机还可包括至少一种传感器1150，比如光传感器、运动传感器以及其他传感器。具体地，光传感器可包括环境光传感器及接近传感器，其中，环境光传感器可根据环境光线的明暗来调节显示面板1141的亮度，接近传感器可在手机移动到耳边时，关闭显示面板1141和/或背光。作为运动传感器的一种，加速计传感器可检测各个方向上(一般为三轴)加速度的大小，静止时可检测出重力的大小及方向，可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等；至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器，在此不再赘述。The handset may also include at least one sensor 1150, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1141 according to the brightness of the ambient light, and the proximity sensor may turn off the display panel 1141 and/or when the mobile phone is moved to the ear. or backlight. As a kind of motion sensor, the accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is stationary, and can be used to identify the application of mobile phone posture (such as horizontal and vertical screen switching, related Games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tap), etc.; as for other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared sensor, etc. repeat.

音频电路1160、扬声器1161，传声器1162可提供用户与手机之间的音频接口。音频电路1160可将接收到的音频数据转换后的电信号，传输到扬声器1161，由扬声器1161转换为声音信号输出；另一方面，传声器1162将收集的声音信号转换为电信号，由音频电路1160接收后转换为音频数据，再将音频数据输出处理器1180处理后，经RF电路1110以发送给比如另一手机，或者将音频数据输出至存储器1120以便进一步处理。The audio circuit 1160, the speaker 1161, and the microphone 1162 can provide an audio interface between the user and the mobile phone. The audio circuit 1160 can transmit the electrical signal converted from the received audio data to the speaker 1161, and the speaker 1161 converts it into an audio signal for output; After being received, it is converted into audio data, and then the audio data is processed by the output processor 1180, and then sent to another mobile phone through the RF circuit 1110, or the audio data is output to the memory 1120 for further processing.

WiFi属于短距离无线传输技术，手机通过WiFi模块1170可以帮助用户收发电子邮件、浏览网页和访问流式媒体等，它为用户提供了无线的宽带互联网访问。虽然图12示出了WiFi模块1170，但是可以理解的是，其并不属于手机的必须构成，完全可以根据需要在不改变发明的本质的范围内而省略。WiFi is a short-distance wireless transmission technology. The mobile phone can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 1170, which provides users with wireless broadband Internet access. Although Fig. 12 shows a WiFi module 1170, it can be understood that it is not an essential component of the mobile phone, and can be completely omitted as required without changing the essence of the invention.

处理器1180是手机的控制中心，利用各种接口和线路连接整个手机的各个部分，通过运行或执行存储在存储器1120内的软件程序和/或模块，以及调用存储在存储器1120内的数据，执行手机的各种功能和处理数据，从而对手机进行整体监控。可选的，处理器1180可包括一个或多个处理单元；优选的，处理器1180可集成应用处理器和调制解调处理器，其中，应用处理器主要处理操作系统、用户界面和应用程序等，调制解调处理器主要处理无线通信。可以理解的是，上述调制解调处理器也可以不集成到处理器1180中。The processor 1180 is the control center of the mobile phone. It uses various interfaces and lines to connect various parts of the entire mobile phone. By running or executing software programs and/or modules stored in the memory 1120, and calling data stored in the memory 1120, execution Various functions and processing data of the mobile phone, so as to monitor the mobile phone as a whole. Optionally, the processor 1180 may include one or more processing units; preferably, the processor 1180 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface and application programs, etc. , the modem processor mainly handles wireless communications. It can be understood that the foregoing modem processor may not be integrated into the processor 1180 .

摄像头1190用于采集图像。The camera 1190 is used to collect images.

手机还包括给各个部件供电的电源(比如电池)，优选的，电源可以通过电源管理系统与处理器1180逻辑相连，从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The mobile phone also includes a power supply (such as a battery) for supplying power to each component. Preferably, the power supply can be logically connected to the processor 1180 through the power management system, so that functions such as charging, discharging, and power consumption management can be realized through the power management system.

尽管未示出，手机还可以包括摄像头、蓝牙模块等，在此不再赘述。Although not shown, the mobile phone may also include a camera, a Bluetooth module, etc., which will not be repeated here.

在本发明实施例中，该终端所包括的处理器1180还具有以下控制功能：In the embodiment of the present invention, the processor 1180 included in the terminal also has the following control functions:

可选地，还可以：Optionally, you can also:

输出所述翻译结果语句。Output the translation result sentence.

在上述实施例中，可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时，可以全部或部分地以计算机程序产品的形式实现。In the above embodiments, all or part of them may be implemented by software, hardware, firmware or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product.

所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时，全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一计算机可读存储介质传输，例如，所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质，(例如，软盘、硬盘、磁带)、光介质(例如，DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server, or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a server or a data center integrated with one or more available media. The available medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, DVD), or a semiconductor medium (for example, a Solid State Disk (SSD)).

本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成，该程序可以存储于一计算机可读存储介质中，存储介质可以包括：ROM、RAM、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above-mentioned embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium, and the storage medium can include: ROM, RAM, disk or CD, etc.

以上对本申请实施例所提供的翻译模型训练的方法、语句翻译的方法、装置以及设备进行了详细介绍，本文中应用了具体个例对本申请的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本申请的方法及其核心思想；同时，对于本领域的一般技术人员，依据本申请的思想，在具体实施方式及应用范围上均会有改变之处，综上所述，本说明书内容不应理解为对本申请的限制。The above has introduced in detail the translation model training method, sentence translation method, device and equipment provided by the embodiment of the present application. In this paper, specific examples have been used to illustrate the principle and implementation of the present application. The description of the above embodiments It is only used to help understand the method of the present application and its core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present application, there will be changes in the specific implementation and application scope. In summary, The contents of this specification should not be understood as limiting the application.

Claims

1. a kind of method of translation model training characterized by comprising

Training sample set is obtained, includes multiple training samples in the training sample set；

Determine each corresponding disturbance sample set of training sample, the disturbance sample set in the training sample set Sample is disturbed including at least one, the disturbance sample is higher than the first preset value with the semantic similarity of corresponding training sample；

Use the multiple training sample and the corresponding disturbance sample set training initial translation of each training sample Model, to obtain target translation model.

2. the method according to claim 1, wherein each training sample is a training sample pair, institute Training sample is stated to including training input sample and training output sample；

Accordingly, the corresponding disturbance sample set of each training sample of the determination, comprising:

Determine the corresponding disturbance input sample set of each trained input sample and disturbance output sample set pair The disturbance output sample answered, the disturbance input sample set includes at least one disturbance input sample, and the disturbance exports sample This is identical as the training output sample；

Accordingly, just using the multiple training sample and the corresponding disturbance sample set training of each training sample Beginning translation model, to obtain target translation model, comprising:

Using multiple training samples to each trained corresponding disturbance input sample set of input sample, Yi Jisuo The corresponding disturbance output sample training initial translation model of disturbance output sample set is stated, to obtain target translation model.

3. according to the method described in claim 2, it is characterized in that, in the determination training sample set each training it is defeated Enter the corresponding disturbance input sample set of sample, comprising:

Determine that the first word in each trained input sample, first word are word to be replaced；

First word is replaced respectively at least one second word, to obtain the disturbance input sample set, described The semantic similarity of two words and first word is higher than the second preset value.

4. according to the method described in claim 2, it is characterized in that, in the determination training sample set each training it is defeated Enter the corresponding disturbance input sample set of sample, comprising:

Determine the term vector of each word in each trained input sample；

It is superimposed a different Gaussian noise vector, on the term vector of each word every time to obtain the disturbance sample Set.

5. according to any method of claim 2-4, which is characterized in that the initial translation model includes encoder, divides Class device and decoder；

The encoder exports the first intermediate representation for receiving the trained input sample and corresponding disturbance input sample As a result with the second intermediate representation as a result, the first intermediate representation result be the trained input sample intermediate representation as a result, The second intermediate representation result is the intermediate representation result of the corresponding disturbance input sample；

The classifier is for distinguishing the first intermediate representation result and the second intermediate representation result；

The decoder is used for according to the first intermediate representation result output training output sample, according to second middle table Show the result output training output sample.

6. according to the method described in claim 5, it is characterized in that, the model objective function of the initial translation model include with The classifier and the relevant class object function of the encoder, trained mesh relevant to the encoder and the decoder Scalar functions and disturbance objective function；

It wherein, include the trained input sample, the corresponding disturbance input sample, the volume in the class object function The code parameter of device and the parameter of the classifier；

The training objective function include the trained input sample, it is described training output sample, the encoder parameter and The parameter of the decoder；

It is described disturbance objective function include the disturbance input sample, it is described training output sample, the encoder parameter and The parameter of the decoder.

7. according to the method described in claim 6, it is characterized in that, it is described using multiple training samples to each training The corresponding disturbance input sample set of input sample and the corresponding disturbance of disturbance output sample set export sample Training initial translation model, to obtain target translation model, comprising:

Each trained input sample, corresponding disturbance input sample and corresponding training output sample are inputted into the model mesh Scalar functions；

Model objective function described in method optimizing according to gradient decline, to determine value, the institute of the parameter of the encoder State the value of the value of the parameter of decoder and the parameter of the classifier, wherein for encoder in class object function The gradient of parameter is multiplied by -1.

8. a kind of method of statement translation characterized by comprising

Receive the first sentence to be translated expressed with first language；

The described first sentence to be translated is translated using target translation model, to obtain the translation knot expressed with second language Fruit sentence, wherein the target translation model is using each training sample in multiple training samples and the multiple training sample What corresponding disturbance sample set training obtained, the disturbance sample set includes at least one disturbance sample, described to disturb Dynamic sample is higher than the first preset value with the semantic similarity of corresponding training sample；

Export the translation result sentence expressed with second language.

9. according to the method described in claim 8, it is characterized in that, the method also includes:

The second sentence to be translated expressed with the first language is received, second sentence to be translated is described first to be translated The similarity of the disturbance sentence of sentence, second sentence to be translated and the described first sentence to be translated is higher than the first preset value；

The described second sentence to be translated is translated using target translation model, to obtain and the described first sentence pair to be translated The translation result sentence answered；

Export the translation result sentence.

10. a kind of device of translation model training characterized by comprising

Acquiring unit includes multiple training samples in the training sample set for obtaining training sample set；

Determination unit, for determining, each training sample is respectively corresponded in the training sample set that the acquiring unit obtains Disturbance sample set, the disturbance sample set includes at least one disturbance sample, the disturbance sample and corresponding trained sample This semantic similarity is higher than the first preset value；

Model training unit, the multiple training sample and the determination unit for being obtained using the acquiring unit are determined The corresponding disturbance sample set training initial translation model of each training sample, to obtain target translation model.

11. device according to claim 10, which is characterized in that

The determination unit, for being a training sample pair in each training sample, the training sample is to including instruction When practicing input sample and training output sample, the corresponding disturbance input sample set of each trained input sample is determined, with And the disturbance exports the corresponding disturbance of sample set and exports sample, the disturbance input sample set includes at least one disturbance Input sample, the disturbance output sample are identical as the training output sample；

The model training unit, for using multiple training samples to each trained input sample is corresponding disturbs Dynamic input sample set and the corresponding disturbance of disturbance output sample set export sample training initial translation model, with Obtain target translation model.

12. a kind of device of statement translation characterized by comprising

Receiving unit, for receiving the first sentence to be translated expressed with first language；

Translation unit, for being turned over using target translation model first sentence to be translated received to the receiving unit It translates, to obtain the translation result sentence expressed with second language, wherein the target translation model is to use multiple training samples It is obtained with the corresponding disturbance sample set training of training sample each in the multiple training sample, the disturbance sample Set includes at least one disturbance sample, and the disturbance sample, which is higher than first with the semantic similarity of corresponding training sample, to be preset Value；

Output unit, the translation result sentence expressed with second language translated for exporting the translation unit.

13. a kind of computer equipment, which is characterized in that the computer equipment includes: input/output (I/O) interface, processor And memory, program instruction is stored in the memory；

The processor executes method as claimed in claim 1 for executing the program instruction stored in memory.

14. a kind of terminal device, which is characterized in that the terminal device includes: input/output (I/O) interface, processor and deposits Reservoir is stored with program instruction in the memory；

The processor executes method as claimed in claim 8 or 9 for executing the program instruction stored in memory.

15. a kind of computer readable storage medium, including instruction, which is characterized in that when described instruction is transported on a computing device When row, so that the computer equipment is executed as described in method of any of claims 1-7 or claim 8 or 9 Method.