CN102231109A - Traceless manageable automatic source code instrumentation method - Google Patents
Traceless manageable automatic source code instrumentation method Download PDFInfo
- Publication number
- CN102231109A CN102231109A CN2011101985825A CN201110198582A CN102231109A CN 102231109 A CN102231109 A CN 102231109A CN 2011101985825 A CN2011101985825 A CN 2011101985825A CN 201110198582 A CN201110198582 A CN 201110198582A CN 102231109 A CN102231109 A CN 102231109A
- Authority
- CN
- China
- Prior art keywords
- instrumentation
- file
- type
- node
- syntax tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Stored Programmes (AREA)
Abstract
一种无痕可管理的源代码自动插桩方法,步骤包括:40:开始,打开一个工程;41:定义一个文件过滤器,对所需插桩的工程进行匹配,保留匹配的工程;42:接着使用所述文件过滤器对被插桩的源文件进行过滤;43:选择自动插桩的具体应用类型,定义相应类型所需插桩的代码;44:用语法树结构匹配,根据自动插桩的具体应用类型进行相应插桩点的位置定位,并在相应位置插入代码,生成一个新源文件;45:所述新源文件进行编译产生新的可执行字节码文件,并保存;46:生成可执行文件,结束。本方法主要特点为插桩代码可视化、集中管理插入代码、插桩过程无痕化、插桩点自动化定位、自动插桩可扩展、自动插桩高效性。
A traceless and manageable source code automatic insertion method, the steps include: 40: start, open a project; 41: define a file filter, match the required insertion projects, and keep the matched projects; 42: Then use the file filter to filter the instrumented source files; 43: select the specific application type of automatic instrumentation, and define the code for the corresponding type of instrumentation; 44: use the syntax tree structure to match, according to the automatic instrumentation The specific application type is used to locate the position of the corresponding insertion point, and insert the code at the corresponding position to generate a new source file; 45: compile the new source file to generate a new executable bytecode file, and save it; 46: Generate executable file, end. The main features of this method are visualization of the instrumentation code, centralized management of the insertion code, traceless instrumentation process, automatic positioning of instrumentation points, scalable automatic instrumentation, and high efficiency of automatic instrumentation.
Description
技术领域 technical field
本发明涉及计算机程序动态分析,主要涉及一种无痕可管理的源代码插桩方法。该方法包括插桩点可视化、插桩点管理、插桩点定位、自动插桩框架、自动插桩性能优化五个部分。The invention relates to dynamic analysis of computer programs, and mainly relates to a traceless and manageable source code stub insertion method. The method includes five parts: visualization of insertion points, management of insertion points, positioning of insertion points, automatic insertion framework, and optimization of automatic insertion performance.
背景技术 Background technique
程序分析通常利用静态程序分析和动态程序分析对程序行为进行自动分析,进而提高软件质量。动态程序分析常借助于插桩方法来收集程序动态运行行为,某些与运行环境相关的程序行为只能通过插桩来收集,而静态程序分析无法进行分析。软件开发过程中,代码审查者在代码编写阶段完成后使用源代码插桩方法对代码进行审查,通常审查者拥有阅读源代码权限但不便对代码进行修改。通过对程序运行行为的分析尽早发现代码中的错误,进而提高软件质量。源代码插桩能够充分地利用程序语义、可视化地显示插桩代码、并且不会提高代码逻辑复杂性。Program analysis usually uses static program analysis and dynamic program analysis to automatically analyze program behavior, thereby improving software quality. Dynamic program analysis often collects the dynamic running behavior of programs by means of instrumentation. Some program behaviors related to the operating environment can only be collected through instrumentation, but static program analysis cannot be analyzed. During the software development process, the code reviewer uses the source code instrumentation method to review the code after the code writing stage is completed. Usually, the reviewer has the right to read the source code but is inconvenient to modify the code. By analyzing the running behavior of the program, the errors in the code can be found as early as possible, so as to improve the software quality. Source code instrumentation can make full use of program semantics, visually display instrumented code, and will not increase the complexity of code logic.
程序插桩技术,是在保证被测程序原有逻辑完整性的基础上在程序中插入一些探针,通过探针的执行并抛出程序运行的特征数据,通过对这些数据的分析,可以获得程序的控制流和数据流信息,进而得到逻辑覆盖等动态信息,从而实现测试目的的方法。由于程序插桩技术是在被测程序中插入探针,然后通过探针的执行来获得程序的控制流和数据流信息,以此来实现测试的目的。因此,根据探针插入的时间可以分为目标代码插桩和源代码插桩。The program instrumentation technology is to insert some probes into the program on the basis of ensuring the original logical integrity of the program under test, and throw out the characteristic data of the program operation through the execution of the probes. Through the analysis of these data, we can obtain The control flow and data flow information of the program, and then obtain dynamic information such as logic coverage, so as to achieve the method of testing. The program instrumentation technology inserts probes into the program under test, and then obtains the control flow and data flow information of the program through the execution of the probes, so as to achieve the purpose of testing. Therefore, according to the time of probe insertion, it can be divided into object code instrumentation and source code instrumentation.
现有插桩方法主要包括断言机制、字节码插桩、面向方面插桩等三类方法。断言机制直接在写源文件中添加插桩代码,将会降低代码阅读性。字节码插桩则直接对字节码文件进行修改,被插字节码的源代码无法可视化,且无法保证代码插入过程的正确性。面向方面插桩在程序纵向继承关系的基础上增加了横向方面关系,增加了程序的逻辑复杂性。Existing instrumentation methods mainly include three types of methods: assertion mechanism, bytecode instrumentation, and aspect-oriented instrumentation. The assertion mechanism directly adds instrumentation code to the source file, which will reduce the readability of the code. Bytecode instrumentation directly modifies the bytecode file, the source code of the inserted bytecode cannot be visualized, and the correctness of the code insertion process cannot be guaranteed. Aspect-oriented stubbing increases the horizontal aspect relationship on the basis of the vertical inheritance relationship of the program, which increases the logical complexity of the program.
目前插桩技术主要存在插桩点及其代码可视化、被插代码管理、插桩点自动定位、自动插桩性能较低的问题。At present, the instrumentation technology mainly has the problems of the visualization of the instrumentation point and its code, the management of the inserted code, the automatic positioning of the instrumentation point, and the low performance of the automatic instrumentation.
发明内容 Contents of the invention
鉴于上述问题,本发明旨在提供一个源代码插桩方法,该方法不仅支持手动插桩,并且支持批量自动插桩。其主要特点有插桩代码可视化、插入代码集中管理、插桩过程无痕化、插桩点自动化定位、自动插桩可扩展性和自动插桩高效性。In view of the above problems, the present invention aims to provide a source code insertion method, which not only supports manual insertion, but also supports batch automatic insertion. Its main features are instrumentation code visualization, centralized management of insertion code, traceless instrumentation process, automatic positioning of instrumentation points, automatic instrumentation scalability, and automatic instrumentation efficiency.
本发明是通过以下技术方案实现的:The present invention is achieved through the following technical solutions:
一种无痕可管理的源代码自动插桩方法,步骤包括:A traceless and manageable source code automatic piling method, the steps include:
步骤40:开始,打开一个工程;Step 40: start, open a project;
步骤41:定义一个文件过滤器,对所需插桩的工程进行匹配,保留匹配的工程;Step 41: Define a file filter, match the required projects for stub insertion, and keep the matched projects;
步骤42:接着使用所述文件过滤器对被插桩的源文件进行过滤;Step 42: Then use the file filter to filter the inserted source file;
步骤43:选择自动插桩的具体应用类型,定义相应类型所需插桩的代码;Step 43: Select the specific application type of automatic stub insertion, and define the code for the stub insertion required by the corresponding type;
步骤44:用语法树结构匹配,根据自动插桩的具体应用类型进行相应插桩点的位置定位,并在相应位置插入代码,生成一个新源文件;Step 44: use syntax tree structure matching, locate the position of the corresponding insertion point according to the specific application type of automatic insertion, and insert code at the corresponding position to generate a new source file;
步骤45:所述新源文件进行编译产生新的可执行字节码文件,并保存;Step 45: The new source file is compiled to generate a new executable bytecode file, and saved;
步骤46:生成可执行文件,结束。Step 46: generate an executable file, end.
本自动插桩方法提供了元插桩类型重用框架,包括元插桩操作组合器和元插桩类型池,其中,This automatic instrumentation method provides a meta-instrumentation type reuse framework, including a meta-instrumentation operation combiner and a meta-instrumentation type pool, where,
元插桩类型池包括多个元插桩类型,并且支持添加新的元插桩类型;The meta-instrumentation type pool includes multiple meta-instrumentation types, and supports adding new meta-instrumentation types;
元插桩操作组合器将多个元插桩类型组合到一起,通过遍历一次语法树完成在语法树上多种类型的插桩操作;The meta-insertion operation combiner combines multiple meta-insertion types together, and completes various types of instrumentation operations on the syntax tree by traversing the syntax tree once;
所述元插桩类型支持对不同类型插桩进行插桩,且每一个元插桩类型只能针对某一类型元插桩进行插桩。The meta-instrumentation type supports the instrumentation of different types of instrumentation, and each meta-instrumentation type can only be instrumented for a certain type of meta-instrumentation.
所述步骤41中,对所需插桩的工程进行匹配,包括匹配工程中的包、文件、和方法。In the
所述步骤42中,所述源文件过滤的步骤包括:In described step 42, the step of described source file filtering comprises:
步骤60:开始:源文件过滤是对某一工作空间所有源文件进行过滤;Step 60: start: source file filtering is to filter all source files in a certain workspace;
步骤61:先判断该工作空间是否包含其它工程,如没有则跳转到步骤64,若有则进入步骤62;Step 61: First judge whether the workspace contains other projects, if not, go to step 64, if yes, go to step 62;
步骤62:匹配工程名,如果匹配则进入步骤63,如果不匹配则返回步骤61继续查看是否有下一个工程;Step 62: Match the project name, if it matches, go to step 63, if not, return to step 61 and continue to check whether there is a next project;
步骤63:如果工程名匹配,说明该工程需要进行插桩,将其添加到插桩工程集合中,然后跳回至步骤61;Step 63: If the project name matches, it means that the project needs to be inserted, add it to the set of inserted projects, and then jump back to step 61;
步骤64:进行包的过滤,先判断工程集合中的工程是否包含其它包(此处其它包的说明:Java工程中包括多个package,逐个包进行过滤,看看还有没有其它包没有进行过滤。包是指java语言中package的概念),如没有则跳转到步骤67开始进行文件的过滤,若有则进入步骤65;Step 64: Perform package filtering, first determine whether the projects in the project collection contain other packages (the description of other packages here: Java projects include multiple packages, filter them one by one, and see if there are any other packages that have not been filtered .Package refers to the concept of package in the java language), if not then jump to step 67 and start to filter the file, if there is then enter step 65;
步骤65:匹配包名,若匹配则进入步骤66,若不匹配则回到步骤64继续查看是否有下一个包;Step 65: Match the package name, if it matches, then enter step 66, if it does not match, then return to step 64 and continue to check whether there is a next package;
步骤66:匹配包名,若相匹配,说明该包需要进行插桩,将其添加到插桩包集合中,然后跳回至步骤64;Step 66: Match the package name, if it matches, it means that the package needs to be instrumented, add it to the instrumentation package collection, and then jump back to step 64;
步骤67:判断包集合中的包是否包含其它源文件,如果没有则跳转到步骤6b,如果有则进入步骤68;Step 67: Determine whether the package in the package collection contains other source files, if not, jump to step 6b, and if yes, go to step 68;
步骤68:匹配源文件名,若匹配则进入步骤69对方法进行匹配,否则跳回至步骤67;Step 68: Match the source file name, if it matches, go to step 69 to match the method, otherwise jump back to step 67;
步骤69:匹配方法,若匹配则进入步骤6a,否则跳回至步骤67;Step 69: Matching method, if it matches, go to step 6a, otherwise jump back to step 67;
步骤6a:将匹配的源文件添加到文件匹配集合;Step 6a: Add matching source files to the file matching collection;
步骤6b:对工程中所有工程名、包名、文件名和方法名进行匹配后并完成对文件的过滤操作。Step 6b: After matching all project names, package names, file names and method names in the project, the file filtering operation is completed.
所述步骤62中,匹配工程名,是将工程名与工程名正则表达式进行匹配;In the step 62, matching the project name is to match the project name with the project name regular expression;
所述步骤65中,匹配包名,是将包名与包名正则表达式进行匹配;In the step 65, matching the package name is to match the package name with the package name regular expression;
所述步骤66中,匹配包名,是将包名与包名正则表达式进行匹配;In the step 66, matching the package name is to match the package name with the package name regular expression;
所述步骤68中,匹配源文件名,是将源文件名与源文件名正则表达式进行匹配;In the step 68, matching the source file name is to match the source file name with the source file name regular expression;
所述步骤69中,匹配方法,是将源文件中的方法的方法名与方法名正则表达式进行匹配。In the step 69, the matching method is to match the method name of the method in the source file with the method name regular expression.
所述步骤44中,通过语法树匹配可实现插桩定位,该部分使用visitor模式进行匹配,语法树匹配并插入代码的步骤包括:In the
步骤70:开始:对源文件进行插桩是逐个文件进行操作的,然后根据不同插桩类型进行插桩;插桩类型包括方法、IF分支、Switch分支、While分支、Do-while分支和For分支类型;Step 70: Start: The instrumentation of source files is performed file by file, and then instrumentation is performed according to different instrumentation types; instrumentation types include method, IF branch, Switch branch, While branch, Do-while branch and For branch type;
步骤71:查看是否还有下一个源文件,如果没有则进入步骤7h,结束;如果有则进入步骤72;Step 71: check whether there is a next source file, if not, then enter step 7h, end; if there is, then enter step 72;
步骤72:先把源文件编译成语法树,再以Visitor模式访问各个节点,Visitor模型下几种插桩类型的操作步骤如下:Step 72: First compile the source file into a syntax tree, and then visit each node in the Visitor mode. The operation steps of several types of stubs under the Visitor model are as follows:
步骤73:首先判断语法树是否有下一个节点;Step 73: first determine whether the syntax tree has a next node;
步骤74:根据所选插桩类型判断是否包含方法类型,若包含并且该结点是方法体结点,则进入步骤75,否则跳转到步骤76;Step 74: Determine whether the method type is included according to the selected stub type, if it is included and the node is a method body node, then go to step 75, otherwise go to step 76;
步骤75:将事先由分析人员所定义的方法类型插桩代码插入到抽象语法树中的相应节点;Step 75: Insert the instrumentation code of the method type defined by the analyst into the corresponding node in the abstract syntax tree;
步骤76:根据所选插桩类型判断是否包含IF分支类型,若包含并且该结点是IF分支结点,则进入步骤77,否则跳转到步骤78;Step 76: Determine whether the IF branch type is included according to the selected stub type, if it is included and the node is an IF branch node, then go to step 77, otherwise go to step 78;
步骤77:将事先由分析人员所定义的IF分支类型插桩代码插入到抽象语法树中的相应节点;Step 77: Insert the IF branch type instrumentation code defined by the analyst in advance into the corresponding node in the abstract syntax tree;
步骤78:根据所选插桩类型判断是否包含Switch分支类型,若包含并且该结点是Switch分支结点,则进入步骤79,否则跳转到步骤7a;Step 78: Determine whether the Switch branch type is included according to the selected stub type, if it is included and the node is a Switch branch node, then enter step 79, otherwise jump to step 7a;
步骤79:将事先由分析人员所定义的Switch分支类型插桩代码插入到抽象语法树中的相应节点;Step 79: Insert the Switch branch type instrumentation code defined by the analyst in advance into the corresponding node in the abstract syntax tree;
步骤7a:根据所选插桩类型判断是否包含While分支类型,若包含并且该结点是While分支结点,则进入步骤7b,否则跳转到步骤7a;Step 7a: Determine whether the While branch type is included according to the selected stub type, if it is included and the node is a While branch node, go to step 7b, otherwise skip to step 7a;
步骤7b:将事先由分析人员所定义的While分支类型插桩代码插入到抽象语法树中的相应节点;Step 7b: Insert the While branch type instrumentation code defined by the analyst into the corresponding node in the abstract syntax tree;
步骤7c:根据所选插桩类型判断是否包含Do-While分支类型,若包含并且该结点是Do-Whi le分支结点,则进入步骤7d,否则跳转到步骤7a;Step 7c: Determine whether the Do-While branch type is included according to the selected stub type, if it is included and the node is a Do-While branch node, then enter step 7d, otherwise jump to step 7a;
步骤7d:将事先由分析人员所定义的Do-While分支类型插桩代码插入到抽象语法树中的相应节点;Step 7d: Insert the Do-While branch type instrumentation code defined by the analyst into the corresponding node in the abstract syntax tree;
步骤7e:根据所选插桩类型判断是否包含For分支类型,若包含并且该结点是For分支结点,则进入步骤7f,否则跳转到步骤7a;Step 7e: Determine whether the For branch type is included according to the selected stub type, if it is included and the node is a For branch node, then go to step 7f, otherwise jump to step 7a;
步骤7f:将事先由分析人员所定义的For分支类型插桩代码插入到抽象语法树中的相应节点;Step 7f: Insert the instrumentation code of the For branch type defined by the analyst in advance into the corresponding node in the abstract syntax tree;
步骤7g:完成语法树插桩后,将语法树转换成源代码文件,然后进入步骤73对下一个源文件进行插桩;Step 7g: After completing the syntax tree instrumentation, convert the syntax tree into a source code file, and then proceed to step 73 to perform instrumentation for the next source file;
步骤7h:完成了对所有文件的插桩插桩,整个过程结束。Step 7h: The instrumentation of all files is completed, and the whole process ends.
本发明的自动插桩方案是针对某具体应用类型进行的,利用抽象语法树的结构匹配进行插桩点自动定位,插桩点自动定位是自动插桩部分的重点。定位后将代码片段的语法树添加到原文件的语法树中的相应位置,然后将产生的新语法树并将其转换成源代码,最后将新文件编译成字节码文件,所得字节码文件可直接运行。The automatic insertion scheme of the present invention is carried out for a specific application type, and the automatic positioning of the insertion point is performed by using the structure matching of the abstract syntax tree, and the automatic positioning of the insertion point is the key point of the automatic insertion part. After positioning, add the syntax tree of the code fragment to the corresponding position in the syntax tree of the original file, then convert the generated new syntax tree into source code, and finally compile the new file into a bytecode file, and the resulting bytecode The file can be run directly.
整个自动插桩过程是逐个文件进行的。首先将原文件和插桩代码编译成语法树,接着遍历每个文件的语法树查看是否具有与插桩类型相匹配的语法树结构。找到匹配的语法树结点后,将对应插桩类型代码的语法树合并到原文件的语法树上。The entire automatic instrumentation process is performed file by file. First, compile the original file and instrumentation code into a syntax tree, and then traverse the syntax tree of each file to check whether it has a syntax tree structure that matches the instrumentation type. After finding the matching syntax tree node, merge the syntax tree corresponding to the instrumentation type code into the syntax tree of the original file.
语法树结构匹配使用了Visitor设计模式,利用该设计模式逐结点地遍历整个语法树,进而实现了树形结构的匹配。The syntax tree structure matching uses the Visitor design pattern, which traverses the entire syntax tree node by node, and then realizes the matching of the tree structure.
为提高自动插桩的效率,本发明为自动插桩提供了文件过滤方法和元插桩类型重用框架。In order to improve the efficiency of automatic post insertion, the present invention provides a file filtering method and a meta-insertion type reuse framework for automatic post insertion.
利用文件过滤方法可以对所需编译的文件进行过滤,降低所需编译文件数量。程序分析时只需对整个工作空间(或工程)中的部分代码进行分析,分析员对工作空间(或工程)的结构已有所了解,利用分析员已有知识对自动插桩过程进行优化。分析者定义一些正则表达式进而对文件进行过滤,与之相匹配的文件(工程或包)将被保留下来。通过上述方案,所需编译文件数目将大大减小。The file filtering method can be used to filter the files to be compiled to reduce the number of files to be compiled. During program analysis, only part of the code in the entire workspace (or project) needs to be analyzed. The analyst already has an understanding of the structure of the workspace (or project), and uses the analyst’s existing knowledge to optimize the automatic pile insertion process. The analyzer defines some regular expressions to filter the files, and the matching files (projects or packages) will be kept. Through the above scheme, the number of required compilation files will be greatly reduced.
进一步的,在自动插桩中,若每种元插桩类型的插桩过程都需遍历一遍语法树,则使用多个元插桩类型进行插桩时开销将很大。整个过程开销主要体现在对原文件的编译,将文件编译成语法树其时间代价大,本发明提供元插桩类型重用框架以便解决该问题。元插桩类型重用框架主要包含:元插桩类型池、元插桩操作组合器,其中元插桩类型池包含多个元插桩类型。每种元插桩类型都是针对某一具体类型进行的并对插桩点的结构匹配进行定义,但元插桩类型并不能直接对语法树进行操作,需结合元插桩操作组合器进行使用。元插桩操作组合器将对文件的语法树进行遍历,根据元插桩类型所定义结构对语法树进行修改。元插桩类型池支持定义并添加新的插桩类型,进而实现对不同应用类型自动插桩的扩展。Furthermore, in automatic instrumentation, if the instrumentation process of each meta-instrumentation type needs to traverse the syntax tree once, then the overhead will be very high when using multiple meta-instrumentation types for instrumentation. The cost of the whole process is mainly reflected in the compilation of the original file, and the time cost of compiling the file into a syntax tree is high. The present invention provides a framework for reusing meta-instrumentation types to solve this problem. The meta-instrumentation type reuse framework mainly includes: meta-instrumentation type pool, meta-instrumentation operation combiner, wherein the meta-instrumentation type pool contains multiple meta-instrumentation types. Each meta-instrumentation type is for a specific type and defines the structure matching of the instrumentation point, but the meta-instrumentation type cannot directly operate on the syntax tree, and needs to be used in conjunction with the meta-instrumentation operation combiner . The meta-instrumentation combiner will traverse the syntax tree of the file, and modify the syntax tree according to the structure defined by the meta-instrumentation type. The meta-instrumentation type pool supports the definition and addition of new instrumentation types, thereby realizing the extension of automatic instrumentation for different application types.
本发明利用标志保存插桩代码,并将插桩代码与原有代码区分开来,提高了插桩代码可视化能力。标志可以在插桩位置显示相应的插桩代码,且不会干扰原文件编辑工作。同时可以集中管理所有插桩标志,进而提高插桩代码的管理能力。本发明还利用语法树结构匹配进而实现插桩点的自动定位,自动定位能有效地实现自动插桩功能。在抽象语法树上,自动插桩在所得定位位置添加插桩代码,然后将新的语法树转换成源文件。为提高自动插桩的效率,本发明还设计了文件过滤和元插桩类型重用框架。以上所有操作均在后台完成且不修改用户原文件,整个过程是一个无痕过程。The invention saves the post-inserting code by using the mark, distinguishes the post-inserting code from the original code, and improves the visualization ability of the post-inserting code. The flag can display the corresponding instrumentation code at the instrumentation position without interfering with the editing work of the original file. At the same time, all instrumentation flags can be managed centrally, thereby improving the management ability of instrumentation codes. The invention also utilizes syntax tree structure matching to realize automatic positioning of the insertion point, and the automatic positioning can effectively realize the automatic insertion function. On the abstract syntax tree, automatic instrumentation adds instrumentation code at the resulting location, and then converts the new syntax tree into a source file. In order to improve the efficiency of automatic pile insertion, the present invention also designs a framework for file filtering and meta-instrumentation type reuse. All the above operations are completed in the background without modifying the user's original files, and the whole process is a traceless process.
附图内容Attached content
图1为自动插桩的完整的流程图。Figure 1 is a complete flowchart of automatic pile insertion.
图2为自动插桩的元插桩类型重用框架图。Fig. 2 is a frame diagram of meta-instrumentation type reuse of automatic instrumentation.
图3为自动插桩中文件过滤的流程图。Fig. 3 is a flow chart of file filtering in automatic post insertion.
图4为自动插桩中,语法树匹配并插入代码的流程图。Fig. 4 is a flow chart of syntax tree matching and code insertion in automatic stub insertion.
具体实施方式 Detailed ways
下面通过附图对本发明的技术方案做进一步的详细描述。The technical scheme of the present invention will be described in further detail below with reference to the accompanying drawings.
一种无痕可管理的源代码自动插桩方法,本方法是针对某具体应用类型进行的,利用抽象语法树的结构匹配进行插桩点的自动定位,插桩点自动定位自动插桩部分的重点。定位后将代码片段的语法树结构添加到原文件的语法树结构中的相应位置,将产生的新语法树转换成源代码并编译成字节码,所得字节码可以直接进行运行。自动插桩的完整的流程图如图1所示,包括如下步骤:A traceless and manageable source code automatic insertion method, this method is carried out for a specific application type, using the structure matching of the abstract syntax tree to automatically locate the insertion point, the automatic positioning of the insertion point and the automatic insertion part focus. After positioning, add the syntax tree structure of the code fragment to the corresponding position in the syntax tree structure of the original file, convert the generated new syntax tree into source code and compile it into bytecode, and the resulting bytecode can be run directly. The complete flowchart of automatic pile insertion is shown in Figure 1, including the following steps:
步骤40:整个自动插桩过程的开始,此时打开一个工程并准备通过插桩对其进行分析;Step 40: the start of the entire automatic pile insertion process, at this time open a project and prepare to analyze it through pile insertion;
步骤41:定义一个文件过滤器,文件过滤器使用正则表达式对所需插桩的工程(包、文件、方法)进行匹配,与之匹配的则将保留下来;Step 41: Define a file filter. The file filter uses regular expressions to match the required instrumentation projects (packages, files, methods), and the matched ones will be retained;
步骤42:接着使用上述定义的过滤器对被插桩的文件进行过滤,利用分析者对工程文件已有知识提高了自动插桩性能;Step 42: Then use the filter defined above to filter the inserted files, and use the analyst's existing knowledge of the project files to improve the automatic insertion performance;
步骤43:选择所需插桩的类型,定义相关类型所需插桩的代码;Step 43: Select the type of stub required, and define the code of the stub required for the relevant type;
步骤44:接着利用语法树结构匹配根据不同的插桩类型进行插桩点的位置定位,并在相应位置插入代码最终产生一个新的文件;Step 44: Then use syntax tree structure matching to locate the position of the insertion point according to different insertion types, and insert the code at the corresponding position to finally generate a new file;
步骤45:对上述新的文件进行编译产生新的可执行字节码文件,并将这些文件保存起来;Step 45: Compile the above-mentioned new files to generate new executable bytecode files, and save these files;
步骤46:最终产生可执行文件,整个过程将不会修改原有文件信息,整个过程是一个无痕的过程。Step 46: Finally, an executable file is generated, and the original file information will not be modified in the whole process, and the whole process is a traceless process.
自动插桩方案中为提高插桩效率提供了文件过滤方案和元插桩类型重用框架。自动插桩的元插桩类型重用框架图如图2所示,该框架主要是由元插桩操作组合器和元插桩类型池两部分组成。其中元插桩类型支持对不同类型插桩进行插桩,且每一个元插桩只能针对某一类型进行插桩。元插桩类型池包含多个元插桩类型,并且支持添加新的元插桩类型。元插桩操作组合器用于将多个元插桩类型组合到一起,通过遍历一次语法树进而完成在语法树上多种类型的插桩操作。In the automatic instrumentation scheme, a file filtering scheme and a meta-instrumentation type reuse framework are provided to improve the efficiency of instrumentation. The frame diagram of meta-instrumentation type reuse for automatic instrumentation is shown in Figure 2. The framework is mainly composed of two parts: meta-instrumentation operation combiner and meta-instrumentation type pool. Among them, the meta-instrumentation type supports different types of instrumentation, and each meta-insertion can only be instrumented for a certain type. The meta instrumentation type pool contains multiple meta instrumentation types and supports adding new meta instrumentation types. The meta-instrumentation combiner is used to combine multiple meta-instrumentation types together, and completes multiple types of instrumentation operations on the syntax tree by traversing the syntax tree once.
文件过滤方案通过减小插桩文件集合,降低被编译文件数目。插桩过程中并非每个文件都需要进行插桩,编译一些无关文件并将语法树遍历一遍将会浪费大量时间。通过文件过滤将大大降低插桩文件集大小,自动插桩文件过滤的流程图如图3所示,包括如下步骤:The file filtering scheme reduces the number of compiled files by reducing the set of instrumented files. During the instrumentation process, not every file needs to be instrumented. Compiling some irrelevant files and traversing the syntax tree will waste a lot of time. File filtering will greatly reduce the size of the instrumentation file set. The flowchart of automatic instrumentation file filtering is shown in Figure 3, including the following steps:
步骤60:自动插桩文件过滤是对某一工作空间所有文件进行过滤;Step 60: Automatically inserting file filtering is to filter all files in a certain workspace;
步骤61:先看该工作空间是否包含其它工程,如没有则跳转到步骤64开始进行包的过滤,若有则进入下一步;Step 61: first check whether the workspace contains other projects, if not, jump to step 64 to start packet filtering, and if so, go to the next step;
步骤62:将工程名与工程名正则表达式进行匹配,若匹配则进入下一步骤,若不匹配则回到步骤61继续查看是否有下一个工程;Step 62: Match the project name with the regular expression of the project name, if it matches, enter the next step, if not, return to step 61 and continue to check whether there is a next project;
步骤63:该工程名与正则表达式相匹配,说明该工程需要进行插桩,将其添加到插桩工程集合中,然后跳回至步骤61;Step 63: The project name matches the regular expression, indicating that the project needs to be instrumented, add it to the instrumentation project collection, and then jump back to step 61;
步骤64:先看工程集合中的工程是否包含其它包,如没有则跳转到步骤67开始进行文件的过滤,若有则进入下一步;Step 64: first check whether the projects in the project collection contain other packages, if not, jump to step 67 to start filtering files, and if so, go to the next step;
步骤65:将包名与包名正则表达式进行匹配,若匹配则进入下一步骤,若不匹配则回到步骤64继续查看是否有下一个包;Step 65: Match the package name with the regular expression of the package name, if it matches, then enter the next step, if not, return to step 64 and continue to check whether there is a next package;
步骤66:该包名与正则表达式相匹配,说明该包需要进行插桩,将其添加到插桩包集合中,然后跳回至步骤64;Step 66: The package name matches the regular expression, indicating that the package needs to be instrumented, add it to the instrumentation package collection, and then jump back to step 64;
步骤67:先看包集合中的包是否包含其它文件,如果没有则跳转到步骤6b,如果有则进入下一步;Step 67: First check whether the packages in the package collection contain other files, if not, go to step 6b, and if so, go to the next step;
步骤68:将文件名与文件名正则表达式进行匹配,若匹配则进入下一步对方法名进行匹配,否则跳回至步骤67;Step 68: Match the file name with the regular expression of the file name, if it matches, go to the next step to match the method name, otherwise jump back to step 67;
步骤69:查看文件中是否包含与方法名正则表达式相匹配的方法,若匹配则进入下一步,否则跳回至步骤67;Step 69: Check whether the file contains a method that matches the regular expression of the method name, if it matches, go to the next step, otherwise jump back to step 67;
步骤6a:将匹配的文件添加到文件匹配集合;Step 6a: Add the matched files to the file matching collection;
步骤6b:对工程中所有工程名、包名、文件名以及方法名进行匹配后并完成对文件的过滤操作。Step 6b: After matching all project names, package names, file names and method names in the project, the file filtering operation is completed.
通过语法树匹配可实现插桩定位,该部分使用设计模式中的visitor模式进行匹配,语法树匹配并插入代码的流程图如图4所示,包括如下步骤:Insertion positioning can be realized through syntax tree matching. This part uses the visitor pattern in the design mode for matching. The flow chart of syntax tree matching and code insertion is shown in Figure 4, including the following steps:
步骤70:对文件进行插桩是逐个文件进行操作的,然后根据不同插桩类型进行插桩,下面流程包含对方法、IF分支、Switch分支、While分支、Do-while分支以及For分支类型进行语法树结构匹配并进行插桩。Step 70: The instrumentation of files is performed file by file, and then instrumentation is performed according to different instrumentation types. The following process includes the method, IF branch, Switch branch, While branch, Do-while branch, and For branch types. The tree structure is matched and instrumented.
步骤71:查看是否还有下一个文件,如果没有则结束,否则进入下一步骤;Step 71: Check whether there is a next file, if not, end, otherwise enter the next step;
步骤72:将文件编译成语法树,以便后面的插桩定位使用,下面将以Visitor模式访问各个节点,下面几个步骤描述了visitor模型下几种插桩类型的操作;Step 72: Compile the file into a syntax tree for subsequent instrumentation and positioning. Next, we will visit each node in the Visitor mode. The following steps describe the operations of several types of instrumentation under the visitor model;
步骤73:首先判断语法树是否有下一个节点;Step 73: first determine whether the syntax tree has a next node;
步骤74:根据所选插桩类型判断是否包含方法类型,若包含并且该结点是方法体结点,则跳转到下一步,否则跳转到步骤76;Step 74: Determine whether the method type is included according to the selected stub type, if it is included and the node is a method body node, then go to the next step, otherwise go to step 76;
步骤75:将事先由分析人员所定义的方法类型插桩代码插入到抽象语法树中;Step 75: inserting the instrumentation code of the method type defined by the analyst into the abstract syntax tree;
步骤76:根据所选插桩类型判断是否包含IF分支类型,若包含并且该结点是IF分支结点,则跳转到下一步,否则跳转到步骤78;Step 76: Determine whether the IF branch type is included according to the selected stub type, if it is included and the node is an IF branch node, then go to the next step, otherwise go to step 78;
步骤77:将事先由分析人员所定义的IF分支类型插桩代码插入到抽象语法树中;Step 77: Insert the IF branch type instrumentation code defined by the analyst into the abstract syntax tree;
步骤78:根据所选插桩类型判断是否包含Switch分支类型,若包含并且该结点是Switch分支结点,则跳转到下一步,否则跳转到步骤7a;Step 78: Determine whether the Switch branch type is included according to the selected stub type, if it is included and the node is a Switch branch node, then go to the next step, otherwise go to step 7a;
步骤79:将事先由分析人员所定义的Switch分支类型插桩代码插入到抽象语法树中;Step 79: Insert the Switch branch type instrumentation code defined by the analyst in advance into the abstract syntax tree;
步骤7a:根据所选插桩类型判断是否包含While分支类型,若包含并且该结点是While分支结点,则跳转到下一步,否则跳转到步骤7a;Step 7a: Determine whether the While branch type is included according to the selected stub type, if it is included and the node is a While branch node, then go to the next step, otherwise go to step 7a;
步骤7b:将事先由分析人员所定义的While分支类型插桩代码插入到抽象语法树中;Step 7b: Insert the While branch type instrumentation code defined by the analyst in advance into the abstract syntax tree;
步骤7c:根据所选插桩类型判断是否包含Do-While分支类型,若包含并且该结点是Do-While分支结点,则跳转到下一步,否则跳转到步骤7a;Step 7c: Determine whether the Do-While branch type is included according to the selected stub type, if it is included and the node is a Do-While branch node, then go to the next step, otherwise go to step 7a;
步骤7d:将事先由分析人员所定义的Do-While分支类型插桩代码插入到抽象语法树中;Step 7d: Insert the Do-While branch type instrumentation code defined by the analyst into the abstract syntax tree;
步骤7e:根据所选插桩类型判断是否包含For分支类型,若包含并且该结点是For分支结点,则跳转到下一步,否则跳转到步骤7a;Step 7e: Determine whether the For branch type is included according to the selected stub type, if it is included and the node is a For branch node, then go to the next step, otherwise go to step 7a;
步骤7f:将事先由分析人员所定义的For分支类型插桩代码插入到抽象语法树中;Step 7f: Insert the instrumentation code of the For branch type defined by the analyst into the abstract syntax tree;
步骤7g:完成语法树插桩后,将语法树转换成源代码文件,然后进入下一步对下一个文件进行插桩;Step 7g: After the syntax tree instrumentation is completed, convert the syntax tree into a source code file, and then proceed to the next step to instrument the next file;
步骤7h:完成了对所有文件的插桩插桩,整个过程结束;Step 7h: The instrumentation of all files is completed, and the whole process ends;
最后对所产生的新文件进行编译,生成包含原文件和插桩代码的字节码文件,运行这些文件便可获得插桩后文件运行结果。Finally, compile the generated new file to generate a bytecode file containing the original file and instrumented code, and run these files to obtain the running result of the instrumented file.
本发明的技术方案利用标志编辑并保存插桩代码,插桩标志能够将插桩代码与原代码区分开,同时能在插桩位置显示插桩代码,能够有效地提高程序的可阅读性。插桩标志不影响对源程序的编辑工作,此外插桩点的集中管理提高了插桩标志的易用性。本发明使用语法树结构匹配提高了定位精度,将原源代码文件编译成语法树后进行结构匹配,对语法树进行修改最后将语法树转换成源代码文件。另外,为提高匹配效率,方案还提出了文件过滤方法和元插桩类型重用框架。文件过滤方法有效地降低了所需编译和遍历的文件数目,元插桩类型重用框架使得多种插桩类型能在一次语法树遍历中完成。The technical solution of the present invention uses the mark to edit and save the inserted code, and the inserted mark can distinguish the inserted code from the original code, and can display the inserted code at the inserted position, which can effectively improve the readability of the program. The instrumentation flag does not affect the editing work of the source program, and the centralized management of the instrumentation point improves the usability of the instrumentation flag. The invention improves the positioning accuracy by using syntax tree structure matching, compiles the original source code file into a syntax tree, performs structure matching, modifies the syntax tree, and finally converts the syntax tree into a source code file. In addition, in order to improve the matching efficiency, the scheme also proposes a file filtering method and a meta-insertion type reuse framework. The file filtering method effectively reduces the number of files to be compiled and traversed, and the meta-instrumentation type reuse framework enables multiple instrumentation types to be completed in one syntax tree traversal.
最后应当说明的是:以上步骤仅用于说明本发明的技术方案而非对其限制。尽管上述步骤对本发明进行了详细的说明,相关领域的技术人员应当理解,依然可以对本发明的具体技术进行修改或者对部分技术进行等同替换;而不脱离本发明技术方案的精神,其均应涵盖在本发明请求保护的技术方案范围当中。Finally, it should be noted that the above steps are only used to illustrate the technical solution of the present invention rather than limit it. Although the above steps have described the present invention in detail, those skilled in the art should understand that the specific technologies of the present invention can still be modified or some technologies can be equivalently replaced; without departing from the spirit of the technical solutions of the present invention, they should cover In the scope of the technical solutions claimed in the present invention.
Claims (6)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2011101985825A CN102231109A (en) | 2011-07-15 | 2011-07-15 | Traceless manageable automatic source code instrumentation method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2011101985825A CN102231109A (en) | 2011-07-15 | 2011-07-15 | Traceless manageable automatic source code instrumentation method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN102231109A true CN102231109A (en) | 2011-11-02 |
Family
ID=44843676
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN2011101985825A Pending CN102231109A (en) | 2011-07-15 | 2011-07-15 | Traceless manageable automatic source code instrumentation method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN102231109A (en) |
Cited By (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103049504A (en) * | 2012-12-11 | 2013-04-17 | 南京大学 | Semi-automatic instrumentation method based on source code inquiring |
| CN103488460A (en) * | 2013-09-04 | 2014-01-01 | 用友软件股份有限公司 | System and method for automatically marking source code |
| WO2014134990A1 (en) * | 2013-03-08 | 2014-09-12 | Tencent Technology (Shenzhen) Company Limited | Method, device and computer-readable storage medium for closure testing |
| CN104142819A (en) * | 2013-07-10 | 2014-11-12 | 腾讯科技(深圳)有限公司 | File processing method and device |
| CN106529224A (en) * | 2016-10-27 | 2017-03-22 | 南京大学 | Binary obfuscation method based on ROP (Return Oriented Programming) attack feature |
| CN106610898A (en) * | 2016-12-28 | 2017-05-03 | 南京大学 | JPF-based Java code SSA single path generation method |
| CN106649118A (en) * | 2016-12-28 | 2017-05-10 | 南京大学 | Generating method of SSA single path of Java code based on AST |
| CN106874058A (en) * | 2016-12-29 | 2017-06-20 | 中国航天系统科学与工程研究院 | A kind of program automatically instrument method based on source code |
| CN110442346A (en) * | 2019-07-08 | 2019-11-12 | 中国科学院计算技术研究所 | Rule augmentation method for compiler code detection |
| CN110471670A (en) * | 2019-08-20 | 2019-11-19 | 杭州和利时自动化有限公司 | A kind of compiler, Compilation Method and method for tracing and DCS controller |
| CN111158667A (en) * | 2020-01-02 | 2020-05-15 | 广州虎牙科技有限公司 | Code injection method and device, electronic equipment and storage medium |
| CN112905443A (en) * | 2019-12-04 | 2021-06-04 | 阿里巴巴集团控股有限公司 | Test case generation method, device and storage medium |
| CN115687133A (en) * | 2022-11-08 | 2023-02-03 | 平安壹钱包电子商务有限公司 | Serialization realization method and device, storage medium and electronic equipment |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7240335B2 (en) * | 1996-08-27 | 2007-07-03 | Compuware Corporation | Byte code instrumentation |
| CN101706750A (en) * | 2009-11-16 | 2010-05-12 | 西安邮电学院 | Detective pole acquiring method based on embedded type simulator |
| CN101833500A (en) * | 2010-04-07 | 2010-09-15 | 南京航空航天大学 | Embedded software intelligent testing method based on Agent |
-
2011
- 2011-07-15 CN CN2011101985825A patent/CN102231109A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7240335B2 (en) * | 1996-08-27 | 2007-07-03 | Compuware Corporation | Byte code instrumentation |
| CN101706750A (en) * | 2009-11-16 | 2010-05-12 | 西安邮电学院 | Detective pole acquiring method based on embedded type simulator |
| CN101833500A (en) * | 2010-04-07 | 2010-09-15 | 南京航空航天大学 | Embedded software intelligent testing method based on Agent |
Non-Patent Citations (1)
| Title |
|---|
| CHEN HUAJIE等: "An Instrumentation Tool for Program Dynamic Analysis in Java", 《2011 FIFTH INTERNATIONAL CONFERENCE ON SECURE SOFTWARE INTEGRATION AND RELIABILITY IMPROVEMENT-COMPANION》 * |
Cited By (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103049504A (en) * | 2012-12-11 | 2013-04-17 | 南京大学 | Semi-automatic instrumentation method based on source code inquiring |
| WO2014134990A1 (en) * | 2013-03-08 | 2014-09-12 | Tencent Technology (Shenzhen) Company Limited | Method, device and computer-readable storage medium for closure testing |
| US9507693B2 (en) | 2013-03-08 | 2016-11-29 | Tencent Technology (Shenzhen) Company Limited | Method, device and computer-readable storage medium for closure testing |
| CN104142819A (en) * | 2013-07-10 | 2014-11-12 | 腾讯科技(深圳)有限公司 | File processing method and device |
| CN104142819B (en) * | 2013-07-10 | 2016-08-24 | 腾讯科技(深圳)有限公司 | A kind of document handling method and device |
| CN103488460A (en) * | 2013-09-04 | 2014-01-01 | 用友软件股份有限公司 | System and method for automatically marking source code |
| CN103488460B (en) * | 2013-09-04 | 2015-12-02 | 用友网络科技股份有限公司 | The system and method for automatic mark source code |
| CN106529224A (en) * | 2016-10-27 | 2017-03-22 | 南京大学 | Binary obfuscation method based on ROP (Return Oriented Programming) attack feature |
| CN106610898A (en) * | 2016-12-28 | 2017-05-03 | 南京大学 | JPF-based Java code SSA single path generation method |
| CN106649118A (en) * | 2016-12-28 | 2017-05-10 | 南京大学 | Generating method of SSA single path of Java code based on AST |
| CN106610898B (en) * | 2016-12-28 | 2019-01-04 | 南京大学 | A kind of generation method of the Java code SSA single path based on JPF |
| CN106649118B (en) * | 2016-12-28 | 2019-02-19 | 南京大学 | A method for generating SSA single path of Java code based on AST |
| CN106874058A (en) * | 2016-12-29 | 2017-06-20 | 中国航天系统科学与工程研究院 | A kind of program automatically instrument method based on source code |
| CN110442346A (en) * | 2019-07-08 | 2019-11-12 | 中国科学院计算技术研究所 | Rule augmentation method for compiler code detection |
| CN110471670A (en) * | 2019-08-20 | 2019-11-19 | 杭州和利时自动化有限公司 | A kind of compiler, Compilation Method and method for tracing and DCS controller |
| CN112905443A (en) * | 2019-12-04 | 2021-06-04 | 阿里巴巴集团控股有限公司 | Test case generation method, device and storage medium |
| CN111158667A (en) * | 2020-01-02 | 2020-05-15 | 广州虎牙科技有限公司 | Code injection method and device, electronic equipment and storage medium |
| CN115687133A (en) * | 2022-11-08 | 2023-02-03 | 平安壹钱包电子商务有限公司 | Serialization realization method and device, storage medium and electronic equipment |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN102231109A (en) | Traceless manageable automatic source code instrumentation method | |
| CN101739339B (en) | Program dynamic dependency relation-based software fault positioning method | |
| CN102231134A (en) | Method for detecting redundant code defects based on static analysis | |
| CN102236602A (en) | Visual software test design platform | |
| CN102567164A (en) | Instruction set batch testing device and method for processor | |
| Konat et al. | Scalable incremental building with dynamic task dependencies | |
| US11126527B2 (en) | Software engineering method including tracing and visualizing | |
| Rajan et al. | Aspect language features for concern coverage profiling | |
| CN102214142A (en) | Instrumentation method for traceless manageable source code manually-defined mark | |
| CN118642942A (en) | Code testing analysis and correction method and system based on AST | |
| Faria et al. | A toolset for conformance testing against UML sequence diagrams based on event-driven colored Petri nets | |
| CN102662829B (en) | Processing method and apparatus for complex data structure in code static state testing | |
| Marcozzi et al. | Taming coverage criteria heterogeneity with LTest | |
| Baker et al. | Trex-the refactoring and metrics tool for ttcn-3 test specifications | |
| Schubert et al. | Into the woods: Experiences from building a dataflow analysis framework for C/C++ | |
| Rajan et al. | Generalizing AOP for aspect-oriented testing | |
| Li et al. | A User-extensible Refactoring Tool for Erlang Programs | |
| CN114968779A (en) | A code-oriented application performance analysis method and system | |
| Gupta et al. | Optimal code compiling in C | |
| Quinlan et al. | An extensible open-source compiler infrastructure for testing | |
| Somé et al. | An approach for aspect-oriented use case modeling | |
| Owens | A generic framework facilitating automated quality assurance across programming languages of disparate paradigms. | |
| Assaiante | Symbolic debugging of optimized code: measuring, testing, tuning and enhancing debug information quality | |
| Dagenais et al. | Slicing functional aspects out of legacy applications | |
| Zhao | Automatic refactoring for renamed clones in test code |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20111102 |