Background technology
Along with the arrival in chip multi-core processor epoch, how to make application program under the multi-core system structure, obtain good performance, become a research focus.The method of accelerating the execution speed of application program mainly contains two kinds: the explicit use multiple programming of a kind of programmer of being technology is come developing application, this method fails to be promoted on a large scale owing to be subject to the limitation and the complexity of finding the solution problem of existing language model; Another kind is exactly to depend on parallel compiler automatically or semi-automatically with the serial program parallelization.
The serial program parallelization is the hot issue of parallel processing area research, also is one of high-performance computing sector problem that need solve.Parallelization is divided into full-automatic parallelization and two kinds of patterns of interactive parallelization.In recent years, people are obtaining some progress in parallelization aspect the theoretical and practical approach two, some full-automatic parallelization systems occurred, as the Polaris of UIUC (E Benna of University of Illinois-champagne branch school), the SUIF of Stanford (Stamford), the AFT of Fudan University etc.These full-automatic parallelization systems, technology such as analysis when induction variable identification by adopting interprocedural analysis, symbol data correlation analysis, array privatization, reduction identification, complex form and operation, do not need human intervention, automatically the part that can walk abreast in the discovery procedure generates parallel codes, has certain parallelization ability.But, because the parallelization algorithm that uses in full-automatic parallelization system can't be handled complicated application program effectively, and can't effectively handle the dependence between the small routine after a plurality of fractionations and keep original serial program semanteme simultaneously, cause the actual effect that compiles not ideal.The full-automatic parallelization of this class system science that is mainly used in is calculated, and can not be applicable to as field widely such as desktop application, multimedia and server.The interactive parallelization system then is by the effective information in the program is provided to the user, comprise that help users such as correlation analysis result, routine call figure, performance prediction result carry out parallelization work, the CAPTools of Greenwich (University of Greenwich) university for example, the Forge90 of the Fortran d system of rice university and Applied Parallel Research (using Research on Parallel institute) company etc. all is interactive parallelization systems.These interactive parallelization systems, when taking the automatically parallelizing technology as far as possible, permission people in the parallelization process improves the parallelization effect for checking and revise the parallelization result by utilizing artificial ability, has remedied the deficiency of full-automatic parallelization system on the part degree.But, above-mentioned interactive parallelization system does not generally all adopt up-to-date automatically parallelizing technology, and its exchange method and have much room for improvement with the compactedness of cooperation among users, therefore, their parallelization effect is still not ideal enough under existing multi-core system structure.
Summary of the invention
Technical matters to be solved by this invention is to propose a kind ofly to provide friendly interactive means, make user and compiler close cooperation and combine up-to-date advanced parallelization technology and make serial application to obtain the interactive parallelization compiling system and the Compilation Method thereof of the performance of getting well under the multi-core system structure.
The present invention provides a kind of interactive parallelization compiling system for solving its technical matters, comprise based on compiler, can be come the device of parallel processing by the polycaryon processor of computing machine by coming serial program is compiled as with user interactions, described compiler is the Eclipse compiler; Described interactive parallelization compiling system also comprises: the interactive parallelization plug-in unit, be used for obtaining interactive information with user interactions, the interactive parallelization engine receives described interactive information and carries out the automated procedures analysis, determines the device of the computation structure characteristic of serial program; The interactive parallelization engine, be used for interactive parallelization plug-in unit and user interactions and obtain interactive information, to described linear task computation, set up the expense model according to interactive information, employing is based on the graph model theoretical method, the procedure chart model is divided and dispatched, to excavate coarseness task level concurrency, to described recurrence task computation, the employing algorithm of dividing and ruling, use parallel storehouse that program is carried out conversion, described regular data is calculated, set up performance model according to interactive information, adopt affine partitioning algorithm that program is carried out conversion, it is parallel to obtain pipelining-stage, to described specific calculation, the employing expense drives, congenial multi-threaded parallel method based on the user experience information guiding of sampling and feeding back is excavated its concurrency, and the parallelization result is passed to described interactive parallelization plug-in unit, and the Eclipse compiler shows parallelization result's device.
The present invention also provides a kind of interactive parallelization Compilation Method, based on compiler, can come the concurrent program of parallel processing by the polycaryon processor of computing machine by coming serial program is compiled as with user interactions, described compiler is the Eclipse compiler, described interactive parallelization Compilation Method comprises: the preproduction phase: interactive parallelization plug-in unit and user interactions obtain interactive information, the interactive parallelization engine receives described interactive information and carries out the automated procedures analysis, determines the computation structure characteristic of serial program; The parallelization stage: interactive parallelization plug-in unit and user interactions obtain interactive information, to described linear task computation, set up the expense model according to interactive information, employing is based on the graph model theoretical method, the procedure chart model is divided and dispatched, to excavate coarseness task level concurrency, to described recurrence task computation, the employing algorithm of dividing and ruling, use parallel storehouse that program is carried out conversion, described regular data is calculated, set up performance model according to interactive information, adopt affine partitioning algorithm that program is carried out conversion, the acquisition pipelining-stage is parallel, to described specific calculation, adopts expense to drive, congenial multi-threaded parallel method based on the user experience information guiding of sampling and feeding back is excavated its concurrency, and the parallelization result passed to described interactive parallelization plug-in unit, the Eclipse compiler shows the parallelization result.
Interactive parallelization compiling system provided by the invention and Compilation Method according to the information of automatic analysis gained, in conjunction with by the customer-furnished information of interactive mode, form the computation structure characteristic of program.At different calculating, adopt different parallel methods to realize the parallelization conversion of source-to-source.The present invention has the following advantages:
(1) realizes interactive function based on the plug-in unit of Eclipse compiler, friendly visible environment and powerful integration of user interaction functionality are provided, have portability and extensibility preferably;
(2) conversion of employing source-to-source has versatility, can obtain parallelization effect preferably at the different target machine architecture;
(3) adopt up-to-date automatically parallelizing analytical technology to combine with interactive, extraction program information and user knowledge have stronger program comprehension ability preferably;
(4) support number of different types to calculate, corresponding parallelization scheme is provided, support the concurrency of multiple granularity, support to speculate multi-threaded parallelization.
Embodiment
In order to make the purpose, technical solutions and advantages of the present invention clearer, describe the present invention below in conjunction with drawings and Examples.
Fig. 1 is the structured flowchart of interactive parallelization compiling system of the present invention, and referring to Fig. 1, interactive parallelization compiling system provided by the invention comprises the computing machine with polycaryon processor, and serial program is stored in the computing machine., the interactive parallelization compiling system be included in Eclipse compiler 1 in the calculating under the computer environment of polycaryon processor, move.Interactive parallelization plug-in unit 2 and interactive parallelization engine 3.Eclipse compiler 1 is as the underlying basis framework, and interactive parallelization plug-in unit 2 and interactive parallelization engine 3 are cooperated mutually, can be by the concurrent program of polycaryon processor parallel processing with being stored in that serial program in the computing machine is compiled as.Eclipse compiler of the present invention is the compiler based on the Eclipse environment.
Interactive parallelization plug-in unit 2 is based on the interactive view plug of Eclipse compiler 1, used the plug-in environment of Eclipse compiler to develop as substrate platform, friendly Interactive Visualization environment and integration of user interaction functionality is provided, by obtaining interactive information such as correlated performance parameter and user experience knowledge alternately with the user, and pass to the interactive parallelization engine 3 to instruct parallelization work.The user selects the parallelization function by the menu in the interactive parallelization plug-in unit 2 again by the editing machine option program piece of Eclipse compiler, triggers the interactive parallelization engine 3 and carries out the parallelization operation.The interactive parallelization engine 3 receives the interactive information that interactive parallelization plug-in unit 2 is obtained, and carries out the parallelization conversion of source-to-source, and the parallelization transformation result is passed to interactive parallelization plug-in unit 2.Interactive parallelization plug-in unit 2 receives the parallelization results, and source program is carried out rewrite operation, to user's display result, and can further receive user's feedback in the editing machine of Eclipse compiler, further carries out the parallelization operation as required.
The interactive parallelization engine 3 comprises interactive module 31, automatically parallelizing module 32, program transformation module 33 and parallel codes generation module 34.Interactive parallelization plug-in unit 2 is transferred to mutual and formula row engine 3 with serial program information and interactive information via interactive module 31, and the interactive parallelization engine 3 carries out parallelization work obtaining on the basis of interactive information to serial program.
Interactive module 31 is interface modules of interactive parallelization engine 3 and interactive parallelization plug-in unit 2, and by this module, the interactive parallelization engine 3 obtains interactive information, and interactive parallelization plug-in unit 2 is passed in process analysis and parallelization result.
Automatically parallelizing module 32 receives serial program and interactive information, carries out process analysis.Filter out linear task computation according to the interactive information that obtains.For non-linear task computation, carry out data-flow analysis and interprocedural analysis in the process again, filter out the recurrence task computation.Calculate for non task, carry out the data dependency analysis again, come distinguishing rule data-flow computation and specific calculation.
Program transformation module 33 receives the analysis result of interactive information and automatically parallelizing module 32, serial program is understood, and standard selects suitable parallelization strategy to carry out parallel transformation to various computing.Program transformation module 33 provides corresponding parallelization scheme at four types calculating.For linear task computation, set up the expense model according to interactive information, adopt method based on the graph model theory, the procedure chart model is divided and dispatched, excavate coarseness task level concurrency.For the recurrence task computation, adopt the parallel algorithm of dividing and ruling, use parallel storehouse that program is carried out conversion.Calculate for regular data, set up performance model according to interactive information, adopt affine partitioning algorithm that program is carried out conversion, it is parallel to obtain pipelining-stage.For the specific calculation of SPECCPU2000 Compress etc., adopt expense to drive, based on the congenial multi-threaded parallel method of the user experience information guiding of sampling and feedback specific calculation carried out parallelization and handle.
The source code change is carried out in the relevant parallelization conversion that 34 pairs of program transformation modules of parallel codes generation module 33 produce, and generates the parallelization code.For C language strings line program, generate the parallelization code that serial program code and OpenMP instruct statement to combine.
The continuous periodic duty of whole interactive parallelization engine 3, carry out obtaining associated user's knowledge posterior infromation alternately with the user, and further carry out parallelization as required and handle, up to the parallelization of satisfying user performance requirement as a result, have positive interaction capabilities and stronger parallelization ability.
Interactive parallelization plug-in unit 2 also provides the performance parameter configuration page, by the User Defined performance parameter, carries out alternately with the interactive parallelization engine 3, dynamically determines to instruct parallelization work by the expense model.Similar with the performance parameter configuration page, interactive parallelization plug-in unit 2 also provides the user knowledge window, and the user instructs the multi-threaded parallel work of speculating by the relevant knowledge of loading routine operation.In addition, interactive parallelization plug-in unit 2 also provides the visual environment of browsing, and comprises the visual performance analyser that can check the parallelization effect, and the many figure browser that can check program information figure.After the interactive parallelization engine 3 was finished parallelization work, interactive parallelization plug-in unit 2 was responsible for the code after the parallelization is write, and source file is made amendment, and show the parallelization result in real time in the editing machine of Eclipse compiler.
The present invention proposes the parallelization algorithm that a kind of compute type drives, and with this algorithm as theoretical foundation, at dissimilar calculating, adopt different parallelization strategies.The calculating that the present invention supported comprises: 1. the calculating of forming by linear task, and as the multi-media decoding and encoding application program, MDG program (molecule dynamic simulator program, the part of Perfect Benchmarks) or the like; 2. application that the method for dividing and ruling deals with problems etc. is for example used in the calculating of being made up of the recurrence task; 3. the calculating of being made up of regular data is as a part of science computing application program etc.; 4. specific calculation is such as SPECCPU2000 Compress etc.
Core algorithm of the present invention is divided into two stages: preproduction phase and parallelization stage.Preproduction phase, be mainly used to obtain the serial program relevant information.By interactive mode, screen linear task computation according to the information that the user provides.Then, carry out the automated procedures analysis, comprise data-flow analysis and interprocedural analysis in the process, screening recurrence task computation.At last, by the data dependency analysis, come distinguishing rule data-flow computation and specific calculation.After determining the computation structure characteristic of serial application, enter the parallelization stage.The parallelization stage is adopted different parallelization strategies at the calculating of different qualities.For linear task computation, by setting up the expense model, and, adopt method based on the graph model theory according to the expense model with user interactions, the procedure chart model is divided and dispatched, excavate coarseness task level concurrency.And the parallelization result is shown to the user, the parallelization process is that iteration is carried out, up to the satisfied parallelization result of user.For the recurrence task computation, adopt the parallel algorithm of dividing and ruling, use parallel storehouse that program is carried out conversion, and show the parallelization result.Calculate for regular data,, obtain Performance Evaluation information, use affine partitioning algorithm that program is carried out conversion, and show the parallelization result according to performance model by the user interactions means.For specific calculation, by with user interactions, obtain user experience information and instruct congenial the execution, adopt expense to drive, specific calculation carried out the parallelization processing based on the congenial multi-threaded parallel method of sampling and feedback, by a Simulation execution of supporting to speculate parallel runtime environment, obtain the parallelization performance number again, and the parallelization results of property is shown to the user, parallelization process iteration is carried out, up to the satisfied parallelization result of user.
Core algorithm of the present invention, promptly the estimated performance according to serial program uses suitable parallelization strategy to carry out the source-to-source conversion, and its false code is described below:
Interactive parallelization Compilation Method of the present invention is used the interactive parallelization compiling system and is realized core algorithm of the present invention, comprises preproduction phase and parallelization stage.In the preproduction phase, interactive parallelization plug-in unit and user interactions obtain interactive information, and interactive information is transferred to the interactive parallelization engine.The interactive parallelization engine receives interactive information and serial program information, the automatically parallelizing module is screened linear task computation according to interactive information earlier, carry out data-flow analysis and interprocedural analysis in the process then, screening recurrence task computation, pass through data dependency analysis, distinguishing rule data-flow computation and specific calculation at last.In the parallelization stage, the program transformation module of interactive parallelization engine is for linear task computation, by setting up the expense model with user interactions, and, adopt method based on the graph model theory according to the expense model, the procedure chart model is divided and dispatched, excavate coarseness task level concurrency, the Eclipse compiler shows the parallelization result, and the parallelization process is that iteration is carried out, up to the satisfied parallelization result of user.For the recurrence task computation, the program transformation module adopts the parallel algorithm of dividing and ruling, and uses parallel storehouse that program is carried out conversion, and the Eclipse compiler shows the parallelization result.The program transformation module is calculated for regular data, by the user interactions means, obtains Performance Evaluation information, uses affine partitioning algorithm that program is carried out conversion according to performance model, and the Eclipse compiler shows the parallelization result.The program transformation module is for specific calculation, by with user interactions, obtain user experience information and instruct congenial the execution, adopt expense to drive, specific calculation carried out the parallelization processing based on the congenial multi-threaded parallel method of sampling and feedback, the Eclipse compiler shows the parallelization result, rowization process iteration is carried out, up to the satisfied parallelization result of user.