Etino is a static analysis tool for scheduling computations in heterogeneous architectures automatically. Given a target architecture, with a description of its processing devices, Etino performs a series of training steps, using Simulated Annealing, to find optimal parameters for a cost model that adequately represents that architecture. This training process is performed only once for any given architecture.
Once these parameters have been computed, Etino can receive as input any C program, plus additional information of which code regions can be offloaded to other devices (such as GPUs or FPGAs). It then determines where each region of code should execute, to maximize utility. Utility is general: it could be minimum runtime, or least amount of energy consumption, for example. Etino will transform the source code, cloning the functions in the original program, so that versions meant for a specific device can be compiled to run there. It will also replace the call sites of all relevant functions to invoke the correct version to achieve the computed optimal scheduling. To know more about how Etino works, you can read our paper (Static Placement of Computation on Heterogeneous Devices, OOPSLA '17), or watch its presentation. If you want to know more about the implementation of Etino, download our artifact.
The interface below can be used to try out Etino online. Simply input a valid C program in the upper textbox, as well as information about any loops that can be offloaded to an accelerator device in the lowermost textbox. You can then choose amongst the three pre-computed architectures for which we have trained cost models for Etino. It will then output the transformed file, with the transformed source code containing the appropriate version of each function. It will also attempt to annotate parallel loops with OpenMP pragmas automatically, for GPU offloading. The interface will also show a file breaking down the scheduling decisions made by Etino, based on the parameters for that architecture.