Automatic program synthesis for software diversification

Contact: Benoit Baudry (benoit.baudry@inria.fr)
Keywords: software diversity, program transformation, approximate computation, program synthesis

Description

Program synthesis [5] and software diversification [2] are two very active research areas and this Master internship aims at exploring how the former can serve the purpose of the latter.
Program synthesis [5] consists in automating the task of discovering programs that realize user intent. There exist many techniques to achieve this challenging goal, which vary according to three main dimensions: how to express user intent, how to express the program, how to search the space of different programs. In this work, we focus input/output et trace-based descriptions of user intent and the Java language to express programs. We shall explore different techniques for searching the space of programs.
The automatic diversification of programs can serve different purposes: fault-tolerance through N-version design, security through randomization or correctness / quality trade-offs through the synthesis of variants that enhance certain quality factors such as energy of memory. The latter type of of diversification proceeds as follows [8, 9]: given a program and its test suite, transform the program, check if the test suite still passes, if yes evaluate the performance of the variant. We have developed such a procedure to synthesize sosie programs in Java, i.e., program variants that pass the same test suite as the original [1].
We have observed that when running a test suite against a program under test, some methods executed thousands of times by hundreds of test cases. Focusing on these highly tested methods, it is possible to collect very rich input/output and trace-based micro specifications.
This work will investigate how the collection of rich specifications on some program regions can be exploited to automatically synthesize new versions of these regions. The production of completely new, synthetic code for highly tested code regions would provide a radical diversification technique, which remains in the original program’s behavioral envelop. We shall investigate the synthesis of loop-free programs [7, 6], as well a more local techniques to replace some specific instructions [3] or method sequences [4].
The applicant should have previous experience in software engineering and program analysis, as well as a taste for program transformation and experimental science.

References

[1] B. Baudry, S. Allier, and M. Monperrus. Tailored source code transformations to synthesize computationally diverse program variants. In Proc. of ISSTA, pages 149–159, CA, USA, 2014.
[2] B. Baudry and M. Monperrus. The multiple facets of software diversity: Recent developments in year 2000 and beyond. ACM Computing Survey, 2015.
[3] J. Galenson, P. Reames, R. Bodik, B. Hartmann, and K. Sen. Codehint: Dynamic and interactive synthesis of code snippets. In Proc. of ICSE, pages 653–663, 2014.
[4] A. Goffi, A. Gorla, A. Mattavelli, M. Pezzè, and P. Tonella. Search-based synthesis of equivalent method sequences. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 366–376. ACM, 2014.
[5] S. Gulwani. Dimensions in program synthesis. In Proceedings of the 12th international ACM SIGPLAN symposium on Principles and practice of declarative programming, pages 13–24. ACM, 2010.
[6] S. Gulwani, S. Jha, A. Tiwari, and R. Venkatesan. Synthesis of loop-free programs. In ACM SIGPLAN Notices, volume 46, pages 62–73. ACM, 2011.
[7] S. Jha, S. Gulwani, S. Seshia, A. Tiwari, et al. Oracle-guided component-based program synthesis. In Proc. of the Int. Conf on Software Engineering (ICSE), volume 1, pages 215–224. IEEE, 2010.
[8] M. Rinard, C. Cadar, and H. H. Nguyen. Exploring the acceptability envelope. In Proc. of the Conf. on Object-oriented programming, Systems, Languages, and Applications (OOPSLA), pages 21–30. ACM, 2005.
[9] E. Schulte, Z. P. Fry, E. Fast, W. Weimer, and S. Forrest. Software mutational robustness. Genetic Programming and Evolvable Machines, pages 1–32, 2013.