Hampering massive attack reuse through automatic synthesis of artificial software diversity

PhD proposal

Software reuse is essential to build the complex applications that pervade our daily lives. Yet, massive reuse has a darker side: it creates a monoculture of software applications and millions of clone programs
around the world can be hacked in the same way. For example, WordPress web sites have been the
targets of an agressive hacking campaign in France, in the aftermath of the Charlie Hebdo attacks.
Two factors favored this massive attack: (i) WordPress is the dominating technology to build web
sites (forming an applicative monoculture) and (ii) WordPress developers introduced some rigidity in
the code (e.g., “hard-coded” naming conventions), which favored the reconnaissance phase of these
attacks. This phenomenon of “software monoculture” was coined more than a decade ago to highlight
the risks of using a handful of operating systems and databases [8]. Nowadays, a new form of software monoculture is emerging, called applicative monoculture.

In this PhD, we aim at exploring novel program analysis and transformation techniques to automatically
diversify applicative monocultures. The goal is to synthesize large quantities of variants that
relax rigidities and reduce the risks of massive hacks. In particular, we will focus on two specific types of code rigidities: constant Strings and numerical values, which can leak information about the location of sensitive resources or about sensitive data; over-specifications in the control flow (e.g., choices of certain execution paths that can be modified to reduce predictability about the execution). The work will consist of three main phases: locate code rigidities through a combination of program analysis (static, dynamic, data-flow); transform the program at those code rigidity locations to generate multiple variants; evaluate the impact of the diverse variants for protection, this step will be performed on widely used libraries (e.g. Guava) and end-user applications (e.g., Firefox).
This PhD will contribute to the field of software diversity. Our recent survey [3] shows that most of automatic software diversification occurs at the operating system (OS) level. The seminal work of Cohen [4] and Forrest [5] established the foundations of OS diversity for security and protection purposes. In this work, we aim at leveraging this work on diversification, at the application level, to tackle application monoculture. Today, there exist few transformations that diversify application code. These works include Rinard et al.’s loop perforation [7], Schulte’s novel observations about mutational robustness [6], and our work on multi-level diversification of web applications [2].

The PhD candidate should have a strong knowledge of object-oriented programming, static analysis
and software testing. An interest for code protection and fault tolerance is highly recommended. The
work will have a strong empirical component and the candidate should be prepared to develop all the
necessary software to run experiments on real-world programs. The candidate must also be prepared
to read and write scientific papers.

The PhD candidate will be part of the DiverSE team at INRIA in Rennes, France [1]. The team is international, it gathers 20 PhD students and 5 postdocs and has strong connections with international labs and companies.


[1] Diverse: diversity-centric software engineering. Home page.
[2] S. Allier, O. Barais, B. Baudry, J. Bourcier, E. Daubert, F. Fleurey, M. Monperrus, H. Song, and M. Tricoire.
Multi-tier diversification in web-based software applications. IEEE Software, 32(1):83–90, 2015.
[3] B. Baudry and M. Monperrus. The Multiple Facets of Software Diversity: Recent Developments in Year 2000 and Beyond. ACM Computing Surveys, (Accepted for publication), 2015.
[4] F. B. Cohen. Operating system protection through program evolution. Computers & Security, 12(6):565–584, 1993.
[5] S. Forrest, A. Somayaji, and D. Ackley. Building diverse computer systems. In Proc. of HotOS, pages 67–72, 1997.
[6] E. Schulte, Z. Fry, E. Fast, W. Weimer, and S. Forrest. Software mutational robustness. Genetic Programming and Evolvable Machines, pages 1–32, 2013.
[7] S. Sidiroglou-Douskos, S. Misailovic, H. Hoffmann, and M. Rinard. Managing performance vs. accuracy trade-offs with loop perforation. In Proc. of ESEC/FSE, pages 124–134, 2011.
[8] M. Stamp. Risks of monoculture. Comm. of the ACM, 47(3):120, 2004.