Research topics for interns, Masters or thesis

Software hardening


Fixing yanked releases in npm
Debloating Rust programs
Adversarial rebuild
Air-gapped software builds
Software integrity with runtime SBOMs
Embedding the software supply chain at runtime with Java classloaders
Ultra small code with GraalVM and debloating
Full-stack debloating for a video conferencing system
Reproducible builds for Maven
Leveraging the diversity of bundlers for debloating JavaScript applications
Automatic specialization of the Java Runtime (JRE)
Systematic decompilation in the CI to mitigate supply chain attacks
Detecting superfluous conflicts in Java projects
API specialization in Kotlin

Software testing


Automatic test generation for Rust
Forging test results to tamper with open-source projects
Test Generation for Ethereum Clients Using Production Data
Code Coverage in Production
Live analysis of Webassembly in the browser
NumPy: boosting the test suite of the Python numerical analysis package
Automatic synthesis of Java Mock objects based on Production observations
Effectiveness of diverse coverage tools for Java
Amplifying Kotlin library test suites with client usages
From JSON to Java records

Software diversification


Diverse-double compilation for jit compilers
Neural diversification
Using generative AI to improve software substitutability
Build Integrity with N-Version Continuous Integration
Diverse execution environments with infrastructure as code
Github copilot for automatic diversity
Diversifying a npm registry
Polymorphing GraphQL queries
Diverse Multi-compilation for Trusting trust
Java – Kotlin translation to diversify bytecode
Automatic generation of 1 Million libc
Automatic synthesis of diverse replacements for Java expressions
Superdiversifying SHA256
Automatic diversification of Kafka

Off the beaten track

The software supply chain of creative coding
Code by singing in eso-lang
Github repositories with literary references
The anatomy of the most Enterprise email client
Easter egg VM flag
Web stalker: deconstructing modern browser technology (remix)
Paint Splatters & Perl Programs (remix)

Software hardening

Fixing yanked releases in npm

Package managers such as npm, rubygems, or cargo support ‘yanking‘ a specific release of a package. This can be for for security or legal reasons, or even as a form of protest [3]. Meanwhile, all projects that depend on that release will fail to build when a release is yanked, which can have some catastrophic consequences when the release is massively used [3]. A recent study shows that 9.6% of the packages in Cargo have at least one yanked release [1]. In this project, we analyze the top 10000 npm packages by downloads to determine the amount of yanked releases in the npm ecosystem. We also analyze how the dependen projects fix their build when these releases are yanked.

Debloating Rust programs

The Rust and Cargo ecosystem is growing for low-level programming, thanks to Rust’s memory safety and comprehensive compiler. It is now a language of choice to develop Kernel features or embedded systems. These applications have drastic constraints on size and it is important that Rust programs are as small as possible.
In this work, we develop novel techniques to reduce bloat in Rust programs, starting from debloating unnecessary crates at build time.

Adversarial rebuild

Reproducibe builds is an essential concept to ensure the integrity of the software supply chain [1]. Yet, setting up an automated pipeline for reproducible builds is extremely challenging because of the numerous platform specificities or randomness sources that can occur in a build. Lamb and Zacchiroli introduced the concept of ‘adversarial rebuild’, which aims at assessing whether a build is actually reproducible. This concept has been implemented for Debian with the reprotest tool that builds the same source code twice in different environments, and then checks the binaries produced by each build for differences.

The objective of this work is to determine what are the most sensitive environmental changes that can perturb a reproducible build. We will collect a set of projects that have set up a reproducible build pipeline. Then, we will explore diverse environmental changes and study their effect on the reproducibility of the build.

Air-gapped software builds
Supervisors: Benoit Baudry, Martin Monperrus, KTH Royal Institute of Technology

Air-gapped software development is done by the military and similar highly sensitive environment. Modern software builds typically require Internet connectivity, and a typical build involves thousands of network requests. How to reconcile those opposite requirements? In this thesis, you will design, implement and evaluate an infrastucture for air-gapped software builds.

Software integrity with runtime SBOMs
Supervisors: Benoit Baudry, Martin Monperrus, KTH Royal Institute of Technology

A software bill of material (SBOM) is an inventory of the software components that are reused in an application; e.g third-party libraries. With the growing awareness about the risk of software supply chain attacks, several standards have emerged to compute the static SBOM of an application. This is essential to identify the presence of risky components in the supply chain. Yet, malicious components can be introduced through the compilation and deployment phases. In this project, we investigate the feasibility of collecting the runtime SBOM of an application to mitigate this risk. The student will experiment with and contribute to jbom [2] to provide a sound technique to detect discrepancies between the static and the dynamic SBOM

Embedding the software supply chain at runtime with Java classloaders
Supervisors: Benoit Baudry, Martin Monperrus, KTH Royal Institute of Technology

In Java, class loading refers to retrieving the binary form of a class or interface and constructing, from that binary form, a class object to represent the class or interface [1]. Today, different subclasses of the `ClassLoader` may implement different loading policies [2]. For example, a class loader may cache the binary representation of a class, prefetch it based on expected usage, or load a group of related classes together. These activities may not be completely transparent to a running application. In this context, determining the third-party suppliers of classes loaded at runtime allows for controlling and hardening the software supply chain of third-party components used during program execution. Monitoring the origins of the “actually” executed code is a critical task for building more reliable and secure systems. The student will design and implement a novel software tool to build a representation of the software supply chain at runtime.

Ultra small code with GraalVM and debloating
GraalVM compiles Java code to native, boosting deployment and runtime performance. Meanwhile, code debloating [2] removes unnecessary code from applications, reducing code size and attack surface. Both techniques are actively researched in the Java ecosystem[2,3]. In this work, we will you use both techniques in conjunction to take code reduction one step further. We will experiment with debloating before, as well as after the GraalVM compilation to understand where the largest code size savings can be performed. Quarkus [4] might be used to reduce one more step.

Full-stack debloating for a video conferencing system
Software bloat is data and code that accumulates over time and yet is not necessary for an application to behave correctly. Several techniques have been proposed over the last years to detect and remove bloat. These techniques complement each other since they analyze bloat at different levels of the software stack (libraries, containers, kernel, etc.). Yet, no previous work has studied the combined effect of these techniques
For this thesis you will apply different debloating techniques such as DepClean [1], docker-slim [2] and unikernels [3]. You will measure the effects of each technique and their combination on the jitsi video conferencing system.

Reproducible builds for Maven
Supervisors: Benoit Baudry, Martin Monperrus, KTH Royal Institute of Technology

Reproducible builds is an essential property for secure software supply chains [1]. There is ongoing effort in some Linux distributions, in particular Debian, to ensure reproducible builds [2]. In the Java world, there is little work on this topic and no clear understanding of the problem. You will design, perform and analyze an experiment to assess the status quo of reproducible builds in Java and a tool to improve build reproducibility.

Leveraging the diversity of bundlers for debloating JavaScript applications

JavaScript is the most used programming language for the development of web applications. Once the web application grows, so does the bundle size, primarily due to all its third-party dependencies [1,2]. A bundler is a tool that transforms all the JavaScript code and its dependencies into a new output file with everything merged (including other files such as HTML, CSS, and PNG). There are many production-ready JavaScript bundlers (e.g., Webpack, Rollup, Browserify, ESbuild, and Parcel). They can perform optimizations and minifications on the bundle, such as tree shaking, scope hoisting, bundle splitting, and minifying [4]. However, the size reduction achieved by a bundler is limited by its own code minimization technique [3]. The student will perform an experimental study to leverage the diversity of JavaScript bundlers in order to reduce the original code size of applications while keeping the functionality required to pass all test cases in their test suites.

  • [1] Slimming JavaScript Applications: An Approach for Removing Unused Functions From JavaScript libraries (JSS), 2019
  • [2] Evolving JavaScript Code to Reduce Load Time (TSE), 2021
  • [3] Stubbifier: Debloating Dynamic Server-Side JavaScript Applications (ArXiv), 2021
  • [4] https://webpack.js.org/guides/tree-shaking/

Automatic specialization of the Java Runtime (JRE)E

The Java Runtime Environment (JRE) is a great, general purpose execution engine, which provides the standard Java libraries.
Because it is general purpose, it offers too much functionality, when considering only one Java application that runs in the JRE.
You will design and experiment with a system that automatically specializes the JRE for a specific Java application, using jcov [1] to identify the parts that are necessary and the parts that can be removed. This topic contributes in hardening the software supply chain through debloating [2] and specialization of the software stack [3].

Systematic decompilation in the CI to mitigate supply chain attacks
Supervisors: Benoit Baudry, Martin Monperrus, KTH Royal Institute of Technology

Supply chain attacks [1] represent a growing threat on software systems, as illustrated by the Solar Winds attack in late 2020 [2].
One of these attacks consist in tampering with the code at one point in the automatic build pipeline, in order to inject malicious code into the binary.
In this work, we investigate the systematic disassembly of binary [3], at the end of the build pipeline, to detect the injection of malicious code injection.

API specialization in Kotlin
Supervisors: Benoit Baudry, Cesar Soto-Valero, KTH Royal Institute of Technology

Software applications rely on numerous third-party APIs to reuse existing features (e.g., data processing, security, network, etc.).
Yet, applications use only a small part of the APIs.
The unused parts represent unecessary risks for the security and reliability of the applciation.

In this project, we investigate API specialization to mitigate these risks [1].
This technique first determinines what are the legitimate usages of an API, to build a sense of self [3] for the application API usage.
Then, the specialization consists and in building a proxy that blocks all other API usages at runtime.
This project focuses on specialization for Kotlin APIs [2].

https://github.com/topics/fake

Software testing

Automatic test generation for Rust

The popularity of the Rust programming language is constantly growing in various sectors, from embedded systems to creative coding. Meanwhile there is little support for automatic test generation in Rust. In this work, we evaluate the robustness of state of the art solution [1,2]. Then, we develop novel techniques to automatically enhance the test suites of Rust programs with variant test cases written in the idomatic way of Rust automated tests [4].

Forging test results to tamper with open-source projects

The large open source software supply chains of many applications have turned open source repositories into targets of choice for the introduction of malicious code [1]. As mature open source projects use continuous integration, stealthy code tampering should also ensure that the test suite passes. While the modification of the test suite might appear as red flag to the open source community, another solution consists in forging the test results [2]. For example, a change in the continuous integration pipeline can turn some failing test cases into passing ones.
In this work, we investigate different strategies to forge test suite results in order to mask ill-intended changes in the source code.

  • [1] On Omitting Commits and Committing Omissions: Preventing Git Metadata Tampering That (Re)introduces Software Vulnerabilities
    https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/torres-arias
  • [2] in-toto: Providing farm-to-table guarantees for bits and bytes
    https://www.usenix.org/system/files/sec19-torres-arias.pdf

Test Generation for Ethereum Clients Using Production Data
Supervisors: Martin Monperrus, Benoit Baudry

Description: Unit testing is one of the essential ways to improve the quality of software It is also helpful for correctness checking when there are different implementations based on the same software specification. Let us take Ethereum clients as an example, there are thousands of common tests [1] provided for all the Ethereum client projects. Though these tests have already cover various cases, there are corner cases in production that are missing in the test suite [2]. In this thesis project, you will design, implement and evaluate a prototype that collects production data and generate new valuable test cases for Ethereum clients.

Code Coverage in Production
Supervisors: Martin Monperrus, Benoit Baudry

Description: Code coverage usually relates to test code. Production code coverage is the coverage over real interactions made by users in production. Obtaining and analysing production code coverage enables to identify useless code as well as relevant test data and values. It enables testers and developers to better align the test intentions with what matters for users. The student will compare and analyze techniques for automatically collecting code coverage in production for Java software.

Live analysis of Webassembly in the browser
Supervisors: Benoit Baudry, Javier Cabrera-Arteaga, KTH Royal Institute of Technology

Webassembly is rapidly conquering the world of web technology [1].
Its safe and compact binary format provides great support to consolidate existing applications and to boost the migration of legacy apps to the browser [3].

In this project we will investigate what Webassembly binaries arrive in web browsers.
The project includes the development of efficient technology to collect wasm files live in the browser.
The second part consists in analyzing the live coverage of these files, as well as their purpose.

NumPy: boosting the test suite of the Python numerical analysis package

NumPY is a fundamental package for scientific computing with Python, as well as an excellent illustration of state of the art software engineering [1]. For example, the NumPY community uses four different continuous integration systems [2]. Its crucial importance for science calls for a rock-solid test suite, in order to ensure the validity and reproducibility of scientific experiments.
You will dive deep into the test suite of NumPy and aim at making it stronger through a systematic assessment of the test cases. You will investigate the presence of pseudo-tested methods [3] and contribute test improvement to NumPy’s test suite.

Automatic synthesis of Java Mock objects based on Production observations

Mock objects are highly valuable to create predictable test environments, which speed-up test execution and limit flaky tests. Yet, the development of relevant mock objects is challenging, since there is currently no support to determine the validity or value of manually selected values for mocks.
You will design a system that observe an application in production in order to collect real program state values that will then be turned into mock objects. This system will leverage efficient observability technology [3] in order to contribute to the state of the art of automated test generation [1,2].

Effectiveness of diverse coverage tools for Java

Code coverage is a key metric to assess test suite quality as well as to perform dynamic analyses [1]. Yet, there exist a variety of test coverage tools, each with their strengths and quirks [1,2].
You will design and perform a systematic analysis of the main coverage tools for a specific programming language, e.g. Java [2,3,4], in order to determine which is the most appropriate combination of tools for the most accurate measurement of full coverage.

From JSON to Java records
Pankti records program states in production in order to generate differential unit tests that can improve the original test suite of an application [1]. Currently, the states are serialized in JSON, then the generated test includes instructions to deserialize the objects. In this thesis, you will investigate how to generate Java records [2] as part of the test harness. This will make more readable test cases that are not overloaded with deserialization instructions. Java records were introduced in Java 14, and aim to simplify the way we create a POJO (Plain Old Java Objects).

Amplifying Kotlin library test suites with client usages

Third-party libraries are at the core of the software supply chain [1]. Their test suites are essential to ensure the quality of this infrastructure.
One solution to consolidate these test suites consists in carving additional test cases by running the clients of these libraries [2].
You will design, implement and evaluate a test carving tool for Java libraries [3].

Software diversification

Diverse-double compilation for jit compilers

Just-in-Time (JIT) compilation plays a crucial role in optimizing the performance of modern software programs. However, there are also targets for trusting trust attacks. This thesis aims to investigate the benefits of a diverse-double compilation (DDC) approach to mitigate those attacks. You will design, implement and evaluate DDC for a Java JIT compiler.

Neural diversification
Supervisors: Javier Cabrera, Benoit Baudry, Martin Monperrus

Automatic code generation is boosted by generative AI and large langage models [1]. These new abilities are used daily, letting software developers focus on the design and creative parts of development. In this work, we are interested in ability of these models to generate mutliple variants of the same functionality. The goal is to revisit program synthesis for automatic software diversification [2], through the lens of generative AI.

Using generative AI to improve software substitutability

Supervisors: Javier Ron, Benoit Baudry, Martin Monperrus

Software substitutability is a property which measures how readily a
software component can be replaced by a different but equivalent component [1].
In software supply chains it is critical for faulty or vulnerable
components to be replaced as quickly as possible. However, software
substitutes might not be immediately available.
Generative AI tools like ChatGPT may be used to efficiently produce
software substitutes in diverse programming languages/paradigms [2].
In this work, we assess the feasibility of using generative AI tools to
enhance substitutablity of components in software supply chains.

Build Integrity with N-Version Continuous Integration

Some software projects with strong reliability and security constraints build their product with more than one build pipeline. This also an approach to address the challenge of trusting trust [1]. For example, the NumPy open source project for scientific computing uses four continuous integration systems [2]. Following an attack against its Orion product, the Solarwinds company started using diverse build systems [3]. In this work, the student will experiment with integrating diversity in existing build pipelines. For example, the student will investigate duplicating a Travis CI pipeline with Github actions and assess the impact of this diversity of build technology.

Diverse execution environments with infrastructure as code

Infrastructure as code is about provisioning execution resources through executable configuration files [1]. In this context, the execution of program provisions a whole environment to execute an application. A variation of the same program will provision a different environment to run the same application. In this project the student will explore transformations for infrastructure as code with the intention of creating a moving target at the environment level [2]. We consider using Modus to define the infrastructure [3].

Github copilot for automatic diversity

Github copilot, a.k.a an AI pair programmer, generates suggestions for lines of code, or entire functions [1]. It is based on an immense set of code written by human developers in order to synthesize new code in a new context. In this work, we wish to experiment these techniques in order to replace existing code snippets written by developers by synthetic ones. The objective to to generate program variants that are semantically similar but which executions are different.

Diversifying a npm registry

Supervisors: Benoit Baudry, Martin Monperrus, KTH Royal Institute of Technology

Dependency confusion is a growing threat for software supply chain [1]. This attack consists in uploading malicious packages on public repositories, which will eventually be packaged in applications, through dependency resolution mechanisms. In this work, we will explore the automatic randomization of instructions [3] in private npm registries to mitigate dependency confusion [2]. The student will deploy a local npm registry and a instruction randomization scheme, along with the adaptation of the javascript engine to correctly execute the randomized packages.

Polymorphing GraphQL queries

Supervisors: Benoit Baudry, Martin Monperrus, KTH Royal Institute of Technology

GraphQL is increasingly adopted for web APIs [1], making it a good target for exploits [2]. In this work investigate polymorphing to harden GraphQL APIs [3]. The student will develop a randomization scheme for the API and the corresponding adaptation of the client queries in order to build an effective protection against injection attacks.

Diverse Multi-compilation for Trusting trust

Supervisors: Benoit Baudry, Martin Monperrus, KTH Royal Institute of Technology

The problem of deceptive compilers introducing malicious code is relevant and hard [1,2]. One solution for this is to use multiple diverse compilers to mitigate the problem [3]. For instance, one can compile a C program with both GCC and CLANG. You will design, implement and evaluate a multi-compiler scheme for C.

Automatic generation of 1 Million libc

libc is at the core of most software stacks, but it is fragile, prone to critical vulnerabilities [1]. In this work we explore a combination of techniques to generate large amounts of diverse implementations of libc [2]. The student will combine the abundant combinations of flags of C compilers [3], with state of the art code transformation and obfuscation techniques [4] to generate many libs variants.

Java – Kotlin translation to diversify bytecode

Supervisors: Benoit Baudry, Martin Monperrus, KTH Royal Institute of Technology

The transition from Java to Kotlin is timely and hard problem [1].
In this work, we explore the natural diversity of translation strategies from Java to Kotlin [2], as well as the diversity of compilation options of koltinc [3] and javac [4]. The goal is to assess the ability of these strategies to generate diverse versions of Java bytecode for the same piece of source code.

Superdiversifying SHA256

Software diversity increases the robustness of software systems [1]. Through various transformations and randomization, it is possible to automatically generate variants of a program. These variants should have minimal impact on convenience, usability, and efficiency. Meanwhile, each variant should not be sensible to the same bug or vulnerability.
In this project, we explore the large-scale diversification of SHA256 [2]. This family of hashing functions is essential for cryptography, and hence a critical feature for security. The student will investigate superdiversification [3] and the composition of multiple diversification techniques, in order to synthesize large amounts of variants for an implementation of SHA256.

Automatic diversification of Kafka

Automatic software diversity consists in generating multiple variants of an application, which provide the same functionality, with diverse implementations.
The goal is to minimize the risks of having a single point of failure.
In this project, we aim at automatically synthesizing diverse variants of applications that stream data with Kafka [1]. Diversification will be on Kafka itself, e.g., build the application with different versions of Kafka. We will also leverage the natural emergence of the Kafka compatible streaming library, Redpanda [2].

  • [1] Kafka
  • [2] redpanda
  • [3] The multiple facets of software diversity: Recent developments in year 2000 and beyond

Off the beaten track

The software supply chain of creative coding

Artists use advanced software technology to produce, distribute and generate artworks. Such software technology includes libraries for sound synthesis [1], visual art[2,3], augmented reality [4], as well as platforms to distribute artworks [5,6]. In this work, we dive deep in this software ecosystem to draw a systematic landscape of the software supply chain [7] for creative coding.

Code by singing for eso-lang

The progress of voice recognition and speech-to-text technology is fabulous. It opens the way towards, coding by voice, a very promising advance to open the world of programming to a wider population [1].
In this thesis, we will explore the possibilities of writing code by singing. This master thesis at the intersection of software technology, signal processing and rickrolling will be disseminated as part of a growing eso-lang [2].

Github repositories with literary references

Github repositories are rich sources of code, documentation and discussions. They also contain amazing resources such as images, sound snippets, texts or references. A recent study has analyzed the presence of links to academic papers in Github repositories [1]. This study reveals the critical importance of linking code, data and publications to improve replication in computational science. In this work we wish to explore literary references in Github. For example, references to Bob Dylan cited in C code or novel quotes in comments, perl -le’$_=`perldoc -T perlfaq4`,s/^.*N;(.*?)E.*$/$1/s,print’.

The study seeks to unveil the deep connection of Github with culture and society and to analyze the role of literature on software development.

Anatomy of Outlook mail

Everyday we use extraordinary software objects. Examples of such objects include the Android mobile systems that run on billions of devices, the domain name system that runs the web, or the Outlook email client that lets millions of workers communicate efficiently. These objects are extraordinary in several respects: they are large, they are composed of hundreds of diverse software parts, they evolve fast, they exist in many versions that are tailored to various needs. The massive presence of such objects, as well as the very large dimensions that characterize them are intriguing for software developers and for users. One approach to unveil the extraordinary nature of these objects consists in breaking down all of its components turning into an anatomical analysis of the object [1,2].

In this work, we aim at building a fine-grained anatomy of an extraordinary, extremely popular software object: the Outlook email client.

Easter egg VM flag

Easter eggs, sometimes called the final frontier of software development [10]. (Except that of course you can’t have a final frontier, because there’d be nothing for it to be a frontier to, but as frontiers go, it’s pretty penultimate . . .) [269696]. And against the wash of continuous integration a commit hangs, bloated and poetic, one single, cool contribution, gleaming like the madness of gods. Nearly unreal. Reality is not digital, an on-off state, but analog. Easter eggs are for lovers and for the mind. Not enterprise, nor a resurrection, they cherish enchantment and freedom. In the quest for technology and Mastery, you will add an extra mile to the frontier with a new Easter flag for an extraordinary virtual machine [42].

[42] java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version
[10] Curated list of all the easter eggs and hidden jokes in Python.
[269696] Moving Pictures. T. Pratchett. 1990, on Monday afternoon, just before tea.

Web stalker: deconstructing modern browser technology (remix)

In 1998, Simon Pope, Colin Green and Matthew Fuller designed the Web Stalker, an alternative web browser that displays the structure of web pages instead of its content [1]. The work was motivated by a strong motivation to understand what happens beyond the screen and to let web users experience this understanding. Twenty years later, the adoption of the web has massively radiated in all aspects of our lives and the complexity of web browsers has exploded.
This project is about rethinking a web stalker in the era of modern web browsers, going from the design of a solution that leverages the architecture of these browsers [2] to the implementation of an artistic representation of web pages content based on Electron [3].

Paint Splatters & Perl Programs (remix)

In 2019, Colin Mc Millen and Tim Toady ran an experiment to answer one question: is it possible to smear paint on the wall without creating valid Perl? This is an essential question at the forefront of art / computing frontier.
In this project, we will reproduce Mc Millen’s experiment [1], starting with the curated dataset provided by the authors [2]. We will then elaborate on the findings with original splatters and an exploration of Perl’s diverse ecosystem [3].