Semester Project and Thesis

The SPRING lab offers project opportunities for BSc, MSc, and PhD students. We encourage interested students to have a look at the Thesis & Project Guidelines from the MLO lab, where you will gain an understanding about what can be expected of us and what we expect from students.

Last Update: 15th November 2024

How to apply

Please, apply via Google form (login may be required). You will need to specify which project(s) you are interested in, why you are interested, and if you have any relevant experience in this area.

External students, i.e., students who are not from EPFL nor ETHZ, should get in touch with the supervisors of the project(s) via email.

Applications are processed in two rounds. For each round, we collect applications before the deadline. Then, we will get back to selected applicants during the corresponding “First Contact From Supervisors” period. If we do not get back to you during the indicated period, it means that we probably do not have space anymore.

We will make a mark on the project once it is taken. We strongly recommend that you apply as soon as possible for best consideration, since we expect most projects would be taken after the first round. However, we will leave the form open after the second round and consider all applications, if there are still available projects at that time.

Early deadline: 4th December 2024

First Contact From Supervisors: 5th December 2024 - 16th December 2024

Late deadline: 1st February 2024

First Contact From Supervisors: 2nd February 2024 - 16th February 2024

Notes:

Note that projects can be updated or added throughout the application period. We recommend that you check this page regularly for updates.

If you encounter any technical issue, please get in touch with Saiid El Hajj Chehade.

Projects on System Security

SYSTEM1: Constant-Time Fully Homomorphic Encryption/Decryption/Key Generation Taken

Fully Homomorphic Encryption (FHE) allows for arbitrary computations to be performed on encrypted data, and is a very useful building block for more complex privacy-preserving protocols. FHE schemes are now fast enough to be deployed in practice, and are poised to be more widely deployed than ever in 2024 – 2026 thanks to the emergence of FHE hardware accelerators, various standardization efforts, and user-friendly compilers. FHE schemes are implemented in several libraries, but most libraries do not offer constant-time key-generation, encryption, and decryption operations, which can lead to devastating key-recovery attacks in practice [1-3].

Ensuring that these algorithms are actually constant-time is nigh impossible when using a high-level language (e.g., C, C++, Rust) and relies heavily on trusting the compiler and platform, which is an extremely brittle guarantee in practice. In this project, you will implement FHE key generation / encryption / decryption using the low-level Jasmin language, which automatically proves secret-independence of implementations using the EasyCrypt framework. A Jasmin implementation of ML-KEM (a key encapsulation mechanism based on similar cryptographic assumptions as FHE schemes) will provide a starting point for your implementation, but you will need to implement additional FHE-specific operations (e.g., arithmetic over polynomial rings, discrete Gaussian sampling, number-theoretic transforms). Ideally, your implementation will be compatible with an existing FHE library, and will thus be a major step forward for more secure deployments of FHE in the real world.

Requirements

Applying to this project

This research project/master’s project (PDM) is aimed at one MSc student. The student will work with Christian Knabenhans.

[1] F. Aydin, E. Karabulut, S. Potluri, E. Alkim, and A. Aysu, “RevEAL: Single-Trace Side-Channel Leakage of the SEAL Homomorphic Encryption Library”. 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), IEEE, Mar. 2022
[2] Wei Cheng, Jean-Luc Danger, Sylvain Guilley, Fan Huang, Amina Bel Korchi, et al., “Cache-Timing Attack on the SEAL Homomorphic Encryption Library”. 11th International Workshop on Security Proofs for Embedded Systems (PROOFS 2022), Sep. 2022
[3] F. Aydin and A. Aysu, “Leaking secrets in homomorphic encryption with side-channel attacks”. J. Cryptogr. Eng., Jan. 2024



Projects on Cryptography

CRYPTO1: Bringing the Fischlin transform to the real world

A well-known technique to convert an interactive proof to a non-interactive proof is the Fiat-Shamir transformation, which guarantees security in the random oracle model. An alternative transformation to achieve non-interactivity is the Fischlin transform [Fis05], which presents a number of advantages of the Fiat-Shamir transform (in particular, a straight-line, i.e. non-rewinding extractor) [DV24, ABGR12, RT24]. The concrete runtime and implementation costs of the Fischlin transform are however still not well-understood, although there has been some very recent progress in this direction. The goal of this project is to (i) derive guidelines to securely instantiate the Fischlin transform for real-world use cases, (ii) implement and optimize the Fischlin transform on top of the arkworks library, and (iii) compare this concrete instantiations with Fiat-Shamir implementations in terms of concrete efficiency, parameter sets, and theoretical security guarantees.

Requirements

Applying to this project

This research project is aimed at one MSc student. The student will work with Christian Knabenhans.

[Fis05] M. Fischlin, “Communication-Efficient Non-interactive Proofs of Knowledge with Online Extractors”, CRYPTO 2005
[DV14] Ö. Dagdelen and D. Venturi, “A Second Look at Fischlin’s Transformation”, AFRICACRYPT 2014
[ABGC12] P. Ananth, R. Bhaskar, V. Goyal, and V. Rao, “On the (In)security of Fischlin’s Paradigm”, Theory of Cryptography 2012
[CL24] Y.-H. Chen and Y. Lindell, “Optimizing and Implementing Fischlin’s Transform for UC-Secure Zero-Knowledge”, ePrint 2024.
[RT24] L. Rotem and S. Tessaro, “Straight-Line Knowledge Extraction for Multi-Round Protocols”, ePrint 2024.



Projects on Machine Learning

ML1: Harms of online Ads Taken

Google’s Topic API was hailed as the solution to privacy, by using topics (cars, sports, wellness, ..) instead of the actual URLs users visit to recommend personalized advertisement to put in the dedicated space on the web pages. In this project you will show that this mechanism gives a false sense of security, and that it unfortunately does not remove harms for users, who can still be harmed or discriminated against.

This project includes a creativity/small research part and coding.

Requirements

Applying to this project

This research project is aimed at one BS/MSc student. The student will work with Mathilde Raynal.

[1] https://brave.com/blog/mozilla-ppa/
[2] https://arxiv.org/pdf/1408.6491

ML2: Evading Content Moderation in X Taken

Content moderation is necessary to protect users from exposure to toxic (e.g., racist) content and maintain engagement. In this project, the student will implement an attack we developed on the X platform (formerly Twitter). The goal of the project is not to enable malicious users to post toxic content but to find and inform researchers about vulnerabilities.

Requirements

Applying to this project

This research project is aimed at one MSc student. The student will work with Mathilde Raynal.

[1] https://github.com/QData/TextAttack
[2] https://perspectiveapi.com/

ML3: Exploring trade-offs between privacy and multi-purpose functionality in data releases Taken

There is huge demand from governments and businesses to make data available to support innovation and competition. By definition, the best way to release the data so as to enable a broad range of use cases would be to release the raw data itself. However, raw data, even when personal identifiers are removed, is vulnerable to re-identification attacks. Alternative privacy enhancing releases such as synthetic data, aggregate statistics, and machine learning models have emerged. In particular, synthetic data promises to enable general-purpose uses of the data.

The goal of this project is to explore trade-offs between privacy and multi-purpose functionality in data releases. The student will select a couple of use cases for a synthetic data release and compute the privacy-utility trade-off for general-purpose synthetic data, traditional de-identification techniques such as k-anonymization, and targeted releases such as machine learning and aggregate statistics. The project will involve (1) evaluating privacy using state-of-the-art attacks, e.g., [1,2], (2) implementing or re-using differentially private algorithms tailored to each data release, and (3) computing privacy-utility trade-off curves for every use case.

Requirements

Applying to this project This research project/master’s project (PDM) is aimed at one MSc student. The student will work with Ana-Maria Cretu.

[1] Houssiau, F., Jordon, J., Cohen, S. N., Daniel, O., Elliott, A., Geddes, J., … & Szpruch, L. (2022). Tapas: a toolbox for adversarial privacy auditing of synthetic data. arXiv preprint arXiv:2211.06550.
[2] Stevanoski, B., Cretu, A. M., & de Montjoye, Y. A. (2024). QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based Systems. arXiv preprint arXiv:2409.01992.

ML4: Reconstruction attacks against perceptual hashing algorithms

Perceptual hashing algorithms are widely used to detect edited copies of targeted content, such as child sexual abuse media (CSAM) or non-consensually shared intimate images, in social media platforms. A perceptual hashing algorithm maps an image to a fixed-size vector representation, which captures the main features of the image and is called a perceptual hash. Perceptual hashes are different from cryptographic hashes in that they are robust to small transformations applied to the image, such as grayscaling and resizing. Perceptual hashes are believed to be privacy-preserving because of the signal loss with respect to the original image, as they are typically very low dimensional and consist of bits.

The goal of this project is to design and evaluate reconstruction attacks against perceptual hashes, whose goal is to recover a version of the original image given the hash, and to explore different adversary assumptions. The starting point will be to replicate existing works such as [1] or [2]. Then, the student will explore more advanced reconstruction techniques through the use of diffusion-based approaches.

Requirements

Applying to this project
This research project/master’s project (PDM) is aimed at one MSc student. The student will work with Ana-Maria Cretu.

[1] Hawkes, S. et al. Perceptual Hash Inversion Attacks on Image-Based Sexual Abuse Removal Tools. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10762793
[2] Madden, J. et al. Robustness of Practical Perceptual Hashing Algorithms to Hash-Evasion and Hash-Inversion Attacks. https://arxiv.org/pdf/2406.00918



Projects on Network Security

NET1: Creation of a Network Capture Tool Taken

Being anonymous when browsing a website is essential to preserve freedom of speech and information. To achieve this, one can use VPNs, or employ anonymous networks. However, the metadata of exchanged packets remains exposed. In website fingerprinting attacks [1], a passive attacker uses this metadata to predict, with machine learning techniques, which website a user has accessed. These attacks may allow governments or Internet Service Providers (ISPs) to monitor communications, thereby threatening user privacy.

The goal of this project is to design and implement a generic and easy-to-use network capture tool. The tool will be designed to be highly configurable and extensible, making it suitable for a wide range of use cases (e.g. captures of VPN, Tor or NYM traffic).

Requirements

Applying to this project

This research project is aimed at one MSc student. The student will work with Eric Jollès.

[1] S. Siby, L. Barman, C. Wood, M. Fayed, N. Sullivan, C. Troncoso “Evaluating practical QUIC website fingerprinting protections for the masses”
Code: https://github.com/spring-epfl/quic-wf-defenses/tree/main



Projects on Web Security

WEB1: Measuring Tracker Response to Browser History Taken

Measuring advertisement and tracking services (ATS) in the wild is an essential step in understanding challenges to web privacy and developing useful web privacy-preserving technologies (web-PETS). Automating web measurement is a necessary step to keep up with the expanding size of the web [1].

However, web trackers are getting smarter and more capable of detecting crawlers to block them or hide their privacy-invasive activity from them [2]. To make crawlers more human and measurements more representative of what real users observe, web privacy researchers have been studying various factors that reduce the distance between automated crawlers and normal users [3].

In this project, you will investigate the impact of equipping crawlers with human-like browser histories (through cookies), and measure the difference in ATS activity compared to traditional crawlers.

This project involves implementing a flexible framework to equip crawlers with configurable browsing histories, designing the experimental setup, and running a preliminary measurement experiment.

Note: This tool is potentially useful in a parallel project offered by spring (ML1). Students undetaking both projects will have the opportunity to collaborate for a larger contribution.

Requirements

Applying to this project

This research project is aimed at one MSc student. The student will work with Saiid El Hajj Chehade.

[1] https://dl.acm.org/doi/abs/10.1145/2976749.2978313
[2] https://medium.com/@datajournal/avoid-detection-with-puppeteer-stealth-febc3d70f319
[3] https://dl.acm.org/doi/abs/10.1145/3366423.3380104

WEB2: Understanding Privacy Implications of Chrome Extension Updates Taken

Many browsers offer users the ability to install extensions to customize their experience, such as ad Blockers, Dark Mode Readers, and Coupon Finders… On Chromium-based browsers (Google Chrome, Opera, Brave, …), users can choose from the Chrome Web Store, a service operated directly by Google, without having to log in.

A large body of work investigated the issues of Extension fingerprinting, wondering whether trackers installed on multiple websites can learn which set of extensions a user has installed and use it as the user’s transparent identifier to track their browsing patterns and build an advertising profile [1].

Another service that receives frequent requests regarding the user’s extension set is the CRX Chrome extension update service. Every time a browser wants to check for an extension update, it sends requests to update.googleapis.com. A study looking into backend requests from browsers found that such requests often contain persistent identifiers binding the extension update requests [2]. Additionally, users of Brave, a privacy-focused browser company, raise issues about Chrome Web Store being able to detect their extensions even if they are installed from a local file [3]. If such one domain can map these update requests to consistent user profiles, they can track users all over Chromium browsers.

In this work, we want to characterize the risk that the CRX update API poses to users and what threat models could exploit their privacy and anonymity. This work will involve a deep dive into the API’s documentation, a small-scale experimental setup to capture such requests and analyze various Chrome-based browsers, and a proposal for a client-side if any.

Requirements

Applying to this project

This research project is aimed at one BS/MSc student. The student will work with Saiid El Hajj Chehade.

[1] https://par.nsf.gov/servlets/purl/10167717
[2] https://ieeexplore.ieee.org/abstract/document/9374407
[3] https://community.brave.com/t/google-keeps-track-of-installed-extensions/506735