# Open Thesis Topics

Our research group offers a variety of projects (Bachelor theses, Practicals, Master theses) on the following topics:

- Causality and causal inference
- Machine learning and causal modeling in cognitive neuroscience
- Brain-Computer Interfaces (BCIs) for communication and rehabilitation

Below is a list of open topics. If you are interested in a particular topic, please send an email to the contact person listed underneath the project description. If you would like to suggest a topic of your own, please contact moritz.grosse-wentrup@univie.ac.at.

### Projects for P1 and/or P2

**State-action pair extraction from training in RL test environment (IBM drone environment, OpenAI gym, etc.) (P1)**

One of our goals is to find human-intelligible explanations for the actions of an RL agent. To do so, we aim to find low-dimensional and interpretable representations of the states in which the RL agent takes its actions. To apply certain transformations to the states (we hypothesize that such transformations render states interpretable), we require state-action pairs encountered in the learning process. In this project, you are asked to train an RL agent in a test environment like OpenAI gym or any suitable environment like car simulators and build a pipeline to provide a dataset of state-action pairs. Furthermore, developing a map from the action space to an interpretable action space is highly desirable. Depending on the size of the original action space and the respective codomain of such map, it can be derived manually, e.g. assigning labels like ‘right turn’ for certain degrees of steering wheel rotation.

Contact person: Christoph Luther

**Benchmarking Causal Structure Learning for inference of cond. independencies (P1 or P2)**

Causal graphs, typically directed acyclic graphs, are a great way to visualize dependencies between random variables of a probability distribution. They come along with their own representation of statistical independence called d-separation. Under the two standard assumptions that the given distribution satisfies the Markov assumption and is faithful w.r.t. the graph both concepts are equivalent. Since there is a plethora of algorithms to efficiently infer such graphs from data and conditional independence testing directly from data is a challenging problem, in this project you are asked to compare whether the procedure of first estimating such graphs from data and then (efficiently) reading off d-separations can replace explicit conditional independence tests. For the project, you are asked to acquaint yourself with causal structure learning algorithms and apply (already implemented) concepts to data and read off d-separations (conditional independences). You shall then compare the results to those of standard conditional independence tests (like partial correlation tests for Gaussian data) and evaluate, which approach is more efficient as well as accurate. In order to have a ground truth to compare to, you would mostly rely on synthetic data sampled from a known distribution.

*Depending on the extent of the benchmark and the coverage of theory, the project can be either P1 or P2.*

Contact person: Christoph Luther

**Causal Structure Learning with Deep RL on calcium imaging data of C. elegans (P1)**

Machine and Reinforcement learning has been very revolutionary in research. In this project, we will build on the paper by Kim & Shlizerman, (2020). We will aim to infer connectivity structure from data using their Deep RL framework. For this project, we will aim to make use of C. elegans worm data.

Reference: Kim, J., & Shlizerman, E. (2020). Deep Reinforcement Learning for Neural Control. <http://arxiv.org/abs/2006.07352>

Contact person: Sadiq Adedayo

**Implementations of existing Causal Structure Learning algorithms (P1 or P2).**

Existing CSL algorithms range from constraint- to score-based methods. In this project, you will implement some selected algorithm(s). Implementations already exist in other programming languages i.e. R, MATLAB etc. Your task will be to translate to Python or Julia and ready for use as a toolbox for benchmarking these algorithms on data (synthetic and experimental datasets). Improvements can also be proposed to improve the efficiency of these algorithms.

Contact person: Sadiq Adedayo

**Causal structure learning on NEST data (P1/P2) and creating tailor-made independence tests for spiking neuronal data (Bachelor thesis)**

To what extent can we infer cause-effect relations between features in a time-series dataset? In this project, you will apply a well-known causal structure learning algorithm on simulated neuronal data. You will then compare your results with the true neuronal network and evaluate the performance of the algorithm. You can then identify scenarios when the performance of the algorithm is inadequate. For those who wish to go a step further, you can try to enhance the performance of the algorithm by creating novel independence tests that are better-suited to the 'firing' nature of neuronal data.

Contact person: Akshey Kumar

**Auditory cortex clustering**

This project is in collaboration with the Brain and Language Lab (Narly Golestani) at the Vienna Cognitive Science Hub. The aim is to test different clustering algorithms on a dataset of auditory cortex structural magnetic resonance imaging data, focusing on the Heschl's gyrus (a main “language area” in the brain). The first goal is to cluster a large group of participants in a data driven manner and find an optimal clustering, that is relatively consistent across clustering algorithms, and can deal with missing data. Once this is done, the clustering can be used to predict demographic and language aptitude information.

Contact person: Jozsef Arato

**Eye-movement similarity**

The goal is participate in the development of a python package for eye-movement data analysis (fixations, scanpath). The project is based at the Vienna Cognitive Science Hub and involves working on data from art history, psychology and open eye-movement datasets. The goal is to add new functionalities to the package (eg: improved time-series analysis), making it more user friendly, and testing the algorithms on different data-sets, to come up with the best default settings, and work toward an eventual publication as open source software.

Contact person: Jozsef Arato

### Projects for P2

**Online EEG artifact detection using Riemannian methods (P2)**

Brain signals, measured with EEG, are usually accompanied by signals from non-neuronal sources, so-called artifacts. To decode brain signals only, you must detect and remove artifact-infected signals from your EEG data. For this project, we will use Riemannian geometry-based methods as models for decoding and artifact removal. Your task will be to use and adapt existing Riemannian models and apply them to an online scenario, meaning you will analyze the EEG data during an ongoing recording session. The EEG data analysis and online decoding will be conducted using Python and OpenVibe (a software platform for online data analysis).

Contact person: Philipp Raggam

**Hamiltonian Monte Carlo sampling with discrete latent variables. (P2)**

Markov Chain Monte Carlo methods are fundamental for Bayesian Statistics, since it is the standard way to obtain inferences of posterior distributions. Hamiltonian Monte Carlo is the state-of-the-art sampler, but it assumes certain differentiability properties, so what happens when there exists discrete latent variables in a model? The idea of this project is to study solutions for such problem and implement them.

Contact person: Mauricio Gonzalez Soto

**Unsupervised feature extraction for single cell recordings (P2)**

Multi-electrode arrays (MEAs) can record neural data on the level of individual neurons. There are three standard sets of features used when working with this kind of data, namely the originally recorded local field potentials, raw threshold crossings, and so called spike-sorted data. In this project you will implement an unsupervised neural network method (autoencoder) to automatically learn a lower dimensional feature representation from this data, and compare it with the three standard methods on a given supervised learning task. (Also possible as Bachelor thesis)

Contact person: Anja Meunier

**Learning physical laws from videos using deep auto-encoders (P2) + the novel AbCNet algorithm (Bachelor/Master thesis)**

Are there succinct underlying laws that govern the dynamics of seemingly complex datasets? Can the physics of given system (eg: coupled-oscillators, gas) be learned from a video in an unsupervised manner? Here, you will learn low-dimensional representations of data that are compatible with its dynamics and physics. You will then compare the learned representations to the 'true' fundamental variables of the system (eg: position, velocity of the oscillators, pressure, temperature). The representation learning will be achieved using deep neural networks in a variety of architectures.

Contact person: Akshey Kumar

### Projects for Bachelor & Master theses

**Learning physical laws from videos using deep auto-encoders (P2) + the novel AbCNet algorithm (Bachelor/Master thesis)**

Are there succinct underlying laws that govern the dynamics of seemingly complex datasets? Can the physics of given system (eg: coupled-oscillators, gas) be learned from a video in an unsupervised manner? Here, you will learn low-dimensional representations of data that are compatible with its dynamics and physics. You will then compare the learned representations to the 'true' fundamental variables of the system (eg: position, velocity of the oscillators, pressure, temperature). The representation learning will be achieved using deep neural networks in a variety of architectures.

Contact person: Akshey Kumar

**Causal structure learning on NEST data (P1/P2) and creating tailor-made independence tests for spiking neuronal data (Bachelor thesis)**

To what extent can we infer cause-effect relations between features in a time-series dataset? In this project, you will apply a well-known causal structure learning algorithm on simulated neuronal data. You will then compare your results with the true neuronal network and evaluate the performance of the algorithm. You can then identify scenarios when the performance of the algorithm is inadequate. For those who wish to go a step further, you can try to enhance the performance of the algorithm by creating novel independence tests that are better-suited to the 'firing' nature of neuronal data.

Contact person: Akshey Kumar

**Topological data analysis for C. elegans data (Master thesis)**

Topological Data Analysis is about applying techniques from geometry and topology to datasets in order to understand their intrinsic properties, such as the existence of “holes” or connected components. In this projec the student will analyze certain techniques of TDA and apply them to a dataset from neural activity.

Contact persons: Mauricio Gonzalez Soto, Akshey Kumar