Have you ever managed a large dataset? This project provides an opportunity to handle a dataset with over 8,000 downloads each month. You will reorganize the dataset by task, document it thoroughly, and create a user-friendly interface and leaderboard. The project also involves working with HPC clusters, Hugging Face libraries, and GitHub Pages for documentation. Basic Python skills and familiarity with Linux commands are required.
In this project you will implement a job generator process. As an input, a JSON configuration (JSON) file and jobs generation rate (jobs per unit time) will be provided. The configuration file contains metadata about different Deep learning jobs, such as the path to the executable file and required arguments. Your task is to design and implement a generator which works as follows: Randomly select a job from the JSON file, use the metadata of the selected job to prepare a YAML/Batch script, a script template will be provided, and submit the prepared script to another process using an RPC protocol. The rate at which a job is sampled and submitted should be equal to the given generation rate. In addition, an implementation of an RPC protocol is required, the description of the protocol will be provided. You may choose any programming language for coding, but advisably to use Python. You will be provided with a supplementary code with helper functions, RPC protocol description and script/configuration files description.
In this project we are developing a resource manager framework, part of this framework is to design monitoring APIs functionalities, which gathers hardware metrics upon invocation. You will be given a code template and your task is to fill in the code for some API functionalities. Supplementary programs and helper functions will be provided to be able to test your implementation. A documentation for the required API functions including their input and output parameters will be provided. This project requires C and Python programming as well as basic operating systems knowledge.
Mining sensors data for anomaly detection (with industrial partner)
In collaboration with an industrial partner, we have access to sensor data collected from various machines. We need a curious student to look at the dataset from multiple possible angles and see if datamining / data science techniques can identify any pattern. This is an ideal project if you want to learn about datamining / data science techniques.
Research packages for the formal specification and verification of process compositions
For our research we implemented and use a number of Java packages that allow us to specify, unfold, and verify process compositions such as business process models and service compositions. These packages require some work, including new functionality, replacing old dependencies, adding different output formats, replacing log functionality, refactoring to use certain programming patterns, and more. In this project, we would like a number of students to improve, refactor, and add functionality. This project is available for up to 5 students, which will work on separate sub-projects such as:
- Adding rich Event Log generation from random executions of annotated Petri net models.
- Separating embedded data annotations and allowing execution of Petri nets using data.
- Adding functionality for colored Petri nets.
- Implementing improved Prime Event Structures (PES) representations of processes and unfolding (i.e., creation of PES) from Petri nets.
- Replacing old dependencies and refactoring.