Dominik Probst

Dominik Probst, M. Sc.

Department of Computer Science
Chair of Computer Science 6 (Data Management)

Room: Room 08.157
Martensstraße 3
91058 Erlangen

Scheduled Office Hours

The following slots are the time when attendance can most likely be expected. As this may well change depending on appointments, consultation hours should always be arranged:

Jede Woche Mo, Tu, We, 09:00 - 15:00, Room 08.157, Martensstraße 3

Since January 2020, Dominik Probst is a member of the research staff at our chair.

From October 2014 to September 2019 he was a tutor in “Konzeptioneller Modellierung” (eight semesters) and our part of the “Fertigungstechnisches Praktikum” (Summer Semester 2016 and Summer Semester 2017).

(Virtual) consultation hours can currently be arranged by e-mail.

 

  • Generation of Symbol Tables for String Compression with Frequent-Substring Trees

    (Own Funds)

    Term: since 19/09/2022

    With the ongoing rise in global data volumes, database compression is becoming increasingly relevant. While the compression of numeric data types has been extensively researched, the compression of strings has only recently received renewed scientific attention.

    A promising approach to string compression is the use of symbol tables, where recurring substrings within a database are substituted with short codes. A corresponding table enables the smooth reconstruction of the original data. This method is distinguished by short compression and decompression times, although the compression rate heavily depends on the quality of the symbol table.

    The research project FST focuses on the creation of optimized symbol tables to maximize the compression rate. For this purpose the eponymous Frequent-Substring Trees are constructed, a trie-like data structure that maps all potential table entries and enables the identification of optimal entries through the use of metadata.

    The primary objective of the research project is to increase the compression rate of string compression methods without significantly affecting the compression and decompression times.

  • Architecture of Non-Multiple Autoencoders for Non-Lossy Information Agglomeration (working title, preliminary)

    (Own Funds)

    Term: 02/01/2020 - 19/09/2022

    The compression of data has played a decisive role in data management for a long time. Compressed data can be permanently stored in a more space-saving manner and sent over the network more efficiently. However, the ever-increasing volumes of data mean that the importance of good compression methods is growing all the time.

    Within the scope of project Anania (Architecture of Non-Multiple Autoencoders for Non-Lossy Information Agglomeration), we are investigating to what extent classical compression methods in relational databases can be supplemented and improved using methods from machine learning.

    The project focuses on autoencoders that can recognize semantic connections in relations when applied tuple-wise and thus promise further improvement in the compression of relational data. Combinations of autoencoders and classical compression methods are also a possible focus of the project.

    Side note: The name of the project "Anania" was chosen in reference to the butterfly "Anania funebris". In its stylized form, an autoencoder strongly resembles the silhouette of a butterfly with outstretched wings, which made the choice of this acronym seem fitting.