Completed PhDs at the Chair Media Informatics
» 2018 — Stefanie Müller: Systematization and identification of interferences and their origins in historical video documents using the example of digitized video stocks of Saxonian local television stations
» 2016 — Thomas Wilhelm-Stein: Teaching Information Retrieval - Supporting the Acquisition of Practical Knowledge About Information Retrieval Components Using Real-World Experiments and Game Mechanics
2019: Stefan Kahl t.b.c. Download
Stefan Kahl: Identifying Birds by Sound: Large-scale Acoustic Event Recognition for Avian Activity Monitoring
Birds are omnipresent and often reveal their presence through their vocalizations. They respond to environmental changes over many spatial scales and are thus ideal indicator species to monitor ecosystem health across various lifeforms. Automated observation of avian vocal activity and species diversity can be a transformative tool for ornithologists, conservation biologists, and bird watchers to assist in longterm monitoring of critical environmental niches. Digital sound transformation is commonly used when studying bird sounds. Since the inception of the sound spectrograph, spectrograms play a significant role in avian research. We can assume that visual representations of bird sounds contain valuable information on species identity, rendering spectrograms a particularly suitable representation. Deep artificial neural networks have surpassed traditional classifiers in the field of visual recognition and acoustic event classification. Still, deep neural networks require expert knowledge to design, train, and test powerful models. With this constraint and the requirements of future applications in mind, an extensive research platform for automated avian activity monitoring was developed: BirdNET. An unprecedented amount of training, validation, and test data was used to assess the overall system performance on more then 3,900 hours of field recordings covering 987 classes and almost 300 hours of fully annotated soundscapes containing almost 80,000 vocalizations. The resulting benchmark system yields state-of-the-art scores across various acoustic domains and was used to develop expert tools and public demonstrators that can help to advance the democratization of scientific progress and future conservation efforts.
2018: Robert Herms Download
Effective Speech Features for Cognitive Load Assessment: Classification and Regression
The automatic recognition of cognitive load is a vital step towards the development of adaptive systems that are capable of providing the user with dynamic support in order to maintain the load experienced within an optimal range for maximum productivity. Speech contains a multitude of information and has been identified to be a potential modality to measure the user’s cognitive load.
The focus of this thesis is on the effectiveness of speech features for automatic cognitive load assessment, with particular attention being paid to new perspectives of this research area. A new cognitive load database, called CoLoSS, is introduced containing speech recordings of users who performed a learning task. This data collection contrasts with existing cognitive load databases since learning tasks have not yet been employed and it provides continuous numerical labels in addition to the discrete load levels considered until now. The CoLoSS corpus, together with the CLSE database in which two variants of the Stroop test and a reading span task are employed, forms the basis for the evaluations.
Various acoustic features from different categories including prosody, voice quality, and spectrum are investigated in terms of their relevance. Moreover, Teager energy parameters, which have proven highly successful in stress detection, are introduced for cognitive load assessment and it is demonstrated how automatic speech recognition technology can be used to extract potential indicators of the user’s cognitive load. As a further contribution, three hand-crafted feature sets are proposed.
The suitability of the extracted features is systematically evaluated by recognition experiments with speaker-independent systems designed for three-class classification (low, medium, and high cognitive load). Various configurations in terms of combinations of features, filters for feature selection, feature normalisation methods, and model parameters are tested. To prove the generalisation ability of the proposed feature sets, cross-corpus experiments are carried out. Additionally, a novel approach to speech-based cognitive load modelling is introduced, whereby the load is represented as a continuous quantity and its prediction can thus be regarded as a regression problem. The evaluation of regression algorithms on the CoLoSS corpus reveals the advantages of using automatic feature subset selection.
2018: Stefanie Müller Download — German
Systematization and identification of interferences and their origins in historical video documents using the example of digitized video stocks of Saxonian local television stations
Due to the imminent disappearance of magnetic tape-based video systems like VHS and SVHS, local Saxon television stations are forced to digitize large amounts of analog video tapes. The captured scenes provide an insight into the era of German reunifi cation and they document challenges which East German citizens had to face in the transfer from a centrally planned economy to a free market system. Often, this mass digitization has the purpose as a general ‘backup’ without any optimization procedures. Qualitative defi ciencies of the material must then be reviewed manually in a time consuming process after the digitization.
An overview of high and lowquality sequences can help to speed up this practice. So far, no methods for an automated quality analysis of digitized archives transferred from analog sources exist. Th is dissertation presents an approach for classifi cation and automated detection of visual distortions in digitized analog video material. Possible sources of interference and visual distortion were identifi ed and systematized through extensive research of available literature and material analysis. The vague, partially overlapping and even contradictory terminology of over 800 terms was classified and redefi ned. The findings have then been supplemented by tables and fi gures to create a new overview of the discussed phenomena, which has not been available beforehand.
Furthermore, a proposal for a suitable method for their automated detection and visualization is being evaluated. Th e presented method was implemented with the help of a deep convolutional neural network and includes the generation of datasets, the training of the system and the classifi cation of the previously systematized phenomena. Additionally, a user-friendly visualization is proposed, which allows to distinguish between useful and unusable video material as well as recognizing semantic structures without an in-depth knowledge of technical or visual disturbances or artifi cial neural networks. Th e presented method is resource-effi cient and can be adapted to other video formats and expanded to large datasets.
2015: Markus Rickert Download — German
Content-based Analysis and Segmentation of narrative, audiovisual Media
Audiovisual media, especially movies and TV shows, developed within the last hundred years into major mass media. Today, large stocks of audiovisual media are managed in databases and media libraries. The content is provided to professional users as well as private consumers. A particular challenge lies in the indexing, searching and description of multimedia assets.
The segmentation of audiovisual media as a branch of video analysis forms the basis for various applications in multimedia information retrieval, content browsing and video summarization. In particular, the segmentation into semantic meaningful scenes or sequences is difficult. It requires a special understanding of cinematic style elements that were used to support the narration during the creative process of film production.
This work examines the cinematic style elements and how they can be used in the context of algorithmic methods for analysis. For this purpose, an analysis framework was developed as well as a method for sequence-segmentation of films and videos. It can be shown that semantic relationships can be found in narrative audiovisual media, which lead to an appropriate sequence segmentation, by using a multi-stage analysis process, based on visual MPEG-7 descriptors.
2015: Anke Schneider Download — German
Color Influence Factors for the Emotional Impact of a Picture and Their Relevance for The Retrieval of Tourism Pictures
The use of pictures in a variety of areas has increased tremendously in recent years (Wedel & Pieters, 2008), as they stimulate a person’s imagination and help to create first experiences and emotions. Furthermore, the rapid developments in multimedia have led to an escalation of the number of digitally stored pictures and photographs. Consequently, finding the ‘best picture’ for a convincing advertising campaign has been becoming increasingly difficult due to the abundance of available pictures. To further complicate this search process, a lot of pictures related to a specific topic are very similar with regard to their content. However, their low-level features, such as hue, saturation, and luminance, might differ considerably. Therefore, this work focusses on the influence of emotional characteristics on the image retrieval process. This includes the study of emotions caused by the color properties of a picture, as well as the evaluation of the results of an emotional image retrieval processes. Results of different experiments show that a picture’s luminance and color have the power to influence emotion. The subsequent evaluation of the results shows an improvement of emotional image retrieval processes. Consequently, one can conclude that the consideration of emotions for ranking affects the quality of the results of the Image Retrieval positively.
2016: Thomas Wilhelm-Stein Download — German
Teaching Information Retrieval - Supporting the Acquisition of Practical Knowledge About Information Retrieval Components Using Real-World Experiments and Game Mechanics
Information retrieval has achieved great significance in form of search engines for the Internet. Retrieval systems are used in a variety of research scenarios, including corporate support databases, but also for the organization of personal emails.
A current challenge is to determine and predict the performance of individual components of these retrieval systems, in particular the complex interactions between them. For the implementation and configuration of retrieval systems and retrieval components professionals are needed. By using the web-based learning application Xtrieval Web Lab students can gain practical knowledge about the information retrieval process by arranging retrieval components in a retrieval system and their evaluation without using a programming language. Game mechanics guide the students in their discovery process, motivate them and prevent information overload by a partition of the learning content.
2015: Albrecht Kurze Download — German
Modeling the QoS-QoE-relationship for mobile services and the empirical evaluation in a emulated testbed
The thesis is centered around the relationship of Quality of Service (QoS) and Quality of Experience (QoE) for mobile Internet services. While QoS covers the technical view on the telecommunications network characterized by performance-related parameter values (e.g. throughput and latency), QoE refers to the assessment of the user experience (e.g. satisfaction and acceptability) in the use of the services. In the thesis QoS and QoE are revealed as highly complex and related concepts in theoretical contemplation. Integrating both concepts requires a multidisciplinary or interdisciplinary approach between engineering and human sciences to consider both - technological aspects of the network as well the human user. The designed multilayered model appropriately integrates the technical network view as well as the user's perspective by considering all relevant factors of influence and all internal relationships between QoS and QoE. The conducted extensive psychophysical laboratory experiment with real users, devices and services quantifies the relationship between specific QoS values and specific QoE values. A testbed developed for network emulation allows combining typical mobile network situations with typical usage situations in a controlled and focused manner. The three elaborated principles to test for relevance, suitability and efficiency take into account the special features of the test setup and test design. Test results gained from more than 200 volunteers confirm the predicted QoS-QoE-characteristics of the six tested mobile services to be either elastic or non-elastic. It is possible to conclude from the desired degree of user satisfaction on the necessary values of the QoS network parameters, which results in a QoS-QoE-corridor between lower and upper threshold values. Findings prove that QoS-independent factors, e.g. the type of presentation of the stimuli in the app on the user’s device, can be as relevant for QoE as the evaluated QoS network parameters themselves.
2014: Marc Ritter Download — German
Optimization of Algorithms for Video Analysis: A Framework to Fit the Demands of Local Television Stations
The data collections of local television stations often consist of multiples of ten thousand video tapes. Modern methods are needed to exploit the content of such archives. While the retrieval of objects plays a fundamental role, essential requirements incorporate low false and high detection rates in order to prevent the corruption of the search index. However, a sufficient number of objects need to be found to make assumptions about the content explored.
This work focuses on the adjustment and optimization of existing detection techniques. Therefor, the author develops a holistic framework that directly reflects on the high demands of video analysis with the aim to facilitate the development of image processing algorithms, the visualization of intermediate results, and their evaluation and optimization. The effectiveness of the system is demonstrated on the structural decomposition of video footage and on content-based detection of faces and pedestrians.
2013: Arne Berger Download — German
Prototypes in Interaction Design — A Classification of the Dimensions of Sketched Artefacts for Optimizing the Cooperation of Design and Computer Science
Which material manifests a house? A sketch of the house? A car? A model of the car? Answering those questions is relatively simple because architecture and product design cultivate a rich and tangible tradition of prototyping and an adequate design theory. Which material manifests an interactive system? Is it the glass of the touch screen or is it the color of the buttons? Interaction design is an emerging discipline and its accompanying design theory is even more so in its early days.
The PhD thesis contemplates questions of materiality in interaction design. What are interactive prototypes and how can they be sufficiently described? Which properties are inscribed and interpreted by designers, engineers, users, the environment? How can this knowledge be utilized for a meaningful transdisciplinary collaboration and equal participation in design processes?
A variety of empirical findings will be presented, based on a meta-theoretical framework, that integrates Somatic-Marker-Hypothesis from Neuroscience and Actor-Network-Theory from Philosophy of Technology.
2012: Jens Kürsten Download — English
A Generic Approach to Component-Level Evaluation in Information Retrieval
Research in information retrieval deals with the theories and models that constitute the foundations for any kind of service that provides access or pointers to particular elements of a collection of documents in response to a submitted information need. The specific field of information retrieval evaluation is concerned with the critical assessment of the quality of search systems. Empirical evaluation based on the Cranfield paradigm using a specific collection of test queries in combination with relevance assessments in a laboratory environment is the classic approach to compare the impact of retrieval systems and their underlying models on retrieval effectiveness.
In the past two decades international campaigns, like the Text Retrieval Conference, have led to huge advances in the design of experimental information retrieval evaluations. But in general the focus of this system-driven paradigm remained on the comparison of system results, i.e. retrieval systems are treated as black boxes. This approach to the evaluation of retrieval system has been criticised for treating systems as black boxes. Recent works on this subject have proposed the study of the system configurations and their individual components. This thesis proposes a generic approach to the evaluation of retrieval systems at the component-level.
The focus of the thesis at hand is on the key components that are needed to address typical ad-hoc search tasks, like finding books on a particular topic in a large set of library records. A central approach in this work is the further development of the Xtrieval framework by the integration of widely-used IR toolkits in order to eliminate the limitations of individual tools. Strong empirical results at international campaigns that provided various types of evaluation tasks confirm both the validity of this approach and the flexibility of the Xtrieval framework.
Modern information retrieval systems contain various components that are important for solving particular subtasks of the retrieval process. This thesis illustrates the detailed analysis of important system components needed to address ad-hoc retrieval tasks. Here, the design and implementation of the Xtrieval framework offers a variety of approaches for flexible system configurations. Xtrieval has been designed as an open system and allows the integration of further components and tools as well as addressing search tasks other than ad-hoc retrieval. This approach ensures that it is possible to conduct automated component-level evaluation of retrieval approaches.
Both the scale and impact of these possibilities for the evaluation of retrieval systems are demonstrated by the design of an empirical experiment that covers more than 13,000 individual system configurations. This experimental set-up is tested on four test collections for ad-hoc search. The results of this experiment are manifold. For instance, particular implementations of ranking models fail systematically on all tested collections. The exploratory analysis of the ranking models empirically confirms the relationships between different implementations of models that share theoretical foundations. The obtained results also suggest that the impact on retrieval effectiveness of most instances of IR system components depends on the test collections that are being used for evaluation. Due to the scale of the designed component-level evaluation experiment, not all possible interactions of the system component under examination could be analysed in this work. For this reason the resulting data set will be made publicly available to the entire research community.