Jump to main content

Professorship Artificial Intelligence

Research Seminar

Research seminars are presently cancelled.

The seminar is intended for interested students in their Master or final stages of Bachelor studies. However, other interested listeners are cordially welcomed too! In the seminar, either the staff of the professorship Artificial Intelligence or students will present current research topics. The presentations are normally hold in English. This term, the seminar will be hold mainly on Mondays and Thursdays. The precise schedule is presented in the following table.

Information for Bachelor and Master students

Seminar presentations of each major can be hold in the frame of this event. This includes mainly the "Hauptseminar" in the major Bachelor-IF/AIF and the "Forschungsseminar" in all Master majors of Informatik as well as SEKO Bachelor and Master. Attendees acquire research-oriented knowledge by themselves and present it during a talk. The research topics will be typically from the field of Artificial Intelligence, Neurocomputing, Deep Reinforcement learning, Neurokognition, Neurorobotic and intelligent agents in virtual reality. Interested students should write an Email to Prof. Hamker, the talk itself will be scheduled after an individual consultation.

Recent Events

Learning Shape Based Features for Robust Neural Representation

Ranjitha Subramaniam

Thu, 25. 6. 2020,

Convolutional Neural Networks (CNNs) have gained tremendous significance over the years with state-of-the-art results in many computer vision tasks like object recognition, object detection, semantic segmentation, etc. Such high performance of CNNs is commonly attributed to the fact that they learn increasingly complex features while traversing deeper in their layers and this behavior is analogous to how humans perceive objects. Nevertheless, recent studies revealed that there exist considerable differences between human visual perception and the perception of objects by CNNs. One such substantial distinction is that humans predominantly rely on robust shape features to recognize objects while CNNs are highly biased towards local texture cues for object recognition. The perceptional differences between CNNs and humans can be reduced by improving the shape bias of CNNs. Recent work from Geirhos et al. showed that the augmentation of natural images using various styles from paintings makes their texture cues unpredictable and enforces the networks to learn more robust features. A CNN trained on such stylized images exhibits improved shape bias than a standard network trained on natural images. Besides the enhanced shape bias, such a network also demonstrates improved robustness against common image corruptions such as noise, blur, etc. The improved shape bias of the network is hypothesized to be the reason behind its high corruption robustness. With the objective to improve shape bias of CNNs, a technique, which employs edge maps with explicit shape details, is introduced in this thesis work. Moreover, the possible texture bias of the network is reduced by a technique called style randomization, which randomizes the statistics of activation maps in feature space. On evaluation, the proposed network shows higher shape bias. However, this shape biased network displays poor performance on image corruptions and its results are no better than a standard texture biased CNN. Hence, a systematic study is carried out to analyze the different characteristics in an image that could influence the corruption robustness. These characteristics include the existence of natural image properties, explicit shape details from edge maps and the stylized texture details. While stylization and certain preserved statistics of natural images play a role in improving the corruption robustness, no clear correlation is observed between the shape bias of a CNN and its corruption robustness. This study reveals that the strong data augmentation, which resulted from the stylization of natural images, helped in improving the corruption robustness of stylized networks while their improvement in shape bias emerged only as a byproduct. A further study is conducted to understand the adaptability of a network pretrained on natural images to data from different distributions. It is observed that the network shows improved performance on the target data while finetuning only its affine parameters of normalization layers. This indicates that a network trained using natural images also encodes robust representations but these representations are not leveraged in its affine layers.

Vision-based Traffic Sign Detection and Classification using Machine Learning

Ahmed Mohamed

Thu, 11. 6. 2020,

Traffic signs detection and classification is a key topic in modern autonomous driving systems. Machine learning methods, especially convolutional neural networks, have achieved a significant improvement in computer vision tasks and traffic sign classification problem. Traffic signs detection using CNN-based object detectors requires a substantial amount of computational resources and specialised hardware to achieve real-time performance. This thesis presents a shallow CNN to solve the traffic signs classification problem achieving comparable results to state of the art algorithms, and it presents a CNN-based object detector based on YOLOv3 that is able to achieve real-time performance by removing convolutional layers to reduce computation and re-organising the network structure. Model compression techniques are also applied to the resultant detection and classification model to reduce the size and computations in the model. The final detection model yielded nearly the same results as the original YOLOv3 model while achieving a significant reduction in size and real-time performance at 30 FPS on a general-purpose CPU.

Less Deep Neural Networks for Object Localization

Natasha Szejer

Thu, 30. 4. 2020,

Object localization is an important task to improve the performance of computer vision methods, with applications in e.g. robotics or object detection. Different approaches for this task exist: Inspired by neuroscience, a model according to the visual cortex system has been presented by Beuth and Hamker (2017). This model uses an attention mechanism originating in the Frontal Eye Field (FEF) and Prefrontal Cortex (PFC) to control a Higher Visual Area (HVA). Because the HVA controls the FEF, the process is recurrent and has shown to be promising for object localization. On the other hand, recent deep learning architectures have evolved, that can be trained end-to-end to a specific task and enable optimal specialization to a given dataset. In this research seminar, we investigate several state-of-the-art deep learning architectures for a comparison to Beuth and Hamker (2017). Inspired by the architectures presented, we will propose a concept that has similar structure as the model of Beuth and Hamker (2017) for a fair comparison. The findings of this preliminary investigation shall be used as a foundation for future evaluation in the scope of a master thesis.

A neuro-computational model of dopamine dependent interval timing using recurrent neural networks

Sebastian Adams

Mon, 24. 2. 2020, Room 131

Der reward prediction error (RPE) ist ein wichtiges Signal, welches einem Organismus mitteilt, ob etwas besser oder schlechter ist als erwartet. Dabei gibt es eine bestimmte Hirnregion, die vor allem mit diesem Signal assoziiert wird: das ventrale Tegmentum (englisch: ventral tegmental area, VTA). Es zeigt hohe Aktivität, wenn etwas besser als erwartet ist, normale Aktivität wenn eine erwartete Belohnung eintritt und ist genau an dem Zeitpunkt inaktiv, an dem eine erwartete Belohnung ausbleibt. VTA hat dabei weitreichende Verbindungen und leitet dieses Signal beispielsweise zum präfrontalen Kortex weiter. Dass VTA bei dem Ausbleiben einer Belohnung inaktiv ist, liegt an einer zeitlich präzisen Erwartung. Ein Hauptteil dieses Vortrages wird sich mit den zugrunde liegenden Prozessen der zeitlichen Kodierung im Gehirn befassen. Im Gegensatz zu dem gerade angesprochenen RPE, ist die Frage, wie das Gehirn Zeit kodiert, sehr umstritten. Ein Problem dabei ist, dass es keine Prozess im Gehirn gibt, der ohne Zeit beschreibbar ist. Aufgrund dieses Problems gibt es ein weites Feld an möglichen Ansätzen. Ein interessanter und noch relativ junger Ansatz ist dabei an das reservoir computing angelehnt. Ihm liegt die Annahme zugrunde, dass Zeit durch das Auslesen der Aktivität vieler Neuronen eine intrinsische Komponente neuronaler Netzwerke ist (sogenannte population clocks). Dieser Ansatz wird in das Modell von Vitay und Hamker (2014) zu den afferenten Verbindungen zu VTA eingebettet und anschließend diskutiert.

Dense Descriptor Generation for Robotic Manipulation

Vishal Mhasawade

Mon, 17. 2. 2020, Room 131

Robotic systems either the mobile ones or the manipulators are on the verge of becoming increasingly commonplace. One of the most important reasons behind this is the intelligence which is being imparted into such systems. In order to make this intelligence more efficient, arguably visual input to the robot is one of the most sought ways. Having said this, there arises the question what is the best visual representation of robot's world for manipulation? For robots to manipulate any object (rigid/non-rigid) in front of them the object structure plays an important role. Along with the structure of the object the visual representation robots see has to be applicable to wide variety of tasks. This work implements Dense Object Nets as the way of creating a visual representation of robot's world which is called as dense descriptors. We are trying to learn a visual representation of the object which should be (i) task-independent and which can be used as basic building block for various manipulation tasks. (ii) This visual representation is generally applicable to rigid and non-rigid objects, (iii) also it is being learned with complete self-supervision. In short, we are trying to answer the question what is the current visual state of robot's world? By creating the dense descriptor representation. The ultimate milestone would be to enable the robot to perceive the world as we humans are able to do and Dense Object Nets can be called as a small step towards it.

Entwicklung einer GUI für den Neurosimulator ANNarchy

Feng Zhou

Mon, 17. 2. 2020, Room 131

Graphische Benutzeroberflächen (GUI) sollen die Benutzung von Software erleichtern. Im Fall von Neurosimulatoren betrifft das vor allem die Definition von neuronalen Netzwerken. In meiner Bachelorarbeit entwickelte ich eine GUI für den Neurosimulator ANNarchy. Mit deren Hilfe sollen Benutzer neuronale Netzwerke modellieren und Simulationen durchführen können. In meinem Vortrag werde ich meine Ideen und Schritte bei der Entwicklung der GUI erläutern und die entwickelte GUI demonstrieren. Im Anschluss werden offene Punkte diskutiert und verschiedene Ideen für zukünftige Verbesserungen gezeigt.

Design and Modelling of Soft Pneumatic Actuator to Support Human's Forearm Motion

Hirenkumar Gadhiya

Thu, 13. 2. 2020, Room 131

Wearable robots offer pertinence and usability beyond many other types of technological interface and can include various applications such as personal entertainment, customized health, and fitness. Wearable robots improve the wearers' ability to interact physically with the human and external environment and gain rehabilitation efficiency. Due to their stiffness and wearable property, one can believe in wearable robots. For designing any exosuit, the first main important step is modelling of exosuit. Modelling of soft robotics is not an easy task. The aim of this research project is, therefore, to design and model the exosuit that is useful to support the human forearm movement. The online programming of behavior of a soft textile inflatable actuator is proposed in this research. In addition, forward and inverse kinematics are solved to get the desired tip position, and base frame, respectively. The structure of the exosuit is also discussed which consists of one pneumatic actuator to control flexion and extension movements, development board ESP-32, and the proportional valve has been developed for precise control of the actuator. At the end of the work, the comparisons are made to prove that the data obtained from the theoretical modelling match the data obtained from simulation.

Auswirkungen von Distraktoren auf überwachte und unüberwachte Lernverfahren

Michael Göthel

Mon, 3. 2. 2020, Room 204

In dieser Arbeit wurde untersucht, welche Auswirkungen ein Störfaktor, der in dieser Arbeit als Distraktor bezeichnet wird, auf das Lernen zweier untersuchter Netze, ein Deep Convolutional Neuronal Network (DCNN) und ein unüberwacht lernendes Netz, hat. Bei den verwendeten Netzen handelt es sich zum einen um das von LeCun et al. (1998) vorgestellte LeNet-5 als DCNN, zum anderen wird das bereits von Kolankeh (2018) vorgestellte Netz als unüberwachtes Netz genutzt. Es wurden die Klassifikationen der Netze sowohl mittels der Layer-Wise Relevance Propagation (LRP) untersucht und außerdem die Aktivitäten der Neuronen selbst. Dabei konnten bereits frühere Resultate, beispielsweise von Lapuschkin et al. (2019), welche eine starke Anfälligkeit eines DCNN für einen solchen Distraktor gezeigt haben, nachempfunden werden. Es konnten allerdings dieselben Eigenschaften auch für das unüberwachte Netz festgestellt werden. In verschiedenen Resultaten, welche mit und ohne Hilfe des Klassifikators erstellt wurden, konnten hier Hinweise gefunden werden, welche auf eine ähnliche Beeinflussung des unüberwachten Netzes durch den Distraktor schließen lassen.

Federated Pruning of Semantic Segmentation Networks Based on Temporal Stability

Yasin Baysidi

Mon, 27. 1. 2020, Room 204

Deep Convolutional Neural Networks (DCNN) are used widely in autonomous driving applications for perceiving the environment based on camera inputs. These DCNNs are used particularly for semantic segmentation and object detection. However, the number of trainable parameters in such networks is high and can be reduced without decreasing the overall performance of the network. On the other side, the performance of these networks are always assessed with conventional assessment methods, such as mean Intersection over Union (mIoU), or loss, and not towards other requirements, like stability, robustness, etc. Based on that, we propose a novel temporal stability evaluation metric and also study the impact of removing parts of the trained network, which tend to be unstable after training. This master thesis consists of two parts: 1) a novel method to define the temporal stability of semantic segmentation methods with sequential unlabeled data named Temporal Coherence, and 2) a novel pruning method, which reduces the complexity of the networks towards temporal stability enhancement named Stability based Federated Pruning. In the coarse of our experiments, two semantic segmentation networks, Fully Convolutional Networks FCN8-VGG16 and Full Resolution Residual Network (FRRN) are trained on two data sets, Cityscapes [9] and a Volkswagen Group internal data set. Afterwards, they are pruned with two state-of-the-art pruning methods along with our proposed method, and evaluated on Intersection over Union as a supervised and our Temporal Coherence as an unsupervised evaluation metric. It is shown in the experiments that the overall performance (mIoU) and the Temporal Coherence of the networks improved after pruning up to more than 40 percent of the network parameters. Furthermore, we have shown that we could produce competitive results by our pruning metric compared to the other state-of-the-art pruning methods in all the experiments, and outperformed them in some of cases.

Cooperative Machine Learning for Autonomous Vehicles

Sebastian Neubert

Thu, 23. 1. 2020, Room 368

Automated driving has been around for more than half a century till now and the approaches vary noticeably in the car industry. While some manufacturers and research institutions rely on a combination of multiple sensors like Lidar, Radar, Sonar, GPS and Camera, Elon Musk, the CEO of Tesla, is convinced to solve fully autonomous driving by primarily solving vision, as inspired by what human beings are using to make driving decisions, i.e. vision in first place. Current state-of-the-art approaches for object detection are entirely based on machine learning techniques. These involve training very complex models on huge amounts of data centralized in large-scale datacenters. Due to the fact that in modern applications like autonomous driving and edge computing, data is usually generated in a decentralized way, a feasible consideration would be to also train the machine learning models in a decentralized manner. In this thesis we examine a distributed learning approach called Federated Learning (FL) by applying it on several scenarios with MNIST as the dataset. In these settings, multiple clients are led to personalize on a specific digit whose models are then aggregated into an average model. We have made an in-depth analysis of how this algorithm is performing on these scenarios. Additionally, we propose several ways of improving the accuracy up to 97 % on the test set as well as the consideration that the principles of FL are not limited to neural network based learning algorithms but, for instance, can also be applied to SVMs.

A neuro-computational approach for attention-guided vergence control using the iCub

Torsten Follak

Mon, 13. 1. 2020, Room 204

In this thesis a combined model for attention-guided vergence control is presented. The model consists of two prior models. One model for vergence control by Gibaldi et al. (2017), implementing a biologically inspired approach for the robotic vergence control. The second part is a model for object localization by Beuth (2017), which is inspired by the attention mechanisms in the human brain. The connection of these two models should lead to a new model with an attention guidance mechanism for the vergence control. This thesis presents first the grounding models. Further, the necessary adaptions for the model fusion are shown. Finally, the performance of the new model is tested in different settings.

Real time head pose estimation received from 3D sensor using neural networks

Dhanakumaresh Subramani

Thu, 9. 1. 2020, Room 368

Human-machine non-verbal communication can be inferred from the human head pose tracking. Therefore, human head pose estimation is very crucial in person-specific services such as automotive safety. Bayesian filters like Kalman filter is one among the efficient visual object tracking algorithm. The popularity of Kalman filter is because of its inherent feedback loop, which can predict the forthcoming measurements. Nevertheless, it cannot be used widely because of its complex design, and it has to be micro specific to the task. Recent studies in RNN (Recurrent Neural Network) prove that it could be an ideal replacement for the Bayesian filters as the temporal information in RNN has a significant influence in the field of visual object tracking. The feedback loop in RNN allows storing the temporal state information of the entire sequences of the event. Additionally, RNN can perform the functionalities of CNN (Convolutional Neural Network) and Kalman filter. Moreover, notable improvements in CNN architectures are studied, such as learning multiple related tasks (human head pose estimation, facial landmark localization and visibility estimation) improves the accuracy of the main task. Hence, in this thesis, a recurrent multi-tasking network is designed, which can estimate the head orientation along with facial landmark localization.

... older

Press Articles