The seminar is intended for interested students in their Master or final stages of Bachelor studies. However, other interested listeners are cordially welcomed, too! In the seminar, either the staff of the professorship Artificial Intelligence or students will present current research topics. The presentations are normally hold in English. The precise schedule is presented in the following table.
Information for Bachelor and Master students
Seminar presentations of each major can be hold in the frame of this event. This includes mainly the "Hauptseminar" in the major Bachelor-IF/AIF and the "Forschungsseminar" in all Master majors of Informatik as well as SEKO Bachelor and Master. Attendees acquire research-oriented knowledge by themselves and present it during a talk. The candidates are expected to have sufficient background knowledge, typically acquired by a participation in the lectures Neurocomputing (fornerly Machine Learning) or Neurokognition (I+II). The research topics will be typically from the field of Artificial Intelligence, Neurocomputing, Deep Reinforcement learning, Neurokognition, Neurorobotics and intelligent agents in virtual reality. Interested students should write an Email to Prof. Hamker, the talk itself will be scheduled after an individual consultation.
Payam Atoofi Wed, 13. 1. 2021, https://us02web.zoom.us/j/88027471757 In the previous presentation the principle of spectral-based convolution on non-Euclidean structured data was discussed. Although a brief introduction to spatial-based approaches were provided, it is worth discussing them in more detail as they have shown to be less computationally expensive as opposed to their spectral counterpart. Moreover, the intuition and simplicity that spatial approaches offer has helped these methods to shine brighter in recent years. However, regardless of the methods there still exist challenges in the field of Geometric Deep Learning, e.g. training Graph Neural Networks (GNNs) with Stochastic Gradient Descent (SGD) using mini batches. Such issues which could for instance be observed in node-wise classification tasks, further necessitates a solution when sampling nodes or subgraphs are required. Although a generalized framework in non-Euclidean domains, as of these presentations is still out of reach, but the analogy, and even the problem setting could be brought closer for both graphs and manifolds. Spatial-based methods on manifolds follow the same principle of using a diffusion operator in order to aggregate local information as it is the case in graphs. In the second installment of presentations on GDL, some of the most renowned examples and successes of these models on real-world problems would be discussed to hopefully become an incentive on further research in this field, such as antibiotic discovery, predicting molecule's aroma, and fake news detection.
Geometric Deep Learning: Part II - Graphs, Manifolds, and Spatial-based Approaches
Wed, 13. 1. 2021, https://us02web.zoom.us/j/88027471757
In the previous presentation the principle of spectral-based convolution on non-Euclidean structured data was discussed. Although a brief introduction to spatial-based approaches were provided, it is worth discussing them in more detail as they have shown to be less computationally expensive as opposed to their spectral counterpart. Moreover, the intuition and simplicity that spatial approaches offer has helped these methods to shine brighter in recent years. However, regardless of the methods there still exist challenges in the field of Geometric Deep Learning, e.g. training Graph Neural Networks (GNNs) with Stochastic Gradient Descent (SGD) using mini batches. Such issues which could for instance be observed in node-wise classification tasks, further necessitates a solution when sampling nodes or subgraphs are required. Although a generalized framework in non-Euclidean domains, as of these presentations is still out of reach, but the analogy, and even the problem setting could be brought closer for both graphs and manifolds. Spatial-based methods on manifolds follow the same principle of using a diffusion operator in order to aggregate local information as it is the case in graphs. In the second installment of presentations on GDL, some of the most renowned examples and successes of these models on real-world problems would be discussed to hopefully become an incentive on further research in this field, such as antibiotic discovery, predicting molecule's aroma, and fake news detection.
Exploration and Assessment of Methods for Anomaly Detection in Time Series Signals
Tue, 5. 1. 2021, https://us02web.zoom.us/j/84988730469
This thesis is devoted to the problem of anomaly detection in time series data. Some of the important applications of time series anomaly detection are healthcare, fraud detection, and system failure detection. This paper examines sensor signals used in the automotive industry. Three common types of signals were selected for the study. These signals are interesting for the research because they have different properties and characteristics. In addition to the signals themselves, various representations of the signals were also investigated. The problem of detecting anomalies in time series has become a burning one, especially in the modern world, because car manufacturers are trying to improve cars for the convenience of drivers. The amount of data received from these sensors is increasing and, consequently, there is a high need to use automatic systems for checking signals for anomalies. The purpose of this work is to explore existing methods for detecting anomalies in time series and assess their performance. Unsupervised machine learning algorithms such as LOF, COF, CBLOF, KNN, HBOS, and OCSVM were tested and assessed. The algorithms that showed the best results were implemented in the desktop application for use in the offline mode. For the online mode, such methods as MA (Moving Average) and a certain type of an artificial recurrent neural network LSTM were tested. That is, as a result of the research, a desktop application was developed and implemented to detect anomalies in time series. This desktop application is able to work in real time and in offline mode. This paper also offers methods for checking the properties of signals. These methods are able to detect the anomalous signal even in cases when the other algorithms can not. However, the utilization of these methods requires minimal knowledge of the signal. Almost all of these offered methods can be applied in both application modes.
A neuro-computational perspective on habit formation: from rats to Tourette patients
Wed, 16. 12. 2020, https://us02web.zoom.us/j/84719165578
Why do we do what we do? Actions could be either directed by their consequences (goal-directed behavior) or through a stimulus - response relationship (habitual behavior). In this talk I will first introduce you to reward devaluation: an experimental paradigm used to measure the development of habitual behavior. Then I will introduce a neuro-computational model of multiple cortico-basal ganglia-thalamo-cortical loops which we have used to simulate two recent devaluation experiments: one with rats running in a T-maze and one that involves Tourette patients. Our model does not only reproduce behavioral and neural data in both cases but also predicts that cortico-thalamic shortcuts, that bypass some of the loops, are critical for the development of habits.
Geometric Deep Learning: Part I - Graph Neural Networks
Wed, 2. 12. 2020, https://us02web.zoom.us/j/88027471757
A lot of success of deep learning methods, particularly classical CNNs, is owed to the availability of the data, based on which these algorithms are built. However, despite the enormous amount of data, provided from social networks, chemistry and biochemistry in pharmaceutical research, 3D computer vision, etc. the methods which were successful previously on domains such as images, audio signals, etc. fell short in being immediately adopted to these new domains. The main reason is the inherent structure of these data which is non-Euclidean, as opposed to Euclidean structured data with grid-like underlying structure. The field of Geometric Deep Learning (GDL) is dedicated to address issues arising in non-Euclidean structured data. Due to the extensive research and endeavors of the community in GDL, this field is no longer an underdeveloped territory. Some of the prominent categories in this field are the attempts to apply embedding techniques on graphs, defining a mathematical-sound convolution operation on graphs inspired by Graph Signal Processing (GSP), defining spatial-based convolution e.g. message passing approach, and developing algorithms based on graph isomorphism. The challenges and shortcomings of many of the methods are now more understood. One of the immediate results brought the topic into a debate whether going deep in Graph Neural Networks is necessarily an advantage. The question regarding a generalized framework which is easily adopted in different tasks, e.g. node-wise classification, graph classification, etc. has not yet been answered, and remains application dependent. In this presentation these ideas are explored and discussed to see how some of these ideas developed into a state-of-the-art.
Stopping planned actions - a neurocomputational basal ganglia model
Wed, 25. 11. 2020, https://us02web.zoom.us/j/88027471757
In this presentation, we will first take a short tour through the field of computational modeling of the basal ganglia. Some interesting modeling approaches in the research fields of reinforcement learning, working memory and Parkinson's Disease will be presented and thereby, we will look at the general structure and some of the most popular assumed functions of the basal ganglia. After this more general part of the presentation, we will present our current work about simulating a stop-signal-task with our basal ganglia model. In such stop-signal-tasks, one must suddenly cancel a planned action due to an occurring stop cue. The recently proposed 'pause-then-cancel' model suggests that the subthalamic nucleus (STN) of the basal ganglia provides a rapid stimulus-unspecific 'pause' signal. This signal is followed by a stop-cue-specific 'cancel' signal from recently defined striatum-projecting neurons of the external globus pallidus (GPe) of the basal ganglia, the so-called Arkypallidal neurons. The purpose of our stop-signal-task simulations is to better understand the underlying neuronal processes of this 'pause-then-cancel' theory and the relative contribution of Arkypallidal and STN neurons to stopping. After an extensive review of the structure and connectivity of the GPe, we have completely revised the GPe of our basal ganglia model to include not only the Arkypallidal neurons but also recently described cortex-projecting GPe neurons. We replicate neuronal and behavioral findings of stop-signal-tasks and demonstrate that besides STN and Arkypallidal neurons also cortex-projecting GPe neurons are required for successful stopping. Further, we predict the effects of lesions on stopping performance. Our model simulations provide an explanation for some surprising or non-intuitive findings, such as stronger projections of Arkypallidal neurons on the indirect than on the direct pathway of the basal ganglia and the fact that Arkypallidal neurons become active during both stopping of actions and movement initiation.
Deep Learning for 3D Shapes Represented by Meshes
Tue, 17. 11. 2020, https://us02web.zoom.us/j/88027471757
After the groundbreaking success of CNNs on images, where models on a variety of tasks could outperform humans, researchers have tried to adapt the convolution to different domains, e.g. on graphs, meshes, etc. These data however have introduced challenges to the classical CNN approach since the data does not have grid-like underlying structure. To tackle this issue, a range of different methods have been developed from data representation conversion (or dimensionality reduction) to a domain where classical CNN could be utilized, to defining convolution operations particularly for such domains where the filters are able to capture the information of a non-Euclidean structured data. Non-Euclidean structured data such as mesh representation of 3D objects, whether they are scanned or generated, are used in many models for tasks ranging from classification to semantic segmentation. MeshCNN, a network with convolution and pooling operations on edges of the meshes, has shown promising results in both classification and semantic segmentation tasks. On the other hand, MeshNet has been proposed to classify meshes using the meshes' faces as the unit of input. We have proposed a MeshNet-based architecture for a semantic segmentation task. To direct the model's learning beyond a point-wise segmentation, a weighted loss function has also been introduced to emphasize more on the faces with larger areas, and as a result an improved IoU of the area has been achieved. The effect of zero vs random replication padding has also been investigated. The model has been tested on COSEG dataset, containing samples of chairs, vases, and aliens. The results have shown that the model could perform segmentation on par with MeshCNN, on two out of three separate categories. Also, our proposed architecture has performed almost equally good with all three categories combined. This model has outperformed PointNet on COSEG dataset.
Introducing infrared target simulation based on cGAN derived models
Wed, 11. 11. 2020, https://us02web.zoom.us/j/88027471757
After Ian Goodfellow introduced the GAN (Generative Adversarial Network) architecture, Yann Lecun called it the coolest thing since sliced bread. The interest in GAN in all domains took a blowout growth. The most important application of GANs is to generate natural-looking images in an unsupervised manner. This can solve the problem that not enough images are available for supervised learning. It also can create powerful programming frames for unsupervised learning as many computer scientists and AI engineers have already experimented. This seminar will start with the definition of GANs and how they generally work. Also, the main part is one particular type of GAN that involves the conditional generation of outputs called conditional GAN (cGAN). This part will derive more possible models and help to generate more images in diverse classes, such as infrared target simulation based on cGAN.
Deep Hebbian Neural Networks, a less artificial approach
Wed, 4. 11. 2020, https://us02web.zoom.us/j/88027471757
In recent years, deep neural networks using supervised learning gained a lot of interest for modeling brain function. However, network connectivity and learning have been questioned for their biological plausibility. Also the increasing amount of discovered shortcomings, compared to the brain, raised the call for a less artificial intelligence. We present how three core principles from computational neuroscience can be combined to a system of the first visual cortical areas. These are: 1) Hebbian and anti-Hebbian plasticity with homeostatic regulations, to learn the synaptic strengths of excitatory and inhibitory neurons, causing independent neuronal responses, a key aspect for neuronal coding. 2) Intrinsic plasticity to control the operating point of the neurons, enabling all neurons to participate equally in the encoding and causing an informative code in deeper network layers. 3) Experience-dependent structural plasticity to modify connections during network training, allowing to observe the anatomical footprint of the learnings and to overcome biased initial definitions. We implemented the core circuit of the pathway from LGN to V2, which consists of nine neuronal layers, implementing excitatory and inhibitory neurons, and their recurrent connectivity. We demonstrate three important exemplary aspects, highlighting the value of this model class. 1) The general ability to do invariant object recognition, on MNIST and COIL-100, with competitive results to other unsupervised approaches. 2) The model develops realistic V2 receptive fields, from which we can derive predictions for differences in the sensitivity of the layer and neuron type on naturalistic textures, extending experimental observations. 3) The distribution of synaptic weights with respect to the response correlations of the connected neurons. We extend the common view on inhibitory connectivity as unspecific by its specific connection structure, which is difficult to observe experimentally. We link our findings to the specific role of inhibitory plasticity. Finally, we give an outlook for the next challenges and achievements highlighting deep biological neural networks as a promising research field to overcome limitations of state of the art deep neural networks and allow detailed insights in the brain functioning.
Towards general and robust industrial bin-picking: A Deep Reinforcement Learning based approach using simulation to reality transfer
Fri, 2. 10. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz
The ability to reliably grasp and lift arbitrary objects is a fundamental skill that is required for various industrial use-cases. To this end learning-based approaches can generate grasp poses from single sensor observations without the need of being provided with explicit object pose information. This thesis aims on learning the skill of grasping arbitrary items from cluttered heaps in simulation. For this a multi-stage approach is presented in which the graspable areas of the heap are extracted in an initial filtering stage. Next, a hierarchical learning approach is used to select from these candidates and define the final grasp pose using two different policies. By including information about the gripper dimension in the observation, it was possible to train the policies on a distribution of parallel grippers. The models could then be used in a real scenario with unseen objects and a new gripper configuration.
Improving Robustness of Object Detection Models using Compression
Thu, 24. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz
Deep neural networks are often not robust to semantically-irrelevant changes in the input. In this work we address the issue of robustness of state-of-the-art deep convolutional neural networks (CNNs) against commonly occurring distortions in the input such as photometric changes, or noise. These changes in the input are often accounted for during training in the form of data augmentation. We have three major contributions: First, we propose a new pruning method called augmented-based filter pruning that can be used to prune the filters that react dramatically to the small changes of the input. Second, we propose robustness training that consists of a new regularization loss called feature-map augmentation (FMA) loss which can be used during finetuning to make a model robust to several distortions in the input. Finally, we propose the combination of pruning and robustness training that results in a single model that is robust to augmentation types that is used during pruning and finetuning. We use our strategy to improve the robustness of an existing state-of-the-art object detection network. In the course of our experiments, we trained Faster R-CNN object detection network on KITTI Vision Benchmark Suite dataset. Afterwards, we used robustness training on augmented baseline for different augmentation types. For combination of pruning and robustness training method we pruned 200 filters in each pruning step and finetuned the updated model using FMA loss and original loss. In total we pruned 40% of filters of our baseline. It is shown in the experiments that the accuracy of the network improved for both augmented and clean test set.
Voice authentication and Voice-to-Text tool based on Deep Learning in the area of plant maintenance - Industry 4.0
Tue, 22. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz
As part of the development of future Industry 4.0 solutions at Robert Bosch GmbH, we investigate several speaker recognition and speech recognition models to be used in a plant maintenance use case. Our speech application should first authenticate a user through a voice sample. If the authentication is successful, speech commands from the user are transcribed into text which allows the user to conduct plant maintenance operations with the assistance of a chatbot. Speaker and speech recognition systems are tested on speech samples with background noise since the speech pipeline is planned to be implemented in a production plant. We evaluate two speaker recognition and three speech recognition models on an internal noisy dataset consisting of 12,000 audio files. On the speaker recognition task, Microsoft Azure's Speaker Recognition API achieves an accuracy of 99.7% while an implementation based on the ResNet-34 architecture achieves an accuracy of 93.9%. The results show that speaker recognition models can handle background noise fairly well. On the speech recognition task, we test the models without prior training. Microsoft Azure's Speech-to-Text achieves a 6.9% WER while Mozilla DeepSpeech and the Python SpeechRecognition library (with a PocketSphinx model) considerably struggle with background noise, achieving a 52.5% and 67.1% WER respectively. We compare the speech models on several other metrics like latency, costs, or data security to give an overview of speech models that could be deployed in a production plant.
Recognition of structural road surface conditions to support Highly Automated Driving
Thu, 10. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz
Road surface detection is important in terms of safety of the vehicle as well as providing comfort. In this master thesis, traditional feature extraction techniques of computer vision and CNN techniques are used to classify the different road surfaces using the fixed size image patches with ten classes. The road features are extracted using texture based feature extractors, shape based feature extractors and combination of both. The four computer vision models used are GLCM, LBP, HoG and HoG with LBP. The same dataset is also trained and tested using a shallow CNN model and a CNN model where the image is pre-processed with Gabor filter. Ten-fold cross validation is used to evaluate the individual models. The behaviour of individual models on shuffled and unshuffled road sequences is noted. It is observed that although CNN performs better in terms of accuracy, the inference time is maximum when compared to the traditional computer vision methods used. Tests with the reduction in the number of classes proved to have a better accuracy than the tests with ten classes.
Schätzung der Eigenrotation eines Roboters basierend auf Gyrosensor- und Kameradaten
Tue, 8. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/lor-hy4-uug
Die Bachelorarbeit beschäftigt sich mit dem Thema Roboterorientierung, speziell in dem Teilbereich Rotation. Dafür wurde untersucht, in wie weit visuelle Informationen (angelehnt an das menschliche Sichtfeld über eine Front-Kamera) dafür geeignet sind eine Rotation zu erkennen. Dabei stellt sich die Frage, ob diese Methode als Feedback-Signal für ein Orientierungsmodell dienen kann. Des Weiteren wurden Trackingmethoden zur Überprüfung der errechneten Rotationen getestet und ausgewertet.
Supervised learning in Spiking Neural Networks
Thu, 3. 9. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz
In recent years, we more often talk about artificial neural networks (ANN) in the field of machine learning. However, ANNs are fundamentally different from the brain, as they are not biologically plausible, especially through the way that information propagates. Naturally, it led to a new type of neural networks - Spiking Neural Networks (SNN). The concepts of SNN are inspired by the biological neuronal mechanisms that can efficiently process discrete spatio-temporal events (spikes). Due to the non-differentiable properties of a spike, it is difficult to use conventional optimization for training SNNs in a supervised fashion. In my presentation, I will introduce the training approaches from Lee, J. H. et al (2016), Lee, C. et al (2019), and Zenke F. et al (2017) as possible solutions for supervised learning in SNNs.
Vehicle Brake and Indicator Light Status Recognition Using Spatio-temporal Neural Networks
Giridhar Vadakkumuri Parameswaran
Thu, 6. 8. 2020, https://webroom.hrz.tu-chemnitz.de/gl/jul-2tw-4nz
Detecting the intent of ado vehicle is a pivotal task in realising self-driving capabilities. Recognising the status of brake and turn indicator lights of the ado vehicle helps the ego vehicle to plan its trajectory. By drawing inspiration from the success of deep learning in solving video classification and action recognition tasks, the thesis investigates end to end deep learning techniques to solve the problem of taillight status recognition by utilizing visual data. We investigate the suitability of various deep learning models like CNN, LSTM, ConvLSTM, 3D convolutional networks, spatial attentions networks and their combinations to solve the task. These models are trained and benchmarked on a public dataset from UC Merced. All these models work on a sequence of images of the ado - vehicle and predict its brake and turn indicator light status. Our best method is able to outperform the state of the art in terms of accuracy on the RGB-UC Merced Vehicle Rear Signal Dataset, demonstrating the effectiveness of attention models and temporal networks for vehicle taillight recognition. We also compile and present two large datasets - Bosch Boxy taillight dataset and IAV taillight dataset, which can be utilized by other researchers for solving this task.