Current Trends in Learning from Data Streams
Abstract: Learning from data streams is a hot topic in machine learning and data mining. In this talk, we present three different problems and discuss streaming techniques to solve them. The first problem is the application of data stream techniques to telecommunications fraud detection. We propose an algorithm for the interconnected by-pass fraud problem. This real-world problem requires processing high-speed telecommunications data and providing fraud alarms in real-time. For the second problem, we present an architecture to explain black-box models for predictive maintenance. The explanations are oriented toward equipment anomalies. For the third problem, we present one of the first algorithms for online hyper-parameter tuning for streaming data. The Self hyper-Parameter Tunning (SPT) algorithm is an optimization algorithm for online hyper-parameter tuning from non-stationary data streams. SPT works as a wrapper over any streaming algorithm and can be used for classification, regression, and recommendation.
Bio: João Gama is Associate Professor of the Faculty of Economy, University of Porto. He is a researcher and vice-director of LIAAD, a group belonging to INESC TEC. He got the PhD degree from the University of Porto, in 2000. He is Senior member of IEEE. He has worked in several National and European projects on Incremental and Adaptive learning systems, Ubiquitous Knowledge Discovery, Learning from Massive, and Structured Data, etc. He served as Co-Program chair of ECML’2005, DS’2009, ADMA’2009, IDA’ 2011, and ECML/PKDD’2015. He served as track chair on Data Streams with ACM SAC from 2007 till 2016. He organized a series of Workshops on Knowledge Discovery from Data Streams with ECML/PKDD, and Knowledge Discovery from Sensor Data with ACM SIGKDD. He is author of several books in Data Mining (in Portuguese) and authored a monograph on Knowledge Discovery from Data Streams. He authored more than 250 peer-reviewed papers in areas related to machine learning, data mining, and data streams. He is a member of the editorial board of international journals ML, DMKD, TKDE, IDA, NGC, and KAIS. He (co-)supervised more than 12 PhD students and 50 Msc students.
Abstract: A central challenge to contemporary AI is to integrate learning and reasoning. The integration of learning and reasoning has been studied for decades already in the fields of statistical relational artificial intelligence and probabilistic programming. StarAI has focussed on unifying logic and probability, the two key frameworks for reasoning, and has extended this probabilistic logics machine learning principles. I will argue that StarAI and Probabilistic Logics form an ideal basis for developing neuro-symbolic artificial intelligence techniques. Thus neuro-symbolic computation = StarAI + Neural Networks. Many parallels will be drawn between these two fields and will be illustrated using the Deep Probabilistic Logic Programming languages such as DeepProbLog and DeepStochLog.
Bio: Luc De Raedt is full professor at the Department of Computer Science, KU Leuven, and director of Leuven.AI, the newly founded KU Leuven Institute for AI. He is a guestprofessor at Örebro University in the Wallenberg AI, Autonomous Systems and Software Program. He received his PhD in Computer Science from KU Leuven (1991), and was full professor (C4) and Chair of Machine Learning at the Albert-Ludwigs-University Freiburg, Germany (1999-2006). His research interests are in Artificial Intelligence, Machine Learning and Data Mining, as well as their applications. He is well known for his contributions in the areas of learning and reasoning, in particular, for his work on probabilistic and inductive programming. He co-chaired important conferences such as ECMLPKDD 2001 and ICML 2005 (the European and International Conferences on Machine Learning), ECAI 2012 and IJCAI in 2022. He is on the editorial board of Artificial Intelligence, Machine Learning and the Journal of Machine Learning Research. He is a EurAI and AAAI fellow, an IJCAI Trustee and received and ERC Advanced Grant in 2015.
Abstract: The ability to find and interpret cross-document relations is crucial in many fields of human activity, from social media to collaborative writing. While natural language processing has made tremendous progress in extracting information from single texts, a general NLP framework for modelling interconnected texts including their versions and related documents is missing. The talk reports on our ongoing efforts to establish such a framework. We address several challenges related to this. First, NLP has an acute need for diverse data to model cross-document tasks. We discuss our new, ethically sound data acquisition strategies and present unique cross-document datasets, along with a generic data model that can capture text structure and cross-document relations in heterogeneous documents. Second, we report on a study that instantiates our framework in the domain of scientific peer reviews. Third, to model cross-document relations, we need to make transformers aware of the structural relations within and across documents – yet it is unclear how much structure they already encode. To this end, we present preliminary insights into probing of long document transformers for structure. Our results pave the way to move NLP forward towards more human-like interpretation of text in the context of other texts.
Bio: Iryna Gurevych (PhD 2003, U. Duisburg-Essen, Germany) is professor of Computer Science and director of the Ubiquitous Knowledge Processing (UKP) Lab at the Technical University (TU) of Darmstadt in Germany. Her main research interests are in machine learning for large-scale language understanding and text semantics. Iryna’s work has received numerous awards. Examples are the ACL fellow award 2020 and the first Hessian LOEWE Distinguished Chair award (2,5 mil. Euro) in 2021. Iryna is co-director of the NLP program within ELLIS, a network of excellence in machine learning. She is currently vice-president of the Association of Computational Linguistics. In 2022, she has been awarded an ERC Advanced Grant.