SV6083
A Survey of Data-Efficient Graph Learning
12 min. talk | August 6th at 15:00 | Session: DM: Mining graphs (2/3)
[+] More
[-] Less
Graph-structured data, prevalent in domains ranging from social networks to biochemical analysis, serve as the foundation for diverse real-world systems. While graph neural networks demonstrate proficiency in modeling this type of data, their success is often reliant on significant amounts of labeled data, posing a challenge in practical scenarios with limited annotation resources. To tackle this problem, tremendous efforts have been devoted to enhancing graph machine learning performance under low-resource settings by exploring various approaches to minimal supervision. In this paper, we introduce a novel concept of Data-Efficient Graph Learning (DEGL) as a research frontier, and present the first survey that summarizes the current progress of DEGL. We initiate by highlighting the challenges inherent in training models with large labeled data, paving the way for our exploration into DEGL. Next, we systematically review recent advances on this topic from several key aspects, including self-supervised graph learning, semi-supervised graph learning, and few-shot graph learning. Also, we state promising directions for future research, contributing to the evolution of graph machine learning.
SV7348
Budget Feasible Mechanisms: A Survey
12 min. talk | August 8th at 15:00 | Session: GTEP: Game Theory and Economic Paradigms
[+] More
[-] Less
In recent decades, the design of budget feasible mechanisms for a wide range of procurement auction settings has received significant attention in the Artificial Intelligence (AI) community.
These procurement auction settings have practical applications in various domains such as federated learning, crowdsensing, edge computing, and resource allocation. In a basic procurement auction setting of these domains, a buyer with a limited budget is tasked with procuring items (\eg, goods or services) from strategic sellers, who have private information on the true costs of their items and incentives to misrepresent their items’ true costs.
The primary goal of budget feasible mechanisms is to elicit the true costs from sellers and determine items to procure from sellers to maximize the buyer valuation function for the items and ensure that the total payment to the sellers is no more than the budget. In this survey, we provide a comprehensive overview of key procurement auction settings and results of budget feasible mechanisms. We provide several promising future research directions.
SV7427
On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
12 min. talk | August 6th at 15:00 | Session: NLP: Natural Language Processing (1/3)
[+] More
[-] Less
Big models have achieved revolutionary breakthroughs in the field of AI, but they also pose potential ethical and societal risks to humans. Addressing such problems, alignment technologies were introduced to make these models conform to human preferences and values. Despite the considerable advancements in the past year, various challenges lie in establishing the optimal alignment strategy, such as data cost and scalable oversight, and how to align remains an open question. In this survey paper, we comprehensively investigate value alignment approaches. We first unpack the historical context of alignment tracing back to the 1920s (where it comes from), then delve into the mathematical essence of alignment (what it is), shedding light on the inherent challenges. Following this foundation, we provide a detailed examination of existing alignment methods, which fall into three categories: RL-based Alignment, SFT-based Alignment, and Inference-Time Alignment, and demonstrate their intrinsic connections, strengths, and limitations, helping readers better understand this research area. In addition, two emerging topics, alignment goal and multimodal alignment, are also discussed as novel frontiers in the field. Looking forward, we discuss potential alignment paradigms and how they could handle remaining challenges, prospecting where future alignment will go.
SV7549
A Survey on Efficient Federated Learning Methods for Foundation Model Training
12 min. talk | August 7th at 10:00 | Session: ML: Federated learning
[+] More
[-] Less
Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients. However, new approaches to FL often discuss their contributions involving small deep-learning models only and focus on training full models on clients. In the wake of Foundation Models (FM), the reality is different for many deep learning applications. Typically, FMs have already been pre-trained across a wide variety of tasks and can be fine-tuned to specific downstream tasks over significantly smaller datasets than required for full model training. However, access to such datasets is often challenging. By its design, FL can help to open data silos. With this survey, we introduce a novel taxonomy focused on computational and communication efficiency, the vital elements to make use of FMs in FL systems. We discuss the benefits and drawbacks of parameter-efficient fine-tuning (PEFT) for FL applications, elaborate on the readiness of FL frameworks to work with FMs and provide future research opportunities on how to evaluate generative models in FL as well as the interplay of privacy and PEFT.
SV7830
A Survey of Graph Meets Large Language Model: Progress and Future Directions
12 min. talk | August 6th at 11:30 | Session: DM: Mining graphs (1/3)
[+] More
[-] Less
Graph plays a significant role in representing and analyzing complex relationships in real-world applications such as citation networks, social networks, and biological data. Recently, Large Language Models (LLMs), which have achieved tremendous success in various domains, have also been leveraged in graph-related tasks to surpass traditional Graph Neural Networks (GNNs) based methods and yield state-of-the-art performance. In this survey, we first present a comprehensive review and analysis of existing methods that integrate LLMs with graphs. First of all, we propose a new taxonomy, which organizes existing methods into three categories based on the role (i.e., enhancer, predictor, and alignment component) played by LLMs in graph-related tasks. Then we systematically survey the representative methods along the three categories of the taxonomy. Finally, we discuss the remaining limitations of existing studies and highlight promising avenues for future research. The relevant papers are summarized and will be consistently updated at: https://github.com/yhLeeee/Awesome-LLMs-in-Graph-tasks.
SV7832
X-former Elucidator: Reviving Efficient Attention for Long Context Language Modeling
12 min. talk | August 7th at 15:00 | Session: NLP: Natural Language Processing (2/3)
[+] More
[-] Less
Transformer-based LLMs are becoming increasingly important in various AI applications. However, apart from the success of LLMs, the explosive demand of long context handling capabilities is a key and in-time problem for both academia and industry.
Due to the limitations from the quadratic complexity of the attention mechanism, long context scenarios require much more resources for LLM development and deployment, bringing huge challenges to the underlying AI infrastructure.
Meanwhile, we observe that there is a trend of reviving previous efficient attention mechanisms to latest LLMs.
However, it still remains an open question about how to select from these diverse approaches in practice.
In this paper, we answer this question from several aspects.
First, we revisit these latest long-context LLM innovations and discuss their relationship with prior approaches with a novel and comprehensive taxonomy.
Next, we conduct a thorough evaluation over various types of workloads considering both efficiency and effectiveness.
Finally, we provide an in-depth analysis, summarize our key findings, and offer insightful suggestions on the trade-offs of designing and deploying efficient attention mechanisms for Transformer-based LLMs.
SV7836
A Survey on Cross-Domain Sequential Recommendation
12 min. talk | August 7th at 15:00 | Session: DM: Data Mining (1/2)
[+] More
[-] Less
Cross-domain sequential recommendation (CDSR) shifts the modeling of user preferences from flat to stereoscopic by integrating and learning interaction information from multiple domains at different granularities (ranging from inter-sequence to intra-sequence and from single-domain to cross-domain). In this survey, we initially define the CDSR problem using a four-dimensional tensor and then analyze its multi-type input representations under multidirectional dimensionality reductions. Following that, we provide a systematic overview from both macro and micro views. From a macro view, we abstract the multi-level fusion structures of various models across domains and discuss their bridges for fusion. From a micro view, focusing on the existing models, we specifically discuss the basic technologies and then explain the auxiliary learning technologies. Finally, we exhibit the available public datasets and the representative experimental results as well as provide some insights into future directions for research in CDSR.
SV7862
Large Language Models for Time Series: A Survey
12 min. talk | August 6th at 15:00 | Session: ML: Machine Learning (1/6)
[+] More
[-] Less
Large Language Models (LLMs) have seen significant use in domains such as natural language processing and computer vision. Going beyond text, image and graphics, LLMs present a significant potential for analysis of time series data, benefiting domains such as climate, IoT, healthcare, traffic, audio and finance. This survey paper provides an in-depth exploration and a detailed taxonomy of the various methodologies employed to harness the power of LLMs for time series analysis. We address the inherent challenge of bridging the gap between LLMs’ original text data training and the numerical nature of time series data, and explore strategies for transferring and distilling knowledge from LLMs to numerical time series analysis. We detail various methodologies, including (1) direct prompting of LLMs, (2) time series quantization, (3) aligning techniques, (4) utilization of the vision modality as a bridging mechanism, and (5) the combination of LLMs with tools. Additionally, this survey offers a comprehensive overview of the existing multimodal time series and text datasets in diverse domains, and discusses the challenges and future opportunities of this emerging field.
SV7865
Intelligent Agents for Auction-based Federated Learning: A Survey
12 min. talk | August 7th at 10:00 | Session: ML: Federated learning
[+] More
[-] Less
Auction-based federated learning (AFL) is an important emerging category of FL incentive mechanism design, due to its ability to fairly and efficiently motivate high-quality data owners to join data consumers’ (i.e., servers’) FL training tasks. To enhance the efficiency in AFL decision support for stakeholders (i.e., data consumers, data owners, and the auctioneer), intelligent agent-based techniques have emerged. However, due to the highly interdisciplinary nature of this field and the lack of a comprehensive survey providing an accessible perspective, it is a challenge for researchers to enter and contribute to this field. This paper bridges this important gap by providing a first-of-its-kind survey on the Intelligent Agents for AFL (IA-AFL) literature. We propose a unique multi-tiered taxonomy that organises existing IA-AFL works according to 1) the stakeholders served, 2) the auction mechanism adopted, and 3) the goals of the agents, to provide readers with a multi-perspective view into this field. In addition, we analyse the limitations of existing approaches, summarise the commonly adopted performance evaluation metrics, and discuss promising future directions leading towards effective and efficient stakeholder-oriented decision support in IA-AFL ecosystems.
SV7900
A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied Agents
12 min. talk | August 8th at 11:30 | Session: ROB: Robotics (2/2)
[+] More
[-] Less
The burgeoning fields of robot learning and embodied AI have triggered an increasing demand for large quantities of data. However, collecting sufficient unbiased data from the target domain remains a challenge due to costly data collection processes and stringent safety requirements. Consequently, researchers often resort to data from easily accessible source domains, such as simulation and laboratory environments, for cost-effective data acquisition and rapid model iteration. Nevertheless, the environments and embodiments of these source domains can be quite different from their target domain counterparts, underscoring the need for effective cross-domain policy transfer approaches. In this paper, we conduct a systematic review of existing cross-domain policy transfer methods. Through a nuanced categorization of domain gaps, we encapsulate the overarching insights and design considerations of each problem setting. We also provide a high-level discussion about the key methodologies used in cross-domain policy transfer problems. Lastly, we summarize the open challenges that lie beyond the capabilities of current paradigms and discuss potential future directions in this field.
SV7903
Guide to Numerical Experiments on Elections in Computational Social Choice
12 min. talk | August 7th at 15:00 | Session: GTEP: Computational social choice (2/2)
[+] More
[-] Less
We analyze how numerical experiments regarding elections were conducted within computational social choice literature (focusing on papers published in the IJCAI, AAAI, and AAMAS conferences). We analyze the sizes of the studied elections and the methods of generating preference data, thereby making previously hidden standards and practices explicit. In particular, we survey a number of statistical cultures for generating elections and their commonly used parameters.
SV7910
Large Language Model Based Multi-agents: A Survey of Progress and Challenges
12 min. talk | August 6th at 15:00 | Session: NLP: Natural Language Processing (1/3)
[+] More
[-] Less
Large Language Models (LLMs) have achieved remarkable success across a wide array of tasks. Due to their notable capabilities in planning and reasoning, LLMs have been utilized as autonomous agents for the automatic execution of various tasks. Recently, LLM-based agent systems have rapidly evolved from single-agent planning or decision-making to operating as multi-agent systems, enhancing their ability in complex problem-solving and world simulation. To offer an overview of this dynamic field, we present this survey to offer an in-depth discussion on the essential aspects and challenges of LLM-based multi-agent (LLM-MA) systems. Our objective is to provide readers with an in-depth understanding of these key points: the domains and settings where LLM-MA systems operate or simulate; the profiling and communication methods of these agents; and the means by which these agents develop their skills. For those interested in delving into this field, we also summarize the commonly used datasets or benchmarks. To keep researchers updated on the latest studies, we maintain an open-source GitHub repository (github.com/taichengguo/LLM_MultiAgents_Survey_Papers), dedicated to outlining the research of LLM-MA research.
SV7914
Continual Learning with Pre-Trained Models: A Survey
12 min. talk | August 9th at 11:30 | Session: ML: Classification
[+] More
[-] Less
Nowadays, real-world applications often face streaming data, which requires the learning system to absorb new knowledge as data evolves. Continual Learning (CL) aims to achieve this goal and meanwhile overcome the catastrophic forgetting of former knowledge when learning new ones. Typical CL methods build the model from scratch to grow with incoming data. However, the advent of the pre-trained model (PTM) era has sparked immense research interest, particularly in leveraging PTMs’ robust representational capabilities. This paper presents a comprehensive survey of the latest advancements in PTM-based CL. We categorize existing methodologies into three distinct groups, providing a comparative analysis of their similarities, differences, and respective advantages and disadvantages. Additionally, we offer an empirical study contrasting various state-of-the-art methods to highlight concerns regarding fairness in comparisons. The source code to reproduce these evaluations is available at: https://github.com/sun-hailong/LAMDA-PILOT
SV7931
Safety of Multimodal Large Language Models on Images and Text
12 min. talk | August 7th at 10:00 | Session: ETF: Safety and robustness
[+] More
[-] Less
Attracted by the impressive power of Multimodal Large Language Models (MLLMs), the public is increasingly utilizing them to improve the efficiency of daily work. Nonetheless, the vulnerabilities of MLLMs to unsafe instructions bring huge safety risks when these models are deployed in real-world scenarios. In this paper, we systematically survey current efforts on the evaluation, attack, and defense of MLLMs’ safety on images and text. We begin with introducing the overview of MLLMs on images and text and understanding of safety, which helps researchers know the detailed scope of our survey. Then, we review the evaluation datasets and metrics for measuring the safety of MLLMs. Next, we comprehensively present attack and defense techniques related to MLLMs’ safety. Finally, we analyze several unsolved issues and discuss promising research directions. The relevant papers are collected at "https://github.com/isXinLiu/Awesome-MLLM-Safety".
SV7937
A Survey on Rank Aggregation
12 min. talk | August 8th at 15:00 | Session: DM: Data Mining (2/2)
[+] More
[-] Less
Rank aggregation (RA), the technique of combining multiple basic rankings into a consensus one, plays an important role in social choices, bioinformatics, information retrieval, metasearch, and recommendation systems. Although recent years have witnessed remarkable progress in RA, the absence of a systematic overview motivates us to conduct a comprehensive survey including both classic algorithms and the latest advances in RA study. Specifically, we first discuss the challenges of RA research, then present a systematic review with a fine-grained taxonomy to introduce representative algorithms in unsupervised RA, supervised RA, as well as the previously overlooked semi-supervised RA. Within each category, we not only summarize the common ideas of similar methods, but also discuss their strengths and weaknesses. Particularly, to investigate the performance difference of different types of RA methods, we conduct the largest scale of comparative evaluation to date of 27 RA methods on 7 public datasets from person re-identification, recommendation systems, bioinformatics and social choices. Finally, we raise two open questions in the current RA research and make our comments about future trends in the context of the latest research progress.
SV7942
More is Better: Deep Domain Adaptation with Multiple Sources
12 min. talk | August 7th at 15:00 | Session: ML: Machine Learning (3/6)
[+] More
[-] Less
In many practical applications, it is often difficult and expensive to obtain large-scale labeled data to train state-of-the-art deep neural networks. Therefore, transferring the learned knowledge from a separate, labeled source domain to an unlabeled or sparsely labeled target domain becomes an appealing alternative. However, direct transfer often results in significant performance decay due to domain shift. Domain adaptation (DA) aims to address this problem by aligning the distributions between the source and target domains. Multi-source domain adaptation (MDA) is a powerful and practical extension in which the labeled data may be collected from multiple sources with different distributions. In this survey, we first define various MDA strategies. Then we systematically summarize and compare modern MDA methods in the deep learning era from different perspectives, followed by commonly used datasets and a brief benchmark. Finally, we discuss future research directions for MDA that are worth investigating.
SV7944
Supervised Algorithmic Fairness in Distribution Shifts: A Survey
12 min. talk | August 7th at 10:00 | Session: ETF: Safety and robustness
[+] More
[-] Less
Supervised fairness-aware machine learning under distribution shifts is an emerging field that addresses the challenge of maintaining equitable and unbiased predictions when faced with changes in data distributions from source to target domains.
In real-world applications, machine learning models are often trained on a specific dataset but deployed in environments where the data distribution may shift over time due to various factors. This shift can lead to unfair predictions, disproportionately affecting certain groups characterized by sensitive attributes, such as race and gender. In this survey, we provide a summary of various types of distribution shifts and comprehensively investigate existing methods based on these shifts, highlighting six commonly used approaches in the literature. Additionally, this survey lists publicly available datasets and evaluation metrics for empirical studies. We further explore the interconnection with related research fields, discuss the significant challenges, and identify potential directions for future studies.
SV7984
Label Leakage in Vertical Federated Learning: A Survey
12 min. talk | August 7th at 10:00 | Session: ML: Federated learning
[+] More
[-] Less
Vertical federated learning (VFL) is a distributed machine learning paradigm that collaboratively trains models using passive parties with features and an active party with additional labels. While VFL offers privacy preservation through data localization, the threat of label leakage remains a significant challenge. Label leakage occurs due to label inference attacks, where passive parties attempt to infer labels for their privacy and commercial value. Extensive research has been conducted on this specific VFL attack, but a comprehensive summary is still lacking. To bridge this gap, our paper aims to survey the existing label inference attacks and defenses. We propose two new taxonomies for both label inference attacks and defenses, respectively. Beyond summarizing the current state of research, we highlight techniques that we believe hold potential and could significantly influence future studies. Moreover, experimental benchmark datasets and evaluation metrics are summarized to provide a guideline for subsequent work.
SV7993
A Survey of Constraint Formulations in Safe Reinforcement Learning
12 min. talk | August 7th at 15:00 | Session: ML: Machine Learning (5/6)
[+] More
[-] Less
Safety is critical when applying reinforcement learning (RL) to real-world problems. As a result, safe RL has emerged as a fundamental and powerful paradigm for optimizing an agent’s policy while incorporating notions of safety. A prevalent safe RL approach is based on a constrained criterion, which seeks to maximize the expected cumulative reward subject to specific safety constraints. Despite recent effort to enhance safety in RL, a systematic understanding of the field remains difficult. This challenge stems from the diversity of constraint representations and little exploration of their interrelations. To bridge this knowledge gap, we present a comprehensive review of representative constraint formulations, along with a curated selection of algorithms designed specifically for each formulation. In addition, we elucidate the theoretical underpinnings that reveal the mathematical mutual relations among common problem formulations. We conclude with a discussion of the current state and future directions of safe reinforcement learning research
SV8001
A Systematic Survey on Federated Semi-supervised Learning
12 min. talk | August 7th at 15:00 | Session: ML: Machine Learning (4/6)
[+] More
[-] Less
Federated learning (FL) revolutionizes distributed machine learning by enabling devices to collaboratively learn a model while maintaining data privacy. However, FL usually faces a critical challenge with limited labeled data, making semi-supervised learning (SSL) crucial for utilizing abundant unlabeled data. The integration of SSL within the federated framework gives rise to federated semi-supervised learning (FSSL), a novel approach that exploits unlabeled data across devices without compromising privacy. This paper systematically explores FSSL, shedding light on its four basic problem settings that commonly appear in real-world scenarios. By examining the unique challenges, generic solutions, and representative methods tailored for each setting of FSSL, we aim to provide a cohesive overview of the current state of the art and pave the way for future research directions in this promising field.
SV8003
A Comprehensive Survey and Taxonomy on Point Cloud Registration Based on Deep Learning
12 min. talk | August 8th at 15:00 | Session: CV: Computer Vision (2/2)
[+] More
[-] Less
Point cloud registration (PCR) involves determining a rigid transformation that aligns one point cloud to another. Despite the plethora of outstanding deep learning (DL)-based registration methods proposed, comprehensive and systematic studies on DL-based PCR techniques are still lacking. In this paper, we present a comprehensive survey and taxonomy of recently proposed PCR methods. Firstly, we conduct a taxonomy of commonly utilized datasets and evaluation metrics. Secondly, we classify the existing research into two main categories: supervised and unsupervised registration, providing insights into the core concepts of various influential PCR models. Finally, we highlight open challenges and potential directions for future research. A curated collection of valuable resources is made available at https://github.com/yxzhang15/PCR.
SV8007
Lifted Planning: Recent Advances in Planning Using First-Order Representations
12 min. talk | August 9th at 11:30 | Session: KRR: Reasoning about actions
[+] More
[-] Less
Lifted planning is usually defined as planning directly over a first-order representation. From the mid-1990s until the late 2010s, lifted planning was sidelined, as most of the state-of-the-art planners first ground the task and then solve it using a propositional representation. Moreover, it was unclear whether lifted planners could scale. But as planning problems become harder, they also become infeasible to ground. Recently, lifted planners came back into play, aiming at problems where grounding is a bottleneck. In this work, we survey recent advances in lifted planning. The main techniques rely either on state-space search or logic satisfiability. For lifted search-based planners, we show the direct connections to other areas of computer science, such as constraint satisfaction problems and databases. For lifted planners based on satisfiability, the advances in modeling are crucial to their scalability. We briefly describe the main planners available in the literature and their techniques.
SV8015
Knowledge Distillation in Federated Learning: A Practical Guide
12 min. talk | August 7th at 10:00 | Session: ML: Federated learning
[+] More
[-] Less
Federated Learning (FL) enables the training of Deep Learning models without centrally collecting possibly sensitive raw data. The most used algorithms for FL are parameter-averaging based schemes (e.g., Federated Averaging) that, however, have well known limits, i.e., model homogeneity, high communication cost, poor performance in presence of heterogeneous data distributions. Federated adaptations of regular Knowledge Distillation (KD) can solve or mitigate the weaknesses of parameter-averaging FL algorithms while possibly introducing other trade-offs. In this article, we originally present a focused review of the state-of-the-art KD-based algorithms specifically tailored for FL, by providing both a novel classification of the existing approaches and a detailed technical description of their pros, cons, and tradeoffs.
SV8026
Learning Structural Causal Models through Deep Generative Models: Methods, Guarantees, and Challenges
12 min. talk | August 8th at 15:00 | Session: ML: Machine Learning (6/6)
[+] More
[-] Less
This paper provides a comprehensive review of deep structural causal models (DSCMs), particularly focusing on their ability to answer counterfactual queries using observational data within known causal structures. It delves into the characteristics of DSCMs by analyzing the hypotheses, guarantees, and applications inherent to the underlying deep learning components and structural causal models, fostering a finer understanding of their capabilities and limitations in addressing different counterfactual queries. Furthermore, it highlights the challenges and open questions in the field of deep structural causal modeling. It sets the stages for researchers to identify future work directions and for practitioners to get an overview in order to find out the most appropriate methods for their needs.
SV8027
Building Expressive and Tractable Probabilistic Generative Models: A Review
12 min. talk | August 8th at 15:00 | Session: UAI: Uncertainty in AI
[+] More
[-] Less
We present a comprehensive survey of the advancements and techniques in the field of tractable probabilistic generative modeling, primarily focusing on Probabilistic Circuits (PCs). We provide a unified perspective on the inherent trade-offs between expressivity and tractability, highlighting the design principles and algorithmic extensions that have enabled building expressive and efficient PCs, and provide a taxonomy of the field. We also discuss recent efforts to build deep and hybrid PCs by fusing notions from deep neural models, and outline the challenges and open questions that can guide future research in this evolving field.
SV8040
Recent Advances in Predictive Modeling with Electronic Health Records
12 min. talk | August 9th at 11:30 | Session: MTA: Health and medicine
[+] More
[-] Less
The development of electronic health records (EHR) systems has enabled the collection of a vast amount of digitized patient data. However, utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics. With the advancements in machine learning techniques, deep learning has demonstrated its superiority in various applications, including healthcare. This survey systematically reviews recent advances in deep learning-based predictive models using EHR data. Specifically, we introduce the background of EHR data and provide a mathematical definition of the predictive modeling task. We then categorize and summarize predictive deep models from multiple perspectives. Furthermore, we present benchmarks and toolkits relevant to predictive modeling in healthcare. Finally, we conclude this survey by discussing open challenges and suggesting promising directions for future research.
SV8041
A Survey on Plan Optimization
12 min. talk | August 9th at 11:30 | Session: PS: Planning and Scheduling (2/2)
[+] More
[-] Less
Automated Planning deals with finding a sequence of actions that solves a given (planning) problem. The cost of the solution is a direct consequence of these actions, for example its number or their accumulated costs. Thus, in most applications, cheaper plans are preferred. Yet, finding an optimal solution is more challenging than finding some solution. So, many planning algorithms find some solution and then post-process, i.e., optimize it — a technique called plan optimization. Over the years many different approaches were developed, not all for the same kind of plans, and not all optimize the same metric. In this comprehensive survey, we give an overview of the existing plan optimization goals, their computational complexity (if known), and existing techniques for such optimizations.
SV8046
Recurrent Concept Drifts on Data Streams
12 min. talk | August 6th at 15:00 | Session: ML: Machine Learning (1/6)
[+] More
[-] Less
In an era where machine learning permeates every facet of human existence, and data evolves incessantly, the application of machine learning models transcends mere data processing. It involves navigating constant changes exemplified by the phenomenon of concept drift, which often affects model performance.
These drifts can be recurrent due to the cyclic nature of the underlying data generation processes, which could be influenced by recurrent phenomena such as weather and time of the day.
Stream Learning on data streams with recurrent concept drifts attempts to learn from such streams of data.
The survey underscores the significance of the field and its practical applications, delving into nuanced definitions of machine learning for data streams afflicted by recurrent concept drifts. It explores diverse methodological approaches, elucidating their key design components. Additionally, it examines various evaluation techniques, benchmark datasets, and available software tailored for simulating and analysing data streams with recurrent concept drifts. Concluding, the survey offers insights into potential avenues for future research in the field.
SV8052
Trends, Applications, and Challenges in Human Attention Modelling
12 min. talk | August 7th at 15:00 | Session: HAI: Humans and AI
[+] More
[-] Less
Human attention modelling has proven, in recent years, to be particularly useful not only for understanding the cognitive processes underlying visual exploration, but also for providing support to artificial intelligence models that aim to solve problems in various domains, including image and video processing, vision-and-language applications, and language modelling. This survey offers a reasoned overview of recent efforts to integrate human attention mechanisms into contemporary deep learning models and discusses future research directions and challenges.
For a comprehensive overview of the ongoing research, refer to our dedicated repository available at https://github.com/aimagelab/awesome-human-visual-attention.
SV8053
AI-Enhanced Virtual Reality in Medicine: A Comprehensive Survey
12 min. talk | August 6th at 11:30 | Session: MTA: Multidisciplinary Topics and Applications (1/2)
[+] More
[-] Less
With the rapid advance of computer graphics and artificial intelligence technologies, the ways we interact with the world have undergone a transformative shift. Virtual Reality (VR) technology, aided by artificial intelligence (AI), has emerged as a dominant interaction media in multiple application areas, thanks to its advantage of providing users with immersive experiences. Among those applications, medicine is considered one of the most promising areas. In this paper, we present a comprehensive examination of the burgeoning field of AI-enhanced VR applications in medical care and services. By introducing a systematic taxonomy, we meticulously classify the pertinent techniques and applications into three well-defined categories based on different phases of medical diagnosis and treatment: Visualization Enhancement, VR-related Medical Data Processing, and VR-assisted Intervention. This categorization enables a structured exploration of the diverse roles that AI-powered VR plays in the medical domain, providing a framework for a more comprehensive understanding and evaluation of these technologies.nTo our best knowledge, this work is the first systematic survey of AI-powered VR systems in medical settings, laying a foundation for future research in this interdisciplinary domain.
SV8056
Graph Neural Networks for Brain Graph Learning: A Survey
12 min. talk | August 7th at 11:30 | Session: DM: Applications
[+] More
[-] Less
Exploring the complex structure of the human brain is crucial for understanding its functionality and diagnosing brain disorders. Thanks to advancements in neuroimaging technology, a novel approach has emerged that involves modeling the human brain as a graph-structured pattern, with different brain regions represented as nodes and the functional relationships among these regions as edges. Moreover, graph neural networks (GNNs) have demonstrated a significant advantage in mining graph-structured data. Developing GNNs to learn brain graph representations for brain disorder analysis has recently gained increasing attention. However, there is a lack of systematic survey work summarizing current research methods in this domain. In this paper, we aim to bridge this gap by reviewing brain graph learning works that utilize GNNs. We first introduce the process of brain graph modeling based on common neuroimaging data. Subsequently, we systematically categorize current works based on the type of brain graph generated and the targeted research problems. To make this research accessible to a broader range of interested researchers, we provide an overview of representative methods and commonly used datasets, along with their implementation sources. Finally, we present our insights on future research directions. The repository of this survey is available at https://github.com/XuexiongLuoMQ/Awesome-Brain-Graph-Learning-with-GNNs.
SV8076
A Survey on Model-Free Goal Recognition
12 min. talk | August 7th at 15:00 | Session: PS: Planning and Scheduling (1/2)
[+] More
[-] Less
Goal Recognition is the task of inferring an agent’s intentions from a set of observations.
Existing recognition approaches have made considerable advances in domains such as human-robot interaction, intelligent tutoring systems, and surveillance. However, most approaches rely on explicit domain knowledge, often defined by a domain expert. Much recent research focus on mitigating the need for a domain expert while maintaining the ability to perform quality recognition, leading researchers to explore Model-Free Goal Recognition approaches. We comprehensively survey Model-Free Goal Recognition, and provide a perspective on the state-of-the-art approaches and their applications, showing recent advances. We categorize different approaches, introducing a taxonomy with a focus on their characteristics, strengths, weaknesses, and suitability for different scenarios. We compare the advances each approach made to the state-of-the-art and provide a direction for future research in Model-Free Goal Recognition.
SV8079
Strategic Aspects of Stable Matching Markets: A Survey
12 min. talk | August 8th at 15:00 | Session: GTEP: Game Theory and Economic Paradigms
[+] More
[-] Less
Matching markets consist of two disjoint sets of agents, where each agent has a preference list over agents on the other side. The primary objective is to find a stable matching between the agents such that no unmatched pair of agents prefer each other to their matched partners. The incompatibility between stability and strategy-proofness in this domain gives rise to a variety of strategic behavior of agents, which in turn may influence the resulting matching. In this paper, we discuss fundamental properties of stable matchings, review essential structural observations, survey key results in manipulation algorithms and their game-theoretical aspects, and more importantly, highlight a series of open research questions.
SV8081
A Survey of Robotic Language Grounding: Tradeoffs between Symbols and Embeddings
12 min. talk | August 8th at 11:30 | Session: ROB: Robotics (2/2)
[+] More
[-] Less
With large language models, robots can understand language more flexibly and more capable than ever before. This survey reviews and situates recent literature into a spectrum with two poles: 1) mapping between language and some manually defined formal representation of meaning, and 2) mapping between language and high-dimensional vector spaces that translate directly to low-level robot policy. Using a formal representation allows the meaning of the language to be precisely represented, limits the size of the learning problem, and leads to a framework for interpretability and formal safety guarantees. Methods that embed language and perceptual data into high-dimensional spaces avoid this manually specified symbolic structure and thus have the potential to be more general when fed enough data but require more data and computing to train. We discuss the benefits and tradeoffs of each approach and finish by providing directions for future work that achieves the best of both worlds.
SV8084
Robust Counterfactual Explanations in Machine Learning: A Survey
12 min. talk | August 6th at 15:00 | Session: ETF: AI Ethics, Trust, Fairness (1/2)
[+] More
[-] Less
Counterfactual explanations (CEs) are advocated as being ideally suited to providing algorithmic recourse for subjects affected by the predictions of machine learning models. While CEs can be beneficial to affected individuals, recent work has exposed severe issues related to the robustness of state-of-the-art methods for obtaining CEs. Since a lack of robustness may compromise the validity of CEs, techniques to mitigate this risk are in order. In this survey, we review works in the rapidly growing area of robust CEs and perform an in-depth analysis of the forms of robustness they consider. We also discuss existing solutions and their limitations, providing a solid foundation for future developments.
SV8085
Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models
12 min. talk | August 9th at 11:30 | Session: NLP: Language models
[+] More
[-] Less
Recently, large language models (LLMs) have shown remarkable capabilities including understanding context, engaging in logical reasoning, and generating responses. However, this is achieved at the expense of stringent computational and memory requirements, hindering their ability to effectively support long input sequences. This survey provides an inclusive review of the recent techniques and methods devised to extend the sequence length in LLMs, thereby enhancing their capacity for long-context understanding. In particular, we review and categorize a wide range of techniques including architectural modifications, such as modified positional encoding and altered attention mechanisms, which are designed to enhance the processing of longer sequences while avoiding a proportional increase in computational cost. The diverse methodologies investigated in this study can be leveraged across different phases of LLMs, i.e., training, fine-tuning and inference. This enables LLMs to efficiently process extended sequences. The limitations of the current methodologies is discussed in the last section along with the suggestions for future research directions, underscoring the importance of sequence length in the continued advancement of LLMs.
SV8086
Policy Space Response Oracles: A Survey
12 min. talk | August 7th at 15:00 | Session: MAS: Multi-agent learning
[+] More
[-] Less
Game theory provides a mathematical way to study the interaction between multiple decision makers. However, classical game-theoretic analysis is limited in scalability due to the large number of strategies, precluding direct application to more complex scenarios. This survey provides a comprehensive overview of a framework for large games, known as Policy Space Response Oracles (PSRO), which holds promise to improve scalability by focusing attention on sufficient subsets of strategies. We first motivate PSRO and provide historical context. We then focus on the strategy exploration problem for PSRO: the challenge of assembling effective subsets of strategies that still represent the original game well with minimum computational cost. We survey current research directions for enhancing the efficiency of PSRO, and explore the applications of PSRO across various domains. We conclude by discussing open questions and future research.
SV8098
Automated Essay Scoring: Recent Successes and Future Directions
12 min. talk | August 9th at 11:30 | Session: NLP: Natural Language Processing (3/3)
[+] More
[-] Less
Automated essay scoring (AES), the task of automatically assigning a score to an essay that summarizes its quality, is a challenging task that remains largely unsolved despite more than 50 years of research. This survey paper discusses the milestones in AES research and reflects on future directions.
SV8100
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
12 min. talk | August 6th at 15:00 | Session: NLP: Natural Language Processing (1/3)
[+] More
[-] Less
Despite the impressive performance of LLMs, their widespread adoption faces challenges due to substantial computational and memory requirements during inference. Recent advancements in model compression and system-level optimization methods aim to enhance LLM inference. This survey offers an overview of these methods, emphasizing recent developments. Through experiments on LLaMA(/2)-7B, we evaluate various compression techniques, providing practical insights for efficient LLM deployment in a unified setting. The empirical analysis on LLaMA(/2)-7B highlights the effectiveness of these methods. Drawing from survey insights, we identify current limitations and discuss potential future directions to improve LLM inference efficiency. We release the codebase to reproduce the results presented in this paper at https://github.com/nyunAI/Faster-LLM-Survey
SV8103
A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation
12 min. talk | August 9th at 11:30 | Session: DM: Mining graphs (3/3)
[+] More
[-] Less
Many real-world datasets can be naturally represented as graphs, spanning a wide range of domains. However, the increasing complexity and size of graph datasets present significant challenges for analysis and computation. In response, graph reduction techniques have gained prominence for simplifying large graphs while preserving essential properties. In this survey, we aim to provide a comprehensive understanding of graph reduction methods, including graph sparsification, graph coarsening, and graph condensation. Specifically, we establish a unified definition for these methods and introduce a hierarchical taxonomy to categorize the challenges they address. Our survey then systematically reviews the technical details of these methods and emphasizes their practical applications across diverse scenarios. Furthermore, we outline critical research directions to ensure the continued effectiveness of graph reduction techniques.
SV8104
Medical Neural Architecture Search: Survey and Taxonomy
12 min. talk | August 8th at 15:00 | Session: ML: Machine Learning (6/6)
[+] More
[-] Less
This paper presents a comprehensive survey of Medical Neural Architecture Search (MedNAS), a burgeoning field at the confluence of deep learning and medical imaging. With the increasing prevalence of FDA-approved medical deep learning models, MedNAS emerges as a key area in leveraging computational innovations for healthcare advancements. Our survey examines the paradigm shift introduced by Neural Architecture Search (NAS), which automates neural network design, replacing traditional, manual designs. We explore the unique search spaces tailored for medical tasks on different types of data from images to EEG, the methodologies of MedNAS, and their impact on medical applications.
SV8106
Empowering Time Series Analysis with Large Language Models: A Survey
12 min. talk | August 7th at 15:00 | Session: ML: Machine Learning (5/6)
[+] More
[-] Less
Recently, remarkable progress has been made over large language models (LLMs), demonstrating their unprecedented capability in varieties of natural language tasks. However, completely training a large general-purpose model from the scratch is challenging for time series analysis, due to the large volumes and varieties of time series data, as well as the non-stationarity that leads to concept drift impeding continuous model adaptation and re-training. Recent advances have shown that pre-trained LLMs can be exploited to capture complex dependencies in time series data and facilitate various applications. In this survey, we provide a systematic overview of existing methods that leverage LLMs for time series analysis. Specifically, we first state the challenges and motivations of applying language models in the context of time series as well as brief preliminaries of LLMs. Next, we summarize the general pipeline for LLM-based time series analysis, categorize existing methods into different groups (\textit{i.e.}, direct query, tokenization, prompt design, fine-tune, and model integration), and highlight the key ideas within each group. We also discuss the applications of LLMs for both general and spatial-temporal time series data, tailored to specific domains. Finally, we thoroughly discuss future research opportunities to empower time series analysis with LLMs.
SV8112
A Survey on Extractive Knowledge Graph Summarization: Applications, Approaches, Evaluation, and Future Directions
12 min. talk | August 8th at 15:00 | Session: DM: Data Mining (2/2)
[+] More
[-] Less
With the continuous growth of large Knowledge Graphs (KGs), extractive KG summarization becomes a trending task. Aiming at distilling a compact subgraph with condensed information, it facilitates various downstream KG-based tasks. In this survey paper, we are among the first to provide a systematic overview of its applications and define a taxonomy for existing methods from its interdisciplinary studies. Future directions are also laid out based on our extensive and comparative review.
SV8121
A Survey on Neural Question Generation: Methods, Applications, and Prospects
12 min. talk | August 7th at 15:00 | Session: NLP: Natural Language Processing (2/3)
[+] More
[-] Less
In this survey, we present a detailed examination of the advancements in Neural Question Generation (NQG), a field leveraging neural network techniques to generate relevant questions from diverse inputs like knowledge bases, texts, and images. The survey begins with an overview of NQG’s background, encompassing the task’s problem formulation, prevalent benchmark datasets, established evaluation metrics, and notable applications. It then methodically classifies NQG approaches into three predominant categories: structured NQG, which utilizes organized data sources, unstructured NQG, focusing on more loosely structured inputs like texts or visual content, and hybrid NQG, drawing on diverse input modalities. This classification is followed by an in-depth analysis of the distinct neural network models tailored for each category, discussing their inherent strengths and potential limitations. The survey culminates with a forward-looking perspective on the trajectory of NQG, identifying emergent research trends and prospective developmental paths. Accompanying this survey is a curated collection of related research papers, datasets, and codes, all of which are available on GitHub. This provides an extensive reference for those delving into NQG.
SV8123
A Survey of Multimodal Sarcasm Detection
12 min. talk | August 7th at 15:00 | Session: ML: Machine Learning (5/6)
[+] More
[-] Less
Sarcasm is a rhetorical device that is used to convey the opposite of the literal meaning of an utterance. Sarcasm is widely used on social media and other forms of computer-mediated communication motivating the use of computational models to identify it automatically. While the clear majority of approaches to sarcasm detection have been carried out on text only, sarcasm detection often requires additional information present in tonality, facial expression, and contextual images. This has led to the introduction of multimodal models, opening the possibility to detect sarcasm in multiple modalities such as audio, images, text, and video. In this paper, we present the first comprehensive survey on multimodal sarcasm detection – henceforth MSD – to date. We survey papers published between 2018 and 2023 on the topic, and discuss the models and datasets used for this task. We also present future research directions in MSD.
SV8127
Social Learning through Interactions with Other Agents: A Survey
12 min. talk | August 6th at 15:00 | Session: ML: Machine Learning (2/6)
[+] More
[-] Less
Social learning plays an important role in the development of human intelligence. As children, we imitate our parent’s speech patterns until we are able to produce sounds; we learn from them praising us and scolding us, and as adults, we learn by working with others. In this work, we survey the degree to which this developmental paradigm — social learning — has been mirrored in machine learning. In particular, since learning socially requires interacting with others, we are interested in how embodied agents can and have utilised these techniques. This is especially in light of the degree to which recent advances in natural language processing (NLP) enable us to perform new forms of social learning. We look at how behaviour cloning and next-token prediction mirror human imitation, how learning from human feedback mirrors human education, and how we can go further to enable fully communicative agents that learn from each other.
We find that while individual social learning techniques have been used successfully, there has been little unifying work showing how to bring them together into socially embodied agents.
SV8132
Recent Advances in End-to-End Simultaneous Speech Translation
12 min. talk | August 8th at 10:00 | Session: NLP: Speech
[+] More
[-] Less
Simultaneous speech translation (SimulST) is a demanding task that involves generating translations in real-time while continuously processing speech input. This paper offers a comprehensive overview of the recent developments in SimulST research, focusing on four major challenges. Firstly, the complexities associated with processing lengthy and continuous speech streams pose significant hurdles. Secondly, satisfying real-time requirements presents inherent difficulties due to the need for immediate translation output. Thirdly, striking a balance between translation quality and latency constraints remains a critical challenge. Finally, the scarcity of annotated data adds another layer of complexity to the task. Through our exploration of these challenges and the proposed solutions, we aim to provide valuable insights into the current landscape of SimulST research and suggest promising directions for future exploration.
SV8145
A Survey on Network Alignment: Approaches, Applications and Future Directions
12 min. talk | August 6th at 15:00 | Session: DM: Mining graphs (2/3)
[+] More
[-] Less
Network alignment, the task of mapping corresponding nodes across networks, is attracting more attention for cross-network analysis in diverse domains, including social, biological, and co-authorship networks. Although a variety of methods have been proposed, we lack a holistic understanding of the approaches and applications. Our survey aims to bridge this gap by first proposing a taxonomy of network alignment, characterizing existing approaches, and then systematically summarizing and reviewing their performance and highlighting their scopes for future development. Finally, we discuss some important applications and give directions for future research within this domain.