List of abstracts

Richard Combes (Centrale/Supelec)

Title: Distributed Multi-Armed Bandits without Collision Sensing

Abstract: In this talk we consider the distributed multi-armed bandit problem where M learners must select the best M arms out of a set of K arms in a distributed fashion, while avoiding collisions, that is choosing the same arm. We present several approaches depending on the type of collision sensing available to each players, and then focus on the case where no collision sensing is available. We propose a novel algorithm for this scenario, which circumvents two problems shared by all state-of-the-art algorithms: it does not need as an input a lower bound on the minimal expected reward of an arm, and its performance does not scale inversely proportionally to the minimal expected reward. We prove a theoretical regret upper bound to justify these claims. We complement our theoretical results with numerical experiments, showing that the proposed algorithm outperforms state-of-the-art in practice as well. This talk is based on joint work with Huang Wei and Cindy Trinh.

Short bio: Richard Combes is currently an associate professor in Supelec. He received the Engineering Degree from Telecom Paristech (2008), the Master Degree in Mathematics from university of Paris VII (2009) and the Ph.D. degree in Mathematics from university of Paris VI (2013). He was a visiting scientist at INRIA (2012) and a post-doc in KTH (2013). He received the best paper award at SIGMETRICS 2019 and CNSM 2011. His current research interests are machine learning, networks and probability.

-----------------------------------------------------------------

Junghyun Lee (KAIST)

Title: Nearly Optimal Latent State Decoding in Block MDPs

Abstract: We investigate the problems of model estimation and reward-free learning in episodic Block MDPs. In these MDPs, the decision maker has access to rich observations or contexts generated from a small number of latent states. We are first interested in estimating the latent state decoding function (the mapping from the observations to latent states) based on data generated under a fixed behavior policy. We derive an information-theoretical lower bound on the error rate for estimating this function and present an algorithm approaching this fundamental limit. In turn, our algorithm also provides estimates of all the components of the MDP. We then study the problem of learning near-optimal policies in the reward-free framework. Based on our efficient model estimation algorithm, we show that we can infer a policy converging (as the number of collected samples grows large) to the optimal policy at the best possible rate. Interestingly, our analysis provides necessary and sufficient conditions under which exploiting the block structure yields improvements in the sample complexity for identifying near-optimal policies. When these conditions are met, the sample complexity in the minimax reward-free setting is improved by a multiplicative factor n, where n is the number of possible contexts.

Short bio: Junghyun Lee is a 2nd-year PhD student at the Kim Jaechul Graduate School of AI at KAIST, where he is jointly advised by Prof. Se-Young Yun and Prof. Chulhee Yun. His research focuses on mathematical and theoretical AI, with a recent emphasis on achieving tight statistical guarantees—such as regret and sample complexity—for interactive machine learning, encompassing sequential decision-making (bandits, online learning), reinforcement learning (MDPs), and active learning. He is also passionate about exploring fundamental concepts in statistics and probability theory, algorithmic fairness, and deep learning theory from both optimization and statistical perspectives. Essentially, he is intrigued by any machine learning challenge that requires rigorous mathematical analysis. His research has been published in international AI conferences, including AISTATS, NeurIPS, ICLR, and AAAI. He received the Best Student Paper Award at OPODIS 2023. Junghyun holds an MSc from the same school and a BSc in Mathematical Sciences and Computer Science (double major) from KAIST.

-----------------------------------------------------------------

Rajesh Sundaresan (Indian Institute of Science)

Title: Reputation-based information design for increasing prosocial behaviour

Abstract: We will discuss an information design problem that uses (1) norms and conventions in a society or network and (2) individuals' desires to have good reputation (in keeping with these norms and conventions), to increase the quantum of prosocial actions. A function of each individual's action is made public and considerations of the resulting reputation improves prosocial actions. The planner designs the function to make public and agents respond with their actions. We will study the equilibria that ensue and will highlight some interesting properties. The problem came out of a field experiment conducted in about 20,000 households in Kerala on how effective feedback can be in reducing residential energy consumption. The talk will be on joint work with Alexandre Reiffers-Masson.

Short bio: Rajesh Sundaresan is a Professor of Electrical Communication Engineering, an Associate Faculty at the Robert Bosch Centre for Cyber-Physical Systems, and the current Dean of the Division of Electrical, Electronics, and Computer Sciences, at the Indian Institute of Science. His research interests include decision theory, communication, computation, and control over networks, cyber-social systems, and data-driven decision frameworks for public health responses.

-----------------------------------------------------------------

Ruo-Chun Tzeng and Po-An Wang (KTH)

Title: Faster Algorithms for Combinatorial Semi-Bandits

Abstract: This talk is based on two papers concerned with combinatorial semi-bandits and addressing the challenge of achieving both statistical and computational efficiency. We focus on two popular learning tasks: best arm identification and regret minimization.

In the first part of the talk, corresponding to the paper "Closing the Computational-Statistical Gap in Best Arm Identification for Combinatorial Semi-Bandits", we aim to identify the best action with a probability of at least 1-delta while minimizing the number of samples used. Several statistically optimal methods exist, but none are computationally efficient. This inefficiency stems fronm the complex computation of the most confusing parameter in the sample complexity lower bound. We propose a two-player game algorithm to approximate this parameter. The resulting first-order sampling rule (P-FWS) runs in polynomial time and achieves statistical optimality.

The second part of the talk, "Matroid Semi-Bandits in Sublinear Time", focuses on regret minimization in bandit within a matroid structure. Current state-of-the-art algorithms, such as CUCB and KL-OSM, require time complexity at least linear in the number K of arms. By dynamically maintaining an approximate maximum-weight base over inner-product weights, we introduce the first algorithm with per-round time complexity that is sublinear in K, while asymptotically matching the gap-dependent lower bound achieved by CUCB.

Short bios: Ruo-Chun is a KTH doctoral student, working with Prof. Aristides Gionis on signed graph mining and with Prof. Alexandre Proutiere on graph bandits. For the former, she is working with explict pattern mining. For the latter, she is focusing on the computational aspect of combiantorial bandits. Po-An is a fourth-year PhD student at KTH Royal Institute of Technology, working under the guidance of Prof. Alexandre Proutiere. In June 2023, he embarked on an enriching internship journey with CyberAgent in Japan. Prior to his academic pursuit at KTH, he gained valuable research experience as an assistant in Chi-Jen Lu’s lab at Academia Sinica.

-----------------------------------------------------------------

Cesar A. Uribe (Rice University)

Title: On Graphs with Finite-Time Consensus

Abstract: In this talk, we present sequences of graphs satisfying the finite-time consensus property (i.e., iterating through such a finite sequence is equivalent to performing global or exact averaging) and their use in Gradient Tracking. We provide an explicit weight matrix representation of the studied sequences and prove its finite-time consensus property. Moreover, we incorporate the studied finite-time consensus topologies into Gradient Tracking and present a new algorithmic scheme called Gradient Tracking for Finite-Time Consensus Topologies (GT-FT). We analyze the new scheme for nonconvex problems with stochastic gradient estimates. Our analysis shows that the convergence rate of GT-FT does not depend on the heterogeneity of the agents' functions or the connectivity of any individual graph in the topology sequence. Furthermore, owing to the sparsity of the graphs, GT-FT requires lower communication costs than Gradient Tracking using the static counterpart of the topology sequence.

Short bio: Cesar A. Uribe is the Louis Owen Assistant Professor at the Department of Electrical and Computer Engineering at Rice University. He received the M.Sc. degrees in systems and control from the Delft University of Technology in The Netherlands and in applied mathematics from the University of Illinois at Urbana-Champaign in 2013 and 2016, respectively. He also received the Ph.D. degree in electrical and computer engineering at the University of Illinois at Urbana-Champaign in 2018. He was a Postdoctoral Associate in the Laboratory for Information and Decision Systems-LIDS at the Massachusetts Institute of Technology-MIT until 2020. He is the recipient of the Ralph E. Powe Junior Faculty Enhancement Award, 100k Strong in the Americas Innovation Award, and the Google Research Scholar Award. He held a visiting professor position at the Moscow Institute of Physics and Technology until 2022. His research interests include distributed learning and optimization, decentralized control, algorithm analysis, and computational optimal transport.

-----------------------------------------------------------------

Adam Wierman (Caltech)

Title: MARL and Machine-Learned Advice in Strategic Settings

Abstract: This talk will discuss recent results about learning in strategic settings. We will focus on the classical setting of two-player stochastic games and study the convergence of multi-agent reinforcement learning, connecting smoothed fictious play with modern algorithms like DQN. Then, we will show how to balance exploitative play (using learning tools) with robust play (using game-theoretic tools) in strategic settings by building on the emerging literature of learning-augmented algorithms.

Short bio: Adam Wierman is the Carl F Braun Professor in the Department of Computing and Mathematical Sciences at Caltech. He received his Ph.D., M.Sc., and B.Sc. in Computer Science from Carnegie Mellon University. Adam’s research strives to make the networked systems that govern our world sustainable and resilient. He is best known for his work spearheading the design of algorithms for sustainable data centers, which as seen significant industry adoption (e.g. through the startup Verrus), and his work on heavy-tails, including co-authoring a book on “The Fundamentals of Heavy Tails”. He is a recipient of multiple awards, including the ACM Sigmetrics Rising Star award, the ACM Sigmetrics Test of Time award, the IEEE INFOCOM Test of Time award, the IEEE Communications Society William R. Bennett Prize, the Caltech IDEA Advocate award, multiple teaching awards, and is a co-author of papers that have received “best paper” awards at a wide variety of conferences across computer science, power engineering, and operations research.

-----------------------------------------------------------------

Kuang Xu (Stanford)

Title: Experimenting under Stochastic Congestion

Abstract: We study randomized experiments in a service system when stochastic congestion can arise from temporarily limited supply and/or demand. Such congestion gives rise to cross-unit interference between the waiting customers, and analytic strategies that do not account for this interference may be biased. In current practice, one of the most widely used ways to address stochastic congestion is to use switchback experiments that alternatively turn a target intervention on and off for the whole system. We find, however, that under a queueing model for stochastic congestion, the standard way of analyzing switchbacks is inefficient, and that estimators that leverage the queueing model can be materially more accurate. We also consider a new class of experimental design, which can be used to estimate a policy gradient of the dynamic system using only unit-level randomization, thus alleviating key practical challenges that arise in running a switchback. Preprint: https://arxiv.org/abs/2302.12093.

Short bio: Kuang Xu (Chinese: 许匡) is a Tenured Associate Professor at Stanford Graduate School of Business. He is an expert in Operations Research, AI and Data Science innovation, supply chains and logistics, and data-driven decision-making. He is a Co-Creator of AI and Data Science Strategy, the first Stanford course focusing on the strategy, management and entrepreneurship of AI and Data Science. He is a Co-Director of the Stanford GSB Value Chain Innovation Initiative.

Kuang’s research focuses on decision-making under uncertainty, leveraging tools from operations research, statistics and machine learning. His work has been published in leading academic journals including Operations Research and Management Science, and has received a number of prestigious awards, including the George E. Nicholson Prize from the Institute for Operations Research and the Management Sciences (INFORMS), the Best Paper Award as well as Outstanding Student Paper Award from the Association for Computing Machinery (ACM), Special Interest Group on Measurement and Evaluation (SIGMETRICS), and an ACM SIGMETRICS Rising Star Research Award. He serves as an Associate Editor for Management Science and Operations Research in Data Science and Stochastic Modeling. His research and writing have been featured in a variety of media outlets including the NPR, PBS, NBC and USA Today.

Kuang advises companies and investment funds on how to build core AI and Data Science capabilities and strategic moats. He has served as the Chief Data Science Advisor for Shipt Inc., Senior Advisor to Uber Inc., and scientific advisors to a number of startups as well as venture capital and private equity funds.

Kuang received his Ph.D. degree in Electrical Engineering and Computer Science from MIT (2019), and the Bachelor of Science degree from the University of Illinois at Urbana-Champaign (2009). He is a native of Suzhou, China.

-----------------------------------------------------------------

Bert Zwart (Eindhoven University of Technology)

Title: Optimization under rare events: scaling laws for solutions of chance-constrained programs

Abstract: Inspired by the need to develop a fundamental framework for Probabilistic Dimensioning of Frequency Restoration Reserves in the European Power Grid we investigate a class of chance-constraint programs in which profit needs to be maximized while enforcing that a given adverse event (such as a blackout) remains rare.

Using techniques from large deviations and extreme-value theory, we show how the optimal value scales as the prescribed bound on the violation probability becomes small and how convex programs emerge in the limit. We also use our results to show that the popular CVAR approximation is asymptotically optimal under light-tail assumptions but is sub-optimal in a heavy-tailed setting. Then, we apply point process techniques and random set theory to study the suboptimality gap in the Monte Carlo-based scenario approach.

Short bio: Bert Zwart is a professor at Eindhoven University of Technology (TU/e). Bert’s research expertise is in applied probability and stochastic networks, in particular rare event analysis and simulation, scaling limits, scheduling under uncertainty, dynamic pricing and applications in communication and energy networks. The outcome of Bert’s research makes it possible to analyze complex systems and procedures, in which traditional methods of analysis are unusable, due to the number of factors and variables involved.

The scope of his activities extends to mathematical models for computer and communication networks, call centers and production processes, which are all frequently highly dimensional and, therefore, difficult to analyze. His research aims to reduce complexity of such models by focusing on their macroscopic behavior. In addition, a new application area for stochastic operations research is emerging - energy networks, especially electricity grids - which brings new and urgent research questions.