Video recordings of the plenary sessions are available -> here <-!
Andrew Barron is the 2024 Shannon Lecturer. The Shannon Lecture takes place at 8:30am on Thursday, July 11 in Ballrooms II and III.
The ISIT 2024 plenary speakers are (in alphabetical order, click on the links for abstracts and speaker bios):
- Rebecca Willett (8:30am Monday, July 8, Ballrooms II and III)
- Gregory Wornell (8:30am Tuesday, July 9, Ballrooms II and III)
- Venkatesan Guruswami (8:30am Wednesday, July 10, Ballrooms II and III)
- Emina Soljanin (8:30am Friday, July 12, Ballrooms II and III)
Shannon Lecture
Information Theory and High-Dimensional Bayes Computation by Andrew Barron (Yale University, USA)
Abstract: Information theory provides foundations and links among the problems of model discovery, prediction, compression, estimation and communication of data sequences. Various procedures are available to tackle such problems. Among such, the Bayes procedures are not only average case optimal, they also provide favorable individual case performance. Importantly for engineering and scientific practice, a number of Bayesian modeling developments are associated with providing computationally effective methods for sequence prediction, compression, and channel decoding. Laplace's approximation of Bayes factors, the use of Jeffreys' prior, their relationship to stochastic complexity and to minimax redundancy and to minimax regret, the index of resolvability, the average case optimality of Bayes predictive distributions for relative entropy loss, and the information-theoretic determination minimax statistical risk provide some starting points which we may discuss at the overlap of Bayes theory and information theory.
Models for sequences of discrete outcomes and models for continuous parameter function estimation provide natural playgrounds. For discrete data models, Laplace's rule of succession, the Krichevsky-Trofimov rule, the Shtarkov minimax regret rule, on-line learning with log-loss, the Willems et al Context Tree Weighting Algorithm, and capacity achieving LDPC codes with Bayesian belief propagation/message passing are among the important developments we may discuss. Colleagues are exploring the impact of some of these models considerably beyond their originally intended context.
Particular attention will be given to continuous data models. We start with the Bayesian interpretation of the development of least squares by Gauss and the Bayesian and information theory implications of the extensions to recursive least squares, linear predictive coding, Kalman filtering, and online learning with squared error loss. As with certain discrete models, these continuous models permit explicit determination of procedures that are Bayes optimal and nearly pointwise regret optimal for arbitrary sequences. For log-concave distributions the critical development of information-theoretic characterization of rapid mixing, initiated by Bakry and Emery and carried forward by various prominent scholars, brings many other Bayesian prediction and estimation problems into the computationally feasible playground, even in high dimensions. We may discuss various such problems. These include the class of all the location estimation problems and linear regression problems with log-concave error distributions, for which the uniform prior is provably minimax for cumulative Kullback loss and minimax for data compression given initial data. Also included are Cover's universal portfolios which are log-concave integrations that become computable even with a large number of stocks. For Gaussian channel communication via superposition codes (also called regression codes), adaptive successive decoders and approximate message passing algorithms for approximate computation of Bayes optimal decoders are provably computationally feasible and capacity achieving.
However, the lack of provably effective optimization or sampling methods plague the important classes of high-dimensional nonlinear function modeling problems, including modern artificial neural networks via deep learning. These network models can be proven to be information-theoretically, statistically, and approximation-theoretically accurate even in high-dimensional settings for suitable classes of functions. These artificial neural networks models have multimodal posterior distributions. Nevertheless, we show, in joint work with Curtis McDonald, how to overcome the computation-theoretic challenge by the introduction of certain auxiliary parameters for which the conditional distribution of the network parameters given the data and the auxiliary parameters are always log-concave. Importantly, when the network parameter dimension exceeds the sample size to the 1.5 power, we show that the distribution of the auxiliary parameters becomes log-concave. Accordingly, we can first sample the auxiliary parameters and then conditionally sample the network parameters to computationally efficiently produce Bayes optimal Monte Carlo neural net estimates, appealing to the above-mentioned information-theoretic results. These provide the first demonstration of computational learnability of accurate statistical estimates for such neural networks, in particular for the class of functions with bounded variation with respect to the neural network class.
Biography: Andrew R Barron, Professor of Statistics and Data Science at Yale University, has made outstanding contributions at the overlap of Information Theory with Probability and Statistics. Prior to joining Yale University in 1992, Barron was a faculty member in Statistics and Electrical and Computer Engineering at the University of Illinois at Urbana Champaign. Barron received his MS and PhD degrees from Stanford University in Electrical Engineering in 1985 under the direction of Tom Cover and a Bachelor's degree in the fields of Mathematical Science and Electrical Engineering from Rice University in 1981. Barron is a Fellow of the IEEE, a Medallion Prize winner of the Institute of Mathematical Statistics, and a winner along with Bertrand Clarke of the IEEE Thompson Prize. Andrew Barron has served as a Secretary of the Board of Governors of the IEEE Information Theory Society and several terms as an elected member of this Board. He has been an associate editor of the IEEE Transactions on Information Theory and the Annals of Statistics. Barron has served on and subsequently chaired the Thomas M. Cover Dissertation Prize Committee. At Yale University, Barron regularly teaches courses in Information Theory, Theory of Statistics, High-Dimensional Function Estimation and Artificial Neural Networks. Barron has served terms as department chair, director of graduate studies, director of undergraduate studies in Statistics, director of undergraduate studies in Applied Mathematics, and courtesy appointee as Professor of Electrical Engineering. Barron has proudly mentored 20 PhD students. Often working with these students and other colleagues, Barron is known for several specific research accomplishments: in particular, for generalizing the AEP to continuous-valued ergodic processes, for proving an information-theoretic Central Limit Theorem, for determining information-theoretic aspects of portfolio estimation, for formulating the index of resolvability and providing an associated characterization of performance of Minimum Description Length estimators, for determining the asymptotics of universal data compression in parametric families, for characterizing the concentration of Bayesian posteriors in the vicinity of parameters in the information support of the prior, for an information-theoretic determination of the minimax rates of function estimation, for providing information-theoretic characterization of statistical efficiency, for providing an early unifying view of statistical learning networks, for developing approximation and estimation bounds for artificial neural networks and recent extensions to deep learning, for advancing greedy algorithms for training neural networks, for information-theoretic aggregation of least squares regressions, and for formulating and proving capacity-achieving sparse regression codes for Gaussian noise communication channels. Barron maintains homes in New Haven, Connecticut and in Osijek, Croatia with his wife Lidija. Barron is also a distinguished FAI free flight model glider competitor in the F1A class, as a five time U.S. National Champion, a four time U.S. National Team Member at World Championships (most recently in 2023), as a two time America's Cup Champion, and as a co-manager and co-owner with family members of Barron Field, LLC.
Plenary Talks
Learning Low-rank Functions With Neural Networks by Rebecca Willett (University of Chicago, USA)
Abstract: Neural networks are increasingly prevalent and transformative across domains. Understanding how these networks operate in settings where mistakes can be costly (such as transportation, finance, healthcare, and law) is essential to uncovering potential failure modes. Many of these networks operate in the “overparameterized regime,” in which there are far more parameters than training samples, allowing the training data to be fit perfectly. What does this imply about the predictions the network will make on new samples? That is, if we train a neural network to interpolate training samples, what can we say about the interpolant, and how does this depend on the network architecture? In this talk, I will describe insights into the role of network depth using the notion of representation costs – i.e., how much it “costs” for a neural network to represent various functions. Understanding representation costs helps reveal the role of network depth in machine learning and the types of functions learned, relating them to Barron and mixed variation function spaces, such as single- and multi-index models.
Biography: Rebecca Willett is a Professor of Statistics and Computer Science and the Director of AI in the Data Science Institute at the University of Chicago, and she holds a courtesy appointment at the Toyota Technological Institute at Chicago. Her research is focused on the mathematical foundations of machine learning, scientific machine learning, and signal processing. Prof. Willett is the Deputy Director for Research at the NSF-Simons Foundation National Institute for Theory and Mathematics in Biology and a member of the NSF Institute for the Foundations of Data Science Executive Committee. She is the Faculty Director of the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship at the University of Chicago and helps direct the Air Force Research Lab University Center of Excellence on Machine Learning. Willett received the National Science Foundation CAREER Award in 2007, was a member of the DARPA Computer Science Study Group, received an Air Force Office of Scientific Research Young Investigator Program award in 2010, was named a Fellow of the Society of Industrial and Applied Mathematics in 2021, and was named a Fellow of the IEEE in 2022. Prof. Willett completed her PhD in Electrical and Computer Engineering at Rice University in 2005 and was an Assistant then tenured Associate Professor of Electrical and Computer Engineering at Duke University from 2005 to 2013. She was an Associate Professor of Electrical and Computer Engineering, Harvey D. Spangler Faculty Scholar, and Fellow of the Wisconsin Institutes for Discovery at the University of Wisconsin-Madison from 2013 to 2018. She serves on the advisory boards of the US National Science Foundation’s Institute for Mathematical and Statistical Innovation, the US National Science Foundation’s Institute for the Foundations of Machine Learning, and the MATH+ Berlin Mathematics Research Center, as well as National Academies of Science, Engineering and Medicine committees.
Will We Ever Learn? A Sensor's Lament, and other Stories by Gregory Wornell (Massachusetts Institute of Technology, USA)
Abstract: Over many decades, information theoretic analysis has proven to be extraordinary useful in reimagining system architecture in diverse applications. Indeed, such analysis clarifies where information is and is not needed, and quantifies the impact of design constraints. Among other examples, this talk will focus on problems of acquisition and digital conversion of sensor data, which straddles the analog/digital interface. The lack of adaptability at this interface often necessitates considerable overprovisioning in contemporary systems, and leads to a significant bottleneck in the information pipeline. Highlighting efforts within and beyond the community, this talk will discuss some of what information theory reveals about what might be possible with respect to addressing these challenges, and about the prospects of learning at the edge.
Biography: Gregory W. Wornell received his Ph.D. from the Massachusetts Institute of Technology (MIT) in electrical engineering and computer science in 1991. Since then he has been on the faculty at MIT, where he is the Sumitomo Professor of Engineering in the department of Electrical Engineering and Computer Science (EECS). At MIT he leads the Signals, Information, and Algorithms Laboratory, and is affiliated with the Research Laboratory of Electronics (RLE), and the Computer Science and Artificial Intelligence Laboratory (CSAIL). He has been involved in the Information Theory and Signal Processing societies in a variety of capacities, and maintains a number of industrial relationships and activities. Among awards for his research and teaching is the 2019 IEEE Leon K. Kirchmayer Graduate Teaching Award.
A few options go a long way: List decoding and applications by Venkatesan Guruswami (University of California, Berkeley, USA)
Abstract: List decoding allows the error-correction procedure to output a small list of candidate codewords, and the decoding is deemed successful if the list includes the original uncorrupted codeword. List decoding has enjoyed a number of influential consequences. It allows bridging between the Shannon and Hamming worlds and achieving "capacity" even in worst-case error models. It serves as a versatile subroutine in varied error-correction scenarios not directly tied to list decoding. It boasts a diverse array of "extraneous" applications in computational complexity, combinatorics, cryptography, and quantum computing. And it has infused several novel algebraic, probabilistic, combinatorial, and algorithmic techniques and challenges into coding theory.
This talk will provide a glimpse of several facets of list decoding, its origins, evolution, constructions, connections, and applications.
Biography: Venkatesan Guruswami received his Bachelor's degree in Computer Science from the Indian Institute of Technology at Madras in 1997 and his Ph.D. in Computer Science from the Massachusetts Institute of Technology in 2001. He is currently a Chancellor’s Professor in the Electrical Engineering and Computer Science Department at the University of California, Berkeley, and a senior scientist at the Simons Institute for the Theory of Computing. He was a Miller Research Fellow at UC Berkeley and held faculty positions at the University of Washington and Carnegie Mellon University prior to his current position. His research interests span many topics such as coding and information theory, approximate optimization, computational complexity, pseudo-randomness, and related mathematics. Prof. Guruswami has served the theoretical computer science community in several leadership roles. He is the current Editor-in-Chief of the Journal of the ACM, and was previously Editor-in-Chief of the ACM Transactions on Computation Theory. He has served as the president of the Computational Complexity Foundation and on the editorial boards of JACM, the SIAM Journal on Computing and the IEEE Transactions on Information Theory. He has been program committee chair for the conferences CCC (2012), FOCS (2015), ISIT (2018, co-chair), FSTTCS (2022), and ITCS (2024). Prof. Guruswami is a recipient of a Guggenheim Fellowship, a Simons Investigator award, the Presburger Award, Packard and Sloan Fellowships, the ACM Doctoral Dissertation Award, an IEEE Information Theory Society Paper Award and a Distinguished Alumnus Award from IIT Madras. He was an invited speaker at the 2010 International Congress of Mathematicians. Prof. Guruswami is a fellow of the ACM, IEEE, and AMS.
Codes: (Always) at Your Service by Emina Soljanin (Rutgers University, USA)
Abstract: Error control coding is essential in many scientific disciplines and nearly all telecommunication systems. Proposals for new codes and new roles of codes in communications and computing systems continue to appear. Each new proposal initially faces (justified) skepticism and pushback by practitioners until discarded or adopted as a necessary evil. Coding performance metrics have become hard to define and even harder to evaluate. The first part of this talk considers the service rate region of a code, a new performance metric of a distributed system that stores data redundantly using the code. It measures the storage system's ability to serve multiple users requesting different data objects. The second part of the talk asks if there is a coding gain in adding redundancy to distributed computing and how we can evaluate and achieve it.
Biography: Emina Soljanin is a Distinguished Professor of Electrical and Computer Engineering at Rutgers University. Before moving to Rutgers in January 2016, she was a (Distinguished) Member of Technical Staff for 21 years in Bell Labs Math Research. She received her Ph.D. and M.Sc. from Texas A&M University and her B.S. from the University of Sarajevo, all in Electrical Engineering. Prof. Soljanin’s research interests and expertise are broad. She has participated in numerous research and business projects. These projects include designing the first distance-enhancing codes implemented in commercial magnetic storage devices, the first forward error correction for Bell Labs optical transmission devices, color space quantization for image processing, link error prediction methods for Hybrid ARQ wireless standards, network and rateless coding, and network data security and user anonymity, Her most recent activities are in distributed computing systems and quantum information science. Prof. Soljanin has served as an Associate Editor for Coding Techniques for the IEEE Transactions on Information Theory and has had various roles in other journal editorial boards, special workshop organizing, and conference program committees. She is an IEEE Fellow, an outstanding alumnus of the Texas A&M School of Engineering, the 2011 Padovani Lecturer, a 2016/17 Distinguished Lecturer, and the 2019 IEEE Information Theory Society President. Prof. Soljanin’s favorite recognition is the 2023 Aaron D. Wyner Distinguished Service Award.