ARC
ARC-types, our seminar series
season 2019/2020
Talks suspended until further notice due to the Covid pandemic.
Hagit Attiya (Technion - Israel Institute of Technology)
Fences and RMRs Required for Synchronization
Date: Mon Feb 24 @ 11:00 am
Place: Aula Seminari, 3rd floor, via Salaria 113 (main CS building)
Compiler optimizations that execute memory accesses out of (program) order often lead to incorrect execution of concurrent programs. These re-orderings are prohibited by inserting costly fence (memory barrier) instructions. The inherent Fence Complexity is a good estimate of an algorithm's time complexity, as is its RMR complexity: the number of Remote Memory References the algorithm must issue. When write instructions are executed in order, as in the Total Store Order (TSO) model, it is possible to implement a lock (and other objects) using only one RAW fence and an optimal O(n log n) RMR complexity. However, when store instructions may be re-ordered, as in the Partial Store Order (PSO) model, we prove that there is an inherent tradeoff between fence and RMR complexities. The proof relies on an interesting encoding argument.
Gabor Lugosi (Pompeu Fabra University)
Network archeology: on revealing the past of random trees
Date: Tue Feb 18 @ 3:30 pm
Place: Aula Seminari, 3rd floor, via Salaria 113 (main CS building)
Networks are often naturally modeled by random processes in which nodes of the network are added one-by-one, according to some random rule. Uniform and preferential attachment trees are among the simplest examples of such dynamically growing networks. The statistical problems we address in this talk regard discovering the past of the network when a present-day snapshot is observed. Such problems are sometimes termed "network archeology". We present a few results that show that, even in gigantic networks, a lot of information is preserved from the very early days.
Gabor Lugosi is an ICREA research professor at the Department of Economics, Pompeu Fabra University, Barcelona. He graduated in electrical engineering at the Technical University of Budapest in 1987, and received his Ph.D. from the Hungarian Academy of Sciences in 1991. His research main interests include the theory of machine learning, combinatorial statistics, inequalities in probability, random graphs and random structures, and information theory.
Andreas Krause (ETH Zurich)
Safe and Efficient Exploration in Reinforcement Learning
Date: Wed Feb 5 @ 3:30 pm
Place: Aula Seminari, 3rd floor, via Salaria 113 (main CS building)
At the heart of Reinforcement Learning lies the challenge of trading exploration -- collecting data for identifying better models -- and exploitation -- using the estimate to make decisions. In simulated environments (e.g., games), exploration is primarily a computational concern. In real-world settings, exploration is costly, and a potentially dangerous proposition, as it requires experimenting with actions that have unknown consequences. In this talk, I will present our work towards rigorously reasoning about safety of exploration in reinforcement learning. I will discuss a model-free approach, where we seek to optimize an unknown reward function subject to unknown constraints. Both reward and constraints are revealed through noisy experiments, and safety requires that no infeasible action is chosen at any point. I will also discuss model-based approaches, where we learn about system dynamics through exploration, yet need to verify safety of the estimated policy. Our approaches use Bayesian inference over the objective, constraints and dynamics, and -- under some regularity conditions -- are guaranteed to be both safe and complete, i.e., converge to a natural notion of reachable optimum. I will also present recent results harnessing the model uncertainty for improving efficiency of exploration, and show experiments on safely and efficiently tuning cyber-physical systems in a data-driven manner.
Andreas Krause is a Professor of Computer Science at ETH Zurich, where he leads the Learning & Adaptive Systems Group. He also serves as Academic Co-Director of the Swiss Data Science Center. Before that he was an Assistant Professor of Computer Science at Caltech. He received his Ph.D. in Computer Science from Carnegie Mellon University (2008) and his Diplom in Computer Science and Mathematics from the Technical University of Munich, Germany (2004). He is a Microsoft Research Faculty Fellow and a Kavli Frontiers Fellow of the US National Academy of Sciences. He received ERC Starting Investigator and ERC Consolidator grants, the Deutscher Mustererkennungspreis, an NSF CAREER award, the Okawa Foundation Research Grant recognizing top young researchers in telecommunications as well as the ETH Golden Owl teaching award. His research on machine learning and adaptive systems has received awards at several premier conferences and journals, including the ACM SIGKDD Test of Time award 2019. Andreas Krause served as Program Co-Chair for ICML 2018, and is regularly serving as Area Chair or Senior Program Committee member for ICML, NeurIPS, AAAI and IJCAI, and as Action Editor for the Journal of Machine Learning Research.
Charles Elkan (University of California, San Diego)
Learning to rank results optimally in search and recommendation engines
Date: Fri Jan 17 @ 11:00 am
Place: Aula Seminari, via Salaria 113, last floor
Consider the scenario where an algorithm is given a context, and then it must select a slate of relevant results to display. As four special cases, the context may be a search query, a slot for an advertisement, a social media user, or an opportunity to show recommendations. We want to compare many alternative ranking functions that select results in different ways. However, A/B testing with traffic from real users is expensive. This research provides a method to use traffic that was exposed to a past ranking function to obtain an unbiased estimate of the utility of a hypothetical new ranking function. The method is a purely offline computation, and relies on assumptions that are quite reasonable. We show further how to design a ranking function that is the best possible, given the same assumptions. Learning optimal rankings for search results given queries is a special case. Experimental findings on data logged by a real-world e-commerce web site are positive.
Charles Elkan is an adjunct professor of computer science at the University of California, San Diego (UCSD). His full-time position is as managing director and global head of machine learning at Goldman Sachs. From 2014 to 2018 he was the first Amazon Fellow, leading a team of over 30 scientists and engineers in Seattle, Palo Alto, and New York doing research and development in applied machine learning for both e-commerce and cloud computing. Before joining Amazon, Dr. Elkan was a tenured full professor of computer science at UCSD. His Ph.D. is from Cornell and his undergraduate degree from Cambridge, and he has held visiting positions at Harvard and MIT. Dr. Elkan's students have gone on to faculty positions at universities including Columbia, Carnegie Mellon, the University of Washington, and Stanford, and to leading roles in industry.
Silvio Lattanzi (Google Zurich)
Online Scheduling via Learned Weights
Date: Wed Dec 18 @ 3:00 pm (CANCELLED)
Place: viale Regina Elena 295/B, building F, last floor
Online algorithms are a hallmark of worst case optimization under uncertainty. On the other hand, in practice, the input is often far from worst case, and has some predictable characteristics. A recent line of work has shown how to use machine learned predictions to circumvent strong lower bounds on competitive ratios in classic online problems such as ski rental and caching. We study how predictive techniques can be used to break through worst case barriers in online scheduling.
The makespan minimization problem with restricted assignments is a classic problem in online scheduling theory. Worst case analysis of this problem gives Ω(log m) lower bounds on the competitive ratio in the online setting. We identify a robust quantity that can be predicted and then used to guide online algorithms to achieve better performance. Our predictions are compact in size, having dimension linear in the number of machines, and can be learned using standard off the shelf methods. The performance guarantees of our algorithms depend on the accuracy of the predictions, given predictions with error η, we show how to construct O(log η) competitive fractional assignments.
We then give an online algorithm that rounds any fractional assignment into an integral schedule. Our algorithm is O((log log m)^3)-competitive and we give a nearly matching Ω(log log m) lower bound for online rounding algorithms. Altogether, we give algorithms that, equipped with predictions with error η, achieve O(log η (log log m)^3) competitive ratios, breaking the Ω(log m) lower bound even for moderately accurate predictions.
Joint work with Thomas Lavastida, Benjamin Moseley and Sergei Vassilvitskii
Silvio is a Research Scientist at Google Zurich since April 2017. Before he was part of the NY Algorithm group at Google New York (2011-2017). He received my PhD from Sapienza University of Rome under the supervision of Alessandro Panconesi.
His research interests are in the areas of algorithms, machine learning and information retrieval.
Mohsen Ghaffari (ETH Zurich)
Network Decomposition and Derandomization for Distributed Algorithms
Date: Thu November 28 @ 3:30 pm
Place: viale Regina Elena 295/B, building F, last floor
We present an efficient deterministic distributed algorithm for network decomposition and explain how this implies a general distributed derandomization theorem, proving that PLOCAL=P-RLOCAL. That is, informally, any efficient (i.e., polylogarithmic-time) randomized distributed algorithm for any locally checkable graph problem can be derandomized to an efficient deterministic distributed algorithm. This leads to near-exponential improvements in deterministic and, perhaps surprisingly, randomized distributed algorithms for numerous graph problems, as well as some improvements for massively parallel algorithms (i.e., MapReduce). These results settle several decades-old and central open problems in distributed graph algorithms, including Linial's well-known MIS question [FOCS'87], which had been called the most outstanding problem in the area. This is based on joint work with my student Vaclav Rozhon.
Mohsen Ghaffari is an assistant professor of computer science at ETH Zurich. He joined ETH after receiving a PhD from MIT. Mohsen has a broad range of interest in theoretical computer science, with a focus on distributed algorithms, parallel algorithms, and network algorithms. His work has been recognized with several prestigious awards including MIT's 2017 Sprowls award of best thesis in computer science, ACM's 2017 Principles of Distributed Computing dissertation award, and an honorable mention of ACM's 2017 doctoral dissertation award, as well as seven best paper or best student paper awards at top theory and distributed computing conferences, a 2019 ERC starting grant, and a 2019 Google's faculty research award.
Luca Trevisan (Bocconi University)
Max Cut is Approximated by Subexponential-Size Linear Programs
Date: Mon December 2 @ 3:30 pm
Place: viale Regina Elena 295/B, building F, last floor
We show that, for every epsilon>0, there is a linear programming relaxation of the Max Cut problem with at most exp(n^epsilon) variables and constraints that achieves approximation ratio at least 1/2 + delta, for some delta(epsilon)>0. Specifically, this is achieved by linear programming relaxations in the Sherali-Adams hierarchy. Previously, the only sub-exponential time algorithms known to approximate Max Cut with an approximation ratio better than 1/2 were based on semidefinite programming or spectral methods. We also provide subexponential time approximation algorithms based on linear programming for Khot's Unique Games problem, which have a qualitatively similar performance to previously known subexponential time algorithms based on spectral methods and on semidefinite programming. Joint work with Sam Hopkins and Tselil Schramm.
Luca Trevisan is a professor of Computer Science at Bocconi University. Luca studied at the Sapienza University of Rome, he was a post-doc at MIT and at DIMACS, and he was on the faculty of Columbia University, U.C. Berkeley, and Stanford, before returning to Berkeley in 2014 and, at long last, moving back to Italy in 2019. Luca's research is focused on computational complexity, on analysis of algorithms, and on problems at the intersection of pure mathematics and theoretical computer science. Luca received the STOC'97 Danny Lewin (best student paper) award, the 2000 Oberwolfach Prize, and the 2000 Sloan Fellowship. He was an invited speaker at the 2006 International Congress of Mathematicians. He is a recipient of a 2019 ERC Advanced Grant.
Raffaele Giancarlo (Dipartimento di Matematica ed Informatica, University of Palermo)
Algorithms, Theoretical Computer Science and Epigenomics: Mining Mathematical Laws for Predicting Chromatin Organization and Nucleosome Occupancy in Eukaryotic Genomes
Date: Wed October 30, 3:00 pm
Place: viale Regina Elena 295/B, building F, last floor
Epigenomics is the study of modifications on the genetic material of a cell that do not depend on changes in the DNA sequence, since those latter involve specific proteins around which DNA wraps. The end result is that epigenomic changes have a fundamental role in the proper working of each cell in Eukaryotic organisms. A particularly important part of Epigenomics concentrates on the study of chromatin, that is, a fiber composed of a DNA-protein complex and very characterizing of Eukaryotes. Understanding how chromatin is assembled and how it changes is fundamental for Biology. Starting from an open problem regarding nucleosomes and the so called 30 nm fiber, posed by R. Kornberg nearly 30 years ago, we show how the use of simple ideas from Theoretical Computer Science can surprisingly help in making progress in obtaining a mathematical characterization of nucleosome potioning in Eukaryotes, based only on knowledge of the DNA sequence. Such a characterization was conjectured not to exist. As a corollary, we obtain an algorithm for nucleosome occupancy prediction, optimal in time and guaranteeing a two orders of magnitudo improvement in time over Machine Learning counterparts. In addition, statistically sound data structures, i.e., Epigenomic Dictionaries, are also introduced in order to effectively mine epigenomic datasets for the discovery of further regularities. Again, those data structures are much more sensitive than the Machine Learning countderparts and they can profitably accomodate information from different sources.
Joint Work with: Simona Ester Rombo, Dipartimento di Matematica ed Informatica, University of Palermo Italy; Filippo Utro, Computational Biology Center, IBM T.J. Watson Research, Yorktown Heights, NY, USA.
Raffaele Giancarlo is Full Professor of Computer Science, Faculty of Science, Universita di Palermo. He obtained a Ph.D. in Computer Science from Columbia University in 1990 defending one of the first Ph.D. Thesis about Algorithms and Computational Biology. He was awarded several fellowships, among which the AT&T Bell Labs Post-Doctoral Fellowship in Mathematical and Information Sciences and a CNRS Visiting Scientist Grant. He has also been Invited Keynote Speaker to several conferences and summer schools, including SIAM International Conference in Applied Mathematics and has held several visiting scientist positions with many research labs and Universities both in USA and Europe such as AT&T Shannon Laboratories and Max Plank Institure for Molecular Genetics- Bioinformatics Section, Berlin, Computer Science Haifa University. Moreover, he has served either as Chairman or as Member of Scientific Commettees in several conferences directly related to the theme of the project, such as Workshop on Algorithms in Bioinformatics, Combinatorial Pattern Matching and String Processingf and Information Retrieval. He is currently on the Editorial Board of Theoreticl Computer Science, Algorithms for Molecular Biology, BMC Bioinformatics an BMC Research Notes. He has also been the principal investigator of several Italian Ministry of Education research projects in Bioinformatics and one CNRS Grant. Morebver, he has been a reviewer for most of the best established Journals and Conferences in Computational Biology and Theoretical Computer Science. In addition, he has also been reviewer for severan National Granting Agencies, including US NSF and the Royal Society. He is also regularly consulted by Universities nationally and internationally in orer to assess Faculty promotions. As for involvement in National and Local Higher Edication Structures, he has been President of the Computer Science Curricula, University of Palermo and Member of the Italian Computer Science Curricula Commission of the Italian Association of Computer Science Reserachers (GRIN). He is currently on the Board of Directors of the CINI Consortiujm, that represents all of the academic competences in Computer Science and Engineering present in Italy. Scientific Results: Specialist of Design and Analysis of combinatorial algorithms, with particular enphasis on string algorithms, ranging from Bioinformatics to Data Compression. Great interest and many contributions to Data Structures and Automata Theory, with applications to Bioinformatics. The scientific production consists of more than 100 papers appeared in established journals and conferences. Moreover, he is coauthor of many patents, granted by the US Patent Office, in information retrieval.
Other speakers for the 2019/2020 season will include:
Luca Trevisan (Bocconi University)
Silvio Lattanzi (Google Zurich)
Andreas Krause (ETH Zurich)
Nelly Litvak (University of Twente)
Claire Mathieu (CNRS)
PAST SEASON (2018-2019)
Friday July 5, 2019, 11am, room G50, 3rd floor, building G (viale Regina Elena 295/B)
Pierre Fraigniaud, CNRS and University Paris Diderot, France
A Topological Perspective on Distributed Network Computing
This talk will survey the tight connections between distributed computing and combinatorial topology. Indeed, combinatorial topology has been identified as a universal framework in which many different models for distributed computing can be expressed, and using combinatorial topology has been shown to be quite useful for analyzing the complexity of problems in these models. The talk will also present some recent results about the application of combinatorial topology in the framework of distributed network computing.
Pierre Fraigniaud got his PhD degree in Computer Science from ENS Lyon in 1990. He joined CNRS in 1991. He was director of the Institut de Recherche en Informatique Fondamentale (IRIF) located at Université de Paris. His main research interest is parallel and distributed computing, and specifically the design and analysis of distributed algorithms and data structures for networks. He is member of the Editorial Boards of Distributed Computing (DC), and Theory of Computing Systems (TOCS). He was Program Committee Chair of IPDPS 2017 (track Algorithms), ICALP 2014 (track Foundations of networked computation), PODC 2011, DISC 2005, SPAA 2001, and EuroPar 1996. In 2012, he received the Silver Medal from CNRS, and, in 2014, the Prize for Innovation in Distributed Computing.
Tuesday June 25, 2019, 3pm, room G50, 3rd floor, building G (viale Regina Elena 295/B)
Ola Svensson, EPFL, Lausanne, Switzerland
Better Guarantees for k-Means Clustering
For quite some time the best approximation algorithm for k-means were a simple local search heuristic yielding an approximation guarantee of 9, a ratio that is known to be tight with respect to such methods.
In this talk, we present an improved guarantee of 6.357 via a new primal-dual approach that allows us to exploit the geometric structure of k-means.
Finally, we also consider the semi-supervised setting. Given an oracle that returns whether two given points belong to the same cluster in a fixed optimal clustering, we should that few queries are sufficient to recover a clustering that is arbitrarily close to the optimal one.
This is based on joint works with Sara Ahmadian, Buddhima Gamlath, Ashkan Norouzi-Fard, and Justin Ward.
Ola Svensson is an Associate Professor at the School of Computer and Communication Sciences at EPFL, Switzerland. He is interested in theoretical aspects of computer science with an emphasis on the approximability of NP-hard optimization problems. His work has received several recognitions including the 2019 Michael and Sheila Held Prize by the National Academy of Sciences and best paper awards at FOCS and STOC.
Wednesday May 22, 2019, 12:00am, 2nd floor, building F (viale Regina Elena 295/B)
Benny Chor, School of Computer Science, Tel Aviv University
Identifying Relatives (Almost) Privately
The privacy of personal genomic data, and how much access to it is released to commercial and non commercial entities, is an important concern. In this work, we study a specific aspect of this problem: How to determine familial, relatives relations among a large group of individuals, while keeping genomic DNA data private (or almost private). There are several commercial companies that provide such service, albeit in completely non private manner: The individual customers supply their entire genomic DNA information to these companies. We describe the moderator model, where customers keep their DNA information locally, and a central moderator matches pairs of persons who are likely to be related. These persons then engage in a secure two party computation, which checks more accurately for familial relations, and does not leak any additional information. The entire protocol leaks only a bounded amount of information to the moderator or to any other party.
Joint work with Orit Moskovich and Benny Pinkas
Benny Chor did his B.Sc. and M.Sc. studies in mathematics at the Hebrew University of Jerusalem, from 1978 to 1981. He did his Ph.D. at MIT, working under the supervision of Ron Rivest on public key crypto systems, from 1981 to 1985. His Ph.D. thesis, “Two Issues in Public Key Cryptography”, was selected as an ACM Distinguished Dissertation. After completing his Ph.D., he was a Bantrell post doctorate fellow at MIT, and then a post doctorate associate at Harvard University.
Upon returning to Israel, in 1987, he served as a faculty member in the Faculty of Computer Science at the Technion. In 2001, he has joined the School of Computer Science, Tel-Aviv University, where he is now a full Professor. In October 2018 he was appointed the head of the school. He has supervised (at the Technion and at Tel Aviv University) 8 Ph.D. students and 23 M.Sc. students.
He is currently interested in computational biology, in privacy, randomness, and in educational aspects of computer science, from primary school to university level. Together with Dr. Amir Rubinstein, he has recently completed the creation of an on line course, in Hebrew, titled “First Steps in Computer Science and Python Programming”.
Thursday May 9, 2019, 12:00am, room G50, building G (viale Regina Elena 295/B)
Christian Scheideler (Paderborn University)
Distributed algorithms for hybrid networks
I will introduce a new communication model, called hybrid network, in which the nodes have the choice between two communication modes: a local mode that allows them to exchange messages with nearby nodes, and a global mode where communication between any pair of nodes is possible, but the amount of communication is limited. This can be motivated, for instance, by wireless networks in which we combine direct device-to-device communication with communication via the cellular infrastructure. I will show how to quickly build up a low-diameter, low-degree network of global edges (i.e., connections established via the global communication mode) on top of any network of local edges (i.e., connections given by the local communication mode) so that various problems such as computing minimum spanning trees and shortest paths can be solved much more efficiently than by just using the local edges.
This is joint work with various colleagues including John Augustine, Fabian Kuhn, and Mohsen Ghaffari.
Christian Scheideler received his M.S. and Ph.D. degrees in computer science from the Paderborn University, Germany, in 1993 and 1996. He was a postdoc at the Weizmann Institute and the Paderborn University, where he also finished his Habilitation, an Assistant Professor at the Johns Hopkins University, USA, and an Associate Professor at the Technical University of Munich, Germany, before he became a full professor at the Paderborn University in 2009. Among his various activities, he is currently the general chair of the ACM SPAA conference, an associate editor of the Journal of the ACM, an associate editor of the Journal of Computer and System Sciences, and the PC chair of ALGOSENSORS 2019.
Christian Scheideler has published more than 100 papers in refereed conferences and journals and has served on more than 50 conference committees. He has also been the PC chair and local arrangements chair of various conferences in the past. His main focus is on distributed algorithms and data structures, network theory (in particular, overlay networks, wireless networks, and hybrid networks), and the design of algorithms and architectures for robust and secure distributed systems as well as programmable matter.
Thursday April 18, 2019, 12:00am, room G50, building G (viale Regina Elena 295/B)
Devdatt Dubhashi (Chalmers University of Technology)
Stochastic Incremental Algorithms for Optimal Transport with SON Regularizer
Optimal Transport (OT) is a classic area in probability and statistics for transferring mass from one probability distribution to another. Recently OT has been very successfully used for domain adaptation in many applications in computer vision, texture analysis, tomographic reconstruction and clustering. We introduce a new regularizer OT which is tailored to better preserve the class structure. We give the first theoretical guarantees for an OT scheme that respects class structure. We give an accelerated proximal--projection scheme for this formulation with the proximal operator in closed form to give a highly scalable algorithm for computing optimal transport plans. Our experiments show that the new regularizer preserves class structure better and is more robust compared to previous regularizers.
Devdatt Dubhashi is Professor in the Department of Computer Science and Engineering at Chalmers University, leading the algorithms and machine learning group. He is also Chief Scientist at Machine Intelligence Sweden AB, a startup that bridges from research to innovation in AI powered technologies. He received his B.Tech. in Computer Science and Engineering from the Indian Institute of Technology (IIT) Delhi and his masters and Ph.D. in Computer Science from Cornell University USA. He has held positions at the Max Planck Institute for Computer Science Saarbrueken Germany and at BRICS (Basic Research in Computer Science) a center of the Danish National Science Foundation at the University of Aarhus Denmark. He led the project "Data Driven Secure Business Intelligence" funded by SSF and was an expert consultant for the OECD on Data Driven Innovation. He was a Vinnova and Marie-Curie Fellow He was an invited speaker in 2017 at the SICS Data Science Day, Vehicle Electronics and Connected Services and at the European Meeting of Statisticians.
Wednesday April 10, 2019, 11am, building F, 2nd floor (viale Regina Elena 295/B)
Christian Sohler (TU Dortmund)
Strong coresets for k-median and subspace clustering: Goodbye dimension
I will start my talk with an introduction to the area of coresets for clustering problems, give basic definitions and explain how coresets can be used to obtain distributed and streaming algorithms. During my talk, I will mostly focus on a commonly used coreset definition that has been introduced by Har-Peled and Mazumdar: A coreset is a weighted set S of points such that for all sets C of k centers we have (1-eps) cost(P,C) <= cost(S,C) <= (1+eps) cost(P,C), where cost(P,C) is the sum of distances of the points in P to their closest center in C. Then I will present a new algorithm to compute coresets that have a similar for-all guarantee as in the above definition and that consist of a number of points that is independent of the size of the input point set P and the dimension of the input space d.
(Joint work with David Woodruff.)
Christian Sohler received his PhD from University of Paderborn, Germany. After positions as assistant professor in Paderborn and associate Professor in Bonn and Dortmund, he has been promoted to full professor at TU Dortmund. Currently, he is visiting professor at Google Zurich. His research interests include the development and analysis of algorithms and data structures for processing very large data sets. His focus lies on the design of randomized algorithms for the analysis of very large graphs, streaming algorithms and clustering algorithms.
Friday March 8, 2019, 11am, room G50, 3rd floor, building G (viale Regina Elena 295/B)
Danupon Nanongkai (KTH Royal Institute of Technology)
Dynamic graph algorithms and complexity (a survey)
In this talk I will attempt to answer the following questions I have been asked quite often: What are the current challenges in dynamic graph algorithms? What are good starting points for people who want to try working in this field? The talk will focus on challenges for basic graph problems (e.g. connectivity, shortest paths, maximum matching), and will survey some existing upper and lower bound results and techniques. [slides]
Danupon Nanongkai is currently an associate professor in the Theoretical Computer Science group at KTH Royal Institute of Technology, Sweden. He received a Ph.D. in Algorithms, Combinatorics, and Optimization (ACO) from Georgia Tech in 2011 and a docent (aka habilitation) in Computer Science from KTH in 2017. He was awarded the Principles of Distributed Computing Doctoral Dissertation Award in 2013 and an ERC Starting Grant in 2016. His research interest is generally on graph algorithms and complexity, where the current focus is on distributed and dynamic graph algorithms.
Wednesday February 20, 2019, 4pm, building F (viale Regina Elena 295/B).
Artur Czumaj (University of Warwick)
Round Compression for Parallel Matching Algorithms
For over a decade now we have been witnessing the success of massive parallel computation (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to accurately capture the nature of large-scale computation. In particular, compared to the classic distributed algorithms or PRAM models, these frameworks allow for much more local computation. The fundamental question that arises in this context is though: can we leverage this additional power to obtain even faster parallel algorithms?
A prominent example here is the fundamental graph problem of finding maximum matching. It is well known that in the PRAM model one can compute a 2-approximate maximum matching in O(log n) rounds. However, the exact complexity of this problem in the MPC framework is still far from understood. Lattanzi et al. showed that if each machine has n^{1+Ω(1)} memory, this problem can also be solved 2-approximately in a constant number of rounds. These techniques, as well as the approaches developed in the follow up work, seem though to get stuck in a fundamental way at roughly O(log n) rounds once we enter the near-linear memory regime. It is thus entirely possible that in this regime, which captures in particular the case of sparse graph computations, the best MPC round complexity matches what one can already get in the PRAM model, without the need to take advantage of the extra local computation power.
In this talk, we finally show how to refute that perplexing possibility. That is, we break the above O(log n) round complexity bound even in the case of slightly sublinear memory per machine.
This is a joint work with Jakub Łącki, Aleksander Mądry, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski, which appeared at STOC'2018.
Artur Czumaj is a Professor of Computer Science and Director of the Centre for Discrete Mathematics and its Applications (DIMAP) at the University of Warwick, UK. He received his PhD and Habilitation degree from the University of Paderborn in Germany. His main research interest is in the design of randomized algorithms and their probabilistic analysis, with applications to property testing and sublinear algorithms, optimization algorithms, parallel and distributed computing, string matching, and algorithmic game theory.
Tuesday January 22, 2019, 4pm, room G50 (viale Regina Elena 295/B).
Nikhil Bansal (Eindhoven University of Technology)
Problems and progress in algorithmic discrepancy
The probabilistic method is a powerful tool in combinatorics and computer science. However in many settings, methods from combinatorial discrepancy can give much stronger results than those achievable by randomized rounding and other probabilistic methods. In this talk, I will give a gentle introduction to discrepancy and will survey some of the recent techniques that have been developed in the area.
Nikhil Bansal is a research at CWI, and a professor in the Department of Mathematics and Computer Science at Eindhoven University of Technology. He did his Bachelors in Computer Science from IIT Mumbai (1999) and obtained his PhD from Carnegie Mellon University in 2003. He worked at the IBM T.J. Watson Research Center until 2011, where he also managed the Algorithms group. He is interested in theoretical computer science with focus on the design and analysis of algorithms. He has received several best paper awards including one at the FOCS 2011 conference, and is the recipient of the NWO Vidi, TOP and VICI grants, and an ERC consolidator grant.
Thursday December 20th, 2018, 4pm, room G50 (viale Regina Elena 295/B).
Silvio Lattanzi (Google Research Europe)
Online Principal Component Analysis and Column Subset Selection
We propose online algorithms for Column Subset Selection (CSS) and Principal Component Analysis (PCA), two methods that are widely employed for data analysis, summarization, and visualization. Given a data matrix A that is revealed one column at a time, the online CSS problems asks to keep a small set of columns, S, that best approximates the space spanned by the columns of A. As each column arrives, the algorithm must irrevocably decide whether to add it to S, or to ignore it. In the online PCA problem, the goal is to output a projection of each column to a low dimensional subspace. In other words, the algorithm must provide an embedding for each column as it arrives, but explicitly maintaining the low dimensional target space is not required.
While both of these problems have been studied, only additive approximations were known prior to our work. The core of our approach is an adaptive sampling technique that gives a simple, practical, and efficient algorithm for both of these problems. We prove that by judiciously sampling columns with probability proportional to their residual norm, we can select at most Õ(k) columns and achieve an O(log n) approximation as compared to the best rank k subspace.
In order to reduce the approximation ratio we show how to combine our adaptive sampling procedure with previous methods by Boutsidis et al. and Frieze et al. to obtain (1 + eps) approximations for both the online PCA and online CSS problems. This combination is a result of a careful analysis that allows us to run both algorithms “in series” when processing each column of A, thereby converting the additive approximation obtained previously to tight multiplicative guarantees.
Silvio is a Research Scientist at Google Research Europe since April 2017. Before he was part of the NY Algorithm group at Google New York (2011-2017). He received his PhD from Sapienza University of Rome under the supervision of Alessandro Panconesi. His research interests are in the areas of algorithms, machine learning and information retrieval.