0% found this document useful (0 votes)
45 views722 pages

Automata Languages Programming

The document is the proceedings of the 40th International Colloquium on Automata, Languages, and Programming (ICALP 2013), held in Riga, Latvia from July 8-12, 2013. It includes details about the conference organization, program committees, accepted papers, and awards given for outstanding contributions. The event featured various tracks focusing on different aspects of theoretical computer science and included workshops prior to the main conference.

Uploaded by

doriangray0212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views722 pages

Automata Languages Programming

The document is the proceedings of the 40th International Colloquium on Automata, Languages, and Programming (ICALP 2013), held in Riga, Latvia from July 8-12, 2013. It includes details about the conference organization, program committees, accepted papers, and awards given for outstanding contributions. The event featured various tracks focusing on different aspects of theoretical computer science and included workshops prior to the main conference.

Uploaded by

doriangray0212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Fedor V.

Fomin
Rūsiņš Freivalds
Marta Kwiatkowska
David Peleg (Eds.)
ARCoSS
LNCS 7966

Automata, Languages,
and Programming
40th International Colloquium, ICALP 2013
Riga, Latvia, July 2013
Proceedings, Part II

123
Lecture Notes in Computer Science 7966
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison, UK Takeo Kanade, USA
Josef Kittler, UK Jon M. Kleinberg, USA
Alfred Kobsa, USA Friedemann Mattern, Switzerland
John C. Mitchell, USA Moni Naor, Israel
Oscar Nierstrasz, Switzerland C. Pandu Rangan, India
Bernhard Steffen, Germany Madhu Sudan, USA
Demetri Terzopoulos, USA Doug Tygar, USA
Gerhard Weikum, Germany

Advanced Research in Computing and Software Science


Subline of Lectures Notes in Computer Science

Subline Series Editors


Giorgio Ausiello, University of Rome ‘La Sapienza’, Italy
Vladimiro Sassone, University of Southampton, UK

Subline Advisory Board


Susanne Albers, University of Freiburg, Germany
Benjamin C. Pierce, University of Pennsylvania, USA
Bernhard Steffen, University of Dortmund, Germany
Madhu Sudan, Microsoft Research, Cambridge, MA, USA
Deng Xiaotie, City University of Hong Kong
Jeannette M. Wing, Microsoft Research, Redmond, WA, USA
Fedor V. Fomin Rūsiņš Freivalds
Marta Kwiatkowska David Peleg (Eds.)

Automata, Languages,
and Programming
40th International Colloquium, ICALP 2013
Riga, Latvia, July 8-12, 2013
Proceedings, Part II

13
Volume Editors
Fedor V. Fomin
University of Bergen, Department of Informatics
Postboks 7803, 5020 Bergen, Norway
E-mail: [email protected]
Rūsiņš Freivalds
University of Latvia, Faculty of Computing
Raina bulv. 19, 1586 Riga, Latvia
E-mail: [email protected]
Marta Kwiatkowska
University of Oxford, Department of Computer Science
Wolfson Building, Parks Road, Oxford OX1 3QD, UK
E-mail: [email protected]
David Peleg
Weizmann Institute of Science, Faculty of Mathematics and Computer Science
POB 26, 76100 Rehovot, Israel
E-mail: [email protected]

ISSN 0302-9743 e-ISSN 1611-3349


ISBN 978-3-642-39211-5 e-ISBN 978-3-642-39212-2
DOI 10.1007/978-3-642-39212-2
Springer Heidelberg Dordrecht London New York

Library of Congress Control Number: 2013941217

CR Subject Classification (1998): F.2, F.1, C.2, H.3-4, G.2, I.2, I.3.5, E.1

LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues


© Springer-Verlag Berlin Heidelberg 2013
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and
executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication
or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location,
in its current version, and permission for use must always be obtained from Springer. Permissions for use
may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution
under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of publication,
neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or
omissions that may be made. The publisher makes no warranty, express or implied, with respect to the
material contained herein.
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Preface

ICALP, the International Colloquium on Automata, Languages and Program-


ming, is arguably the most well-known series of scientific conferences on Theo-
retical Computer Science in Europe. The first ICALP was held in Paris, France,
during July 3–7, 1972, with 51 talks. The same year EATCS, the European As-
sociation for Theoretical Computer Science, was established. Since then ICALP
has been the flagship conference of EATCS.
ICALP 2013 was the 40th conference in this series (there was no ICALP
in 1973). For the first time, ICALP entered the territory of the former Soviet
Union. It was held in Riga, Latvia, during on July 8–12, 2013, in the University of
Latvia. This year the program of ICALP was organized in three tracks: Track A
(Algorithms, Complexity and Games), Track B (Logic, Semantics, Automata and
Theory of Programming), and Track C (Foundations of Networked Computation:
Models, Algorithms and Information Management).
In response to the Call for Papers, 436 papers were submitted; 14 papers were
later withdrawn. The three Program Committees worked hard to select 71 papers
for Track A (out of 249 papers submitted), 33 papers for Track B (out of 113
papers), and 20 papers for Track C (out of 60 papers). The average acceptance
rate was 29%. The selection was based on originality, quality, and relevance
to theoretical computer science. The quality of the submitted papers was very
high indeed. The Program Committees acknowledge that many rejected papers
deserved publication but regrettably it was impossible to extend the conference
beyond 5 days.
The best paper awards were given to Mark Bun and Justin Thaler for their
paper “Dual Lower Bounds for Approximate Degree and Markov-Bernstein In-
equalities” (in Track A), to John Fearnley and Marcin Jurdziński for the paper
“Reachability in Two-Clock Timed Automata is PSPACE-complete” (in Track
B), and to Dariusz Dereniowski, Yann Disser, Adrian Kosowski, Dominik Pajak, 
and Przemyslaw Uznański for the paper “Fast Collaborative Graph Exploration”
(in Track C). The best student paper awards were given to Radu Curticapean
for the paper “Counting matchings of size k is #W[1]-hard” (in Track A) and to
Nicolas Basset for the paper “A maximal entropy stochastic process for a timed
automaton” (in Track B).
ICALP 2013 contained a special EATCS lecture on the occasion of the 40th
ICALP given by:
– Jon Kleinberg, Cornell University
Invited talks were delivered by:
– Susanne Albers, Humboldt University
– Orna Kupferman, Hebrew University
– Dániel Marx, Hungarian Academy of Sciences
VI Preface

– Paul Spirakis, University of Patras


– Peter Widmayer, ETH Zürich
The main conference was preceded by a series of workshops on Sunday July 7,
2013 (i.e., one day before ICALP 2013). The list of workshops consisted of:
– Workshop on Automata, Logic, Formal languages, and Algebra (ALFA 2013)
– International Workshop on Approximation, Parameterized and Exact Algo-
rithms (APEX 2013)
– Quantitative Models: Expressiveness and Analysis
– Foundations of Network Science (FONES)
– Learning Theory and Complexity
– 7th Workshop on Membrane Computing and Biologically Inspired Process
Calculi (MeCBIC 2013)
– Workshop on Quantum and Classical Complexity
We sincerely thank our sponsors, members of the committees, referees, and the
many colleagues who anonymously spent much time and effort to make ICALP
2013 happen.

May 2013 Fedor V. Fomin


Rūsiņš Freivalds
Marta Kwiatkowska
David Peleg
Organization

Program Committee
Track A
Andris Ambainis University of Latvia, Latvia
Edith Elkind Nanyang Technological University, Singapore
Leah Epstein University of Haifa, Israel
Rolf Fagerberg University of Southern Denmark, Denmark
Fedor Fomin University of Bergen, Norway (Chair)
Pierre Fraigniaud CNRS and University Paris Diderot, France
Fabrizio Grandoni Dalle Molle Institute, Switzerland
Joachim Gudmundsson University of Sydney, Australia
Kazuo Iwama Kyoto University, Japan
Valentine Kabanets Simon Fraser University, Canada
Stavros Kolliopoulos National and Kapodistrian University
of Athens, Greece
Daniel Král’ University of Warwick, UK
Daniel Lokshtanov University of California, San Diego, USA
Konstantin Makarychev Microsoft Research, Redmond, USA
Peter Bro Miltersen Aarhus University, Denmark
Ilan Newman University of Haifa, Israel
Konstantinos Panagiotou Ludwig Maximilians University, Munich,
Germany
Alexander Razborov University of Chicago, USA
Saket Saurabh The Institute of Mathematical Sciences, India
David Steurer Microsoft Research, New England, USA
Kunal Talwar Microsoft Research, Silicon Valley, USA
Dimitrios Thilikos National and Kapodistrian University
of Athens, Greece
Virginia Vassilevska Williams University of California, Berkeley,
and Stanford, USA
Gerhard Woeginger Eindhoven University of Technology,
The Netherlands
Track B
Christel Baier TU Dresden, Germany
Chiara Bodei University of Pisa, Italy
Mikolaj Bojańczyk University of Warsaw, Poland
Patricia Bouyer-Decitre CNRS/ENS Cachan, France
Vassilis Christophides University of Crete, Greece
Yuxin Deng Shanghai Jiao-Tong University, China
Marcelo Fiore University of Cambridge, UK
VIII Organization

Patrice Godefroid Microsoft Research, Redmond, USA


Andy Gordon MSR Cambridge and University of Edinburgh,
UK
Alexey Gotsman Madrid Institute for Advanced Studies
(IMDEA), Spain
Masami Hagiya University of Tokyo, Japan
Michael Huth Imperial College London, UK
Stephan Kreutzer TU Berlin, Germany
Antonı́n Kučera Masaryk University, Brno, Czech Republic
Viktor Kuncak EPFL, Lausanne, Switzerland
Marta Kwiatkowska University of Oxford, UK (Chair)
Leonid Libkin University of Edinburgh, UK
Rupak Majumdar Max Planck Institute, Kaiserslautern, Germany
Jerzy Marcinkowski University of Wroclaw, Poland
Annabelle McIver Macquarie University, Australia
Catuscia Palamidessi INRIA Saclay and LIX, Ecole Polytechnique,
Paris, France
Frank Pfenning Carnegie Mellon University, USA
André Platzer Carnegie Mellon University, USA
Jean-François Raskin UL Brussels, Belgium
Jan Rutten CWI and Radboud University Nijmegen,
The Netherlands
Peter Selinger Dalhousie University, Canada
Andreas Winter University of Bristol, UK

Track C
James Aspnes Yale Univerity, USA
Ioannis Caragiannis University of Patras, Greece
Xavier Defago JAIST, Japan
Josep Diaz UPC, Barcelona, Spain
Stefan Dobrev Slovak Academy of Sciences, Bratislava,
Slovak Republic
Michele Flammini University of L’Aquila, Italy
Leszek Gasieniec
 University of Liverpool, UK
Cyril Gavoille Univerity of Bordeaux, France
David Kempe University of Southern California, USA
Valerie King University of Victoria, Canada
Amos Korman CNRS, Paris, France
Miroslaw Kutylowski Wroclaw University of Technology, Poland
Dahlia Malkhi Microsoft Research, Silicon Valley, USA
Luca Moscardelli University of Chieti, Pescara, Italy
Thomas Moscibroda Microsoft Research Asia and Tsinghua
University, China
Marina Papatriantafilou Chalmers University of Technology, Goteborg,
Sweden
Organization IX

David Peleg Weizmann Institute of Science, Israel (Chair)


Yvonne Anne Pignolet ETH Zurich, Switzerland
Sergio Rajsbaum UNAM, Mexico
Liam Roditty Bar-Ilan University, Israel
José Rolim University of Geneva, Switzerland
Christian Scheideler Technical University of Munich, Germany
Jennifer Welch Texas A&M University, USA

Organizing Committee
(all from University of Latvia, Latvia)

Andris Ambainis
Kaspars Balodis
Juris Borzovs (Organizing Chair)
Rūsiņš Freivalds (Conference Chair)
Marats Golovkins
Nikolay Nahimov
Jeļena Poļakova
Alexander Rivosh
Agnis Škuškovniks (Organizing Deputy Chair)
Juris Smotrovs
Abuzer Yakaryılmaz

Sponsoring Institutions
QuBalt
University of Latvia

Additional Reviewers
Aaronson, Scott Arvind, V. Barman, Siddharth
Aceto, Luca Askalidis, Georgios Barto, Libor
Adamaszek, Anna Atserias, Albert Belovs, Aleksandrs
Afshani, Peyman Aumüller, Martin Bendlin, Rikke
Agrawal, Manindra Avigdor-Elgrabli, Noa Benoit, Anne
Ahn, Kook Jin Avis, David Benzaken, Veronique
Aichholzer, Oswin Badanidiyuru, Berman, Itay
Albers, Susanne Ashwinkumar Bertrand, Nathalie
Allouche, Jean-Paul Bae, Sang Won Berwanger, Dietmar
Alur, Rajeev Balmau, Oana Bianchi, Giuseppe
Alvarez, Carme Bampis, Evripidis Biedl, Therese
Amano, Kazuyuki Bansal, Nikhil Bilò, Davide
Andoni, Alexandr Barcelo, Pablo Bilò, Vittorio
X Organization

Björklund, Andreas Chen, Ning Ďuriš, Pavol


Björklund, Henrik Chen, Taolue Dutta, Kunal
Blais, Eric Chen, Xin Dvořák, Zdeněk
Bläser, Markus Chester, Sean Dziembowski, Stefan
Blum, William Chistikov, Dmitry Ebtekar, Aram
Bodirsky, Manuel Chitnis, Rajesh Eidenbenz, Raphael
Bodlaender, Hans L. Chmelı́k, Martin Eikel, Martina
Bohy, Aaron Chrobak, Marek Eisenbrand, Friedrich
Boker, Udi Cicalese, Ferdinando Elbassioni, Khaled
Bollig, Benedikt Clark, Alex Elkin, Michael
Bonnet, François Conchinha, Bruno Emek, Yuval
Bonsangue, Marcello Cormode, Graham Ene, Alina
Bortolussi, Luca Corradini, Andrea Englert, Matthias
Boularias, Abdeslam Crescenzi, Pierluigi Eppstein, David
Bourhis, Pierre Currie, James Erlebach, Thomas
Boyar, Joan Cygan, Marek Escoffier, Bruno
Boyle, Elette Czerwiński, Wojciech Esmaeilsabzali, Shahram
Brandes, Philipp Czumaj, Artur Faenza, Yuri
Brandstadt, Andreas Dal Lago, Ugo Fanelli, Angelo
Braverman, Mark Datta, Samir Faust, Sebastian
Braverman, Vladimir Daum, Marcus Favrholdt, Lene
Brazdil, Tomas Davies, Rowan Fehnker, Ansgar
Bringmann, Karl Dawar, Anuj Feige, Uri
Brodal, Gerth Stølting de Gouw, Stijn Feldman, Moran
Brody, Joshua de Groote, Philippe Feng, Yuan
Bulatov, Andrei de Haan, Robert Fenner, Stephen
Byrka, Jarek de Wolf, Ronald Feret, Jérôme
Cabello, Sergio Dell, Holger Ferrari, Gianluigi
Cachin, Christian Deshpande, Amit Fertin, Guillaume
Carbone, Marco Devanur, Nikhil Fijalkow, Nathanaël
Carton, Olivier Devroye, Luc Filmus, Yuval
Cerny, Pavol Dı́az-Báñez, José-Miguel Fineman, Jeremy
Cervesato, Iliano Dinitz, Michael Fischer, Eldar
Chakrabarty, Deeparnab Dittmann, Christoph Fischlin, Marc
Chakraborty, Sourav Doerr, Benjamin Fisher, Jasmin
Chaloulos, Konstantinos Doerr, Carola Floderus, Peter
Chan, Siu On Dorrigiv, Reza Fountoulakis, Nikolaos
Charatonik, Witold Dósa, György Frandsen, Gudmund
Chase, Melissa Doty, David Frati, Fabrizio
Chattopadhyay, Arkadev Doyen, Laurent Friedrich, Tobias
Chávez, Edgar Dregi, Markus Frieze, Alan
Chawla, Sanjay Drucker, Andrew Friggstad, Zachary
Chechik, Shiri Dräger, Klaus Fu, Hongfei
Chen, Jie Ducas, Léo Függer, Matthias
Chen, Kevin Dunkelman, Orr Fujioka, Kaoru
Organization XI

Funke, Stefan Hahn, Ernst Moritz Jansen, Klaus


Gagie, Travis Hähnle, Nicolai Jarry, Aubin
Gairing, Martin Hajiaghayi, Jeż, Artur
Galletta, Letterio Mohammadtaghi Jeż, L
 ukasz
Gamarnik, David Halldórsson, Jonsson, Bengt
Ganesh, Vijay Magnús M. Jordán, Tibor
Ganian, Robert Hansen, Kristoffer Jurdziński, Tomasz
Garg, Jugal Arnsfelt Jürjens, Jan
Gärtner, Bernd Hanzlik, Lucjan Kaibel, Volker
Gasarch, William Harsha, Prahladh Kamiński, Marcin
Gaspers, Serge Hassin, Refael Kammer, Frank
Gauwin, Olivier Hasuo, Ichiro Kanellopoulos,
Gavinsky, Dmitry Hayman, Jonathan Panagiotis
Gay, Simon He, Chaodong Kannan, Sampath
Geeraerts, Gilles He, Shan Kaplan, Haim
Gemulla, Rainer Heggernes, Pınar Kapralov, Michael
Georgiadis, Giorgos Heindel, Tobias Karakostas, George
Ghorbal, Khalil Hellwig, Matthias Karanikolas, Nikos
Giannopoulos, Panos Hirai, Yoichi Karavelas, Menelaos I.
Gimbert, Hugo Hitchcock, John M. Karhumäki, Juhani
Giotis, Ioannis Hliněný, Petr Kärkkäinen, Juha
Gmyr, Robert Hoeksma, Ruben Kartzow, Alexander
Goasdoué, François Höfner, Peter Kawahara, Jun
Gogacz, Tomasz Hon, Wing-Kai Kayal, Neeraj
Golas, Ulrike Horiyama, Takashi Keller, Nathan
Goldberg, Andrew Huang, Chien-Chung Keller, Orgad
Goldhirsh, Yonatan Huber, Anna Kellerer, Hans
Göller, Stefan Hüllmann, Martina Kemper, Stephanie
Golovach, Petr Hur, Chung-Kil Kerenidis, Iordanis
Goncharov, Sergey Ibsen-Jensen, Rasmus Khot, Subhash
Gopalan, Parikshit Ilcinkas, David Kiayias, Aggelos
Gorbunov, Sergey Im, Sungjin Kiefer, Stefan
Gorry, Thomas Imai, Katsunobu Kik, Marcin
Gottlieb, Lee-Ad Imreh, Csanád Kim, Eun Jung
Gourvès, Laurent Indyk, Piotr Kissinger, Alexander
Goyal, Navin Ishii, Toshimasa Klauck, Hartmut
Graça, Daniel Ito, Hiro Klein, Karsten
Grenet, Bruno Ito, Tsuyoshi Kliemann, Lasse
Guo, Alan Itoh, Toshiya Klı́ma, Ondřej
Gupta, Anupam Iván, Szabolcs Klin, Bartek
Gupta, Sushmita Iwamoto, Chuzo Klonowski, Marek
Gurvich, Vladimir Jager, Tibor Kluczniak, Kamil
Gutwenger, Carsten Jain, Rahul Kniesburges, Sebastian
Habib, Michel Jančar, Petr Kobayashi, Koji
Hadzilacos, Vassos Jansen, Bart Kobayashi, Yusuke
XII Organization

Koenemann, Jochen Lawson, Mark V. Mertzios, George B.


Kolay, Sudeshna Le Gall, François Messner, Jochen
Kolesnikov, Vladimir Lerays, Virginie Mestre, Julian
Kollias, Kostas Leroux, Jérôme Meyer auf der Heide,
Komm, Dennis Leucci, Stefano Friedhelm
Komusiewicz, Christian Levin, Asaf Mezzetti, Gianluca
Konečný, Filip Lewenstein, Moshe Micciancio, Daniele
Konrad, Christian Lewi, Kevin Michaliszyn, Jakub
Kopczynski, Eryk Li, Jian Misra, Neeldhara
Kopelowitz, Tsvi Li, Shi Misra, Pranabendu
Kopparty, Swastik Liang, Guanfeng Mitsch, Stefan
Kortsarz, Guy Lin, Anthony Widjaja Mittal, Shashi
Kötzing, Timo Lingas, Andrzej Miyazaki, Shuichi
Koucký, Michal Linji, Yang Mokhov, Andrey
Koutavas, Vasileios Loff, Bruno Møller, Anders
Koutsopoulos, Andreas Lohrey, Markus Monaco, Gianpiero
Kowalik, Lukasz Loos, Sarah Montanari, Alessandro
Koza, Michal López-Ortiz, Alejandro Montgomery, Hart
Kozen, Dexter Lösch, Steffen Morgan, Carroll
Královič, Rastislav Lotker, Zvi Morin, Pat
Královič, Richard Lu, Pinyan Moseley, Ben
Kratsch, Stefan Lücke, Dominik Movahedi, Mahnush
Krčál, Jan Lundell, Eva-Marta Moysoglou, Yannis
Křetı́nský, Jan Maffray, Frédéric Mozes, Shay
Krishnaswamy, Mahajan, Meena Mucha, Marcin
Ravishankar Makarychev, Yury Müller, Tobias
Kristensen, Lars Malgouyres, Rémy Mulmuley, Ketan
Kritikos, Kyriakos Mantaci, Roberto Muscholl, Anca
Krivine, Jean Manthey, Bodo Myers, Robert
Krokhin, Andrei Märcker, Steffen Myers, Steven
Krumke, Sven Markey, Nicolas Nagarajan, Viswanath
Krysta, Piotr Martin, Russell Nanongkai, Danupon
Krzywiecki, L  ukasz Martins, João G. Naor, Moni
Kunc, Michal Marx, Dániel Narayan, Arjun
Kuperberg, Denis Mathieson, Luke Navarra, Alfredo
Kupferman, Orna Matsuda, Takahiro Navarro, Gonzalo
Kurz, Alexander Matulef, Kevin Nebel, Markus
Kutten, Shay Mauro, Jacopo Nederlof, Jesper
Kutzkov, Konstantin Mazumdar, Arya Nehama, Ilan
Kyropoulou, Maria McAuley, Julian Nelson, Jelani
Lachish, Oded McGregor, Andrew Ngo, Hung
Lanese, Ivan McKenzie, Pierre Nguyen, Huy
Lauks-Dutka, Anna Meinicke, Larissa Nguyen, Phong
Laurent, Dominique Meister, Daniel Nicholson, Pat
Lavi, Ron Mereacre, Alexandru Nishimura, Harumichi
Organization XIII

Nonner, Tim Pilipczuk, Marcin Rote, Günter


Nordström, Jakob Pilipczuk, Michal Roth, Aaron
Novotný, Petr Pinkas, Benny Rouselakis, Yannis
Nowotka, Dirk Piskac, Ruzica Russo, Claudio
Nutov, Zeev Polák, Libor Rusu, Irena
O’Donnell, Ryan Policriti, Alberto Sadakane, Kunihiko
Obdrzalek, Jan Porat, Ely Saei, Reza
Ogierman, Adrian Pottier, François Saha, Barna
Okamoto, Yoshio Pouly, Amaury Sakai, Yoshifumi
Okawa, Satoshi Prabhakar, Pavithra Sakurada, Hideki
Oliehoek, Frans Prabhakaran, Manoj M. Salem, Iosif
Ollinger, Nicolas Pratikakis, Polyvios Salinger, Alejandro
Ölveczky, Peter Pratt-Hartmann, Ian Sanders, Peter
Onak, Krzysztof Price, Eric Sankowski, Piotr
Ono, Hirotaka Puglisi, Simon Sankur, Ocan
Ostrovsky, Rafail Quaas, Karin Santhanam, Rahul
Otachi, Yota Rabehaja, Tahiry Saptharishi, Ramprasad
Ott, Sebastian Rabinovich, Roman Saraf, Shubhangi
Oualhadj, Youssouf Rabinovich, Yuri Sassolas, Mathieu
Oveis Gharan, Shayan Räcke, Harald Satti, Srinivasa Rao
Paes Leme, Renato Radzik, Tomasz Sau, Ignasi
Pagh, Rasmus Raghunathan, Ananth Sauerwald, Thomas
Paluch, Katarzyna Rajaraman, Rajmohan Sawa, Zdeněk
Pandey, Omkant Raman, Venkatesh Sağlam, Mert
Pandurangan, Gopal Ramanujan, M.S. Schäfer, Andreas
Pandya, Paritosh Ranzato, Francesco Schindelhauer, Christian
Panigrahi, Debmalya Raptopoulos, Schmid, Stefan
Pankratov, Denis Christoforos Schneider, Johannes
Panolan, Fahad Ravi, R. Schnoebelen, Philippe
Paparas, Dimitris Rawitz, Dror Schröder, Matthias
Pardubská, Dana Raz, Ran Schubert, Aleksy
Pascual, Fanny Razenshteyn, Ilya Schumacher, André
Pasechnik, Dmitrii Regev, Oded Schwartz, Roy
Passmore, Grant Řehák, Vojtěch Schweitzer, Pascal
Paul, Christophe Rémy, Didier Schwentick, Thomas
Paulusma, Daniël Restivo, Antonio Scott, Elizabeth
Peikert, Chris Rettinger, Robert Sebő, András
Perdrix, Simon Reutenauer, Christophe Sedgewick, Bob
Petig, Thomas Reyzin, Lev Segev, Danny
Pferschy, Ulrich Richerby, David Seki, Shinnosuke
Phillips, Jeff Rigo, Michel Sénizergues, Géraud
Picaronny, Claudine Röglin, Heiko Sereni, Jean-Sébastien
Pierrakos, George Ron, Dana Serna, Maria
Pietrzak, Krzysztof Rosén, Adi Seshadhri, C.
Piliouras, Georgios Rotbart, Noy Seto, Kazuhisa
XIV Organization

Severini, Simone Tanaka, Keisuke Vijayaraghavan,


Sgall, Jiřı́ Tancer, Martin Aravindan
Shapira, Asaf Tang, Bangsheng Villanger, Yngve
Shavit, Nir Tassa, Tamir Visconti, Ivan
Sherstov, Alexander Telikepalli, Kavitha Vishnoi, Nisheeth
Shi, Yaoyun Tentes, Aris Vondrák, Jan
Shpilka, Amir Terauchi, Tachio Vrgoč, Domagoj
Siddharthan, Rahul Tessaro, Stefano Vrt’o, Imrich
Sidiropoulos, Anastasios Thachuk, Chris Walukiewicz, Igor
Silva, Alexandra Thaler, Justin Wan, Andrew
Silveira, Rodrigo Thapper, Johan Wang, Haitao
Silvestri, Riccardo Tirthapura, Srikanta Wang, Yajun
Simmons, Robert Todinca, Ioan Watanabe, Osamu
Simon, Hans Toninho, Bernardo Watrous, John
Singh, Mohit Tonoyan, Tigran Watson, Thomas
Skrzypczak, Michal Torenvliet, Leen Weimann, Oren
Sloane, Tony Toruńczyk, Szymon Weinstein, Omri
Sly, Allan Torán, Jacobo Wieczorek, Piotr
Soltanolkotabi, Mahdi Triandopoulos, Nikos Wiese, Andreas
Soto, José A. Tudor, Valentin Wiesner, Karoline
Špalek, Robert Tůma, Vojtěch Williams, Ryan
Spirakis, Paul Tzameret, Iddo Wirth, Tony
Srba, Jiřı́ Tzevelekos, Nikos Wojtczak, Dominik
Srinathan, Kannan Ueckerdt, Torsten Wollan, Paul
Srinivasan, Srikanth Uehara, Ryuhei Wong, Prudence W.H.
Stapleton, Gem Ueno, Kenya Wood, David
Staton, Sam Valencia, Frank Wootters, Mary
Stauffer, Alexandre Valiant, Gregory Worrell, James
Stenman, Jari van Dam, Wim Wulff-Nilsen, Christian
Storandt, Sabine van Den Heuvel, Jan Xiao, David
Strauss, Martin van Leeuwen, Erik Jan Yamamoto, Mitsuharu
Strichman, Ofer van Melkebeek, Dieter Yaroslavtsev, Grigory
Strothmann, Thim van Rooij, Johan M.M. Yehudayof, Amir
Subramani, K. van Stee, Rob Yoshida, Yuichi
Suri, Subhash Varacca, Daniele Zadimoghaddam,
Sutner, Klaus Variyam, Vinod Morteza
Svensson, Ola Vatshelle, Martin Zawadzki, Erik
Svitkina, Zoya Veanes, Margus Zetzsche, Georg
Szegedy, Mario Végh, László Zhang, Qin
Sznajder, Nathalie Vereshchagin, Nikolay Zikas, Vassilis
Talmage, Edward Vergnaud, Damien Zimmermann, Martin
Tamir, Tami Verschae, José Živný, Stanislav
Tan, Li-Yang Viderman, Michael Zwick, Uri
Tanabe, Yoshinori Vidick, Thomas
Table of Contents – Part II

EATCS Lecture
Algorithms, Networks, and Social Phenomena . . . . . . . . . . . . . . . . . . . . . . . 1
Jon Kleinberg

Invited Talks
Recent Advances for a Classical Scheduling Problem . . . . . . . . . . . . . . . . . . 4
Susanne Albers

Formalizing and Reasoning about Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . 15


Shaull Almagor, Udi Boker, and Orna Kupferman

The Square Root Phenomenon in Planar Graphs . . . . . . . . . . . . . . . . . . . . . 28


Dániel Marx

A Guided Tour in Random Intersection Graphs . . . . . . . . . . . . . . . . . . . . . . 29


Paul G. Spirakis, Sotiris Nikoletseas, and Christoforos Raptopoulos

To Be Uncertain Is Uncomfortable, But to Be Certain Is Ridiculous . . . . 36


Peter Widmayer

Track B – Logic, Semantics, Automata and Theory


of Programming
Decision Problems for Additive Regular Functions . . . . . . . . . . . . . . . . . . . . 37
Rajeev Alur and Mukund Raghothaman

Beyond Differential Privacy: Composition Theorems and Relational


Logic for f -divergences between Probabilistic Programs . . . . . . . . . . . . . . . 49
Gilles Barthe and Federico Olmedo

A Maximal Entropy Stochastic Process for a Timed Automaton . . . . . . . 61


Nicolas Basset

Complexity of Two-Variable Logic on Finite Trees . . . . . . . . . . . . . . . . . . . . 74


Saguy Benaim, Michael Benedikt, Witold Charatonik,
Emanuel Kieroński, Rastislav Lenhardt, Filip Mazowiecki, and
James Worrell

Nondeterminism in the Presence of a Diverse or Unknown Future . . . . . . 89


Udi Boker, Denis Kuperberg, Orna Kupferman, and
Michal Skrzypczak
XVI Table of Contents – Part II

Coalgebraic Announcement Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101


Facundo Carreiro, Daniel Gorı́n, and Lutz Schröder

Self-shuffling Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113


Émilie Charlier, Teturo Kamae, Svetlana Puzynina, and
Luca Q. Zamboni

Block-Sorted Quantified Conjunctive Queries . . . . . . . . . . . . . . . . . . . . . . . . 125


Hubie Chen and Dániel Marx

From Security Protocols to Pushdown Automata . . . . . . . . . . . . . . . . . . . . . 137


Rémy Chrétien, Véronique Cortier, and Stéphanie Delaune

Efficient Separability of Regular Languages by Subsequences and


Suffixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Wojciech Czerwiński, Wim Martens, and Tomáš Masopust

On the Complexity of Verifying Regular Properties on Flat Counter


Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
Stéphane Demri, Amit Kumar Dhar, and Arnaud Sangnier

Multiparty Compatibility in Communicating Automata:


Characterisation and Synthesis of Global Session Types . . . . . . . . . . . . . . . 174
Pierre-Malo Deniélou and Nobuko Yoshida

Component Reconfiguration in the Presence of Conflicts . . . . . . . . . . . . . . 187


Roberto Di Cosmo, Jacopo Mauro, Stefano Zacchiroli, and
Gianluigi Zavattaro

Stochastic Context-Free Grammars, Regular Languages, and Newton’s


Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Kousha Etessami, Alistair Stewart, and Mihalis Yannakakis

Reachability in Two-Clock Timed Automata Is PSPACE-Complete . . . . . 212


John Fearnley and Marcin Jurdziński

Ramsey Goes Visibly Pushdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224


Oliver Friedmann, Felix Klaedtke, and Martin Lange

Checking Equality and Regularity for Normed BPA with Silent


Moves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Yuxi Fu

FO Model Checking of Interval Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250


Robert Ganian, Petr Hliněný, Daniel Král’, Jan Obdržálek,
Jarett Schwartz, and Jakub Teska

Strategy Composition in Compositional Games . . . . . . . . . . . . . . . . . . . . . . 263


Marcus Gelderie
Table of Contents – Part II XVII

Asynchronous Games over Tree Architectures . . . . . . . . . . . . . . . . . . . . . . . . 275


Blaise Genest, Hugo Gimbert, Anca Muscholl, and Igor Walukiewicz

Querying the Guarded Fragment with Transitivity . . . . . . . . . . . . . . . . . . . 287


Georg Gottlob, Andreas Pieris, and Lidia Tendera

Contractive Signatures with Recursive Types, Type Parameters, and


Abstract Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Hyeonseung Im, Keiko Nakata, and Sungwoo Park

Algebras, Automata and Logic for Languages of Labeled Birooted


Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
David Janin

One-Variable Word Equations in Linear Time . . . . . . . . . . . . . . . . . . . . . . . 324


Artur Jeż

The IO and OI Hierarchies Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336


Gregory M. Kobele and Sylvain Salvati

Evolving Graph-Structures and Their Implicit Computational


Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Daniel Leivant and Jean-Yves Marion

Rational Subsets and Submonoids of Wreath Products . . . . . . . . . . . . . . . . 361


Markus Lohrey, Benjamin Steinberg, and Georg Zetzsche
Fair Subtyping for Open Session Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
Luca Padovani
Coeffects: Unified Static Analysis of Context-Dependence . . . . . . . . . . . . . 385
Tomas Petricek, Dominic Orchard, and Alan Mycroft

Proof Systems for Retracts in Simply Typed Lambda Calculus . . . . . . . . . 398


Colin Stirling

Presburger Arithmetic, Rational Generating Functions, and


Quasi-Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
Kevin Woods

Revisiting the Equivalence Problem for Finite Multitape Automata . . . . . 422


James Worrell

Silent Transitions in Automata with Storage . . . . . . . . . . . . . . . . . . . . . . . . . 434


Georg Zetzsche

Track C – Foundations of Networked Computation


New Online Algorithms for Story Scheduling in Web Advertising . . . . . . . 446
Susanne Albers and Achim Passen
XVIII Table of Contents – Part II

Sketching for Big Data Recommender Systems Using Fast


Pseudo-random Fingerprints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Yoram Bachrach and Ely Porat

Physarum Can Compute Shortest Paths: Convergence Proofs and


Complexity Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
Luca Becchetti, Vincenzo Bonifaci, Michael Dirnberger,
Andreas Karrenbauer, and Kurt Mehlhorn

On Revenue Maximization for Agents with Costly Information


Acquisition: Extended Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
L. Elisa Celis, Dimitrios C. Gklezakos, and Anna R. Karlin

Price of Stability in Polynomial Congestion Games . . . . . . . . . . . . . . . . . . . 496


George Christodoulou and Martin Gairing

Localization for a System of Colliding Robots . . . . . . . . . . . . . . . . . . . . . . . . 508


Jurek Czyzowicz, Evangelos Kranakis, and Eduardo Pacheco

Fast Collaborative Graph Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520


Dariusz Dereniowski, Yann Disser, Adrian Kosowski,
Dominik Pajak,
 and Przemyslaw Uznański

Deterministic Polynomial Approach in the Plane . . . . . . . . . . . . . . . . . . . . . 533


Yoann Dieudonné and Andrzej Pelc

Outsourced Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545


Sebastian Faust, Carmit Hazay, and Daniele Venturi

Learning a Ring Cheaply and Fast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557


Emanuele G. Fusco, Andrzej Pelc, and Rossella Petreschi

Competitive Auctions for Markets with Positive Externalities . . . . . . . . . . 569


Nick Gravin and Pinyan Lu

Efficient Computation of Balanced Structures . . . . . . . . . . . . . . . . . . . . . . . . 581


David G. Harris, Ehab Morsy, Gopal Pandurangan,
Peter Robinson, and Aravind Srinivasan

A Refined Complexity Analysis of Degree Anonymization in Graphs . . . . 594


Sepp Hartung, André Nichterlein, Rolf Niedermeier, and
Ondřej Suchý

Sublinear-Time Maintenance of Breadth-First Spanning Tree


in Partially Dynamic Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
Monika Henzinger, Sebastian Krinninger, and Danupon Nanongkai

Locally Stable Marriage with Strict Preferences . . . . . . . . . . . . . . . . . . . . . . 620


Martin Hoefer and Lisa Wagner
Table of Contents – Part II XIX

Distributed Deterministic Broadcasting in Wireless Networks of Weak


Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
Tomasz Jurdzinski, Dariusz R. Kowalski, and Grzegorz Stachowiak

Secure Equality and Greater-Than Tests with Sublinear Online


Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
Helger Lipmaa and Tomas Toft

Temporal Network Optimization Subject to Connectivity Constraints . . . 657


George B. Mertzios, Othon Michail, Ioannis Chatzigiannakis, and
Paul G. Spirakis

Strong Bounds for Evolution in Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 669


George B. Mertzios and Paul G. Spirakis

Fast Distributed Coloring Algorithms for Triangle-Free Graphs . . . . . . . . 681


Seth Pettie and Hsin-Hao Su

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695


Table of Contents – Part I

Track A – Algorithms, Complexity and Games


Exact Weight Subgraphs and the k -Sum Conjecture . . . . . . . . . . . . . . . . . . 1
Amir Abboud and Kevin Lewi

Minimizing Maximum (Weighted) Flow-Time on Related and Unrelated


Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
S. Anand, Karl Bringmann, Tobias Friedrich, Naveen Garg, and
Amit Kumar

Tight Lower Bound for Linear Sketches of Moments . . . . . . . . . . . . . . . . . . 25


˜ Yury Polyanskiy, and Yihong Wu
Alexandr Andoni, Huy L. Nguyên,

Optimal Partitioning for Dual Pivot Quicksort (Extended Abstract) . . . . 33


Martin Aumüller and Martin Dietzfelbinger

Space–Time Tradeoffs for Subset Sum: An Improved Worst Case


Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Per Austrin, Petteri Kaski, Mikko Koivisto, and Jussi Määttä

On the Extension Complexity of Combinatorial Polytopes . . . . . . . . . . . . . 57


David Avis and Hans Raj Tiwary

Algorithms for Hub Label Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69


Maxim Babenko, Andrew V. Goldberg, Anupam Gupta, and
Viswanath Nagarajan

Improved Approximation Algorithms for (Budgeted) Node-Weighted


Steiner Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
MohammadHossein Bateni, MohammadTaghi Hajiaghayi, and
Vahid Liaghat

Search-Space Size in Contraction Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . 93


Reinhard Bauer, Tobias Columbus, Ignaz Rutter, and
Dorothea Wagner

Time-Efficient Quantum Walks for 3-Distinctness . . . . . . . . . . . . . . . . . . . . 105


Aleksandrs Belovs, Andrew M. Childs, Stacey Jeffery,
Robin Kothari, and Frédéric Magniez

An Algebraic Characterization of Testable Boolean CSPs . . . . . . . . . . . . . 123


Arnab Bhattacharyya and Yuichi Yoshida
XXII Table of Contents – Part I

Approximation Algorithms for the Joint Replenishment Problem


with Deadlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Marcin Bienkowski, Jaroslaw Byrka, Marek Chrobak, Neil Dobbs,
Tomasz Nowicki, Maxim Sviridenko, Grzegorz Świrszcz, and
Neal E. Young

Sparse Suffix Tree Construction in Small Space . . . . . . . . . . . . . . . . . . . . . . 148


Philip Bille, Johannes Fischer, Inge Li Gørtz, Tsvi Kopelowitz,
Benjamin Sach, and Hjalte Wedel Vildhøj

Tree Compression with Top Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160


Philip Bille, Inge Li Gørtz, Gad M. Landau, and Oren Weimann

Noncommutativity Makes Determinants Hard . . . . . . . . . . . . . . . . . . . . . . . 172


Markus Bläser

Optimal Orthogonal Graph Drawing with Convex Bend Costs . . . . . . . . . 184


Thomas Bläsius, Ignaz Rutter, and Dorothea Wagner

Deterministic Single Exponential Time Algorithms for Connectivity


Problems Parameterized by Treewidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Hans L. Bodlaender, Marek Cygan, Stefan Kratsch, and
Jesper Nederlof

On the Complexity of Higher Order Abstract Voronoi Diagrams . . . . . . . 208


Cecilia Bohler, Panagiotis Cheilaris, Rolf Klein, Chih-Hung Liu,
Evanthia Papadopoulou, and Maksym Zavershynskyi

A Pseudo-Polynomial Algorithm for Mean Payoff Stochastic Games


with Perfect Information and a Few Random Positions . . . . . . . . . . . . . . . . 220
Endre Boros, Khaled Elbassioni, Vladimir Gurvich, and
Kazuhisa Makino

Direct Product via Round-Preserving Compression . . . . . . . . . . . . . . . . . . . 232


Mark Braverman, Anup Rao, Omri Weinstein, and Amir Yehudayoff

How Hard Is Counting Triangles in the Streaming Model? . . . . . . . . . . . . . 244


Vladimir Braverman, Rafail Ostrovsky, and Dan Vilenchik

Online Checkpointing with Improved Worst-Case Guarantees . . . . . . . . . . 255


Karl Bringmann, Benjamin Doerr, Adrian Neumann, and
Jakub Sliacan

Exact and Efficient Generation of Geometric Random Variates and


Random Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Karl Bringmann and Tobias Friedrich

Finding Short Paths on Polytopes by the Shadow Vertex Algorithm . . . . 279


Tobias Brunsch and Heiko Röglin
Table of Contents – Part I XXIII

On Randomized Online Labeling with Polynomially Many Labels . . . . . . 291


Jan Bulánek, Michal Koucký, and Michael Saks

Dual Lower Bounds for Approximate Degree and Markov-Bernstein


Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Mark Bun and Justin Thaler

New Doubling Spanners: Better and Simpler . . . . . . . . . . . . . . . . . . . . . . . . . 315


T.-H. Hubert Chan, Mingfei Li, Li Ning, and Shay Solomon

Maximum Edge-Disjoint Paths in k -Sums of Graphs . . . . . . . . . . . . . . . . . . 328


Chandra Chekuri, Guyslain Naves, and F. Bruce Shepherd

On Integrality Ratios for Asymmetric TSP in the Sherali-Adams


Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Joseph Cheriyan, Zhihan Gao, Konstantinos Georgiou, and
Sahil Singla

Counting Matchings of Size k Is #W[1]-Hard . . . . . . . . . . . . . . . . . . . . . . . . 352


Radu Curticapean

Faster Exponential-Time Algorithms in Graphs of Bounded Average


Degree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Marek Cygan and Marcin Pilipczuk

A Robust Khintchine Inequality, and Algorithms for Computing


Optimal Constants in Fourier Analysis and High-Dimensional
Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Anindya De, Ilias Diakonikolas, and Rocco Servedio

Combining Binary Search Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388


Erik D. Demaine, John Iacono, Stefan Langerman, and Özgür Özkan

The Two-Handed Tile Assembly Model Is Not Intrinsically Universal . . . 400


Erik D. Demaine, Matthew J. Patitz, Trent A. Rogers,
Robert T. Schweller, Scott M. Summers, and Damien Woods

Clustering in the Boolean Hypercube in a List Decoding Regime . . . . . . . 413


Irit Dinur and Elazar Goldenberg

A Combinatorial Polynomial Algorithm for the Linear Arrow-Debreu


Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
Ran Duan and Kurt Mehlhorn

Towards an Understanding of Polynomial Calculus: New


Separations and Lower Bounds (Extended Abstract) . . . . . . . . . . . . . . . . . . 437
Yuval Filmus, Massimo Lauria, Mladen Mikša,
Jakob Nordström, and Marc Vinyals
XXIV Table of Contents – Part I

On the Power of Deterministic Mechanisms for Facility Location


Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Dimitris Fotakis and Christos Tzamos

l2/l2 -Foreach Sparse Recovery with Low Risk . . . . . . . . . . . . . . . . . . . . . . . 461


Anna C. Gilbert, Hung Q. Ngo, Ely Porat, Atri Rudra, and
Martin J. Strauss

Autoreducibility of Complete Sets for Log-Space and Polynomial-Time


Reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
Christian Glaßer, Dung T. Nguyen, Christian Reitwießner,
Alan L. Selman, and Maximilian Witek

An Incremental Polynomial Time Algorithm to Enumerate All Minimal


Edge Dominating Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
Petr A. Golovach, Pinar Heggernes, Dieter Kratsch, and
Yngve Villanger

Deciding the Winner of an Arbitrary Finite Poset Game Is


PSPACE-Complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Daniel Grier

Dynamic Compressed Strings with Random Access . . . . . . . . . . . . . . . . . . . 504


Roberto Grossi, Rajeev Raman, Satti Srinivasa Rao, and
Rossano Venturini

The Complexity of Planar Boolean #CSP with Complex Weights . . . . . . 516


Heng Guo and Tyson Williams

Arthur-Merlin Streaming Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528


Tom Gur and Ran Raz

Local Correctability of Expander Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540


Brett Hemenway, Rafail Ostrovsky, and Mary Wootters

On the Complexity of Broadcast Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552


Martin Hirt and Pavel Raykov

On Model-Based RIP-1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564


Piotr Indyk and Ilya Razenshteyn

Robust Pseudorandom Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576


Yuval Ishai, Eyal Kushilevitz, Xin Li, Rafail Ostrovsky,
Manoj Prabhakaran, Amit Sahai, and David Zuckerman

A Robust AFPTAS for Online Bin Packing with Polynomial


Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
Klaus Jansen and Kim-Manuel Klein
Table of Contents – Part I XXV

Small Stretch Pairwise Spanners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601


Telikepalli Kavitha and Nithin M. Varma

Linear Kernels and Single-Exponential Algorithms via Protrusion


Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
Eun Jung Kim, Alexander Langer, Christophe Paul, Felix Reidl,
Peter Rossmanith, Ignasi Sau, and Somnath Sikdar

The Power of Linear Programming for Finite-Valued CSPs:


A Constructive Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
Vladimir Kolmogorov

Approximating Semi-matchings in Streaming and in Two-Party


Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Christian Konrad and Adi Rosén

Full-Fledged Real-Time Indexing for Constant Size Alphabets . . . . . . . . . 650


Gregory Kucherov and Yakov Nekrich

Arithmetic Circuit Lower Bounds via MaxRank . . . . . . . . . . . . . . . . . . . . . . 661


Mrinal Kumar, Gaurav Maheshwari, and Jayalal Sarma M.N.

Model Checking Lower Bounds for Simple Graphs . . . . . . . . . . . . . . . . . . . . 673


Michael Lampis

The Complexity of Proving That a Graph Is Ramsey . . . . . . . . . . . . . . . . . 684


Massimo Lauria, Pavel Pudlák, Vojtěch Rödl, and Neil Thapen

An Improved Lower Bound for the Randomized Decision Tree


Complexity of Recursive Majority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696
Nikos Leonardos

A Quasi-Polynomial Time Partition Oracle for Graphs


with an Excluded Minor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 709
Reut Levi and Dana Ron

Fixed-Parameter Algorithms for Minimum Cost Edge-Connectivity


Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721
Dániel Marx and László A. Végh

Graph Reconstruction via Distance Oracles . . . . . . . . . . . . . . . . . . . . . . . . . . 733


Claire Mathieu and Hang Zhou

Dual Techniques for Scheduling on a Machine with Varying Speed . . . . . . 745


Nicole Megow and José Verschae

Improved Space Bounds for Strongly Competitive Randomized Paging


Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
Gabriel Moruz and Andrei Negoescu
XXVI Table of Contents – Part I

No-Wait Flowshop Scheduling Is as Hard as Asymmetric Traveling


Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 769
Marcin Mucha and Maxim Sviridenko

A Composition Theorem for the Fourier Entropy-Influence


Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 780
Ryan O’Donnell and Li-Yang Tan

Large Neighborhood Local Search for the Maximum Set Packing


Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792
Maxim Sviridenko and Justin Ward

The Complexity of Three-Element Min-Sol and Conservative


Min-Cost-Hom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804
Hannes Uppman

The Complexity of Infinitely Repeated Alternating Move Games . . . . . . . 816


Yaron Velner

Approximating the Diameter of Planar Graphs in Near Linear Time . . . . 828


Oren Weimann and Raphael Yuster

Testing Linear-Invariant Function Isomorphism . . . . . . . . . . . . . . . . . . . . . . 840


Karl Wimmer and Yuichi Yoshida

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 851


Algorithms, Networks, and Social Phenomena

Jon Kleinberg

Cornell University
Ithaca NY USA
http://www.cs.cornell.edu/home/kleinber/

Abstract. We consider the development of computational models for


systems involving social networks and large human audiences. In partic-
ular, we focus on the spread of information and behavior through such
systems, and the ways in which these processes are affected by the un-
derlying network structure.

Keywords: social networks, random graphs, contagion.

Overview
A major development over the past two decades has been the way in which net-
worked computation has brought together people and information at a global
scale. In addition to its societal consequences, this move toward massive connec-
tivity has led to a range of new challenges for the field of computing; many of
these challenges are based directly on the need for new models of computation.
We focus here on some of the modeling issues that arise in the design of
computing systems involving large human audiences — these include social net-
working and social media sites such as Facebook, Google Plus, Twitter, and
YouTube, sites supporting commerce and economic exchange such as Amazon
and eBay, and sites for organizing the collective creation of knowledge such as
Wikipedia. The interactions on these sites are extensively mediated by algo-
rithms, and in thinking about the design issues that come into play, we need to
think in particular about the feedback loops created by interactions among the
large groups of people that populate these systems — in the ways they respond
to incentives [21,23,27], form social networks [9,12,20] and share information [25].
Within this broad space of questions, we consider models for the spread of
information and behavior through large social and economic networks — it has
become clear that this type of person-to-person transmission is a basic “transport
mechanism” for such networks [14]. Among the issues informing this investiga-
tion are recent theoretical models of such processes [4,6,8,11,13,19,26,30], as well
as incentive mechanisms for propagating information [1,2,5,15,24], techniques for
reconstructing the trajectory of information spreading through a network given
incomplete observations [7,10,18], and empirical results indicating the impor-
tance of network structure [3,16,17,22] — and in particular network neighbor-
hood structure [28,29] — for understanding the ways in which information will
propagate at a local level.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 1–3, 2013.

c Springer-Verlag Berlin Heidelberg 2013
2 J. Kleinberg

References

1. Arcaute, E., Kirsch, A., Kumar, R., Liben-Nowell, D., Vassilvitskii, S.: On thresh-
old behavior in query incentive networks. In: Proc. 8th ACM Conference on Elec-
tronic Commerce, pp. 66–74 (2007)
2. Babaioff, M., Dobzinski, S., Oren, S., Zohar, A.: On bitcoin and red balloons. In:
Proc. ACM Conference on Electronic Commerce, pp. 56–73 (2012)
3. Backstrom, L., Huttenlocher, D., Kleinberg, J., Lan, X.: Group formation in large
social networks: Membership, growth, and evolution. In: Proc. 12th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining (2006)
4. Blume, L., Easley, D., Kleinberg, J., Kleinberg, R., Tardos, É.: Which networks
are least susceptible to cascading failures? In: Proc. 52nd IEEE Symposium on
Foundations of Computer Science (2011)
5. Cebrián, M., Coviello, L., Vattani, A., Voulgaris, P.: Finding red balloons with split
contracts: robustness to individuals’ selfishness. In: Proc. 44th ACM Symposium
on Theory of Computing. pp. 775–788 (2012)
6. Centola, D., Macy, M.: Complex contagions and the weakness of long ties. American
Journal of Sociology 113, 702–734 (2007)
7. Chierichetti, F., Kleinberg, J.M., Liben-Nowell, D.: Reconstructing patterns of in-
formation diffusion from incomplete observations. In: Proc. 24th Advances in Neu-
ral Information Processing Systems. pp. 792–800 (2011)
8. Dodds, P., Watts, D.: Universal behavior in a generalized model of contagion.
Physical Review Letters 92, 218701 (2004)
9. Easley, D., Kleinberg, J.: Networks, Crowds, and Markets: Reasoning about a
Highly Connected World. Cambridge University Press (2010)
10. Golub, B., Jackson, M.O.: Using selection bias to explain the observed structure
of internet diffusions. Proc. Natl. Acad. Sci. USA 107(24), 10833–10836 (2010)
11. Granovetter, M.: Threshold models of collective behavior. American Journal of
Sociology 83, 1420–1443 (1978)
12. Jackson, M.O.: Social and Economic Networks. Princeton University Press (2008)
13. Kempe, D., Kleinberg, J., Tardos, É.: Maximizing the spread of influence in a social
network. In: Proc. 9th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, pp. 137–146 (2003)
14. Kleinberg, J.: Cascading behavior in networks: Algorithmic and economic issues.
In: Nisan, N., Roughgarden, T., Tardos, É., Vazirani, V. (eds.) Algorithmic Game
Theory, pp. 613–632. Cambridge University Press (2007)
15. Kleinberg, J., Raghavan, P.: Query incentive networks. In: Proc. 46th IEEE Sym-
posium on Foundations of Computer Science, pp. 132–141 (2005)
16. Kossinets, G., Watts, D.: Empirical analysis of an evolving social network. Science
311, 88–90 (2006)
17. Leskovec, J., Adamic, L., Huberman, B.: The dynamics of viral marketing. ACM
Transactions on the Web 1(1) (May 2007)
18. Liben-Nowell, D., Kleinberg, J.: Tracing information flow on a global scale using
Internet chain-letter data. Proc. Natl. Acad. Sci. USA 105(12), 4633–4638 (2008)
19. Mossel, E., Roch, S.: On the submodularity of influence in social networks. In:
Proc. 39th ACM Symposium on Theory of Computing (2007)
20. Newman, M.E.J.: Networks: An Introduction. Oxford University Press (2010)
21. Nisan, N., Roughgarden, T., Tardos, É., Vazirani, V.: Algorithmic Game Theory.
Cambridge University Press (2007)
Algorithms, Networks, and Social Phenomena 3

22. Onnela, J.P., Saramaki, J., Hyvonen, J., Szabo, G., Lazer, D., Kaski, K., Kertesz,
J., Barabasi, A.L.: Structure and tie strengths in mobile communication networks.
Proc. Natl. Acad. Sci. USA 104, 7332–7336 (2007)
23. Papadimitriou, C.H.: Algorithms, games, and the internet. In: Proc. 33rd ACM
Symposium on Theory of Computing, pp. 749–753 (2001)
24. Pickard, G., Pan, W., Rahwan, I., Cebrian, M., Crane, R., Madan, A., Pentland,
A.: Time-critical social mobilization. Science 334(6055), 509–512 (2011)
25. Rogers, E.: Diffusion of Innovations, 4th edn. Free Press (1995)
26. Schelling, T.: Micromotives and Macrobehavior. Norton (1978)
27. Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-
Theoretic, and Logical Foundations. Cambridge University Press (2009)
28. Ugander, J., Backstrom, L., Kleinberg, J.: Subgraph frequencies: Mapping the em-
pirical and extremal geography of large graph collections. In: Proc. 22nd Interna-
tional World Wide Web Conference (2013)
29. Ugander, J., Backstrom, L., Marlow, C., Kleinberg, J.: Structural diversity in social
contagion. Proc. Natl. Acad. Sci. USA 109(16), 5962–5966 (2012)
30. Watts, D.J.: A simple model of global cascades on random networks. Proc. Natl.
Acad. Sci. USA 99(9), 5766–5771 (2002)
Recent Advances for a Classical Scheduling
Problem

Susanne Albers

Department of Computer Science, Humboldt-Universität zu Berlin


[email protected]

Abstract. We revisit classical online makespan minimization which has


been studied since the 1960s. In this problem a sequence of jobs has to be
scheduled on m identical machines so as to minimize the makespan of the
constructed schedule. Recent research has focused on settings in which
an online algorithm is given extra information or power while processing
a job sequence. In this paper we review the various models of resource
augmentation and survey important results.

1 Introduction

Makespan minimization on parallel machines is a fundamental and extensively


studied scheduling problem with a considerable body of literature published over
the last forty years. Consider a sequence σ = J1 , . . . , Jn of jobs that have to be
scheduled non-preemptively on m identical parallel machines. Each job Jt is
specified by an individual processing time pt , 1 ≤ t ≤ n. The goal is to minimize
the makespan, i.e. the maximum completion time of any job in the constructed
schedule. In the offline variant of the problem the entire job sequence σ is known
in advance. In the online variant the jobs arrive one by one as elements of a list.
Whenever a job Jt arrives, its processing time pt is known. The job has to be
scheduled immediately on one of the machines without knowledge of any future
jobs Jt , with t > t.
Already in 1966 Graham [25] presented the elegant List algorithm. This
strategy, which can be used for the online setting, assigns each job of σ to a
machine currently having the smallest load. Graham proved that List is (2 −
1/m)-competitive. An online algorithm A is called c-competitive if, for every
job sequence, the makespan of A’s schedule is at most c times the makespan
of an optimal schedule [35]. In 1987 Hochbaum and Shmoys [26] devised a fa-
mous polynomial time approximation scheme for the offline problem, which is
NP-hard [23].
Over the past 20 years research on makespan minimization has focused on the
online problem. Deterministic algorithms achieving a competitive ratio smaller
than 2 − 1/m were developed in [2,12,21,22,28]. The best deterministic strategy
currently known has a competitiveness of 1.9201 [21]. Lower bounds on the
performance of deterministic algorithms were presented in [2,12,13,20,24,31,32].
The strongest result implies that no deterministic online strategy can achieve a

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 4–14, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Recent Advances for a Classical Scheduling Problem 5

competitive ratio smaller than 1.88 [31]. Hence the remaining gap between the
known upper and lower bounds is quite small.
Very few results have been developed for randomized online algorithms. For
m = 2 machines, Bartal et al. [12] presented an algorithm that attains an optimal
competitive ratio of 4/3. Currently, no randomized algorithm is known whose
competitiveness is provably below the deterministic lower bound of 1.88, for all
values of m. A lower bound of e/(e − 1) ≈ 1.581 on the competitive ratio of
any randomized online strategy, for general m, was given in [14,34]. The ratio
of e/(e − 1) is also the best performance guarantee that can be achieved by
deterministic online algorithms if job preemption is allowed [15].
Recent research on makespan minimization has investigated scenarios where
the online constraint is relaxed. More precisely, an online algorithm is given ad-
ditional information or extra power in processing a job sequence σ. The study
of such settings is motivated by the fact that the competitiveness of determin-
istic online strategies is relatively high, compared to List’s initial performance
guarantee of 2 − 1/m. Furthermore, with respect to the foundations of online
algorithms, it is interesting to gain insight into the value of various forms of
resource augmentation. Generally, in the area of scheduling the standard type
of resource augmentation is extra speed , i.e. an online algorithm is given faster
machines than an offline algorithm that constructs optimal schedules. We re-
fer the reader to [6,27,30] and references therein for a selection of work in this
direction. However, for online makespan minimization, faster processors do not
give particularly interesting results. Obviously, the decrease in the algorithms’
competitive ratios is inversely proportional to the increase in speed.
For online makespan minimization the following scientifically more challenging
types of resource augmentation have been explored. The problem scenarios are
generally well motivated from a practical point of view.
– Known total processing
n time: Consider a setting in which an online algorithm
knows the sum t=1 pt of the job processing times of σ. The access to such
a piece of information can be justified as follows. In a parallel server system
there usually exist fairly accurate estimates on the workload that arrives
over a given time horizon. Furthermore, in a shop floor a scheduler typically
accepts orders (tasks) of a targeted volume for a given time period, say a
day or a week.
– Availability of a reordering buffer: In this setting an online algorithm has a
buffer of limited size that may be used to partially reorder the job sequence.
Whenever a job arrives, it is inserted into the buffer; then one job of the
buffer is removed and assigned in the current schedule.
– Job migration: Assume that at any time an online algorithm may perform
reassignments, i.e. jobs already scheduled on machines may be removed and
transferred to other machines. Job migration is a well-known and widely
used technique to balance load in parallel and distributed systems.
In this paper we survey the results known for these relaxed online scenarios. It
turns out that usually significantly improved competitive ratios can be achieved.
Unless otherwise stated, all algorithms considered in this paper are deterministic.
6 S. Albers

Throughout this paper let M1 , . . . , Mm denote the m machines. Moreover at any


time the load of a machine is the sum of the processing times of the jobs currently
assigned to that machine.

2 Known Total Processing Time

In this
 section we consider the scenario that an online algorithm knows the sum
S = nt=1 pt of the job processing times, for the incoming sequence σ. The prob-
lem was first studied by Kellerer et al. [29] who concentrated on m = 2 machines
and gave an algorithm that achieves an optimal competitive ratio of 4/3. The set-
ting with a general number m of machines was investigated in [5,10,16].
√ Angelelli
et al. [10] gave a strategy that attains a competitiveness of (1 + 6)/2 ≈ 1.725.
The best algorithm currently known was developed by Cheng et al. [16] and is
1.6-competitive. Both the algorithms by Angelelli et al. and Cheng et al. work
with job classes, i.e. jobs are classified according to their processing times. For
each class, specific scheduling rules apply. The algorithm by Cheng et al. [16] is
quite involved, as we shall see below. A simple algorithm not resorting to job
classes was presented by Albers and Hellwig [5]. However, the algorithm is only
1.75-competitive and hence does not achieve the best possible competitive ratio.
We proceed to describe the 1.6-competitive algorithm by Cheng et al. [16], which
we call ALG(P).
Description of ALG(P): The job assignment rules essentially work with
small, medium and large jobs. In order to keep track of the machines containing
these
n jobs, a slightly more refined classification is needed. W.l.o.g. assume that
t=1 pt = m so that 1 is a lower bound on the optimum makespan. A job Jt ,
1 ≤ t ≤ n, is
• tiny if pt ∈ (0, 0.3] • little if pt ∈ (0.3, 0.6] • medium if pt ∈ (0.6, 0.8],
• big if pt ∈ (0.8, 0.9] • very big if pt > 0.9.

Tiny and little jobs are also called small. Big and very big jobs are large. At any
given time let j denote the load of Mj , 1 ≤ j ≤ m. Machine Mj is
• empty if j = 0 • little if j ∈ (0.3, 0.6] • small if j ∈ (0, 0.6].
A machine Mj is called medium if it only contains one medium job. Finally, Mj
is said to be nearly full if contains a large as well as small jobs and j ≤ 1.1.
ALG(P) works in two phases.
Phase 1: The first phase proceeds as long as (1) there are empty machines and
(2) twice the total number of empty and medium machines is greater than the
number of little machines. Throughout the phase ALG(P) maintains a lower
bound L on the optimum makespan. Initially, L := 1. During the phase, for each
new job Jt , the algorithm sets L := max{L, pt }. Then the job is scheduled as
follows.
Recent Advances for a Classical Scheduling Problem 7

– Jt is very big. Assign Jt to an Mj with the largest load j ≤ 0.6.


– Jt is big. If there are more than m/2 empty machines, assign Jt to an empty
machine. Otherwise use the scheduling rule for very big jobs.
– Jt is medium. Assign Jt to an empty machine.
– Jt is small. Execute the first possible assignment rule: (a) If there exists a
small machine Mj such that j + pt ≤ 0.6, assign Jt to it. (b) If there is
a nearly full machine Mj such that j + pt ≤ 1.6L, assign Jt to it. (c) If
there exist more than m/2 empty machines, then let ∗ = 0.9; otherwise let
∗ = 0.8. If there exist machines Mj such that j > ∗ and j + pt ≤ 1.6L
then, among these machines, assign Jt to one having the largest load. (d)
Assign Jt to an empty machine.

Phase 2: If at the end of Phase 1 there are no empty machines, then in Phase 2
jobs are generally scheduled according to a Best Fit strategy. More specifically, a
job Jt is assigned to a machine Mj having the largest load j such that j + pt ≤
1.6L. Here L is updated as L := max{L, pt , 2p∗ }, where p∗ is the processing
time of the (m + 1)-st largest job seen so far. If at the end of Phase 1 there exist
empty machines, then the job assignment is more involved. First ALG(P) creates
batches of three machines. Each batch consists of two little machines as well as
one medium or one empty machine. Each batch either receives only small jobs
or only medium and large jobs. At any time there exists only one open batch to
receive small jobs. Similarly, there exists one open batch to receive medium and
large jobs. While batches are open or can be opened, jobs are either scheduled on
an empty machines or using the Best Fit policy. Once the batches are exhausted,
jobs are assigned using Best Fit to the remaining machines. We refer the reader
to [16] for an exact definition of the scheduling rules.
Theorem 1. [16] ALG(P) is 1.6-competitive, for general m.
Lower bounds on the best possible competitive ratio of deterministic strategies
were given in [5,10,16]. Cheng et al. [16] showed a lower bound of 1.5. Angelelli
et al. [10] gave an improved bound of 1.565, as m → ∞. The best lower bound
currently known was presented in [5].
Theorem 2. [5] Let A be a deterministic online algorithm that knows the total
processing time of σ. If A is c-competitive, then c ≥ 1.585, as m → ∞.
Hence the gap between the best known upper and lower bounds is very small.
Nonetheless, an interesting open problem is to determine the exact competitive-
ness that can be achieved in the setting where the sum of the job processing
times is known.
Further results have been developed for the special case of m = 2 machines.
Two papers by Angelelli et al. [7,8] assume that an online algorithm additionally
knows an upper bound on the maximum job processing time. A setting with
m = 2 uniform machines is addressed in [9].
Azar and Regev [11] studied a related problem. Here an online algorithm even
knows the value of the optimum makespan, for the incoming job sequence. In a
scheduling environment it is probably unrealistic that the value of an optimal
8 S. Albers

solution is known. However, the proposed setting represents an interesting bin


packing problem, which Azar and Regev [11] coined bin stretching. Now ma-
chines correspond to bins and jobs correspond to items. Consider a sequence σ
of items that can be feasibly packed into m unit-size bins. The goal of an online
algorithm is to pack σ into m bins so as to stretch the size of the bins as least
as possible. Azar and Regev [11] gave a 1.625-competitive algorithm, but the
algorithm ALG(P) is also 1.6-competitive for bin stretching.

Corollary 1. [16] ALG(P) is 1.6-competitive for bin stretching.


Azar and Regev [11] showed a lower bound.

Theorem 3. [11] Let A be a deterministic online algorithm for bin stretching.


If A is c-competitive, then c ≥ 4/3.

An open problem is to tighten the gap between the upper and the lower bounds.
Bin stretching with m = 2 bins was addressed by Epstein [19].

3 Reordering Buffer

Makespan minimization with a reordering buffer was proposed by Kellerer et


al. [29]. The papers by Kellerer et al. [29] and Zhang [36] focus on m = 2
machines and present algorithms that achieve an optimal competitive ratio of
4/3. The algorithms use a buffer of size 2. In fact the strategies work with a
buffer of size 1 because in [29,36] a slightly different policy is used when a new
job arrives. Upon the arrival of a job, either this job or the one residing in the
buffer may be assigned to the machines. In the latter case, the new job is inserted
into the buffer.
The problem with a general number m of machines was investigated by Englert
et al. [18]. They developed various algorithms that use a buffer of size O(m).
All the algorithms consist of two phases, an iteration phase and a final phase.
Let k denote the size of the buffer. In the iteration phase the first k − 1 jobs of
σ are inserted into the buffer. Subsequently, while jobs arrive, the incoming job
is placed in the buffer. Then a job with the smallest processing time is removed
from the buffer and assigned in the schedule. In the final phase, the remaining
k − 1 jobs in the buffer are scheduled.
In the following we describe the algorithm by Englert et al. [18] that achieves
the smallest competitiveness. We refer to this strategy as ALG(B). For the def-
inition of the algorithm and its competitiveness we introduce a function fm (α).
For any machine number m ≥ 2 and real-valued α > 1, let

fm (α) = (α − 1)(Hm−1 − H(1−1/α)m−1 ) + (1 − 1/α)m α/m. (1)



Here Hk = ki=1 1/i denotes the k-th Harmonic number, for any integer k ≥ 1.
We set H0 = 0. For any fixed m ≥ 2, let αm be the value satisfying fm (α) = 1.
The paper by Albers and Hellwig [3] formally shows that αm is well-defined.
Recent Advances for a Classical Scheduling Problem 9

Algorithm ALG(B) is exactly αm -competitive. The sequence (αm )m≥2 is non-


decreasing with α2 = 4/3 and limm→∞ αm = W−1 (−1/e2 )/(1 + W−1 (−1/e2 )) ≈
1.4659. Here W−1 denotes the lower branch of the Lambert W function.
Description of ALG(B): The algorithm works with a buffer of size k = 3m.
During the iteration phase ALG(B) maintains a load profile on the m machines
M1 , . . . , Mm . Let

(αm − 1) m−j
m
if j ≤ m/αm
β(j) =
αm otherwise.

Consider any step during the iteration phase. As mentioned above, the algorithm
removes a job with the smallest processing time from the buffer. Let p denote
the respective processing time. Furthermore, let L be the total load on the m
machines prior to the assignment. The job is scheduled on a machine Mj with a
load of at most
β(j)(L/m + p) − p.
Englert et al. [18] prove that such a machine always exists. In the final phase
ALG(B) first constructs a virtual schedule on M1 , . . . , Mm 
empty machines.
More specifically, the k − 1 jobs from the buffer are considered in non-increasing
order of processing time. Each job is assigned to a machine of M1 , . . . , Mm

with
the smallest current load. The process stops when the makespan of the virtual
schedule is at least three times the processing time of the last job assigned.
This last job is removed again from the virtual schedule. Then the machines
M1 , . . . , Mm

are renumbered in order of non-increasing load. The jobs residing
on Mj are assigned to Mj in the real schedule, 1 ≤ j ≤ m. In a last step each of
the remaining jobs is placed on a least loaded machine in the current schedule.
Theorem 4. [18] ALG(B) is αm -competitive, for any m ≥ 2.
Englert et al. [18] prove that the competitiveness of αm is best possible using a
buffer whose size does not depend on the job sequence σ.
Theorem 5. [18] No deterministic online algorithm can achieve a competitive
ratio smaller than αm , for any m ≥ 2, with a buffer whose size does not depend
on σ.
Englert et al. [18] also present algorithms that use a smaller buffer. In particular
they give a (1 + αm /2)-competitive algorithm that works with a buffer of size
m + 1. Moreover, they analyze an extended List algorithm that always assigns
jobs to a least loaded machine. The strategy attains a competitiveness of 2 −
1/(m − k + 1) with a buffer of size k ∈ [1, (m + 1)/2].
The paper by Englert et al. [18] also considers online makespan minimization
on related machines and gives a (2 + )-competitive algorithm, for any  > 0,
that uses a buffer of size m. Dósa and Epstein [17] studied online makespan
minimization on identical machines assuming that job preemption is allowed
and showed that the competitiveness is 4/3.
10 S. Albers

4 Job Migration
To the best of our knowledge makespan minimization with job migration was
first addressed by Aggarwal et al. [1]. However the authors consider an offline
setting. An algorithm is given a schedule, in which all jobs are already assigned,
and a budget. The algorithm may perform job migrations up to the given budget.
Aggarwal et al. [1]. design strategies that perform well with respect to the best
possible solution that can be constructed with the budget. In this article we are
interested in online makespan minimization with job migration. Two models of
migration have been investigated. (1) An online algorithm may migrate a certain
volume of jobs. (2) An online algorithm may migrate a limited number of jobs.
In the next sections we address these two settings.

4.1 Migrating a Bounded Job Volume


Sanders et al. [33] studied the scenario in which, upon the arrival of a job Jt ,
jobs of total processing time βpt may be migrated. Here β is called the migration
factor. Sanders et al. [33] devised various algorithms. A strength of the strategies
is that they are robust, i.e. after the arrival of each job the makespan of the
constructed schedule is at most c times the optimum makespan, for the prefix
of the job sequence seen so far. As usual c denotes the targeted competitive
ratio.On the other hand, over time, the algorithms migrate a total job volume
of β nt=1 pt , which depends on the input sequence σ.
In the remainder of this section we present some of the results by Sanders et
al. [33] in more detail. The authors first show a simple strategy that is (3/2 −
1/(2m))-competitive; the migration factor is upper bounded by 2. We describe
an elegant algorithm, which we call ALG(M1), that is 3/2-competitive using a
migration factor of 4/3.
Description of ALG(M1): Upon the arrival of a new job Jt , the algorithm
checks m+1 options and chooses the one that minimizes the resulting makespan,
where ties are broken in favor of Option 0. Option 0 : Assign Jt to a least loaded
machine. Option j (1 ≤ j ≤ m): Consider machine Mj . Ignore the job with the
largest processing time on Mj and take the remaining jobs on this machine in
order of non-increasing processing time. Remove a job unless the total processing
volume of removed jobs would exceed 4/3 · pj . Schedule Jt on Mj . Assign the
removed jobs repeatedly to a least loaded machine.

Theorem 6. [33] ALG(M1) is 3/2-competitive, where the migration factor is


upper bounded by 4/3.

Sanders et al. [33] show that a migration factor of 4/3 is essentially best pos-
sible for 3/2-competitive algorithms. More specifically, they consider schedules
S whose makespan is at most 3/2 times the optimum makespan for the set of
scheduled jobs and that additionally satisfy the following property: A removal
of the largest job from each machine yields a schedule S  whose makespan is
upper bounded by the optimum makespan for the set of jobs sequenced in S.
Recent Advances for a Classical Scheduling Problem 11

Sanders et al. [33] show that there exists a schedule S such that, for any 0 <  <
4/21, upon the arrival of a new job, a migration factor of 4/3 −  is necessary to
achieve a competitiveness of 3/2.
Sanders et al. [33] also present a more involved algorithm that is 4/3-competi-
tive using a migration factor of 5/2. Moreover, they construct a sophisticated
online approximation scheme, where the migration factor depends exponentially
on 1/.
Theorem 7. [18] There exists a (1 + )-competitive algorithm with a migration
2
factor of β() ∈ 2O(1/·log (1/)) . The running time needed to update the schedule
in response to the arrival of a new job is constant.
Finally, Sanders et al. [33] show that no constant migration factor is sufficient
to maintain truly optimal schedules.
Theorem 8. [33] Any online algorithm that maintains optimal solutions uses a
migration factor of Ω(m).

4.2 Migrating a Limited Number of Jobs


Albers and Hellwig [3] investigated the setting where an online algorithm may
perform a limited number of job reassignments. These job migrations may be
executed at any time while a job sequence σ is processed but, quite intuitively,
the best option is to perform migrations at the end of σ. The paper [3] presents
algorithms that use a total of O(m) migrations, independently of σ. The best
algorithm is αm -competitive, for any m ≥ 2, where αm is defined as in Sec-
tion 3. Recall that αm is the value satisfying fm (α) = 1, where fm (α) is the
function specified in (1). Again, there holds α2 = 4/3 and limm→∞ αm =
W−1 (−1/e2 )/(1 + W−1 (−1/e2 )) ≈ 1.4659. The algorithm uses at most ( (2 −
αm )/(αm − 1)2 + 4)m job migrations. For m ≥ 11, this expression is at most
7m. For smaller machine numbers it is 8m to 10m. In the next paragraphs we
describe the algorithm, which we call ALG(M2).
Description of ALG(M2): The algorithm operates in two phases, a job arrival
phase and a job migration phase. In the job arrival phase all jobs of σ are assigned
one by one to the machines. In this phase no job migrations are performed. Once
σ is scheduled, the job migration phase starts. First the algorithm removes some
jobs from the machines. Then these jobs are reassigned to other machines.
Job arrival phase: Let time t be the point in time when Jt has to be scheduled,
1 ≤ t ≤ n. ALG(M2) classifies jobs into small and large. To this end it maintains
a lower bound Lt on the optimum makespan. Let p2m+1 t denote the processing
time of the (2m + 1)-st largest job in J1 , . . . , Jt , provided that 2m + 1 ≤ t. If
2m + 1 > t, let p2m+1 = 0. Obviously, when t jobs have arrived, the optimum
1 t
t
makespan cannot be smaller than the average load m i=1 pi on the m machines
and, furthermore, cannot be smaller than three times the processing time of the
(2m + 1)-st largest job. Hence let
t
Lt = max{ m 1
i=1 pi , 3pt
2m+1
}.
12 S. Albers

A job Ji , with i ≤ t, is small at time t if pi ≤ (αm − 1)Lt ; otherwise it is large at


time t. Finally, let p∗t be the total processing time of the jobs of J1 , . . . , Jt that
are large at time t. Define L∗t = p∗t /m.
ALG(M2) maintains a load profile on the m machines M1 , . . . , Mm as far
as small jobs are concerned. The profile is identical to the one maintained by
ALG(B) in Section 3, except that here we restrict ourselves to small jobs. Again,
let 
(αm − 1) m−jm
if j ≤ m/αm
β(j) =
αm otherwise.
For any machine Mj , 1 ≤ j ≤ m, let s (j, t) be the load on Mj caused by the
jobs that are small at time t, prior to the assignment of Jt .
For t = 1, . . . , n, each Jt is scheduled as follows. If Jt is small at time t, then
it is scheduled on a machine with s (j, t) ≤ β(j)L∗t . Albers and Hellwig [3] show
that such a machine always exists. If Jt is large at time t, then it is assigned to
a machine having the smallest load among all machines. At the end of the job
arrival phase let L = Ln and L∗ = L∗n .
Job migration phase: The phase consists of a job removal step, followed by a job
reassignment step. In the removal step ALG(M2) maintains a set R of removed
jobs. While there exists a machine Mj whose load exceeds
max{β(j)L∗ , (αm − 1)L}, ALG(M2) removes a job with the largest processing
time currently residing on Mj and adds the job to R. In the job reassignment
step the jobs of R are considered in non-increasing order of processing time.
First ALG(M2) constructs sets P1 , . . . , Pm consisting of at most two jobs each.
For j = 1, . . . , m, the j-th largest job of R is inserted into Pj provided that the
job is large at time n. Furthermore, for j = m + 1, . . . , 2m, the j-th largest job
of R is inserted into P2m+1−j , provided that the job is large at time n and twice
its processing time is greater than the processing time of the job already con-
tained in Pj . These sets P1 , . . . , Pm are renumbered in order of non-increasing
total processing time. Then, for j = 1, . . . , m, set Pj is assigned to a least loaded
machine. Finally the jobs of R \ (P1 ∪ . . . , Pm ) are scheduled one by one on a
least loaded machine.
Theorem 9. [3] ALG(M2) is αm -competitive and uses at most ( (2−αm )/(αm −
1)2 + 4)m job migrations.
In fact Albers and Hellwig [3] prove that at most ( (2 − αm )/(αm − 1)2 + 4 jobs
are removed from each machine in the removal step. Furthermore, Albers and
Hellwig show that the competitiveness of αm is best possible using a number of
migrations that does not depend on σ.
Theorem 10. [3] Let m ≥ 2. No deterministic online algorithm can achieve a
competitive ratio smaller than αm if o(n) job migrations are allowed.
Finally, Albers and Hellwig [3] give a family of algorithms that uses fewer mi-
grations and, in particular, trades performance for migrations. The family is
c-competitive, for any 5/3 ≤ c ≤ 2. For c = 5/3, the strategy uses at most 4m
job migrations. For c = 1.75, at most 2.5m migrations are required.
Recent Advances for a Classical Scheduling Problem 13

5 Conclusion

In this paper we have reviewed various models of resource augmentation for


online makespan minimization. The performance guarantees are tight for the
scenarios with a reordering buffer and job migration. Open problems remain in
the setting where the total job processing time is known and in the bin stretching
problem. The paper by Kellerer et al. [29] also proposed another augmented
setting in which an online algorithm is allowed to construct several candidate
schedules in parallel, the best of which is finally chosen. Kellerer et al. [29] present
a 4/3-competitive algorithm for m = 2 machines. For a general number m of
machines the problem is explored in [4].

References
1. Aggarwal, G., Motwani, R., Zhu, A.: The load rebalancing problem. Journal of
Algorithms 60(1), 42–59 (2006)
2. Albers, S.: Better bounds for online scheduling. SIAM Journal on Computing 29,
459–473 (1999)
3. Albers, S., Hellwig, M.: On the value of job migration in online makespan minimiza-
tion. In: Epstein, L., Ferragina, P. (eds.) ESA 2012. LNCS, vol. 7501, pp. 84–95.
Springer, Heidelberg (2012)
4. Albers, S., Hellwig, M.: Online makespan minimization with parallel schedules,
arXiv:1304.5625 (2013)
5. Albers, S., Hellwig, M.: Semi-online scheduling revisited. Theoretical Computer
Science 443, 1–9 (2012)
6. Anand, S., Garg, N., Kumar, A.: Resource augmentation for weighted flow-time ex-
plained by dual fitting. In: Proc. 23rd Annual ACM-SIAM Symposium on Discrete
Algorithms, pp. 1228–1241 (2012)
7. Angelelli, E., Speranza, M.G., Tuza, Z.: Semi-on-line scheduling on two parallel
processors with an upper bound on the items. Algorithmica 37, 243–262 (2003)
8. Angelelli, E., Speranza, M.G., Tuza, Z.: New bounds and algorithms for on-line
scheduling: two identical processors, known sum and upper bound on the tasks.
Discrete Mathematics & Theoretical Computer Science 8, 1–16 (2006)
9. Angelelli, E., Speranza, M.G., Tuza, Z.: Semi-online scheduling on two uniform
processors. Theoretical Computer Science 393, 211–219 (2008)
10. Angelelli, E., Nagy, A.B., Speranza, M.G., Tuza, Z.: The on-line multiprocessor
scheduling problem with known sum of the tasks. Journal of Scheduling 7, 421–428
(2004)
11. Azar, Y., Regev, O.: On-line bin-stretching. Theoretical Computer Science 268,
17–41 (2001)
12. Bartal, Y., Fiat, A., Karloff, H., Vohra, R.: New algorithms for an ancient schedul-
ing problem. Journal of Computer and System Sciences 51, 359–366 (1995)
13. Bartal, Y., Karloff, H., Rabani, Y.: A better lower bound for on-line scheduling.
Infomation Processing Letters 50, 113–116 (1994)
14. Chen, B., van Vliet, A., Woeginger, G.J.: A lower bound for randomized on-line
scheduling algorithms. Information Processing Letters 51, 219–222 (1994)
15. Chen, B., van Vliet, A., Woeginger, G.J.: A optimal algorithm for preemptive
online scheduling. Operations Research Letters 18, 127–131 (1995)
14 S. Albers

16. Cheng, T.C.E., Kellerer, H., Kotov, V.: Semi-on-line multiprocessor scheduling
with given total processing time. Theoretical Computer Science 337, 134–146
(2005)
17. Dósa, G., Epstein, L.: Preemptive online scheduling with reordering. In: Fiat, A.,
Sanders, P. (eds.) ESA 2009. LNCS, vol. 5757, pp. 456–467. Springer, Heidelberg
(2009)
18. Englert, M., Özmen, D., Westermann, M.: The power of reordering for online min-
imum makespan scheduling. In: Proc. 49th Annual IEEE Symposium on Founda-
tions of Computer Science, pp. 603–612 (2008)
19. Epstein, L.: Bin stretching revisited. Acta Informatica 39(2), 7–117 (2003)
20. Faigle, U., Kern, W., Turan, G.: On the performance of on-line algorithms for
partition problems. Acta Cybernetica 9, 107–119 (1989)
21. Fleischer, R., Wahl, M.: Online scheduling revisited. Journal of Scheduling 3, 343–
353 (2000)
22. Galambos, G., Woeginger, G.: An on-line scheduling heuristic with better worst
case ratio than Graham’s list scheduling. SIAM Journal on Computing 22, 349–355
(1993)
23. Garay, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory
of NP-Completeness. W.H. Freeman and Company, New York (1979)
24. Gormley, T., Reingold, N., Torng, E., Westbrook, J.: Generating adversaries for
request-answer games. In: Proc. 11th ACM-SIAM Symposium on Discrete Algo-
rithms, pp. 564–565 (2000)
25. Graham, R.L.: Bounds for certain multi-processing anomalies. Bell System Tech-
nical Journal 45, 1563–1581 (1966)
26. Hochbaum, D.S., Shmoys, D.B.: Using dual approximation algorithms for schedul-
ing problems theoretical and practical results. Journal of the ACM 34, 144–162
(1987)
27. Kalyanasundaram, B., Pruhs, K.: Speed is as powerful as clairvoyance. Journal of
the ACM 47, 617–643 (2000)
28. Karger, D.R., Phillips, S.J., Torng, E.: A better algorithm for an ancient scheduling
problem. Journal of Algorithms 20, 400–430 (1996)
29. Kellerer, H., Kotov, V., Speranza, M.G., Tuza, Z.: Semi on-line algorithms for the
partition problem. Operations Research Letters 21, 235–242 (1997)
30. Pruhs, K., Sgall, J., Torng, E.: Online scheduling. In: Leung, J. (ed.) Handbook
of Scheduling: Algorithms, Models, and Performance Analysis, ch. 15. CRC Press
(2004)
31. Rudin III, J.F.: Improved bounds for the on-line scheduling problem. Ph.D. Thesis.
The University of Texas at Dallas (May 2001)
32. Rudin III, J.F., Chandrasekaran, R.: Improved bounds for the online scheduling
problem. SIAM Journal on Computing 32, 717–735 (2003)
33. Sanders, P., Sivadasan, N., Skutella, M.: Online scheduling with bounded migra-
tion. Mathematics of Operations Reseach 34(2), 481–498 (2009)
34. Sgall, J.: A lower bound for randomized on-line multiprocessor scheduling. Infor-
mation Processing Letters 63, 51–55 (1997)
35. Sleator, D.D., Tarjan, R.E.: Amortized efficiency of list update and paging rules.
Communications of the ACM 28, 202–208 (1985)
36. Zhang, G.: A simple semi on-line algorithm for P 2//Cmax with a buffer. Informa-
tion Processing Letters 61, 145–148 (1997)
Formalizing and Reasoning about Quality

Shaull Almagor1, Udi Boker2 , and Orna Kupferman1


1
The Hebrew University, Jerusalem, Israel
2
IST Austria, Klosterneuburg, Austria

Abstract. Traditional formal methods are based on a Boolean satisfaction no-


tion: a reactive system satisfies, or not, a given specification. We generalize for-
mal methods to also address the quality of systems. As an adequate specification
formalism we introduce the linear temporal logic LTL[F]. The satisfaction value
of an LTL[F] formula is a number between 0 and 1, describing the quality of
the satisfaction. The logic generalizes traditional LTL by augmenting it with a
(parameterized) set F of arbitrary functions over the interval [0, 1]. For exam-
ple, F may contain the maximum or minimum between the satisfaction values of
subformulas, their product, and their average.
The classical decision problems in formal methods, such as satisfiability, model
checking, and synthesis, are generalized to search and optimization problems
in the quantitative setting. For example, model checking asks for the quality in
which a specification is satisfied, and synthesis returns a system satisfying the
specification with the highest quality. Reasoning about quality gives rise to other
natural questions, like the distance between specifications. We formalize these
basic questions and study them for LTL[F]. By extending the automata-theoretic
approach for LTL to a setting that takes quality into an account, we are able to
solve the above problems and show that reasoning about LTL[F] has roughly the
same complexity as reasoning about traditional LTL.

1 Introduction
One of the main obstacles to the development of complex computerized systems lies in
ensuring their correctness. Efforts in this direction include temporal-logic model check-
ing – given a mathematical model of the system and a temporal-logic formula that
specifies a desired behavior of the system, decide whether the model satisfies the for-
mula, and synthesis – given a temporal-logic formula that specifies a desired behavior,
generate a system that satisfies the specification with respect to all environments [6].
Correctness is Boolean: a system can either satisfy its specification or not satisfy
it. The richness of today’s systems, however, justifies specification formalisms that are
multi-valued. The multi-valued setting arises directly in systems in which components
are multi-valued (c.f., probabilistic and weighted systems) and arises indirectly in ap-
plications where multi values are used in order to model missing, hidden, or varying
information (c.f., abstraction, query checking, and inconsistent viewpoints). As we elab-
orate below, the multi-valued setting has been an active area of research in recent years.

This work was supported in part by the Austrian Science Fund NFN RiSE (Rigorous Systems
Engineering), by the ERC Advanced Grant QUAREM (Quantitative Reactive Modeling), and
the ERC Grant QUALITY. The full version is available at the authors’ URLs.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 15–27, 2013.
c Springer-Verlag Berlin Heidelberg 2013
16 S. Almagor, U. Boker, and O. Kupferman

No attempts, however, have been made to augment temporal logics with a quantitative
layer that would enable the specification of the relative merits of different aspects of
the specification and would enable to formalize the quality of a reactive system. Given
the growing role that temporal logic plays in planning and robotics, and the criticality
of quality in these applications [16], such an augmentation is of great importance also
beyond the use of temporal logic in system design and verification.
In this paper we suggest a framework for formalizing and reasoning about quality.
Our working assumption is that satisfying a specification is not a yes/no matter. Differ-
ent ways of satisfying a specification should induce different levels of quality, which
should be reflected in the output of the verification procedure. Consider for example the
specification G(req → Fgrant ). There should be a difference between a computation
that satisfies it with grants generated soon after requests, one that satisfies it with long
waits, one that satisfies it with several grants given to a single request, one that satisfies
it vacuously (with no requests), and so on. Moreover, we may want to associate dif-
ferent levels of importance to different components of a specification, to express their
mutual influence on the quality, and to formalize the fact that we have different levels
of confidence about some of them.
Quality is a rather subjective issue. Technically, we can talk about the quality of sat-
isfaction of specifications since there are different ways to satisfy specifications. We
introduce and study the linear temporal logic LTL[F ], which extends LTL with an ar-
bitrary set F of functions over [0, 1]. Using the functions in F , a specifier can formally
and easily prioritize the different ways of satisfaction. The logic LTL[F ] is really a fam-

ily of logics, each parameterized by a set F ⊆ {f : [0, 1]k → [0, 1]|k ∈ } of functions
(of arbitrary arity) over [0, 1]. For example, F may contain the min {x, y}, max {x, y},
and 1 − x functions, which are the standard quantitative analogues of the ∧, ∨, and ¬
operators. As we discuss below, such extensions to LTL have already been studied in
the context of quantitative verification [15]. The novelty of LTL[F ], beyond its use in
the specification of quality, is the ability to manipulate values by arbitrary functions. For
example, F may contain the quantitative operator λ , for λ ∈ [0, 1], that tunes down
the quality of a sub-specification. Formally, the quality of the satisfaction of the speci-
fication λ ϕ is the multiplication of the quality of the satisfaction of ϕ by λ. Another
useful operator is the weighted-average function ⊕λ . There, the quality described by
the formula ϕ ⊕λ ψ is the weighted (according to λ) average between the quality of ϕ
and that of ψ. This enables the quality of the system to be an interpolation of different
aspects of it. As an example, consider the formula G(req → (grant ⊕ 34 Xgrant )). The
formula specifies the fact that we want requests to be granted immediately and the grant
to hold for two transactions. When this always holds, the satisfaction value is 1. We are
quite okay with grants that are given immediately and last for only one transaction, in
which case the satisfaction value is 34 , and less content when grants arrive with a delay,
in which case the satisfaction value is 14 .
An LTL[F ] formula maps computations to a value in [0, 1]. We accordingly gen-
eralize classical decision problems, such as model checking, satisfiability, synthesis,
and equivalence, to their quantitative analogues, which are search or optimization
Formalizing and Reasoning about Quality 17

problems. For example, the equivalence problem between two LTL[F ] formulas ϕ1 and
ϕ2 seeks the supremum of the difference in the satisfaction values of ϕ1 and ϕ2 over all
computations. Of special interest is the extension of the synthesis problem. In conven-
tional synthesis algorithms we are given a specification to a reactive system, typically
by means of an LTL formula, and we transform it into a system that is guaranteed to
satisfy the specification with respect to all environments [23]. Little attention has been
paid to the quality of the systems that are automatically synthesized1. Current efforts
to address the quality challenge are based on enriching the game that corresponds to
synthesis to a weighted one [2,5]. Using LTL[F ], we are able to embody quality within
the specification, which is very convenient.
In the Boolean setting, the automata-theoretic approach has proven to be very use-
ful in reasoning about LTL specifications. The approach is based on translating LTL
formulas to nondeterministic Büchi automata on infinite words [25]. In the quantitative
approach, it seems natural to translate formulas to weighted automata [21]. However,
these extensively-studied models are complicated and many problems become undecid-
able for them [1,17]. We show that we can use the approach taken in [15], bound the
number of possible satisfaction values of LTL[F ] formulas, and use this bound in or-
der to translate LTL[F ] formulas to Boolean automata. From a technical point of view,
the big challenge in our setting is to maintain the simplicity and the complexity of the
algorithms for LTL, even though the number of possible values is exponential. We do
so by restricting attention to feasible combinations of values assigned to the different
subformulas of the specification. Essentially, our translation extends the construction of
[25] by associating states of the automaton with functions that map each subformula
to a satisfaction value. Using the automata-theoretic approach, we solve the basic prob-
lems for LTL[F ] within the same complexity classes as the corresponding problems in
the Boolean setting (as long as the functions in F are computable within these com-
plexity classes; otherwise, they become the computational bottleneck). Our approach
thus enjoys the fact that traditional automata-based algorithms are susceptible to well-
known optimizations and symbolic implementations. It can also be easily implemented
in existing tools.
Recall that our main contribution is the ability to address the issue of quality within
the specification formalism. While we describe it with respect to Boolean systems, we
show in Section 5 that our contribution can be generalized to reason about weighted
systems, where the values of atomic propositions are taken from [0, 1]. We also extend
LTL[F ] to the branching temporal logic CTL [F ], which is the analogous extension of
CTL , and show that we can still solve decision and search problems. Finally, we define
a fragment, LTL , of LTL[F ] for which the number of different satisfaction values is
linear in the length of the formula, leading to even simpler algorithms.

Related Work. In recent years, the quantitative setting has been an active area of re-
search, providing many works on quantitative logics and automata [9,10,12,18].
Conceptually, our work aims at formalizing quality, having a different focus from
each of the other works. Technically, the main difference between our setting and most
1
Note that we do not refer here to the challenge of generating optimal (say, in terms of state
space) systems, but rather to quality measures that refer to how the specification is satisfied.
18 S. Almagor, U. Boker, and O. Kupferman

of the other approaches is the source of quantitativeness: There, it stems from the nature
of the system, whereas in our setting it stems from the richness of the new functional
operators. For example, in multi-valued systems, the values of atomic propositions are
taken from a finite domain [4,18]. In fuzzy temporal logic [22], the atomic propositions
take values in [0, 1]. Probabilistic temporal logic is interpreted over Markov decision
processes [8,20], and in the context of real-valued signals [11], quantitativeness stems
from both time intervals and predicates over the value of atomic propositions.
Closer to our approach is [7], where CTL is augmented with discounting and
weighted-average operators. Thus, a formula has a rich satisfaction value, even on
Boolean systems. The motivation in [7] is to suggest a logic whose semantics is not
too sensitive to small perturbations in the model. Accordingly, formulas are evaluated
on weighted-system (as we do in Section 5) or on Markov-chains. We, on the other
hand, aim at specifying quality of on-going behaviors. Hence, we work with the much
stronger LTL and CTL∗ logics, and we augment them by arbitrary functions over [0, 1].
A different approach, orthogonal to ours, is to stay with Boolean satisfaction values,
while handling quantitative properties of the system, in particular ones that are based
on unbounded accumulation [3]. The main challenge in these works is the border of
decidability, whereas our technical challenge is to keep the simplicity of the algorithms
known for LTL in spite of the exponential number of satisfaction values. Nonetheless,
an interesting future research direction is to combine the two approaches.

2 Formalizing Quality

2.1 The Temporal Logic LTL[F ]

The linear temporal logic LTL[F ] generalizes LTL by replacing the Boolean operators
of LTL with arbitrary functions over [0, 1]. The logic is actually a family of logics, each
parameterized by a set F of functions.
Syntax. Let AP be a set of Boolean atomic propositions, and let F ⊆ {f : [0, 1]k →

[0, 1] | k ∈ } be a set of functions over [0, 1]. Note that the functions in F may have
different arities. An LTL[F ] formula is one of the following:

– True, False, or p, for p ∈ AP .


– f (ϕ1 , ..., ϕk ), Xϕ1 , or ϕ1 Uϕ2 , for LTL[F ] formulas ϕ1 , . . . , ϕk and a function
f ∈ F.

Semantics. The semantics of LTL[F ] formulas is defined with respect to (finite or infi-
nite) computations over AP . We use (2AP )∞ to denote (2AP )∗ ∪ (2AP )ω . A computa-
tion is a word π = π0 , π1 , . . . ∈ (2AP )∞ . We use π i to denote the suffix πi , πi+1 , . . ..
The semantics maps a computation π and an LTL[F ] formula ϕ to the satisfaction value
of ϕ in π, denoted [[π, ϕ]]. The satisfaction value is defined inductively as described in
Table 1 below.2
2
The observant reader may be concerned by our use of max and min where sup and inf are in
order. In Lemma 1 we prove that there are only finitely many satisfaction values for a formula
ϕ, thus the semantics is well defined.
Formalizing and Reasoning about Quality 19

Table 1. The semantics of LTL[F]

Formula Satisfaction value Formula Satisfaction value


[[π, True]] 1 [[π, f (ϕ1 , ..., ϕk )]] f ([[π, ϕ1 ]], ..., [[π, ϕk ]])
[[π, False]] 0 [[π, Xϕ1 ]] [[π 1 , ϕ1 ]]
1 if p ∈ π0
[[π, p]] [[π, ϕ1 Uϕ2 ]] max {min{[[π i , ϕ2 ]], min [[π j , ϕ1 ]]}}
0 if p ∈/ π0 0≤i<|π| 0≤j<i

It is not hard to prove, by induction on the structure of the formula, that for every
computation π and formula ϕ, it holds that [[π, ϕ]] ∈ [0, 1]. We use the usual Fϕ1 =
TrueUϕ1 and Gϕ1 = ¬(TrueU(¬ϕ1 )) abbreviations.
The logic LTL coincides with the logic LTL[F ] for F that corresponds to the usual
Boolean operators. For simplicity, we use the common such functions as abbreviation,
as described below. In addition, we introduce notations for some useful functions. Let
x, y ∈ [0, 1]. Then,
• ¬x = 1 − x • x ∨ y = max {x, y} • x ∧ y = min {x, y}
• λ x = λ · x • x ⊕λ y = λ · x + (1 − λ) · y
To see that LTL indeed coincides with LTL[F ] for F = {¬, ∨, ∧}, note that for this F ,
all formulas are mapped to {0, 1} in a way that agrees with the semantics of LTL.

Kripke Structures and Transducers. For a Kripke structure K and an LTL[F ] formula
ϕ, we have that [[K, ϕ]] = min {[[π, ϕ]] : π is a computation of K}. That is, the value is
induced by the path that admits the lowest satisfaction value. 3
In the setting of open systems, the set of atomic propositions is partitioned into sets I
and O of input and output signals. An (I, O)-transducer then models the computations
generated (deterministically) by the system when it interacts with an environment that
generates finite or infinite sequences of input signals.
Example 1. Consider a scheduler that receives requests and generates grants. Consider
the LTL[F ] formula G(req → F(grant ⊕ 21 Xgrant )) ∧ ¬( 34 G¬req). The satisfaction
value of the formula is 1 if every request is eventually granted, and the grant lasts for
two consecutive steps. If a grant holds only for a single step, then the satisfaction value
is reduced to 12 . In addition, if there are no requests, then the satisfaction value is at
most 14 . This shows how we can embed vacuity tests in the formula.

2.2 The Basic Questions


In the Boolean setting, an LTL formula maps computations to {True, False}. In the
quantitative setting, an LTL[F ] formula maps computations to [0, 1]. Classical deci-
sion problems, such as model checking, satisfiability, synthesis, and equivalence, are
accordingly generalized to their quantitative analogues, which are search or optimiza-
tion problems. Below we specify the basic questions with respect to LTL[F ]. While the
3
Since a Kripke structure may have infinitely many computations, here too we should have
a-priori used inf, and the use of min is justified by Lemma 1.
20 S. Almagor, U. Boker, and O. Kupferman

definition here focuses on LTL[F ], the questions can be asked with respect to arbitrary
quantitative specification formalism, with the expected adjustments.

– The satisfiability problem gets as input an LTL[F ] formula ϕ and returns


max{[[π, ϕ]] : π is a computation}. Dually, the validity problem returns, given an
LTL[F ] formula ϕ, the value min {[[π, ϕ]] : π is a computation}. 4
– The implication problem gets as input two LTL[F ] formulas ϕ1 and ϕ2 and returns
max {[[π, ϕ1 ]] − [[π, ϕ2 ]] : π is a computation}. The symmetric version of implica-
tion, namely the equivalence problem, gets as input two LTL[F ] formulas ϕ1 and
ϕ2 and returns max {|[[π, ϕ1 ]] − [[π, ϕ2 ]]| : π is a computation}.
– The model-checking problem is extended from the Boolean setting to find, given a
system K and an LTL[F ] formula ϕ, the satisfaction value [[K, ϕ]].
– The realizability problem gets as input an LTL formula over I ∪ O, for sets I and O
of input and output signals, and returns max{[[T , ϕ]] : T is an (I, O)-transducer}.
The synthesis problem is then to find a transducer that attains this value.

Decision Problems. The above questions are search and optimization problems. It is
sometimes interesting to consider the decision problems they induce, when referring
to a threshold. For example, the model-checking decision-problem is to decide, given
a system K, a formula ϕ, and a threshold t, whether [[K, ϕ]] ≥ t. For some problems,
there are natural thresholds to consider. For example, in the implication problem, asking
whether max {[[π, ϕ1 ]] − [[π, ϕ2 ]] : π is a computation} ≥ 0 amounts to asking whether
for all computations π, we have that [[π, ϕ1 ]] ≥ [[π, ϕ2 ]], which indeed captures implica-
tion.

2.3 Properties of LTL[F ]

Bounding the Number of Satisfaction Values. For an LTL[F ] formula ϕ, let V (ϕ) =
{[[π, ϕ]] : π ∈ (2AP )∞ }. That is, V (ϕ) is the set of possible satisfaction values of ϕ in
arbitrary computations. We first show that this set is finite for all LTL[F ] formulas.
Lemma 1. For every LTL[F ] formula ϕ, we have that |V (ϕ)| ≤ 2|ϕ| .
The good news that follows from Lemma 1 is that every LTL[F ] formula has only
finitely many possible satisfaction values. This enabled us to replace the sup and inf op-
erators in the semantics by max and min. It also implies that we can point to witnesses
that exhibit the satisfaction values. However, Lemma 1 only gives an exponential bound
to the number of satisfaction values. We now show that this exponential bound is tight.
Example 2. Consider the logic LTL[{⊕}], augmenting LTL with the average function,
where for every x, y ∈ [0, 1] we have that x ⊕ y = 12 x + 12 y. Let n ∈ and consider 
the formula ϕn = p1 ⊕ (p2 ⊕ (p3 ⊕ (p4 ⊕ ...pn ))...). The length of ϕn is in O(n) and
the nesting depth of ⊕ operators in it is n. For every computation π it holds that
1 1 1 1
[[π, ϕn ]] = [[π0 , p1 ]] + [[π0 , p2 ]] + ... + n−1 [[π0 , pn−1 ]] + n−1 [[π0 , pn ]].
2 4 2 2
4
Lemma 1 guarantees that max and min (rather than sup and inf) are defined.
Formalizing and Reasoning about Quality 21

Hence, every assignment π0 ⊆ {p1 , ..., pn−1 } to the first position in π induces a dif-
ferent satisfaction value for [[π, ϕn ]], implying that there are 2n−1 different satisfaction
values for ϕn .

A Boolean Look at LTL[F ]. LTL[F ] provides means to generalize LTL to a quan-


titative setting. Yet, one may consider a Boolean logic defined by LTL[F ] formulas
and predicates. For example, having formulas of the form ϕ1 ≥ ϕ2 or ϕ1 ≥ v, for
LTL[F ] formulas ϕ1 and ϕ2 , and a value v ∈ [0, 1]. It is then natural to compare the
expressiveness and succinctness of such a logic with respect to LTL.
One may observe that the role the functions in F play in LTL[F ] is propositional, in
the sense that the functions do not introduce new temporal operators. We formalize this
intuition in the full version, showing that for every LTL[F ] formula ϕ and predicate
P ⊆ [0, 1], there exists an LTL formula Bool (ϕ, P ) equivalent to the assertion ϕ ∈ P .
Formally, we have the following.
Theorem 1. For every LTL[F ] formula ϕ and predicate P ⊆ [0, 1], there exists an
LTL formula Bool (ϕ, P ), of length at most exponential in ϕ, such that for every com-
putation π, it holds that [[π, ϕ]] ∈ P iff π |= Bool (ϕ, P ).
The translation described in the proof of Theorem 1 may involve an exponential blow-
up. We indeed conjecture that this blowup is unavoidable, implying that LTL[F ], when
used as a Boolean formalism, is exponentially more succinct than LTL. Since very little
is known about lower bounds for propositional formulas, we leave it as a conjecture. We
demonstrate the succinctness with the following example.
Example 3. For k ≥ 1, let ⊕ k1 be the k-ary average operator. Consider the logic
LTL[{⊕ k1 }], for an even integer k, and consider the formula ϕk = ⊕ k1 (p1 , . . . , pk ),
for the atomic propositions p1 , . . . , pk .
For every computation π, it holds that [[π, ϕk ]] = |{i: pik∈π0 }| . Hence, [[π, ϕk ]] = 12
iff exactly half of the atomic propositions p1 , . . . , pk hold in π0 . We conjecture that
the LTL formula Bool (ϕk , 12 ) must be exponential in k. Intuitively, the formula has to
 k 
refer to every subset of size k2 . A naive implementation of this involves k/2 clauses,
which is exponential in k. The question whether this can be done with a formula that is
polynomial in k is a long-standing open problem.

3 Translating LTL[F] to Automata


The automata-theoretic approach uses the theory of automata as a unifying paradigm
for system specification, verification, and synthesis [24,26]. In this section we describe
an automata-theoretic framework for reasoning about LTL[F ] specifications. In order
to explain our framework, let us recall first the translation of LTL formulas to nonde-
terministic generalized Büchi automata (NGBW), as introduced in [25]. We start with
the definition of NGBWs. An NGBW is A = Σ, Q, Q0 , δ, α, where Σ is the input
alphabet, Q is a finite set of states, Q0 ⊆ Q is a set of initial states, δ : Q × Σ → 2Q
is a transition function, and α ⊆ 2Q is a set of sets of accepting states. The number of
sets in α is the index of A. A run r = r0 , r1 , · · · of A on a word w = w1 · w2 · · · ∈ Σ ω
22 S. Almagor, U. Boker, and O. Kupferman

is an infinite sequence of states such that r0 ∈ Q0 , and for every i ≥ 0, we have that
ri+1 ∈ δ(ri , wi+1 ). We denote by inf(r) the set of states that r visits infinitely often,

that is inf(r) = {q : ri = q for infinitely many i ∈ }. The run r is accepting if
it visits all the sets in α infinitely often. Formally, for every set F ∈ α we have that
inf(r) ∩ F = ∅. An automaton accepts a word if it has an accepting run on it. The
language of an automaton A, denoted L(A), is the set of words that A accepts.
In the Vardi-Wolper translation of LTL formulas to NGBWs [25], each state of the
automaton is associated with a set of formulas, and the NGBW accepts a computation
from a state q iff the computation satisfies exactly all the formulas associated with q. The
state space of the NGBW contains only states associated with maximal and consistent
sets of formulas, the transitions are defined so that requirements imposed by temporal
formulas are satisfied, and the acceptance condition is used in order to guarantee that
requirements that involve the satisfaction of eventualities are not delayed forever.
In our construction here, each state of the NGBW assigns a satisfaction value to
every subformula. Consistency then assures that the satisfaction values agree with the
functions in F . Similar adjustments are made to the transitions and the acceptance con-
dition. The construction translates an LTL[F ] formula ϕ to an NGBW, while setting its
initial states according to a required predicate P ⊆ [0, 1]. We then have that for every
computation π ∈ (2AP )ω , the resulting NGBW accepts π iff [[π, ϕ]] ∈ P .
We note that a similar approach is taken in [15], where LTL formulas are interpreted
over quantitative systems. The important difference is that the values in our construction
arise from the formula and the functions it involves, whereas in [15] they are induced
by the values of the atomic propositions.
Theorem 2. Let ϕ be an LTL[F ] formula and P ⊆ [0, 1] be a predicate. There exists
an NGBW Aϕ,P such that for every computation π ∈ (2AP )ω , it holds that [[π, ϕ]] ∈ P
2
iff Aϕ,P accepts π. Furthermore, Aϕ,P has at most 2(|ϕ| ) states and index at most |ϕ|.
Proof. We define Aϕ,P = 2AP , Q, δ, Q0 , α as follows. Let cl(ϕ) be the set of ϕ’s
subformulas. Let Cϕ be the collection of functions g : cl(ϕ) → [0, 1] such that for
all ψ ∈ cl(ϕ), we have that g(ψ) ∈ V (ψ). For a function g ∈ Cϕ , we say that g is
consistent if for every ψ ∈ cl(ϕ), the following holds.
– If ψ = True, then g(ψ) = 1, and if ψ = False then g(ψ) = 0.
– If ψ = p ∈ AP , then g(ψ) ∈ {0, 1}.
– If ψ = f (ψ1 , . . . , ψk ), then g(ψ) = f (g(ψ1 ), . . . , g(ψk )).
The state space Q of Aϕ,P is the set of all consistent functions in Cϕ . Then, Q0 =
{g ∈ Q : g(ϕ) ∈ P } contains all states in which the value assigned to ϕ is in P .
We now define the transition function δ. For functions g, g  and a letter σ ∈ Σ, we
have that g  ∈ δ(g, σ) iff the following hold.
– σ = {p ∈ AP : g(p) = 1}.
– For all Xψ1 ∈ cl(ϕ) we have g(Xψ1 ) = g  (ψ1 ).
– For all ψ1 Uψ2 ∈ cl(ϕ) we have g(ψ1 Uψ2 ) =
max {g(ψ2 ), min {g(ψ1 ), g  (ψ1 Uψ2 )}}.
Finally, every formula ψ1 Uψ2 contributes to α the set Fψ1 Uψ2 =
{g : g(ψ2 ) = g(ψ1 Uψ2 )}.
Formalizing and Reasoning about Quality 23

Remark 1. The construction described in the proof of Theorem 2 is such that select-
ing the set of initial states allows us to specify any (propositional) condition regarding
the sub-formulas of ϕ. A simple extension of this idea allows us to consider a set of
formulas {ϕ1 , ..., ϕm } = Φ and a predicate P ⊆ [0, 1]m , and to construct an NGBW
that accepts a computation π iff [[π, ϕ1 ]], ..., [[π, ϕn ]] ∈ P . Indeed, the state space
of the product consists of functions that map all the formulas in Φ to their satisfac-
tion values, and we only have to choose as the initial states these functions g for which
g(ϕ1 ), ..., g(ϕn ) ∈ P . As we shall see in Section 4, this allows us to use the automata-
theoretic approach also in order to examine relations between the satisfaction values of
different formulas.

4 Solving the Basic Questions for LTL[F]

In this section we solve the basic questions defined in Section 2.2. We show that they
all can be solved for LTL[F ] with roughly the same complexity as for LTL. When
we analyze complexity, we assume that the functions in F can be computed in a com-
plexity that is subsumed by the complexity of the problem for LTL (PSPACE, except
for 2EXPTIME for realizability), which is very reasonable. Otherwise, computing the
functions becomes the computational bottleneck. A related technical observation is that,
assuming the functions in F can be calculated in PSPACE, we can also enumerate in
PSPACE the set V (ϕ) of the possible satisfaction values of an LTL[F ] formula ϕ.
The questions in the quantitative setting are basically search problems, asking for the
best or worst value. Since every LTL[F ] formula may only have exponentially many
satisfaction values, one can reduce a search problem to a set of decision problems with
respect to specific thresholds, remaining in PSPACE. Combining this with the construc-
tion of NGBWs described in Theorem 2 is the key to our algorithms.
We can now describe the algorithms in detail.
Satisfiability and Validity. We start with satisfiability and solve the decision version
of the problem: given ϕ and a threshold v, decide whether there exists a computation
π such that [[π, ϕ]] ≥ v. The latter can be solved by checking the nonemptiness of the
NGBW Aϕ,P with P = [v, 1]. Since the NGBW can be constructed on-the-fly, this can
be done in PSPACE in the size of |ϕ|. The search version can be solved in PSPACE by
iterating over the set of relevant thresholds.
We proceed to validity. It is not hard to see that for all ϕ and v, we have that
∀π, [[π, ϕ]] ≥ v iff ¬(∃π, [[π, ϕ]] < v). The latter can be solved by checking, in PSPACE,
the nonemptiness of the NGBW Aϕ,P with P = [0, v). Since PSPACE is closed under
complementation, we are done. In both cases, the nonemptiness algorithm can return
the witness to the nonemptiness.
Implication and Equivalence. In the Boolean setting, implication can be reduced
to validity, which is in turn reduced to satisfiability. Doing the same here is more
sophisticated, but possible: we add to F the average and negation operators. It is
not hard to verify that for every computation π, it holds that [[π, ϕ1 ⊕ 21 ¬ϕ2 ]] =
1 1
2 ([[π, ϕ1 ]]− [[π, ϕ2 ]])+ 2 . In particular, max{[[π, ϕ1 ]]− [[π, ϕ2 ]] : π is a computation} =
2 · max {[[π, ϕ1 ⊕ 21 ¬ϕ2 ]] : π is a computation} − 1. Thus, the problem reduces to the
24 S. Almagor, U. Boker, and O. Kupferman

satisfiability of ϕ1 ⊕ 21 ¬ϕ2 , which is solvable in PSPACE. Note that, alternatively, one


can proceed as suggested in Remark 1 and reason about the composition of the NGBWs
for ϕ1 and ϕ2 . The solution to the equivalence problem is similar, by checking both di-
rections of the implication.
Model Checking. The complement of the problem, namely whether there exists a com-
putation π of K such that [[π, ϕ]] < v, can be solved by taking the product of the NGBW
Aϕ,(0,v] from Theorem 2 with the system K and checking for emptiness on-the-fly. As
in the Boolean case, this can be done in PSPACE. Moreover, in case the product is
not empty, the algorithm returns a witness: a computation of K that satisfies ϕ with a
low quality. We note that in the case of a single computation, motivated by multi-valued
monitoring [11], one can label the computation in a bottom-up manner, as in CTL model
checking, and the problem can be solved in polynomial time.
Realizability and Synthesis. Several algorithms are suggested in the literature for solv-
ing the LTL realizability problem [23]. Since they are all based on a translation of speci-
fications to automata, we can adopt them. Here we describe an adoption of the Safraless
algorithm of [19] and its extension to NGBWs. Given ϕ and v, the algorithm starts by
constructing the NGBW Aϕ,[0,v) and dualizing it to a universal generalized co-Büchi
automaton (UGCW) Ãϕ,[0,v) . Since dualization amounts to complementation, Ãϕ,[0,v)
accepts exactly all computations π with [[π, ϕ]] ≥ v. Being universal, we can expand
Ãϕ,[0,v) to a universal tree automaton U that accepts a tree with directions in 2I and la-
bels in 2O if all its branches, which correspond to input sequences, are labeled by output
sequences such that the composition of the input and output sequences is a computa-
tion accepted by Ãϕ,[0,v) . Realizability then amounts to checking the nonemptiness of
U and synthesis to finding a witness to its nonemptiness. Since ϕ only has an exponen-
tial number of satisfaction values, we can solve the realizability and synthesis search
problems by repeating this procedure for all relevant values. Since the size of Aϕ,[0,v)
is single-exponential in |ϕ|, the complexity is the same as in the Boolean case, namely
2EXPTIME-complete.

5 Beyond LTL[F]

The logic LTL[F ] that we introduce and study here is a first step in our effort to in-
troduce reasoning about quality to formal methods. Future work includes stronger for-
malisms and algorithms. We distinguish between extensions that stay in the area of
LTL[F ] and ones that jump to the (possibly undecidable) world of infinitely many satis-
faction values. In the latter, we include efforts to extend LTL[F ] by temporal operators
in which the future is discounted, and efforts to combine LTL[F ] with other qualitative
aspects of systems [3]. In this section we describe two extensions of the first class: an
extension of LTL[F ] to weighted systems and to a branching-time temporal logic. We
also describe a computationally simple fragment of LTL[F ].

Weighted Systems. A weighted Kripke structure is a tuple K = AP, S, I, ρ, L, where


AP, S, I, and ρ are as in Boolean Kripke structures, and L : S → [0, 1]AP maps each
state to a weighted assignment to the atomic propositions. Thus, the value L(s)(p) of an
Formalizing and Reasoning about Quality 25

atomic proposition p ∈ AP in a state s ∈ S is a value in [0, 1]. The semantics of LTL[F ]


with respect to a weighted computation coincides with the one for non-weighted sys-
tems, except that for an atomic proposition p, we have that [[π, p]] = L(π0 )(p).
It is not hard to extend the construction of Aϕ,P , as described in the proof of Theo-
rem 2, to an alphabet W AP , where W is a set of possible values for the atomic propo-
sitions. Indeed, we only have to adjust the transitions so that there is a transition from
state g with letter σ ∈ W AP only if g agrees with σ on the values of the atomic propo-
sitions. Hence, in settings where the values for the atomic propositions are known, and
in particular model checking, the solutions to the basic questions is similar to the ones
described for LTL[F ] with Boolean atomic propositions.

Formalizing Quality with Branching Temporal Logics. Formulas of LTL[F ] specify on-
going behaviors of linear computations. A Kripke structure is not linear, and the way
we interpret LTL[F ] formulas with respect to it is universal. In branching temporal
logic one can add universal and existential quantifiers to the syntax of the logic, and
specifications can refer to the branching nature of the system [13].
The branching temporal logic CTL [F ] extends LTL[F ] by the path quantifiers E
and A. Formulas of the form Eϕ and Aϕ are referred to as state formulas and they are
interpreted over states s in the structure with the semantics [[s, Eϕ]] = max{[[π, ϕ]] :
π starts in s} and [[s, Aϕ]] = min{[[π, ϕ]] : π starts in s}.
In [14], the authors describe a general technique for extending the scope of LTL
model-checking algorithms to CTL . The idea is to repeatedly consider an innermost
state subformula, view it as an (existentially or universally quantified) LTL formula,
apply LTL model checking in order to evaluate it in all states, and add a fresh atomic
proposition that replaces this subformula and holds in exactly these states that satisfy
it. This idea, together with our ability to model check systems with weighted atomic
propositions, can be used also for model checking CTL [F ].
More challenging is the handling of the other basic problems. There, the solution in-
volves a translation of CTL [F ] formulas to tree automata. Since the automata-theoretic
approach for CTL has the Vardi-Wolper construction at its heart, this is possible.

The Fragment LTL of LTL[F ]. In the proof of Lemma 1, we have seen that a formula
may take exponentially many satisfaction values. The proof crucially relies on the fact
that the value of a function is a function of all its inputs. However, in the case of unary
functions, or indeed functions that do not take many possible values, this bound can be
lowered. Such an interesting fragment is the logic LTL = LTL[{λ ,  λ }λ∈[0,1] ∪
{∨, ¬}], with the functions λ (x) = λ · x and  λ (x) = λ · x + (1 − λ)/2.
This fragment is interesting in two aspects. First, computationally, an LTL formula
has only polynomially many satisfaction values. Moreover, for a predicate of the form
P = [v, 1] (resp. P = (v, 1]), the LTL formula Bool (ϕ, P ) can be shown to be of
linear length in |ϕ|. This implies that solving threshold-problems for LTL formulas
can be done with tools that work with LTL with no additional complexity. Second,
philosophically, an interesting question that arises when formalizing quality regards
26 S. Almagor, U. Boker, and O. Kupferman

how the lack of quality in a component should be viewed. With quality between 0 and 1,
we have that 1 stands for “good”, 0 for “bad”, and 12 for “not good and not bad”. While
the λ operator enables us to reduce the quality towards “badness”, the  λ operator
enables us to do so towards “ambivalence”.

References

1. Almagor, S., Boker, U., Kupferman, O.: What’s decidable about weighted automata? In: Bul-
tan, T., Hsiung, P.-A. (eds.) ATVA 2011. LNCS, vol. 6996, pp. 482–491. Springer, Heidelberg
(2011)
2. Bloem, R., Chatterjee, K., Henzinger, T.A., Jobstmann, B.: Better Quality in Synthesis
through Quantitative Objectives. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS,
vol. 5643, pp. 140–156. Springer, Heidelberg (2009)
3. Boker, U., Chatterjee, K., Henzinger, T.A., Kupferman, O.: Temporal specifications with
accumulative values. In: 26th LICS, pp. 43–52 (2011)
4. Bruns, G., Godefroid, P.: Model checking with multi-valued logics. In: Dı́az, J., Karhumäki,
J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 281–293. Springer,
Heidelberg (2004)
5. Černý, P., Chatterjee, K., Henzinger, T.A., Radhakrishna, A., Singh, R.: Quantitative Synthe-
sis for Concurrent Programs. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 243–259. Springer, Heidelberg (2011)
6. Clarke, E., Henzinger, T.A., Veith, H.: Handbook of Model Checking. Elsvier (2013)
7. de Alfaro, L., Faella, M., Henzinger, T.A., Majumdar, R., Stoelinga, M.: Model checking
discounted temporal properties. TCS 345(1), 139–170 (2005)
8. Desharnais, J., Gupta, V., Jagadeesan, R., Panangaden, P.: Metrics for labelled markov pro-
cesses. TCS 318(3), 323–354 (2004)
9. Droste, M., Kuich, W., Rahonis, G.: Multi-valued MSO logics over words and trees. Funda-
menta Informaticae 84(3-4), 305–327 (2008)
10. Droste, M., Rahonis, G.: Weighted automata and weighted logics with discounting.
TCS 410(37), 3481–3494 (2009)
11. Donzé, A., Maler, O., Bartocci, E., Nickovic, D., Grosu, R., Smolka, S.: On Temporal Logic
and Signal Processing. In: Chakraborty, S., Mukund, M. (eds.) ATVA 2012. LNCS, vol. 7561,
pp. 92–106. Springer, Heidelberg (2012)
12. Droste, M., Werner, K., Heiko, V.: Handbook of Weighted Automata. Springer (2009)
13. Emerson, E.A., Halpern, J.Y.: Sometimes and not never revisited: On branching versus linear
time. Journal of the ACM 33(1), 151–178 (1986)
14. Emerson, E.A., Lei, C.L.: Modalities for model checking: Branching time logic strikes back.
In: Proc. 12th POPL, pp. 84–96 (1985)
15. Faella, M., Legay, A., Stoelinga, M.: Model Checking Quantitative Linear Time Logic.
TCS 220(3), 61–77 (2008)
16. Kress-Gazit, H., Fainekos, G.E., Pappas, G.J.: Temporal-Logic-Based Reactive Mission and
Motion Planning. IEEE Trans. on Robotics 25(6), 1370–1381 (2009)
17. Krob, D.: The equality problem for rational series with multiplicities in the tropical semiring
is undecidable. International Journal of Algebra and Computation 4(3), 405–425 (1994)
18. Kupferman, O., Lustig, Y.: Lattice automata. In: Cook, B., Podelski, A. (eds.) VMCAI 2007.
LNCS, vol. 4349, pp. 199–213. Springer, Heidelberg (2007)
19. Kupferman, O., Vardi, M.Y.: Safraless decision procedures. In: Proc. 46th FOCS, pp. 531–540
(2005)
Formalizing and Reasoning about Quality 27

20. Kwiatkowska, M.Z.: Quantitative verification: models techniques and tools. In: FSE,
pp. 449–458 (2007)
21. Mohri, M.: Finite-state transducers in language and speech processing. Computational Lin-
guistics 23(2), 269–311 (1997)
22. Moon, S., Lee, K.H., Lee, D.: Fuzzy branching temporal logic. IEEE Transactions on Sys-
tems, Man, and Cybernetics, Part B 34(2), 1045–1055 (2004)
23. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Proc.16th POPL, pp. 179–190
(1989)
24. Thomas, W.: Automata on infinite objects. In: Handbook of Theoretical Computer Science,
pp. 133–191 (1990)
25. Vardi, M.Y., Wolper, P.: An automata-theoretic approach to automatic program verification.
In: Proc. 1st LICS, pp. 332–344 (1986)
26. Vardi, M.Y., Wolper, P.: Reasoning about infinite computations. I&C 115(1), 1–37 (1994)
The Square Root Phenomenon in Planar Graphs

Dániel Marx

Computer and Automation Research Institute,


Hungarian Academy of Sciences (MTA SZTAKI)
Budapest, Hungary
[email protected]

Abstract. Most of the classical NP-hard problems remain NP-hard


when restricted to planar graphs, and only exponential-time algorithms
are known for the exact solution of these planar problems. However, in
many cases, the exponential-time algorithms on planar graphs are sig-
nificantly faster than the algorithms for √ general graphs: for example,
3-Coloring can be solved in time 2O( n) in an n-vertex planar graph,
whereas only 2O(n) -time algorithms are known for general graphs. For
various planar problems, we often see a square root appearing in the run-
ning time√ of the √best algorithms,

e.g., the running time is often of the
form 2O( n) , nO( k) , or 2O( k) · n. By now, we have a good understand-
ing of why this square root appears. On the algorithmic side, most of
these algorithms rely on the notion of treewidth and its relation to grid
minors in planar graphs (but sometimes this connection is not obvious
and takes some work to exploit). On the lower bound side, under a com-
plexity assumption called Exponential Time Hypothesis (ETH), we can
show that these algorithms are essentially best possible, and therefore
the square root has to appear in the running time.


Research supported by the European Research Council (ERC) grant 280152.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, p. 28, 2013.

c Springer-Verlag Berlin Heidelberg 2013
A Guided Tour in Random Intersection Graphs

Paul G. Spirakis1,2 , Sotiris Nikoletseas1 , and Christoforos Raptopoulos1


1
Computer Technology Institute and Press “Diophantus” and
University of Patras, Greece
2
Computer Science Department, University of Liverpool, United Kingdom
{spirakis,nikole}@cti.gr, [email protected]

1 Introduction and Motivation

Random graphs, introduced by P. Erdős and A. Rényi in 1959, still attract a


huge amount of research in the communities of Theoretical Computer Science,
Algorithms, Graph Theory, Discrete Mathematics and Statistical Physics. This
continuing interest is due to the fact that, besides their mathematical beauty,
such graphs are very important, since they can model interactions and faults in
networks and also serve as typical inputs for an average case analysis of algo-
rithms. The modeling effort concerning random graphs has to show a plethora
of random graph models; some of them have quite elaborate definitions and are
quite general, in the sense that they can simulate many other known distribu-
tions on graphs by carefully tuning their parameters.
In this tour, we consider a simple, yet general family of models, namely Ran-
dom Intersection Graphs (RIGs). In such models there is a universe M of labels
and each one of n vertices selects a random subset of M. Two vertices are
connected if and only if their corresponding subsets of labels intersect.
Random intersection graphs may model several real-life applications quite ac-
curately. In fact, there are practical situations where each communication agent
(e.g. a wireless node) gets access only to some ports (statistically) out of a
possible set of communication ports. When another agent also selects a commu-
nication port, then a communication link is implicitly established and this gives
rise to communication graphs that look like random intersection graphs. Fur-
thermore, random intersection graphs are relevant to and capture quite nicely
social networking. Indeed, a social network is a structure made of nodes tied
by one or more specific types of interdependency, such as values, visions, finan-
cial exchange, friends, conflicts, web links etc. Other applications may include
oblivious resource sharing in a distributed setting, interactions of mobile agents
traversing the web, social networking etc. Even epidemiological phenomena (like
spread of disease between individuals with common characteristics in a popula-
tion) tend to be more accurately captured by this “proximity-sensitive” family
of random graphs.

This research was partially supported by the EU IP Project MULTIPLEX contract
number 317532.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 29–35, 2013.

c Springer-Verlag Berlin Heidelberg 2013
30 P.G. Spirakis, S. Nikoletseas, and C. Raptopoulos

1.1 A More Formal, First Acquaintance with RIGs

Random intersection graphs were introduced by M. Karoński, E.R. Sheinerman


and K.B. Singer-Cohen [7] and K.B. Singer-Cohen [18]. The formal definition of
the model is given below:

Definition 1 (Uniform Random Intersection Graph - Gn,m,p [7, 18]).


Consider a universe M = {1, 2, . . . , m} of labels and a set of n vertices V .
Assign independently to each vertex v ∈ V a subset Sv of M, choosing each
element i ∈ M independently with probability p and draw an edge between two
vertices v = u if and only if Sv ∩ Su = ∅. The resulting graph is an instance
Gn,m,p of the uniform random intersection graphs model.

In this model we also denote by Li the set of vertices that have chosen label
i ∈ M . Given Gn,m,p , we will refer to {Li , i ∈ M} as its label representation.
It is often convenient to view the label representation as a bipartite graph with
vertex set V ∪ M and edge set {(v, i) : i ∈ Sv } = {(v, i) : v ∈ Li }. We refer to
this graph as the bipartite random graph Bn,m,p associated to Gn,m,p . Notice that
the associated bipartite graph is uniquely defined by the label representation.
It follows from the definition of the model the (unconditioned) probability
that a specific edge exists is 1 − (1 − p2 )m . Therefore, if mp2 goes to infinity
with n, then this probability goes to 1. We can thus restrict the range of the
parameters to the “interesting” range where mp2 = O(1) (i.e. the range of values
for which the unconditioned probability that an edge exists does not go to 1).
Furthermore, as is usual in the literature, we assume that the number of labels
is some power of the number of vertices, i.e. m = nα , for some α > 0.
It is worth mentioning that the edges in Gn,m,p are not independent. In par-
ticular, there is a strictly positive dependence between the existence of two edges
that share an endpoint (i.e. Pr(∃{u, v}|∃{u, w}) > Pr(∃{u, v})). This dependence
is stronger the smaller the number of labels M includes, while it seems to fade
away as the number of labels increases. In fact, by using a coupling technique,
the authors in [4] prove the equivalence (measured in terms of total variation dis-
tance) of uniform random intersection graphs and Erdős-Rényi random graphs,
when m = nα , α > 6. This bound on the number of labels was improved in [16],
by showing equivalence of sharp threshold functions among the two models for
α ≥ 3. These results show that random intersection graphs are quite general and
that known techniques for random graphs can be used in the analysis of uniform
random intersection graphs with a large number of labels.
The similarity between uniform random intersection graphs and Erdős-Rényi
random graphs vanishes as the number of labels m decreases below the number
of vertices n (i.e. m = nα , for α ≤ 1). This dichotomy was initially pointed
out in [18], through the investigation of connectivity of Gn,m,p
 . In particular,
it was proved that the connectivity threshold for α > 1 is ln n ln n
nm , but it is m
(i.e. quite larger) for α ≤ 1. Therefore, the mean number of edges just above
connectivity is approximately 12 n ln n in the first case (which is equal to the mean
number of edges just above the connectivity threshold for Erdős-Rényi random
A Guided Tour in Random Intersection Graphs 31

graphs), but it is larger by at least a factor of ln n in the second case. Other


dichotomy results of similar flavor were pointed out in the investigation of the
(unconditioned) vertex degree distribution by D. Stark [19], through the analysis
of a suitable generating function, and in the investigation of the distribution of
the number of isolated vertices by Y. Shang [17].
In this invited talk, we will focus on research related to both combinatorial
and algorithmic properties of uniform random intersection graphs, but also of
other, related models that are included in the family of random intersection
graphs. In particular, we note that by selecting the label set of each vertex
using a different distribution, we get random intersection graphs models whose
statistical behavior can vary considerably from that of Gn,m,p . Two of these
models are the following: (a) In the General Random Intersection Graphs
Model Gn,m,p [11], where p = [p1 , p2 , . . . , pm ], the label set Sv of a vertex v
is formed by choosing independently each label i with probability pi . (b) In the
Regular Random Intersection Graphs Model Gn,m,λ [6], where λ ∈ N,
the label set of a vertex is chosen independently, uniformly at random for the
set of all subsets of M of cardinality λ.

2 Selected Combinatorial Problems

Below we provide a brief presentation of the main results on the topic obtained
by our team. We also give a general description of the techniques used; some
of these techniques highlight and take advantage of the intricacies and special
structure of random intersection graphs, while others are adapted from the field
of Erdős-Rényi random graphs.

2.1 Independent Sets

The problem of the existence and efficient construction of large independent sets
in general random intersection graphs is considered in [11]. Concerning existence,
exact formulae are derived for the expectation and variance of the number of
independent sets of any size, by using a vertex contraction technique. This tech-
nique involves the characterization of the statistical behavior of an independent
set of any size and highlights an asymmetry in the edge appearance rule of ran-
dom intersection graphs. In particular, it is shown that the probability that any
fixed label i is chosen by some vertex in a k-size S with no edges is exactly
kpi
1+(k−1)pi . On the other hand, there is no closed formula for the respective prob-
ability when there is at least one edge between the k vertices (or even when the
set S is complete)! The special structure of random intersection graphs is also
used in the design of efficient algorithms for constructing quite large independent
sets in uniform random intersection graphs. By analysis, it is proved that the
approximation guarantees of algorithms using the label representation of ran-
dom intersection graphs are superior to that of well known greedy algorithms
for independent sets when applied to instances of Gn,m,p .
32 P.G. Spirakis, S. Nikoletseas, and C. Raptopoulos

2.2 Hamilton Cycles


In [15], the authors investigate the existence and efficient construction of Hamil-
ton cycles in uniform random intersection graphs. In particular, for the case
m = nα , α > 1 the authors first prove a general result that allows one to apply
(with the same probability of success) any algorithm that finds a Hamilton cycle
with high probability in a Gn,M random graph (i.e. a graph chosen equiprobably
form the space of all graphs with M edges). The proof is done by using a sim-
ple coupling argument. A more complex coupling was given in [3], resulting in
a more accurate characterization of the threshold function for Hamiltonicity in
Gn,m,p for the whole range of values of α. From an algorithmic perspective, the
authors in [15]provide an expected polynomial time algorithm for the case where
 n 
n
m=O ln n and p is constant. For the more general case where m = o ln n
they propose a label exposure greedy algorithm that succeeds in finding a Hamil-
ton cycle in Gn,m,p with high probability, even when the probability of label
selection is just above the connectivity threshold.

2.3 Coloring
In [9], the problem of coloring the vertices of Gn,m,p is investigated (see also [1]).
For the case where the number of labels is less than the number of vertices
and mp ≥ ln2 n (i.e. a factor ln n above the connectivity threshold of uniform
random intersection graphs), a polynomial time algorithm is proposed for finding
a propercoloring Gn,m,p . The algorithm is greedy-like and it is proved that it
2 2 2
takes O n lnmp n time, while using Θ nmp
ln n different colors. Furthermore, by
using a one sided coupling to the regular random intersection graphs model
Gn,m,λ with λ ∼ mp, and using an upper bound on its independence number
from [13], it is shown that the number of colors used by the proposed algorithm
is optimal up to constant factors.
To complement this result, the authors in [9] prove that when mp < β ln n,
for some small constant β, only np colors are needed in order to color n − o(n)
vertices of Gn,m,p whp. This means that even for quite dense instances, using
the same number of colors as those needed to properly color the clique induced
by any label suffices to color almost all of the vertices of Gn,m,p . For the proof,
the authors explore a combination of ideas from [5] and [8]. In particular, a
martingale {Xt }t≥0 is defined, so that Xn is equal to the maximum subset of
vertices that can be properly colored using a predefined number of colors k.
Then, by providing an appropriate lower bound on the probability that there is
a sufficiently large subset of vertices that can be split in k independent sets of
roughly the same size, and then using Azuma’s Inequality for martingales, the
authors provide a lower bound on E[Xn ] and also show that the actual value
Xn is highly concentrated around its mean value.
Finally, due to the similarities that the Gn,m,p model has to the process of
generating random hypergraphs, [9] includes a comparison of the problem of
finding a proper coloring for Gn,m,p to that of coloring hypergraphs so that no
edge is monochromatic. In contrast to the first problem, it is proved that only
A Guided Tour in Random Intersection Graphs 33

two colors suffice for the second problem. Furthermore, by using the method of
conditional expectations (see [14]) an algorithm can be derived that finds the
desired coloring in polynomial time.

2.4 Maximum Cliques


In [12], the authors consider maximum cliques in the uniform random intersection
graphs model Gn,m,p . It is proved that, when the number of labels is not too
large, we can use the label choices of the vertices to find a maximum clique in
polynomial time (in the number of labels m and vertices n of the graph). Most
of the analytical work in the paper is devoted in proving the Single Label Clique
Theorem. Its proof includes a coupling to a graph model where edges appear
independently and in which we can bound the size of the maximum clique by
well known probabilistic techniques. The theorem states that when the number
of labels is less than the number of vertices, any large enough clique in a random
instance of Gn,m,p is formed by a single label. This statement may seem obvious
when p is small, but it is hard to imagine
 that it still holds for all “interesting”
1
values for p. Indeed, when p = o nm , by slightly modifying an argument
of [1], one can see that Gn,m,p almost surely has no cycle of size k ≥ 3 whose edges
are formed by k distinct labels (alternatively, the intersection graph produced
by reversing the roles of labels and vertices is a tree). On the other hand, for
larger p a random instance of Gn,m,p is far from perfect1 and the techniques
of [1] do not apply. By using the Single Label Clique Theorem, a tight bound
on the clique number of Gn,m,p is proved, in the case where m = nα , α < 1. A
lower bound in the special case where mp2 is constant, was given in [18]. We
considerably broaden this range of values to also include vanishing values for
mp2 and also provide an asymptotically tight upper bound.
Finally, as yet another consequence of the Single Label Clique Theorem, the
authors in [12] prove that the problem of inferring the complete information
of label choices for each vertex from the resulting random intersection graph is
solvable whp; namely, the maximum likelihood estimation method will provide
a unique solution (up to permutations of the labels).2 In particular, given values
m, n and p, such that m = nα , 0 < α < 1, and given a random instance of the
Gn,m,p model, the label choices for each vertex are uniquely defined.

2.5 Expansion and Random Walks


The edge expansion and the cover time of uniform random intersection graphs is
investigated in [10]. In particular, by using first moment arguments, the authors
1
A perfect graph is a graph in which the chromatic number of every induced subgraph
equals the size of the largest clique of that subgraph. Consequently, the clique number
of a perfect graph is equal to its chromatic number.
2
More precisely, if B is the set of different label choices that can give rise to a graph
G, then the problem of inferring the complete information of label choices from G is
solvable if there is some B ∗ ∈ B such that Pr(B ∗ |G) > Pr(B|G), for all B  B = B ∗ .
34 P.G. Spirakis, S. Nikoletseas, and C. Raptopoulos

first prove that Gn,m,p is an expander whp when the number of labels is less than
the number of vertices, even when p is just above the connectivity threshold (i.e.
p = (1 + o(1))τc , where τc is the connectivity threshold). Second, the authors
show that random walks on the vertices of random intersection graphs are whp
rapidly mixing (in particular, the mixing time is logarithmic on n). The proof is
based on upper bounding the second eigenvalue of the random walk on Gn,m,p
through coupling of the original Markov Chain describing the random walk to
another Markov Chain on an associated random bipartite graph whose conduc-
tance properties are appropriate. Finally, the authors prove that the cover time
of the random walk on Gn,m,p , when m = nα , α < 1 and p is at least 5 times
the connectivity threshold is Θ(n log n), which is optimal up to a constant. The
proof is based on a general theorem of Cooper and Frieze [2]; the authors prove
that the degree and spectrum requirements of the theorem hold whp in the case
of uniform random intersection graphs. The authors also claim that their proof
also carries over to the case of smaller values for p, but the technical difficulty
for proving the degree requirements of the theorem of [2] increases.

3 Epilogue
We discussed here recent progress on the Random Intersection Graphs (RIGs)
Model. The topic is still new and many more properties await to be discovered
especially for the General (non-Uniform) version of RIGs. Such graphs (and
other new graph classes) are motivated by modern technology, and thus, some
combinatorial results and algorithmic properties may become useful in order to
understand and exploit emerging networks nowadays.

References
1. Behrisch, M., Taraz, A., Ueckerdt, M.: Coloring random intersection graphs and
complex networks. SIAM J. Discrete Math. 23, 288–299 (2008)
2. Cooper, C., Frieze, A.: The Cover Time of Sparse Random Graphs. In: Random
Structures and Algorithms, vol. 30, pp. 1–16. John Wiley & Sons, Inc. (2007)
3. Efthymiou, C., Spirakis, P.G.: Sharp thresholds for Hamiltonicity in random inter-
section graphs. Theor. Comput. Sci. 411(40-42), 3714–3730 (2010)
4. Fill, J.A., Sheinerman, E.R., Singer-Cohen, K.B.: Random intersection graphs
when m = ω(n): an equivalence theorem relating the evolution of the G(n, m, p)
and G(n, p) models. Random Struct. Algorithms 16(2), 156–176 (2000)
5. Frieze, A.: On the Independence Number of Random Graphs. Disc. Math. 81,
171–175 (1990)
6. Godehardt, E., Jaworski, J.: Two models of Random Intersection Graphs for Classi-
fication. In: Opitz, O., Schwaiger, M. (eds.). Studies in Classification, Data Analysis
and Knowledge Organisation, pp. 67–82. Springer, Heidelberg (2002)
7. Karoński, M., Sheinerman, E.R., Singer-Cohen, K.B.: On Random Intersection
Graphs: The Subgraph Problem. Combinatorics, Probability and Computing Jour-
nal 8, 131–159 (1999)
8. L
 uczak, T.: The chromatic number of random graphs. Combinatorica 11(1), 45–54
(2005)
A Guided Tour in Random Intersection Graphs 35

9. Nikoletseas, S., Raptopoulos, C., Spirakis, P.G.: Coloring Non-sparse Random


Intersection Graphs. In: Královič, R., Niwiński, D. (eds.) MFCS 2009. LNCS,
vol. 5734, pp. 600–611. Springer, Heidelberg (2009)
10. Nikoletseas, S., Raptopoulos, C., Spirakis, P.G.: Expander properties and the cover
time of random intersection graphs. Theor. Comput. Sci. 410(50), 5261–5272 (2009)
11. Nikoletseas, S., Raptopoulos, C., Spirakis, P.G.: Large independent sets in general
random intersection graphs. Theor. Comput. Sci. 406, 215–224 (2008)
12. Nikoletseas, S., Raptopoulos, C., Spirakis, P.G.: Maximum Cliques in Graphs with
Small Intersection Number and Random Intersection Graphs. In: Rovan, B., Sas-
sone, V., Widmayer, P. (eds.) MFCS 2012. LNCS, vol. 7464, pp. 728–739. Springer,
Heidelberg (2012)
13. Nikoletseas, S., Raptopoulos, C., Spirakis, P.G.: On the Independence Num-
ber and Hamiltonicity of Uniform Random Intersection Graphs. Theor. Comput.
Sci. 412(48), 6750–6760 (2011)
14. Molloy, M., Reed, B.: Graph Colouring and the Probabilistic Method. Springer
(2002)
15. Raptopoulos, C., Spirakis, P.G.: Simple and Efficient Greedy Algorithms for Hamil-
ton Cycles in Random Intersection Graphs. In: Deng, X., Du, D.-Z. (eds.) ISAAC
2005. LNCS, vol. 3827, pp. 493–504. Springer, Heidelberg (2005)
16. Rybarczyk, K.: Equivalence of a random intersection graph and G(n, p). Random
Structures and Algorithms 38(1-2), 205–234 (2011)
17. Shang, Y.: On the Isolated Vertices and Connectivity in Random Intersection
Graphs. International Journal of Combinatorics 2011, Article ID 872703 (2011),
doi:10.1155/2011/872703
18. Singer-Cohen, K.B.: Random Intersection Graphs. PhD thesis, John Hopkins Uni-
versity (1995)
19. Stark, D.: The vertex degree distribution of random intersection graphs. Random
Structures & Algorithms 24(3), 249–258 (2004)
To Be Uncertain Is Uncomfortable,
But to Be Certain Is Ridiculous

Peter Widmayer

Institute of Theoretical Computer Science, ETH Zürich, Switzerland


[email protected]

Traditionally, combinatorial optimization postulates that an input instance is


given with absolute precision and certainty, and it aims at finding an optimum
solution for the given instance. In contrast, real world input data are often
uncertain, noisy, inaccurate. As a consequence, an optimum solution for a real
world instance may not be meaningful or desired. While this unfortunate gap
between theory and reality has been recognized for quite some time, it is far from
understood, let alone resolved. We advocate to devote more attention to it, in
order to develop algorithms that find meaningful solutions for uncertain inputs.
We propose an approach towards this goal, and we show that this approach on
the one hand creates a wealth of algorithmic problems, while on the other hand
it appears to lead to good real world solutions.
This talk is about joint work with Joachim Buhmann, Matus Mihalak, and
Rasto Sramek.


Chinese proverb, sometimes also attributed to Goethe.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, p. 36, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Decision Problems for Additive Regular Functions

Rajeev Alur and Mukund Raghothaman

University of Pennsylvania
{alur,rmukund}@cis.upenn.edu

Abstract. Additive Cost Register Automata (ACRA) map strings to


integers using a finite set of registers that are updated using assignments
of the form “x := y +c” at every step. The corresponding class of additive
regular functions has multiple equivalent characterizations, appealing clo-
sure properties, and a decidable equivalence problem. In this paper, we
solve two decision problems for this model. First, we define the register
complexity of an additive regular function to be the minimum number of
registers that an ACRA needs to compute it. We characterize the register
complexity by a necessary and sufficient condition regarding the largest
subset of registers whose values can be made far apart from one another.
We then use this condition to design a pspace algorithm to compute the
register complexity of a given ACRA, and establish a matching lower
bound. Our results also lead to a machine-independent characterization
of the register complexity of additive regular functions. Second, we con-
sider two-player games over ACRAs, where the objective of one of the
players is to reach a target set while minimizing the cost. We show the
corresponding decision problem to be exptime-complete when the costs
are non-negative integers, but undecidable when the costs are integers.

1 Introduction
Consider the following scenario: a customer frequents a coffee shop, and each time
purchases a cup of coffee costing $2. At any time, he may fill a survey, for which
the store offers to give him a discount of $1 for each of his purchases that month
(including for purchases already made). We model this by the machine M1 shown
in figure 1. There are two states qS and q¬S , indicating whether the customer has
filled out the survey during the current month. There are three events to which
the machine responds: C indicates the purchase of a cup of coffee, S indicates the
completion of a survey, and # indicates the end of a month. The registers x and y
track how much money the customer owes the establishment: in the state q¬S , the
amount in x assumes that he will not fill out a survey that month, and the amount in
y assumes that he will fill out a survey before the end of the month. At any time the
customer wishes to settle his account, the machine outputs the amount of money
owed, which is always the value in the register x.

The full version of this paper is available on the arXiv (arXiv:1304.7029). This
research was partially supported by the NSF Expeditions in Computing award
1138996.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 37–48, 2013.

c Springer-Verlag Berlin Heidelberg 2013
38 R. Alur and M. Raghothaman


x := x + 2
C
y := y + 1 C/x := x + 1
S/x := y

q¬S qS
start x x

#/y := x
#/y := x S

Fig. 1. ACRA M1 models a customer in a coffee shop. It implements a function f1 :


{C, S, #}∗ → Z mapping the purchase history of the customer to the amount he owes
the store.

The automaton M1 has a finite state space, and a finite set of integer-valued
registers. On each transition, each register u is updated by an expression of the
form “u := v + c”, for some register v and constant c ∈ Z. Which register will
eventually contribute to the output is determined by the state after reading the
entire input, and so the cost of an event depends not only on the past, but also on
the future. Indeed, it can be shown that these machines are closed under regular
lookahead, i.e. the register updates can be conditioned on regular properties of
an as-yet-unseen suffix, for no gain in expressivity. The important limitation is
that the register updates are test-free, and cannot examine the register contents.
The motivation behind the model is generalizing the idea of regular languages
to quantitative properties of strings. A language L ⊆ Σ ∗ is regular when it
is accepted by a DFA. Regular languages are a robust class, permitting mul-
tiple equivalent representations such as regular expressions and as formulas in
monadic second-order logic. Recently in [2], we proposed the model of regular
functions: they are the MSO-definable transductions from strings to expression
trees over some pre-defined grammar. The class of functions thus defined de-
pends on the grammar allowed; the simplest is when the underlying domain is
the set of integers Z, and expressions involve constants and binary addition, and
we call the resulting class additive regular functions. Additive regular functions
have appealing closure properties, such as closure under linear combination, in-
put reversal, and regular lookahead, and several analysis problems are efficiently
decidable – such as containment, shortest paths and equivalence checking. The
machine M1 is an example of an Additive Cost Register Automaton (ACRA),
and this class defines exactly the additive regular functions
Observe that the machine M1 has two registers, and it is not immediately clear
how (if it is even possible) to reduce this number. This is the first question that
this paper settles: Given an ACRA M , how do we determine the minimum num-
ber of registers needed by any ACRA to compute the function it defines, M ?
We describe a property called register separation, and show that any equivalent
ACRA needs at least k registers iff the registers of M are k-separable. It turns
Decision Problems for Additive Regular Functions 39

out that the registers of M1 are 2-separable, and hence two registers are neces-
sary. We then go on to show that determining k-separability is pspace-complete.
Determining the register complexity is the natural analogue of the state mini-
mization problem for DFAs [6].
The techniques used to analyse the register complexity allow us to state a re-
sult similar to the pumping lemma for regular languages: The register complexity
of f is at least k iff for some m, we have strings σ0 , . . . , σm , τ1 , . . . , τm , suffixes w1 ,
. . . , wk , k distinct coefficient vectors c1 , . . . , ck ∈ Zm , and values
 d1 , . . . , dk ∈ Z
so that for all vectors x ∈ Nm , f (σ0 τ1x1 σ1 τ2x2 . . . σm wi ) = j cij xj + di . Thus,
depending on the suffix wi , at least one of the cycles τ1 , . . . , τk contributes
differently to the final cost.
Finally, we consider ACRAs with turn-based alternation. These are games
where several objective functions are simultaneously computed, but only one of
these objectives will eventually contribute to the output, based on the actions
of both the system and its environment. Alternating ACRAs are thus related
to multi-objective games and Pareto optimization [12], but are a distinct model
because each run evaluates to a single value. We study the reachability prob-
lem in ACRA games: Given a budget k, is there a strategy for the system to
reach an accepting state with cost at most k? We show that this problem is
exptime-complete when the incremental costs assume values from N, and unde-
cidable when the incremental costs are integer-valued.

Related Work. The traditional model of string-to-number transducers has


been (non-deterministic) weighted automata (WA). Additive regular functions
are equivalent to unambiguous weighted automata, and are therefore strictly
sandwiched between weighted automata and deterministic WAs in expressive-
ness. Deterministic WAs are ACRAs with one register, and algorithms exist to
compute the state complexity and for minimization [10]. Mohri [11] presents a
comprehensive survey of the field. Recent work on the quantitative analysis of
programs [5] also uses weighted automata, but does not deal with minimization
or with notions of regularity. Data languages [7] are concerned with strings over
a (possibly infinite) data domain D. Recent models [3] have obtained Myhill-
Nerode characterizations, and hence minimization algorithms, but the models
are intended as acceptors, and not for computing more general functions. Turn-
based weighted games [9] are ACRA games with a single register, and in this
special setting, it is possible to solve non-negative optimal reachability in polyno-
mial time. Of the techniques used in the paper, difference bound invariants are
a standard tool. However when we need them, in section 3, we have to deal with
disjunctions of such constraints, and show termination of invariant strengthen-
ing – to the best of our knowledge, the relevant problems have not been solved
before.

Outline of the Paper. We define the automaton model in section 2. In section


3, we introduce the notion of separability, and establish its connection to regis-
ter complexity. In section 4, we show that determining the register complexity
40 R. Alur and M. Raghothaman

is pspace-complete. Finally, in section 5, we study ACRA reachability games –


in particular, that ACRA (Z) games are undecidable, and that ACRA (N) reach-
ability games are exptime-complete.

2 Additive Regular Functions


We will use additive cost register automata as the working definition of additive
regular functions, i.e. a function1 f : Σ ∗ → Z⊥ is regular iff it is implemented
by an ACRA. An ACRA is a deterministic finite state machine, supplemented
by a finite number of integer-valued registers. Each transition specifies, for each
register u, a test-free update of the form “u := v + c”, for some register v, and
constant c ∈ Z. Accepting states are labelled with output expressions of the form
“v + c”.
Definition 1. An ACRA is a tuple M = (Q, Σ, V, δ, μ, q0 , F, ν), where Q is a
finite non-empty set of states, Σ is a finite input alphabet, V is a finite set of
registers, δ : Q × Σ → Q is the state transition function, μ : Q × Σ × V → V × Z
is the register update function, q0 ∈ Q is the start state, F ⊆ Q is the set of
accepting states, and ν : F → V × Z is the output function.
The configuration of the machine is a pair γ = (q, val), where q is the current
state, and val : V → Z maps each register to its value. Define (q, val) →a (q  , val )
iff δ (q, a) = q  and for each u ∈ V , if μ (q, a, u) = (v, c), then val (u) = val (v) + c.
The machine M implements a function M  : Σ ∗ → Z⊥ defined as follows.
For each σ ∈ Σ ∗ , let (q0 , val0 ) →σ (qf , valf ), where val0 (v) = 0 for all v. If
qf ∈ F and ν (qf ) = (v, c), then M  (σ) = valf (v) + c. Otherwise M  (σ) = ⊥.
We will write val (u, σ) for the value of a register u after the machine has pro-
cessed the string σ starting from the initial configuration. In the rest of this
section, we summarize some known results about ACRAs [2]:

Equivalent Characterizations. Additive regular functions are equivalent to


unambiguous weighted automata [11] over the tropical semiring. These are non-
deterministic machines with a single counter. Each transition increments the
counter by an integer c, and accepting states have output increments, also inte-
gers. The unambiguous restriction requires that there be a single accepting path
for each string in the domain, thus the “min” operation of the tropical semiring
is unused. Recently, streaming tree transducers [1] have been proposed as the
regular model for string-to-tree transducers – ACRAs are equivalent in expres-
siveness to MSO-definable string-to-term transducers with binary addition as
the base grammar.

Closure Properties. What makes additive2 regular functions interesting to


study is their robustness to various manipulations:
1
By convention, we represent a partial function f : A → B as a total function
f : A → B⊥ , where B⊥ = B ∪ {⊥}, and ⊥ ∈ / B is the “undefined” value.
2
We will often drop the adjective “additive”, and refer simply to regular functions.
Decision Problems for Additive Regular Functions 41

 x := y + 1
a y := y + 1
  z := z
x := x + 1  x := x
a x := x b
y := y b y := y + 1
y := y + 1
q0
start x
q0 q1
start x y
 x := z + 1

x := x + 1 b y := y
a
y := y z := z + 1

(a) M2 . (b) M3 .

Fig. 2. ACRAs M2 and M3 operate over the input alphabet Σ = {a, b}. Both
implement the function defined as f2 () = 0, and for all σ, f2 (σa) = |σa|a , and
f2 (σb) = |σb|b . Here |σ|a is the number of occurrences of the symbol a in the string σ.

1. For all c ∈ Z, if f1 and f2 are regular functions, then so are f1 + f2 and cf1 .
2. If f is a regular function, then frev defined as frev (σ) = f (σ rev ) is also
regular.
3. If f1 and f2 are regular functions, and L is a regular language, then the func-
tion f defined as f (σ) = if σ ∈ L, then f1 (σ) , else f2 (σ) is also regular.
4. ACRAs are closed under regular lookahead, i.e. even if the machine were al-
lowed to make decisions based on a regular property of the suffix rather than
simply the next input symbol, there would be no increase in expressiveness.

Analysis Problems. Given ACRAs M1 and M2 , equivalence-checking and the


min-cost problem (minσ∈Σ ∗ M  (σ)) can be solved in polynomial time. It follows
then that containment (for all σ, M1  (σ) ≤ M2  (σ)) also has a polynomial
time algorithm.

3 Characterizing the Register Complexity

The register complexity of an additive regular function f is the minimum number


of registers an ACRA needs to compute it. For example the register complexity
of both M1  in figure 1 and M2  in figure 2a is 2. Computing the register
complexity is the first problem we solve, and is the subject of this section and
the next.
Definition 2. Let f : Σ ∗ → Z⊥ be a regular function. The register complexity
of f is the smallest number k such that there is an ACRA M implementing f
with only k registers.
42 R. Alur and M. Raghothaman

Informally, the registers of M are separable in some state q if their values can
be pushed far apart. For example, consider the registers x and y of M1 in the
state q0 . For any constant c, there is a string σ = C c leading to q0 so that
|val (x, σ) − val (y, σ)| ≥ c. In formalizing this idea, we need to distinguish regis-
ters that are live in a given state, i.e. those that can potentially contribute to the
output. For example, M1 could be augmented with a third register z tracking
the length of the string processed. However, the value of z would be irrelevant
to the computation of f1 . Informally, a register v is live3 in a state q if for some
suffix σ ∈ Σ ∗ , on processing σ starting from q, the initial value of v is what
influences the final output.
Definition 3. Let M = (Q, Σ, V, δ, μ, q0, ν) be an ACRA. The registers of M
are k-separable if there is some state q, and a subset U ⊆ V so that
1. |U | = k, all registers v ∈ U are live in q, and
2. for all c ∈ Z, there is a string σ, such that δ (q0 , σ) = q and for all distinct
u, v ∈ U , |val (u, σ) − val (v, σ)| ≥ c.
The registers of a machine M are not k-separable if at every state q, and sub-
set U of k live registers, there is a constant c such that for all strings σ to q,
|val (u, σ) − val (v, σ)| < c, for some distinct u, v ∈ U . Note that the specific
registers which are close may depend on σ. For example, in the machine M3
from figure 2b, if a string σ ends with an a, then x and y will have the same
value, while if the last symbol was a b, then x and z are guaranteed to be equal.

Theorem 1. Let f : Σ ∗ → Z⊥ be a function defined by an ACRA M . Then the


register complexity of f is at least k iff the registers of M are k-separable.
We now sketch the proofs for each direction.

k-separability Implies a Lower Bound on the Register Complexity


Consider the machine M1 from figure 1. Here k = 2, and the registers x and y
are separated in the state q¬S . Let σ1 = , i.e. the empty string, and σ2 = S –
these are suffixes which, when starting from q¬S , “extract” the values currently
in x and y respectively.
Now suppose an equivalent counter-example machine M  is proposed with
only one register v. At each state q  of M  , observe the “effect” of processing
suffixes σ1 and σ2 . Each of these can be summarized by an expression of the
form v + cq i for i ∈ {1, 2}, the current value of register v, and cq i ∈ Z. Thus, the
outputs differ by no more than |(v + cq 1 ) − (v + cq 2 )| ≤ |cq 1 | + |cq 2 |. Fix n =
maxq (|cq 1 | + |cq 2 |), and observe that for all σ, |M   (σσ1 ) − M   (σσ2 )| ≤ n.
However, for σ = C n+1 , we know that |f1 (σσ1 ) − f1 (σσ2 )| > n, so M  cannot
be equivalent to M1 . This argument can be generalized to obtain:
Lemma 1. Let M be an ACRA whose registers are k-separable. Then the reg-
ister complexity of the implemented function f is at least k.
3
Live registers are formally defined in the full version of this paper.
Decision Problems for Additive Regular Functions 43

Non-separability Permits Register Elimination


Consider an ACRA M whose registers are not k-separable. We can then state
an invariant at each state q: there is a constant cq such that for every subset
U ⊆ V of live registers with |U | = k, and for every string σ with δ (q0 , σ) = q,
there must exist distinct u, v ∈ U with |val (u, σ) − val (v, σ)| < c. For example,
with 3 registers x, y, z, this invariant would be ∃c, |x − y| < c ∨ |y − z| < c ∨
|z − x| < c. We will construct a machine M  , where each state q  = (q, C, v)
has 3 components: the first component q is the state of the original machine,
and C identifies some term (not necessarily unique) in the disjunction which is
currently satisfied. Now for example, if we know that |x − y| < c, then it suffices
to explicitly maintain the value of only one register, and the (bounded) difference
can be stored in the state – this is the third component v.
Since we need to track these register differences during the execution, the
invariants must be inductive: if Dq and Dq are the invariants at states q and
q  respectively, and q →a q  is a transition in the machine, then it must be the
case that Dq =⇒ wp (Dq , q, a). Here wp refers to the standard notion of the
weakest precondition from program analysis: wp (Dq , q, a) is exactly that set
of variable valuations val so that (q, val) →a (q  , val ) for some Dq -satisfying
valuation val .
The standard technique to make a collection of invariants inductive is strength-
ening: if Dq =⇒ wp (Dq , q, a), then Dq is replaced with Dq ∧wp (Dq , q, a), and
this process is repeated at every pair of states until a fixpoint is reached. This
procedure is seeded with the invariants asserting non-separability. However, be-
fore the result of this back-propagation can be used in our arguments, we must
prove that the method terminates – this is the main technical problem solved in
this section.
We now sketch a proof of this termination claim for a simpler class of in-
variants. Consider the class of difference-bound constraints – assertions of the
form C = u,v∈V auv < u − v < buv , where for each u, v, auv , buv ∈ Z or
auv , buv ∈ {−∞, ∞}. When in closed form4 , C induces an equivalence relation
≡C over the registers: u ≡C v iff auv , buv ∈ Z. Let C and C  be some pair of
constraints such that C =⇒  C  . Then the assertion C ∧ C  is strictly stronger

than C. Either C ∧ C relates a strictly larger set of variables – ≡C ≡C∧C  –
or (if ≡C =≡C∧C  ) for some pair of registers u, v, the bounds auv < u − v < buv
imposed by C ∧C  are a strict subset of the bounds auv < u−v < buv imposed by
C. Observe that the first type of strengthening can happen at most |V |2 times,
while the second type of strengthening can happen only after auv , buv are estab-
lished for a pair of registers u, v, and can then happen at most buv − auv times.
Thus the process of repeated invariant strengthening must terminate. This argu-
ment can be generalized to disjunctions of difference-bound constraints, and we
conclude:

Lemma 2. Consider an ACRA M whose registers are not k-separable. Then,


we can effectively construct an equivalent machine M  with only k − 1 registers.
4
For all u, v ∈ V , auv = −bvu , and for all u, v, w ∈ V , auv + avw ≤ auw .
44 R. Alur and M. Raghothaman

4 Computing the Register Complexity

4.1 Computing the Register Complexity Is in pspace

We reduce the problem of determining the register complexity of M  to  one of de-


termining reachability in a directed “register separation” graph with O |Q| 2|V |
2

nodes. The presence of an edge in this graph can be determined in polynomial


space, and thus we have a pspace algorithm to determine the register complex-
ity. Otherwise, if polynomial time algorithms are used for  graph reachability and
4 4|V |2
1-counter 0-reachability, the procedure runs in time O c |Q| 2 3
, where c
is the largest constant in the machine.
We first generalize the idea of register separation to that of separation rela-
tions: an arbitrary relation  ⊆ V × V separates a state q if for every c ∈ Z, there
is a string σ so that δ (q0 , σ) = q, and whenever u  v, |val (u, σ) − val (v, σ)| ≥ c.
Thus, the registers of M are k-separable iff for some state q and some subset U
of live registers at q, |U | = k and {(u, v) | u, v ∈ U, u = v} separates q.
Consider a string τ ∈ Σ ∗ , so for some q, δ (q, τ ) = q. Assume also that:

1. For every register u in the domain or range of , μ (q, τ, u) = (u, cu ), for some
cu ∈ Z, and
2. for some pair of registers x, y, μ (q, τ, x) = (x, c) and μ (q, τ, y) = (y, c ) for
distinct c, c .

Thus, every pair of registers that is already separated is preserved during the
cycle, and some new pair of registers is incremented differently. We call such
strings τ “separation cycles” at q. They allow us to make conclusions of the form:
If  separates q, then  ∪ {(x, y)} also separates q.
Now consider a string σ ∈ Σ ∗ , such that for some q, q  , δ (q, σ) = q  . Pick
arbitrary relations ,  , and assume that whenever u  v  , and μ (q, σ, u ) =
(u, cu ), μ (q, σ, v  ) = (v, cv ), we have u  v. We can then conclude that if 
separates q, then  separates q  We call such strings σ “renaming edges” from
(q, ) to (q  ,  ).
We then show that if  separates q and  is non-empty, then there is a separa-
tion cycle-renaming edge sequence to (q, ) from some strictly smaller separation
(q  ,  ). Thus, separation at each node can be demonstrated by a sequence of sep-
aration cycles with renaming edges in between, and thus we reduce the problem
to that of determining reachability in an exponentially large register separation
graph. Finally, we show that each type of edge can be determined in pspace.

Theorem 2. Given an ACRA M and a number k, there is a pspace procedure


to determine whether its register complexity is at least k.

4.2 Pumping Lemma for ACRAs

The following theorem is the interpretation of a path through the register sep-
aration graph. Given a regular function f of register complexity at least k, it
Decision Problems for Additive Regular Functions 45

guarantees the existence of m cycles τ1 , . . . , τm , serially connected by strings σ0 ,


. . . , σm , so that based on one of k suffixes w1 , . . . , wk , the cost paid on one of the
cycles must differ. These cycles are actually the separation cycles discussed ear-
lier, and intermediate strings σi correspond to the renaming edges. Consider for
example, the function f2 from figure 2, and let σ0 = , τ1 = aab, and σ1 = . We
can increase the difference between the registers x and y to arbitrary amounts
by pumping the cycle τ1 . Now if the suffixes are w1 = a, and w2 = b, then the
choice of suffix determines the “cost” paid on each iteration of the cycle.
Theorem 3. A regular function f : Σ ∗ → Z⊥ has register complexity at least
k iff there exist strings σ0 , . . . , σm , τ1 , . . . , τm , and suffixes w1 , . . . , wk , and
k distinct coefficient vectors c1 , . . . , ck ∈ Zm , and values d1 , . . . , dk ∈ Z so that
for all x1 , . . . , xm ∈ N,

f (σ0 τ1x1 σ1 τ2x2 . . . σm wi ) = cij xj + di .


j

4.3 Computing the Register Complexity Is pspace-hard


We reduce the DFA intersection non-emptiness checking problem [8] to the prob-
lem of computing the register complexity. Let A = (Q, Σ, δ, q0 , {qf }) be a DFA.
Consider a single-state ACRA M with input alphabet Σ. For each state q ∈ Q,
M maintains a register vq . On reading a symbol a ∈ Σ, M updates vq := vδ(q,a) ,
for each q. Observe that this is simulating the DFA in reverse: if we start with
a special tagged value in vqf , then after processing σ, that tag is in the register
vq0 iff σ rev is accepted by A. Also observe that doing this in parallel for all the
DFAs no longer requires an exponential product construction, but only as many
registers as a linear function of the input size. We use this idea to construct in
polynomial time an ACRA M whose registers are (k + 2)-separable iff there is
a string σ ∈ Σ ∗ which is simultaneously accepted by all the DFAs. Therefore:

Theorem 4. Given an ACRA M and a number k, deciding whether the register


complexity of M  is at least k is pspace-hard.

5 Games over ACRAs


We now study games played over ACRAs. We extend the model of ACRAs to
allow alternation – in each state, a particular input symbol may be associated
with multiple transitions. The system picks the input symbol to process, while
the environment picks the specific transition associated with this input sym-
bol. Accepting states are associated with output functions, and the system may
choose to end the game in any accepting state. Given a budget k, we wish to
decide whether the system has a winning strategy with worst-case cost no more
than k. We show that ACRA games are undecidable when the incremental costs
are integer-valued, and exptime-complete when the incremental costs are from
D = N.
46 R. Alur and M. Raghothaman

Definition 4. An ACRA (D) reachability game G is a tuple


(Q, Σ, V, δ, μ, q0 , F, ν), where Q, Σ, and V are finite sets of states, input
symbols and registers respectively, δ ⊆ Q × Σ × Q is the transition relation,
μ : δ × V → V × D is the register update function, q0 ∈ Q is the start state,
F ⊆ Q is the set of accepting states, and ν : F → V × D is the output function.
The game configuration is a tuple γ = (q, val), where q ∈ Q is the current
state, and val : V → D is the current register valuation. A run π is a (possibly
infinite) sequence of game configurations (q1 , val1 ) →a1 (q2 , val2 ) →a2 · · · with
the property that
1. the transition qi →ai qi+1 ∈ δ for each i, and
2. vali+1 (u) = vali (v) + c, where μ (qi →ai qi+1 , u) = (v, c), for each register
u and for each transition i.

A strategy is a function θ : Q∗ × Q → Σ that maps a finite history q1 q2 . . . qn to


the next symbol θ (q1 q2 . . . qn ).

Definition 5. A run π is consistent with a strategy θ if for each i,


θ (q1 q2 . . . qi ) = ai . θ is winning from a configuration (q, val) with a budget of
k ∈ D if for every consistent run π starting from (q1 , val1 ) = (q, val), for some
i, qi ∈ F and ν (qi , vali ) ≤ k.

For greater readability, we write tuples (q, a, q  ) ∈ δ as q →a q  . If q ∈ F , and


val is a register valuation, we write ν (q, val) for the result val (v) + c, where
ν (q) = (v, c). When we omit the starting configuration for winning strategies it
is understood to mean the initial configuration (q0 , val0 ) of the ACRA.

5.1 ACRA (N) Reachability Games Can Be Solved in exptime


Consider the simpler class of (unweighted) graph reachability games. These are
played over a structure Gf = (Q, Σ, δ, q0 , F ), where Q is the finite state space,
and Σ is the input alphabet. δ ⊆ Q×Σ ×Q is the state transition relation, q0 ∈ Q
is the start state, and F ⊆ Q is the set of accepting states. If the input symbol
a ∈ Σ is played in a state q, then the play may adversarially proceed to any
state q  so that (q, a, q  ) ∈ δ. The system can force a win if every run compatible
with some strategy θf : Q∗ × Q → Σ eventually reaches a state qf ∈ F . Such
games can be solved by a recursive back-propagation
 algorithm
 – corresponding
to model checking the formula μX · F ∨ a∈Σ [a] X – in time O (|Q| |Σ|). In
such games, whenever there is a winning strategy, there is a memoryless winning
strategy θsmall which guarantees that no state is visited twice.
From every ACRA (N) reachability game G = (Q, Σ, V, δ, μ, q0 , F, ν), we can
project out an unweighted graph reachability game Gf = (Q, Σ, δ, q0 , F ). Also,
Gf has a winning strategy iff for some k ∈ N, G has a k-winning strategy.
Consider the cost of θsmall (computed for Gf ) when used with G. Since no
run ever visits the same state twice, θsmall is c0 |Q|-winning, where c0 is the
largest constant appearing in G. We have thus established an upper-bound on
the optimal reachability strategy, if it exists.
Decision Problems for Additive Regular Functions 47

Given an upper-bound k ∈ N, we would like to determine whether a winning


strategy θ exists within this budget. Because the register increments are non-
negative, once a register v achieves a value larger than k, it cannot contribute
to the final output on any suffix σ permitted by the winning strategy. We thus
convert G into an unweighted graph reachability Gfk , where the value of each
register is explicitly tracked in the state, as long as it is in the set {0, 1, . . . , k}.
This game can be solved for the optimal reachability strategy, and so we have:

Theorem 5. The optimal  strategy θ for an  ACRA (N) reachability game G can
be computed in time O |Q| |Σ| 2|V | log c0 |Q| , where c0 is the largest constant ap-
pearing in the description of G.

Note that the optimal strategy in ACRA (N) games need not be memoryless: the
strategy may visit a state again with a different register valuation. However, the
strategy θ constructed in the proof of the above theorem is memoryless given
the pair (q, val) of the current state and register valuation.

5.2 Hardness of Solving ACRA (D) Reachability Games

We reduce the halting problem for two-counter machines to the problem of solv-
ing an ACRA (Z) reachability game. Informally, we construct a game GM given a
two-counter machine M so that the player has a 0-winning strategy through GM
iff M halts. This strategy encodes the execution of M , and the adversary verifies
that the run is valid. A similar idea is used to show that deciding ACRA (N)
reachability games is exptime-hard. The reduction in that case proceeds from
the halting problem for linearly bounded alternating Turing machines [4]. Given
such a machine M , we construct in polynomial time a game gadget GM where
the only strategy is to encode the runs of the Turing machine.

Theorem 6. Determining whether there is a winning strategy with budget k in


an ACRA (N) reachability game is exptime-hard.

Theorem 7. Determining whether there is a winning strategy with budget k in


an ACRA (Z) reachability game is undecidable.

6 Conclusion

In this paper, we studied two decision problems for additive regular functions:
determining the register complexity, and alternating reachability in ACRAs. The
register complexity of an additive regular function f is the smallest number k
so there is some ACRA implementing f with only k registers. We developed an
abstract characterization of the register complexity as separability and showed
that computing it is pspace-complete. We then studied the reachability prob-
lem in alternating ACRAs, and showed that it is undecidable for ACRA (Z) and
exptime-complete for ACRA (N) games. Future work includes proving similar
characterizations and providing algorithms for register minimization in more
48 R. Alur and M. Raghothaman

general models such as streaming string transducers. String concatenation does


not form a commutative monoid, and the present paper is restricted to unary
operators (increment by constant), and so the technique does not immediately
carry over. Another interesting question is to find a machine-independent char-
acterization of regular functions f : Σ ∗ → Z⊥ . A third direction of work would
be extending these ideas to trees and studying their connection to alternating
ACRAs.

References
1. Alur, R., D’Antoni, L.: Streaming tree transducers. In: Czumaj, A., Mehlhorn, K.,
Pitts, A., Wattenhofer, R. (eds.) ICALP 2012, Part II. LNCS, vol. 7392, pp. 42–53.
Springer, Heidelberg (2012)
2. Alur, R., D’Antoni, L., Deshmukh, J.V., Raghothaman, M., Yuan, Y.: Regular
functions and cost register automata. To Appear in the 28th Annual Symposium
on Logic in Computer Science (2013), Full version available at
http://www.cis.upenn.edu/~alur/rca12.pdf
3. Bojanczyk, M., Klin, B., Lasota, S.: Automata with group actions. In: 26th Annual
Symposium on Logic in Computer Science, pp. 355–364 (2011)
4. Chandra, A., Kozen, D., Stockmeyer, L.: Alternation. Journal of the ACM 28(1),
114–133 (1981)
5. Chatterjee, K., Doyen, L., Henzinger, T.A.: Quantitative Languages. In: Kaminski,
M., Martini, S. (eds.) CSL 2008. LNCS, vol. 5213, pp. 385–400. Springer, Heidelberg
(2008)
6. Hopcroft, J., Motwani, R., Ullman, J.: Introduction to Automata Theory, Lan-
guages, and Computation, 3rd edn. Prentice Hall (2006)
7. Kaminski, M., Francez, N.: Finite-memory automata. Theoretical Computer Sci-
ence 134(2), 329–363 (1994)
8. Kozen, D.: Lower bounds for natural proof systems. In: 18th Annual Symposium
on Foundations of Computer Science, pp. 254–266 (October 31-November 2, 1977)
9. Markey, N.: Weighted automata: Model checking and games. Lecture Notes (2008),
http://www.lsv.ens-cachan.fr/ markey/Teaching/MPRI/2008-2009/
MPRI-2.8b-4.pdf
10. Mohri, M.: Minimization algorithms for sequential transducers. Theoretical Com-
puter Science 234, 177–201 (2000)
11. Mohri, M.: Weighted automata algorithms. In: Droste, M., Kuich, W., Vogler, H.
(eds.) Handbook of Weighted Automata. Monographs in Theoretical Computer
Science, pp. 213–254. Springer (2009)
12. Papadimitriou, C., Yannakakis, M.: Multiobjective query optimization. In: Pro-
ceedings of the 20th Symposium on Principles of Database Systems, PODS 2001,
pp. 52–59. ACM (2001)
Beyond Differential Privacy: Composition Theorems
and Relational Logic for f -divergences
between Probabilistic Programs

Gilles Barthe and Federico Olmedo

IMDEA Software Institute, Madrid, Spain


{Gilles.Barthe,Federico.Olmedo}@imdea.org

Abstract. f -divergences form a class of measures of distance between proba-


bility distributions; they are widely used in areas such as information theory and
signal processing. In this paper, we unveil a new connection between f -diver-
gences and differential privacy, a confidentiality policy that provides strong pri-
vacy guarantees for private data-mining; specifically, we observe that the notion
of α-distance used to characterize approximate differential privacy is an instance
of the family of f -divergences. Building on this observation, we generalize to ar-
bitrary f -divergences the sequential composition theorem of differential privacy.
Then, we propose a relational program logic to prove upper bounds for the f -
divergence between two probabilistic programs. Our results allow us to revisit
the foundations of differential privacy under a new light, and to pave the way for
applications that use different instances of f -divergences.

1 Introduction
Differential privacy [12] is a policy that provides strong privacy guarantees in private
data analysis: informally, a randomized computation over a database D is differentially
private if the private data of individuals contributing to D is protected against arbitrary
adversaries with query access to D. Formally, let  ≥ 0 and 0 ≤ δ ≤ 1: a randomized
algorithm c is (, δ)-differentially private if its output distributions for any two neigh-
bouring inputs x and y are (e , δ)-close, i.e. for every event E:

Pr c(x)E ≤ e Pr c(y)E + δ

where Pr c(x)E denotes the probability of event E in the distribution obtained by run-
ning c on input x. One key property of differential privacy is the existence of sequential
and parallel composition theorems, which allows building differentially private compu-
tations from smaller blocks. In this paper, we focus on the first theorem, which states
that the sequential composition of an (1 , δ1 )-differentially private algorithm with an
(2 , δ2 )-differentially private one yields an (1 + 2 , δ1 + δ2 )-differentially private al-
gorithm.

f -divergences [2,10] are convex functions that can be used to measure the distance be-
tween two distributions. The class of f -divergences includes many well-known notions
of distance, such as statistical distance, Kullback-Leibler divergence (relative entropy),

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 49–60, 2013.
c Springer-Verlag Berlin Heidelberg 2013
50 G. Barthe and F. Olmedo

or Hellinger distance. Over the years, f -divergences have found multiple applications
in information theory, signal processing, pattern recognition, machine learning, and se-
curity. The practical motivation for this work is a recent application of f -divergences
to cryptography: in [24], Steinberger uses Hellinger distance to improve the security
analysis of key-alternating ciphers, a family of encryption schemes that encompasses
the Advanced Encryption Standard AES.

Deductive Verification of Differentially Private Computations. In [6], we develop an


approximate probabilistic Hoare logic, called apRHL, for reasoning about differential
privacy of randomized computations. The logic manipulates judgments of the form:

c1 ∼α,δ c2 : Ψ ⇒ Φ

where c1 and c2 are probabilistic imperative programs, α ≥ 1, 0 ≤ δ ≤ 1 and Ψ


and Φ are relations over states. As for its predecessor pRHL [5], the notion of valid
judgment rests on a lifting operator that turns a relation R over states into a relation
∼α,δ
R over distributions of states: formally, the judgment above is valid iff for every pair
of memories m1 and m2 , m1 Ψ m2 implies (c1  m1 ) ∼α,δ Φ (c2  m2 ). The definition
of the lifting operator originates from probabilistic process algebra [15], and has close
connections with flow networks and the Kantorovich metric [11].
apRHL judgments characterize differential privacy, in the sense that c is (, δ)-dif-
ferentially private iff the apRHL judgment c ∼e ,δ Ψ : c ⇒ ≡ is valid, where Ψ is
a logical characterization of adjacency—for instance, two lists of the same length are
adjacent if they differ in a single element.

Problem Statement and Contributions. The goal of this paper is to lay the theoreti-
cal foundations for tool-supported reasoning about f -divergences between probabilistic
computations. To achieve this goal, we start from [6] and take the following steps:
1. as a preliminary observation, we prove that the notion of α-distance used to char-
acterize differential privacy is in fact an f -divergence;
2. we define a notion of composability of f -divergences and generalize the sequential
composition theorem of differential privacy to composable divergences;
3. we generalize the notion of lifting used in apRHL to composable f -divergences;
4. we define f pRHL, a probabilistic relational Hoare logic for f -divergences, and
prove its soundness.

Related Work. The problem of computing the distance between two probabilistic com-
putations has been addressed in different areas of computer science, including machine
learning, stochastic systems, and security. We briefly point to some recent develop-
ments.
Methods for computing the distance between probabilistic automata have been stud-
ied by Cortes and co-authors [8,9]; their work, which is motivated by machine-learning
applications, considers the Kullback-Leibler divergence as well as the Lp distance.
Approximate bisimulation for probabilistic automata has been studied, among oth-
ers, by Segala and Turrini [23] and by Tracol, Desharnais and Zhioua [25]. The sur-
vey [1] provides a more extensive account of the field.
Composition Theorems and Relational Logic for f -divergences 51

In the field of security, approximate probabilistic bisimulation is closely connected


to quantitative information flow of probabilistic computations, which has been stud-
ied e.g. by Di Pierro, Hankin and Wiklicky [20]. More recently, the connections be-
tween quantitative information flow and differential privacy have been explored e.g.
by Barthe and Köpf [4], and by Alvim, Andrés, Chatzikokolakis and Palamidessi [3].
Moreover, several language-based methods have been developed for guaranteeing dif-
ferential privacy; these methods are based on runtime verification, such as PINQ [17]
or Airavat [22], type systems [21,14], or deductive verification [7]. We refer to [19] for
a survey of programming languages methods for differential privacy.

2 Mathematical Preliminaries

In this section we review the representation of distributions used in our development


and recall the definition of f -divergences.

2.1 Probability Distributions

Throughout the presentation we consider distributions and sub-distributions over dis-


crete sets only. A probability distribution
 (resp. sub-distribution)
 over a set A is an
object μ : A → [0, 1] such that a∈A μ(a) = 1 (resp. a∈A μ(a) ≤ 1). We let D(A)
(resp. D≤1 (A)) be the set of distributions (resp. sub-distributions) over A.
Distributions are closed under convex  combinations: given distributions (μi )i∈N in
D(A) and weights (w i )i∈N such that i∈N wi = 1 and wi ≥ 0 for all i ∈ N, the
convex combination i∈N wi μi is also a distribution over A. Thus,given μ ∈ D(A)
and M : A → D(B), we define the distribution bind μ M over B as a∈A μ(a) M (a).
Likewise, sub-distributions are closed under convex combinations.

2.2 f -divergences

Let F be the set of non-negative convex functions f : R+


0 → R0 such that f is contin-
+

uous at 0 and f (1) = 0. Then each function in F induces a notion of distance between
probability distributions as follows:
Definition 1 (f -divergence). Given f ∈ F , the f -divergence Δf (μ1 , μ2 ) between two
distributions μ1 and μ2 in D(A) is defined as:

def μ1 (a)
Δf (μ1 , μ2 ) = μ2 (a)f
μ2 (a)
a∈A

The definition adopts the following conventions, which are used consistently throughout
the paper:

0f (0/0) = 0 and 0f (t/0) = t lim+ xf (1/x) if t > 0


x→0

Moreover, if Δf (μ1 , μ2 ) ≤ δ we say that μ1 and μ2 are (f, δ)-close.


52 G. Barthe and F. Olmedo

f -divergence f Simplified Form



Statistical distance SD(t) = 1
|t − 1| 1
|μ1 (a) − μ2 (a)|
2

a∈A 2
 
Kullback-Leibler 1
KL(t) = t ln(t) − t + 1 a∈A μ 1 (a) ln μ1 (a)
μ2 (a)
√    2
Hellinger distance HD(t) = 12 ( t − 1)2 1
a∈A 2 μ1 (a) − μ2 (a)

Fig. 1. Examples of f -divergences

When defining f -divergences one usually allows f to take positive as well as negative
values in R. For technical reasons, however, we consider only non-negative functions.
We now show that we can adopt this restriction without loss of generality.
Proposition 1. Let F  be defined as F , except that we allow f ∈ F  to take negative
values. Then for every f ∈ F  there exists g ∈ F given by g(t) = f (t) − f−

(1)(t − 1),

such that Δf = Δg . (Here f− denotes the left derivative of f , whose existence can be
guaranteed from the convexity of f .).
The class of f -divergences includes several popular instances; these include statistical
distance, relative entropy (also known as Kullback-Leibler divergence), and Hellinger
distance. In Figure 1 we summarize the convex function used to define each of them
and we also include a simplified form, useful to compute the divergence. (In case of
negative functions, we previously apply the transformation mentioned in Proposition 1,
so that we are consistent with our definition of f -divergences.)
In general, Δf does not define a metric. The symmetry axiom might be violated and
the triangle inequality holds only if f equals a non-negative multiple of the statistical
distance. The identity of indiscernibles does not hold in general, either.

3 A Sequential Composition Theorem for f -divergences


In this section we show that the notion of α-distance used to capture differential privacy
is an f -divergence. Then we define the composition of f -divergences and show that the
sequential composition theorem of differential privacy generalizes to this setting.

3.1 An f -divergence for Approximate Differential Privacy


In [6] we introduced the concept of α-distance to succinctly capture the notion of dif-
ferentially private computations. Given α ≥ 1, the α-distance between distributions μ1
and μ2 in D(A) is defined as
def
Δα (μ1 , μ2 ) = max dα (μ1 (E), μ2 (E))
E⊆A

1
Rigorously speaking, the function used for defining the Kullback-Leibler divergence should
be given by f (t) = t ln(t) + t − 1 if t > 0 and f (t) = 1 if t = 0 to guarantee its continuity
at 0.
Composition Theorems and Relational Logic for f -divergences 53

where dα (a, b) =def


max{a − αb, 0}. (This definition slightly departs from that of [6],
in the sense that we consider an asymmetric version of the α-distance. The original
version, symmetric, corresponds to taking dα (a, b) =def
max{a − αb, b − αa, 0}). Now
we can recast the definition of differential privacy in terms of the α-distance and say
that a randomized computation c is (, δ)-differentially private iff Δe (c(x), c(y)) ≤ δ
for any two adjacent inputs x and y.
Our composition result of f -divergences builds on the observation that α-distance is
an instance of the class of f -divergences.
Proposition 2. For every α ≥ 1, the α-distance Δα (μ1 , μ2 ) coincides with the
def
f -divergence ΔADα (μ1 , μ2 ) associated to function ADα (t) = max{t − α, 0}.

3.2 Composition
One key property of f -divergences is a monotonicity result referred to as the data pro-
cessing inequality [18]. In our setting, it is captured by the following proposition:
Proposition 3. Let μ1 ,μ2 ∈ D(A), M : A → D(B) and f ∈ F . Then
Δf (bind μ1 M, bind μ2 M ) ≤ Δf (μ1 , μ2 )
In comparison, the sequential composition theorem for differential privacy [16] is cap-
tured by the following theorem.
Theorem 1. Let μ1 ,μ2 ∈ D(A), M1 , M2 : A → D(B) and α, α ≥ 1. Then
Δαα (bind μ1 M1 , bind μ2 M2 ) ≤ Δα (μ1 , μ2 ) + max Δα (M1 (a), M2 (a))
a

Note that the data processing inequality for α-distance corresponds to the composition
theorem for the degenerate case where M1 and M2 are equal. The goal of this paragraph
is to generalize the sequential composition theorem to f -divergences. To this end, we
first define a notion of composability between f -divergences.
Definition 2 (f -divergence composability). Let f1 , f2 , f3 ∈ F . We say that (f1 , f2 )
is f3 -composable iff for all μ1 , μ2 ∈ D(A) and M1 , M2 : A → D(B), there exists
μ3 ∈ D(A) such that

Δf3 (bind μ1 M1 , bind μ2 M2 ) ≤ Δf1 (μ1 , μ2 ) + μ3 (a)Δf2 (M1 (a), M2 (a))


a∈A

Our notion of composability is connected to the notion of additive information mea-


sures from [13, Ch. 5]. To justify the connection, we first present an adaptation of their
definition to our setting.
Definition 3 (f -divergence additivity). Let f1 , f2 , f3 ∈ F . We say that (f1 , f2 ) is
f3 -additive iff for all distributions μ1 , μ2 ∈ D(A) and μ1 , μ2 ∈ D(B),
Δf3 (μ1 × μ1 , μ2 × μ2 ) ≤ Δf1 (μ1 , μ2 ) + Δf2 (μ1 , μ2 )
Here, μ×μ denotes the product distribution of μ and μ , i.e. (μ×μ )(a, b) =
def
μ(a)μ (b).
It is easily seen that composability entails additivity.
54 G. Barthe and F. Olmedo

Proposition 4. Let f1 , f2 , f3 ∈ F such that (f1 , f2 ) is f3 -composable. Then (f1 , f2 ) is


f3 -additive.

The f -divergences from Figure 1 present good behaviour under composition. The statis-
tical distance, Hellinger distance and the Kullback-Leibler divergence are composable
w.r.t. themselves. Moreover, α-divergences are composable.

Proposition 5
• (SD, SD) is SD-composable;
• (KL, KL) is KL-composable;
• (HD, HD) is HD-composable;
• (ADα1 , ADα2 ) is ADα1 α2 -composable for every α1 , α2 ≥ 1.

The sequential composition theorem of differential privacy extends naturally to the class
of composable divergences.
Theorem 2. Let f1 , f2 , f3 ∈ F . If (f1 , f2 ) is f3 -composable, then for all μ1, μ2 ∈ D(A)
and all M1 , M2 : A → D(B),

Δf3 (bind μ1 M1 , bind μ2 M2 ) ≤ Δf1 (μ1 , μ2 ) + max Δf2 (M1 (a), M2 (a))
a

Theorem 2 will be the cornerstone for deriving the sequential composition rule of
f pRHL. (As an intermediate step, we first show that the composition result extends
to relation liftings.)

4 Lifting
The definition of valid apRHL judgment rests on the notion of lifting. As a last step
before defining our relational logic, we extend the notion of lifting to f -divergences.
One key difference between our definition and that of [6] is that the former uses two
witnesses, rather than one. In the remainder, we let supp (μ) denote the set of elements
a ∈ A such that μ(a) > 0. Moreover,
 given μ ∈ D(A × B),we define π1 (μ) and π2 (μ)
by the clauses π1 (μ)(a) = b∈B μ(a, b) and π2 (μ)(b) = a∈A μ(a, b).
f,δ
Definition 4 (Lifting). Let f ∈ F and δ ∈ R+ 0 . Then (f, δ)-lifting ∼R of a relation
R ⊆ A × B is defined as follows: given μ1 ∈ D(A) and μ2 ∈ D(B), μ1 ∼f,δ R μ2
iff there exist μL , μR ∈ D(A × B) such that: i) supp (μL ) ⊆ R; ii) supp (μR ) ⊆ R;
iii) π1 (μL ) = μ1 ; iv) π2 (μR ) = μ2 and v) Δf (μL , μR ) ≤ δ. The distributions μL and
μR are called the left and right witnesses for the lifting, respectively.
A pleasing consequence of our definition is that the witnesses for relating two distribu-
tions are themselves distributions, rather than sub-distributions; this is in contrast with
our earlier definition from [6], where witnesses for the equality relation are necessarily
sub-distributions. Moreover, our definition is logically equivalent to the original one
from [15], provided δ = 0, and f satisfies the identity of indiscernibles. In the case
of statistical distance and α-distance, our definition also has a precise mathematical
relationship with (an asymmetric variant of) the lifting used in [6].
Composition Theorems and Relational Logic for f -divergences 55

Proposition 6. Let α ≥ 1, μ1 ∈ D(A) and μ2 ∈ D(B). If μ1 ∼AD R


α ,δ
μ2 then there
exists a sub-distribution μ ∈ D(A × B) such that: i) supp (μ) ⊆ R; ii) π1 (μ) ≤ μ1 ;
iii) π2 (μ) ≤ μ2 and iv) Δα (μ1 , π1 μ) ≤ δ, where ≤ denotes the natural pointwise order
on the space of sub-distributions, i.e. μ ≤ μ iff μ(a) ≤ μ (a) for all a.
We briefly review some key properties of liftings. The first result characterizes liftings
over equivalence relations, and will be used to show that f -divergences can be charac-
terized by our logic.
Proposition 7 (Lifting of equivalence relations). Let R be an equivalence relation
over A and let μ1 , μ2 ∈ D(A). Then,

μ1 ∼f,δ
R μ2 ⇐⇒ Δf (μ1 /R, μ2 /R) ≤ δ,
def
where μ/R is a distribution over the quotient set A/R, defined as (μ/R)([a]) = μ([a]).
In particular, if R is the equality relation ≡, we have

μ1 ∼f,δ
≡ μ2 ⇐⇒ Δf (μ1 , μ2 ) ≤ δ

Our next result allows deriving probability claims from lifting judgments. Given
R ⊆ A × B we say that the subsets A0 ⊆ A and B0 ⊆ B are R-equivalent, and
write A0 =R B0 , iff for every a ∈ A and b ∈ B, a R b implies a ∈ A0 ⇐⇒ b ∈ B0 .
Proposition 8 (Fundamental property of lifting). Let μ1 ∈ D(A), μ2 ∈ D(B), and
R ⊆ A × B. Then, for any two events A0 ⊆ A and B0 ⊆ B,

f,δ μ1 (A0 )
μ1 ∼R μ2 ∧ A0 =R B0 =⇒ μ2 (B0 ) f ≤δ
μ2 (B0 )
Our final result generalizes the sequential composition theorem from the previous sec-
tion to arbitrary liftings.
Proposition 9 (Lifting composition). Let f1 , f2 , f3 ∈ F such that (f1 , f2 ) is f3 -
composable. Moreover let μ1 ∈ D(A), μ2 ∈ D(B), M1 : A → D(A ) and
M2 : B → D(B  ). If μ1 ∼fR11,δ1 μ2 and M1 (a) ∼fR22,δ2 M2 (b) for all a and b such
that a R b, then
(bind μ1 M1 ) ∼fR32,δ1 +δ2 (bind μ2 M2 )

5 A Relational Logic for f -divergences


Building on the results of the previous section, we define a relational logic, called
f pRHL, for proving upper bounds for the f -divergence between probabilistic com-
putations written in a simple imperative language.

5.1 Programming Language


We consider programs written in a probabilistic imperative language pW HILE. The
syntax of the programming language is defined inductively as follows:
56 G. Barthe and F. Olmedo

C ::= skip nop


| V ←E deterministic assignment
| V← $ DE random assignment
| if E then C else C conditional
| while E do C while loop
| C; C sequence
Here V is a set of variables, E is a set of deterministic expressions, and DE is a set
of expressions that denote distributions from which values are sampled in random as-
signments. Program states or memories are mappings from variables to values. More
precisely, memories map a variable v of type T to a value in its interpretation T . We
use M to denote the set of memories. Programs are interpreted as functions from initial
memories to sub-distributions over memories. The semantics, which is given in Fig-
ure 2, is based on two evaluation functions ·E and ·DE for expressions and distribu-
tion expressions; these functions respectively map memories to values and memories to
sub-distributions of values. Moreover, the definition uses the operator unit, which maps
every a ∈ A to the unique distribution over A that assigns probability 1 to a and proba-
bility 0 to every other element of A, and the null distribution μ0 , that assigns probability
0 to all elements of A. Note that the semantics of programs is a map from memories to
sub-distributions over memories. Sub-distributions, rather than distributions, are used to
model probabilistic non-termination. However, for the sake of simplicity, in the current
development of the logic, we only consider programs that terminate with probability 1
on all inputs and leave the general case for future work.

skip m = unit m
c; c  m = bind (c m) c 
x ← e m = unit (m {eE m/x})
x ← μ m
$ = bind (μDE m) (λv. unit (m {v/x}))
if e then c1 else c2  m = if (eE m = true) then (c1  m) else (c2  m)
while e do c m = λf. supn∈N ([while e do c]n  m f )
[while e do c]0 = if (eE m = true) then (unit m) else μ0
where
[while e do c]n+1 = if e then c; [while e do c]n

Fig. 2. Semantics of programs

5.2 Judgments

f pRHL judgments are of the form c1 ∼f,δ c2 : Ψ ⇒ Φ, where c1 and c2 are programs,
Ψ and Φ are relational assertions, f ∈ F and δ ∈ R+ 0 . Relational assertions are first-
order formulae over generalized expressions, i.e. expressions in which variables are
tagged with a 1 or 2. Relational expressions are interpreted as formulae over pairs
of memories, and the tag on a variable is used to indicate whether its interpretation
Composition Theorems and Relational Logic for f -divergences 57

should be taken in the first or second memory. For instance, the relational assertion
x1 = x2 states that the values of x coincide in the first and second memories. More
generally, we use ≡ to denote the relational assertion that states that the values of all
variables coincide in the first and second memories.
An f pRHL judgment is valid iff for every pair of memories related by the pre-
condition Ψ , the corresponding pair of output distributions is related by the (f, δ)-lifting
of the post-condition Φ.
Definition 5 (Validity in f pRHL). A judgment c1 ∼f,δ c2 : Ψ ⇒ Φ is valid, written
|= c1 ∼f,δ c2 : Ψ ⇒ Φ, iff

∀m1 , m2 • m1 Ψ m2 =⇒ (c1  m1 ) ∼f,δ


Φ (c2  m2 )

f pRHL judgments provide a characterization of f -divergence. Concretely, judgments


with the identity relation as post-condition can be used to derive (f, δ)-closeness results.

Proposition 10. If |= c1 ∼f,δ c2 : Ψ ⇒ ≡, then for all memories m1 , m2 ,

m1 Ψ m2 =⇒ Δf (c1  m1 , c2  m2 ) ≤ δ

Moreover, f pRHL characterizes continuity properties of probabilistic programs. We


assume a continuity model in which programs are executed on random inputs, i.e. dis-
tributions of initial memories, and we use f -divergences as metrics to compare program
inputs and outputs.
Proposition 11. Let f1 , f2 , f3 ∈ F such that (f1 , f2 ) is f3 -composable. If we have
|= c1 ∼f2 ,δ2 c2 : ≡ ⇒ ≡, then for any two distributions of initial memories μ1 and μ2 ,

Δf1 (μ1 , μ2 ) ≤ δ1 =⇒ Δf3 (bind μ1 c1 , bind μ2 c2 ) ≤ δ1 + δ2

Finally, we can use judgments with arbitrary post-condictions to relate the probabili-
ties of single events in two programs. This is used, e.g. in the context of game-based
cryptographic proofs.
Proposition 12. If |= c1 ∼f,δ c2 : Ψ ⇒ Φ, then for all memories m1 , m2 and events
E1 , E2 ,

(c1  m1 )(E1 )
m1 Ψ m2 ∧ E1 =Φ E2 =⇒ (c2  m2 )(E2 ) f ≤δ
(c2  m2 )(E2 )

5.3 Proof System


Figure 3 presents a set of core rules for reasoning about the validity of an f pRHL
judgment. All the rules are transpositions of rules from apRHL [6]. However, f pRHL
rules do no directly generalize their counterparts from apRHL. This is because both
logics admit symmetric and asymmetric versions, but apRHL and f pRHL are opposite
variants: f pRHL is asymmetric and apRHL is symmetric. Refer to Section 5.4 for a
discussion about the symmetric version of f pRHL.
58 G. Barthe and F. Olmedo

∀m1 , m2 • m1 Ψ m2 =⇒ (m1 {e1  m1 /x1 }) Φ (m2 {e2  m2 /x2 })


[assn]
x1 ← e1 ∼f,0 x2 ← e2 : Ψ ⇒ Φ

∀m1 , m2 • m1 Ψ m2 =⇒ Δf (μ1 DE m1 , μ2 DE m2 ) ≤ δ


$ μ ∼
[rand]
x1 ← f,δ x2 ← μ2 : Ψ ⇒ x1 1 = x2 2
$
1

Ψ =⇒ b1 ≡ b 2
c1 ∼f,δ c1 : Ψ ∧ b1 ⇒ Φ c2 ∼f,δ c2 : Ψ ∧ ¬b1 ⇒ Φ
[cond]
if b then c1 else c2 ∼f,δ if b then c1 else c2 : Ψ ⇒ Φ


(f1 , . . . , fn ) composable and monotonic


Θ = def
b1 ≡ b 2 Ψ ∧ e1 ≤ 0 =⇒ ¬b1
c ∼f1 ,δ c : Ψ ∧ b1 ∧ b 2 ∧ e1 = k ⇒ Ψ ∧ Θ ∧ e1 < k

[while]
while b do c ∼fn ,nδ while b do c : Ψ ∧ Θ ∧ e1 ≤ n ⇒ Ψ ∧ ¬b1 ∧ ¬b 2

(f1 , f2 ) is f3 -composable
c1 ∼f1 ,δ1 c2 : Ψ ⇒ Φ c1 ∼f2 ,δ2 c2 : Φ ⇒ Φ
[skip] [seq]
skip ∼f,0 skip : Ψ ⇒ Ψ c1 ; c1 ∼f3 ,δ1 +δ2 c2 ; c2 : Ψ ⇒ Φ


c1 ∼f,δ c2 : Ψ ∧ Θ ⇒ Φ c1 ∼f  ,δ c2 : Ψ  ⇒ Φ
c1 ∼f,δ c2 : Ψ ∧ ¬Θ ⇒ Φ Ψ ⇒ Ψ Φ ⇒ Φ f ≤ f  δ  ≤ δ

[case] [weak]
c1 ∼f,δ c2 : Ψ ⇒ Φ c1 ∼f,δ c2 : Ψ ⇒ Φ

Fig. 3. Core proof rules

We briefly describe some main rules, and refer the reader to [6] for a longer descrip-
tion about each of them. Rule [seq] relates two sequential compositions and is a direct
consequence from the lifting composition (see Proposition 9). Rule [while] relates two
loops that terminate in lockstep. The bound depends on the maximal number of itera-
tions of the loops, and we assume given a loop variant e that decreases at each iteration,
and is initially upper bounded by some constant n. We briefly explain the side condi-
tions: (f1 , . . . , fn ) is composable iff (fi , f1 ) is fi+1 -composable for every 1 ≤ i < n.
Moreover, (f1 , . . . , fn ) is monotonic iff fi ≤ fi+1 for 1 ≤ i < n. Note that the rule is
given for n ≥ 2; specialized rules exist for n = 0 and n = 1. This rule readily special-
izes to reason about (, δ)-differential privacy by taking fi = ADαi , where α = e .
If an f pRHL judgment is derivable using the rules of Figure 3, then it is valid. For-
mally,
Proposition 13 (Soundness). If c1 ∼f,δ c2 : Ψ ⇒ Φ then |= c1 ∼f,δ c2 : Ψ ⇒ Φ.

5.4 Symmetric Logic

One can also define a symmetric version of the logic by adding as an additional clause
in the definition of the lift relation that Δf (μR , μL ) ≤ δ. An instance of this logic is the
symmetric apRHL logic from [6]. All rules remain unchanged, except for the random
Composition Theorems and Relational Logic for f -divergences 59

sampling rule that now requires the additional inequality to be checked in the premise of
the rule.

6 Conclusion
This paper makes two contributions: first, it unveils a connection between differen-
tial privacy and f -divergences. Second, it lays the foundations for reasoning about
f -divergences between randomized computations. As future work, we intend to im-
plement support for f pRHL in EasyCrypt [4], and formalize the results from [24].
We also intend to investigate the connection between our notion of lifting and flow
networks.

Acknowledgments. This work was partially funded by the European Projects FP7-
256980 NESSoS and FP7-229599 AMAROUT, Spanish project TIN2009-14599
DESAFIOS 10 and Madrid Regional project S2009TIC-1465 PROMETIDOS.

References
1. Abate, A.: Approximation metrics based on probabilistic bisimulations for general state-
space markov processes: a survey. Electronic Notes in Theoretical Computer Sciences (2012)
(in print)
2. Ali, S.M., Silvey, S.D.: A general class of coefficients of divergence of one distribution from
another. Journal of the Royal Statistical Society. Series B (Methodological) 28(1), 131–142
(1966)
3. Alvim, M.S., Andrés, M.E., Chatzikokolakis, K., Palamidessi, C.: On the relation between
differential privacy and Quantitative Information Flow. In: Aceto, L., Henzinger, M., Sgall,
J. (eds.) ICALP 2011, Part II. LNCS, vol. 6756, pp. 60–76. Springer, Heidelberg (2011)
4. Barthe, G., Grégoire, B., Heraud, S., Béguelin, S.Z.: Computer-aided security proofs for the
working cryptographer. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 71–90.
Springer, Heidelberg (2011)
5. Barthe, G., Grégoire, B., Zanella-Béguelin, S.: Formal certification of code-based crypto-
graphic proofs. In: 36th ACM SIGPLAN-SIGACT Symposium on Principles of Program-
ming Languages, POPL 2009, pp. 90–101. ACM, New York (2009)
6. Barthe, G., Köpf, B., Olmedo, F., Zanella-Béguelin, S.: Probabilistic relational reasoning for
differential privacy. In: 39th ACM SIGPLAN-SIGACT Symposium on Principles of Pro-
gramming Languages, POPL 2012, pp. 97–110. ACM, New York (2012)
7. Chaudhuri, S., Gulwani, S., Lublinerman, R., Navidpour, S.: Proving programs robust. In:
19th ACM SIGSOFT Symposium on the Foundations of Software Engineering and 13rd
European Software Engineering Conference, ESEC/FSE 2011, pp. 102–112. ACM, New
York (2011)
8. Cortes, C., Mohri, M., Rastogi, A.: Lp distance and equivalence of probabilistic automata.
Int. J. Found. Comput. Sci. 18(4), 761–779 (2007)
9. Cortes, C., Mohri, M., Rastogi, A., Riley, M.: On the computation of the relative entropy of
probabilistic automata. Int. J. Found. Comput. Sci. 19(1), 219–242 (2008)
10. Csiszár, I.: Eine informationstheoretische ungleichung und ihre anwendung auf den beweis
der ergodizitat von markoffschen ketten. Publications of the Mathematical Institute of the
Hungarian Academy of Science 8, 85–108 (1963)
60 G. Barthe and F. Olmedo

11. Deng, Y., Du, W.: Logical, metric, and algorithmic characterisations of probabilistic bisimu-
lation. Tech. Rep. CMU-CS-11-110, Carnegie Mellon University (March 2011)
12. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.)
ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)
13. Ebanks, B., Sahoo, P., Sander, W.: Characterizations of Information Measures. World Scien-
tific (1998)
14. Gaboardi, M., Haeberlen, A., Hsu, J., Narayan, A., Pierce, B.C.: Linear dependent types for
differential privacy. In: 40th ACM SIGPLAN–SIGACT Symposium on Principles of Pro-
gramming Languages, POPL 2013, pp. 357–370. ACM, New York (2013)
15. Jonsson, B., Yi, W., Larsen, K.G.: Probabilistic extensions of process algebras. In: Bergstra,
J., Ponse, A., Smolka, S. (eds.) Handbook of Process Algebra, pp. 685–710. Elsevier, Ams-
terdam (2001)
16. McSherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data
analysis. Commun. ACM 53(9), 89–97 (2010)
17. McSherry, F.D.: Privacy integrated queries: an extensible platform for privacy-preserving
data analysis. In: 35th SIGMOD International Conference on Management of Data, SIG-
MOD 2009, pp. 19–30. ACM, New York (2009)
18. Pardo, M., Vajda, I.: About distances of discrete distributions satisfying the data processing
theorem of information theory. IEEE Transactions on Information Theory 43(4), 1288–1293
(1997)
19. Pierce, B.C.: Differential privacy in the programming languages community. Invited Tutorial
at DIMACS Workshop on Recent Work on Differential Privacy Across Computer Science
(2012)
20. Di Pierro, A., Hankin, C., Wiklicky, H.: Measuring the confinement of probabilistic systems.
Theor. Comput. Sci. 340(1), 3–56 (2005)
21. Reed, J., Pierce, B.C.: Distance makes the types grow stronger: a calculus for differential pri-
vacy. In: 15th ACM SIGPLAN International Conference on Functional programming, ICFP
2010, pp. 157–168. ACM, New York (2010)
22. Roy, I., Setty, S.T.V., Kilzer, A., Shmatikov, V., Witchel, E.: Airavat: security and privacy for
MapReduce. In: 7th USENIX Conference on Networked Systems Design and Implementa-
tion, NSDI 2010, pp. 297–312. USENIX Association, Berkeley (2010)
23. Segala, R., Turrini, A.: Approximated computationally bounded simulation relations for
probabilistic automata. In: 20th IEEE Computer Security Foundations Symposium, CSF
2007, pp. 140–156. IEEE Computer Society (2007)
24. Steinberger, J.: Improved security bounds for key-alternating ciphers via hellinger distance.
Cryptology ePrint Archive, Report 2012/481 (2012), http://eprint.iacr.org/
25. Tracol, M., Desharnais, J., Zhioua, A.: Computing distances between probabilistic automata.
In: Proceedings of QAPL. EPTCS, vol. 57, pp. 148–162 (2011)
A Maximal Entropy Stochastic Process
for a Timed Automaton,

Nicolas Basset1,2
1
LIGM, University Paris-Est Marne-la-Vallée and CNRS, France
2
LIAFA, University Paris Diderot and CNRS, France
[email protected]

Abstract. Several ways of assigning probabilities to runs of timed au-


tomata (TA) have been proposed recently. When only the TA is given, a
relevant question is to design a probability distribution which represents
in the best possible way the runs of the TA. This question does not seem
to have been studied yet. We give an answer to it using a maximal entropy
approach. We introduce our variant of stochastic model, the stochastic
process over runs which permits to simulate random runs of any given
length with a linear number of atomic operations. We adapt the notion of
Shannon (continuous) entropy to such processes. Our main contribution
is an explicit formula defining a process Y ∗ which maximizes the entropy.
This formula is an adaptation of the so-called Shannon-Parry measure to
the timed automata setting. The process Y ∗ has the nice property to be
ergodic. As a consequence it has the asymptotic equipartition property
and thus the random sampling w.r.t. Y ∗ is quasi uniform.

1 Introduction

Timed automata (TA) were introduced in the early 90’s by Alur and Dill [4] and
then extensively studied, to model and verify the behaviours of real-time systems.
In this context of verification, several probability settings have been added to
TA (see references below). There are several reasons to add probabilities: this
permits (i) to reflect in a better way physical systems which behave randomly,
(ii) to reduce the size of the model by pruning the behaviors of null probability
[8], (iii) to resolve undeterminism when dealing with parallel composition [15,16].
In most of previous works on the subject (see e.g. [10,2,11,15]), probability
distributions on continuous and discrete transitions are given at the same time as
the timed settings. In these works, the choice of the probability functions is left
to the designer of the model. Whereas, she or he may want to provide only the
TA and ask the following question: what is the “best” choice of the probability
functions according to the TA given? Such a “best” choice must transform the

An extended version of the present paper containing detailed proofs and examples
is available on-line http://hal.archives-ouvertes.fr/hal-00808909.

The support of Agence Nationale de la Recherche under the project EQINOCS
(ANR-11-BS02-004) is gratefully acknowledged.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 61–73, 2013.

c Springer-Verlag Berlin Heidelberg 2013
62 N. Basset

TA into a random generator of runs the least biased as possible, i.e it should
generate the runs as uniformly as possible to cover with high probability the
maximum of behaviours of the modeled system. More precisely the probability
for a generated run to fall in a set should be proportional to the size (volume) of
this set (see [16] for a same requirement in the context of job-shop scheduling).
We formalize this question and propose an answer based on the notion of entropy
of TA introduced in [6].
The theory developed by Shannon [21] and his followers permits to solve
the analogous problem of quasi-uniform path generation in a finite graph. This
problem can be formulated as follows: given a finite graph G, how can one find
a stationary Markov chain on G which allows one to generate the paths in the
most uniform manner? The answer is in two steps (see Chapter 1.8 of [19] and
also section 13.3 of [18]): (i) There exists a stationary Markov chain on G with
maximal entropy, the so called Shannon-Parry Markov chain; (ii) This stationary
Markov chain allows to generate paths quasi uniformly.
In this article we lift this theory to the timed automata setting. We work with
timed region graphs which are to timed automata what finite directed graphs
are to finite state automata i.e. automata without labeling on edges and without
initial and final states. We define stochastic processes over runs of timed region
graphs (SPOR) and their (continuous) entropy. This generalization of Markov
chains for TA has its own interest, it is up to our knowledge the first one which
provides a continuous probability distribution on starting states. Such a SPOR
permits to generate step by step random runs. As a main result we describe a
maximal entropy SPOR which is stationary and ergodic and which generalizes
the Shannon-Parry Markov chain to TA (Theorem 1). Concepts of maximal
entropy, stationarity and ergodicity can be interesting by themselves, here we
use them as the key hypotheses to ensure a quasi uniform sampling (Theorem
2). More precisely the result we prove is a variant of the so called Shannon-
McMillan-Breiman theorem also known as asymptotic equipartition property
(AEP).
Potential Applications. There are two kind of probabilistic model checking:
(i) the almost sure model checking aiming to decide if a model satisfies a for-
mula with probability one (e.g. [13,3]); (ii) the quantitative (probabilistic) model
checking (e.g. [11,15]) aiming to compare the probability of a formula to be sat-
isfied with some given threshold or to estimate directly this probability.
A first expected application of our results would be a “proportional” model
checking. The inputs of the problem are: a timed region graph G, a formula
ϕ, a threshold θ ∈ [0, 1]. The question is whether the proportion of runs of G
which satisfy ϕ is greater than θ or not. A recipe to address this problem would
be as follows: (i) take as a probabilistic model M the timed region graph G
together with the maximum entropy SPOR Y ∗ defined in our main theorem;
(ii) run a quantitative (probabilistic) model checking algorithm on the inputs
M, ϕ, θ (the output of the algorithm is yes or no whether M satisfies ϕ with a
probability greater than θ or not) (iii) use the same output for the proportional
model checking problem.
A Maximal Entropy Stochastic Process for a Timed Automaton 63

A random simulation with a linear number of operations wrt. the length of


the run can be achieved with our probabilistic setting. It would be interesting to
incorporate the simulation of our maximal entropy process in a statistical model
checking algorithms. Indeed random simulation is at the heart of such kind of
quantitative model checking (see [15] and reference therein).
The concepts handled in this article such as stationary stochastic processes
and their entropy, AEP, etc. come from information and coding theory (see [14]).
Our work can be a basis for the probabilistic counterpart of the timed channel
coding theory we have proposed in [5]. Another application in the same flavour
would be a compression method of timed words accepted by a given deterministic
TA.
Related Work. As mentioned above, this work generalizes the Shannon-Parry
theory to the TA setting. Up to our knowledge this is the first time that a
maximal entropy approach is used in the context of quantitative analysis of
real-time systems.
Our models of stochastic real-time system can be related to numerous previous
works. Almost-sure model checking for probabilistic real-time systems based on
generalized semi Markov processes GSMPs was presented in [3] at the same
time as the timed automata theory and by the same authors. This work was
followed by [2,10] which address the problem of quantitative model checking for
GSMPs under restricted hypotheses. The GSMPs have several differences with
TA; roughly they behave as follows: in each location, clocks decrease until a clock
is null, at this moment an action corresponding to this clock is fired, the other
clocks are either reset, unchanged or purely canceled. Our probability setting is
more inspired by [8,13,15] where probability densities are added directly on the
TA. Here we add the new feature of an initial probability density function on
states.
In [15], a probability distribution on the runs of a network of priced timed
automaton is implicitly defined by a race between the components, each of them
having its own probability. This allows a simulation of random runs in a non
deterministic structure without state space explosion. There is no reason that the
probability obtained approximates uniformness and thus it is quite incomparable
to our objective.
Our techniques are based on the pioneering articles [6,7] on entropy of regular
timed languages. In the latter article and in [5], an interpretation of the entropy
of a timed language as information measure of the language was given.

2 Stochastic Processes on Timed Region Graphs

2.1 Timed Graphs and Their Runs

In this section we define a timed region graph which is the underlying structure
of a timed automaton [4]. For technical reasons we consider only timed region
graphs with bounded clocks. We will justify this assumption in section 3.1.
64 N. Basset

Timed Region Graphs. Let X be a finite set of variables called clocks. Clocks
have non-negative values bounded by a constant M . A rectangular constraint
has the form x ∼ c where ∼∈ {≤, <, =, >, ≥}, x ∈ X, c ∈ N. A diagonal
constraint has the form x − y ∼ c where x, y ∈ X. A guard is a finite conjunction
of rectangular constraints. A zone is a set of clock vectors x ∈ [0, M ]X satisfying
a finite conjunction of rectangular and diagonal constraints. A region is a zone
which is minimal for inclusion (e.g. the set of points (x1 , x2 , x3 , x4 ) which satisfy
the constraints 0 = x2 < x3 − 4 = x4 − 3 < x1 − 2 < 1). Regions of [0, 1]2 are
depicted in Fig 1.
As we work by analogy with finite graphs, we introduce timed region graphs
which are roughly timed automata without labels on transitions and without
initial and final states. Moreover we consider a state space decomposed in regions.
Such a decomposition in regions are quite standard for timed automata and does
not affect their behaviours (see e.g. [11,6]).
A timed region graph is a tuple (X, Q, S, Δ) such that
– X is a finite set of clocks.
– Q is a finite set of locations.
– S is the set of states which are couples of a location and a clock vector
(S ⊆ Q × [0, M ]X ). It admits a region decomposition S = ∪q∈Q {q} × rq
where for each q ∈ Q, rq is a region.
– Δ is a finite set of transitions. Any transition δ ∈ Δ goes from a starting
location δ − ∈ Q to an ending location δ + ∈ Q; it has a set r(δ) of clocks to
reset when firing δ and a fleshy guard g(δ) to satisfy to fire it. Moreover, the
set of clock vectors that satisfy g(δ) is projected on the region rδ+ when the
clocks in r(δ) are resets.

Runs of the Timed Region Graph. A timed transition is an element (t, δ)


of A =def [0, M ] × Δ. The time delay t represents the time before firing the
transition δ.
Given a state s = (q, x) ∈ S (i.e x ∈ rq ) and a timed transition α = (t, δ) ∈ A
the successor of s by α is denoted by s  α and defined as follows. Let x be the
clock vector obtained from x + (t, . . . , t) by resetting clocks in r(δ) (xi = 0 if
i ∈ r(δ), xi = xi + t otherwise). If δ − = q and x + (t, . . . , t) satisfies the guard
g(δ) then x ∈ rδ+ and s  α = (δ + , x ) else s  α = ⊥. Here and in the rest of
the paper ⊥ represents every undefined state.
We extend the successor action  to words of timed transitions by induction:
s  ε = s and s  (αα ) = (s  α)  α for all s ∈ S, α ∈ A, α ∈ A∗ .
A run of the timed region graph G is a word s0 α0 · · · sn αn ∈ (S × A)n+1 such
that si+1 = si  αi = ⊥ for all i ∈ {0, . . . , n − 1} and sn  αn = ⊥; its reduced
version is [s0 , α0 . . . αn ] ∈ S × An+1 (for all i > 0 the state si is determined by
its preceding states and timed transition and thus is a redundant information).
In the following we will use without distinction extended and reduced version of
runs. We denote by Rn the set of runs of length n (n ≥ 1).
Example 1. Let G ex1 be the timed region graph depicted on figure 1 with rp and
rq the region described by the constraints 0 = y < x < 1 and 0 = x < y < 1
A Maximal Entropy Stochastic Process for a Timed Automaton 65

y
δ 1 , 0 < x < 1, {y} δ 4 , 0 < y < 1, {x}
1

δ 2 , 0 < y < 1, {x}


rq p q

δ 3 , 0 < x < 1, {y}


0 rp 1 x

Fig. 1. The running example. Right: G ex1 ; left: Its state space (in gray).

respectively. Successor action is defined by [p, (x, 0)]  (t, δ 1 ) = [p, (x + t, 0)] and
[p, (x, 0)]  (t, δ 2 ) = [q, (0, t)] if x + t < 1; [q, (0, y)]  (t, δ 3 ) = [p, (t, 0)] and
[q, (0, y)]  (t, δ 4 ) = [q, (0, y + t)] if y + t < 1. An example of run of G ex1 is
(p, (0.5, 0))(0.4, δ 1 )(p, (0.9, 0))(0.8, δ 2 )(q, (0, 0.8))(0.1, δ 3 )(p, (0.1, 0)).

Integrating over States and Runs; Volume of Runs. It is well known (see
[4]) that a region is uniquely described by the integer parts of clocks and by an
order on their fractional parts, e.g. in the region rex given by the constraints
0 = x2 < x3 − 4 = x4 − 3 < x1 − 2 < 1, the integer parts are x1 = 2, x2 =
0, x3 = 4, x4 = 3 and fractional parts are ordered as follows 0 = {x2 } <
{x3 } = {x4 } < {x1 } < 1. We denote by γ1 < γ2 < · · · < γd the fractional
parts different from 0 of clocks of a region rq (d is called the dimension of the
region). In our example the dimension of rex is 2 and (γ1 , γ2 ) = (x3 − 4, x1 − 2).
We denote by Γq the simplex Γq = {γ ∈ Rd | 0 < γ1 < γ2 < · · · < γd < 1}.
The mapping φr : x !→ γ is a natural bijection from the d dimensional region
r ⊂ R|X| to Γq ⊂ Rd . In the example the pre-image of a vector (γ1 , γ2 ) is
(γ2 + 2, 0, γ1 + 4, γ1 + 3).
Example 2 (Continuing example 1). The region rp = {(x, y) | 0 = y < x < 1} is
1-dimensional, φrp (x, y) = x and φ−1
rp (γ) = (γ, 0).

Now, we introduce simplified notation for sums of integrals over states, transi-
tions and runs. We define the integral of an integrable1 function f : S → R (over
states):  
f (s)ds = f (q, φ−1
rq (γ))dγ.
S q∈Q Γq

where .dγ is the usual integral (w.r.t. the Lebesgue measure). We define the
integral of an integrable function f : A → R (over timed transitions):
 
f (α)dα = f (t, δ)dt
A δ∈Δ [0,M]

1
A function f : S → R is integrable if for each q ∈ Q the function γ → f (q, φ−1
rq (γ))
is Lebesgue integrable. A function f : A → R is integrable if for each δ ∈ Δ the
function t → f (t, δ) is Lebesgue integrable.
66 N. Basset

and the integral of an integrable function f : Rn → R (over runs) with the


convention that f [s, α] = 0 if s  α = ⊥:
   
f [s, α]d[s, α] = . . . f [s, α]dα1 . . . dαn ds
Rn S A A

To summarize, we take finite sums over finite discrete sets Q, Δ and take integrals
over dense sets Γq , [0, M ]. More precisely, all the integrals we define have their
corresponding measures2 which are products of counting measures on discrete
sets Σ, Q and Lebesgue measure over subsets of Rm for some m ≥ 0 (e.g. Γq ,
[0, M ]). We denote by B(S) (resp. B(A)) the set of measurable subsets of S
(resp. A).
The volume of the set of n-length runs is defined by:
  
Vol(Rn ) = 1d[s, α] = 1sα=⊥ dαds
Rn S An

Remark 1. The use of reduced version of runs is crucial when dealing with in-
tegrals (and densities in the following). Indeed the following integral on the
extended version of runs is
 always
 null since variables are linked (si+1 = si  αi
for i = 0..n − 2): A S . . . A S 1s0 α0 ···sn−1 αn−1 ∈Rn ds0 dα0 . . . dsn−1 dαn−1 = 0.

2.2 SPOR on Timed Region Graphs

Let (Ω, F , P ) be a probability space. A stochastic process over runs (SPOR) of a


timed region graph G is a sequence of random variables (Yn )n∈N = (Sn , An )n∈N
such that:

C.1) For all n ∈ N, Sn : (Ω, F , P ) → (S, B(S)) and An : (Ω, F , P ) → (A, B(A)).
C.2) The initial state S0 has a probability density
 function (PDF) p0 : S → R+
i.e. for every S ∈ B(S), P (S0 ∈ S) = s∈S p0 (s)ds (in particular P (S0 ∈

S) = s∈S p0 (s)ds = 1).
C.3) Probability on every timed transition only depends on the current state:
for every n ∈ N, A ∈ B(A), for almost every3 s ∈ S, y0 · · · yn ∈ (S × A)n ,

P (An ∈ A|Sn = s, Yn = yn , . . . , Y0 = y0 ) = P (An ∈ A|Sn = s),

 by a conditional PDF p(.|s) : A → R


+
moreover this probability is given
such that P (An ∈ A|Sn = s) = α∈A p(α|s)dα and p(α|s) = 0 if s  α = ⊥

(in particular P (An ∈ A|Sn = s) = α∈A p(α|s)dα = 1).
C.4) States are updated deterministically knowing the previous state and tran-
sition: Sn+1 = Sn  An .
2
We refer the reader to [12] for an introduction to measure and probability theory.
3
A property prop (like “f is positive”, “well defined”...) on a setB holds almost
everywhere when the set where it is false has measure (volume) 0: B 1bprop db = 0.
A Maximal Entropy Stochastic Process for a Timed Automaton 67

For all n ≥ 1, Y0 · · ·Yn−1 has a PDF pn : Rn → R+ i.e. for every R ∈ B(Rn ),


P (Y0 · · · Yn−1 ∈ R) = Rn pn [s, α]1[s,α]∈R d[s, α]. This PDF can be defined with
the following chain rule:

pn [s0 , α] = p0 (s0 )p(α0 |s0 )p(α1 |s1 ) . . . p(αn−1 |sn−1 )

where for each j = 1..n − 1 the state updates are defined by sj = sj−1  αj−1 .
The SPOR (Yn )n∈N is called stationary whenever for all i, n ∈ N, Yi · · · Yi+n−1
has the same PDF as Y0 · · · Yn−1 which is pn .

Simulation According to a SPOR. Given a SPOR Y , a run (s0 , α) ∈ Rn can


be chosen randomly w.r.t. Y with a linear number of the following operations:
random pick according to p0 or p(.|s) and computing of a successor. Indeed it
suffices to pick s0 according to p0 and for i = 0..n − 1 to pick αi according to
p(.|si ) and to make the update si+1 = si  αi .

2.3 Entropy
In this sub-section, we define entropy for timed region graphs and SPOR. The
first one is inspired by [6] and the second one by [21].

Entropy of a Timed Region Graph


Proposition-Definition 1. Given a timed region graph G, the following limit
exists and defined the entropy of G:
1
H(G) = lim log2 (Vol(Rn )).
n→∞ n
When H(G) > −∞, the timed region graph is thick, the volume behaves w.r.t. n
like an exponent: Vol(Rn ) ≈ 2nH . When H(G) = −∞, the timed region graph
is thin, the volume decays faster than any exponent: ∀ρ > 0, Vol(Rn ) << ρn .

Entropy of a SPOR
Proposition-Definition 2. If Y is a stationary SPOR, then
  
1
− pn [s, α] log2 pn [s, α]d[s, α] →n→∞ − p0 (s) p(α|s) log2 p(α|s)dαds.
n Rn S A

This limit is called the entropy of Y , denoted by H(Y ).

Proposition 1. Let G be a timed region graph and Y be a stationary SPOR on


G. Then the entropy of Y is upper bounded by that of G: H(Y ) ≤ H(G).

The main contribution of this article is a construction of a stationary SPOR


for which the equality holds i.e. a timed analogue of the Shannon-Parry Markov
Chain [21,20].
68 N. Basset

3 Maximal Entropy SPOR and Quasi Uniform Sampling


In this section G is a timed region graph satisfying the technical condition below
(section 3.1). We present a stationary SPOR Y ∗ for which the upper bound on
entropy is reached H(Y ∗ ) = H(G) (Theorem 1). Another key property of this
SPOR is the ergodicity we define now:

Ergodicity. Given a set of infinite runs R ⊆ (S × A)ω and i, j ∈ N, we denote


by Rii+j ⊆ (S × A)j+1 the set of finite runs (si , αi ) · · · (si+j , αi+j ) that can occur
between indices i and i + j in an infinite run (sk , αk )k∈N of R. Let Y be a stationary
SPOR then the sequence P (Y0 · · · Yn ∈ R0n ) decreases and converges to a value
called the probability of R and denoted by P (R) = limn→∞ P (Y0 · · · Yn ∈ R0n ).
The set R is shift invariant if for every i, n ∈ N: Rii+n = R0n . A stochastic process
is ergodic whenever it is stationary and every shift invariant set has probability 0
or 1. Definition of ergodicity for general probability measures can be found in [12].

3.1 Technical Assumptions


In this section we explain and justify several technical assumptions on the timed
region graph G we make in the following.
Bounded Delays. If the delays were not bounded the sets of runs Rn would
have infinite volumes and thus a quasi uniform random generation cannot be
achieved.
Fleshy Transitions. We consider timed region graphs whose transitions are
fleshy [6]: there is no constraints of the form x = c in their guards. Non fleshy
transitions yield a null volume and are thus useless. Delete them reduces the size
of the timed region graph considered and ensures that every path has a positive
volume (see [6,9] for more justifications and details).
Strong Connectivity of the Set of Locations. We will consider only timed
region graph which are strongly connected i.e. locations are pairwise reachable.
This condition (usual in the discrete case we generalize) is not restrictive since
the set of locations can be decomposed in strongly connected components and
then a maximal entropy SPOR can be designed for each components.
Thickness. In the maximal entropy approach we adopt, we need that the en-
tropy is finite H(G) > −∞. This is why we restrict our attention to thick timed
region graph. The dichotomy between thin and thick timed region graphs was
characterized precisely in [9] where it appears that thin timed region graph are
degenerate. The key characterization of thickness is the existence of a forgetful
cycle [9]. When the locations are strongly connected, existence of such a forgetful
cycle ensures that the state space S is strongly connect i.e. for all s, s ∈ S there
exists α ∈ A∗ such that s  α = s .
Weak Progress Cycle Condition. In [6] the following assumption (known as
the progress cycle condition) was made: for some positive integer constant D, on
each path of D consecutive transitions, all the clocks are reset at least once.
A Maximal Entropy Stochastic Process for a Timed Automaton 69

Here we use a weaker condition: for a positive integer constant D, a timed


region graph satisfies the D weak progress condition (D-WPC) if on each path of
D consecutive transitions at most one clock is not reset during the entire path.
The timed region graph on figure 1 does not satisfy the progress cycle condi-
tion (e.g. x is not reset along δ 1 ) but satisfies the 1-WPC.

3.2 Main Theorems

Theorem 1. There exists a positive real ρ and two functions v, w : S !→ R


positive almost everywhere such that the following equations define the PDF of
an ergodic SPOR Y ∗ with maximal entropy: H(Y ∗ ) = H(G).

v(s  α)
p∗0 (s) = w(s)v(s); p∗ (α|s) = . (1)
ρv(s)

Objects ρ, v, w are spectral attributes of an operator Ψ defined in the next


section.
An ergodic SPOR satisfies an asymptotic equipartition property (AEP) (see
[14] for classical AEP and [1] which deals with the case of non necessarily Marko-
vian stochastic processes with density). Here we give our own AEP. It strongly
relies on the pointwise ergodic theorem (see [12]) and on the Markovian property
satisfied by every SPOR (conditions C.3 and C.4).
Theorem 2 (AEP for SPOR). If Y is an ergodic SPOR then

P [{s0 α0 s1 α1 · · · | −(1/n) log2 pn [s0 , α0 · · · αn ] →n→+∞ H(Y )}] = 1

This theorem applied to the maximal entropy SPOR Y ∗ means that long runs
have a high probability to have a quasi uniform density:

p∗n [s0 , α0 · · · αn ] ≈ 2−nH(Y )
≈ 1/Vol(Rn ) (since H(Y ∗ ) = H(G)).

3.3 Operator Ψ and Its Spectral Attributes ρ, v, w

The maximal entropy SPOR is a lifting to the timed setting of the Shannon-
Parry Markov chain of a finite strongly connected graph. The definition of this
chain is based on the Perron-Frobenius theory applied to the adjacency matrix
M of the graph. This theory ensures that there exists both a positive eigenvector
v of M for the spectral radius4 ρ (i.e. M v = ρv) and a positive eigenvector w
of the transposed matrix M  for ρ (i.e. M  w = ρw). The initial probability
distribution on the states Q of the Markov chain is given by pi = vi wi for
i ∈ Q and the transition probability matrix P is given by Pij = vj Mij /(ρvi )
for i, j ∈ Q. The timed analogue of M is the operator Ψ introduced in [6]. To
4
Recall from linear algebra (resp. spectral theory) that the spectrum of a matrix (resp.
of an operator) Ψ is the set {λ ∈ C s.t. Ψ −λId is not invertible.}. The spectral radius
ρ of Ψ is the radius of the smallest disc centered in 0 which contains all the spectrum.
70 N. Basset

define ρ,v and w, we will use the theory of positive linear operators (see e.g. [17])
instead of the Perron-Frobenius theory used in the discrete case.
The operator Ψ of a timed region graph is defined by:

∀f ∈ L2 (S), ∀s ∈ S, Ψ f (s) = f (s  α)dα (with f (⊥) = 0), (2)
A

where L2 (S) is the Hilbert space


 of square integrable functions from S to
R with
the scalar product f, g = S f (s)g(s)ds and associated norm ||f ||2 = f, f .
Proposition 2. The operator Ψ defined in (2) is a positive continuous linear
operator on L2 (S).
The real ρ used in (1) is the spectral radius of Ψ .
Theorem 3 (adapted from [6] to L2 (S)). The spectral radius ρ is a positive
eigenvalue (i.e. ρ > 0 and ∃v ∈ L2 (S) s.t. Ψ v = ρv) and H(G) = log2 (ρ).
The adjoint operator Ψ ∗ (acting also on L2 (S)) is the analogue of M  . It is
formally defined by the equation:
∀f, g ∈ L2 (S), Ψ f, g = f, Ψ ∗ g. (3)
A more effective characterization of some power of Ψ ∗ and then of its eigenfunc-
tions is ensured by proposition 3 below. The following theorem defines v, w used
in the definition of the maximal entropy SPOR (1).
Theorem 4. There exists a unique eigenfunction (up to a scalar constant) v of
Ψ (resp. w of Ψ ∗ ) for the eigenvalue ρ which is positive almost everywhere. Any
non-negative eigenfunction of Ψ (resp. Ψ ∗ ) is collinear to v (resp. w).
Eigenfunctions v and w are chosen such that w, v = 1. Operator Ψ and Ψ ∗ are
easier to describe when elevated to some power greater than D the constant of
weak progress cycle condition.
Proposition 3. For every n ≥ D there exists a function  kn ∈ L2 (S × S) such
that: Ψ n (f )(s) = S kn (s, s )f (s )ds and Ψ ∗n (f )(s) = S kn (s , s)f (s )ds .
It is worth mentioning that for any n ≥ D, the objects ρ, v (resp. w) are solutions
of the eigenvalue problem S kn (s, s )v(s )ds = ρn v(s) with v non negative (resp.
   n
S kn (s , s)w(s )ds = ρ w(s) with w non negative); unicity of v (resp. w) up to
a scalar constant is ensured by Theorem 4. Further computability issues for ρ, v
and w are discussed in the conclusion.

Sketch of Proof of Theorem 4. The proof of Theorem 4 is based on theorem


11.1 condition e) of [17] which is a generalization of Perron-Frobenius Theorem
to positive linear operators. The main hypothesis to prove is the irreducibility of
Ψ whose analogue in the discrete case is the irreducibility of the adjacency matrix
M of a finite graph i.e for all states i, j there exists n ≥ 1 such that Mijn > 0
(this is equivalent to the strong connectivity of the graph). The following key
lemma is a sufficient condition for the irreducibility of Ψ . It is based on the
strong connectivity of the state space S ensured both by strong connectivity of
the set of locations and by thickness (see section 3.1).
A Maximal Entropy Stochastic Process for a Timed Automaton 71

Lemma 1. For every q, q  ∈ Q, there exists n ≥ D such that kn (defined in


Proposition 3) is positive almost everywhere on ({q} × rq ) × ({q  } × rq ).

3.4 Running Example Completed


Example 3. Let us make (2) explicit on our running example.
 
Ψ f (p, (x, 0)) =  f (p, (x + t, 0))10<x≤x+t<1 dt +  f (q, (0, t))10≤t<1 dt
Ψ f (q, (0, y)) = f (p, (t, 0))10≤t<1 dt + f (q, (0, y + t))10<y≤y+t<1 dt
Integrals from left to right correspond to transitions δ 1 , δ 2 for the first line and
to δ 3 , δ 4 for the second line.
We introduce the notations vp (γ) = v(p, (γ, 0)) and vq (γ) = v(q, (0, γ)). With
these notations the eigenvalue equation ρv = Ψ v gives:
 1  1  1  1
ρvp (γ) = vp (γ  )dγ  + vq (γ  )dγ  ; ρvq (γ) = vp (γ  )dγ  + vq (γ  )dγ  .
γ 0 0 γ

Similarly the eigenfunction w satisfies:


 γ  1  1  γ
ρwp (γ) = wp (γ  )dγ  + wq (γ  )dγ  ; ρwq (γ) = wp (γ  )dγ  + wq (γ  )dγ  .
0 0 0 0

After some calculus we obtain that ρ = 1/ ln(2); vp (γ) = vq (γ) = C2−γ ; wp (γ) =
wq (γ) = C  2γ with C and C  two positive constants.
Finally the maximal entropy SPOR for G ex1 is given by:
1
p∗0 (p, (γ, 0)) = p∗0 (q, (0, γ)) = for γ ∈ (0, 1);
2
2−t
p∗ (t, δ 1 |p, (γ, 0)) = p∗ (t, δ 4 |q, (0, γ)) = for γ ∈ (0, 1), t ∈ (0, 1);
ρ
2γ−t
p∗ (t, δ 2 |p, (γ, 0)) = p∗ (t, δ 3 |q, (0, γ)) = for γ ∈ (0, 1), t ∈ (0, 1).
ρ

4 Conclusion and Perspectives


In this article, we have proved the existence of an ergodic stochastic process
over runs of a timed region graph G with maximal entropy, provided G has finite
entropy (H > −∞) and satisfies the D weak progress condition.
The next question is to know how simulation can be achieved in practice.
Symbolic computation of ρ and v have been proposed in [6] for subclasses of
deterministic TA. In the same article, an iterative procedure is also given to
estimate the entropy H = log2 (ρ). We think that approximations of ρ, v and w
using an iterative procedure on Ψ and Ψ ∗ would give a SPOR with entropy as
close to the maximum as we want. A challenging task for us is to determine an
upper bound on the convergence rate of such an iterative procedure.
Connection with information theory is clear if we consider as in [5], a timed
regular language as a source of timed words. A SPOR is in this approach a
stochastic source of timed words. It would be very interesting to lift compression
methods (see [19,14]) from untimed to timed setting.
72 N. Basset

Acknowledgements. I thank Eugene Asarin, Aldric Degorre and Dominique


Perrin for sharing motivating discussions.

References
1. Algoet, P.H., Cover, T.M.: A sandwich proof of the Shannon-McMillan-Breiman
theorem. The Annals of Probability 16(2), 899–909 (1988)
2. Alur, R., Bernadsky, M.: Bounded model checking for GSMP models of stochas-
tic real-time systems. In: Hespanha, J.P., Tiwari, A. (eds.) HSCC 2006. LNCS,
vol. 3927, pp. 19–33. Springer, Heidelberg (2006)
3. Alur, R., Courcoubetis, C., Dill, D.L.: Model-checking for probabilistic real-time
systems. In: Leach Albert, J., Monien, B., Rodrı́guez-Artalejo, M. (eds.) ICALP
1991. LNCS, vol. 510, Springer, Heidelberg (1991)
4. Alur, R., Dill, D.L.: A theory of timed automata. Theoretical Computer Sci-
ence 126, 183–235 (1994)
5. Asarin, E., Basset, N., Béal, M.-P., Degorre, A., Perrin, D.: Toward a timed theory
of channel coding. In: Jurdziński, M., Ničković, D. (eds.) FORMATS 2012. LNCS,
vol. 7595, pp. 27–42. Springer, Heidelberg (2012)
6. Asarin, E., Degorre, A.: Volume and entropy of regular timed languages: Ana-
lytic approach. In: Ouaknine, J., Vaandrager, F.W. (eds.) FORMATS 2009. LNCS,
vol. 5813, pp. 13–27. Springer, Heidelberg (2009)
7. Asarin, E., Degorre, A.: Volume and entropy of regular timed languages: Dis-
cretization approach. In: Bravetti, M., Zavattaro, G. (eds.) CONCUR 2009. LNCS,
vol. 5710, pp. 69–83. Springer, Heidelberg (2009)
8. Baier, C., Bertrand, N., Bouyer, P., Brihaye, T., Größer, M.: Probabilistic and topo-
logical semantics for timed automata. In: Arvind, V., Prasad, S. (eds.) FSTTCS
2007. LNCS, vol. 4855, pp. 179–191. Springer, Heidelberg (2007)
9. Basset, N., Asarin, E.: Thin and thick timed regular languages. In: Fahrenberg,
U., Tripakis, S. (eds.) FORMATS 2011. LNCS, vol. 6919, pp. 113–128. Springer,
Heidelberg (2011)
10. Bernadsky, M., Alur, R.: Symbolic analysis for GSMP models with one state-
ful clock. In: Bemporad, A., Bicchi, A., Buttazzo, G. (eds.) HSCC 2007. LNCS,
vol. 4416, pp. 90–103. Springer, Heidelberg (2007)
11. Bertrand, N., Bouyer, P., Brihaye, T., Markey, N.: Quantitative model-checking
of one-clock timed automata under probabilistic semantics. In: QEST, pp. 55–64.
IEEE Computer Society (2008)
12. Billingsley, P.: Probability and measure, vol. 939. Wiley (2012)
13. Bouyer, P., Brihaye, T., Jurdziński, M., Menet, Q.: Almost-sure model-checking of
reactive timed automata. QEST 2012, 138–147 (2012)
14. Cover, T.M., Thomas, J.A.: Elements of information theory, 2nd edn. Wiley (2006)
15. David, A., Larsen, K.G., Legay, A., Mikučionis, M., Poulsen, D.B., van Vliet, J.,
Wang, Z.: Statistical model checking for networks of priced timed automata. In:
Fahrenberg, U., Tripakis, S. (eds.) FORMATS 2011. LNCS, vol. 6919, pp. 80–96.
Springer, Heidelberg (2011)
16. Kempf, J.-F., Bozga, M., Maler, O.: As soon as probable: Optimal scheduling under
stochastic uncertainty. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013 (ETAPS
2013). LNCS, vol. 7795, pp. 385–400. Springer, Heidelberg (2013)
A Maximal Entropy Stochastic Process for a Timed Automaton 73

17. Krasnosel’skij, M.A., Lifshits, E.A., Sobolev, A.V.: Positive Linear Systems: the
Method of Positive Operators. Heldermann Verlag, Berlin (1989)
18. Lind, D., Marcus, B.: An Introduction to Symbolic Dynamics and Coding. Cam-
bridge University Press (1995)
19. Lothaire, M.: Applied Combinatorics on Words (Encyclopedia of Mathematics and
its Applications). Cambridge University Press, New York (2005)
20. Parry, W.: Intrinsic Markov chains. Transactions of the American Mathematical
Society, 55–66 (1964)
21. Shannon, C.E.: A mathematical theory of communication. Bell Sys. Tech. J. 27,
379–423, 623–656 (1948)
Complexity of Two-Variable Logic
on Finite Trees

Saguy Benaim, Michael Benedikt1, , Witold Charatonik2, , Emanuel


Kieroński2, , Rastislav Lenhardt1 , Filip Mazowiecki3, , and James Worrell1
1
University of Oxford
2
University of Wroclaw
3
University of Warsaw

Abstract. Verification of properties expressed in the two-variable frag-


ment of first-order logic FO2 has been investigated in a number of
contexts. The satisfiability problem for FO2 over arbitrary structures
is known to be NEXPTIME-complete, with satisfiable formulas having
exponential-sized models. Over words, where FO2 is known to have
the same expressiveness as unary temporal logic, satisfiability is again
NEXPTIME-complete. Over finite labelled ordered trees FO2 has the
same expressiveness as navigational XPath, a popular query language
for XML documents. Prior work on XPath and FO2 gives a 2EXPTIME
bound for satisfiability of FO2 over trees. This work contains a compre-
hensive analysis of the complexity of FO2 on trees, and on the size and
depth of models. We show that the exact complexity varies according
to the vocabulary used, the presence or absence of a schema, and the
encoding of labels on trees. We also look at a natural restriction of FO2 ,
its guarded version, GF2 . Our results depend on an analysis of types in
models of FO2 formulas, including techniques for controlling the number
of distinct subtrees, the depth, and the size of a witness to satisfiability
for FO2 sentences over finite trees.

1 Introduction

The complexity of verifying properties over a class of structures depends on both


the specification language and the type of structure. Stockmeyer [Sto74] showed
that full first-order logic (FO) has non-elementary complexity even when applied
to very restricted structures, such as words. The two-variable fragment, FO2 , is
known to have better complexity. Grädel et al. [GKV97] showed that satisfiability
over arbitrary relational vocabularies is NEXPTIME-complete, with satisfiable
sentences having exponential-sized models. Over words, Etessami et al. showed
that satisfiability remains NEXPTIME-complete, with satisfiable formulas again

Supported by EPSRC grant EP/G004021/1.

Supported by Polish NCN grant number DEC-2011/03/B/ST6/00346.

Supported by Polish Ministry of Science and Higher Education grant N N206
371339.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 74–88, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Complexity of Two-Variable Logic on Finite Trees 75

having exponential-sized models [EVW02]. Moreover the complexity bounds over


words extend to bounds on a host of related verification problems [BLW12].
The NEXPTIME-completeness of FO2 over both general structures and words
raises the question of the impact of structural restrictions on the complexity of
FO2 . Surprisingly the complexity of satisfiability for FO2 over finite trees has not
been not been investigated in detail. Marx and De Rijke [MdR04] showed that
FO2 over trees corresponds precisely to the navigational core of the XML query
language XPath. From the work of Marx [Mar04] it follows that the satisfiability
problem for XPath is complete for EXPTIME. Given that the translation from
FO2 to XPath in [MdR04] is exponential, this gives a 2EXPTIME bound on
satisfiability for FO2 over trees.
In this work we will consider the satisfiability problem for FO2 over finite
trees, and the corresponding question of the size and depth needed for witness
models. In particular, we will consider:
– satisfiability in the presence of all navigational predicates: predicates for
the parent/child relation, its transitive closure the descendant relation, the
left/right sibling relation and its transitive closure;
– the impact on the complexity of limiting sentences to make use of predicates
in a particular subset;
– satisfiability over general unranked trees, and satisfiability in the presence
of a schema;
– satisfiability over trees where node labels are denoted with explicit unary
labels versus the case where node labels are boolean combinations over a
propositional alphabet;
– satisfiability of the full logic, versus the restriction to the case where quan-
tification of variables must be guarded.
We will show that each of these variations affects the complexity of the problem.
In the process, we will show that the tree case differs in a number of important
ways from that of words. First, satisfiability is EXPSPACE-complete, unlike for
general structures or words. Secondly, the basic technique for analyzing FO2 on
words [EVW02]—bounds on the number of quantifier-rank types that occur in a
structure—is not useful for getting tight complexity bounds for FO2 over trees.
Instead we will use a combination of methods, including reductions to XPath,
bounds on the number of subformula types, and a quotient construction that
is based not only on types, but on a set of distinguished witness nodes. These
techniques allow us to distinguish situations where satisfiable FO2 -formulas have
models of (reasonably) small depth, and situations where they have models of
small size.
Related Work. Two-variable logic on data trees, in which nodes are associ-
ated with values in an infinite set of data, has been studied by Bojańczyk et
al. [BMSS09]. There the main result is decidability over the signature with data
equality, the child relation, and the right sibling relation. Figueira [Fig12] con-
siders two-variable logic with two successor relations, respectively corresponding
to two different linear orders. This is quite different from considering the two
76 S. Benaim et al.

successor relations derived from a tree order. Figueira’s results were generalized
in a recent work [CW13] where two-variable logic over structures that contain
two forests of finite trees, even in presence of additional binary predicates and
counting quantifiers, is proved to be decidable in NEXPTIME. However, these
results are not comparable with ours because the logic in [CW13] is restricted
to ranked trees, it cannot express the sibling relation and does not allow us-
ing the transitive descendant relation. The two-variable logic over two transitive
relations is shown undecidable in [Kie05]. On the other hand the two-variable
fragment with one transitive relation has been recently shown by Szwast and
Tendera to be decidable [ST13]. The complexity of two-variable logic over ordi-
nary trees is explicitly studied in [BK08]. Our results here show that the proof
of the satisfiability result there (claiming NEXPTIME for full two-variable logic)
is incorrect.
Organization. Section 2 gives preliminaries. Section 3 gives precise bounds
for the satisfiability of full FO2 on trees. Section 4 considers the case where the
child predicate is absent, while Section 5 considers the case where the descendant
predicate is absent. Section 6 considers restricting the logic to its guarded version.
Section 7 gives conclusions.

2 Logics and Models

A tree is a finite directed acyclic graph whose edge relation is denoted ↓. We


denote the transitive closure of the edge relation by ↓+ and assume that ↓+ is
a partial order with only one minimum element—the root of the tree. We read
u ↓ v as ‘v is a child of u’ and u ↓+ v as ‘v is a descendant of u’. We assume also a
sibling relation → whose transitive closure →+ restricts to a linear order on the
children of each node. We read u → v as ‘v is the right sibling of u’. Given a tree
t and vertex v, SubTree(t, v) denotes the subtree of t rooted at v. We consider
trees equipped with a family of unary predicates P1 , P2 , . . . on their vertices. We
further say that a tree satisfies the unary alphabet restriction (UAR) if exactly
one predicate Pi holds of each vertex.
We consider first-order logic with equality over signatures of the form τun ∪τbin ,
where τun consists of unary predicates and τbin ⊆ {↓, ↓+ , →, →+ } is a subset of
the navigational relations. Over such signatures we consider the two-variable sub-
set of first-order logic FO2 and its guarded fragment GF2 (cf. [AvBN98]). The
former comprises those formulas built using only variables x and y, while in the
latter quantifiers are additionally relativised to atoms, i.e., universally quantified
formulas have the form ∀y (α(x, y) ⇒ ϕ(x, y)) and existentially quantified for-
mulas have the form ∃y (α(x, y) ∧ ϕ(x, y)), where α(x, y) is atomic. Atom α(x, y)
is called a guard. Equalities x = x or x = y are also allowed as guards.
We write FO2 [τbin ] or GF2 [τbin ] to indicate that the only binary symbols that
are allowed in formulas are those from τbin and that formulas are interpreted over
trees. Although we allow equality in our upper bounds, it will not play any role
in the lower bounds. When interpreting formulas of the logic in a tree, we will
Complexity of Two-Variable Logic on Finite Trees 77

always assume the navigational relations are given their natural interpretation:
↓ as the child relation, ↓+ the descendant relation, and so forth.
We consider also satisfiability for FO2 over k-ranked trees, that is, trees where
nodes have at most k children. Note that for k-ranked trees it is natural to
consider signatures that include the relation ↓i , connecting a node to its ith
child for each i ≤ k, either in place of or in addition to the predicates above.
However we will not consider a separate signature for ranked trees, since it is
easy to derive tight bounds for ranked trees for such signatures based on the
techniques introduced here.
A ranked tree schema consists of a bottom-up tree automaton on trees of some
rank k [Tho97]. A tree automaton takes trees labeled from a finite set Σ. We will
thus identify the symbols in Σ with predicates Pi , and thus all trees satisfying
the schema will satisfy the UAR.
We consider the following problems:
– Given an FO2 sentence ϕ, determine if there is some tree (resp. k-ranked,
UAR tree) that satisfies it.
– Given an FO2 sentence ϕ and a schema S, determine whether ϕ is satisfied by
some tree satisfying S. We consider the combined complexity in the formula
and schema.
Some of our results will go through XPath, a common language used for querying
XML documents viewed as trees. The navigational core of XPath is a modal
language, analogous to unary temporal logic on trees, denoted NavXP. NavXP
is built on binary modalities, referred to as axis relations. We will focus on the
following axes: self, parent, child, descendant, descendant-or-self, ancestor-or-self,
next-sibling, following-sibling, preceding-sibling, previous-sibling. In a tree t, we
associate each axis a with a set Rat of pairs of nodes. Rchild t
denotes the set of
pairs of nodes (x, y) in t where y is a child of x, and similarly for the other axes
(see [Mar04]).
NavXP consists of path expressions, which denote binary relations between
nodes in a tree, and filters, denoting unary relations. Below we give the syn-
tax (from [BK08]), using p to range over path expressions and q over filters.
L ranges over symbols for each labelling of a node (i.e. for general trees, boolean
combinations of predicates P1 , P2 , . . ., for UAR trees a single predicate).

p ::= step | p/p | p ∪ p step ::= axis | step[q]


q ::= p | lab() = L | q ∧ q | q ∨ q | ¬q

where axis relations are given above.


The semantics of NavXP path expressions relative to a tree t is given by:
t
1. [[axis]] = Raxis 2. [[step[q]]] = {(n, n ) ∈ [[step]] : n ∈ [[q]]} 3. [[p1 /p2 ]] =
{(n, n ) : ∃w(n, w) ∈ [[p1 ]] ∧ (w, n ) ∈ [[p2 ]]} 4. [[p1 ∪ p2 ]] = [[p1 ]] ∪ [[p2 ]].


For filters we have: 1. [[lab() = L]] = {n : n has label L} 2. [[p]] = {n :


∃n (n, n ) ∈ [[p]]} 3. [[q1 ∧ q2 ]] = [[q1 ]] ∩ [[q2 ]] 4. [[¬q]](n) = {n : n ∈ [[q]]}. A
NavXP filter is said to hold of a tree t if it holds of the root under the above
semantics.
78 S. Benaim et al.

As mentioned earlier, expressive equivalence of FO2 and NavXP on trees, ex-


tending the translation to Unary Temporal Logic in the word case, is known:
Proposition 1 ([MdR04]). There is an exponential translation from FO2 [↓+ ]
to NavXP with only the descendant and ancestor axes and from FO2 [↓, ↓+ , →, →+ ]
to NavXP with all axes.
From the fact that NavXP has an exponential time satisfiability problem [Mar04]
and the above proposition, we get the following (implicit in [MdR04]):
Corollary 1. The satisfiability problem for FO2 [↓, ↓+ , →, →+ ] is in 2EXPTIME.

3 Satisfiability for Full FO2 on Trees

Subformula Types and Exponential Depth Bounds. In the analysis of


satisfiability of FO2 for words of Etessami et al. [EVW02], a NEXPTIME bound
is achieved by showing that any sentence with a finite model has a model of
at most exponential size. The small-model property follows, roughly speaking,
from the fact that any model realizes only exponentially many “quantifier-rank
types”—maximal consistent sets of formulas of a given quantifier rank—and the
fact that two nodes with the same quantifier-rank type can be identified.
In the case of trees, this approach breaks down in several places. It is easy to
see that one cannot always obtain an exponential-sized model, since a sentence
can enforce binary branching and exponential depth. Because there are doubly-
exponentially many non-isomorphic small-depth subtrees, there can be doubly-
exponentially many quantifier-rank types realized even along a single path in
a tree: so quantifier-rank types can not be used even to show an exponential
depth bound. We thus use subformula types of a given FO2 -formula ϕ (for short,
ϕ-types), which are maximal consistent collections of subformulas of ϕ with
one free variable. The ϕ-type of a node n in a tree, Tpϕ (n), is defined as the
set of subformulas of ϕ it satisfies. The number of ϕ-types is only exponential
in |ϕ|, but subformula types are more delicate than quantifier-rank types. E.g.
nodes with the same ϕ-type cannot always be identified without changing the
truth of ϕ. Most of the upper bounds will be concerned with handling this issue,
by adding additional conditions on nodes to be identified, and/or preserving
additional parts of the tree.
Upper Bounds for FO2 on Trees. We exhibit the issues arising and techniques
used to solve them by giving an upper bound for the full logic, which improves
on the 2EXPTIME bound one obtains via translation to NavXP.
Theorem 1. The satisfiability problem for FO2 [↓, ↓+ , →, →+ ] is in EXPSPACE.
The key to the proof is to show an “exponential-depth property”:
Lemma 1. Every satisfiable FO2 [↓, ↓+ , →, →+ ] sentence ϕ has a model t whose
depth is bounded by 2poly(|ϕ|) . The same bound holds for satisfiability with respect
to UAR trees or ranked schemas. The outdegree of nodes can also be bounded by
2poly(|ϕ|) .
Complexity of Two-Variable Logic on Finite Trees 79

We sketch the argument for the depth bound, leaving the similar proof for the
branching bound to the full version. Given a tree t and nodes n0 and n1 in t with
n1 not an ancestor of n0 , the overwrite of n0 by n1 in t is the tree t(n1 → n0 )
formed by replacing the subtree of n0 with the subtree of n1 in t. Let F be
the binary relation relating a node m in t to its copies in t(n1 → n0 ): n1 and
its descendants have a single copy if n1 is a descendant of n0 , and two copies
otherwise; nodes in SubTree(t, n0 ) that are not in SubTree(t, n1 ) have no copies,
and other nodes have a single copy. In the case that n1 is a descendant of n0 , F is
a partial function. We say an equivalence relation ≡ on nodes of a tree t is globally
ϕ-preserving if for any equivalent nodes n0 , n1 in t with n0 ∈ SubTree(t, n1 ), the
ϕ-type of a node n in t is the same as the ϕ-type of nodes in F (n) within
t(n1 → n0 ). We say it is pathwise ϕ-preserving if this holds for any node n0 , n1
in t with n1 a descendant of n0 . The path-index of an equivalence relation on t
is the maximum of the number of equivalence classes represented on any path,
while the index is the total number of classes.
We can not always overwrite a node with another having the same ϕ-type, but
by adding additional information, we can get a pathwise ϕ-preserving relation
with small path-index. For a node n, let DescTypes(n) be the set of ϕ-types
of descendants of n, and AncTypes(n) the set of ϕ-types of ancestors of n. Let
IncompTypes(n) be the ϕ-types of nodes n that are neither descendants nor
ancestors of n. Say n0 ≡Full n1 if they agree on their ϕ-type, the set DescTypes,
and the set IncompTypes.
Lemma 2. The relation ≡Full is pathwise ϕ-preserving, and its path index is
bounded by 2poly(|ϕ|) . Thus, there is a polynomial P such that for any tree t
satisfying ϕ and root-to-leaf path p of length at least 2P (|ϕ|), there are two nodes
n0 , n1 on p such that t(n1 → n0 ) still satisfies ϕ.
Given Lemma 2, Lemma 1 follows by contracting all paths exceeding a given
length until the depth of the tree is exponential in |ϕ|. In fact (e.g., for ranked
trees) the equivalence classes of ≡Full can be used as the state set of a tree
automaton A and then it can be arranged that A reaches the same state on n0
as on n1 . The path index property implies that the automaton goes through only
exponentially many states on any path of a tree. By taking the product of this
automaton with a ranked schema, the corresponding depth bound relative to a
schema follows.
We give a simple argument for the path index bound in Lemma 2. First, note
that the total number of ϕ-types is exponential in |ϕ|. Now the sets DescTypes(n)
either become smaller or stay the same as n varies down a path, and hence can
only change exponentially often. Similarly the sets IncompTypes(n) grow bigger
or stay the same, and thus can change only exponentially often. In intervals
along a path where both of these sets are stable, the number of possibilities for
the ϕ-type of a node is exponential. This gives the path index bound.
Theorem 1 follows from combining Lemma 1 with the following result on
satisfiability of NavXP:
Theorem 2. The satisfiability of a NavXP filter ϕ over trees of bounded depth b
is in PSPACE (in b and |ϕ|).
80 S. Benaim et al.

This result is a variant of a result from [BFG08] that finite satisfiability for
the fragment of NavXP which contains only axis relations child, parent, next-
sibling, preceding-sibling, previous-sibling and following-sibling is in PSPACE.
Given Theorem 2 we complete the proof of Theorem 1 by translating an FO2 sen-
tence ϕ into an NavXP filter ϕ with an exponential blow-up, using Proposition
1. By Lemma 1, the depth of a witness structure is bounded by an exponential
in |ϕ|, and the EXPSPACE result follows.
Lower Bound. We now show a matching lower bound for the satisfiability
problem.
Theorem 3. The satisfiability problem for FO2 on trees is EXPSPACE-hard,
with hardness holding even when formulas are restricted to be in GF2 [↓+ ].
This is proved by coding the acceptance problem for an alternating exponential
time machine. A tree node can be associated with an n-bit address, a path
corresponds to one thread of the alternating computation, and the tree structure
is used to code alternation. The equality and successor relations between the
addresses associated to nodes x and y can be coded in GF2 [↓+ ] using a standard
argument—see [Kie02] for details, where it was shown that a restricted variant
of the two-variable guarded fragment with some unary predicates and a single
binary predicate that is interpreted as a transitive relation is EXPSPACE-hard.
It is not hard to see that the proof presented there works fine (actually, it is even
more natural) if we restrict the class of admissible structures to (finite) trees.

4 Satisfiability without Child

Unary Alphabet Restriction, Polynomial Alternation Bounds, and


Polynomial Depth Bounds. The previous section showed EXPSPACE-
completeness for satisfiability of FO2 [↓+ ]. However the EXPSPACE-hardness ar-
gument for ↓+ makes use of multiple predicates holding at a given node, to code
the address of a tape cell of an alternating exponential-time Turing Machine. It
thus does not apply to satisfiability over UAR trees (as defined in Section 2) or
to satisfiability with respect to a schema, since both cases restrict to a single
alphabet symbol per node. In both cases we show that the complexity of satis-
fiability can be lowered to NEXPTIME, using distinct techniques for the case of
ranked and unranked trees.
We start by noting that one always has at least NEXPTIME-hardness, even
with UAR.
Theorem 4. The satisfiability of FO2 [↓+ ] UAR trees is NEXPTIME-hard, and
similarly with respect to a ranked schema.
The proof is a variation of the argument for NEXPTIME-hardness for words
[EVW02], but this time using the frontier of a shallow but wide tree to code the
tiling of an exponential grid.
We will prove a matching NEXPTIME upper bound for UAR trees and for
satisfiability with respect to a ranked schema. To do this, we extend an idea
Complexity of Two-Variable Logic on Finite Trees 81

introduced in the thesis of Weis [Wei11], working in the context of FO2 [<] on
UAR words: polynomial bounds on the number of times a formula changes its
truth value while keeping the same symbol along a given path.
The following is a generalization of Lemma 2.1.10 of Weis [Wei11]. Consider
an FO2 [↓+ ] formula ψ(x), a tree t satisfying the UAR, and fix a root-to-leaf path
p = p1 . . . pmax(p) in t. Given a label a, define an a-interval in p to be a set of
the form {i : m1 ≤ i < m2 ; t, pi |= a(x)} for some m1 , m2 .
Lemma 3. For every FO2 [↓+ ] formula ψ(x), UAR tree t, and root-to-leaf path
p = p1 . . . pmax(p) in t, the set {i| t, pi |= ψ ∧ a(x)} can be partitioned into a set
of at most |ψ|2 a-intervals.
From Lemma 3, we will show that FO2 [↓+ ] sentences that are satisfiable over
UAR trees always have polynomial-depth witnesses:
Lemma 4. If an FO2 [↓+ ] formula ϕ is satisfied over a UAR tree, then it is
satisfied by a model of depth bounded by a polynomial in |ϕ|.
Let us prove this fact. Suppose that ϕ is satisfied over a UAR tree t. On each path
p, for each letter b, let a b, ϕ-interval be a maximal b-interval on which every one-
variable subformula of ϕ has constant truth value. By the lemma above, the total
number of such intervals is polynomially bounded. We let W contain the endpoints
of each b, ϕ-interval for all symbols b. We note the following crucial property of W :
for every node m in p which is not in W , there is a node in W with the same ϕ-type
as m that is strictly above m, and also one strictly below m.

path p path p
w0 w0

r0
w1
=⇒
T0
r1
T1
w1
T1 T2 T0 T2

Fig. 1. Tree Promotion

The idea is now to remove all those points on path p that are not in W . This
must be done in a slightly unusual way, by “promoting” subtrees that are off the
path. For every child c of a removed node r that does not lie on path p we attach
SubTree(t, c) to the closest node of W above r (see Figure 1). Let t denote the
tree obtained as a result of this surgery.
Let f be the partial function taking a node in t that is not removed to its image
in t . We claim that t still satisfies ϕ, and more generally that for any subformula
ρ(x) of ϕ and node m of t, we have t, m |= ρ iff t , f (m) |= ρ. This is proved by
induction on ρ, with the base cases and the cases for boolean operators being
82 S. Benaim et al.

straightforward. For an existential formula ∃yβ(x, y), we give just the “only if”
direction, which is via case analysis on the position of a witness node w such
that t, m, w |= β.
If w is in t then t , m, w |= β by the induction hypothesis and the fact that
w is an ancestor (or descendant) of m in t if and only if it is an ancestor (or
descendant) of m in t.
If w is not in t , then it must be that w lies on the path p and is not one the
protected witnesses in W . But then w has both an ancestor w and descendant
w in W that satisfy all the same one-variable subformulas as w does in t,
with both w and w preserved in the tree t . If m and w are distinct then
t , m, w |= β by the induction hypothesis and the fact that m and w have
the same ancestor/descendant relationship in t as do m and w in t. If m is
identical to w then t , m, w |= β by similar reasoning. In any case we deduce
that t , m |= ∃yβ.
Since this process reduces both the length of the chosen path p and does not
increase the length of any other path, it is clear that iterating it yields a tree of
polynomial depth.
Note that we can guess a tree as above in NEXPTIME, and hence we have the
following bound:
Theorem 5. Satisfiability for FO2 [↓+ ] formulas over UAR unranked trees is in
NEXPTIME, and hence is NEXPTIME-complete.

Bounds on Subtrees and Satisfiability of FO2 [↓+ ] with Respect to a


Ranked Schema. The collapse argument above relied heavily on the fact that
trees were unranked, since over a fixed rank we could not apply “pathwise col-
lapse”. Indeed, we can show that over ranked trees, an FO2 [↓+ ] formula satisfiable
over UAR trees need not have a witness of polynomial depth:
Theorem 6. There are FO2 [↓+ ] formulas ϕn of size O(n) that are satisfiable
over UAR binary trees, where the minimum depth of satisfying UAR binary trees
grows as 2n .
Nevertheless, we can still obtain an NEXPTIME bound for UAR trees of a given
rank, and even for satisfiability with respect to a ranked schema.
Theorem 7. The satisfiability problem for FO2 [↓+ ] over ranked schemas is in
NEXPTIME, and is thus NEXPTIME-complete.
We give the argument only for satisfiability with respect to rank-k UAR trees,
leaving the extension to schemas for the full paper. The idea will be to create a
model with only an exponential number of distinct subtrees, which can be rep-
resented by an exponential-sized DAG. We do this by creating an equivalence
relation that is globally ϕ-preserving (not just pathwise) and which has expo-
nential index (not just path index). We will then collapse equivalent nodes, as in
Lemma 2. There are several distinctions from that lemma: to identify nodes that
are not necessarily comparable we can not afford to abstract a node by the set
of all the types realized below it, since within the tree as a whole there can be
doubly-exponentially many such sets. Instead we will make use of some “global
Complexity of Two-Variable Logic on Finite Trees 83

information” about the tree, in the form of a set of “protected witnesses”, which
we denote W .
By Lemma 1 we know that a satisfiable FO2 [↓+ ] formula ϕ has a model t of
depth at most exponential in ϕ. Fix such a t. For each ϕ-type τ , let wτ be a node
of t with maximal depth satisfying τ . We include all wτ and all of their ancestors
in a set W , and call these basic global witnesses. For any m that is an ancestor
or equal to a basic global witness wτ , and any subformula ρ(x) = ∃yβ(x, y)
of ϕ, if there is w incomparable (by the descendant relation) to m such that
t, m, w |= β we add one such w to W , along with all its ancestors – these are
the incomparable global witnesses.
We need one more definition. Given a node m in a tree, for every ϕ-type
τ realized by some ancestor m of m, for every subformula ∃yβ(x, y) of τ , if
there is a descendant w of m such that t, m , w |= β(x, y), choose one such
witness w and let SelectedDescTypes(m) include the ϕ-type of that witness. Note
that the same witness will suffice for every ancestor m realizing τ , and since
there are only polynomially many ϕ-types realized on the path, the collection
SelectedDescTypes(m) will be of polynomial size.
Now we transform t to t such that t |= ϕ and t has only exponentially
many different subtrees. We make use of a well-founded linear order ≺ on trees
with a given rank and label alphabet, such that: 1. SubTree(t, n ) ≺ SubTree(t, n)
implies n is not an ancestor of n; 2. for every tree C with a distinguished leaf, for
trees t1 , t2 with t1 ≺ t2 , we have C[t1 ] ≺ C[t2 ], where C[ti ] is the tree obtained
by replacing the distinguished leaf of C with ti . There are many such orderings,
e.g. using standard string encodings of a tree.
For any model t if there are two nodes n, n in t such that 1. n, n ∈ W ,
2. Tpϕ (n) = Tpϕ (n ), 3. AncTypes(n) = AncTypes(n ), 4. SelectedDescTypes(n)
= SelectedDescTypes(n ), 5. SubTree(t, n ) ≺ SubTree(t, n) (which implies that
n cannot be an ancestor of n), then let t = Update(t) be obtained by choosing
such n and n and replacing the subtree rooted at n by the subtree rooted at n .
Let T1 be the nodes in t that were not in SubTree(t, n), and for any node
m ∈ T1 let f (m) denote the same node considered within t . Let T2 denote the
nodes in t that are images of a node in SubTree(t, n ). For each m ∈ T2 , let
f −1 (m) denote the node in SubTree(t, n ) from which it derives.
We claim the following:
Lemma 5. For all m ∈ T1 the ϕ-type of n in t is the same as the ϕ-type of
f (m) in t . Moreover, for every node m in T2 , the ϕ-type of m in t is the same
as that of f −1 (m) in t.
Applying the lemma above to the root of t, which is necessarily in T1 , it follows
that the truth of the sentence ϕ is preserved by this operation.
We now iterate the procedure ti+1 := Update(ti ), until no more updates are
possible. This procedure terminates, because the tree decreases in the order ≺
every step. We can thus represent the tree as an exponential-sized DAG, with
one node for each subtree.
Thus we have shown that any satisfiable formula has an exponential-size DAG
that unfolds into a model of the formula. Given such a DAG, we can check
84 S. Benaim et al.

whether an FO2 formula holds in polynomial time in the size of the DAG. This
gives a NEXPTIME algorithm for checking satisfiability.

5 Satisfiability without Descendant


Recall that even on words with only the successor relation, the satisfiability prob-
lem for two-variable logic is NEXPTIME-hard [EVW02]. From this it is easy to
see that the satisfiability for FO2 [↓] is NEXPTIME-hard, on ranked and unranked
trees.
Theorem 8. The satisfiability problem for FO2 [↓] is NEXPTIME-hard, even
with the UAR.
We now present a matching upper bound, which holds even in the presence of
sibling relations, i.e., for FO2 [↓, →, →+ ]. The result is surprising, in that it is
easy to write satisfiable FO2 [↓] sentences ϕn of polynomial size whose smallest
tree model is of depth exponential in n, and whose size is doubly exponential.
Indeed, such formulas can be obtained as a variation of the proof of Theorem 8,
by coding a complete binary tree whose nodes are associated with n-bit numbers,
increasing the number by 1 as we move from parent to either child.
Theorem 9. The satisfiability problem for FO2 [↓, →, →+ ], and the satisfiability
problem with respect to a ranked schema, are in NEXPTIME, and hence are
NEXPTIME-complete.
We sketch the idea for satisfiability, which iteratively quotients the structure by
an equivalence relation, while preserving certain global witnesses, along the lines
of Theorem 7. By Lemma 1 we know that a satisfiable FO2 [↓, →, →+ ] formula ϕ
has a model t of depth at most exponential in ϕ, where the outdegree of nodes
is bounded by an exponential.
For each ϕ-type that is satisfied in t, choose a witness and include it along
with all its ancestors in a set W – that is, we include the “basic witnesses” as in
Theorem 7. We also include all children of each basic witness – call these “child
witnesses”.
Thus the size of the set of “protected witnesses” W is again at most ex-
ponential. Now we transform t to t such that t |= ϕ and at the same time
t has only exponentially many different subtrees. Our update procedure looks
for nodes n, n in t such that 1. n, n ∈ W ; 2. SubTree(t, n ) ≺ SubTree(t, n),
where ≺ is an appropriate ordering (as in Theorem 7); 3. Tpϕ (n) = Tpϕ (n ) and
Tpϕ (parent(n)) = Tpϕ (parent(n )). We then obtain t = Update(t) by choosing
such n and n and replacing SubTree(t, n) by SubTree(t, n ).
The theorem is proved by showing that this update operation preserves ϕ. Iter-
ating it until no two nodes can be found produces a tree that can be represented
as an exponential-size DAG.

6 Restricting the Logic


For FO2 [↓+ ] over trees the complexity drop from EXPSPACE to NEXPTIME, re-
sulting from restricting the class of models to those satisfying UAR, is slightly
Complexity of Two-Variable Logic on Finite Trees 85

less spectacular than in the case of words, where an analogous restriction de-
creases the complexity from NEXPTIME to NP. However, to obtain NEXPTIME-
lower bound, we need to speak about pairs x, y of elements in free position, i.e.,
such elements that y is neither an ascendant nor descendant of x. Thus it is
natural to look at the situation where quantification is restricted to only pairs of
elements that are connected by binary relations. To capture the former kind of
scenario we consider the restriction of FO2 to the two-variable guarded fragment,
GF2 , in which all quantifiers have to be relativised by binary predicates. It is
easy to see that GF2 on trees still embeds NavXP, while still being exponentially
more succinct. We are able to show a PSPACE bound on satisfiability of GF2 [↓+ ]
for UAR trees. The following observation is crucial.
Lemma 6. Let ϕ be a GF2 [↓+ ] formula and let t be a UAR tree satisfying ϕ.
Then, there exists a tree t , obtained by removing some subtrees from t, still
satisfying ϕ, such that the degree of nodes in t is bounded polynomially in |ϕ|
and the depth of t.
For the proof assume w.l.o.g. that ϕ is written in negation normal form, i.e,
negations occur only in front of atomic formulas. For every subformula of ϕ of
the form ∃xψ(x) which is satisfied in t choose a single node satisfying ψ and
mark it together with all its ancestors. Analogously for formulas ∃yψ(y). For
every formula ∃y(x↓+ y ∧ ψ(x, y)) belonging to the ϕ-type of the root of t choose
a witness and mark it, together with all its ancestors. Then remove all subtrees
rooted at unmarked successors of the root. Note that the obtained structure still
satisfies ϕ. Analogously as with the root proceed with all marked elements, e.g.,
in a depth-first manner. Let t be the tree obtained after the final step of the
above process. Note that the number of descendants of a node in t at depth l
is bounded by (l + 1) · |ϕ|. This justifies the bound from the statement of the
lemma.
Theorem 10. The satisfiability problem for GF2 [↓+ ] over finite UAR trees is
PSPACE-complete.
We propose an alternating procedure solving the problem. Note that by combin-
ing Lemma 4 and Lemma 6 we may restrict our attention to trees whose depth
and degree are polynomially bounded in the size of the input formula ϕ. First,
our procedure guesses labels and ϕ-types of the root and its children, and checks
if the guessed information is consistent. Then it universally chooses one of the
children, guesses labels and ϕ-types of its children, and proceeds analogously.
In this way, the procedure builds a single path of the tree, together with the
immediate successors of all its nodes. This is sufficient to determine if a model
satisfies ϕ, as ϕ is guarded and cannot speak about pairs of elements not belong-
ing to a common path. Note that our procedure works in alternating polynomial
time, and thus can be also implemented in PSPACE. The matching lower bound
can be shown by reduction from the QBF problem. The crux is enforcing a full
binary tree of depth n with internal nodes at depth i coding truth values of the
i-th propositional variable from the QBF formula. In GF2 [↓+ ] we can measure
the depth of a node in a tree, and thus we can determine the identity of the
86 S. Benaim et al.

variable encoded at a given node. Then evaluating a formula at a leaf node, we


can reconstruct the valuation stored on the path leading to it from the root.
Augmenting GF2 [↓+ ] with any of the remaining binary navigational predicates
leads to an EXPSPACE lower bound over UAR trees.
Theorem 11. The satisfiability problem over finite UAR trees for each of the
logics GF2 [↓, ↓+ ], GF2 [↓+ , →], GF2 [↓+ , →+ ] is EXPSPACE-hard.
In the proof we can follow the construction from [Kie02] showing EXPSPACE-
hardness of GF2 [↓+ ]. Combinations of unary predicates holding in a single node
in that proof can be now simulated by means of additional binary predicates. In
the case of GF2 [↓, ↓+ ] we code them using auxiliary children of a node. In the
case of GF2 [↓+ , →] and GF2 [↓+ , →+ ] we employ siblings for this task. Details
will be given in the full version of the paper.
This completes the picture for the case of signatures containing ↓+ . (Recall
Theorem 3, which shows that without the UAR restriction already GF2 [↓+ ] is
EXPSPACE-hard.) We now consider the case of signatures containing ↓ but not
containing ↓+ .

Theorem 12. The satisfiability problem for GF2 [↓, →] over finite trees is in
EXPTIME. The satisfiability problem for GF2 [↓] is EXPTIME-hard, even under
UAR assumption.

For the upper bound we propose an alternating procedure working in polynomial


space. The procedure looks for a model of an input formula ϕ of depth and degree
exponentially bounded in the size of ϕ, as guaranteed by Lemma 1. Again we
assume that ϕ is in negation normal form. At each moment during its execution
the procedure stores information about a node of the tree and polynomially
many of its children. For each node this information consists of its label, ϕ-type,
and, for every subformula of ϕ of the form ∃xψ(x) or ∃yψ(y), a note whether
ψ is satisfied at some point in the subtree rooted at this node. The procedure
first guesses the information about the root. For every formula ∃y(x↓y ∧ ψ(x, y))
belonging to its ϕ-type it guesses a witness. Additionally, for every formula
∃xψ(x) or ∃yψ(y), for which it is declared the ψ will hold below, a witness
satisfying ψ or declaring that ψ will hold below it is guessed. (Note that in
total at most polynomially many witnesses in are required.) Also the order in
which all those witnesses appear on the list of the children of the root is guessed.
Now the procedure universally chooses a pair of consecutive witnesses n1 , n2 and,
starting from n1 , tries to reach n2 by a →-chain of elements. At each of the nodes
it checks if the guessed information is consistent with the information about its
neighbours, and additionally it makes a universal choice between continuing
the horizontal path to n2 or going down the tree (in a way similar to the one
described for the root).
The lower bound in the above theorem can be shown by an encoding of an
alternating Turing machine working in polynomial space.
Equipping the logic with →+ allows us to lift the lower bound, even assuming
UAR:
Complexity of Two-Variable Logic on Finite Trees 87

Theorem 13. The satisfiability problem for GF2 [↓, →+ ] over finite UAR trees
is NEXPTIME-hard.
The proof of this theorem relies on the fact that without the UAR assumption
FO2 is NEXPTIME-hard even if only unary relations are allowed in the signature
[EVW02]. This can be simulated in our scenario: we use the children of the root
to encode the elements in a model of such a unary formula. Then the relation →+
may be used as a guard, allowing to refer to any pair of these. The combination
of unary predicates holding at a given position can be simulated by means of
the ↓-successors.
Recall that an upper bound matching the lower bound from Theorem 13 holds
for FO2 [↓, →, →+ ] even without the UAR assumption (Theorem 9).

7 Conclusions and Acknowledgements


The main result of the paper is that the satisfiability problem for FO2 over finite
trees, with four navigational predicates: ↓, ↓+ , →, →+ , is EXPSPACE-complete.
We also consider an additional semantic restriction that at a single node precisely
one unary predicate holds (UAR). Under UAR the full logic remains EXPSPACE-
complete, but for some of its weakened variants this assumption makes difference.
Namely, FO2 [↓+ ] becomes NEXPTIME-complete and GF2 [↓+ ] is even PSPACE-
complete under UAR, even though both logics are still EXPSPACE-complete
without UAR.
We go on to establish the precise complexity bounds for all logics GF2 [τbin ] and
FO2 [τbin ], with τbin ⊆ {↓, ↓+ , →, →+ } containing at least ↓ or ↓+ , for arbitrary
finite trees or under UAR. These bounds can be inferred from the results below:
Each of 48 possible variants lies between two logics, one for which we have a
lower bound and one for which we have the same upper bound.
Lower Bounds:
– PSPACE: GF2 [↓+ ] with UAR (Thm. 10)
– EXPTIME: GF2 [↓] with UAR (Thm. 12)
– NEXPTIME: FO2 [↓+ ] with UAR (Thm. 4), FO2 [↓] with UAR (Thm. 8),
GF2 [↓, →+ ] with UAR (Thm. 13)
– EXPSPACE: GF2 [↓+ ] (Thm. 3), GF2 [↓, ↓+ ], GF2 [↓+ , →], GF2 [↓+ , →+ ], all
with UAR (Thm. 11)

Upper Bounds:
– PSPACE: GF2 [↓+ ] with UAR (Thm. 10)
– EXPTIME: GF2 [↓, →] (Thm. 12)
– NEXPTIME: FO2 [↓+ ] with UAR (Thm. 5), FO2 [↓, →, →+ ] (Thm. 9)
– EXPSPACE: FO2 [↓, ↓+ , →, →+ ](Thm. 1)
We also obtain some results concerning satisfiability over ranked trees and satis-
fiability in the presence of schemas.
One direction of future research is to extend the analysis to infinite trees. It
seems that the complexity results we have obtained here can be transferred to
this case without major difficulties.
88 S. Benaim et al.

Acknowledgements. This paper is a merger of two independently-developed


works, [BBLW13] and [CKM13]. We thank the anonymous reviewers of ICALP
for many helpful remarks on both works.

References
[AvBN98] Andréka, H., van Benthem, J., Németi, I.: Modal languages and bounded
fragments of predicate logic. J. Phil. Logic 27, 217–274 (1998)
[BBLW13] Benaim, S., Benedikt, M., Lenhardt, R., Worrell, J.: Controlling the depth,
size, and number of subtrees in two variable logic over trees. CoRR
abs/1304.6925 (2013)
[BFG08] Benedikt, M., Fan, W., Geerts, F.: XPath satisfiability in the presence of
DTDs. J. ACM 55(2), 8:1–8:79 (2008)
[BK08] Benedikt, M., Koch, C.: XPath Leashed. ACM Comput. Surv. 41(1), 3:1–
3:54 (2008)
[BLW12] Benedikt, M., Lenhardt, R., Worrell, J.: Verification of two-variable logic
revisited. In: QEST, pp. 114–123. IEEE (2012)
[BMSS09] Bojańczyk, M., Muscholl, A., Schwentick, T., Segoufin, L.: Two-variable
logic on data trees and XML reasoning. J. ACM 56(3) (2009)
[CKM13] Charatonik, W., Kieroński, E., Mazowiecki, F.: Satisfiability of the two-
variable fragment of first-order logic over trees. CoRR abs/1304.7204 (2013)
[CW13] Charatonik, W., Witkowski, P.: Two-variable logic with counting and trees.
In: LICS. IEEE (to appear, 2013)
[EVW02] Etessami, K., Vardi, M.Y., Wilke, T.: First-order logic with two variables
and unary temporal logic. Inf. Comput. 179(2), 279–295 (2002)
[Fig12] Figueira, D.: Satisfiability for two-variable logic with two successor relations
on finite linear orders. CoRR abs/1204.2495 (2012)
[GKV97] Grädel, E., Kolaitis, P.G., Vardi, M.Y.: On the decision problem for two-
variable first-order logic. Bull. Symb. Logic 3(1), 53–69 (1997)
[Kie02] Kieroński, E.: EXPSPACE-complete variant of guarded fragment with tran-
sitivity. In: Alt, H., Ferreira, A. (eds.) STACS 2002. LNCS, vol. 2285,
pp. 608–619. Springer, Heidelberg (2002)
[Kie05] Kieroński, E.: Results on the guarded fragment with equivalence or transi-
tive relations. In: Ong, L. (ed.) CSL 2005. LNCS, vol. 3634, pp. 309–324.
Springer, Heidelberg (2005)
[Mar04] Marx, M.: XPath with conditional axis relations. In: Bertino, E.,
Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M.,
Böhm, K. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 477–494. Springer,
Heidelberg (2004)
[MdR04] Marxand, M., de Rijke, M.: Semantic characterization of navigational
XPath. In: TDM. CTIT Workshop Proceedings Series, pp. 73–79 (2004)
[ST13] Szwast, W., Tendera, L.: FO2 with one transitive relation is decidable. In:
STACS. LIPIcs, vol. 20, pp. 317–328, Schloss Dagstuhl - Leibniz-Zentrum
fuer Informatik (2013)
[Sto74] Stockmeyer, L.J.: The Complexity of Decision Problems in Automata The-
ory and Logic. PhD thesis, Massachusetts Institute of Technology (1974)
[Tho97] Thomas, W.: Languages, automata, and logic. In: Rozenberg, G., Salomaa,
A. (eds.) Handbook of Formal Languages. Springer (1997)
[Wei11] Weis, P.: Expressiveness and Succinctness of First-Order Logic on Finite
Words. PhD thesis, University of Massachusetts (2011)
Nondeterminism in the Presence
of a Diverse or Unknown Future

Udi Boker1 , Denis Kuperberg2, Orna Kupferman2, and Michał Skrzypczak3


1
IST Austria, Klosterneuburg, Austria
2
The Hebrew University, Jerusalem, Israel
3
University of Warsaw, Poland

Abstract. Choices made by nondeterministic word automata depend on both the


past (the prefix of the word read so far) and the future (the suffix yet to be read).
In several applications, most notably synthesis, the future is diverse or unknown,
leading to algorithms that are based on deterministic automata. Hoping to retain
some of the advantages of nondeterministic automata, researchers have studied
restricted classes of nondeterministic automata. Three such classes are nondeter-
ministic automata that are good for trees (GFT; i.e., ones that can be expanded
to tree automata accepting the derived tree languages, thus whose choices should
satisfy diverse futures), good for games (GFG; i.e., ones whose choices depend
only on the past), and determinizable by pruning (DBP; i.e., ones that embody
equivalent deterministic automata). The theoretical properties and relative merits
of the different classes are still open, having vagueness on whether they really dif-
fer from deterministic automata. In particular, while DBP ⊆ GFG ⊆ GFT, it is not
known whether every GFT automaton is GFG and whether every GFG automa-
ton is DBP. Also open is the possible succinctness of GFG and GFT automata
compared to deterministic automata. We study these problems for ω-regular au-
tomata with all common acceptance conditions. We show that GFT=GFG⊃DBP,
and describe a determinization construction for GFG automata.

1 Introduction
Nondeterminism is very significant in word automata: it allows for exponential suc-
cinctness [14] and in some cases, such as Büchi automata, it also increases the expres-
sive power [9]. In the automata-theoretic approach to formal verification, temporal logic
formulas are translated to nondeterministic word automata [16]. In some applications,
such as model checking, algorithms can proceed on the nondeterministic automaton,
whereas in other applications, such as synthesis and control, they cannot. There, the
advantages of nondeterminism are lost, and the algorithms involve a complicated deter-
minization construction [15] or acrobatics for circumventing determinization [8].
To see the inherent difficulty of using nondeterminism in synthesis, let us review the
current approach for solving the synthesis problem, going through games [4].

This work was supported in part by the Polish Ministry of Science grant no. N206 567840,
Poland’s NCN grant no. DEC-2012/05/N/ST6/03254, Austrian Science Fund NFN RiSE (Rig-
orous Systems Engineering), ERC Advanced Grant QUAREM (Quantitative Reactive Model-
ing), and ERC Grant QUALITY. The full version is available at the authors’ URLs.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 89–100, 2013.
c Springer-Verlag Berlin Heidelberg 2013
90 U. Boker et al.

Let L be a language of infinite words over an alphabet 2I∪O , where I and O are sets of
input and output signals, respectively. The synthesis problem for L is to build a reactive
system that outputs signals from 2O upon receiving input signals from 2I , such that the
generated sequence (an infinite word over the alphabet 2I∪O ) is in L [12]. The prob-
lem is solved by taking a deterministic automaton D for L and conducting a two-player
game on top of it. The players, “system” and “environment”, generate words over 2I∪O ,
where in each turn the environment first chooses the 2I component of the next letter,
the system responds with the 2O component, and D moves to the successor state. The
goal of the system is to generate an accepting run of D no matter which sequence of
input assignments is generated by the environment. The system has a winning strategy
iff the language L can be synthesized.
Now, if one tries to replace D with a nondeterministic automaton A for L, the system
should also choose a transition to proceed with. Then, it might be that L is synthesizable
and still the system has no winning strategy, as each choice of Σ may “cover” a strict
subset of the possible futures.
Some nondeterministic automata are, however, good for games: in these automata
it is possible to resolve the nondeterminism in a way that only depends on the past
and still accepts all the words in the language. This notion, of good for games (GFG)
automata was first introduced in [5].1 Formally, a nondeterministic automaton over the
alphabet Σ is GFG if there is a strategy that maps each word x ∈ Σ ∗ to the transition
to be taken after x is read. Note that a state q of the automaton may be reachable via
different words, and the strategy may suggest different transitions from q after different
words are read. Still, the strategy depends only on the past, meaning on the word read
so far. Obviously, there exist GFG automata: deterministic ones, or nondeterministic
ones that are determinizable by pruning (DBP); that is, ones that just add transitions on
top of a deterministic automaton. In fact, these are the only examples known so far of
GFG automata. 2 A natural question is whether all GFG automata are DBP.
More generally, a central question is what role nondeterminism can play in automata
used for games, or abstractly put, in cases that the future is unknown. Specifically, can
such nondeterminism add expressive power? Can it contribute to succinctness? Is it
“real” or must it embody a deterministic choice?
Before addressing these questions, one should consider their tight connection to non-
determinism in tree automata for derived languages [7]: A nondeterministic word au-
tomaton A with language L is good for trees (GFT) if, when expanding its transition
function to get a symmetric tree automaton, it recognizes the derived language, denoted
der(L), of L; that is, all trees all of whose branches are in L [7]. Tree automata for de-
rived languages were used for solving the synthesis problem [12] and are used when
translating branching temporal logics such as CTL to tree automata [3]. Analogously
to GFG automata, the problem in using nondeterminism in GFT automata stems from
the need to satisfy different futures (the different branches in the tree). For example,

1
GFGness is also used in [2] in the framework of cost functions under the name “history-
determinism”.
2
As explained in [5], the fact the GFG automata constructed there are DBP does not contradict
their usefulness in practice, as their transition relation is simpler than the one of the embodied
deterministic automaton and it can be defined symbolically.
Nondeterminism in the Presence of a Diverse or Unknown Future 91

solving the synthesis problem, the branches of the tree correspond to the possible input
sequences, and when the automaton makes a guess, the guess has to be successful for
all input sequences. The main difference between GFG and GFT is that the former can
only use the past, whereas the latter can possibly take advantage of the future, except
that the future is diverse.
A principal question is whether GFG and GFT automata are the same, meaning
whether nondeterminism can take some advantage of a diverse future, or is it the same
as only considering the past.
It is not difficult to answer all the above questions for safety languages; that is, when
the language L = L(A) ⊆ Σ ω is such that all the words in L can be arranged in one
tree. Then, a memoryless accepting run of A (that is, its expansion to a symmetric tree
automaton for der(L)) on this tree induces a deterministic automaton embodied in A,
meaning that A is DBP. Moving to general ω-regular languages, the first question, con-
cerning expressiveness of deterministic versus GFT automata, was answered in [7] with
respect to Büchi automata, and in [11] with respect to all levels of the Mostowski hier-
archy. It is shown in these works that if der(L) can be recognized by a nondeterministic
Büchi tree automaton, then L can be recognized by a deterministic Büchi word automa-
ton, and similarly for parity conditions of a particular index. Thus, nondeterminism in
the presence of unknown or diverse future does not add expressive power. The other
questions, however, are open since the 90s.
In this paper we examine these questions further for automata with all common ac-
ceptance conditions. We first show that a Muller automaton is GFG iff it is GFT. As the
Muller condition can describe all the common acceptance conditions (Büchi, co-Büchi,
parity, Streett, and Rabin), the result follows to all of them. Intuitively, a GFT automa-
ton A (or, equivalently, a nondeterministic tree automaton for a derived language) is
limited in using information about the future, as different branches of the tree challenge
it with different futures. Formally, we prove that A is GFG by using determinacy of a
well-chosen game. The same game allows us to show that there is a deterministic au-
tomaton for L(A) with the same acceptance condition as A. This also simplifies the
result of [11] and generalizes it to Muller conditions. Indeed, the proof in [11] is based
on intricate arguments that heavily rely on the structure of parity condition.
Can GFG automata take some advantage of nondeterminism or do they simply hide
determinism? We show the existence of GFG Büchi and co-Büchi automata that use
the past in order to make decisions, and thus cannot have a memoryless strategy. Note
that we use the basic acceptance conditions for these counter examples, thus the result
follows to all common acceptance conditions. This is different from known results on
GFG automata over finite words or weak GFG automata, where GFG automata are DBP
[7,10]. This result is quite surprising, as strategies in parity games are memoryless. We
further build a GFG automaton that cannot be pruned into a deterministic automaton
even with a finite unbounded look-ahead, meaning that even an unbounded yet finite
view of the future cannot compensate on memorylessness.
Regarding succinctness, the currently known upper bound for the state blowup in-
volved in determinizing a GFG parity automaton is exponential [7], with no nontrivial
lower bound. We provide some insights on GFG automata, showing that in some cases
its determinization is efficient. We show that if A and B are GFG Rabin automata that
92 U. Boker et al.

recognize a language L and its complement, then there is a deterministic Rabin au-
tomaton for L of size |A × B|. Thus, in the context of GFG automata, determinization
is essentially the same problem as complementation. Moreover, our construction shows
that determinization cannot induce an exponential blowup both for an automaton and its
complement. This is in contrast with standard nondeterminism, even over finite words.
For example, both the language Lk = (a + b)∗ a(a + b)k and its complement admit non-
deterministic automata that are linear in k, while the deterministic ones are exponential
in k.
Due to lack of space, some proofs are omitted, or shortened, and can be found in the
full version.

2 Preliminaries
2.1 Trees and Labeled Trees
We consider trees over a set D of directions. A tree T is a prefix-closed subset of T =
D∗ . We refer to T as the complete D-tree. The elements in T are called nodes, and ε is
the root of T . For a node u ∈ D∗ and d ∈ D, the node ud is the child of u with direction
d. A path of T is a set π ⊆ T , such that ε ∈ π and for all u ∈ π, there is a unique d ∈ D
with ud ∈ π. Note that each path π corresponds to an infinite word in Dω .
For an alphabet Σ, a Σ-labeled D-tree is a D-tree in which each edge is labeled by a
letter from Σ. We choose to label edges instead of nodes in order to be able to compose
a set of words into a single tree, even when the set contains words that do not agree on
their first letter. Formally, a Σ-labeled D-tree is a pair T, t where T ⊆ T is a D-tree
and t : T \ {ε} → Σ labels each edge (or equivalently its target node) by a letter in Σ.
Let TD,Σ be the set of Σ-labeled D-trees (not necessarily complete). We say that a word
w ∈ Σ ω is a branch of a tree T, t ∈ TD,Σ if there is a path π = {ε, u1 , u2 , . . .} ⊆ T
such that w = t(π) = t(u1 )t(u2 ) . . . We use branches(T, t) to denote the set of
branches of T, t. Note that branches(T, t) is a subset of Σ ω .

2.2 Automata
Automata on words An automaton on infinite words is a tuple A = Σ, Q, q0 , Δ, α,
where Σ is the input alphabet, Q is a finite set of states, q0 ∈ Q is an (for simplicity,
single) initial state, Δ ⊆ Q × Σ × Q is a transition relation such that q, a, q   ∈ Δ if
the automaton in state q, reading a, can move to state q  . The state q0 ∈ Q is the initial
state, and α is an acceptance condition. Here we will use Büchi, co-Büchi, parity, Rabin,
Streett and Muller automata. In a Büchi (resp. co-Büchi) conditions, α ⊆ Q is a set of
accepting (resp. rejecting) states. In a parity condition of index [i, j], the acceptance
condition α : Q → [i, j] is a function mapping each state to its priority (we use [i, j] to
denote the set {i, i + 1, . . . , j}). In a Rabin (resp. Streett) condition, α ⊆ 22 ×2 is a
Q Q

Q
set of pairs of sets of states, and in a Muller condition, α ⊆ 22 is a set of sets of states.
Since the transition relation may specify many possible transitions for each state and
letter, the automaton A may be nondeterministic. If Δ is such that for every q ∈ Q and
a ∈ Σ, there is a single state q  ∈ Q such that q, a, q   ∈ Δ, then A is a deterministic
automaton.
Nondeterminism in the Presence of a Diverse or Unknown Future 93

Given an input word w = a0 · a1 · · · in Σ ω , a run of A on w is a function r : N → Q


where r(0) = q0 and for every i ≥ 0, we have r(i), ai , r(i + 1) ∈ Δ; i.e., the run
starts in the initial state and obeys the transition function. For a run r, let inf(r) denote
the set of states that r visits infinitely often. That is, inf(r) = {q ∈ Q : for infinitely
many i ≥ 0, we have r(i) = q}. The run r is accepting iff
– inf(r) ∩ α = ∅, for a Büchi condition.
– inf(r) ∩ α = ∅, for a co-Büchi condition.
– max{α(q) : q ∈ inf(r)} is even, for a parity condition.
– there exists E, F  ∈ α, such that inf(r) ∩ E = ∅ and inf(r) ∩ F = ∅ for a Rabin
condition.
– for all E, F  ∈ α, we have inf(r) ∩ E = ∅ or inf(r) ∩ F = ∅ for a Streett
condition.
– inf(r) ∈ α for a Muller condition.
Note that Büchi and co-Büchi are dual, as well as Rabin and Streett. Parity and Muller
are self-dual. Also note that Büchi and co-Büchi are a special case of parity, which is a
special case of Rabin and Streett, which in turn are special cases of the Muller condition.
An automaton A accepts an input word w iff there exists an accepting run of A on w.
The language of A, denoted L(A), is the set of all words in Σ ω that A accepts.

Automata on Trees. An automaton on Σ-labeled D-trees is a tuple A =


Σ, D, Q, q0 , Δ, α, where Σ, Q, q0 , and α are as in automata on words, and Δ ⊆
Q × (Σ × Q)D . Recall that we label the edges of the input trees. Accordingly,
q, (ad , qd )d∈D  ∈ Δ if the automaton in state q, reading for each d ∈ D the letter
ad in direction d, can send a copy in qd to the child in direction d. If for all q ∈ Q and
(ad )d∈D ∈ Σ D , there is a single tuple (qd )d∈D such that q, (ad , qd )d∈D  ∈ Δ , then A
is deterministic.
A run of A on a Σ-labeled tree T, t is a function r : T → Q such that r(ε) = q0
and for all u ∈ T , we have that r(u), (t(ud), r(ud))d∈D  ∈ Δ. If for some directions
d, the nodes ud are not in T , we assume that the requirement on them is satisfied. A
run r on a tree T is accepting if the acceptance condition of the automaton is satisfied
on all infinite paths of T, r. For instance when A is a Büchi automaton, the run r
is accepting if on all infinite paths in T it visits α infinitely often. As in automata on
words, a tree T, t is accepted by A if there exists an accepting run of A on T, t, and
the language of A, denoted L(A), is the set of all trees in TD,Σ that A accepts.
We use three letter acronyms in {D, N} × {F, B, C, P, R, S, M} × {W, T} to denote
classes of automata, with the first letter indicating whether this is a deterministic or
nondeterministic automaton, the second whether it is an automaton on finite words or
a Büchi / co-Büchi / parity / Rabin / Streett / Muller automaton, and the third whether
it runs on words or trees. For example, a DBW is a deterministic Büchi automaton on
infinite words.

2.3 Between Deterministic and Nondeterministic Automata


Let L ⊆ Σ ω be a language of infinite words. We define the derived language of L,
denoted der(L), as the set of Σ-labeled D-trees all of whose branches are in L. Note
that the definition has D as a parameter.
94 U. Boker et al.

Since membership of a tree T, t in der(L) only depends on branches(T, t), we


do not lose generality if we consider, in the context of derivable languages, trees in a
normal form in which D = Σ and labels agree with directions. We note that examining
trees for which |D| < |Σ| introduces an extra assumption on the set of possible futures,
of which a nondeterministic automaton may take advantage.
Formally, we say that a Σ-labeled D-tree T, t is in a normal form if Σ = D,
and for all ua ∈ Σ + , we have t(ua) = a. Clearly, each Σ-labeled D-tree T, t has
a unique Σ-labeled Σ-tree T  , t  in a normal form such that branches(T, t) =
branches(T  , t ). Working with trees in a normal form enables us to identify the
domain T with its labeling t. Thus, from now on we refer to a Σ-tree T , with the
understanding that we talk about the unique Σ-labeled Σ-tree in normal form that
has T as its underlying Σ-tree. For a Σ-tree T , the branch associated with a path
{, d1 , d1 d2 , d1 d2 d3 , . . .} is the infinite word d1 d2 d3 · · · .The tree automata we consider
also have D = Σ (and we omit D from the specification of the automaton).
Consider a nondeterministic word automaton A = Σ, Q, q0 , Δ, α. Let At be the
expansion of A to a tree automaton. Recall that we restrict attention to automata with
D = Σ. That is, At = Σ, Q, q0 , Δt , α is such that for every q, (ad , qd )d∈Σ  ∈
Q × (Σ × Q)Σ , we have that q, (ad , qd )d∈Σ  ∈ Δt iff for all d ∈ Σ, there is a
transition q, ad , qd  is in Δ. We say that A is good for trees (GFT, for short), if L(At ) =
der(L(A)).
It is easy to see that when A is deterministic, then A is GFT. Indeed, At only accepts
trees in der(L(A)), so L(At ) ⊆ der(L(A)). Conversely, since each prefix of a word in
Σ ω corresponds to a single prefix of a run of A, we can compose the accepting runs of
A of the words in L(A) to an accepting run on At on every tree in der(L(A)).
General nondeterministic automata are not GFT. For example, let A =
{a, b}, {q0, q1 }, q0 , {q0 , a, q0 , q0 , b, q0 , q0 , a, q1 , q1 , a, q1 }, {q1 } be the canon-
ical NBW recognizing L = (a+b)∗ aω . Then, At cannot accept the tree T = a∗ ∪a∗ ba∗ .
Indeed, At has to move to q1 at some point on the aω branch, but it then fails to accept
other branches from that point, as there is no transition leaving q1 labeled with b. In fact
no NBT can recognize der(L(A)) [13].
A nondeterministic word automaton A = Σ, Q, q0 , Δ, α is good for games (GFG,
for short) if there is a strategy σ : Σ ∗ → Q such that the following hold: (1) The strategy
σ is compatible with Δ. That is, for all u, a ∈ Σ ∗ ×Σ, we have σ(u), a, σ(ua) ∈ Δ.
(2) The restriction imposed by σ does not exclude words from L(A); that is, for all
u = u0 · u1 · u2 , · · · ∈ L(A), the sequence σ(ε), σ(u0 ), σ(u0 u1 ), σ(u0 u1 u2 ), . . .
satisfies the acceptance condition α.
Finally, Σ is determinizable by pruning (DBP, for short) if it can be determinized to
an equivalent automaton by removing some of its transitions.
A DBP automaton is obviously also GFG, using the strategy that follows the un-
pruned transitions. A GFG automaton is also GFT, as the latter can resolve its non-
determinism using the strategy that witnesses the GFGness.
Proposition 1. If an automaton A is DBP, then it is GFG. If A is GFG then A is GFT.
Let A = Σ, Q, q0 , Δ, α be a tree automaton. The word automaton associated with A
is Aw = Σ, Q, q0 , Δw , α, where Δw is such that q, a, q   ∈ Δw iff Δ has a transition
from q in which q  is sent to some direction along an edge labeled a. Formally, there is a
Nondeterminism in the Presence of a Diverse or Unknown Future 95

transition q, (ad , qd )d∈Σ  ∈ Δ with (ad , qq ) = (a, q  ) for some d ∈ Σ. It is easy to see
that Aw accepts exactly all infinite words that appear as a branch of some tree accepted
by A. Note that if L(A) = der(L), then L(Aw ) = L, and L((Aw )t ) = der(L), so Aw
is GFT.

3 From GFT to GFG

In this section we prove that if an NMW is GFT then it is also GFG. In addition,
we show that GFG automata admit finite memory strategies and we study connections
with [11].
The crucial tool in the proof is the following infinite-duration perfect-information
game between two players ∃ and ∀. Let A = D, QA , qIA , ΔA , αA  be an arbitrary
NMW. Let D = D, QD , qID , ΔD , αD  be a DSW recognizing L(A). The arena of
the game G(A) is QA × QD and its initial position q0 , p0  is the pair of initial states
(qIA , qID ). In the i-th round of a play, ∀ chooses a letter di ∈ D and ∃ chooses a state
qi+1 such that qi , di , qi+1  ∈ ΔA . The successive position is (qi+1 , pi+1 ), where pi+1
is the unique state of D such that pi , di , pi+1  ∈ ΔD .
An infinite play Π = (q0 , p0 , d0 ), (q1 , p1 , d1 ), . . . is won by ∃ if either the run
ΠA := (qi )i∈N is accepting or the run ΠD := (pi )i∈N is rejecting. Note that since D
recognizes L(A), it follows that ΠD is rejecting iff ΠD := (di )i∈N does not belong to
L(A).
Since the game is ω-regular, it admits finite-memory winning strategies. The winning
condition for ∃ in G(A) is the disjunction of αA with the Rabin condition that is dual to
αD . In particular, when A is an NRW, then the winning condition is a Rabin condition,
thus if ∃ has a winning strategy in G(A), she also has a memoryless one.
Obviously, a strategy for ∃ is a strategy for resolving the nondeterminism in A.
Hence, we have the following.
Lemma 1. If ∃ has a winning strategy in G(A) then A is GFG. Additionally, there
exists a finite-memory strategy σ witnessing its GFGness. If A is an NRW (NMW), then
σ is at most exponential (resp. doubly exponential) in the size of A.

Lemma 2. If ∀ has a winning strategy in G(A), then A is not GFT.

Proof. Let σ∀ : (QA )+ → D be a winning strategy of ∀ in G(A). Thus, for a sequence


q of states, namely the history of the game so far, the strategy σ∀ assigns the letter
 +
to be played by ∀. Note that for some sequences q ∈ QA , the value σ∀ (q) is set
arbitrarily, as there is no play corresponding to such a sequence (e.g., if q0 = qIA ).
Let u ∈ D∗ be a word and q = (q0 , . . . , qj−1 ) ∈ (QA )+ be a sequence of states. We
say that q forces u if (q, u) is a prefix of a play in G(A) in which ∀ plays according to
the strategy σ∀ . Formally, q forces u if the following hold: (1) |u| = |q| = j > 0, (2)
q0 = qIA , (3) for every i < j − 1, the tuple (qi , u(i), qi+1 ) is a transition of A, and (4)
for every i < j, the letter u(i) equals σ∀ (q0 . . . qi ).  ∗
Let T ⊆ D∗ be the set of words u ∈ D∗ such that there is a sequence q ∈ QA
that forces u. Note that T is prefix-closed, so it is a D-branching tree. We first show
that T ∈ der(L(A)). Consider an infinite path π of T . Let π = {, u1 , u2 , . . .} and let
96 U. Boker et al.
 ∗
(q i )i>0 be sequences q i ∈ QA such that q i forces ui . Note that |q i | = |ui | = i.
Since there are finitely many states in A, there exists a subsequence of (q i )i>0 that is
pointwise convergent to a limit ρ ∈ (QA )ω . For instance, this sequence can be built by
iteratively choosing states that appear in infinitely many of the q i . For a finite or infinite
sequence of states q and an index j ∈ N, let q |j be the prefix of q of length j. It follows
that for every j ∈ N there exists i > 0 such that ρ|j = (q i )|j .
Let Π be the play that is the outcome of ∀ playing σ∀ and ∃ playing successive states
of ρ. By the above, for every j ∈ N, we have ρ(j), π(j), ρ(j + 1) ∈ ΔA . Therefore,
the play Π is well defined, ΠD = π, and ΠA = ρ. Since σ∀ is a winning strategy, ΠD
is accepting, and therefore π ∈ L(A). Since we showed the above for all paths π of T ,
we conclude that T ∈ der(L(A)).
Assume now, by way of contradiction, that A is GFT. Thus, L(At ) = der(L(A)).
Since T ∈ der(L(A)), there is an accepting run ρt of At on T . Let Π be the infinite
play of G(A) that is the outcome of ∀ playing σ∀ and ∃ playing transitions of ρt : if ∀
played u ∈ D+ , then ∃ plays ρt (u). Since ρt is accepting, ΠA is also accepting, and ∃
wins Π, contradicting the fact that ∀ plays his winning strategy. &
%

Observe that the arena of the game G(A) is finite and the winning condition for ∃ is ω-
regular. Thus, the game is determined (see [1,4]) and one of the conditions in Lemma 1
or 2 holds. Hence A is either GFG or not GFT, and we can conclude with the following:

Theorem 1. If an NMW is GFT then it is GFG. Moreover, there exists a finite-memory


strategy σ witnessing its GFGness. If A is an NRW (NMW), then σ is at most exponen-
tial (resp. doubly exponential) in the size of A.

The following observation can be seen as an extension of [11] from parity condition to
general Muller acceptance conditions. The only difference here is that we work with Σ-
labelled D-trees with |D| ≥ |Σ|, while [11] was working on binary trees with arbitrary
alphabets. Again, we believe that these differences in the formalisms do not reflect
essential behaviors of automata on infinite trees, since a simple encoding always allows
to go from one formalism to another. Notice that the proof in [11] relies crucially on
the structure of parity conditions and does not seem to generalize to arbitrary Muller
conditions. In the following statement we use γ to denote an acceptance condition, e.g.
a parity [i, j] condition, a Rabin condition with k pairs, a Muller condition with k sets,
etc.

Corollary 1. Consider an ω-regular word language L. If der(L) can be recognized by


a nondeterministic γ tree automaton, then L can be recognized by a deterministic γ
word automaton.

Proof. The word automaton Aw = D, Q, qI , Δ, α associated with A is GFT, so by


Theorem 1 it is GFG. By Theorem 1, the fact that Aw is GFG is witnessed by a strategy
σ, using a finite memory structure M with an initial state m0 ∈ M . That is to say,
σ : Q × M × D → Q × M can be used to guide choices in Aw , ensuring that all words
in L are accepted. Therefore, we can build the required deterministic automaton D with
states Q × M , where the transition function maps a state q, m and letter d ∈ D to
the state σ(q, m, d). The acceptance condition of D is identical to α, and needs only
Nondeterminism in the Presence of a Diverse or Unknown Future 97

to consider the Q-components of the states (the M component does not play a role for
acceptance), and is thus of type γ. Since an accepting run of D induces an accepting
run of Aw , we have L(D) ⊆ L. Conversely, if π is a word in L, the unique run of D
on π corresponds to the execution of the GFG strategy σ in Aw , and it thus accepting.
Hence, L(D) = L, for the deterministic γ automaton D. &
%
Observe that since deterministic automata are clearly GFT, the other direction of Corol-
lary 1 is trivial.

4 From GFG to Deterministic Automata


In this section we study determinization of GFG automata. As discussed in Section 2,
every DBP automaton (that is, a nondeterministic automaton that is determinizable by
pruning) is GFG. The first question we consider is whether the converse is also valid,
thus whether every GFG is DBP. We show that, surprisingly, not all GFG Büchi and co-
Büchi automata are DBP. Note that since these counter examples are with basic accep-
tance conditions, the result follows for all common acceptance conditions. This gives
rise to a second question, of the blowup involved in determinizing GFG NRWs. We
describe a determinization construction that generates a DRW whose size is bounded
by the product of the input GFG NRW and a GFG NRW for its complement.

4.1 GFG Büchi and Co-Büchi Automata Are not DBP


There is a strong intuition that a GFG NBW can be determinized by pruning: By def-
inition, the choices of a GFG NBW are independent on the future. Accordingly, the
question about GFG NBWs being DBP amounts to asking whether the choices are in-
dependent of the past. Since Büchi games are memoryless, it is tempting to believe that
the answer is positive. A positive answer is also supported by the fact that GFG NBWs
that recognize safety languages are DBP. In addition, all GFG NBWs studied so far,
and in particular these constructed in [5] are DBP. Yet, as we show in this section, GFG
NBWs are not DBP.
We start with a simple meta-NBW that operates over infinite words composed of
two finite words (tokens), x and y. Afterwards, we formalize it to a GFG NBW. More-
over, we show that even more flexible versions of DBP, such as ones that allow the
“deterministic” automata to have finite look-ahead are not sufficient.

A Meta Example. The meta-NBW M, described in Figure 1, accepts exactly all words
that contain infinitely many ‘xx’s or ‘yy’s. That is, L(M) = [(x + y)∗ (xx + yy)]ω .
It is not hard to see that M is GFG by using the following strategy in its single non-
deterministic state q0 : “if the last token was x then go to q1 else go to q2 ”. On the other
hand, determinizing q0 to always choose q1 loses y ω and always choosing q2 loses xω .
Hence, M is not DBP.

A Concrete Example. Using the above meta-NBW with x = aaa and y = aba provides
the NBW A, described in Figure 2, whose language is L = [(aaa + aba)∗ (aaa aaa +
aba aba)]ω . Essentially, it follows from the simple observations that A has an infinite
run on a word w iff w ∈ (aaa + aba)ω . Also, after a prefix whose length divides by 3,
a run of A can only be in either q0 , p or g.
98 U. Boker et al.

M: q1 p A: q1 a a p
y a a
x a
x b
a
b
q0 x y q0 a
b a

y a
y a
x a a

q2 g a
q2 b g

Fig. 1. A meta GFG NBW that is not Fig. 2. A GFG NBW that is not DBP
DBP

A Co-Büchi Example. In order to show that these counter examples are not specific
to the Büchi condition, we give another example of GFG which is not DBP, using the
co-Büchi condition. For simplicity, the acceptance is now specified via the transitions
instead of the states. Dashed transitions are co-Büchi, i.e. accepting runs must take them
only finitely often. (It is not hard to build a counter-example with co-Büchi condition
on states from this automaton.)

a
a a
a

b b
a
a
a b a

Fig. 3. A co-Büchi automaton that recognizes the language (aa + ab)∗ [aω + (ab)ω ]. It is GFG
but not DBP. Note that unlike the Büchi counter-example, one good choice is enough for getting
an accepting run.

Theorem 2. GFG NPWs are not DBP, even for Büchi and co-Büchi conditions.
Proof. We prove that the NBW A from Figure 2 is GFG and is not DBP. First, the only
nondeterminism of A is in q0 . The following strategy, applied in q0 , witnesses that A is
GFG: “if the last three letters were ‘aaa’ then go to q1 else go to q2 ”. Now, to see that A
is not DBP, recall that the only nondeterminism of A is in q0 . Therefore, there are two
possible prunings to consider: the DBW A in which δ(q0 , a) = q1 and the DBW A
in which δ(q0 , a) = q2 . With the former, (aba)ω ∈ L(A) \ L(A ) and with the latter
(aaa)ω ∈ L(A) \ L(A ). &
%
Nondeterminism in the Presence of a Diverse or Unknown Future 99

While the GFG NBW A used in the proof of Theorem 2 is not DBP, it can be deter-
minized by merging the states q1 and q2 , to which q0 goes nondeterministically, and
then pruning. Furthermore, A is “almost deterministic”, in the sense that a look-ahead
of one letter into the future is sufficient for resolving its nondeterminism. One may won-
der whether GFG NBWs are determinizable with more flexible definitions of pruning.
We answer this to the negative, describing (in the full version) a GFG NBW in which
merging the target states of the nondeterminism cannot help, and no finite look-ahead
suffices for resolving the nondeterminism.
Theorem 3. There are GFG NBWs that cannot be pruned into deterministic automata
with unbounded yet finite look-ahead, or by merging concurrent target states.

4.2 A Determinization Construction


In this section, we show that determinization in the context of GFG automata cannot
induce an exponential blowup for both a language and its complement. This gives a
serious hint towards the fact that determinization is simpler in the case of GFG. Indeed,
for general nondeterministic automata, the blowup can occur on both sides. For exam-
ple, consider the family of languages of finite words Lk = (a + b)∗ a(a + b)k . While
for all k ≥ 1, both Lk and its complement have nondeterministic automata with O(k)
states, a deterministic automaton for Lk must have at least 2k states.
We now assume that we have a Rabin GFG (NRW-GFG) automaton for L, and an
NRW-GFG for the complement comp(L) of L. We show the following.
Theorem 4. If A is an NRW-GFG for L with n states, and B is an NRW-GFG for
comp(L) with m states, then we can build a DRW for L with nm states.
Proof. Let A = Σ, Q, q0 , ΔA , α be an NRW-GFG for L, and B = Σ, P, p0 , ΔB , β
be an NRW-GFG for comp(L). We construct a Rabin game G between two players, ∃
and ∀, as follows. The arena of G is the product A × B. Formally, the positions of the
a
game are pairs of states (q, p) ∈ Q × P , and there is an edge (q, p) → (q  , p ) in G if
(q, a, q  ) ∈ ΔA and (p, a, p ) ∈ ΔB . The initial position of the game is (q0 , p0 ).
A turn from position (p, q) is played as follows: First, ∀ chooses a letter a in Σ.
a
Then, ∃ chooses an edge (q, p) → (q  , p ). The game then continues from (q  , p ).
Thus, the outcome of a play is an infinite sequence π = (q0 , p0 ), (q1 , p1 ), (q2 , p2 ), . . .
of positions. Note that π combines the run πA = q0 , q1 , q2 , . . . of A and the run πB =
p0 , p1 , p2 , . . . of B.
The winning condition for ∃ is that either πA satisfies α or πB satisfies β. These
objectives can be easily specified by a Rabin winning condition. It is easy to see that ∃
has a winning strategy: it suffices to play in both automata according to their respecting
GFG strategies. By definition of GFG automata, if ∀ generates a word u in L, the run
πA is accepting in A and thus satisfies α. Likewise, if u ∈ comp(L), then the run πB is
accepting in B and thus satisfies β. Since every word is either in L or in comp(L), the
winning condition for ∃ is always satisfied.
It is known that Rabin games admit memoryless strategies [6]. Hence, ∃ actually has
a memoryless winning strategy in G. Such a strategy maps each position (q, p) ∈ Q × P
and letter a ∈ Σ to a destination (q  , p ). Hence, by keeping only edges used by the
100 U. Boker et al.

memoryless strategy, we can prune the nondeterministic product automaton A × B into


a deterministic automaton that accepts all words in Σ ω . Moreover, by simply forgetting
the acceptance condition of B, and keeping only the one from A, we get a DRW D
recognizing L. Notice that if A was for instance Büchi, or parity with index [i, j], the
automaton D has the same acceptance condition. The number of states of D is |P × Q|.
&
%

References
1. Büchi, J.R., Landweber, L.H.: Solving Sequential Conditions by Finite State Strategies. CSD
TR (1967)
2. Colcombet, T.: The theory of stabilisation monoids and regular cost functions. In: Albers, S.,
Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009, Part
II. LNCS, vol. 5556, pp. 139–150. Springer, Heidelberg (2009)
3. Emerson, E.A., Sistla, A.P.: Deciding branching time logic. In: Proc. 16th ACM Symp. on
Theory of Computing, pp. 14–24 (1984)
4. Grädel, E., Thomas, W., Wilke, T. (eds.): Automata, Logics, and Infinite Games. LNCS,
vol. 2500. Springer, Heidelberg (2002)
5. Henzinger, T.A., Piterman, N.: Solving games without determinization. In: Ésik, Z. (ed.) CSL
2006. LNCS, vol. 4207, pp. 395–410. Springer, Heidelberg (2006)
6. Klarlund, N.: Progress measures, immediate determinacy, and a subset construction for tree
automata. Ann. Pure Appl. Logic 69(2-3), 243–268 (1994)
7. Kupferman, O., Safra, S., Vardi, M.Y.: Relating word and tree automata. Ann. Pure Appl.
Logic 138(1-3), 126–146 (2006)
8. Kupferman, O., Vardi, M.Y.: Safraless decision procedures. In: Proc. 46th IEEE Symp. on
Foundations of Computer Science, pp. 531–540 (2005)
9. Landweber, L.H.: Decision problems for ω–automata. Mathematical Systems Theory 3,
376–384 (1969)
10. Morgenstern, G.: Expressiveness results at the bottom of the ω-regular hierarchy. M.Sc. The-
sis, The Hebrew University (2003)
11. Niwinski, D., Walukiewicz, I.: Relating hierarchies of word and tree automata. In: Meinel,
C., Morvan, M. (eds.) STACS 1998. LNCS, vol. 1373, Springer, Heidelberg (1998)
12. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Proc. 16th ACM Symp. on
Principles of Programming Languages, pp. 179–190 (1989)
13. Rabin, M.O.: Weakly definable relations and special automata. In: Proc. Symp. Math. Logic
and Foundations of Set Theory, pp. 1–23. North-Holland (1970)
14. Rabin, M.O., Scott, D.: Finite automata and their decision problems. IBM Journal of Re-
search and Development 3, 115–125 (1959)
15. Safra, S.: On the complexity of ω-automata. In: Proc. 29th IEEE Symp. on Foundations of
Computer Science, pp. 319–327 (1988)
16. Vardi, M.Y., Wolper, P.: Reasoning about infinite computations. Information and Computa-
tion 115(1), 1–37 (1994)
Coalgebraic Announcement Logics

Facundo Carreiro1, Daniel Gorı́n2 , and Lutz Schröder2


1
Institute for Logic, Language and Computation, Universiteit van Amsterdam
2
Department of Computer Science, Universität Erlangen-Nürnberg

Abstract. In epistemic logic, dynamic operators describe the evolution


of the knowledge of participating agents through communication, one of
the most basic forms of communication being public announcement. Se-
mantically, dynamic operators correspond to transformations of the un-
derlying model. While metatheoretic results on dynamic epistemic logic
so far are largely limited to the setting of Kripke models, there is evident
interest in extending its scope to non-relational modalities capturing,
e.g., uncertainty or collaboration. We develop a generic framework for
non-relational dynamic logic by adding dynamic operators to coalgebraic
logic. We discuss a range of examples and establish basic results including
bisimulation invariance, complexity, and a small model property.

1 Introduction
Dynamic epistemic logics [5] are tools for reasoning about knowledge and belief
of agents in a setting where interaction is of crucial interest. These logics extend
epistemic logic (EL) [11] with dynamic operators, used to denote knowledge-
changing actions. The most common of these is public announcement, first in-
troduced in [17], which supports formulas of the form φψ stating that after
publicly (and faithfully) announcing that a certain fact φ holds (such as ‘agent
b does not know that agent a knows p’), ψ will hold (e.g. ‘agent b knows p’).
EL and its extension with public announcements (PAL) are typically inter-
preted on epistemic models, i.e. Kripke models where each accessibility relation
is an equivalence; the points of the model represent epistemic alternatives. Eval-
uating a formula φψ at a point c of an epistemic model I (notation c I φψ)
amounts to verifying that the announcement is faithful (i.e., c I φ) and that
ψ holds at c after removing from I all epistemic alternatives where φ does not
hold (notation c Iφ ψ). The term ‘dynamic’ refers precisely to the fact that
models are changed during evaluation in this way.
Dynamic operators are of independent interest outside an epistemic setting.
E.g., they occur as soon as one tries to express resiliency-related properties in
verification (cf. van Benthem’s sabotage logic [3] for an example); and they can
turn a logic-based database query language into one supporting hypothetical
queries (as in “return the aggregated sales we would have if we assumed that
December sales corresponded to March”).
Moreover, dynamic effects need not be restricted to a relational setting as
found in Kripke models. E.g., the notion of announcing that a formula ψ holds

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 101–112, 2013.

c Springer-Verlag Berlin Heidelberg 2013
102 F. Carreiro, D. Gorı́n, and L. Schröder

has a natural analogue in probabilistic modal logic [12,7,10] where announce-


ments have the effect of conditioning the current distribution, as discussed along
with other examples of dynamic actions in non-relational settings by Baltag [1].
Of course, dynamic operators are subject to the usual tension between expres-
sive power and computational complexity. The extension of robustly decidable
modal languages with dynamic operators can quickly lead to undecidability (see
e.g. [13,18,9]; also, the ↓-binder of hybrid logic can be seen as an example of a
dynamic operator, and in general leads to undecidability). On the other hand,
PAL is well-behaved: it is as expressive as EL but exponentially more succinct
with the same complexity (PSPACE-complete in the multi-agent case [14,8]).
In this paper we study announcement operators in a broad sense for modal
logics beyond Kripke semantics. To deal with these at the appropriate level of
generality, we work in the setting of coalgebraic modal logic [15], which uniformly
covers a broad range of modal operators including, e.g., probabilistic, graded,
and game-theoretic modalities (Sect. 2). A coalgebraic announcement can then
be seen as the global application of a certain form of local transformation (con-
trasting with the global transformations considered in [1]).
A pervasive principle that transpires is that adding announcement operators
preserves invariance under bisimulation and hence does not add fundamentally
new expressive power; it may however be necessary to add new static modalities
in order to eliminate announcements. We deal with generic announcement op-
erators at increasing levels of generality, starting with a very well-behaved class
of strong announcements that allow for a straightforward translation into the
modal base language without requiring additional modalities. These constitute a
particular type of deterministic update on models (i.e. certain transformation of
the behaviour functor); which are also extended to account for announcements
that are enriched with effects, such as non-determinism or uncertainty. As a
unifying notion, we arrive at backwards transformations (Sect. 4) which act on
predicates on the behaviour functor rather than the behaviour functor itself. We
refer to the overall framework as coalgebraic announcement logic (CAL).
Besides bisimulation invariance, our technical results on these logics include:
i) an equivalent translation of CAL into the modal base language, usually in-
ducing an exponential blowup; ii) satisfiability preserving polynomial reductions
to the base language for strong announcements, thus enabling the transfer of
upper complexity bounds; and iii) a constructive filtration argument that yields
a small model property independently of the presence of a master modality. The
latter contribution appears to be a novel observation even for the static case,
and substantially clarifies the original coalgebraic filtration construction [19].

2 Preliminaries

The framework of coalgebraic modal logics uniformly deals with a broad range
of modal operators and a variety of different structures. This is achieved by
recognizing the latter as instances of the concept of coalgebra. Given a functor
T : Set → Set, a T -coalgebra is a pair X, γ consisting of a non-empty set of
Coalgebraic Announcement Logics 103

states X and a transition map γ : X → T X. We often identify X, γ with γ.


For x ∈ X, we shall refer to γ(x) as the T -description of x.
Example 1. Many structures that are well-known from theoretical computer
science or from modal logic admit a natural presentation as coalgebras.
(i) Kripke frames are coalgebras for the covariant powerset functor P. The map
γR : X → PX encodes a Kripke frame W, R with γR (x) := {w | xRw}.
(ii) A Kripke model W, R, V : P → PW  for a set P of propositions, corre-
sponds to the K-coalgebra W, γ where KW := PW × PP. The structure is
recovered with γ(x) := (γR (x), V  (x)) where V  (x) = {p ∈ P | x ∈ V (p)}.
(iii) The neighbourhood functor is N := P̆ ◦ P̆, where P̆ is the contravariant
powerset functor.1 N -coalgebras are the neighbourhood frames of modal
logic [22], used in dynamic logics for reasoning about evidence and belief [6].
(iv) Let M be the subfunctor of N given by MX := {S ∈ N X |
S is upwards closed}. M-coalgebras are monotone neighbourhood frames.
(v) The discrete distribution functor Dω maps X to the set of discrete prob-
ability distributions over X. Dω -coalgebras are Markov chains. The sub-
probability functor Sω is similar but requires only that the measure of the
whole space is at most 1 (instead of equal to 1).
(vi) Similarly, the finite multiset functor Bω maps X to the set of functions
μ : X → N with finite support. Coalgebras for Bω are multigraphs, i.e.
N-weighted transition systems.
Coalgebras for a functor T form a category CoAlgT where morphisms f : γ → σ
between γ : X → T X and σ : Y → T Y are maps f : X → Y with σ ◦ f = T f ◦ γ.
For x ∈ X and y ∈ Y , we write (x, γ) ∼ (y, σ), read “x and y are behaviourally
equivalent ”, if there exists a coalgebra ξ : Z → T Z with morphisms f : γ → ξ
and g : σ → ξ such that f (x) = g(y). Functors are assumed wlog. to preserve
injective maps [2] and to be non-trivial, in the sense that T X = ∅ implies X = ∅.
The syntax of coalgebraic modal logics is parametrized by a modal similarity
type Λ. The language CML(Λ) is then given by the grammar

φ ::= ⊥ | φ → φ | ♥k (φ1 , . . . φk ) (1)

where ♥k ∈ Λ is a modal operator of arity k ≥ 0. We shall use the usual Boolean


abbreviations ∧, ∨, etc. when convenient. Each modality ♥k ∈ Λ is interpreted as
a k-ary predicate lifting ♥k , i.e., a natural transformation ♥k  : P̆ k → ˙ P̆ ◦T op .
The extension of φ in a coalgebra γ is given by ⊥γ = ∅, φ → ψγ = (X \
φγ ) ∪ ψγ and ♥k (φ1 , . . . φk )γ = {x | γ(x) ∈ ♥k X (φ1 γ , . . . , φk γ )}. For
the sake of readability, we sometimes pretend that all modal operators are unary.

Example 2. Some predicate liftings for the functors of Example 1 are


(i) For P we get the usual diamond with 3X (A) := {t ∈ PX | A ∩ t = ∅}.
The box is defined as 2X (A) := {t ∈ PX | t ⊆ A}.
1
Formally, P̆ : Setop → Set with P̆X = 2X and, for f : X → Y , P̆f (A) = f −1 [A].
104 F. Carreiro, D. Gorı́n, and L. Schröder

(ii) The diamond and box for K are obtained analogously. Propositions corre-
spond to nullary liftings pX := {(s, C) ∈ KX | p ∈ C} for every p ∈ P.
(iii) For N and M, we have 2X (A) := {s ∈ N X | A ∈ s}.
(iv) For Dω (and Sω ), the modalities Lp of probabilistic modal logic correspond
to the liftings Lp X (A) := {μ ∈ Dω X | μ(A) ≥ p} for p ∈ Q ∩ [0, 1].
(v) The counting modalities 3k of graded modal logic are given as predicate
liftings for Bω with 3k X (A) := {μ ∈ Bω X | μ(A) ≥ k} for k ∈ N.
It is well-known that CML(Λ) is invariant under behavioural equivalence; i.e., if
(x, γ) ∼ (y, σ) then x ∈ φγ iff y ∈ φσ , for all φ ∈ CML(Λ).
An operator ♥ ∈ Λ is monotone if A ⊆ B ⊆ X implies ♥X A ⊆ ♥X B. For
example, all operators of Example 2 are monotone except the one for N . We
say that Λ is separating [16] if t ∈ T X is uniquely determined by {(♥, A) ∈
Λ × P̆X | t ∈ ♥X (A)}.

3 Strong Coalgebraic Announcements


Announcements (and, more generally, dynamic operators) are accounted for at
the syntactic level by extending CML(Λ) with a set Π of dynamic modalities
which we call the dynamic similarity type (as opposed to static modalities of the
static similarity type Λ). The syntax of CAL(Π, Λ) is obtained by extending the
grammar in (1) with the clause Δφ1 φ2 for Δ ∈ Π.
At this point, one may informally read Δψ φ as “after announcing ψ, φ holds”;
more generally, Δψ will represent an update operation on the model that is
parameterized by a formula ψ. The update affects every state of the model but
does so in a local way. That is, for each state x of γ, Δψ updates γ(x) in a way
that depends only on γ(x) and ψγ .

Definition 3. An update is a natural transformation τ : T → ˙ (P̆  T ), where


P̆  T is the Set-functor defined by (P̆  T )X := (T X)P̆X and, for f : X → Y ,
by (P̆  T )f := λh : (T X)P̆X .(T f ◦ h ◦ P̆f ).

Intuitively, every component τX takes as input the extension of a formula and the
T -description of an element and returns an updated T -description. Naturality
says that T f (τX (t, P̆f A)) = τY (T f (t), A), for f : X → Y , t ∈ T X and A ⊆
Y . We interpret Δ ∈ Π as an update Δ, and extend the semantics with
the clause Δψ φγ = φΔX (ψγ )◦γ — i.e., Δψ applies local changes to the
entire coalgebra γ. We often identify Δ and Δ. The basic example is public
announcement logic over unrestricted frames (as considered in [14]; which we call
standard PAL although it is not interpreted over epistemic models), for which
we take Δ(S)(A) = A ∩ S and then rewrite an announcement ψφ to ψ ∧ Δψ φ
— this induces essentially the standard semantics, since restricting all successors
to satisfy ψ is modally indistinguishable from restricting the whole model to ψ.

Example 4. On relational models, the update !! : P →


˙ (P̆  P) defined as
!!X (S)(A) := if A = ∅ then S ∩ A else S gives the total announcements
Coalgebraic Announcement Logics 105

of [23]. That is, the announcement need not be truthful. If we think of A as the
extension of a formula φ then this transformation removes the successors not
satisfying φ. If an impossible formula is announced, it is ignored.
Example 5. For the functor Dω we can define an update τ : Dω → ˙ (P̆  Dω )
that has the effect of conditioning all probabilities to a given formula as
τX (μ)(A) := λx.if μ(A) > 0 then μ(x | A) else μ(x). Again, this update
simply ignores the announcement of impossible events (i.e. those with probabil-
ity 0). We also write this update as μA := τ (μ)(A).
It is clear that there is a resemblance between Examples 4 and 5: both updates
give rise to dynamic operators that restrict the successors of a node to the points
that satisfy certain formula. This connection can be made more precise, which
will allow us to discuss this type of announcements in a uniform way.
Definition 6. An update τ is called a strong announcement on Λ if
(a) the partial application τX (−, A) : T X → T X factors through the inclusion
iA : T A → T X, for every A ⊆ X (intuitively, τX (−, A) : T X → T A); and
(b) τX (s, A) ∈ ♥X (C) iff s ∈ ♥X (C), for all s ∈ T X, C ⊆ A ⊆ X, ♥ ∈ Λ.
Condition (a) intuitively says that when ψ is announced, the resulting model
should be based on the states satisfying ψ, while (b) ensures that all states
satisfying ψ are retained. (Note that (b) is purely local and hence does not imply
that whenever φ → ψ is valid, then Δψ ♥φ ↔ ♥φ is valid; this fails already in
standard PAL.) In most cases, condition (b) is sufficient for naturality (so, for
instance, we are exempt from proving it in the examples below).
Proposition 7. A set-indexed family of maps τX : T X → (2X → T X) satisfy-
ing condition (b) of Definition 6 for a separating set Λ of predicate liftings is a
natural transformation T →˙ (P̆  T ); i.e. it is an update.
Example 8. (i) In slight modification of Example 4, putting !(S)(A) :=
A ∩ S defines a strong announcement on {3} (but not on {2}); it induces
standard PAL. For the differences between ! and !! see [14,23].
(ii) Putting τ (μ)(A) := λx. if x ∈ A then μ(x) else 0 for μ ∈ Bω X defines
a strong announcement on Λ = {30 , 31 , . . . }. The case for the subdistri-
bution functor Sω is similar.
(iii) For the neighbourhood functor N , putting τ (t)(A) := t ∩ PA defines a
strong announcement on Λ = {2}. The same definition (sic!) works for the
monotone neighbourhood functor M and Λ = {3}.
(iv) Probabilistic conditioning (cf. Example 5) is not a strong announcement.
These examples show that strong announcements occur in varying settings. For
monotone logics, they are actually uniquely determined.
Theorem 9. Let Λ consist of monotone operators. If τ is a strong announce-
ment on Λ, then we have an adjunction T iA * τX (−, A) where the ordering on
T X is given by s ≤ t ⇐⇒ ∀♥ ∈ Λ, A ⊆ X. s ∈ ♥(A) ⇒ t ∈ ♥(A). In
particular, τ is uniquely determined.
106 F. Carreiro, D. Gorı́n, and L. Schröder

This applies to all the updates of Example 8 except the one for N (since the
predicate lifting involved is not monotone). In PAL, the announcement operator
can be removed by means of well-known reduction laws [5,14], and hence does
not add expressive power. This generalizes to strong announcements:

Theorem 10. Let Δ be a strong announcement on Λ, and let ♥ ∈ Λ; then


Δψ ♥φ ≡ ♥(ψ ∧ Δψ φ).

Remark 11. Theorem 10 can be used on duals of strong announcements, yield-


ing, e.g., that in standard PAL, !ψ 2φ ≡!ψ ¬3¬φ ≡ ¬!ψ 3¬φ ≡ 2(ψ →!ψ φ).

Corollary 12. Let Π be a set of strong announcements on Λ. For every φ ∈


CAL(Π, Λ) there is φ∗ ∈ CML(Λ) such that φ ≡ φ∗ . Hence, CAL(Π, Λ) is
invariant under behavioural equivalence.

In general, the translation induced by Theorem 10 (and commutation of an-


nouncements with Boolean operators) induces an exponential blowup. In Sec-
tion 5 we will look at this in more detail, and show that one can obtain a
satisfiability-preserving polynomial translation in many cases.

4 Imperfect Announcements and Other Effects


The updates introduced in the previous section are all of a deterministic nature,
in the sense that points of a coalgebra are updated in a unique way. While this is
a sensible condition for many applications, one can also think of updates where
the outcome is, e.g. non-deterministic or governed by a probability distribution
(what are usually called effects in the context of programming languages).
Let us consider non-determinism for a moment. It seems reasonable to extend
the (deterministic) updates of Definition 3 to natural transformations of the form
T → ˙ (P̆  PT ) which return a set of possible T -descriptions to choose from. The
question is now how to interpret a formula such as Δψ φ in this setting. Notice
that there are at least two sensible readings: we could declare Δψ φ true at x in γ
if φ is true at x in all possible transformations of γ(x) (demonic interpretation);
or, alternatively, if φ is true at x in at least one of them (angelic interpretation).
This example shows that such a notion of non-deterministic update by itself does
not suffice; we will see that the missing behaviour can be specified by means of
predicate liftings.
In order to suitably generalize the deterministic updates of the previous sec-
tion we involve a different notion of transformation that acts directly on the
involved predicates. As such, unsurprisingly, it is contravariant, generalizing as
it does the preimage under an update.

Definition 13. A regenerator is a natural transformation ρ : P̆ × P̆T →


˙ P̆T .

The arguments of a regenerator should be thought of as the extension of the


announced formula ψ and a predicate on T X of the form ♥φγψ , where γψ
denotes an updated version of γ; the regenerator transforms this back into a
Coalgebraic Announcement Logics 107

predicate on T X as seen from the original γ. We can now define the coalgebraic
logic of announcements with effects CAL◦ (Π, Λ), which syntactically coincides
with CAL(Π, Λ). In CAL◦ (Π, Λ), each Δ ∈ Π is interpreted by a regenerator
Δ◦ . The semantics of formulas requires not only a T -coalgebra X, γ but also
a map ρ : 2T X → 2T X (the global regenerator ) that keeps track of the updates
applied so far. The extension ·◦ρ,γ of formulas of CAL◦ (Π, Λ) is defined as usual
for Boolean connectives and by
Δψ φ◦ρ,γ = φ◦Δ◦ (ψ◦ρ,γ ,−)◦ρ,γ ♥φ◦ρ,γ = (P̆γ ◦ ρ ◦ ♥X )φ◦ρ,γ .
X

When no ambiguity arises, we may write · instead of ·◦ . We will also use φγ
instead of φιX ,γ where ι : P̆T X →
˙ P̆T is the identity.
The connection between regenerators and “updates with effects” as in the
non-deterministic update discussed above can now be made precise. The crucial
observation is that any natural transformation τ : T → ˙ (P̆  F T ) equipped
with a predicate lifting (for F ) λ : P̆ →˙ P̆F induces the regenerator (for T )
˙ P̆T defined by ρτ,λ
ρτ,λ : P̆ × P̆T → X (A, S) := P̆(τX (−)(A))[λT X (S)]. In fact,
CAL(Λ, Π) is just CAL◦ (Λ, Π) with F = Id and λ = id .
Example 14. The non-deterministic announcements discussed above corre-
spond to taking F = P; the angelic interpretation is induced by λaX (t) := {s |
t ∩ s = ∅} and the demonic one by λdX (t) := {s | t ⊆ s} (i.e. 3 and 2 from
Example 2.i). Examples of other updates for various choices of T and τ are:
(i) Lossy announcements: take T = P and τX (S, A) := {S ∩ A, S}; this models
an announcement that can fail (leaving the set of successors unchanged).
If Δ◦ is based on λd , then Δψ φ means that φ has to hold regardless of
whether the announcement of ψ succeeds or not. The angelic case is dual.
(ii) Controlled sabotage: again for T = P, but define τX (S, A) := {S \ A, S}. If
we think of A as a delicate area of a network, this transformation models
links that may fail every time we want to go through them.
(iii) Unstable (pseudo-)Markov chains: let T = Sω and, for each ε ∈ Q ∩ [0, 1],
define a non-deterministic update τX ε
(μ, A) = {μ̃p | 0 ≤ p ≤ ε, μ̃p ∈ Sω X}
where μ̃p (x) := if x ∈ A then μ(x) + p else μ(x). This update non-
deterministically augments the probability of each a ∈ A by at most ε.
Example 15. Taking F = Dω we get a probability distribution over the out-
comes of an update. For p ∈ Q ∩ [0, 1] we can define λpX (A) := {μ | μ(A) ≥ p},
obtaining dynamic operators Δp such that Δpψ φ is true if the probability of the
effect of announcing ψ (in some unspecified way) making φ true is greater than
p. Note that the underlying coalgebra need not be probabilistic: in this example
the coalgebra type T is arbitrary and F only plays a role in the liftings.
Remark 16. One is tempted to think of non-deterministic or probabilistic up-
dates as changing the coalgebra γ, non-deterministically or randomly, to a fixed
γ  . Although this is not accurate in that the choice is made again every time the
evaluation encounters a static modality, it becomes formally correct by restrict-
ing to tree-shaped coalgebras, i.e. those where the underlying Kripke frame is a
108 F. Carreiro, D. Gorı́n, and L. Schröder

tree, which, in the light of Theorem 17 below, is without loss of generality since
every coalgebra is behaviourally equivalent to a tree-shaped one [20]. One still
needs to keep in mind, however, that the choice is made per state, e.g. a lossy
announcement may succeed in some states and fail in others.

We now show that even in the presence of effects, dynamic modalities can be
rewritten in terms of static modalities (albeit not necessarily of the base logic),
and hence coalgebraic announcement logic in the more general sense remains in-
variant under behavioural equivalence. The crucial observation is that composing
a predicate lifting and a regenerator yields a predicate lifting of a higher arity.
That is, given λ : P̆ n → ˙ P̆T and ρ : P̆ × P̆T → ˙ P̆T , we have that the composite
λX (A, B1 . . . Bn ) := ρX (A, λX (B1 . . . Bn )) is a predicate lifting λ : P̆ n+1 →
˙ P̆T .
Given a static modality ♥ and a dynamic modality Δ, we introduce a static
modality (Δ·♥) interpreted by the composite of Δ and ♥ in this sense; one
easily shows that

Δψ ♥(φ1 , . . . , φn ) ≡ (Δ·♥) (ψ, Δψ φ1 , . . . , Δψ φn ).

Iterating this, we obtain static modalities m for all strings m = Δ1 · · · Δn ♥;


we denote the similarity type extending Λ by these modalities as CLΠ (Λ). We
say that Λ is closed for Π if for every m ∈ CLΠ (Λ), m (a1 , . . . , an ) (for propo-
sitional variables ai ) can be expressed as a polynomial-sized (in n) formula that
is a propositional combination of formulae ♥φ where φ is a propositional com-
bination of the ai . Note that when Π consists of strong announcements on Λ,
then Λ is closed for Π.

Theorem 17. For all φ ∈ CAL◦ (Π, Λ) there is φ∗ ∈ CML(CLΠ (Λ)) s.t. φ ≡ φ∗ .
Hence, CAL◦ (Π, Λ) is invariant under behavioural equivalence.

5 Decidability and Complexity

We have shown in the previous sections that coalgebraic announcement logic


can be reduced to basic coalgebraic modal logic, albeit incurring an exponential
blow-up. For PAL, it is known that this blow-up is unavoidable [14,8] and yet
its computational complexity is the same as that of the base logic. We now show
that under mild assumptions, the complexity result generalizes to the coalgebraic
setting. We consider two standard decision problems: i) the satisfiability problem
(SAT) “given a formula φ, decide if there is X, γ such that φγ = ∅”; and
ii) the constrained satisfiability problem (CSAT) “given two formulas ψ and φ,
decide if there is X, γ such that φγ = ∅ and ψγ = X”, in which case we say
that φ is satisfiable with respect to ψ. In the terminology of description logic,
CSAT corresponds to reasoning with a general TBox (given by ψ). SAT is a
special case of CSAT but tends to have lower complexity.
Our results have different levels of generality. First we prove a small model
property which holds unconditionally. For the case that the static similarity
type Λ is closed for Π, we moreover provide a polynomial reduction of CSAT
Coalgebraic Announcement Logics 109

to the base logic, which allows inheriting the complexity of CSAT for the latter,
typically EXPTIME. For a polynomial reduction of SAT to the base logic, we
need to assume that Λ contains a master modality, which then again allows
inheriting the complexity, typically PSPACE. We illustrate these methods for
the logic of probabilistic conditioning.

5.1 Constructive Filtrations and the Small Model Property


For Σ a set of formulas, a Σ-filtered model is understood as one whose states
are subsets of Σ and such that each state satisfies all the Σ-formulas it contains.
Hence, a Σ-filtered model has at most 2|Σ| states. We shall prove that every
satisfiable formula φ of CAL◦ (Λ, Π) is satisfied in some Σ-filtered model with
|Σ| = O(|φ|). In fact, we show how to derive a Σ-filtered models from any model
for φ, rather simplifying the construction given for CML(Λ) in [19].
In what follows, let Σ be a fixed set of CAL◦ (Λ, Π)-formulas, closed under
subformulas and negation (identifying ¬¬φ with φ as usual). Let HΣ ⊆ 2Σ be the
set of all maximal satisfiable subsets of Σ. For a given coalgebra γ : X → T X,
let fΣ : X → HΣ be the mapping fΣ (x) = {φ ∈ Σ | x ∈ φγ }. A coalgebra
γΣ : HΣ → T HΣ is said to be a Σ-filtration of γ whenever for all x ∈ X, there
exists y ∈ X such that fΣ (x) = fΣ (y) and γΣ (fΣ (x)) = T fΣ (γ(y)).
Intuitively, we take the quotient of X by satisfaction of formulas in Σ and
allow any of the members of the equivalence class to be the representative (each
choice of representatives induces a potentially different filtration).2
To state the filtration theorem we need to relate global regenerators based
on a coalgebra γ with global regenerators based on a Σ-filtration γΣ . So we say
that two maps ρX : 2T X → 2T X and ρHΣ : 2T HΣ → 2T HΣ are fΣ -synchronized
if ρX ◦ P̆T f = P̆T f ◦ ρHΣ (the naturality diagram for fΣ if ρ was a natural
transform). By induction over φ, one shows
Theorem 18. Let γΣ be a Σ-filtration of X, γ. For all φ ∈ Σ, x ∈ X
and every pair of fΣ -synchronized ρX and ρHΣ , we have x ∈ φρX ,γ iff
fΣ (x) ∈ φρHΣ ,γΣ .

Observing that id 2T X and id 2T HΣ are fΣ -synchronized, we obtain


Corollary 19. CAL◦ (Λ, Π) has the small (exponential) model property.
It is easy to exploit the small model property to give an upper bound NEXP-
TIME for CSAT under mild additional conditions; in view of the results in the
next section, we refrain from spelling out details.

5.2 Polynomial Satisfiability-Preserving Translations


From Theorem 17 we already know that when Λ is closed for Π, then every
formula in CAL◦ (Λ, Π) is equivalent to one of CML(Λ), perhaps exponentially
2
The simpler definition γΣ (fΣ (x)) = T fΣ (γ(x)) is not well-defined when fΣ is not
injective. Also, fΣ may not be a coalgebra morphism, even with Σ = CAL◦ (Λ, Π).
110 F. Carreiro, D. Gorı́n, and L. Schröder

larger. This implies that the complexity of the decision problems for CAL◦ (Λ, Π)
is at most one exponential higher than for CML(Λ). But one can do better. The
main observation is that, although the translated formula φ∗ may be of size
exponential in |φ|, it contains only polynomially many different subformulas.
Using essentially the same argument as in [14, Lemma 9], one can prove:
Theorem 20. Let Λ be closed for Π. Then CSAT for CAL◦ (Λ, Π) has the same
complexity as for CML(Λ).
The proof is by introducing propositional variables as abbreviations for sub-
formulas, using the constraint. To deal with satisfiability in the absence of a
constraint, we need a master modality to make abbreviations work up to the
modal depth of the target formula. Coalgebraically, a master modality for Λ is
a static modality  such that + and φ → (♥ψ ↔ ♥(φ ∧ ψ)), for all ♥ ∈ Λ
and φ, ψ ∈ CML(Λ), are valid. In the presence of a master modality one can give
better bounds for SAT than those from Theorem 20.
Theorem 21. Let Λ be closed for Π, and contain a master modality. Then the
complexity of SAT for CAL◦ (Λ, Π) is the same as for CML(Λ).
Interestingly, master modalities abound: if T preserves inverse images then the
predicate lifting X (A) := T A induces a master modality. Preserving inverse
images is weaker than the frequent assumption of preservation of weak pullbacks.
E.g., in graded modal logic 21 := ¬31 ¬ is a master modality, and in probabilistic
modal logic L1 is a master modality. Having observed that Λ is closed for strong
announcements on Λ, we note explicitly
Theorem 22. If Π consists of strong announcements on Λ, then CSAT for
CAL(Λ, Π) has the same complexity as for CML(Λ); the same holds for SAT if
Λ contains a master modality.
In particular, we regain the known complexity of standard PAL, and we obtain,
as new results, PSPACE and EXPTIME as the complexity of SAT and CSAT,
respectively, for graded modal logic with the strong announcement operator
(Example 8), as well as, e.g., NP as the complexity of neighbourhood logic and
monotone modal logic with strong announcement.

5.3 Case Study: Conditionings in Probabilistic Logic


We now turn the attention to a logic where a master modality is available but the
announcements that we are interested in are not strong: the logic of probabilistic
conditioning (cf. Examples 5 and 8), the latter denoted Δ, with L1 being the
master modality.
First, we observe that the static similarity type Λ = {Lp | p ∈ Q ∩ [0, 1]} likely
fails to be closed for {Δ}. To see this, we first move to an extended modal lan-
guage with linear inequalities over probabilities of formulas φ, the latter denoted
(φ) for ‘likelihood’ (i.e. essentially the probabilistic part of the logic introduced
in [7]). In this notation, we have
Δψ Lp φ ≡ ((ψ) = 0 → (Δψ φ) ≥ 0) ∧ (ψ ∧ Δψ φ) ≥ p · (ψ) (2)
Coalgebraic Announcement Logics 111

where the first conjunct takes care of the exceptional case of impossible an-
nouncements. It seems unlikely that one could express the right-hand-side of (2)
with a finite formula using only the operators Lp . However, we can extend Λ to
a closed similarity type. A very conservative solution is to let Lp (φ | ψ) be a
binary modal operator abbreviating (φ ∧ ψ) ≥ p · (ψ); then

Δψ Lp (φ | χ) ≡ (L1 (¬ψ | +) → Lp (Δψ φ | χ)) ∧ Lp (Δψ φ | χ ∧ ψ),

i.e. the Lp (−, −) are closed for {Δ}. More generally, one may verify that the
n
full language of linear inequalities (with n-ary modal operators i=1 ai ( i ) ≥
b for all n ≥ 0 and a1 , . . . , an , b ∈ Q) is closed. SAT for the modal logic of
linear inequalities over probabilities is known to be in PSPACE [7], hence the
complexity of SAT for the above logics of probabilistic conditioning is PSPACE.

6 Conclusions

We have introduced the framework of coalgebraic announcement logics and seen


that it transfers to a setting of richer structures and general effects many nice
properties enjoyed by (relational) public announcement logics. Our work fits in
the spirit of [1], which also studies dynamic epistemic operators on a coalgebraic
setting; although with a rather different perspective and a completely different
technical machinery. That framework gains much generality from defining up-
dates as natural transformations in CoAlgT instead of Set. Giving up locality
in this way (for now updates can look at the whole coalgebra structure) has
its consequences: one loses small (or even finite) model properties, general de-
cidability results, etc. The framework of [4] avoids explicit updates of models
but otherwise has a comparable level of generality, with similar advantages and
drawbacks. We expect that all coalgebraic announcement logics can be shown
to be expressible in those frameworks.
It is only a slight simplification to claim that all coalgebraic results are compo-
sitional. One can reduce the study of composite functors to that of multi-sorted
functors and almost all coalgebraic results extend straightforwardly from the
single-sorted to the multi-sorted one at the expense of no more than additional
indexes in the notation [21]. Applying this mechanism to the case of coalge-
braic announcement logics requires some concentration (e.g. one needs to realize
that regenerators apply to multi-sorted predicates) but does not pose any es-
sential problems. Effectively, this means that all our results — invariance under
behavioural equivalence, complexity analysis and the small model property —
carry over to (complex) composite settings, such as a probabilistic logic about
beliefs in a multi-agent system with group announcement operators that com-
municate facts only to selected agents by probabilistic conditioning.
112 F. Carreiro, D. Gorı́n, and L. Schröder

References
1. Baltag, A.: A coalgebraic semantics for epistemic programs. In: Coalgebraic Meth-
ods in Computer Science. ENTCS, vol. 82, pp. 17–38. Elsevier (2003)
2. Barr, M.: Terminal coalgebras in well-founded set theory. Theoret. Comput.
Sci. 114, 299–315 (1993)
3. van Benthem, J.: An essay on sabotage and obstruction. In: Hutter, D., Stephan,
W. (eds.) Mechanizing Mathematical Reasoning. LNCS (LNAI), vol. 2605,
pp. 268–276. Springer, Heidelberg (2005)
4. Cı̂rstea, C., Sadrzadeh, M.: Coalgebraic epistemic update without change of model.
In: Mossakowski, T., Montanari, U., Haveraaen, M. (eds.) CALCO 2007. LNCS,
vol. 4624, pp. 158–172. Springer, Heidelberg (2007)
5. van Ditmarsch, H., van der Hoek, W., Kooi, B.: Dynamic epistemic logics. Springer
(2007)
6. Duque, D.F., van Benthem, J., Pacuit, E.: Evidence logic: a new look at neighbor-
hood structures. In: Advances in Modal Logics. College Publications (2012)
7. Fagin, R., Halpern, J.Y.: Reasoning about knowledge and probability. J. ACM 41,
340–367 (1994)
8. French, T., van der Hoek, W., Iliev, P., Kooi, B.: Succinctness of epistemic lan-
guages. In: Int. Joint Conf. on Artif. Int., pp. 881–886 (2011)
9. French, T., van Ditmarsch, H.: Undecidability for arbitrary public announcement
logic. In: Advances in Modal Logics, pp. 23–42. College Publications (2008)
10. Heifetz, A., Mongin, P.: Probabilistic logic for type spaces. Games and Economic
Behavior 35, 31–53 (2001)
11. Hintikka, J.: Knowledge and belief. Cornell University Press (1962)
12. Larsen, K., Skou, A.: Bisimulation through probabilistic testing. Inf. Comput. 94,
1–28 (1991)
13. Löding, C., Rohde, P.: Model checking and satisfiability for sabotage modal logic.
In: Pandya, P.K., Radhakrishnan, J. (eds.) FSTTCS 2003. LNCS, vol. 2914, pp.
302–313. Springer, Heidelberg (2003)
14. Lutz, C.: Complexity and succinctness of public announcement logic. In: Joint
Conference on Autonomous Agents and Multi-Agent Systems, pp. 137–143 (2006)
15. Pattinson, D.: Coalgebraic modal logic: Soundness, completeness and decidability
of local consequence. Theoret. Comput. Sci. 309, 177–193 (2003)
16. Pattinson, D.: Expressive logics for coalgebras via terminal sequence induction.
Notre Dame J. Formal Logic 45, 2004 (2002)
17. Plaza, J.A.: Logics of public communications. In: International Symposium on
Methodologies for Intelligent Systems, pp. 201–216 (1989)
18. Rohde, P.: Moving in a crumbling network: The balanced case. In: Marcinkowski,
J., Tarlecki, A. (eds.) CSL 2004. LNCS, vol. 3210, pp. 310–324. Springer, Heidelberg
(2004)
19. Schröder, L.: A finite model construction for coalgebraic modal logic. J. Log. Al-
gebr. Prog. 73, 97–110 (2007)
20. Schröder, L., Pattinson, D.: Coalgebraic correspondence theory. In: Ong, L. (ed.)
FOSSACS 2010. LNCS, vol. 6014, pp. 328–342. Springer, Heidelberg (2010)
21. Schröder, L., Pattinson, D.: Modular algorithms for heterogeneous modal logics
via multi-sorted coalgebra. Math. Struct. Comput. Sci. 21(2), 235–266 (2011)
22. Segerberg, K.: An essay in classical modal logic. No. 1 in Filosofiska studier utgivna
av Filosofiska föreningen och Filosofiska institutionen vid Uppsala univ. (1971)
23. Steiner, D., Studer, T.: Total public announcements. In: Artemov, S., Nerode, A.
(eds.) LFCS 2007. LNCS, vol. 4514, pp. 498–511. Springer, Heidelberg (2007)
Self-shuffling Words

Émilie Charlier1 , Teturo Kamae2, Svetlana Puzynina3,5 ,


and Luca Q. Zamboni4,5
1
Département de Mathématique, Université de Liège, Belgium
[email protected]
2
Advanced Mathematical Institute, Osaka City University, Japan
[email protected]
3
Sobolev Institute of Mathematics, Novosibirsk, Russia
[email protected]
4
Institut Camille Jordan, Université Lyon 1, France
[email protected]
5
FUNDIM, University of Turku, Finland

Abstract. In this paper we introduce and study a new property of in-


finite words which is invariant under the action of a morphism: We say
an infinite word x ∈ AN , defined over a finite alphabet A, is self-shuffling
if x admits factorizations: x = ∞ i=1 Ui Vi =

i=1 Ui =

i=1 Vi with
Ui , Vi ∈ A+ . In other words, there exists a shuffle of x with itself which
reproduces x. The morphic image of any self-shuffling word is again self-
shuffling. We prove that many important and well studied words are
self-shuffling: This includes the Thue-Morse word and all Sturmian words
(except those of the form aC where a ∈ {0, 1} and C is a characteristic
Sturmian word). We further establish a number of necessary conditions
for a word to be self-shuffling, and show that certain other important
words (including the paper-folding word and infinite Lyndon words) are
not self-shuffling. In addition to its morphic invariance, which can be used
to show that one word is not the morphic image of another, this new no-
tion has other unexpected applications: For instance, as a consequence of
our characterization of self-shuffling Sturmian words, we recover a num-
ber theoretic result, originally due to Yasutomi, which characterizes pure
morphic Sturmian words in the orbit of the characteristic.

1 Introduction

Let A be a finite non-empty set. We denote by A∗ the set of all finite words
u = x1 x2 . . . xn with xi ∈ A. The quantity n is called the length of u and is
denoted |u|. For a letter a ∈ A, by |u|a we denote the number of occurrences of

The first and fourth authors are supported in part by FiDiPro grant of the
Academy of Finland. The third author is supported in part by the Academy
of Finland under grant 251371, by Russian Foundation of Basic Research (grant
12-01-00448), and by RF President grant MK-4075.2012.1. Preliminary version:
http://arxiv.org/abs/1302.3844.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 113–124, 2013.

c Springer-Verlag Berlin Heidelberg 2013
114 É. Charlier et al.

a in u. The empty word, denoted ε, is the unique element in A∗ with |ε| = 0.


We set A+ = A − {ε}. We denote by AN the set of all one-sided infinite words
x = x0 x1 x2 . . . with xi ∈ A.
Given k finite or infinite words x(1) , x(2) , . . . , x(k) ∈ A∗ ∪ AN we denote by

S (x(1) , x(2) , . . . , x(k) ) ⊂ A∗ ∪ AN

the collection of all words z for which there exists a factorization



 (1) (2) (k)
z= Ui Ui · · · Ui
i=0

∞
∈ A∗ and with x(j) =
(j) (j)
with each Ui i=0 Ui for 1 ≤ j ≤ k. Intu-
itively, z may be obtained as a shuffle of the words x(1) , x(2) , . . . , x(k) . In case
x(1) , x(2) , . . . , x(k) ∈ A∗ , each of the above products can be taken to be finite.
Finite word shuffles were extensively studied in [5]. Given x ∈ A∗ , it is gen-
erally a difficult problem to determine whether there exists y ∈ A∗ such that
x ∈ S (y, y) (see Open Problem 4 in [5]). However, in the context of infinite
words, this question is essentially trivial: In fact, it is readily verified that if
x ∈ AN is such that each a ∈ A occurring in x occurs an infinite number of times
in x, then there exist infinitely many y ∈ AN with x ∈ S (y, y). Instead, in the
framework of infinite words, a far more delicate question is the following:

Question 1. Given x ∈ AN , does there exist an integer k ≥ 2 such that x ∈


S (x, x, . . . , x)?
  
k

If such a k exists, we say x is k-self-shuffling.


Given x = x0 x1 x2 . . . ∈ AN and an infinite subset N = {N0 < N1 < N2 <
. . .} ⊆ N, we put x[N ] = xN0 xN1 xN2 . . . ∈ AN . Alternatively,

Definition 1. For x ∈ AN and k= 2, 3, . . ., we say x is k-self-shuffling if there


k
exists a k-element partition N = i=1 N i with x[N i ] = x for each i = 1, . . . , k.

In case k = 2, we say simply x is self-shuffling. We note that if x is k-self-


shuffling, then x is -self-shuffling for each  ≥ k but not conversely (see §2),
whence each self-shuffling word is k-self-shuffling for all k ≥ 2. In this paper
we are primarily interested in self-shuffling words, however, many of the results
presented here extend to general k. Thus x ∈ AN is self-shuffling if and only if x
admits factorizations

 ∞
 ∞

x= Ui Vi = Ui = Vi
i=1 i=1 i=1

with Ui , Vi ∈ A+ .
The property of being self-shuffling is an intrinsic property of the word (and
not of the associated language) and seems largely independent of its complexity
Self-shuffling Words 115

(examples exist from the lowest to the highest possible complexity). The simplest
class of self-shuffling words consists of all (purely) periodic words x = uω . It is
clear that if x is self-shuffling, then every letter a ∈ A occurring in x must occur
an infinite number of times. Thus for instance, the ultimately periodic word 01ω
is not self-shuffling. As we shall see, many well-known words which are of interest
in both combinatorics on words and symbolic dynamics, are self-shuffling. This
includes for instance the famous Thue-Morse word

T = 0110100110010110100101100110100110010110 . . .

whose origins go back to the beginning of the last century with the works of the
Norwegian mathematician Axel Thue [9]. The nth entry tn of T is defined as the
sum modulo 2 of the digits in the binary expansion of n. While the Thue-Morse
word appears naturally in many different areas of mathematics (from discrete
mathematics to number theory to differential geometry-see [1] or [2]), proving
that Thue-Morse is self-shuffling is somewhat more involved than expected.
Sturmian words constitute another important class of aperiodic self-shuffling
words. Sturmian words are infinite words over a binary alphabet having exactly
n+ 1 factors of length n for each n ≥ 0 [7]. Their origin can be traced back to the
astronomer J. Bernoulli III in 1772. They arise naturally in many different areas
of mathematics including combinatorics, algebra, number theory, ergodic theory,
dynamical systems and differential equations. Sturmian words are also of great
importance in theoretical physics and in theoretical computer science and are
used in computer graphics as digital approximation of straight lines. We show
that all Sturmian words are self-shuffling except those of the form aC where
a ∈ {0, 1} and C is a characteristic Sturmian word. Thus for every irrational
number α, all (uncountably many) Sturmian words of slope α are self-shuffling
except for two. Our proof relies on a geometric characterization of Sturmian
words via irrational rotations on the circle.
So while there are many natural examples of aperiodic self-shuffling words,
the property of being self-shuffling is nevertheless quite restrictive. We obtain
a number of necessary (and in some cases sufficient) conditions for a word to
be self-shuffling. For instance, if a word x is self-shuffling, then x begins in only
finitely many Abelian border-free words. As an application of this we show that
the well-known paper folding word is not self-shuffling. Infinite Lyndon words
(i.e., infinite words which are lexicographically smaller than each of its suffixes)
are also shown not to be self-shuffling.
One important feature of self-shuffling words stems from its invariance under
the action of a morphism: The morphic image of a self-shuffling word is again self-
shuffling. In some instances this provides a useful tool for showing that one word
is not the morphic image of another. So for instance, the paper folding word is not
the morphic image of any self-shuffling word. However this application requires
knowing a priori whether a given word is or is not self-shuffling. In general,
to show that a word is self-shuffling, one must actually exhibit a shuffle. Self-
shuffling words have other unexpected applications particularly in the study of
fixed points of substitutions. For instance, as an almost immediate consequence
116 É. Charlier et al.

of our characterization of self-shuffling Sturmian words, we recover a result, first


proved by Yasutomi via number theoretic methods, which characterizes pure
morphic Sturmian words in the orbit of the characteristic.

2 Examples and Non-examples

In this section we list some examples and non-examples of self-shuffling words.


As usual in combinatorics on words, we follow notation from [7].
Fibonacci Word: The Fibonacci infinite word

x = 0100101001001010010100 . . .

is defined as the fixed point of the morphism ϕ given by 0 !→ 01, 1 !→ 0. It


is readily verified that ϕ2 (a) = ϕ(a)a for each a ∈ {0, 1}. Whence, writing
x = x0 x1 x2 . . . with each xi ∈ {0, 1} we obtain

x = x0 x1 x2 . . . = ϕ(x0 )ϕ(x1 )ϕ(x2 ) . . . = ϕ2 (x0 )ϕ2 (x1 )ϕ2 (x2 ) . . . =


ϕ(x0 )x0 ϕ(x1 )x1 ϕ(x2 )x2 . . .

which shows that x is self-shuffling. In contrast, the word y = 0x is not self-


shuffling. The word y starts with infinitely many prefixes of the form 0B1 with
B a palindrome. It follows that 0B1 is Abelian border-free (i.e., no proper suffix
of 0B1 is Abelian equivalent to a proper prefix of 0B1). By Proposition 3 the
word y is not self-shuffling.

Paper-folding Word: The paper-folding word

x = 00100110001101100010 . . .

is a Toeplitz word generated by the pattern u = 0?1? (see, e.g., [4]). It is readily
verified that x begins in arbitrarily long Abelian border-free words and hence by
Proposition 3 is not self-shuffling. More precisely, the prefixes uj of x of length
nj = 2j − 1 are Abelian border-free. Indeed, it is verified that for each k < nj , we
have |pref k (uj )|0 > k/2 while |suff k (uj )|0 ≤ k/2. Here pref k (u) (resp., suff k (u))
denotes the prefix (resp., suffix) of length k of a word u.

A 3-Self-shuffling Word Which Is Not Self-shuffling: Let y denote the


fixed point of the morphism σ : 0 !→ 0001 and 1 !→ 0101, and put
x = 0−2 y = 01000100010101000100010001010100010001000101010001010100 . . . ,

where the notation w = v −k u means that u = v k w. Then for each prefix uj of


x of length 4j − 2, the longest Abelian border of uj of length less than or equal
to (4j − 2)/2 has length 2. Hence x is not self-shuffling (see Proposition 3). The
3-shuffle is given by the following:
Self-shuffling Words 117

U0 = 0100, U1 = 01, . . . , U4i+2 = ε, U4i+3 = σi+1 (0100),


U4i+4 = σ(0), U4i+5 = (σ(0))−1 σi+1 (01),
V0 = 0100, V1 = 01, . . . , V4i+2 = (σ(0))−1 σi+1 (0), V4i+3 = σ(0),
V4i+4 = (σ(0))−1 σi+1 (01)σ(0), V4i+5 = ε,
W0 = 01, W1 = (σ(0))2 , . . . , W4i+2 = ε, W4i+3 = (σ(0))−1 σi+1 (01),
W4i+4 = ε, W4i+5 = σi+2 (0)σ(0).

It is then verified that



 ∞
 ∞
 ∞

x= Ui Vi Wi = Ui = Vi = Wi ,
i=0 i=0 i=0 i=0

from which it follows that x is 3-self-shuffling.

A Recurrent Binary Self-shuffling Word with Full Complexity: For each


positive integer n, let zn denote the concatenation of all words of length n in
increasing lexicographic order. For example, z2 = 00011011. For i ≥ 0 put

zn , if i = n2n−1 for some n,
vi =
0i 1i , otherwise,

and define


x= Xi = 01010011030113 04 0102 12 0114 . . . ,
i=0

where X0 = X1 = 01, X2 = 0011, and for i ≥ 3, Xi = 0i yi−2 1i , where yi−2 =


yi−3 vi−2 yi−3 , and y0 = ε. We note that x is recurrent (i.e., each prefix occurs
twice) and has full complexity (since it contains zn as a factor for every n).
To show that the word x is self-shuffling, we first show that Xi+1 ∈ S (Xi , Xi ).
Take Ni = {0, . . . , i − 1, i + 1, . . . , 2i − i, 2i − i + vi−1 |1 , 2i+1 − i − 1}, where u|1
denotes the positions j of a word u in which the j-th letter uj of u is equal to 1.
Then it is straightforward to see that Xi = Xi+1 [Ni ] = Xi+1 [{1, . . . , 2i+1 }\Ni ].
The self-shuffle of x is built in a natural way concatenating shuffles of Xi starting
with U0 = V0 = 01, so that X0 . . . Xi+1 ∈ S (X0 . . . Xi , X0 . . . Xi ).

3 General Properties

In this section we develop several fundamental properties of self-shuffling words.


The next two propositions show the invariance of self-shuffling words with respect
to the action of a morphism:

Proposition 1. Let A and B be finite non-empty sets and τ : A → B∗ a mor-


phism. If x ∈ AN is self-shuffling, then so is τ (x) ∈ BN .
118 É. Charlier et al.

∞  
Proof. If x ∈ S 
(x, x), then we can
∞write x = i=1 Ui Vi = ∞ i=1 U = ∞
i ∞ i=1 Vi .
∞ ∞
Whence τ (x) = i=1 τ (Ui Vi ) = i=1 τ (Ui )τ (Vi ) = i=1 τ (Ui ) = i=1 τ (Vi ) as
required.
Proposition 2. Let τ : A → A∗ be a morphism, and x ∈ AN be a fixed point
of τ.
1. Let u be a prefix of x and k be a positive integer such that τ k (a) begins in u
for each a ∈ A. Then if x is self-shuffling, then so is u−1 x.
2. Let u ∈ A∗ , and let k be a positive integer such that τ k (a) ends in u for each
a ∈ A. Then if x is self-shuffling, then so is ux.
Proof. We prove
∞only item  (1) since the
∞proof of (2) is essentially identical.

Suppose x = i=1 Ui Vi = i=1 Ui = i=1 Vi . Then by assumption, for each
i ≥ 1 we can write τ k (Ui ) = uUi and τ k (Vi ) = uVi for some Ui , Vi ∈ A∗ . Put
Xi = Ui u and Yi = Vi u. Then since

 ∞
 ∞
 ∞

x = τ k (x) = τ k (Ui Vi ) = τ k (Ui )τ k (Vi ) = τ k (Ui ) = τ k (Vi ),
i=1 i=1 i=1 i=1

we deduce that

 ∞
 ∞

u−1 x = Xi Yi = Xi = Yi .
i=1 i=1 i=1

Corollary 1. Let τ : A → A∗ be a primitive morphism, and a ∈ A. Suppose


τ (b) begins (respectively ends) in a for each letter b ∈ A. Suppose further that the
fixed point τ ∞ (a) is self-shuffling. Then every right shift (respectively left shift)
of τ ∞ (a) is self-shuffling.
Remark 1. Since the Fibonacci word is self-shuffling and is fixed by the primitive
morphism 0 !→ 01, 1 !→ 0, it follows from Corollary 1 that every tail of the
Fibonacci word is self-shuffling.
There are a number of necessary conditions that a self-shuffling word must sat-
isfy, which may be used to deduce that a given word is not self shuffling. For
instance:
Proposition 3. If x ∈ AN is self-shuffling, then for each positive integer N
there exists a positive integer M such that every prefix u of x with |u| ≥ M has
an Abelian border v with |u|/2 ≥ |v| ≥ N. In particular, x must begin in only a
finite number of Abelian border-free words.
∞
Proof.
∞ Suppose ∞ to the contrary that there exist factorizations x = i=0 Ui Vi =
i=0 Vi with Ui , Vi ∈ A , and there exists N such that for every M
+
i=0 Ui =
there exists a prefix u of x with |u| ≥ M which has no Abelian borders of length
N
between N and |u|/2. Take M = | i=0 Ui Vi | and a prefix u satisfying these
conditions. Then there exist non-empty proper prefixes U  and V  of u such that
u ∈ S (U  , V  ) with |U  |, |V  | > N . Writing u = U  U  it follows that U  and V 
are Abelian equivalent. This contradicts that u has no Abelian borders of length
between N and |u|/2.
Self-shuffling Words 119

An extension of this argument gives both a necessary and sufficient condition


for self-shuffling in terms of Abelian borders (which is however difficult to check
in practice). For u ∈ A∗ let Ψ (u) denote the Parikh vector of u, i. e., Ψ (u) =
(|u|a )a∈A .

Definition 2. Given x ∈ AN , we define a directed graph Gx = (Vx , Ex ) with


vertex set

Vx = {(n, m) ∈ N2 | Ψ (pref n x) + Ψ (pref m x) = Ψ (pref n+m x)}

and the edge set

Ex = { ((n, m), (n , m )) ∈ Vx × Vx |


n = n + 1 and m = m or m = m + 1 and n = n}.
∞
We say that Gx connects 0 to ∞ if there exists an infinite path j=1 (nj , mj )
in Gx such that (n0 , m0 ) = (0, 0) and nj , mj → ∞ as j → ∞.

Theorem 1. A word x ∈ AN is self-shuffling if and only if the graph Gx connects


0 to ∞.

The theorem gives a constructive necessary and sufficient condition for self-
shuffling since a path to infinity defines a self-shuffle.
As we shall now see, lexicographically extremal words are never self-shuffling.
Let (A, ≤) be a finite linearly ordered set. Then ≤ induces the lexicographic
ordering ≤lex on A+ and AN defined as follows: If u, v ∈ A+ (or AN ) we write
u ≤lex v if either u = v or if u is lexicographically smaller than v. In the latter
case we write u <lex v.
Let x ∈ AN . A factor u of x is called minimal (in x) if u ≤lex v for all factors v
of x with |v| = |u|. An infinite word y in the shift orbit closure Sx of x is called
Lyndon (in Sx ) if every prefix of y is minimal in x. The proof of the following
result is omitted for space considerations:

Theorem 2. Let (A, ≤) be a linearly ordered finite set and let x ∈ AN . Let
y, z ∈ Sx with y Lyndon and aperiodic. Then for each w ∈ S (y, z), we have
w <lex z. In particular, taking z = y we deduce that y is not self-shuffling.

Let A be a finite non-empty set. We say x ∈ AN is extremal if there exists a


linear ordering ≤ on A with respect to which x is Lyndon. As an immediate
consequence of Theorem 2 we obtain:

Corollary 2. Let A be a finite non-empty set and x ∈ AN be an aperiodic ex-


tremal word. Then x is not self-shuffling.

Remark 2. Let x = 11010011001011010010110 . . . denote the first shift of the


Thue-Morse infinite word. It is easily checked that x is extremal and hence is
not self-shuffling; yet it can be verified that x begins in only a finite number of
Abelian border-free words.
120 É. Charlier et al.

4 The Thue-Morse Word Is Self-shuffling


Theorem 3. The Thue-Morse word T = 011010011001 . . . fixed by the mor-
phism τ mapping 0 !→ 01 and 1 !→ 10 is self-shuffling.

Proof. For u ∈ {0, 1}∗ we denote by ū the word obtained from u by exchanging
0s and 1s. Let σ : {1, 2, 3, 4} → {1, 2, 3, 4}∗ be the morphism defined by

σ(1) = 12, σ(2) = 31, σ(3) = 34, σ(4) = 13.

Set u = 01101 and v = 001; note that uv is a prefix of T. Also define morphisms
g, h : {1, 2, 3, 4} → {0, 1}∗ by

g(1) = vū, g(2) = v̄ū, g(3) = v̄u, g(4) = vu

and
h(1) = uv, h(2) = ūv̄, h(3) = ūv̄, h(4) = uv
We will make use of the following lemmas:

Lemma 1. g(σ(a)) ∈ S (g(a), h(a)) for each a ∈ {1, 2, 3, 4}. In particular


ug(σ(1)) ∈ S (ug(1), h(1)).

Proof. For a = 1 we note that

g(σ(1)) = g(12) = vūv̄ ū = 0011001011010010.


Factoring 0011001011010010 = 0 · 011 · 0 · 010 · 11 · 01 · 0010 we obtain

g(σ(1)) ∈ S (00110010, 01101001) = S (vū, uv) = S (g(1), h(1)).

Similarly, for a = 2 we have

g(σ(2)) = g(31) = v̄uvū = 1100110100110010.


Factoring 1100110100110010 = 1 · 100 · 1 · 1 · 010 · 0110 · 010 we obtain

g(σ(2)) ∈ S (11010010, 10010110) = S (v̄ ū, ūv̄) = S (g(2), h(2)).

Exchanging 0s and 1s in the previous two shuffles yields

g(σ(3)) = g(34) = v̄uvu ∈ S (v̄u, ūv̄) = S (g(3), h(3))

and
g(σ(4)) = g(13) = vūv̄u ∈ S (vu, uv) = S (g(4), h(4)).

It is readily verified that

Lemma 2. h(σ(a)) = τ (h(a)) for each a ∈ {1, 2, 3, 4}.

Let w = w0 w1 w2 w3 . . . with wi ∈ {1, 2, 3, 4} denote the fixed point of σ beginning


in 1. As a consequence of the previous lemma we deduce that
Self-shuffling Words 121

Lemma 3. T = h(w).

Proof. In fact τ (h(w)) = h(σ(w)) = h(w) from which it follows that h(w) is one
of the two fixed points of τ. Since h(w) begins in h(1) which in turn begins in
0, it follows that T = h(w).

Lemma 4. T = ug(w).

Proof. It is readily verified that:

ug(1) = h(1)ū

ūg(2) = h(2)ū
ūg(3) = h(3)u
ug(4) = h(4)u.
Moreover, each occurrence of g(1) and g(4) in ug(w) is preceded by u while
each occurrence of g(2) and g(3) in ug(w) is preceded by ū. It follows that
ug(w) = h(w) which by the preceding lemma equals T.

Set
A0 = ug(σ(w0 )) and Ai = g(σ(wi )), for i ≥ 1
B0 = ug(w0 )) and Bi = g(wi )), for i ≥ 1
and
Ci = h(wi ) for i ≥ 0.
It follows from Lemma 3 and Lemma 4 that

 ∞
 ∞

T= Ai = Bi = Ci
i=0 i=0 i=0

and it follows from Lemma 1 that Ai ∈ S (Bi , Ci ) for each i ≥ 0. Hence T ∈


S (T, T) as required.

5 Self-shuffling Sturmian Words


In this section we characterize self-shuffling Sturmian words. Sturmian words
admit various types of characterizations of geometric and combinatorial nature,
e.g., they can be defined via balance, complexity, morphisms, etc. (see Chapter
2 in [7]). In [8], Morse and Hedlund showed that each Sturmian word may be
realized geometrically by an irrational rotation on the circle. More precisely,
every Sturmian word x is obtained by coding the symbolic orbit of a point ρ(x)
on the circle (of circumference one) under a rotation by an irrational angle α
where the circle is partitioned into two complementary intervals, one of length α
(labeled 1) and the other of length 1 − α (labeled 0) (see Fig. 1). And conversely
each such coding gives rise to a Sturmian word. The irrational α is called the slope
122 É. Charlier et al.

and the point ρ(x) is called the intercept of the Sturmian word x. A Sturmian
word x of slope α with ρ(x) = α is called a characteristic Sturmian word. It is
well known that every prefix u of a characteristic Sturmian word is left special,
i.e., both 0u and 1u are factors of x [7]. Thus if x is a characteristic Sturmian
word of slope α, then both 0x and 1x are Sturmian words of slope α and ρ(0x) =
ρ(1x) = 0. The fact that ρ is not one-to-one stems from the ambiguity of the
coding of the boundary points 0 and 1 − α.

0
1

0 X
1−α
r
ρ(x)

Fig. 1. Geometric picture of a Sturmian word of slope α

Theorem 4. Let S, M and L be Sturmian words of the same slope α, 0 < α < 1,
satisfying S ≤lex M ≤lex L. Then M ∈ S (S, L) if and only if the following
conditions hold: If ρ(M ) = ρ(S) (respectively, ρ(M ) = ρ(L)), then ρ(L) = 0
(respectively ρ(S) = 0).

In particular (taking S = M = L), we obtain

Corollary 3. A Sturmian word x ∈ {0, 1}N is self-shuffling if and only if ρ(x) =


0, or equivalently, x is not of the form aC where a ∈ {0, 1} and C is a charac-
teristic Sturmian word.

Our proof explicitly describes an algorithm for shuffling S and L so as to pro-


duce M. It is formulated in terms of the circle rotation description of Sturmian
words. Geometrically speaking, points ρ(S) and ρ(L) will take turns following
the trajectory of ρ(M ) so that the respective codings agree; as one follows the
other waits its turn (remains neutral). The algorithm specifies this following rule
depending on the relative positions of the trajectories of all three points and is
broken down into several cases. The proof can be summarized by the directed
graph in Fig. 2 in which each state n corresponds to “case n” in the proof.
We let s, m, and  denote the current tail of the words S, M , and L. They
are initialized as
s := S,  := L, and m := M.
While m is always a tail of M, the letters s and  may be tails of S or L,
depending on which is the current lexicographically largest1. Each directed edge
1
The choice of the letter s, m, and  is intended to refer to small, medium, and large
respectively.
Self-shuffling Words 123

3.2 2.2

1.2 1.1 2.1 3.1 6.1 6.2

4 5

Fig. 2. Graphical depiction of the proof of Theorem 4

corresponds to a precise set of instructions which specify which of s or  is


neutral, which of s or  follows m and for how long, and in the end a possible
relabeling of the variables s and . In each case the outcome leads to a new case
in which there is a switch in the follower. In other words, if there is an edge from
case i to case j in the graph, then either the instructions for case i and case j
specify different followers (as is the case for cases 1.1 and 2.1) in which case the
passage from i to j leaves the labeling of s and  unchanged, or the instructions
for case i and case j specify the same follower (as is the case for cases 1.2 and
1.1) in which case the passage from i to j exchanges the labeling of s and .
The proof of Theorem 4 amounts to showing that for each state n in the graph,
the specified instructions will take n to an adjacent state in the graph.
As an almost immediate application of Corollary 3 we recover the following
result originally proved by Yasutomi in [10] and later reproved by Berthé, Ei,
Ito and Rao in [3] and independently by Fagnot in [6]. We say an infinite word is
pure morphic if it is a fixed point of some morphism different from the identity.

Theorem 5 (Yasutomi [10]). Let x ∈ {0, 1}N be a characteristic Sturmian


word. If y is a pure morphic word in the orbit of x, then y ∈ {x, 0x, 1x, 01x, 10x}.

Proof. We begin with some preliminary observations. Let Ω(x) denote the set of
all left and right infinite words y such that F (x) = F (y) where F (x) and F (y)
denote the set of all factors of x and y respectively. If y ∈ Ω(x) is a right infinite
word, and 0y, 1y ∈ Ω(x), then y = x. This is because every prefix of y is a left
special factor and hence also a prefix of the characteristic word x. Similarly if y
is a left infinite word and y0, y1 ∈ Ω(x), then y is equal to the reversal of x. If
τ is a morphism fixing some point y ∈ Ω(x), then τ (z) ∈ Ω(x) for all z ∈ Ω(x).
Suppose to the contrary that τ = id is a morphism fixing a proper tail y of x.
Then y is self-shuffling by Corollary 3. Put x = uy with u ∈ {0, 1}+. Using the
characterization of Sturmian morphisms (see Theorem 2.3.7 & Lemma 2.3.13
in [7]) we deduce that τ must be primitive. Thus we can assume that |τ (a)| >
1 for each a ∈ {0, 1}. If τ (0) and τ (1) end in distinct letters, then as both
0τ (x), 1τ (x) ∈ Ω(x), it follows that τ (x) = x. Since also τ (y) = y and |τ (u)| >
|u|, it follows that y is a proper tail of itself, a contradiction since x is aperiodic.
Thus τ (0) and τ (1) must end in the same letter. Whence by Corollary 1 it follows
124 É. Charlier et al.

that every left extension of y is self-shuffling, which is again a contradiction since


0x and 1x are not self-shuffling.
Next suppose τ = id is a morphism fixing a point y = uabx ∈ Ω(x) where u ∈
{0, 1}+ and {a, b} = {0, 1}. Again we can suppose τ is primitive and |τ (0)| > 1
and |τ (1)| > 1. If τ (0) and τ (1) begin in distinct letters, then τ (x̃)0, τ (x̃)1 ∈ Ω(x)
where x̃ denotes the reverse of x. Thus τ (x̃) = x̃. Thus for each prefix v of abx
we have τ (x̃v) = x̃τ (v) whence τ (v) is also a prefix of abx. Hence τ (abx) = abx.
As before this implies that abx is a proper tail of itself which is a contradiction.
Thus τ (0) and τ (1) begin in the same letter. Whence by Corollary 1 it follows
that every tail of y is self-shuffling, which is again a contradiction since 0x and
1x are not self-shuffling.

Remark 3. In the case of the Fibonacci infinite word x, each of


{x, 0x, 1x, 01x, 10x} is pure morphic. For a general characteristic word x, since
every point in the orbit of x except for 0x and 1x is self-shuffling, it follows that
if τ is a morphism fixing x (respectively 01x or 10x), then τ (0) and τ (1) must
end (respectively begin) in distinct letters.

References
1. Allouche, J.-P., Shallit, J.: The ubiquitous Prouhet-Thue-Morse sequence.
In: Ding, C., Helleseth, T., Niederreiter, H. (eds.) Proceedings of Sequences and
Their Applications, SETA 1998, pp. 1–16. Springer (1999)
2. Allouche, J.-P., Shallit, J.: Automatic sequences. In: Theory, Applications, Gener-
alizations. Cambridge University Press (2003)
3. Berthé, V., Ei, H., Ito, S., Rao, H.: On substitution invariant Sturmian words: an
application of Rauzy fractals. Theor. Inform. Appl. 41, 329–349 (2007)
4. Cassaigne, J., Karhumäki, J.: Toeplitz Words, Generalized Periodicity and Period-
ically Iterated Morphisms. European J. Combin. 18, 497–510 (1997)
5. Henshall, D., Rampersad, N., Shallit, J.: Shuffling and unshuffling. Bull.
EATCS 107, 131–142 (2012)
6. Fagnot, I.: A little more about morphic Sturmian words. Theor. Inform. Appl. 40,
511–518 (2006)
7. Lothaire, M.: Algebraic Combinatorics on Words. Encyclopedia of Mathematics
and its Applications, vol. 90. Cambridge University Press, U.K (2002)
8. Morse, M., Hedlund, G.A.: Symbolic dynamics II: Sturmian sequences. Amer. J.
Math. 62, 1–42 (1940)
9. Thue, A.: Über unendliche Zeichenreihen. Norske Vid. Selsk. Skr. I Math-Nat.
Kl. 7, 1–22 (1906)
10. Yasutomi, S.-I.: On sturmian sequences which are invariant under some substitu-
tions. In: Kanemitsu, S., et al. (eds.) Number Theory and Its Applications, Pro-
ceedings of the Conference held at the RIMS, Kyoto, Japan, November 10-14, 1997,
pp. 347–373. Kluwer Acad. Publ., Dordrecht (1999)
Block-Sorted Quantified Conjunctive Queries

Hubie Chen1, and Dániel Marx2,


1
Universidad del Paı́s Vasco and IKERBASQUE, E-20018 San Sebastián, Spain
2
Computer and Automation Research Institute, Hungarian Academy of Sciences
(MTA SZTAKI), Budapest, Hungary

Abstract. We study the complexity of model checking in quantified


conjunctive logic, that is, the fragment of first-order logic where both
quantifiers may be used, but conjunction is the only permitted connec-
tive. In particular, we study block-sorted queries, which we define to be
prenex sentences in multi-sorted relational first-order logic where two
variables having the same sort must appear in the same quantifier block.
We establish a complexity classification theorem that describes precisely
the sets of block-sorted queries of bounded arity on which model check-
ing is fixed-parameter tractable. This theorem strictly generalizes, for
the first time, the corresponding classification for existential conjunc-
tive logic (which is known and due to Grohe) to a logic in which both
quantifiers are present.

1 Introduction

Model checking, the problem of deciding if a logical sentence holds on a struc-


ture, is a fundamental computational task that appears in many guises through-
out computer science. Witness its appearance in areas such as computational
logic, verification, artificial intelligence, constraint satisfaction, and computa-
tional complexity. The case where one wishes to evaluate a first-order sentence
on a finite structure is a problem of principal interest in database theory and is
the topic of this article. This problem is well-known to be quite intractable in
general: it is PSPACE-complete.
As has been articulated in the literature [7], the typical situation in the
database setting is the posing of a relatively short query to relatively large
database, or in logical parlance, the evaluation of a short formula on a large
relational structure. It has consequently been argued that, in measuring the
time complexity of this task, one could reasonably allow a slow (that is, possi-
bly non-polynomial-time) computable preprocessing of the formula, so long as

Research supported by the Spanish Project FORMALISM (TIN2007-66523), by the
Basque Government Project S-PE12UN050(SAI12/219), and by the University of
the Basque Country under grant UFI11/45.

Research supported by the European Research Council (ERC) grant
“PARAMTIGHT: Parameterized complexity and the search for tight complexity
results,” reference 280152.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 125–136, 2013.

c Springer-Verlag Berlin Heidelberg 2013
126 H. Chen and D. Marx

the desired evaluation can be performed in polynomial time following this pre-
processing. Relaxing polynomial-time computation so that an arbitrary depen-
dence in a parameter is tolerated yields, in essence, the notion of fixed-parameter
tractability. This notion of tractability is the base of parameterized complexity
theory, which provides a taxonomy for reasoning about and classifying problems
where each instance has an associated parameter. We follow this paradigm, and
focus the discussion on this form of tractability.
First-order model checking is intractable even if one restricts the connectives
and quantifiers permitted; for instance, model checking of existential conjunctive
queries, by which we mean sentences formed using atoms, conjunction (∧), and
existential quantification (∃), is well-known to be intractable (it is NP-complete).
Thus, a typical way to gain insight into which sentences exhibit tractable behav-
ior is to consider model checking relative to a set Φ of sentences. In the context
of existential conjunctive logic, there is a mature understanding of sentence sets.
It was proved by Grohe [6] that when Φ is a set of existential conjunctive queries
having bounded arity, model checking on Φ is fixed-parameter tractable if there is
a constant k ≥ 1 such that each sentence in Φ is logically equivalent to one whose
treewidth is bounded above by k, and is intractable otherwise (under a standard
assumption from parameterized complexity). The treewidth of a conjunctive sen-
tence (in prenex form) is measured here via the graph on the sentence’s variables
wherein two variables are adjacent if they co-occur in an atom.
An important precursor to Grohe’s theorem was the complexity classification
of graph sets for existential conjunctive logic. Grohe, Schwentick, and Segoufin [7]
defined model checking relative to a graph set G as the problem of deciding, given
a structure and an existential conjunctive query whose graph is in G, whether
or not the query is true on the structure; they showed that the problem is fixed-
parameter tractable when G has bounded treewidth, and intractable otherwise.
In this paper, we restrict our attention to queries of bounded arity (the case of
unbounded arity leads to a different theory, where complexity may depend on
the choice of representation of relations [3,8]). For bounded-arity structures, this
result is coarser than Grohe’s theorem, as it can be taken as a classification of
sentence sets Φ that obey the closure property that if a sentence is in Φ, then
all sentences having the same graph are also in Φ; in contrast, Grohe’s theorem
classifies arbitrary sentence sets.
This graph classification was recently generalized to quantified conjunctive
logic, wherein both quantifiers (∀, ∃) are permitted in addition to conjunction
(∧). Define a prefixed graph to be a quantifier prefix Q1 v1 . . . Qn vn paired with a
graph on the variables {v1 , . . . , vn }; each quantified conjunctive query in prenex
form can naturally be mapped to a prefixed graph, by simply taking the quan-
tifier prefix of the query along with the graph of the quantifier-free, conjunctive
portion of the query. Chen and Dalmau [2] defined a width measure for prefixed
graphs, which generalizes treewidth, and proved that model checking on a set of
prefixed graphs is fixed-parameter tractable if the set has bounded width, and
intractable otherwise. This result generalizes the graph classification by Grohe,
Schwentick, and Segoufin, and provides a unified view of this classification as
Block-Sorted Quantified Conjunctive Queries 127

well as earlier complexity results [5] on quantified conjunctive logic. Note, how-
ever, that the present result is incomparable to Grohe’s result: Grohe’s result is
on arbitrary sentence sets in a less expressive logic, while the result of Chen and
Dalmau considers sentences in more expressive logic, but considers them from
the coarser graph-based viewpoint, that is, it classifies sentence sets obeying the
(analog of the) described closure property.
In this article, we present a veritable generalization of Grohe’s theorem in
quantified conjunctive logic. In the bounded-arity case, our theorem naturally
unifies together both Grohe’s theorem and the classification of prefixed graphs
in quantified conjunctive logic. The sentences studied by our theorem are of
the following type. Define a block-sorted query to be a quantified conjunctive
sentence in multi-sorted, relational first-order logic where two variables having
the same sort must occur in the same quantifier block. This class of sentences
includes each sentence having a sort for each quantifier block. As an example,
consider the sentence

∃x1 , x2 ∀y1 , y2 , y3 ∃z1 , z2


R(x1 , y1 ) ∧ R(x2 , y3 ) ∧ S(x2 , y2 , y3 , z1 ) ∧ S(x1 , y1 , y2 , z2 ) ∧ T (x1 , x2 , y2 ),

where the variables xi have the same sort e, the variables yi have the same sort
u, and the variables zi have the same sort e ; the arities of the relation symbols
R, S, and T are eu, euue , and eeu, respectively. The definitions impose that a
structure B on which such a sentence can be evaluated needs to provide a domain
Bs (which is a set) for each sort; quantifying a variable of sort s is performed
over the domain Bs . (See the next section for the precise formalization that is
studied.)
Our main theorem is the classification of block-sorted queries. We show how
to computably derive from each query a second logically equivalent query, and
demonstrate that, for a bounded-arity set of block-sorted queries, model checking
is fixed-parameter tractable if the width of the derived queries is bounded (with
respect to the mentioned width measure [2]), and is intractable otherwise. This
studied class of queries encompasses existential conjunctive queries, which can
be viewed as block-sorted queries in which there is one existential quantifier
block, and all variables have the same sort. Observe that, given any sentence in
quantified conjunctive logic (either one-sorted or multi-sorted) and any structure
on which the sentence is to be evaluated, one can view the sentence as a block-
sorted query. (This is done as follows: for each sort s that appears in more than
one quantifier block, introduce a new sort sb for each block b where it appears;
correspondingly, introduce new relation symbols.) Our theorem can thus be read
as providing a general tractability result which is applicable to all of quantified
conjunctive logic, and a matching intractability result that proves optimality of
this tractability result for the class of block-sorted queries.
Our theorem is the first generalization of Grohe’s theorem to a logic where both
quantifiers are present. The previous work suggests that we should proceed the fol-
lowing way: take the width measure of Chen and Dalmau [2], and apply it to some
analog of the logically equivalent core of Grohe [6]. However, the execution of these
128 H. Chen and D. Marx

ideas are not at all obvious and we have to overcome a number of technical barri-
ers. For instance, Grohe’s theorem statement (in the formulation given here) makes
reference to logical equivalence. While there is a classical and simple characteriza-
tion of logical equivalence in existential conjunctive logic [1], logical equivalence
for first-order logic is of course well-known to be an undecidable property; logical
equivalence for quantified conjunctive logic is now known (in the one-sorted case)
to be decidable [4], but is perhaps still not well-understood (for instance, its ex-
act complexity is quite open). Despite this situation, we succeed in identifying, for
each block-sorted sentence, a logically equivalent sentence whose width character-
izes the original sentence’s complexity, obtaining a statement parallel to that of
Grohe’s theorem; the definition of this equivalent sentence is a primary contribu-
tion of this article. In carrying out this identification, we present a notion of core
for block-sorted sentences and develop its basic theory; the core of an existential
conjunctive sentence (an established notion) is, intuitively, a minimal equivalent
sentence, and Grohe’s theorem can be stated in terms of the treewidth of the cores
of a sentence set. Another technical contribution of the article is to develop a graph-
theoretic understanding of variable interactions (see Section 4), which understand-
ing is sufficiently strong so as to allow for the delicate embedding of hard sentences
from the previous work [2] into the sentences under consideration, to obtain the in-
tractability result. Overall, we believe that the notions, concepts, and techniques
that we introduce in this article will play a basic role in the investigation of model
checking in logics that are more expressive than the one considered here.

2 Preliminaries
2.1 Terminology and Setup
We will work with the following formalization of multi-sorted relational first-
order logic. A signature is a pair (σ, S) where S is a set of sorts and σ is a set of
relation symbols; each relation symbol R ∈ σ has associated with it an element
of S ∗ , called the arity of R and denoted ar(R). In formulas over signature (σ, S),
each variable v has associated with it a sort s(v) from S; we use atom to refer
to an atomic formula R(v1 , . . . , vk ) where R ∈ σ and s(v1 ) . . . s(vk ) = ar(R). A
structure B on signature (σ, S) consists of an S-sorted family {Bs | s ∈ S} of
sets called the universe of B, and, for each symbol R ∈ σ, an interpretation
RB ⊆ Bar(R) . Here, for a word w = w1 . . . wk ∈ S ∗ , we use Bw to denote the
product Bw1 × · · · × Bwk . We say that two structures are similar if they are
defined on the same signature. Let B and C be two similar structures defined
on the same signature (σ, S). We say that B is a substructure of C if for each
s ∈ S, it holds that Bs ⊆ Cs , and for each R ∈ σ, it holds that RB ⊆ RC . We
say that B is an induced substructure of C if, in addition, for each R ∈ σ one
has that RB = RC ∩ Bar(R) .
A quantified conjunctive query is a sentence built from atoms, conjunction,
existential quantification, and universal quantification. It is well-known that such
sentences can be efficiently translated into prenex normal form, that is, of the
form Q1 v1 . . . Qn vn φ where each Qi is a quantifier and where φ is a conjunction
Block-Sorted Quantified Conjunctive Queries 129

of atoms. For such a sentence, it is well-known that the conjunction φ can be en-
coded as a structure A where As contains the variables of sort s that appear in φ
and, for each relation symbol R, the relation RA consists of all tuples (v1 , . . . , vk )
such that R(v1 , . . . , vk ) appears in φ. In the other direction, any structure A can
be viewed as encoding the conjunction (v1 ,...,vk )∈RB R(v1 , . . . , vk ). We will typ-
ically denote a quantified conjunctive query Q1 v1 . . . Qn vn φ as a pair (P, A)
consisting of the quantifier prefix P = Q1 v1 . . . Qn vn and a structure A that
encodes the quantifier-free part φ. Note that when discussing the evaluation of
a sentence (P, A) on a structure, we can and often will assume that all variables
appearing in P are elements of A.
We define a block-sorted query to be a quantified conjunctive query in prenex
normal form where for all variables v, v  , if s(v) = s(v  ) then v, v  occur in the
same quantifier block. By a quantifier block, we mean a subsequence Qi vi . . . Qj vj
of the quantifier prefix (with i ≤ j) having maximal length such that Qi = · · · =
Qj . We number the quantifier blocks from left to right (that is, the outermost
quantifier block is considered the first). For each sort s having a variable that
appears in such a query, either all variables of sort s are universal, in which case
we call s a universal sort or a ∀-sort, or all variables or sort s are existential, in
which case we call s a existential sort or a ∃-sort.

2.2 Conventions

In general, when A is a structure with universe {As | s ∈ S}, we assume that


the sets As are pairwise disjoint, and use A to denote ∪s∈S As . Correspondingly,
we assume that in forming formulas over signature (σ, S), the sets of permitted
variables for different sorts are pairwise disjoint. Relative to a quantified con-
junctive query (P, A), we use A∃ to denote the set {a ∈ A | s(a) is an ∃-sort};
likewise, we use A∀ to denote the set {a ∈ A | s(a) is an ∀-sort}. In dealing with
sets such as these, for a variable v we use a subscript < v to restrict to variables
coming before v in the quantifier prefix P ; for instance, we will use A∀,<v to
denote the set of all universally quantified variables that occur before v. When
discussing a function from a set whose elements are sorted to another such set,
we assume tacitly that the function preserves sort, that is, for each sort s, each
elements of sort s in the first set is mapped to an element of sort s in the second
set.
Let (P, A) be a block-sorted query and let B be a structure similar to A; we
say that a homomorphism φ : A → B is universal-injective if φ is injective on
A∀ .

2.3 Basic Facts

Intuitively, evaluating the query (P, A) on the structure B can be interpreted as


game with two players “universal” and “existential.” In the order given by the
prefix P , the two players assign values to the variables; existential and universal
sets the values of the existential and the universal variables, respectively. The
130 H. Chen and D. Marx

aim of existential is to ensure that the resulting assignment satisfies the formula,
that is, gives a homomorphism from A to B, while universal tries to prevent this.
The query (P, A) is true on B if existential has a winning strategy. We formalize
this intuition by the following definition:
Definition 1. Let (P, A) be a quantified conjunctive query, and let B be a
structure similar to A. An existential strategy for (P, A) on B is a set of
mappings (fx : (A∀,<x → B) → Bs(x) )x∈A∃ such that the following holds:
for any h : A∀ → B, a homomorphism from A to B is given by the map
(f, h) : A → B defined by (f, h)(x) = fx (h  A∀,<x ) for each existential variable
x, and (f, h)(y) = h(y) for each universal variable y.
Proposition 2. Let (P, A) be a quantified conjunctive query, and let B be a
structure similar to A. Then B |= (P, A) if and only if there is an existential
strategy.
The transitivity of homomorphisms allows us to quickly deduce consequences of
the existence of a homomorphism A → B. For example, we know that there is
also a homomorphism A → B whenever there is a homomorphism A → A;
and there is a also homomorphism A → B whenever there is a homomorphism
B → B . These quick observations are very useful in the study of the homomor-
phism problem, where they allow us to restrict our attention to specific type of
structures. In our setting, however, the quantified nature of the problem makes
such consequences less obvious. In the following, we find analogs of these ob-
servations in our setting, that is, assuming that B |= (PA , A) holds, we explore
under what conditions the structure B or the query (PA , A) can be replaced to
obtain another true statement.
First we give a sufficient condition under which the query can be replaced.
Let us say that two similar block-sorted queries (PA , A) and (PC , C) having the
same number of quantifier blocks are mutually respecting if for each sort s and
for each i ≥ 1, it holds that s is used in the ith quantifier block of PA if and
only if it is used only in the ith quantifier block of PC .
Proposition 3. Let (PA , A) and (PC , C) be similar block-sorted queries that
are mutually respecting. Suppose that i : A → C is a universal-injective homo-
morphism. Then it holds that (PC , C) entails (PA , A).
The following proposition gives a sufficient condition for replacing the structure
B on which the query is evaluated:
Proposition 4. Let σ be a signature, let (P, A) be a block-sorted query over
σ, and let B, B be structures over σ. Suppose that B |= (P, A) and that there
exists a homomorphism g : B → B that is universal-surjective in the sense that
f (B∀ ) = B∀ . Then, it holds that B |= (P, A).
Note that this proposition can be viewed as a variant of the known fact that, in
standard (one-sorted) first-order logic, if a quantified conjunctive query Φ holds
on a structure B and B admits a surjective homomorhpism to B , then Φ also
holds on B (see for example [4, Lemma 1]).
Block-Sorted Quantified Conjunctive Queries 131

3 The Selfish Core


Let (P, C) be a block-sorted query on signature (σ, S). When A is similar to C,
we say that A is an ∃-substructure of C if A is a substructure of C; for each
∀-sort u it holds that Au = Cu ; and, for each ∃-sort e it holds that Ae ⊆ Ce . We
say that A is a proper ∃-substructure of C if, in addition, there exists an ∃-sort
e such that the containment Ae ⊆ Ce is proper.
We say that a block-sorted query (P, C) is selfish if C |= (P, C). We say that
a block-sorted query (P, C) is a selfish core if it is selfish and for any proper
∃-substructure A of C, either (P, A) is not selfish or the queries (P, A) and
(P, C) are not logically equivalent.
We give characterizations of the notion of selfish core in the following propo-
sition; afterwards, we show that each block-sorted query has (in a sense made
precise) a selfish core. Let us say that an endomorphism h : C → C of a struc-
ture C is proper if its image is proper, that is, if there exists a sort s ∈ S such
that h(Cs )  Cs .

Proposition 5. Let (P, C) be a selfish block-sorted query. The following are


equivalent.
1. (P, C) is a selfish core.
2. There does not exist a proper endomorphism of C that fixes each universal
variable.
3. There does not exist a proper endomorphism of C that, for each universal
sort u, is injective on Cu .

Define a selfish core of a block-sorted query (P, A) to be a block-sorted query


that is a selfish core and that is logically equivalent to (P, A). We now show that
each block-sorted query has a selfish core which is computable (from the query).

Definition 6. Let (P, A) be a block-sorted query; we define the block-sorted


query (P ∗ , A∗ ) the following way.
– For each ∀-sort u, define A∗u = Au .
– For each ∃-sort e, define A∗e = {xg | x ∈ Ae , g : A∀,<x → A∀,<x }.
– P ∗ is obtained from P by replacing each quantification ∃x with ∃xg1 . . . ∃xgm
where g1 , . . . , gm is a list of all the mappings from A∀,<x to A∀,<x .

– RA = {(g  (a1 ), . . . , g  (ak )) | (a1 , . . . , ak ) ∈ RA , g : A∀ → A∀ }
where g  is the extension of g that maps a value x ∈ A∃ to xg|A∀,<x .

Example 7. Consider the query (P, A) = ∀y1 , y2 ∃x : R1 (x, y1 ) ∧ R2 (x, y2 ). For


i, j ∈ {1, 2}, let gij be the mapping defined by gij (y1 ) = yi and gij (y2 ) = yj .
Then the query (P ∗ , A∗ ) can be defined as

∀y1 , y2 ∃xg11 , xg12 , xg21 , xg22 :


[R1 (xg11 , y1 ) ∧ R2 (xg11 , y1 ) ∧ R1 (xg12 , y1 ) ∧ R2 (xg12 , y2 )
∧ R1 (xg21 , y2 ) ∧ R2 (xg21 , y1 ) ∧ R1 (xg22 , y2 ) ∧ R2 (xg22 , y2 )].
132 H. Chen and D. Marx

Proposition 8. Let (P, A) be a block-sorted query. The following statements


concerning (P ∗ , A∗ ) hold.
1. (P ∗ , A∗ ) and (P, A) are logically equivalent.
2. (P ∗ , A∗ ) is selfish.
3. The structure A∗ contains an induced substructure C such that (P ∗ , C) is a
selfish core of (P, A); moreover, (P ∗ , C) is computable from (P, A).

4 Strong and Weak Elements


Throughout this section, we assume that (P, A) is a block-sorted query; the
definitions and claims are all relative to this query. We use GA to denote the
Gaifman graph of the structure A, that is, the graph with vertex set A and
containing an edge {a, a } if and only if a and a are distinct and co-occur in a
tuple of a relation of A. Relative to (P, A), when i is the number of a quantifier
block, we will use notation such as A≥i to denote the set of variables occurring
in block i or later, and define for example A<i analogously.
Definition 9. A level i component is a maximal connected set of ∃-variables in
GA [A≥i ].
Definition 10. Let x ∈ A∃ be an ∃-variable in the ith quantifier block.
– For j ≤ i, use CA (x, j) to denote the level j component containing x.
– Define NA (x, j), the neighborhood of CA (x, j), to be the set of all universal
variables in A<j adjacent
 to CA (x, j) in GA .
– Define UA (x) to be j≤i NA (x, j).
In other words, a universal variable y on level j is in UA (x) if and only if y can
be reached from x on a path in GA such that all the vertices of the path other
than y are existential variables on levels greater than j. We remark that the
definition of CA (x, j), as well as that of the other sets, depends on A as well as
the quantifier prefix P ; however, this prefix will be clear from context, and we
omit it from the notation.
Definition 11. We say that an ∃-variable xg ∈ A∗∃ is degenerate if g is non-
injective on UA (x).
Definition 12. An ∃-variable x ∈ A∃ is weak if there exists a
universal-injective homomorphism ψ : A → A∗ where ψ(x) is a degenerate
element of A∗ ; the ∃-variable x is strong otherwise.
Example 13. Consider the folllowing query (P, A):

∀y1 , y2 , y3 ∃x1 x2 , x3 , x4 , x5
R1 (x1 , y1 ) ∧ R2 (x2 , y2 ) ∧ R3 (x1 , x2 ) ∧ R1 (x3 , y3 ) ∧ R1 (x5 , y3 )
∧ R2 (x4 , y3 ) ∧ R3 (x3 , x4 ) ∧ R3 (x5 , x4 ).
If g is the mapping with g(y1 ) = g(y2 ) = g(y3 ) = y3 , then there is a homomor-
phism ψ from A to A∗ that is identity on y1 , y2 , y3 , x1 , x2 and ψ(x3 ) = ψ(x5 ) =
xg1 and ψ(x4 ) = xg2 . Hence x3 , x4 , x5 are weak elements.
Block-Sorted Quantified Conjunctive Queries 133

Definition 14. Define the strong substructure of A to be the substructure of


A induced by the union of A∀ with the strong elements of A.

The main result of the section is showing that removing the weak elements does
not change the sentence. In the proof of the classification theorem, this will allow
us to consider the width of the strong substructure as the classification criteria.
Theorem 15. Let S be the strong substructure of A. The queries (P, S) and
(P, A) are logically equivalent.
We conclude this section with a simple lemma that will be of help in estab-
lishing the complexity hardness result.

Lemma 16. Suppose that φ is a universal-injective endomorphism of A and


that x ∈ A∃ is a strong variable. Then φ(x) is strong as well.

Proof. Assume for contradiction that φ(x) is not strong: there is a universal-
injective homomorphism ψ : A → A∗ where ψ(φ(x)) is a degenerate element.
Now ψ(φ) is a universal-injective homomorphism A → A∗ that maps x to a
degenerate element of A∗ , contradicting the assumption that x is strong. &
%

5 Classification Theorem

When Φ is a set of (possibly multi-sorted) first-order sentences, define Φ-MC to


be the model checking problem of deciding, given a sentence φ ∈ Φ and a finite
structure B over the same signature, whether or not B |= φ. We study this prob-
lem using parameterized complexity; we use the terminology and conventions for
parameterized complexity defined in [2], and take φ to be the parameter of an
instance (φ, B).
As defined in [2], a prefixed graph consists of a quantifier prefix P paired with
an undirected graph whose vertices are the variables appearing in P . In [2], a
width measure is defined that associates a natural number with each prefixed
graph; we refer the reader to that article for the precise definition. As we use
both the algorithmic and hardness results of [2] as black box, the exact definition
does not matter for the purposes of this paper. Fix a computable mapping M
that, given a block-sorted query φ, computes a selfish core (P, A) of φ, and then
computes the strong substructure S of (P, A), and outputs (P, S).

Theorem 17. Let Φ be a set of block-sorted queries of bounded arity. If the


set of prefixed graphs {(P, GS ) | (P, S) ∈ M (Φ)} has bounded width, then the
problem Φ-MC is in FPT; otherwise, the problem Φ-MC is not in FPT, unless
W[1] ⊆ nuFPT.

The remainder of this section is devoted to the proof of Theorem 17.


The positive FPT result is obtained as follows. Given an instance (φ, B) of
the problem Φ-MC, the algorithm is to evaluate B |= M (φ) using the algorithm
of [2]; this evaluation can be performed in polynomial time given M (φ), and
134 H. Chen and D. Marx

since the computation of M (φ) depends only on the parameter of the instance
(φ, B), the whole computation is in FPT.
We now give the hardness result. For a block-sorted query (P, S) over signature
(σ, S), we define the relativization (P, S)rel of (P, S) in the following way. Denote
P by Q1 v1 . . . Qn vn , and let θ be the conjunction of atoms corresponding to S.
Define (P, S)rel to be the one-sorted sentence Q1 v1 ∈ Wv1 . . . Qn vn ∈ Wvn θ over
signature σ ∪ {Wv1 , . . . , Wvn } where each Wvi is a fresh unary relation symbol
and the arity of a symbol R ∈ σ is the length of ar(σ,S) (R). Here, ∃v ∈ W ψ is
syntactic shorthand for ∃v(W (v) ∧ ψ); and, ∀v ∈ W ψ is syntactic shorthand for
∀v(W (v) → ψ). Assuming that the set of prefixed graphs given in the theorem
statement has unbounded width, the hardness result of [2, Section 6] implies that
Φrel = {(M (φ))rel | φ ∈ Φ} is W[1]-hard or coW[1]-hard under nuFPT reductions.
It thus suffices to give an nuFPT reduction from Φrel -MC to Φ-MC, which we
now do. Let ((P, S)rel , B) be an instance of Φrel -MC, and let φ ∈ Φ be such that
(P, S) = M (φ); let (P, A) denote the selfish core of φ computed by M (φ) (note
that S is a substructure of A).
We will work with the structure A∗ . Let Aid denote the subuniverse of A∗
containing all universal variables of A∗ and each existential variable of A∗ of
the form aid , where id is the identity mapping. (We use id generically to denote
the identity mapping, but note that this is defined on A∀,<a for an existential
variable a.) Observe that Aid induces in A∗ a copy of the structure A. With
this correspondence, let S id denote the union of A∀ and the strong elements of
Aid , and let Sid denote the induced substructure of A∗ on S id . Let D denote the
subuniverse of A∗ containing all degenerate elements of A∗ . We will sometimes
drop the id superscript when it is clear from context.
Define a structure B over signature (σ, S) as follows. The universe is denoted
by {Bs | s ∈ S} and is defined by Bs = {(a, b) ∈ (Aid s ∪ Ds ) × (B ∪ {⊥}) | (a ∈
Ssid → b ∈ UaB ) and (a ∈ / Ssid → b = ⊥)}. Here, B denotes the universe of the

one-sorted structure B. Now, for each R ∈ σ, define RB to be the relation

 id
{((a1 , b1 ), . . . , (ak , bk )) ∈ Bar(R) | (a1 , . . . , ak ) ∈ RA and ((a1 , . . . , ak ) ∈ RS →
(b1 , . . . , bk ) ∈ RB )}. We will use πi to denote the mapping that projects a tuple
onto the ith coordinate.
We claim that B |= (P, S)rel if and only if B |= (P, A).
We first prove the backwards direction. We will use the following lemma.

Lemma 18. Suppose that h : A → (Aid ∪ D) is a homomorphism from A to A∗


that is identity on A∀ . Then h(S) = S id .

Proof. Let h0 be the homomorphism A → A∗ that is identitiy on universals


and maps x to xid . As A is selfish and (P, A) and (P, A∗ ) are logically equiv-
alent by Proposition 8(1), we have that A |= (P, A∗ ), implying that there is a
homomorphism h∗ from A∗ to A that is identity on the universals.
We claim that h∗ is injective on Aid . Indeed, otherwise h∗ (h0 ) is noninjective
(as Aid is the image of h0 ), hence it is a proper endomorphism of A that is
identity on the universals. By Proposition 5, this contradicts the assumption
that A is a selfish core.
Block-Sorted Quantified Conjunctive Queries 135

Next we claim that h∗ maps S id to S. Otherwise, the endomorphism h∗ (h0 ) of


A maps a strong element to a weak element, contradicting Lemma 16. Together
with the fact that h∗ is injective on Aid , it follows that h∗ maps Aid \ S id to A \ S.
The homomorphism h cannot map a strong element x ∈ S to D by definition
of strong elements. If h maps a strong element x ∈ S to Aid \ S id , then (as shown
in the previous paragraph) endomorphism h∗ (h) of A maps x to A \ S, that
is, to a weak element. As h∗ (h) is an endomorphism fixing the universals, this
contradicts Lemma 16. Thus we have proved that h maps every strong element
to S id . &
%
Let (fx )x∈A∃ be an existential strategy witnessing B |= (P, A). Let x ∈ S be
an existential variable of (P, S)rel , and let s be the sort of x. For any mapping
h : A∀,<x → B, define h : A∀,<x → B  by h (y) = (y, h(y)). Observe that under
any such mapping h, we have π1 ({fx (h ) | x ∈ Ss }) = Ss , since we can extend
h to a mapping h : A∀ → B  and then the homomorphism (f  , h ) given by
Definition 1 is from A to Aid ∪ D, and Lemma 18 can be applied.
We can thus define a strategy (fx ) for (P, S)rel on B as follows: for an exis-
tential variable x ∈ S∃ and a map h : A∀,<x → B, define fx (h) = b if and only if
(x, b) is in {fx (h ) | x ∈ Ss }. This mapping is well-defined by the observation of
the previous paragraph, and for any h : A∀ → B obeying h(a) ∈ UaB , we obtain
from the definition of B that the homomorphism (f, h) given by Definiton 1 is
from S to B with (f, h)(x) ∈ UxB for each ∃-variable x.
We now prove the forwards direction. We will make use of the following lemma.
Lemma 19. There is an existential strategy (fx ) for (P, A) on the substructure
of A∗ induced by (Aid ∪ D) where f  (h) is a degenerate element in D whenever
h : A∀,x → A∀,x is not injective on U (x).
Let (ft ) witness B |= (P, S)rel . We will define a strategy (Fx ) to witness B |=
(P, A).
For each partial map H : A∀ → B  and subset Y ⊆ A∀ containing the domain
of H, fix e(H, Y ) to be an extension of H defined on Y such that
– if for some universal sort s it holds that H1 |As = H2 |As , then e(H1 , Y )|As =
e(H2 , Y )|As ; and,
– if for some universal sort s the map π1 (H) is injective on As , then π1 (e(H, Y ))
is as well.
It is straightforward to verify that such a mapping e exists; note that when using
this mapping, Y will be of the form A∀,<x for an existential variable x.
We define the strategy (Fx ) as follows. Let x be an existential variable of
(P, A), and let H : A∀,<x → B  be a map. Define H[x] as e(H|U (x), A∀,<x ). Set
Fx (H) to be the pair (c1 , c2 ) where
– c1 = fx (π1 (H[x])), where (fx ) is the strategy from Lemma 19; and,
– c2 is ⊥ if c1 is not in S, and otherwise is equal to fc1 ((H[x])(A∀,<x )). Note
that H[x])(A∀,<x ) is a set of pairs that should be viewed as a function, in
passing it to fc1 .
136 H. Chen and D. Marx

Observe that if c1 is not degenerate, then by the just-given lemma, the mapping
π1 (H[x]) is injective on U (x); it follows that π1 (H[x]) is injective on A∀,<x
by the definition of H[x] and the second condition in the definition of e. This
implies that (H[x])(A∀,<x ) is the graph of a mapping defined on A∀,<x , and c2
as described above is well-defined.
By the definition of c1 , we have that (Fx ) has the property that for any
H : A∀ → B  , it holds that π1 (F, H) is a homomorphism from A to A∗ . It
remains to verify that if (a1 , . . . , ak ) ∈ RA , then the image ((t1 , b1 ), . . . , (tk , bk ))
of (a1 , . . . , ak ) under (F, H) has the property that (t1 , . . . , tk ) ∈ RS implies
(b1 , . . . , bk ) ∈ RB . For each existential variable x occurring in (a1 , . . . , ak ), ob-
serve that for any universal variable y coming before it in the quantifier prefix,
one has y ∈ U (x) and thus H(y) = (H[x])(y). It thus suffices to show that if x and
x are existential variables in this tuple where x occurs before x , H : A∀,<x → B 
and H  : A∀,<x → B  are mappings where H  extends H, then (H  [x ])(A∀,<x )
extends (H[x])(A∀,<x ). It suffices to show that H  [x ] and H[x] agree on A∀,<x .
It follows by definition of U that U (x)|A∀,<x = U (x )|A∀,<x . Thus, for an ∀-
sort s occurring before x, we have (H  |U (x ))|As = (H|U (x))|As . So thus by
the first condition in the definition of e, it holds that e(H  |U (x ), A∀,<x )|As =
e(H|U (x), A∀,<x )|As from which we obtain the desired agreement.

References
1. Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in
relational data bases. In: Proceddings of STOC 1977, pp. 77–90 (1977)
2. Chen, H., Dalmau, V.: Decomposing quantified conjunctive (or disjunctive) formu-
las. In: LICS (2012)
3. Chen, H., Grohe, M.: Constraint satisfaction with succinctly specified relations.
Journal of Computer and System Sciences 76(8), 847–860 (2010)
4. Chen, H., Madelaine, F., Martin, B.: Quantified constraints and containment prob-
lems. In: Twenty-Third Annual IEEE Symposium on Logic in Computer Science,
LICS (2008)
5. Gottlob, G., Greco, G., Scarcello, F.: The complexity of quantified constraint satis-
faction problems under structural restrictions. In: IJCAI 2005 (2005)
6. Grohe, M.: The complexity of homomorphism and constraint satisfaction problems
seen from the other side. Journal of the ACM 54(1) (2007)
7. Grohe, M., Schwentick, T., Segoufin, L.: When is the evaluation of conjunctive
queries tractable? In: STOC 2001 (2001)
8. Marx, D.: Tractable hypergraph properties for constraint satisfaction and conjunc-
tive queries. In: Proceedings of the 42nd ACM Symposium on Theory of Computing,
pp. 735–744 (2010)
From Security Protocols to Pushdown Automata

Rémy Chrétien1,2 , Véronique Cortier1 , and Stéphanie Delaune2


1
LORIA, CNRS, France
2
LSV, ENS Cachan & CNRS & INRIA Saclay Île-de-France

Abstract. Formal methods have been very successful in analyzing security pro-
tocols for reachability properties such as secrecy or authentication. In contrast,
there are very few results for equivalence-based properties, crucial for studying
e.g. privacy-like properties such as anonymity or vote secrecy.
We study the problem of checking equivalence of security protocols for an
unbounded number of sessions. Since replication leads very quickly to unde-
cidability (even in the simple case of secrecy), we focus on a limited fragment
of protocols (standard primitives but pairs, one variable per protocol’s rules)
for which the secrecy preservation problem is known to be decidable. Surpris-
ingly, this fragment turns out to be undecidable for equivalence. Then, restrict-
ing our attention to deterministic protocols, we propose the first decidability
result for checking equivalence of protocols for an unbounded number of ses-
sions. This result is obtained through a characterization of equivalence of pro-
tocols in terms of equality of languages of (generalized, real-time) deterministic
pushdown automata.

1 Introduction
Formal methods have been successfully applied for rigorously analyzing security pro-
tocols. In particular, many algorithms and tools (see [13,4,9,2,11] to cite a few) have
been designed to automatically find flaws in protocols or prove security. Most of these
results focus on reachability properties such as authentication or secrecy: for any execu-
tion of the protocol, an attacker should never learn a secret (secrecy property) or make
Alice think she’s talking to Bob while Bob did not engage a conversation with her (au-
thentication property). However, privacy properties such as vote secrecy, anonymity, or
untraceability cannot be expressed as such. They are instead defined as indistinguisha-
bility properties in [1,6]. For example, Alice’s identity remains private if an attacker
cannot distinguish a session where Alice is talking from a session where Bob is talking.
Studying indistinguishability properties for security protocols amounts into check-
ing a behavioral equivalence between processes. Processes represent protocols and are
specified in some process algebras such as CSP or the pi-calculus, except that mes-
sages are no longer atomic actions but terms, in order to faithfully represent crypto-
graphic messages. Of course, considering terms instead of atomic actions considerably

Full version available at http://hal.inria.fr/hal-00817230. The research lead-
ing to these results has received funding from the European Research Council under the Eu-
ropean Union’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement n◦
258865, project ProSecure, and the ANR project JCJC VIP no 11 JS02 006 01.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 137–149, 2013.
c Springer-Verlag Berlin Heidelberg 2013
138 R. Chrétien, V. Cortier, and S. Delaune

increases the difficulty of checking equivalence. As a matter of fact, there are just a few
results for checking equivalence of processes that manipulate terms.

– Based on a procedure developed by M. Baudet [3], it has been shown that trace
equivalence is decidable for deterministic processes with no else branches, and
for a family of equational theories that captures most standard primitives [10]. A
simplified proof of [3] has been proposed by Y. Chevalier and M. Rusinowitch [8].
– A. Tiu and J. Dawson [17] have designed and implemented a procedure for open
bisimulation, a notion of equivalence stronger than the standard notion of trace
equivalence. This procedure only works for a limited class of processes.
– V. Cheval et al. [7] have proposed and implemented a procedure for trace equiva-
lence, and for a quite general class of processes. They consider non deterministic
processes that use standard primitives, and that may involve else branches.

However, these decidability results analyse equivalence for a bounded number of ses-
sions only, that is assuming that protocols are executed a limited number of times. This
is of course a strong limitation. Even if no flaw is found when a protocol is executed n
times, there is absolutely no guarantee that the protocol remains secure when it is exe-
cuted n+ 1 times. And actually, the existing tools for a bounded number of sessions can
only analyse protocols for a very limited number of sessions, typically 2 or 3. Another
approach consists in implementing a procedure that is not guaranteed to terminate. This
is in particular the case of ProVerif [4], a well-established tool for checking security of
protocols. ProVerif is able to check equivalence although it does not always succeed [5].
Of course, Proverif does not correspond to any decidability result.

Our Contribution. We study the decidability of equivalence of security protocols for an


unbounded number of sessions. Even in the case of reachability properties such as se-
crecy, the problem is undecidable in general. We therefore focus on a class of protocols
for which secrecy is decidable [9]. This class typically assumes that each protocol rule
manipulates at most one variable. Surprisingly, even a fragment of this class (with only
symmetric encryption) turns out to be undecidable for equivalence properties. We con-
sequently further assume our protocols to be deterministic (that is, given an input, there
is at most one possible output). We show that equivalence is decidable for an unbounded
number of sessions and for protocols with standard primitives but pairs. Interestingly,
we show that checking for equivalence of protocols actually amounts into checking
equality of languages of deterministic pushdown automata. The decidability of equality
of languages of deterministic pushdown automata is a difficult problem, shown to be
decidable at Icalp in 1997 [14]. We actually characterize equivalence of protocols in
terms of equivalence of deterministic generalized real-time pushdown automata, that is
deterministic pushdown automata with no epsilon-transition but such that the automata
may unstack several symbols at a time. More precisely, we show how to associate to a
process P an automata AP such that two processes are equivalent if, and only if, their
corresponding automata yield the same language and, reciprocally, we show how to as-
sociate to an automata A a process PA such that two automata yield the same language
if, and only if, their corresponding processes are equivalent, that is:
P ≈ Q ⇔ L(AP ) = L(AQ ), and L(A) = L(B) ⇔ PA ≈ PB .
From Security Protocols to Pushdown Automata 139

Therefore, checking for equivalence of protocols is as difficult as checking equivalence


of deterministic generalized real-time pushdown automata.

2 Model for Security Protocols


Security protocols are modeled through a process algebra that manipulates terms.

2.1 Syntax
Term algebra. As usual, messages are represented by terms. More specifically, we con-
sider a sorted signature with six sorts rand, key, msg, SimKey, PrivKey and PubKey
that represent respectively random numbers, keys, messages, symmetric keys, private
keys and public keys. We assume that msg subsumes the five other sorts, key subsumes
SimKey, PrivKey and PubKey. We consider six function symbols senc and sdec, aenc
and adec, sign and check that represent symmetric, asymmetric encryption and decryp-
tion as well as signatures. Since we are interested in the analysis of indistinguishability
properties, we consider randomized primitives:
senc : msg × SimKey × rand → msg sdec : msg × SimKey → msg
aenc : msg × PubKey × rand → msg adec : msg × PrivKey → msg
sign : msg × PrivKey × rand → msg check : msg × PubKey → msg
We further assume an infinite set Σ0 of constant symbols of sort key or msg, an infinite
set Ch of constant symbols of sort channel, two infinite sets of variables X , W, and
an infinite set N = Npub - Nprv of names of sort rand: Npub represents the random
numbers drawn by the attacker while Nprv represents the random numbers drawn by the
protocol’s participants. As usual, terms are defined as names, variables, and function
symbols applied to other terms. We denote by T (F , N , X ) the set of terms built on
function symbols in F , names in N , and variables in X . We simply write T (F , N )
when X = ∅. We consider three particular signatures:
Σpub = {senc, sdec, aenc, adec, sign, check, start}
Σ + = Σpub ∪ Σ0 Σ = {senc, aenc, sign, start} ∪ Σ0
where start ∈/ Σ0 is a constant symbol of sort msg. Σpub represents the functions/data
available to the attacker, Σ + is the most general signature, while Σ models actual
messages (with no failed computation). We add a bijection between elements of sort
PrivKey and PubKey. If k is a constant of sort PrivKey, k−1 will denotes its image
by this function, called inverse. We will write the inverse function the same, so that
(k−1 )−1 = k. To keep homogeneous notations, we will extend this function to sym-
metric keys: if k is of sort SimKey, then k−1 = k. The relation between encryption and
decryption is represented through the following rewriting rules, yielding a convergent
rewrite system:
sdec(senc(x, y, z), y) → x adec(aenc(x, y, z), y −1 ) → x
check(sign(x, y, z), y −1 ) → x

This rule models the fact that the decryption of a ciphertext will return the associated
plaintext when the right key is used to perform decryption. We denote by t↓ the normal
form of a term t ∈ T (Σ + , N , X ).
140 R. Chrétien, V. Cortier, and S. Delaune

Example 1. The term m = senc(s, k, r) represents an encryption of the constant s with


the key k using the random r ∈ N , whereas t = sdec(m, k) models the application of
the decryption algorithm on m using k. We have that t↓ = s.
An attacker may build his own messages by applying functions to terms he already
knows. Formally, a computation done by the attacker is modeled by a recipe. i.e. a term
in T (Σpub , Npub , W). The variables in W intuitively refer to variables used to store
messages learnt by the attacker.

Process algebra. The intended behavior of a protocol can be modelled by a process


defined by the following grammar where u ∈ T (Σ, N , X ), n ∈ N , and c ∈ Ch:
P, Q := 0 | in(c, u).P | out(c, u).P | (P | Q) | !P | new n.P
The process “in(c, u).P ” expects a message m of the form u on channel c and then
behaves like P θ where θ is a substitution such that m = uθ. The process “out(c, u).P ”
emits u on channel c, and then behaves like P . The variables that occur in u will be
instantiated when the evaluation will take place. The process P | Q runs P and Q
in parallel. The process !P executes P some arbitrary number of times. The process
new n.P invents a new name n and continues as P .
Sometimes, we will omit the null process. We write fv (P ) for the set of free variables
that occur in P , i.e. the set of variables that are not in the scope of an input. A protocol
is a ground process, i.e. a process P such that fv (P ) = ∅.
Example 2. For the sake of illustration, we consider a naive protocol, where A sends a
value v (e.g. a vote) to B, encrypted by a short-term key exchanged through a server.
1. A → S : senc(kAB , kAS , rA )
2. S → B : senc(kAB , kBS , rS )
3. A → B : senc(v, kAB , r)
The agent A sends a symmetric key kAB encrypted with the key kAS (using a fresh
random number rA ). The server answers to this request by decrypting this message and
encrypting it with kBS . The agent A can now send his vote v encrypted with kAB .
The role of A is modeled by a process PA (v) while the role of S is modeled by PS .
The role of B (which does not output anything) is omitted for concision.
def
PA (v) = ! in(cA , start).new rA .out(cA , senc(kAB , kAS , rA )) (1)
| ! in(cA , start).new r.out(cA , senc(v, kAB , r)) (2)
def
PS = ! in(cS , senc(x, kAS , z)).new rS .out(cS , senc(x, kBS , rS )) (3)
| ! in(cS , senc(x, kAS , z)).new rS .out(cS , senc(x, kCS , rS )) (4)
where cA , cA , cS , cS are constants of sort channel, kAB , kAS , kBS , and kCS are (pri-
vate) constants in Σ0 of sort SimKey, whereas rA , rS , r are names of sort rand, and x
(resp. z) is a variable of sort msg (resp. rand).
Intuitively, PA (v) sends kAB encrypted by kAS to the server (branch 1), and then
her vote encrypted by kAB (branch 2). The process PS models the server, answering
both requests from A to B (branch 3), as well as requests from A to C (branch 4).
More generally the server answers requests from any agent to any agent but only two
cases are considered here, again for concision. The whole protocol is given by P (v),
From Security Protocols to Pushdown Automata 141

where PA (v) and PS evolve in parallel and additionally, the secret key kCS is sent in
clear, to model the fact that the attacker may learn keys of some corrupted agents:
def
P (v) = PA (v) | PS | ! in(c, start).out(c, kCS )

2.2 Semantics

A configuration of a protocol is a pair (P; σ) where:

– P is a multiset of processes. We often write P ∪ P, or P | P, instead of {P } ∪ P.


– σ = {w1  m1 , . . . , wn  mn } is a frame, i.e. a substitution where w1 , . . . , wn are
variables in W, and m1 , . . . , mn are terms in T (Σ, N ). Those terms represent the
messages that are known by the attacker.
α
The operational semantics of protocol is defined by the relation −
→ over configurations.
For sake of simplicity, we often write P instead of (P ; ∅).
in(c,R)
(in(c, u).P ∪ P; σ) −−−−→ (P θ ∪ P; σ)
where R is a recipe such that Rσ↓ ∈ T (Σ, N ) and Rσ↓ = uθ for some θ
out(c,wi+1 )
(out(c, u).P ∪ P; σ) −−−−−−−→ (P ∪ P; σ ∪ {wi+1  u})
where i is the number of elements in σ
τ
(!P ∪ P; σ) −
→ (P ∪ !P ∪ P; σ)
τ 
(new n.P ∪ P; σ) −
→ (P {n /n } ∪ P; σ) where n is a fresh name in Nprv
A process may input any term that an attacker can build (rule I N). The process
out(c, u).P outputs u (which is stored in the attacker’s knowledge) and then behaves
like P . The two remaining rules are unobservable (τ action) from the point of view
w
of the attacker. The relation −
→ between configurations (where w is a sequence of ac-
tions) is defined in a usual way. Given a sequence of observable actions w, we write
w w
K ==⇒ K  when there exists w such that K −→ K  and w is obtained from w by
erasing all occurrences of τ . For every configuration K, we define its set of traces as
follows:
tr
trace(K) = {(tr, σ) | K ==⇒ (P; σ) for some configuration (P; σ)}.

Example 3. Going back to the protocol introduced in Example 2, consider the following
scenario: (i) the corrupted agent C discloses his secret key kCS ; (ii) the agent A initiates
a session with B, and for this she sends a request to the server S; (iii) the attacker
intercepts this message and sends it to S as a request coming from A to establish a key
with C. Instead of answering to this request with senc(kAB , kBS , rS ), the server sends
senc(kAB , kCS , rS ), and the attacker will learn kAB . More formally, we have that:
def in(c,start).out(c,w1 ).in(cA ,start).out(cA ,w2 ).in(c ,w2 ).out(c ,w3 ).
K0 = (P (v); ∅) ==============================
S S
==========⇒ (P (v); σ)
where σ = {w1 kCS , w2 senc(kAB , kAS , rA ), w3 senc(kAB , kCS , rS )}, and rA , rS
are (fresh) names in Nprv . In this execution trace, first the key kCS is sent after having
called the corresponding process. Then, branches (1) and (4) of P (v) are triggered.
142 R. Chrétien, V. Cortier, and S. Delaune

2.3 Trace Equivalence


Intuitively, two processes are equivalent if they cannot be distinguished by any attacker.
Trace equivalence can be used to formalise many interesting security properties, in
particular privacy-type properties, such as those studied for instance in [1,6]. We first
introduce a notion of intruder’s knowledge well-suited to cryptographic primitives for
which the success of decrypting or checking a signature is visible.
Definition 1. Two frames σ1 and σ2 are statically equivalent, σ1 ∼ σ2 , when we have
that dom(σ1 ) = dom(σ2 ), and:
– for any recipe R, Rσ1 ↓ ∈ T (Σ, N ) if, and only if, Rσ2 ↓ ∈ T (Σ, N ); and
– for all recipes R1 and R2 such that R1 σ1 ↓, R2 σ1 ↓ ∈ T (Σ, N ), we have that
R1 σ1 ↓ = R2 σ1 ↓ if, and only if, R1 σ2 ↓ = R2 σ2 ↓.
Intuitively, two frames are equivalent if an attacker cannot see the difference between
the two situations they represent: if some computation fails in σ1 it should fail in σ2 as
well, and σ1 and σ2 should satisfy the same equalities.
Example 4. Assume some agent publishes her vote encrypted. The possible values for
the votes are typically public. Therefore the question is not whether an attacker may
know the value of the vote (that he knows anyway) but instead, whether he may distin-
guish between two executions where A votes differently. Consider the two frames:
def
σi = {w4  v0 , w5  v1 , w6  senc(vi , kAB , r)} with i ∈ {0, 1}
where v0 , v1 ∈ Σ0 , and r ∈ Nprv . We have that σ0 ∼ σ1 . Intuitively, there is no test that
allows the attacker to distinguish the two frames since the key kAB is not available. In
this scenario, the vote vi remains private. Now, consider the frames σi = σ ∪ σi with
i ∈ {0, 1} and σ as defined in Example 3. We have that σ0 ∼ σ1 . Indeed, consider the
recipes R1 = sdec(w6 , sdec(w3 , w1 )) and R2 = w4 . We have that R1 σ0 ↓ = R2 σ0 ↓ =
v0 , whereas R1 σ1 ↓ = v1 and R2 σ1 ↓ = v0 . Intuitively, an attacker can learn kAB and
then compare the encrypted vote to the values v0 and v1 .
Intuitively, two processes are trace equivalent if, however they behave, the resulting
sequences of messages observed by the attacker are in static equivalence.
Definition 2. Let P and Q be two protocols. We have that P . Q if for every (tr, σ) ∈
trace(P ), there exists (tr , σ  ) ∈ trace(Q) such that tr = tr and σ ∼ σ  . They are trace
equivalent, written P ≈ Q, if P . Q and Q . P .
Example 5. Continuing Example 2, our naive protocol is secure if the vote of A remains
private. This is typically expressed by P (v0 ) | Q ≈ P (v1 ) | Q. An attacker should not
distinguish between two instances of the protocol where A votes two different values.
The purpose of Q is to disclose the two values v0 and v1 .
def
Q = ! in(c0 , start).out(c0 , v0 ) | ! in(c1 , start).out(c1 , v1 )
However, our protocol is insecure. As seen in Example 3, an attacker may learn kAB ,
and therefore distinguish between the two processes described above. Formally, we
have that P (v0 ) | Q ≈ P (v1 ) | Q. This is reflected by the trace tr described below:
def
tr = tr.in(c0 , start).out(c0 , w4 ).in(c1 , start).out(c1 , w5 ).in(cA , start).out(cA , w6 ).
From Security Protocols to Pushdown Automata 143

We have that (tr , σ0 ) ∈ trace(K0 ) with K0 = (P (v0 ) | Q; ∅) and σ0 as defined in
Example 4. Because of the existence of only one branch using each channel, there is
only one possible execution of P (v1 ) | Q (up to a bijective renaming of the private
names of sort rand) matching the labels in tr , and the corresponding execution will
allow us to reach the frame σ1 as described in Example 4. We have already seen that
static equivalence does not hold, i.e. σ0 ∼ σ1 .

3 Ping-Pong Protocols
We aim at providing a decidability result for the problem of trace equivalence between
protocols in presence of replication. However, it is well-known that replication leads to
undecidability even for the simple case of reachability properties. Thus, we consider a
class of protocols, called Cpp , for which (in a slightly different setting), reachability has
already been proved decidable [9].

3.1 Class Cpp


We basically consider ping-pong protocols (an output is computed using only the mes-
sage previously received in input), and we assume a kind of determinism. Moreover,
we restrict the terms that are manipulated throughout the protocols: only one unknown
message (modelled by the use of a variable of sort msg) can be received at each step.
We fix a variable x ∈ X of sort msg. An input term u (resp. output term v) is a term
defined by the grammars given below:
u := x | s | f(u, k, z) v := x | s | f(v, k, r)
where s, k ∈ Σ0 ∪ {start}, z ∈ X , f ∈ {senc, aenc, sign} and r ∈ N . Moreover, we
assume that each variable (resp. name) occurs at most once in u (resp. v).

Definition 3. Cpp is the class of protocol of the form:


n pi
P = | | !in(ci , uij ).new r1 . . . . .new rkji . out(ci , vji ) such that:
i=1 j=1

1. for all i ∈ {1, . . . , n}, and j ∈ {1, . . . , pi }, kji ∈ N, uij is an input term, and vji is
an output term where names occurring in vji are included in {r1 , . . . , rkji };
2. for all i ∈ {1, . . . , n}, and j1 , j2 ∈ {1, . . . , pi }, if j1 = j2 then for any renaming
of variables, uij1 and uij2 are not unifiable1.

Note that the purpose of item 2 is to restrict the class of protocols to those that have
a deterministic behavior (a particular input action can only be accepted by one branch
of the protocol). This is a natural restriction since most of the protocols are indeed
deterministic: an agent should usually know exactly what to do once he has received a
message. Actually, the main limitations of the class Cpp are stated in item 1: we consider
a restricted signature (e.g. no pair, no hash function), and names can only be used to
produce randomized ciphertexts/signatures.
1
i.e. there does not exist θ such that uij1 θ = uij2 θ.
144 R. Chrétien, V. Cortier, and S. Delaune

Example 6. The protocols described in Example 5 are in Cpp . For instance, we can
check that senc(x, kAS , z) is an input term whereas senc(x, kBS , rS ) is an output term.
Moreover, the determinism condition (item 2) is clearly satisfied: each branch of the
protocol P (v0 ) | Q (resp. P (v0 ) | Q) uses a different channel.

Our main contribution is a decision procedure for trace equivalence of processes in Cpp .
Details of the procedure are provided in Section 4.

Theorem 1. Let P and Q be two protocols in Cpp . The problem whether P and Q are
trace equivalent, i.e. P ≈ Q, is decidable.

3.2 Undecidability Results

The class Cpp is somewhat limited but surprisingly, extending Cpp to non deterministic
processes immediately yields undecidability of trace equivalence. More precisely, trace
inclusion of processes in Cpp is already undecidable.

Theorem 2. Let P and Q be two protocols in Cpp . The problem whether P is trace
included in Q, i.e. P . Q, is undecidable.

This result is shown by encoding the Post Correspondence Problem (PCP). Alterna-
tively, it results from the reduction result established in Section 5 and the undecidability
result established in [12]. Undecidability of trace inclusion actually implies undecid-
ability of trace equivalence as soon as processes are non deterministic. Indeed consider
the choice operator + whose (standard) semantics is given by the following rules:
τ τ
({P + Q} ∪ P; σ) −
→ (P ∪ P; σ) ({P + Q} ∪ P; σ) −
→ (Q ∪ P; σ)

Corollary 1. Let P , Q1 , and Q2 be three protocols in Cpp . The problem whether P is


equivalent to Q1 + Q2 , i.e. P ≈ Q1 + Q2 , is undecidable.

Indeed, consider P and Q1 , for which trace inclusion encodes PCP, and let Q2 = P .
Trivially, P . Q1 + Q2 . Thus P ≈ Q1 + Q2 if, and only if, Q1 + Q2 . P , i.e. if, and
only if, Q1 . P , hence the undecidability result.

4 From Trace Equivalence to Language Equivalence

This section is devoted to a sketch of proof of Theorem 1. Deciding trace equivalence


is done in two main steps. First, we show how to reduce the trace equivalence problem
between protocols in Cpp , to the problem of deciding trace equivalence (still between
protocols in Cpp ) when the attacker acts as a forwarder.
Then, we encode the problem of deciding trace equivalence for forwarding attackers
into the problem of language equivalence for real-time generalized pushdown determin-
istic automata (GPDA).
From Security Protocols to Pushdown Automata 145

4.1 Generalized Pushdown Automata


GPDA differ from deterministic pushdown automata (DPA) as they can unstack several
symbols at a time. We consider real-time GPDA with final-state acceptance.

Definition 4. A real-time GPDA is a 7-tuple A = (Q, Π, Γ, q0 , ω, Qf , δ) where Q is


the finite set of states, q0 ∈ Q is the initial state, Qf ⊆ Q is the set of accepting
states, Π is the finite input-alphabet, Γ is the finite stack-alphabet, ω is the initial stack
symbol, and δ : (Q × Π × Γ0 ) → Q × Γ0 is the partial transition function such that:
– Γ0 is a finite subset of Γ ∗ ; and
– for any (q, a, x) ∈ dom(δ) and y suffix strict of x, we have that (q, a, y) ∈ dom(δ).

Let q, q  ∈ Q, w, w , γ ∈ Γ ∗ , m ∈ Π ∗ , a ∈ Π; we note (qwγ, am) A (q  ww , m)


if (q  , w ) = δ(q, a, γ). The relation ∗A is the reflexive and transitive closure of A .
m
For every qw, q  w in QΓ ∗ and m ∈ Π ∗ , we note qw −→A q  w if, and only if,
(qw, m) A (q w , ). For sake of clarity, a transition from q to q  reading a, popping γ
∗  

a;γ/w 
from the stack and pushing w will be denoted by q −−−−→ q  .
Let A be a GPDA. The language recognized by A is defined by:
m
L(A) = {m ∈ Π ∗ | q0 ω −→A qf w for some qf ∈ Qf and w ∈ Γ ∗ }.
A real-time GPDA can easily be converted into a DPA by adding new states and -
transitions. Thus, the problem of language equivalence for two real-time GPDA A1
and A2 , i.e. deciding whether L(A1 ) = L(A2 ) is decidable [15].

4.2 Getting Rid of the Attacker


We define the actions of a forwarder by modifying our semantics. We restrict the recipes
R, R1 , and R2 that are used in the I N rule and in static equivalence (Definition 1) to be
either the public constant start or a variable in W. This leads us to consider a new
⇒fwd between configurations, and a new notion of static equivalence ∼fwd . We
relation =
denote by ≈fwd the trace equivalence relation induced by this new semantics.

Example 7. The trace exhibited in Example 3 is still a valid one according to the for-
warder semantics, and the frames σ0 and σ1 described in Example 4 are in equivalence
according to ∼fwd . Actually, we have that P (v0 ) | Q ≈fwd P (v1 ) | Q. Indeed, the fact
that a forwarder simply acts as a relay prevents him to mount the aforementioned attack.

As shown above, the forwarder semantics is very restrictive: a forwarder can not rely
on his deduction capabilities to mount an attack. To counterbalance the effects of this
semantics, the key idea consists in modifying the protocols under study by adding new
rules that encrypt/sign and decrypt/check messages on demand for the forwarder.
Formally, we define a transformation Tfwd that associates to a pair of protocols in Cpp
a finite set of pairs of protocols (still in Cpp ), and we show the following result:

Proposition 1. Let P and Q be two protocols in Cpp . We have that:


P ≈ Q if, and only if, P  ≈fwd Q for some (P  , Q ) ∈ Tfwd (P, Q).
146 R. Chrétien, V. Cortier, and S. Delaune

Roughly the transformation Tfwd consists in first guessing among the keys of the
protocols P and the keys of the protocols Q those that are deducible by the attacker,
as well as a bijection α between these two sets. We can show that such a bijection
necessarily exists when P ≈ Q. Then, to compensate the fact that the attacker is a
simple forwarder, we give him access to oracles for any deducible key k, adding the
corresponding branches in the processes, i.e. in case k is of sort SimKey, we add
k , x).new r.out(ck , senc(x, k, r)) | ! in(ck , senc(x, k, z)).out(ck , x)
! in(csenc senc sdec sdec

To maintain the equivalence, we do a similar transformation in both P and Q relying


on the bijection α. We ensure that the set of deducible keys has been correctly guessed
by adding of some extra processes. Then the main step of the proof consists in showing
that the forwarder has now the same power as a full attacker, although he cannot reuse
the same randomness in two distinct encryptions/signatures, as a real attacker could.

4.3 Encoding a Protocol into a Real-Time GPDA

For any process P ∈ Cpp , we can show that it is possible to define a polynomial-sized
real-time GPDA AP such that trace equivalence against forwarder of two processes
coincides with language equivalence of the two corresponding automata.

Theorem 3. Let P and Q in Cpp , we have that: P ≈fwd Q ⇐⇒ L(AP ) = L(AQ ).

The idea is that the automaton AP associated to a protocol P recognizes the words (a
sequence of channels) that correspond to a possible execution in P . The stack of AP
is used to store a (partial) representation of the last outputted term. This requires to
convert a term into a word, and we use the following representation:
s = s for any constant s ∈ Σ0 ∪ {start}; and f(v, k, r) = v̄.k otherwise.
Note that, even if our signature is infinite, we show that only a finite number of constants
of sort msg and a finite number of constants of sort channel need to be considered
(namely those that occur in the protocols under study). Thus, the stack-alphabet and the
input-alphabet of the automaton are both finite.
To construct the automaton associated to a process P ∈ Cpp , we need to construct
an automaton that recognizes any execution of P and the corresponding valid tests. For
the sake of illustration, we present only the automaton (depicted below) that recognizes
tests of the form w = w such that the corresponding term is actually a constant.
Intuitively, the basic building blocks (e.g. q0 with the transitions from q0 to itself)
mimic an execution of P where each input is fed with the last outputted term. Then, to
recognize the tests of the form w = w that are true in such an execution, it is sufficient
to memorize the constant si that is associated to w (adding a new state qi ), and to see
whether it is possible to reach a state where the stack contains si again.
Capturing tests that lead to non-constant symbols (i.e. terms of the form senc(u, k, r))
is more tricky for several reasons. First, it is not possible anymore to memorize the re-
sulting term in a state of the automaton. Second, names of sort rand play a role in such
a test, while they are forgotten in our encoding. We therefore have to, first, characterize
more precisely trace equivalence and secondly, construct more complex automata that
use some special track symbols to encode when randomized ciphertexts may be reused.
From Security Protocols to Pushdown Automata 147

ci ; uij /vji
q1
ω con
ci ; uij /vji 1/ .. st;
; ωs . ωs
const 1 /ω
ci ; uij /vji
const; ωsk /ω const; ωsk /ω
q0 qf
qk
con .. /ω
st; ω . ωs 
s /ω s t;
q c on

ci ; uij /vji

5 From Language Equivalence to Trace Equivalence


We have just seen how to encode equivalence of processes in Cpp into real-time GPDA.
The equivalence of processes in Cpp is actually equivalent to language equivalence of
real-time GPDA. Indeed, we can conversely encode any real-time GPDA into a process
in Cpp , preserving equivalence. The transformation works as follows.
Given a word u = α1 . . . . .αp , for sake of concision, the expression x.u will denote
either the term senc(. . . senc(x, α1 , z1 ), . . .), αp , zp ) when it occurs as an input term;
or senc(. . . senc(x, α1 , r1 ), . . .), αp , rp ) when it occurs as an output term. Then given
an automaton A = (Q, Π, Γ, q0 , ω, Qf , δ), the corresponding process PA is defined as
follows:
def
PA = ! in(c0 , start).new r.out(c0 , senc(ω, q0 , r))
| ! in(ca , senc(x.u, q, z)).new r̃.out(ca , senc(x.v, q  , r))
| ! in(cf , senc(x, qf , z)).out(cf , start)
| PA
where a quantifies over Π, q over Q, u over words in Γ ∗ such that (q, a, u) ∈ dom(δ),
qf over Qf , and (q  , v) = δ(q, a, u).
Intuitively, the stack of the automata A is encoded as a pile of encryptions (where
each key encodes a tile of the stack). Then, upon receiving a stack s encrypted by q
on channel ca , the process PA mimics the transition of A at state q and stack s, upon
reading a. The resulting stack is sent encrypted by the resulting state. This polynomial
encoding (with some additional technical details hidden in PA ) preserves equivalence.

Proposition 2. Let A and B be two real-time GPDA: L(A) ⊆ L(B) ⇐⇒ PA . PB .

Therefore, checking for equivalence of protocols is as difficult as checking equivalence


of real-time generalized pushdown deterministic automata. It follows that the exact
complexity of checking equivalence of protocols is unknown. The only upper bound
is that equivalence is at most primitive recursive. This bound comes from the algorithm
proposed by C. Stirling for equivalence of DPA [16] (Icalp 2002). Whether equivalence
of DPA (or even real-time GPDA) is e.g. at least NP-hard is unknown.
148 R. Chrétien, V. Cortier, and S. Delaune

6 Conclusion
We have shown a first decidability result for equivalence of security protocols for an un-
bounded number of sessions by reducing it to the equality of languages of deterministic
pushdown automata. We further show that deciding equivalence of security protocols is
actually at least as hard as deciding equality of languages of deterministic, generalized,
real-time pushdown automata.
Our class of security protocols handles only randomized primitives, namely symmet-
ric/asymmetric encryptions and signatures. Our decidability result could be extended to
handle deterministic primitives instead of the randomized one (the reverse encoding
- from real-time GPDAs to processes with deterministic encryption - may not hold
anymore). Due to the use of pushdown automata, extending our decidability result to
protocols with pair is not straightforward. A direction is to use pushdown automata for
which stacks are terms.
G. Sénizergues is currently implementing his procedure for pushdown automata [14].
As soon as the tool will be available, we plan to implement our translation, yielding a
tool for automatically checking equivalence of security protocols, for an unbounded
number of sessions.

References
1. Arapinis, M., Chothia, T., Ritter, E., Ryan, M.: Analysing unlinkability and anonymity using
the applied pi calculus. In: 23rd Computer Security Foundations Symposium (CSF 2010),
pp. 107–121. IEEE Computer Society Press (2010)
2. Basin, D., Mödersheim, S., Viganò, L.: A symbolic model checker for security protocols.
Journal of Information Security 4(3), 181–208 (2005)
3. Baudet, M.: Deciding security of protocols against off-line guessing attacks. In: 12th ACM
Conference on Computer and Communications Security (CCS 2005). ACM Press (2005)
4. Blanchet, B.: An efficient cryptographic protocol verifier based on prolog rules. In: 14th
Computer Security Foundations Workshop (CSFW 2001). IEEE Computer Society Press
(2001)
5. Blanchet, B., Abadi, M., Fournet, C.: Automated Verification of Selected Equivalences for
Security Protocols. In: 20th Symposium on Logic in Computer Science (2005)
6. Bruso, M., Chatzikokolakis, K., den Hartog, J.: Formal verification of privacy for RFID sys-
tems. In: 23rd Computer Security Foundations Symposium, CSF 2010 (2010)
7. Cheval, V., Comon-Lundh, H., Delaune, S.: Trace equivalence decision: Negative tests and
non-determinism. In: 18th ACM Conference on Computer and Communications Security
(CCS 2011). ACM Press (2011)
8. Chevalier, Y., Rusinowitch, M.: Decidability of equivalence of symbolic derivations. J. Au-
tom. Reasoning 48(2), 263–292 (2012)
9. Comon-Lundh, H., Cortier, V.: New decidability results for fragments of first-order logic
and application to cryptographic protocols. In: Nieuwenhuis, R. (ed.) RTA 2003. LNCS,
vol. 2706, pp. 148–164. Springer, Heidelberg (2003)
10. Cortier, V., Delaune, S.: A method for proving observational equivalence. In: 22nd IEEE
Computer Security Foundations Symposium (CSF 2009). IEEE Computer Society Press
(2009)
11. Cremers, C.: Unbounded verification, falsification, and characterization of security protocols
by pattern refinement. In: 15th ACM Conference on Computer and Communications Security
(CCS 2008). ACM (2008)
From Security Protocols to Pushdown Automata 149

12. Friedman, E.P.: The inclusion problem for simple languages. Theor. Comput. Sci. 1(4),
297–316 (1976)
13. Rusinowitch, M., Turuani, M.: Protocol Insecurity with Finite Number of Sessions and Com-
posed Keys is NP-complete. Theoretical Computer Science 299, 451–475 (2003)
14. Sénizergues, G.: The equivalence problem for deterministic pushdown automata is decidable.
In: Degano, P., Gorrieri, R., Marchetti-Spaccamela, A. (eds.) ICALP 1997. LNCS, vol. 1256,
pp. 671–681. Springer, Heidelberg (1997)
15. Sénizergues, G.: L(A)=L(B)? Decidability results from complete formal systems. Theor.
Comput. Sci. 251(1-2), 1–166 (2001)
16. Stirling, C.: Deciding DPDA equivalence is primitive recursive. In: Widmayer, P., Triguero,
F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS,
vol. 2380, pp. 821–832. Springer, Heidelberg (2002)
17. Tiu, A., Dawson, J.E.: Automating open bisimulation checking for the SPI calculus. In: 23rd
IEEE Computer Security Foundations Symposium (CSF 2010), pp. 307–321 (2010)
Efficient Separability of Regular Languages
by Subsequences and Suffixes

Wojciech Czerwiński, Wim Martens, and Tomáš Masopust

Institute for Computer Science, University of Bayreuth


[email protected], [email protected], [email protected]

Abstract. When can two regular word languages K and L be separated


by a simple language? We investigate this question and consider sepa-
ration by piecewise- and suffix-testable languages and variants thereof.
We give characterizations of when two languages can be separated and
present an overview of when these problems can be decided in polynomial
time if K and L are given by nondeterministic automata.

1 Introduction
In this paper we are motivated by scenarios in which we want to describe some-
thing complex by means of a simple language. The technical core of our scenarios
consists of separation problems, which are usually of the following form:
Given are two languages K and L. Does there exist a language S, coming
from a family F of simple languages, such that S contains everything
from K and nothing from L?
The family F of simple languages could be, for example, languages definable in
FO, piecewise testable languages, or languages definable with small automata.
Our work is specifically motivated by two seemingly orthogonal problems
coming from practice: (a) increasing the user-friendliness of XML Schema and
(b) efficient approximate query answering. We explain these next.
Our first motivation comes from simplifying XML Schema. XML Schema is
currently the only industrially accepted and widely supported schema language
for XML. Historically, it is designed to alleviate the limited expressiveness of
Document Type Definition (DTD) [6], thereby making DTDs obsolete. Unfor-
tunately, XML Schema’s extra expressiveness comes at the cost of simplicity.
Its code is designed to be machine-readable rather than human-readable and
its logical core, based on complex types, does not seem well-understood by users
[16]. One reason may be that the specification of XML Schema’s core [8] consists
of over 100 pages of intricate text. The BonXai schema language [16,17] is an
attempt to overcome these issues and to combine the simplicity of DTDs with
the expressiveness of XML Schema. It has exactly the same expressive power as
XML Schema, is designed to be human-readable, and avoids the use of complex
types. Therefore, it aims at simplifying the development or analysis of XSDs.
In its core, a BonXai schema is a set of rules L1 → R1 , . . . , Ln → Rn in which
all Li and Ri are regular expressions. An unranked tree t (basically, an XML

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 150–161, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Efficient Separability of Regular Languages by Subsequences and Suffixes 151

document) is in the language of the schema if, for every node u, the word formed
by the labels of u’s children is in the language Rk , where k is the largest num-
ber such that the word of ancestors of u is in Lk . This semantical definition is
designed to ensure full back-and-forth compatibility with XML Schema [16].
When translating an XML Schema Definition (XSD) into an equivalent BonXai
schema, the regular expressions Li are obtained from a finite automaton that is
embedded in the XSD. Since the current state-of-the-art in translating automata
to expressions does not yet generate human-readable results, we are investigating
simpler classes of expressions which we expect to suffice in practice. Practical and
theoretical studies show evidence that regular expressions of the form Σ ∗ w (with
w ∈ Σ + ) and Σ ∗ a1 Σ ∗ · · · Σ ∗ an (with a1 , . . . , an ∈ Σ) and variations thereof seem
to be quite well-suited [9,13,18]. We study these kinds of expressions in this paper.
Our second motivation comes from efficient approximate query answering.
Efficiently evaluating regular expressions is relevant in a very wide array of fields.
We choose one: in graph databases and in the context of the SPARQL language
[5,10,14,19] for querying RDF data. Typically, regular expressions are used in
this context to match paths between nodes in a huge graph. In fact, the data can
be so huge that exact evaluation of a regular expression r over the graph (which
can lead to a product construction between an automaton for the expression and
the graph [14,19]) may not be feasible within reasonable time. Therefore, as a
compromise to exact evaluation, one could imagine that we try to rewrite the
regular expression r as an expression that we can evaluate much more efficiently
and is close enough to r. Concretely, we could specify two expressions rpos (resp.,
rneg ) that define the language we want to (resp., do not want to) match in our
answer and ask whether there exists a simple query (e.g., defining a piecewise
testable language) that satisfies these constraints. Notice that the scenario of
approximating an expression r in this way is very general and not even limited
to databases. (Also, we can take rneg to be the complement of rpos .)
At first sight, these two motivating scenarios may seem to be fundamentally
different. In the first, we want to compute an exact simple description of a
complex object and in the second one we want to compute an approximate simple
query that can be evaluated more efficiently. However, both scenarios boil down
to the same underlying question of language separation. Our contributions are:
(1) We formally define separation problems that closely correspond to the mo-
tivating scenarios. Query approximation will be abstracted as separation and
schema simplification as layer-separation (Section 2.1).
(2) We prove the equivalece of separability of languages K and L by boolean com-
binations of simple languages, layer-separability, and the existence of an infinite
sequence of words that goes back and forth between K and L. This characteri-
zation shows how the exact and approximate scenario are related and does not
require K and L to be regular (Sec. 3). Our characterization generalizes a result
by Stern [23] that says that a regular language L is piecewise testable iff every
increasing infinite sequence of words (w.r.t. subsequence ordering) alternates
finitely many times between L and its complement.
152 W. Czerwiński, W. Martens, and T. Masopust

(3) In Section 4 we prove a decomposition characterization for separability of


regular languages by piecewise testable languages and we give an algorithm that
decides separability. The decomposition characterization is in the spirit of an
algebraic result by Almeida [1]. It is possible to prove our characterization using
Almeida’s result but we provide a self-contained, elementary proof which can be
understood without a background in algebra. We then use this characterization
to distill a polynomial time decision procedure for separability of languages of
NFAs (or regular expressions) by piecewise testable languages. The state-of-
the-art algorithm for separability by piecewise testable languages ([2,4]) runs in
time O(poly(|Q|) · 2|Σ|) when given DFAs for the regular languages, where |Q|
is the number of states in the DFAs and |Σ| is the alphabet size. Our algorithm
runs in time O(poly(|Q| + |Σ|)) even for NFAs. Notice that |Σ| can be large
(several hundreds and more) in the scenarios that motivate us, so we believe the
improvement with respect to the alphabet to be relevant in practice.
(4) Whereas Section 4 focuses exclusively on separation by piecewise testable lan-
guages, we broaden our scope in Section 5. Let’s say that a subsequence language
is a language of the form Σ ∗ a1 Σ ∗ · · · Σ ∗ an Σ ∗ (with all ai ∈ Σ). Similarly, a suffix
language is of the form Σ ∗ a1 · · · an . We present an overview of the complexities
of deciding whether regular languages can be separated by subsequence lan-
guages, suffix languages, finite unions thereof, or boolean combinations thereof.
We prove all cases to be in polynomial time, except separability by a single
subsequence language which is NP-complete. By combining this with the results
from Section 3 we also have that layer-separability is in polynomial time for all
languages we consider.
We now discuss further related work. There is a large body of related work that
has not been mentioned yet. Piecewise testable languages are defined and studied
by Simon [20,21], who showed that a regular language is piecewise testable iff
its syntactic monoid is J-trivial and iff both the minimal DFA for the language
and the minimal DFA for the reversal are partially ordered. Stern [24] suggested
an O(n5 ) algorithm in the size of a DFA to decide whether a regular language
is piecewise testable. This was improved to quadratic time by Trahtman [25].
(Actually, from our proof, it now follows that this question can be decided in
polynomial time if an NFA and its complement NFA are given.)
Almeida [2] established a connection between a number of separation problems
and properties of families of monoids called pseudovarieties. Almeida shows,
e.g., that deciding whether two given regular languages can be separated by a
language with its syntactic monoid lying in pseudovariety V is algorithmically
equivalent to computing two-pointlike sets for a monoid in pseudovariety V. It
is then shown by Almeida et al. [3] how to compute these two-pointlike sets in
the pseudovariety J corresponding to piecewise testable languages. Henckell et
al. [11] and Steinberg [22] show that the two-pointlike sets can be computed
for pseudovarieties corresponding to languages definable in first order logic and
languages of dot depth at most one, respectively. By Almeida’s result [2] this
implies that the separation problem is also decidable for these classes.
Efficient Separability of Regular Languages by Subsequences and Suffixes 153

2 Preliminaries and Definitions


For a finite set S, we denote its cardinality by |S|. By Σ we always denote an
alphabet, that is, a finite set of symbols. A (Σ-)word w is a finite sequence of
symbols a1 · · · an , where n ≥ 0 and ai ∈ Σ for all i = 1, . . . , n. The length of
w, denoted by |w|, is n and the alphabet of w, denoted by Alph(w), is the set
{a1 , . . . , an } of symbols occurring in w. The empty word is denoted by ε. The set
of all Σ-words is denoted by Σ ∗ . A language is a set of words. For v = a1 · · · an
and w ∈ Σ ∗ a1 Σ ∗ · · · Σ ∗ an Σ ∗ , we say that v is a subsequence of w, denoted by
v / w.
A (nondeterministic) finite automaton or NFA A is a tuple (Q, Σ, δ, q0 , F ),
where Q is a finite set of states, δ : Q×Σ → 2Q is the transition function, q0 ∈ Q
is the initial state, and F ⊆ Q is the set of accepting states. We sometimes denote
a
that q2 ∈ δ(q1 , a) as q1 − → q2 ∈ δ to emphasize that A being in state q1 can go
to state q2 reading an a ∈ Σ. A run of A on word w = a1 · · · an is a sequence of
ai
states q0 · · · qn where, for each i = 1, . . . , n, we have qi−1 −→ qi ∈ δ. The run is
accepting if qn ∈ F . Word w is accepted by A if there is an accepting run of A on
w. The language of A, denoted by L(A), is the set of all words accepted by A.
By δ ∗ we denote the extension of δ to words, that is, δ ∗ (q, w) is the set of states
that can be reached from q by reading w. The size |A| = |Q| + q,a |δ(q, a)| of A
is the total number of transitions and states. An NFA is deterministic (a DFA)
when every δ(q, a) consists of at most one element.
The regular expressions (RE) over Σ are defined as follows: ε and every Σ-
symbol is a regular expression; whenever r and s are regular expressions, then so
are (r · s), (r + s), and (s)∗ . In addition, we allow ∅ as a regular expression, but
we assume that ∅ does not occur in any other regular expression. For readabil-
ity, we usually omit concatenation operators and parentheses in examples. We
sometimes abbreviate an n-fold concatenation of r by rn . The language defined
by an RE r is denoted by L(r) and is defined as usual. Often we simply write r
instead of L(r). Whenever we say that expressions or automata are equivalent,
we mean that they define the same language. The size |r| of r is the total number
of occurrences of alphabet symbols, epsilons, and operators in r, i.e., the number
of nodes in its parse tree. A regular expression is union-free if it does not contain
the operator +. A language is union-free if it is defined by a union-free regular
expression.
A quasi-order is a reflexive and transitive relation. For a quasi-order ,
the (upward) -closure of a language L is the set closure (L) = {w | v 
w for some v ∈ L}. We denote the -closure of a word w as closure (w) instead
of closure ({w}). Language L is (upward) -closed if L = closure (L).
A quasi-order  on a set X is a well-quasi-ordering (a WQO ) if for every
infinite sequence (xi )∞ i=1 of elements of X there exist indices i < j such that
xi  xj . It is known that every WQO is also well-founded, that is, there exist
no infinite descending sequences x1 x2 · · · such that xi  xi+1 for all i.
Higman’s Lemma [12] (which we use multiple times) states that, for every al-
phabet Σ, the subsequence relation / is a WQO on Σ ∗ . Notice that, as a corol-
lary to Higman’s Lemma, every /-closed language is a finite union of languages
154 W. Czerwiński, W. Martens, and T. Masopust

of the form Σ ∗ a1 Σ ∗ . . . Σ ∗ an Σ ∗ which means that it is also regular, see also [7].
A language is piecewise testable if it is a finite boolean combination of /-closed
languages (or, finite boolean combination of languages Σ ∗ a1 Σ ∗ · · · Σ ∗ an Σ ∗ ). In
this paper, all boolean combinations are finite.

2.1 Separability of Languages

A language S separates language K from L if S contains K and does not intersect


L. We say that S separates K and L if it either separates K from L or L from
K. Let F be a family of languages. Languages K and L are separable by F if
there exists a language S in F that separates K and L. Languages K and L are
layer-separable by F if there exists a finite sequence of languages S1 , . . . , Sm in
F such that
i−1
1. for all 1 ≤ i ≤ m, language Si \ j=1 Sj intersects at most one of K and L;
m
2. K or L (possibly both) is included in j=1 Sj .

Notice that separability always implies layer-separability. However, the opposite


implication does not hold, as we demonstrate next.

Example 1. Let F = {an a∗ | n ≥ 0} be a family of /-closed languages over


Σ = {a}, K = {a, a3 }, and L = {a2 , a4 }. We first show that languages K and
L are not separable by F . Indeed, assume that S ∈ F separates K and L. If K
is included in S, then aa∗ ⊆ S, hence L and S are not disjoint. Conversely, if
L ⊆ S, then a2 a∗ ⊆ S and therefore S and K are not disjoint. This contradicts
that S separates K and L. Now we show that the languages are layer-separable
by F . Consider languages S1 = a4 a∗ , S2 = a3 a∗ , S3 = a2 a∗ , and S4 = aa∗ .
Then both K and L are included in S4 , and S1 intersects only L, S2 \ S1 = a3
intersects only K, S3 \(S1 ∪S2 ) = a2 intersects only L, and S4 \(S1 ∪S2 ∪S3 ) = a
intersects only K; see Fig. 1.
Example 1 illustrates some intu-
ition behind layered separability. Our
motivation for layered separability a3 a4
comes from the BonXai schema lan- S2 S1
guage which is discussed in the in-
troduction. We need to solve layer- a a2
separability if we want to decide S4
S3
whether an XML Schema has an K L
equivalent BonXai schema with sim- Fig. 1. An example of a layer-separation
ple regular expressions (defining lan-
guages in F ). Layered separability implies that languages are, in a sense, sepa-
rable by languages from F in a priority-based system: If we consider the ordered
sequence of languages S1 , S2 , S3 , S4 then, in order to classify a word w ∈ K ∪ L
in either K or L, we have to match it against the Si in increasing order of the
index i. If we know the lowest index j for which w ∈ Sj , we know whether w ∈ K
or w ∈ L.
Efficient Separability of Regular Languages by Subsequences and Suffixes 155

We now define a tool (similar to and slightly more general than the alternat-
ing towers of Stern [23]) that allows us to determine when languages are not
separable. For languages K and L and a quasi-order , we say that a sequence
(wi )ki=1 of words is a -zigzag between K and L if w1 ∈ K ∪ L and, for all
i = 1, . . . , k − 1:
(1) wi  wi+1 ; (2) wi ∈ K implies wi+1 ∈ L; and (3) wi ∈ L implies wi+1 ∈ K.
We say that k is the length of the -zigzag. We similarly define an infinite
sequence of words to be an infinite -zigzag between K and L. If the languages
K and L are clear from the context then we sometimes omit them and refer
to the sequence as a (infinite) -zigzag. If we consider the subsequence order
/, then we simply write a zigzag instead of a /-zigzag. Notice that we do not
require K and L to be disjoint. If there is a w ∈ K ∩ L then there clearly exists
an infinite zigzag: w, w, w, . . .
Example 2. In order to illustrate infinite zigzags consider the languages K =
{a(ab)2k c(ac)2 | k,  ≥ 0} and L = {b(ab)2k+1 c(ac)2+1 | k,  ≥ 0}. Then the
following infinite sequence is an infinite zigzag between K and L:

b(ab)i c(ac)i if i is odd
wi =
a(ab)i c(ac)i if i is even

Indeed w1 ∈ L, words from the sequence alternately belong to K and L, and for
all i ≥ 1 we have wi / wi+1 . %
&

3 A Characterization of Separability
The aim of this section is to prove the following theorem. It extends a result by
Stern that characterizes piecewise testable languages [23]. In particular, it also
applies to non-regular languages and does not require K to be the complement
of L.
Theorem 3. For languages K and L and a WQO  on words, the following
are equivalent.
(1) K and L are separable by a boolean combination of -closed languages.
(2) K and L are layer-separable by -closed languages.
(3) There does not exist an infinite -zigzag between K and L.
Some of the equivalences in the theorem still hold when the assumptions are
weakened. For example the equivalence between (1) and (2) does not require 
to be a WQO.
Since the subsequence order / is a WQO on words, we know from The-
orem 3 that languages are separable by piecewise testable languages if and
only if they are layer-separable by /-closed languages. Actually, since / is a
WQO (and therefore only has finitely many minimal elements within a lan-
guage), the latter is equivalent to being layer-separable by languages of the form
Σ ∗ a1 Σ ∗ · · · Σ ∗ an Σ ∗ .
156 W. Czerwiński, W. Martens, and T. Masopust

In Example 1 we illustrated two languages K and L that are layer-separable


by /-closed languages. Notice that K and L can also be separated by a boolean
combination of the languages a∗ a1 , a∗ a2 , a∗ a3 , and a∗ a4 from F , as K ⊆ ((a∗ a1 \
a∗ a2 ) ∪ (a∗ a3 \ a∗ a4 )) and L ∩ ((a∗ a1 \ a∗ a2 ) ∪ (a∗ a3 \ a∗ a4 )) = ∅.
We now give an overview of the proof of Theorem 3. The next lemma proves
the equivalence between (1) and (2), but is slightly more general. In particular,
it does not rely on a WQO.
Lemma 4. Let F be a family of languages closed under intersection and con-
taining Σ ∗ . Then languages K and L are separable by a finite boolean combina-
tion of languages from F if and only if K and L are layer-separable by F .
The proof is constructive. The only if direction is the more complex one and
shows how to exploit the implicit negation in the first condition in the definition
of layer-separability in order to simulate separation by boolean combinations.
Notice that the families of -closed languages in Theorem 3 always contain Σ ∗
and are closed under intersection.
The following lemma shows that the implication (2) ⇒ (3) in Theorem 3 does
not require well-quasi ordering.
Lemma 5. Let  be a quasi order on words and assume that languages K and
L are layer-separable by -closed languages. Then there is no infinite -zigzag
between K and L.
To prove that (3) implies (2), we need the following technical lemma in which
we require  to be a WQO. In the proof of the lemma, we argue how we can see
-zigzags in a tree each of whose nodes is labeled by a word. If a node labelled
w1 is the parent of a node labeled w2 , then we have w1  w2 . Intuitively, every
path in the tree structure corresponds to a -zigzag. We need the fact that 
is a WQO in order to show that we can assume that every node in this tree
structure has a finite number of children. We then apply König’s lemma to show
that arbitrarily long -zigzags imply the existence of an infinite -zigzag. The
lemma then follows by contraposition.
Lemma 6. Let  be a WQO on words. If there is no infinite -zigzag between
languages K and L, then there exists a constant k ∈ N such that no -zigzag
between K and L is longer than k.
If there is no infinite -zigzag, then we can put a bound on the maximal length
of zigzags by Lemma 6. This bound actually has a close correspondence to the
number of “layers” we need to separate K and L.
Lemma 7. Let  be a WQO on words and assume that there is no infinite
-zigzag between languages K and L. Then the languages K and L are layer-
separable by -closed languages.

4 Testing Separability by Piecewise Testable Languages


Whereas Section 3 proves a result for general WQOs, we focus in this section
exclusively on the ordering / of subsequences. Therefore, if we say zigzag in this
Efficient Separability of Regular Languages by Subsequences and Suffixes 157

section, we always mean /-zigzag. We show here how to decide the existence
of an infinite zigzag between two regular word languages, given by their regular
expressions or NFAs, in polynomial time. According to Theorem 3, this is equiv-
alent to deciding if the two languages can be separated by a piecewise testable
language.
To this end, we first prove a decomposition result that is reminiscent of a
result of Almeida ([1], Theorem 4.1 in [3]). We show that, if there is an infinite
zigzag between regular languages, then there is an infinite zigzag of a special
form in which every word can be decomposed in some synchronized manner. We
can find these special forms of zigzags in polynomial time in the NFAs for the
languages. The main features are that our algorithm runs exponentially faster
in the alphabet size than the current state-of-the-art [4] and that our algorithm
and its proof of correctness do not require knowledge of the algebraic perspective
on regular languages.
A regular language is a cycle language if it is of the form u(v)∗ w, where u, v, w
are words and (Alph(u) ∪ Alph(w)) ⊆ Alph(v). We say that v is a cycle of the
language and that Alph(v) is its cycle alphabet. Regular languages LA and LB
are synchronized in one step if they are of one of the following forms:
– LA = LB = {w}, that is, they are the same singleton word, or
– LA and LB are cycle languages with equal cycle alphabets.
We say that regular languages LA and LB are synchronized if they are of the form
LA = D1A D2A . . . DkA and LB = D1B D2B . . . DkB where, for all 1 ≤ i ≤ k, languages
DiA and DiB are synchronized in one step. So, languages are synchronized if they
can be decomposed into (equally many) components that can be synchronized
in one step. Notice that synchronized languages are always non-empty.
Example 8. Languages LA = a(ba)∗ aab ca bb(bc)∗ and LB =
b(aab) ba ca cc(cbc) b are synchronized. Indeed, LA = D1A D2A D3A and
∗ ∗

LB = D1B D2B D3B for D1A = a(ba)∗ aab, D2A = ca, D3A = bb(cb)∗ and
D1B = b(aab)∗ ba, D2B = ca, and D3B = cc(cbc)∗ b.
The next lemma shows that, in order to search for infinite zigzags, it suffices
to search for synchronized sublanguages. The proof goes through a sequence of
lemmas that gradually shows how the sublanguages of LA and LB can be made
more and more specific.
Lemma 9 (Synchronization / Decomposition). There is an infinite zigzag
between regular languages LA and LB if and only if there exist synchronized
languages K A ⊆ LA and K B ⊆ LB .
We now use this result to obtain a polynomial-time algorithm solving our prob-
lem. The first step is to define what it means for NFAs to contain synchronized
sublanguages.
For an NFA A over an alphabet Σ, two states p, q, and a word w ∈ Σ ∗ , we
w
write p −→ q if q ∈ δ ∗ (p, w) or, in other words, the automaton can go from state
p to state q by reading w. For Σ0 ⊆ Σ, states p and q are Σ0 -connected in A if
there exists a word uvw ∈ Σ0∗ such that:
158 W. Czerwiński, W. Martens, and T. Masopust

A qkB
qjA qj+1 q1B

A = q0A B = q0B
B
q1A qkA qj+1
qjB

Fig. 2. Synchronization of automata A and B

1. Alph(v) = Σ0 and
u v w
2. there is a state m such that p −
→ m, m −
→ m, and m −
→ q.
Consider two NFAs A = (QA , Σ, δ A , q0A , F A ) and B = (QB , Σ, δ B , q0B , F B ). Let
(q A , q B ) and (q̄ A , q̄ B ) be in QA × QB . We say that (q A , q B ) and (q̄ A , q̄ B ) are
synchronizable in one step if one of the following situations occurs:
a a
– there exists a symbol a in Σ such that q A −
→ q̄ A and q B −
→ q̄ B ,
A A
– there exists an alphabet Σ0 ⊆ Σ such that q and q̄ are Σ0 -connected in
A and q B and q̄ B are Σ0 -connected in B.
We say that automata A and B are synchronizable if there exists a sequence of
pairs (q0A , q0B ), . . . , (qkA , qkB ) ∈ QA × QB such that:
1. for all 0 ≤ i < k, (qiA , qiB ) and (qi+1
A B
, qi+1 ) are synchronizable in one step;
2. states q0A and q0B are initial states of A and B, respectively; and
3. states qkA and qkB are accepting states of A and B, respectively.
Notice that if the automata A and B are synchronizable, then the languages
L(A) and L(B) are not necessarily synchronized, only some of its sublanguages
are necessarily synchronized.
Lemma 10 (Synchronizability of automata). For two NFAs A and B, the
following conditions are equivalent.
1. Automata A and B are synchronizable.
2. There exist synchronized languages K A ⊆ L(A) and K B ⊆ L(B).
The intuition behind Lemma 10 is depicted in Figure 2. The idea is that there is
a sequence (q0A , q0B ), . . . , (qkA , qkB ) that witnesses that A and B are synchronizable.
The pairs of paths that have the same style of lines depict parts of the automaton
that are synchronizable in one step. In particular, the dotted path from q1A to
qjA has the same word as the one from q1B to qjB . The other two paths contain at
least one loop.
The following theorem states that synchronizability in automata captures
exactly the existence of infinite zigzags between their languages. The theorem
statement uses Theorem 3 for the connection between infinite zigzags and sepa-
rability.
Efficient Separability of Regular Languages by Subsequences and Suffixes 159

Theorem 11. Let A and B be two NFAs. Then the languages L(A) and L(B)
are separable by a piecewise testable language if and only if the automata A and
B are not synchronizable.
We can now show how the algorithm from [4] can be improved to test in poly-
nomial time whether two given NFAs are synchronizable or not. Our algorithm
computes quadruples of states that are synchronizable in one step and links such
quadruples together so that they form a pair of paths as illustrated in Figure 2.
Theorem 12. Given two NFAs A and B, it is possible to test in polynomial
time whether L(A) and L(B) can be separated by a piecewise testable language.

5 Asymmetric Separation and Suffix Order


We present a bigger picture on efficient separations that are relevant to the
scenarios that motivate us. For example, we consider what happens when we
restrict the allowed boolean combinations of languages. Technically, this means
that separation is no longer symmetric. Orthogonally, we also consider the suffix
order /s between words in which v /s w if and only if v is a (not necessarily
strict) suffix of w. An important technical difference with the rest of the paper
is that the suffix order is not a WQO. Indeed, the suffix order /s has an infinite
antichain, e.g., a, ab, abb, abbb, . . . The results we present here for suffix order
hold true for prefix order as well.
Let F be a family of languages. Language K is separable from a language L
by F if there exists a language S in F that separates K from L, i.e., contains
K and does not intersect L. Thus, if F is closed under complement, then K is
separable from L implies L is separable from K. The separation problem by F
asks, given an NFA for K and an NFA for L, whether K is separable from L by
F.
We consider separation by families of languages F (O, C), where O (“order”)
specifies the ordering relation and C (“combinations”) specifies how we are al-
lowed to combine (upward) O-closed languages. Concretely, O is either the sub-
sequence order / or the suffix order /s . We allow C to be one of single, unions,
or bc (boolean combinations), meaning that each language in F (O, C) is either
the O-closure of a single word, a finite union of the O-closures of single words, or
a finite boolean combination of the O-closures of single words. Thus, F (/, bc) is
the family of piecewise testable languages and F (/s , bc) is the family of suffix-
testable languages. With this convention in mind, the main result of this section
is to provide a complete complexity overview of the six possible cases of sep-
aration by F (O, C). The case F (/, bc) has already been proved in Section 4.

Theorem 13. For O ∈ {/, /s } and C being one of single, unions, or boolean
combinations, we have that the complexity of the separation problem by F (O, C)
is as indicated in Table 1.
Since the separation problem for prefix order is basically the same as the sepa-
ration for suffix order and has the same complexity we didn’t list it separately
in the table.
160 W. Czerwiński, W. Martens, and T. Masopust

Table 1. The complexity of deciding separability for regular languages K and L

F(O, C) single unions bc (boolean combinations)


 (subsequence) NP-complete PTIME PTIME
s (suffix) PTIME PTIME PTIME

6 Conclusions and Further Questions

Subsequence and suffix languages seem to be promising for obtaining “simple”


separations of regular languages, since we can often efficiently decide if two given
regular languages are separable (Table 1). Layer-separability is even in PTIME
in all cases (since it has the same complexity. Looking back at our motivating
scenarios, the obvious next questions are: if a separation exists, can we efficiently
compute one? How large is it?
If we look at the broader picture, we are interested in how characterization of
separability can be used in a wider context than regular languages and subse-
quence ordering. Another concrete question is whether we can decide in polyno-
mial time if a given NFA defines a piecewise-testable language. Furthermore, we
are also interested in efficient separation results by combinations of languages of
the form Σ ∗ w1 Σ ∗ · · · Σ ∗ wn or variants thereof.
We discovered that Theorem 12 and a characterization similar to Theorem 11
also have been obtained in [26], which was submitted to ArXiv 3 weeks after the
ICALP deadline.

Acknowledgments. We thank Jean-Eric Pin and Marc Zeitoun for patiently


answering our questions about the algebraic perspective on this problem. We are
grateful to Mikolaj Bojańczyk, who pointed out the connection between layered
separability and boolean combinations. We also thank Piotr Hofman for pleasant
and insightful discussions about our proofs during his visit to Bayreuth. This
work was supported by DFG grant MA 4938/2-1.

References

1. Almeida, J.: Implicit operations on finite J-trivial semigroups and a conjecture of


I. Simon. Journal of Pure and Applied Algebra 69, 205–218 (1990)
2. Almeida, J.: Some algorithmic problems for pseudovarieties. Publicationes Mathe-
maticae Debrecen 54, 531–552 (1999)
3. Almeida, J., Costa, J.C., Zeitoun, M.: Pointlike sets with respect to R and J.
Journal of Pure and Applied Algebra 212(3), 486–499 (2008)
4. Almeida, J., Zeitoun, M.: The pseudovariety J is hyperdecidable. RAIRO Informa-
tique Théorique et Applications 31(5), 457–482 (1997)
5. Arenas, M., Conca, S., Pérez, J.: Counting beyond a yottabyte, or how SPARQL
1.1 property paths will prevent the adoption of the standard. In: World Wide Web
Conference, pp. 629–638 (2012)
Efficient Separability of Regular Languages by Subsequences and Suffixes 161

6. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensi-
ble Markup Language XML 1.0, 5th edn. Tech. report, W3C Recommendation
(November 2008), http://www.w3.org/TR/2008/REC-xml-20081126/
7. Ehrenfeucht, A., Haussler, D., Rozenberg, G.: On regularity of context-free lan-
guages. Theoretical Computer Science 27(3), 311–332 (1983)
8. Gao, S., Sperberg-McQueen, C.M., Thompson, H.S., Mendelsohn, N., Beech, D.,
Maloney, M.: W3C XML Schema Definition Language (XSD) 1.1 part 1. Tech.
report, W3C (2009), http://www.w3.org/TR/2009/CR-xmlschema11-1-20090430/
9. Gelade, W., Neven, F.: Succinctness of pattern-based schema languages for XML.
Journal of Computer and System Sciences 77(3), 505–519 (2011)
10. Harris, S., Seaborne, A.: SPARQL 1.1 query language. Tech. report, W3C (2010)
11. Henckell, K., Rhodes, J., Steinberg, B.: Aperiodic pointlikes and beyond. Interna-
tional Journal of Algebra and Computation 20(2), 287–305 (2010)
12. Higman, G.: Ordering by divisibility in abstract algebras. Proceedings of the Lon-
don Mathematical Society s3-2(1), 326–336 (1952)
13. Kasneci, G., Schwentick, T.: The complexity of reasoning about pattern-based
XML schemas. In: Principles of Database Systems, pp. 155–164 (2007)
14. Losemann, K., Martens, W.: The complexity of evaluating path expressions in
SPARQL. In: Principles of Database Systems, pp. 101–112 (2012)
15. Maier, D.: The complexity of some problems on subsequences and supersequences.
Journal of the ACM 25(2), 322–336 (1978)
16. Martens, W., Neven, F., Niewerth, M., Schwentick, T.: Developing and analyzing
XSDs through BonXai. Proc. of the VLDB Endowment 5(12), 1994–1997 (2012)
17. Martens, W., Neven, F., Niewerth, M., Schwentick, T.: BonXai: Combining the
simplicity of DTD with the expressiveness of XML Schema (manuscript 2013)
18. Martens, W., Neven, F., Schwentick, T., Bex, G.J.: Expressiveness and complexity
of XML Schema. ACM Trans. on Database Systems 31(3), 770–813 (2006)
19. Pérez, J., Arenas, M., Gutierrez, C.: nSPARQL: A navigational language for RDF.
Journal of Web Semantics 8(4), 255–270 (2010)
20. Simon, I.: Hierarchies of Events with Dot-Depth One. PhD thesis, Dep. of Applied
Analysis and Computer Science, University of Waterloo, Canada (1972)
21. Simon, I.: Piecewise testable events. In: Brakhage, H. (ed.) GI Conference on Au-
tomata Theory and Formal Languages. LNCS, vol. 33, pp. 214–222. Springer, Hei-
delberg (1975)
22. Steinberg, B.: A delay theorem for pointlikes. Semigroup Forum 63, 281–304 (2001)
23. Stern, J.: Characterizations of some classes of regular events. Theoretical Computer
Science 35(1985), 17–42 (1985)
24. Stern, J.: Complexity of some problems from the theory of automata. Information
and Control 66(3), 163–176 (1985)
25. Trahtman, A.N.: Piecewise and local threshold testability of DFA. In: Freivalds,
R. (ed.) FCT 2001. LNCS, vol. 2138, pp. 347–358. Springer, Heidelberg (2001)
26. van Rooijen, L., Zeitoun, M.: The separation problem for regular languages by
piecewise testable languages (March 8, 2013), http://arxiv.org/abs/1303.2143
On the Complexity of Verifying Regular
Properties on Flat Counter Systems,

Stéphane Demri2,3 , Amit Kumar Dhar1 , and Arnaud Sangnier1


1
LIAFA, Univ Paris Diderot, Sorbonne Paris Cité, CNRS, France
2
New York University, USA
3
LSV, CNRS, France

Abstract. Among the approximation methods for the verification of


counter systems, one of them consists in model-checking their flat unfold-
ings. Unfortunately, the complexity characterization of model-checking
problems for such operational models is not always well studied except
for reachability queries or for Past LTL. In this paper, we characterize
the complexity of model-checking problems on flat counter systems for
the specification languages including first-order logic, linear mu-calculus,
infinite automata, and related formalisms. Our results span different
complexity classes (mainly from PTime to PSpace) and they apply to
languages in which arithmetical constraints on counter values are sys-
tematically allowed. As far as the proof techniques are concerned, we
provide a uniform approach that focuses on the main issues.

1 Introduction

Flat Counter Systems. Counter systems, finite-state automata equipped with


program variables (counters) interpreted over non-negative integers, are known
to be ubiquitous in formal verification. Since counter systems can actually sim-
ulate Turing machines [17], it is undecidable to check the existence of a run
satisfying a given (reachability, temporal, etc.) property. However it is possi-
ble to approximate the behavior of counter systems by looking at a subclass of
witness runs for which an analysis is feasible. A standard method consists in con-
sidering a finite union of path schemas for abstracting the whole bunch of runs,
as done in [14]. More precisely, given a finite set of transitions Δ, a path schema
is an ω-regular expression over Δ of the form L = p1 (l1 )∗ · · · pk−1 (lk−1 )∗ pk (lk )ω
where both pi ’s and li ’s are paths in the control graph and moreover, the li ’s
are loops. A path schema defines a set of infinite runs that respect a sequence of
transitions that belongs to L. We write Runs(c0 , L) to denote such a set of runs
starting at the initial configuration c0 whereas Reach(c0 , L) denotes the set of
configurations occurring in the runs of Runs(c0 , L). A counter system is flattable
whenever the set of configurations reachable from c0 is equal to Reach(c0 , L) for

Work partially supported by the EU Seventh Framework Programme under grant
agreement No. PIOF-GA-2011-301166 (DATAVERIF).

A version with proofs is available as [5].

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 162–173, 2013.

c Springer-Verlag Berlin Heidelberg 2013
On the Complexity of Verifying Regular Properties on Flat Counter Systems 163

some finite union of path schemas L. Similarly, a flat counter system, a system
in which each control state belongs to at most one simple loop, verifies that the
set of runs from c0 is equal to Runs(c0 , L) for some finite union of path schemas
L. Obviously, flat counter systems are flattable. Moreover, reachability sets of
flattable counter systems are known to be Presburger-definable, see e.g. [1,3,7].
That is why, verification of flat counter systems belongs to the core of methods
for model-checking arbitrary counter systems and it is desirable to character-
ize the computational complexity of model checking problems on this kind of
systems (see e.g. results about loops in [2]). Decidability results for verifying
safety and reachability properties on flat counter systems have been obtained
in [3,7,2]. For the verification of temporal properties, it is much more difficult to
get sharp complexity characterization. For instance, it is known that verifying
flat counter systems with CTL enriched with arithmetical constraints is decid-
able [6] whereas it is only NP-complete with Past LTL [4] (NP-completeness
already holds with flat Kripke structures [10]).

Our Motivations. Our objectives are to provide a thorough classification of


model-checking problems on flat counter systems when linear-time properties
are considered. So far complexity is known with Past LTL [4] but even the de-
cidability status with linear μ-calculus is unknown. Herein, we wish to consider
several formalisms specifying linear-time properties (FO, linear μ-calculus, in-
finite automata) and to determine the complexity of model-checking problems
on flat counter systems. Note that FO is as expressive as Past LTL but much
more concise whereas linear μ-calculus is strictly more expressive than Past LTL,
which motivates the choice for these formalisms dealing with linear properties.

Our Contributions. We characterize the computational complexity of model-


checking problems on flat counter systems for several prominent linear-time
specification languages whose alphabets are related to atomic propositions but
also to linear constraints on counter values. We obtain the following results:

– The problem of model-checking first-order formulae on flat counter


systems is PSpace-complete (Theorem 9). Note that model-checking
classical first-order formulae over arbitrary Kripke structures is already known
to be non-elementary. However the flatness assumption allows to drop the
complexity to PSpace even though linear constraints on counter values are
used in the specification language.
– Model-checking linear μ-calculus formulae on flat counter systems
is PSpace-complete (Theorem 14). Not only linear μ-calculus is known
to be more expressive than first-order logic (or than Past LTL) but also the
decidability status of the problem on flat counter systems was open [6]. So,
we establish decidability and we provide a complexity characterization.
– Model-checking Büchi automata over flat counter systems is NP-
complete (Theorem 12).
– Global model-checking is possible for all the above mentioned for-
malisms (Corollary 16).
164 S. Demri, A.K. Dhar, and A. Sangnier

2 Preliminaries
2.1 Counter Systems
Counter constraints are defined below as a subclass of Presburger formulae whose
free variables are understood as counters. Such constraints are used to define
guards in counter systems but also to define arithmetical constraints in temporal
formulae. Let C = {x1 , x2 , . . .} be a countably infinite set of counters (variables
interpreted over non-negative integers) and AT = {p1 , p2 , . . .} be a countable
infinite set of propositional variables (abstract properties about program points).
We write Cn to denote the restriction of C to {x1 , x2 , . . . , xn }. The set of guards
g using the counters from Cn ,  written G(Cn ), is made of Boolean combinations
n
of atomic guards of the form i=0 ai · xi ∼ b where the ai ’s are in Z, b ∈ N
and ∼∈ {=, ≤, ≥, <, >}. For g ∈ G(Cn ) and a vector v ∈ Nn , we say that v
satisfies g, written v |= g, if the formula obtained by replacing each xi by v[i]
holds. For n ≥ 1, a counter system of dimension n (shortly a counter system)
S is a tuple Q, Cn , Δ, l where: Q is a finite set of control states, l : Q → 2AT
is a labeling function, Δ ⊆ Q × G(Cn ) × Zn × Q is a finite set of transitions
labeled by guards and updates. As usual, to a counter system S = Q, Cn , Δ, l,
we associate a labeled transition system T S(S) = C, → where C = Q × Nn is
the set of configurations and →⊆ C × Δ × C is the transition relation defined
δ
by: q, v, δ, q  , v  ∈→ (also written q, v −
→ q  , v ) iff δ = q, g, u, q   ∈ Δ,
v |= g and v = v + u. Note that in such a transition system, the counter values
are non-negative since C = Q × Nn .
Given an initial configuration c0 ∈ Q × Nn , a run ρ starting from c0 in S
is an infinite path in the associated transition system T S(S) denoted as: ρ :=
δ δm−1 δ
c0 −→0
· · · −−−→ cm −−
→ · · · where ci ∈ Q × Nn and δi ∈ Δ for all i ∈ N. We
m

say that a counter system is flat if every node in the underlying graph belongs
to at most one simple cycle (a cycle being simple if no edge is repeated twice
in it) [3,14,4]. We denote by CF S the class of flat counter systems. A Kripke
structure S can be seen as a counter system without counter and is denoted
by Q, Δ, l where Δ ⊆ Q × Q and l : Q → 2AT . Standard notions on counter
systems, as configuration, run or flatness, naturally apply to Kripke structures.

2.2 Model-Checking Problem


We define now our main model-checking problem on flat counter systems param-
eterized by a specification language L. First, we need to introduce the notion
of constrained alphabet whose letters should be understood as Boolean combi-
nations of atomic formulae (details follow). A constrained alphabet is a triple of
the form at, agn , Σ where at is a finite subset of AT, agn is a finite subset of
atomic guards from G(Cn ) and Σ is a subset of 2at∪agn . The size of a constrained
alphabet is given by size(at, agn , Σ) = card(at) + card(agn ) + card(Σ) where
card(X) denotes the cardinality of the set X. Of course, any standard alphabet
(finite set of letters) can be easily viewed as a constrained alphabet (by ignoring
the structure of letters). Given an infinite run ρ := q0 , v0  → q1 , v1  · · · from
On the Complexity of Verifying Regular Properties on Flat Counter Systems 165

a counter system with n counters and an ω-word over a constrained alphabet


w = a0 , a1 , . . . ∈ Σω , we say that ρ satisfies w, written ρ |= w, whenever for
i ≥ 0, we have p ∈ l(qi ) [resp. p ∈ l(qi )] for every p ∈ (ai ∩ at) [resp. p ∈ (at \ ai )]
and vi |= g [resp. vi |= g] for every g ∈ (ai ∩ agn ) [resp. g ∈ (agn \ ai )].
A specification language L over a constrained alphabet at, agn , Σ is a set
of specifications A, each of it defining a set L(A) of ω-words over Σ. We will
also sometimes consider specification languages over (unconstrained) standard
finite alphabets (as usually defined). We now define the model-checking problem
over flat counter systems with specification language L (written MC(L, CF S)):
it takes as input a flat counter system S, a configuration c and a specification A
from L and asks whether there is a run ρ starting at c and w ∈ Σω in L(A) such
that ρ |= w. We write ρ |= A whenever there is w ∈ L(A) such that ρ |= w.

2.3 A Bunch of Specification Languages


Infinite Automata. Now let us define the specification languages BA and ABA,
respectively with nondeterministic Büchi automata and with alternating Büchi
automata. We consider here transitions labeled by Boolean combinations of
atoms from at ∪ agn . A specification A in ABA is a structure of the form
Q, E, q0 , F  where E is a finite subset of Q × B(at ∪ agn ) × B+ (Q) and B+ (Q)
denotes the set of positive Boolean combinations built over Q. Specification A is
a concise representation for the alternating Büchi automaton BA = Q, δ, q0 , F 
where δ : Q × 2at∪agn → B+ (Q) and δ(q, a) = q,ψ,ψ ∈E, a|=ψ ψ  . We say
def

that A is over the constrained alphabet at, agn , Σ, whenever, for all edges
q, ψ, ψ   ∈ E, ψ holds at most for letters from Σ (i.e. the transition relation
of BA belongs to Q × Σ → B+ (Q) ). We have then L(A) = L(BA ) with the usual
acceptance criterion for alternating Büchi automata. The specification language
BA is defined in a similar way using Büchi automata. Hence the transition re-
lation E of A = Q, E, q0 , F  in BA is included in Q × B(at ∪ agn ) × Q and the
transition relation of the Büchi automaton BA is then included in Q×2at∪agn ×Q.

Linear-Time Temporal Logics. Below, we present briefly three logical languages


that are tailored to specify runs of counter systems, namely ETL (see e.g.[25,19]),
Past LTL (see e.g. [21]) and linear μ-calculus (or μTL), see e.g. [23]. A speci-
fication in one of these logical specification languages is just a formula. The
differences with their standard versions in which models are ω-sequences of
propositional valuations are listed below: models are infinite runs of counters
systems; atomic formulae are either propositional variables in AT or atomic
guards; given an infinite run ρ := q0 , v0  → q1 , v1  · · · , we will have ρ, i |= p
⇔ p ∈ l(qi ) and ρ, i |= g ⇔ vi |= g. The temporal operators, fixed point oper-
def def

ators and automata-based operators are interpreted then as usual. A formula φ


built over the propositional variables in at and the atomic guards in agn defines
a language L(φ) over at, agn , Σ with Σ = 2at∪agn . There is no need to recall
here the syntax and semantics of ETL, Past LTL and linear μ-calculus since with
their standard definitions and with the above-mentioned differences, their vari-
ants for counter systems are defined unambiguously (see a lengthy presentation
166 S. Demri, A.K. Dhar, and A. Sangnier

of Past LTL for counter systems in [4]). However, we may recall a few definitions
on-the-fly if needed. Herein the size of formulae is understood as the number of
subformulae.

Example. In adjoining figure, we present a flat counter system with two counters
and with labeling function l such that l(q3 ) = {p, q} and l(q5 ) = {p}. We would
like to characterize the set of configurations c with control state q1 such that
there is some infinite run from c for which after some position i, all future even
positions j (i.e. i ≡2 j) satisfy that p holds and the first counter is equal to the
second counter.
start q1

+, (0, −2) This can be specified in linear μ-calculus using as


+, (0, 0)
+, (0, 0) atomic formulae either propositional variables or
q4 q2 +, (−3, 0) atomic guards. The corresponding formula in linear
μ-calculus is: μz1 .(X(νz2 .(p ∧ (x1 − x2 = 0) ∧ XXz2 ) ∨
+, (0, 0)
+, (0, 0) Xz1 ). Clearly, such a position i occurs in any run
after reaching the control state q3 with the same
q3
value for both counters. Hence, the configurations
g (x1 , x2 ), (1, 0) g(x1 , x2 ), (0, 1)
q1 , v satisfying these properties have counter val-
ues v ∈ N2 verifying the Presburger formula below:
q5

∃ y (((x1 = 3y + x2 ) ∧ (∀ y g(x2 + y , x2 + y ) ∧ g (x2 + y , x2 + y + 1)))∨

((x2 = 2y + x1 ) ∧ (∀ y g(x1 + y , x1 + y ) ∧ g (x1 + y , x1 + y + 1))))


In the paper, we shall establish how to compute systematically such formulae
(even without universal quantifications) for different specification languages.

3 Constrained Path Schemas


In [4] we introduced minimal path schemas for flat counter systems. Now, we
introduce constrained path schemas that are more abstract than path schemas.
A constrained path schema cps is a pair p1 (l1 )∗ · · · pk−1 (lk−1 )∗ pk (lk )ω , φ(x1 ,
. . . , xk−1 ) where the first component is an ω-regular expression over a con-
strained alphabet at, agn , Σ with pi , li ’s in Σ∗ , and φ(x1 , . . . , xk−1 ) ∈ G(Ck−1 ).
Each constrained path schema defines a language L(cps) ⊆ Σω given by L(cps) =
def

{p1 (l1 )n1 · · · pk−1 (lk−1 )nk−1 pk (lk )ω : φ(n1 , . . . , nk−1 ) holds true}. The size of
cps, written size(cps), is equal to 2k + len(p1 l1 · · · pk−1 lk−1 pk lk ) + size(φ(x1 , . . . ,
xk−1 )). Observe that in general constrained path schemas are defined under con-
strained alphabet and so will the associated specifications unless stated
otherwise.
Let us consider below the three decision problems on constrained path schemas
that are useful in the rest of the paper. Consistency problem checks whether
On the Complexity of Verifying Regular Properties on Flat Counter Systems 167

L(cps) is non-empty. It amounts to verify the satisfiability status of the second


component. Let us recall the result below.
Theorem 1. [20] There are polynomials pol1 (·), pol2 (·) and pol3 (·) such
that for every guard g, say in G(Cn ), of size N , we have (I) there exist B ⊆
[0, 2pol1 (N ) ]n and P1 , . . . , Pα ∈ [0, 2pol1 (N ) ]n with α ≤ 2pol2 (N ) such that for ev-
ery y ∈ Nn , y |= g iff there are b ∈ B and a ∈ Nα such that y = b + a[1]P1 +
· · · + a[α]Pα ; (II) if g is satisfiable, then there is y ∈ [0, 2pol3 (N ) ]n s.t. y |= g.
Consequently, the consistency problem is NP-complete (the hardness being ob-
tained by reducing SAT). The intersection non-emptiness problem, clearly re-
lated to model-checking problem, takes as input a constrained path schema
cps and a specification A ∈ L and asks whether L(cps) ∩ L(A) = ∅. Typi-
cally, for several specification languages L, we establish the existence of a com-
putable map fL (at most exponential) such that whenever L(cps) ∩ L(A) = ∅
there is p1 (l1 )n1 · · · pk−1 (lk−1 )nk−1 pk (lk )ω belonging to the intersection and for
which each ni is bounded by fL (A, cps). This motivates the introduction of the
membership problem for L that takes as input a constrained path schema cps,
a specification A ∈ L and n1 , . . . , nk−1 ∈ N and checks whether p1 (l1 )n1 · · ·
pk−1 (lk−1 )nk−1 pk (lk )ω ∈ L(A). Here the ni ’s are understood to be encoded in
binary and we do not require them to satisfy the constraint of the path schema.
Since constrained path schemas are abstractions of path schemas used in [4],
from this work we can show that runs from flat counter systems can be repre-
sented by a finite set of constrained path schemas as stated below.
Theorem 2. Let at be a finite set of atomic propositions, agn be a finite set of
atomic guards from G(Cn ), S be a flat counter system whose atomic propositions
and atomic guards are from at ∪ agn and c0 = q0 , v0  be an initial configuration.
One can construct in exponential time a set X of constrained path schemas
s.t.: (I) Each constrained path schema cps in X has an alphabet of the form
at, agn , Σ (Σ may vary) and cps is of polynomial size. (II) Checking whether a
constrained path schema belongs to X can be done in polynomial time. (III) For
every run ρ from c0 , there is a constrained path schema cps in X and w ∈ L(cps)
such that ρ |= w. (IV) For every constrained path schema cps in X and for every
w ∈ L(cps), there is a run ρ from c0 such that ρ |= w.
In order to take advantage of Theorem 2 for the verification of flat counter sys-
tems, we need to introduce an additional property: L has the nice subalphabet
property iff for all specifications A ∈ L over at, agn , Σ and for all constrained
alphabets at, agn , Σ , one can build a specification A over at, agn , Σ  in poly-
nomial time in the sizes of A and at, agn , Σ  such that L(A) ∩ (Σ )ω = L(A ).
We need this property to build from A and a constraint path schema over
at, agn , Σ , the specification A . This property will also be used to transform a
specification over at, agn , Σ into a specification over the finite alphabet Σ .
Lemma 3. BA, ABA, μTL, ETL, Past LTL have the nice subalphabet property.
The abstract Algorithm 1 which performs the following steps (1) to (3) takes as
input S, a configuration c0 and A ∈ L and solves MC(L, CF S): (1) Guess cps
168 S. Demri, A.K. Dhar, and A. Sangnier

over at, agn , Σ  in X; (2) Build A such that L(A) ∩ (Σ )ω = L(A ); (3) Return
L(cps) ∩ L(A ) = ∅. Thanks to Theorem 2, the first guess can be performed
in polynomial time and with the nice subalphabet property, we can build A in
polynomial time too. This allows us to conclude the following lemma which is a
consequence of the correctness of the above algorithm (see [5]).
Lemma 4. If L has the nice subalphabet property and its intersection non-
emptiness problem is in NP[resp. PSpace], then MC(L, CF S) is in NP[resp.
PSpace]
We know that the membership problem for Past LTL is in PTime and the inter-
section non-emptiness problem is in NP (as a consequence of [4, Theorem 3]). By
Lemma 4, we are able to conclude the main result from [4]: MC(PastLTL, CF S)
is in NP. This is not surprising at all since in this paper we present a general
method for different specification languages that rests on Theorem 2 (a conse-
quence of technical developments from [4]).

4 Taming First-Order Logic and Flat Counter Systems

In this section, we consider first-order logic as a specification language. By


Kamp’s Theorem, first-order logic has the same expressive power as Past LTL
and hence model-checking first-order logic over flat counter systems is decid-
able too [4]. However this does not provide us an optimal upper bound for the
model-checking problem. In fact, it is known that the satisfiability problem for
first-order logic formulae is non-elementary and consequently the translation into
Past LTL leads to a significant blow-up in the size of the formula.

4.1 First-Order Logic in a Nutshell

For defining first-order logic formulae, we consider a countably infinite set of


variables Z and a finite (unconstrained) alphabet Σ. The syntax of first-order
logic over atomic propositions FOΣ is then given by the following grammar:
φ ::= a(z) | S(z, z ) | z < z | z = z | ¬φ | φ ∧ φ | ∃z φ(z) where a ∈ Σ and
z, z ∈ Z. For a formula φ, we will denote by f ree(φ) its set of free variables de-
fined as usual. A formula with no free variable is called a sentence. As usual,
we define the quantifier height qh(φ) of a formula φ as the maximum nesting
depth of the operators ∃ in φ. Models for FOΣ are ω-words over the alphabet
Σ and variables are interpreted by positions in the word. A position assignment
is a partial function f : Z → N. Given a model w ∈ Σω , a FOΣ formula φ and
a position assignment f such that f (z) ∈ N for every variable z ∈ f ree(φ), the
satisfaction relation |=f is defined as usual. Given a FOΣ sentence φ, we write
w |= φ when w |=f φ for an arbitrary position assignment f . The language of
ω-words w over Σ associated to a sentence φ is then L(φ) = {w ∈ Σω | w |= φ}.
For n ∈ N, we define the equivalence relation ≈n between ω-words over Σ as:
w ≈n w when for every sentence φ with qh(φ) ≤ n, w |= φ iff w |= φ.
On the Complexity of Verifying Regular Properties on Flat Counter Systems 169

FO on CS. FO formulae interpreted over infinite runs of counter systems are


defined as FO formulae over a finite alphabet except that atomic formulae of the
form a(z) are replaced by atomic formulae of the form p(z) or g(z) where p is
an atomic formula or g is an atomic guard from G(Cn ). Hence, a formula φ built
over atomic formulae from a finite set at of atomic propositions and from a finite
set agn of atomic guards from G(Cn ) defines a specification for the constrained
alphabet at, atn , 2at∪agn . Note that the alphabet can be of exponential size in
the size of φ and p(z) actually corresponds to a disjunction p∈a a(z).
Lemma 5. FO has the nice subalphabet property.
We have taken time to properly define first-order logic for counter systems (whose
models are runs of counter systems, see also Section 2.2) but below, we will
mainly operate with FOΣ over a standard (unconstrained) alphabet. Let us
state our first result about FOΣ which allows us to bound the number of times
each loop is taken in a constrained path schema in order to satisfy a formula.
We provide a stuttering theorem equivalent for F OΣ formulas as is done in [4]
for PLTL and in [12] for LTL. The lengthy proof of Theorem 6 uses Ehrenfeuch-
Fraïssé game (see [5]).
Theorem 6 (Stuttering Theorem). Let w = w1 sM w2 , w = w1 sM+1 w2 ∈ Σω
such that N ≥ 1, M > 2N +1 and s ∈ Σ+ . Then w ≈N w .

4.2 Model-Checking Flat Counter Systems with FO


Let us characterize the complexity of MC(FO, CF S). First, we will state the
complexity of the intersection non-emptiness problem. Given a constrained path
schema cps and a FO sentence ψ, Theorem 1 provides two polynomials pol1 and
pol2 to represent succinctly the solutions of the guard in cps. Theorem 6 allows
us to bound the number of times loops are visited. Consequently, we can compute
a value fFO (ψ, cps) exponential in the size of ψ and cps, as explained earlier,
which allows us to find a witness for the intersection non-emptiness problem
where each loop is taken a number of times smaller than fFO (ψ, cps).
Lemma 7. Let cps be a constrained path schema and ψ be a FOΣ sentence.
Then L(cps) ∩ L(ψ) is non-empty iff there is an ω-word in L(cps) ∩ L(ψ) in
which each loop is taken at most 2(qh(ψ)+2)+pol1 (size(cps))+pol2 (size(cps)) times.
Hence fFO (ψ, cps) has the value 2(qh(ψ)+2)+(pol1 +pol2 )(size(cps)) . Furthermore
checking whether L(cps) ∩ L(ψ) is non-empty amounts to guess some n ∈
[0, 2(qh(ψ)+2)+pol1 (size(cps))+pol2 (size(cps)) ]k−1 and verify whether w = p1 (l1 )n[1]
· · · pk−1 (lk−1 )n[k−1] pk (lk )ω ∈ L(cps) ∩ L(ψ). Checking if w ∈ L(cps) can be
done in polynomial time in (qh(ψ) + 2) + pol1 (size(cps)) + pol2 (size(cps)) (and
therefore in polynomial time in size(ψ) + size(cps)) since this amounts to ver-
ify whether n |= φ. Checking whether w ∈ L(ψ) can be done in exponential
space in size(ψ) + size(cps) by using [15, Proposition 4.2]. Hence, this leads to a
nondeterministic exponential space decision procedure for the intersection non-
emptiness problem but it is possible to get down to nondeterministic polynomial
170 S. Demri, A.K. Dhar, and A. Sangnier

space using the succinct representation of constrained path schema as stated by


Lemma 8 below for which the lower bound is deduced by the fact that model-
checking ultimately periodic words with first-order logic is PSpace-hard [15].
Lemma 8. Membership problem with FOΣ is PSpace-complete.
Note that the membership problem for FO is for unconstrained alphabet, but due
to the nice subalphabet property of FO, the same holds for constrained alphabet
since given a FO formula over at, agn , Σ, we can build in polynomial time a
FO formula over at, agn , Σ  from which we can build also in polynomial time
a formula of FOΣ (where Σ is for instance the alphabet labeling a constrained
path schema). We can now state the main results concerning FO.
Theorem 9. (I) The intersection non-emptiness problem with FO is PSpace-
complete. (II) MC(FO, CF S) is PSpace-complete. (III) Model-checking flat
Kripke structures with FO is PSpace-complete.
Proof. (I) is a consequence of Lemma 7 and Lemma 8. We obtain (II) from (I)
by applying Lemma 4 and Lemma 5. (III) is obtained by observing that flat
Kripke structures form a subclass of flat counter systems. To obtain the lower
bound, we use that model-checking ultimately periodic words with first-order
logic is PSpace-hard [15]. &
%

5 Taming Linear μ-calculus and Other Languages


We now consider several specification languages defining ω-regular properties
on atomic propositions and arithmetical constraints. First, we deal with BA by
establishing Theorem 10 and then deduce results for ABA, ETL and μTL.
Theorem 10. Let B = Q, Σ, q0 , Δ, F  be a Büchi automaton (with standard
definition) and cps = p1 (l1 )∗ · · · pk−1 (lk−1 )∗ pk (lk )ω , φ(x1 , . . . , xk−1 ) be a con-
strained path schema over Σ. We have L(cps) ∩ L(B) = ∅ iff there exists y ∈
[0, 2pol1 (size(cps)) +2.card(Q)k ×2pol1 (size(cps))+pol2 (size(cps)) ]k−1 such that p1 (l1 )y[1]
. . . pk−1 (lk−1 )y[k−1] pk lkω ∈ L(B) ∩ L(cps) (pol1 and pol2 are from Theorem 1).
Theorem 10 can be viewed as a pumping lemma involving an automaton and
semilinear sets. Thanks to it we obtain an exponential bound for the map fBA so
that fBA (B, cps) = 2pol1 (size(cps)) +2.card(Q)size(cps) ×2pol1 (size(cps))+pol 2 (size(cps)) .
So checking L(cps) ∩ L(B) = ∅ amounts to guess some n ∈ [0, 2pol1 (size(cps)) +
2.card(Q)size(cps) ×2pol1 (size(cps))+pol2 (size(cps)) ]k−1 and to verify whether the word
w = p1 (l1 )n[1] · · · pk−1 (lk−1 )n[k−1] pk (lk )ω ∈ L(cps) ∩ L(B). Checking whether
w ∈ L(cps) can be done in polynomial time in size(B) + size(cps) since this
amounts to check n |= φ. Checking whether w ∈ L(B) can be also done in poly-
nomial time by using the results from [15]. Indeed, w can be encoded in polyno-
mial time as a pair of straight-line programs and by [15, Corollary 5.4] this can
be done in polynomial time. So, the membership problem for Büchi automata
is in PTime. By using that BA has the nice subalphabet property and that we
can create a polynomial size Büchi automata from a given BA specification and
cps, we get the following result.
On the Complexity of Verifying Regular Properties on Flat Counter Systems 171

Lemma 11. The intersection non-emptiness problem with BA is NP-complete.


Now, by Lemma 3, Lemma 4 and Lemma 11, we get the result below for which
the lower bound is obtained from an easy reduction of SAT.
Theorem 12. MC(BA, CF S) is NP-complete.
We are now ready to deal with ABA, ETL and linear μ-calculus. A language
L has the nice BA property iff for every specification A from L, we can build a
Büchi automaton BA such that L(A) = L(BA ), each state of BA is of polynomial
size, it can be checked if a state is initial [resp. accepting] in polynomial space
and the transition relation can be decided in polynomial space too. So, given a
language L having the nice BA property, a constrained path schema cps and
a specification in A ∈ L, if L(cps) ∩ L(A) is non-empty, then there is an ω-
word in L(cps) ∩ L(A) such that each loop is taken at most a number of times
bounded by fBA (BA , cps). So fL (A, cps) is obviously bounded by fBA (BA , cps).
Hence, checking whether L(cps)∩L(A) is non-empty amounts to guess some n ∈
[0, fL (A, cps)]k−1 and check whether w = p1 (l1 )n[1] · · · pk−1 (lk−1 )n[k−1] pk (lk )ω ∈
L(cps) ∩ L(A). Checking whether w ∈ L(cps) can be done in polynomial time
in size(A) + size(cps) since this amounts to check n |= φ. Checking whether
w ∈ L(A) can be done in nondeterministic polynomial space by reading w while
guessing an accepting run for BA . Actually, one guesses a state q from BA and
check whether the prefix p1 (l1 )n[1] · · · pk−1 (lk−1 )n[k−1] pk can reach it and then
q
nonemptiness between (lk )ω and the Büchi automaton BA in which q is an initial
state is checked. Again, this can be done in nondeterministic polynomial space
thanks to the nice BA property. We obtain the lemma below.
Lemma 13. Membership problem and intersection non-emptiness problem for
L having the nice BA property are in PSpace.
Let us recall consequences of results from the literature. ETL has the nice BA
property by [24], linear μ-calculus has the nice BA property by [23] and ABA
has the nice BA property by [18]. Note that the results for ETL and ABA can
be also obtained thanks to translations into linear μ-calculus. By Lemma 13,
Lemma 4 and the above-mentioned results, we obtain the following results.
Theorem 14. MC(ABA, CF S), MC(ETL, CF S) and MC(μTL, CF S) are in
PSpace.
Note that for obtaining the PSpace upper bound, we use the same procedure for
all the logics. Using that the emptiness problem for finite alternating automata
over a single letter alphabet is PSpace-hard [8], we are also able to get lower
bounds.
Theorem 15. (I) The intersection non-emptiness problem for ABA [resp. μTL]
is PSpace-hard. (II) MC(ABA, CF S) and MC(μTL, CF S) are PSpace-hard.
According to the proof of Theorem 15 (see [5]), PSpace-hardness already holds
for a fixed Kripke structure, that is actually a simple path schema. Hence, for lin-
ear μ-caluclus, there is a complexity gap between model-checking unconstrained
172 S. Demri, A.K. Dhar, and A. Sangnier

path schemas with two loops (in UP∩co-UP [9]) and model-checking uncon-
strained path schemas (Kripke structures) made of a single loop, which is in
contrast to Past LTL for which model-checking unconstrained path schemas
with a bounded number of loops is in PTime [4, Theorem 9].
As an additional corollary, we can solve the global model-checking problem
with existential Presburger formulae. The global model-checking consists in char-
acterizing the set of initial configurations from which there exists a run satisfying
a given specification. We knew that Presburger formulae exist for global model-
checking [6] for Past LTL (and therefore for FO) but we can conclude that they
are structurally simple and we provide an alternative proof. Moreover, the ques-
tion has been open for μTL since the decidability status of MC(μTL, CF S) has
been only resolved in the present work.

Corollary 16. Let L be a specification language among FO, BA, ABA, ETL or
μTL. Given a flat counter system S, a control state q and a specification A in
L, one can effectively build an existential Presburger formula φ(z1 , . . . , zn ) such
that for all v ∈ Nn . v |= φ iff there is a run ρ starting at q, v verifying ρ |= A.

6 Conclusion

We characterized the complexity of MC(L, CF S) for prominent linear-time spec-


ification languages L whose letters are made of atomic propositions and linear
constraints. We proved the PSpace-completeness of the problem with linear μ-
calculus (decidability was open), for alternating Büchi automata and also for
FO. When specifications are expressed with Büchi automata, the problem is
shown NP-complete. Global model-checking is also possible on flat counter sys-
tems with such specification languages. Even though the core of our work relies
on small solutions of quantifier-free Presburger formulae, stuttering properties,
automata-based approach and on-the-fly algorithms, our approach is designed to
be generic. Not only this witnesses the robustness of our method but our com-
plexity characterization justifies further why verification of flat counter systems
can be at the core of methods for model-checking counter systems. Our main
results are in the table below with useful comparisons (‘Ult. periodic KS’ stands
for ultimately periodic Kripke structures namely a path followed by a loop).

Flat counter systems Kripke struct. Flat Kripke struct. Ult. periodic KS
μTL PSpace-C (Thm. 14) PSpace-C [23] PSpace-C (Thm. 14) in UP∩co-UP [16]
ABA PSpace-C (Thm. 14) PSpace-C PSpace-C (Thm. 14) in PTime (see e.g. [11, p. 3])
ETL in PSpace (Thm. 14) PSpace-C [21] in PSpace [21] in PTime (see e.g. [19,11])
BA NP-C (Thm.12) in PTime in PTime in PTime
FO PSpace-C (Thm. 9) Non-el. [22] PSpace-C (Thm. 9) PSpace-C [15]
Past LTL NP-C [4] PSpace-C [21] NP-C [10,4] PTime [13]
On the Complexity of Verifying Regular Properties on Flat Counter Systems 173

References
1. Boigelot, B.: Symbolic methods for exploring infinite state spaces. PhD thesis,
Université de Liège (1998)
2. Bozga, M., Iosif, R., Konečný, F.: Fast acceleration of ultimately periodic relations.
In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 227–242.
Springer, Heidelberg (2010)
3. Comon, H., Jurski, Y.: Multiple counter automata, safety analysis and PA. In: Vardi,
M.Y. (ed.) CAV 1998. LNCS, vol. 1427, pp. 268–279. Springer, Heidelberg (1998)
4. Demri, S., Dhar, A.K., Sangnier, A.: Taming Past LTL and Flat Counter Sys-
tems. In: Gramlich, B., Miller, D., Sattler, U. (eds.) IJCAR 2012. LNCS (LNAI),
vol. 7364, pp. 179–193. Springer, Heidelberg (2012)
5. Demri, S., Dhar, A.K., Sangnier, A.: On the complexity of verifying regular prop-
erties on flat counter systems (2013), http://arxiv.org/abs/1304.6301
6. Demri, S., Finkel, A., Goranko, V., van Drimmelen, G.: Model-checking CTL∗ over
flat Presburger counter systems. JANCL 20(4), 313–344 (2010)
7. Finkel, A., Leroux, J.: How to compose presburger-accelerations: Applications to
broadcast protocols. In: Agrawal, M., Seth, A.K. (eds.) FSTTCS 2002. LNCS,
vol. 2556, pp. 145–156. Springer, Heidelberg (2002)
8. Jančar, P., Sawa, Z.: A note on emptiness for alternating finite automata with a
one-letter alphabet. IPL 104(5), 164–167 (2007)
9. Jurdziński, M.: Deciding the winner in parity games is in UP ∩ co-UP. IPL 68(3),
119–124 (1998)
10. Kuhtz, L., Finkbeiner, B.: Weak kripke structures and LTL. In: Katoen, J.-P.,
König, B. (eds.) CONCUR 2011. LNCS, vol. 6901, pp. 419–433. Springer, Heidel-
berg (2011)
11. Kupferman, O., Vardi, M.: Weak alternating automata are not that weak. ACM
Transactions on Computational Logic 2(3), 408–429 (2001)
12. Kučera, A., Strejček, J.: The stuttering principle revisited. Acta Informatica 41(7-
8), 415–434 (2005)
13. Laroussinie, F., Markey, N., Schnoebelen, P.: Temporal logic with forgettable past.
In: LICS 2002, pp. 383–392. IEEE (2002)
14. Leroux, J., Sutre, G.: Flat counter systems are everywhere! In: Peled, D.A., Tsay,
Y.-K. (eds.) ATVA 2005. LNCS, vol. 3707, pp. 489–503. Springer, Heidelberg (2005)
15. Markey, N., Schnoebelen, P.: Model checking a path. In: Amadio, R.M., Lugiez, D.
(eds.) CONCUR 2003. LNCS, vol. 2761, pp. 251–265. Springer, Heidelberg (2003)
16. Markey, N., Schnoebelen, P.: Mu-calculus path checking. IPL 97(6) (2006)
17. Minsky, M.: Computation, Finite and Infinite Machines. Prentice Hall (1967)
18. Miyano, S., Hayashi, T.: Alternating finite automata on ω-words. Theor. Comput.
Sci. 32, 321–330 (1984)
19. Piterman, N.: Extending temporal logic with ω-automata. Master’s thesis, The
Weizmann Institute of Science (2000)
20. Pottier, L.: Minimal Solutions of Linear Diophantine Systems: Bounds and Algo-
rithms. In: Book, R.V. (ed.) RTA 1991. LNCS, vol. 488, pp. 162–173. Springer,
Heidelberg (1991)
21. Sistla, A., Clarke, E.: The complexity of propositional linear temporal logic.
JACM 32(3), 733–749 (1985)
22. Stockmeyer, L.J.: The complexity of decision problems in automata and logic. PhD
thesis, MIT (1974)
23. Vardi, M.: A temporal fixpoint calculus. In: POPL 1988, pp. 250–259. ACM (1988)
24. Vardi, M., Wolper, P.: Reasoning about infinite computations. I&C 115 (1994)
25. Wolper, P.: Temporal logic can be more expressive. I&C 56, 72–99 (1983)
Multiparty Compatibility in Communicating Automata:
Characterisation and Synthesis of Global Session Types

Pierre-Malo Deniélou and Nobuko Yoshida


1 Royal Holloway, University of London
2 Imperial College London

Abstract. Multiparty session types are a type system that can ensure the safety
and liveness of distributed peers via the global specification of their interac-
tions. To construct a global specification from a set of distributed uncontrolled
behaviours, this paper explores the problem of fully characterising multiparty
session types in terms of communicating automata. We equip global and local
session types with labelled transition systems (LTSs) that faithfully represent
asynchronous communications through unbounded buffered channels. Using the
equivalence between the two LTSs, we identify a class of communicating au-
tomata that exactly correspond to the projected local types. We exhibit an algo-
rithm to synthesise a global type from a collection of communicating automata.
The key property of our findings is the notion of multiparty compatibility which
non-trivially extends the duality condition for binary session types.

1 Introduction
Over the last decade, session types [12,18] have been studied as data types or functional
types for communications and distributed systems. A recent discovery by [4,20], which
establishes a Curry-Howard isomorphism between binary session types and linear log-
ics, confirms that session types and the notion of duality between type constructs have
canonical meanings. Multiparty session types [2,13] were proposed as a major general-
isation of binary session types. They can enforce communication safety and deadlock-
freedom for more than two peers thanks to a choreographic specification (called global
type) of the interaction. Global types are projected to end-point types (local types),
against which processes can be statically type-checked and verified to behave correctly.
The motivation of this paper comes from our practical experiences that, in many
situations, even where we start from the end-point projections of a choreography, we
need to reconstruct a global type from distributed specifications. End-point specifica-
tions are usually available, either through inference from the control flow, or through
existing service interfaces, and always in forms akin to individual communicating finite
state machines. If one knows the precise conditions under which a global type can be
constructed (i.e. the conditions of synthesis), not only the global safety property which
multiparty session types ensure is guaranteed, but also the generated global type can
be used as a refinement and be integrated within the distributed system development
life-cycle (see [17]). This paper attempts to give the synthesis condition as a sound
and complete characterisation of multiparty session types with respect to Communi-
cating Finite State Machines (CFSMs) [3]. CFSMs have been a well-studied formal-
ism for analysing distributed safety properties and are widely present in industry tools.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 174–186, 2013.
© Springer-Verlag Berlin Heidelberg 2013
Multiparty Compatibility in Communicating Automata 175

They can been seen as generalised end-point specifications, therefore an excellent target
for a common comparison ground and for synthesis. As explained below, to identify a
complete set of CFSMs for synthesis, we first need to answer a question – what is the
canonical duality notion in multiparty session types?
Characterisation of Binary Session Types as Communicating Automata. The sub-
class which fully characterises binary session types was actually proposed by Gouda,
Manning and Yu in 1984 [11] in a pure communicating automata context. Consider a
simple business protocol between a Buyer and a Seller from the Buyer’s viewpoint:
Buyer sends the title of a book, Seller answers with a quote. If Buyer is satisfied by the
quote, then he sends his address and Seller sends back the delivery date; otherwise it
retries the same conversation. This can be described by the following session type:
μ t.! title; ?quote; !{ ok :! addrs; ?date; end, retry : t } (1.1)
where the operator ! title denotes an output of the title, whereas ?quote denotes an in-
put of a quote. The output choice features the two options ok and retry and ; denotes
sequencing. end represents the termination of the session, and μ t is recursion.
The simplicity and tractability of binary sessions come from the notion of duality in
interactions [10]. The interaction pattern of the Seller is fully given as the dual of the
type in (1.1) (exchanging input ! and output ? in the original type). When composing
two parties, we only have to check they have mutually dual types, and the resulting
communication is guaranteed to be deadlock-free. Essentially the same characterisation
is given in communicating automata. Buyer and Seller’s session types are represented
by the following two machines.
?retry !retry
/.-,w
→ ()*+ //.-,
()*+ //.-,
()*+ //.-,
()*+ //.-,
()*+ //.-,
()*+

 /.-,w
→ ()*+ //.-,
()*+ //.-,
()*+ //.-,
()*+ //.-,
()*+ //.-,
()*+


!title ?quote ?ok !addrs ?date ?title !quote !ok ?addrs !date

We can observe that these CFSMs satisfy three conditions. First, the communications
are deterministic: messages that are part of the same choice, ok and retry here, are dis-
tinct. Secondly, there is no mixed state (each state has either only sending actions or
only receiving actions). Third, these two machines have compatible traces (i.e. dual):
the Seller machine can be defined by exchanging sending to receiving actions and
vice versa. Breaking one of these conditions allows deadlock situations and breaking
one of the first two conditions makes the compatibility checking undecidable [11, 19].

AB!quit
Multiparty Compatibility. This notion of duality is A → ()*+
/.-,
]
/ AC!finish ()*+
()*+
/.-, //.-,


no longer effective in multiparty communications, AB!act
where the whole conversation cannot be reconstructed  AC!commit
()*+
/.-,
from only a single behaviour. To bypass the gap be- AB?quit
B → ()*+
/.-, / ()*+
/.-, //.-,
()*+


BC!save
tween binary and multiparty, we take the synthesis ]
approach, that is to find conditions which allow a AB?act
 BC!sig
global choreography to be built from the local ma- ()*+
/.-,
chine behaviour. Instead of directly trying to decide
C → ()*+
/.-, //.-, AC?finish ()*+
()*+ //.-,


BC?save
whether the communications of a system will satisfy ]
BC?sig
safety (which is undecidable in the general case), in-  AC?commit
()*+
/.-,
ferring a global type guarantees the safety as a direct Commit
consequence.
176 P.-M. Deniélou and N. Yoshida

We give a simple example above to illustrate the problem. The Commit protocol in-
volves three machines: Alice A, Bob B and Carol C. A orders B to act or quit. If act is
sent, B sends a signal to C, and A sends a commitment to C and continues. Otherwise B
informs C to save the data and A gives the final notification to C to terminate the protocol.
This paper presents a decidable notion of multiparty compatibility as a generalisa-
tion of duality of binary sessions, which in turns characterises a synthesis condition.
The idea is to check the duality between each automaton and the rest, up to the inter-
nal communications (1-bounded executions in the terminology of CFSMs, see § 2) that
the other machines will independently perform. For example, in the Commit example,
to check the compatibility of trace AB!quit AC!finish in A, we observe the dual trace
AB?quit · AC?finish from B and C executing the internal communications between B and
C such that BC!save· BC?save. If this extended duality is valid for all the machines from
any 1-bounded reachable state, then they satisfy multiparty compatibility and can build
a well-formed global choreography.
Contributions and Outline. Section 3 defines new labelled transition systems for
global and local types that represent the abstract observable behaviour of typed pro-
cesses. We prove that a global type behaves exactly as its projected local types, and
the same result between a single local type and its CFSMs interpretation. These corre-
spondences are the key to prove the main theorems. Section 4 defines multiparty com-
patibility, studies its safety and liveness properties, gives an algorithm for the synthesis
of global types from CFSMs, and proves the soundness and completeness results be-
tween global types and CFSMs. Section 5 discusses related work and concludes. The
full proofs and applications of this work can be found in [17].

2 Communicating Finite State Machines

This section starts from some preliminary notations (following [6]). ε is the empty
word. A is a finite alphabet and A∗ is the set of all finite words over A. |x| is the length
of a word x and x.y or xy the concatenation of two words x and y. Let P be a set of
participants fixed throughout the paper: P ⊆ {A, B, C, . . ., p, q, . . . }.

Definition 2.1 (CFSM). A communicating finite state machine is a finite transition


system given by a 5-tuple M = (Q,C, q0 , A, δ ) where (1) Q is a finite set of states; (2)
C = {pq ∈ P2 | p = q} is a set of channels; (3) q0 ∈ Q is an initial state; (4) A is a finite
alphabet of messages, and (5) δ ⊆ Q× (C × {!, ?} × A)× Q is a finite set of transitions.

In transitions, pq!a denotes the sending action of a from process p to process q, and
pq?a denotes the receiving action of a from p by q. , 1 range over actions and we define
the subject of an action  as the principal in charge of it: subj(pq!a) = subj(qp?a) = p.
A state q ∈ Q whose outgoing transitions are all labelled with sending (resp. receiv-
ing) actions is called a sending (resp. receiving) state. A state q ∈ Q which does not
have any outgoing transition is called final. If q has both sending and receiving outgo-
ing transitions, q is called mixed. We say q is directed if it contains only sending (resp.
receiving) actions to (resp. from) the same (identical) participant. A path in M is a finite
sequence of q0 , . . . , qn (n ≥ 1) such that (qi , , qi+1 ) ∈ δ (0 ≤ i ≤ n − 1), and we write
Multiparty Compatibility in Communicating Automata 177

 1
q−
→ q if (q, , q1 ) ∈ δ . M is connected if for every state q = q0 , there is a path from q0
to q. Hereafter we assume each CFSM is connected.
A CFSM M = (Q,C, q0 , A, δ ) is deterministic if for all states q ∈ Q and all actions ,
(q, , q1 ), (q, , q11 ) ∈ δ imply q1 = q11 .1

Definition 2.2 (CS). A (communicating) system S is a tuple S = (Mp )p∈P of CFSMs


such that Mp = (Qp ,C, q0p , A, δp ).

For Mp = (Qp ,C, q0p , A, δp ), we define a configuration of S = (Mp )p∈P to be a tuple


s = ("q;"w) where "q = (qp )p∈P with qp ∈ Qp and where "w = (wpq )p=q∈P with wpq ∈ A∗ .
The element "q is called a control state and q ∈ Qp is the local state of machine Mp .
Definition 2.3 (reachable state). Let S be a communicating system. A configuration
s1 = ("q1 ;"w1 ) is reachable from another configuration s = ("q;"w) by the firing of the
transition t, written s − → s1 or s− →t 1
s , if there exists a ∈ A such that either: (1) t =
(qp , pq!a, qp) ∈ δp and (a) qp1 = qp1 for all p1 = p; and (b) w1pq = wpq .a and w1p1 q1 = wp1 q1
1 1

for all p1 q1 = pq; or (2) t = (qq , pq?a, q1q) ∈ δq and (a) q1p1 = qp1 for all p1 = q; and (b)
wpq = a.w1pq and w1p1 q1 = wp1 q1 for all p1 q1 = pq.
The condition (1-b) puts the content a to a channel pq, while (2-b) gets the content
a from a channel pq. The reflexive and transitive closure of → is →∗ . For a transition
t1 · · · tm
t = (s, , s1 ), we refer to  by act(t). We write s1 −−−→sm+1 for s1 −→t1
s2 · · · −
→s
tm
m+1 and use
ϕ to denote t1 · · ·tm . We extend act to these sequences: act(t1 · · ·tn ) = act(t1 ) · · · act(tn ).
The initial configuration of a system is s0 = ("q0 ;"ε ) with "q0 = (q0p )p∈P. A final con-
figuration of the system is s f = ("q;"ε ) with all qp ∈"q final. A configuration s is reachable
if s0 →∗ s and we define the reachable set of S as RS(S) = {s | s0 →∗ s}. We define the
ϕ
traces of a system S to be Tr(S) = {act(ϕ ) | ∃s ∈ RS(S), s0 − → s}.
We now define several properties about communicating systems and their configura-
tions. These properties will be used in § 4 to characterise the systems that correspond to
multiparty session types. Let S be a communicating system, t one of its transitions and
s = ("q;"w) one of its configurations. The following definitions of configuration proper-
ties follow [6, Definition 12].
1. s is stable if all its buffers are empty, i.e., "w = "ε .
2. s is a deadlock configuration if s is not final, and "w = "ε and each qp is a receiving
state, i.e. all machines are blocked, waiting for messages.
3. s is an orphan message configuration if all qp ∈ "q are final but "w = 0,
/ i.e. there is at
least an orphan message in a buffer.
4. s is an unspecified reception configuration if there exists q ∈ P such that qq is a
receiving state and (qq , pq?a, q1q ) ∈ δ implies that |wpq | > 0 and wpq ∈ aA∗ , i.e qq
is prevented from receiving any message from buffer pq.
A sequence of transitions is said to be k-bounded if no channel of any intermediate
configuration si contains more than k messages. We define the k-reachability set of
S to be the largest subset RSk (S) of RS(S) within which each configuration s can be
1 “Deterministic” often means the same channel should carry a unique value, i.e. if (q, c!a, q1 ) ∈
δ and (q, c!a1 , q11 ) ∈ δ then a = a1 and q1 = q11 . Here we follow a different definition [6] in
order to represent branching type constructs.
178 P.-M. Deniélou and N. Yoshida

reached by a k-bounded execution from s0 . Note that, given a communicating system


S, for every integer k, the set RSk (S) is finite and computable. We say that a trace ϕ is
n-bound, written bound(ϕ ) = n, if the number of send actions in ϕ never exceeds the
number of receive actions by n. We then define the equivalences: (1) S ≈ S1 is ∀ϕ , ϕ ∈
Tr(S) ⇔ ϕ ∈ Tr(S1 ); and (2) S ≈n S1 is ∀ϕ , bound(ϕ ) ≤ n ⇒ (ϕ ∈ Tr(S) ⇔ ϕ ∈ Tr(S1 )).
The following key properties will be examined throughout the paper as properties
that multiparty session type can enforce. They are undecidable in general CFSMs.
Definition 2.4 (safety and liveness). (1) A communicating system S is deadlock-free
(resp. orphan message-free, reception error-free) if for all s ∈ RS(S), s is not a deadlock
(resp. orphan message, unspecified reception) configuration. (2) S satisfies the liveness
property if for all s ∈ RS(S), there exists s −→∗ s1 such that s1 is final.

3 Global and Local Types: The LTSs and Translations


This section presents multiparty session types, our main object of study. For the syntax
of types, we follow [2] which is the most widely used syntax in the literature. We intro-
duce two labelled transition systems, for local types and for global types, and show the
equivalence between local types and communicating automata.
Syntax. A global type, written G, G1 , .., describes the whole conversation scenario of a
multiparty session as a type signature, and a local type, written by T, T 1 , .., type-abstract
sessions from each end-point’s view. p, q, · · · ∈ P denote participants (see § 2 for con-
ventions). The syntax of types is given as:
G ::= p → p1 : {a j .G j } j∈J | μ t.G | t | end
T ::= p?{ai.Ti }i∈I | p!{ai.Ti }i∈I | μ t.T | t | end
a j ∈ A corresponds to the usual message label in session type theory. We omit the men-
tion of the carried types from the syntax in this paper, as we are not directly concerned
by typing processes. Global branching type p → p1 : {a j .G j } j∈J states that participant
p can send a message with one of the ai labels to participant p1 and that interactions de-
scribed in G j follow. We require p = p1 to prevent self-sent messages and ai = ak for all
i = k ∈ J. Recursive type μ t.G is for recursive protocols, assuming that type variables
(t, t1 , . . . ) are guarded in the standard way, i.e. they only occur under branchings. Type
end represents session termination (often omitted). p ∈ G means that p appears in G.
Concerning local types, the branching type p?{ai.Ti }i∈I specifies the reception of a
message from p with a label among the ai . The selection type p!{ai.Ti }i∈I is its dual.
The remaining type constructors are the same as global types. When branching is a
singleton, we write p → p1 : a.G1 for global, and p!a.T or p?a.T for local.
Projection. The relation between global and local types is formalised by projection.
Instead of the restricted original projection [2], we use the extension with the merging
operator # from [7]: it allows each branch of the global type to actually contain different
interaction patterns. The projection of G onto p (written G  p) is defined as:

⎪ 
⎨p!{a j .G j  q} j∈J q=p
μ t.G  p G  p = t
p → p1 : {a j .G j } j∈J  q = p?{a j .G j  q} j∈J q=p 1 (μ t.G)  p =

⎩ end otherwise
& j∈J G j  q otherwise
tp=t end  p = end
Multiparty Compatibility in Communicating Automata 179

The mergeability relation # is the smallest congruence relation over local types such
that:
∀i ∈ (K ∩ J).Ti # Ti1 ∀k ∈ (K \ J), ∀ j ∈ (J \ K).ak = a j
p?{ak .Tk }k∈K # p?{a j .T j1 } j∈J
When T1 # T2 holds, we define the operation & as a partial commutative operator over
two types such that T & T = T for all types and that:
p?{ak .Tk }k∈K & p?{a j .T j1 } j∈J = p?({ak .(Tk & Tk1 )}k∈K∩J ∪ {ak .Tk }k∈K\J ∪ {a j .T j1 } j∈J\K )
and homomorphic for other types (i.e. C [T1 ] & C [T2 ] = C [T1 & T2 ] where C is a context
for local types). We say that G is well-formed if for all p ∈ P, G  p is defined.
Example 3.1 (Commit). The global type for the commit protocol in § 1 is:
μ t.A → B : {act. B → C : {sig. A → C : commit.t }, quit.B → C : {save.A → C : finish.end}}
Then C’s local type is: μ t.B?{sig.A?{commit.t}, save.A?{finish.end}}.
We now present labelled transition relations (LTS) for global and local types and their
sound and complete correspondence.
LTS over Global Types. We first designate the observables (, 1, ...). We choose here
to follow the definition of actions for CFSMs where a label  denotes the sending or the
reception of a message of label a from p to p1 :  ::= pp1 !a | pp1 ?a
In order to define an LTS for global types, we need to represent intermediate states
in the execution. For this reason, we introduce in the grammar of G the construct p 
p1 : j {ai .Gi }i∈I to represent the fact that a j has been sent but not yet received.

− G1 is defined as (subj() is
Definition 3.1 (LTS over global types.). The relation G →
defined in § 2):
pp1 !a j
[GR1] p → p1 : {ai .Gi }i∈I −−−→ p  p1 : j {ai .Gi }i∈I ( j ∈ I)
→ G1

pp1 ?a j G[ μ t.G/t] −
[GR2] p  p1 : j {ai .Gi }i∈I −−−−→ G j [GR3]
→ G1

μ t.G −
→ G1j p, q ∈ subj() → G1j q ∈ subj() ∀i ∈ I \ j, G1i = Gi
 
∀j ∈ I Gj − Gj −
[GR4] [GR5]
→ p → q : {ai .G1i }i∈I → p  q : j {ai .G1i }i∈I
 
p → q : {ai .Gi }i∈I − p  q : j {ai .Gi }i∈I −
[GR1] represents the emission of a message while [GR2] describes the reception of
a message. [GR3] governs recursive types. [GR4,5] define the asynchronous seman-
tics of global types, where the syntactic order of messages is enforced only for the
participants that are involved. For example, when the participants of two consecutive
communications are disjoint, as in: G1 = A → B : a.C → D : b.end, we can observe the
emission (and possibly the reception) of b before the interactions of a (by [GR4]).
A more interesting example is: G2 = A → B : a.A → C : b.end. We write 1 = AB!a,
2 = AB?a, 3 = AC!b and 4 = AC?b. The LTS allows the following three sequences:
   
G2 −
→1
A  B : a.A → C : b.end −
→2
A → C : b.end −
→3
A  C : b.end −
→4
end
   
G2 −
→1
A  B : a.A → C : b.end −
→3
A  B : a.A  C : b.end −
→2
A  C : b.end −
→4
end
   
G2 −
→1
A  B : a.A → C : b.end −
→3
A  B : a.A  C : b.end −
→4
A  B : a.end −
→2
end
180 P.-M. Deniélou and N. Yoshida

The last sequence is the most interesting: the sender A has to follow the syntactic order
but the receiver C can get the message b before B receives a. The respect of these con-
straints is enforced by the conditions p, q ∈ subj() and q ∈ subj() in rules [GR4,5].
LTS over Local Types. We define the LTS over local types. This is done in two steps,
following the model of CFSMs, where the semantics is given first for individual au-
tomata and then extended to communicating systems. We use the same labels (, 1 , ...)
as the ones for CFSMs.

→ T 1 , for the local type of role
Definition 3.2 (LTS over local types). The relation T −
p, is defined as:
− T1

pq!ai qp?a j T [ μ t.T /t] →
[LR1] q!{ai .Ti }i∈I −−−→ Ti [LR2] q?{ai .Ti }i∈I −−−→ T j [LR3]
→ T1

μ t.T −

The semantics of a local type follows the intuition that every action of the local type
should obey the syntactic order. We define the LTS for collections of local types.
Definition 3.3 (LTS over collections of local types). A configuration s = ("T ;"w) of
a system of local types {Tp }p∈P is a pair with "T = (Tp )p∈P and "w = (wpq )p=q∈P with
wpq ∈ A∗ . We then define the transition system for configurations. For a configuration
 pq!a
sT = ("T ;"w), the visible transitions of sT −→ s1T = ("T 1 ; "w1 ) are defined as: (1) Tp −−→ Tp1
and (a) Tp11 = Tp1 for all p1 = p; and (b) w1pq = wpq · a and w1p1 q1 = wp1 q1 for all p1 q1 = pq;
pq?a
or (2) Tq −−→ Tq1 and (a) Tp11 = Tp1 for all p1 = q; and (b) wpq = a · w1pq and w1p1 q1 = wp1 q1
for all p1 q1 = pq.
The semantics of local types is therefore defined over configurations, following the
definition of the semantics of CFSMs. wpq represents the FIFO queue at channel pq.
We write Tr(G) to denote the set of the visible traces that can be obtained by reducing
G. Similarly for Tr(T ) and Tr(S). We extend the trace equivalences ≈ and ≈n in § 2 to
global types and configurations of local types.
We now state the soundness and completeness of projection w.r.t. the LTSs.

Theorem 3.1 (soundness and completeness). 2 Let G be a global type with partici-
pants P and let "T = {G  p}p∈P be the local types projected from G. Then G ≈ ("T ;"ε ).

Local types and CFSMs Next we show how to algorithmically go from local types
to CFSMs and back while preserving the trace semantics. We start by translating local
types into CFSMs.
Definition 3.4 (translation from local types to CFSMs). Write T 1 ∈ T if T 1 occurs in
T . Let T0 be the local type of participant p projected from G. The automaton correspond-
ing to T0 is A(T0 ) = (Q,C, q0 , A, δ ) where: (1) Q = {T 1 | T 1 ∈ T0 , T 1 = t, T 1 = μ t.T };
(2) q0 = T01 with T0 = μ"t.T01 and T01 ∈ Q; (3) C = {pq | p, q ∈ G}; (4) A is the set of
{a ∈ G}; and (5) δ is defined as:
2 The local type abstracts the behaviour of multiparty typed processes as proved in the subject
reduction theorem in [13]. Hence this theorem implies that processes typed by global type G
by the typing system in [2, 13] follow the LTS of G.
Multiparty Compatibility in Communicating Automata 181

(T, (pp1 !a j ), T j ) ∈ δ T j = t
If T = p1 !{a j .T j } j∈J ∈ Q, then
(T, (pp1 !a j ), T 1 ) ∈ δ T j = t, μ t"t.T 1 ∈ T0 , T 1 ∈ Q

(T, (p1 p?a j ), T j ) ∈ δ T j = t
If T = p1 ?{a j .T j } j∈J ∈ Q, then
(T, (p1 p?a j ), T 1 ) ∈ δ T j = t, μ t"t.T 1 ∈ T0 , T 1 ∈ Q
The definition says that the set of states Q are the suboccurrences of branching or se-
lection or end in the local type; the initial state q0 is the occurrence of (the recursion
body of) T0 ; the channels and alphabets correspond to those in T0 ; and the transition is
defined from the state T to its body T j with the action pp1 !a j for the output and pp1 ?a j
for the input. If T j is a recursive type variable t, it points the state of the body of the
corresponding recursive type. As an example, see C’s local type in Example 3.1 and its
corresponding automaton in § 1.

Proposition 3.1 (local types to CFSMs). Assume Tp is a local type. Then A(Tp ) is
deterministic, directed and has no mixed states.

We say that a CFSM is basic if it is deterministic, directed and has no mixed states. Any
basic CFSM can be translated into a local type.

Definition 3.5 (translation from a basic CFSM to a local type). From a basic Mp =
(Q,C, q0 , A, δ ), we define the translation T(Mp ) such that T(Mp ) = Tε (q0 ) where Tq̃ (q)
is defined as:
(1) Tq̃ (q) = μ tq .p1 !{a j .Tq̃·q
◦ (q )} 1
j j∈J if (q, pp !a j , q j ) ∈ δ ;
1 ◦ 1
(2) Tq̃ (q) = μ tq .p ?{a j .Tq̃·q (q j )} j∈J if (q, p p?a j , q j ) ∈ δ ;
(3) Tq̃◦ (q) = Tε (q) = end if q is final; (4) Tq̃◦ (q) = tqk if (q, , qk ) ∈ δ and qk ∈ q̃; and
(5) Tq̃◦ (q) = Tq̃ (q) otherwise.

Finally, we replace μ t.T by T if t is not in T .

In Tq̃ , q̃ records visited states; (1,2) translate the receiving and sending states to branch-
ing and selection types, respectively; (3) translates the final state to end; and (4) is the
case of a recursion: since qk was visited,  is dropped and replaced by the type variable.
The following proposition states that these translations preserve the semantics.

Proposition 3.2 (translations between CFSMs and local types). If a CFSM M is


basic, then M ≈ T(M). If T is a local type, then T ≈ A(T ).

4 Completeness and Synthesis

This section studies the synthesis and sound and complete characterisation of multi-
party session types as communicating automata. A first idea would be to restrict basic
CFSMs to the natural generalisation of half-duplex systems [6, § 4.1.1], in which each
pair of machines linked by two channels, one in each direction, communicates in a
half-duplex way. In this class, the safety properties of Definition 2.4 are however unde-
cidable [6, Theorem 36]. We therefore need a stronger (and decidable) property to force
basic CFSMs to behave as if they were the result of a projection from global types.
182 P.-M. Deniélou and N. Yoshida

Multiparty compatibility In the two machines case, there exists a sound and com-
plete condition called compatible [11]. Let us define the isomorphism Φ : (C × {!, ?} ×
A)∗ −→ (C × {!, ?} × A)∗ such that Φ ( j?a) = j!a, Φ ( j!a) = j?a, Φ (ε ) = ε , Φ (t1 · · ·tn )
= Φ (t1 ) · · · Φ (tn ). Φ exchanges a sending action with the corresponding receiving one
and vice versa. The compatibility of two machines can be immediately defined as
Tr(M1 ) = Φ (Tr(M2 )) (i.e. the traces of M1 are exactly the set of dual traces of M2 ).
The idea of the extension to the multiparty case comes from the observation that from
the viewpoint of the participant p, the rest of all the machines (Mq )q∈P\p should behave
as if they were one CFSM which offers compatible traces Φ (Tr(Mp )), up to internal
synchronisations (i.e. 1-bounded executions). Below we define a way to group CFSMs.

Definition 4.1 (Definition 37, [6]). Let Mi = (Qi ,Ci , q0i , Ai , δi ). The associated CFSM
of a system S = (M1 , .., Mn ) is M = (Q,C, q0 , Σ , δ ) such that: Q = Q1 × Q2 × · · · × Qn ,
q0 = (q01 , . . . , q0n ) and δ is the smallest relation for which: if (qi , , q1i ) ∈ δi (1 ≤ i ≤ n),
then ((q1 , ..., qi , ..., qn ), , (q1 , ..., q1i , ..., qn )) ∈ δ .

We now define a notion of compatibility extended to more than two CFSMs. We say that
ϕ is an alternation if ϕ is an alternation of sending and corresponding receive actions
(i.e. the action pq!a is immediately followed by pq?a).

Definition 4.2 (multiparty compatible system). A system S = (M1 , .., Mn ) (n ≥ 2) is


multiparty compatible if for any 1-bounded reachable stable state s ∈ RS1 (S), for any
sending action  and for at least one receiving action  from s in Mi , there exists a se-
quence of transitions ϕ ·t from s in a CFSM corresponding to S−i = (M1 , . . . , Mi−1 , Mi+1 ,
. . . , Mn ) where ϕ is either empty or an alternation and  = Φ (act(t)) and i ∈ act(ϕ )
(i.e. ϕ does not contain actions to or from channel i).

The above definition states that for each Mi , the rest of machines S−i can produce the
compatible (dual) actions by executing alternations in S−i . From Mi , these intermediate
alternations can be seen as non-observable internal actions.

Example 4.1 (multiparty compatibility). As an example, we can test the multiparty


compatibility property on the commit example in § 1.We only detail here how to check
the compatibility from the point of view of A. To check the compatibility for the ac-
tions act(t1 · t2 ) = AB!quit · AC!finish, the only possible action is Φ (act(t1 )) = AB?quit
from B, then a 1-bounded excecution is BC!save · BC?save, and Φ (act(t2 )) = AC?finish
from C. To check the compatibility for the actions act(t3 · t4 ) = AB!act · AC!commit,
Φ (act(t3 )) = AB?act from B, the 1-bound execution is BC!sig·BC?sig, and Φ (act(t4 )) =
AC?commit from C.

Remark 4.1. In Definition 4.2, we check the compatibility from any 1-bounded reach-
able stable state in the case one branch is selected by different senders. Consider the
following machines:
BA?a
(/.-,
A →/.-,
()*+ / ()*+
/.-, CA?c ()*+
//.-,

 B →/.-,
()*+ / 
()*+
/.-,
 C →/.-,
()*+ / 
()*+
/.-,
 A1 →/.-,
()*+ 7 JJJ /.-,
()*+
BA?a BA!a CA!c CA?c ()*+
/ 

BA?b  BA!b  CA!d  JJJ
()*+ CA?d ()*+
/.-, //.-,

 ()*+
/.-,

 ()*+
/.-,


BA?b
CA?d /.-, % 
()*+
Multiparty Compatibility in Communicating Automata 183

In A, B and C, each action in each machine has its dual but they do not satisfy multiparty
compatibility. For example, if BA!a · BA?a is executed, CA!d does not have a dual action
(hence they do not satisfy the safety properties). On the other hand, the machines A1 , B
and C satisfy the multiparty compatibility.

Theorem 4.1. Assume S = (Mp )p∈P is basic and multiparty compatible. Then S satisfies
the three safety properties in Definition 2.4. Further, if there exists at least one Mq which
includes a final state, then S satisfies the liveness property.

Proposition 4.1. If all the CFSMs Mp (p ∈ P) are basic, there is an algorithm to check
whether (Mp )p∈P is multiparty compatible.

The proof of Theorem 4.1 is non-trivial, using a detailed analysis of causal relations.
The proof of Proposition 4.1 comes from the finiteness of RS1 (S). See [17] for details.

Synthesis. Below we state the lemma which will be crucial for the proof of synthesis
and completeness. The lemma comes from the intuition that the transitions of multiparty
compatible systems are always permutations of one-bounded executions as it is the case
in multiparty session types. See [17] for the proof.

Lemma 4.1 (1-buffer equivalence). Suppose S1 and S2 are two basic and multiparty
compatible communicating systems such that S1 ≈1 S2 , then S1 ≈ S2 .

Theorem 4.2 (synthesis). Suppose S is a basic system and multiparty compatible. Then
there is an algorithm which successfully builds well-formed G such that S ≈ G if such
G exists, and otherwise terminates.

Proof. We assume S = (Mp )p∈P. The algorithm starts from the initial states of all ma-
chines (qp1 0 , ..., qpn 0 ). We take a pair of the initial states which is a sending state qp0 and
q
a receiving state q0 from p to q. We note that by directness, if there are more than two
pairs, the participants in two pairs are disjoint, and by [G4] in Definition 3.1, the order
does not matter. We apply the algorithm with the invariant that all buffers are empty and
that we repeatedly pick up one pair such that qp (sending state) and qq (receiving state).
We define G(q1 , ..., qn ) where (qp , qq ∈ {q1 , ..., qn }) as follows:

– if (q1 , ..., qn ) has already been examined and if all participants have been involved
since then (or the ones that have not are in their final state), we set G(q1 , ..., qn ) to
be tq1 ,...,qn . Otherwise, we select a pair sender/receiver from two participants that
have not been involved (and are not final) and go to the next step;
– otherwise, in qp , from machine p, we know that all the transitions are sending ac-
tions towards p1 (by directedness), i.e. of the form (qp , pq!ai , qi ) ∈ δp for i ∈ I.
• we check that machine q is in a receiving state qq such that (qq , pq?a j , q1j ) ∈ δp1
with j ∈ J and I ⊆ J.
• we set μ tq1 ,...,qn .p → q : {ai .G(q1 , ..., qp ← qi , ..., qq ← q1i , ..., qn )}i∈I (we re-
place qp and qq by qi and q1i , respectively) and continue by recursive calls.
• if all sending states in q1 , ..., qn become final, then we set G(q1 , ..., qn ) = end.
– we erase unnecessary μ t if t ∈ G.
184 P.-M. Deniélou and N. Yoshida

Since the algorithm only explores 1-bounded executions, the reconstructed G satisfies
G ≈1 S. By Theorem 3.1, we know that G ≈ ({G  p}p∈P;"ε ). Hence, by Proposition 3.2,
we have G ≈ S1 where S1 is the communicating system translated from the projected
local types {G  p}p∈P of G. By Lemma 4.1, S ≈ S1 and therefore S ≈ G.

The algorithm can generate the global type in Example 3.1 from CFSMs in § 1and the
global type B → A{a : C → A : {c : end, d : end}, b : C → A : {c : end, d : end}} from A1 ,
B and C in Remark 4.1. Note that B → A{a : C → A : {c : end}, b : C → A : {d : end}}
generated by A, B and C in Remark 4.1 is not projectable, hence not well-formed.
By Theorems 3.1 and 4.1, and Proposition 3.2, we can now conclude:

Theorem 4.3 (soundness and completeness). Suppose S is basic and multiparty com-
patible. Then there exists G such that S ≈ G. Conversely, if G is well-formed, then there
exists a basic and multiparty compatible system S such that S ≈ G.

5 Conclusion and Related Work


This paper investigated the sound and complete characterisation of multiparty session
types into CFSMs and developed a decidable synthesis algorithm from basic CFSMs.
The main tool we used is a new extension to multiparty interactions of the duality
condition for binary session types, called multiparty compatibility. The basic condition
(coming from binary session types) and the multiparty compatibility property are a nec-
essary and sufficient condition to obtain safe global types. Our aim is to offer a duality
notion which would be applicable to extend other theoretical foundations such as the
Curry-Howard correspondence with linear logics [4,20] to multiparty communications.
Basic multiparty compatible CFSMs also define one of the few non-trivial decidable
subclass of CFSMs which satisfy deadlock-freedom. The methods proposed here are
palatable to a wide range of applications based on choreography protocol models and
more widely, finite state machines. Multiparty compatibility is applicable for extend-
ing the synthesis algorithm to build more expressive graph-based global types (general
global types [8]) which feature fork and join primitives [9].
Our previous work [8] presented the first translation from global and local types into
CFSMs. It only analysed the properties of the automata resulting from such a transla-
tion. The complete characterisation of global types independently from the projected
local types was left open, as was synthesis. This present paper closes this open prob-
lem. There are a large number of paper that can be found in the literature about the
synthesis of CFSMs. See [16] for a summary of recent results. The main distinction
with CFSM synthesis is, apart from the formal setting (i.e. types), about the kind of the
target specifications to be generated (global types in our case). Not only our synthesis
is concerned about trace properties (languages) like the standard synthesis of CFSMs
(the problem of the closed synthesis of CFSMs is usually defined as the construction
from a regular language L of a machine satisfying certain conditions related to buffer
boundedness, deadlock-freedom and words swapping), but we also generate concrete
syntax or choreography descriptions as types of programs or software. Hence they are
directly applicable to programming languages and can be straightforwardly integrated
into the existing frameworks that are based on session types.
Multiparty Compatibility in Communicating Automata 185

Within the context of multiparty session types, [15] first studied the reconstruction of
a global type from its projected local types up to asynchronous subtyping and [14] re-
cently offers a typing system to synthesise global types from local types. Our synthesis
based on CFSMs is more general since CFSMs do not depend on the syntax. For exam-
ple, [14, 15] cannot treat the synthesis for A1 , B and C in Remark 4.1. These works also
do not study the completeness (i.e. they build a global type from a set of projected lo-
cal types (up to subtyping), and do not investigate necessary and sufficient conditions to
build a well-formed global type). A difficulty of the completeness result is that it is gen-
erally unknown if the global type constructed by the synthesis can simulate executions
with arbitrary buffer bounds since the synthesis only directly looks at 1-bounded exe-
cutions. In this paper, we proved Lemma 4.1 and bridged this gap towards the complete
characterisation. Recent work by [1, 5] focus on proving the semantic correspondence
between global and local descriptions (see [8] for more detailed comparison), but no
synthesis algorithm is studied.

Acknowledgement. The work has been partially sponsored by the Ocean Observato-
ries Initiative and EPSRC EP/K011715/1, EP/K034413/1 and EP/G015635/1.

References
1. Basu, S., Bultan, T., Ouederni, M.: Deciding choreography realizability. In: POPL 2012, pp.
191–202. ACM (2012)
2. Bettini, L., Coppo, M., D’Antoni, L., De Luca, M., Dezani-Ciancaglini, M., Yoshida, N.:
Global progress in dynamically interleaved multiparty sessions. In: van Breugel, F., Chechik,
M. (eds.) CONCUR 2008. LNCS, vol. 5201, pp. 418–433. Springer, Heidelberg (2008)
3. Brand, D., Zafiropulo, P.: On communicating finite-state machines. J. ACM 30, 323–342
(1983)
4. Caires, L., Pfenning, F.: Session types as intuitionistic linear propositions. In: Gastin, P.,
Laroussinie, F. (eds.) CONCUR 2010. LNCS, vol. 6269, pp. 222–236. Springer, Heidelberg
(2010)
5. Castagna, G., Dezani-Ciancaglini, M., Padovani, L.: On global types and multi-party session.
LMCS 8(1) (2012)
6. Cécé, G., Finkel, A.: Verification of programs with half-duplex communication. Inf. Com-
put. 202(2), 166–190 (2005)
7. Deniélou, P.-M., Yoshida, N.: Dynamic multirole session types. In: POPL, pp. 435–446.
ACM, Full version, Prototype at http://www.doc.ic.ac.uk/˜ pmalo/dynamic
8. Deniélou, P.-M., Yoshida, N.: Multiparty session types meet communicating automata. In:
Seidl, H. (ed.) ESOP 2012. LNCS, vol. 7211, pp. 194–213. Springer, Heidelberg (2012)
9. http://arxiv.org/abs/1304.1902
10. Girard, J.-Y.: Linear logic. TCS 50 (1987)
11. Gouda, M., Manning, E., Yu, Y.: On the progress of communication between two finite state
machines. Information and Control 63, 200–216 (1984)
12. Honda, K., Vasconcelos, V.T., Kubo, M.: Language primitives and type discipline for
structured communication-based programming. In: Hankin, C. (ed.) ESOP 1998. LNCS,
vol. 1381, pp. 122–138. Springer, Heidelberg (1998)
13. Honda, K., Yoshida, N., Carbone, M.: Multiparty Asynchronous Session Types. In: POPL
2008, pp. 273–284. ACM (2008)
186 P.-M. Deniélou and N. Yoshida

14. Lange, J., Tuosto, E.: Synthesising choreographies from local session types. In: Koutny, M.,
Ulidowski, I. (eds.) CONCUR 2012. LNCS, vol. 7454, pp. 225–239. Springer, Heidelberg
(2012)
15. Mostrous, D., Yoshida, N., Honda, K.: Global principal typing in partially commutative
asynchronous sessions. In: Castagna, G. (ed.) ESOP 2009. LNCS, vol. 5502, pp. 316–332.
Springer, Heidelberg (2009)
16. Muscholl, A.: Analysis of communicating automata. In: Dediu, A.-H., Fernau, H., Martı́n-
Vide, C. (eds.) LATA 2010. LNCS, vol. 6031, pp. 50–57. Springer, Heidelberg (2010)
17. DoC Technical Report, Imperial College London, Computing, DTR13-5 (2013)
18. Takeuchi, K., Honda, K., Kubo, M.: An interaction-based language and its typing system.
In: Halatsis, C., Philokyprou, G., Maritsas, D., Theodoridis, S. (eds.) PARLE 1994. LNCS,
vol. 817, pp. 398–413. Springer, Heidelberg (1994)
19. Villard, J.: Heaps and Hops. PhD thesis, ENS Cachan (2011)
20. Wadler, P.: Proposition as Sessions. In: ICFP 2012, pp. 273–286 (2012)
Component Reconfiguration in the Presence
of Conflicts$

Roberto Di Cosmo1 , Jacopo Mauro2, Stefano Zacchiroli1, and Gianluigi Zavattaro2


1 Univ Paris Diderot, Sorbonne Paris Cité, PPS, UMR 7126, CNRS, F-75205 Paris, France
[email protected], [email protected]
2 Focus Team, Univ of Bologna/INRIA, Italy, Mura A. Zamboni, 7, Bologna

{jmauro,zavattar}@cs.unibo.it

Abstract. Components are traditionally modeled as black-boxes equipped with


interfaces that indicate provided/required ports and, often, also conflicts with
other components that cannot coexist with them. In modern tools for automatic
system management, components become grey-boxes that show relevant internal
states and the possible actions that can be acted on the components to change
such state during the deployment and reconfiguration phases. However, state-of-
the-art tools in this field do not support a systematic management of conflicts. In
this paper we investigate the impact of conflicts by precisely characterizing the
increment of complexity on the reconfiguration problem.

1 Introduction
Modern software systems are more and more based on interconnected software compo-
nents (e.g. packages or services) deployed on clusters of heterogeneous machines that
can be created, connected and reconfigured on-the-fly. Traditional component models
represent components as black-boxes with interfaces indicating their provide and re-
quire ports. In many cases also conflicts are considered in order to deal with frequent
situations in which components cannot be co-installed.
In software systems where components are frequently reconfigured (e.g. “cloud”
based applications that elastically reacts to client demands) more expressive compo-
nent models are considered: a component becomes a grey-box showing relevant internal
states and the actions that can be acted on the component to change state during deploy-
ment and reconfiguration. For instance, in the popular system configuration tool Pup-
pet [10] or the novel deployment management system Engage [8], components can be
in the absent, present, running or stopped states, and the actions install, uninstall, start,
stop and restart can be executed upon them. Rather expressive dependencies among
components can be declared. The aim of these tools is to allow the system administrator
to declaratively express the desired component configuration and automatically execute
a correct sequence of low-level actions that bring the current configuration to a new one
satisfying the administrator requests respecting dependencies. We call reconfigurability
the problem of checking the existence of such sequence of low-level actions.
$ Work partially supported by Aeolus project, ANR-2010-SEGI-013-01, and performed at IR-
ILL, center for Free Software Research and Innovation in Paris, France, www.irill.org

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 187–198, 2013.
c Springer-Verlag Berlin Heidelberg 2013
188 R. Di Cosmo et al.

Despite the importance of conflicts in many component models, see e.g. package-
based software distributions used for Free and Open Source Software (FOSS) [5], the
Eclipse plugin model [3], or the OSGi component framework [12], state-of-the-arts
management systems like the above do not take conflicts into account. This is likely
ascribable to the increased complexity of the reconfigurability problem in the presence
of conflicts. In this paper we precisely characterize this increment of complexity.
In a related paper [6] we have proposed the Aeolus component model that, despite
its simplicity, is expressive enough to capture the main features of tools like Puppet
and Engage. We have proved that the reconfigurability problem is Polynomial-Time
for Aeolus− , the fragment without numerical constraints. In this paper we consider
Aeolus core, the extension of this fragment with conflicts, and we prove that even if
the reconfigurability problem remains decidable, it turns out to be Exponential-Space
hard. We consider this result a fundamental step towards the realization of tools that
manage conflicts systematically. In fact, we shed some light on the specific sources of
the increment of complexity of the reconfigurability problem.
The technical contribution of the paper and its structure is as follows. In Section 2
we formalize the reconfigurability problem in the presence of conflicts. In Section 3
we prove its decidability by resorting to the theory of Well-Structured Transition Sys-
tems [2,7]. We consider this decidability result interesting also from a foundational
viewpoint: despite our component model has many commonalities with concurrent
models like Petri nets, in our case the addition of conflicts (corresponding to inhibitor
arcs in Petri nets) does not make the analysis of reachability problems undecidable.
The closed relationship between our model and Petri nets is used in Section 4 where
we prove the Exponential-Space hardness of the reconfigurability problem by reduction
from the coverability problem in Petri nets. In Section 5 we discuss related work and
report concluding remarks. Missing proofs are available in [4].

2 The Aeolus core Model


The Aeolus core model represents relevant internal states of components by means
of a finite state automaton (see Fig. 1): depending on its state components activate
provide and require functionalities (called ports), and get in conflict with ports provided
by others (in Fig. 1 active ports are black while inactive ones are grey). Each port
is identified by an interface name. Bindings can be established between provide and
require ports with the same interface. Fig. 1 shows the graphical representation of a
typical deployment of the popular WordPress blog. According to the Debian packages
metadata, WordPress requires a Web server providing httpd in order to be installed, and
an active MySQL database server in order to be in production. The chosen Web server
is Apache2 which is broken into various packages (e.g. apache2, apache2-bin) that
shall be simultaneously installed. Notice that Apache2 is not co-installable with other
Web servers, such as lighttpd.
We now move to the formal definition of Aeolus core. We assume given a set I of
interface names.
Definition 1 (Component type). The set Γ of component types of the Aeolus core
model, ranged over by T , T1 , T2 , . . . contains 4-ples Q, q0 , T, D where:
Component Reconfiguration in the Presence of Conflicts 189

Fig. 1. Typical Wordpress/Apache/MySQL deployment, modeled in Aeolus core

– Q is a finite set of states containing the initial state q0 ;


– T ⊆ Q × Q is the set of transitions;
– D is a function from Q to a 3-ple P, R, C of interface names (i.e. P, R, C ⊆ I ) in-
dicating the provide, require, and conflict ports that each state activates. We assume
that the initial state q0 has no requirements and conflicts (i.e. D(q0 ) = P, 0,
/ 0).
/

We now define configurations that describe systems composed by components and their
bindings. Each component has a unique identifier taken from the set Z . A configura-
tion, ranged over by C1 , C2 , . . ., is given by a set of component types, a set of compo-
nents in some state, and a set of bindings.
Definition 2 (Configuration). A configuration C is a 4-ple U, Z, S, B where:
– U ⊆ Γ is the finite universe of the available component types;
– Z ⊆ Z is the set of the currently deployed components;
– S is the component state description, i.e. a function that associates to components
in Z a pair T , q where T ∈ U is a component type Q, q0 , T, D, and q ∈ Q is the
current component state;
– B ⊆ I × Z × Z is the set of bindings, namely 3-ple composed by an interface,
the component that requires that interface, and the component that provides it; we
assume that the two components are different.
Configuration are equivalent if they have the same instances up to instance renaming.
Definition 3 (Configuration equivalence). Two configurations U, Z, S, B and
U, Z 1 , S1 , B1  are equivalent (U, Z, S, B ≡ U, Z 1 , S1 , B1 ) iff there exists a bijective func-
tion ρ from Z to Z 1 s.t.
– S(z) = S1 (ρ (z)) for every z ∈ Z;
– r, z1 , z2  ∈ B iff r, ρ (z1 ), ρ (z2 ) ∈ B1 .
Notation. We write C [z] as a lookup operation that retrieves the pair T , q = S(z), where
C = U, Z, S, B. On such a pair we then use the postfix projection operators .type and .state
to retrieve T and q, respectively. Similarly, given a component type Q, q0 , T, D, we use projec-
tions to decompose it: .states, .init, and .trans return the first three elements; .P(q), .R(q),
and .C(q) return the three elements of the D(q) tuple. Moreover, we use .prov (resp. .req) to
denote the union of all the provide ports (resp. require ports) of the states in Q. When there is no
190 R. Di Cosmo et al.

ambiguity we take the liberty to apply the component type projections to T , q pairs. Example:
C [z].R(q) stands for the require ports of component z in configuration C when it is in state q.
We can now formalize the notion of configuration correctness.

Definition 4 (Correctness). Let us consider the configuration C = U, Z, S, B.


We write C |=req (z, r) to indicate that the require port of component z, with interface
r, is bound to an active port providing r, i.e. there exists a component z1 ∈ Z \ {z} such
that r, z, z1  ∈ B, C [z1 ] = T 1 , q1  and r is in T 1 .P(q1 ). Similarly, for conflicts, we write
C |=cn f (z, c) to indicate that the conflict port c of component z is satisfied because
no other component has an active port providing c, i.e. for every z1 ∈ Z \ {z} with
C [z1 ] = T 1 , q1  we have that c ∈ T 1 .P(q1 ).
The configuration C is correct if for every component z ∈ Z with S(z) = T , q we
have that C |=req (z, r) for every r ∈ T .R(q) and C |=cn f (z, c) for every c ∈ T .C(q).

Configurations evolve at the granularity of actions.


Definition 5 (Actions). The set A contains the following actions:

– stateChange(z1 , q1 , q11 , . . . , zn , qn , q1n ) where zi ∈ Z and ∀i = j . zi = z j ;


– bind(r, z1 , z2 ) where z1 , z2 ∈ Z and r ∈ I ;
– unbind(r, z1 , z2 ) where z1 , z2 ∈ Z and r ∈ I ;
– newRsrc(z : T ) where z ∈ Z and T ∈ U is the component type of z;
– delRsrc(z) where z ∈ Z .

Notice that we consider a set of state changes in order to deal with simultaneous instal-
lations like the one needed for Apache2 and Apache2-bin in Fig. 1. The execution of
actions is formalized as configuration transitions.
α
Definition 6 (Reconfigurations). Reconfigurations are denoted by transitions C − →C1
meaning that the execution of α ∈ A on the configuration C produces a new configu-
ration C 1 . The transitions from a configuration C = U, Z, S, B are defined as follows:

stateChange(z1 ,q1 ,q1 ,...,zn ,qn ,q1 ) bind(r,z1 ,z2 )


C −−−−−−−−−−−−−1−−−−−−−− n
→ U, Z, S1 , B C −−−−−−−→ U, Z, S, B ∪ r, z1 , z2 
if ∀i . C [zi ].state = qi if r, z1 , z2  ∈ B
and ∀i . (qi , q1 ) ∈ C [z ].trans and r ∈ C [z1 ].req ∩ C [z2 ].prov
i i
1 1 C [zi ].type, q1i  if ∃i . z1 = zi
and S (z ) =
C [z1 ] otherwise

unbind(r,z1 ,z2 )
C −−−−−−−−→ U, Z, S, B \ r, z1 , z2  if r, z1 , z2  ∈ B

newRsrc(z:T ) delRsrc(z)
C −−−−−−−−→ U, Z ∪ {z}, S1 , B C −−−−−−→ U, 1 1
 Z \ {z}, S1 , B 
if z ∈ Z, T ∈U ⊥ if z =z
if S1 (z1 ) =
T , T .init if z1 = z C [z1 ] otherwise
and S1 (z1 ) =
C [z1 ] otherwise and B1 = {r, z1 , z2  ∈ B | z ∈ {z1 , z2 }}

We can now define a reconfiguration run as the effect of the execution of a sequence of
actions (atomic or multiple state changes).
Component Reconfiguration in the Presence of Conflicts 191

Definition 7 (Reconfiguration Run). A reconfiguration run is a sequence of reconfig-


α1 α2 αm
urations C0 −→ C1 −→ · · · −→ Cm such that Ci is correct, for every 0 ≤ i ≤ m.

As an example, a reconfiguration run to reach the scenario depicted in Fig. 1 starting


from a configuration where only apache2 and mysql are running and apache2-bin is
installed is the one involving in sequence the creation of wordpress, the bindings of
wordpress with mysql and apache2, and finally the installation of wordpress.
We now have all the ingredients to define the reconfigurability problem: given a
universe of component types and an initial configuration, we want to know whether
there exists a reconfiguration run leading to a configuration that includes at least one
component of a given type T in a given state q.

Definition 8 (Reconfigurability Problem). The reconfigurability problem has as input


a universe U of component types, an initial configuration C , a component type T , and
α1 α2
a state q. It returns as output true if there exists a reconfiguration run C −→ C1 −→
αm
· · · −→ Cm and Cm [z] = T , q, for some component z ∈ Cm . Otherwise, it returns false.

The restriction to only one component in a given state is not limiting: we can encode
any given combination of component types and states by adding dummy provide ports
enabled only by the final states of interest, and a target dummy component with require-
ments on all such provide ports.

3 Reconfigurability is Decidable in Aeolus core

We demonstrate decidability of the reconfigurability problem by resorting to the theory


of Well-Structured Transition Systems (WSTS) [2,7].
A reflexive and transitive relation is called quasi-ordering. A well-quasi-ordering
(wqo) is a quasi-ordering (X, ≤) such that, for every infinite sequence x1 , x2 , x3 , · · · ,
there exist i < j with xi ≤ x j . Given a quasi-order ≤ over X, an upward-closed set is a
subset I ⊆ X such that the following holds: ∀x, y ∈ X : (x ∈ I ∧ x ≤ y) ⇒ y ∈ I. Given
x ∈ X, its upward closure is ↑ x = {y ∈ X | x ≤ y}. This notion can be extended 
to sets
in the obvious way: given a set Y ⊆ X we define its upward closure as 
↑ Y = y∈Y ↑ y.
A finite basis of an upward-closed set I is a finite set B such that I = x∈B ↑ x.

Definition 9. A WSTS is a transition system (S , →, /) where / is a wqo on S which


is compatible with →, i.e., for every s1 / s11 such that s1 → s2 , there exists s11 →∗ s12
such that s2 / s12 (→∗ is the reflexive and transitive closure of →). Given a state s ∈ S ,
Pred(s) is the set {s1 ∈ S | s1 → s} of immediate predecessors of s. Pred is extended

to sets in the obvious way: Pred(S) = s∈S Pred(s). A WSTS has effective pred-basis if
there exists an algorithm that, given s ∈ S , returns a finite basis of ↑ Pred(↑ s).

The following proposition is a special case of Proposition 3.5 in [7].


Proposition 1. Let (S , →, /) be a finitely branching WSTS with decidable / and ef-
fective pred-basis. Let I be any upward-closed subset of S and let Pred ∗ (I) be the set
{s1 ∈ S | s1 →∗ s} of predecessors of states in I. A finite basis of Pred ∗ (I) is computable.
192 R. Di Cosmo et al.

In the remainder of the section, we assume a given universe U of component types;


so we can consider that the sets of possible component types T and of possible internal
states q are both finite. We will resort to the theory of WSTS by considering an abstract
model of configurations in which bindings are not taken into account.
Definition 10 (Abstract Configuration). An abstract configuration B is a finite mul-
tiset of pairs T , q where T is a component type and q is a corresponding state. We
use Con f to denote the set of abstract configurations.
A concretization of an abstract configuration is simply a correct configuration that for
every component-type and state pair T , q has as many instances of component T in
state q as pairs T , q in the abstract configuration.
Definition 11 (Concretization). Given an abstract configuration B we say that a cor-
rect configuration C = U, Z, S, B is one concretization of B if there exists a bijection
f from the multiset B to Z s.t. ∀T , q ∈ B we have that S( f (T , q)) = T , q. We
denote with γ (B) the set of concretizations of B. We say that an abstract configuration
B is correct if it has at least one concretization (formally γ (B) = 0).
/
An interesting property of an abstract configuration is that from one of its concretiza-
tions it is possible to reach via bind and unbind actions all the other concretizations up
to instance renaming. This is because it is always possible to switch one binding from
one provide port to another one by adding a binding to the new port and then removing
the old binding.
Property 1. Given an abstract configuration B and configurations C1 , C2 ∈ γ (B) there
α1 αn
exists α1 , . . . , αn sequence of binding and unbinding actions s.t. C1 −→ . . . −→ C ≡ C2 .
We now move to the definition of our quasi-ordering on abstract configurations. In order
to be compatible with the notion of correctness we cannot adopt the usual multiset
inclusion ordering. In fact, the addition of one component to a correct configuration
could introduce a conflict. If the type-state pair of the added component was absent in
the configuration, the conflict might be with a component of a different type-state. If the
type-state pair was present in a single copy, the conflict might be with that component
if the considered type-state pair activates one provide and one conflict port on the same
interface. This sort of self-conflict is revealed when there are at least two instances,
as one component cannot be in conflict with itself. If the type-state pair was already
present in at least two copies, no new conflicts can be added otherwise such conflicts
were already present in the configuration (thus contradicting its correctness).
In the light of the above observation, we define an ordering on configurations that
corresponds to the product of three orderings: the identity on the set of type-state pairs
that are absent, the identity on the pairs that occurs in one instance, and the multiset
inclusion for the projections on the remaining type-state pairs.
Definition 12 (≤). Given a pair T , q and an abstract configuration B, let
#B (T , q) be the number of occurrences in B of the pair T , q. Given two ab-
stract configurations B1 , B2 we write B1 ≤ B2 if for every component type T
and state q we have that #B1 (T , q) = #B2 (T , q) when #B1 (T , q) ∈ {0, 1} or
#B2 (T , q) ∈ {0, 1}, and #B1 (T , q) ≤ #B2 (T , q) otherwise.
Component Reconfiguration in the Presence of Conflicts 193

As discussed above, this ordering is compatible with correctness.

Property 2. If an abstract configuration B is correct than all the configurations B 1 such


that B ≤ B 1 are also correct.

Another interesting property of the ≤ quasi-ordering is that from one concretization of


an abstract configuration, it is always possible to reconfigure it to reach a concretization
of a smaller abstract configuration. In this case it is possible to first add from the starting
configuration the bindings that are present in the final configuration. Then the extra
components present in the starting configuration can be deleted because not needed to
guarantee correctness (they are instances of components that remain available in at least
two copies). Finally the remaining extra bindings can be removed.

Property 3. Given two abstract configurations B1 , B2 s.t. B1 ≤ B2 , C1 ∈ γ (B1 ), and


α1 αn
C2 ∈ γ (B2 ) we have that there exists a reconfiguration run C2 −→ . . . −→ C ≡ C1 .

We have that ≤ is a wqo on Conf because, as we consider finitely many component


type-state pairs, the three distinct orderings that compose ≤ are themselves wqo.

Lemma 1. ≤ is a wqo over Conf .

We now define a transition system on abstract reconfigurations and prove it is a WSTS


with respect to the ordering defined above.
α
→ B 1 if there exists C −
Definition 13 (Abstract reconfigurations). We write B − →C1
1 1
for some C ∈ γ (B) and C ∈ γ (B ).

By Property 3 and Lemma 1 we have the following.

Lemma 2. The transition system (Conf , −


→, ≤) is a WSTS.

The following lemma is rather technical and it will be used to prove that (Conf , −
→, ≤)
has effective pred-basis. Intuitively it will allow us to consider, in the computation of
the predecessors, only finitely many different state change actions.

Lemma 3. Let k be the number of distinct component type-state pairs. If B1 → − B2


then there exists B11 −
→ B21 such that B11 ≤ B1 , B21 ≤ B2 and |B21 | ≤ 3k + 2k2.

Proof. If |B2 | ≤ 3k + 2k2 the thesis trivially holds. Consider now |B2 | > 3k + 2k2 and
α
a transition C1 −→ C2 such that C1 ∈ γ (B1 ) and C2 ∈ γ (B2 ). Since |B2 | > 3k there are
three components z1 , z2 and z3 having the same component type and internal state. We
consider two subcases.
Case 1. z1 , z2 and z3 do not perform a state change in the action α . W.l.o.g we can as-
sume that z3 does not appear in α (this is not restrictive because at most two components
that do not perform a state change can occur in an action). We can now consider the con-
figuration C11 obtained by C1 after removing z3 (if there are bindings connected to pro-
α
vide ports of z3 , these can be rebound to ports of z1 or z2 ). Consider now C11 −
→ C21 and
the corresponding abstract configurations B1 and B2 . It is easy to see that B11 −
1 1
→ B21 ,
194 R. Di Cosmo et al.

B11 ≤ B1 , B21 ≤ B2 and |B21 | < |B2 |. If |B21 | ≤ 3k + 2k2 the thesis is proved, otherwise
we repeat this deletion of components.
Case 2. There are no three components of the same type-state that do not perform a
state change. Since |B2 | > 2k2 + 2 we have that α is a state change involving strictly
more than 2k2 components. This ensures the existence of three components z11 , z12 and
z13 of the same type that perform the same state change from q to q1 . As in the previous
case we consider the configuration C11 obtained by C1 after removing z13 and α 1 the state
α1
change similar to α but without the state change of z13 . Consider now C11 −→ C21 and the
corresponding abstract configurations B11 and B21 . As above, B11 ≤ B1 , B21 ≤ B2 and
|B21 | < |B2 |. If |B21 | ≤ 3k + 2k2 the thesis is proved, otherwise we repeat the deletion
of components. &
%
We are now in place to prove that (Conf , −
→, ≤) has effective pred-basis.
Lemma 4. The transition system (Conf , −
→, ≤) has effective pred-basis.
Proof. We first observe that given an abstract configuration the set of its concretizations
up to configuration equivalence is finite, and that given a configuration C the set of pre-
α
ceding configurations C 1 such that C 1 −→ C is also finite (and effectively computable).
Consider now an abstract configuration B. We now show how to compute a finite ba-
sis for ↑ Pred(↑ B). First of all we consider the configuration B if |B| > 3k + 2k2 ,
the (finite) set of configurations B 1 such that B ≤ B 1 and |B 1 | ≤ 3k + 2k2 otherwise.
Then we consider the (finite) set of concretizations of all such abstract configurations.
And finally we compute the (finite) set of the preceding configurations of all such con-
cretizations. The set of abstract configuration corresponding to the latter is a finite basis
for ↑ Pred(↑ B) as a consequence of Lemma 3. &
%
We are finally ready to prove our decidability result.
Theorem 1. The reconfigurability problem in Aeolus core is decidable.
Proof. Let k be the number of distinct component type-state pairs according to the
considered universe of component types. We first observe that if there exists a correct
configuration containing a component of type T in state q then it is possible to obtain
via some binding, unbinding, and delete actions another correct configuration with k or
less components. Hence, given a component type T and a state q, the number of target
configurations that need to be considered is finite. Moreover, given a configuration C 1 ∈
γ (B 1 ) there exists a reconfiguration run from C ∈ γ (B) to C 1 iff B ∈ Pred ∗ (↑ B 1 ).
To solve the reconfigurability problem it is therefore possible to consider only the
(finite set of) abstractions of the target configurations. For each of them, say B 1 , by
Proposition 1, Lemma 2 and Lemma 4 we know that a finite basis for Pred ∗ (↑ B 1 ) can
be computed. It is sufficient to check whether at least one of the abstract configurations
in such basis is ≤ w.r.t. the abstraction of the initial configuration. &
%

4 Reconfigurability is ExpSpace-hard in Aeolus core


We prove that the reconfigurability problem in Aeolus core is ExpSpace-hard by reduc-
tion from the coverability problem in Petri nets, a problem which is indeed known to be
ExpSpace-complete [11,13]. We start with some background on Petri nets.
Component Reconfiguration in the Presence of Conflicts 195

q0 q
η q0 f
q'0

f
e e q0

r raux
e q

Fig. 2. Example of a component type transformation η ( )

A Petri net is a tuple N = (P, T, m0 ), where P and T are finite sets of places and
transitions, respectively. A finite multiset over the set P of places is called a marking,
and m0 is the initial marking. Given a marking m and a place p, we say that the place
p contains a number of tokens equal to the number of instances of p in m. A transition
t ∈ T is a pair of markings denoted with •t and t • . A transition t can fire in the marking m
if •t ⊆ m (where ⊆ is multiset inclusion); upon transition firing the new marking of the
net becomes n = (m\ m1 )-m11 (where \ and - are the difference and union operators for
multisets, respectively). This is written as m ⇒ n. We use ⇒∗ to denote the reflexive and
transitive closure of ⇒. We say that m1 is reachable from m if m ⇒∗ m1 . The coverability
problem for marking m consists of checking whether m0 ⇒∗ m1 for some m ⊆ m1 .
We now discuss how to encode Petri nets in Aeolus core component types. Before
entering into the details we observe that given a component type T it is always possible
to modify it in such a way that its instances are persistent and unique. The uniqueness
constraint can be enforced by allowing all the states of the component type to provide
a new port with which they are in conflict. To avoid the component deletion it is suf-
ficient to impose its reciprocal dependence with a new type of component. When this
dependence is established the components be deleted without violating it. In Fig. 2 we
show an example of how a component type having two states can be modified in order
to reach our goal. A new auxiliary initial state q10 is created. The new port e ensures
that the instances of type T in a state different from q10 are unique. The require port f
provided by a new component type Taux forbids the deletion of the instances of type T ,
if they are not in state q10 . We assume that the ports e and f are fresh. We can therefore
consider w.l.o.g. components that, when deployed, are unique and persistent. Given a
component type T we denote this component type transformation with η (T ).
We now describe how to encode a Petri net in the Aeolus core model. We will use
three types of components: one modeling the tokens, one for transitions and one for
defining a counter. The components for transitions and the counter are unique and per-
sistent, while those for the tokens cannot be unique because the number of tokens in
a Petri net can be unbounded. The simplest component is the one used to model a to-
ken in a given place. Intuitively one token in a place is encoded as one instance of a
corresponding component type in an on state. There could be more than one of these
components deployed simultaneously representing multiple tokens in a place. In Fig. 3a
we represent the component type for the tokens in the place p of the Petri net. The ini-
tial state is the off state. The token could be created following a protocol consisting
of requiring the port a p and then providing the port b p to signal the change of status.
Similarly a token can be deleted requiring the port c p and then providing the port d p .
196 R. Di Cosmo et al.

1 counteri(1)

off reset counteri


ap dp reset counter
i up' counteri
ap cp
up counteri+1 up counteri
bp cp
up' counteri+1
token(p) on

(a) Token in place p counteri(0)


Ci 0

(b) i-th bit counter

Fig. 3. Token and counter component types

Even if multiple instances of the token component can be deployed simultaneously, the
conflict ports a p and c p guarantee that only one at a time can initiate the protocol to
change its state. We denote with token(p) the component type representing the tokens
in the place p.
In order to model the transitions with component types without having an exponen-
tial blow up of the size of the encoding we need a mechanism to count up to a fixed
number. Indeed a transition can consume and produce up to a given number of tokens.
To count a number up to n we will use C1 , . . . ,C log(n) components; every Ci will rep-
resent the i-th less significant bit of the binary representation of the counter that, for
our purposes, needs just to support the increment and reset operations. In Fig. 3b we
represent one of the bits implementing the counter. The initial state is 0. To reset the
bit it is possible to provide the reset counteri port while to increment it the up counteri
should be provided. If the bit is in state 1 the increment will trigger the increment of the
next bit except for the component representing the most significant bit that will never
need to do that. We transform all the component types representing the counter using
the η transformation to ensure uniqueness and persistence of its instances. The instance
of η (Ci ) can be used to count how many tokens are consumed or produced checking if
the right number is reached via the ports counteri (1) and counteri (0).
A transition can be represented with a single component interacting with token and
counter components. The state changes of the transition component can be intuitively
divided in phases. In each of those phases a fixed number of tokens from a given place
is consumed or produced. The counter is first reset providing the reset counteri and re-
quiring the reset1 counteri ports for all the counter bits. Then a cycle starts incrementing
the counter providing and requiring the ports up counter1 and up1 counter1 and consum-
ing or producing a token. The production of a token in place p is obtained providing
and requiring ports a p and b p while the consumption providing and requiring the ports
c p and d p . The phase ends when all the bits of the counter represent in binary the right
number of tokens that need to be consumed or produced. If instead at least one bit is
wrong the cycle restarts. In Fig. 4 we depict the phase of a consumption of n tokens.
Starting from the initial state of the component representing the transition, the con-
sumption phases need to be performed first. When the final token has been produced
Component Reconfiguration in the Presence of Conflicts 197

counteri(¬hi)
counterk(¬hk) counter1(¬h1)
transition(t)

... ...

... ...

∀ i . reset' counteri up counter1 cp dp ∀ i . counteri(hi)


∀ i . reset counteri up' counter1

Fig. 4. Consumption phase of n tokens from place p for a transition t (k = log(n) and hi is the
i-th least significative bit of the binary representation of n)

the transition component can restart from the initial state. Given a transition t we will
denote with transition(t) the component type explained above.
Definition 14 (Petri net encoding in Aeolus core). Given a Petri net N = (P, T, m0 ) if
n is the largest number of tokens that can be consumed or produced by a transition in
T , the encoding of N in Aeolus core is the set of component types ΓN = {token(p) | p ∈
P} ∪ {η (Ci ) | i ∈ [1.. log(n) ]} ∪ {η (transition(t)) | t ∈ T }.
An important property of the previous encoding is that it is polynomial w.r.t. the size
of the Petri net. This is due to the fact that the counter and place components have a
constant amount of states and ports while the transition components have a number of
states that grows linearly w.r.t. the number of places involved in a transition.
The proof that the reconfiguration problem for Aeolus core is ExpSpace-hard thus
follows from the following correspondence between a Petri net N and its set of compo-
nent types ΓN : every computation in N can be faithfully reproduced by a correspond-
ing reconfiguration run on the components types ΓN ; every reconfiguration run on ΓN
corresponds to a computation in N excluding the possibility for components of kind
token(p) to be deleted (because η is not applied to those components) and of compo-
nents transition(t) to execute only partially the consumption of the tokens (because e.g.
some token needed by the transition is absent). In both cases, the effect is to reach a
configuration in which some of the token was lost during the reconfiguration run, but
this is not problematic as we deal with coverability. In fact, if a configuration is reached
with at least some tokens, then also the corresponding Petri nets will be able to reach a
marking with at least those tokens (possibly more).
Theorem 2. The reconfiguration problem for Aeolus core is ExpSpace-hard.

5 Related Work and Conclusions


Engage [8] is very close to Aeolus purposes: it provides a declarative language to
define resource configurations and a deployment engine. However, it lacks conflicts.
198 R. Di Cosmo et al.

This might make a huge computational differences, as it is precisely the introduction


of conflicts that makes reconfigurability ExpSpace-hard in Aeolus core (the problem is
polynomial in Aeolus− [6]). ConfSolve [9] is a DSL used to specify system configu-
rations with constraints suitable for modern Constraint Satisfaction Problems solvers.
ConfSolve allocates virtual machines to physical ones considering constraints like CPU,
RAM, . . . . This differs from reconfigurability in Aeolus. Package-based software man-
agement [1,5] is a degenerate case of Aeolus reconfigurability. Package managers are
used to compute a new configuration, but they use simple heuristics to reach it, ignoring
transitive incoherences met during deployment.
In this work we have studied the impact of adding conflicts to a realistic component
model, onto the complexity of reconfigurability: the problem remains decidable—while
in other models, like Petri nets, the addition of tests-for-absence makes the model Tur-
ing powerful—but becomes ExpSpace-hard.
We consider our decidability and hardness proofs useful for at least two future in-
tertwined research directions. On the one hand, we plan to extend existing tools with
techniques inspired by our decidability proof in order to also deal with conflicts and
produce a reconfiguration run. On the other hand, the hardness proof sheds some light
on the specific combination of component model features that make the reconfigurabil-
ity problem ExpSpace-hard. We plan to investigate realistic restrictions on the Aeolus
component model for which efficient reconfigurability algorithms could be devised.

References
1. Abate, P., Di Cosmo, R., Treinen, R., Zacchiroli, S.: Dependency solving: a separate concern
in component evolution management. J. Syst. Software 85, 2228–2240 (2012)
2. Abdulla, P.A., Cerans, K., Jonsson, B., Tsay, Y.K.: General decidability theorems for infinite-
state systems. In: LICS, pp. 313–321. IEEE (1996)
3. Clayberg, E., Rubel, D.: Eclipse Plug-ins, 3rd edn. Addison-Wesley (2008)
4. Di Cosmo, R., Mauro, J., Zacchiroli, S., Zavattaro, G.: Component reconfiguration in the
presence of conflicts. Tech. rep. Aeolus Project (2013),
http://hal.archives-ouvertes.fr/hal-00816468
5. Di Cosmo, R., Trezentos, P., Zacchiroli, S.: Package upgrades in FOSS distributions: Details
and challenges. In: HotSWup 2008 (2008)
6. Di Cosmo, R., Zacchiroli, S., Zavattaro, G.: Towards a formal component model for the
cloud. In: Eleftherakis, G., Hinchey, M., Holcombe, M. (eds.) SEFM 2012. LNCS, vol. 7504,
pp. 156–171. Springer, Heidelberg (2012)
7. Finkel, A., Schnoebelen, P.: Well-structured transition systems everywhere! Theoretical
Computer Science 256, 63–92 (2001)
8. Fischer, J., Majumdar, R., Esmaeilsabzali, S.: Engage: a deployment management system. In:
PLDI 2012: Programming Language Design and Implementation, pp. 263–274. ACM (2012)
9. Hewson, J.A., Anderson, P., Gordon, A.D.: A declarative approach to automated configura-
tion. In: LISA 2012: Large Installation System Administration Conference, pp. 51–66 (2012)
10. Kanies, L.: Puppet: Next-generation configuration management. The USENIX Maga-
zine 31(1), 19–25 (2006)
11. Lipton, R.J.: The Reachability Problem Requires Exponential Space. Research report 62,
Department of Computer Science, Yale University (1976)
12. OSGi Alliance: OSGi Service Platform, Release 3. IOS Press, Inc. (2003)
13. Rackoff, C.: The covering and boundedness problems for vector addition systems. Theoret.
Comp. Sci. 6, 223–231 (1978)
Stochastic Context-Free Grammars, Regular
Languages, and Newton’s Method

Kousha Etessami1 , Alistair Stewart1 , and Mihalis Yannakakis2


1
School of Informatics, University of Edinburgh
[email protected], [email protected]
2
Department of Computer Science, Columbia University
[email protected]

Abstract. We study the problem of computing the probability that a


given stochastic context-free grammar (SCFG), G, generates a string in
a given regular language L(D) (given by a DFA, D). This basic problem
has a number of applications in statistical natural language processing,
and it is also a key necessary step towards quantitative ω-regular model
checking of stochastic context-free processes (equivalently, 1-exit recur-
sive Markov chains, or stateless probabilistic pushdown processes).
We show that the probability that G generates a string in L(D) can
be computed to within arbitrary desired precision in polynomial time
(in the standard Turing model of computation), under a rather mild as-
sumption about the SCFG, G, and with no extra assumption about D.
We show that this assumption is satisfied for SCFG’s whose rule prob-
abilities are learned via the well-known inside-outside (EM) algorithm
for maximum-likelihood estimation (a standard method for constructing
SCFGs in statistical NLP and biological sequence analysis). Thus, for
these SCFGs the algorithm always runs in P-time.

1 Introduction
Stochastic (or Probabilistic) Context-Free Grammars (SCFG) are context-free
grammars where the rules (productions) have associated probabilities. They are
a central stochastic model, widely used in natural language processing [14], with
applications also in biology (e.g. [2, 13]). A SCFG G generates a language L(G)
(like an ordinary CFG) and assigns a probability to every string in the language.
SCFGs have been extensively studied since the 1970’s. A number of important
problems on SCFGs can be viewed as instances of the following regular pattern
matching problem for different regular languages:
Given a SCFG G and a regular language L, given e.g., by a deterministic
finite automaton (DFA) D, compute the probability PG (L) that G generates a
string in L, i.e. compute the sum of the probabilities of all the strings in L.
A simple example is when L = Σ ∗ , the set of all strings over the terminal
alphabet Σ of the SCFG G. Then this problem simply asks to compute the

The full version of this paper is available at arxiv.org/abs/1302.6411. Research
partially supported by the Royal Society and by NSF Grant CCF-1017955.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 199–211, 2013.

c Springer-Verlag Berlin Heidelberg 2013
200 K. Etessami, A. Stewart, and M. Yannakakis

probability PG (L(G)) of the language L(G) generated by the grammar G. Al-


ternatively, if we view the SCFG as a stochastic process that starts from the
start nonterminal, repeatedly applies the probabilistic rules to replace (say, left-
most) nonterminals, and terminates when a string of terminals is reached, then
PG (L(G)) is simply the probability that this process terminates. Another simple
example is when L is a singleton, L = {w}, for some string w; in this case the
problem corresponds to the basic parsing question of computing the probability
that a given string w is generated by the SCFG G. Another basic well-studied
problem is the computation of prefix probabilities: given a SCFG G and a string
w, compute the probability that G generates a string with prefix w [12, 21].
This is useful in online processing in speech recognition [12] and corresponds to
the case L = wΣ ∗ . A more complex problem is the computation of infix prob-
abilities [1, 18], where we wish to compute the probability that G generates a
string that contains a given string w as a substring, which corresponds to the
language L = Σ ∗ wΣ ∗ . In general, even when rule probabilities of the SCFG G
are rational, the probabilities we wish to compute can be irrational. Thus the
typical aim for “computing” them is to approximate them to desired precision.
Stochastic context-free grammars are closely related to 1-exit recursive Markov
chains (1-RMC) [9], and to stateless probabilistic pushdown automata (also called
pBPA) [5]; these are two equivalent models for a subclass of probabilistic pro-
grams with recursive procedures. The above regular pattern matching problem
for SCFGs is equivalent to the problem of computing the probability that a
computation of a given 1-RMC (or pBPA) terminates and satisfies a given reg-
ular property. In other words, it corresponds to the quantitative model checking
problem for 1-RMCs with respect to regular finite string properties.
We first review some prior related work, and then describe our results.

Previous Work. As mentioned above, there has been, on the one hand, sub-
stantial work in the NLP literature on different cases of the problem for various
regular languages L, and on the other hand, there has been work in the verifi-
cation and algorithms literature on the analysis and model checking of recursive
Markov chains and probabilistic pushdown automata. Nevertheless, even the
simple special case of L = Σ ∗ , the question of whether it is possible to compute
(approximately) in polynomial time the desired probability for a given SCFG
G (i.e. the probability PG (L(G)) of L(G)) was open until very recently. In [7]
we showed that PG (L(G)) can be computed to arbitrary precision in polynomial
time in the size of the input SCFG G and the number of bits of precision. From
a SCFG G, one can construct a multivariate system of equations x = PG (x),
where x is a vector of variables and PG is a vector of polynomials with positive
coefficients which sum to (at most) 1. Such a system is called a probabilistic poly-
nomial system (PPS), and it always has a non-negative solution that is smallest
in every coordinate, called the least fixed point (LFP). A particular coordinate
of the LFP of the system x = PG (x) is the desired probability PG (L(G)). To
compute PG (L(G)), we used a variant of Newton’s method on x = PG (x), with
suitable rounding after each step to control the bit-size of numbers, and showed
that it converges in P-time to the LFP [7]. Building on this, we also showed that
SCFG, Regular Languages, and Newton’s Method 201

the probability PG ({w}) of string w under SCFG G can also be computed to


any precision in P-time in the size of G, w and the number of bits of precision.
The use of Newton’s method was proposed originally in [9] for computing
termination probabilities for (multi-exit) RMC’s, which requires the solution of
equations from a more general class of polynomial systems x = P (x), called
monotone polynomial systems (MPS), where the polynomials of P have positive
coefficients, but their sum is not restricted to ≤ 1. An arbitrary MPS may not
have any non-negative solution, but if it does then it has a LFP, and a version
of Newton provably converges to the LFP [9]. There are now implementations of
variants of Newton’s method in several tools [22, 16] and experiments show that
they perform well on many instances. The rate of convergence of Newton for
general MPSs was studied in detail in [4], and was further studied most recently
in [20] (see below). In certain cases, Newton converges fast, but in general there
are exponential bad examples. Furthermore, there are negative results indicating
it is very unlikely that any non-trivial approximation of termination probabilities
of multi-exit RMCs, and the LFP of MPSs, can be done in P-time (see [9]).
The model checking problem for RMCs (equivalently pPDAs) and ω-regular
properties was studied in [5, 10]. This is of course a more general problem than
the problem for SCFGs (which correspond to 1-RMCs) and regular languages
(the finite string case of ω-regular languages). It was shown in [10] that in the
case of 1-RMCs, the qualitative problem of determining whether the probability
that a run satisfies the property is 0 or 1 can be solved in P-time in the size of
the 1-RMC, but for the quantitative problem of approximating the probability,
the algorithm runs in PSPACE, and no better complexity bound was known.
The particular cases of computing prefix and infix probabilities for a SCFG
have been studied in the NLP literature, but no polynomial time algorithm for
general SCFGs is known. Jelinek and Lafferty gave an algorithm for grammars
in Chomsky Normal Form (CNF) [12]. Note that a general SCFG G may not
have any equivalent CNF grammar with rational rule probabilities, thus one can
only hope for an “approximately equivalent" CNF grammar; constructing such
a grammar in the case of stochastic grammars G is non-trivial, at least as dif-
ficult as computing the probability of L(G), and the first P-time algorithm was
given in [7]. Another algorithm for prefix probabilities by Stolcke [21] applies
to general SCFGs, but in the presence of unary and -rules, the algorithm does
not run in polynomial time. The problem of computing infix probabilities was
studied in [1, 16, 18], and in particular [16, 18] cast it in the general regular lan-
guage framework, and studied the general problem of computing the probability
PG (L(D)) of the language L(D) of a DFA D under a SCFG G. From G and
D they construct a product weighted context-free grammar (WCFG) G : a CFG
with (positive) weights on the rules, which may not be probabilities, in partic-
ular the weights on the rules of a nonterminal may sum to more than 1. The
desired probability PG (L(D)) is the weight of L(G ). As in the case of SCFGs,
this weight is given by the LFP of a monotone system of equations y = PG (y),
however, unlike the case of SCFGs the system now is not a probabilistic system
(thus our result of [7] does not apply). Nederhof and Satta then solve the system
202 K. Etessami, A. Stewart, and M. Yannakakis

using the decomposed Newton method from [9] and Broyden’s (quasi-Newton)
method, and present experimental results for infix probability computations.
Most recently, in [20], we have obtained worst-case upper bounds on (rounded
and exact) Newton’s method applied to arbitrary MPSs, x = P (x), as a function
of the input encoding size |P | and log(1/), to converge to within additive error
 > 0 of the LFP solution q ∗ . However, our bounds in [20], even when 0 < q ∗ ≤
1, are exponential in the depth of (not necessarily critical) strongly connected
components of x = P (x), and furthermore they also depend linearly on log( q∗1 ),
min

where qmin = mini qi∗ , which can be ≈ 21|P | . As we describe next, we do far better
2
in this paper for the MPSs that arise from the “product” of a SCFG and a DFA.

Our Results. We study the general problem of computing the probability


PG (L(D)) that a given SCFG G generates a string in the language L(D) of
a given DFA D. We show that, under a certain mild assumption on G, this
probability can be computed to any desired precision in time polynomial in the
encoding sizes of G & D and the number of bits of precision.
We now sketch briefly the approach and state the assumption on G. First we
construct from G and D the product weighted CFG G = G ⊗ D as in [16] and
construct the corresponding MPS y = PG (y), whose LFP contains the desired
probability PG (L(D)) as one of its components.The system is monotone but not
probabilistic. We eliminate (in P-time) those variables that have value 0 in the
LFP, and apply Newton, with suitable rounding in every step. The heart of the
analysis shows there is a tight algebraic correspondence between the behavior of
Newton’s method on this MPS and its behavior on the probabilistic polynomial
system (PPS) x = PG (x) of G. In particular, this correspondence shows that,
with exact arithmetic, the two computations converge at the same rate. By
exploiting this, and by extending recent results we established for PPSs, we
obtain the conditional polynomial upper bound. Specifically, call a PPS x = P (x)
critical if the spectral radius of the Jacobian of P (x), evaluated at the LFP q ∗
is equal to 1 (it is always ≤ 1). We can form a dependency graph between the
variables of a PPS, and decompose the variables and the system into strongly
connected components (SCCs); an SCC is called critical if the induced subsystem
on that SCC is critical. The critical depth of a PPS is the maximum number of
critical SCCs on any path of the DAG of SCCs (i.e. the max nesting depth of
critical SCCs). We show that if the PPS of the given SCFG G has bounded (or
even logarithmic) critical depth, then we can compute PG (L(D)) (for any DFA
D) in polynomial time in the size of G, D and the number of bits of precision.
Furthermore, we show this condition is satisfied by a broad class of SCFGs
used in applications. Specifically, a standard way the probabilities of rules of a
SCFG are set is by using the EM (inside-outside) algorithm. We show that the
SCFGs constructed in this way are guaranteed to be noncritical (i.e., have critical
depth 0). So for these SCFGs, and any DFA, the algorithm runs in P-time.
Proofs are in the full version [8].
SCFG, Regular Languages, and Newton’s Method 203

2 Definitions and Background

A weighted context-free grammar (WCFG), G = (V, Σ, R, p), has a finite set V


of nonterminals, a finite set Σ of terminals (alphabet symbols), and a finite list
of rules, R ⊂ V × (V ∪ Σ)∗ , where each rule r ∈ R is a pair (A, γ), which we
usually denote by A → γ, where A ∈ V and γ ∈ (V ∪ Σ)∗ . Finally p : R → R+
maps each rule r ∈ R to a positive weight, p(r) > 0. We often denote a rule
p(r)
r = (A → γ) together with its weight by writing A → γ. We will sometimes
also specify a specific non-terminal S ∈ V as the starting symbol.
Note that we allow γ ∈ (V ∪ Σ)∗ to possibly be the empty string, denoted
by . A rule of the form A→ is called an -rule. For a rule r = (A → γ), we
let left(r) := A and right(r)
 := γ. We let RA = {r ∈ R | left(r) = A}.
For A ∈ V , let p(A) = r∈RA p(r). A WCFG, G, is called a stochastic or
probabilistic context-free grammar (SCFG or PCFG; we shall use SCFG), if for
∀A ∈ V , p(A) ≤ 1. An SCFG is called proper if ∀A ∈ V, p(A) = 1.
We will say that a WCFG, G = (V, Σ, R, p) is in Simple Normal Form (SNF)
if every nonterminal A ∈ V belongs to one of the following three types:
p(r)
1. type L: every rule r ∈ RA , has the form A −−→ B.
1
2. type Q: there is a single rule in RA : A −→ BC, for some B, C ∈ V .
1 1
3. type T: there is a single rule in RA : either A −
→ , or A −
→ a for some a ∈ Σ.
π
For a WCFG, G, strings α, β ∈ (V ∪Σ)∗ , and π = r1 . . . rk ∈ R∗ , we write α ⇒ β
if the leftmost derivation starting
k from α, andπapplying the πsequence π of rules,
π
derives β. We let p(α ⇒ β) = i=1 p(rk ) if α ⇒ β, and p(α ⇒ β) = 0 otherwise.
π
If A ⇒ w for A ∈ V and w ∈ Σ ∗ , we say that π is a complete derivation from A
and its yield is y(π) = w. There is a natural one-to-one correspondence between
the complete derivations of w starting at A and the parse trees of w rooted at
A, and this correspondence preserves weights.
For a WCFG, G = (V, Σ, R, p), nonterminal A ∈ V , and terminal string
 π
w ∈ Σ ∗ , we let pG,w
A = {π|y(π)=w} p(A ⇒ w). For a general WCFG, pG,w A need
not be a finite value (it may be +∞, since the sum may not converge). Note
however that if G is an SCFG, then pG,w A defines the probability that, starting
at nonterminal A, G generates w, and thus it is clearly finite.
The termination probability (termination weight) of an SCFG (WCFG), G,

starting at nonterminal A, denoted qA G G
, is defined by qA = w∈Σ ∗ pG,wA . Again,
G
for an arbitrary WCFG qA need not be a finite number. A WCFG G is called
convergent if qA G
is finite for all A ∈ V . We will only encounter convergent
WCFGs in this paper, so when we say WCFG we mean convergent WCFG,
G
unless otherwise specified. In G is an SCFG, then qA is just the total probability
with which the derivation process starting at A eventually generates a finite
string and (thus) stops, so SCFGs are clearly convergent.
G
An SCFG, G, is called consistent starting at A if qA = 1, and G is called
consistent if it is consistent starting at every nonterminal. Note that even if a
SCFG, G, is proper this does not necessarily imply that G is consistent. For an
204 K. Etessami, A. Stewart, and M. Yannakakis

G
SCFG, G, we can decide whether qA = 1 in P-time ([9]). The same decision
problem is PosSLP-hard for convergent WCFGs ([9]).
For any WCFG, G = (V, Σ, R, p), with n = |V |, assume the nonterminals
in V are indexed as A1 , . . . , An . We define the following monotone polyno-
mial system of equations (MPS) associated with G, denoted x = PG (x).
Here x = (x1 , . . . , xn ) denotes an n-vector of variables. Likewise PG (x) =
(PG (x)1 , . . . , PG (x)n ) denotes an n-vector of multivariate polynomials over the
variables x = (x1 , . . . , xn ). For a vector κ = (κ1 , κ2 , . . . , κn ) ∈ Nn , we use the
notation xκ to denote the monomial xκ1 1 xκ2 2 . . . xκnn . For a non-terminal Ai ∈ V ,
and a string α ∈ (V ∪ Σ)∗ , let κi (α) ∈ N denote the number of occurrences of
Ai in the string α. We define κ(α) ∈ Nn to be κ(α) = (κ1 (α), κ2 (α), . . . , κn (α)).
In the MPS x = PG (x), corresponding to each nonterminal Ai ∈ V , there
 be one variable xκ(α)
will i and one equation, namely xi = PG (x)i , where: PG (x)i ≡
r=(A→α)∈RAi p(r)x . If there are no rules associated with Ai , i.e., if RAi = ∅,
then by default we define PG (x)i ≡ 0. Note that if r ∈ RAi is a terminal rule,
i.e., κ(r) = (0, . . . , 0), then p(r) is one of the constant terms of PG (x)i .
Note: Throughout this paper, for any n-vector z, whose i’th coordinate zi “cor-
responds” to nonterminal Ai , we often find it convenient to use zAi to refer to
zi . So, e.g., we alternatively use xAi and PG (x)Ai , instead of xi and PG (x)i .
Note that if G is a SCFG, then in x = PG (x), by definition, the sum of the
monomial coefficients
 and constant terms of each polynomial PG (x)i is at most
1, because r∈RA p(r) ≤ 1 for every Ai ∈ V . An MPS that satisfies this extra
i
condition is called a probabilistic polynomial system of equations (PPS).
Consider any MPS, x = P (x), with n variables, x = (x1 , . . . , xn ). Let R≥0
denote the non-negative real numbers. Then P (x) defines a monotone operator
on the non-negative orthant Rn≥0 . In general, an MPS need not have any real-
valued solution: consider x = x + 1. However, by monotonicity of P (x), if there
exists a ∈ Rn≥0 such that a = P (a), then there is a least fixed point (LFP) solution
q ∗ ∈ Rn≥0 such that q ∗ = P (q ∗ ), and such that q ∗ ≤ a for all solutions a ∈ Rn≥0 .
Proposition 1. (cf. [9] or see [17]) For any SCFG (or convergent WCFG), G,
with n nonterminals A1 , . . . , An , the LFP solution of x = PG (x) is the n-vector
q G = (qA
G
1
G
, . . . , qA n
) of termination probabilities (termination weights) of G.
For computation purposes, we assume that the input probabilities (weights)
associated with rules of input SCFGs or WCFGs are positive rationals encoded
by giving their numerator and denominator in binary. We use |G| to denote the
encoding size (i.e., number of bits) of an input WCFG G.
Given any WCFG (SCFG) G = (V, Σ, R, p) we can compute in linear time
an SNF form WCFG (resp. SCFG) G = (V  Σ, R , p ) of size |G | = O(|G|) with
G ,w
V  ⊇ V such that qA G,w
= qA for all A ∈ V , w ∈ Σ ∗ (cf. [9] and Proposition
2.1 of [7]). Thus, for the problems studied in this paper, we may assume wlog
that a given input WCFG or SCFG is in SNF form.
A DFA, D = (Q, Σ, Δ, s0 , F ), has states Q, alphabet Σ, transition function
Δ : Q × Σ → Q, start state s0 ∈ Q and final states F ⊆ Q. We extend Δ to
strings: Δ∗ : Q × Σ ∗ → Q is defined by induction on the length |w| ≥ 0 of
SCFG, Regular Languages, and Newton’s Method 205

w ∈ Σ ∗ : for s ∈ Q, Δ∗ (s, ) := s. Inductively, if w = aw , with a ∈ Σ, then


Δ∗ (s, w) := Δ∗ (Δ(s, a), w ). We define L(D) = {w ∈ Σ ∗ | Δ∗ (s0 , w) ∈ F }.
Given a WCFG G and a DFA D overthe same terminal alphabet, for any
G,D G,w G,D
nonterminal A of G, we define qA = w∈L(D) qA . If G is a SCFG, qA
simply denotes the probability that G, starting at A, generates a string in L(D).
G,D G,D
Our goal is to compute qA , given SCFG G and DFA D. In general, qA may
be irrational, even when all rule probabilities of G are rational values. So one
G,D
natural goal is to approximate qA with desired precision. More precisely, the
approximation problem is: given an SCFG, G, a nonterminal A, a DFA, D, over
the same terminal alphabet Σ, and a rational error threshold δ > 0, output a
G,D
rational value v ∈ [0, 1] such that |v − qA | < δ. We would like to do this as
efficiently as possible as a function of the input size: |G|, |D|, and log(1/δ).
G,D
To compute qA , it will be useful to define a WCFG obtained as the product of
a SCFG and a DFA. We assume, wlog, that the input SCFG is in SNF form. The
product (or intersection) of a SCFG G = (V, Σ, R, p) in SNF form, and DFA,
D = (Q, Σ, Δ, s0 , F ), is defined to be a new WCFG, G ⊗ D = (V  , Σ, R , p ),
where the set of nonterminals is V  = Q × V × Q. Assuming n = |V | and d = |Q|,
then |V  | = d2 n. The rules R and rule probabilities p of the product G ⊗ D are
defined as follows (recall G is assumed to be in SNF):
p
– Rules of form L: For every rule of the form (A − → B) ∈ R, and every pair of
p
states s, t ∈ Q, there is a rule (sAt) − → (sBt) in R .
1
– Rules of form Q: for every rule (A − → BC) ∈ R, and for all states s, t, u ∈ Q,
1
there is a rule (sAu) − → (sBt)(tCu) in R .
1
– Rules of form T: for every rule (A − → a) ∈ R, where a ∈ Σ, and for every
1
state s ∈ Q, if Δ(s, a) = t, then there is a rule (sAt) − → a in R .
1 1
For every rule (A − → ) ∈ R, and every s ∈ Q, there is a rule (sAs) − →
Associated with the WCFG, G ⊗ D, is the MPS y = PG⊗D (y), where y is now
a d2 n-vector of variables, where n = |V | and d = |Q|. The LFP solution of this
G,D
MPS captures the probabilities qA in the following sense:
Proposition 2. (cf. [18], or [10] for a variant of this) For any SCFG, G =
(V, Σ, R, p), and DFA, D = (Q, Σ, Δ, s0 , F ), the LFP solution q G⊗D of the MPS
x = PG⊗D (x), satisfies 0 ≤ q G⊗D ≤ 1. Furthermore, for any A ∈ V and s, t ∈ Q,
 G,w G,D 
G⊗D
q(sAt) = {w|Δ∗ (s,w)=t} qA . Thus, for every A ∈ V , qA = t∈F q(sG⊗D
0 At)
.

Newton’s Method (NM). For an MPS (or PPS), x = P (x), in n variables,


let B(x) := P  (x) denote the Jacobian matrix of P (x). In other words, B(x)
is an n × n matrix such that B(x)i,j = ∂P∂xi (x)
j
. For a vector z ∈ Rn , assuming
that matrix (I − B(z)) is non-singular, we define a single iteration of Newton’s
method (NM) for x = P (x) on z via the following operator:
N (z) := z + (I − B(z))−1 (P (z) − z) (1)
Using Newton iteration, starting at n-vector x(0) := 0, yields the following
iteration: x(k+1) := N (x(k) ), for k = 0, 1, 2, . . ..
206 K. Etessami, A. Stewart, and M. Yannakakis

For every MPS, we can detect in P-time all the variables xj such that qj∗ = 0
[9]. We can then remove these variables and their corresponding equation xj =
P (x)j , and substitute their values on the right hand sides of remaining equations.
This yields a new MPS, with LFP q  > 0, which corresponds to the non-zero
coordinates of q ∗ . It was shown in [9] that one can always apply a decomposed
Newton’s method to this MPS, to converge monotonically to the LFP solution.

Proposition 3. (cf. Theorem 6.1 of [9] and Theorem 4.1 of [4]) Let x = P (x)
be a MPS, with LFP q ∗ > 0. Then starting at x(0) := 0, the Newton itera-
tions x(k+1) := N (x(k) ) are well defined and monotonically converge to q ∗ , i.e.
limk→∞ x(k) = q ∗ , and x(k+1) ≥ x(k) ≥ 0 for all k ≥ 0.

Unfortunately, it was shown in [9] that obtaining any non-trivial additive ap-
proximation to the LFP solution of a general MPS, even one whose LFP is
0 < q ∗ ≤ 1, is PosSLP-hard, so we can not compute the termination weights of
general WCFGs in P-time (nor even in NP), without a major breakthrough in
the complexity of numerical computation. (See [9] for more information.)
Fortunately, for the class of PPSs, we can do a lot better. First we can identify
in P-time also all the variables xj such that qj∗ = 1 [9] and remove them from
the system. We showed recently in [7] that by then applying a suitably rounded
down variant of Newton’s method to the resulting PPS, we can approximate q ∗
within additive error 2−j in time polynomial in the size of the PPS and j.

3 Balance, Collapse, and Newton’s Method

For an SCFG, G = (V, Σ, R, p), and a DFA, D = (Q, Σ, Δ, s0 , F ), we want to


relate the behavior of Newton’s method on the MPS associated with the WCFG,
G⊗D, to that of the PPS associated with the SCFG G. We shall show that there
is indeed a tight correspondence, regardless of what the DFA D is. This holds
even when G itself is a convergent WCFG, and thus x = PG (x) is an MPS. We
need an abstract algebraic way to express this correspondence. A key notion will
be balance, and the collapse operator defined on balanced vectors and matrices.
Consider the LFP q G of x = PG (x), and LFP q G⊗D of y = PG⊗D (y). By Pro-
 G,w
pos. 1 and 2, for any A ∈ V , qA
G
= w∈Σ ∗ qA is the probability (weight) that G,
G⊗D  G,w
starting at A, generates any finite string. Likewise q(sAt) = {w|Δ∗ (s,w)=t} qA
is the probability (weight) that, starting at A, G generatesa finite string w such
that Δ∗ (s, w) = t. Thus, for any A ∈ V and s ∈ Q, qA G G⊗D
= t∈Q q(sAt) .
It turns out that analogous relationships hold between many other vectors
associated with G and G ⊗ D, including between the Newton iterates obtained
by applying Newton’s method to their respective PPS (or MPS) and the prod-
uct MPS. Furthermore, associated relationships also hold between the Jacobian
matrices BG (x) and BG⊗D (y) of PG (x) and PG⊗D (y), respectively.
2
Let n = |V | and let d = |Q|. A vector y ∈ Rd n , whose coordinates are indexed
by triples (sAt) ∈ Q × V × Q, is called balanced
 if for any non-terminal A, and
any pair of states s, s ∈ Q, t∈Q y(sAt) = t∈Q y(s At) . In other words, y is
SCFG, Regular Languages, and Newton’s Method 207


balanced if the value of the sum t∈Q y(sAt) is independent of the state s. As
d2 n 2
already observed, q G⊗D
∈ R≥0 is balanced. Let B ⊆ Rd n denote the set of
Let us define the collapse mapping C : B → Rn . For any
balanced vectors. 
A ∈ V , C(y)A := t y(sAt) . Note: C(y) is well-defined, because for y ∈ B, and
any A ∈ V , the sum t y(sAt) is by definition independent of the state s.
2 2
We next extend the definition of balance to matrices. A matrix M ∈ Rd n×d n
is called balanced if, for any non-terminals
 B, C ∈ V  and states s, u ∈ Q,
and for any pair of states v, v  ∈ Q, t M(sBt),(uCv) = t M(sBt),(uCv ) , and
for any s, v ∈ Q and s , v  ∈ Q, t,u M(sBt),(uCv) = t,u M(s Bt),(uCv  ) . Let
B× ⊆ Rd n×d n denote the set of balanced matrices. We extend the collapse
2 2

map C to matrices. C : B× → Rn×n is defined as follows. For any M ∈ B× , and


any B, C ∈ V , C(M )BC := t,u M(sBt),(uCv) . Note, again, C(M ) is well-defined.
We denote the Newton operator, N , applied to a vector x ∈ Rn for the
PPS x = PG (x) associated with G by NG (x ). Likewise, we denote the Newton
operator applied to a vector y  ∈ Rd n for the MPS y = PG⊗D (y) associated
2

with G ⊗ D by NG⊗D (y  ). For a real square matrix M , let ρ(M ) denote the
spectral radius of M . The main result of this section is the following:
Theorem 1. Let x = PG (x) be any PPS (or MPS), with n variables, associated
with a SCFG (or WCFG) G, and let y = PG⊗D (y) be the corresponding product
2
MPS, for any DFA D, with d states. For any balanced vector y ∈ B ⊆ Rd n ,
with y ≥ 0, ρ(BG⊗D (y)) = ρ(BG (C(y))). Furthermore, if ρ(BG⊗D (y)) < 1,
then NG⊗D (y) is defined and balanced, NG (C(y)) is defined, and C(NG⊗D (y)) =
NG (C(y)). Thus, NG⊗D preserves balance, and the collapse map C “commutes”
with N over non-negative balanced vectors, irrespective of what the DFA D is.
We prove this in [8] via a series of lemmas that reveal many algebraic/analytic
properties of balance, collapse, and Newton’s method. Key is:

and B×
2 2 2
Lemma 1. Let B≥0 = B ∩ R≥0 d n
≥0 = B ∩ R≥0
d n×d n
.
We have q G⊗D
∈ B≥0 and C(q G⊗D G
) = q , and:
then BG⊗D (y) ∈ B×
2
(i) If y ∈ B≥0 ⊆ R≥0d n
≥0 , and C(BG⊗D (y)) = BG (C(y)).
(ii) If y ∈ B≥0 , then PG⊗D (y) ∈ B≥0 , and C(PG⊗D (y)) = PG (C(y)).
(iii) If y ∈ B≥0 and ρ(BG (C(y))) < 1, then I − BG⊗D (y) is non-singular,
(I − BG⊗D (y))−1 ∈ B× ≥0 , and C((I − BG⊗D (y))
−1
) = (I − BG (C(y)))−1 .
(iv) If y ∈ B≥0 and ρ(BG (C(y))) < 1, then NG⊗D (y) ∈ B×
and C(NG⊗D (y)) = NG (C(y)).
An easy consequence of Thm. 1 (and Prop. 3) is that if we use NM with exact
arithmetic on the PPS or MPS, x = PG (x), and on the product MPS, y =
PG⊗D (y), they converge at the same rate:
Corollary 1. For any PPS or MPS, x = PG (x), with LFP q G > 0, and cor-
responding product MPS, y = PG⊗D (y), if we use Newton’s method with exact
arithmetic, starting at x(0) := 0, and y (0) := 0, then all the Newton iterates x(k)
and y (k) are well-defined, and for all k: x(k) = C(y (k) ).
208 K. Etessami, A. Stewart, and M. Yannakakis

4 Rounded Newton on PPSs and Product MPSs

To work in the Turing model of computation (as opposed to the unit-cost RAM
model) we have to consider rounding between iterations of NM, as in [7].

Definition 1. (Rounded-down Newton’s method (R-NM), with parame-


ter h.) Given an MPS, x = P (x), with LFP q ∗ , where q ∗ > 0, in R-NM with
integer rounding parameter h > 0, we compute a sequence of iteration vectors
x[k] . Starting with x[0] := 0, ∀k ≥ 0 we compute x[k+1] as follows:
1. Compute x{k+1} := NP (x[k] ), where NP (x) is the Newton op. defined in (1).
[k+1]
2. For each coordinate i = 1, . . . , n, set xi to be equal to the maximum mul-
−h {k+1}
tiple of 2 which is ≤ max(xi , 0). (In other words, round down x{k+1}
−h
to the nearest multiple of 2 , while ensuring the result is non-negative.)

Rounding can cause iterates x[k] to become unbalanced, but we can handle this.
For any PPS, x = P (x), with Jacobian matrix B(x), and LFP q ∗ , ρ(B(q ∗ )) ≤ 1
([9, 7]). If ρ(B(q ∗ )) < 1, we call the PPS non-critical. Otherwise, if ρ(B(q ∗ )) =
1, we call the PPS critical. For SCFGs whose PPS x = PG (x) is non-critical,
we get good bounds, even though R-NM iterates can become unbalanced:

Theorem 2. For any  > 0, and for an SCFG, G, if the PPS x = PG (x) has
LFP 0 < q G ≤ 1 and ρ(BG (q G )) < 1, then if we use R-NM with parameter
h + 2 to approximate the LFP solution of the MPS y = PG⊗D (y), then q G⊗D −
y [h+1] ∞ ≤  where h := 14|G| + 3 + log(1/) + log d .
G,D 
Thus we can compute the probability qA = t∈F qsG⊗D0 At
within additive error
δ > 0 in time polynomial in the input size: |G|, |D| and log(1/δ), in the standard
Turing model of computation.

We in fact obtain a much more general result. For any SCFG, G, and correspond-
ing PPS, x = PG (x), with LFP q ∗ > 0, the dependency graph, HG = (V, E), has
the variables (or the nonterminals of G) as nodes and has the following edges:
(xi , xj ) ∈ E iff xj appears in some monomial in PG (x)i with a positive coeffi-
cient. We can decompose the dependency graph HG into its SCCs, and form the

DAG of SCCs, HG . For each SCC, S, suppose its corresponding equations are
xS = PG (xS , xD(S) )S , where D(S) is the set of variables xj ∈ S such that there
is a path in HG from some variable xi ∈ S to xj . We call a SCC, S, of HG , a
G
critical SCC if the PPS xS = PG (xS , qD(S) )S is critical. In other words, the
SCC S is critical if we plug in the LFP values q G into variables that are in lower
SCCs, D(S), then the resulting PPS is critical. We note that an arbitrary PPS,
x = PG (x) is non-critical if and only if it has no critical SCC. We define the
critical depth, c(G), of x = PG (x) as follows: it is the maximum length, k, of
any sequence S1 , S2 , . . . , Sk , of SCCs of HG , such that for all i ∈ {1, . . . , k − 1},
Si+1 ⊆ D(Si ), and furthermore, such that for all j ∈ {1, . . . , k}, Sj is critical.
Let us call a critical SCC, S, of HG a bottom-critical SCC, if D(S) does not
contain any critical SCCs. By using earlier results ([9, 3]) we can compute in
P-time the critical SCCs of a PPS, and its critical depth (see [8]).
SCFG, Regular Languages, and Newton’s Method 209

PPSs with nested critical SCCs are hard to analyze directly. It turns out we
can circumvent this by “tweaking” the probabilities in the SCFG G to obtain an
SCFG G with no critical SCCs, and showing that the “tweaks” are small enough
so that they do not change the probabilities of interest by much. Concretely:

Theorem 3. For any  > 0, and for any SCFG, G, in SNF form, with q G > 0,
with critical depth c(G), consider the new SCFG, G , obtained from G by the
following process: for each bottom-critical SCC, S, of x = PG (x), find any rule
p
r=A− → B of G, such that A and B are both in S (since G is in SNF, such a
rule must exist in every critical SCC). Reduce the probability p, by setting it to
c(G) c(G)
p = p(1 − 2−(14|G|+3)2 2 ). Do this for all bottom-critical SCCs. This
defines G , which is non-critical. Using G instead of G, if we apply R-NM, with

parameter h + 2 to approximate the LFP q G ⊗D of MPS y = PG ⊗D (y), then
q G⊗D − y [h+1] ∞ ≤  where h := log d + (3 · 2c(G) + 1)(log(1/) + 14|G| + 3) .
G,D 
Thus we can compute qA = t∈F qsG⊗D0 At
within additive error δ > 0 in time
c(G)
polynomial in: |G|, |D|, log(1/δ), and 2 , in the Turing model of computation.

The proof is very involved, and is in [8]. There, we also give a family of SCFGs,
and a 3-state DFA that checks the infix probability of string aa, and we explain
why these examples indicate it will likely be difficult to overcome the exponential
dependence on the critical-depth c(G) in the above bounds.

5 Non-criticality of SCFGs Obtained by EM

In doing parameter estimation for SCFGs, in either the supervised or unsuper-


vised (EM) settings (see, e.g., [17]), we are given a CFG, H, with start nonter-
minal S, and we wish to extend it to an SCFG, G, by giving probabilities to the
rules of H. We also have some probability distribution, P(π), over the complete
derivations, π, of H that start at start non-terminal S. (In the unsupervised
case, we begin with an SCFG, and the distribution P arises from the prior rule
probabilities, and from the training corpus of strings.) We then assign each rule
of H a (new) probability as follows to obtain (or update) G:

P(π)C(A → γ, π)
p(A → γ) := π (2)
π P(π)C(A, π)
where C(r, π) is the
number of times the rule r is used in the complete derivation
π,
 and C(A, π) = r∈RA C(r, π). Equation (2) only makes sense when the sums
π P(π)C(A, π) are finite and nonzero, which we assume; we also assume every
non-terminal and rule of H appears in some complete derivation π with P(π) > 0.

Proposition 4. If we use parameter estimation to obtain SCFG G using equa-


tion (2), under the stated assumptions, then G is consistent1 , i.e. q G = 1, and
furthermore the PPS x = PG (x) is non-critical, i.e., ρ(BG (1)) < 1.
1
Consistency of the obtained SCFGs is well-known; see, e.g., [15, 17] & references
therein; also [19] has results related to Prop. 4 for restricted grammars.
210 K. Etessami, A. Stewart, and M. Yannakakis

It follows from Prop. 4 and Thm. 2, that for SCFGs obtained by parameter
G,D
estimation and EM, we can compute the probability qA of generating a string
in L(D) to within any desired precision in P-time, for any DFA D.

References
[1] Corazza, A., De Mori, R., Gretter, D., Satta, G.: Computation of probabilities for
an island-driven parser. IEEE Trans. PAMI 13(9), 936–950 (1991)
[2] Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.: Biological Sequence Analysis:
Probabilistic models of Proteins and Nucleic Acids. Cambridge U. Press (1999)
[3] Esparza, J., Gaiser, A., Kiefer, S.: Computing least fixed points of probabilistic
systems of polynomials. In: Proc. 27th STACS, pp. 359–370 (2010)
[4] Esparza, J., Kiefer, S., Luttenberger, M.: Computing the least fixed point of pos-
itive polynomial systems. SIAM J. on Computing 39(6), 2282–2355 (2010)
[5] Esparza, J., Kučera, A., Mayr, R.: Model checking probabilistic pushdown au-
tomata. Logical Methods in Computer Science 2(1), 1–31 (2006)
[6] Etessami, K., Stewart, A., Yannakakis, M.: Polynomial time algorithms for branch-
ing Markov decision processes and probabilistic min(max) polynomial Bellman
equations. In: Czumaj, A., Mehlhorn, K., Pitts, A., Wattenhofer, R. (eds.) ICALP
2012, Part I. LNCS, vol. 7391, pp. 314–326. Springer, Heidelberg (2012); See full
version at ArXiv:1202.4798
[7] Etessami, K., Stewart, A., Yannakakis, M.: Polynomial-time algorithms for multi-
type branching processes and stochastic context-free grammars. In: Proc. 44th
ACM STOC, Full version is available at ArXiv:1201.2374 (2012)
[8] Etessami, K., Stewart, A., Yannakakis, M.: Stochastic Context-Free Gram-
mars, Regular Languages, and Newton’s method, Full preprint of this paper:
ArXiv:1302.6411 (2013)
[9] Etessami, K., Yannakakis, M.: Recursive Markov chains, stochastic grammars, and
monotone systems of nonlinear equations. Journal of the ACM 56(1) (2009)
[10] Etessami, K., Yannakakis, M.: Model checking of recursive probabilistic systems.
ACM Trans. Comput. Log. 13(2), 12 (2012)
[11] Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge U. Press (1985)
[12] Jelinek, F., Lafferty, J.D.: Computation of the probability of initial substring gen-
eration by stochastic context-free grammars. Computational Linguistics 17(3),
315–323 (1991)
[13] Knudsen, B., Hein, J.: Pfold: RNA secondary structure prediction using stochastic
context-free grammars. Nucleic Acids Res 31, 3423–3428 (2003)
[14] Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing.
MIT Press (1999)
[15] Nederhof, M.-J., Satta, G.: Estimation of consistent probabilistic context-free
grammars. In: HLT-NAACL (2006)
[16] Nederhof, M.-J., Satta, G.: Computing partition functions of PCFGs. Research
on Language and Computation 6(2), 139–162 (2008)
[17] Nederhof, M.-J., Satta, G.: Probabilistic parsing. New Developments in Formal
Languages and Applications 113, 229–258 (2008)
[18] Nederhof, M.-J., Satta, G.: Computation of infix probabilities for probabilistic
context-free grammars. In: EMNLP, pp. 1213–1221 (2011)
SCFG, Regular Languages, and Newton’s Method 211

[19] Sánchez, J., Benedí, J.-M.: Consistency of stochastic context-free grammars from
probabilistic estimation based on growth transformations. IEEE Trans. Pattern
Anal. Mach. Intell. 19(9), 1052–1055 (1997)
[20] Stewart, A., Etessami, K., Yannakakis, M.: Upper bounds for Newton’s method
on monotone polynomial systems, and P-time model checking of probabilistic one-
counter automata, Arxiv:1302.3741 (2013) (conference version to appear in CAV
2013)
[21] Stolcke, A.: An efficient probabilistic context-free parsing algorithm that computes
prefix probabilities. Computational Linguistics 21(2), 167–201 (1995)
[22] Wojtczak, D., Etessami, K.: Premo: an analyzer for probabilistic recursive models.
In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 66–71.
Springer, Heidelberg (2007)
Reachability in Two-Clock Timed Automata
Is PSPACE-Complete

John Fearnley1 and Marcin Jurdziński2


1
Department of Computer Science, University of Liverpool, UK
2
Department of Computer Science, University of Warwick, UK

Abstract. Haase, Ouaknine, and Worrell have shown that reachability


in two-clock timed automata is log-space equivalent to reachability in
bounded one-counter automata. We show that reachability in bounded
one-counter automata is PSPACE-complete.

1 Introduction
Timed automata [1] are a successful and widely used formalism, which are used in
the analysis and verification of real time systems. A timed automaton is a non-
deterministic finite automaton that is equipped with a number of real-valued
clocks, which allow the automaton to measure the passage of time.
Perhaps the most fundamental problem for timed automata is the reachability
problem: given an initial state, can we perform a sequence of transitions in
order to reach a specified target state? In their foundational paper on timed
automata [1], Alur and Dill showed that this problem is PSPACE-complete. To
show hardness for PSPACE, their proof starts with a linear bounded automaton
(LBA), which is a non-deterministic Turing machine with a fixed tape length n.
They produced a timed automaton with 2n + 1 clocks, and showed that the
timed automaton can reach a specified state if and only if the LBA halts.
However, the work of Alur and Dill did not address the case where the num-
ber of clocks is small. This was rectified by Courcoubetis and Yannakakis [3],
who showed that reachability in timed automata with only three clocks is still
PSPACE-complete. Their proof cleverly encodes the tape of an LBA in a single
clock, and then uses the two additional clocks to perform all necessary oper-
ations on the encoded tape. In contrast to this, Laroussinie et al. have shown
that reachability in one-clock timed automata is complete for NLOGSPACE, and
therefore no more difficult than computing reachability in directed graphs [6].
The complexity of reachability in two-clock timed automata has been left
open. So far, the best lower bound was given by Laroussinie et al., who gave
a proof that the problem is NP-hard via a very natural reduction from subset-
sum [6]. Moreover, the problem lies in PSPACE, because reachability in two-clock

A full version of this paper is available at http://arxiv.org/abs/1302.3109. This
work was supported by EPSRC grants EP/H046623/1 Synthesis and Verification in
Markov Game Structures and EP/D063191/1 The Centre for Discrete Mathematics
and its Applications (DIMAP).

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 212–223, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Reachability in Two-Clock Timed Automata Is PSPACE-Complete 213

timed automata is no harder than reachability in three-clock timed automata.


However, the PSPACE-hardness proof of Courcoubetis and Yannakakis seems
to fundamentally require three clocks, and does not naturally extend to the
two-clock case. Naves [7] has shown that several extensions to two-clock timed
automata lead to PSPACE-completeness, but his work does not advance upon
the NP-hard lower bound for unextended two-clock timed automata.
In a recent paper, Haase et al. have shown a link between reachability in timed
automata and reachability in bounded counter automata [5]. A bounded counter
automaton is a non-deterministic finite automaton equipped with a set of coun-
ters, and the transitions of the automaton may add or subtract arbitrary integer
constants to the counters. The state space of each counter is bounded by some
natural number b, so the counter may only take values in the range [0, b]. More-
over, transitions may only be taken if they do not increase or decrease a counter
beyond the allowable bounds. This gives these seemingly simple automata a
surprising amount of power, because the bounds can be used to implement in-
equality tests against the counters.
Haase et al. show that reachability in two-clock timed automata is log-space
equivalent to reachability in bounded one-counter automata. Reachability in
bounded one-counter automata has also been studied in the context of one-clock
timed automata with energy constraints [2], where it was shown that the problem
lies in PSPACE and is NP-hard. It has also been shown that the reachability
problem for unbounded one-counter automata is NP-complete [4], but the NP
membership proof does not seem to generalise to bounded one-counter automata.

Our Contribution. We show that satisfiability for quantified boolean formulas


can be reduced, in polynomial time, to reachability in bounded one-counter
automata. Hence, we show that reachability in bounded one-counter automata
is PSPACE-complete, and therefore we resolve the complexity of reachability in
two-clock timed automata. Our reduction uses two intermediate steps: subset-
sum games and safe counter-stack automata.
Counter automata are naturally suited for solving subset-sum problems, so our
reduction starts with a quantified version of subset-sum, which we call subset-
sum games. One interpretation of satisfiability for quantified boolean formulas
is to view the problem as a game between an existential player and a universal
player. The players take turns to set their propositions to true or false, and the
existential player wins if and only if the boolean formula is satisfied. Subset-
sum games follow the same pattern, but apply it to subset-sum: the two players
alternate in choosing numbers from sets, and the existential player wins if and
only if the chosen numbers sum to a given target. Previous work by Travers can
be applied to show that subset-sum games are PSPACE-complete [8].
We reduce subset-sum games to reachability in bounded one-counter au-
tomata. However, we will not do this directly. Instead, we introduce safe counter-
stack automata, which are able to store multiple counters, but have a stack-like
restriction on how these counters may be accessed. These automata are a con-
venient intermediate step, because having access to multiple counters makes it
easier for us to implement subset-sum games. Moreover, the stack based re-
214 J. Fearnley and M. Jurdziński

strictions mean that it is relatively straightforward to show that reachability in


safe counter-stack automata is reducible, in polynomial time, to reachability in
bounded one-counter automata, which completes our result.

2 Subset-Sum Games
A subset-sum game is played between an existential player and a universal
player. The game is specified by a pair (ψ, T ), where T ∈ N, and ψ is a list:
∀ {A1 , B1 } ∃ {E1 , F1 } . . . ∀ {An , Bn } ∃ {En , Fn },
where Ai , Bi , Ei , and Fi , are all natural numbers.
The game is played in rounds. In the first round, the universal player chooses
an element from {A1 , B1 }, and the existential player responds by choosing an
element from {E1 , F1 }. In the second round, the universal player chooses an ele-
ment from {A2 , B2 }, and the existential player responds by choosing an element
from {E2 , F2 }. This pattern repeats for rounds 3 through n. Thus, at the end
of the game, the players will have constructed a sequence of numbers, and the
existential player wins if and only if the sum of these numbers is T .
Formally, the set of plays of the game is the set:

P= {Aj , Bj } × {Ej , Fj }.
1≤j≤n

A play P ∈ P is winning for the existential player if and only if P = T.
A strategy for the existential player is a list of functions s = (s1 , s2 , . . . , sn ),
where each function si dictates how the existential player should play in the ith
round of the game. Thus, each function si is of the form:

si : {Aj , Bj } → {Ei , Fi }.
1≤j≤i

This means that the function si maps the first i moves of the universal player
to a decision for the existential player in the ith round.
A play P conforms to a strategy s if the decisions made by the existential
player in P always agree with s. More formally, if P = p1 p2 . . . p2n is a play,
and s = (s1 , s2 , . . . , sn ) is a strategy, then P conforms to s if and only if we
have si (p1 , p3 , . . . , p2i−1 ) = p2i for all i. A strategy s is winning if every play
P ∈ Plays(s) is winning for the existential player. The subset-sum game problem
is to decide, for a given SSG instance (ψ, T ), whether the existential player has
a winning strategy for (ψ, T ).
The SSG problem clearly lies in PSPACE, because it can be solved on a
polynomial time alternating Turing machine. A quantified version of subset-sum
has been shown to be PSPACE-hard, via a reduction from quantified boolean
formulas [8]. Since SSGs are essentially a quantified version of subset-sum, the
proof of PSPACE-hardness easily carries over.
Lemma 1. The subset-sum game problem is PSPACE-complete.
Reachability in Two-Clock Timed Automata Is PSPACE-Complete 215

3 Bounded One-Counter Automata


A bounded one-counter automaton has a single counter that can store values
between 0 and some bound b ∈ N. The automaton may add or subtract values
from the counter, so long as the bounds of 0 and b are not overstepped. This
can be used to test inequalities against the counter. For example, let n ∈ N be a
number, and suppose that we want to test whether the counter is smaller-than
or equal to n. We first attempt to add b − n to the counter, then, if that works,
we subtract b − n from the counter. This creates a sequence of two transitions
which can be taken if and only if the counter is smaller-than or equal to n. A
similar construction can be given for greater-than tests. For the sake of conve-
nience, we will include explicit inequality testing in our formal definition, with
the understanding that this is not actually necessary.
We now give a formal definition. For two integers a, b ∈ Z we define [a, b] =
{n ∈ Z : a ≤ n ≤ b} to be the subset of integers between a and b. A bounded
one-counter automaton is defined by a tuple (L, b, Δ, l0 ), where L is a finite set of
locations, b ∈ N is a global counter bound, Δ specifies the set of transitions, and
l0 ∈ L is the initial location. Each transition in Δ has the form (l, p, g1 , g2 , l ),
where l and l are locations, p ∈ [−b, b] specifies how the counter should be
modified, and g1 , g2 ∈ [0, b] give lower and upper guards for the counter.
Each state of the automaton consists of a location l ∈ L along with a counter
value c. Thus, we define the set of states S to be L × [0, b]. A transition ex-
ists between a state (l, c) ∈ S, and a state (l , c ) ∈ S if there is a transition
(l, p, g1 , g2 , l ) ∈ Δ, where g1 ≤ c ≤ g2 , and c = c + p.
The reachability problem for bounded one-counter automata is: starting at
the state (l0 , 0), can the automaton reach a specified target location lt ? It has
been shown that the reachability problem for bounded one-counter automata is
equivalent to the reachability problem for two-clock timed automata.

Theorem 2 ([5]). Reachability in bounded one-counter automata is log-space


equivalent to reachability in two-clock timed automata.

4 Counter-Stack Automata
Outline. In this section we ask: can we use a bounded one-counter automaton to
store multiple counters? The answer is yes, but doing so forces some interesting
restrictions on the way in which the counters are accessed. By the end of this
section, we will have formalised these restrictions as counter-stack automata.
Suppose that we have a bounded one-counter automaton with counter c and
bound b = 15. Hence, the width of the counter is 4 bits. Now suppose that we
wish to store two 2-bit counters c1 and c2 in c. We can do this as follows:

c = 1 0 0 1
c2 c1
216 J. Fearnley and M. Jurdziński

We allocate the top two bits of c to store c2 , and the bottom two bits to store c1 .
We can easily write to both counters: if we want to increment c2 then we add 4
to c, and if we want to increment c1 then we add 1 to c.
However, if we want to test equality, then things become more interesting.
It is easy to test equality against c2 : if we want to test whether c2 = 2, then
we test whether 8 ≤ c ≤ 11 holds. But, we cannot easily test whether c1 = 2
because we would have to test whether c is 2, 6, 10, or 14, and this list grows
exponentially as the counters get wider. However, if we know that c2 = 1, then
we only need to test whether c = 6. Thus, we arrive at the following guiding
principle: if you want to test equality against ci , then you must know the values
of cj for all j > i. Counter-stack automata are a formalisation of this principle.

Counter-Stack Automata. A counter-stack automaton has a set of k distinct


counters, which are referred to as c1 through ck . For our initial definitions, we
will allow the counters to take all values from N, but we will later refine this
by defining safe counter-stack automata. The defining feature of a counter-stack
automaton is that the counters are arranged in a stack-like fashion:
– All counters may be increased at any time.
– ci may only be tested for equality if the values of ci+1 through ck are known.
– ci may only be reset if the values of ci through ck are known.
When the automaton increases a counter, it adds a specified number n ∈ N to
that counter. The automaton has the ability to perform equality tests against
a counter, but the stack-based restrictions must be respected. An example of a
valid equality test is ck = 3 ∧ ck−1 = 10, because ck−1 = 10 only needs to be
tested in the case where ck = 3 is known to hold. Conversely, the test ck−1 = 10
by itself is invalid, because it places no restrictions on the value of ck .
The automaton may also reset a counter, but the stack-based restrictions
apply. Counter ci may only be reset by a transition, if that transition tests
equality against the values of ci through ck . For example, ck−1 may only be
reset if the transition is guarded by a test of the form ck−1 = n1 ∧ ck−2 = n2 .

Formal Definition. A counter-stack automaton is a tuple (L, C, Δ, l0 ), where


L is a finite set of locations, C = [1, k] is a set of counter indexes, l0 ∈ L is an
initial location, and Δ specifies the transition relation. Each transition in Δ has
the form (l, E, I, R, l ) where:
– l, l ∈ L is a pair of locations.
– E is a partial function from C to N which specifies the equality tests. If E(i)
is defined for some i, then E(j) must be defined for all j ∈ C with j > i.
– I ∈ Nk specifies how the counters must be increased.
– R ⊆ C specifies the set of counters that must be reset. It is required that
E(r) is defined for every r ∈ R.
Each state of the automaton is a location annotated with values for each of
the k counters. That is, the state space of the automaton is L × Nk . A state
(l, c1 , c2 , . . . , ck ) can transition to a state (l , c1 , c2 , . . . , ck ) if and only if there
exists a transition (l, E, I, R, l ) ∈ Δ, where the following conditions hold:
Reachability in Two-Clock Timed Automata Is PSPACE-Complete 217

– For every i for which E(i) is defined, we must have ci = E(i).


– For every i ∈ R, we must have ci = 0.
/ R, we must have ci = ci + Ii .
– For every i ∈

A run is a sequence of states s0 , s1 , . . . , sn , where each si can transition to si+1 .


A counter-stack automaton is b-safe, for some b ∈ N, if it is impossible for the
automaton to increase a counter beyond b. Formally, this condition requires that,
for every state (l, c1 , c2 , . . . , ck ) that can be reached by a run from (l0 , 0, 0, . . . , 0),
we have ci ≤ b for all i. We say that a counter-stack automaton is safe, if it is
b-safe for some b ∈ N.
The reachability problem for safe counter-stack automata is a promise prob-
lem. The input to the problem is a triple (S, b, t), where S is a counter-stack
automaton, b ∈ N is a bound, and t is a target location in S. If S is b-safe,
then the algorithm must decide whether there is a run from (l0 , 0, 0, . . . , 0) to
(t, 0, 0, . . . , 0). If S is not b-safe, then the algorithm can output anything.

Simulation by a Bounded One-Counter Automaton. A safe counter-stack


automaton is designed to be simulated by a bounded one-counter automaton.
To do this, we follow the construction outlined at the start of this section: we
split the bits of the counter c into k chunks, where each chunk represents one
of the counters ci . Note that the safety assumption is crucial, because otherwise
incrementing ci may overflow the allotted space, and inadvertently modify the
value of ci+1 .

Lemma 3. Reachability in safe counter-stack automata is polynomial-time re-


ducible to reachability in bounded one-counter automata.

5 Outline of the Construction

Our goal is to show that reachability in safe counter-stack automata is PSPACE-


hard. To do this, we will show that subset-sum games can be solved by safe
counter-stack automata. In this section, we give an overview of our construction
using the following two-round subset-sum game.
 
∀ {A1 , B1 } ∃ {E1 , F1 } ∀ {A2 , B2 } ∃ {E2 , F2 }, T .

For brevity, we will refer to this instance as (ψ, T ) for the rest of this section.
The construction is split into two parts: the play gadget and the reset gadget.

The Play Gadget. The play gadget is shown in Figure 1. The construction
uses nine counters. The locations are represented by circles and the transitions
are represented by edges. The annotations on the transitions describe the incre-
ments, resets, and equality tests: the notation ci + n indicates that n is added
to counter i, the notation R(ci ) indicates that counter i is reset to 0, and the
notation ci = n indicates that the transition may only be taken when ci = n is
satisfied.
218 J. Fearnley and M. Jurdziński

c1 + 1, c9 + A1 c3 + 1, c9 + E1 c5 + 1, c9 + A2 c7 + 1, c9 + E2
c9 = T
u1 e1 u2 e2 w1 w2
R(c9 )
c2 + 1, c9 + B1 c4 + 1, c9 + F1 c6 + 1, c9 + B2 c8 + 1, c9 + F2

Fig. 1. The play gadget

This gadget allows the automaton to implement a play of the SSG. The loca-
tions u1 and u2 allow the automaton to choose the first and second moves of the
universal player, while the locations e1 and e2 allow the automaton to choose
the first and second moves for the existential player. As the play is constructed,
a running total is stored in c9 , which is the top counter on the stack. The final
transition between w1 and w2 checks whether the existential player wins the
play, and then resets c9 . Thus, the set of runs between u1 and w2 corresponds
precisely to the set of plays won by the existential player in the SSG.
In addition to this, each outgoing transition from ui or ei comes equipped with
its own counter. This counter is incremented if and only if the corresponding
edge is used during the play, and this allows us to check precisely which play
was chosen. These counters will be used by the reset gadget. The idea behind our
construction is to force the automaton to pass through the play gadget multiple
times. Each time we pass through the play gadget, we will check a different play,
and our goal is to check a set of plays that verify whether the existential player
has a winning strategy for the SSG.

Which Plays Should Be Checked?. In our example, we must check four


plays. The format of these plays is shown in Table 1.

Table 1. The set of plays that the automaton will check

Play u1 e1 u2 e2
1 A1 E1 or F1 A2 E2 or F2
2 A1 Unchanged B2 E2 or F2
3 B1 E1 or F1 A2 E2 or F2
4 B1 Unchanged B2 E2 or F2

The table shows four different plays, which cover every possible strategy choice
of the universal player. Clearly, if the existential player does have a winning
strategy, then that strategy should be able to win against all strategy choices of
the universal player. The plays are given in a very particular order: the first two
plays contain A1 , while the second two plays contain B1 . Moreover, we always
check A2 , before moving on to B2 .
We want to force the decisions made at e1 and e2 to form a coherent strategy
for the existential player. In this game, a strategy for the existential player is
a pair s = (s1 , s2 ), where si describes the move that should be made at ei . It
is critical to note that s1 only knows whether A1 or B1 was chosen at u1 . This
Reachability in Two-Clock Timed Automata Is PSPACE-Complete 219

c7 = 1, c8 = 0
R(c7 , c8 ) c5 = 1, c6 = 0
w2 r2 r2 u1
R(c7 , c8 )

0
c5

=
=
c7 = 0, c8 = 1

2
1,

c
R

2,
c
(c

=
c3 = 2, c4 = 0

=
,c

c1
1
6
)
R(c3 , c4 )
r1 r1
R(c3 , c4 )

c1
=
c3 = 0, c4 = 2

2,
R

c
(c

2
1

=
,c

2
2
)
t

Fig. 2. The reset gadget

restriction is shown in the table: the automaton may choose freely between E1
and F1 in the first play. However, in the second play, the automaton must make
the same choice as it did in the first play. The same relationship holds between
the third and fourth plays. These restrictions ensure that the plays shown in
Table 1 are a description of a strategy for the existential player.

The Reset Gadget. The reset gadget, shown in Figure 2, enforces the con-
straints shown in Table 1. The locations w2 and u1 represent the same locations
as they did in Figure 1. To simplify the diagram, we have only included non-
trivial equality tests. Whenever we omit a required equality test, it should be
assumed that the counter is 0. For example, the outgoing transitions from r2
implicitly include the requirement that c7 , c8 , and c9 are all 0.
We consider the following reachability problem: can (t, 0, 0, . . . , 0) be reached
from (u1 , 0, 0, . . . , 0)? The structure of the reset gadget places restrictions on the
runs that reach t. All such runs pass through the reset gadget exactly four times,
and the following table describes each pass:
Pass Path
1 w2 → r2 → r2 → u1
2 w2 → r2 → r2 → r1 → r1 → u1
3 w2 → r2 → r2 → u1
4 w2 → r2 → r2 → r1 → r1 → t
To see why these paths must be taken, observe that, for every i ∈ {1, 3, 5, 7},
each pass through the play gadget increments either ci or ci+1 , but not both.
So, the first time that we arrive at r2 , we must take the transition directly to u1 ,
because the guard on the transition to r1 cannot possibly be satisfied after a
single pass through the play gadget. When we arrive at r2 on the second pass,
we are forced to take the transition to r1 , because we cannot have c5 = 1 and
c6 = 0 after two passes through the play gadget. This transition resets both c5
220 J. Fearnley and M. Jurdziński

and c6 , so the pattern can repeat again on the third and fourth visits to r2 . The
location r1 behaves in the same way as r2 , but the equality tests are scaled up,
because r1 is only visited on every second pass through the reset gadget.
We can now see that all strategies of the universal player must be considered.
The transition between r2 and u1 forces the play gadget to increment c5 , and
therefore the first and third plays must include A2 . Similarly, the transition be-
tween r2 and r1 forces the second and fourth plays to include B2 . Meanwhile, the
transition between r1 and u1 forces the first and second plays to include A1 , and
the transition between r1 and t forces the third and fourth plays to include B1 .
Thus, we select the universal player strategies exactly as Table 1 prescribes.
The transitions between r1 and r1 check that the existential player is playing
a coherent strategy. When the automaton arrives at r1 during the second pass, it
verifies that either E1 was included in the first and second plays, or that F1 was
included in the first and second plays. If this is not the case, then the automaton
gets stuck. The counters c3 and c4 are reset when moving to r1 , which allows
the same check to occur during the fourth pass. For the sake of completeness, we
have included the transitions between r2 and r2 , which perform the same check
for E2 and F2 . However, since the existential player is allowed to change this
decision on every pass, the automaton can never get stuck at r2 .
The end result is that location t can be reached if and only if the existential
player has a winning strategy for (ψ, T ). As we will show in the next section,
the construction extends to arbitrarily large SSGs, which then leads to a proof
that reachability in counter-stack automata is PSPACE-hard. Note that this
construction is safe: c9 is clearly bounded by the maximum value that can be
achieved by a play of the SSG, and reset gadget ensures that no other counter
may exceed 4. Thus, we will have completed our proof of PSPACE-hardness for
bounded one-counter automata and two-clock timed automata.

6 Formal Definition and Proof

Sequential Strategies for SSGs. We start by formalising the ideas behind


Table 1. Recall that the table gives a strategy for the existential player in the
form of a list of plays. Moreover, the table gave a very specific ordering in which
these plays must appear. We now formalise this ordering.
We start by dividing the integers in the interval [1, 2n ] into i-blocks. The
1-blocks partition the interval into two equally sized blocks. The first 1-block
consists of the range [1, 2n−1 ], and the second 1-block consists of the range
[2n−1 + 1, 2n ]. There are four 2-blocks, which partition the 1-blocks into two
equally sized sub-ranges. This pattern continues until we reach the n-blocks.
Formally, for each i ∈ {1, 2, . . . , n}, there are 2i distinct i-blocks. The set of
i-blocks can be generated by considering the intervals [k + 1, k + 2n−i ] for the
first 2i numbers k ≥ 0 that satisfy k mod 2n−i = 0. An i-block is even if k is an
even multiple of 2n−i , and it is odd if k is an odd multiple of 2n−i .
The ordering of the plays in Table 1 can be described using blocks. There
are four 2-blocks, and A2 appears only in even 2-blocks, while B2 only appears
Reachability in Two-Clock Timed Automata Is PSPACE-Complete 221

in odd 2-blocks. Similarly, A1 only appears in the even 1-block, while B1 only
appears in the odd 1-block. The restrictions on the existential player can also be
described using blocks: the existential player’s strategy may not change between
Ei and Fi during an i-block. We generalise this idea in the following definition.

Definition 4 (Sequential strategy). A sequential strategy for the existential


player in (ψ, T ) is a list of 2n plays S = P1 , P2 , . . . , P2n , where for every i-block
L we have:
– If L is an even i-block, then Pj must contain Ai for all j ∈ L.
– If L is an odd i-block, then Pj must contain Bi for all j ∈ L.
– We either have Ei ∈ Pj for all j ∈ L, or we have Fi ∈ Pj for all j ∈ L.

We say that S is winning for the existential player if Pj = T for every Pj ∈ S.
Since a sequential strategy is simply a list of all plays that conform to a strategy,
we have the following lemma.

Lemma 5. The existential player has a winning strategy if and only if the ex-
istential player has a sequential winning strategy.

The Base Automaton. We describe the construction in two steps. Recall,


from Figures 1 and 2, that the top counter is used by the play gadget to store
the value of the play, and to test whether the play is winning. We begin by
constructing a version of the automaton that omits the top counter. That is,
if ck is the top counter, we modify the play gadget by removing all increases
to ck , and the equality test for ck between w1 and w2 . We call this the base
automaton. Later, we will add the constraints for ck back in, to construct the
full automaton.
We now give a formal definition of the base automaton. The location and
counter names are consistent with Figures 1 and 2. For each natural number n,
we define a counter-stack automaton An . The automaton has the following set
of locations:
– locations ui and ei for each i ∈ [1, n],
– locations w1 and w2 ,
– reset locations ri and ri for each i ∈ [1, n], and
– the goal location t.
The automaton uses k = 2n + 1 counters. The top counter ck is reserved for
the full automaton, and will not be used in this construction. We will identify
counters 1 to 2n using the following shorthands. For each integer i we define
ai = c4(i−1)+1 , bi = c4(i−1)+2 , ei = c4(i−1)+3 , and fi = c4(i−1)+4 . For example,
in Figure 1, we have a1 = c1 and a2 = c5 , and these are precisely the counters
associated with A1 and A2 , respectively. The same relationship holds between
b1 and B1 , between b2 and B2 , and so on.
The transitions of the automaton are defined as follows. Whenever we omit
a required equality test against a counter ci , it should be assumed that the
transition includes the test ci = 0.
222 J. Fearnley and M. Jurdziński

– Each location ui has two transitions to ei : a transition that adds 1 to ai ,


and a transition that adds 1 to bi .
– We define un+1 to be a shorthand for w1 . Each location ei has two transitions
to ui+1 : a transition that adds 1 to ei , and transition that adds 1 to fi .
– Location w1 has a transition to w2 , and w2 has a transition to rn . These
transitions do not increase any counter, and do not test any equalities.
– Each location ri has two outgoing transitions to ri . Firstly, there is a tran-
sition that tests ei = 2n−i and fi = 0, and then resets ei and fi . Secondly,
there is a transition that tests ei = 0 and fi = 2n−i , and then resets both ei
and fi .
– We define r0 to be shorthand for location t. Each location ri has two outgoing
transitions. Firstly, there is a transition to u1 that tests ai = 2n−i and bi = 0.

Secondly, there is a transition to ri−1 that tests ai = 2n−i and bi = 2n−i
and then resets both ai and bi .

Runs in the Base Automaton. We now describe the set of runs that are
possible in the base automaton. We decompose every run of the automaton into
segments, such that each segment contains a single pass through the play gadget.
More formally, we decompose R into segments R1 , R2 , . . . , where each segment
Ri starts at u1 , and ends at the next visit to u1 . We say that a run gets stuck
if the run does not end at (t, 0, 0, . . . , 0), and if the final state of the run has
no outgoing transitions. We say that a run R gets stuck during an i-block L
if there exists a j ∈ L such that Rj gets stuck. The following lemma gives a
characterisation of the runs in An .
Lemma 6. A run R in An does not get stuck if and only if, for every i-block
L, all of the following hold.
– If L is an even i-block, then Rj must increment ai for every j ∈ L.
– If L is an odd i-block, then Rj must increment bi for every j ∈ L.
– Either Rj increments ei for every j ∈ L, or Rj increments fi for every
j ∈ L.
We say that a run is successful if it eventually reaches (t, 0, 0, . . . , 0). By defi-
nition, a run is successful if and only if it never gets stuck. Also, the transition
from r1 to t ensures that every successful run must have exactly 2n segments.
With these facts in mind, if we compare Lemma 6 with Definition 4, then we
can see that the set of successful runs in An corresponds exactly to the set of
sequential strategies for the existential player in the SSG.
Since we eventually want to implement An as a safe one-counter automaton,
it is important to prove that An is safe. We do this in the following Lemma.
Lemma 7. Along every run of An we have that counters ai and bi never exceed
2n−i+1 , and counters ei and fi never exceed 2n−i .

The Full Automaton. Let (ψ, T ) be an SSG instance, where ψ is:

∀ {A1 , B1 } ∃ {E1 , F1 } . . . ∀ {An , Bn } ∃ {En , Fn }.


Reachability in Two-Clock Timed Automata Is PSPACE-Complete 223

We will construct a counter-stack automaton Aψ from An . Recall that the top


counter ck is unused in An . We modify the transitions of An as follows. Let δ
be a transition. If δ increments ai then it also adds Ai to ck , if δ increments bi
then it also adds Bi to ck , if δ increments ei then it also adds Ei to ck , and if δ
increments fi then it also adds Fi to ck . We also modify the transition between
w1 and w2 , so that it checks whether ck = T , and resets ck .
Since we only add extra constraints to An , the set of successful runs in Aψ
is contained in the set of successful runs of An . Recall that the set of successful
runs in An encodes the set of sequential strategies for the existential player in
(ψ, T ). In Aψ , we simply check whether each play in the sequential strategy is
winning for the existential player. Thus, we have shown the following lemma.
Lemma 8. The set of successful runs in Aψ corresponds precisely to the set of
winning sequential strategies for the existential player in (ψ, T ).
We also have that Aψ is safe. Bounds for counters
 c1 through ck−1 are shown in
Lemma 7, and counter ck may never exceed {Ai , Bi , Ei , Fi : 1 ≤ i ≤ n}. This
completes the reduction from subset-sum games to safe counter-stack automata,
and gives us our main result.
Theorem 9. Reachability in safe counter-stack automata is PSPACE-complete.
Corollary 10.
– Reachability in bounded one-counter automata is PSPACE-complete.
– Reachability in two-clock timed automata is PSPACE-complete.

References
1. Alur, R., Dill, D.L.: A theory of timed automata. Theoretical Computer Sci-
ence 126(2), 183–235 (1994)
2. Bouyer, P., Fahrenberg, U., Larsen, K.G., Markey, N., Srba, J.: Infinite runs in
weighted timed automata with energy constraints. In: Cassez, F., Jard, C. (eds.)
FORMATS 2008. LNCS, vol. 5215, pp. 33–47. Springer, Heidelberg (2008)
3. Courcoubetis, C., Yannakakis, M.: Minimum and maximum delay problems in real-
time systems. Formal Methods in System Design 1(4), 385–415 (1992)
4. Haase, C., Kreutzer, S., Ouaknine, J., Worrell, J.: Reachability in succinct and
parametric one-counter automata. In: Bravetti, M., Zavattaro, G. (eds.) CONCUR
2009. LNCS, vol. 5710, pp. 369–383. Springer, Heidelberg (2009)
5. Haase, C., Ouaknine, J., Worrell, J.: On the relationship between reachability prob-
lems in timed and counter automata. In: Finkel, A., Leroux, J., Potapov, I. (eds.)
RP 2012. LNCS, vol. 7550, pp. 54–65. Springer, Heidelberg (2012)
6. Laroussinie, F., Markey, N., Schnoebelen, P.: Model checking timed automata with
one or two clocks. In: Gardner, P., Yoshida, N. (eds.) CONCUR 2004. LNCS,
vol. 3170, pp. 387–401. Springer, Heidelberg (2004)
7. Naves, G.: Accessibilité dans les automates temporisé à deux horloges. Rapport de
Master, MPRI, Paris, France (2006)
8. Travers, S.: The complexity of membership problems for circuits over sets of integers.
Theoretical Computer Science 369(13), 211–229 (2006)
Ramsey Goes Visibly Pushdown

Oliver Friedmann1 , Felix Klaedtke2 , and Martin Lange3


1
LMU Munich
2
ETH Zurich
3
University of Kassel

Abstract. Checking whether one formal language is included in another


is vital to many verification tasks. In this paper, we provide solutions for
checking the inclusion of the languages given by visibly pushdown au-
tomata over both finite and infinite words. Visibly pushdown automata
are a richer automaton model than the classical finite-state automata,
which allows one, e.g., to reason about the nesting of procedure calls in
the executions of recursive imperative programs. The highlight of our
solutions is that they do not comprise automata constructions for deter-
minization and complementation. Instead, our solutions are more direct
and generalize the so-called Ramsey-based inclusion-checking algorithms,
which apply to classical finite-state automata and proved effective there,
to visibly pushdown automata. We also experimentally evaluate our al-
gorithms thereby demonstrating the virtues of avoiding determinization
and complementation constructions.

1 Introduction
Various verification tasks can be stated more or less directly as inclusion problems
of formal languages or comprise inclusion problems as subtasks. For example, the
model-checking problem of non-terminating finite-state systems with respect to
trace properties boils down to the question whether the inclusion L(A) ⊆ L(B)
for two Büchi automata A and B holds, where A describes the traces of the sys-
tem and B the property [22]. Another application of checking language inclusion
for Büchi automata appears in size-change termination analysis [13,19]. Inclusion
problems are in general difficult. For Büchi automata it is PSPACE-complete.
From the closure properties of the class of ω-regular languages, i.e., those lan-
guages that are recognizable by Büchi automata it is obvious that questions like
the one above for model checking non-terminating finite-state systems can be ef-
fectively reduced to an emptiness question, namely, L(A) ∩ L(C) = ∅, where C is
a Büchi automaton that accepts the complement of B. Building a Büchi automa-
ton for the intersection of the languages and checking its emptiness is fairly easy:
the automaton accepting the intersection can be quadratically bigger, the empti-
ness problem is NLOGSPACE-complete, and it admits efficient implementations,
e.g., by a nested depth-first search. However, complementing Büchi automata is

Extended abstract. Omitted details can be found in the full version [15], which is
available from the authors’ web pages.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 224–237, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Ramsey Goes Visibly Pushdown 225

challenging. One intuitive reason for this is that not every Büchi automaton has
an equivalent deterministic counterpart. Switching to a richer acceptance condi-
tion like the parity condition so that determinization would be possible is currently
not an option in practice. The known determinization constructions for richer ac-
ceptance conditions are intricate, although complementation would then be easy
by dualizing the acceptance condition. A lower bound on the complementation
problem with respect to the automaton size is 2Ω(n log n) . Known constructions for
complementing Büchi automata that match this lower bound are also intricate. As
a matter of fact, all attempts so far that explicitly construct the automaton C from
B scale poorly. Often, the implementations produce automata for the complement
language that are huge, or they even fail to produce an output at all in reasonable
time and space if the input automaton has more than 20 states, see, e.g., [5, 21].
Other approaches for checking the inclusion of the languages given by Büchi
automata or solving the closely related but simpler universality problem for
Büchi automata have recently gained considerable attention [1,2,8–10,13,14,19].
In the worst case, these algorithms have exponential running times, which are
often worse than the 2Ω(n log n) lower bound on complementing Büchi automata.
However, experimental results, in particular, the ones for the so-called Ramsey-
based algorithms show that the performance of these algorithms is superior. The
name Ramsey-based stems from the fact that their correctness is established by
relying on Ramsey’s theorem [20].1
The Ramsey-based algorithms for checking universality L(B) = Σ ω iteratively
build a set of finite graphs starting from a finite base set and closing it off under
a composition operation. These graphs capture B’s essential behavior on finite
words. The language of B is not universal iff this set contains graphs with certain
properties that witness the existence of an infinite word that is not accepted by B.
First, there must be a graph that is idempotent with respect to the composition
operation. This corresponds to the fact that all the runs of B on the finite
words described by the graph loop. We must also require that no accepting
state occurs on these loops. Second, there must be another graph for the runs
on a finite word that reach that loop. To check the inclusion L(A) ⊆ L(B)
the graphs are annotated with additional information about runs of A on finite
words. Here, in case of L(A) ⊆ L(B), the constructed set of graphs contains
graphs that witness the existence of at least one infinite word that is accepted
by A but all runs of B on that word are rejecting. The Ramsey-based approach
generalizes to parity automata [16]. The parity condition is useful in modeling
reactive systems in which certain modules are supposed to terminate and others
are not supposed to terminate. Also, certain Boolean combinations of Büchi
(non-termination) and co-Büchi (termination) conditions can easily be expressed
as a parity condition. Although parity automata can be translated into Büchi
automata, it algorithmically pays off to handle parity automata directly [16].

1
Büchi’s original complementation construction, which also relies on Ramsey’s the-
orem, shares similarities with these algorithms. However, there is significantly less
overhead when checking universality and inclusion directly and additional heuristics
and optimizations are applicable [1, 5].
226 O. Friedmann, F. Klaedtke, and M. Lange

In this paper, we extend the Ramsey-based analysis to visibly pushdown au-


tomata (VPAs) [4]. This automaton model restricts nondeterministic pushdown
automata in the way that the input symbols determine when the pushdown au-
tomaton pushes or pops symbols from its stack. In particular, the stack heights
are identical at the same positions in every run of any VPA on a given input.
It is because of this syntactic restriction that the class of visibly pushdown lan-
guages retains many closure properties like intersection and complementation.
VPAs allow one to describe program behavior in more detail than finite-state
automata. They can account for the nesting of procedures in executions of recur-
sive imperative programs. Non-regular properties like “an acquired lock must be
released within the same procedure” are expressible by VPAs. Model checking
of recursive state machines [3] and Boolean programs, which are widely used
as abstractions in software model checking, can be carried out in this refined
setting by using VPAs for representing the behavior of the programs and the
properties. Similar to the automata-theoretic approach to model checking finite-
state systems, checking the inclusion of the languages of VPAs is vital here.
This time, the respective decision problem is even EXPTIME-complete. Other
applications for checking language inclusion of VPAs when reasoning about re-
cursive imperative programs also appear in conformance checking [11] and in the
counterexample-guided-abstraction-refinement loop [17].
A generalization of the Ramsey-based approach to VPAs is not straightforward
since the graphs that capture the essential behavior of an automaton must also
account for the stack content in the runs. Moreover, to guarantee termination of
the process that generates these graphs, an automaton’s behavior of all runs must
be captured within finitely many such graphs. In fact, when considering pushdown
automata in general such a generalization is not possible since the universality prob-
lem for pushdown automata is undecidable. We circumvent this problem by only
considering graphs that differ in their stack height by at most one, and by refining
the composition of such graphs in comparison to the unrestricted way that graphs
can be composed in the Ramsey-based approach to finite automata. Then the com-
position operation only needs to account for the top stack symbols in all the runs
described by the graphs, which yields a finite set of graphs in the end.
The main contribution of this paper is the generalization of the Ramsey-based
approach for checking universality and language inclusion for VPAs over infinite
inputs, where the automata’s acceptance condition is stated as a parity condi-
tion. This approach avoids determinization and complementation constructions.
The respective problems where the VPAs operate over finite inputs are special
cases thereof. We also experimentally evaluate the performance of our algorithms
showing that the Ramsey-based inclusion checking is more efficient than methods
that are based on determinization and complementation.
The remainder of this paper is organized as follows. In Sect. 2, we recall the
framework of VPAs. In Sect. 3, we provide a Ramsey-based universality check
for VPAs. Note that universality is a special case of language inclusion. We treat
universality in detail to convey the fundamental ideas first. In Sect. 4, we extend
this to a Ramsey-based inclusion check for VPAs. In Sect. 5, we report on the
experimental evaluation of our algorithms. In Sect. 6, we draw conclusions.
Ramsey Goes Visibly Pushdown 227

a d b a c d d c

Fig. 1. Nested word w = adbacddbc with Σint = {a}, Σcall = {b, c}, and Σret = {d}. Its
pending positions are 1 and 7 with w1 = d and w7 = c. The call position 2 with w2 = b
matches with the return position 6 with w6 = d. The positions 4 and 5 also match.

2 Preliminaries
Words. The set of finite words over the alphabet Σ is Σ ∗ and the set of infinite
words over Σ is Σ ω . Let Σ + := Σ ∗ \ {ε}, where ε is the empty word. The length
of a word w is written as |w|, where |w| = ω when w is an infinite word. For a
word w, wi denotes the letter at position i < |w| in w. That is, w = w0 w1 . . . if
w is infinite and w = w0 w1 . . . wn−1 if w is finite and |w| = n. With inf(w) we
denote the set of letters of Σ that occur infinitely often in w ∈ Σ ω .
Nested words [4] are linear sequences equipped with a hierarchical structure,
which is imposed by partitioning an alphabet Σ into the pairwise disjoint sets
Σint , Σcall , and Σret . For a finite or infinite word w over Σ, we say that the
position i ∈ N with i < |w| is an internal position if wi ∈ Σint . It is a call
position if wi ∈ Σcall and it is a return position if wi ∈ Σret . When attaching
an opening bracket  to every call position and closing brackets  to the return
positions in a word w, we group the word w into subwords. This grouping can
be nested. However, not every bracket at a position in w needs to have a match-
ing bracket. The call and return positions in a nested word without matching
brackets are called pending. To emphasize this hierarchical structure imposed by
the brackets  and , we also refer to the words in Σ ∗ ∪ Σ ω as nested words. See
Fig. 1 for illustration.
To ease the exposition, we restrict ourselves in the following to nested words
without pending positions. Our results extend to nested words with pending
positions; see [15]. For & ∈ {∗, ω}, NW  (Σ) denotes the set of words in Σ  with
no pending positions. These words are also called well-matched.

Automata. A visibly pushdown automaton [4], VPA for short, is a tuple A =


(Q, Γ, Σ, δ, qI , Ω), where Q is a finite set of states, Γ is a finite set of stack
symbols, Σ = Σint ∪Σcall ∪Σret is the input alphabet, δ consists of three transition
functions δint : Q × Σint → 2Q , δcall : Q × Σcall → 2Q×Γ , and δret : Q × Γ × Σret →
2Q , qI ∈ Q is the initial state, and Ω : Q → N is the priority function. Since
we restrict ourselves here to well-matched words, we do not need to consider a
bottom stack symbol ⊥. We write Ω(Q) to denote the set of all priorities used
in A, i.e. Ω(Q) := {Ω(q) | q ∈ Q}. The size of A is |Q| and its index is |Ω(Q)|.
A run of A on w ∈ Σ ω is a word (q0 , γ0 )(q1 , γ1 ) . . . ∈ (Q×Γ ∗ )ω with (q0 , γ0 ) =
(qI , ε) and for each i ∈ N, the following conditions hold:
228 O. Friedmann, F. Klaedtke, and M. Lange

1. If wi ∈ Σint then qi+1 ∈ δint (qi , wi ) and γi+1 = γi .


2. If wi ∈ Σcall then (qi+1 , B) ∈ δcall (qi , wi ) and γi+1 = Bγi , for some B ∈ Γ .
3. If wi ∈ Σret and γi = Bu with B ∈ Γ and u ∈ Γ ∗ then qi+1 ∈ δret (qi , B, wi )
and γi+1 = u.
The run is accepting if max{Ω(q) | q ∈ inf(q0 q1 . . . )} is even. Runs of A on finite
words are defined as expected. In particular, a run on a finite word is accepting
if the last state in the run has an even priority. For & ∈ {∗, ω}, we define
  
L (A) := w ∈ NW  (Σ)  there is an accepting run of A on w .

Priority and Reward Ordering. For an arbitrary set S, we always assume that †
is a distinct element not occurring in S. We write S† for S ∪ {†}. We use † to
explicitly speak about partial functions into S, i.e., † denotes undefinedness.
We define the following two orders on N† . The priority ordering is denoted .
and is the standard order of type ω + 1. Thus, we have 0  1  2  · · ·  †. The
reward ordering / is defined by † ≺ · · · ≺ 5 ≺ 3 ≺ 1 ≺ 0 ≺ 2 ≺ 4 ≺ · · · . Note
that † is maximal for . but minimal for /. For a finite nonempty set S ⊆ N† ,
S and S denote the maxima with respect to the priority ordering . and the
reward ordering /, respectively. Furthermore, we write c & c for {c, c }.
The reward ordering reflects the intuition of how valuable a priority of a VPA’s
state is for acceptance: even priorities are better than odd ones, and the bigger
an even one is the better, while small odd priorities are better than bigger ones
because it is easier to subsume them in a run with an even priority elsewhere.
The element † stands for the non-existence of a run.

3 Universality Checking
Throughout this section, we fix a VPA A = (Q, Γ, Σ, δ, qI , Ω). We describe an
algorithm that determines whether Lω (A) = NW ω (Σ), i.e., whether A accepts
all well-matched infinite nested words over Σ. An extension of the algorithm to
account for non-well-matched nested words and a universality check for VPAs
over finite nested words is given in [15]. Moreover, in [15], we present a comple-
mentation construction for VPAs based on determinization and compare it to
the presented algorithm.
Central to the algorithm are so-called transition profiles. They capture A’s
essential behavior on finite words.
Definition 1. There are three kinds of transition profiles, TP for short. The
first one is an int-TP, which is a function of type Q × Q → Ω(Q)† . We associate
with a symbol a ∈ Σint the int-TP fa . It is defined as

 Ω(q  ) if q  ∈ δint (q, a) and
fa (q, q ) :=
† otherwise.
Ramsey Goes Visibly Pushdown 229

A call-TP is a function of type Q × Γ × Q → Ω(Q)† . With a symbol a ∈ Σcall


we associate the call-TP fa . It is defined as

 Ω(q  ) if (q  , B) ∈ δcall (q, a) and
fa (q, B, q ) :=
† otherwise.

Finally, a ret-TP is a function of type Q × Γ × Q → Ω(Q)† . With a symbol


a ∈ Σret we associate the ret-TP fa . It is defined as

 Ω(q  ) if q  ∈ δret (q, B, a) and
fa (q, B, q ) :=
† otherwise.

A TP of the form fa for an a ∈ Σ is also called atomic. For τ ∈ {int, call, ret},
we define the set of atomic TPs as Tτ := {fa | a ∈ Στ }.
The above TPs describe A’s behavior when A reads a single letter. In the fol-
lowing, we define how TPs can be composed to describe A’s behavior on words
of finite length. The composition, written f ◦ g, can only be applied to TPs of
certain kinds. This ensures that the resulting TP describes the behavior on a
word w such that, after reading w, A’s stack height has changed by at most one.
Definition 2. Let f and g be TPs. There are six different kinds of compositions,
depending on the TPs’ kind of f and g, which we define in the following. If f
and g are both int-TPs, we define
  
(f ◦ g)(q, q  ) := f (q, q  ) & g(q  , q  )  q  ∈ Q .

If f is an int-TP and g is either a call-TP or a ret-TP, we define


  
(f ◦ g)(q, B, q  ) := f (q, q  ) & g(q  , B, q  )  q  ∈ Q and
  
(g ◦ f )(q, B, q  ) := g(q, B, q  ) & f (q  , q  )  q  ∈ Q .

If f is a call-TP and g a ret-TP, we define


  
(f ◦ g)(q, q  ) := f (q, B, q  ) & g(q  , B, q  )  q  ∈ Q and B ∈ Γ .

Intuitively, the composition of two TPs f and g is obtained by following any


edge through f from some state q to another state q  , then following any edge
through g to some other state q  . The value of this path is the maximum of the
two values encountered in f and g with respect to the priority ordering .. Then
one takes the maximum over all such possible values with respect to the reward
ordering / and obtains a weighted path from q to q  in the composition.
We associate finite words with TPs as follows. With a letter a ∈ Σ we associate
the TP fa as done in Def. 1. If the words u, v ∈ Σ + are associated with the TPs
f and g, respectively, we associate the word uv with the TP f ◦ g, provided that
f ◦ g is defined. A word cannot be associated with two distinct TPs. This follows
from the following lemma, which is easy to prove.
230 O. Friedmann, F. Klaedtke, and M. Lange

fa fb fab
q0 q0 q0 q0 q0 X q0
0 1
q1 1 X 0 X
q1 q1 q1 q1 q1 q1

2 2 2
a
a b /X
a q2 2 q2 q2
X
3
Y
q2 = q2
3
Y

Y
q2
b /X 3 3
q3 3 q3 q3 Y q3 q3 Y q3
q0 c /X q3
fb fc fbc
c /X
a q0 q0 q0 q0 q0 q0
b /Y X 0


q1 2 q1 q1 q1 q1 2 q1
q2 b /Y
q2
X
3
Y
q2 q2
X 1
2 Y q2 = q2
3
q2
a 3 2
q3 q3 q3 q3 q3 q3
c /Y Y X

Fig. 2. VPA (left) and the TPs (right) from Example 4

 
Lemma 3. Let f , g, h, and k be TPs. If (h  ◦ f ) ◦ (g ◦
 k) and h ◦ (f ◦ g) ◦ k
are both defined then (h ◦ f ) ◦ (g ◦ k) = h ◦ (f ◦ g) ◦ k .

If the word u ∈ Σ + is associated with the TP f , we write fu for f . Note that


two distinct words can be associated with the same TP, i.e., it can be the case
that fu = fv , for u, v ∈ Σ + with u = v. Intuitively, if this is the case then A’s
behavior on u is identical to A’s behavior on v.
The following example illustrates TPs and their composition.

Example 4. Consider the VPA on the left in Fig. 2 with the states q0 , q1 , q2 ,
and q3 . The states’ priorities are the same as their indices. We assume that
Σint = {a}, Σcall = {b}, and Σret = {c}. The stack alphabet is Γ = {X, Y }.
Fig. 2 also depicts the TPs fa , fb , fc and their compositions fa ◦ fb = fab and
fb ◦fc = fbc . The VPA’s states are in-ports and out-ports of a TP. Assume that f
is a call-TP. An in-port q is connected with an out-port q  if f (q, B, q  ) = †, for
some B ∈ Γ . Moreover, this connection of the two ports is labeled with the stack
symbol B and the priority. The number of a connection between an in-port and
an out-port specifies its priority. For example, the connection in the TP fa from
the in-port q0 to the out-port q0 has priority 0 since fa (q0 , q0 ) = 0. Since fa is
an int-TP, connections are not labeled with stack symbols.
In a composition f ◦ g, we plug f ’s out-ports with g’s in-ports together. The
priority from an in-port of f ◦ g to an out-port of f ◦ g is the maximum with
respect to the priority ordering . of the priorities of the two connections in f
and g. However, if f is a call-TP and g a ret-TP, we are only allowed to connect
the ports in f ◦ g, if the stack symbols of the connections in f and g match.
Finally, since there can be more than one connection between ports in f ◦ g, we
take the maximum with respect to reward ordering /.

We extend the composition operation ◦ to sets of TPs in the natural way, i.e.,
we define F ◦ G := {f ◦ g | f ∈ F and g ∈ G for which f ◦ g is defined}.

Definition 5. Define T as the least solution to the equation

T = Tint ∪ Tcall ◦ Tret ∪ Tcall ◦ T ◦ Tret ∪ T ◦ T .


Ramsey Goes Visibly Pushdown 231

Note that the operations ◦ and ∪ are monotonic, and the underlying lattice of
the powerset of all int-TPs is finite. Thus, the least solution always exists and
can be found using fixpoint iteration in a finite number of steps.
The following lemma is helpful in proving that the elements of T can be used
to characterize (non-)universality of A.
Lemma 6. For every TP f , we have f ∈ T only if there is a well-matched
w ∈ Σ + with f = fw .
We need the following notions to characterize universality in terms of the exis-
tence of TPs with certain properties.
Definition 7. Let f be an int-TP.
(i) f is idempotent if f ◦ f = f . Note that only an int-TP can be idempotent.
(ii) For q ∈ Q, we write f (q) for the set of all q  ∈ Q that are connected to
  
 f (q) := {q ∈ Q | f (q, q ) = †}. Moreover, for Q ⊆ Q, we
q in this TP, i.e.,

define f (Q ) := q∈Q f (q).
(iii) f is bad for the set Q ⊆ Q if f (q, q) is either † or odd, for every q ∈ f (Q ).
A good TP is a TP that is not bad. Note that any TP is bad for ∅. In the
following, we consider bad TPs only in the context of idempotent TPs.
Example 8. Reconsider the VPA from Example 4 and its TPs. It is easy to
see that TP g := fa ◦ fa is idempotent. Since g(q2 , q2 ) = 2, g is good for any
Q ⊆ {q0 , q1 , q2 , q3 } with q2 ∈ Q . The intuition is that there is at least one run
on (aa)ω that starts in q2 and loops infinitely often through q2 . Moreover, on this
run 2 is the highest priority that occurs infinitely often. So, if there is a prefix
v ∈ Σ + with a run that starts in the initial state and ends in q2 , we have that
v(aa)ω is accepted by the VPA. The TP g is bad for {q1 , q3 }, since g(q1 , q1 ) = †
and g(q3 , q3 ) = 3. So, if there is prefix v ∈ Σ + for which all runs that start in
the initial state and end in q1 or q3 then v(aa)ω is not  accepted by  the VPA.
Another TP that is idempodent is the TP g  := fb ◦ (fb ◦ fc ) ◦ fc . Here, we
have that g  (q1 , q1 ) = 2 and g  (q, q  ) = †, for all q, q  ∈ {q0 , q1 , q2 , q3 } with not
q = q  = q1 . Thus, g  is bad for every Q ⊆ Q with q1 ∈ Q .
The following theorem characterizes universality of the VPA A in terms of the
TPs that are contained in the least solution of the equation from Def. 5.
Theorem 9. Lω (A) = NW ω (Σ) iff there are TPs f, g ∈ T such that g is idem-
potent and bad for f (qI ).
Thm. 9 can be used to decide universality for VPAs with respect to the set of
well-matched infinite words. The resulting algorithm, which we name UNIV, is
depicted in Fig. 3. It computes T by least-fixpoint iteration and checks at each
stage whether two TPs exist that witness non-universality according to Thm. 9.
The variable T stores the generated TPs and the variable N stores the newly
generated TPs in an iteration. UNIV terminates if no new TPs are generated in
an iteration. Termination is guaranteed since there are only finitely many TPs.
For returning a witness of the VPA’s non-universality, we assume that we have
a word associated with a TP at hand. UNIV’s asymptotic time complexity is as
follows, where we assume that we use hash tables to represent T and N .
232 O. Friedmann, F. Klaedtke, and M. Lange

1 N ← Tint ∪ Tcall ◦ Tret


2 T ←N
3 while N = ∅ do
4 forall (fu , fv ) ∈ N × T ∪ T × N do
5 if fv idempotent and fv bad for fu (qI ) then
6 return universality does not hold, witnessed by uv ω

7 N ← N ◦ T ∪ T ◦ N ∪ Tcall ◦ N ◦ Tret \ T
8 T ←T ∪N
9 return universality holds
Fig. 3. Universality check UNIV for VPAs with respect to well-matched words

Theorem 10. Assume that the given VPA A has n ≥ 1 states, index k ≥ 2,
and m = max{1, |Σ|, |Γ |}, where Σ is the VPA’s input alphabet and Γ its stack
alphabet. The running time of the algorithm UNIV is in m3 · 2O(n ·log k) .
2

There are various ways to tune UNIV. For instance, we can store the TPs in a
single hash table and store pointers to the newly generated TPs. Furthermore,
we can store pointers to idempotent TPs. Another optimization also concerns
the badness check in the line 4 to 6. Observe that it is sufficient to know the sets
fu (qI ), for fu ∈ T , i.e, the sets Q ⊆ Q for which all runs for some well-matched
word end in a state in Q . We can maintain
 a set R to store this information. We

initialize R with the singleton set ε, {qI } . We  update it after line 8 in
 each
iteration by assigning the set R ∪ uv, fv (Q )  (u, Q ) ∈ R and fv ∈ T to it.
After this update, we can optimize R by removing an element (u, Q ) from it if
there is another element (u , Q ) in R with Q ⊆ Q . These optimizations do
not improve UNIV’s worst-case complexity but they are of great practical value.

4 Inclusion Checking

In this section, we describe how to check language inclusion for VPAs. For the
sake of simplicity, we assume a single VPA and check for inclusion of the lan-
guages that are defined by two states qI1 and qI2 . It should be clear that it is
always possible to reduce the case for two VPAs to this one by forming the dis-
joint union of the two VPAs. Thus, for i ∈ {1, 2}, let Ai = (Q, Γ, Σ, δ, qIi , Ω) be
the respective VPA. We describe how to check whether Lω (A1 ) ⊆ Lω (A2 ) holds.
Transition profiles for inclusion checking extend those for universality checking.
A tagged transition profile (TTP) of the int-type is an element of
   
Q × Ω(Q) × Q × Q × Q → Ω(Q)† .

We write it as f p,c,p  instead of (p, c, p , f ) in order to emphasize the fact that
the TP f is extended with a tuple of states and priorities. A call-TTP is of type
   
Q × Γ × Ω(Q) × Q × Q × Γ × Q → Ω(Q)†
Ramsey Goes Visibly Pushdown 233

and a ret-TTP is of type


   
Q × Ω(Q) × Γ × Q × Q × Γ × Q → Ω(Q)† .
 
Accordingly, they are written f p,B,c,p  and f p,c,B,p  , respectively.

The intuition of an int-TTP f p,c,p  is as follows. The TP f describes the
essential information of all runs of the VPA A2 on a well-matched word u ∈ Σ + .
The attached information p, c, p  describes the existence of some run of the
VPA A1 on u. This run starts in state p, ends in state p , and the maximal
occurring priority on it is c. The intuition behind a call-TTP or a ret-TTP is
similar. The symbol B in the annotation is the topmost stack symbol that is
pushed or popped in the run of A2 for the pending position in the word u.
For a ∈ Σ, we now associate a set Fa of TTPs with the appropriate type.
Recall that fa stands for the TP associated to the letter a as defined in Def. 1.
p,Ω(p ),p 
– If a ∈ Σint , let Fa := {fa | p, p ∈ Q and p ∈ δint (p, a)}.
p,B,Ω(p ),p 

– If a ∈ Σcall , let Fa := {fa | p, p ∈ Q, B ∈ Γ, and (p , B) ∈ δcall (p, a)}.
p,Ω(p ),B,p 
– If a ∈ Σret , let Fa := {fa | p, p ∈ Q, B ∈ Γ, and p ∈ δret (p, B, a)}.
As with TPs, the composition of TTPs is only allowed in certain cases. They
are the same as for TPs, e.g., the composition of a call-TTP with an int-TTP
results in a call-TTP, and with a ret-TTP it results in an int-TTP. However, the
composition of TTPs is not a monoid operation but behaves like the composition
of morphisms in a category in which the states in Q, respectively pairs of states

and stack symbols in Γ , act as objects. A TTP f p,c,p  for instance can be

seen as a morphism from p to p , and it can therefore only be composed with a
morphism from p to anything else.
The composition of two TTPs extends the composition of the underlying TPs

by explaining how the tag of the resulting TTP is obtained. For int-TTPs f p,c,p 
p ,c ,p 
and g , we define
     
f p,c,p  ◦ g p ,c ,p 
:= (f ◦ g)p,cc ,p 
.
  
Composing an int-TTP f p,c,p  and a call-TTP g q,B,c ,q  yields call-TTPs:
    
f p,c,p  ◦ g q,B,c ,q  := (f ◦ g)p,B,cc ,q  if p = q
    
g q,B,c ,q  ◦ f p,c,p  := (g ◦ f )q,B,cc ,p  if q  = p .

The two possible compositions of an int-TTP with a ret-TTP are defined in



exactly the same way. Finally, the composition of a call-TTP f p,B,c,p  and a
  
ret-TTP g p ,c ,B,p  is defined as
     
f p,B,c,p  ◦ g p ,c ,B,p 
:= (f ◦ g)p,cc ,p 
.

Note that the stack symbol B is the same in both annotations. As for sets of
TPs, we extend the composition of TTPs to sets.
Similar to Def. 5, we define a set T̂ to be the least solution to the equation

T̂ = T̂int ∪ T̂call ◦ T̂ret ∪ T̂call ◦ T̂ ◦ T̂ret ∪ T̂ ◦ T̂ ,


234 O. Friedmann, F. Klaedtke, and M. Lange


where T̂τ := {Fa | a ∈ Στ }, for τ ∈ {int, call, ret}. This allows us to characterize
language inclusion between two VPAs in terms of the existence of certain TTPs.

Theorem 11. Lω (A1 ) ⊆ Lω (A2 ) iff there are TTPs f qI ,c,p and g p,d,p in T̂
1

fulfilling the following properties:


(1) The priority d is even.
(2) The TP g is idempotent and bad for f (qI2 ).
Thm. 11 yields an algorithm INCL to check Lω (A1 ) ⊆ Lω (A2 ), for given VPAs
A1 and A2 . It is along the same lines as the algorithm UNIV and we omit it. The
essential difference lies in the sets T̂int , T̂call , and T̂ret , which contain TTPs instead
of TPs, and the refined way in which they are being composed. Each iteration
now searches for two TTPs that witness the existence of some word of the form
uv ω that is accepted by A1 but not accepted by A2 . Similar optimizations that
we sketch for UNIV at the end of Sect. 3 also apply to INCL.
For the complexity analysis of the algorithm INCL below, we do not assume
that the VPAs A1 and A2 necessarily share the state set, the priority function,
the stack alphabet, and the transition functions as assumed at the beginning of
this subsection. Only the input alphabet Σ is the same for A1 and A2 .
Theorem 12. Assume that for i ∈ {1, 2}, the number of states of the VPA Ai
is ni ≥ 1, ki ≥ 2 its index, and mi = max{1, |Σ|, |Γi |}, where Σ is the VPA’s
input alphabet and Γi its stack alphabet. The running time of the algorithm INCL
is in n41 · k12 · m1 · m32 · 2O(n2 ·log k2 ) .
2

5 Evaluation
Our prototype tool FADecider implements the presented algorithms in the pro-
gramming language OCaml.2 To evaluate the tool’s performance we carried out
the following experiments for which we used a 64-bit Linux machine with 4 GB
of main memory and two dual-core Xeon 5110 CPUs, each with 1.6 GHz. Our
benchmark suite consists of VPAs from [11], which are extracted from real-world
recursive imperative programs. Tab. 1 describes the instances, each consisting
of two VPAs A and B, in more detail. Tab. 2 shows FADecider’s running times
for the inclusion checks L∗ (A) ⊆ L∗ (B) and Lω (A) ⊆ Lω (B). For comparison,
we used the OpenNWA library [12]. The inclusion check there is implemented
by a reduction to an emptiness check via a complementation construction. Note
that OpenNWA does not support infinite nested words at all and has no direct
support for only considering well-matched nested words. We used therefore Open-
NWA to perform the language-inclusion checks with respect to all finite nested
words.
FADecider outperforms OpenNWA on these examples. Profiling the inclu-
sion check based on the OpenNWA library yields that complementation requires
about 90% of the overall running time. FADecider spends about 90% of its time
2
The tool (version 0.4) is publicly available at www2.tcs.ifi.lmu.de/fadecider.
Ramsey Goes Visibly Pushdown 235

Table 1. Statistics on the input instances. The first row lists the number of states of the
VPAs from an input instance and their alphabet sizes. The number of stack symbols of
a VPA and its index are not listed, since in these examples the VPA’s stack symbol set
equals its state set and states are either accepting or non-accepting. The second row lists
whether the inclusions L∗ (A) ⊆ L∗(B) and Lω (A) ⊆ Lω (B) of the respective VPAs hold.

ex ex-§2.5 gzip gzip-fix png2ico


size A / size B / alphabet size 9 / 5 / 4 10 / 5 / 5 51 / 71 / 4 51 / 73 / 4 22 / 26 / 5
language relation ⊆ / ⊆ ⊆ / ⊆ ⊆ / ? ⊆/⊆ ⊆/⊆

Table 2. Experimental results for the language-inclusion checks. The row “FADecider”
lists the running times for the tool FADecider for checking L∗ (A) ⊆ L∗ (B) and
Lω (A) ⊆ Lω (B). The row “#TTPs” lists the number of encountered TTPs. The sym-
bol ‡ indicates that FADecider ran out of time (2 hours). The row “OpenNWA” lists
the running times for the implementation based on the OpenNWA library for checking
inclusion on finite words and the VPA’s size obtained by complementing B.

ex ex-§2.5 gzip gzip-fix png2ico


FADecider 0.00s / 0.00s 0.00s / 0.00s 36s / ‡ 42s / 294s 0.10s / 0.11s
#TTPs 6/6 18 / 19 694 / ‡ 518 / 1,117 586 / 609
OpenNWA 0.16s / 27 0.04s / 11 49s / 27 1,104s / 176 74.70s / 543

on composing TPs and about 5% on checking equality of TPs. The experiments


also show that FADecider’s performance on inclusion checks for infinite words
can be worse than for finite words. Note that checking inclusion for infinite-word
languages is more expensive than for finite-word languages, since, in addition to
reachability, one needs to account for loops.

6 Conclusion
Checking universality and language inclusion for automata by avoiding deter-
minization and complementation has recently attracted a lot of attention, see,
e.g., [1, 9, 10, 13, 16]. We have shown that Ramsey-based methods for Büchi au-
tomata generalize to the richer automaton model of VPAs with a parity accep-
tance condition. Another competitive approach based on antichains has recently
also been extended to VPAs, however, only to VPAs over finite words [6]. It
remains to be seen if optimizations for the Ramsey-based algorithms for Büchi
automata [1] extend, with similar speed-ups, to this richer setting. Another di-
rection of future work is to investigate Ramsey-based approaches for automaton
models that extend VPAs like multi-stack VPAs [18].

Acknowledgments. We are grateful to Evan Driscoll for providing us with


VPAs.
236 O. Friedmann, F. Klaedtke, and M. Lange

References

1. Abdulla, P.A., Chen, Y.-F., Clemente, L., Holı́k, L., Hong, C.-D., Mayr, R., Vo-
jnar, T.: Advanced Ramsey-based Büchi automata inclusion testing. In: Katoen,
J.-P., König, B. (eds.) CONCUR 2011. LNCS, vol. 6901, pp. 187–202. Springer,
Heidelberg (2011)
2. Abdulla, P.A., Chen, Y.-F., Holı́k, L., Mayr, R., Vojnar, T.: When simulation meets
antichains. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015,
pp. 158–174. Springer, Heidelberg (2010)
3. Alur, R., Benedikt, M., Etessami, K., Godefroid, P., Reps, T.W., Yannakakis, M.:
Analysis of recursive state machines. ACM Trans. Progr. Lang. Syst. 27(4), 786–818
(2005)
4. Alur, R., Madhusudan, P.: Adding nesting structure to words. J. ACM 56(3), 1–43
(2009)
5. Breuers, S., Löding, C., Olschewski, J.: Improved Ramsey-based Büchi comple-
mentation. In: Birkedal, L. (ed.) FOSSACS 2012. LNCS, vol. 7213, pp. 150–164.
Springer, Heidelberg (2012)
6. Bruyère, V., Ducobu, M., Gauwin, O.: Visibly pushdown automata: Universality
and inclusion via antichains. In: Dediu, A.-H., Martı́n-Vide, C., Truthe, B. (eds.)
LATA 2013. LNCS, vol. 7810, pp. 190–201. Springer, Heidelberg (2013)
7. Büchi, J.R.: On a decision method in restricted second order arithmetic. In: Proc.
of the 1960 Internat. Congr. on Logic, Method, and Philosophy of Science, pp. 1–11
(1960)
8. Dax, C., Hofmann, M., Lange, M.: A proof system for the linear time μ-calculus.
In: Arun-Kumar, S., Garg, N. (eds.) FSTTCS 2006. LNCS, vol. 4337, pp. 273–284.
Springer, Heidelberg (2006)
9. De Wulf, M., Doyen, L., Henzinger, T.A., Raskin, J.-F.: Antichains: A new algo-
rithm for checking universality of finite automata. In: Ball, T., Jones, R.B. (eds.)
CAV 2006. LNCS, vol. 4144, pp. 17–30. Springer, Heidelberg (2006)
10. Doyen, L., Raskin, J.-F.: Antichains for the automata-based approach to model-
checking. Log. Methods Comput. Sci. 5(1) (2009)
11. Driscoll, E., Burton, A., Reps, T.: Checking conformance of a producer and a
consumer. In: ESEC/FSE 2011, pp. 113–123.
12. Driscoll, E., Thakur, A., Reps, T.: OpenNWA: A nested-word automaton library.
In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 665–671.
Springer, Heidelberg (2012)
13. Fogarty, S., Vardi, M.Y.: Büchi complementation and size-change termination. In:
Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 16–30.
Springer, Heidelberg (2009)
14. Fogarty, S., Vardi, M.Y.: Efficient Büchi universality checking. In: Esparza, J., Ma-
jumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 205–220. Springer, Heidelberg
(2010)
15. Friedmann, O., Klaedtke, F., Lange, M.: Ramsey goes visibly pushdown (2012)
(Manuscript); Available at authors’ web pages
16. Friedmann, O., Lange, M.: Ramsey-based analysis of parity automata. In: Flana-
gan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 64–78. Springer,
Heidelberg (2012)
17. Heizmann, M., Hoenicke, J., Podelski, A.: Nested interpolants. In: POPL 2010, pp.
471–482 (2010)
Ramsey Goes Visibly Pushdown 237

18. La Torre, S., Madhusudan, P., Parlato, G.: A robust class of context-sensitive
languages. In: LICS 2007, pp. 161–170 (2007)
19. Lee, C.S., Jones, N.D., Ben-Amram, A.M.: The size-change principle for program
termination. In: POPL 2001, pp. 81–92 (2001)
20. Ramsey, F.P.: On a problem of formal logic. Proc. London Math. Soc. 30, 264–286
(1928)
21. Tsai, M.-H., Fogarty, S., Vardi, M.Y., Tsay, Y.-K.: State of büchi complementation.
In: Domaratzki, M., Salomaa, K. (eds.) CIAA 2010. LNCS, vol. 6482, pp. 261–271.
Springer, Heidelberg (2011)
22. Vardi, M.Y., Wolper, P.: An automata-theoretic approach to automatic program
verification (preliminary report). In: LICS 1986, pp. 332–344 (1986)
Checking Equality and Regularity
for Normed BPA with Silent Moves

Yuxi Fu

BASICS, Department of Computer Science, Shanghai Jiao Tong University


MOE-MS Key Laboratory for Intelligent Computing and Intelligent Systems

Abstract. The decidability of weak bisimilarity on normed BPA is a


long standing open problem. It is proved in this paper that branching
bisimilarity, a standard refinement of weak bisimilarity, is decidable for
normed BPA and that the associated regularity problem is also decidable.

1 Introduction

In [BBK87] Baeten, Bergstra and Klop proved a surprising result that strong
bisimilarity between context free grammars without empty production is decid-
able. The decidability is in sharp contrast to the well known fact that language
equivalence between these grammars is undecidable. After [BBK87] decidability
and complexity issues of equivalence checking of infinite systems à la process
algebra have been intensively investigated. As regards BPA, Hüttel and Stir-
ling [HS91] improved Baeten, Bergstra and Klop’s proof by a more straight-
forward one using tableau system. Hüttel [Hüt92] then repeated the tableau
construction for branching bisimilarity on totally normed BPA processes. Later
Hirshfeld [Hir96] applied the tableau method to the weak bisimilarity on the
totally normed BPA. An affirmative answer to the decidability of the strong
bisimilarity on general BPA is given by Christensen, Hüttel and Stirling by ap-
plying the technique of bisimulation base [CHS92].
The complexity aspect of BPA has also been investigated over the years. Bal-
cazar, Gabarro and Santha [BGS92] pointed out that strong bisimilarity is P-
hard. Huynh and Tian [HT94] showed that the problem is in Σ2p , the second level
of the polynomial hierarchy. Hirshfeld, Jerrum and Moller [HJM96] completed
the picture by offering a remarkable polynomial algorithm for the strong bisimi-
larity of normed BPA. For the general BPA, Burkart, Caucal and Steffen [BCS95]
showed that the strong bisimilarity problem is elementary. They claimed that
their algorithm can be optimized to get a 2-EXPTIME upper bound. A further
elaboration of the 2-EXPTIME upper bound is given in [Jan12] with the intro-
duction of infinite regular words. The current known best lower bound of the
problem, EXPTIME, is obtained by Kiefer [Kie13], improving both the PSPACE
lower bound result and its proof of Srba [Srb02]. Much less is known about the
weak bisimilarity on BPA. Střı́brná’s PSPACE lower bound [Stř98] is subsumed

The full paper can be found at http://basics.sjtu.edu.cn/~ yuxi/.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 238–249, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Checking Equality and Regularity for Normed BPA with Silent Moves 239

by both the result of Srba [Srb02] and that of Mayr [May03], all of which are
subsumed by Kiefer’s recent result. A slight modification of Mayr’s proof shows
that the EXPTIME lower bound holds for the branching bisimilarity as well.
It is generally believed that weak bisimilarity, as well as branching bisimilarity,
on BPA is decidable. There has been however a lack of technique to resolve
the difficulties caused by silent transitions. This paper aims to advance our
understanding of the decidability problems of BPA in the presence of silent
transitions. The main contributions of the paper are as follows:
– We introduce branching norm, which is the least number of nontrivial actions
a process has to do to become an empty process. With the help of this concept
one can carry out a much finer analysis on silent actions than one would have
using weak norm. Branching norm turns out to be crucial in our approach.
– We reveal that in normed BPA the length of a state preserving silent tran-
sition sequence can be effectively bounded. As a consequence we show that
branching bisimilarity on normed BPA processes can be approximated by a
sequence of finite branching bisimulations.
– We establish the decidability of branching bisimilarity on normed BPA by
constructing a sound and complete tableau system for the equivalence.
– We demonstrate how to derive the decidability of the associated regularity
problem from the decidability of the branching bisimilarity of normed BPA.
The result of this paper is significantly stronger than previous decidability results
on the branching bisimilarity of totally normed BPA [Hüt92, CHT95]. It is easy
to derive effective size bound for totally normed BPA since a totally normed
BPA process with k variable occurrences has a norm at least k. For the same
reason right cancellation property holds. Hence the decidability. The totality
condition makes the branching bisimilarity a lot more like strong bisimilarity.

2 Branching Bisimilarity for BPA


A basic process algebra (BPA for short) Γ is a triple (V, A, Δ) where V =
{X1 , . . . .Xn } is a finite set of variables, A = {a1 , . . . .am } ∪ {τ } is a finite set of
actions ranged over by , and Δ is a finite set of transition rules. The special sym-
bol τ denotes a silent action. A BPA process defined in Γ is an element of the set
V ∗ of finite string of element of V. The set V will be ranged over by capital letters
and V ∗ by lower case Greek letters. The empty string is denoted by . We will

use = for the grammar equality on V ∗ . A transition rule is of the form X −→ α,
where  ranges over A. The transitional semantics is closed under composition
 
in the sense that Xγ −→ αγ for all γ whenever X −→ α. We shall assume that
every variable of a BPA is defined by at least one transition rule and every action
in A appears in some transition rule. Accordingly we sometimes refer to a BPA
τ
by its set of transition rules. We write −→ for −→ and =⇒ for the reflexive
τ
transitive closure of −→. The set A will be ranged over by ∗ . If ∗ = 1 . . . k

∗  k−1 
for some k ≥ 0, then α −→ α stands for α −→
1
α1 . . . −→ αk−1 −→
k
α for some


α1 , . . . , αk−1 . We say that α is a descendant of α if α −→ α for some ∗ .
240 Y. Fu

A BPA process α is normed if there are some actions 1 , . . . j such that


 j
α −→ 1
. . . −→ . A process is unnormed if it is not normed. The norm of a BPA
1 k
process α, denoted by α, is the least k such that α −→ . . . −→  for some
1 , . . . k . A normed BPA, or nBPA, is one in which every variable is normed.
For each given BPA Δ, we introduce the following notations:
– mΔ is the !  of transition
number " rules; and nΔ is the number of variables.
 λ
– rΔ is max |γ|  X −→ γ ∈ Δ , where |γ| denotes the length of γ.
– Δ is max {Xi  | 1 ≤ i ≤ nΔ and Xi is normed}.
Each of mΔ , nΔ , rΔ and Δ can be effectively calculated from Δ.

2.1 Branching Bisimilarity


The idea of the branching bisimilarity of van Glabbeek and Weijland [vGW89]
is that not all silent actions can be ignored. What can be ignored are those that
do not change system states irreversibly. For BPA we need to impose additional
condition to guarantee congruence. In what follows xRy stands for (x, y) ∈ R.
Definition 1. A symmetric relation R on BPA processes is a branching bisim-
ulation if the following statements are valid whenever αRβ:

1. If βRα −→ α then one of the following statements is valid:
(i)  = τ and α Rβ.

(ii) β =⇒ β  Rα for some β  such that β  −→ β  Rα for some β  .
2. If α =  then β =⇒ .
The branching bisimilarity 5 is the largest branching bisimulation.
The branching bisimilarity 5 satisfies the standard property of observational
equivalence stated in the next lemma [vGW89].
τ τ τ
Lemma 1. Suppose α0 −→ α1 −→ . . . −→ αk 5 α0 . Then α0 5 α1 5 . . . 5 αk .
Using Lemma 1 it is easy to show that 5 is a congruence and that whenever
 τ τ τ 
β 5 α −→ α is simulated by β −→ β1 −→ β2 . . . −→ βk −→ β  such that
 
βk 5 α and β 5 α then β 5 β1 5 . . . 5 βk .
Having defined an equality for BPA, we can formally draw a line between the
silent actions that change the capacity of systems and those that do not. We say
τ
that a silent action α −→ α is state preserving if α 5 α ; it is a change-of-state
τ ι
if α 5 α . We will write α → α if α −→ α is state preserving and α −→ α if it


is a change-of-state. The reflexive and transitive closure of → is denoted by →∗ .


Since both external actions and change-of-state silent actions must be explicitly
j
bisimulated, we let j range over the set (A \ {τ }) ∪ {ι}. So α −→ α means either
a ι
α −→ α for some a = τ or α −→ α .
Let’s see an example.
Example 1. The BPA Γ1 is defined by the following transition rules:
a τ b τ a b τ
A −→ A, A −→ , B −→ B, B −→ , C −→ C, C −→ C, C −→ .
Clearly AC 5 BC, although A 5 B. In this example all variables are normed.
Checking Equality and Regularity for Normed BPA with Silent Moves 241

2.2 Bisimulation Base


An axiom system B is a finite set of equalities on nBPA processes. An element
α = β of B is called an axiom. Write B α = β if the equality α = β can be
derived from the axioms of B by repetitive use of any of the three equivalence
rules and two congruence rules. For our purpose the most useful axiom systems
are those that generate branching bisimulations. These are bisimulation bases
originally due to Caucal. The following definition is Hüttel’s adaptation to the
branching scenario [Hüt92].
Definition 2. A finite axiom system B is a bisimulation base if the following
bisimulation base property hold for every axiom (α0 , β0 ) of B:

1. If β0 −→ β1 −→ . . . −→ βn −→ β  then there are α1 , . . . , αn , α such that
B β1 = α1 , . . . , B βn = αn , B β  = α and the following hold:
(i) For each i with 0 ≤ i < n, either αi = αi+1 , or αi −→ αi+1 , or there
are α1i , . . . , αki i such that αi −→ α1i −→ . . . −→ αki i −→ αi+1 and
B βi = α1i , . . . , B βi = αki i .

(ii) Either  = τ and αn = α , or αn −→ α , or there are α1n , . . . , αknn

such that αn −→ α1n −→ . . . −→ αknn −→ α and B βn = α1n , . . . ,
B βn = αknn .
2. If β0 =  then either α0 =  or α0 −→ α1 −→ . . . −→ αk −→  for some
α1 , . . . , αk with k ≥ 0 such that B α1 = , . . . , B αk = .
3. The conditions symmetric to 1 and 2.
The next lemma justifies the above definition [Hüt92].
Lemma 2. If B is a bisimulation base then B  = {(α, β) | B α = β} ⊆ 5.
Proof. If B α = β, then an inductive argument shows that there exist γ1 δ1 λ1 ,
γ2 δ2 λ2 , γ3 δ3 λ3 , . . . , γk−1 δk−1 λk−1 , γk δk λk and δ1 , . . . , δk for k ≥ 1 such that α =
γ1 δ1 λ1 , γk δk λk = β and γ1 δ1 λ1 B γ1 δ1 λ1 = γ2 δ2 λ2 B γ2 δ2 λ2 = . . . γk−1 δk−1 
λk−1
 
= γk δk λk B γk δk λk . The transitive closure makes it easy to see that B satisfies
the bisimulation base property. Consequently it is a branching bisimulation. % &

3 Approximation of Branching Bisimilarity


To look at the algebraic property of the branching bisimilarity 5 more closely,
we introduce a notion of normedness appropriate for the equivalence.
Definition 3. The branching norm of an nBPA process α is the least number
j1 j2 jk
k such that ∃j1 . . . jk .∃α1 . . . αk .α →∗ −→ α1 →∗ −→ . . . αk−1 →∗ −→ αk →∗ .
The branching norm of α is denoted by αb .
a τ
For example the branching norm of B defined by {B −→ B, B −→ } is 1. It is
easy to prove that if α 5 β then αb = βb and that if αb = 0 then α 5 .
It follows that α b = αb whenever α →∗ α . Also notice that αb ≤ α.
An important property of branching norm is stated next.
242 Y. Fu

Lemma 3. Suppose α is normed. Then α 5 δα if and only if αb = δαb .

Proof. If αb = δαb then every silent action sequence from δα to α must
contain only state preserving silent transitions according to Lemma 1. Moreover
there must exist such a silent action path for otherwise αb < δαb . &
%

It does not follow from α 5 δα that δ 5 . A counter example is given by the


BPA defined in Example 1. One has AC 5 C 5 BC. But clearly  5 A 5 B 5 .
To deal with situations like this we need the notion of relative norm.

Definition 4. The relative norm ασb of α with respect to σ is the least k such
j1 jk
that ασ →∗ −→ α1 σ . . . αk−1 σ →∗ −→ αk σ →∗ σ for some j1 , . . . , jk , α1 , . . . , αk .

Obviously 0 ≤ ασb ≤ αb . Returning to the BPA Γ1 defined in Example 1,


we see that ABb = 1 and Ab = 0. Using the notion of relative norm we may
C

introduce the following terminologies:



– A transition Xσ −→ ησ is norm consistent if either ησb = Xσb and  = τ
or ησb = Xσb − 1 and  = τ ∨  = ι.
– If Xσ −→ ησ is norm consistent with Xσb > 0, then it is norm splitting if
at least two variables in η have (smaller) nonzero relative norms in ησ.
For an nBPA Δ no silent transition sequence contains more than Δb norm
splitting transitions, where Δb is max{Xi b | 1 ≤ i ≤ nΔ and Xi is normed}.
The crucial property about relative norm is described in the following lemma.

Lemma 4. Let α, β, δ, γ be normed with αγb = βδb . If αγ 5 βδ then γ 5 δ.

Proof. Suppose αγb = βδb . Now αγb + γb = αγb = βδb = βδb +
j1
δb . Therefore γb = δb . A norm consistent action sequence αγ →∗ −→
jk j1 jk
. . . →∗ −→→∗ γ must be matched up by βδ →∗ −→ . . . →∗ −→ β  δ for some β  .
Clearly β δb = γb = δb . It follows from Lemma 3 that δ 5 β  δ 5 γ.

&
%

Lemma 4 describes a weak form of left cancelation property. The general left
cancelation property fails. Fortunately there is a nice property of nBPA that
allows us to control the size of common suffix of a pair of bisimilar processes.

Definition 5. A process α is irredundant over γ if αγb > 0. It is redundant


over γ if αγb = 0. A process α is head irredundant if either α =  or α = Xα
for some X, α such that α 5 α . It is head redundant otherwise. We write
Hirred(α) to indicate that α is head irredundant. A process α is completely
irredundant if every suffix of α is head irredundant. We write Cirred(α) to
mean that α is completely irredundant.

If α is normed, then α is irredundant over γ if and only if αγ 5 γ. In nBPA a


redundant process consists solely of redundant variables.

Lemma 5. Suppose X1 , . . . , Xk , σ are normed. Then X1 . . . Xk is redundant


over σ if and only if Xi is redundant over σ for every Xi ∈ {X1 , . . . , Xk }.
Checking Equality and Regularity for Normed BPA with Silent Moves 243

Proof. Suppose X1 , . . . , Xk , σ are normed and X1 . . . Xk is redundant over σ.


Then X1 . . . Xk σ =⇒ X2 . . . Xk σ =⇒ . . . =⇒ Xk σ =⇒ σ 5 X1 . . . Xk σ. It
follows from Lemma 1 that X1 . . . Xk σ 5 X2 . . . Xk σ 5 . . . 5 Xk σ 5 σ. We are
done by using the congruence property. &
%
For each σ, let the redundant set Rσ of σ be {X | Xσ 5 σ}. Let V(α) be the set
of variables appearing in α. We have two useful corollaries.
Corollary 1. Suppose α, σ are normed. Then ασ 5 σ if and only if V(α) ⊆ Rσ .
Corollary 2. Suppose α, β, σ0 , σ1 are defined in an nBPA and Rσ0 = Rσ1 .
Then ασ0 5 βσ0 if and only if ασ1 5 βσ1 .
Proof. Suppose Rσ0 = Rσ1 . Let S be {(ασ0 , βσ0 ) | ασ1 5 βσ1 }. It is not difficult
to see that S ∪ 5 is a branching bisimulation. &
%
We now take a look at the state preserving transitions of nBPA processes. We
are particularly interested in knowing if the quotient set {θ | α →∗ V θ}/5 of
the equivalence classes is finite for every nBPA process α and every variable V .
It turns out that all such sets are finite with effective size bound.
Lemma 6. For each nBPA process α = Xω, there is an effective bound Hα ,
uniformly computable from α, satisfying the following: If α →∗ V θ then α →∗ V η
for some η such that θ 5 η and the length of α →∗ V η is no more than Hα .
Proof. The basic idea is to show that in an effectively bounded number of steps
α can reach, via norm consistent and norm splitting silent transitions, terms V θ
with all possible variable V and all possible relative norm of V . We then apply
Lemma 4. The bound Hα is computed from |α| and the transition system. % &
Under the assumption γ 5 βγ we can repeat the proof of Lemma 6 for βγ in a
way that γ is not affected. Hence the next corollary.
j
Corollary 3. Suppose α, βγ are nBPA processes and γ 5 βγ. If βγ 5 α −→ α ,
j
then there is a transition sequence βγ →∗ β  γ −→ β  γ with its length bounded
by Hβ such that β  γ 5 α and β  γ 5 α .
We are now in a position to prove the following.
Proposition 1. The relation 5 on nBPA processes is semi-decidable.
Proof. We define 5k , the branching bisimilarity up to depth k, by exploiting
Corollary 3. The inductive definition is as follows:
– α 50 β for all α, β.
– α 5i+1 β if the following condition and its symmetric version hold: If α 5i

β −→ β  then one of the following statements is valid:
(i)  = τ and α 5i β  .

(ii) α =⇒ α 5i β for some α such that α −→ α 5i β  for some α and

the length of α =⇒ α is bounded by Hα .
#
Each 5k is decidable. Using Corollary 3 one easily sees that 5 ⊆ k∈ω 5k . The
proof of the converse inclusion is standard. &
%
244 Y. Fu

4 Equality Checking

A straightforward approach to proving an equality between two processes is


to construct a finite bisimulation tree for the equality. A tree of this kind has
been called a tableau system [HS91, Hüt92]. To apply this approach we need to
make sure that the following properties are satisfied: (i) Every tableau for an
equality α = β is finite. (ii) The set of tableaux for an equality α = β is finite.
We can achieve (i) by using Corollary 2 and Corollary 3. This is because if σ
is long enough then according to Corollary 2 it can be decomposed into some
σ0 σ1 σ2 such that Rσ1 σ2 = Rσ2 . Then λσ0 σ1 σ2 5 γσ0 σ1 σ2 can be simplified to
λσ0 σ2 5 γσ0 σ2 . The equivalence provides a method to control the size of labels
of a tableau. Now (ii) is a consequence of (i), Corollary 3 and König lemma.
The building blocks for tableaux are matches. Suppose α0 α 5 α and β0 β 5 β.
A match for the equality α0 α = β0 β over (α, β) is a finite symmetric relation
{γi α = λi β}ki=1 containing only those equalities accounted for in the following

condition: For each transition α0 α −→ α α, one of the following holds:

–  = τ and α α = β0 β ∈ {γi α = λi β}ki=1 ;


τ τ τ 
– there is a sequence β0 β −→ β1 β −→ . . . −→ βn β −→ β  β, for n < Hβ0 , such
that {α0 α = β1 β, . . . , α0 α = βn β, α α = β  β} ⊆ {γi α = λi β}ki=1 .

If α0 σ 5 σ 5 β0 σ, a match for α0 σ = β0 σ over (σ, σ) is said to be a match for


α0 σ = β0 σ over σ. The computable bound Hβ0 , given by Corollary 3, guarantees
that the number of matches for α0 α = β0 β is effectively bounded.
Suppose α0 , β0 are nBPA processes. A tableau for α0 = β0 is a tree with each
of its nodes labeled by an equality between nBPA processes. The root is labeled
by α0 = β0 . We shall distinguish between global tableau and local tableau. The
global tableau is the overall tableau whose root is labeled by the goal α0 = β0 .
It is constructed from the rules given in Fig. 1. Decmp rule decomposes a goal
into several subgoals. We shall find it useful to use SDecmp, which is a stronger
version of Decmp. The side condition of SDecmp ensures that it is unnecessary
to apply it consecutively. When applying Decmp rule we assume that an equality
γσ = σ, respectively σ = γσ, is always decomposed in the following manner
γσ = σ σ = γσ
respectively .
σ=σ {V σ = σ}V ∈V(γ) σ=σ {V σ = σ}V ∈V(γ)

Accordingly γ = , respectively  = γ, is decomposed in the following fashion


γ= =γ
respectively .
= {V = }V ∈V(γ) = {V = }V ∈V(γ)

SubstL and SubstR allow one to create common suffix for the two processes in
an equality. ContrL and ContrR are used to remove a redundant variable inside
a process. In the side conditions of these two rules, α0 , β0 are the processes
appearing in the root of the global tableau. ContrC deletes redundant variables
from the common suffix of a node label whenever the size of the common suffix
Checking Equality and Regularity for Normed BPA with Silent Moves 245

|γ| + |λ| > 0,


γα = λβ
Decmp ∀U ∈ V(γ).U =⇒ ,
α=β {U α = α}U ∈V(γ) {V β = β}V ∈V(λ) ∀V ∈ V(λ).V =⇒ .
|γ| + |λ| > 0,
γα = λβ Hirred(α), Hirred(β),
SDecmp
α=β {U α = α}U ∈V(γ) {V β = β}V ∈V(λ) ∀U ∈ V(γ).U =⇒ ,
∀V ∈ V(λ).V =⇒ .
γα = λβ γα  α, λβ  β, and {αi α = βi β}ki=1
Match
α1 α = β1 β . . . αk α = βk β is a match for γα = λβ over (α, β).
γα = λβ
SubstL α = δβ is the residual.
γδβ = λβ
γα = λβ
SubstR δα = β is the residual.
γα = λδα
γZδ = λ
ContrL Hirred(δ), Z =⇒  and |γZδ| > max{|α0 |, |β0 |}Δ.
γδ = λ Zδ = δ
γ = λZδ
ContrR Hirred(δ), Z =⇒  and |λZδ| > max{|α0 |, |β0 |}Δ.
γ = λδ Zδ = δ
|σ  σ0 σ1 | > 2nΔ , |σ0 | > 0,
γσ  σ0 σ1 = λσ  σ0 σ1
ContrC Hirred(σ1 ),
γσ  σ1 = λσ  σ1 {V σ1 = σ1 }V ∈V(σ0 ) ∀V ∈ V(σ0 ).V =⇒ .

Fig. 1. Rules for Global Tableaux

is over limit. Notice that all the side conditions on the rules are semi-decidable
due to the semi-decidability of 5. So we can effectively enumerate tableaux.
In what follows a node Zη = W κ to which Match rule is applied with the
condition Zη 5 η ∧ W κ 5 κ is called an M-node. A node of the form Zσ = σ
with σ being head irredundant is called a V-node. We now describe how a global
tableau for α0 = β0 is constructed. Assuming α0 = γXα1 and β0 = λY β1 such
that Xα1 5 α1 and Y β1 5 β1 , we apply the following instance of SDecmp rule:
γXα1 = λY β1
.
Xα1 = Y β1 {U Xα1 = Xα1 }U∈V(γ) {V Y β1 = Y β1 }V ∈V(λ)

By definition Xα1 = Y β1 is an M-node and {U Xα1 = Xα1 }U∈V(γ) ∪ {V Y β1 =


Y β1 }V ∈V(λ) is a set of V-nodes. These nodes are the roots of new subtableaux.
Starting from Xα1 = Y β1 we apply Match rule under the condition that neither
α1 nor β1 is affected. The application of Match rule is repeated to grow the
subtableau rooted at Xα1 = Y β1 . The construction of the tree is done in a
breadth first fashion. So the tree grows level by level. At some stage we apply
Decmp rule to all the current leaves. This particular application of Decmp must
meet the following conditions: (i) Both α1 and β1 must be kept intact in all the
current leaves; (ii) Either α1 or β1 is exposed in at least one current leaf. Choose
a leaf labeled by either α1 = δ1 β1 for some δ1 or by δ1 α1 = β1 for some δ1 and
call it the residual node or R-node. Suppose the residual node is α1 = δ1 β1 . All
the other current leaves, the non-residual nodes, must be labeled by an equality
of the form γ1 α1 = λ1 β1 . A non-residual node with label γ1 α1 = λ1 β1 is then
attached with a single child labeled by γ1 δ1 β1 = λ1 β1 . This is an application of
246 Y. Fu

|γ| > 0 and |λ| > 0; |σ  σ0 σ1 | > 2nΔ ,


2nΔ ≥ |σ1 | > 0 and |σ0 | > 0;
Cirred(σ  σ0 σ1 ) and Cirred(σ  σ1 );
γσ  σ0 σ1 = λσ  σ0 σ1 γσ  σ0 σ1  σ  σ0 σ1 , γσ  σ1  σ  σ1 ;
Localization
γσ  σ1 = λσ  σ1 λσ  σ0 σ1  σ  σ0 σ1 , λσ  σ1  σ  σ1 ;
{Xi σ1 = σ1 }i∈I I ∩ J = ∅, I ∪ J = {1, . . . , nΔ };
{Xi σ0 σ1 = σ0 σ1 }i∈I ∀j ∈ J. Xj σ0 σ1 σ0 σ1 and Xj σ1 σ1 ;
Xi =⇒  for all i ∈ I.

Fig. 2. Rule for Local Tableaux

SubstL rule. Now we can recursively apply the global tableau construction to
γ1 δ1 β1 = λ1 β1 to produce a new subtableau. The treatment of a V-node child,
say U Xα1 = Xα1 , is similar. We keep applying Match rule over α1 as long as
the side condition is met. At certain stage we apply Decmp rule to all the leaves.
The application should meet the following conditions: (i) No occurrence of α1 is
affected; (ii) There is an application of Decmp that takes the following shape

γ1 α1 = λ1 α1
.
α1 = α1 {V α1 = α1 }V ∈V(γ1 ) {V α1 = α1 }V ∈V(λ1 )

We then recursively apply the tableau construction to create new subtableaux.


In the above construction the R-node α1 = δ1 β1 can be the root of a new
subtableau, which might contain another R-node. In fact a chain of R-nodes is
possible. ContrL/ContrR is used to control the size of R-nodes.
After an application of SubstL/SubstR rule we may get a C-node α σ  σ0 σ1 =
 
β σ σ0 σ1 if ContrC rule is applicable. Once a C-node appears, we immediately
apply ContrC rule to reduce the size of its common suffix. Intuitively we should
apply ContrC rule sufficiently often so that the common suffix becomes com-
pletely irredundant. Eventually either the length of the common suffix has be-
come no more than 2nΔ , in which case we continue to build up the global tableau,
or Localization rule as defined in Fig. 2 is applicable, in which case we get an
L-node. The soundness of Localization rule is guaranteed by Corollary 2.
Suppose Localization rule is applied to an L-node α σ  σ0 σ1 = β  σ  σ0 σ1 :

α σ  σ0 σ1 = β  σ  σ0 σ1
.
{Xi σ1 = σ1 }i∈I α σ  σ1 = β  σ  σ1 {Xi σ0 σ1 = σ0 σ1 }i∈I

The node α σ  σ1 = β  σ  σ1 is a new L-node. We call {Xi | i ∈ I} the R-set of the


new L-node. If the size of the common suffix of α σ  σ1 = β  σ  σ1 is still larger
than 2nΔ , we continue to apply Localization rule. Otherwise we get an L-root,
which is the root of a local tableau. Now suppose α σ  σ1 = β  σ  σ1 is an L-root.
The construction of the local tableau should stick to two principles described
as follows: (I) Locality. No application of Decmp, SDecmp, SubstL, SubsR and
ContrC should ever affect σ  σ1 or any suffix of σ  σ1 . Notice that applications of
SubstL or SubstR can never affect σ  σ1 or any suffix of σ  σ1 . (II) Consistency.
Checking Equality and Regularity for Normed BPA with Silent Moves 247

Suppose γα = λβ is a node to which Match rule is applied using a match over


(α, β). Then either σ  σ1 is a suffix of both α and β, or α = β = σ  σ1 for some
σ  satisfying the following: (i) σ  is a proper suffix of σ  ; (ii) γ = U Z and λ = Z
such that Zσ  is a suffix of σ  ; and (iii) the match is over σ  σ1 . The locality
and consistency conditions basically say that choices made in the construction
of the local tableau should not contradict to the fact that σ  σ1 is completely
irredundant.
The construction of a path in a tableau ends with a leaf. A successful leaf is
either a node labeled by ς = ς for some ς, or a node labeled by  = V (V = ) with
V 5 , or a node that has the same label as one of its ancestors. An unsuccessful
leaf is produced if the node is either labeled by  = V (V = ) with V 5 ,
or labeled by some ς = ς  with distinct ς, ς  such that no rule is applicable to
ς = ς  . A local tableau has additionally two new kind of successful/unsuccessful
leaves: (i) An L-root is a successful leaf if it shares the same label with one
of its ancestors that is also an L-root. (ii) Suppose α σ  σ0 σ1 = β  σ  σ0 σ1 is an
L-node and its child α σ  σ1 = β  σ  σ1 is an L-root. In the local tableau rooted
at α σ  σ1 = β  σ  σ1 , a node of the form Zσ1 = σ1 is deemed as a leaf. It is
a successful leaf if Z is in the R-set of the L-root; it is an unsuccessful leaf
otherwise.
Tableau constructions always terminate. In fact we have the following.
Lemma 7. The size of every tableau for an equality is effectively bounded. The
number of tableaux for an equality is effectively bounded.
A tableau is successful if all of its leaves are successful. Successful tableaux
generate bisimulation bases.
Proposition 2. Suppose Xα, Y β are nBPA processes. Then Xα 5 Y β if and
only if there is a successful tableau for Xα = Y β.
Proof. If Xα 5 Y β we can easily construct a tableau using the bisimulation
property, Corollary 2 and Corollary 3. Conversely suppose there is a successful
tableau T for Xα = Y β. Let A = Ab ∪ Az ∪ Al . The set Ab of basic axioms is
given by {γ = λ | γ = λ is a label of a node in T}. The set Az is defined by
  $
 V σ = σ is in Ab , and V =⇒τ τ
θ =⇒ 
Az = V σ = θσ, θσ = σ   .
is a chosen shortest path from V to .
Suppose γσ  σ1 = λσ  σ1 is an L-root and γσ  σ0 σ1 = λσ  σ0 σ1 is its parent. A
node ησ  σ1 = κσ  σ1 in the local tableau rooted at γσ  σ1 = λσ  σ1 must be lifted
to ησ  σ0 σ1 = κσ  σ0 σ1 in order to show that γσ  σ0 σ1 = λσ  σ0 σ1 satisfies the
bisimulation base property. Since local tableaux may be nested, the node might
have several lifted versions. The set Al is defined to be the collection of all such
lifted pairs. We can prove by induction on the nodes of the tableau, starting with
the leaves, that A is a bisimulation base. Hence Xα 5 Y β by Lemma 2. &
%
Our main result follows from Proposition 1, Lemma 7 and Proposition 2.
Theorem 1. The branching bisimilarity on nBPA processes is decidable.
248 Y. Fu

5 Regularity Checking
Regularity problem asks if a process is bisimilar to a finite state process. For
strong regularity problem of nBPA, Kučera [Kuč96] showed that it is decidable
in polynomial time. Srba [Srb02] observed that it is actually NL-complete. The
decidability of strong regularity problem for the general BPA was proved by
Burkart, Caucal and Steffen [BCS95, BCS96]. It was shown to be PSPACE-
hard by Srba [Srb02]. The decidability of almost all weak regularity problems of
process rewriting systems [May00] are unknown. The only exception is Jancar
and Esparza’s undecidability result of weak regularity problem of Petri Net and
its extension [JE96]. Srba [Srb03] proved that weak regularity is both NP-hard
and co-NP-hard for nBPA. Using a result by Srba [Srb03], Mayr proved that
weak regularity problem of nBPA is EXPTIME-hard [May03].
The present paper improves our understanding of the issue by the following.
Theorem 2. The regularity problem of 5 on nBPA is decidable.
Proof. One proves by a combinatorial argument that, in the transition tree of
∗ ∗ ∗
an infinite state BPA process, (i) a path V0 σ0 −→ 1
V1 σ1 −→ 2
V2 σ2 . . . −→
m
Vm σm
exists such that (ii) |σ0 | < |σ1 | < |σ2 | < . . . < |σm | and (iii) V0 σ0 b < V1 σ1 b <
V2 σ2 b < . . . < Vm σm b . We can choose m large enough such that 0 ≤ i <
j ≤ m for some i, j satisfying Vi = Vj and Rσi = Rσj . Let σj = σσi for some
σ. Clearly σi b < σj b . Using Corollary 2 one can prove by induction that
σ i σi 5 σ j σi whenever i = j. It is semi-decidable to find (i) with properties (ii,iii).
The converse implication is proved by a tree construction using Theorem 1. % &

6 Remark
For parallel processes (BPP/PN) with silent actions, the only known decidability
result on equivalence checking is due to Czerwiński, Hofman and Lasota [CHL11].
This paper provides the analogous decidability result for the sequential processes
(BPA/PDA) with silent actions. For further research one could try to apply the
technique developed in this paper to general BPA and normed PDA.

Acknowledgement. I am indebted to He, Huang, Long, Shen, Tao, Yang, Yin


and the anonymous referees. The support from NSFC (60873034, 61033002, ANR
61261130589) and STCSM (11XD1402800) are gratefully acknowledged.

References

[BBK87] Baeten, J., Bergstra, J., Klop, J.: Decidability of bisimulation equivalence
for processes generating context-free languages. In: de Bakker, J.W., Nij-
man, A.J., Treleaven, P.C. (eds.) PARLE 1987. LNCS, vol. 259, pp. 94–113.
Springer, Heidelberg (1987)
[BCS95] Burkart, O., Caucal, D., Steffen, B.: An elementary bisimulation decision
procedure for arbitrary context free processes. In: Hájek, P., Wiedermann, J.
(eds.) MFCS 1995. LNCS, vol. 969, pp. 423–433. Springer, Heidelberg (1995)
Checking Equality and Regularity for Normed BPA with Silent Moves 249

[BCS96] Burkart, O., Caucal, D., Steffen, B.: Bisimulation collapse and the process
taxonomy. In: Sassone, V., Montanari, U. (eds.) CONCUR 1996. LNCS,
vol. 1119, pp. 247–262. Springer, Heidelberg (1996)
[BGS92] Balcazar, J., Gabarro, J., Santha, M.: Deciding bisimilarity is p-complete.
Formal Aspects of Computing 4, 638–648 (1992)
[CHL11] Czerwiński, W., Hofman, P., Lasota, S.: Decidability of branching bisimula-
tion on normed commutative context-free processes. In: Katoen, J.-P., König,
B. (eds.) CONCUR 2011. LNCS, vol. 6901, pp. 528–542. Springer, Heidelberg
(2011)
[CHS92] Christensen, S., Hüttel, H., Stirling, C.: Bisimulation equivalence is decidable
for all context-free processes. In: Cleaveland, W.R. (ed.) CONCUR 1992.
LNCS, vol. 630, pp. 138–147. Springer, Heidelberg (1992)
[CHT95] Caucal, D., Huynh, D., Tian, L.: Deciding branching bisimilarity of normed
context-free processes is in σ2p . Information and Computation 118, 306–315
(1995)
[Hir96] Hirshfeld, Y.: Bisimulation trees and the decidability of weak bisimulations.
Electronic Notes in Theoretical Computer Science 5, 2–13 (1996)
[HJM96] Hirshfeld, Y., Jerrum, M., Moller, F.: A polynomial algorithm for decid-
ing bisimilarity of normed context free processes. Theoretical Computer Sci-
ence 158(1-2), 143–159 (1996)
[HS91] Hüttel, H., Stirling, C.: Actions speak louder than words: Proving bisimilarity
for context-free processes. In: LICS 1991, pp. 376–386 (1991)
[HT94] Huynh, T., Tian, L.: Deciding bisimilarity of normed context free processes
is in σ2p . Theoretical Computer Science 123, 83–197 (1994)
[Hüt92] Hüttel, H.: Silence is golden: Branching bisimilarity is decidable for context
free processes. In: Larsen, K.G., Skou, A. (eds.) CAV 1991. LNCS, vol. 575,
pp. 2–12. Springer, Heidelberg (1992)
[Jan12] Jančar, P.: Bisimilarity on basic process algebra is in 2-exptime (2012)
[JE96] Jančar, P., Esparza, J.: Deciding finiteness of petri nets up to bisimulation.
In: Meyer auf der Heide, F., Monien, B. (eds.) ICALP 1996. LNCS, vol. 1099,
pp. 478–489. Springer, Heidelberg (1996)
[Kie13] Kiefer, S.: BPA bisimilarity is exptime-hard. Information Processing Let-
ters 113, 101–106 (2013)
[Kuč96] Kučera, A.: Regularity is decidable for normed BPA and normed BPP pro-
cesses in polynomial time. In: Král, J., Bartosek, M., Jeffery, K. (eds.) SOF-
SEM 1996. LNCS, vol. 1175, pp. 377–384. Springer, Heidelberg (1996)
[May00] Mayr, R.: Process rewrite systems. Information and Computation 156, 264–
286 (2000)
[May03] Mayr, R.: Weak bisimilarity and regularity of BPA is exptime-hard. In: EX-
PRESS 2003 (2003)
[Srb02] Srba, J.: Strong bisimilarity and regularity of basic process algebra is pspace-
hard. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz,
S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 716–727. Springer,
Heidelberg (2002)
[Srb03] Srba, J.: Complexity of weak bisimilarity and regularity for BPA and BPP.
Mathematical Structures in Computer Science 13, 567–587 (2003)
[Stř98] Střı́brná, J.: Hardness results for weak bisimilarity of simple process algebras.
Electronic Notes in Theoretical Computer Science 18, 179–190 (1998)
[vGW89] van Glabbeek, R., Weijland, W.: Branching time and abstraction in bisimula-
tion semantics. In: Information Processing 1989, pp. 613–618. North-Holland
(1989)
FO Model Checking of Interval Graphs

Robert Ganian1 , Petr Hliněný2 , Daniel Král’3 , Jan Obdržálek2,


Jarett Schwartz4 , and Jakub Teska5
1
Vienna University of Technology, Austria
[email protected]
2
Masaryk University, Brno, Czech Republic
{hlineny,obdrzalek}@fi.muni.cz
3
University of Warwick, Coventry, United Kingdom
[email protected]
4
UC Berkeley, Berkeley, United States
[email protected]
5
University of West Bohemia, Pilsen, Czech Republic
[email protected]

Abstract. We study the computational complexity of the FO model


checking problem on interval graphs, i.e., intersection graphs of intervals
on the real line. The main positive result is that this problem can be
solved in time O(n log n) for n-vertex interval graphs with representa-
tions containing only intervals with lengths from a prescribed finite set.
We complement this result by showing that the same is not true if the
lengths are restricted to any set that is dense in some open subset, e.g.,
in the set (1, 1 + ε).

Keywords: FO model checking, parameterized complexity, interval


graph, clique-width.

1 Introduction

Results on the existence of an efficient algorithm for a class of problems have re-
cently attracted a significant amount of attention. Such results are now referred
to as algorithmic meta-theorems, see a recent survey [15]. The most prominent
example is a theorem of Courcelle [1] asserting that every MSO property can be
model checked in linear time on the class of graphs with bounded tree-width.
Another example is a theorem of Courcelle, Makowski and Rotics [2] assert-
ing that the same conclusion holds for graphs with bounded clique-width when
quantification is restricted to vertices and their subsets.

A full version of this contribution, which contains all proofs, can be downloaded
from http://arxiv.org/abs/1302.6043. All the authors except for Jarett Schwartz
acknowledge support of the Czech Science Foundation under grant P202/11/0196.
Robert Ganian also acknowledges support by the ERC grant (COMPLEX REA-
SON 239962) held by Stefan Szeider. Jarett Schwartz acknowledges support of the
Fulbright and NSF Fellowships.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 250–262, 2013.

c Springer-Verlag Berlin Heidelberg 2013
FO Model Checking 251

In this paper, we focus on more restricted graph properties, specifically those


expressible in the first order logic. Clearly, every such property can be tested
in polynomial time if we allow the degree of the polynomial to depend on the
property of interest. But can these properties be tested in so-called fixed param-
eter tractable (FPT [6]) time, i.e., in polynomial time where the degree of the
polynomial does not depend on the considered property? The first result in this
direction could be that of Seese [22]: Every FO property can be tested in linear
time on graphs with bounded maximum degree. A breakthrough result of Frick
and Grohe [11] asserts that every FO property can be tested in almost linear
time on classes of graphs with locally bounded tree-width. Here, an almost linear
algorithm stands for an algorithm running in time O(n1+ε ) for every ε > 0. A
generalization to graph classes locally excluding a minor (with worse running
time) was later obtained by Dawar, Grohe and Kreutzer [4].
Research in this direction so far culminated in establishing that every FO
property can be tested in almost linear time on classes of graphs with locally
bounded expansion, as shown (independently) by Dawar and Kreutzer [5] (also
see [13] for the complete proof), and by Dvořák, Král’ and Thomas [7]. The
concept of graph classes with bounded expansion has recently been introduced
by Nešetril and Ossona de Mendéz [18,19,20]; examples of such graph classes
include classes of graphs with bounded maximum degree or proper minor-closed
classes of graphs. A holy grail of this area is establishing the fixed parameter
tractability of testing FO properties on nowhere-dense classes of graphs.
In this work, we investigate whether structural properties which do not yield
(locally) bounded width parameters could lead to similar results. Specifically, we
study the intersection graphs of intervals on the real line, which are also called
interval graphs. When we restrict to unit interval graphs, i.e., intersection graphs
of intervals with unit lengths, one can easily deduce the existence of a linear time
algorithm for testing FO properties from Gaifman’s theorem, using the result of
Courcelle et al [2] and that of Lozin [17] asserting that every proper hereditary
subclass of unit interval graphs, in particular, the class of unit interval graphs
with bounded radius, has bounded clique-width. This observation is a starting
point for our research presented in this paper.
Let us now give a definition. For a set L of reals, an interval graph is called
an L-interval graph if it is an intersection graph of intervals with lengths from
L. For example, unit interval graphs are {1}-interval graphs. If L is a finite set
of rationals, then any L-interval graph with bounded radius has bounded clique-
width (see Section 4 for further details). So, FO properties of such graphs can be
tested in the fixed parameter way. However, if L is not a set of rationals, there
exist L-interval graphs with bounded radius and unbounded clique-width, and
so the easy argument above does not apply.
Our main algorithmic result says that every FO property can be tested in time
O(n log n) for L-interval graphs when L is any finite set of reals. To prove this
result, we employ a well-known relation of FO properties to Ehrenfeucht-Fraı̈ssé
games. Specifically, we show using the notion of game trees, which we introduce,
that there exists an algorithm transforming an input L-interval graph to another
252 R. Ganian et al.

L-interval graph that has bounded maximum degree and that satisfies the same
properties expressible by FO sentences with bounded quantifier rank. We remark
that encoding a game associated with a model checking problem by a tree, which
describes the course of the game, was also applied in designing fast algorithms
for MSO model checking [14,16].
On the negative side, we show that if L is an infinite set that is dense in
some open set, then L-interval graphs can be used to model arbitrary graphs.
Specifically, we show that L-interval graphs for such sets L allow polynomially
bounded FO interpretations of all graphs. Consequently, testing FO properties
for L-intervals graphs for such sets L is W[2]-hard (see Corollary 2). In addition,
we show that unit interval graphs allow polynomially bounded MSO interpreta-
tions of all graphs.
The property of being W[2]-hard comes from the theory of parameterized
complexity [6], and it is equivalent to saying that the considered problem is
at least as hard as the d-dominating set problem, asking for an existence of a
dominating set of fixed parameter size d in a graph. It is known that, unless the
Exponential time hypothesis fails, W[2]-hard problems cannot have polynomial
algorithms with the degree a constant independent of the parameter (of the
considered FO property in our case).
In Section 2, we introduce the notation and the computational model used
in the paper. In the following section, we present an O(n log n) algorithm for
deciding FO properties of L-interval graphs for finite sets L. In Section 4, we
present proofs of the facts mentioned above on the clique-width of L-interval
graphs with bounded radius. Finally, we establish FO interpretability of graphs
in L-interval graphs for sets L which are dense in an open set in Section 5.

2 Notation

An interval graph is a graph G such that every vertex v of G can be associated


with an interval J(v) = [(v), r(v)) such that two vertices v and v  of G are
adjacent if and only if J(v) and J(v  ) intersect (it can be easily shown that the
considered class of graphs remains the same regardless of whether we consider
open, half-open or closed intervals in the definition). We refer to such an assign-
ment of intervals to the vertices of G as a representation of G. The point (v) is
the left end point of the interval J(v) and r(v) is its right end point.
If L is a set of reals and r(v) − (v) ∈ L for every vertex v, we say that G is
an L-interval graph and we say that the representation is an L-representation
of G. For example, if L = {1}, we speak about unit interval graphs. Finally, if
r(v) − (v) ∈ L and 0 ≤ (v) ≤ r(v) ≤ d for some real d, i.e., all intervals are
subintervals of [0, d), we speak about (L, d)-interval graphs. Note that if G is an
interval graph of radius k, then G is also an (L, (2k + 1) max L)-interval graph
(we use max L and min L to denote the maximum and the minimum elements,
respectively, of the set L).
We now introduce two technical definitions related to manipulating intervals
and their lengths. These definitions are needed in the next section. If L is a set
FO Model Checking 253

of reals, then L(k) is the set of all integer linear combinations of numbers from
L with the sum of the absolute values of their coefficients bounded by k. For
instance, L(0) = {0} and L(1) = L ∪ {0}. An L-distance of two intervals [a, b)
and [c, d) is the smallest k such that c − a ∈ L(k) . If no such k exists, then the
L-distance of two intervals is defined to be ∞.
Since we do not restrict our attention to L-interval graphs where L is a set
of rationals, we should specify the computational model considered. We use
the standard RAM model with infinite arithmetic precision and unit cost of all
arithmetic operation, but we refrain from trying to exploit the power of this com-
putational model by encoding any data in the numbers we store. In particular,
we only store the end points of the intervals in the considered representations of
graphs in numerical variables with infinite precision.

2.1 Clique-Width
We now briefly present the notion of clique-width introduced in [3]. Our results
on interval graphs related to this notion are given in Section 4. A k-labeled
graph is a graph with vertices that are assigned integers (called labels) from 1
to k. The clique-width of a graph G equals the minimum k such that G can
be obtained from single vertex graphs with label 1 using the following four
operations: relabeling all vertices with label i to j, adding all edges between the
vertices with label i and the vertices with label j (i and j can be the same),
creating a vertex labeled with 1, and taking a disjoint union of graphs obtained
using these operations.

2.2 First Order Properties


In this subsection, we introduce concepts from logic and model theory which we
use. A first order (FO) sentence is a formula with no free variables with the usual
logical connectives and quantification allowed only over variables for elements
(vertices in the case of graphs). A monadic second order (MSO) sentence is a
formula with no free variables with the usual logical connectives where, unlike
in FO sentences, quantification over subsets of elements is allowed. An FO prop-
erty is a property expressible by an FO sentence; similarly, an MSO property
is a property expressible by an MSO sentence. Finally, the quantifier rank of a
formula is the maximum number of nested quantifiers in it.
FO sentences are closely related to the so-called Ehrenfeucht-Fraı̈ssé games.
The d-round Ehrenfeucht-Fraı̈ssé game is played on two relational structures R1
and R2 by two players referred to as the spoiler and the duplicator. In each of
the d rounds, the spoiler chooses an element in one of the structures and the
duplicator chooses an element in the other. Let xi be the element of R1 chosen
in the i-th round and yi be the element of R2 . We say that the duplicator wins
the game if there is a strategy for the duplicator such that the substructure of
R1 induced by the elements x1 , . . . , xd is always isomorphic to the substructure
of R2 induced by the elements y1 , . . . , yd , with the isomorphism mapping each
xi to yi .
254 R. Ganian et al.

The following theorem [8,10] relates this notion to FO sentences of quantifier


rank at most d.

Theorem 1. Let d be an integer. The following statements are equivalent for


any two structures R and R :
– The structures R and R satisfy the same FO sentences of quantifier rank at
most d.
– The duplicator wins the d-round Ehrenfeucht-Fraı̈ssé game for R and R .

We describe possible courses of the d-round Ehrenfeucht-Fraı̈ssé game on a single


relational structure by a rooted tree, which we call a d-EF-tree. All the leaves
of an d-EF-tree are at depth d and each of them is associated with a relational
structure with elements labeled from 1 to d. The d-EF-tree T for the game
played on a relational structure R is obtained as follows. The number of children
of every internal node of T is the number of elements of R and the edges leaving
the internal node to its children are associated in the one-to-one way with the
elements of R. So, every path from the root of T to a leaf u of T yields a sequence
x1 , . . . , xd of the elements of R (those associated with the edges of that path)
and the substructure of R induced by x1 , . . . , xd with xi labeled with i is the
one associated with the leaf u.
A mapping f from a d-EF-tree T to another d-EF-tree T  is an EF-homo-
morphism if the following three conditions hold:
1. if u is a parent of v in T , then f (u) is a parent of f (v) in T  , and
2. if u is a leaf of T , then f (u) is a leaf of T  , and
3. the structures associated with u and f (u) are isomorphic through the bijec-
tion given by the labelings.
Two trees T and T  are EF-equivalent if there exist an EF-homomorphism from
T to T  and an EF-homomorphism from T  to T .
Let T and T  be the d-EF-trees for the game played on relational structures
R and R , respectively. Suppose that T and T  are EF-equivalent and let f1 and
f2 be the EF-homomorphism from T to T  and T  to T , respectively. We claim
that the duplicator wins the d-round Ehrenfeucht-Fraı̈ssé game for R and R .
Let us describe a possible strategy for the duplicator. We restrict to the case
when the spoiler chooses an element xi of R in the i-th step, assuming elements
x1 , . . . , xi−1 in R and elements y1 , . . . , yi−1 in R have been chosen in the previous
rounds. Let u0 , . . . , ui be the path in T corresponding to x1 , . . . , xi . The dupli-
cator chooses the element yi in R that corresponds to the edge f1 (ui−1 )f1 (ui )
in T  . It can be verified that if the duplicator follows this strategy, then the
substructures of R and R induced by x1 , . . . , xd and y1 , . . . , yd , respectively, are
isomorphic.
Let us summarize our findings from the previous paragraph.

Theorem 2. Let d be an integer. If the d-EF-trees for the game played on two
relational structure R and R are EF-equivalent, then the duplicator wins the
d-round Ehrenfeucht-Fraı̈ssé game for R and R .
FO Model Checking 255

The converse implication, i.e., that if the duplicator can win the d-round
Ehrenfeucht-Fraı̈ssé game for R and R , then the d-EF-trees for the game played
on relational structures R and R are EF-equivalent, is also true, but we omit
further details as we only need the implication given in Theorem 2 in our con-
siderations.
We finish this section with some observations on minimal d-EF-trees in EF-
equivalence classes. Let T be a d-EF-tree. Suppose that an internal node at
level d − 1 has two children, which are leaves, associated with the same labeled
structure. Observe that deleting one of them yields an EF-equivalent d-EF-tree.
Suppose that we have deleted all such leaves and an internal node at level d − 2
has two children with their subtrees isomorphic (in the usual sense). Again,
deleting one of them (together with its subtree) yields an EF-equivalent d-EF-
tree. So, if T  is a minimal subtree of T that is EF-equivalent to T and K is
the number of non-isomorphic d-labeled structures, then the degree of nodes at
depth d − 1 does not exceed K, those at depth d − 2 does not exceed 2K , those
K
at depth d − 3 does not exceed 22 , etc. We conclude that the size of a minimal
subtree of T that is EF-equivalent to T is bounded by a function of d and the
type of relational structures considered only.

3 FO Model Checking
Using Theorems 1 and 2, we prove the following kernelization result for L-interval
graphs.

Theorem 3. For every finite subset L of reals and every d, there exists an
integer K0 and an algorithm A with the following properties. The input of A is
an L-representation of an n-vertex L-interval graph G and A outputs in time
O(n log n) an L-representation of an induced subgraph G of G such that
– every unit interval contains at most K0 left end points of the intervals cor-
responding to vertices of G , and
– G and G satisfy the same FO sentences with quantifier rank at most d.

Proof. We first focus on proving the existence of the number K0 and the sub-
graph G and we postpone the algorithmic considerations to the end of the proof.
As the first step, we show that we can assume that all the left end points
are distinct. Choose δ to be the minimum distance between distinct end points
of intervals in the representation. Suppose that the intervals are sorted by their
left end points (resolving ties arbitrarily). Shifting the i-th interval by iδ/2n, for
i = 1, . . . , n, to the right does not change the graph represented by the intervals
and all the end points become distinct.
d+1
Choose ε to be the minimum positive element of L(2 ) . Fix any real a and
d+1
let I be the set of all intervals [x, x + ε) such that x − a ∈ L(2 ) . By the choice
of ε, the intervals of I are disjoint. In addition, the set I is finite (since L is
finite). Let W be the set of vertices w of G such that (w) lies in an interval
from I, and for such a vertex w, let i(w) be the left end point of that interval
256 R. Ganian et al.

from I. Define a linear order on W such that w ≤ w for w and w from W iff


(w) − i(w) ≤ (w ) − i(w ) and resolve the cases of equality for distinct vertices
w and w arbitrarily.
We view W as a linearly ordered set with elements associated with intervals
from I (specifically the interval with the left end point at i(w)) as well as asso-
ciated with the lengths of their corresponding intervals in the representation of
G. Let us establish the following claim.

Claim. There exists a number K depending on only |I| and d such that if W
contains more than K elements associated with the interval [a, a + ε), then there
exists an element w ∈ W associated with [a, a + ε) such that the d-EF-trees for
the game played on W and W \ {w} are EF-equivalent.

Indeed, let T be the d-EF-tree for the game played on W and let T  a minimal
subtree of T that is EF-equivalent to T . Recall that the size of T  does not
exceed a number K depending on only |I| and d. If W contains more than K
elements associated with [a, a + ε), then one of them is not associated with edges
that are present in T  . We set w to be this element. This finishes the proof of
the claim.
Since the d-EF-trees for the game played on W and W \{w} are EF-equivalent,
the duplicator wins the d-round Ehrenfeucht-Fraı̈ssé game for W and W \ {w}
by Theorems 1 and 2.
We describe a strategy for the duplicator to win the d-round Ehrenfeucht-
Fraı̈ssé game for the graphs G and G \ w. During the game, some intervals from
I will be marked as altered. At the beginning, the only altered interval is the
interval [a, a + ε).
The duplicator strategy in the i-th round of the game is the following.

– If the spoiler chooses a vertex w with (w ) in an interval of I at L-distance


at most 2d+1−i from an altered interval, then the duplicator follows its win-
ning strategy for the d-round Ehrenfeucht-Fraı̈ssé game for W and W \ {w}
which gives the vertex to choose in the other graph. In addition, the dupli-
cator marks the interval of I that contains (w ) as altered. Note that the
interval of the chosen vertex has its left end point in the same interval of I
as w .
– Otherwise, the duplicator chooses the same vertex in the other graph. No
new intervals are marked as altered.

It remains to argue that the subgraphs of G and G \ w obtained in this way


are isomorphic. Let w1 , . . . , wd be the chosen vertices of G and w1 , . . . , wd the
chosen vertices of G . For brevity, let us refer to vertices corresponding to the
intervals with left end points in the altered intervals as to altered vertices. If wi
is not altered, then wi = wi . If wi is altered, then (wi ) and (wi ) are in the
same interval J ∈ I and the only intervals that might intersect the intervals
corresponding to wi and wi differently are those with left end points in the
intervals of I at L-distance at most two from J. However, if some chosen vertices
have their left end points in such intervals, then these intervals must also be
FO Model Checking 257

altered and these chosen vertices are altered. Since we have followed a winning
strategy for the duplicator for W and W \ {w} when choosing altered vertices,
the subgraphs of G and G \ w induced by the altered vertices are isomorphic. We
conclude that the subgraphs of G and G \ w induced by the vertices w1 , . . . , wd
and w1 , . . . , wd , respectively, are isomorphic. So, the duplicator wins the game.
Let us summarize our findings. If an interval of length ε contains more than
K left end points of intervals in the given L-representation of G, then one of
the vertices corresponding to these intervals can be removed from G without
changing the set of FO sentences with rank at most d that are satisfied by G.
So, the statement of the theorem is true with K0 set to K ε−1 .
It remains to consider the algorithmic aspects of the theorem. The values of
ε and K0 are determined by L and d. The algorithm sorts the left end points
of all the intervals (this requires O(n log n) time) and for each of these points
computes the distance to the left end of the interval that is K positions later
in the obtained order. If all these distances are at least ε, then every interval of
length at most ε contains at most K0 left end points of the intervals and the
representation is of the desired form.
Otherwise, we choose the smallest of these distances and consider the corre-
sponding interval [a, b), b − a < ε, containing at least K0 left end points of the
intervals from the representation. By the choice of this interval, any interval of
length b − a at L-distance at most 2d+1 from [a, b] contains at most K0 + 1 left
end points of the intervals from the representation. So, the size of the d-EF-tree
for the game played on the vertices v with (v) in such intervals is bounded by a
function of K0 , d and |L|. Since this quantity is independent of the input graph,
we can identify in constant time a vertex w with (w) ∈ [a, b) whose removal
from G does not change the set of FO sentences with quantifier rank d satisfied
by G.
We then update the order of the left end points and the at most K0 computed
distances affected by removing w, and iterate the whole process. Since at each
step we alter at most K0 distances, using a heap to store the computed distances
and choose the smallest of them requires O(log n) time per vertex removal. So,
the running time of the algorithm is bounded by O(n log n). &
%

It is possible to think of several strategies to efficiently decide FO properties of


L-interval graphs given Theorem 3. We present one of them. Fix an FO sentence
Φ with quantifier rank d and apply the algorithm from Theorem 3 to get an
L-interval graph and a representation of this graph such that every unit interval
contains at most K left end points of the intervals of the representation. After
this preprocessing step, every vertex of the new graph has at most K(|L| +
1) max L neighbors. In particular, the maximum degree of the new graph is
bounded. The result of Seese [22] asserts that every FO property can be decided
in linear time for graphs with bounded maximum degree, and so we conclude:

Theorem 4. For every finite subset L of reals and every FO sentence Φ, there
exists an algorithm running in time O(n log n) that decides whether an input
n-vertex L-interval graph G given by its L-representation satisfies Φ.
258 R. Ganian et al.

4 Clique-Width of Interval Graphs

Unit interval graphs can have unbounded clique-width [12], but Lozin [17] noted
that every proper hereditary subclass of unit interval graphs has bounded clique-
width. In particular, the class of ({1}, d)-interval graphs has bounded clique-
width for every d > 0. Using Gaifman’s theorem, it follows that testing FO
properties of unit interval graphs can be performed in linear time if the input
graph is given by its {1}-representation with the left end points of the intervals
sorted. We provide an easy extension of this, and outline how it can be used to
prove the special case of our main result for FO model checking when L is a
finite set of rational numbers (the proof of the lemma is omitted due to space
constraints).

Lemma 1. Let L be a finite set of positive rational numbers. For any d > 0,
the class of (L, d)-interval graphs has bounded clique-width.

From Lemma 1 and Gaifman’s theorem, one can approach the FO model checking
problem on L-interval graphs with L containing rational numbers only as fol-
lows. L-interval graphs with radius d are (L, (2d + 1) max L)-interval graphs. By
Gaifman’s theorem, every FO model checking instance can be reduced to model
checking of basic local FO sentences, i.e., to FO model checking on L-interval
graphs with bounded radius. Since such graphs have bounded clique-width, the
latter can be solved in linear time by [2]. Combining this with the covering
technique from [11], which can be adapted to run in linear time in the case of
L-interval graphs, we obtain the following.

Corollary 1. Let L be a finite set of positive rational numbers. The FO model


checking problem can be solved in linear time on the class of L-interval graphs
if the input graph is given by its L-representation with the left end points of the
intervals sorted.

However, Corollary 1 is just a fortunate special case, since aside of rational


lengths one can prove the following.
 
Lemma 2. For any irrational q > 0 there is d such that the class of {1, q}, d -
interval graphs has unbounded clique-width.

Proof. This proof is in a sense complementary to that of Lemma 1. We may


assume q > 1 (otherwise, we rescale and consider the set {1, 1/q}). So, fix L =
{1, q}, d = q + 3 and an integern (to bespecified later).
Our task is to construct a {1, q}, d -interval representation of a graph G
with large clique-width. Since q is irrational, for every  we can find n such that
L(n) ∩ [0, d − q) contains more than  points. We actually construct an arbitrarily
long sequence P = (a1 , a2 , . . . , an ) of such points as follows: a1 = 0, a2 = 1, and
for i > 2 set
FO Model Checking 259

– ai = ai−1 + 1, provided that |ai−2 − ai−1 | = 1 and ai−1 < d − 2,


– ai = ai−1 − 1, provided that ai−2 − ai−1 = q, and
– ai = ai−1 − q otherwise (we call this ai a q-element of P ).

Informally, we are “folding” a long sequence with differences from L(n) into a
bounded length interval, avoiding as much collisions of points as possible.
Let δ > 0 be such that nδ is smaller than the smallest number in L(n) ∩ (0, d −
q). Let us introduce the following shorthand notation: if J is an interval and r
a real, then J + r is the interval J shifted by r to the right. Similarly, if I is
a set of intervals, then I + r is the set of the intervals from I shifted by r to
the right. We define sets of intervals U1 := {[iδ, 1 + iδ) : i = 0, . . . , n − 1} and
Uq := {[iδ, q + iδ) : i = 0, . . . , n − 1}. For further reference we say that intervals
[iδ, 1 + iδ) or [iδ, q + iδ) are at level i.
For i = 1, . . . , n, we set Wi = Uq +ai if ai is a q-element of P , and Wi = U1 +ai
otherwise. Then every interval of Wi is a subinterval of [0, d). Let G be a graph
on n2 vertices represented by the union of the interval sets W1 ∪ W2 ∪ · · · ∪ Wn .
Let Wi , i = 1, . . . , n, be the vertices represented by Wi . We claim that the
clique-width of G exceeds any fixed number k ∈ N when n sufficiently large.
Assume, for a contradiction, that the clique-width of G is at most k. We can
view the construction of G as a binary tree and conclude a k-labelled subgraph
G1 of G with 13 n2 ≤ |V (G1 )| ≤ 23 n2 appeared during the construction of G.
However, this implies that vertices of G1 have at most k different neighborhoods
in G \ V (G1 ). We will show that this is not possible (assuming that n is large).
For 2 ≤ i ≤ n, vertices x ∈ Wi−1 and y ∈ Wi are mates if they are represented
by copies of the same-level intervals from U1 or Uq above. Our first observation is
that, up to symmetry between i−1 and i, 0 ≤ |Wi−1 ∩V (G1 )|−|Wi ∩V (G1 )| ≤ k.
Suppose not. Then there exist k + 1 vertices in Wi−1 ∩ V (G1 ) whose mates are
in Wi \ V (G1 ), and thus certify pairwise distinct neighborhoods of the former
ones in G \ V (G1 ).
A set Wi is crossing G1 if ∅ = Wi ∩ V (G1 ) = Wi . The arguments given in
the previous paragraph and 13 n2 ≤ |V (G1 )| ≤ 23 n2 imply that for any m, if n is
large, there exist sets Wi0 , Wi0 +1 , . . . , Wi0 +m in G all crossing G1 . So, we can
select an arbitrarily large index set I ⊆ {i0 , . . . , i0 + m − 1}, |I| = , such that
 each i ∈ I the element ai+1 is to the right of ai , and that all intervals in
for
i∈I Wi share a common point. In particular, ai is not a q-element and so both
Wi and Wi+1 are shifted copies of U1 . Let i1 , . . . , i be the elements of I ordered
according to the (strictly) increasing values of ai , i.e., ai1 < · · · < ai .
Finally, for any j, j  ∈ {1, . . . , } such that j  > j +1, we see that each vertex of
Wij ∩ V (G1 ) cannot have the same neighborhood as any vertex of Wij ∩ V (G1 ):
this is witnessed by the non-empty set Wij+1 +1 \ V (G1 ) (represented to the right
of the intervals from Wij while intersecting every interval from Wij ). Therefore,
the vertices of G1 have at least /2 > k distinct neighborhoods in G \ V (G1 ),
which contradicts the fact that the clique-width of G is at most k. &
%
260 R. Ganian et al.

5 Graph Interpretation in Interval Graphs


A useful tool when solving the model checking problem on a class of structures
is the ability to “efficiently translate” an instance of the problem to a different
class of structures, for which we already may have an efficient model checking
algorithm. To this end we introduce simple FO graph interpretation, which is an
instance of the general concept of interpretability of logic theories [21] restricted
to simple graphs with vertices represented by singletons.
An FO graph interpretation is a pair I = (ν, μ) of FO formulae (with 1 and 2
free variables respectively) where μ is symmetric, i.e., G |= μ(x, y) ↔ μ(y, x) in
every graph G. If G is a graph, then I(G) is the graph defined as follows:
– The vertex set of I(G) is the set of all vertices v of G such that G |= ν(v),
and
– the edge set of I(G) is the set of all the pairs {u, v} of vertices of G such
that G |= ν(u) ∧ ν(v) ∧ μ(u, v).
We say that a class C1 of graphs has an FO interpretation in a class C2 if there
exists an FO graph interpretation I such that every graph from C1 is isomorphic
to I(G) for some G ∈ C2 .
A proof of the next lemma is omitted due to space constraints.

Lemma 3. If L is a subset of non-negative reals that is dense in some non-


empty open set, then there exists a polynomially bounded simple FO interpreta-
tion of the class of all graphs in the class of L-interval graphs.

Since many FO properties are W[2]-hard for general graphs, we can immediately
conclude the following.

Corollary 2. If L is a subset of non-negative reals that is dense in some non-


empty open set, then FO model checking is W[2]-hard on L-interval graphs.

We now turn our attention to interpretation in unit interval graphs. The price
we pay for restricting to a smaller class of interval graphs is the strength of
the interpretation language used, namely that of MSO logic. At this point we
remark that there exist two commonly used MSO frameworks for graphs; the
MSO1 language which is allowed to quantify over vertices and vertex sets only,
and MSO2 which is in addition allowed to quantify over edges and edge sets. We
stay with the former weaker one in this paper.
An MSO1 graph interpretation is defined in the analogous way to former FO
interpretation with the formulas μ and ν being MSO1 formulas (we omit a proof
due to space limitations).

Lemma 4. There is a polynomially bounded simple MSO1 interpretation of the


class of all graphs into the class of unit interval graphs.

Again, we can immediately conclude the following.


Corollary 3. MSO1 model checking is W[2]-hard on unit interval graphs.
FO Model Checking 261

This corollary is rather tight since the aforementioned result of Lozin [17] claims
that every proper hereditary subclass of unit interval graphs has bounded clique-
width, and hence MSO1 model checking on this class is in linear time [2].
Lastly, we remark that Fellows et al [9] have shown that testing FO properties
on unit two-interval graphs (i.e., such that each vertex corresponds to a pair of
intervals, each on a distinct line) is W[1]-hard.

References
1. Courcelle, B.: The monadic second order logic of graphs I: Recognizable sets of
finite graphs. Inform. and Comput. 85, 12–75 (1990)
2. Courcelle, B., Makowsky, J.A., Rotics, U.: Linear time solvable optimization prob-
lems on graphs of bounded clique-width. Theory Comput. Syst. 33, 125–150 (2000)
3. Courcelle, B., Olariu, S.: Upper bounds to the clique width of graphs. Discrete
Appl. Math. 101, 77–114 (2000)
4. Dawar, A., Grohe, M., Kreutzer, S.: Locally excluding a minor. In: LICS 2007,
pp. 270–279. IEEE Computer Society (2007)
5. Dawar, A., Kreutzer, S.: Parameterized complexity of first-order logic. ECCC
TR09-131 (2009)
6. Downey, R., Fellows, M.: Parameterized complexity. Monographs in Computer Sci-
ence. Springer (1999)
7. Dvořák, Z., Král’, D., Thomas, R.: Deciding first-order properties for sparse graphs.
In: FOCS 2010, pp. 133–142. IEEE Computer Society (2010)
8. Ehrenfeucht, A.: An application of games to the completeness problem for formal-
ized theories. Fund. Math. 49, 129–141 (1961)
9. Fellows, M., Hermelin, D., Rosamond, F., Vialette, S.: On the parameterized com-
plexity of multiple-interval graph problems. Theoret. Comput. Sci. 410, 53–61
(2009)
10. Fraı̈ssé, R.: Sur quelques classifications des systèmes de relations. Université
d’Alger, Publications Scientifiques, Série A 1, 35–182 (1954)
11. Frick, M., Grohe, M.: Deciding first-order properties of locally tree-decomposable
structures. J. ACM 48, 1184–1206 (2001)
12. Golumbic, M., Rotics, U.: On the clique-width of some perfect graph classes. Int.
J. Found. Comput. Sci. 11, 423–443 (2000)
13. Grohe, M., Kreutzer, S.: Methods for algorithmic meta theorems. In: Model Theo-
retic Methods in Finite Combinatorics Contemporary Mathematics, pp. 181–206.
AMS (2011)
14. Kneis, J., Langer, A., Rossmanith, P.: Courcelle’s theorem — a game-theoretic
approach. Discrete Optimization 8(4), 568–594 (2011)
15. Kreutzer, S.: Algorithmic meta-theorems. ECCC TR09-147 (2009)
16. Langer, A., Reidl, F., Rossmanith, P., Sikdar, S.: Evaluation of an mso-solver. In:
ALENEX 2012, pp. 55–63. SIAM / Omnipress (2012)
17. Lozin, V.: From tree-width to clique-width: Excluding a unit interval graph. In:
Hong, S.-H., Nagamochi, H., Fukunaga, T. (eds.) ISAAC 2008. LNCS, vol. 5369,
pp. 871–882. Springer, Heidelberg (2008)
18. Nešetřil, J., Ossona de Mendez, P.: Grad and classes with bounded expansion I.
Decompositions. European J. Combin. 29, 760–776 (2008)
262 R. Ganian et al.

19. Nešetřil, J., Ossona de Mendez, P.: Grad and classes with bounded expansion II.
Algorithmic aspects. European J. Combin. 29, 777–791 (2008)
20. Nešetřil, J., Ossona de Mendez, P.: Grad and classes with bounded expansion III.
Restricted graph homomorphism dualities. European J. Combin. 29, 1012–1024
(2008)
21. Rabin, M.O.: A simple method for undecidability proofs and some applications. In:
Logic, Methodology and Philosophy of Sciences, vol. 1, pp. 58–68. North-Holland
(1964)
22. Seese, D.: Linear time computable problems and first-order descriptions. Math.
Structures Comput. Sci. 6, 505–526 (1996)
Strategy Composition in Compositional Games

Marcus Gelderie

RWTH Aachen, Lehrstuhl für Informatik 7,


Logic and Theory of Discrete Systems,
D-52056 Aachen
[email protected]

Abstract. When studying games played on finite arenas, the arena is


given explicitly, hiding the underlying structure of the arena. We study
games where the global arena is a product of several smaller, constituent
arenas. We investigate how these “global games” can be solved by playing
“component games” on the constituent arenas. To this end, we introduce
two kinds of products of arenas. Moreover, we define a suitable notion
of strategy composition and show how, for the first notion of product,
winning strategies in reachability games can be composed from winning
strategies in games on the constituent arenas. For the second kind of
product, the complexity of solving the global game shows that a general
composition theorem is equivalent to proving Pspace = Exptime.

1 Introduction
Infinite games with ω-regular winning conditions have been studied extensively
over the past decades [1–5]. This research has been most successful in establishing
results about solving ω-regular games on an “abstract” arena. A fundamental
open problem, which is of intrinsic interest in the area of automated synthesis,
is to exploit the compositional structure of an arena to derive a compositional
representation of a winning strategy. For instance, if an arena is viewed as a
product of several smaller transition systems, is it possible to lift this structure
to strategies in games on this arena?
The classical results on ω-regular games depend on the representation of a
winning strategy by an automaton. None of these results allows to transfer a
given composition of an arena into a composition of automata in such a way
that a winning strategy is implemented. Since there is no lack of methods for
composing automata (for example, the cascade product), it rather seems that
automata are too “coarse” a tool to capture this compositional structure.
We study the compositional nature of winning strategies in games played
on products of arenas. Products of arenas can be defined in a variety of ways
(see e.g. [6]). As a first step towards a compositional approach to synthesis, we
restrict ourselves to two notions, parallel and synchronized product. Our notion
of strategy composition relies on a Turing machine based model for strategy

Supported by DFG research training group 1298, “Algorithmic Synthesis of Reactive
and Discrete-Continuous Systems” (AlgoSyn).

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 263–274, 2013.

c Springer-Verlag Berlin Heidelberg 2013
264 M. Gelderie

representation, called a strategy machine. Using this model, we show how winning
strategies in reachability games can be composed from winning strategies in
games over the constituent factors of the overall product arena. We study the
complexity of such a composition: its size, its runtime and the computational
complexity of finding it. This entails a study of the complexity of deciding who
wins the game.
Compositionality in an arena is closely linked to a succinct representation of
that arena. Likewise, composing a winning strategy from smaller winning strate-
gies may yield a much smaller representation for that strategy. Finding succinct
transition systems from specifications was studied in [7]. The authors consider
the problem of finding a succinct representation of a model for a given CTL
formula. They show that such succinct models are unlikely to exist in general.
Transition systems which are obtained by “multiplying” smaller transition
systems have also been studied in [8]. The authors consider the problem of model
checking such systems. They show the model checking problem for such systems
to be of high complexity for various notions of behavioral specification and model
checking problem.
Strategy machines were introduced in [9] (the model has been studied in a
different setting in [10]). They allow for a broader range of criteria by which to
compare strategies. Being based on Turing machines, strategy machines allow, for
instance, to investigate the “runtime” of a strategy and to quantify and compare
“dynamic” memory (the tape content) and “static memory” (the control states).
The complexity of deciding the winner of a game has been subject to extensive
research in the case of games on an abstract arena [11–13]. These complexity
results depend on the size of the abstract arena. We investigate the complexity
of deciding the winner based on a composite representation of the arena.
Our paper is structured as follows: We first define two notions of product of
arenas, the parallel product and the synchronized product. The games we study
are played on arenas that are composed from smaller arenas using these two
operators. Having defined the notion of arena composition, we define strategy
machines and use them to introduce our notion of strategy composition. Sub-
sequently, we study reachability games. We do this separately for the parallel
product and the synchronized product. To this end, we first introduce two nat-
ural ways of defining a reachability condition on a composite arena, local and
synchronized reachability. For the parallel product we obtain a compositionality
theorem for both local and synchronized reachability. For the synchronized prod-
uct we show that deciding the game is Exptime complete. From this we deduce
that finding a general composition theorem is equivalent to showing Exptime
= Pspace.

2 Games on Composite Arenas

An arena is a directed, bipartite graph A = (V, E). The partition of A is V =


V (0) -V (1) . Given v ∈ V let vE = {v  ∈ V | (v, v  ) ∈ E}. We define two operators
on arenas: the parallel product and the synchronized product. Let = {0, 1}.
Strategy Composition in Compositional Games 265

Definition 1 (Parallel Product). Consider arenas A1 , . . . , Ak with Ai =


(Vi , Ei ). The parallel product A1  · · ·  Ak = (V, E) is the arena given by
k
– V = × i=1 Vi
 
– E = { (σ, v), (1 − σ, v  ) | ∃i : vi ∈ Vi ∧ (vi , vi ) ∈ Ei ∧ ∀j = i : vj = vj }
(σ)

– V (σ) = {(b, v) ∈ V | b = σ}

Note that the parallel product again gives a bipartite arena. Note furthermore
 (0)
that, given a vertex (σ, v) ∈ × i Vi , the number of vertices vi ∈ Vi alter-
natingly increases and decreases by one along all paths starting in (σ, v). The
number of components player 0 controls in each of his moves is given by:
(0)
rank0 (σ, v) = |{i | vi ∈ Vi }| + σ

Player 1 controls rank1 (σ, v) = k − rank0 (σ, v) + 1 components during his moves.
For every p ∈ we have rankp (σ  , v  ) = rankp (σ, v) for all (σ  , v  ) reachable
from (σ, v). If (σ, v) is clear from context, we thus simply write rankp .

A1 A2 Ak A1 A2 Ak

y2 x1 y2 yk
··· b a ··· b
v v1 v2 vk
v v1 v2 vk b b a
u1 x2 a w1 x 2 w2 xk

(a) Parallel Product: Edges are taken lo- (b) Synchronized Product: Where transi-
cally. The square player may move in, e.g., tions permit it, edges are taken globally.
A1 but not in A2 . The circle player may choose transitions
in A1 and A2 . Ak is not affected.

Fig. 1. A vertex v in the parallel and synchronized product of (labeled) arenas


A1 , . . . , Ak . The shaded vertices define a possible successor state of v.

To define the synchronized product we use labeled arenas.


 A labeled arena is
a triple A = (V, Δ, Σ) with V = V (0) - V (1) and Δ ⊆ σ∈ V (σ) × Σ × V (1−σ)
for some finite set Σ of letters.
Definition 2 (Synchronized Product). Consider labeled arenas A1 , . . . , Ak
with Ai = (Vi , Δi , Σi ). The synchronized product A1 ⊗ · · · ⊗ Ak = (V, Σ, Δ) is
given by
k
– V = × i=1 Vi
k
– Σ = i=1 Σi
 
– (σ, v), a, (1 − σ, v  ) ∈ Δ iff for all i, whenever a ∈ Σi and vi ∈ Vi , then
(σ)
 
also (vi , a, vi ) ∈ Δi and vi = vi otherwise.
266 M. Gelderie

Remark 1. Neither the parallel product nor the synchronized product are asso-
ciative in general. This is due to the fact that we absorb the information about
whose turn it is into the arena. We do so for technical reasons. It is nonessential
for the results.
In this paper we study ω-regular games. We assume the reader is familiar with
the elementary theory of ω-regular games. For an introduction see [4, 5]. In the
following, we recall some terminology. A game is a tuple G = (A, W, v0 ) =
(A, W ) consisting of an arena A = (V, E) and a winning condition W ⊆ V ω and
an initial vertex v0 . We always assume that there is a designated initial vertex,
even if we do not always list it explicitly. G is ω-regular if W is ω-regular.
We denote the players by player 0 and player 1. A play in G is an infinite path
π = v0 v1 v2 · · · through A, starting from v0 . On nodes in V (0) player 0 chooses the
next vertex. Otherwise, player 1 chooses. The play is won by player 0 if π ∈ W .
We denote the winning set of player σ by W (σ) = W (σ) (G). The attractor for
player p on a set F is denoted by AttrA p (F ) and defined as usual. It is the set of
vertices from which p can enforce a visit to F .
To study games on composite arenas, we require some additional notation.
Consider a game G = (A, W ) on a composite arena A = A1 ∗ · · · ∗ Ak , with ∗ ∈
{, ⊗}. We call A1 , . . . , Ak the constituent arenas of A. A game Gi = (Ai , Wi )
for some Wi ⊆ Viω is called a component game.
The winning condition W is necessarily given by means of some finite repre-
sentation. In this paper we consider mainly reachability conditions, which are
determined by a set F ⊆ V . A play π satisfies the reachability condition F if

π(i) ∈ F for some i ∈ = {0, 1, 2, . . .}.
It is sometimes convenient to specify properties on a path in some logic. In
this paper we use LTL to express temporal properties on paths. We again assume
the reader is familiar with LTL (see [4, 5] for an introduction). We write ψ U φ
for the strict until (φ is true eventually, and, until then, ψ holds).

3 Strategy Machines and Strategy Composition


Classically, strategies are represented using (usually finite) automata. Automata
are a state space view on a computational system. They abstract away from
implementation details. This comes at the price of loosing information about
important implementation aspects, such as runtime and space usage.
The abstract view of automata is sometimes coarser than required. To aug-
ment this view, in [9] strategy machines were introduced. Strategy machines are
Turing machines with three designated tapes, the IO-tape, the computation tape
and the memory tape. The semantics of a strategy machine can intuitively be
described as follows. A vertex (encoded in binary) appears on the IO-tape. The
strategy machine inspects the content of its memory tape and computes a new
vertex (using all three tapes). The content of the memory tape is updated and
the computation tape is cleared. The new vertex is written on the output tape
and the process repeats. We now recall the definition from [9]. Let ˆ = ∪ {#}.
Strategy Composition in Compositional Games 267

Definition 3 (Strategy Machine). A strategy machine is a deterministic 3-


tape Turing machine M = (Q, , ˆ , qI , qO , δ) with
– a finite set Q of states
– tape alphabet ˆ and input alphabet
– two designated states, qI , the input state and qO , the output state
– a designated IO-tape tIO
– a designated memory tape tM
– a designated computation tape tC
The partial transition function δ : Q × ˆ 3 Q × ˆ 3 × {−1, 0, 1}3 satisfies
– δ(q, b) = (qI , b , d) for all q ∈ Q, b, b ∈ ˆ 3 and d ∈ {−1, 0, 1}3.
– δ(qO , b) is undefined for all b ∈ ˆ 3 .
Since the tape and input alphabets are always the same, we usually omit them
in the list of components of a strategy machine.
We sketch the semantics of a strategy machine (a formal definition can be
found in [9]). Configurations are defined as usual. An iteration of M is a sequence
of configurations beginning with an initial configuration (with state qI ) and a
terminal configuration (with state qO ). By definition of δ, qI and qO appear
exactly at the beginning and at the end of an iteration. The iteration beginning
in configuration c is unique (if it exists) and depends only on the input tIO (c) on
the IO-tape and on the content tM (c) of the memory tape. We write c for the
unique terminal configuration reachable from c and we write (c, c ) for the entire
iteration. Its length, the number of computation steps, is denoted by L(c, c ).
Strategy machines are intended to implement functions on sequences of in-
put. Let π ∈ ( ∗ )∞ = ( ∗ )∗ ∪ ( ∗ )ω , i.e. π(i) ∈ ∗ for all i ∈ dom(π). Let
cπ,0 = (qI , π(0), ε, ε, 0, 0, 0). We define cπ,i+1 to be the configuration which
inherits the memory tape content from the terminal configuration of the i-
th iteration and has π(i + 1) as input on the IO-tape. Formally, cπ,i+1 =
(qI , π(i + 1), tM (cπ,i ), ε, 0, 0, 0). We also say that (cπ,i , cπ,i ) and (cπ,i+1 , cπ,i+1 )
are compatible. An iteration is admissible if it is of the form (cπ,i , cπ,i ) for some
π ∈ ( ∗ )ω . A sequence of iterations (c0 , c0 )(c1 , c1 ) · · · which are compatible is
called a computation of M. Define fM (π) = tIO (cπ,0 ) · tIO (cπ,1 ) · · · . fM maps
strings from ( ∗ )∗ to ( ∗ )∗ (where, in our setting, elements from ∗ are encoded
vertices). We say M implements fM . We sometimes identify M with fM and
say, for instance, that M is a winning strategy.
One of the benefits of strategy machines is the complexity measures they offer
to evaluate a strategy. We define the latency T (M) and the space requirement
S(M) of a strategy machine M as follows:

T (M) = sup sup L(cπ,n , cπ,n )


π∈( ∗ )ω n∈ 
S(M) = sup sup |tM (cπ,n )|
π∈( ∗ )ω n∈ 
Finally, the size of M is the number M = |Q| of its control states.
268 M. Gelderie

In modular programming subroutines are a central concept. Let us formalize


this notion. An n-template is a strategy machine M = (QM , qM,I , qM,O , δM )
with 2n distinguished states sub1 , ret1 , . . . , subn , retn such that δM (subi , b) is
undefined for all 1 ≤ i ≤ n and all b ∈ ˆ 3 . Let M be an n-template and let
Si = (Qi , qi,I , qi,O , δi ), 1 ≤ i ≤ n,
% be strategy machines. Define the strategy
machine M[S1 , . . . , Sn ] = (QM - i Qi , qM,I , qM,O , δ) by δ(subi , b) = δi (qi,I , b)
and δ(qi,O , b) = (reti , b, 0, 0, 0). In all other cases δ coincides with δM or δi ,
whenever this makes sense. Such a machine is called a composition of S1 , . . . , Sk .

Definition 4. Let ∗ ∈ {, ⊗} and let G = (A1 ∗ · · · ∗ Ak , W ).

1. Let G1 , . . . , Gk be component games with winning strategies S1 , . . . , Sk for


player 0. A k-template M such that M[S1 , . . . , Sk ] is a winning strategy in G
is called a winning composition of S1 , . . . , Sk . If M is a winning composition
of any choice S1 , . . . , Sk of winning strategies for player 0 in G1 , . . . , Gk , it
is called a winning composition of G1 , . . . , Gk .
2. A class Λ of games on composite arenas is said to admit polynomial com-
positions, if for every G = (A1 ∗ · · · ∗ Ak , W ) ∈ Λ there exist compo-
nent games G1 , . . . , Gk and a polynomial sized winning composition M
of G1 , . . . , Gk such that for some choice S1 , . . . , Sk of component winning
strategies T (M[S1 , . . . , Sk ]) ∈ poly(G). Such a tuple M, S1 , . . . , Sk is
called a polynomial composition.

Part of the appeal of polynomial compositions is their efficiency: By definition,


a class of games admitting polynomial compositions enables us to find strategies
with a polynomial latency (and thus a polynomial space requirement) which,
depending on G1 , . . . , Gk , have polynomial size. Note that for positionally de-
termined component games this is always the case. The converse holds for trivial
reasons. Any strategy machine M of polynomial size implementing a winning
strategy with latency bounded polynomially can trivially be seen as a polynomial
composition of any choice of machines S1 , . . . , Sk .
We elaborate a bit on the restrictions we impose in the above definition.
The requirement that all strategies in component games are interchangeable
is to ensure that we compose general component games, not specific choices of
strategies. In the definition of polynomial composition, the restriction on the size
is to avoid templates which never call their subroutines and instead implement
the entire global winning strategy on their own. Likewise, the restriction on the
latency is to avoid enabling too powerful computations during the course of a
single iteration (such as, for instance, solving the entire game every turn).

4 Games on Parallel Products

In this section we study reachability games over parallel products of arenas. If


the arena is given as a composition of smaller arenas, it is natural to also study
several compositional ways of specifying the reachability condition. Let Fi ⊆ Vi .
We study the following formalisms:
Strategy Composition in Compositional Games 269


1. local reachability, where F = Floc (F1 , . . . , Fk ) is the set of all v ∈ i Vi with
vi ∈ Fi for some i
2. synchronized
 reachability, where F = Fsync (F1 , . . . , Fk ) is the set of all v ∈
i Vi with vi ∈ Fi for every i

If F = F (F1 , . . . , Fk ), F ∈ {Floc , Fsync }, the set player 0 has to reach is × F,


i.e. the -component does not influence the outcome of the game.

Remark 2. One might consider asynchronous reachability, where all components


Fi must be reached, but not necessarily at the same time. We omit this condition
here, because it is not expressible as a reachability condition on the composite
arena.

Theorem 1. Games of the form G = (A1  · · ·  Ak , Floc (F1 , . . . , Fk )) admit


polynomial compositions where the component games in def. 4 can be chosen as
reachability games. In particular, component positional strategies suffice. More-
over, a polynomial composition can be computed in polynomial time.

We omit the full proof due to space constraints. However, the idea is to show that
deciding the winning set can be done by deciding conditions on the components:
Player 0 wins from (σ0 , v0 ) iff one of the following two applies
(0)
1. There exists i with vi,0 ∈ Vi and vi,0 Ei ∩ Fi = ∅.
2. |{i | vi,0 ∈ AttrA
0 (Fi )}| ≥ rank1
i

The proof of this characterization gives component strategies for both players,
which can be composed to a winning strategy for the respective player in G.
Next, we consider synchronized reachability. We have:

Theorem 2. Games of the form G = (A1  · · ·  Ak , Fsync (F1 , . . . , Fk )) ad-


mit polynomial compositions. The component games in def. 4 can be chosen as
positionally determined games. A polynomial composition can be computed in
polynomial time.

The proof idea is to characterize the winning set by considering component


games. The proof of the characterization gives polynomial compositions for both
players. The characterization is more involved than in thm. 1. We split the
problem into subcases. For space reasons, we only state the characterization for
the case rank1 = 1 and the case where both rank1 > 1 and rank0 > 1 and σ0 = 1.

Lemma 1. In thm. 2, let (σ0 , v0 ) be such that rank1 = 1. Then player 0 wins
from (σ0 , v0 ) iff all of the following hold:

1. for all i we have vi,0 ∈ AttrA i


0 (Fi )
Ai (0)
2. |{i | vi,0 ∈ Attr0 (Fi ∩ Vi )}| ≥ k − 1
(1)
3. if σ0 = 1 and vj,0 ∈ Fj ∩ Vj for some j, then vi,0 ∈ Fi for all i = j or
A
vj,0 Ej ⊆ Attr0 j (Fj )
270 M. Gelderie

A polynomial winning composition for player 0 (resp. player 1) with respect


to positional strategies in component reachability (resp. safety) games can be
computed in polynomial time.
In lem. 1 the component games are also reachability games (just like in thm. 1).
In the next lemma, where rank0 > 1 and rank1 > 1, this is no longer the case.
Instead, we use games with a temporal winning condition of low complexity. We
call a game G = (A, W ) a reachability game with safety constraint if W is given
by an LTL-formula ϕ = S U F with sets S, F ⊆ VA . We have:
Proposition 1. Every reachability game with safety constraint is determined
with positional strategies. Moreover, a winning strategy for both players can be
computed in time O(|VA | + |EA |), if it exists.
For simplicity, we exclude the trivial case where the initial position is already in
Fsync (F1 , . . . , Fk ). We consider the case where player 1 moves first.
Lemma 2. In thm. 2, let rank0 > 1 and rank1 > 1. Then player 0 wins from
(1, v0 ) ∈
/ Fsync (F1 , . . . , Fk ) iff all of the following constraints are met:
=⇒ (vi,0 ∈ Fi ∧ ∀vi ∈ vi,0 Ei : vi ∈ Fi ∨ vi Ei ∩ Fi = ∅)
(1)
1. vi,0 ∈ Vi
(0)
2. vi,0 ∈ Vi \ Fi =⇒ vi,0 Ei ∩ Fi = ∅
 (0) (0) 
3. |{i | vi,0 ∈ W (0) Ai , (Vi ∪ Fi ) U(Vi ∩ Fi ) }| ≥ k − 1
A polynomial winning composition for player 0 (resp. player 1) with respect to
positional strategies in component reachability games with safety constraint can
be computed in polynomial time.
We see that in the case of synchronized reachability we have to use a stronger
notion of winning condition in the component games than we did in the global
game. What kind of component games we need depends on the initial position.
The polynomial composition in thm. 2 relies on positional winning strategies
in component games. This is closely tied to complexity in the following way:
Decision Problem (PARALLEL-REACH[F ]).
k
Input: Arenas A1 , . . . , Ak , a vertex s = (σ0 , v) ∈ × i=1 Vi and sets Fi ⊆ Vi
Decide: s ∈ W (0) in the game G = (A1  · · ·  Ak , F (F1 , . . . , Fk ))?
Corollary 1. Let F ∈ {Floc , Fsync }. PARALLEL-REACH[F ] is Ptime-complete.
We also have the following corollary of thm. 1 and thm. 2:
Corollary 2. Let G = (A1  · · ·  Ak , F (F1 , . . . , Fk )), where F ∈ {Floc , Fsync },
be a reachability game.
 kThere exists
 a winning strategy machine  k M for player
 0
of size M ∈ poly i=1 Ai  and latency T (M) ∈ poly i=1 (Ai ) .

5 Games on Synchronized Products


In this section we investigate arenas obtained via the synchronized product. Here
the situation is quite different. We first consider the complexity of deciding the
winning region in those games:
Strategy Composition in Compositional Games 271

Decision Problem (SYNC-REACH[F ]).


k
Input: Arenas A1 , . . . , Ak , a vertex s = (σ0 , v) ∈ × i=1 Vi and sets Fi ⊆ Vi
Decide: s ∈ W (0) in the game G = (A1 ⊗ · · · ⊗ Ak , F (F1 , . . . , Fk ))?
Theorem 3. Let F ∈ {Floc , Fsync }. SYNC-REACH[F ] is Exptime-complete.
Proof. We only show the claim for F = Floc . The proof for F = Fsync is an
adaption of this proof.
Membership in Exptime is trivial (for instance, using a classical attractor on
an in-memory explicit graph). We therefore focus on hardness.
The main idea is to reduce the acceptance problem of an APspace-Turing-
machine. Since it is APspace, we have only polynomially many tape cells in
use. We introduce a labeled arena Ai for each cell plus one additional labeled
arena AH storing both the state of the machine and the head position. Letters
in Ai are indexed by i so that transitions may target a specific tape cell. The
tape cell is updated by having players choose a transition label aligning the head
position in AH with the index of of the cell to be updated. Since the transition
in AH cannot observe the state of Ai , the players may cheat with respect to
the content of the tape cells. The construction below introduces a mechanism to
enable players to challenge an opponent’s cheating moves and win.
Let M = (Q∃ , Q∀ , Γ̂ , Γ, q0 , ΔM , {qF }) be a bipartite APspace-machine. We
suppose Γ̂ is the input alphabet and Γ ⊇ Γ̂ is the tape alphabet. Let Q =
Q∃ - Q∀ . Suppose w ∈ Γ̂ ∗ is an input to M. We assume that M accepts with
exactly one final state qF ∈ Q∃ and that qF is never visited before termination.
We may also assume that every configuration has at least one outgoing transition.
Suppose p is a polynomial bounding the space of M.
We define p(|w|) = n automata A1 , . . . , An as follows. For every i = 1, . . . , n,
let Γi = Γ ×  {i}. Write γi for n(γ, i) ∈ Γi . All n automata have the same al-
n
phabet Σ = i=1 {vetoi } ∪ i=1 Γi2 . We define Ai = (Ai , Σ, δi ) with Ai =
(Γ ∪ {⊥0 , ⊥1 }) × where δi : Ai × Σ Ai is as follows:

⎪ 
⎨(γ , 1 − σ) if i = j and γ̂ = γ
δi ((γ̂, σ), (γj , γj )) = (⊥σ , 1 − σ) if i = j but γ̂ = γ


(γ̂, 1 − σ) if i = j

for all j ∈ {1, . . . , k}, all γ̂ ∈ Γ , γj , γj ∈ Γj and all σ ∈ . We also define

δi ((⊥p , σ), (γj , γj )) = (⊥p , 1 − σ)

for all j ∈ {1, . . . , k}, all p, σ ∈ and all γj , γj ∈ Γj .


Furthermore, a transition labeled with vetoi is defined on all player 1 states:

(⊥1 , 0) if i = j and s = γi ∈ Γi
δi ((s, 1), vetoj ) =
(s, 0) if i = j or s = ⊥p , p ∈

In particular, player 0 can never play vetoi on his components. The partition of
(σ)
Ai into player 0 and player 1 states is given by Ai = {(s, σ) | s ∈ Γ ∪{⊥0 , ⊥1 }}.
272 M. Gelderie

Next, we define AH with states AH = (Q × {1, . . . , n}) - {C, (+, 0), (⊥, 0)},
(0) (1)
where AH = {(q, h) | q ∈ Q∃ } ∪ {(+, 0), (⊥, 0)} and AH = {(q, h) | q ∈
Q∀ } ∪ {C}. The alphabet of this automaton is again Σ (as defined above). Its
transition relation ΔH is defined by
 
(q, h), (γj , γj ), (q  , h ) ∈ ΔH ⇐⇒ h = j ∧ (q, γ, q  , γ  , d) ∈ ΔM ∧ h = h + d

Note that “illegal” transitions are impossible. The players can only cheat with
respect to the content of the h-th tape cell. Also, no transition labeled with vetoi
for any i is possible from a state (q, h). In addition, we now have the following
transitions:
 
(qF , h), (γi , γi ), C ∈ ΔH for all i, h ∈ {1, . . . , k}, γ, γ  ∈ Γ
(C, vetoi , (⊥, 0)) ∈ ΔH for all i ∈ {1, . . . , k}
 

C, (γi , γi ), (+, 0) ∈ ΔH for all i ∈ {1, . . . , k}, γ, γ  ∈ Γ
 
Suppose q0 ∈ Q∃ . The play begins in position (q0 , 1, 0), (#1 , 0), . . . , (#n , 0) ,
where #i ∈ Γi is the blank symbol of M. Player 0 moves (i.e. picks a letter) at
all states in which AH is in a state from Q∃ , player 1 if it is in a state from Q∀ .
This is ensured by the definition of the transition function δi , which guarantees
that each component changes from a σ-state to a (1 − σ)-state in every round.
The states (⊥, 0) and (+, 0) in AH are 0-states without outgoing transitions.
The set player 0 tries to reach is Floc ({(+, 0)}, {(⊥1 , 0)}, . . . , {(⊥1 , 0)}).
We now show the correctness of the above construction. If M accepts w, then
player 0 has a winning strategy in the reachability game on the configuration
graph of M on w. If player 1 does not cheat and player 0 plays according to his
strategy, the play will finally reach a state ((qF , h), x1 , . . . , xn ) with xi = (⊥p , 0)
for all p ∈ and i ∈ {1, . . . , n}. Recall that qF ∈ Q∃ . Now player 0 must
move to C. Unless player 0 cheats (which is clearly a suboptimal choice at this
point), this implies that every component i moves from xi = (γi , 0) to (γi , 1)
by the definition of δi . Player 1 can play vetoi for some i. However, since the
i-th component is in state (γi , 1) for some γi ∈ Γi , we have that this results in
the i-th component making a transition to state (⊥1 , 0). Thus player 0 wins. If
player 1 plays (γi , γi ) for some i, the play reaches (+, 0) and thus player 0 wins.
If player 1 made an illegal transition at some point in the play, then for some i,
the state of Ai loops between (⊥1 , 0) and (⊥1 , 1) from that point onwards and,
again, player 0 wins.
Conversely, if M rejects w, then player 1 has a winning strategy in the safety
game on the configuration graph of M on w. This implies that, unless player 0
uses an illegal transition, the play never reaches state qF . On the other hand, if
player 0 does make an illegal transition, one component, say i, changes to state
(⊥0 , 1) and remains in {(⊥0 , σ) | σ ∈ } from this point onwards. If the play
ever reaches qF after that, and thereafter reaches C, player 1 can play vetoi
moving AH into state (⊥, 0). Component i is in state (⊥0 , 1) when AH is in
C whereby Ai never reaches state (⊥1 , 0). Since player 1 never has to make an
illegal transition, no component j is in a state (⊥1 , 0). Hence player 0 loses. % &
Strategy Composition in Compositional Games 273

The high complexity of SYNC-REACH[F ] for F ∈ {Floc , Fsync } prohibits poly-


nomially computable polynomial compositions for reachability games on syn-
chronous arenas (with local or synchronized reachability conditions). Indeed,
finding such compositions would amount to showing Exptime = Ptime. What
can we do differently in order to succeed?
In order to find a winning composition of polynomial size one might sus-
pect that more complex component games are necessary. In this event, finding
strategies in the the component games would be more difficult, sparing us the
complexity dilemma. Unfortunately, this turns out to be false.
Remark 3. In the reduction above, the constituent arenas are of size ≤ c for some
constant c ∈ (essentially the size of some APspace Turing machine M de-
ciding an Exptime-complete problem). Thus, more complex winning conditions
on those arenas will not increase the complexity of finding winning strategies.
Another possibility is to loosen the notion of a polynomial composition. Recall
that a winning composition M of strategies S1 , . . . , Sk is a polynomial winning
composition if M[S1 , . . . , Sk ] has polynomial latency and M has polynomial
size. We now loosen this requirement as follows: A winning composition M is a
poly-space composition if M is of polynomial size and the space requirement of
M[S1 , . . . , Sk ] is polynomial.
Lemma 3. Let G = (A1 ⊗ · · · ⊗ Ak , F (F1 , . . . , Fk )), where F ∈ {Floc , Fsync }.
Given G, a strategy machine M with polynomial space requirement implementing
a strategy for player 0 in G from position p0 = (σ0 , v0 ), and sets F1 , . . . , Fk it is
decidable in Pspace whether or not M implements a winning strategy from p0 .
The proof uses APtime = Pspace to verify that there is an M-consistent loop
which does not visit F = F (F1 , . . . , Fk ) and can be reached without visiting F .
Theorem 4. The class of reachability games over synchronized products admits
poly-space compositions iff Pspace = Exptime.
Proof. Clearly Pspace = Exptime implies that a strategy machine with a poly-
nomial space requirement can compute the next move in some attractor strategy
in the course of a single iteration.
Conversely, if the class admits poly-space compositions, there always exists a
polynomial sized winning strategy with a polynomial space requirement (assum-
ing the component games are bounded by some constant, cf. Rem. 3) for player
0 (if he wins). Hence, the following NPspace procedure is correct: We guess
a strategy machine of polynomial size and verify in Pspace if it implements a
winning strategy. By Savitch’s theorem, Pspace = NPspace. &
%

6 Conclusion
We studied the relation between the compositional nature of an arena and the
structure of a winning strategy. To this end we introduced two kinds of prod-
ucts on arenas, the parallel and the synchronized product. We defined a notion
274 M. Gelderie

of strategy composition which relies on strategy machines. This notion of com-


position allows to translate winning strategies in component games to winning
strategies in the global game. We proved such a composition theorem for the
class of reachability games on parallel products. We also showed why a similar
result holds on synchronized products iff Exptime = Pspace.
The results of this paper carry through to Büchi games with only minor
modifications. We also have results on the case where the reachability condition
is given explicitly (instead of as a sequence of k sets). For future research we want
to consider more complex winning conditions, such as parity and weak parity.
Also, we want to treat different ways of modeling the composite game from
constituent arenas, addressing notions of composition from the field of process
algebra and formal verification.

Acknowledgments. I would like to thank the anonymous reviewers for many


helpful suggestions, both for the presentation and for future research.

References
1. Büchi, J.R., Landweber, L.H.: Solving Sequential Conditions by Finite-State
Strategies. Trans. of the AMS 138, 295–311 (1969)
2. McNaughton, R.: Infinite games played on finite graphs. Annals of Pure and Ap-
plied Logic 65(2), 149–184 (1993)
3. Zielonka, W.: Infinite games on finitely coloured graphs with applications to au-
tomata on infinite trees. Theor. Comput. Sci. 200, 135–183 (1998)
4. Grädel, E., Thomas, W., Wilke, T. (eds.): Automata logics, and infinite games: a
guide to current research. Springer, New York (2002)
5. Löding, C.: Infinite games and automata theory. In: Apt, K.R., Grädel, E. (eds.)
Lectures in Game Theory for Computer Scientists. Cambridge U. P. (2011)
6. Baier, C., Katoen, J.: Principles of Model Checking. MIT Press (2008)
7. Fearnley, J., Peled, D., Schewe, S.: Synthesis of succinct systems. In: Chakraborty,
S., Mukund, M. (eds.) ATVA 2012. LNCS, vol. 7561, pp. 208–222. Springer,
Heidelberg (2012)
8. Harel, D., Kupferman, O., Vardi, M.Y.: On the complexity of verifying concurrent
transition systems. Inf. Comput. 173(2), 143–161 (2002)
9. Gelderie, M.: Strategy machines and their complexity. In: Rovan, B., Sassone,
V., Widmayer, P. (eds.) MFCS 2012. LNCS, vol. 7464, pp. 431–442. Springer,
Heidelberg (2012)
10. Goldin, D.Q., Smolka, S.A., Wegner, P.: Turing machines, transition systems, and
interaction. Electr. Notes Theor. Comput. Sci. 52(1), 120–136 (2001)
11. Hunter, P., Dawar, A.: Complexity bounds for regular games (extended ab-
stract). In: Jedrzejowicz, J., Szepietowski, A. (eds.) MFCS 2005. LNCS, vol. 3618,
pp. 495–506. Springer, Heidelberg (2005)
12. Dawar, A., Horn, F., Hunter, P.: Complexity Bounds for Muller Games. Theoretical
Computer Science (2011) (submitted)
13. Horn, F.: Explicit Muller Games are PTIME. In: FSTTCS, pp. 235–243 (2008)
Asynchronous Games over Tree Architectures

Blaise Genest1 , Hugo Gimbert2 , Anca Muscholl2 , and Igor Walukiewicz2


1
IRISA, CNRS, Rennes, France
2
LaBRI, CNRS/Université Bordeaux, France

Abstract. We consider the distributed control problem in the setting


of Zielonka asynchronous automata. Such automata are compositions of
finite processes communicating via shared actions and evolving asyn-
chronously. Most importantly, processes participating in a shared action
can exchange complete information about their causal past. This gives
more power to controllers, and avoids simple pathological undecidable
cases as in the setting of Pnueli and Rosner. We show the decidability of
the control problem for Zielonka automata over acyclic communication
architectures. We provide also a matching lower bound, which is l-fold
exponential, l being the height of the architecture tree.

1 Introduction
Synthesis is by now well understood in the case of sequential systems. It is useful
for constructing small, yet safe, critical modules. Initially, the synthesis problem
was stated by Church, who asked for an algorithm to construct devices trans-
forming sequences of input bits into sequences of output bits in a way required
by a specification [2]. Later Ramadge and Wonham proposed the supervisory
control formulation, where a plant and a specification are given, and a controller
should be designed such that its product with the plant satisfies the specifica-
tion [18]. So control means restricting the behavior of the plant. Synthesis is the
particular case of control where the plant allows for every possible behavior.
For synthesis of distributed systems, a common belief is that the problem
is in general undecidable, referring to work by Pnueli and Rosner [17]. They
extended Church’s formulation to an architecture of synchronously communicat-
ing processes, that exchange messages through one slot communication channels.
Undecidability in this setting comes mainly from partial information: specifica-
tions permit to control the flow of information about the global state of the
system. The only decidable type of architectures is that of pipelines.
The setting we consider here is based on a by now well-established model
of distributed computation using shared actions: Zielonka’s asynchronous au-
tomata [20]. Such a device is an asynchronous product of finite-state processes
synchronizing on common actions. Asynchronicity means that processes can
progress at different speed. Similarly to [6,12] we consider the control problem
for such automata. Given a Zielonka automaton (plant), find another Zielonka
automaton (controller) such that the product of the two satisfies a given spec-
ification. In particular, the controller does not restrict the parallelism of the

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 275–286, 2013.

c Springer-Verlag Berlin Heidelberg 2013
276 B. Genest et al.

system. Moreover, during synchronization the individual processes of the con-


troller can exchange all their information about the global state of the system.
This gives more power to the controller than in the Pnueli and Rosner model,
thus avoiding simple pathological scenarios leading to undecidability. It is still
open whether the control problem for Zielonka automata is decidable.
In this paper we prove decidability of the control problem for reachability
objectives on tree architectures. In such architectures every process can com-
municate with its parent, its children, and with the environment. If a controller
exists, our algorithm yields a controller that is a finite state Zielonka automa-
ton exchanging information of bounded size. We also provide the first non-trivial
lower bound for asynchronous distributed control. It matches the l-fold expo-
nential complexity of our algorithm (l being the height of the architecture tree).
As an example, our decidability result covers client-server architectures where
a server communicates with clients, and server and clients have their own interac-
tions with the environment (cf. Figure 1). Our algorithm providing a controller
for this architecture runs in exponential time. Moreover, each controller adds
polynomially many bits to the state space of the process. Note also that this
architecture is undecidable for [17] (each process has inputs), and is neither cov-
ered by [6] (the action alphabet is not a co-graph), nor by [12] (there is no bound
on the number of actions performed concurrently).
Related work. The setting proposed by Pnueli and Rosner [17] has been thor-
oughly investigated in past years. By now we understand that, suitably using
the interplay between specifications and an architecture, one can get undecid-
ability results for most architectures rather easily. While specifications leading
to undecidability are very artificial, no elegant solution to eliminate them exists
at present.
The paper [10] gives an automata-theoretic approach to solving pipeline archi-
tectures and at the same time extends the decidability results to CTL∗ specifica-
tions and variations of the pipeline architecture, like one-way ring architectures.
The synthesis setting is investigated in [11] for local specifications, meaning that
each process has its own, linear-time specification. For such specifications, it is
shown that an architecture has a decidable synthesis problem if and only if it
is a sub-architecture of a pipeline with inputs at both endpoints. The paper [5]
proposes information forks as an uniform notion explaining the (un)decidability
results in distributed synthesis. In [15] the authors consider distributed synthesis
for knowledge-based specifications. The paper [7] studies an interesting case of
external specifications and well-connected architectures.

Fig. 1. Server/client architecture


Asynchronous Games over Tree Architectures 277

Synthesis for asynchronous systems has been strongly advocated by Pnueli and
Rosner in [16]. Their notion of asynchronicity is not exactly the same as ours:
it means roughly that system/environment interaction is not turn-based, and
processes observe the system only when scheduled. This notion of asynchronicity
appears in several subsequent works, such as [19,9] for distributed synthesis.
As mentioned above, we do not know whether the control problem in our
setting is decidable in general. Two related decidability results are known, both
of different flavor than ours. The first one [6] restricts the alphabet of actions:
control with reachability condition is decidable for co-graph alphabets. This re-
striction excludes among others client-server architectures. The second result [12]
shows decidability by restricting the plant: roughly speaking, the restriction says
that every process can have only bounded missing knowledge about the other
processes (unless they diverge). The proof of [12] goes beyond the controller
synthesis problem, by coding it into monadic second-order theory of event struc-
tures and showing that this theory is decidable when the criterion on the plant
holds. Unfortunately, very simple plants have a decidable control problem but
undecidable MSO-theory of the associated event structure. Melliès [14] relates
game semantics and asynchronous games, played on event structures. More re-
cent work [3] considers finite games on event structures and shows a determinacy
result for such games under some restrictions.
Organization of the Paper. The next section presents basic definitions. The two
consecutive sections present the algorithm and the matching lower bound. The
full version of the paper is available at http://hal.archives-ouvertes.fr/
hal-00684223.

2 Basic Definitions and Observations

We start by introducing Zielonka automata and state the control problem for
such automata. We also give a game-based formulation of the problem.

2.1 Zielonka Automata

Zielonka automata are simple parallel finite-state devices. Such an automaton is


a parallel composition of several finite automata, called processes, synchronizing
on shared actions. There is no global clock, so between two synchronizations,
two processes can do a different number of actions. Because of this Zielonka
automata are also called asynchronous automata.
A distributed action alphabet on a finite set P of processes is a pair (Σ, dom),
where Σ is a finite set of actions and dom : Σ → (2P \ ∅) is a location function.
The location dom(a) of action a ∈ Σ comprises all processes that need to syn-
chronize in order to perform this action. A (deterministic) Zielonka automaton
A = {Sp }p∈P , sin , {δa }a∈Σ  is given by:

– for every process p a finite


 set Sp of (local) states,
– the initial state sin ∈ p∈P Sp ,
278 B. Genest et al.

 ·
– for every action a ∈ Σ a partial transition function δa : Sp →
 p∈dom(a)
p∈dom(a) Sp on tuples of states of processes in dom(a).

For convenience, we abbreviate a tuple (sp )p∈P of local states  by sP , where


P ⊆ P. We also talk about Sp as the set of p-states and of p∈P Sp as global
states. Actions from Σp = {a ∈ Σ | p ∈ dom(a)} are called p-actions. For
p, q ∈ P, let Σp,q = {a ∈ Σ | dom(a) = {p, q}} be the set of synchronization
actions between p and q. We write Σploc instead of Σp,p for the set of local actions
of p, and Σpcom = Σp \ Σploc for the synchronization actions of p.
AZielonka automaton can be seen as a sequential automaton with the state set
a
S = p∈P Sp and transitions s −→ s if (sdom(a) , sdom(a) ) ∈ δa , and sP\dom(a) =
sP\dom(a) . By L(A) we denote the set of words labeling runs of this sequential
automaton that start from the initial state. Notice that L(A) is closed under the
congruence ∼ generated by {ab = ba | dom(a) ∩ dom(b) = ∅}, in other words, it
is a trace-closed language. A (Mazurkiewicz) trace is an equivalence class [w]∼
for some w ∈ Σ ∗ . The notion of trace and the idea of describing concurrency
by a fixed independence relation on actions goes back to the late seventies, to
Mazurkiewicz [13] (see also [4]).
Consider a Zielonka automaton A with two
processes: P, P . The local actions of process P
are {u0 , u1 , c0 , c1 } and those of process P are
{u0 , u1 , c0 , c1 }. In addition, there is a shared ac-
tion $ with dom($) = {P, P }. From the initial
state, P can reach state (i, j) by executing ui cj ,
same for P with uk cl and state (k, l). Action $
is enabled in every pair ((i, j), (k, l)) satisfying
i = l or k = j, it leads to the final state. So
L(A) = {[ui cj uk cl $] | i = l or k = j}.
As the notion of a trace can be formulated without a reference to an accept-
ing device, one can ask if the model of Zielonka automata is powerful enough.
Zielonka’s theorem says that this is indeed the case, hence these automata are a
right model for the simple view of concurrency captured by Mazurkiewicz traces.
Theorem 1. [20] Let dom : Σ → (2P \ {∅}) be a distribution of letters. If
a language L ⊆ Σ ∗ is regular and trace-closed then there is a deterministic
Zielonka automaton accepting L (of size exponential in the number of processes
and polynomial in the size of the minimal automaton for L, see [8]).

2.2 The Control Problem


In Ramadge and Wonham’s control setting [18] we are given an alphabet Σ
of actions partitioned into system and environment actions: Σ sys ∪ Σ env = Σ.
Given a plant P we are asked to find a controller C over Σ such that the product
P × C satisfies a given specification (the product being the standard product of
the two automata). Both the plant and the controller are finite, deterministic
automata over the same alphabet Σ. Additionally, the controller is required not
Asynchronous Games over Tree Architectures 279

to block environment actions, which in technical terms means that from every
state of the controller there should be a transition on every action from Σ env .
The definition of our problem is the same with the difference that we take
Zielonka automata instead of finite automata. Given a distributed alphabet
(Σ, dom) as above, and a Zielonka automaton P , find a Zielonka automaton
C over the same distributed alphabet such that P × C satisfies a given specifica-
tion. Additionally it is required that from every state of C there is a transition
for every action from Σ env . The important point here is that the controller has
the same distributed structure as the plant. Hence concurrency in the controlled
system is the same as in the plant. Observe that in the controlled system P × C
the states carry the additional information computed by the controller.
Example: Reconsider the automaton on page 278, and assume that ui , uk ∈
Σ env are the uncontrollable actions (i, k ∈ {0, 1}). So the controller needs to
propose controllable actions cj and ck , resp., in such a way that both P and P
reach their final states f, f by executing the shared action $. At first sight this
may seem impossible to guarantee, as it looks like process P needs to know what
uk process P has received, or vice-versa. Nevertheless, such a controller exists.
It consists of P allowing after ui only action ci , and P allowing after uk only
action c1−k . Regardless if the environment chooses i = j or i = j, the action $
is enabled in state ((i, i), (j, 1 − j)), so both P, P can reach their final states.
It will be more convenient to work with a game formulation of this problem,
as in [6,12]. Instead of talking about controllers we will talk about distributed
strategies in a game between system and environment. A plant defines a game
arena, with plays corresponding to initial runs of A. Since A is deterministic,
we can view a play as a word from L(A) – or a trace, since L(A) is trace-closed.
Let Plays(A) denote the set of traces associated with words from L(A).
A strategy for the system will be a collection of individual strategies for each
process. The important notion here is the view each process has about the global
state of the system. Intuitively this is the part of the current play that the process
could see or learn about from other processes during a communication with them.
Formally, the p-view of a play u, denoted view p (u), is the smallest trace [v] such
that u ∼ vy and y contains no action from Σp . We write Plays p (A) for the set
of plays that are p-views: Plays p (A) = {view p (u) | u ∈ Plays(A)}.
sys
A strategy for a process p is a function σp : Plays p (A) → 2Σp , where Σpsys =
{a ∈ Σ sys | p ∈ dom(a)}. We require in addition, for every u ∈ Plays p (A), that
σp (u) is a subset of the actions that are possible in the p-state reached on u. A
strategy is a family of strategies {σp }p∈P , one for each process.
The set of plays respecting a strategy σ = {σp }p∈P , denoted Plays(A, σ), is the
smallest set containing the empty play ε, and such that for every u ∈ Plays (A, σ):

1. if a ∈ Σ env and ua ∈ Plays(A) then ua is in Plays(A, σ);


2. if a ∈ Σ sys and ua ∈ Plays(A) then ua ∈ Plays(A, σ) provided that a ∈
σp (view p (u)) for all p ∈ dom(a).

Plays from Plays(A, σ) are called σ-plays and we write Plays p (A, σ) for the
set Plays(A, σ) ∩ Plays p (A). The above definition says that actions of the
280 B. Genest et al.

environment are always possible, whereas actions of the system are possible
only if they are allowed by the strategies of all involved processes.
Our winning conditions in this paper are local reachability conditions: ev-
ery process has a set of target states Fp ⊆ Sp . We also assume that states in
Fp are blocking, that is, they have no outgoing transitions. This means that if
(sdom(a) , sdom(a) ) ∈ δa then sp ∈
/ Fp for all p ∈ dom(a). For defining winning
strategies, we need to consider also infinite σ-plays. By Plays ∞ (A, σ) we denote
the set of finite or infinite σ-plays in A. Such plays are defined as finite ones,
replacing u in the definition of Plays(A, σ) by a possibly infinite, initial run of A.
A play u ∈ Plays ∞ (A, σ) is maximal, if there is no action c such that the trace
uc is a σ-play (note that uc is defined only if no process in dom(c) is scheduled
infinitely often in u).

Definition 1. The control problem for a plant A and a local reachability con-
dition (Fp )p∈P is to determine if there is a strategy σ =
(σp )p∈P such that every
maximal trace u ∈ Plays ∞ (A, σ) is finite and ends in p∈P Fp . Such traces and
strategies are called winning.

3 The Upper Bound for Acyclic Communication Graphs

We impose two simplifying assumptions on the distributed alphabet (Σ, dom).


The first one is that all actions are at most binary: |dom(a)| ≤ 2, for every a ∈ Σ.
The second requires that all uncontrollable actions are local: |dom(a)| = 1, for
every a ∈ Σ env . The first restriction makes the technical reasoning much simpler.
The second restriction reflects the fact that each process is modeled with its own,
local environment.
Since actions are at most binary, we can define an undirected graph CG with
node set P and edges {p, q} if there exists a ∈ Σ with dom(a) = {p, q}, p = q.
Such a graph is called communication graph. We assume throughout this section
that CG is acyclic and has at least one edge. This allows us to choose a leaf
 ∈ P in CG, with {r, } an edge in CG. So, in this section  denotes this fixed leaf
process and r its parent process. Starting from a control problem with input A,
(Fp )p∈P we will reduce it to a control problem over the smaller (acyclic) graph
CG  = CGP\{} . The reduction will work in exponential-time. If we represent CG
as a tree of depth l then applying this construction iteratively we will get an
l-fold exponential algorithm to solve the control problem for the CG architecture.
The main idea is that process r can simulate the behavior of process . Indeed,
after each synchronization between r and , the views of both processes are
identical, and until the next synchronization (or termination)  evolves locally.
The way r simulates  is by “guessing” the future local evolution of  until the
next synchronizations (or termination) in a summarized form. Correctness is
ensured by letting the environment challenge the guesses.
In order to proceed this way, we first show that winning control strategies can
be assumed to satisfy a “separation” property concerning the synchronizations
of process r (cf. 2nd item of Lemma 1):
Asynchronous Games over Tree Architectures 281

Lemma 1. If there exists a winning strategy for controlling A, then there is


one, say σ, such that for every u ∈ Plays(A, σ) the following holds:
1. For every process p and A = σp (view p (u)), we have either A ⊆ Σpcom or
A = {a} for some a ∈ Σploc .
2. Let A = σr (view r (u)) with A ⊆ Σrcom . Then either A ⊆ Σr, or A ⊆ Σrcom \
Σr, holds.
It is important to note that the 2nd item of Lemma 1 only holds when final
states are blocking. To see this, consider the client-server example in Fig. 1 and
assume that environment can either put a client directly to a final state, or oblige
him to synchronize with the server before going to the final state. Suppose that
all states of the server are final. In this case, the server’s strategy must propose
synchronization with all clients at the same time in order to guarantee that all
clients can reach their final states.
Lemma 1 implies that the behavior of process  can be divided in phases
consisting of a local game ending in states where the strategy proposes commu-
nications with r and no local actions. This allows to define summaries of results
of local plays of the leaf process . We denote by state  (v) the -component of
the state reached on v ∈ Σ ∗ from the initial state. Given a strategy σ = (σp )p∈P
and a play u ∈ Plays  (A, σ), we define:

Sync σ (u) = {(t , A) | ∃x ∈ (Σloc )∗ . ux is a σ-play, state  (ux) = t ,


σ (ux) = A ⊆ Σr, , and A = ∅ iff t is final}.

Since our winning conditions are local reachability conditions, we can show that
it suffices to consider memoryless local strategies for process  until the next
synchronization with r (or until termination). Moreover, since final states are
blocking, either all possible local plays from a given -state ultimately require
synchronization with r, or they all terminate in a final state of  (mixing the two
situations would result in a process blocked on communication).
Lemma 2. If there exists a winning strategy for controlling A, then there is
one, say σ = (σp )p∈P , such that for all plays u ∈ Plays  (A, σ) the following hold:
1. Either Sync σ (u) ⊆ (S \ F ) × (2Σr, \ {∅}) or Sync σ (u) ⊆ F × {∅}.
2. If uy is a σ-play with y ∈ (Σ \ Σ )∗ , σr (view r (uy)) = B ⊆ Σr, and B = ∅,
then for every (t , A) ∈ Sync σ (u) some action from A ∩ B is enabled in
(state r (uy), t ).
3. There is a memoryless local strategy τ : S → (Σsys ∩ Σloc ) to reach from
state  (u) the set of local states {t | (t , A) ∈ Sync σ (u) for some A}.
The second item of the lemma says that every evolution of r should be compatible
with every evolution of . The memoryless strategy from the third item proposes
local actions of  based only on the current state of  and not on the history
of the play. This strategy is used in a game on the transition graph of process
. The third item of the lemma follows from the fact that 2-player games with
reachability objectives admit memoryless winning strategies.
282 B. Genest et al.

Definition 2. An admissible plan T from s ∈ S for process  is either a


subset of (S \ F ) × (2Σr, \ {∅}) or a subset of F × {∅}, such that there exists
a memoryless local strategy τ : S → (Σsys ∩ Σloc ) to reach from s the set
{t | (t , A) ∈ T for some A}. An admissible plan T is final if T ⊆ F × {∅}.
So Lemma 2 states that if there exists a winning strategy then there is one, say
σ such that Sync σ (u) is an admissible plan for every σ-play u. Note also that
we can check in polynomial time whether T as above is an admissible plan.
We are now ready to define informally the reduced plant A on the process
set P = P \ {}, that is the result of eliminating process . The only part that
changes in A concerns process r, who now simulates former processes r and .
The new process r starts in state sin,r , sin, . It will get into a state from Sr ×S
every time it simulates a synchronization between the former r and . Between
these synchronizations its behaviour is as follows.
– From a state of the form sr , s , process r can do a controllable action ch(T ),
for every admissible plan T from s , and go to state sr , T .
– From a state of the form sr , T  process r can behave as the former r: it can
either do a local action (controllable or not) or a shared action with some
p = , that updates the Sr -component to some sr , T .
– From a state sr , T  process r can also do a controllable action ch(B) for
some B ⊆ Σr, and go to state sr , T, B; from sr , T, B there are new,
uncontrollable actions of the form (a, t ) where (t , A) ∈ T and a ∈ A ∩ B
a
such that (sr , t ) −→ (sr , t ) in A. This case simulates r choosing a set of
synchronization actions with , and the synchronization itself. For correctness
of this step it is important that B is chosen such that for every (t , A) ∈ T
there is some a ∈ A ∩ B enabled in (sr , t ).
Finally, accepting states of r in A are Fr × F , and sr , T  for sr ∈ Fr and
T a final plan. The proof showing that this construction is correct provides a
reduction from the control game on A to the control game on A .
Theorem 2. Let  be the fixed leaf process with P = P \ {} and r its par-
ent. Then the system has a winning strategy for A, (Fp )p∈P iff it has one for
A , (Fp )p∈P . The size of A is |A| + O(Mr 2M |Σr | ), where Mr and M are the
sizes of processes r and  in A, respectively.
|Σ |
Remark 1. Note that the bound on |A | is better than |A| + O(Mr 2M 2 r )
obtained by simply counting all possible states in the description above. The
reason is that we can restrict admissible plans to be (partial) functions from S
into 2Σr, . That is, we do not need to consider different sets of communication
actions for the same state in S .
Let us reconsider the example from Figure 1 of a server with k clients. Applying
our reduction k times we reduce out all the clients and obtain the single process
plant whose size is Ms 2(M1 +···+Mk )c where Ms is the size of the server, Mi is the
size of client i, and c is the maximal number of communication actions between
a client and the server. Our first main result also follows by applying the above
reduction iteratively.
Asynchronous Games over Tree Architectures 283

Theorem 3. The control problem for distributed plants with acyclic communi-
cation graph is decidable. There is an algorithm for solving the problem (and
computing a finite-state controller, if it exists) whose running time is bounded
by a tower of exponentials of height equal to half of the diameter of the graph.

4 The Lower Bound

Our main objective now is to show how using a communication architecture of


diameter l one can code a counter able to represent numbers of size Tower (2, l)
(with Tower (n, l) = 2Tower (n,l−1) and Tower (n, 1) = n). Then an easy adapta-
tion of the construction will allow to encode computations of Turing machines
with the same space bound as the capabilities of the counters.

x1 . . . xn y1 · · · yn z1 · · · zn
C
$ $ $
V ...
$ $ $
C
x1 · · · xn y1 · · · y n z 1 · · · zn

Fig. 2. Shape of a trace with 3 processes. Dashed lines show two types of tests.

Let us first explain the mechanism we will use. Consider a trace of the shape
presented in Figure 2. There are three processes C, C and V. Process C repeatedly
generates a sequence of n local actions and then synchronizes on action $ with
the verifier process V. Process C does the same. The alphabets of C and C are of
course disjoint. The verifier process V always synchronizes first with C and then
with C. Observe that the actions y 1 · · · y n are concurrent to both x1 · · · xn and
y1 · · · yn , but they are before z1 . Suppose that we allow the environment to stop
this generation process at any moment. Say it stops C at some xi , and C at xi .
We can then set the processes in such a way that they are forced to communicate
xi and xi to V; who can verify if they are correct. The other possibility is that
the environment stops C at xi and C at yi forcing the comparison of xi with
y i . This way we obtain a mechanism allowing to compare position by position
the sequence x1 · · · xn both with x1 · · · xn and with y 1 · · · yn . Observe that V
knows which of the two cases he deals with, since the comparison with the
latter sequence happens after some $ and before the next $. Now, we can use
sequences of n letters to encode numbers from 0 to 2n − 1. Then this mechanism
permits us to verify if x1 · · · xn represents the same number as x1 · · · xn and the
predecessor of y 1 · · · y n . Applying the same reasoning to y1 · · · yn we can test
that it represents the same number as y1 · · · y n and the predecessor of z 1 · · · z n .
If some test fails, the environment wins. If the environment does not stop C and C
at the same position, or stops only one of them, the system wins. So this way we
force the processes C and C to cycle through representations of numbers from 0 to
284 B. Genest et al.

2n − 1. Building on this idea we can encode alternating polynomial space Turing


machines, and show that the control problem for this three process architecture
(with diameter 2) is Exptime-hard. The algorithm from the previous section
provides the matching upper bound.
After this explanation let us introduce general counters. We start with their
alphabets. Let Σi = {ai , bi } for i = 1, . . . , n. We will think of ai as 0 and bi as
1, mnemonically: 0 is round and 1 is tall. Let Σi# = Σi ∪ {#i } be the alphabet
extended with an end marker.
A 1-counter is just a letter from Σ1 followed by #1 . The value of a1 is 0, and
the one of b1 is 1. An (l + 1)-counter is a word x0 u0 x1 u1 · · · xk−1 uk−1 #l+1 where
k = Tower (2, l), and for every i < k we have: xi ∈ Σl+1and ui is an l-counter
with value i. The value of the above (l + 1)-counter is i=0,...,k xi 2i . The end
marker #l+1 is there for convenience. An iterated (l + 1)-counter is a nonempty
sequence of (l + 1)-counters (we do not require that the values of consecutive
(l + 1)-counters are consecutive).
Suppose that we have already constructed a plant C l with root process rl ,
such that every winning strategy in C l needs to produce an iterated l-counter
on rl . We now define C l+1 , a plant where every winning strategy needs to
produce an iterated (l + 1)-counter
on its root process rl+1 . Recall
that such a counter is a se-
quence of l-counters with values
0, 1, . . . , (Tower (2, l) − 1), 0, 1, . . .
The plant C l+1 is made of two copies of C l , that we name Dl and Dl . We add
three processes: rl+1 , rl+1 , Vl+1 . The root rl+1 of C l+1 communicates with Vl+1
and with the root rl of Dl , while rl+1 communicates with Vl+1 and with the
root of Dl .
In order to force C l+1 to generate an (l + 1)-counter, we allow the environment
to compare using Vl+1 the sequence generated by rl+1 and the sequence gener-
ated by rl+1 . The mechanism is similar to the example above. After each letter
of Σl , we add an uncontrollable action that triggers the comparison between the
current letters of Σl on rl+1 and on rl+1 . This may correspond to two types
of tests: equality or successor. For equality, Vl+1 enters a losing state if (1) the
symbols from rl+1 and from rl+1 are different; and (2) the number of remaining
letters of Σl before #l is the same on both rl+1 and rl+1 . The latter test ensures
that the environment has put the challenge at the same positions of the two
counters. The case for successor is similar, accounting for a possible carry. In
any other case (for instance, if the test was issued on one process only instead
of both rl+1 and rl+1 ), the test leads to a winning configuration.
The challenge with this schema is to keep rl+1 and rl+1 synchronized in a
sense that either (i) the two should be generating the same l-counter, or (ii) rl+1
should be generating the consecutive counter with respect to the one generated
by rl+1 . For this, a similar communication mechanism based on $ symbols as in
the example above is used. An action $l shared by rl+1 and Vl+1 is executed after
each l-counter, that is after each #l shared between rl+1 and rl . Similarly with
Asynchronous Games over Tree Architectures 285

action $l shared by rl+1 and Vl+1 . Process Vl+1 switches between state eq and
state succ when receiving $l , and back when receiving $l so it knows whether
rl+1 is generating the same l-counter as rl+1 , or the next one. As rl+1 does not
synchronize (unless there is a challenge) with Vl+1 between two $l , it does not
know whether rl+1 has already started producing the same l-counter or whether
it is still producing the previous one. Another important point about the flow
of knowledge is that while rl is informed when rl+1 is being challenged (as it
synchronizes frequently with rl+1 , and could thus be willing to cheat to produce
a different l-counter), rl does not know that rl+1 is being challenged, and thus
cheating on rl would be caught by verifier Vl .
Proposition 1. For every l, the system has a winning strategy in C l . For every
such winning strategy
 σ, if we consider the unique σ-play without challenges then
its projection on i=1,...,l Σi# is an iterated l-counter.
Proposition 1 is the basis for encoding Turing machines, with C l ensuring that
the space bound is equal to Tower (n, l).
Theorem 4. Let l > 0. There is an acyclic architecture of diameter (4l − 2) and
with 3(2l − 1) processes such that the space complexity of the control problem for
it is Ω(Tower (n, l))-complete.

5 Conclusions
Distributed synthesis is a difficult and at the same time promising problem, since
distributed systems are intrinsically complex to construct. We have considered
here an asynchronous, shared-memory model. Already Pnueli and Rosner in [16]
strongly argue in favour of asynchronous distributed synthesis. The choice of
transmitting additional information while synchronizing is a consequence of the
model we have adopted. We think that it is interesting from a practical point of
view, since it is already used in multithreaded computing (e.g., CAS primitive)
and it offers more decidable settings (e.g., client-server architecture).
Under some restrictions we have shown that the resulting control problem is
decidable. The assumption about uncontrollable actions being local represents
the most common situation where each process comes with its own environment
(e.g., a client). The assumption on binary synchronizations simplifies the defini-
tion of architecture graph and is common in distributed algorithms. The most
important restriction is that on architectures being a tree. Tree architectures are
quite rich and allow to model hierarchical situations, like server/clients (recall
that such cases are undecidable in the setting of Pnueli and Rosner). Neverthe-
less, it would be very interesting to know whether the problem is still decidable
e.g. for ring architectures. Such an extension would require new proof ideas.
A more immediate task is to consider more general winning conditions. A fur-
ther interesting research direction is the synthesis of open, concurrent recursive
programs, as considered e.g. in [1].
Our non-elementary lower bound result is somehow surprising. Since we have
full information sharing, all the complexity is hidden in the uncertainty about
actions performed in parallel by other processes.
286 B. Genest et al.

References
1. Bollig, B., Grindei, M.-L., Habermehl, P.: Realizability of concurrent recursive
programs. In: de Alfaro, L. (ed.) FOSSACS 2009. LNCS, vol. 5504, pp. 410–424.
Springer, Heidelberg (2009)
2. Church, A.: Logic, arithmetics, and automata. In: Proceedings of the International
Congress of Mathematicians, pp. 23–35 (1962)
3. Clairambault, P., Gutierrez, J., Winskel, G.: The winning ways of concurrent
games. In: LICS, pp. 235–244. IEEE (2012)
4. Diekert, V., Rozenberg, G. (eds.): The Book of Traces. World Scientific (1995)
5. Finkbeiner, B., Schewe, S.: Uniform distributed synthesis. In: LICS, pp. 321–330.
IEEE (2005)
6. Gastin, P., Lerman, B., Zeitoun, M.: Distributed games with causal memory are
decidable for series-parallel systems. In: Lodaya, K., Mahajan, M. (eds.) FSTTCS
2004. LNCS, vol. 3328, pp. 275–286. Springer, Heidelberg (2004)
7. Gastin, P., Sznajder, N., Zeitoun, M.: Distributed synthesis for well-connected
architectures. Formal Methods in System Design 34(3), 215–237 (2009)
8. Genest, B., Gimbert, H., Muscholl, A., Walukiewicz, I.: Optimal Zielonka-type
construction of deterministic asynchronous automata. In: Abramsky, S., Gavoille,
C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS,
vol. 6199, pp. 52–63. Springer, Heidelberg (2010)
9. Katz, G., Peled, D., Schewe, S.: Synthesis of distributed control through knowl-
edge accumulation. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS,
vol. 6806, pp. 510–525. Springer, Heidelberg (2011)
10. Kupferman, O., Vardi, M.: Synthesizing distributed systems. In: LICS (2001)
11. Madhusudan, P., Thiagarajan, P.S.: Distributed controller synthesis for local speci-
fications. In: Orejas, F., Spirakis, P.G., van Leeuwen, J. (eds.) ICALP 2001. LNCS,
vol. 2076, p. 396. Springer, Heidelberg (2001)
12. Madhusudan, P., Thiagarajan, P.S., Yang, S.: The MSO theory of connectedly
communicating processes. In: Sarukkai, S., Sen, S. (eds.) FSTTCS 2005. LNCS,
vol. 3821, pp. 201–212. Springer, Heidelberg (2005)
13. Mazurkiewicz, A.: Concurrent program schemes and their interpretations. DAIMI
Rep. PB 78, Aarhus University, Aarhus (1977)
14. Melliès, P.-A.: Asynchronous games 2: The true concurrency of innocence.
TCS 358(2-3), 200–228 (2006)
15. van der Meyden, R., Wilke, T.: Synthesis of distributed systems from knowledge-
based specifications. In: Abadi, M., de Alfaro, L. (eds.) CONCUR 2005. LNCS,
vol. 3653, pp. 562–576. Springer, Heidelberg (2005)
16. Pnueli, A., Rosner, R.: On the synthesis of an asynchronous reactive module. In:
Ronchi Della Rocca, S., Ausiello, G., Dezani-Ciancaglini, M. (eds.) ICALP 1989.
LNCS, vol. 372, pp. 652–671. Springer, Heidelberg (1989)
17. Pnueli, A., Rosner, R.: Distributed reactive systems are hard to synthesize. In:
FOCS, pp. 746–757 (1990)
18. Ramadge, P.J.G., Wonham, W.M.: The control of discrete event systems. Proceed-
ings of the IEEE 77(2), 81–98 (1989)
19. Schewe, S., Finkbeiner, B.: Synthesis of asynchronous systems. In: Puebla, G. (ed.)
LOPSTR 2006. LNCS, vol. 4407, pp. 127–142. Springer, Heidelberg (2007)
20. Zielonka, W.: Notes on finite asynchronous automata. RAIRO–Theoretical Infor-
matics and Applications 21, 99–135 (1987)
Querying the Guarded Fragment with Transitivity

Georg Gottlob1 , Andreas Pieris1 , and Lidia Tendera2


1
Department of Computer Science, University of Oxford, UK
[email protected]
2
Institute of Mathematics and Informatics, Opole University, Poland
[email protected]

Abstract. We study the problem of answering a union of Boolean conjunctive


queries q against a database Δ, and a logical theory ϕ which falls in the guarded
fragment with transitive guards (GF+TG). We trace the frontier between decid-
ability and undecidability of the problem under consideration. Surprisingly, we
show that query answering under GF2 +TG, i.e., the two-variable fragment of
GF+TG, is already undecidable (even without equality), whereas its monadic
fragment is decidable; in fact, it is 2EXPTIME -complete in combined complexity
and coNP-complete in data complexity. We also show that for a restricted class of
queries, query answering under GF+TG is decidable.

1 Introduction

The Guarded Fragment. The guarded fragment of first-order logic (GF) was intro-
duced by Andréka et al. [1] with the aim of explaining and generalizing the good prop-
erties of modal logic. Guarded formulas are constructed as usual first-order formulas
with the exception that all quantification must be bounded, i.e., of the form ∀x̄(α → ϕ)
or ∃x̄(α ∧ ϕ), where α is an atomic formula which guards ϕ in the sense that it contains
all the free variables of ϕ. Andréka et al. showed that modal logic can be embedded in
GF, and they argued in a convincing way that GF inherits the good properties of modal
logic. In [2], Grädel has established that GF enjoys several nice model-theoretic proper-
ties, and he also proved that satisfiability of GF-sentences is 2 EXPTIME-complete, and
EXPTIME -complete for sentences with relations of bounded arity.
The guarded fragment has since been intensively studied and extended in various
ways. An interesting extension is the guarded fragment with transitivity, a natural
representative language for multi-modal logics that are used to formalize epistemic
logics. The obvious formalization of the transitivity of a binary relation R, namely
∀x∀y∀z(R(x, y) ∧ R(y, z) → R(x, z)), is not guarded and there is no way to ex-
press it in GF [2]. As shown by Ganzinger et al. [3], the two-variable guarded fragment
(GF2 ) with transitivity is already undecidable, improving an analogous result for the
three-variable guarded fragment proved by Grädel [2]. In [3], a logic which restricts
the guarded fragment with transitivity by allowing transitive relations to appear only in
guards has been proposed. This formalism, which was dubbed the guarded fragment
with transitive guards (GF+TG) [4,5], is indeed expressive enough to be able to cap-
ture multi-modal logics of type K4, S4 or S5. The decidability of the monadic fragment
of GF2 +TG (MGF2 +TG), where all non-unary relations may appear in guards only,

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 287–298, 2013.
c Springer-Verlag Berlin Heidelberg 2013
288 G. Gottlob, A. Pieris, and L. Tendera

was established in [3]; its exact complexity, as well as the decidability of GF+TG, was
left as an open problem. Satisfiability of GF+TG-sentences is 2 EXPTIME-complete [5],
while the 2EXPTIME-hardness holds also for MGF2 +TG [6].
Querying Guarded-Based Fragments. It is evident that a large corpus of works on
GF (and extensions of it) has focused on satisfiability. More recently, the attention
has shifted on the problem of query answering, a central reasoning task in database
theory [7] and description logics [8]. An extensional database Δ, which is actually a
conjunction of ground atoms, is combined with a first-order sentence ϕ describing con-
straints which derive new knowledge. The database does not necessarily satisfy ϕ, and
may thus be incomplete. A query is not just answered against Δ, as in the classical
setting, but against the logical theory (Δ ∧ ϕ). Here we focus on union of Boolean
conjunctive queries (UCQ). A Boolean conjunctive query (BCQ) q consists of an exis-
tentially closed conjunction of atoms, while a UCQ is a disjunction of a finite number
of BCQ. Thus, given a UCQ q, one checks whether (Δ ∧ ϕ) |= q, written (Δ, ϕ) |= q.
Several fragments of GF have been considered for query answering in the context
of database theory. A notable example is the class of guarded tuple-generating depen-
dencies (or guarded TGDs) [7], that is, sentences of the form ∀x̄(ϕ(x̄) → ∃ȳ ψ(x̄, ȳ)),
where ϕ and ψ are conjunctions of atoms, and in ϕ an atom exists which contains all the
variables of x̄. Although guarded TGDs are, strictly speaking, not GF-sentences, since
their heads may be unguarded, they can be rewritten as guarded sentences [9]. Several
extensions of guarded TGDs have been investigated, see, e.g., [10,7,11]. Fragments of
GF have been also considered in the context of description logics. A prominent lan-
guage is DL-LiteR [8], which forms the OWL 2 QL profile of W3Cs standard ontology
language for modeling Semantic Web ontologies. In fact, each DL-LiteR axiom can
be written as a GF-sentence of the form ∀x̄(α(x̄) → ∃ȳ β(x̄, ȳ)). Following a more
general approach, Bárány et al. studied the problem of query answering for the whole
guarded fragment [9]. Query answering under GF is coNP-complete in data complexity,
i.e., when only the database is part of the input, and 2EXPTIME-complete in combined
complexity, i.e., when also the theory and the query are part of the input. Notice that
the data complexity is widely regarded as more meaningful in practice, since the theory
and the query are typically of a size that can be productively assumed to be fixed.
Research Challenges. While the decidability and complexity landscape of query an-
swering under GF (and fragments of it) is clearing up, the picture for extensions of GF
is still foggy. Notable exceptions are the two-variable guarded fragment with counting
quantifiers, under which query answering is co NP-complete in data complexity [12],
and the guarded negation fragment, where query answering is 2 EXPTIME-complete in
combined complexity [13]. In this paper we focus on GF+TG. Our goal is to better un-
derstand the problem of query answering, and give answers to the following questions:

– Is query answering under GF+TG decidable, and if so, what is the exact data and
combined complexity?
– In case the previous question is answered negatively: (i) What is the frontier be-
tween decidability and undecidability, and what is the exact data and combined
complexity of query answering under the decidable fragment of GF+TG? (ii) Can
we gain decidability for GF+TG by restricting the syntax of the query?
Querying the Guarded Fragment with Transitivity 289

We provide answers to all these questions. Notice that query answering under GF+TG
is at least as hard as (un)satisfiability of GF+TG-sentences; in fact, (Δ, ϕ) |= q iff
(Δ ∧ ϕ ∧ ¬q) is unsatisfiable. However, previous results on GF+TG are not imme-
diately applicable for the following two reasons: Δ contains constants which are for-
bidden in the original definition of GF+TG, and ¬q may not be a GF+TG-sentence.
Therefore, we had either to come up with novel techniques beyond the state of the art,
or significantly extend existing procedures.
Contribution. Our contributions can be summarized as follows:

1. We show that query answering under GF+TG is undecidable even without equality.
This is done by forcing an infinite grid to appear in every model of a GF2 +TG-
sentence, and then, by a further conjunction of formulas, we simulate a deterministic
Turing machine. The same proof shows undecidability of guarded disjunctive TGDs
(i.e., guarded TGDs extended with disjunction in rule-heads) with transitive guards.
Although the question whether the same undecidability result holds also for non-
disjunctive guarded TGDs remains open, we establish that transitivity without the
restriction to guards cannot be safely combined with guarded TGDs.
2. We trace the frontier between decidability and undecidability of query answering
by establishing that for the monadic fragment of GF2 +TG (MGF2 +TG) it is decid-
able; in fact, it is 2 EXPTIME-complete in combined complexity and coNP-complete
in data complexity. The proof of this result is constituted by two steps. First, we
show that satisfiability of an MGF2 +TG-sentence combined with a database Δ, is
2EXPTIME-complete, and NP-complete if we consider only the database as part of
the input. Then, given q ∈ UCQ, we construct a sentence ΦΔ,q such that for every
MGF2 +TG-sentence ϕ, (Δ, ϕ) |= q iff (Δ ∧ ϕ ∧ ¬ΦΔ,q ) is unsatisfiable, where
Δ is obtained from Δ by adding some auxiliary atoms, and (ϕ ∧ ¬ΦΔ,q ) is an
MGF2 +TG-sentence.
3. We show decidability of query answering under GF+TG if we consider unions of
single-transitive-acyclic BCQs, that is, a restricted class of queries; it is 2 EXPTIME-
complete in combined complexity, and coNP-complete in data complexity.

2 Preliminaries
We work with finite relational signatures. Let us fix such a signature τ , and let width(τ )
be the maximal arity of any of the predicate symbols in τ . The guarded fragment of first-
order logic (GF), introduced by Andréka et al. [1], is the collection of first-order for-
mulas with some syntactic restrictions in the quantification pattern, which is analogous
to the relativised nature of modal logic. The set GF of formulas over τ is the smallest
set (i) containing all atomic τ -formulas and equalities, (ii) closed under logical connec-
tives ¬, ∧, ∨, →, and (iii) if x̄ and ȳ are tuples of variables, α is a τ -atom or an equality
atom containing all the variables of {x̄, ȳ}, and ϕ ∈ GF with free variables contained
in {x̄, ȳ}, then ∀x̄(α → ϕ) and ∃x̄(α ∧ ϕ) belong to GF as well. Equality atoms are
allowed to occur anywhere including as guards. To define the guarded fragment with
transitive guards (GF+TG), we additionally fix a subset τ0 ⊆ τ of transitive predicates,
and consider only those constant-free GF-formulas where the transitive predicates do
290 G. Gottlob, A. Pieris, and L. Tendera

not appear outside guards. For a τ -structure A to be a model of a GF+TG-sentence ϕ,


we require all predicates of τ0 to be interpreted as transitive relations in A. The two-
variable guarded fragment with transitive guards is denoted GF2 +TG. In the monadic
fragment of GF2 +TG (MGF2 +TG) all non-unary relations may appear in guards only.
If A is a τ -structure with the universe A and B ⊆ A, then AB denotes the substruc-
ture of A induced on B. An atomic τ -type t(x1 , . . . , xn ) is a maximal consistent set
of τ -literals (atoms or negated atoms) whose constituent terms are among the variables
x1 , . . . , xn . In a τ -structure A the atomic type tpA (ā) of a tuple ā is the unique atomic
type t(x̄) such that A |= t(ā). A conjunctive query (CQ) is a sentence of the form
∃x̄ ϕ(x̄), where ϕ is a conjunction of atomic formulas possibly with constants. Boolean
conjunctive queries (BCQ) are conjunctive queries without free variables. A union of
(Boolean) conjunctive queries (UCQ) is a disjunction of a finite number of (Boolean)
conjunctive queries. By abuse of notation, sometimes we consider a query q ∈ UCQ as
a set of conjunctive queries. To every query q ∈ BCQ of the form ∃x̄ ϕ(x̄) over τ , one
can associate the τ -structure Q having as universe the set of variables in x̄ and atoms
as prescribed by ϕ. Then, A |= q iff there exists a homomorphism h : Q → A. We say
that q is acyclic if the associated structure Q is acyclic. By var (q) (resp., const(q)) we
denote the set of variables (resp., constants) occurring in q. The main problem tackled
in this paper, called UCQ answering, is defined as follows: given a database Δ, which
is actually a conjunction of ground atoms, a first-order theory ϕ, and a Boolean query
q ∈ UCQ, decide whether (Δ ∧ ϕ) |= q, written (Δ, ϕ) |= q. The decision version of
the problem for non-Boolean queries, which asks whether a tuple of constants belongs
to the answer of the query, can be reduced to the same problem for Boolean queries.

3 Undecidability of Querying the GF2 +TG


In contrast to decidability of satisfiability for GF+TG, we show undecidability of UCQ
answering under GF2 +TG.
Theorem 1. UCQ answering for GF2 +TG is undecidable, even if we consider only
one transitive relation, equality-free sentences, and an empty database.
Proof (sketch). We first construct a GF2 +TG-sentence ϕgrid such that every infinite
model of ϕgrid is grid-like. The signature of ϕgrid is constituted by two binary relations
H and V (grid relations), a binary relation H̄ (an auxiliary relation which will allow us
to encode a crucial non-guarded sentence as part of the query), a transitive relation T ,
and unary relations ci,j , where 0 ≤ i ≤ 3 and 0 ≤ j ≤ 1 (ci,j describes elements of the
grid whose column number modulo 4 is i, and whose row number modulo 2 is j).
Let ϕ0 be the conjunction of the initial formulas which fixes the starting point, and
constructs the horizontal and vertical edges of the grid. Let ϕ1 be the additional con-
junction of formulas which asserts that certain elements connected by T are also con-
nected by the horizontal grid relation:
& &   
∀x∀y T (x, y) ∧ ci,j (x) ∧ ci+1,j (y) → (H(x, y) ↔ ¬H̄(x, y))
i=0,2 j=0,1
& &   
∀x∀y T (y, x) ∧ ci,j (x) ∧ ci+1,j (y) → (H(x, y) ↔ ¬H̄(x, y)) .
i=1,3 j=0,1
Querying the Guarded Fragment with Transitivity 291

x x x
~ x ~ ~ x ~ ~ x~ x x~ x x~

x x x
 x
 x
 x
 x
 x
  
I I I I I I I I
x x x
~ x ~ ~ x ~ ~ x~ x x~ x x~

x x x x x x x x
       
I I I I I I I I
x x x
~ x ~ ~ x ~ ~ x~ x x~ x x~

x x x x x x x x
       
I I I I I I I I

D E

Fig. 1. Grid structure for (a) GF2 +TG and (b) GTGD2 + transitivity

We need to guarantee that H is complete over V . Denote γi,j := ci,j (x) ∧ ci+1,j (y) ∧
ci,j+1 (x ) ∧ ci+1,j+1 (y  ), ψ0 = ψ2 := T (x , x) ∧ T (x, y) ∧ T (y, y ) ∧ T (x , y  ), and
ψ1 = ψ3 := T (x, x ) ∧ T (y, x) ∧ T (y  , y) ∧ T (y  , x ). The completeness of H over V
is achieved by the conjunction of formulas ϕ2 :
& &   
∀x∀y∀x ∀y  γi,j ∧ ψi ∧ H(x, y) ∧ V (x, x ) ∧ V (y, y  ) → H(x , y  ) .
i=0,1,2,3 j=0,1

Let ϕgrid = ϕ0 ∧ ϕ1 ∧ ϕ2 . It can be shown that a grid structure as the one in Fig-
ure 1(a), where dashed arrows represent induced edges due to transitivity, appears in
every infinite model of ϕgrid . By using the infinite grid, where its i-th horizontal line
represents the i-th configuration of a deterministic Turing machine M over an empty
input tape, we can now simulate M by constructing a GF2 -sentence ϕM such that M
halts iff ϕgrid ∧ ϕM ∧ ¬∃x halt (x) is unsatisfiable; thus, the latter is an undecidable
problem. We now show that this undecidable problem can be reduced to UCQ answer-
ing for GF2 +TG. The non-guarded sentence ϕ2 is equivalent to ¬ϕ̂2 , where ϕ̂2 is:
' '   
∃x∃y∃x ∃y  γi,j ∧ ψi ∧ H(x, y) ∧ V (x, x ) ∧ V (y, y  ) ∧ H̄(x , y  ) .
i=0,1,2,3 j=0,1

Thus, by letting ϕ̂grid = ϕ0 ∧ ϕ1 , we get that ϕgrid ∧ ϕM ∧ ¬∃x halt (x) is equivalent
to ϕ̂grid ∧ ϕM ∧ ¬(ϕ̂2 ∨ ∃x halt (x)). Hence, ϕgrid ∧ ϕM ∧ ¬∃x halt (x) is unsatisfiable
iff ϕ̂grid ∧ ϕM |= ϕ̂2 ∨ ∃x halt (x). The claim follows by observing that ϕ̂grid ∧ ϕM is
a GF2 +TG-sentence, while (ϕ̂2 ∨ ∃x halt (x)) ∈ UCQ.

Interestingly, the above proof shows that query answering for two-variable guarded
disjunctive TGDs (i.e., guarded TGDs extended with disjunction in rule-heads [14])
with transitive guards (GDTGD2 + TG) is undecidable. In fact, the sentence ϕ1 of the
form ∀x̄(ϕ → (α ↔ ¬β)), which is the only part of the above construction that is not
constituted by guarded TGDs, is equivalent to the following conjunction of formulas:
¬∃x̄(ϕ ∧ α ∧ β) ∧ ∀x̄(ϕ → α ∨ β). Notice that the first conjunct forms a negated query,
while the second one is a guarded disjunctive TGD. The next result follows.
Corollary 1. UCQ answering for GDTGD2 + TG is undecidable, even if we consider
only one transitive relation, and an empty database.
292 G. Gottlob, A. Pieris, and L. Tendera

Querying the GTGD2 + Transitivity. A challenging issue which remains open is


whether the above undecidability result holds also for guarded TGDs with transitive
guards. However, transitivity without the restriction to guards cannot be safely com-
bined with guarded TGDs, which extends the known undecidability result for GF with
two transitive relations [15,16]. This result is of high relevance in the context of classi-
cal databases, where TGDs form a central class of integrity constraints [17].

Theorem 2. UCQ answering for GTGD2 + transitivity is undecidable, even if we con-


sider only two transitive relations, an empty database, and an atomic CQ.

Proof (sketch). It is possible to construct a GTGD2 + transitivity-sentence ϕgrid such


that a grid structure as the one depicted in Figure 1(b) appears in every infinite model
of ϕgrid . Having the grid relations H and V in place, we can exploit the sentence ϕM ,
used also in the proof of Theorem 1, in order to simulate the behavior of a deterministic
Turing machine M over an empty input tape.

4 Querying the Monadic Fragment of GF2 +TG

We show that UCQ answering under MGF2 +TG is decidable. If a definition or a result
is related to a two-variable logic, since there is little to be gained by allowing predicates
of arity bigger than two, we concentrate on unary and binary predicates only.
Ramified Models and Δ-satisfiability. Query answering is at least as hard as
(un)satisfiability: if P is a predicate not occurring in Δ or ϕ, then (Δ, ϕ) |= ∃x P (x)
iff (Δ∧ϕ) is unsatisfiable. Hence, we first study the related problem of Δ-satisfiability:
given a conjunction of ground atoms Δ and a formula ϕ, decide whether (Δ ∧ ϕ) is sat-
isfiable. We establish decidability of the problem and exact complexity bounds. Notice
that in the presence of Δ, existing algorithms deciding satisfiability of fragments of GF
with transitivity cannot be applied directly since constants are not allowed there.
Recall that one of the main properties of GF is the tree-model property saying that
every satisfiable guarded formula has a model A whose treewidth is bounded by the
number of variables in the formula [2]. Also, it is known that there exists a tree de-
composition of A such that each of its bags is guarded in A [9]. It is easy to see that
these properties are not preserved if we consider GF+TG-formulas. However, it was
shown that any satisfiable GF+TG-formula has a special ramified model [5]. We show
that models with similar properties can be also found in the presence of Δ. Ramified
models will be useful for both Δ-satisfiability and UCQ answering. Grädel’s analysis
for GF [2] uses the so-called Scott normal form corresponding to a relational Skolemi-
sation. For GF+TG the following variant of the normal form turned out to be useful.

Definition 1. A GF+TG-sentence in normal form is a conjunction of sentences of the


form: (1) ∃x (α(x)∧ϑ(x)), (2) ∀x̄ (α(x̄) → ∃y (β(x̄, y)∧ϑ(x̄, y))), or (3) ∀x̄ (α(x̄) →
ϑ(x̄)), where y ∈ x̄, α, β are atomic, and ϑ is quantifier-free without a transitive letter.

Lemma 1 ([5]). With every GF+TG-sentence ϕ of length n over τ one can associate
a set Φ of GF+TG-sentences in normal form over an extended signature σ such that:
Querying the Guarded Fragment with Transitivity 293

(1) ϕ is satisfiable iff ψ∈Φ ψ is satisfiable, (2) |Φ| ≤ O(2n ), |σ| ≤ n, width(σ) =
width(τ ), and for every ψ ∈ Φ, |ψ| = O(n log n), and (3) Φ can be computed in
2EXPTIME, and every sentence ψ ∈ Φ can be computed in PTIME w.r.t. n.

Also in our case we might restrict attention to formulas in normal form.

Lemma 2. Let Δ be a conjunction of ground atoms, and ϕ ∈ GF+TG. Then, (Δ ∧ ϕ)


has a (finite) model iff (Δ ∧ ψ∈Φ ψ) has a (finite) model, where Φ is as in Lemma 1.

Intuitively, the key property of the ramified models for GF+TG-sentences can be de-
scribed as follows: if we eliminate the atoms induced due to transitivity during the
construction of a ramified model, then the obtained structure A has bounded treewidth,
and there exists a tree decomposition of A such that each of its bags is guarded and
single-transitive. For the monadic case, the graph of a ramified model after removing
atoms induced due to transitivity can be seen as a forest with roots arbitrarily connected
through Δ, and where every edge is labeled with only one binary relation.

Definition 2. Let Δ be a conjunction of ground atoms with D = dom(Δ), ϕ be an


MGF2 +TG-sentence in normal form, and R |= Δ ∧ ϕ with R ⊇ D. We say that R is
Δ-ramified if there exists a set S of root choices, where D ⊆ S ⊆ R, and a finite set
of trees {Ts }s∈S rooted at elements from S such that the following conditions hold:
1. every a ∈ R is a node in one of the trees of {Ts }s∈S and for every s, s ∈ S such
that s = s , Ts and Ts are disjoint;
2. for every conjunct γ of ϕ of form (1), there is a ∈ S such that a is a witness of γ;
3. for every a ∈ R, for every conjunct γ of ϕ of form (2), if a is not a self-witness of
γ, then one of the successors of a in its tree is a witness of γ for a;
4. for every pair of distinct elements a, b ∈ R\D, for every P, P  ∈ τ , if R |= P (a, b)
and R |= P  (a, b), then P = P  ;
5. for every a ∈ R \ D, for every T, T  ∈ τ0 , if R |= T (a, a) and R |= T  (a, a), then
T = T  and a ∈ S;
6. for every pair of distinct elements a, b ∈ R\D, for every P ∈ τ \τ0 , if R |= P (a, b),
then a and b are neighbors in one of the trees of {Ts }s∈S ;
7. for every pair of distinct elements a, b ∈ R \ D, for every T ∈ τ0 , if R |= T (a, b)
then either a and b are in the same subtree Ts and there is a T -path in Ts from a
to b, or there are c, c ∈ D such that there is a T -path from a to c in Tc , there is a
T -path from c to b in Tc and, if c = c , then R |= T (c, c ).

Theorem 3. Let Δ be a conjunction of ground atoms, ϕ an MGF2 +TG-sentence in


normal form, and A |= Δ ∧ ϕ. Then, there is a Δ-ramified model R for (Δ ∧ ϕ) and a
homomorphism h that maps R to A.

Proof (sketch). Let A be a model of (Δ ∧ ϕ) and D = dom(Δ). For every conjunct


γ of ϕ of form (1) pick a witness aγ ∈ A, and let S be the union of D with the set
of the elements aγ . The structure R is defined as an unraveling of A from elements
of S. More precisely, we start defining R0 = A  S and taking h : R0 → A to be
the identity on R0 . Then, we proceed inductively. For every i ≥ 0, we extend the
structure Ri to Ri+1 by adding, for every element added to Ri in step i, witnesses
294 G. Gottlob, A. Pieris, and L. Tendera

for conjuncts of ϕ of the form (2). In a single step of this stage, we have a ∈ Ri ,
γ = ∀x(α(x) → ∃yβ(x, y) ∧ ϑ(x, y)) and h(a) ∈ A. As A |= γ we can find a witness
b of γ for h(a) in A. We add a new element b to Ri+1 , define h(b) = b and define
tpRi+1 (a, b) using the relevant part (identified by the guard β) of tpA (h(a), b ). After
adding b as a witness of a conjunct γ with β, where β is a T -atom, we ensure that b is
connected by the transitive relation T to all elements c connected to a via a T -path in
Ri , defining tpRi+1 (b, c) from the corresponding 2-type tpA (h(b), h(c)). Other pairs
of distinct elements in Ri+1 are connected using only negative 2-types, i.e., they are not
in the interpretation of any binary relation. One can show that R is a ramified model as
defined in Definition 2, and that h is a homomorphism from R to A.

We can now show that Δ-satisfiability of MGF2 +TG-sentences is decidable.

Theorem 4. Δ-satisfiability of MGF2 +TG is 2EXPTIME-complete in combined com-


plexity, and NP-complete in data complexity.

Proof (sketch). We compute (in exponential space w.r.t. |ϕ|) the set Φ of Lemma 1.
Then we guess a sentence ϕ ∈ Φ and check whether ϕ has a Δ-ramified model. To
check the latter we first guess a structure D of size at most |dom(Δ)| + |ϕ| interpreting
Δ and containing witnesses for conjuncts of ϕ of the form (1). Then we universally
choose an element d ∈ D and check whether D can be extended to a Δ-ramified model
with D being the set of root choices. Observe that in the next steps of the procedure,
it suffices to keep for each element a in the model a description of 1-types occurring
on transitive paths from and to a. This information can be stored using exponential
size with respect to |ϕ|. An alternating procedure working in exponential space can be
naturally derived from the construction. If only Δ is considered as part of the input, then
the above procedure works in nondeterministic polynomial time. For the lower bounds,
it is known that satisfiability of MGF2 +TG is already 2 EXPTIME-hard [6]. For data
complexity the corresponding NP-hard lower bound follows from Theorem 2 in [18].

It is important to say that by adapting the notion of ramified models introduced in [5]
for GF+TG, one can show that Δ-satisfiability for GF+TG-sentences is decidable, and
of the same complexity. Details are omitted due to space limits.
Query Answering via Unsatisfiability. We now investigate query answering under
MGF2 +TG. Given a database Δ and a query q ∈ UCQ, our goal is to construct a
sentence ΦΔ,q which enjoys the following properties: (i) for each ϕ ∈ MGF2 +TG,
(Δ, ϕ) |= q iff (Δ , ϕ) |= ΦΔ,q , where Δ ⊇ Δ, and (ii) ΦΔ,q is equivalent to an
MGF2 +TG-sentence. Since (Δ , ϕ) |= ΦΔ,q iff (Δ ∧ ϕ ∧ ¬ΦΔ,q ) is unsatisfiable, we
can then rely on the results regarding Δ-satisfiability of MGF2 +TG-sentences. Let us
first introduce the class of single-acyclic queries.

Definition 3. A query q ∈ BCQ is single-acyclic if the following conditions hold: (i) q


is acyclic, (ii) for every pair of distinct variables x, y ∈ var (q), q contains at most one
atom containing both x and y, and (iii) there exists at most one transitive predicate T
such that a reflexive atom of the form T (x, x), where x ∈ var (q), occurs in q.
Querying the Guarded Fragment with Transitivity 295

As shown in [19], a query q ∈ BCQ is acyclic iff there exists a query p ∈ BCQ
equivalent to q which is also in GF. By exploiting this result, one can easily show that a
single-acyclic query is equivalent to a sentence of MGF2 +TG.
Lemma 3. For each single-acyclic query q ∈ BCQ over τ , there is an MGF2 +TG-
sentence χq of size linear in |q| such that, for every τ -structure A, A |= q iff A |= χq .

Fix a database Δ and a query q ∈ UCQ. Having the notion of single-acyclic queries
in place, we are now ready to construct the sentence ΦΔ,q = p∈q φΔ,p , where each
disjunct φΔ,p is a union of single-acyclic BCQs constructed as described below. Clearly,
if (Δ, ϕ) |= p, then there exists a homomorphism h that maps p to each model of
(Δ ∧ ϕ), and thus to each Δ-ramified model of (Δ ∧ ϕ). The key idea underlying our
construction is, for each such mapping h, to describe the image h(p) of p in each Δ-
ramified model of (Δ ∧ ϕ) by a union of single-acyclic BCQs. As we shall see below,
for query answering purposes, it suffices to focus on the Δ-ramified models. The formal
construction of φΔ,p is as follows. If p is single-acyclic, then φΔ,p coincides with p;
otherwise, we apply the following steps:

1. Enumerate all the possible mappings. Let H = {h | h is a mapping var (p) →


(var (p) ∪ dom(Δ))}. For h ∈ H, we denote h(p) the maximal subset of h(p)
which contains only constants of dom(Δ), while h(p) = h(p) \ h(p) .
2. Partition h(p) into different subtrees of a Δ-ramified model. For each h ∈ H,
let Ph be the set of all possible n-tuples S1 , c1 , . . . , Sn , cn , where 1 ≤ n ≤
|var (p)|, such that: (i) {S1 , . . . , Sn } is a partition of var (p), and (ii) {c1 , . . . , cn }
is a subset of (dom(Δ) ∪ {$1 , . . . , $n }), where $i ’s are auxiliary variables.
3. Focus on a subtree by eliminating crossing transitive edges. For each h ∈ H,
let φh be the set of all possible BCQs that can be constructed as follows: for each
S1 , c1 , . . . , Sn , cn  ∈ Ph , replace each atom T (x, y) of h(p) , where T ∈ τ0 ,
with T (x, ci ) ∧ T (ci , cj ) ∧ T (cj , y) (or with T (x, c) ∧ T (c, y) if ci = cj = c) if
there exist 1 ≤ i = j ≤ n such that x ∈ Si and y ∈ Sj .
4. Eliminate constants. For each h ∈ H, let φ− h be the set of BCQs defined as fol-
lows: for each p ∈ φh , in φ− 
h there exists a BCQ obtained from p ∧ h(p) by

replacing, for each constant c in p ∧ h(p) , each occurrence of c with the variable
♦c , and adding the conjunct Rc (♦c ), where Rc is an auxiliary unary predicate.
5. Describe the image. For each h ∈ H, and for each BCQ p ∈ φ− h:
(a) Let pΔ be the maximal subset of p such that dom(pΔ ) ⊆ {♦c }c∈dom(Δ) , and
let p1 , . . . , pm be the maximal connected components of p \ pΔ .
(b) For each i ∈ {1, . . . , m}, if pi is single-acyclic, then Qpi = {pi }; otherwise,
let Qpi be the set of all single-acyclic BCQs which entail pi that can be con-
structed by eliminating induced 1 transitive atoms from pi ; let Qpm+1 = {pΔ }.
(c) If there exists i ∈ {1, . . . , m} such that Qpi = ∅, then φ−
h = ⊥; otherwise,
φ−
h is defined as p ∈φ− (×1≤i≤m+1 Qpi ), where ×1≤i≤m+1 Qpi are all the
h
BCQs that can be constructed by keeping exactly one BCQ from each Qpi .
6. Finalization. Let φΔ,p = h∈H φ−
h.
1
An atom T (x, z) is induced if atoms of the form T (x, y) and T (y, z) already occur in pi .
296 G. Gottlob, A. Pieris, and L. Tendera

Before showing soundness and completeness of our construction, let us first estab-
lish two auxiliary results. The first one states that, if we focus our attention on Δ-
ramified models, then our construction is complete. In the sequel, let Δ = Δ ∧
c∈dom(Δ) Rc (c).

Lemma 4. Consider a conjunction of ground atoms Δ, and a query q ∈ UCQ. Given


a sentence ϕ ∈ MGF2 +TG, if (Δ ∧ ϕ) is consistent and (Δ, ϕ) |= q, then R |= ΦΔ,q ,
for each Δ-ramified model R of (Δ ∧ ϕ), and ΦΔ,q ≡ ⊥.

Proof (sketch). By hypothesis, there exists p ∈ BCQ in q such that (Δ, ϕ) |= p. Fix a
Δ-ramified model R for (Δ∧ϕ), which exists by Theorem 3, and extend it to a ramified
model R∗ interpreting the auxiliary symbols from Δ∗ . As R∗ |= (Δ∗ ∧ ϕ), we have
R∗ |= p. Let h be the homomorphism that maps p into R∗ . Obviously, h ∈ H. Let S
be the set of root choices in R∗ . Using S we define a partition of var(p) into subsets
mapped into the same subtree Ts of R∗ . By construction, φh is nonempty and one can
show that there exists at least one disjunct γ in φΔ,p such that R |= γ.

For query answering we can consider only the Δ-ramified models of a theory.

Lemma 5. Consider a conjunction of ground atoms Δ, a sentence ϕ ∈ MGF2 +TG,


and a query q ∈ UCQ. (Δ, ϕ) |= q iff R |= q, for each Δ-ramified model R of (Δ ∧ ϕ).

Proof. (⇒) By hypothesis, each model of (Δ ∧ ϕ) entails q, and the claim follows. (⇐)
Towards a contradiction, assume that each Δ-ramified model entails q, but (Δ, ϕ) |= q.
The latter implies that there exists a model A of (Δ∧ϕ) such that A |= q. By Theorem 3,
there exists a ramified model R of (Δ ∧ ϕ), and a homomorphism h that maps R into
A. Since R |= q, there exists a homomorphism μ that maps q into R. Therefore, h ◦ μ
maps q into A, and thus A |= q which is a contradiction. The claim follows.

We are now ready to establish soundness and completeness of the construction.

Theorem 5. Consider a conjunction of ground atoms Δ, and a query q ∈ UCQ. For


each ϕ ∈ MGF2 +TG, (Δ, ϕ) |= q iff (Δ , ϕ) |= ΦΔ,q .

Proof. (⇒) If (Δ ∧ ϕ) is not consistent, then also (Δ ∧ ϕ) is not consistent and the
claim follows. In case that (Δ ∧ ϕ) is consistent, the claim follows immediately from
Lemmas 4 and 5. (⇐) By hypothesis, there exists a BCQ p ∈ ΦΔ,q such that (Δ , ϕ) |=
p. By construction, p entails q, and thus (Δ , ϕ) |= q. The auxiliary predicates of the
form Rc , where c ∈ dom(Δ), being introduced only during the construction of ΦΔ,q ,
do not match any predicate in q, and hence (Δ, ϕ) |= q.

Let us now investigate the complexity of the obtained formula. For brevity, let r = |τ |.
Also, given a query q ∈ UCQ, let Hq = maxp∈q |p| and Vq = maxp∈q |var (p)|.

Lemma 6. Consider a conjunction of ground atoms Δ, and a query q ∈ BCQ. It holds


O(H )
that (1) |ΦΔ,q | is at most |q|·(Vq +|dom(Δ)|)Vq ·rO(Hq ) ·Hq q , (2) HΦΔ,q = O(Hq ),
and (3) ΦΔ,q can be constructed in EXPTIME w.r.t. q, and in PTIME w.r.t. Δ.
Querying the Guarded Fragment with Transitivity 297

From Theorem 5 we get that (Δ, ϕ) |= q iff (Δ ∧ ϕ ∧ ¬ΦΔ,q ) is unsatisfiable. By


Lemma 6, we can construct ΦΔ,q in EXPTIME w.r.t. q. Then, for each of its disjuncts p,
which are exponentially many, we call the 2 EXPTIME algorithm for Δ-satisfiability of
MGF2 +TG-sentences, provided by Theorem 4, in order to check whether (Δ ∧ϕ∧¬p)
is unsatisfiable. Since |p| is linear w.r.t. q, the above describes a 2EXPTIME proce-
dure for query answering. Now, in case that both ϕ and q are fixed, it is easy to
see that Theorem 4 and Lemma 6 give us a co NP procedure for query answering.
The double-exponential time lower bound follows from the fact that satisfiability of
MGF2 +TG-sentences is 2 EXPTIME-hard [6]. The co NP-hardness is inherited immedi-
ately from [20], where it was shown that query answering under a single sentence of
the form ∀x(R1 (x) → R2 (x) ∨ R3 (x)) is co NP-hard in data complexity.
Theorem 6. UCQ answering for MGF2 +TG is 2EXPTIME-complete in combined com-
plexity, and coNP-complete in data complexity.
Restricting the Query. We conclude this section by briefly discussing how we can gain
decidability of query answering under GF+TG. Although single-acyclic queries were
tailored for MGF2 +TG, it turned out that they can be naturally extended to single-
transitive-acyclic queries, which are suitable for querying arbitrary GF+TG-sentences.
Definition 4. A query q ∈ BCQ is single-transitive-acyclic if the following holds: (i) q
is acyclic, and (ii) for each hyperedge e in the hypergraph of q, there exists at most one
pair of distinct variables x, y ∈ e, and at most one T ∈ τ0 such that T (x, y) is in q.
Since a BCQ q is acyclic iff there exists a guarded BCQ equivalent to q [19], one can
show that a single-transitive-acyclic query is equivalent to a GF+TG-sentence.
Lemma 7. For each single-transitive-acyclic query q ∈ BCQ over τ , there is a
GF+TG-sentence χq of size linear in |q| such that, for every τ -structure A, A |= q
iff A |= χq .
Given a database Δ, a sentence ϕ ∈ GF+TG, and a single-transitive-acyclic BCQ q,
the above lemma implies that one can decide whether (Δ, ϕ) |= q just by checking if
the sentence (Δ ∧ ϕ ∧ ¬χq ), where (ϕ ∧ ¬χq ) ∈ GF+TG, is unsatisfiable. Since the
results on Δ-satisfiability established above hold, not only for MGF2 +TG, but for the
whole fragment under consideration, we get the following complexity result.
Theorem 7. UCQ answering under GF+TG with single-transitive-acyclic BCQs is
2EXPTIME-complete in combined complexity, and co NP-complete in data complexity.

5 Future Work
We state three open problems for query answering. The first one concerns the decidabil-
ity of guarded TGDs with transitive guards. The second one is whether MGF2 +TG can
be safely combined with counting quantifiers, an important feature for many computa-
tional logics. Finally, the third one is to pinpoint the complexity of MGF2 +TG under
finite models; recall that in this work we considered arbitrary (finite or infinite) models.
For the latter, since MGF2 +TG does not enjoy the finite model property, completely
new techniques are needed.
298 G. Gottlob, A. Pieris, and L. Tendera

Acknowledgements. Georg Gottlob and Lidia Tendera acknowledge the EPSRC Grant
EP/H051511/1 “ExODA”. Lidia Tendera also gratefully acknowledges her association
with St. John’s College during her visit to Oxford in 2012, and the support of Polish
Ministry of Science and Higher Education Grant N N206 37133. Andreas Pieris ac-
knowledges the ERC Grant 246858 “DIADEM” and the EPSRC Grant EP/G055114/1
“Constraint Satisfaction for Configuration: Logical Fundamentals, Algorithms and
Complexity”.

References
1. Andréka, H., van Benthem, J., Németi, I.: Modal languages and bounded fragments of pred-
icate logic. J. Philosophical Logic 27, 217–274 (1998)
2. Grädel, E.: On the restraining power of guards. J. Symb. Log. 64(4), 1719–1742 (1999)
3. Ganzinger, H., Meyer, C., Veanes, M.: The two-variable guarded fragment with transitive
relations. In: Proc. of LICS, pp. 24–34 (1999)
4. Szwast, W., Tendera, L.: On the decision problem for the guarded fragment with transitivity.
In: Proc. of LICS, pp. 147–156 (2001)
5. Szwast, W., Tendera, L.: The guarded fragment with transitive guards. Ann. Pure Appl.
Logic 128(1-3), 227–276 (2004)
6. Kieroński, E.: The two-variable guarded fragment with transitive guards is 2EXPTIME-hard.
In: Gordon, A.D. (ed.) FOSSACS 2003. LNCS, vol. 2620, pp. 299–312. Springer, Heidelberg
(2003)
7. Calı̀, A., Gottlob, G., Kifer, M.: Taming the infinite chase: Query answering under expressive
relational constraints. In: Proc. of KR, pp. 70–80 (2008)
8. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning
and efficient query answering in description logics: The DL-Lite family. J. Autom. Reason-
ing 39(3), 385–429 (2007)
9. Bárány, V., Gottlob, G., Otto, M.: Querying the guarded fragment. In: Proc. of LICS,
pp. 1–10 (2010)
10. Baget, J.F., Mugnier, M.L., Rudolph, S., Thomazo, M.: Walking the complexity lines for
generalized guarded existential rules. In: Proc. of IJCAI, pp. 712–717 (2011)
11. Krötzsch, M., Rudolph, S.: Extending decidable existential rules by joining acyclicity and
guardedness. In: Proc. of IJCAI, pp. 963–968 (2011)
12. Pratt-Hartmann, I.: Data-complexity of the two-variable fragment with counting quantifiers.
Inf. Comput. 207(8), 867–888 (2009)
13. Bárány, V., ten Cate, B., Segoufin, L.: Guarded negation. In: Aceto, L., Henzinger, M., Sgall,
J. (eds.) ICALP 2011, Part II. LNCS, vol. 6756, pp. 356–367. Springer, Heidelberg (2011)
14. Gottlob, G., Manna, M., Morak, M., Pieris, A.: On the complexity of ontological reasoning
under disjunctive existential rules. In: Rovan, B., Sassone, V., Widmayer, P. (eds.) MFCS
2012. LNCS, vol. 7464, pp. 1–18. Springer, Heidelberg (2012)
15. Kazakov, Y.: Saturation-based decision procedures for extensions of the guarded fragment.
PhD thesis, Universität des Saarlandes (2005)
16. Kieroński, E.: Results on the guarded fragment with equivalence or transitive relations. In:
Ong, L. (ed.) CSL 2005. LNCS, vol. 3634, pp. 309–324. Springer, Heidelberg (2005)
17. Beeri, C., Vardi, M.Y.: A proof procedure for data dependencies. J. ACM 31(4), 718–741
(1984)
18. Pratt-Hartmann, I.: Complexity of the two-variable fragment with counting quantifiers. Jour-
nal of Logic, Language and Information 14(3), 369–395 (2005)
19. Gottlob, G., Leone, N., Scarcello, F.: Robbers, marshals, and guards: Game theoretic and
logical characterizations of hypertree width. J. Comput. Syst. Sci. 66(4), 775–808 (2003)
20. Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M., Rosati, R.: Data complexity of
query answering in description logics. Artif. Intell. 195, 335–360 (2013)
Contractive Signatures with Recursive Types,
Type Parameters, and Abstract Types

Hyeonseung Im1 , Keiko Nakata2 , and Sungwoo Park3


1
LRI - Université Paris-Sud 11, Orsay, France
2
Institute of Cybernetics at Tallinn University of Technology, Estonia
3
Pohang University of Science and Technology, Republic of Korea

Abstract. Although theories of equivalence or subtyping for recursive


types have been extensively investigated, sophisticated interaction be-
tween recursive types and abstract types has gained little attention. The
key idea behind type theories for recursive types is to use syntactic con-
tractiveness, meaning every μ-bound variable occurs only under a type
constructor such as → or ∗. This syntactic contractiveness guarantees the
existence of the unique solution of recursive equations and thus has been
considered necessary for designing a sound theory for recursive types.
However, in an advanced type system, such as OCaml, with recursive
types, type parameters, and abstract types, we cannot easily define the
syntactic contractiveness of types. In this paper, we investigate a sound
type system for recursive types, type parameters, and abstract types.
In particular, we develop a new semantic notion of contractiveness for
types and signatures using mixed induction and coinduction, and show
that our type system is sound with respect to the standard call-by-value
operational semantics, which eliminates signature sealings. Moreover we
show that while non-contractive types in signatures lead to unsound-
ness of the type system, they may be allowed in modules. We have also
formalized the whole system and its type soundness proof in Coq.

1 Introduction
Recursive types are widely used features in most programming languages and
the key constructs to exploit recursively defined data structures such as lists and
trees. In type theory, there are two ways to exploit recursive types, namely by
using the iso-recursive or equi-recursive formulation.
In the iso-recursive formulation, a recursive type μX.τ is considered isomor-
phic but not equal to its one-step unfolding {X !→ μX.τ }τ . Correspondingly
the term language provides built-in coercion functions called fold and unfold,
witnessing this isomorphism.

fold : {X !→ μX.τ }τ → μX.τ


unfold : μX.τ → {X !→ μX.τ }τ

An expanded version of this paper, containing detailed proofs and omitted defini-
tions, and the Coq development are available at http://toccata.lri.fr/~im.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 299–311, 2013.

c Springer-Verlag Berlin Heidelberg 2013
300 H. Im, K. Nakata, and S. Park

Although in the iso-recursive formulation programs have to be annotated with


fold and unfold coercion functions, this formulation usually simplifies typecheck-
ing in the presence of recursive types. For example, datatypes in SML [11] are a
special form of iso-recursive types.
In contrast, the equi-recursive formulation defines a recursive type μX.τ to be
equal to its one-step unfolding {X !→ μX.τ }τ and does not require any coercion
functions as in the iso-recursive formulation. For example, polymorphic variant
and object types as implemented in OCaml [1] require structural recursive types
and thus equi-recursive types. To use equi-recursive types, it suffices to add either
of the two rules below into the type system:
Γ e : τ τ ≡ σ typ-eq Γ e:τ τ ≤σ
typ-sub
Γ e:σ Γ e:σ
Here the rule typ-eq exploits the equivalence relation ≡ between recursive types,
and the rule typ-sub the subtyping relation ≤. The equi-recursive formulation
makes it easier to write programs using recursive types, but it raises a tricky
problem in typechecking: we need to decide when two recursive types are in the
equivalence or subtyping relation. In response, several authors have investigated
theories of equivalence or subtyping for equi-recursive types [2,3,6,18].
The key idea behind type theories for equi-recursive types is to use syntactic
contractiveness [9], meaning that given a recursive type μX.τ , the use of the
recursion variable X in τ must occur under a type constructor such as → or ∗.
In other words, non-contractive types such as μX.X and unit∗μX.X are rejected
at the syntax level. For example, most previous work considers variants of the
following type language:
τ, σ ::= X | τ → σ | μX.(τ → σ)
The main reason for employing this syntactic contractiveness is to guarantee
the existence of the unique solution of recursive equations introduced by equi-
recursive types and obtain a sound type theory.
However, in an advanced type system, such as OCaml, with recursive types,
type parameters, abstract types, and modules, we cannot easily define the syn-
tactic contractiveness of types. To illustrate, consider the following code fragment
which is allowed in OCaml using the “-rectypes” option. The “-rectypes” option
allows arbitrary equi-recursive types.
module M : T = struct module type T = sig
type ’a t = ’a type ’a t
type s = int and u = bool type s = s t and u = u t
let f x = x val f : int -> s
let g x = x val g : u -> bool
end end
let h x = M.g (M.f x)
let y = h 3 (* run-time error *)
Under the usual interpretation of ML signature sealings, the module M correctly
implements, or satisfies, the signature T. Moreover the types s and u in T are
Contractive Signatures with Recursive Types 301

considered contractive in OCaml since the type cycles are guarded by the param-
eterized abstract type t in T; hence the signature T is well-formed. Furthermore
the types s and u in T are structurally equivalent, and thus the values h and
y are well-formed with types int -> bool and bool, respectively. At run-time,
however, the evaluation of y, i.e., h 3, leads to an unknown constructor 3 of
type bool, breaking type soundness.1
In this paper, we investigate a type system for equi-recursive types, type pa-
rameters, and abstract types. In our system, recursive types may be declared by
using type definitions of the form type α t = τ where both the type parameter α
and the recursive type t may appear in the type τ .2 Abstract types may be de-
clared by using the usual ML-style signature sealing operations (Section 2.1). For
this system, we develop a new notion of semantic contractiveness for types and
signatures using mixed induction and coinduction (Section 2.3). Our semantic
contractiveness determines the types s and u in the signature T above to be non-
contractive, and our type system rejects T. We then show that our type system
with semantic contractiveness is sound with respect to the standard call-by-value
operational semantics, which eliminates signature sealings (Section 2.4).
Another notable result is that even in the presence of non-contractive types
in modules, we can develop a sound type system where well-typed programs
cannot go wrong. This is particularly important since our type soundness result
may give a strong hint about the soundness of OCaml, which allows us to define
non-contractive types using recursive modules and signature sealings.
Our contributions are summarized as follows.
– To our knowledge, we are the first to consider a type system for equi-recursive
types, type parameters, and abstract types, and define a type sound semantic
notion of contractiveness.
– Since the OCaml type system allows both recursive types and abstract types,
and non-contractive types in modules, our type soundness result gives a
strong hint about how to establish the soundness of OCaml.
– We have formalized the whole system and its type soundness proof in Coq
version 8.4. Our formalization extensively uses mixed induction and coin-
duction, so it may act as a good reference for using mixed induction and
coinduction in Coq.
The remainder of the paper is organized as follows. Section 2 presents a type
system for recursive types, type parameters, and abstract types. In particular, we
consider a simple module system with a signature sealing operation and define a
structural type equivalence and semantic contractiveness using mixed induction
and coinduction. Section 3 discusses Coq mechanization and an algorithmic type
equivalence and contractiveness. Section 4 discusses related work and Section 5
concludes.
1
We discovered the above bug together with Jacques Garrigue and it has been fixed
in the development version of OCaml (available in the OCaml svn repository).
2
We do not use the usual μ-notation because encoding mutually recursive type defini-
tions into μ-types requires type-level pairs and projection, complicating the theory.
Moreover the use of type definitions better reflects the OCaml implementation.
302 H. Im, K. Nakata, and S. Park

Syntax

types τ, σ ::= unit | α | τ → σ | τ1 ∗ τ2 | τ t


expressions e ::= () | x | λx : τ. e | e1 e2 | (e1 , e2 ) | πi (e) | fix x : τ. e | l
specifications D ::= type α t | type α t = τ | val l : τ
definitions dτ ::= type α t = τ (type definitions)
de ::= let l = e (value definitions)
signatures S ::= · | D, S
modules M ::= (dτ , de )
programs P ::= (M, S, e) | (M, e)

Well-Formed types S; Σ τ type

type variable sets Σ ::= · | {α}


Standard well-formedness rules for types are omitted.

S  type α t S; Σ τ type S  type α t = σ S; Σ τ type


wft-abs wft-app
S; Σ τ t type S; Σ τ t type
Well-formed specifications and signatures S D ok S ok

S; {α} τ type S; · τ type


wfs-abs wfs-type wfs-val
S type α t ok S type α t = τ ok S val l : τ ok

BN(S) distinct ∀D ∈ S, S D ok
wf-sig
S ok

Fig. 1. Syntax and well-formedness

2 A Type System λrec


abs

This section presents a type system λrec


abs for recursive types, type parameters, and
abstract types, permitting non-contractive types in modules. Section 2.1 presents
the syntax and inference rules for well-formedness. Section 2.2 defines a struc-
tural type equivalence and Section 2.3 defines a semantic contractiveness using
mixed induction and coinduction. Finally Section 2.4 presents a type soundness
result with respect to the standard call-by-value operational semantics.

2.1 Syntax and Well-Formedness

Figure 1 shows the syntax for λrec


abs and inference rules for well-formedness. We
use meta-variables α, β for type variables, s, t, u for type names, x, y, z for value
variables, and l for value names. Both types and expressions are defined in
the standard way as in the simply-typed λ-calculus except that they may refer
to type definitions and value definitions in modules. For simplicity, we include
simple signatures and modules only, which suffice to introduce abstract types,
and exclude nested modules and functors. Abstract types are then introduced
Contractive Signatures with Recursive Types 303

by a program of the form (M, S, e), called a signature sealing, which hides the
implementation details of the module M behind the signature S.
Recursive types are introduced by type definitions of the form type α t = τ
where τ may refer to the name t with no restriction. For example, we may define
recursive types such as type α t = α t → α t and type α t = α t → (α ∗ α) t.
We also permit a non-contractive type definition such as type α t = α t in a
module but reject a non-contractive type specification in a signature, since the
latter breaks type soundness. Any sequence dτ of type definitions (or type spec-
ifications) in modules (or signatures) may be mutually recursive, whereas no
sequence de of value definitions are mutually recursive. The main reason for this
design choice is that our focus in this paper is to investigate the interaction be-
tween non-contractive recursive types and abstract types. Moreover, to simplify
the discussion, we consider only those type constructors with a single parameter.
We can easily add into the system nullary or multi-parameter type constructors.
As for well-formedness of types, we use a judgment S; Σ τ type to mean that
type τ is well-formed under context (S, Σ). Here we use a type variable set Σ to
denote either an empty set or a singleton set. We also use judgments S D ok
and S ok to mean that specification D and signature S are well-formed, respec-
tively. Most of the rules are standard, and we only remark that a signature S is
well-formed only if all bound type names in S are distinct from each other and
each type definition is well-formed under S (rule wf-sig). These well-formedness
conditions for signatures allow us to define arbitrarily mutually recursive type
definitions. In the remainder of the paper, we assume that every type and sig-
nature that we mention is well-formed, without explicitly saying so.

2.2 Type Equivalence

In this section, we define a type equivalence relation in terms of unfolding type


definitions. The usual β-equivalence is then embedded into our type equivalence
relation by means of unfolding. We use a judgment S τ + σ to mean that
type τ unfolds into σ by expanding a type name in τ into its definition under
S. The rule unfold below, which is the only rule for unfolding type definitions,
implements this idea and given a type application τ t, it replaces type variable
α with argument τ in definition σ of t.
S 6 type α t = σ
unfold
S τ t + {α !→ τ }σ

We write S τ +∗ σ for the reflexive and transitive closure of unfolding. We


say that a type τ is vacuous under signature S, if ∀τ  , S τ +∗ τ  implies
  
∃τ , S τ + τ . In other words, vacuous types are those types that allow us
to unfold type definitions infinitely using the rule unfold.
The type equivalence relation is defined by nesting induction into coinduction.
In order to structurally compare types for their equivalence, we should be able to
check equivalence for vacuous types. While using induction to compare structures
of types, we use coinduction to compare infinite unfoldings of types and thus to
304 H. Im, K. Nakata, and S. Park

Coinductive type equivalence S; Σ τ1 ≡ τ2


S; Σ τ =σ S τ  τ S σ  σ S; Σ τ  ≡ σ
eq-ind eq-coind
S; Σ τ ≡σ S; Σ τ ≡σ

R
Inductive type equivalence S; Σ τ1 = τ2

S; Σ τ i R σi (i = 1, 2)
eq-unit eq-var eq-fun
R R R
S; Σ unit = unit S; {α} α=α S; Σ τ 1 → τ 2 = σ1 → σ2

S; Σ τ i R σi (i = 1, 2) S  type α t S; Σ τRσ
eq-prod eq-abs
R R
S; Σ τ 1 ∗ τ 2 = σ1 ∗ σ2 S; Σ τ t=σ t
τ  τ τ = σ σ  σ τ = σ
R R
S S; Σ S S; Σ
eq-lunfold eq-runfold
R R
S; Σ τ =σ S; Σ τ =σ

Fig. 2. Type equivalence

check equivalence for vacuous types. Figure 2 shows inference rules for type
equivalence, defined using the rule unfold. We use a judgment S; Σ τ1 ≡ τ2
to mean that τ1 and τ2 are coinductively equivalent under context (S, Σ) and
R
a judgment S; Σ τ1 = τ2 to mean that τ1 and τ2 are inductively equivalent.
R
Note that the inductive equivalence relation = is parameterized over a relation
R, which is instantiated with the coinductive equivalence relation ≡ in the rule
eq-ind. This way, we nest the inductive equivalence relation into the coinductive
equivalence relation3 . We use a double horizontal line for a coinductive rule and
a single horizontal line for an inductive rule.
The rule eq-coind is a coinductive rule for checking equivalence between vac-
uous types. To show that two vacuous types τ and σ are equivalent, that is,
S; Σ τ ≡ σ, we repeatedly apply the rule eq-coind. When we get the very same
proposition to be proved in the premise, the proof is completed by coinduction.
Notably vacuous types are only equivalent to vacuous types. As for equivalence
for types other than vacuous types, we use the rule eq-ind, which nests induction
into coinduction, to compare their structures.
The inductive type equivalence compares structures of types. Given a pair
of types, we apply the rule eq-lunfold or eq-runfold a finite number of times,
unfolding type definitions, until we get a pair of the unit type, type variables,
function types, or product types. Then we structurally compare them. Note that
the rules eq-lunfold and eq-runfold are the only rules where induction plays a role.
It is crucial that these rules are defined inductively; if we allow them to be used
coinductively, a vacuous type becomes equivalent to any type. The rules eq-unit
for the unit type and eq-var for type variables are standard. The rules eq-fun
3
A definition of the form νX.F (X, μY.G(X, Y )).
Contractive Signatures with Recursive Types 305

Contractive types and signatures S⇓τ S ↓C τ S⇓

S ↓⇓ τ (S, τi ) ∈ C (i = 1, 2)
c-coind c-unit c-var c-fun
S⇓τ S ↓C unit S ↓C α S ↓C τ1 → τ2

(S, τi ) ∈ C (i = 1, 2) S  type α t S ↓C τ S τ  σ S ↓C σ
c-prod c-abs c-type
S ↓C τ1 ∗ τ2 S ↓C τ t S ↓C τ
∀(type α t = τ ) ∈ S, S ⇓ τ
c-sig
S⇓

Fig. 3. Contractive types and signatures

for function types, eq-prod for product types, and eq-abs for abstract types are
where the inductive equivalence goes back to the coinductive equivalence.
With this definition of type equivalence, for example, now we prove that the
types s and u in the signature T in the introduction are equivalent as follows:
coinduction hypothesis
T 6 type ’a t T; · s≡u

eq-abs
T; · s t = u t
eq-ind, eq-lunfold, eq-runfold
T; · s ≡ u
Our type equivalence is indeed an equivalence relation, i.e., reflexive, sym-
metric, and transitive (see the expanded version for the proof).

2.3 Contractive Types and Signatures


Given a program (M, S, e), we restrict every type τ in a type specification
type α t = τ in S to be contractive to obtain type soundness as illustrated
in the introduction. Intuitively a type is contractive if any sequence of its un-
folding eventually produces a type constructor such as unit, α, →, or ∗. A subtle
case is a type application τ t where t is an abstract type: we require that τ be
contractive. For instance, assuming t is an abstract type, type α s = (α s ∗ α s) t
and type α s = ((α ∗ α) t) t are contractive, but type α s = (α s) t is not. This
way, we avoid the possibility of a type specification type α s = τ in a signature
to degenerate into type α s = α s during subtyping.
Figure 3 shows inference rules for contractive types and signatures. We use
two judgments S ⇓ τ and S ↓C τ to define contractive types: the former to
define coinductive contractiveness and the latter inductive contractiveness. The
basic idea of using nested induction into coinduction is the same as for type
equivalence. Note that the rule c-coind is the only coinductive rule for checking
contractiveness of a type, which nests induction into coinduction.
The inductive contractiveness is defined using six rules: two axioms, two rules
going back to the coinductive contractiveness, and two inductive rules for type
applications. The unit type and type variables are by definition inductively con-
tractive (rules c-unit and c-var). In the rules c-fun and c-prod, a function type
306 H. Im, K. Nakata, and S. Park

τ1 → τ2 and a product type τ1 ∗ τ2 are inductively contractive under signature


S if each component τi and S are related by C , which is instantiated with the
coinductive contractiveness ⇓ in the rule c-coind. The rules c-abs and c-type are
where induction plays a role. An abstract type τ t is inductively contractive if
so is its argument τ (rule c-abs). Finally a type τ is inductively contractive if so
is its unfolding σ (rule c-type).
A signature S is contractive, denoted by S ⇓, if for every type specification
in S, its right hand side is contractive under S (rule c-sig). With this definition
of contractiveness, for example, now the signature T in the introduction is not
contractive because the type s t in the type specification type s = s t (or u t
in type u = u t) cannot be proved to be contractive:

an infinite derivation
T 6 type ’a t T ↓⇓ s
c-abs
T s+st T ↓⇓ s t
c-type
T 6 type ’a t T ↓⇓ s
c-coind, c-abs
T⇓st

2.4 Type Soundness


In this section, we prove type soundness of λrec
abs by the usual progress and preser-
vation properties (Theorems 1 and 4). First we give an overview of typing rules
and reduction rules in Figure 4 (where standard typing rules and reduction rules
for expressions are omitted). Most of the typing rules are standard, and we only
remark that a sealed program (M, S, e) is well-typed if the module M has some
signature S  , S is contractive, S  is a sub-signature of S, and e is well-typed
under the sealed signature S (rule typ-prog-seal). Here we use a subtyping judg-
ment S  ≤ S, which is defined as for subtyping for record types, where width
subtyping is allowed: S  may include more specifications than S. Furthermore
any type specification is considered to be a sub-specification of an abstract type
specification with the same bound name.
Reduction rules are also standard. Given a sealed program (M, S, e), we first
remove the sealed signature. Then we evaluate the module M to a module value
V , by sequentially evaluating the value definitions de in M to definition values
dv . Finally using the module value V we evaluate the expression e to a value v,
obtaining a program value (V, v).
Theorems 1 and 4 below now prove type soundness of λrec abs . For the progress
theorem, we also prove classification, inversion, and canonical forms lemmas
as usual. In particular, the classification lemma ensures that we cannot prove
equivalence of types with different constructors even in the presence of non-
contractive types in modules. For example, S; Σ τ1 ∗ τ2 ≡ σ1 → σ2 for any
types τ1 , τ2 , σ1 , σ2 .
Contractive Signatures with Recursive Types 307

Well-typed definitions and modules S d e : Se M :S

S; · e:τ S, val l : τ d e : Se
typ-emp typ-val
S ·:· S (let l = e, de ) : (val l : τ, Se )

dτ ok dτ d e : Se BN(de ) distinct
typ-mod
(dτ , de ) : (dτ , Se )
Well-typed programs P : (S, τ )

M : S S ⇓ S  ≤ S S; · e:τ M : S S; · e : τ
typ-prog-seal typ-prog
(M, S, e) : (S, τ ) (M, e) : (S, τ )

Reduction rules
values v ::= () | λx : τ. e | (v1 , v2 )
definition values dv ::= let l = v
module values V ::= (dτ , dv )
program values Pv ::= (V, v)

red-p-seal
(M, S, e) −→ (M, e)
M −→ M  dv e −→ e
red-p-mod red-p-exp
(M, e) −→ (M  , e) (dτ , dv , e) −→ (dτ , dv , e )
dv e −→ e dv  let l = v
red-mod red-name
(dτ , dv , let l = e, de ) −→ (dτ , dv , let l = e , de ) dv l −→ v

Fig. 4. Typing rules and reduction rules

Theorem 1 (Progress).
(1) If (dτ , dv ) : S and S; · e : τ , then either e is a value or ∃e , dv e !−→ e .
(2) If M : S, then either M is a module value or ∃M  , M !−→ M  .
(3) If P : (S, τ ), then either P is a program value or ∃P  , P !−→ P  .
The key lemma for the preservation theorem is that type equivalence is pre-
served by subtyping. In the lemma below, the signature S2 being contractive is
crucial. For example, assuming S is the inferred signature of the module M in the
introduction, although S ≤ T and T; · s ≡ t, we have S; · s ≡ t.
Lemma 2. If S1 ≤ S2 , S2 ⇓, and S2 ; Σ τ ≡ σ, then S1 ; Σ τ ≡ σ.
Now using Lemma 2, we show that if a sealed program (M, S, e) is well-typed,
the program (M, e) where the sealed signature S is eliminated is also well-typed
(Lemma 3), which proves the most difficult case (4) of Theorem 4. We then prove
other cases of Theorem 4 as usual using induction and case analysis.
Lemma 3 (Contractive signature elimination). If (M, S, e) : (S, τ ), then
there exists S  such that (M, e) : (S  , τ ) and S  ≤ S.
308 H. Im, K. Nakata, and S. Park

Theorem 4 (Preservation).

(1) If (dτ , dv ) : S, S; · e : τ , and dv e !−→ e , then S; · e : τ .


(2) If M : S and M !−→ M  , then M  : S.
(3) If P = (M, e), P : (S, τ ) and P !−→ P  , then P  : (S, τ ).
(4) If (M, S, e) : (S, τ ), then ∃S  such that M : S  , S  ≤ S, and S  ; · e : τ.

3 Discussion
3.1 Coq Mechanization
For the Coq mechanization, we use Mendler-style [10] coinductive rules for type
equivalence and contractiveness in the style of Nakata and Uustalu [14], instead
of the Park-style rules in Figures 2 and 3. The reason is that Coq’s syntactic
guardedness condition for induction nested into coinduction is too weak to work
with the Park-style rules. We cannot construct corecursive functions (coinductive
proofs) that we need. For example, to enable Coq’s guarded corecursion, we use
the following Mendler-style coinductive rule instead of the Park-style rule eq-ind:
R
R ⊆ ≡ S; Σ τ = σ
eq-ind
S; Σ τ ≡ σ
The main difference is that we use in the rule eq-ind a relation R that is stronger
than the coinductive equivalence relation ≡. Hence, to build a coinductive proof,
we need to find such a relation R, and in many cases we cannot just use ≡ for
R. With this definition, the Park-style rules are derivable.

3.2 Algorithmic Type Equivalence and Contractiveness


Strictly speaking, equality of equi-recursive types with type parameters is decid-
able [4]. Solomon [17] has shown it to be equivalent to the equivalence problem
for deterministic pushdown automata (DPDA), which has been shown decid-
able by Sénizergues [16]. There is, however, no known practical algorithm for
DPDA-equivalence, and it is not known whether there exists any algorithm for
unification either, which is required for type inference in the core language.
One possible approach to practical type equivalence would be to reject non-
regular recursive types as in OCaml. We can then also algorithmically decide
contractiveness of every type in a program by enumerating all the distinct type
structures that can be obtained by unfolding each type used in the program and
its subterms. Still we need a sound metatheory of non-regular recursive types to
prove soundness of an OCaml-style recursive module system, because such types
may be hidden behind signature sealings in the presence of recursive modules.
Contractive Signatures with Recursive Types 309

4 Related Work

The literature on subtyping for μ-types (hence without type definitions, type
parameters, and abstract types) is abundant. In this setting, contractiveness
can be checked syntactically: every μ-bound variable occurs under → or ∗. We
mention three landmark papers. Amadio and Cardelli [2] were the first to give
a subtyping algorithm. They define subtyping in three ways, which are proved
equivalent: an inclusion between unfoldings of μ-types into infinite trees, a sub-
typing algorithm, and an inductive axiomatization. Brandt and Henglein [3] give
a new inductive axiomatization in which the underlying coinductive nature of
Amadio and Cardelli’s system is internalized by allowing, informally speaking,
construction of circular proofs. Gapeyev et al. [6] is a good self-contained intro-
duction to subtyping for recursive types, including historical notes on theories of
recursive types. They define a subtyping relation on contractive μ-types as the
greatest fixed point of a suitable generating function.
Danielsson and Altenkirch [5] present an axiomatization of subtyping for μ-
types using induction nested into coinduction. They formalized the development
in Agda, which supports induction nested into coinduction as a basic form.
Komendantsky [8] conducted a similar project in Coq using the Mendler-style
coinduction.
Recursive types are indispensable in theories of recursive modules since recur-
sive modules allow us to indirectly introduce recursion in types that span across
module boundaries. In this setting, one has to deal with a more expressive lan-
guage for recursive types, which may include, for instance, higher-order type
constructors, type definitions, and abstract types. Montagu and Rémy [12,13]
investigate existential types to model modular type abstraction in the context
of a structural type system. They consider its extensions with recursion (i.e.,
equi-recursive types without type parameters) and higher-order type construc-
tors separately but do not investigate a combination of the two extensions. Crary
et al. [4] first propose a type system for recursive modules using an inductive
axiomatization of (coinductive) type equivalence for equi-recursive types with
higher-order type constructors, type definitions, and abstract types. However,
the metatheory of their axiomatization such as type soundness is not investi-
gated. Rossberg and Dreyer [15] use equi-recursive types with inductive type
equivalence (i.e., they do not have a rule equivalent to contract in [2] to enable
coinductive reasoning) to prove soundness of their mixin-style recursive module
system. They do not intend to use equi-recursive types for the surface language.
Our earlier work [7] on recursive modules considers equi-recursive types with
type definitions and abstract types, but without type parameters. There we de-
fine a type equivalence relation using weak bisimilarity.

5 Conclusion and Future Work

This paper studies a type system for recursive types, type parameters, and ab-
stract types. In particular, we investigate the interaction between non-contractive
310 H. Im, K. Nakata, and S. Park

types and abstract types, and show that while non-contractive types in signa-
tures lead to unsoundness of the type system, they may be allowed in modules.
Our study is mainly motivated by OCaml, which allows us to define both ab-
stract types and equi-recursive types with type parameters (with the “-rectypes”
option). To obtain a sound type system, we develop a new notion of semantic con-
tractiveness using mixed induction and coinduction and reject non-contractive
types in signatures. We show that our type system is sound with respect to the
standard call-by-value operational semantics, which eliminates signature seal-
ings. We have also formalized the whole system and its soundness proof in Coq.
Future work includes extending our type system to the full-scale module system
including recursive modules, nested modules, and higher-order functors.

Acknowledgments. This work was supported by Mid-career Researcher Pro-


gram through NRF funded by the MEST (2010-0022061). Hyeonseung Im was
partially supported by the ANR TYPEX project n. ANR-11-BS02-007 02. Keiko
Nakata was supported by the ERDF funded EXCS project, the Estonian Min-
istry of Education and Research research theme no. 0140007s12, and the Esto-
nian Science Foundation grant no. 9398.

References
1. OCaml, http://caml.inria.fr/ocaml/
2. Amadio, R.M., Cardelli, L.: Subtyping recursive types. ACM Transactions on Pro-
gramming Languages and Systems 15(4), 575–631 (1993)
3. Brandt, M., Henglein, F.: Coinductive axiomatization of recursive type equality and
subtyping. In: de Groote, P., Hindley, J.R. (eds.) TLCA 1997. LNCS, vol. 1210,
pp. 63–81. Springer, Heidelberg (1997)
4. Crary, K., Harper, R., Puri, S.: What is a recursive module? In: PLDI 1999 (1999)
5. Danielsson, N.A., Altenkirch, T.: Subtyping, declaratively: an exercise in mixed
induction and coinduction. In: Bolduc, C., Desharnais, J., Ktari, B. (eds.) MPC
2010. LNCS, vol. 6120, pp. 100–118. Springer, Heidelberg (2010)
6. Gapeyev, V., Levin, M.Y., Pierce, B.C.: Recursive subtyping revealed. Journal of
Functional Programming 12(6), 511–548 (2002)
7. Im, H., Nakata, K., Garrigue, J., Park, S.: A syntactic type system for recursive
modules. In: OOPSLA 2011 (2011)
8. Komendantsky, V.: Subtyping by folding an inductive relation into a coinductive
one. In: Peña, R., Page, R. (eds.) TFP 2011. LNCS, vol. 7193, pp. 17–32. Springer,
Heidelberg (2012)
9. MacQueen, D., Plotkin, G., Sethi, R.: An ideal model for recursive polymorphic
types. In: POPL 1984 (1984)
10. Mendler, N.P.: Inductive types and type constraints in the second-order lambda
calculus. Annals of Pure and Applied Logic 51(1-2), 159–172 (1991)
11. Milner, R., Tofte, M., Harper, R., MacQueen, D.: The Definition of Standard ML
(Revised). The MIT Press (1997)
12. Montagu, B.: Programming with first-class modules in a core language with subtyp-
ing, singleton kinds and open existential types. PhD thesis, École Polytechnique,
Palaiseau, France (December 2010)
Contractive Signatures with Recursive Types 311

13. Montagu, B., Rémy, D.: Modeling abstract types in modules with open existential
types. In: POPL 2009 (2009)
14. Nakata, K., Uustalu, T.: Resumptions, weak bisimilarity and big-step semantics
for While with interactive I/O: An exercise in mixed induction-coinduction. In:
SOS 2010, pp. 57–75 (2010)
15. Rossberg, A., Dreyer, D.: Mixin’ up the ML module system. ACM Transactions
on Programming Languages and Systems 35(1), 2:1–2:84 (2013)
16. Sénizergues, G.: The equivalence problem for deterministic pushdown automata
is decidable. In: Degano, P., Gorrieri, R., Marchetti-Spaccamela, A. (eds.) ICALP
1997. LNCS, vol. 1256, pp. 671–681. Springer, Heidelberg (1997)
17. Solomon, M.: Type definitions with parameters (extended abstract). In: POPL
1978 (1978)
18. Stone, C.A., Schoonmaker, A.P.: Equational theories with recursive types (2005),
http://www.cs.hmc.edu/~ stone/publications.html
Algebras, Automata and Logic for Languages
of Labeled Birooted Trees

David Janin

Université de Bordeaux, LaBRI UMR 5800,


351, cours de la Libération,
F-33405 Talence, France
[email protected]

Abstract. In this paper, we study the languages of labeled finite bi-


rooted trees: Munn’s birooted trees extended with vertex labeling. We
define a notion of finite state birooted tree automata that is shown to
capture the class of languages that are upward closed w.r.t. the natu-
ral order and definable in Monadic Second Order Logic. Then, relying
on the inverse monoid structure of labeled birooted trees, we derive a
notion of recognizable languages by means of (adequate) premorphisms
into finite (adequately) ordered monoids. This notion is shown to cap-
ture finite boolean combinations of languages as above. We also provide
a simple encoding of finite (mono-rooted) labeled trees in an antichain
of labeled birooted trees that shows that classical regular languages of
finite (mono-rooted) trees are also recognized by such premorphisms and
finite ordered monoids.

Introduction
Motivations and background. Semigroup theory has amply demonstrated its
considerable efficiency over the years for the study and fine grain analysis of
languages of finite words, that is subsets of the free monoid A∗ . This can be
illustrated most simply by the fact that a language L ⊆ A∗ is regular if and
only if there is a finite monoid S and a monoid morphism θ : A∗ → S such that
L = θ−1 (θ(L)). In this case, we say that the language L is recognized by the
finite monoid S (and the morphism θ).
Even more effectively, for every language L ⊆ A∗ , the notion of recognizability
induces a notion of syntactic congruence 5L for the language L in such a way
that the monoid M (L) = A∗ / 5L is the smallest monoid that recognizes L.
Then, many structural properties of the language L can be decided by analyzing
the properties of its syntactic monoid M (L), e.g. regularity, star freeness, etc
(see [14] for more examples of such properties).
These results triggered the development of entire algebraic theories of lan-
guages of various structures elaborated on the basis of richer algebraic frame-
works such as, among others, ω-semigroups for languages of infinite words [19,12],

Partially funded by project INEDIT, ANR-12-CORD-0009
Complete version available at http://hal.archives-ouvertes.fr/hal-00784898

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 312–323, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Algebras, Automata and Logic for Languages of Labeled Birooted Trees 313

preclones or forest algebra for languages of trees [5,3,2], or indeed ω-hyperclones


for languages of infinite trees [1]. With an aim to describing the more subtle
properties of languages, several extensions of the notion of recognizability by
monoids and morphisms were also taken into consideration, e.g. recognizability
by monoids and relational morphisms [13] or recognizability by ordered monoids
and monotonic morphisms [15].
A recent study of the languages of overlapping tiles [6,9] or, equivalently,
subsets of the (inverse) monoid of McAlister [11,8], has led to the definition of
quasi-recognizability: recognizability by means of (adequate) premorphisms into
(adequately ordered) ordered monoids.
As (monotonic) morphisms are particular cases of premorphisms, this no-
tion can be seen as a generalization of recognizability by (ordered) monoids
and (monotonic) morphisms [15]. To some extent, quasi-recognizability can also
be seen as a notion of co-algebraic recognizability in the sense that it is dual
to the standard notion. Indeed, (adequate) premorphisms preserve some (and
sufficiently many) decompositions while morphisms preserve all compositions.
However, this notion of quasi-recognizability has not yet been settled for we
need to restrict both the class of allowed premorphisms and/or the class of finite
ordered monoids for that notion to be effective. Without any restrictions, the
inverse image by a premorphism of a finite subset of a finite ordered monoid
may not even be computable [8]. Further still, there are several incomparable
candidates for defining such an effective restriction as illustrated by a recent and
complementary study of walking automata on birooted trees [7].
In this paper, we aim to stabilize the notion of recognizability by adequate
premorphisms by applying it to the study of languages of labeled birooted trees.
In doing so, it appears that this notion admits both simple automata theoretic
characterization and robust logical characterization.

Outline. Birooted labeled trees, called birooted F -trees, are presented in Sec-
tion 1. Equipped with an extension of Scheiblich’s product of (unlabeled) bi-
rooted trees [16], the resulting algebraic structures are inverse monoids that are
quite similar to discrete instances of Kellendonk’s tiling semigroups [10]. Then,
birooted F -trees can be ordered by the (inverse semigroup) natural order rela-
tion that is stable under product: the inverse monoid B 1 (F ) of labeled birooted
F -trees is also a partially ordered monoid.
Birooted tree automata are defined and studied in Section 2. By construction,
languages recognized by these finite automata are upward closed in the natural
order. It follows that they fail to capture all languages definable by means of
Monadic Second Order (MSO) formulae. However, this loss of expressive power
is shown to be limited to the property of upward closure. Indeed, we prove
(Theorem 2) that every upward closed language of birooted trees which is MSO
definable is recognized by a finite state birooted tree automata.
As a case in point, when F is seen as a functional signature, by embedding
the classical F -terms (see [18]) into birooted F -trees, we show (Theorem 3) that
the birooted tree image of every regular language L of F -terms is of the form
314 D. Janin

UL ∩ DL for some MSO definable and upward closed (resp. downward closed)
language UL (resp. language DL ).
The algebraic counterpart of birooted tree automata is presented in Section 3
where the notions of adequately ordered monoids and adequate premorphisms
are defined. The induced notion of quasi-recognizable languages of birooted F -
trees is shown to be effective (Theorem 4).
As for expressive power, it is shown that every birooted tree automaton
simply induces an adequate premorphism that recognizes the same language
(Theorem 5) and that every quasi-recognizable language is MSO definable (The-
orem 6). The picture is made complete by proving (Theorem 7) that quasi-
recognizable languages of birooted trees correspond exactly to finite boolean
combinations of upward closed MSO definable languages.
Together with Theorem 3, this result demonstrates that our proposal can
also be seen as yet another algebraic characterization of regular languages of
trees that complete that previously obtained by means of preclones [5], forest
algebras [3] or ordered monoids and admissible premorphisms [7].

Related Works. We should also mention that the notion of birooted F -tree au-
tomata defined above is an extension of that previously defined [9] for languages
of one-dimensional overlapping tiles: subsets of McAlister monoids [11].
Although closely related, we can observe that an extension of this type is
by no means straightforward. Of course going from the linear structure of over-
lapping tiles to the tree shaped structure of birooted F -trees already induces a
tangibly increased level of complexity. However, the main difference comes from
edge directions. In overlapping tiles, all edges go in the same direction while,
in birooted F -trees, edges can go back and forth (almost) arbitrarily. Proving
Theorem 2 is thus much more complex than proving an analogous result for
overlapping tiles.
Comparing our proposal with other known algebraic characterizations of lan-
guages of (mono-rooted) F -trees [5,3] is not easy. Of course, our proposal induces
a larger class of definable languages since we are dealing with birooted F -trees
and not just F -trees. However, a more relevant comparison would be to compare
the classification of languages through a full series of approaches, by restricting
even further the allowed recognizers: be them preclones as in [5], forest alge-
bras [2] or adequately ordered monoids as proposed here.
With quasi-recognizability, recognizers are monoids (and premorphisms). It
follows that the known restrictions applicable to the study of languages of
words, e.g. aperiodic monoids [14], can simply be extended to adequately ordered
monoids. Yet, the relevance of such restrictions for languages of mono-rooted or
birooted F -trees still needs to be evaluated.
Another source of difficulty comes from the fact that adequate premorphisms
are not morphisms : only disjoint products are preserved. To some extent, the
notion of quasi-recognizability by premorphisms presented here is analogous,
compared with classical recognizability by morphisms, to what unambiguous
non deterministic automata are in comparison with deterministic automata. On
the negative side, this means that the notion of quasi-recognizability has not yet
Algebras, Automata and Logic for Languages of Labeled Birooted Trees 315

been completely understood. On the positive side, this means that it may lead
to radically new outcomes.

1 Semigroups and Monoids of Birooted F -trees


Simply said, a labeled birooted tree is a (non empty) finite connected subgraph
of the Cayley graph of the free group F G(A) with labeled vertices on some
finite alphabet F and two distinguished vertices respectively called the input
root and the output root. This definition and some of the associated properties
are detailled in this section.
Formally, let A be a finite (edge) alphabet and let Ā be a disjoint copy of A
with, for every letter a ∈ A, its copy ā ∈ Ā. Let u !→ u be the mapping from
(A + Ā)∗ to itself inductively defined by 1̄ = 1 and ua = ā ū and uā = a ū, for
every u ∈ (A + Ā)∗ , every a ∈ A. This mapping is involutive, i.e. u = u for every
u ∈ (A + Ā)∗ , and it is an anti-morphism, i.e.uv = v̄ ū for every word u and
v ∈ (A + Ā)∗ .
The free group F G(A) generated by A is the quotient of (A + Ā)∗ by the least
congruence 5 such that, for every letter a ∈ A, aā 5 1 and āa 5 1. This is
indeed a group since, for every u ∈ (A + Ā)∗ , we have [u][ū] = [1] hence [ū] is
the group inverse of [u].
It is known that every class [u] ∈ F G(A) contains a unique element red(u) (the
reduced form of u) that contains no factors of the form aā nor āa for a ∈ A. In
the sequel, every such class [u] ∈ F G(A) is thus represented by its reduced form
red(u). Doing so, the product u · v of every two reduced words u and v ∈ F G(A)
is directly defined by u · v = red(uv).
Elements of F G(A), when seen as reduced words, can then be ordered by the
prefix order relation ≤p defined, for every (reduced word) u and v ∈ F G(A) by
u ≤p v when there exists (a reduced word) w ∈ F G(A) such that red(uw) =
uw = v. The associated predecessor relation ≺p is defined, for every v and
w ∈ F G(A), by v ≺p w when v <p w and w = vx for some x ∈ A + Ā.
A labeled birooted tree on the edge alphabet A and the vertex alphabet F is
a pair B = t, u where t : F G(A) → F is a partial maps which domain dom(t)
is a prefix closed subset of F G(A) with u ∈ dom(t). In such a presentation,
1 ∈ dom(t) is the input root vertex and u ∈ dom(t) is the output root vertex.
Assuming the edge alphabet A is implicit, these labeled birooted trees are called
birooted F -trees or, when F is also implicit, simply birooted trees.
For every birooted tree B = t, u, for every v ∈ dom(t), let tv : F G(A) → F
be the partial function defined by dom(tv ) = v̄ · dom(t) and tv (w) = t(vw) for
every w ∈ dom(tv ). Accordingly, let Bv = tv , v̄u be the v translation of the
birooted tree B.
Observe that such a translation slightly differs from the classical notion of
subtrees since dom(tv ) = v̄·dom(t) contains as many vertices as dom(t). A notion
of sub-birooted tree Bvp , with fewer vertices and thus more closely related with
the classical notion of subtree, is defined below when proving the decomposition
property (Lemma 1).
316 D. Janin

The partial product r, u · s, v of two birooted F -tree r, u and s, v is
defined, when it exists, as the birooted F -tree t, w defined by w = u·v, dom(t) =
dom(r) ∪ u · dom(s), t(u ) = r(u ) for every u ∈ dom(r) and tu (v  ) = s(v  ) for
every v  ∈ dom(s).
Observe that such a product exists if and only if the tree ru and the tree
s agree on dom(ru ) ∩ dom(s), i.e. for every v  ∈ dom(ru ) ∩ dom(s), we have
ru (v  ) = r(uv  ) = s(v  ). It follows that undefined products may arise when F is
not a singleton.
Two examples of birooted F -trees B1 and B2 are depicted below, with a
dangling input edge marking the input root and a dangling output edge marking
the output root.
in
(B1 ) g a (B2 ) b f
out in
b f a f a b f
f g b g c
out
g g

The (defined) product of the birooted F -trees B1 and B2 is then depicted below.

(B1 · B2 )
in
g a b f
b f a b f
f b g c
out
g g

In that picture, the cercle marks the synchronization vertex that results from
the merging of the output root of B1 and the input root of B2 . The a-labeled
a
edge f → g emanating from that vertex is the common edge resulting from the
fusion of the two (synchronized) birooted F -trees.
The product is completed by adding a zero element for the undefined case
with 0 · t, v = t, v · 0 = 0 · 0 = 0 for every (defined) birooted tree t, v.
One can easily check that the completed product is associative. The resulting
structure is thus a semigroup denoted by B(F ): the semigroup of birooted F -
trees. When F is a singleton, every birooted F -tree can just be seen as a pair
(P, u) with an non empty prefix closed domain P ⊆ F G(A) and an output root
u ∈ P . Then, following Scheiblich presentation [16], the semigroup B(F ) is the
free monoid F IM (A) generated by A with unit 1 = ({1}, 1). When F is not
a singleton, we extend the set B(F ) with a unit denoted by 1. The resulting
structure is a monoid denoted by B 1 (F ) : the monoid of birooted F -trees.
The monoid of birooted F -trees is an inverse monoid, i.e. for every B ∈ B 1 (F )
there is a unique B −1 ∈ B 1 (F ) such that BB −1 B = B and B −1 BB −1 = B −1 .
Indeed, we necessarily have 0−1 = 0, 1−1 = 1 and, for every non trivial birooted
F -tree t, u one can check that t, u−1 = tu , ū.
As an inverse monoid, elements of B 1 (F ) can be ordered by the natural order
defined, for every B and C ∈ B 1 (F ) by B ≤ C when B = BB −1 C (equivalently
Algebras, Automata and Logic for Languages of Labeled Birooted Trees 317

B = CB −1 B). One can check that 0 is the least element and, for every defined
birooted F -trees r, u and s, v we have r, u ≤ s, v if and only if u = v,
dom(r) ⊇ dom(s) and, for every w ∈ dom(s), t(w) = s(w).
Observe that, as far as trees only are concerned, the natural order is the
reverse of the (often called) prefix order on trees. In particular, the bigger is the
size of a birooted tree, the smaller is the birooted tree in the natural order.
One can easily check that the monoid of birooted F -trees is finitely generated.
We prove here a stronger statement that will be extensively used in the remainder
of the text.
A birooted tree is said elementary when it is either 0 or 1, or of the from
Bf = {1 !→ f }, 1 for some f ∈ F or of the form Bf xg = {1 !→ f, x !→ g}, x
for some vertex label f and g ∈ F and some letter x ∈ A + Ā.
in a g out in out in a g out
(Bf ag ) f (Bf ) f (Bf āg ) f

The right projection B R (resp. the left projection B L ) of a birooted tree B =


t, u is defined to be B R = BB −1 (resp. B L = B −1 B) or, equivalently, B R =
t, 1 (resp. B L = tu , 1). The right projection B R of B is also called the reset
of B.
The product B1 ·B2 of two birooted trees B1 and B2 is a disjoint product when
B1 ·B2 = 0 and there is a unique elementary birooted tree Bf such that B1L ≤ Bf
and B2R ≤ 1, i.e. B1L ∨B2R = 1. This restricted product is called a disjoint product
because, when B1 = t1 , u1  and B2 = t2 , u2 , the product B1 · B2 is disjoint
if and only if t(u1 ) = t2 (1) = f and dom(t1 ) ∩ u1 · dom(t2 ) = {u1 }, i.e. the set
of edges in B1 · B2 is the disjoint union of the set of edges of B1 and the set of
(translated) edges of B2 .
Lemma 1 (Strong Decomposition). For every B ∈ B(F ), the birooted F -
tree B can be decomposed into a finite combination of elementary birooted trees
by disjoint products and (right) resets.

Proof. Let B = t, u be a birooted F -tree. We aim at proving it can be decom-


posed as stated above. We first define some specific sub-birooted trees of B that
will be used for such a decomposition.
For every vertex v and w ∈ dom(t) such that v ≺p w, let Bv,w p
be the two
p
vertices birooted F -tree defined by Bv,w = Bf xg where f = t(v), g = t(w) and
vx = w.
Let U = {v ∈ dom(t) : 1 ≤p v ≤p u} be the set of vertices that appears on
the path from the input root 1 to the output root u.
For every v ∈ dom(t), let Dp (v) be the greatest prefix closed subset of the set
{w ∈ dom(tv ) : v ≤p vw, vw ∈ U ⇒ w = 1} and let Bvp = tv |Dp (v), 1 be the
idempotent birooted tree obtained from B by restricting the subtree tv rooted
at the vertex v to the domain Dp (v).
Then, given u0 = 1 <p u1 <p u2 <p · · · <p un−1 <p un = u the increasing se-
quence (under the prefix order) of all the prefixes of the output root u, we observe
that B = Bup0 Bup0 ,u1 Bup1 · · · Bupn−1 Bupn−1 ,un Bupn with only disjoint products.
318 D. Janin

It remains thus to prove that every idempotent sub-birooted tree of the form
Bvp for some v ∈ dom(t) can also be decomposed into an expression of the desired
form. But this is easily done by induction on the size of the birooted trees Bcp .
Indeed, Let v ∈ dom(t). In the case v is a leaf (w.r.t. the prefix order) then
Bvp = Bt(v) and we are done. Otherwise, we have Bvp = r, 1 for some F -tree
  p 
p R
r and we observe that Bvp = { Bv,w · Bw : w ∈ dom(r), v ≺p w} with only
disjoint products and resets. This concludes the proof. 2
The above decomposition of B as a combination of elementary birooted trees by
disjoint products and right projections is called a strong decomposition of the
birooted F -tree B.

2 Birooted F -tree Automata


In this section, we define the notion of birooted F -tree automata that is shown to
capture the class of languages of birooted F -trees that are upward closed w.r.t.
the natural order and definable in Monadic Second Order Logic (MSO).
A birooted F -tree (finite) automaton is a quintuple A = Q, δ, Δ, W  defined
by a (finite) set of states Q, a (non deterministic) state table δ : F → P(Q),
a (non deterministic) transition table Δ : A → P(Q × Q) and an acceptance
condition W ⊆ Q × Q.
A run of the automaton A on a non trivial birooted F -tree B = t, u is a
mapping ρ : dom(t) → Q such that for every v ∈ dom(t):
 State coherence: ρ(v) ∈ δ(t(v)),
 Transition coherence: for every a ∈ A, if va ∈ dom(t) then (ρ(v), ρ(va)) ∈
Δ(a) and if vā ∈ dom(t) then (ρ(vā), ρ(v)) ∈ Δ(a).
The run ρ is an accepting run when (ρ(1), ρ(u)) ∈ W . The set L(A) ⊆ B(F ) of
birooted F -tree B such that there is an accepting run of A on B is the language
recognized by the automaton A.
Every non trivial birooted F -tree B = t, u can be seen as a (tree-shaped)
FO-structure MB with domain dom(MB ) = dom(t), constant in B = 1 and
constant out B = u, unary relation Sf = t−1 (f ) for every f ∈ F and binary
relation Ra = {(v, w) ∈ dom(t) × dom(t) : va = w} for every a ∈ A.
We say that a language L ⊆ B(F ) is definable in monadic second order logic
(MSO) when there exists a closed MSO formula ϕ on the FO-signature {in, out}∪
{Sf }f ∈F ∪ {Ra }a∈A such that L = {B ∈ B(F ) : MB |= ϕ}.
The following theorem gives a rather strong characterization of the languages
recognized by finite state birooted F -tree automata.
Theorem 2. Let L ⊆ B(F ) be a language of birooted F -trees. The language is
recognized by a finite birooted F -tree automaton if and only if L is upward closed
(in the natural order) and MSO definable.

Proof. Let L ⊆ B(F ) be a language of birooted F -trees. We first prove the easiest
direction, from birooted tree automata to MSO. Then, we prove the slightly more
difficult direction from MSO to birooted tree automata.
Algebras, Automata and Logic for Languages of Labeled Birooted Trees 319

From birooted tree automata to MSO. Assume that L is recognizable by a finite


state birooted tree automaton A. Without loss of generality, since A is finite,
we assume that the set Q of states of A is such that Q ⊆ P([1, n]) for some
n ≥ log2 |Q|.
Then, checking that a birooted tree t, u belongs to L(A) just amounts
to checking that there exists an accepting run. This can easily be de-
scribed by an existential formula of monadic second order logic of the form
∃X1 X2 · · · Xn ϕ(in, out) with n set variables X1 , X2 , . . . , Xn and a first order
formula ϕ(in, out).
Indeed, every mapping ρ : dom(v) → Q is encoded by saying, for every vertex
v ∈ dom(t), that ρ(v) = {k ∈ [1, n] : v ∈ Xk }. Then, checking that the mapping
ρ encoded in such a way is indeed an accepting run amounts to checking that
it satisfies state and transition coherence conditions and acceptance condition.
This is easily encoded in the FO-formula ϕ(x, y).

From MSO to Birooted Tree Automata. Conversely, assume that L is upward


closed for the natural order and that L is definable in M SO. Observe that
every B = t, u can just be seen as a (deterministic) tree rooted in the input
root vertex 1 with edges labeled on the alphabet A + Ā (with edge “direction”
being induced by the prefix order on F G(A)), vertices labeled on the alphabet
F × {0, 1} (with 1 used to distinguish the output root u from the other vertices).
An example of such an encoding of birooted trees into trees is depicted below.

b f b (f, 0)
in in
f a b f (f, 0) a b (f, 0)
b g c b̄ (g, 0) c̄
out
g g (g, 1) (g, 0)

Since L is definable in MSO, applying (an adapted version of) the theorem of
Doner, Thatcher and Wright (see for instance [18]), there exists a finite state
tree automaton A that recognizes L. We conclude our proof by defining from
the (finite) tree automaton A a (finite) birooted tree automaton A such that
L(A) = L(A ).
The major difficulty in defining A is that the (one root) tree automaton A
reads a tree from the (input) root to the leaves hence following the prefix relation
order ≤p . Moreover, birooted trees, such a prefix order is not encoded in the
direction of edges. It follows that, when translating the tree automaton A into
an equivalent birooted tree automaton A , we need to encode (and propagate)
that direction information into states.
But this can be achieved by observing that for every vertex v and w such
that v ≺p w, the edge from v to w is uniquely defined by the letter x ∈ (A + Ā)
such that vx = w. It follows that every such a vertex w (distinct from the in-
put root 1) will be marked in automaton A by a state that will encode that letter
320 D. Janin

x; distinguishing thus the unique predecessor vertex v of w from all successor


vertices w such that w ≺p w .
2
From now on, a language of birooted F -trees that is definable by a finite birooted
F -tree automaton is called a regular language of birooted F -trees.
We aim now at relating languages of birooted F -trees and languages of F -
trees. Assume till the end of that section that the set F is now a finite functional
signatures that is a finite set of symbols equipped with some arity mapping
ρ : F → P(A) that maps every function symbol f the set of its arguments’
names ρ(f ) ⊆ A.
A F -tree (also called F -term) is a function t : A∗ → F with prefix closed finite
domain dom(t) such that for every u ∈ dom(t), every a ∈ A, if ua ∈ dom(t) then
a ∈ ρ(t(u)). Such a finite tree t is said to be complete when, moreover, for every
u ∈ dom(t), for every a ∈ A, if a ∈ ρ(t(u)) then ua ∈ dom(t).
Every F -tree t is encoded into a birooted F -tree t, 1 called the birooted
image of tree t. By extension, for every set X of F -trees, the language LX =
{t, 1 ∈ B(F ) : t ∈ X} of birooted tree images of trees of X is called the birooted
tree image of the language X.

Theorem 3. For every regular language X of complete finite F -trees, we have


LX = UX ∩ DX for some regular language UX of birooted F -trees and the com-
plement DX of some regular language B(F ) − DX of birooted F -trees.

Proof. This essentially follows from Theorem 2.


2

3 Quasi-Recognizable Languages of Birooted F -trees


Intimately related to the theory of non-regular semigroups initiated by Fountain
in the 70s (see e.g [4]), the notion of recognizability by premorphisms is pro-
posed in [6] (generalized in [9]) to define languages of positive (resp. arbitrary)
overlapping tiles. This notion is extended here to languages of birooted F -trees.
Let S be a monoid partially ordered by a relation ≤S (or just ≤ when there
is no ambiguity). We always assume that the order relation ≤ is stable under
product, i.e. if x ≤ y then xz ≤ yz and zx ≤ zy for every x, y and z ∈ S. The set
U (S) of subunits of the partially ordered monoid S is defined by U (S) = {y ∈
S : y ≤ 1}.
A partially ordered monoid S is an adequately ordered monoid when all sub-
units of S are idempotents, and for every x ∈ S, both the minimum of right
local units xL = min{y ∈ U (S) : xy = x} and the minimum of left local units
xR = min{y ∈ U (S) : yx = x} exist and belong to U (S). The subunits xL and
xR are respectively called the left projection and the right projection of x.

Examples. Every inverse monoid S ordered by the natural order is an adequately


ordered monoid with xL = x−1 x and xR = xx−1 for every x ∈ S.
Algebras, Automata and Logic for Languages of Labeled Birooted Trees 321

As a particular case, the monoid B 1 (F ) ordered by the natural order is also


an adequately ordered monoid. The subunits of B 1 (F ) are, when distinct from
0 or 1, the birooted F -trees of the form t, 1 and, indeed, for every birooted
F -tree B = t, u we have B R = t, 1 and B L = tu , 1.
For every set Q, the relation monoid P(Q × Q) ordered by inclusion is also an
adequately ordered monoid with, for every X ⊆ Q × Q, X L = {(q, q) ∈ Q × Q :
(p, q) ∈ X} and X R = {(p, p) ∈ Q × Q : (p, q) ∈ X}.
A mapping θ : S → T between two adequately ordered monoids is a premor-
phism when θ(1) = 1 and, for every x and y ∈ S, we have θ(xy) ≤T θ(x)θ(y) and
if x ≤S y then θ(x) ≤T θ(y). A premorphism θ : S → T is an adequate premor-
phism when for every x and y ∈ S we have θ(xL ) = (θ(x))L , θ(y R ) = (θ(y))R
and, if xy = 0 with xL ∨ y R = z ≺ 1, i.e. the product xy is a disjoint product,
then θ(xy) = θ(x)θ(y).
A language L ⊆ B(F ) of birooted tree is a quasi-recognizable language when
there exists a finite adequately ordered monoid S and an adequate premorphism
θ : B(F ) → S such that L = θ−1 (θ(L)).

Theorem 4. Let θ : F IM (A) → S be an adequate premorphism with finite S.


For every B ∈ B(F ) the image θ(B) of the birooted F -tree B by the adequate
premorphism θ is uniquely determined by the structure of B, the structure of S
and the image by θ of elementary birooted F -trees.

Proof. This essentially follows from the adequacy assumption and the strong
decomposition property (Lemma 1). 2
Now we want to show that every finite state birooted automaton induces an
adequate premorphism that recognizes the same language.

Theorem 5. Let L ⊆ B(F ) be a language of birooted F -trees. If L is recognizable


by a finite state birooted tree automaton then it is recognizable by an adequate
premorphism into a finite adequately ordered monoid.

Proof. Let L ⊆ B(F ) and let A = Q, δ, Δ, T  be a finite birooted tree automaton
such that L = L(A).
We define the mapping ϕA : B(F ) → P(Q × Q) by saying that ϕA (B) is, for
every birooted F -tree B = t, u ∈ B(F ), the set of all pairs of state (p, q) ∈ Q×Q
such that there exists a run ρ : dom(t) → Q such that p = ρ(1) and q = ρ(u).
The mapping ϕA is extended to 0 by taking ϕA (0) = ∅ and, to 1 by taking
ϕ(1) = IQ = {(q, q) ∈ Q × Q : q ∈ Q}.
The fact P(Q × Q) is an adequately ordered monoid have already been de-
tailled in the examples above. By definition we have L = ϕ−1 (X ) with X =
{X ⊆ Q × Q : X ∩ T = 0}. Then, we prove that ϕA is indeed an adequate
premorphism.
2
The following theorem tells how quasi-recognizability and MSO definability are
related.
322 D. Janin

Theorem 6. Let θ : F IM (A) → S be an adequate premorphism with finite S.


For every X ⊆ S, the language θ−1 (X) is definable in Monadic Second Order
Logic.

Proof. Let θ : F IM (A) → S as above and let X ⊆ S. Uniformly computing


the value of θ on every birooted tree by means of an MSO formula is done by
adapting Shelah’s decomposition techniques [17]. More precisely, we show that
the strong decomposition provided by Lemma 1 is indeed definable in MSO.
Then, the computation of the value of θ on every birooted rooted B can be
done from the value of θ on the elementary birooted trees and the sub-birooted
F -trees that occur in such a decomposition.
2
For the picture to be complete, it remains to characterize the class of quasi-
recognizable languages w.r.t. the class of languages definable in Monadic Second
Order Logic.

Theorem 7. Let L ⊆ B(F ) be a language of birooted F -trees. The following


properties are equivalent:
(1) the language L is quasi-recognizable,
(2) the language L is a finite boolean combination of upward closed MSO defin-
able languages,
(3) the language L is a finite boolean combination of languages recognized by
finite state birooted tree automata.

Proof. The fact that (1) implies (2) essentially follows from Theorem 6. The fact
(2) implies (3) immediately follows from Theorem 2. Last, we prove, by classical
argument (e.g. cartesian product of monoids) that the class of quasi-recognizable
languages is closed under boolean operations. Then, by applying Theorem 5 this
proves that (3) implies (1).
2

Corollary 8. The birooted image of every regular languages of F -tree is recog-


nizable by an adequate premorphism in a finite adequately ordered monoid.

Proof. This follows from Theorem 3 and Theorem 7. 2

4 Conclusion
Studying languages of birooted F -trees, structures that generalize F -terms, we
have thus defined a notion of automata, a related notion of quasi-recognizability
and we have characterized quite in depth their expressive power in relationship
with language definability in Monadic Second Order Logic.
As a particular case, our results provide a new algebraic characterization of the
regular languages of finite F -trees. Potential links with the preclones approach [5]
or the forest algebra approach [3,2] need to be investigated further.
Algebras, Automata and Logic for Languages of Labeled Birooted Trees 323

References
1. Blumensath, A.: Recognisability for algebras of infinite trees. Theor. Comput.
Sci. 412(29), 3463–3486 (2011)
2. Bojanczyk, M., Straubing, H., Walukiewicz, I.: Wreath products of forest algebras,
with applications to tree logics. Logical Methods in Computer Science 8(3) (2012)
3. Bojańczyk, M., Walukiewicz, I.: Forest algebras. In: Logic and Automata,
pp. 107–132 (2008)
4. Cornock, C., Gould, V.: Proper two-sided restriction semigroups and partial ac-
tions. Journal of Pure and Applied Algebra 216, 935–949 (2012)
5. Ésik, Z., Weil, P.: On logically defined recognizable tree languages. In: Pandya,
P.K., Radhakrishnan, J. (eds.) FSTTCS 2003. LNCS, vol. 2914, pp. 195–207.
Springer, Heidelberg (2003)
6. Janin, D.: Quasi-recognizable vs MSO definable languages of one-dimensional over-
lapping tiles. In: Rovan, B., Sassone, V., Widmayer, P. (eds.) MFCS 2012. LNCS,
vol. 7464, pp. 516–528. Springer, Heidelberg (2012)
7. Janin, D.: Walking automata in the free inverse monoid. Technical Report RR-
1464-12 (revised April 2013), LaBRI, Université de Bordeaux (2012)
8. Janin, D.: On languages of one-dimensional overlapping tiles. In: van Emde Boas,
P., Groen, F.C.A., Italiano, G.F., Nawrocki, J., Sack, H. (eds.) SOFSEM 2013.
LNCS, vol. 7741, pp. 244–256. Springer, Heidelberg (2013)
9. Janin, D.: Overlapping tile automata. In: Bulatov, A. (ed.) CSR 2013. LNCS,
vol. 7913, pp. 431–443. Springer, Heidelberg (2013)
10. Kellendonk, J., Lawson, M.V.: Tiling semigroups. Journal of Algebra 224(1), 140–
150 (2000)
11. Lawson, M.V.: McAlister semigroups. Journal of Algebra 202(1), 276–294 (1998)
12. Perrin, D., Pin, J.-E.: Semigroups and automata on infinite words. In: Fountain,
J. (ed.) Semigroups, Formal Languages and Groups. NATO Advanced Study In-
stitute, pp. 49–72. Kluwer Academic (1995)
13. Pin, J.-E.: Relational morphisms, transductions and operations on languages. In:
Pin, J.E. (ed.) LITP 1988. LNCS, vol. 386, pp. 34–55. Springer, Heidelberg (1989)
14. Pin, J.-E.: Finite semigroups and recognizable languages: an introduction. In: Foun-
tain, J. (ed.) Semigroups, Formal Languages and Groups. NATO Advanced Study
Institute, pp. 1–32. Kluwer Academic (1995)
15. Pin, J.-.E.: Syntactic semigroups. In: Handbook of Formal Languages, ch. 10, vol. I,
pp. 679–746. Springer (1997)
16. Scheiblich, H.E.: Free inverse semigroups. Semigroup Forum 4, 351–359 (1972)
17. Shelah, S.: The monadic theory of order. Annals of Mathematics 102, 379–419
(1975)
18. Thomas, W.: Languages, automata, and logic. In: Handbook of Formal Languages,
ch. 7, vol. III, pp. 389–455. Springer (1997)
19. Wilke, T.: An algebraic theory for regular languages of finite and infinite words.
Int. J. Alg. Comput. 3, 447–489 (1993)
One-Variable Word Equations in Linear Time

Artur Jeż1,2,
1
Max Planck Institute für Informatik,
Campus E1 4, DE-66123 Saarbrücken, Germany
2
Institute of Computer Science, University of Wrocław,
ul. Joliot-Curie 15, PL-50383 Wrocław, Poland
[email protected]

Abstract. In this paper we consider word equations with one variable


(and arbitrary many appearances of it). A recent technique of recom-
pression, which is applicable to general word equations, is shown to be
suitable also in this case. While in general case it is non-deterministic,
it determinises in case of one variable and the obtained running time is
O(n) (in RAM model).

Keywords: Word equations, string unification, one variable equations.

1 Introduction
Word Equations. The problem of satisfiability of word equations was con-
sidered as one of the most intriguing in computer science. The first algorithm
for it was given by Makanin [11] and his algorithm was improved several times,
however, no essentially different approach was proposed for over two decades.
An alternative algorithm was proposed by Plandowski and Rytter [16], who
presented a very simple algorithm with a (nondeterministic) running time poly-
nomial in n and log N , where N is the length of the length-minimal solution.
However, at that time the only bound on such length followed from Makanin’s
work and it was triply exponential in n.
Soon after Plandowski showed, using novel factorisations, that N is at most
doubly exponential [14], proving that satisfiability of word equations is in NEX-
PTIME. Exploiting the interplay between factorisations and compression he im-
proved the algorithm so that it worked in PSPACE [15]. On the other hand, it is
only known that the satisfiability of word equations is NP-hard.

One Variable. Constructing a cubic algorithm for the word equations with
only one variable (and arbitrarily many appearances of it) is trivial. First non-
trivial bound was given by Obono, Goralcik and Maksimenko, who devised an
O(n log n) algorithm [13]. This was improved by Dąbrowski and Plandowski [2]
to O(n + #X log n), where #X is the number of appearances of the variable

The full version of this paper is available at http://arxiv.org/abs/1302.3481

This work was supported by Alexander von Humboldt Foundation.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 324–335, 2013.

c Springer-Verlag Berlin Heidelberg 2013
One-Variable Word Equations in Linear Time 325

in the equation. The latter work assumed that alphabet Σ is finite or that it
can be identified with numbers. A general solution was presented by Laine and
Plandowski [9], who gave an O(n log #X ) algorithm in a simpler model, in which
the only operation on letters is their comparison.

Recompression. Recently, the author proposed a technique of recompression


based on previous techniques of Mehlhorn et. al [12], Lohrey and Mathissen [10]
and Sakamoto [17]. This method was successfully applied to various problems
related to grammar-compressed strings [5,3,4]. Unexpectedly, this approach was
also applicable to word equations, in which case alternative proofs of many known
results were obtained [6].
The technique is based on iterative application of two replacement schemes
performed on the text t:

pair compression of ab For two different letters a, b such that substring ab


appears in t replace each of ab in t by a fresh letter c.
a’s block compression For each maximal block a , where a is a letter and
 > 1, that appears in t, replace all a s in t by a fresh letter a .

In one phase, pair compression (block compression) is applied to all pairs (blocks,
respectively) that appeared at the beginning of this phase. Ideally, each letter
is compressed and so the length of t halves, in a worst-case scenario during one
phase t is still shortened by a constant factor.
The surprising property is that such a schema can be efficiently applied even
to grammar-compressed data [5,3] or to text given in an implicit way, i.e. as
a solution of a word equation [6]. In order to do so, local changes of the variables
(or nonterminals) are needed: X is replaced with a X (or Xa ), where a is
a prefix (suffix, respectively) of substitution for X. In this way the solution that
substitutes a w for X is implicitly replaced with one that substitutes w.

Recompression and One-Variable Equations. As the recompression works


for general word equations, it can be applied also to restricted subclasses. In the
general case it relies on the nondeterminism, however, when restricted to one-
variable equations it determinises. A simple implementation has O(n+#X log n)
running time, see Section 3. Adding a few heuristics, data structures and applying
a more sophisticated analysis yields an O(n) running time, see Section 4.

Outline of the Algorithm. We present an algorithm for one-variable equa-


tion based on the recompression; it also provides a compact description of all
solutions of such an equation. Intuitively: when pair compression is applied, say
ab is replaced by c (assuming it can be applied), then there is a one-to-one corre-
spondence of the solutions before and after the compression, this correspondence
is simply exchange of all abs by cs and vice-versa. The same applies to the block
compression. On the other hand, the modification of X can lead to loss of so-
lutions (for technical reasons we do not consider the solution ): when X is to
326 A. Jeż

be replaced with a X then each solution of the form a w has a corresponding


solution w, but solution a is lost in the process. So before the replacement, it is
tested whether a is a solution and if so, it is reported. The testing is performed
by on-the-fly evaluation of both sides under substitution X = a and comparing
the obtained strings letter by letter until a mismatch is found or both strings
end.
It is easy to implement the recompression so that one phase takes linear time.
The cost is distributed to explicit words between the variables, each such w is
charged O(|w|). If such w is long enough, its length decreases by a constant factor
in one phase, see Lemma 8. Thus, such cost is charged to the lost length and
sums to O(n) in total. However, this is not true when w is short (in particular,
of constant length). In this case we use the fact that there are O(log n) phases
and in each phase such cost is at most O(#X ) (i.e. proportional to the number
of explicit words in total).
Using the following heuristics as well as more involved analysis the running
time can be lowered to O(n) (see Section 4 for some details):
– We save space used for problematic ‘short’ words between the variables (and
thus time needed to compress them in a phase): instead of storing multiple
copies of the same short string we store it once and have pointers to it in
the equation. Additionally we prove that those short words are substrings of
‘long’ words, which allows a bound on the sum of their lengths.
– when we compare Xw1 Xw2 . . . wm X from one side of the equation with its
copy appearing on the other side, we make such a comparison in O(1) time
(using suffix arrays);

– the (Xu)m and (Xu )m (under substitution for X) are compared in O(|u| +
|u |) time instead of naive O(m · |u| + m · |u |), using simple facts from


combinatorics on words.
Furthermore a more insightful analysis shows that problematic ‘short’ words in
the equation invalidate several candidate solutions. This allows a tighter estima-
tion of the time spent on testing the solutions.

Model. To perform the recompression efficiently, an algorithm for grouping


pairs is needed. When we identify the symbols in Σ with consecutive numbers,
this is done using RadixSort in linear time1 . Thus, all (efficient) applications
of recompression technique make such an assumption. On the other hand, the
second of the mentioned heuristics craves checking substring equality in O(1), to
this end a suffix array [7] with a structure for answering longest common prefix
query (lcp) [8] is employed on which we use range minimum queries [1]. The last
structure needs the flexibility of the RAM model to run in O(1) time per query.
1
The RadixSort runs in time linear in number of numbers plus the universe size. Since
we introduce numbers in each phase, it might be that the latter is much larger than
the equation length. However, after each phase in linear time we can replace the
letters appearing in the equation so that they constitute an interval of numbers,
which yields that the RadixSort has indeed linear running time.
One-Variable Word Equations in Linear Time 327

2 Preliminaries
One-Variable Equations. Consider a word equation A = B over one variable
X, by |A| + |B| we denote its length and n is the initial length of the equation.
Without loss of generality one of A and B begins with a variable and the other
with a letter [2]: If they both begin with the same symbol (be it letter or non-
terminal), we can remove this symbol from them, without affecting the set of
solutions; if they begin with different letters, this equation clearly has no solu-
tion. The same applies to the last symbols of A and B. Thus, in the following
we assume that the equation is of the form
A0 XA1 . . . AnA −1 XAnA = XB1 . . . BnB −1 XBnB , (1)
where Ai , Bj ∈ Σ ∗ are called (explicit) words, nA (nB ) denotes the number of
appearances of X in A (B, respectively). A0 (first word) is nonempty and exactly
one of AnA , BnB (last word) is nonempty. If this condition is violated for any
reason, we greedily repair by cutting letters from appropriate strings.
A substitution S assigns a string to X, we extend S to (X ∪ Σ)∗ with an
obvious meaning. A solution is a substitution such that S(A) = S(B). For an
equation A = B we are looking for a description of all its solutions. We disregard
the empty solution S(X) =  and always assume that S(X) = . In such a case
by (1) we can determine the first (last) letter of S(X) in O(1) time.
Lemma 1. Let a be the first letter of A0 . If A0 ∈ a+ then S(X) ∈ a∗ for
each solution S of A = B, all such solutions can be calculated and reported in
O(|A| + |B|) time. If A0 ∈/ a∗ then there is at most one solution S(X) ∈ a+ , the
length of such a solution can be returned in O(|A| + |B|) time. For S(X) ∈ / a+
the lengths of the a-prefixes of S(X) and A0 are the same.
A symmetric version of Lemma 1 holds for the suffix of S(X). By
SimpleSolution(a) we denote a procedure that for A0 ∈ / a∗ returns the unique

 such that S(X) = a is a solution (or nothing, if there is no such solution).

Representation of Solutions. Consider any solution S of A = B. If |S(X)| ≤


|A0 | then S(X) is a prefix of A0 . When |S(X)| > |A0 | then S(A) begins with
A0 S(X) while S(B) begins with S(X) and thus S(X) has a period A0 . Hence
S(X) = Ak0 A, where A is a prefix of A0 and k > 0. In both cases S(X) is
uniquely determined by |S(X)|, so it is enough to describe such lengths.
Each letter in the current instance of our algorithm represents some (com-
pressed) string of the input equation, we store its weight which is the length
of such a string. Furthermore, when we replace X with a X (or Xa ) we keep
track of the weight of a . In this way, for each solution of the current equation we
know what is the length of the corresponding solution of the original equation
and this identifies it uniquely.

Preserving Solutions. All subprocedures of the algorithm should preserve


solutions, i.e. there should be a one-to-one correspondence between solutions
328 A. Jeż

before and after the application of the subprocedure. However, as they replace
X with a X (or Xbr ), some solutions are lost in the process and so they should
be reported. We formalise these notions.
We say that a subprocedure preserves solutions when given an equation A = B
it returns A = B  such that for some strings u and v

– some solutions of A = B are reported by the subprocedure,


– S is an unreported solution of A = B if and only if there is a solution S  of
A = B  such that S(X) = uS  (X)v = uv.

By PCab→c (w) we denote the string obtained from w by replacing each ab


by c (we assume that a = b, so this is well-defined), this corresponds to pair
compression. We say that a subprocedure properly implements pair compres-
sion for ab, if it satisfies the conditions for preserving solutions above, but with
PCab→c (S(X)) = uS  (X)v replacing S(X) = uS  (X)v. Similarly, by BCa (w) we
denote a string with maximal blocks a replaced by a (for each  > 1) and we
say that a subprocedure properly implements blocks compression for a letter a.
Given an equation A = B, its solution S and a pair ab ∈ Σ 2 appearing in
S(A) (or S(B)) we say that this appearance is explicit, if it comes from substring
ab of A (or B, respectively); implicit, if it comes (wholly) from S(X); crossing
otherwise. A pair is crossing if it has a crossing appearance and noncrossing
otherwise. A similar notion applies to maximal blocks of as, in which case we
say that a has a crossing block or it has no crossing blocks. Alternatively, a pair
ab is crossing if b is the first letter of S(X) and aX appears in the equation or
a is the last letter of S(X) and Xb appears in the equation or a is the last and
b the first letter of S(X) and XX appears in the equation.
Unless explicitly stated, we consider crossing/noncrossing pairs ab in which
a = b. As the first (last) letter of S(X) is the same for each S, the definition of
the crossing pair does not depend on the solution; the same applies to crossing
blocks.
When a pair ab is noncrossing, its compression is easy, as it is enough to
replace each explicit ab with a fresh letter c, we refer to this procedure as
PairCompNCr(a, b). Similarly, when no block of a has a crossing appearance,
the a’s blocks compression consists simply of replacing explicit a blocks, we call
this procedure BlockCompNCr(a).

Lemma 2. If ab is a noncrossing pair then PairCompNCr(a, b) properly imple-


ments pair compression for ab. If a has no crossing blocks, then BlockCompNCr(a)
properly implements the block compression for a.

The main idea of the recompression method is the way it deals with the crossing
pairs: imagine that ab is a crossing pair, this is because S(X) = bw and aX
appears in A = B or S(X) = wa and Xb appears in it (the remaining case, in
which S(X) = awb and XX appears in the equation is treated in the same way).
The cases are symmetric, so we deal only with the first one. To ‘uncross’ ab in
this case it is enough to ‘left-pop’ b from X: replace each X in the equation with
bX and implicitly change the solution to S(X) = w.
One-Variable Word Equations in Linear Time 329

Algorithm 1. Pop(a, b)
1: if b is the first letter of S(X) then
2: if SimpleSolution(b) returns 1 then  S(X) = b is a solution
3: report solution S(X) = b
4: replace each X in A = B by bX  Implicitly change S(X) = bw to S(X) = w
5:  perform symmetric actions for a

Lemma 3. Pop(a, b) preserves solutions. After it the pair ab is noncrossing.

The presented procedures are merged into PairComp(a, b) that turns crossing
pairs into noncrossing ones and then compresses them.

Lemma 4. PairComp(a, b) properly implements the pair compression of ab.

The number of noncrossing pairs can be large, however, applying Pop(a, b), where
b, a are the first and last letters of the S(X) reduces their number to 2.

Lemma 5. After Pop(a, b), where b, a are the first and last letters of the S(X),
the solutions are preserved and there are at most two crossing pairs.

The problems with crossing blocks are solved in a similar fashion: a has a crossing
block, if and only if aa is a crossing pair. So we ‘left-pop’ a from X until the first
letter of S(X) is different than a, we do the same with the ending letter b. This
effectively removes the whole a-prefix (b-suffix, respectively) from X: suppose
that S(X) = a wbr , where w does not start with a nor end with b. Then we
replace each X by a Xbr , implicitly changing the solution to S(X) = w. The
corresponding procedure is called CutPrefSuff.

Lemma 6. CutPrefSuff preserves solutions and after its application there are
no crossing blocks of letters.

BlockComp(a) compresses all blocks of a, regardless of whether it is crossing or


not, by first applying CutPrefSuff and then BlockCompNCr(a).

Lemma 7. BlockComp(a) properly implements the block compression for a be-


fore its application.

3 Main Algorithm

The following algorithm OneVar is basically a simplification of the general algo-


rithm for testing the satisfiability of word equations [6].
330 A. Jeż

Algorithm 2. OneVar reports all solutions of a given word equation


1: while |A0 | > 1 do
2: Letters ← letters in A = B
3: run CutPrefSuff  There are now crossing blocks
4: for a ∈ Letters do  Compressing blocks, time O(|A| + |B)| in total
5: run BlockComp(a)
6: Pop(a, b), where a is the first and b the last letter of S(X)
7:  Now there are only two crossing pairs
8: Crossing ← list of crossing pairs, Non-Crossing ← list of noncrossing pairs
9: for each ab ∈ Non-Crossing do  Compress noncrossing pairs, O(|A| + |B)|
10: PairCompNCr(a, b)
11: for ab ∈ Crossing do  Compress the 2 crossing pairs, O(|A| + |B)|
12: PairComp(a, b)
13: TestSolution  Test solutions from a∗

We call one iteration of the main loop a phase.

Theorem 1. OneVar runs in time O(|A| + |B| + (nA + nB ) log(|A| + |B|)) and
correctly reports all solutions of a word equation A = B.

The most important property of OneVar is that the explicit strings between the
variables shorten (assuming they are long enough): We say that a word Ai (Bj )
is short if it consists of at most C = 100 letters and long otherwise.

Lemma 8. If Ai (Bj ) is long then its length is reduced by 1/4 in this phase; if
it is short then after the phase it still is.
If the first word is short then its length is shortened by at least 1 in a phase.

It is relatively easy to estimate the running time of one phase.

Lemma 9. One phase of OneVar can be performed in O(|A| + |B|) time.

The cost of one phase is charged towards the words A0 , . . . , AnA , B1 , . . . , BnB
proportionally to their lengths. Since the lengths of the long words drop by a
constant factor in each phase, in total such cost is O(n). For short words the
cost is O(1) per phase and there are O(log n) phases by Lemma 8.

4 Heuristics and Better Analysis

The main obstacle in the linear running time is the necessity of dealing with
short words, as the time spent on processing them is difficult to charge. The
improvement to linear running time is done by four major modifications:

several equations We store a system of several equations and look for a solu-
tion of such a system. This allows removal of some words from the equations.
One-Variable Word Equations in Linear Time 331

small solutions We identify a class of particularly simple solutions, called


small, and show that a solution is reported within O(1) phases from the
moment when it became small. In several cases of the analysis we show that
the solutions involved are small and so it is easier to charge the time spent
on testing them.
storage All words are represented by a structure of size proportional to the size
of the long words. In this way the storage space (and so also time used for
compression) decreases by a constant factor in each phase.
testing The testing procedure is modified, so that the time it spends on the
short words is reduced. We also improve the rough estimate that SimpleSo-
lution takes time proportional to |A| + |B| to an estimation that counts for
each word whether it was included in the test or not.

Several Equations. We store several equations and look for substitutions that
simultaneously satisfy all of them. Hence we have a collection Ai = Bi of equa-
tions, for i = 1, . . . , m, each of them is of the form (1). This system is obtained
by replacing one equation Ai Ai = Bi Bi with equivalent two equations Ai = Bi
and Ai = Bi .
Each of the equations Ai = Bi in the system specifies the first and last letter
of the solution, length of the a-prefix and suffix etc., exactly in the same way
as it does for a single equation. However, it is enough to use only one of them,
say A1 = B1 , as if there is any conflict then there is no solution at all. The
consistency is not checked, simply when we find out about inconsistency, we
terminate immediately. We say that Ai (Bj ) is first or last if it is in any of the
stored equations.
All operations on a single equation from previous sections (popping letters,
cutting prefixes/suffixes, pair/block compression, etc.) generalise to a system of
equations and they preserve their properties and running times, with the length
ma single equation |A| + |B| replaced by a sum of lengths of all equations
of
i=1 |Ai | + |Bi |.

Small Solutions. We say that a word w represented as w = w1 w2 w3 (where 


is arbitrary) is almost periodic, with period size |w2 | and side size |w1 w3 | (note
that several such representations may exist, we use this notion for a particu-
lar representation that is clear from the context). A substitution S is small, if
S(X) = (w)k v, where w, v are almost periodic, with period size at most C and
side size at most 6C.

Lemma 10. Suppose that S is a small solution. There is a constant c such that
within c phases the corresponding solution is reported by OneVar.

Storing. While the long words are stored exactly as they used to, the short
words are stored more efficiently: we keep a table of short words and equations
point to the table of short words instead of storing them. We say that such
332 A. Jeż

a representation is succinct and its size is the sum of lengths of words stored in
it. Note that we do not include the size of the equation.
The correctness of such an approach is guaranteed by the fact that equality
of two explicit words is not changed by OneVar, which is shown by a simple
induction.

Lemma 11. Consider any words A and B in the input equation. Suppose that
during OneVar they were transformed to A = B  , none of which is a first nor
last word. Then A = B if and only if A = B  .

Hence, to perform the compression it is enough to read the succinct representa-


tion without looking at the whole equation. In particular, the compression (both
pair and block) can be performed in time proportional to the size of the succinct
representation.

Lemma 12. The compression in one phase of OneVar can be performed in time
linear in size of the succinct representation.

Ideally, we want to show that the succinct representation has size proportional
to the length of long words. In this way its size would decrease by a constant
factor in each phase and thus be O(n) in total. In reality, we are quite close
to this: the words stored in the tables are of two types: normal and overdue.
The normal words are substrings of the long words or A20 and consequently the
sum of their sizes is proportional to the size of the long words. A word becomes
overdue if at the beginning of the phase it is not a substring of a long word or
A20 . It might be that it becomes a substring of such a word later, it does not stop
to be an overdue word in such a case. The new overdue words can be identified
in linear time using standard operations on a suffix array for a concatenation of
long and short strings appearing in the equations.

Lemma 13. In time proportional to the sum of sizes of the long words plus the
number of overdue words we can identify the new overdue words.

The overdue words can be removed from the equations in O(1) phases after
becoming overdue. This is shown by a serious of lemmata.
We say that for a substitution S the word A is arranged against itself if each
A in S(A) coming from explicit Ai = A corresponds to Bj = A at the same
positions in S(B) (and symmetrically, for the sides of the equation exchanged).

Lemma 14. Consider a word A in a phase in which it becomes overdue and a


solution S. Then either S is small or A is arranged against itself.

The proof is rather easy: we consider the Ai = A that is not arranged against
some Bj = A in S(A) = S(B). Since by definition it also cannot be arranged
against a subword of a long word, case inspection gives that one of the S(X)
preceding or succeeding Ai overlaps with some other S(X), yielding that S(X)
is periodical. Furthermore, this period has length at most |Ai | ≤ C, hence S(X)
is small.
One-Variable Word Equations in Linear Time 333

Due to Lemmata 10 and 14 the overdue words can be removed in O(1) phases
after their introduction: suppose that A becomes an overdue word in phase
. Any solution, in which an overdue word A is not arranged against another
copy of A is small and so it is reported after O(1) phases. Then an equation
Ai XAXAi = Bi XAXBi , where Ai and Bi do not have A as a word, is equivalent
to two equations Ai = Bi and Ai = Bi and this procedure can be applied
recursively to Ai = Bi . This removes all copies of A from the system.

Lemma 15. Consider the set of overdue words introduced in phase . Then in
phase  + c (for some constant c) we can remove all words A from equations.
The obtained set of equations has the same set of solutions. The time spend on
removal of overdue words, over the whole run of OneVar, is O(n).

This allows to bound the time spent on compression.

Lemma 16. The running time of OneVar, except for time used to test the so-
lutions, is O(n).

Testing. SimpleSolution checks whether S is a solution by comparing S(Ai ) and


S(Bi ) letter by letter, replacing X with a on the fly. We say that in such a case
a letter b in S(Ai ) is tested against the corresponding letter in S(Bi ).
Suppose that for a substitution S a letter from Ai is tested against a letter
from S(XBj ) (there is some asymmetry regarding Ai s and Bj s in the definition,
this is a technical detail without an importance). We say that this test is:

protected if at least one of Ai , Ai+1 , Bj , Bj+1 is long


failed if Ai , Ai+1 , Bj and Bj+1 are short and a mismathch for S is found till
the end of Ai+1 or Bj+1 ;
aligned if Ai = Bj and Ai+1 = Bj+1 , all of them are short and the first letter
of Ai is tested against the first letter of Bj ;
misaligned if all of Ai , Ai+1 , Bj , Bj+1 are short, Ai+1 = Ai or Bj+1 = Bj and
this is not an aligned test;
periodical if Ai+1 = Ai , Bj+1 = Bj , all of them are short and this is not an
aligned test.

It is easy to show by case inspection that each test is of one of those type. We
calculate the cost of each type of tests separately. For failed tests note that there
are constantly many of them in each of the O(log n) phases.

Lemma 17. The number of all failed tests is O(log n).

For protected tests, we charge the cost of the protected test to the long word
and only O(|A|) such tests can be charged to one long word A in a phase. On
the other hand, each long word is shortened by a constant factor in a phase and
so this cost can be charged to those removed letters and thus the total cost of
those tests (over the whole run of OneVar) is O(n).
334 A. Jeż

Lemma 18. In one phase the number of protected tests is proportional to the
length of long words. Thus there are O(n) such tests in total.
In case of the misaligned tests, consider the phase in which the last of Ai+1 , Ai ,
Bj+1 , Bj becomes short. We show that the corresponding solution S  is small
in this phase and so by Lemma 10 it is reported within O(1) following phases.
The proof is quite technical, it follows a general idea of Lemma 14: we show that
S(X) overlaps with itself and so it has a period. A closer inspection proves that
this period is almost periodical.
The cost of the misaligned test is charged to the last word among Ai , Ai+1 ,
Bj , Bj+1 that became short, say, Bj and only O(1) such tests are charged to
this Bj (over the whole run of OneVar). Hence there are O(n) misaligned tests.
Lemma 19. There are O(n) misaligned tests during the whole run of OneVar.
Consider the maximal set of consecutive aligned tests, they correspond to com-
parison of Ai XAi+1 . . . Ai+k X and Bj XBj+1 . . . Bj+k X, where Ai+ = Bj+ for
 = 0, . . . , k. Then the next test is either misaligned, protected or failed, so if the
cost of all those aligned tests can be bounded by O(1), they can be associated
with the succeeding test. Note that instead of performing the aligned tests (by
comparing letters), it is enough to identify the maximal (syntactically) equal
substrings of the equation. From Lemma 11 it follows that this corresponds to
the (syntactical) equality of substrings in the original equation. We identify such
substrings in O(1) per substring using a suffix array constructed for the input
equation.
Lemma 20. The total cost of aligned tests is O(n).
For the periodical tests we apply a similar charging strategy. Suppose that we
are to test the equality of (suffix of) S((Ai X) ) and (prefix of) S(X(Bj X)k ).
Firstly, it is easy to show that the next test is either misaligned, protected or
failed. Secondly, if |Ai | = |Bj | then the test for Ai+ and Bj+ for 0 <  ≤  is
the same as for Ai and Bj and so they can be all skipped. If |Ai | > |Bj | then
the common part of S((Ai X) ) and S(X(Bj X)k ) have periods |S(Ai X)| and
|S(Bj X)| and consequently has a period |Ai | − |Bj | ≤ C. So to test the equality
of S((Ai X) ) and (prefix of) S(X(Bj X)k ) it is enough to test first common
|Ai | − |Bj | letters and check whether both S(Ai X) and S(Bj X) have period
|Ai | − |Bj |.
Lemma 21. Performing all periodical tests takes in total O(n) time
This yields that the total time of testing is linear.
Lemma 22. The time spent on testing solutions during OneVar is O(n).

Acknowledgements I would like to thank P. Gawrychowski for initiating my


interest in compressed membership problems and compressed pattern matching,
exploring which led to this work and for pointing to relevant literature [10,12];
J. Karhumäki, for his explicit question, whether the techniques of local recom-
pression can be applied to the word equations; W. Plandowski for his numerous
comments and suggestions on the recompression applied to word equations.
One-Variable Word Equations in Linear Time 335

References
1. Berkman, O., Vishkin, U.: Recursive star-tree parallel data structure. SIAM J.
Comput. 22(2), 221–242 (1993)
2. Dąbrowski, R., Plandowski, W.: On word equations in one variable. Algorith-
mica 60(4), 819–828 (2011)
3. Jeż, A.: Faster fully compressed pattern matching by recompression. In: Czumaj,
A., Mehlhorn, K., Pitts, A., Wattenhofer, R. (eds.) ICALP 2012, Part I. LNCS,
vol. 7391, pp. 533–544. Springer, Heidelberg (2012)
4. Jeż, A.: Approximation of grammar-based compression via recompression. In: Fis-
cher, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 165–176. Springer,
Heidelberg (2013)
5. Jeż, A.: The complexity of compressed membership problems for finite automata.
Theory of Computing Systems, 1–34 (2013),
http://dx.doi.org/10.1007/s00224-013-9443-6
6. Jeż, A.: Recompression: a simple and powerful technique for word equations. In:
Portier, N., Wilke, T. (eds.) STACS. LIPIcs, vol. 20, pp. 233–244. Schloss Dagstuhl–
Leibniz-Zentrum fuer Informatik, Dagstuhl (2013),
http://drops.dagstuhl.de/opus/volltexte/2013/3937
7. Kärkkäinen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction.
J. ACM 53(6), 918–936 (2006)
8. Kasai, T., Lee, G., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-
common-prefix computation in suffix arrays and its applications. In: Amir, A.,
Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidel-
berg (2001)
9. Laine, M., Plandowski, W.: Word equations with one unknown. Int. J. Found.
Comput. Sci. 22(2), 345–375 (2011)
10. Lohrey, M., Mathissen, C.: Compressed membership in automata with compressed
labels. In: Kulikov, A., Vereshchagin, N. (eds.) CSR 2011. LNCS, vol. 6651,
pp. 275–288. Springer, Heidelberg (2011)
11. Makanin, G.S.: The problem of solvability of equations in a free semigroup. Matem-
aticheskii Sbornik 2(103), 147–236 (1977) (in Russian)
12. Mehlhorn, K., Sundar, R., Uhrig, C.: Maintaining dynamic sequences under equal-
ity tests in polylogarithmic time. Algorithmica 17(2), 183–198 (1997)
13. Obono, S.E., Goralcik, P., Maksimenko, M.N.: Efficient solving of the word equa-
tions in one variable. In: Privara, I., Ružička, P., Rovan, B. (eds.) MFCS 1994.
LNCS, vol. 841, pp. 336–341. Springer, Heidelberg (1994)
14. Plandowski, W.: Satisfiability of word equations with constants is in NEXPTIME.
In: STOC, pp. 721–725 (1999)
15. Plandowski, W.: Satisfiability of word equations with constants is in PSPACE.
J. ACM 51(3), 483–496 (2004)
16. Plandowski, W., Rytter, W.: Application of Lempel-Ziv encodings to the solution
of word equations. In: Larsen, K.G., Skyum, S., Winskel, G. (eds.) ICALP 1998.
LNCS, vol. 1443, pp. 731–742. Springer, Heidelberg (1998)
17. Sakamoto, H.: A fully linear-time approximation algorithm for grammar-based
compression. J. Discrete Algorithms 3(2-4), 416–430 (2005)
The IO and OI Hierarchies Revisited

Gregory M. Kobele1, and Sylvain Salvati2,


1
University of Chicago
[email protected]
2
INRIA, LaBRI, Université de Bordeaux
[email protected]

Abstract. We study languages of λ-terms generated by IO and OI


unsafe grammars. These languages can be used to model meaning repre-
sentations in the formal semantics of natural languages following the tra-
dition of Montague [19]. Using techniques pertaining to the denotational
semantics of the simply typed λ-calculus, we show that the emptiness
and membership problems for both types of grammars are decidable. In
the course of the proof of the decidability results for OI, we identify a de-
cidable variant of the λ-definability problem, and prove a stronger form
of Statman’s finite completeness Theorem [28].

1 Introduction

In the end of the sixties, similar but independent lines of research were pursued in
formal language theory and in the formal semantics of natural language. Formal
language theory was refining the Chomsky hierarchy so as to find an adequate
syntactic model of programming languages lying in between the context-free and
context-sensitive languages. Among others, this period resulted in the definition
of IO and OI macro languages by Fischer [12] and the notion of indexed languages
(which coincide with OI macro languages) by Aho [2]. At the same time, Richard
Montague [19] was proposing a systematic way of mapping natural language sen-
tences to logical formulae representing their meanings, providing thereby a solid
foundation for the field of formal semantics. The main idea behind these two lines
of research can be summed up in the phrase ‘going higher-order.’ For macro and
indexed grammars, this consisted in parameterizing non-terminals with strings
and indices (stacks) respectively, and in Montague’s work it consisted in us-
ing the simply typed λ-calculus to map syntactic structures to their meanings.
Montague was ahead of the formal language theory community which took an-
other decade to go higher-order with the work of Damm [7]. However, the way
Damm defined higher-order grammars used (implicitly) a restricted version of
the λ-calculus that is now known as the safe λ-calculus. This restriction was
made explicit by Knapik et al. [16] and further studied by Blum and Ong [4].

Long version: http://hal.inria.fr/hal-00818069

The first author was funded by LaBRI while working on this research.

This work has been supported by ANR-12-CORD-0004 POLYMNIE.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 336–348, 2013.

c Springer-Verlag Berlin Heidelberg 2013
The IO and OI Hierarchies Revisited 337

For formal grammars this restriction has first been lifted by de Groote [8] and
Muskens [21] in the context of computational linguistics and as a way of applying
Montague’s techniques to syntactic modeling.
In the context of higher-order recursive schemes, Ong showed that safety
was not a necessary condition for the decidability of the MSO model checking
problem. Moreover, the safety restriction has been shown to be a real restriction
by Parys [23]. Nevertheless, concerning the IO and OI hierarchies, the question
as to whether safety is a genuine restriction in terms of the definable languages is
still an open problem. Aehlig et al. [1] showed that for second order OI grammars
safety was in fact not a restriction. It is nevertheless generally conjectured that
for higher-order grammars safety is in fact a restriction.
As we wish to extend Montague’s technique with the OI hierarchy so as to
enrich it with fixed-point computation as proposed by Moschovakis [20], or as
in proposals to handle presuppositions in natural languages by Lebedeva and
de Groote [10,9,17], we work with languages of λ-terms rather than with just
languages of strings or trees. In the context of languages of λ-terms, safety clearly
appears to be a restriction since, as shown by Blum and Ong [4], not every λ-
term is safe. Moreover the terms generated by Montague’s technique appear to
be unsafe in general.
This paper is thus studying the formal properties of the unsafe IO and OI lan-
guages of λ-terms. A first property that the use of unsafe grammars brings into
the picture is that the unsafe IO hierarchy is strictly included within the unsafe
OI hierarchy. The inclusion can be easily shown using a standard CPS transform
on the grammars and its strictness is implied by decidability results. Nevertheless,
it is worth noting that such a transform cannot be performed on safe grammars,
so that it is unclear whether safe IO languages are safe OI languages. This paper
focuses primarily on the emptiness and the membership problems for unsafe IO
and OI languages, by using simple techniques related to the denotational seman-
tics of the λ-calculus. For the IO case, we are going to recast some known results
from Salvati [25,24], so as to emphasize that they derive from the fact that for an
IO language and a finite model of the λ-calculus, one can effectively compute the
elements of the model which are the interpretations of terms in the language. This
allows us to show that the emptiness problem is decidable, and also, using Stat-
man’s finite completeness theorem [28], to show that the membership problem is
decidable. In contrast to the case for IO languages, we show that this property does
not hold for OI languages. Indeed, we prove that the set of closed terms of a given
type is an OI language, and thus, since λ-definability is undecidable [18], the set of
elements in a finite model that are the interpretation of terms in an OI language
cannot be effectively computed. To show the decidability of emptiness and of the
membership problems for OI, we prove a theorem that we call the Observability
Theorem; it characterizes some semantic properties of the elements of an OI lan-
guage in monotonic models, and leads directly to the decidability of the emptiness
problem. For the membership problem we prove a generalization of Statman’s fi-
nite completeness theorem which, in combination with the Observability Theorem,
entails the decidability of the membership problem of OI languages.
338 G.M. Kobele and S. Salvati

This work is closely related to the research that is being carried out on higher-
order recursive schemes. It differs from it in one important respect: the main
objects of study in the research on higher-order recursive schemes are the infinite
trees generated by schemes, while our work is related to the study of the Böhm
trees of λY -terms which may contain λ-binders. Such Böhm trees are closer to
the configuration graphs of Higher-order Collapsible Pushdown Automata whose
first-order theory has been shown undecidable [6]. If we were only interested in
grammars generating trees or strings, the decidability of MSO for higher-order
recursion schemes [22] would yield the decidability of both the emptiness and
the membership problems of unsafe OI grammars, but this is no longer the case
when we turn to languages of λ-terms.

Organization of the Paper. We start by giving the definitions related to the


λ-calculus, its finitary semantics, and how to define higher-order grammars in
section 2. We then present the decidability results concerning higher-order IO
languages and explain why the techniques used there cannot be extended to
OI languages in section 3. Section 4 contains the main contributions of the
paper: the notion of hereditary prime elements of monotone models together with
the Observability Theorem, and a strong form of Statman’s finite completeness
Theorem. Finally we present conclusions and a broader perspective on our results
in section 5.

2 Preliminaries
In this section, we introduce the various calculi we are going to use in the course
of the article. Then we show how those calculi may be used to define IO and OI
grammars. We give two presentations of those grammars, one using traditional
rewriting systems incorporating non-terminals, and the other as terms in one
of the calculi; these two perspectives are equivalent. In the remainder of the
paper we will switch between these two formats as is most convenient. Finally
we introduce the usual notions of full and monotone models for the calculi we
work with.

2.1 λ-Calculi
We introduce here various extensions of the simply typed λ-calculus. Given an
atomic type 0 (our results extend with no difficulty to arbitrarily many atomic
types), the set type of types is built inductively using the binary right-associative
infix operator →. We write α1 → · · · → αn → α0 for (α1 → (· · · (αn → α0 ))).
As in [14], the order of a type is: order(0) = 1, order(α → β) = max(order(α) +
1, order(β)). Constants are declared in higher-order signatures Σ which are finite
sets of typed constants {Aα 1 , . . . , An }. We use constants to represent non-
1 αn

terminal symbols.
The IO and OI Hierarchies Revisited 339

We assume that we are given a countably infinite set of typed λ-variables


(xα , y β ,. . . ). The families of typed λY +Ω-terms (Λα )α∈type built on a signature
Σ are inductively constructed according to the following rules: xα , cα and Ω α
are in Λα ; Y α is in Λ(α→α)→α ; if M is in Λα→β and N is in Λα , then (M N )
is in Λα ; if M is in Λβ then (λxα .M ) is in Λα→β ; if M and N are in Λ0 then
M + N is in Λ0 . When M is in Λα we say that it has type α, we write M α
to indicate that M has type α; the order of a term M , is the order of its type.
As it is customary, we omit type annotations when they can be easily inferred
or when they are irrelevant. We adopt the usual conventions about dropping
parentheses in the syntax of terms. We write M α→β + N α→β as an abbreviation
for λxα .M x + N x. The set of free variables of the term M is written F V (M ); a
term M is closed when F V (M ) = ∅. Finally we write M [x1 ← N1 , . . . , xn ← Nn ]
for the simultaneous capture-avoiding substitutions of the terms N1 , . . . , Nn for
the free occurrences of the variables x1 , . . . , xn in M .
The set of λ-terms is the set of terms that do not contain occurrences of Y , +
or Ω, and for any S ⊆ {Y, +, Ω}, the λS-terms are the λ-terms that may contain
only constants that are in S. For example, λ+Ω-terms are the terms that do not
contain occurrences of Y .
We assume the reader is familiar with the notions of β-contractions and η-
contraction and η-long forms (see [14]). The constant Ω α stands for the undefined
term of type α, Y α is the fixpoint combinator of type (α → α) → α, and + is
the non-deterministic choice operator. The families of terms that may contain
occurrences of Ω are naturally ordered with the least compatible relation . such
that Ω α . M for every term M of type α; δ-contraction provides the operational
semantics of the fixpoint combinator: Y M →δ M (Y M ), and +-contraction gives
the operational semantics of the non-deterministic choice operator: M + N →+
M and M + N →+ N . Given a set R of symbols denoting compatible relations,
for S ⊆ R, S-contraction is the union of the contraction relations denoted by the
symbols in S; it will generally be written as →S . For example, →βη+ denotes the

βη+-contraction. S-reduction, written →S , is the reflexive transitive closure of S-
contraction and S-conversion, =S , is the smallest equivalence relation containing
→S . The notion of S-normal form is defined as usual, we simply say normal form

when S is obvious from the context. We recall (see [14]) that when M →βηδ+ N

and M  and N  are respectively the η-long forms of M and N , then M  →βδ+ N  .
In the remainder of the paper we assume that we are working with terms in η-
long form and forget about η-reduction.

2.2 IO/OI Grammars and λY +-Calculus

We define a higher-order macro grammar G as a triple (Σ, R, S) where Σ is a


higher-order signature of non-terminals, R is a finite set of rules A → M where
A is a non-terminal of Σ and M is a λ-term built on Σ that has the same type
as A, and where S is a distinguished non-terminal of Σ, the start symbol. We
do not require M to be a closed term, the free variables in the right hand side
of grammatical rules play the role of terminal symbols. We also do not require
340 G.M. Kobele and S. Salvati

S to be of a particular type, this permits (higher-order) macro grammars to


define languages of λ-terms of arbitrary types. As noted in [8], languages of
strings and trees are particular cases of the languages we study. A grammar
has order n when the highest order of its non-terminals is n. Our higher-order
macro grammars generalize those of Damm [7] in two ways: first, they do not
necessarily verify the safety condition that Damm’s grammar implicitly verify;
second, instead of only defining languages of strings or trees, they can define
languages of λ-terms, following Montague’s tradition in the formal semantics
of natural languages. According to [4], a term M is safe when no subterm N
of M contains free variables (excluding the free variables of M which play the
role of constants) of order lower than that of N , unless N occurs as part of a
subterm N P or λx.N . A grammar is safe when the right hand sides of its rules
are all safe terms. Safe terms can be safely reduced using substitution in place
of capture-avoiding substitutions.
The rules of a grammar G = (Σ, R, S) define a natural relation →G on terms
built on Σ. We write M →G N when N is obtained from M by replacing
(without capturing free variables) an occurrence of a non-terminal A in M by a
term P , such that A → P is a rule of G. The grammar G defines two languages:

LOI (G) = {M in normal form | S →βG M } and LIO (G) = {M in normal form |
∗ ∗
∃P.S →G P ∧ P →β M }. These two languages can be defined in a different
manner, in particular M is in LOI (G) iff S can be reduced to M with the head
reduction strategy that consists in always contracting top-most redices of the
relation →βG . For a given grammar G, we always have that LIO (G) ⊆ LOI (G),
but, in general, LIO (G) = LOI (G). Here follows an example of a second order
macro grammar G whose free variables (or terminals) are ex(0→0)→0 , and0→0→0 ,
not0→0 and P0→0 and whose non-terminals are S 0 (the start symbol), S 0→0
and cons0→0→0 (extending BNF notation to macro grammars).
S → ex(λx.(S  x)) | and S S | notS
S  → λy.ex(λx.S  (cons y x)) | λy.and(S  y)(S  y) | λy.not(S  y) | λy.Py
cons → λxy.x | λxy.y
The language LOI (G) represents the set of formulae of first-order logic built with
one predicate P (we use a similar construction later on to prove Theorem 8).
The language LIO (G) represents the formulae of first-order logic that can be
built with only one variable (that is each subformula of a formula represented in
LIO (G) contains at most one free variable). Given the definition of safety given
in [4], it is easily verified that the terms of these languages are not safe; this
illustrates that unsafe IO and OI languages of λ-terms are more general than
their safe counterparts. Moreover, when seen as graphs, the terms of LOI (G)
form a class of graphs which has an unbounded treewidth; the MSO theory of
these terms is undecidable. This explains why the decidability results we obtain
later on cannot be seen as corollaries of Ong’s Theorem [22].
We extend the notions of IO and OI languages to λY +Ω-terms. Given a
λY +Ω-term M , its IO language LIO (M ) is the set of λ-terms N in normal form
∗ ∗
such that there is P such that M →δ+ P →β N . Its OI language LOI (M ) is the
The IO and OI Hierarchies Revisited 341


set of λ-terms N in normal form such that M →βδ+ N (we can also restrict our
attention to head-reduction). An alternative characterization of LOI (M ) is the
following. Given a term M we write ω(M ) for the immediate approximation of M ,
that is the term obtained from M as follows: ω(λxα .M ) = Ω α→β if ω(M ) = Ω β ,
and λxα .ω(M ) otherwise; ω(M N ) = Ω β if ω(M ) = Ω α→β or M = λx.P , and
ω(M N ) = ω(M )ω(N ) otherwise; ω(Y α ) = Ω (α→α)→α , ω(xα ) = xα , ω(Ω α ) =
Ω α , and ω(N1 + N2 ) = ω(N1 ) + ω(N2 ). Note that ω(M ) is a λ+Ω-term that
contains no β-redices. A λ+Ω-term Q is a finite approximation of M if there is

a P such that M →βδ P and Q = ω(P ). The language LOI (M ) is the union of
the languages LOI (Q) so that Q is a finite approximation of M .
In both the IO and OI mode of evaluation, λ+Ω-terms define finite languages,
and λY +Ω-calculus defines exactly the same classes of languages as higher-order
macro grammars.
Theorem 1. Given a higher-order macro grammar G, there is a λY +Ω-term
M so that LOI (G) = LOI (M ) and LIO (G) = LIO (M ).
Given a λY +Ω-term M there is a higher-order macro grammar G so that
LOI (G) = LOI (M ) and LIO (G) = LIO (M ).
The proof of this theorem is based on the correspondence between higher-order
schemes and λY -calculus that is given in [27]. Going from a λY +Ω-term to
a grammar is simply a direct transposition of the procedure described in [27]
with the obvious treatment for +. For the other direction, it suffices to see the
grammar as a non-deterministic scheme, which is done by viewing all the rules
A → M1 , . . . , A → Mn , of a non-terminal A as a unique rule of a scheme
A → M1 + · · · + Mn ; and then to transform the scheme into a λY +-term using
the transformation given in [27]. There is a minor technicality concerning the IO
languages; one needs to start with a grammar where every non-terminal can be

rewritten into a G-normal form using →G only.

2.3 Models of the λ-Calculi


Full models of the λ-calculus We start by giving the simplest notion of models
of λ-calculus, that of full models. A full model F is a collection of sets indexed
by types (Fα )α∈type so that Fα→β is FβFα , the set of functions from Fα to
Fβ . Note that F is completely determined by F0 . A full model is said to be
finite when F0 is a finite set; in that case Fα is finite for every α ∈ type. A
valuation ν is a function that maps variables to elements of F respecting typing,
meaning that, for every xα , ν(xα ) is in Fα . Given a valuation ν and a in Fα ,
we write ν[xα ← a] for the valuation which maps the variable xα to a but is
otherwise equal to ν. We can now interpret λ-terms in F , using the following
ν ν ν ν
interpretation scheme: [[xα ]]F = ν(xα ), [[M N ]]F = [[M ]]F ([[N ]]F ) and for a in Fα ,
ν ν[x α
←a] ν
[[λxα .M ]]F (a) = [[M ]]F . For a closed term M , [[M ]]F does not depend on
ν, and thus we simply write [[M ]]F . The following facts are known about full
models:
ν ν
Theorem 2. If M =βη N then for every full model F , [[M ]]F = [[N ]]F .
342 G.M. Kobele and S. Salvati

Theorem 3 (Finite Completeness [28]). Given a λ-term M , there is a finite


ν ν
full model FM and a valuation ν so that, for every N , [[N ]]FM = [[M ]]FM iff
N =βη M .

In this theorem, the construction of FM and ν is effective.


For a full model F , an element f of Fα is said to be λ-definable when there is
a closed M such that [[M ]]F = f . The problem of λ-definability is the problem
whose input is a finite full model F and an element f of Fα , and whose answer
is whether f is λ-definable.
Theorem 4 (Loader [18]). The λ-definability problem is undecidable.
ν
Given a language of λ-terms of type α, L, and a full model F , we write [[L]]F
ν
for the set {[[M ]]F | M ∈ L}. So in particular, for a λY +Ω-term M , we may
ν ν
write [[LIO (M )]]F or [[LOI (M )]]F .

Monotone models of λY +Ω-calculus Given two complete lattices L1 and L2 , we


write L3 = Mon[L1 → L2 ] for the lattice of monotonous functions from L1 to
L2 ordered pointwise; f is monotonous if a ≤1 b implies f (a) ≤2 f (b), and given
f and g in L3 , f ≤3 g whenever for every a in L1 , f (a) ≤2 g(a). Among the
functions in Mon[L1 → L2 ], of special interest are the step functions which are
functions a !→ f determined from elements a in L1 and f in L2 , and are defined
such that (a !→ f )(b) is equal to f when a ≤1 b and to ⊥2 otherwise. A monotone
model M is a collection of finite lattices indexed by types, (Mα )α∈type where
Mα→β = Mon[Mα → Mβ ] (we write ⊥α and +α respectively for the least and
greatest elements of Mα ). The notion of valuation on monotone models is similar
to the one on full models and we use the same notation. Terms are interpreted in
monotone models according to the following scheme: [[xα ]]νM = ν(xα ), [[M N ]]νM =
ν[xα ←a]
[[M ]]νM ([[N ]]νM ) and for a in Mα , [[λxα .M ]]νM (a) = [[M ]]M , [[Ω α ]]νM = ⊥α ,
ν ν ν ν
[[M + N ]]M = [[M ]]M ∨[[N ]]M and for every a in Mα→α , [[Y ]]M (a) = {an (⊥α ) |
n ∈ N}.
The following Theorem gives well known results on monotone models (see [3]):

Theorem 5. Given two λY +Ωterms of type α, M and N :


ν ν
1. if M =βδ N then for every monotone model, M, [[M ]]M = [[N ]]M ,
ν ν
2. [[M ]]M = {[[Q]]M | Q is a finite approximation of M },
∗ ν ν
3. if M →βδ+ N then [[N ]]M ≤ [[M ]]M ,
ν ν
4. if N . M then [[N ]]M ≤ [[M ]]M .

3 Relations between IO, OI and Full Models


In this section, we investigate some basic properties of IO and OI languages.
We will see that the class of higher-order OI languages strictly subsumes that
of higher-order IO languages. We will then see that the emptiness and mem-
bership problems for higher-order IO languages are decidable, by showing that
The IO and OI Hierarchies Revisited 343

for a higher-order grammar G, a finite full model F , and a valuation ν, the set
ν ν
[[LIO (G)]]F is effectively computable. On the other hand, we show that [[L]]F is
not in general effectively computable when L is an OI language.
A simple continuation passing style (CPS) transform witnesses that:
Theorem 6 (OI subsumes IO). Given a higher-order grammar G there is a
higher-order grammar G  so that LIO (G) = LOI (G  ).
The CPS transform naturally makes the order of G  be the order of G plus 2.
We now show that for a full model F , a valuation ν and a given grammar G,
ν
the set [[LIO (G)]]F can be effectively computed. A natural consequence of this is
that the emptiness and the membership problems for higher-order IO languages
are decidable. These results are known in the literature [24,25,26], nevertheless,
we include them here so as to emphasize that they are related to the effectivity
ν
of the set [[LIO (G)]]F , a property that, as we will see later, does not hold in the
case of OI languages.
Theorem 7 (Effective finite interpretation of IO). Given a higher-order
macro grammar G, a full model F and a valuation ν, one can effectively construct
ν
the set [[LIO (G)]]F .
Corollary 1. Given a higher-order macro grammar G, the problem of deciding
whether LIO (G) = ∅ is P-complete.
Corollary 2. Given a higher-order macro grammar G and a term M , it is de-
cidable whether M ∈ LIO (G).
We are now going to see that the set of closed λ-terms of a given type α is
an OI language. Combined with Theorem 4, we obtain that the set [[LOI (G)]]F
cannot be effectively computed. Moreover, Theorems 6 and 7 imply that the IO
hierarchy is strictly included in the OI hierarchy.
Theorem 8. For every type α, there is a closed λY +-term M of type α such
that LOI (M ) is the set of all closed normal λ-terms of type α.
Theorem 9 (Undecidable finite interpretation of OI). Given a higher-
order macro grammar G, a finite full model F , and f an element of F , it is
undecidable whether f ∈ [[LOI (G)]].
Proof. Direct consequence of Theorems 8 and 4. &
%
Theorem 10. The class of higher-order IO language is strictly included in the
class of higher-order OI languages.
Proof. If there were an IO grammar that could define the set of closed terms of
type α, Theorem 7 would contradict Theorem 4. %
&
This last theorem should be contrasted with the result of Haddad [13] which
shows that OI and IO coincide for schemes. The two results do not contradict
each other as IO is not defined in the same way on schemes and on grammars.
344 G.M. Kobele and S. Salvati

4 Emptiness and Membership for the OI Hierarchy


In this section we prove the decidability of the emptiness and membership prob-
lems for higher-order OI languages. For this we use monotone models as approx-
imations of sets of elements of full models.

4.1 Hereditary Primality and the Observability Theorem


Theorem 9 implies that the decision techniques we used for the emptiness and
the membership problems for IO do not extend to OI. So as to show that those
problems are nevertheless decidable, we are going to prove a theorem that we call
the Observability Theorem, which allows us to observe certain semantic proper-
ties of λ-terms in the OI language of a λY +Ω-term M by means of the semantic
values of M in monotone models. For this we introduce the notion of hereditary
prime elements of a monotone model.

Definition 1. In a lattice L, an element f is prime (or ∨-prime) when for


every g1 and g2 in L, f ≤ g1 ∨ g2 implies that f ≤ g1 or f ≤ g2 .
Given a monotone model M = (Mα )α∈type , for every type α we define the

sets M+α and Mα by:

1. M+
0 and M0 contain the prime elements of M0 that are different from ⊥0 ,
2. Mα→β = {( F ) !→ g | F ⊆ M−
+
α ∧ g ∈ Mβ },
+
− −
3. Mα→β = {f !→ g | f ∈ Mα ∧ g ∈ Mβ }.
+

A valuation ν on M is said hereditary prime when, for every variable xα ,


ν(xα ) = F for some F ⊆ M−α . The elements of Mα are called the hereditary
+

prime elements of Mα .
ν
The main interest of primality lies in that, if f is prime and f ≤ [[M + N ]]M ,
ν ν
then either f ≤ [[M ]]M or f ≤ [[N ]]M . The notion of hereditary primality is
simply a way of making primality compatible with all the constructs of λY +Ω-
terms. The proof of the following technical Lemma from which we derive the
Observability Theorem, is mainly based on this idea.

Lemma 1. Given a λ+Ω-term M α , a monotone model M = (Mα )α∈type , a


hereditary prime valuation ν and a hereditary prime element f of Mα , we have
the equivalence:
ν ν
f ≤ [[M α ]]M ⇔ ∃N ∈ LOI (M ).f ≤ [[N ]]M

Theorem 5 allows to extend Lemma 1 to λY +Ω-terms.


Theorem 11 (Observability). Given a λY +Ω-term M , a monotone model
M = (Mα )α∈type , a hereditary prime valuation ν and a hereditary prime ele-
ment f of Mα , we have the equivalence:
ν ν
f ≤ [[M α ]]M ⇔ ∃N ∈ LOI (M ).f ≤ [[N ]]M
The IO and OI Hierarchies Revisited 345

Proof. Since for every α, Mα is finite, according to Theorem 5.2 (and the fact
that the set of finite approximations of M is directed for the partial order .),
ν ν
there is a finite approximation Q of M such that [[Q]]M = [[M ]]M and thus
ν
f ≤ [[Qα ]]M . But then Q is a λ+Ω-term and by the previous Lemma this is
ν
equivalent to there being some N in LOI (Q) such that f ≤ [[N ]]M . The conclusion
follows from the fact that obviously LOI (Q) ⊆ LOI (M ). The other direction
follows from Theorem 5.3. &
%

4.2 Decidability Results


We are now going to use the Observability Theorem so as to prove the decid-
ability of both the emptiness and the membership problems for OI languages.

Decidability of emptiness We consider the monotone model E = (Eα )α∈type so


that E0 is the lattice with two elements {+, ⊥} so that ⊥ ≤ +. We then define
for every α, the element eα of Eα+ ∩ Eα− so that: e0 = +, and eα→β = eα !→ eβ .
We let ξ be the valuation so that for each variable xα , ξ(xα ) = eα . A simple
induction gives the following Lemma which, combined with Theorem 11 gives
proposition 1 and finally Theorem 12:
Lemma 2. For every λ-term M of type γ, eγ ≤ [[M ]]ξE .
Proposition 1. Given a λY +Ω-term M of type α, we have that
LOI (M ) = ∅ ⇔ eα ≤ [[M ]]ξE

Proof. If LOI (M ) = ∅, then there is N in normal form so that M →βδ+ N .
ξ ξ
Lemma 2 implies that eα ≤ [[N ]]E and thus, using Theorem 5, eα ≤ [[M ]]E . If
ξ
eα ≤ [[M ]]E , since eα is in Eα , from Theorem 11, there is N in LOI (M ) so that
+
ξ
eα ≤ [[N ]]E ; so in particular that LOI (M ) = ∅. &
%
Theorem 12. The emptiness problem for OI languages is decidable.

Decidability of membership For the decidability of the membership problem, we


are going to prove a stronger version of Statman’s finite completeness Theo-
rem. The proofs based mostly on logical relations (see [3]) can be found in the
appendix.
Given a finite set A, we write M(A) = (Mα (A))α∈type for the monotone
model so that M0 (A) is the lattice of subsets of A ordered by inclusion. We let
⊥A,α be the least element of Mα (A).
Definition 2. Given a λ-term M of type α, a triple T = (A, ν, f ), where A
is a finite set, ν is a valuation on M(A) and f is an element of Mα (A), is
characteristic of M when:
ν
1. for every λ-term N of type α, M =β N iff f ≤ [[N ]]M(A) ,
2. f is a hereditary prime element of Mα (A) and ν is a hereditary prime val-
uation.
The stronger form of Statman finite complete is formulated as:
346 G.M. Kobele and S. Salvati

Theorem 13 (Monotone finite completeness). For every type α and every


pure term M , one can effectively construct a triple T that is characteristic of M .

Using the Observability Theorem as in the proof of Proposition 1 we obtain:

Theorem 14. Given a λ-term M in normal form of type α and a λY +Ω-term


N of type α, if T = (A, ν, f ) is a characteristic triple of M then f ≤ [[N ]]νM(A)
iff M ∈ LOI (N ).

Theorem 15. Given M a λ-term of type α and N a λY +Ω-term of type α, it


is decidable whether M ∈ LOI (N ).

5 Conclusion
We have seen how to use models of λ-calculus so as to solve algorithmic ques-
tions, namely the emptiness and membership problems, related to the classes
of higher-order IO and OI languages of λ-terms. In so doing, we have revisited
various questions related to finite models of the λ-calculus. In particular, we
have seen that hereditary prime elements, via the Observability Theorem, play
a key role in finding effective solutions for higher-order OI languages. In combi-
nation with Theorem 8, we obtain that it is decidable whether there is a term
M whose interpretation in a monotone model is greater than a given hereditary
prime element of that model, which gives a decidability result for a restricted
notion of λ-definability. This raises at least two questions: (i) what kind of prop-
erties of λ-terms can be captured with hereditary prime elements, (ii) is there
a natural extension of this notion that still defines some decidable variant of
λ-definability.
On the complexity side, we expect that, using similar techniques as in [29],
it might be possible to prove that verifying whether the value of a λY +Ω-term
is greater than a hereditary prime element of a monotone model is of the same
complexity as the emptiness and membership problems for the safe OI hierarchy
which is (n − 2)-Exptime-complete for order n-grammars (see [11], with Huet’s
convention, the order of grammars is one plus the order of their corresponding
higher-order pushdown automaton). Of course, such a high complexity makes
the decidability results we obtained of little interest for practical applications
in natural language processing. It does however underscore the need to identify
linguistically motivated generalizations which point to tractable subclasses of
OI grammars [30]. Some restricted classes of IO grammars are known to have
low complexity [15,5]. A natural move is to see whether in the OI mode of
derivation those grammars still have reasonable complexity for the emptiness
and membership problems.
The IO and OI Hierarchies Revisited 347

References
1. Aehlig, K., de Miranda, J.G., Ong, C.-H.L.: Safety is not a restriction at level
2 for string languages. In: Sassone, V. (ed.) FOSSACS 2005. LNCS, vol. 3441,
pp. 490–504. Springer, Heidelberg (2005)
2. Aho, A.V.: Indexed grammars - an extension of context-free grammars. J.
ACM 15(4), 647–671 (1968)
3. Amadio, R.M., Curien, P.-L.: Domains and Lambda-Calculi. Cambridge Tracts in
Theoretical Computer Science. Cambridge University Press (1998)
4. Blum, W., Ong, C.-H.L.: The safe lambda calculus. Logical Methods in Computer
Science 5(1:3), 1–38 (2009)
5. Bourreau, P., Salvati, S.: A datalog recognizer for almost affine λ-cfgs. In:
Kanazawa, M., Kornai, A., Kracht, M., Seki, H. (eds.) MOL 12. LNCS, vol. 6878,
pp. 21–38. Springer, Heidelberg (2011)
6. Broadbent, C.H.: The limits of decidability for first order logic on cpda graphs. In:
STACS, pp. 589–600 (2012)
7. Damm, W.: The IO- and OI-hierarchies. Theor. Comput. Sci. 20, 95–207 (1982)
8. de Groote, P.: Towards abstract categorial grammars. In: ACL (ed.) Proceedings
39th Annual Meeting of ACL, pp. 148–155 (2001)
9. de Groote, P., Lebedeva, E.: On the dynamics of proper names. Technical report,
INRIA (2010)
10. de Groote, P., Lebedeva, E.: Presupposition accommodation as exception handling.
In: SIGDIAL, pp. 71–74. ACL (2010)
11. Engelfriet, J.: Iterated stack automata and complexity classes. Inf. Comput. 95(1),
21–75 (1991)
12. Fischer, M.J.: Grammars with macro-like productions. PhD thesis, Harvard Uni-
versity (1968)
13. Haddad, A.: IO vs OI in higher-order recursion schemes. In: FICS. EPTCS, vol. 77,
pp. 23–30 (2012)
14. Huet, G.: Résolution d’équations dans des langages d’ordre 1,2,...,ω. Thèse de doc-
torat en sciences mathématiques, Université Paris VII (1976)
15. Kanazawa, M.: Parsing and generation as datalog queries. In: Proceedings of the
45th Annual Meeting of ACL, pp. 176–183. ACL (2007)
16. Knapik, T., Niwiński, D., Urzyczyn, P.: Higher-order pushdown trees are easy. In:
Nielsen, M., Engberg, U. (eds.) FOSSACS 2002. LNCS, vol. 2303, pp. 205–222.
Springer, Heidelberg (2002)
17. Lebedeva, E.: Expressing Discourse Dynamics Through Continuations. PhD thesis,
Université de Lorraine (2012)
18. Loader, R.: The undecidability of λ-definability. In: Logic, Meaning and Compu-
tation: Essays in Memory of Alonzo Church, pp. 331–342. Kluwer (2001)
19. Montague, R.: Formal Philosophy: Selected Papers of Richard Montague. Yale
University Press, New Haven (1974)
20. Moschovakis, Y.: Sense and denotation as algorithm and value. In: Logic Collo-
quium 1990: ASL Summer Meeting in Helsinki, vol. 2, p. 210. Springer (1993)
21. Muskens, R.: Lambda Grammars and the Syntax-Semantics Interface. In: Proceed-
ings of the Thirteenth Amsterdam Colloquium, pp. 150–155 (2001)
22. Ong, C.-H.L.: On model-checking trees generated by higher-order recursion
schemes. In: LICS, pp. 81–90 (2006)
23. Parys, P.: On the significance of the collapse operation. In: LICS, pp. 521–530
(2012)
348 G.M. Kobele and S. Salvati

24. Salvati, S.: Recognizability in the Simply Typed Lambda-Calculus. In: Ono, H.,
Kanazawa, M., de Queiroz, R. (eds.) WoLLIC 2009. LNCS, vol. 5514, pp. 48–60.
Springer, Heidelberg (2009)
25. Salvati, S.: On the membership problem for non-linear acgs. Journal of Logic Lan-
guage and Information 19(2), 163–183 (2010)
26. Salvati, S., Manzonetto, G., Gehrke, M., Barendregt, H.: Loader and Urzyczyn are
logically related. In: Czumaj, A., Mehlhorn, K., Pitts, A., Wattenhofer, R. (eds.)
ICALP 2012, Part II. LNCS, vol. 7392, pp. 364–376. Springer, Heidelberg (2012)
27. Salvati, S., Walukiewicz, I.: Recursive schemes, Krivine machines, and collapsible
pushdown automata. In: Finkel, A., Leroux, J., Potapov, I. (eds.) RP 2012. LNCS,
vol. 7550, pp. 6–20. Springer, Heidelberg (2012)
28. Statman, R.: Completeness, invariance and λ-definability. Journal of Symbolic
Logic 47(1), 17–26 (1982)
29. Terui, K.: Semantic evaluation, intersection types and complexity of simply typed
lambda calculus. In: RTA, pp. 323–338 (2012)
30. van Rooij, I.: The tractable cognition thesis. Cognitive Science 32, 939–984 (2008)
Evolving Graph-Structures and Their Implicit
Computational Complexity

Daniel Leivant1 and Jean-Yves Marion2


1
Indiana University, USA
[email protected]
2
Université de Lorraine, LORIA, France
[email protected]

Abstract. Dynamic data-structures are ubiquitous in programming,


and they use extensively underlying directed multi-graph structures, such
as labeled trees, DAGs, and objects. This paper adapts well-established
static analysis methods, namely data ramification and language-based
information flow security, to programs over such graph structures. Our
programs support the creation, deletion, and updates of both vertices
and edges, and are related to pointer machines. The main result states
that a function over graph-structures is computable in polynomial time
iff it is computed by a terminating program whose graph manipulation
is ramified, provided all edges that are both created and read in a loop
have the same label.

1 Introduction
The interplay of algorithms and data-structures has been central to both the-
oretical and practical facets of programming. A core method of this relation is
the organization of data-structures by underlying directed multi-graphs, such
as trees, DAGs, and objects, where each vertex points to a record. Such data
structures are often thought of as “dynamic”, because they are manipulated
by algorithms that modify the underlying graph, namely by creating, updating
and removing vertices and edges. Our imperative language is inspired by pointer
machines [6,10] and by abstract state machines [2].
In this work we propose a simple and effective static analysis method for
guaranteeing the feasible time-complexity of programs over many dynamic data-
structures. Most static analysis efforts have focused in recent years on program
termination and on safety and security. Our work is thus a contribution to an-
other strand of static analysis, namely computational complexity.
Static analysis of computational complexity is based on several methods,
classified broadly into descriptive ones (i.e. related to Finite Model Theory),
and applicative (i.e. identifying restrictions of programs and proof methods that
guarantee upper bounds on the complexity of computation). One of the most
fruitful applicative methods has been ramification, also referred to as tiering.
Initially this method was used for inductive data, such as words and natural
numbers, but lately the method has been applied to more general forms of data.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 349–360, 2013.

c Springer-Verlag Berlin Heidelberg 2013
350 D. Leivant and J.-Y. Marion

The intuition behind ramification is that programs’ execution time depends on


the nature of information flow during execution, and that such flow can be con-
strained naturally and effectively by imposing a precedence relation regulating
the information flow, e.g. from higher tier to lower tier. Here we refer to a ram-
ification that pertains to commands of our imperative language as well as to
expressions denoting graph elements.
Our main result states that a function over graph-structures is computable in
polynomial time iff it is computed by a terminating program whose graph ma-
nipulation is ramified, provided all edges that are both created and read in a loop
have the same label. This result considerably extends previous uses of ramifica-
tion in implicit computational complexity, and this extension touches on some
of the most important aspects of programming, which have been disregarded in
previous research. A simple modification of the proof gives an analogous charac-
terization of Logspace computation over graph structures.1 We thus believe that
this work is of both theoretical and practical significance. Our results raise in-
teresting questions about relations between data ramification and typed systems
for program security [12,9], where the concept of information flow is explicit.

2 Evolving Graph-Structures
2.1 Sorted Partial Structures
The framework of sorted structures is natural for the graph-structures we wish
to consider, with vertices and data treated as distinct sorts. Data might itself be
sorted, but that does not concern us here. Recall that in a sorted structure, if V
and D are sorts, then a function f is of type V → D if its domain consists of the
structure elements of sort V, and its range of elements of sort D.
The graphs we consider are essentially deterministic transition graphs: edges
are labeled (to which we refer as “actions”), and every vertex has at most one
out-edge with a given label. Such graphs are conveniently represented by partial
functions, corresponding to actions. An edge labeled f from vertex u to vertex v
is represented by the equality f (u) = v. When u has no f-labeled out-edges, we
leave f (u) undefined. To represent function partiality in the context of structures,
we post a special constant nil, assumed to lie outside the sorts (or, equivalently,
in a special singleton sort)2 . We write f : V + D, and say that f is a partial
function from sort V to sort D, if f : V → (D ∪ {nil}). We write V + D for the
type of such partial functions.

2.2 Graph Structures


We consider sorted partial structures with a distinguished sort V of vertices. To
account for the creation of new vertices, we also include a sort R of reserved-
vertices. For simplicity, and without loss of generality, we assume only one addi-
tional sort D, which we dub data. A graph vocabulary is a sorted vocabulary for
1
We also present extensions of our language to incorporate recursion.
2
This is a minor, but deliberate, departure from the usual ontological (i.e. Church-
style) typing of sorted structure.
Evolving Graph-Structures and Their Implicit Computational Complexity 351

these sorts, where we have just five sets of identifiers: V (the vertex constants),
D (the data constants), F (function-identifiers for labeled edges, of type V + V),
G (function-identifiers for data, of type V → D), and R (the relation-identifiers),
each of some type τ × · · · × τ , where each τ is V or D. As syntactic parameters
we use v ∈ V, d ∈ D, f ∈ F, g ∈ G, and R ∈ R.
Given a sorted vocabulary Σ as above, a Σ-structure S consists of a vertex-
universe V S which is finite, a potentially infinite reserve-universe RS , a data-
universe D S , a distinct object ⊥ to interpret nil, and a sort-correct interpretation
AS for each Σ-identifier A: vS ∈ V S ; dS ∈ DS ; gS ∈ [V S → D S ] (a data-
function); fS ∈ [V S + V S ] (a partial function), and for a relation-identifier R, a
relation RS ⊆ τ S × · · · τ S . Note that we do not refer to functions over data, nor
to functions of arity > 1. Also, the fact that our graphs are edge-deterministic
is reflected in our reprensetation of edges by functions.
Our graph structures bear similarity to the struct construct in the pro-
gramming language C, and to objects (without behaviors or methods): a vertex
identifies an object, and the state of that object is given by fields that are spec-
ified by the unary function identifiers. This is why Tarjan, in defining similar
structures [11], talks about records and items rather than vertices and edges. The
restriction of a graph-structure S to the sort V of vertices can be construed as a
labeled directed multi-graph, in which there is an edge labeled by f from vertex
u to vertex v exactly when fS (u) = v. Thus the fan-out of each graph is bounded
by the number of edge-labels in Σ. Examples of graph structures abound, see
examples in Section 5. Linked-lists of data is an obvious one, of which words
(represented as linked lists of alphabet-symbols) form a special case.

2.3 Expressions
Expressions are generated from a set X of vertex-variables, a set Y of data-
variables, and the vocabulary identifiers, as follows. Equality here does not
conform strictly to the sort discipline, in that we allow equations of the form
V = nil3 .
V ∈ VExpr ::= X | nil | v | f (V ) where X ∈ X
D ∈ DExpr ::= Y | d | g(V ) where Y ∈ Y
B ∈ BExpr ::= V = V | D = D | ¬(B) | R(E1 . . . En ) where R : τ n , Ei : τ

3 Imperative Programs over Graph-Structures


3.1 Programs
We refer to a skeletal imperative language, which supports pointers:

P ∈ Prg ::= X:=V | Y :=D | f (X):=V | g(X):=D | New(X)


| skip | P ; P | if (B) {P } {P } | while (B) {P }
3
Negation is useful, however, in defining commands’ semantics and dispensing with
truth constants.
352 D. Leivant and J.-Y. Marion

The boolean expression B of conditional and iterative commands is said to be


their guard. We posit that each program is given with a finite set X0 ⊂ X of
input variables.

3.2 Example: Tarjan’s Union Algorithm


The following graph-algorithm is due to Tarjan [11, p.21]. It refers to a represen-
tation of sets by linked-lists, whose initial vertex also serves as a name for the
set. The linked-list is represented by a partial-function next, and the function
parent maps each node to the head of its linked-list. The algorithm generates,
given as input two lists r, q representing disjoint sets, a list representing their
union. It successively inserts right after r’s head the entries of q; thus r is main-
tained as the name of the union.
while ( q = nil ) { s a v e := next ( q ) ;
parent ( q ) := r ; % parent and next a r e m o d i f i e d
next ( q ) := next ( r ) ;
next ( r ) := q ;
q := s a v e }
We shall see that our tiering method admits the program above.

3.3 Evolving Structures


In defining the semantics of an “uninterpreted” imperative program one refers to
structures for that program’s vocabulary, augmented with a store (i.e. environ-
ment, valuation). For programs over a Σ-structure S, a store consists then of a
function μ = μX ∪ μY , where μX : X → V S ∪ {⊥}, and μY : Y → DS . A
Σ-configuration is a pair consisting of a Σ-structure S and a store μ. We chose the
phrase configuration to stress their dynamic nature, as computation stages of an
evolving structure.
Commands of imperative languages are interpreted semantically as partial-
functions that map an initial configuration to a final one. Here we have two com-
mands whose intended semantic interpretation is to modify the structure itself. A
command New(X) modifies the sorts, by moving an element of RS to V S , and up-
dating the store to have X point to the new element. We write (S, μ)[νX] for the
resulting configuration. Thus X points to a fresh vertex. More importantly, a com-
mand of the form f (X):=V modifies the semantics of the partial-function fS , in that
for a = μ(X) and w the value of V in (S, μ) (defined formally below), an f-edge v  u
is replaced by v  w. We write (S, μ)[f (X) ← w] for the resulting interpretation.
Our structure updates are obviously related to Gurevich’s abstract sequential
machines (ASM) [2]. Gurevich divides identifiers into two classes: static iden-
tifiers, whose interpretation remains constant, and dynamic identifiers, whose
interpretation may evolve during computation. Here the only dynamic identi-
fiers are the edge functions. An ASM computation progresses through “states”,
where every state is a structure. In contrast, we refer to configurations, because
program variables play a central role in our imperative programs. Thus, the
execution of our programs progresses through configuration.
Evolving Graph-Structures and Their Implicit Computational Complexity 353

Our programming language is related, more broadly, to pointer machines. Tar-


jan [11] defined a pure reference machine, consisting of an expandable collection
of records and a finite number of registers. Pure reference machines are easily
simulated by our programs, and vice versa.
Our programs are also related to Schönhage machines [10]. Each such machine
consists of a finite control program combined with a dynamic structure (which
is essentially the same as our graph-structures). Schönhage machines are an
extension of Kolmogorov-Uspensky machines [6]. Another source of inspiration
is the work of Jones & als. on blob model of computations [3].

3.4 Semantics of Expressions


We give next the evaluation rules for Σ-expressions E in a Σ-configuration (S, μ),
e
writing S, μ |= E ⇒ a to indicate that E evaluates to element a of S.4
b∈ V∪D Z ∈X∪Y S, μ  E ⇒ a
e
h ∈F∪G
e e
S, μ  b ⇒ bS S, μ  Z ⇒ μ(Z) e
S, μ  h(a) ⇒ hS (a)
e e
S, μ  Ei ⇒ ai a1 ..an  ∈ RS S, μ  Ei ⇒ ai a1 ..an  ∈ RS
S, μ  R(E1 , . . . , En ) S, μ  ¬R(E1 , . . . , En )

3.5 Semantics of Programs


The semantics of programs is defined below:
e
S, μ  E ⇒ a
s
s S, μ  New(X) =⇒ (S, μ)[νX]  skip
S, μ  Z:=E =⇒ S, μ[Z ← a]  skip
e e
S, μ  X ⇒ a S, μ  V ⇒ b
s
S, μ  f (X):=V =⇒ S, μ[f(a) := b]  skip
s s
S, μ  P1 =⇒ S  , μ  P1 S, μ  P1 =⇒ S  , μ  skip
s s
S, μ  P1 ; P2 =⇒ S  , μ  P1 ; P2 S, μ  P1 ; P2 =⇒ S  , μ  P2
S, μ  B S, μ  ¬B
s s
S, μ  if (B){P0 }{P1 } =⇒ S, μ  P0 S, μ  if (B){P0 }{P1 } =⇒ S, μ  P1
S, μ  B S, μ  ¬B
s s
S, μ  while(B){P } =⇒ S, μ  P ; while(B){P } S, μ  while(B){P } =⇒ S, μ  skip

s
The phrase S, μ  P =⇒ S  , μ  P  conveys that evaluating a program
P starting with configuration (S, μ) is reduced to evaluating P  in configuration
(S  , μ ); i.e., P reduces to P  while updating (S, μ) to (S  , μ ).
An initial configuration is a configuration (S, μ) where μ(X) = nil for every
non-input variable X. A program P computes the partial function [[P ]] with
s
initial configurations as input, defined by: [[P ]](S, μ) = (S  , ξ) iff S, μ  P (=⇒
)∗ T , ξ  skip.
4
Here we consider equality as just another relation.
354 D. Leivant and J.-Y. Marion

3.6 Run-Time
We say that a program P runs in time t on input (S, μ), and write TimeP (S, μ) =
s
t, when S, μ  P (=⇒)t T , ξ  skip for some (T , ξ). We take the size |S, μ| of
a configuration (S, μ) to be the number n of elements in the vertex-universe V .
Since the number of edges is bounded by n2 , we disregard them here. We also
disregard the size of the data-universe, because our programs do not modify the
data present in records. A program P is running in polynomial time if there is
a k > 0 such that TimeP (S, μ)  k · |S, μ|k for all configurations (S, μ),

4 Ramifiable Programs
4.1 Tiering
Program tiering, also referred to as ramification, has been introduced in [7] and
used in restricted form already in [1]. It serves to syntactically control the run-
time of programs. Here we adapt tiering to graph-structures. The main challenge
here is the evolution of structures in course of computation. To address it, we
consider a finite lattice T = (T, /, 0, ∨, ∧), and refer to the elements of T as tiers.
However, in order to simplify soundness proofs, and without loss of generality,
we will focus on the boolean lattice T = ({0, 1}, , 0, ∨, ∧). We use lower case
Greek letters α, β as discourse parameters for tiers.
Given T, we consider T-environments (Γ , Δ). Here Γ assigns a tier to each
variable in V, whereas Δ assigns to each function identifier f : V + V one or
several expressions of the form α → β, so that either
1. all types in Δ(f ) are of the form α → α, in which case we say that f is stable
in the environment; or
2. all types in Δ(f ) are of the form α → β, with β ≺ α, and we say that f is
reducing in the environment.
A tiering assertion is a phrase of the form Γ , Δ V : α, where V is a vertex-
expression and (Γ , Δ) a T-environment. The correct tiering assertions are gen-
erated by the tiering system in Figure 1.

4.2 Ramifiable Programs


Given a lattice T, a program P is T-ramifiable if there is a T-environment (Γ , Δ)
for which Γ, Δ P : α for some α, and such that Γ (X ) = 1 for every input
variable X ∈ X0 .5 Thus, ramifiable programs can be construed as programs
decorated with tiering information.

Lemma 1 (Subject Reduction)


s
If S, μ  P =⇒ S  , μ  P  and Γ, Δ P : α then Γ, Δ P  : α.

Lemma 2 (Type Inference). The problem of deciding, given a program P


and a lattice T, whether P is T-ramifiable, is decidable in polynomial time.
5
Recall that each program is assumed given with a set X0 of input variables.
Evolving Graph-Structures and Their Implicit Computational Complexity 355

Γ (X ) = α α → β ∈ Δ(f) Γ, Δ V :α
Γ, Δ c:α Γ, Δ X :α Γ, Δ f(V ) : β

Γ, Δ Vi : α Γ,Δ Vi : α
Γ, Δ R(V1 , . . . , Vn ) : α Γ,Δ V0 = V1 : α

Fig. 1. Tiering rules for vertex and boolean expressions

Γ, Δ X :α Γ, Δ V :α Γ, Δ f(X) : α Γ, Δ V :α
Γ, Δ X :=V : α Γ, Δ f(X ):=V : α

Γ, Δ X :0 Γ, Δ B:α Γ, Δ P :α
0≺α
Γ, Δ New(X) : 0 Γ, Δ while(B){P } : α

Γ, Δ P :α Γ, Δ P : β
Γ, Δ skip : 0 Γ, Δ P ; P : α ∨ β
Γ, Δ B:α Γ, Δ Pi : α Γ, Δ P :β
(β  α)
Γ, Δ if (B){P0 }{P1 } : α Γ, Δ P :α

Fig. 2. Tiering rules for programs

Proof. We associate with each vertex-variable X a “tier-variable” αX , and with


each function f ∈ F two variables αf and β f , with the intent that αf → β f
is a possible tiering of f. The typing rules for tiers give rise to a set of linear
constraints on these tier-variables, a problem which is poly-time decidable.

4.3 Stationary and Tightly-Modifying Loops


We say that a function identifier f is probed in P if it occurs in P either in some
assignment X := V or in the guard of a loop or a branching command. For
example, f is probed in X := f(V ), as well as in if (f(X ) = nil){P }{P  }. The
identifier f is modified in P if it occurs in an assignment f (X) := V in P .
Fix a lattice T, and a T-environment (Γ , Δ). By the tiering rules, if a loop
while(B){P } is of tier α then Γ , Δ B : α. We say that the loop is stationary
if no f ∈ F of type α → α is modified therein. The loop above is tightly-modifying
if it has modified function-identifiers of type α → α, but at most one of those is
also probed. In other words, all edges that are both created and read in a loop
have the same label. For instance, in Example 3.2 above, next is both modified
and probed, but parent is modified without being probed. Thus the loop, with
its obvious tiering environment, is tightly-modifying.6
6
Note that next and parent are of type 1 → 1, and all variables are of tier 1. Set
union can be iterated because the result r is of tier 1, unlike in most other works.
356 D. Leivant and J.-Y. Marion

4.4 Main Characterization Theorem

Given a lattice T and Γ , Δ P : α, we say that (Γ , Δ) is a tight ramification of


P if Γ is an initial tiering, and each loop of P is stationary or tightly-modifying.
We say that P is tightly-ramifiable if it has a tight T-ramification (Γ , Δ), with
Γ initial, for some non-trivial T.

Theorem 1. A function over graph-structures is computable in polynomial time


iff it is computed by a terminating and tightly-ramifiable program.

The Theorem follows from the Soundness Lemma 6 and the Completeness Propo-
sition 1 below.

5 Examples of Ramified Programs

Tree insertion. The program below inserts the tree T into the binary search
tree whose root is pointed-to by x. The input variables are x and T .
i f ( x1 = n i l )
{x 1 :=T : 1 ; } % then c l a u s e
{ % else clause
while ( ( x 1 = n i l ) and ( key (T1 ) = key ( x 1 ) ) )
{ i f ( key (T1 ) < key ( x 1 ) ) {p1 :=x 1 ; x 1 := l e f t ( x 1 ) 1 }
{p1 :=x 1 ; x 1 := ri ght ( x 1 ) 1 } } : 1 ;
i f ( key (T ) < key ( p ) ) { l e f t ( p1 ) := T1 : 1}
1 1

{ ri ght ( p1 ):= T1 : 1}

Note that neither left nor right is modified in the loop, so the loop is stationary.
Copying lists. Here we use New to copy a list, where the copy is in reverse
order. Note that the source list is of tier 1 while the copy is of tier 0.
y0 = n i l : 0 ;
while ( x 1 = n i l )
{ z 0 :=y 0 : 0 ; New( y 0 ) ; suc ( y 0 ):= z 0 : 0 ;
x 1 := suc ( x 1 ) : 1 } : 1

The loop is stationary, because the updated occurrence of suc is of type 0 → 0.

6 Soundness of Programs for Feasibility: Run-Time


Analysis

We show next that every tightly-ramified program computes a PTime function


over configurations. The proof is based on the following observations, which we
articulate more precisely below. First, if we start with a configuration where no
vertex is assigned to variables of different tiers, then all configurations obtained in
the course of computation have that property. We are thus assured that vertices
can be ramified unambiguously.
Evolving Graph-Structures and Their Implicit Computational Complexity 357

The tiering rules imply that a program P of tier 0 cannot have loops, and
is therefore evaluated in  |P | steps. At the same time, the value of a variable
of tier 1 depends only on vertices of tier 1. This implies, as we shall see, that
the number of iterations of a given loop must be bounded by the number of
possible configurations that may be generated by its body. Our restriction to
tightly-modifying ramification guarantees that the number is polynomial.

6.1 Non-interference
Lemma 3 (Confinement). Let (Γ, Δ) be an environment. If Γ, Δ P : 0,
then Γ (X) = 0 for every variable X assigned-to in P .
The proof is a straightforward structural induction. Note also that a program P
of tier 0 cannot have a loop, and is thus evaluated within |P | steps.
We say that a vertex-tiering Γ is compatible with a store μ if Γ (X) = Γ (X  )
implies μ(X) = μ(X  ) for all X, X  ∈ X. We say that Γ is an initial tiering if
Γ (X) is 1 for X initial (i.e. X ∈ X0 ), and 0 otherwise. Thus an initial tiering is
always compatible with an initial store.
S, μ |= P =⇒ S  , μ |= P  .
s
Lemma 4. Suppose that Γ , Δ P : α and
If μ is compatible with Γ then so is μ .
The proof is straightforward by structural induction on P .
We show next that tiering, when compatible with the initial configuration,
guarantees the non-interference of lower-tiered values in the run-time of higher-
tiered programs. A similar effect of tiering, albeit simpler, was observed already
in [7]. This is also similar to the security-related properties considered in [12].
Non-interference can also be rendered algebraically, as in [8].
The (Γ , Δ)-collapse of a configuration (S, μ) is the configuration (S Δ , μΓ ),
where μΓ (X) = μ(X) if Γ (X) = 1, and μΓ (X) is undefined otherwise; whereas
S Δ is the structure identical to S except that each f for which (1 → 1) ∈ Δ(f )
is interpreted as ∅. Thus (S Δ , μΓ ) disregards vertices that are not not reachable
from some variable of tier 1 using edges of type (1 → 1).
The next lemma states that a program’s output vertices in tier 1 do not
depend on vertices in tier 0, nor on edges that do not have tier 1 → 1.
S  , μ |= P  . There
s
Lemma 5. Suppose Γ , Δ P : α, and S, μ |= P =⇒
is a configuration (S  , μ ) such that S Δ , μΓ |= P S  , μ |= P  , and
s
=⇒
(S Δ , μΓ ) = (S Δ , μΓ ).
The proof is straightforward by structural induction on programs.

6.2 Polynomial Bounds


(Soundness)
Lemma 6. Assume that Γ , Δ P : α, where P is tightly-modifying. There is a
k > 0 such that for every graph-structure S and every store μ compatible with
Γ , if S, μ |= P =⇒ S  , μ |= P  , then S, μ |= P (=⇒)t S  , μ |= P  for some
s s

t < k + |S| .k
358 D. Leivant and J.-Y. Marion

In proving Lemma 6 we will use the following combinatorial observation. Let


G be a digraph of out-degree 1. We say that a set of vertices C generates G if
every vertex in G is reachable by a path starting at C. The following lemma
provides a polynomial, albeit crude, upper bound on the number of digraphs
with k generators.

Lemma 7. The number (up to isomorphism) of digraphs with n vertices, and a


2
generator of size k, is  n2k .

Proof. A connected digraph of out-degree 1 must consist of a loop of vertices,


with incoming linear spikes. If there are just k generators, then there are at most
k such spikes. There are at most k entry points on the loop to choose for these
spikes, and each spike is of size  n. So there are at most nk × nk non-isomorphic
connected graphs with a generator of size k.
Also, with only k generating vertices we can have at most k connected com-
2
ponents. In sum there are at most (n2k )k = n2k non-isomorphic graphs of size
n with k generators.

Proof of Lemma 6. We proceed by structural induction on P . The only non-


trivial observations are as follows. For program composition, we use the Com-
patibility Lemma 4.
The crucial case is, of course, where P is of the form while(B)Q. Say
X1 , . . . , Xm are the vertex-variables in B. The tiering rules require that B, and
therefore X1 , . . . , Xm , are all of tier 1.
If Q updates only edges that are not probed in P (including the guard B),
then neither the execution of Q nor the evaluation of B is affected, that is
all configurations in the computation have the same vertices of tier 1, with no
change in edges that affect the execution of P , by Lemma 5. Thus the truth of B
in each invocation of Q is determined by the combinations of values assigned to
X1 , . . . , Xm , while the structural changes caused by Q do not affect the execution
of Q in subsequent invocations. Since we assume that P terminates, it follows
that the combinations of values for X1 , . . . , Xm must be all different. If n is the
number of vertices of tier 1, then n  |S|, and there are nm such combinations.
By IH Q terminates in PTime, and therefore so does P .
Suppose Q does update edges that are probed in P . Since P is assumed tightly-
modifying, all such updates are for the same f ∈ F. Let C be the set of initial
values of variables occurring in P (including X1 . . . Xm and possibly others).
Let U be the set of vertices reachable from C by some path of tier 1 (i.e. using
edges labels by h ∈ F assigned 1 → 1 by Δfrom C vertices). By Lemma 5, the
execution of P , including all iterated invocations of B and Q, has only vertices
in U as value of tier 1. Moreover, U is generated by C, whose size is fixed by
the syntax of P . It follows, by Lemma 7, that the number of such configurations
is polynomial in the size of U , which in turns in bounded by the size |S| of the
vertex-universe of the structure S. &
%
Evolving Graph-Structures and Their Implicit Computational Complexity 359

7 Completeness of Tightly-Ramifiable Programs for


Feasibility
Proposition 1. Every polynomial-time function on graph-structures is com-
putable by a terminating and tightly-ramifiable program.
Proof. Suppose f is a unary function on graph-structures, which is computed
by a Turing machine M over alphabet Σ, modulo some canonical coding of
graph structures by strings. Assume that M operates in time k · nk . We posit
that M uses a read-only input tape, and a work-tape. We simulate M by a
tightly-ramifiable program Γ , Δ P : 1 over graph-structures, as follows. The
structures considered simulate each of the two tapes by a double-linked list of
records, with the three fields left, right, and val, returning two pointers and an
alphabet letter, respectively. We take our data constants to include each σ ∈ Σ.
The input-tape is assigned tier 1, and the work-tape tier 0. Initially the work-
tape is empty, and new cells are progressively created by using the command
New at tier 0. The machine’s yield-relation between configurations is simulated
at tier 0 by nested conditionals. Finally, we include in our simulation a clock,
consisting of k nested loops, as in the following nesting of two loops:
u1 := head 1 : 1 // head is a pointer to the input tape;
while ( u1 = n i l )
{ v 1 := head 1 : 1 ;
while ( v 1 = n i l )
{ v 1 := suc ( v 1 ) : 1 ;
T r a n s i t i o n f u n c t i o n i s h e r e: 0 }: 1
u1 := suc ( u 1 ) : 1 } : 1

8 Characterization of Log-Space Languages


Hofmann and Schöpp [4] introduced pure pointer programs based on a uniform
iterator forall, and related them to computation in logarithmic space. Our pro-
grams differ from these in supporting modification of the structure, in the guises
of vertex creation and edge displacement and deletion. Moreover, our programs
are based on a looping construct that treats individually each vertex of the
structure.
These differences notwithstanding, our characterization of PTime can be mod-
ified to a characterization of log-space, by restricting the syntax of ramifiable-
programs. Say that a program P over graph-structures is a jumping-program if
it uses no edge update. Jones [5] showed that a simple cons-free imperative pro-
gramming language While\Ro accepts precisely the languages decidable by Tur-
ing machines in logarithmic space. Since our jumping-programs can be rephrased
in While\Ro they too accept only log-space languages (where input strings are
represented as linked lists). Conversely, the store used by a jumping-program is
essentially of size k · log(n) + log(Q), where n is the size of the vertex-universe,
Q the size of the data-universe, and k the number of variables. Consequently,
we have:
360 D. Leivant and J.-Y. Marion

Theorem 2. A language is accepted by a jumping-program iff it is decidable in


Logspace.

9 Adding Recursion
It is not hard to augment our programing language with recursion. Here is a
procedure that recursively searches for a path from vertex v to w.7
Proc s e a r c h ( v 1 ,w1 ) )
{ i f ( v=w) 1 return t r u e : 1 ;
v i s i t e d ( v ) = tru e: 1 ;
f o r a l l t 1 i n A d j L i s t ( v ) % L i s t o f a d j a c e n c y nodes o f t
{ i f ( v i s i t e d ( t ) 1=f a l s e )
i f ( s e a r c h ( t , w) 1=t r u e ) return t r u e : 1 ; }
return f a l s e : 1 ; }
A restricted form of recursion is linear recursion, where at most one recursive
call is allowed in the definition of a recursive procedure. Moreover, we suppose
that each function body is stationary or tightly-modifying.
Theorem 3. On its domain of computation, a tightly-ramifiable program with
linear recursive calls is computable in polynomial time.

References
1. Bellantoni, S., Cook, S.A.: A new recursion-theoretic characterization of the poly-
time functions. Computational Complexity 2, 97–110 (1992)
2. Gurevich, Y.: Sequential abstract state machines capture sequential algorithms.
ACM Transactions on Computational Logic 1(1), 77–111 (2000)
3. Hartmann, L., Jones, N.D., Simonsen, J.G., Vrist, S.B.: Programming in biomolec-
ular computation: Programs, self-interpretation and visualisation. Sci. Ann. Comp.
Sci. 21(1), 73–106 (2011)
4. Hofmann, M., Schöpp, U.: Pure pointer programs with iteration. ACM Trans.
Comput. Log. 11(4) (2010)
5. Jones, N.D.: Logspace and ptime characterized by programming languages. Theor.
Comput. Sci. 228(1-2), 151–174 (1999)
6. Kolmogorov, A.N., Uspensky, V.: On the definition of an algorithm. Uspekhi Mat.
Naut. 13(4) (1958)
7. Leivant, D.: Predicative recurrence and computational complexity I: Word recur-
rence and poly-time. In: Feasible Mathematics II. Birkhauser-Boston (1994)
8. Marion, J.-Y.: A type system for complexity flow analysis. In: LICS (2011)
9. Sabelfeld, A., Sands, D.: Declassification: dimensions and principles. J. Comput.
Secur. 17, 517–548 (2009)
10. Schönhage, A.: Storage modification machines. SIAM J. Comp. 9(3), 490–508
(1980)
11. Tarjan, R.E.: Reference machines require non-linear time to maintain disjoint sets.
In: STOC 1977, pp. 18–29. ACM (1977)
12. Volpano, D., Irvine, C., Smith, G.: A sound type system for secure flow analysis.
Journal of Computer Security 4(2/3), 167–188 (1996)

7
The construct forall X in R(u), which is “blind,” in the sense that it does not
depend on node ordering. As a result, no function identifier is probed except visited.
Rational Subsets and Submonoids of Wreath Products

Markus Lohrey1, Benjamin Steinberg2, and Georg Zetzsche3


1
Universität Leipzig, Institut für Informatik
2
City College of New York, Department of Mathematics
3
Technische Universität Kaiserslautern, Fachbereich Informatik

Abstract. It is shown that membership in rational subsets of wreath products


H ! V with H a finite group and V a virtually free group is decidable. On the
other hand, it is shown that there exists a fixed finitely generated submonoid in
the wreath product Z ! Z with an undecidable membership problem.

1 Introduction
The study of algorithmic problems in group theory has a long tradition. Dehn, in his
seminal paper from 1911, introduced the word problem (Does a given word over the
generators represent the identity?), the conjugacy problem (Are two given group el-
ements conjugate?) and the isomorphism problem (Are two given finitely presented
groups isomorphic?), see [25] for general references in combinatorial group theory.
Starting with the work of Novikov and Boone from the 1950’s, all three problems were
shown to be undecidable for finitely presented groups in general. A generalization of
the word problem is the subgroup membership problem (also known as the general-
ized word problem) for finitely generated groups: Given group elements g, g1 , . . . , gn ,
does g belong to the subgroup generated by g1 , . . . , gn ? Explicitly, this problem was
introduced by Mihailova in 1958, although Nielsen had already presented in 1921 an
algorithm for the subgroup membership problem for free groups.
Motivated partly by automata theory, the subgroup membership problem was further
generalized to the rational subset membership problem. Assume that the group G is
finitely generated by the set X (where a ∈ X if and only if a−1 ∈ X). A finite au-
tomaton A with transitions labeled by elements of X defines a subset L(A) ⊆ G in
the natural way; such subsets are the rational subsets of G. The rational subset mem-
bership problem asks whether a given group element belongs to L(A) for a given finite
automaton (in fact, this problem makes sense for any finitely generated monoid). The
notion of a rational subset of a monoid can be traced back to the work of Eilenberg and
Schützenberger from 1969 [8]. Other early references are [1,11]. Rational subsets of
groups also found applications for the solution of word equations (here, quite often the
term rational constraint is used) [6,20]. In automata theory, rational subsets are tightly
related to valence automata (see [9,16,17] for details): For any group G, the empti-
ness problem for valence automata over G (which are also known as G-automata) is
decidable if and only if G has a decidable rational subset membership problem.

This work was supported by the DAAD research project RatGroup. The second author was
partially supported by a grant from the Simons Foundation (#245268 to Benjamin Steinberg).
Omitted proofs can be found in the long version [24] of this paper.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 361–372, 2013.
c Springer-Verlag Berlin Heidelberg 2013
362 M. Lohrey, B. Steinberg, and G. Zetzsche

For free groups, Benois [2] proved that the rational subset membership problem is
decidable using a classical automaton saturation procedure (which yields a polynomial
time algorithm). For commutative groups, the rational subset membership can be solved
using integer programming. Further (un)decidability results on the rational subset mem-
bership problem can be found in [21] for right-angled Artin groups, in [28] for nilpotent
groups, and in [23] for metabelian groups. In general, groups with a decidable rational
subset membership problem seem to be rare. In [22] it was shown that if the group G
has at least two ends, then the rational subset membership problem for G is decidable
if and only if the submonoid membership problem for G (Does a given element of G
belong to a given finitely generated submonoid of G?) is decidable.
In this paper, we investigate the rational subset membership problem for wreath
products. The wreath product is a fundamental operation in group theory. To define
( product H 8 G of two groups G and H, one first takes the direct sum
the wreath
K = g∈G H of copies of H, one for each element of G. An element g ∈ G acts
on K by permuting the copies of H according to the left action of g on G. The corre-
sponding semidirect product K  G is the wreath product H 8 G.
In contrast to the word problem, decidability of the rational subset membership prob-
lem is not preserved under wreath products. For instance, in [23] it was shown that for
every non-trivial group H, the rational subset membership problem for H 8 (Z × Z)
is undecidable. The proof uses an encoding of a tiling problem, which uses the grid
structure of the Cayley graph of Z × Z.
In this paper, we prove the following two new results concerning the rational subset
membership problem and the submonoid membership problem for wreath products:
(i) The submonoid membership problem is undecidable for Z 8 Z. The wreath product
Z 8 Z is one of the simplest examples of a finitely generated group that is not finitely
presented, see [4,5] for further results showing the importance of Z 8 Z.
(ii) For every finite group H and every virtually free group1 V , the group H 8 V has
a decidable rational subset membership problem; this includes for instance the fa-
mous lamplighter group Z2 8 Z.
For the proof of (i) we encode the acceptance problem for a 2-counter machine (Minsky
machine [26]) into the submonoid membership problem for Z 8 Z. One should remark
that Z 8 Z is a finitely generated metabelian group and hence has a decidable subgroup
membership problem [29,30]. For the proof of (ii), an automaton saturation procedure
is used. The termination of the process is guaranteed by a well-quasi-order (wqo) that
refines the classical subsequence wqo considered by Higman [14].
Wqo theory has also been applied successfully for the verification of infinite state
systems. This research led to the notion of well-structured transition systems [10]. Ap-
plications in formal language theory are the decidability of the membership problem
for leftist grammars [27] and Kunc’s proof of the regularity of the solutions of certain
language equations [18]. A disadvantage of using wqo theory is that the algorithms it
yields are not accompanied by complexity bounds. The membership problem for leftist
grammars [15] and, in the context of well-structured transition systems, several natu-
ral reachability problems [3,32] (e.g. for lossy channel systems) have even been shown
1
Recall that a group is virtually free if it has a free subgroup of finite index.
Rational Subsets and Submonoids of Wreath Products 363

not to be primitive recursive. The complexity status for the rational subset membership
problem for wreath products H 8 V (H finite, V virtually free) thus remains open. Ac-
tually, we do not even know whether the rational subset membership problem for the
lamplighter group Z2 8 Z is primitive recursive.

2 Rational Subsets of Groups


Let G be a finitely generated group and X a finite symmetric generating set for G (sym-
metric means that x ∈ X ⇔ x−1 ∈ X). For a subset B ⊆ G we denote with B ∗ (resp.
B) the submonoid (resp. subgroup) of G generated by B. The set of rational subsets of
G is the smallest set that contains all finite subsets of G and that is closed under union,
product, and ∗ . Alternatively, rational subsets can be represented by finite automata. Let
A = (Q, G, E, q0 , QF ) be a finite automaton, where transitions are labeled with ele-
ments of G: Q is the finite set of states, q0 ∈ Q is the initial state, QF ⊆ Q is the set of
final states, and E ⊆ Q×G×Q is a finite set of transitions. Every transition label g ∈ G
can be represented by a finite word over the generating set X. The subset L(A) ⊆ G
accepted by A consists of all group elements g1 g2 g3 · · · gn such that there exists a se-
quence of transitions (q0 , g1 , q1 ), (q1 , g2 , q2 ), (q2 , g3 , q3 ), . . . , (qn−1 , gn , qn ) ∈ E with
qn ∈ QF . The rational subset membership problem for G is the following decision
problem: Given a finite automaton A as above and an element g ∈ G, does g ∈ L(A)
hold? Since g ∈ L(A) if and only if 1G ∈ L(A)g −1 , and L(A)g −1 is rational, too, the
rational subset membership problem for G is equivalent to the question whether a given
automaton accepts the group identity.
The submonoid membership problem for G is the following decision problem: Given
elements g, g1 , . . . , gn ∈ G, does g ∈ {g1 , . . . , gn }∗ hold? Clearly, decidability of the
rational subset membership problem for G implies decidability of the submonoid mem-
bership problem for G. Moreover, the latter generalizes the classical subgroup mem-
bership problem for G (also known as the generalized word problem), where the input
is the same as for the submonoid membership problem for G but it is asked whether
g ∈ g1 , . . . , gn  holds.
In our undecidability results in Sec. 5, we will actually consider the non-uniform
variant of the submonoid membership problem, where the submonoid is fixed, i.e., not
part of the input.

3 Wreath Products
(
Let G and H be groups. Consider the direct sum K = g∈G Hg , where Hg is a copy of
H. We view K as the set H (G) = {f ∈ H G | f −1 (H \ {1H }) is finite} of all mappings
from G to H with finite support together with pointwise multiplication as the group
operation. The group G has a natural left action on H (G) given by gf (a) = f (g −1 a),
where f ∈ H (G) and g, a ∈ G. The corresponding semidirect product H (G)  G is the
wreath product H 8 G. In other words:
364 M. Lohrey, B. Steinberg, and G. Zetzsche

– Elements of H 8 G are pairs (f, g), where f ∈ H (G) and g ∈ G.


– The multiplication in H 8 G is defined as follows: Let (f1 , g1 ), (f2 , g2 ) ∈ H 8 G.
Then (f1 , g1 )(f2 , g2 ) = (f, g1 g2 ), where f (a) = f1 (a)f2 (g1−1 a).

The following intuition might be helpful: An element (f, g) ∈ H 8 G can be thought


of as a finite multiset of elements of H \ {1H } that are sitting at certain elements of G
(the mapping f ) together with the distinguished element g ∈ G, which can be thought
of as a cursor moving in G. If we want to compute the product (f1 , g1 )(f2 , g2 ), we do
this as follows: First, we shift the finite collection of H-elements that corresponds to
the mapping f2 by g1 : If the element h ∈ H \ {1H } is sitting at a ∈ G (i.e., f2 (a) = h),
then we remove h from a and put it to the new location g1 a ∈ H. This new collection
corresponds to the mapping f2 : a !→ f2 (g1−1 a). After this shift, we multiply the two
collections of H-elements pointwise: If in a ∈ G the elements h1 and h2 are sitting
(i.e., f1 (a) = h1 and f2 (a) = h2 ), then we put the product h1 h2 into the location a.
Finally, the new distinguished G-element (the new cursor position) becomes g1 g2 .
If H (resp. G) is generated by the set A (resp. B) with A ∩ B = ∅, then H 8 G is
generated by A ∪ B.

Proposition 1. Let K be a subgroup of G of finite index m and let H be a group. Then


H m 8 K is isomorphic to a subgroup of index m in H 8 G.

4 Decidability
We show that the rational subset membership problem is decidable for groups G =
H 8 V , where H is finite and V is virtually free. First, we will show that the rational
subset membership problem for G = H 8 F2 , where F2 is the free group generated by a
and b, is decidable. For this we make use of a particular well-quasi-order.

A Well-quasi-order. Recall that a well-quasi-order (wqo) on a set A is a reflexive and


transitive relation / such that for every infinite sequence a1 , a2 , a3 , . . . with ai ∈ A
there exist i < j such that ai / aj . In this paper, / will always be antisymmetric as
well; so / will be a well partial order.
For a finite alphabet X and two words u, v ∈ X ∗ , we write u / v if there exist
v0 , . . . , vn ∈ X ∗ , u1 , . . . , un ∈ X such that v = v0 u1 v1 · · · un vn and u = u1 · · · un .
The following theorem was shown by Higman [14] (and independently Haines [13]).
Theorem 1 (Higman’s Lemma). The order / on X ∗ is a wqo.
Let H be a group. For a monoid morphism α : X ∗ → H and u, v ∈ X ∗ let u /α v if
there is a factorization v = v0 u1 v1 · · · un vn with v0 , . . . , vn ∈ X ∗ , u1 , . . . , un ∈ X,
u = u1 · · · un , and α(vi ) = 1 for 0 ≤ i ≤ n. It is easy to see that /α is indeed a
partial order on X ∗ . Furthermore, let /H be the partial order on X ∗ with u /H v if
v = v0 u1 v1 · · · un vn for some v0 , . . . , vn ∈ X ∗ , u1 , . . . , un ∈ X, and u = u1 · · · un
such that α(vi ) = 1 for every morphism α : X ∗ → H and 0 ≤ i ≤ n. Note that if
H is finite, there are only finitely many morphisms α : X ∗ → H. The upward closure
U ⊆ X ∗ of {ε} with respect to /H is the intersection of all preimages α−1 (1) for
all morphisms α : X ∗ → H, which is therefore regular if H is finite (and a finite
Rational Subsets and Submonoids of Wreath Products 365

automaton for this upward closure can be constructed from X and H). Since for w =
w1 · · · wn , w1 , . . . , wn ∈ X, the upward closure of {w} equals U w1 · · · U wn U , we can
also construct a finite automaton for the upward closure of any given singleton provided
that H is finite. In the latter case, we can also show that /H is a wqo:

Lemma 1. For every finite group H and finite alphabet X, (X ∗ , /H ) is a wqo.2

Proof. There are only finitely many morphisms α : X ∗ → H, say α1 , . . . , α . If β :


X ∗ → H  is the morphism with β(w) = (α1 (w), . . . , α (w)), then for all words
w ∈ X ∗ : β(x) = 1 if and only if α(x) = 1 for all morphisms α : X ∗ → H. Thus, /H
coincides with /β , and it suffices to show that /β is a wqo.
Let w1 , w2 , . . . ∈ X ∗ be an infinite sequence of words. Since H  is finite, we can
assume that all the wi have the same image under β; otherwise, choose an infinite
subsequence on which β is constant. Consider the alphabet Y = X × H  . For every
w ∈ X ∗ , w = a1 · · · ar , let w̄ ∈ Y ∗ be the word

w̄ = (a1 , β(a1 ))(a2 , β(a1 a2 )) · · · (ar , β(a1 · · · ar )). (1)

Applying Thm. 1 to the sequence w̄1 , w̄2 , . . . yields i < j with w̄i / w̄j . This means
w̄i = u1 · · · ur , w̄j = v0 u1 v1 · · · ur vr for some u1 , . . . , ur ∈ Y , v0 , . . . , vr ∈ Y ∗ .
By definition of w̄i we have us = (us , hs ) for 1 ≤ s ≤ r, where hs = β(u1 · · · us )
and wi = u1 · · · ur . Let π1 : Y ∗ → X ∗ be the morphism extending the projection
onto the first component, and let vs = π1 (vs ) for 0 ≤ s ≤ r. Then clearly wj =
v0 u1 v1 · · · ur vr . We claim that β(vs ) = 1 for 0 ≤ s ≤ r, from which wi /β wj
and hence the lemma follows. Since w̄j is also obtained according to (1), we have
β(u1 · · · us+1 ) = hs+1 = β(v0 u1 v1 · · · us vs us+1 ) for 0 ≤ s ≤ r − 1. By induction
on s, this implies β(vs ) = 1 for 0 ≤ s ≤ r − 1. Finally, β(vr ) = 1 follows from
β(u1 · · · ur ) = β(wi ) = β(wj ) = β(v0 u1 v1 · · · ur vr ) = β(u1 · · · ur vr ). &
%

Loops. Let G = H 8F2 and fix free generators a, b ∈ F2 . Recall that every element of F2
can be represented by a unique word over {a, a−1 , b, b−1 } that does not contain a factor
of the form aa−1 , a−1 a, bb−1 , or b−1 b; such words are called reduced. For f ∈ F2 ,
let |f | be the length of the reduced( word representing f . Also recall that elements of G
are pairs (k, f ), where k ∈ K = g∈F2 H and f ∈ F2 . In the following, we simply
write kf for the pair (k, f ). Fix an automaton A = (Q, G, E, q0 , QF ) with labels from
G for the rest of Sec. 4. We want to check whether 1 ∈ L(A). Since G is generated as a
monoid by H ∪{a, a−1 , b, b−1 }, we can assume that E ⊆ Q × (H ∪{a, a−1 , b, b−1 })×
Q.
A configuration is an element of Q × G. For configurations (p, g1 ), (q, g2 ), we write
(p, g1 ) →A (q, g2 ) if there is a (p, g, q) ∈ E such that g2 = g1 g. For elements f, g ∈
F2 , we write f ≤ g (f < g) if the reduced word representing f is a (proper) prefix
of the reduced word representing g. We say that an element f ∈ F2 \ {1} is of type
x ∈ {a, a−1 , b, b−1 } if the reduced word representing f ends with x. Furthermore,
1 ∈ F2 is of type 1. Hence, the set of types is T = {1, a, a−1 , b, b−1 }. When regarding
2
One can actually show for any group H: (X ∗ , H ) is a wqo if and only if for every n ∈ N,
there is k ∈ N with |g1 , . . . , gn | ≤ k for all g1 , . . . , gn ∈ H. See the full version [24].
366 M. Lohrey, B. Steinberg, and G. Zetzsche

the Cayley graph of F2 as a tree with root 1, the children of a node of type t are of
the types C(t) = {a, a−1 , b, b−1 } \ {t−1 }. Clearly, two nodes have the same type if
and only if their induced subtrees of the Cayley graph are isomorphic. The elements of
D = {a, a−1 , b, b−1 } will also be called directions.
Let p, q ∈ Q and t ∈ T . A sequence of configurations

(q1 , k1 f1 ) →A (q2 , k2 f2 ) →A · · · →A (qn , kn fn ) (2)

(recall that ki fi denotes the pair (ki , fi ) ∈ G) is called a well-nested (p, q)-computation
for t if (i) q1 = p and qn = q, (ii) f1 = fn is of type t, and (iii) fi ≥ f1 for 1 < i < n
(this last condition is satisfied automatically if f1 = fn = 1). We define the effect
of the computation to be f1−1 k1−1 kn fn ∈ K. Hence, the effect describes the change
imposed by applying the corresponding sequence of transitions, independently of the
configuration in which it starts. The depth of the computation (2) is the maximum value
of |f1−1 fi | for 1 ≤ i ≤ n. We have 1 ∈ L(A) if and only if for some q ∈ QF , there is a
well-nested (q0 , q)-computation for 1 with effect 1.
For d ∈ C(t), a well-nested (p, q)-computation (2) for t is called a (p, d, q)-loop for
t if in addition f1 d ≤ fi for 1 < i < n. Note that there is a (p, d, q)-loop for t that starts
in (p, kf ) (where f is of type t) with effect e and depth m if and only if there exists a
(p, d, q)-loop for t with effect e and depth m that starts in (p, t).
Given p, q ∈ Q, t ∈ T , d ∈ C(t), it is decidable whether there is a (p, d, q)-
loop for t: This amounts to checking whether a given automaton with input alphabet
{a, a−1 , b, b−1 } accepts a word representing the identity of F2 such that no proper
prefix represents the identity of F2 . Since this can be accomplished using pushdown
automata, we can compute the set

Xt = {(p, d, q) ∈ Q × C(t) × Q | there is a (p, d, q)-loop for t}.

Loop Patterns. Given a word w = (p1 , d1 , q1 ) · · · (pn , dn , qn ) ∈ Xt∗ , a loop assign-


ment for w is a choice of a (pi , di , qi )-loop for t for each position i, 1 ≤ i ≤ n.
The effect of a loop assignment is e1 · · · en ∈ K, where ei ∈ K is the effect of the
loop assigned to position i. The depth of a loop assignment is the maximum depth
of an appearing loop. A loop pattern for t is a word w ∈ Xt∗ that has a loop as-
signment with effect 1. The depth of the loop pattern is the minimum depth of a loop
assignment with effect 1. Note that applying the loops for the symbols in a loop pat-
tern (p1 , d1 , q1 ) · · · (pn , dn , qn ) does not have to be a computation: We do not require
qi = pi+1 . Instead, the loop patterns describe the possible ways in which a well-nested
computation can enter (and leave) subtrees of the Cayley graph of F2 in order to have
effect 1. The sets
Pt = {w ∈ Xt∗ | w is a loop pattern for t}
for t ∈ T will therefore play a central role in the decision procedure.
Recall the definition of the partial order /H from Sec. 4. We have shown that /H is
a wqo (Lemma 1). The second important result on /H is:

Lemma 2. For each t ∈ T , Pt is an upward closed subset of Xt∗ with respect to /H .


Rational Subsets and Submonoids of Wreath Products 367

Lemma 1 and 2 already imply that each Pt is a regular language, since the upward
closure of each singleton is regular. This can also be deduced by observing that /H is
a monotone order in the sense of [7]. Therein, Ehrenfeucht et al. show that languages
that are upward closed with respect to monotone well-quasi-orders are regular. Our next
step is a characterization of the Pt that allows us to compute finite automata for them.
In order to state this characterization, we need the following definitions.

Let X, Y be alphabets. A regular substitution is a map σ : X → 2Y such that
σ(x) ⊆ Y ∗ is regular for every x ∈ X. For w ∈ X ∗ , w = w1 · · · wn , wi ∈ X, let
σ(w) = R1 · · · Rn , where σ(wi ) = Ri for 1 ≤ i ≤ n. Given R ⊆ Y ∗ and a regular

substitution σ : X → 2Y , let σ −1 (R) = {w ∈ X ∗ | σ(w) ∩ R = ∅}. If R is regular,
then σ −1 (R) is regular as well [31, Prop. 2.16], and an automaton for σ −1 (R) can be
obtained effectively from automata for R and the σ(x). The alphabet Yt is given by

Yt = Xt ∪ ((Q × H × Q) ∩ E).

We will interpret a word in Yt∗ as that part of a computation that happens in a node of
type t: A symbol in Yt \ Xt stands for a transition that stays in the current node and
only changes the local H-value and the state. A symbol (p, d, q) ∈ Xt represents the
execution of a (p, d, q)-loop in a subtree of the current node. The morphism πt : Yt∗ →
Xt∗ is the projection onto Xt∗ , meaning πt (y) = y for y ∈ Xt and πt (y) = ε for
y ∈ Yt \ Xt . The morphism νt : Yt∗ → H is defined by

νt ((p, d, q)) = 1 for (p, d, q) ∈ Xt


νt ((p, h, q)) = h for (p, h, q) ∈ Yt \ Xt .

Hence, when w ∈ Yt∗ describes part of a computation, νt (w) is the change it imposes
on the current node. For p, q ∈ Q and t ∈ T , define the regular set
t
Rp,q = {(p0 , g1 , p1 )(p1 , g2 , p2 ) · · · (pn−1 , gn , pn ) ∈ Yt∗ | p0 = p, pn = q}.

Then πt−1 (Pt ) ∩ νt−1 (1) ∩ Rp,qt


consists of those words over Yt that admit an assign-
ment of loops to occurrences of symbols in Xt so as to obtain a well-nested (p, q)-
computation for t with effect 1. Given d ∈ C(t), t ∈ T , the regular substitution

σt,d : Xt → 2Yd is defined by
)
σt,d ((p, d, q)) = {Rpd ,q | (p, d, p ), (q  , d−1 , q) ∈ E}
σt,d ((p, u, q)) = {ε} for u ∈ C(t) \ {d}.

For tuples (Ut )t∈T and (Vt )t∈T with Ut , Vt ⊆ Xt∗ , we write (Ut )t∈T ≤ (Vt )t∈T if
Ut ⊆ Vt for each t ∈ T . We can now state the following fixpoint characterization:

Lemma 3. (Pt )t∈T is the smallest tuple such that for every t ∈ T we have ε ∈ Pt and
*  −1 
−1
σt,d πd (Pd ) ∩ νd−1 (1) ⊆ Pt .
d∈C(t)

Given a language L ⊆ Xt∗ , let L↑t = {v ∈ Xt∗ | u /H v for some u ∈ L}.


368 M. Lohrey, B. Steinberg, and G. Zetzsche

Theorem 2. The rational subset membership problem is decidable for every group G =
H 8 F , where H is finite and F is a finitely generated free group.
Proof. Since H 8 F is a subgroup of H 8 F2 (since F is a subgroup of F2 ), it suffices to
show decidability for G = H 8 F2 . First, we compute finite automata for the languages
(0)
Pt . We do this by initializing Ut := {ε}↑t for each t ∈ T and then successively
(i)
extending the sets Ut , which are represented by finite automata, until they equal Pt :
If there is a t ∈ T and a word
* 
−1
πd−1 (Ud ) ∩ νd−1 (1) \ Ut ,
(i) (i)
w∈ σt,d
d∈C(t)

(i+1) (i) (i+1) (i)


we set Ut := Ut ∪ {w}↑t and Uu := Uu for u ∈ T \ {t}. Otherwise we stop.
(i)
By induction on i, it follows from Lemma 2 and Lemma 3 that Ut ⊆ Pt .
(i+1) (i) (i)
In each step, we obtain Ut by adding new words to Ut . Since the sets Ut
are upward closed by construction and there is no infinite (strictly) ascending chain
of upward closed sets in a wqo, the algorithm above has to terminate with some tuple
(k)
(Ut )t∈T . This, however, means that for every t ∈ T
* 
−1
πd−1 (Ud ) ∩ νd−1 (1) ⊆ Ut .
(k) (k)
σt,d
d∈C(t)

(k) (k) (k)


Since on the other hand ε ∈ Ut and Ut ⊆ Pt , Lemma 3 yields Ut = Pt .
Now we have 1 ∈ L(A) if and only if π1−1 (P1 ) ∩ ν1−1 (1) ∩ Rq10 ,q = ∅ for some
q ∈ QF , which can be reduced to non-emptiness for finite automata. &
%
Theorem 3. The rational subset membership problem is decidable for every group H 8
V with H finite and V virtually free.
Proof. This is immediate from Thm. 2 and Prop. 1: If F is a free subgroup of index m
in V , then H m 8 F is isomorphic to a subgroup of index m in H 8 V and decidability of
rational subset membership is preserved by finite extensions [12,17]. &
%

5 Undecidability
In this section, we will prove the second main result of this paper: The wreath product
Z 8 Z contains a fixed submonoid with an undecidable membership problem. Our proof
is based on the undecidability of the halting problem for 2-counter machines.

2-Counter Machines A 2-counter machine (also known as Minsky machine) is a tuple


C = (Q, q0 , qf , δ), where Q is a finite set of states, q0 ∈ Q is the initial state, qf ∈ Q
is the final state, and δ ⊆ (Q \ {qf }) × {c0 , c1 } × {+1, −1, = 0} × Q is the set of
transitions. The set of configurations is Q×N×N, on which we define a binary relation
→C as follows: (p, m0 , m1 ) →C (q, n0 , n1 ) if and only if one of the following holds:
– There is (p, ci , b, q) ∈ δ such that b ∈ {−1, 1}, ni = mi + b, and n1−i = m1−i .
– There is (p, ci , = 0, q) ∈ δ such that ni = mi = 0 and n1−i = m1−i .
Rational Subsets and Submonoids of Wreath Products 369

It is well known that every Turing-machine can be simulated by a 2-counter machine


(see e.g. [26]). In particular, we have:
Theorem 4. There is a fixed 2-counter machine C = (Q, q0 , qf , δ) such that the fol-
lowing problem is undecidable: Given m, n ∈ N, does (q0 , m, n) →∗C (qf , 0, 0) hold?

Submonoids of Z 8 Z. In this section, we only consider wreath products of the form


H 8 Z. An element (f, m) ∈ H 8 Z such that the support of f is contained in the interval
[a, b] (with a, b ∈ Z) and 0, m ∈ [a, b] will also be written as a list [f (a), . . . , f (b)],
where in addition the element f (0) is labeled by an incoming (downward) arrow and
the element f (m) is labeled by an outgoing (upward) arrow.
We will construct a fixed finitely generated submonoid of the wreath product Z 8 Z
with an undecidable membership problem. For this, let C = (Q, q0 , qf , δ) be the 2-
counter machine from Thm. 4. W.l.o.g. we can assume that there exists a partition Q =
Q0 ∪ Q1 such that q0 ∈ Q0 and

δ ⊆ (Q0 × {c0 } × {+1, −1, = 0} × Q1 ) ∪ (Q1 × {c1 } × {+1, −1, = 0} × Q0 ).

In other words, C alternates between the two counters. Hence, a transition (q, ci , x, p)
can be just written as (q, x, p).
Let Σ = Q - {c, #} and let ZΣ be the free abelian group generated by Σ. First, we
prove that there is a fixed finitely generated submonoid M of ZΣ 8Z with an undecidable
membership problem. Let a ∈ Σ be (a generator for the right Z-factor; hence Z 8 Z
Σ

is generated by Σ ∪ {a}. Let K = m∈Z ZΣ . In the following, we will freely switch


between the description of elements of ZΣ 8 Z by words over (Σ ∪ {a})±1 and by pairs
from K  Z.
Our finitely generated submonoid M of ZΣ 8 Z is generated by the following el-
ements. The right column shows the generators in list notation (elements of ZΣ are
written additively, i.e., as Z-linear combinations of elements of Σ):
´ ˆ
p−1 a#a2 #aq for (p, = 0, q) ∈ δ [−p, #, 0, #, q] (3)
´ ˆ
p−1 a#aca2 qa−2 for (p, +1, q) ∈ δ [−p, #, c, 0, q] (4)
´ ˆ
p−1 a#a3 qa6 c−1 a−8 for (p, −1, q) ∈ δ [−p, #, 0, 0, q, 0, 0, 0, 0, 0, −c] (5)
´ˆ
c−1 a8 ca−8 [−c, 0, 0, 0, 0, 0, 0, 0, c] (6)
´ ˆ
c−1 a#a7 ca−6 [−c, #, 0, 0, 0, 0, 0, 0, c] (7)
ˆ ´
qf−1 a−1 [0, −qf ] (8)
ˆ ´
#−1 a−2 [0, 0, −#] (9)

For initial counter values m, n ∈ N let I(m, n) = aq0 a2 cm a4 cn a−6 ; its list notation is
´ ˆ
[0, q0 , 0, m · c, 0, 0, 0, n · c]. (10)
370 M. Lohrey, B. Steinberg, and G. Zetzsche

Here is some intuition: The group element I(m, n) represents the initial configura-
tion (q0 , m, n) of the 2-counter machine C. Lemma 4 below states that (q0 , m, n) →∗C
(qf , 0, 0) is equivalent to the existence of Y ∈ M with I(m, n)Y = 1, i.e., I(m, n)−1 ∈
M . Generators of type (3)–(7) simulate the 2-counter machine C. States of C will be
stored at cursor positions 4k + 1. The values of the first (resp., second) counter will be
stored at cursor positions 8k + 3 (resp., 8k + 7). Note that I(m, n) puts a single copy
of the symbol q0 ∈ Σ at position 1, m copies of symbol c (which represents counter
values) at position 3, and n copies of symbol c at position 7. Hence, indeed, I(m, n) sets
up the initial configuration (q0 , m, n) for C. Even cursor positions will carry the special
symbol #. Note that generator (8) is the only generator which changes the cursor posi-
tion from even to odd or vice versa. It will turn out that if I(m, n)Y = 1 (Y ∈ M ), then
generator (8) has to occur exactly once in Y ; it terminates the simulation of the 2-counter
machine C. Hence, Y can be written as Y = U (qf−1 a−1 )V with U, V ∈ M . Moreover,
it turns out that U ∈ M is a product of generators (3)–(7), which simulate C. Thereby,
even cursor positions will be marked with a single occurrence of the special symbol #.
In a second phase, which corresponds to V ∈ M , these special symbols # will be re-
moved again and the cursor will be moved left to position 0. This is accomplished with
generator (9). In fact, our construction enforces that V is a power of (9).
During the simulation phase (corresponding to U ∈ M ), generators of type (3) im-
plement zero tests, whereas generators of type (4) (resp., (5)) increment (resp., decre-
ment) a counter. Finally, (6) and (7) copy the counter value to the next cursor position
that is reserved for the counter (that is copied). During such a copy phase, (6) is first
applied ≥ 0 many times. Finally, (7) is applied exactly once.
Lemma 4. For all m, n ∈ N we have: (q0 , m, n) →∗C (qf , 0, 0) if and only if there
exists Y ∈ M such that I(m, n)Y = 1.
The following result is an immediate consequence of Thm. 4 and Lemma 4.
Theorem 5. There is a fixed finitely generated submonoid M of the wreath product
ZΣ 8 Z with an undecidable membership problem.
Finally, we can establish the main result of this section.
Theorem 6. There is a fixed finitely generated submonoid M of the wreath product
Z 8 Z with an undecidable membership problem.
Proof. By Thm. 5 it suffices to reduce the submonoid membership problem of ZΣ 8 Z
to the submonoid membership problem of Z 8 Z. If m = |Σ|, then Prop. 1 shows that
ZΣ 8 Z ∼= Zm 8 mZ is isomorphic to a subgroup of index m in Z 8 Z. So if Z 8 Z had a
decidable submonoid membership problem for each finitely generated submonoid, then
the same would be true of ZΣ 8 Z. &
%
Theorem 6 together with the undecidability of the rational subset membership problem
for groups H8(Z×Z) for non-trivial H [23] implies the following: For finitely generated
non-trivial abelian groups G and H, H 8 G has a decidable rational subset membership
problem if and only if (i) G is finite3 or (ii) G has rank 1 and H is finite.
3
If G has size m, then by Prop. 1, H m ∼
= H m !1 is isomorphic to a subgroup of index m in H!G.
Since H m is finitely generated abelian and decidability of the rational subset membership is
preserved by finite extensions [12,17], decidability for H ! G follows.
Rational Subsets and Submonoids of Wreath Products 371

By [4], Z 8 Z is a subgroup of Thompson’s group F as well as of Baumslag’s finitely


presented metabelian group a, s, t | [s, t] = [at , a] = 1, as = aat . Hence, we get:

Corollary 1. Thompson’s group F and Baumslag’s finitely presented metabelian group


both contain finitely generated submonoids with an undecidable membership problem.

6 Open Problems
As mentioned in the introduction, the rational subset membership problem is undecid-
able for every wreath product H 8 (Z × Z), where H is a non-trivial group [23]. We
conjecture that for every non-trivial group H and every non-virtually free group G, the
rational subset membership problem for H 8 G is undecidable. The reason is that the
undecidability proof for H 8 (Z × Z) [23] only uses the grid-like structure of the Cayley
graph of Z × Z. In [19] it was shown that the Cayley graph of a group G has bounded
tree width if and only if the group is virtually free. Hence, if G is not virtually free,
then the Cayley-graph of G has unbounded tree width, which means that finite grids of
arbitrary size appear as minors in the Cayley-graph of G. One might therefore hope to
again reduce a tiling problem to the rational subset membership problem for H 8 G (for
H non-trivial and G not virtually free).
Another interesting case, which is not resolved by our results, concerns the rational
subset membership problem for wreath products G 8 V with V virtually free and G a
finitely generated infinite torsion group. Finally, all these questions can also be asked
for the submonoid membership problem. We do not know any example of a group with
decidable submonoid membership problem but undecidable rational subset membership
problem. If such a group exists, it must be one-ended [22].

References
1. Anisimov, A.V.: Group languages. Kibernetika 4, 18–24 (1971) (in Russian); English trans-
lation. Cybernetics 4, 594–601 (1973)
2. Benois, M.: Parties rationnelles du groupe libre. C. R. Acad. Sci. Paris, Sér. A 269,
1188–1190 (1969)
3. Chambart, P., Schnoebelen, P.: Post embedding problem is not primitive recursive, with appli-
cations to channel systems. In: Arvind, V., Prasad, S. (eds.) FSTTCS 2007. LNCS, vol. 4855,
pp. 265–276. Springer, Heidelberg (2007)
4. Cleary, S.: Distortion of wreath products in some finitely-presented groups. Pacific Journal
of Mathematics 228(1), 53–61 (2006)
5. Davis, T.C., Olshanskii, A.Y.: Subgroup distortion in wreath products of cyclic groups. Jour-
nal of Pure and Applied Algebra 215(12), 2987–3004 (2011)
6. Diekert, V., Muscholl, A.: Solvability of equations in free partially commutative groups is
decidable. International Journal of Algebra and Computation 16(6), 1047–1069 (2006)
7. Ehrenfeucht, A., Haussler, D., Rozenberg, G.: On regularity of context-free languages. Theor.
Comput. Sci. 27, 311–332 (1983)
8. Eilenberg, S., Schützenberger, M.P.: Rational sets in commutative monoids. Journal of Alge-
bra 13, 173–191 (1969)
9. Fernau, H., Stiebe, R.: Sequential grammars and automata with valences. Theor. Comput.
Sci. 276(1-2), 377–405 (2002)
372 M. Lohrey, B. Steinberg, and G. Zetzsche

10. Finkel, A., Schnoebelen, P.: Well-structured transition systems everywhere! Theor. Comput.
Sci. 256(1-2), 63–92 (2001)
11. Gilman, R.H.: Formal languages and infinite groups. In: Geometric and Computational Per-
spectives on Infinite Groups DIMACS Ser. Discrete Math. Theoret. Comput. Sci, vol. 25,
pp. 27–51. AMS (1996)
12. Grunschlag, Z.: Algorithms in Geometric Group Theory. PhD thesis, University of California
at Berkley (1999)
13. Haines, L.H.: On free monoids partially ordered by embedding. Journal of Combinatorial
Theory 6, 94–98 (1969)
14. Higman, G.: Ordering by divisibility in abstract algebras. Proceedings of the London Math-
ematical Society. Third Series 2, 326–336 (1952)
15. Jurdziński, T.: Leftist grammars are non-primitive recursive. In: Aceto, L., Damgård, I.,
Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008,
Part II. LNCS, vol. 5126, pp. 51–62. Springer, Heidelberg (2008)
16. Kambites, M.: Formal languages and groups as memory. Communications in Algebra 37(1),
193–208 (2009)
17. Kambites, M., Silva, P.V., Steinberg, B.: On the rational subset problem for groups. Journal
of Algebra 309(2), 622–639 (2007)
18. Kunc, M.: Regular solutions of language inequalities and well quasi-orders. Theor. Comput.
Sci. 348(2–3), 277–293 (2005)
19. Kuske, D., Lohrey, M.: Logical aspects of Cayley-graphs: the group case. Annals of Pure and
Applied Logic 131(1–3), 263–286 (2005)
20. Lohrey, M., Sénizergues, G.: Theories of HNN-extensions and amalgamated products. In:
Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052,
pp. 504–515. Springer, Heidelberg (2006)
21. Lohrey, M., Steinberg, B.: The submonoid and rational subset membership problems for
graph groups. Journal of Algebra 320(2), 728–755 (2008)
22. Lohrey, M., Steinberg, B.: Submonoids and rational subsets of groups with infinitely many
ends. Journal of Algebra 324(4), 970–983 (2010)
23. Lohrey, M., Steinberg, B.: Tilings and submonoids of metabelian groups. Theory Comput.
Syst. 48(2), 411–427 (2011)
24. Lohrey, M., Steinberg, B., Zetzsche, G.: Rational subsets and submonoids of wreath products.
arXiv.org (2013), http://arxiv.org/abs/1302.2455
25. Lyndon, R.C., Schupp, P.E.: Combinatorial Group Theory. Springer (1977)
26. Minsky, M.L.: Computation: Finite and Infinite Machines. Prentice-Hall International (1967)
27. Motwani, R., Panigrahy, R., Saraswat, V.A., Venkatasubramanian, S.: On the decidability of
accessibility problems (extended abstract). In: Proc. STOC 2000, pp. 306–315. ACM (2000)
28. Roman’kov, V.: On the occurence problem for rational subsets of a group. In: International
Conference on Combinatorial and Computational Methods in Mathematics, pp. 76–81 (1999)
29. Romanovskii, N.S.: Some algorithmic problems for solvable groups. Algebra i Logika 13(1),
26–34 (1974)
30. Romanovskii, N.S.: The occurrence problem for extensions of abelian groups by nilpotent
groups. Sibirsk. Mat. Zh. 21, 170–174 (1980)
31. Sakarovitch, J.: Elements of Automata Theory. Cambridge University Press (2009)
32. Schnoebelen, P.: Verifying lossy channel systems has nonprimitive recursive complexity. Inf.
Process. Lett. 83(5), 251–261 (2002)
Fair Subtyping for Open Session Types$

Luca Padovani

Dipartimento di Informatica, Università di Torino, Italy


[email protected]

Abstract. Fair subtyping is a liveness-preserving refinement relation for session


types akin to (but coarser than) the well-known should-testing precongruence.
The behavioral characterization of fair subtyping is challenging essentially be-
cause fair subtyping is context-sensitive: two session types may or may not be
related depending on the context in which they occur, hence the traditional coin-
ductive argument for dealing with recursive types is unsound in general. In this
paper we develop complete behavioral and axiomatic characterizations of fair
subtyping and we give a polynomial algorithm to decide it.

1 Introduction
Session types [7,8] describe the type, order, and direction of messages that can be
sent over channels. In essence, session types are simple CCS -like processes using a
reduced set of operators [3,1]: termination, external and internal choices respectively
guarded by input and output actions, and recursion. For example, the session type T =
μ x.(!buy.x ⊕ !pay) denotes a channel for sending an arbitrary number of buy messages
followed by a single pay message. The session type S = μ x.(?buy.x + ?pay.?vouch)
denotes a channel for receiving an arbitrary number of buy messages, or a single pay
message followed by a vouch message. We can describe a whole session in abstract
terms as the parallel composition of the types of its endpoint channels. For instance,
T | S | !vouch describes a session with a client that buys an arbitrary number of items
and then pays, a shop that serves the client, and a bank that vouches for the client. Ses-
sion type systems check that processes use session channels according to a session type.
As an example, the typing derivation below proves that the process rec X .k!m.X
sending the message m on channel k is well typed in the channel environment k : T
provided that “m is a message of type buy” (the exact interpretation of this property is
irrelevant):
[ VAR ]
m : buy X !→ {k : x}; k : x X
[ OUTPUT ]
X !→ {k : x}; k : !buy.x ⊕ !pay k!m.X
[ REC ]
k : μ x.(!buy.x ⊕ !pay) rec X .k!m.X
Rule [ REC ] opens the recursive session type T in correspondence with recursion in the
process and augments the process environment with the association X !→ {k : x}. In
this way, an occurrence of the process variable X in a channel environment where the
$ Full version http://www.di.unito.it/˜ padovani/Papers/OpenFairSubtyping.pdf.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 373–384, 2013.
c Springer-Verlag Berlin Heidelberg 2013
374 L. Padovani

channel k has type x can be declared well typed. Rule [ OUTPUT ] checks that the output
performed by the process on channel k is allowed by the type of k. Finally, the residual
process after k!m is checked against the residual type of k after !buy.
It is worth noting that all session type systems admit a derivation like the one above
and that such derivation implicitly and crucially relies on the subtyping relation !buy.x⊕
!pay  !buy.x saying that a channel of type !buy.x ⊕ !pay can be safely used where
a channel of type !buy.x is expected. Conventional works on subtyping for session
types [6,3,1] establish that  is contravariant for outputs, thereby making the theory of
session types a conservative extension of the theory of channel types [11,4]. Nonethe-
less, it can be argued that this subtyping relation is inadequate. For instance, consider
the session described earlier as T | S | !vouch and observe that all non-terminated par-
ticipants retain the potential to make progress: it is always possible for the client to pay.
If that happens, the bank can send the vouch message to the shop, at which point all
participants terminate. Accepting the typing derivation above means allowing a client
that behaves according to μ x.!buy.x to interact with a shop and a bank that behave as
S | !vouch. In the session μ x.!buy.x | S | !vouch, however, the client never pays and the
liveness of the session with respect to the bank is compromised. This example proves
that the original subtyping relation for session types, for which !buy.x ⊕ !pay  !buy.x
holds, is not liveness preserving in general: there exist a context C = μ x.[ ] and two
behaviors S | !vouch such that the session described by C [!buy.x ⊕ !pay] | S | !vouch
does have the liveness property while C [!buy.x] | S | !vouch does not.
The contribution of this work is the definition of a new subtyping relation, which
we dub fair subtyping, as the coarsest liveness-preserving refinement for possibly open
session types (like !buy.x ⊕ !pay and !buy.x above) that is a pre-congruence for all the
operators of the type language. With this definition in place, we can reject a derivation
like the one above because it is based on the law !buy.x⊕ !pay  !buy.x which is invalid
for fair subtyping. It may be questioned whether dealing with open session types is
really necessary, given that the above derivation can also be reformulated as follows
[ VAR ]
m : buy X !→ {k : T }; k : T X
[ OUTPUT ]
X !→ {k : T }; k : !buy.T ⊕ !pay k!m.X
[ REC ]
k : T rec X .k!m.X
where T is unfolded (instead of being opened) by rule [ REC ] and associated with the
process variable X (this corresponds to using equi-recursive types, whereby a recur-
sive type and its unfolding are deemed equal). Now, this derivation should be rejected
just as the first one, with the difference that the second derivation relies on the law
!buy.T ⊕ !pay  !buy.T . It turns out that this law holds even for fair subtyping, intu-
itively because there are only finitely many differences between (the infinite unfoldings
of) !buy.T and !pay  !buy.T . In conclusion, we are not aware of alternative ways of
detecting such invalid derivations other than forbidding subsumption within recursions,
or using a theory of open session types like the one developed in the present paper.
A behavioral refinement called should-testing enjoying all the properties that we seek
for in fair subtyping has been extensively studied in [12]. There, should-testing is shown
to be the coarsest liveness-preserving pre-congruence of a process algebra considerably
richer than session types. Therefore, given the correspondence between session types
Fair Subtyping for Open Session Types 375

and processes, we could just take should-testing as the defining notion for fair subtyp-
ing. We find this shortcut unsatisfactory for several reasons: first, should-testing implies
trace equivalence between related processes. In our context, this would amount to re-
quiring invariance of outputs, essentially collapsing subtyping to type equality. Second,
no complete axiomatization is known for should-testing and its alternative characteri-
zation is based on a complex denotational model. As a consequence, it is difficult to
understand the basic laws that underlie should-testing. Third, the decision algorithm for
should-testing is linear exponential, that is remarkably more expensive compared to the
quadratic algorithm for the original subtyping [6]. Instead, by restricting the language
of processes to that of session types, we are able to show that:
– Fair subtyping is coarser than should-testing and does not imply trace equivalence.
– Fair subtyping admits a complete axiomatization obtained from that of the original
subtyping by plugging in a simple auxiliary relation in just two strategic places.
– Fair subtyping can be decided in O(n4 ) time.
In the rest of the paper we formalize session types as an appropriate subset of CCS
(Section 2) and define fair subtyping as the relation that preserves session liveness in
every context (Definition 2.2). Then, we provide a coinductive characterization of fair
subtyping that unveils its properties (Section 3). The pre-congruence property is subtle
to characterize because fair subtyping is context sensitive (two session types may or may
not be related depending on the context in which they occur). For example, we have seen
that !buy.x ⊕ !pay  !buy.x and yet !buy.(!buy.x ⊕ !pay) ⊕ !pay  !buy.!buy.x ⊕ !pay
despite the unrelated terms !buy.x ⊕ !pay and !buy.x occur in corresponding positions
in the latter pair of related session types. The coinductive characterization also paves
the way to the complete axiomatization of fair subtyping and to its decision algorithm
(Section 4). In turn, the axiomatization shows how to incrementally patch the original
subtyping for session types to ensure liveness preservation. A more detailed comparison
with related work is given in the conclusions (Section 5). Because of space constraints,
proofs of the results can only be found in the long version of the paper.

2 Syntax and Semantics of Session Types


We assume an infinite set V of variables x, y, . . . and an infinite set of messages a, b,
. . . ; we let X, Y , . . . range over subsets of V. Session types are defined by the grammar
(
T ::= end | x | ∑i∈I ?ai .Ti | i∈I !ai .Ti | μ x.T

where the set I is always finite and non-empty and choices are deterministic, in the
sense that ai = a j implies i = j for every i, j ∈ I.
The term end denotes the type of channels on which no further operations are pos-
sible. We will often omit trailing occurrences of end. A term ∑i∈I ?ai .Ti is the type of
a channel for receiving a message in the set {ai }i∈I . According to the received mes-
(
sage ai , the channel must be used according to Ti afterwards. Terms i∈I !ai .Ti are
analogous, but they denote the type of channels that can be used for sending messages.
Note that output session types represent internal choices (the process using a chan-
nel with output type can choose any message in the set {ai }i∈I ) while input session
376 L. Padovani

Table 1. Transition system of sessions


[ T- OUTPUT ]
!a
!a.T −→ T

[ T- PAR ] [ T- COMM ]
[ T- CHOICE ] [ T- INPUT ]
 1 α α
k∈I k∈I M −→ M M −→ M 1 N −→ N 1
+ τ τ
∑ ?ai .Ti −−→ Tk
?ak 
!ai .Ti −→ !ak .Tk M | N −→ M 1 | N M | N −→ M 1 | N 1
i∈I i∈I

types are external choices (the process using a channel with input type must be ready to
deal with any message in the set {ai }i∈I ). We will sometimes use an infix notation for
choices
(
writing ?a1 .T1 + · · · + ?an .Tn and !a1 .T1 ⊕ · · · ⊕ !an .Tn instead of ∑1≤i≤n ?ai .Ti
and 1≤i≤n !ai .Ti respectively. Terms μ x.T and x are used for building recursive session
types, as usual. We assume that session types are contractive, namely that they do not
contain subterms of the form μ x1 · · · μ xn .x1 . The notions of free and bound variables are
standard and so are the definitions of open and closed session types. We take an equire-
cursive point of view and identify session types modulo renaming of bound variables
and folding/unfolding of recursions. That is, μ x.T = T {μ x.T /x} where T {S/x} is the
capture-avoiding substitution of every free occurrence of x in T with S. We say that T
and S are strongly equivalent, notation T ≈ S, if their infinite unfoldings are the same
regular tree [5].
Sessions M, N, . . . are abstracted as parallel compositions of session types T , S, . . . ,
their grammar is:
M ::= T | (M | M)
We define the operational semantics of sessions by means of a labeled transition system
mimicking the actions performed by processes that behave according to session types
(in fact, we are abstracting processes into types). The transition system makes use of ac-
tions α of the form ?a and !a describing the input/output of a messages and labels  that
are either actions or the invisible move τ . The transition system is defined in Table 1.
Rules [ T- OUTPUT ], [ T- CHOICE ], and [ T- INPUT ] deal with prefixed terms. The first and last
ones are standard. Rule [ T- CHOICE ] states that a process behaving according to the type
(
i∈I !ai .Ti may internally choose, through an invisible move τ , to send any message
from the set {ai }i∈I . Rule [ T- PAR ] (and its symmetric, omitted) propagates labels across
compositions while [ T- COMM ] is the synchronization rule between complementary ac-
tions resulting into an invisible move (we let ?a = !a and !a = ?a).
We use ϕ , ψ , . . . to range over strings of actions, ε to denote the empty string, and
τ
≤ to denote the usual prefix order between strings. We write =⇒ for the reflexive, tran-
τ α τ α τ
sitive closure of −→ and =⇒ for the composition =⇒−→=⇒. We extend this notation
α1 ···αn α1 αn
to strings of actions so that == =⇒ stands for the composition =⇒ · · · =⇒. We write
α ϕ α ϕ
T =⇒ (respectively T =⇒) if there exists S such that T =⇒ S (respectively T =⇒ S).
ϕ ϕ
We write T =⇒  if not T =⇒. We let tr(T ) denote the set of traces of T , namely
ϕ
tr(T ) = {ϕ | T =⇒}.
def
Fair Subtyping for Open Session Types 377

We say that a session M is successful if every computation starting from M can be


extended to a state that emits the action !OK, where OK is a special message that we
assume is not used for synchronization purposes. Formally:
τ !OK
Definition 2.1 (Success). We say that M is successful if M =⇒ N implies N =⇒.
Example 2.1. Consider T = μ x.(!a.x ⊕ !b) and S = μ x.(?a.x + ?b.?c) from the intro-
duction, where for brevity we respectively use a, b, and c in place of buy, pay, and
vouch. Then T | S | !c.!OK is a successful session because, no matter how many a mes-
sages the first participant sends, it is always possible that a b message and a c mes-
sage are sent. At that point, the third participant emits !OK. The session !b | S | !c.!OK is
successful as well. By contrast μ x.!a.x | S | !c.!OK is unsuccessful even if the first two
participants keep interacting with each other, because none of them is willing to receive
the c message sent by the third participant. 
We can now define fair subtyping as the relation that preserves session success. To make
sure that fair subtyping is a pre-congruence, we quantify over all possible contexts that
apply to the two session types being compared. Contexts, ranged over by C , are just
session types with exactly one hole [ ] in place of some subterm. We write C [T ] for
the session type obtained by filling the hole in C with T . This operation differs from
variable substitution in that C may capture variables occurring free in T .
Fair subtyping is defined thus:
Definition 2.2 (Fair Subtyping). We say that T is a fair subtype of S, written T  S, if
M | C [T ] successful implies M | C [S] successful for every M and C .
To provide the intuition underlying Definition 2.2, suppose that a (well-typed) process
accesses a session using a channel k of type S. By replacing k with another channel k1 of
type T  S we are in fact changing the session that the process accesses. However, the
process is oblivious to the replacement, in the sense that it keeps behaving on k1 as if it
was a channel of type S even if in reality k1 has type T . Now, suppose that the session
accessed through k1 is described by M | T and that it is successful. The replacement
changes the session to M | S. By the hypothesis T  S, we have the assurance that M | S
is also successful, therefore the substitution is safe.
Showing that T  S is difficult in general, because of the universal quantification
over sessions M and contexts C in Definition 2.2. Until we characterize precisely the
properties of  in Section 3, it is easier to find session types that are not related by fair
subtyping.
Example 2.2. Consider T = μ x.(!a.x ⊕ !b) and S = μ x.!a.x and take M = μ x.(?a.x +
?b.!OK). Note that M relies on receiving a b message to emit !OK. Then we have T  S
because M | T is successful but M | S is not (no b message is ever sent). Now, for C =
μ x.[ ] we have T = C [!a.x ⊕ !b] and S = C [!a.x], therefore !a.x ⊕ !b  !a.x. 

3 Fair Subtyping
We begin our study of fair subtyping by recalling the traditional subtyping relation for
session types, which we dub “unfair subtyping”.
378 L. Padovani

Definition 3.1 (Unfair Subtyping). We say that U is a coinductive subtyping if T U S


implies either (1) T = S = x or (2) T = S = end or (3) T = ∑i∈I ?ai .Ti and S = ∑i∈I ?ai .Si
( (
and Ti U Si for every i ∈ I or (4) T = i∈I∪J !ai .Ti and S = i∈I !ai .Si and Ti U Si for
every i ∈ I. Unfair subtyping, denoted by U , is the largest coinductive subtyping.

Clauses (1–2) state the reflexivity of U for end and type variables, while clauses (3–
4) respectively state invariance and contravariance of U with respect to external and
internal choices. There is no need for a clause dealing with recursive types, because
type equality already accounts for their unfolding and types are contractive. Unfair sub-
typing is essentially the standard subtyping relation for session types presented in [6].1
The appeal for unfair subtyping comes from its simplicity and intuitive rationale. The
key clause (4) states that the larger session type allows in general for fewer kinds of
messages to be sent: when T U S, a process behaving as S can be safely placed in a
context where a process behaving as T is expected because S expresses a more deter-
ministic behavior compared to T . Reducing non-determinism is generally perceived as
harmless, but sometimes it may compromise liveness.

Example 3.1. Consider the session types T = !a.x ⊕ !b and S = !a.x and the context
C = μ x.[ ]. Then both {(T, S), (x, x)} and {(C [T ], C [S])} are coinductive subtyping
relations, from which we deduce T U S and C [T ] U C [S]. Yet Example 2.2 shows
that neither C [T ]  C [S] nor T  S do hold. 

Unfair subtyping is a necessary but not sufficient condition for fair subtyping.

Theorem 3.1.   U .

Note that T U S implies tr(T ) ⊇ tr(S) and that U may compromise session success
by letting S have “too few” traces compared to T (Example 3.1). Therefore, Theo-
rem 3.1 suggests that fair subtyping should be characterized as a restriction of Defini-
tion 3.1 where we impose additional conditions to clause (4). The condition tr(T ) ⊆
tr(S) is clearly sufficient but too strong: it imposes invariance of fair subtyping with
respect to outputs, collapsing fair subtyping to equality. Nonetheless, we will show that,
when T  S, there must be an “inevitable” pair of corresponding states at some “finite
distance” from T and S for which trace inclusion holds. We formalize this property say-
ing that T converges into S. The precise definition of convergence is subtle because T
and S may be open: we must be able to reason on the property of trace inclusion be-
tween corresponding states of T and S by considering the possibility that T and S occur
in a context that binds (some of) their free variables.
We begin by introducing a notation for referring to the residual state of a session type
after some sequence of actions.

Definition 3.2 (Continuation). Let α ∈ tr(T ). The continuation of T after α is the ses-
ε α
sion type S such that T =⇒−→ S (note that S is uniquely determined because branches
in session types are guarded by distinct actions). We extend the notion of continuation
to sequences of actions so that T (ε ) = T and T (αϕ ) = T (α )(ϕ ) when αϕ ∈ tr(T ).
1 In practice, subtyping can be relaxed so that it is covariant with respect to external choices [6].
This difference between unfair and standard subtyping does not affect our results.
Fair Subtyping for Open Session Types 379

Next, we define the traces of a session type leading to a free variable:


Definition 3.3 (X-traces). The X-traces of T , denoted by X-tr(T ), are the traces of T
ϕ
leading to a variable in X. That is, X-tr(T ) = {ϕ | ∃x ∈ X : T =⇒ x}.
Convergence is the relation .X;Y inductively defined by the rule:

∀ϕ ∈ (tr(T ) \ tr(S)) ∪Y -tr(S) : ∃ψ ≤ ϕ , a : T (ψ !a) .0;X∪Y


/ S(ψ !a)
(1)
T .X;Y S
where X and Y are two sets of so-called safe and dangerous variables. Because of its
subtle definition, we explain convergence incrementally and, for the time being, we
consider the particular instance when X = Y = 0.
/ In this case rule (1) reduces to

∀ϕ ∈ tr(T ) \ tr(S) : ∃ψ ≤ ϕ , a : T (ψ !a) . S(ψ !a)


(2)
T .S

and it is easy to observe that its base case corresponds to the condition tr(T ) \ tr(S) =
/ that is tr(T ) ⊆ tr(S). Now, suppose T . S and imagine some session M composed
0,
with either T or S whose aim is to tell T and S apart in the sense that M succeeds
(emitting !OK) as soon as it enters some trace of T that is not present in S. In order
to achieve its goal, M will try to drive the interaction with T along some path ϕ ∈
tr(T ) \ tr(S). Rule (2) says that after following some prefix ψ of ϕ that is shared by
both T and S, M encounters an internal choice having a branch (corresponding to some
action !a) that may divert the interaction to a new stage where the residual behaviors
of T and S (respectively T (ψ !a) and S(ψ !a)) have sets of traces that are slightly less
different. We say “slightly less different” because T (ψ !a) and S(ψ !a) are one step
closer to the top of the derivation of T . S, whose leaves imply trace inclusion. Since
convergence is defined inductively, this means that T and S are a finite number of steps
away from the point where trace inclusion holds. In conclusion, when T . S holds, it
is impossible for M to solely rely on the traces in tr(T ) \ tr(S) in order to succeed; M
can always be veered into a stage of the interaction where (some corresponding states
of) T and S are no longer distinguishable as the traces of (such corresponding states of)
T and S are the same.
Example 3.2. Take T = μ x.(!a.x ⊕ !b) and S = μ x.(!a.!a.x ⊕ !b) and observe that
tr(T )\tr(S) is the language of strings generated by the regular expression !a(!a!a)∗ !b.
Given an arbitrary string in tr(T ) \ tr(S) we can take ψ = ε and we have T (!b) =
S(!b) = end where end . end, so we can conclude T . S. 
Example 3.3. Consider again the session types T = μ x.(!a.x ⊕ !b) and S = μ x.!a.!a.x
and recall that in Example 3.1 we showed T U S. Let us try to build a derivation
for T . S. Note that tr(T ) \ tr(S) is the language of strings generated by the regular
expression (!a)∗ !b. Taken ϕ ∈ tr(T ) \ tr(S) we have that any prefix ψ of ϕ that is in
tr(T )∩tr(S) has the form !a · · · !a and now T (ψ !a) = T and S(ψ !a) = S. Therefore, in
order to prove T . S, we need a derivation for T . S. Since convergence is an inductive
relation, T . S is not derivable which agrees with the fact that these two session types
are not related by fair subtyping. 
380 L. Padovani

We can now turn our attention to the general definition of .X;Y , whose base case
adds the condition Y -tr(S) = 0/ to trace inclusion that we have discussed earlier. First
of all, observe that the naive extension of . with the axiom x . x, whereby every vari-
able x converges into itself, fails to yield a pre-congruence for recursion. For example,
according to this extension we would have !a.x ⊕ !b . !a.x and yet Example 3.3 shows
that μ x.(!a.x ⊕ !b) . μ x.!a.x. It is the context in which a type variable x occurs that
determines whether or not x converges into itself:
– If x only lies along traces that do not distinguish T from S then it is safe, in the
sense that cycles created by contexts binding x do not allow sensing any difference
between T and S.
– If x lies along a path that distinguishes T from S then it is dangerous, because cycles
created by contexts binding x may enable such difference to be sensed.
In the general definition of convergence, the two sets X and Y respectively contain the
variables that are assumed to be safe (but that may be found to be dangerous at some
later stage while proving convergence) and the variables that are known to be dangerous.
Whenever a trace that distinguishes T from S is discovered (ϕ ∈ (tr(T ) \ tr(S))), the
safe variables become dangerous ones. The condition Y -tr(S) = 0/ then restricts the
application of the axiom x . x to safe variables.

Example 3.4. Take T = !a.(!a.x ⊕ !b.end) ⊕ !b.end and S = !a.!a.x ⊕ !b.end which
are obtained from the session types in Example 3.2 by possibly unfolding and then
opening recursions. Note that the variable x is dangerous because it lies along the
trace !a!a which goes through corresponding states of T and S which differ, indeed
tr(T ) \ tr(S) = {!a!b}. However, we can divert from the trace !a!b by taking ψ = ε
and now we have T (!b) = S(!b) = end where end .0;{x}
/ end, so T .{x};0/ S. 

Example 3.5. Let T = !a.(!a.x ⊕ !b.end) ⊕ !b.end and S = !a.!a.x and let us try to build
a derivation for T .{x};0/ S. Note that tr(T ) \ tr(S) = {!b, !a!b}. The prefixes of any
ϕ ∈ tr(T ) \ tr(S) that are in tr(T ) ∩ tr(S) are either ε or !a. If we take the prefix
ψ = !a, we have T (!a!a) = S(!a!a) = x and now x .0;{x}/ x. If we take the prefix ψ =
ε , we have T (!a) = !a.x ⊕ !b.end and S(!a) = !a.x and also in this case we deduce
T (!a) .0;{x}
/ S(!a) by iterating a similar argument. Therefore we conclude T .{x};0/ S
which was expected since from Example 3.3 we knew that μ x.T . μ x.S. 

From now on, we will often write .X as an abbreviation for .X;0/ . The key property of
the X set of safe variables is formalized thus:
Lemma 3.1. T .X∪{x} S implies μ x.T .X μ x.S.
In words, the variables in X can be safely bound by a recursive context without com-
promising convergence. We now show the characterization of fair subtyping:

Definition 3.4 (Coinductive Fair Subtyping). We say that F is a coinductive fair


subtyping relation if (X, T, S) ∈ F implies T .X S and either:
(1) T = S = x, or
(2) T = S = end, or
Fair Subtyping for Open Session Types 381

Table 2. Axiomatization of convergence and fair subtyping


[ C - VAR ] [ C - REC ]
[ C - END ] x ∈ Y T X ∪{x};Y \{x} S
end X ;Y end
x X ;Y x μ x.T X ;Y μ x.S

[ C - INPUT ] [ C - OUTPUT 1] [ C - OUTPUT 2]


∀i ∈ I : Ti X ;Y Si ∀i ∈ I : Ti X ;Y Si ∃k ∈ I : Tk 0;X
/ ∪Y Sk
+ + + +
∑ ?ai .Ti X ;Y ∑ ?ai .Si !ai .Ti X ;Y !ai .Si !ai .Ti X ;Y !ai .Si
i∈I i∈I i∈I i∈I i∈I∪J i∈I

[ F - REC ]
[ F - END ] [ F - VAR ] T F S T {x};0/ S
end F end x F x
μ x.T F μ x.S

[ F - INPUT ] [ F - OUTPUT ] [ A - SUBT ]


∀i ∈ I : Ti F Si ∀i ∈ I : Ti F Si T F S T V;0/ S
+ +
∑ ?ai .Ti F ∑ ?ai .Si !ai .Ti F !ai .Si T A S
i∈I i∈I i∈I∪J i∈I

(3) T = ∑
(i∈I ?ai .Ti and S = ∑i∈I / Ti , Si ) ∈ F for every i ∈ I, or
(?ai .Si and (0,
/ Ti , Si ) ∈ F for every i ∈ I.
(4) T = i∈I∪J !ai .Ti and S = i∈I !ai .Si and (0,
We write T X S if (X, T, S) ∈ F for some coinductive fair subtyping F .
Theorem 3.2.  = V
Structurally, Definition 3.4 and Definition 3.1 are very similar. The key difference be-
tween T U S and T X S is that in the latter T .X S must also hold. Note that, when
checking that the continuations Ti and Si are related, the set of safe variables is emp-
tied. This twist is motivated by the fact that applying a context μ x.[ ] around T creates
a cycle that necessarily goes through the initial state of T while no context applied to
T can create loops “within” T . This property makes  context sensitive: consider for
example the session types T = !a.x ⊕ !b and S = !a.x and the context C = !a.[ ] ⊕ !b and
note that C does not bind the variable x. Now we have C [T ] .{x} C [S] while T .{x} S.
Therefore C [T ]  C [S] even if T  S.

4 Axioms and Algorithms


The characterization developed in Section 3 allows us to axiomatize fair subtyping.
The axiomatization, besides being the first for a liveness-preserving refinement pre-
congruence, shows how to guarantee liveness preservation by patching unfair subtyping
and, more generally, the traditional subtyping for session types [6] with appropriate
conditions in just two strategic places.
Table 2 defines three inductive relations X;Y (rules [ C -*]), F (rules [ F -*]), and A
(rule [ A - SUBT ]) which will be shown to coincide with .X;Y , 0/ , and V . The axiom-
atization of F largely follows from clauses (1–4) of Definition 3.4, except that recur-
sive session types are treated explicitly by rule [ F - REC ], which requires the condition
382 L. Padovani

T {x};0/ S that verifies whether it is safe to close T and S with the context μ x.[ ] (see
Lemma 3.1). Fair subtyping is defined by [ A - SUBT ], which is basically Theorem 3.2 in
the form of inference rule. The axiomatization of convergence includes a core set of
rules where [ C - END ], [ C - INPUT ], and [ C - OUTPUT 1] enforce trace inclusion (condition
tr(T ) ⊆ tr(S) in (1)) and rule [ C - VAR ] checks that x is not a dangerous variable (con-
dition Y -tr(S) = 0/ in (1)). Rule [ C - OUTPUT 2] deals with the case in which the larger
session type provides strictly fewer choices with respect to the smaller one and corre-
sponds to the “existential part” of the rule (1). In this case, there must be a common
branch (k ∈ I) such that the corresponding continuations are in the convergence rela-
tion where all the safe variables have become dangerous ones (Tk .0;X∪Y / Sk ). Finally,
rule [ C - REC ] deals with recursive contexts μ x.[ ] by recording x as a safe variable.

Theorem 4.1 (Correctness). A ⊆ .

The presented axiomatization is complete when session types have recursive terms
binding the same variable in corresponding positions. We do not regard this as a lim-
itation, though, because when T  S it is always possible to find T 1 and S1 that are
strongly equivalent to T and S for which this property holds. For example, it is not possi-
ble to derive μ x.!a.x A μ x.!a.!a.x using the rules in Table 2, but μ x.!a.x A μ x.!a.x ≈
μ x.!a.!a.x. On the contrary, making this assumption allows us to focus on the interesting
aspects of the axiomatization by leaving out some well-understood
technicalities [2].

Theorem 4.2 (Completeness). Let T  S. Then T ≈ T 1 A S1 ≈ S for some T 1 and S1 .

We briefly discuss an algorithm for deciding fair subtyping based on its axiomatiza-
tion. The only two rules in Table 2 that are not syntax directed are [ C - OUTPUT 1] and
[ C - OUTPUT 2] when J \ I = 0/ because the sets of variables may or may not change
when going from the conclusion to the premise of these rules. A naive algorithm would
have to backtrack in case the wrong choice is made, leading to exponential complex-
ity. Table 3 presents an alternative set of syntax-directed rules for convergence. Space
constraints prevent us from describing them in detail, but the guiding principle of these
rules is simple: a judgment T #X S  Y synthesizes, whenever possible, the smallest
subset Y of X such that T Y ;X\Y S holds. This way, in [ AC - OUTPUT 1] and [ AC - OUTPUT
2] the index set X does not change from the conclusion to the premises, so the algo-
rithm can just recur and then verify whether Tk #X Sk  0/ for some branch k ∈ I: if this
is the case, then [ AC - OUTPUT 2] applies; if not and J \ I = 0,
/ then [ AC - OUTPUT 1] applies;
otherwise, the algorithm fails. This new set of rules is sound and complete:

Theorem 4.3. The following properties hold:


1. T #Y S  X implies T X;Y \X S;
2. T X;Y S implies T #X∪Y S  Z and Z ⊆ X.

Regarding the complexity of the proposed algorithm, observe that convergence can be
decided in linear time using the rules in Table 3 and that, in Table 2, only [ F - REC ] and [ A -
SUBT ] duplicate work. Moreover, rule [ A - SUBT ] is needed only once for each derivation.
Therefore, the algorithm for fair subtyping is quadratic in the size of the proof tree for
Fair Subtyping for Open Session Types 383

Table 3. Algorithmic rules for convergence

[ AC - REC ]
[ AC - END ] [ AC - VAR ] T #X ∪{x} S  Y
end #X end  0/ x #X x  {x} ∩ X
μ x.T #X μ x.S  Y \ {x}

[ AC - INPUT ] [ AC - OUTPUT 1] [ AC - OUTPUT 2]


∀i ∈ I : Ti #X Si  Xi ∀i ∈ I : (Ti #X Si  Xi ∧ Xi = 0)
/ ∃k ∈ I : Tk #X Sk  0/
) + + ) + +
∑ ?ai .Ti #X ∑ ?ai .Si  Xi !ai .Ti #X !ai .Si  Xi !ai .Ti #X !ai .Si  0/
i∈I i∈I i∈I i∈I i∈I i∈I i∈I∪J i∈I

T 1 A S1 , which is the same as S1  (the number of distinct subtrees in S1 ). Since S1  in
the (constructive) proof of Theorem 4.2 is bound by T  · S, the overall complexity
for deciding T  S is O(n4 ) where n = max{T , S}.

5 Conclusion and Related Work


Subtyping is ubiquitous in session type systems, even when it is not explicitly men-
tioned in the type rules, and if the liveness of sessions is a major concern, the conven-
tional subtyping relation for session types is inadequate. In this paper we have defined
and characterized a liveness-preserving subtyping relation for possibly open session
types that is a pre-congruence for all the operators of the type language.
Fair subtyping shares with the should-testing pre-congruence [12] its semantic def-
inition (Definition 2.2), but differs in that should-testing is defined for a richer lan-
guage of processes. In particular, our type language disallows parallel compositions
inside T terms, while in [12] parallel composition can occur anywhere. This feature
increases the discriminating power of tests in [12] and ultimately implies that tr(T ) =
tr(S) when T  S. To see why, consider T = !a ⊕ !b and S = !a and the context
C = μ x.((?a.x + ?b.!done) | [ ]). The intuition is that the term C [T ] “restarts” T if
T chooses to emit a and terminates if T chooses to emit b. So, ?done.!OK | C [T ] is
successful while ?done.!OK | C [S] is not, determining T  S. In our context, the abil-
ity to restart a process an arbitrary number of times is too powerful: session types are
associated with linear channels which have exactly one owner at any given point in
time. Also, the type of the channel reduces as the channel is used and the type system
forbids “jumps” to previous stages of the protocol described by the session type, unless
the session type itself allows to do so by means of an explicit recursion. If the condi-
tion tr(T ) ⊆ tr(S) were to hold also in our context, fair subtyping would collapse to
equality (modulo folding/unfolding of recursions) and would therefore lack any interest
whatsoever. We conjecture that the restriction of should-testing to finite-state processes
(where parallel composition is forbidden underneath recursions) shares the same trace-
related properties (and possibly the behavioral characterization) of fair subtyping.
Fair subtyping was introduced for the first time in [10], compared to which this
work presents major differences. First of all, the subtyping relation in [10] is defined
for closed session types only and essentially coincides with the should-testing relation
in [12]. Previous works [9,12] have shown that the extension of liveness-preserving
384 L. Padovani

refinements to open terms is a challenging task with substantial impact on the proper-
ties of such refinements (this extension was left as an open problem in [9] and it was
discovered to induce trace equivalence in [12]). By contrast, fair subtyping for open
session types does not induce trace equivalence and turns out to be an original liveness-
preserving pre-congruence that is not investigated elsewhere. Moreover, in the present
work we purposefully adopt a notion of “session correctness” (Definition 2.1) that is
weaker (i.e. more general) than the analogous notion in [10]. Since fair subtyping is
defined as the relation that preserves success (Definition 2.2), the net effect is that the
results presented here apply to all session type theories based on stronger notions of ses-
sion correctness. Technically, the consequence is that all session types are inhabited and
therefore the technique described in [10] based on session type difference is no longer
applicable. By contrast, here we are able to give a direct definition of convergence with
no need for auxiliary operators or notions of type emptiness.

Acknowledgments. This work was partially supported by a visiting professor position


of the Université Paris Diderot, by MIUR PRIN 2010-2011 CINA, and by Ateneo/CSP
Project SALT. The author is grateful to the anonymous referees for their comments and
to Daniele Varacca and Viviana Bono for discussions on the topics of the paper.

References
1. Barbanera, F., de’Liguoro, U.: Two notions of sub-behaviour for session-based client/server
systems. In: Kutsia, T., Schreiner, W., Fernández, M. (eds.) PPDP 2010, pp. 155–164. ACM
(2010)
2. Brandt, M., Henglein, F.: Coinductive axiomatization of recursive type equality and subtyp-
ing. Fundamenta Informaticae 33(4), 309–338 (1998)
3. Castagna, G., Dezani-Ciancaglini, M., Giachino, E., Padovani, L.: Foundations of session
types. In: Porto, A., López-Fraguas, F.J. (eds.) PPDP 2009, pp. 219–230. ACM (2009)
4. Castagna, G., De Nicola, R., Varacca, D.: Semantic subtyping for the pi-calculus. Theoretical
Computer Science 398(1-3), 217–242 (2008)
5. Courcelle, B.: Fundamental properties of infinite trees. Theoretical Computer Science 25,
95–169 (1983)
6. Gay, S., Hole, M.: Subtyping for session types in the π -calculus. Acta Informatica 42(2-3),
191–225 (2005)
7. Honda, K.: Types for dyadic interaction. In: Best, E. (ed.) CONCUR 1993. LNCS, vol. 715,
pp. 509–523. Springer, Heidelberg (1993)
8. Honda, K., Vasconcelos, V.T., Kubo, M.: Language primitives and type disciplines for
structured communication-based programming. In: Hankin, C. (ed.) ESOP 1998. LNCS,
vol. 1381, pp. 122–138. Springer, Heidelberg (1998)
9. Natarajan, V., Cleaveland, R.: Divergence and fair testing. In: Fülöp, Z. (ed.) ICALP 1995.
LNCS, vol. 944, pp. 648–659. Springer, Heidelberg (1995)
10. Padovani, L.: Fair Subtyping for Multi-Party Session Types. In: De Meuter, W., Roman,
G.-C. (eds.) COORDINATION 2011. LNCS, vol. 6721, pp. 127–141. Springer, Heidelberg
(2011)
11. Pierce, B., Sangiorgi, D.: Typing and subtyping for mobile processes. Mathematical Struc-
tures in Computer Science 6(5), 409–453 (1996)
12. Rensink, A., Vogler, W.: Fair testing. Information and Computation 205(2), 125–198 (2007)
Coeffects: Unified Static Analysis
of Context-Dependence

Tomas Petricek, Dominic Orchard, and Alan Mycroft

University of Cambridge, UK
{tp322,dao29,am}@cl.cam.ac.uk

Abstract. Monadic effect systems provide a unified way of tracking


effects of computations, but there is no unified mechanism for tracking
how computations rely on the environment in which they are executed.
This is becoming an important problem for modern software – we need
to track where distributed computations run, which resources a program
uses and how they use other capabilities of the environment.
We consider three examples of context-dependence analysis: liveness
analysis, tracking the use of implicit parameters (similar to tracking of
resource usage in distributed computation), and calculating caching re-
quirements for dataflow programs. Informed by these cases, we present a
unified calculus for tracking context dependence in functional languages
together with a categorical semantics based on indexed comonads. We
believe that indexed comonads are the right foundation for construct-
ing context-aware languages and type systems and that following an
approach akin to monads can lead to a widespread use of the concept.

Modern applications run in diverse environments – such as mobile phones or the


cloud – that provide additional resources and meta-data about provenance and
security. For correct execution of such programs, it is often more important to
understand how they depend on the environment than how they affect it.
Understanding how programs affect their environment is a well studied area:
effect systems [13] provide a static analysis of effects and monads [8] provide a
unified semantics to different notions of effect. Wadler and Thiemann unify the
two approaches [17], indexing a monad with effect information, and showing that
the propagation of effects in an effect system matches the semantic propagation
of effects in the monadic approach.
No such unified mechanism exists for tracking the context requirements. We
use the term coeffect for such contextual program properties. Notions of context
have been previously captured using comonads [14] (the dual of monads) and by
languages derived from modal logic [12,9], but these approaches do not capture
many useful examples which motivate our work. We build mainly on the former
comonadic direction (§3) and discuss the modal logic approach later (§5).
We extend a simply-typed lambda calculus with a coeffect system based on
comonads, replicating the successful approach of effect systems and monads.

Examples of Coeffects. We present three examples that do not fit the tradi-


tional approach of effect systems and have not been considered using the modal

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 385–397, 2013.
© Springer-Verlag Berlin Heidelberg 2013
386 T. Petricek, D. Orchard, and A. Mycroft

logic perspective, but can be captured as coeffect systems (§1) – the tracking of
implicit dynamically-scoped parameters (or resources), analysis of variable live-
ness, and tracking the number of required past values in dataflow computations.
Coeffect Calculus. Informed by the examples, we identify a general algebraic
structure for coeffects. From this, we define a general coeffect calculus that unifies
the motivating examples (§2) and discuss its syntactic properties (§4).
Indexed Comonads. Our categorical semantics (§3) extends the work of
Uustalu and Vene [14]. By adding annotations, we generalize comonads to in-
dexed comonads, which capture notions of computation not captured by ordinary
comonads.

1 Motivation
Effect systems, introduced by Gifford and Lucassen [5], track effects of com-
putations, such as memory access or message-based communication [6]. Their
approach augments typing judgments with effect information: Γ e : τ, F . In
Moggi’s semantics, well-typed terms Γ e : τ are mapped to morphisms Γ  →
M τ  where M encodes effects and has the structure of a monad [8]. Wadler and
Thiemann annotate monads with effect information, written M F [17].
In contrast to the analysis of effects, our analysis of context-dependence dif-
fers in the treatment of lambda abstraction. Wadler and Thiemann explain that
“in the rule for abstraction, the effect is empty because evaluation immediately
returns the function, with no side effects. The effect on the function arrow is
the same as the effect for the function body, because applying the function will
have the same side effects as evaluating the body” [17]. We instead consider sys-
tems where λ-abstraction places requirements on both the call-site (latent re-
quirements) and declaration-site (immediate requirements), resulting in different
program properties. We informally discuss three examples that demonstrate how
contextual requirements propagate. Section 2 unifies these in a single calculus.
We write coeffect judgements C s Γ e : τ where the coeffect annotation s
associates context requirements with the free-variable context Γ . Function types
have the form C s τ1 → τ2 associating latent coeffects s with the parameter. The
C s Γ syntax and C s τ types are a result of the indexed comonadic semantics (§3).
Implicit Parameters and Resources. Implicit parameters [7] are
dynamically-scoped variables. They can be used to parameterize a computation
without propagating arguments explicitly through a chain of calls and are part of
the context in which expressions evaluate. As correctly expected [7], they can be
modelled by comonads. Rebindable resources in distributed computations (e.g.,
a local clock) follow a similar pattern, but we discuss implicit parameters for
simplicity.
The following function prints a number using implicit parameters ?culture
(determining the decimal mark) and ?format (the number of decimal places):

λn.printNumber n ?culture ?format


Coeffects: Unified Static Analysis of Context-Dependence 387

x:τ ∈Γ CrΓ e1 : C t τ1 → τ2 CsΓ e2 : τ1


(var ) (app)
C ∅Γ x : τ C r∪s∪t
Γ e1 e2 : τ2
C r∪s (Γ, x : τ1 ) e : τ2
(access) (abs)
C {?a} Γ ?a : ρ C r Γ λx.e : C s τ1 → τ2

Fig. 1. Selected coeffect rules for implicit parameters

Figure 1 shows a type-and-coeffect system tracking the set of an expression’s


implicit parameters. For simplicity here, all implicit parameters have type ρ.
Context requirements are created in (access), while (var ) requires no implicit
parameters; (app) combines requirements of both sub-expressions as well as the
latent requirements of the function.
The (abs) rule is where the example differs from effect systems. Function
bodies can access the union of the parameters (or resources) available at the
declaration-site (C r Γ ) and at the call-site (C s τ1 ). Two of the nine permissible
judgements for the above example are:

C ∅Γ (. . .) : C {?culture,?format} int → string


{?culture,?format}
C Γ (. . .) : C {?format} int → string

The coeffect system infers multiple, i.e. non-principal, coeffects for functions.
Different judgments are desirable depending on how a function is used. In the
first case, both parameters have to be provided by the caller. In the second,
both are available at declaration-site, but ?format may be rebound (the precise
meaning is provided by the semantics, discussed in §3).
Implicit parameters can be captured by the reader monad, where parameters
are associated with the function codomain M ∅ (int → M {?culture,?format} string),
modelling only the first case. Whilst the reader monad can be extended to model
rebinding, the next example cannot be structured by any monad.

Liveness Analysis. Liveness analysis detects whether a free variable of an


expression may be used (live) or whether it is definitely not used (dead ). A
compiler can remove bindings to dead variables as the result is never used.
We start with a restricted analysis and briefly mention how to make it prac-
tical later (§5). The restricted form is interesting theoretically as it gives rise to
the indexed partiality comonad (§3), which is a basic but instructive example.
The coeffect system in Fig. 2 detects whether all free variables are dead (C D Γ )
or whether at least one variable is live (C L Γ ). Variable use (var ) is annotated
with L and constants with D, i.e., if c ∈ N then C D Γ c : int. A dead context
may be marked as live by letting D . L and adding sub-coeffecting (§2).

x:τ ∈Γ CrΓ e1 : C t τ1 → τ2 CsΓ e2 : τ1


(var ) (app)
CLΓ x : τ C r(s!t) Γ e1 e2 : τ2

Fig. 2. Selected coeffect rules for liveness analysis


388 T. Petricek, D. Orchard, and A. Mycroft

x:τ ∈Γ C mΓ e1 : C p τ1 → τ2 C n Γ e2 : τ1
(var ) (app)
C 0Γ x : τ C max(m,n+p) Γ e1 e2 : τ2

CnΓ e : τ C min(m,n) (Γ, x : τ1 ) e : τ2


(prev ) (abs)
C n+1
Γ prev e : τ C m Γ λx.e : C n τ1 → τ2

Fig. 3. Selected coeffect rules for causal data flow

The (app) rule can be understood by discussing its semantics. Consider seman-
tic functions f, g, h annotated by r, s, t respectively. The sequential composition
g ◦ f is live in its parameter only when both f and g are live. In the coeffect
semantics, f is not evaluated if g ignores its parameter (regardless of evaluation
order). Thus, g ◦ f is annotated by conjunction r % s (where L % L = L). A point-
wise composition of g and h, passing the same parameter to both, is live in its
parameter if either g or h is live (i.e., disjunction s & t). Application uses both
compositions, thus Γ is live if it is needed by e1 or by the function and by e2 .
An (abs) rule (not shown) compatible with the structure in Fig. 1 combines
the context annotations using %. Thus, if the body uses some variables, both the
function argument and the context of the declaration-site are marked as live.
The coeffect system thus provides a call-by-name-style semantics, where re-
dundant computations are omitted. Liveness cannot be modelled using monads
with denotations τ1 → M r τ2 . In call-by-value languages, the argument τ1 is al-
ways evaluated. Using indexed comonads (§3), we model liveness as a morphism
C r τ1 → τ2 where C r is the parametric type Maybe τ = τ + 1 (which contains a
value τ when r = L and does not contain value when r = D).

Efficient Dataflow. Dataflow languages (e.g., Lucid [16]) declaratively de-


scribe computations over streams. In causal data flow, programs may access
past values. In this setting, a function τ1 → τ2 becomes a function from a list of
historical values [τ1 ] → τ2 . A coeffect system here tracks how many past values
to cache.
Figure 3 annotates contexts with an integer specifying the maximum number
of required past values. The current value is always present, so (var ) is annotated
with 0. The expression prev e gets the previous value of stream e and requires
one additional past value (prev ); e.g. prev (prev e) requires 2 past values.
The (app) rule follows the same intuition as for liveness. Sequential composi-
tion adds the tags (the first function needs n + p past values to produce p past
inputs for the second function); passing the context to two subcomputations
requires the maximum number of the elements required by the two subcompu-
tations. The (abs) rule for data-flow needs a distinct operator – min – therefore,
the declaration-site and call-site must each provide at least the number of past
values required by the function body (as the body may use variables coming
from the declaration-site as well as the argument).
Soundness follows from our categorical model (§3). Uustalu and Vene model
causal dataflow by a non-empty list comonad NeList τ = τ × (NeList τ + 1) [14].
However, this model leads to (inefficient) unbounded lists of past elements.
Coeffects: Unified Static Analysis of Context-Dependence 389

The coeffect system above infers a (sound) over-approximation of the number of


required past elements and so fixed-length lists may be used instead.

2 Generalized Coeffect Calculus

The previous three examples exhibit a number of commonalities. We capture


these in the coeffect calculus. We do not overly restrict the calculus to allow for
notions of context-dependent computations not discussed above.
The syntax of our calculus is that of the simply-typed lambda calculus (where
v ranges over variables, T over base types, and r over coeffect annotations):

e ::= v | λv.e | e1 e2 τ ::= T | τ1 → τ2 | C r τ

The type C r τ captures values of type τ in a context specified by the annotation


r. This type appears only on the left-hand side of a function arrow C r τ1 → τ2 . In
the semantics, C r corresponds to some data type (e.g., List or Maybe). Extensions
such as explicit let -binding are discussed later (§4).
The coeffect tags r, that were demonstrated in the previous section, can be
generalized to a structure with three binary operators and a particular element.

Definition 1. A coeffect algebra (S, ⊕, ∨, ∧, e) is a set S with an element e ∈ S,


a semi-lattice (S, ∨), a monoid (S, ⊕, e), and a binary ∧. That is, ∀r, s, t ∈ S:

r ⊕ (s ⊕ t) = (r ⊕ s) ⊕ t e⊕r =r =r⊕e (monoid)


r∨s =s∨r r ∨ (s ∨ t) = (r ∨ s) ∨ t r∨r =r (semi-lattice)

The generalized coeffect calculus captures the three motivating examples (§1),
where some operators of the coeffect algebra may coincide.
The ⊕ operator represents sequential composition; guided by the categorical
model (§3), we require it to form a monoid with e. The operator ∨ corresponds to
merging of context requirements in pointwise composition and the semi-lattice
(S, ∨) defines a partial order: r ≤ s when r ∨ s = s. This ordering implies a
sub-coeffecting rule. The coeffect e is often the top or bottom of the lattice.
The ∧ operator corresponds to splitting requirements of a function body be-
tween the call- and definition-site. This operator is unrestricted in the general
system, though it has additional properties in some coeffects systems, e.g., semi-
lattice structure on ∧. Possibly these laws should hold for all coeffect systems,
but we start with as few laws as possible to avoid limiting possible uses of the
calculus. We consider constrained variants with useful properties later (§4).

Implicit parameters use sets of names S = P(Id) as tags with union ∪ for all
three operators. Variable use is annotated with e = ∅ and ≤ is subset ordering.

Liveness uses a two point lattice S = {D, L} where D . L. Variables are anno-
tated with the top element e = L and constants with bottom D. The ∨ operation
is & (join) and ∧ and ⊕ are both % (meet).
390 T. Petricek, D. Orchard, and A. Mycroft

x:τ ∈Γ CrΓ e1 : C t τ1 → τ2 C s Γ e2 : τ1
(var ) (app)
CeΓ x : τ C r∨(s⊕t)
Γ e1 e2 : τ2

CsΓ e:τ C r∧s (Γ, x : τ1 ) e : τ2


(sub) (s ≤ r) (abs)
CrΓ e:τ C r Γ λx.e : C s τ1 → τ2

Fig. 4. Type and coeffect system for the coeffect calculus

Dataflow tags are natural numbers S = N and operations ∨, ∧ and ⊕ correspond


to max, min and +, respectively. Variable use is annotated with e = 0 and the
order ≤ is the standard ordering of natural numbers.
Coeffect Typing Rules. Figure 4 shows the rules of the coeffect calculus,
given some coeffect algebra (S, ⊕, ∨, ∧, e). The context required by a variable
(var ) is annotated with e. The sub-coeffecting rule (sub) allows the contextual
requirements of an expression to be generalized.
The (abs) rule checks the body of the function in a context r ∧ s, which is
a combination of the coeffects available in the context r where the function is
defined and in a context s provided by the caller of the function. Note that
none of the judgements create a value of type C r τ . This type appears only
immediately to the left of an arrow C r τ1 → τ2 .
In function application (app), context requirements of both expressions and
the function are combined as previously: the pointwise composition ∨ is used to
combine the coeffect r of the expression representing a function and the coeffects
of the argument, sequentially composed with the coeffects of the function: s ⊕ t.
For space reasons, we omit recursion. We note that this would require adding
coeffect variables and extending the coeffect algebra with a fixed-point operation.

3 Coeffect Semantics Using Indexed Comonads


The approach of categorical semantics interprets terms as morphisms in some
category. For typed calculi, typing judgments x1 : τ1 . . . xn : τn e : τ are
usually mapped to morphisms τ1  × . . . × τn  → τ . Moggi showed the se-
mantics of various effectful computations can be captured generally using the
(strong) monad structure [8]. Dually, Uustalu and Vene showed that (monoidal )
comonads capture various kinds of context-dependent computation [14].
We extend Uustalu and Vene’s approach to give a semantics for the coeffect
calculus by generalising comonads to indexed comonads. We emphasise semantic
intuition and abbreviate the categorical foundations for space reasons.
Indexed Comonads. Uustalu and Vene’s approach interprets well-typed terms
as morphisms C(τ1 ×. . .×τn ) → τ , where C encodes contexts and has a comonad
structure [14]. Indexed comonads comprise a family of object mappings C r in-
dexed by a coeffect r describing the contextual requirements satisfied by the
encoded context. We interpret judgments C r (x1 : τ1 , . . . , xn : τn ) e : τ as
morphisms C r (τ1  × . . . × τn ) → τ .
Coeffects: Unified Static Analysis of Context-Dependence 391

The indexed comonad structure provides a notion of composition for computa-


tions with different contextual requirements.
Definition 2. Given a monoid (S, ⊕, e) with binary operator ⊕ and unit e, an
indexed comonad over a category C comprises a family of object mappings C r
where for all r ∈ S and A ∈ obj(C) then C r A ∈ obj(C) and:
– a natural transformation εA : C e A → A, called the counit;
– a family of mappings (−)†r,s from morphisms C r A → B to
morphisms C r⊕s A → C s B in C, natural in A, B, called coextend;
such that for all f : C r τ1 → τ2 and g : C s τ2 → τ3 the following equations hold:
† † † † † †
ε ◦ fr,e =f (ε)e,r = id (g ◦ fr,s )(r⊕s),t = gs,t ◦ fr,(s⊕t)

The coextend operation gives rise to an associative composition operation for


computations with contextual requirements (with counit as the identity):

◦ : (C r τ1 → τ2 ) → (C s τ2 → τ3 ) → (C r⊕s τ1 → τ3 )
ˆ g ˆ◦ f = g ◦ fr,s

The composition ˆ◦ best expresses the intention of indexed comonads: contextual


requirements of the composed functions are combined. The properties of the
composition follow from the indexed comonad laws and the monoid (S, ⊕, e).

Example 1. Indexed comonads are analogous to comonads (in coKleisli form),


but with the additional monoidal structure on indices. Indeed, comonads are a
special case of indexed comonads with a trivial singleton monoid, e.g., ({1}, ∗, 1)
with 1 ∗ 1 = 1 where C 1 is the underlying functor of the comonad and ε and
(−)†1,1 are the usual comonad operations. However, as demonstrated next, not
all indexed comonads are derived from ordinary comonads.

Example 2. The indexed partiality comonad encodes free-variable contexts of


a computation which are either live or dead (i.e., have liveness coeffects) with
the monoid ({D, L}, %, L), where C L A = A encodes live contexts and C D A = 1
encodes dead contexts, where 1 is the unit type inhabited by a single value (). The

counit operation ε : C L A → A and coextend operations fr,s : C r!s A → C s B
(for all f : C A → B), are defined:
r

† † † †
εx=x fD,D x = () fD,L x = f () fL,D x = () fL,L x=f x

The indexed family C r here is analogous to the non-indexed Maybe (or option)
data type Maybe A = A + 1. This type does not permit a comonad structure
since ε : Maybe A → A is undefined at (inr ()). For the indexed comonad, ε need
only be defined for C L A = A. Thus, indexed comonads capture a broader range
of contextual notions of computation than comonads.
Moreover, indexed comonads are not restricted by the shape preservation
property of comonads [11]: that a coextended function cannot change the shape

of the context. For example, in the second case above fD,L : C D A → C L B where
the shape changes from 1 (empty context) to B (available context).
392 T. Petricek, D. Orchard, and A. Mycroft

C r Γ λx.e : C s τ1 → τ2  = curry (C r∧s (Γ, x : τ1 ) e : τ2  ◦ mr,s )


C r∨(s⊕t) Γ e1 e2 : τ  = (uncurry C r Γ e1 : C t τ1 → τ2 ) ◦
(id × C s Γ e2 : τ1 †s,t ) ◦ nr,s⊕t ◦ C r∨(s⊕t) Δ
C e Γ xi : τi  = πi ◦ ε

Fig. 5. Categorical semantics for the coeffect calculus

Monoidal Indexed Comonads. Indexed comonads provide a semantics to


sequential composition, but additional structure is needed for the semantics of
the full coeffect calculus. Uustalu and Vene [14] additionally require a (lax semi-)
monoidal comonad structure, which provides a monoidal operation m : CA ×
CB → C(A × B) for merging contexts (used in the semantics of abstraction).
The semantics of the coeffect calculus requires an indexed lax semi-monoidal
structure for combining contexts as well as an indexed colax monoidal structure
for splitting contexts. These are provided by two families of morphisms (given a
coeffect algebra with ∨ and ∧):
– mr,s : C r A × C s B → C (r∧s) (A × B) natural in A, B;
– nr,s : C (r∨s) (A × B) → C r A × C s B natural in A, B;
The mr,s operation merges contextual computations with tags combined by ∧
(greatest lower-bound), elucidating the behaviour of mr,s : that merging may
result in the loss of some parts of the contexts r and s.
The nr,s operation splits context-dependent computations and thus the con-
textual requirements. To obtain coeffects r and s, the input needs to provide at
least r and s, so the tags are combined using the ∨ operator (least upper-bound).
For the sake of brevity, we elide the indexed versions of the laws required by
Uustalu and Vene (e.g., most importantly, merging and splitting are associative).
Example 3. For the indexed partiality comonad, given the liveness coeffect
algebra ({D, L}, %, &, %, L), the additional lax/colax monoidal operations are:

mL,L (x, y) = (x, y) nD,D () = ((), ()) nD,L (x, y) = ((), y)


mr,s (x, y) = () nL,D (x, y) = (x, ()) nL,L (x, y) = (x, y)

Example 4. Uustalu and Vene model causal dataflow computations using the
non-empty list comonad NEList A = A×(1+NEList A) [14]. Whilst this comonad
implies a trivial indexed comonad, we define an indexed comonad with integer
indices for the number of past values demanded of the context.
We define C n A = A × (A × . . . × A) where the first A is the current (always
available) value, followed by a finite product of n past values. The definition of
the operations is a straightforward extension of the work of Uustalu and Vene.

Categorical Semantics. Figure 5 shows the categorical semantics of the coef-


fect calculus using additional operations πi for projection of the i th element of
a product, usual curry and uncurry operations, and Δ : A → A × A duplicating
a value. While C r is a family of object mappings, it is promoted to a family of
functors with the derived morphism mapping C r (f ) = (f ◦ ε)†e,r .
Coeffects: Unified Static Analysis of Context-Dependence 393

The semantics of variable use and abstraction are the same as in Uustalu and
Vene’s semantics, modulo coeffects. Abstraction uses mr,s to merge the outer
context with the argument context for the context of the function body. The
indices of e for ε and r, s for mr,s match the coeffects of the terms. The semantics
of application is more complex. It first duplicates the free-variable values inside
the context and then splits this context using nr,s⊕t . The two contexts (with
different coeffects) are passed to the two sub-expressions, where the argument
subexpression, passed a context (s ⊕ t), is coextended to produce a context
t which is passed into the parameter of the function subexpression (cf. given
f : A → (B → C), g : A → B, then uncurry f ◦ (id × g) ◦ Δ : A → C).
A semantics for sub-coeffecting is omitted, but may be provided by an op-
eration ιr,s : C r A → C s A natural in A, for all r, s ∈ S where s ≤ r, which
transforms a value C r A to C s A by ignoring some of the encoded context.

4 Syntax-Based Equational Theory


The operational semantics of every context-dependent language here differs as
the notion of context is always different. However, for coeffect calculi satisfying
certain conditions we can define a universal equational theory. This suggests
a pathway to an operational semantics for two out of our three examples (the
notion of context for data-flow is more complex).
In a pure λ-calculus, β- and η-equality for functions (also called local sound-
ness and completeness respectively [12]) describe how pairs of abstraction and
application can be eliminated: (λx.e2 )e1 ≡β e1 [x ← e2 ] and (λx.e x) ≡η e. The
β-equality rule, using the usual Barendregt convention of syntactic substitution,
implies a reduction, giving part of an operational semantics for the calculus.
The call-by-name evaluation strategy modelled by β-reduction is not suitable
for impure calculi therefore a restricted β rule, corresponding to call-by-value, is
used, i.e. (λx.e2 )v ≡ e2 [x ← v]. Such reduction can be encoded by a let -binding
term, let x = e1 in e2 , which corresponds to sequential composition of two
computations, where the resulting pure value of e1 is substituted into e2 [4,8].
For an equational theory of coeffects, consider first a notion of let -binding
equivalent to (λx.e2 ) e1 , which has the following type and coeffect rule:
C sΓ e1 : τ1 C r1 ∧r2 (Γ, x : τ1 ) e2 : τ2
r1 ∨(r2 ⊕s)
(1)
C Γ let x = e1 in e2 : τ2
For our examples, ∧ is idempotent (i.e., r ∧ r = r) implying a simpler rule:

CsΓ e1 : τ1 C r (Γ, x : τ1 ) e2 : τ2
(2)
C Γ let x = e1 in e2 : τ2
r∨(r⊕s)

For our examples (but not necessarily all coeffect systems), this defines a more
“precise” coeffect with respect to ≤ where r ∨ (r ⊕ s) ≤ r1 ∨ (r2 ⊕ s).
This rule removes the non-principality of the first rule (i.e., multiple possible
typings). However, using idempotency to split coeffects in abstraction would
remove additional flexibility needed by the implicit parameters example.
394 T. Petricek, D. Orchard, and A. Mycroft

The coeffect r ∨ (r ⊕ s) can also be simplified for all our examples, leading to
more intuitive rules – for implicit parameters r ∪ (r ∪ s) = r ∪ s; for liveness we
get that r & (r % s) = r and for dataflow we obtain max(r, r + s) = r + s.
Our calculus can be extended with let -binding and (2). However, we also
consider the cases when a syntactic substitution e2 [x ← e1 ] has the coeffects
specified by the above rule (2) and prove subject reduction theorem for certain
coeffect calculi. We consider two common special cases when the coeffect of
variables e is the greatest (+) or least (⊥) element of the semi-lattice (S, ∨) and
derive additional properties that hold about the coeffect algebra:

Lemma 1 (Substitution). Given C r (Γ, x : τ2 ) e1 : τ1 and C s Γ e2 : τ2


then C r∨(r⊕s) Γ e2 [x ← e1 ] : τ1 if the coeffect algebra satisfies the conditions
that e is either the greatest or least element of the semi-lattice, ⊕ = ∧, and ⊕
distributes over ∨, i.e., X ⊕ (Y ∨ Z) = (X ⊕ Y ) ∨ (X ⊕ Z).

Proof. By induction over , using the laws (§2) and additional assumptions. %
&

Assuming →β is the usual call-by-name reduction, the following theorem models


the evaluation of coeffect calculi with coeffect algebra that satisfies the above
requirements. We do not consider call-by-value, because our calculus does not
have a notion of value, unless explicitly provided by let -binding (even a function
“value” λx.e may have immediate contextual requirements).

Theorem 1 (Subject Reduction). For a coeffect calculus, satisfying the con-


ditions of Lemma 1, if C r Γ e : τ and e →β e then C r Γ e : τ .

Proof. A direct consequence of Lemma 1. &


%

The above theorem holds for both the liveness and resources examples, but not
for dataflow. In the case of liveness, e is the greatest element (r ∨ e = e); in
the case of resources, e is the least element (r ∨ e = r) and the proof relies on
the fact that additional context requirements can be placed at the context C r Γ
(without affecting the type of function when substituted under λ-abstraction).
However, the coeffect calculus also captures context-dependence in languages
with more complex evaluation strategies than call-by-name reduction based on
syntactic substitution. In particular, syntactic substitution does not provide a
suitable evaluation for dataflow (because a substituted expression needs to cap-
ture the context of the original scope).
Nevertheless, the above results show that – unlike effects – context-dependent
properties can be integrated with call-by-name languages. Our work also provides
a model of existing work, namely Haskell implicit parameters [7].

5 Related and Further Work

This paper follows the approaches of effect systems [5,13,17] and categorical
semantics based on monads and comonads [8,14]. Syntactically, coeffects differ
Coeffects: Unified Static Analysis of Context-Dependence 395

from effects in that they model systems where λ-abstraction may split contextual
requirements between the declaration-site and call-site.
Our indexed (monoidal) comonads (§3) fill the gap between (non-indexed)
(monoidal) comonads of Uustalu and Vene [14] and indexed monads of Atkey [2],
Wadler and Thiemann [17]. Interestingly, indexed comonads are more general
than comonads, capturing more notions of context-dependence (§1).

Comonads and Modal Logics. Bierman and de Paiva [3] model the modal-
ity of an intuitionistic S4 modal logic using monoidal comonads, which links our
calculus to modal logics. This link can be materialized in two ways.
Pfenning et al. and Nanevski et al. derive term languages using the Curry-
Howard correspondence [12,3,9], building a metalanguage (akin to Moggi’s
monadic metalanguage [8]) that includes as a type constructor. For example,
in [12], the modal type τ represents closed terms. In contrast, the semantic
approach uses monads or comonads only in the semantics. This has been em-
ployed by Uustalu and Vene and (again) Moggi [8,14]. We follow the semantic
approach.
Nanevski et al. extend an S4 term language to a contextual modal type theory
(CMTT) [9]. The context is a set of variables required by a computation, which
makes CMTT useful for meta-programming and staged computations. Our con-
textual types are indexed by a coeffect algebra, which is more general and can
capture variable contexts, but also integers, two-point lattices, etc..
The work on CMTT suggests two extensions to coeffects. The first is develop-
ing the logical foundations. We briefly considered special cases of our system that
permits local soundness in §4; local completeness can be treated similarly. The
second is developing a coeffect metalanguage. The use of coeffect algebras pro-
vides an additional flexibility over CMTT, allowing a wider range of applications
via a richer metalanguage.

Relating Effects and Coeffects. The difference between effects and coeffects
is mainly in the (abs) rule. While the semantic models (monads vs. comonads)
are different, they can be extended to obtain equivalent syntactic rules. To allow
splitting of implicit parameters in lambda abstraction, the reader monad needs
an operation that eagerly performs some effects of a function: (τ1 → M r⊕s τ2 ) →
M r (τ1 → M s τ2 ). To obtain a pure lambda abstraction for coeffects, we need to
restrict the mr,s operation of indexed comonads, so that the first parameter is
annotated with e (meaning no effects): C e A × C r B → C r (A × B).

Structural Coeffects. To make the liveness analysis practical, we need to


associate information with individual variables (rather than the entire context).
We can generalize the calculus from this paper by adding a product operation ×
to the coeffect algebra. A variable context x : τ1 , y : τ2 , z : τ3 is then annotated
with r × s × t where each component of the tag corresponds to a single variable.
The system is then extended with structural rules such as:

C r×s (Γ, x : τ1 ) e : τ2 C r×s (x : τ1 , y : τ1 ) e : τ2


(abs) (contr )
C r Γ λx.e : C s τ1 → τ2 C r∨s (z : τ1 ) e[x ← z][y ← z] : τ2
396 T. Petricek, D. Orchard, and A. Mycroft

The context requirements associated with function are exactly those linked to the
specific variable of the lambda abstraction. Rules such as contraction manipulate
variables and perform a corresponding operation on the indices.
The structural coeffect system is related to bunched typing [10] (but general-
izes it by adding indices). We are currently investigating how to use structural
coeffects to capture fine-grained context-dependence properties such as secure
information flow [15] or, more generally, those captured by the dependency core
calculus [1].

6 Conclusions

We examined three simple calculi with associated coeffect systems (liveness anal-
ysis, implicit parameters, and dataflow analysis). These were unified in the coef-
fect calculus, providing a general coeffect system parameterised by an algebraic
structure describing propagation of context requirements throughout a program.
We model the semantics of the coeffect calculus using the indexed (monoidal)
comonad structure – a novel structure, which is more powerful than (monoidal)
comonads. Indices of the indexed comonad operations manifest the semantic
propagation of context so that the propagation of information in the general
coeffect type system corresponds exactly to the semantic propagation of context
in our categorical model.
We consider the analysis of context to be essential, not least for the examples
here but also given increasingly rich and diverse distributed systems.

Acknowledgements. We thank Gavin Bierman, Tarmo Uustalu, Varmo Vene


and reviewers of earlier drafts. This work was supported by the EPSRC and
CHESS.

References
1. Abadi, M., Banerjee, A., Heintze, N., Riecke, J.G.: A core calculus of dependency.
In: Proceedings of POPL (1999)
2. Atkey, R.: Parameterised notions of computation. J. Funct. Program. 19 (2009)
3. Bierman, G.M., de Paiva, V.C.V.: On an intuitionistic modal logic. Studia Log-
ica 65, 2000 (2001)
4. Filinski, A.: Monads in action. In: Proceedings of POPL (2010)
5. Gifford, D.K., Lucassen, J.M.: Integrating functional and imperative programming.
In: Proceedings of Conference on LISP and func. prog., LFP 1986 (1986)
6. Jouvelot, P., Gifford, D.K.: Communication Effects for Message-Based Concur-
rency. Technical report, Massachusetts Institute of Technology (1989)
7. Lewis, J.R., Shields, M.B., Meijert, E., Launchbury, J.: Implicit parameters: dy-
namic scoping with static types. In: Proceedings of POPL, POPL 2000 (2000)
8. Moggi, E.: Notions of computation and monads. Inf. Comput. 93, 55–92 (1991)
9. Nanevski, A., Pfenning, F., Pientka, B.: Contextual modal type theory. ACM
Trans. Comput. Logic 9(3), 23:1–23:49 (2008)
Coeffects: Unified Static Analysis of Context-Dependence 397

10. O’Hearn, P.: On bunched typing. J. Funct. Program. 13(4), 747–796 (2003)
11. Orchard, D., Mycroft, A.: A Notation for Comonads. In: Post-Proceedings of IFL
2012. LNCS. Springer, Heidelberg (2012) (to appear)
12. Pfenning, F., Davies, R.: A judgmental reconstruction of modal logic. Mathemati-
cal. Structures in Comp. Sci. 11(4), 511–540 (2001)
13. Talpin, J., Jouvelot, P.: The type and effect discipline. In: Logic in Computer
Science, 1992. LICS, pp. 162–173 (1994)
14. Uustalu, T., Vene, V.: Comonadic Notions of Computation. Electron. Notes Theor.
Comput. Sci. 203, 263–284 (2008)
15. Volpano, D., Irvine, C., Smith, G.: A sound type system for secure flow analysis.
J. Comput. Secur. 4, 167–187 (1996)
16. Wadge, W.W., Ashcroft, E.A.: LUCID, the dataflow programming language. Aca-
demic Press Professional, Inc., San Diego (1985)
17. Wadler, P., Thiemann, P.: The marriage of effects and monads. ACM Trans. Com-
put. Logic 4, 1–32 (2003)
Proof Systems for Retracts in Simply Typed
Lambda Calculus

Colin Stirling

School of Informatics
University of Edinburgh
[email protected]

Abstract. This paper1 concerns retracts in simply typed lambda cal-


culus assuming βη-equality. We provide a simple tableau proof system
which characterises when a type is a retract of another type and which
leads to an exponential decision procedure.

1 Introduction
Type ρ is a retract of type τ if there are functions C : ρ → τ and D : τ → ρ
with D ◦ C = λx.x. This paper concerns retracts in the case of simply typed
lambda calculus [1]. Various questions can be asked. The decision problem is:
given ρ and τ , is ρ a retract of τ ? Is there an independent characterisation of
when ρ is a retract of τ ? Is there an inductive method, such as a proof system, for
deriving assertions of the form “ρ is a retract of τ ”? If so, can one also construct
(inductively) the witness functions C and D?
Bruce and Longo [2] provide a simple proof system that solves when there are
retracts in the case that D ◦ C =β λx.x. The problem is considerably more diffi-
cult if β-equality is replaced with βη-equality. De Liguoro, Piperno and Statman
[3] show that the retract relation with respect to βη-equality coincides with the
surjection relation: ρ is a retract of τ iff for any model there is a surjection from
τ to ρ. They also provide a proof system for the affine case (when each variable
in C and D occurs at most once) assuming a single ground type. Regnier and
Urzyczyn [9] extend this proof system to cover multiple ground types. The proof
systems yield simple inductive nondeterministic algorithms belonging to NP for
deciding whether ρ is an affine retract of τ . Schubert [10] shows that the problem
of affine retraction is NP-complete and how to derive witnesses C and D from
the proof system in [9]. Under the assumption of a single ground type, decid-
ability of when ρ is a retract of τ is shown by Padovani [8] by explicit witness
construction (rather than by a proof system) of a special form.
More generally, decidability of the retract problem follows from decidability
of higher-order matching in simply typed lambda calculus [13]: ρ is a retract of τ
iff the equation λz ρ .xτ1 →ρ (xρ→τ
2 z) =βη λz ρ .z has a solution (the witnesses D and
C for x1 , x2 ). Since the complexity of matching is non-elementary [15] this de-
cidability result leaves open whether there is a better algorithm, or even a proof
1
For a full version see http://www.homepages.inf.ed.ac.uk/cps/ret.pdf

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 398–409, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Proof Systems for Retracts in Simply Typed Lambda Calculus 399

system, for the problem. In the case of β-equality matching is no guide to solv-
ability: the retract problem is simply solvable whereas β-matching is undecidable
[4].
In this paper we provide an independent solution to the retract problem. We
show it is decidable by exhibiting sound and complete tableau proof systems.
We develop two proof systems for retracts, one for the (slightly easier) case when
there is a single ground type and the other for when there are multiple ground
types. Both proof systems appeal to paths in terms. Their correctness depend
on properties of such paths. We appeal to a dialogue game between witnesses of
a retract to prove such properties: a similar game-theoretic characterisation of
β-reduction underlies decidability of matching.
In Section 2 we introduce retracts in simply typed lambda calculus and fix
some notation for terms as trees and for their paths. The two tableau proof
systems for retracts are presented in Section 3 where we also briefly examine
how they generate a decision procedure for the retract problem. In Section 4 we
sketch the proof of soundness of the tableau proof systems (and completeness
and further details are provided in the full version).

2 Preliminaries

Simple types are generated from ground types using the binary function operator
→. We let a, b, o, . . . range over ground types and ρ, σ, τ, . . . range over simple
types. Assuming → associates to the right, so ρ → σ → τ is ρ → (σ → τ ), if a
type ρ is not a ground type then it has the form ρ1 → . . . → ρn → a. We say
that a is the target type of a and of any type ρ1 → . . . → ρn → a.
Simply typed terms in Church style are generated from a countable set of
typed variables xσ using lambda abstraction and function application [1]. We
write S σ , or sometimes S : σ, to mean term S has type σ. The usual typing
rules hold: if S τ then λxσ .S τ : σ → τ ; if S σ→τ and U σ then (S σ→τ U σ ) : τ . In a
sequence of unparenthesised applications we assume that application associates
to the left, so SU1 . . . Uk is ((. . . (SU1 ) . . .)Uk ). Another abbreviation is λz1 . . . zm
for λz1 . . . λzm . Usual definitions of when a variable occurrence is free or bound
and when a term is closed are assumed.
We also assume the usual dynamics of β and η-reductions and the consequent
βη-equivalence between terms (as well as α-equivalence). Confluence and strong
normalisation ensure that terms reduce to (unique) normal forms. Moreover,
we assume the standard notion of η-long β-normal form (a term in normal form
which is not an η-reduct of some other term) which we abbreviate to lnf. The syn-
tax of such terms reflects their type: a lnf of type a is a variable xa , or x U1 . . . Uk
where xρ1 →...→ρk →a and each Uiρi is a lnf; a lnf of type ρ1 → . . . → ρn → a has
the form λxρ11 . . . xρnn .S, where S a is a lnf.
The following definition introduces retracts between types [2,3].

Definition 1. Type ρ is a retract of type τ , written |= ρ  τ , if there are terms


C : ρ → τ and D : τ → ρ such that D ◦ C =βη λxρ .x.
400 C. Stirling

The witnesses C and D to a retract can always be presented as lnfs. We can think
of C as a “coder” and D as a “decoder” [9]. Assume ρ = ρ1 → . . . → ρl → a
and τ = τ1 → . . . → τn → a: in a retract the types must share target type [9].
We instantiate the bound ρi variables in a decoder D to D(z1ρ1 , . . . , zlρl ), often
abbreviated to D(z), and the bound variable of type ρ in C to C(xρ ): so, |= ρ τ
if D(z1ρ1 , . . . , zlρl )(C(xρ )) =βη xz1 . . . zl . From [9], we can restrict a decoder to
be of the form λf τ .f S1τ1 . . . Snτn with f as head variable and a coder C(x) has
the form λy1τ1 . . . ynτn .H(xT1ρ1 . . . Tlρl ).

Definition 2. We say that the decoder D(z1 , . . . , zl ) = λf τ .f S1τ1 . . . Snτn and the
coder C(x) = λy1τ1 . . . ynτn .H(xT1ρ1 . . . Tlρl ) are canonical witnesses for ρ  τ if
D(z)(C(x)) =βη xz1 . . . zl and they obey the following properties:
1. variables f, z1 , . . . , zl occur only once in D(z),
2. x occurs only once in C(x),
3. H is ε if ρ and τ are constructed from a single ground type,
4. if Tiρi contains an occurrence of yj then it is the head variable of Tiρi , zi
τ
occurs in Sj j and Tiρi contains no other occurrences of any yk , 1 ≤ k ≤ n.

The next result follows from observations in [3,9].

Proposition 1. |= ρ  τ iff there exist canonical witnesses for ρ  τ .

So, if there is only a single ground type then C(x) can be restricted to have the
form λy1τ1 . . . ynτn .xT1ρ1 . . . Tlρl with x as head variable [3].

Example 1. From [3]. Let ρ = ρ1 → ρ2 → o where ρ1 = ρ2 = σ → o and let


τ = τ1 → o where τ1 = σ → (o → o → o) → o and σ is arbitrary. It follows
that |= ρ  τ . A decoder D(z1ρ1 , z2ρ2 ) is λf τ .f (λuσ v o→o→o .v(z1 u)(z2 u)) and a
coder C(xρ ) is λy τ1 .x(λwσ .yw(λso to .s))(λwσ .yw(λso to .t)); so, (D(z1 , z2 ))C(x)
→∗β x(λwσ .z1 w)(λwσ .z2 w) =βη xz1 z2 . &
%

Example 2. From [9] with multiple ground types. Let ρ = ρ1 → ρ2 → a where


ρ1 = b → a, ρ2 = a and let τ = τ1 → a where τ1 = b → (a → o → a) → a.
A decoder is D(z1ρ1 , z2ρ2 ) is λf τ .f (λub1 ua→o→a
2 .u2 (z1 u1 )z2 ) and a coder C(xρ ) is
λy .ys (λw1 w2 .x(λv .yv(λw1 w2 .w1 ))w2 ); so, (D(z1 , z2 ))C(x) →∗β x(λv b .z1 v)z2
τ1 b a o b a o

=βη xz1 z2 . &


%

Terms are represented as special kinds of tree (that we call binding trees in
[12,14]) with dummy lambdas and an explicit binding relation. A term of the
form y a is represented as a tree with a single node labelled y a . In the case of
y U1 . . . Uk , when y ρ1 →...→ρk →a , we assume that a dummy lambda with the empty
sequence of variables is placed directly above any subterm Ui in its tree repre-
sentation if ρi is a ground type. With this understanding, the tree for y U1 . . . Uk
consists of a root node labelled y ρ1 →...→ρk →a and k-successor trees represent-
ing U1 , . . . , Uk . We also use the abbreviation λy for λy1 . . . ym for m ≥ 0, so
y is possibly the empty sequence of variables in the case of a dummy lambda.
The tree representation of λy.S : ρ1 → . . . → ρk → a consists of a root node
Proof Systems for Retracts in Simply Typed Lambda Calculus 401

(0) λf (12) λy

(1) f
t
(13) x
JJ
tt JJ
tt JJ
tt JJ
tt J
(2) λuv (14) λw (20) λw

u
(3) v (15) y
JJ (21) y
JJ
uu JJ JJ
uu JJ JJ
uu JJ JJ
uu J J
(4) λ (8) λ (16) λ (18) λst (22) λ (24) λst

(5) z1 (9) z2 (17) w (19) s (23) w (25) t

(6) λ (10) λ

(7) u (11) u

Fig. 1. D(z1 , z2 ) and C(x) of Example 1

labelled λy and a single successor tree for S a . The trees for C(x) and D(z1 , z2 )
of Example 1, where we have omitted the types, are in Figure 1.
We say that a node is a lambda (variable) node if it is labelled with a lambda
abstraction (variable). The type (target type) of a variable node is the type (target
type) of the variable at that node and the type (target type) of a lambda node
is the type (target type) of the subterm rooted at that node.
The other elaboration is that we assume an extra binary relation ↓ between
nodes in a tree that represents binding; that is, between a node labelled λy1 . . . yn
and a node below it labelled yj (that it binds). A binder λy is such that either y is
empty and therefore is a dummy lambda and cannot bind a variable occurrence
or y = y1 . . . yk and λy can only then bind variable occurrences of the form yi ,
1 ≤ i ≤ k. Consequently, we also employ the following abbreviation n ↓i m if
n ↓ m and n is labelled λy1 . . . yk and m is labelled yi . In Figure 1 we have not
included the binding relation; however, for instance, (2) ↓1 (7).

Definition 3. Lambda node n is a descendant ( k-descendant) of m if either


m ↓ m (m ↓k m ), n is a successor of m for some m and the target types of
m, m and n are the same, or n is a descendant (k-descendant) of m and n is
a descendant of n for some n .
402 C. Stirling

We assume a standard presentation of nodes of a tree as sequences of integers:


an initial sequence, typically ε, is the root node; if n is a node and m is the ith
successor of n then m = ni. For the sake of brevity we have not followed this
approach in Figure 1 where we have presented each node as a unique integer (i).
Definition 4. A path of the tree of a term of type σ is a sequence of nodes
n = n1 , . . . , nk where n1 is the root of the tree, each ni+1 is a successor of ni
and if nj is a variable node then for some i < j, ni ↓ nj (hence is a closed path).
For paths m = m1 , . . . , ml and n = n1 , . . . , nk of type σ we write m  n if for
some i > 0, for all h ≤ 2i, mh = nh , m2i+1 = m2i p, n2i+1 = n2i q and p < q.
A (closed) subtree of a tree of a term of type σ is a set of paths P of type σ
such that if m, n are distinct paths in P then m  n or n  m.
A path n = n1 , . . . , nk is a contiguous sequence of nodes in a tree of a term
starting at the root; for i ≥ 1, each n2i−1 is a lambda node and each n2i is
a variable node (whose binder occurs earlier in the path). Path m is before n,
m  n, if they have a common even length prefix and then differ as to their
successors (the one in m before that in n). These paths could, therefore, be in
the same term: therefore, a closed subtree is a set of such paths.
Definition 5. A path n = n1 , . . . , nl is k-minimal provided that for each binding
node ni there are at most k distinct nodes nj , i < j ≤ l, such that ni ↓ nj . A
subtree P is k-minimal if each path in P is k-minimal.
Not every path or subtree is useful in a term. So, we define when a path or
subtree is realisable meaning that their nodes are “accessible” [7] or “reachable”
[6] in an applicative context.
Definition 6. Assume n = n1 , . . . , nl is a path of odd length of a closed term T
of type σ, m is the node below nl in T and T  is the term T when the variable uτ
at node m is replaced with a fresh free variable z τ . We say that n is realisable if
there is a closed term U = λy σ .yS1 . . . Sk such that U T  =βη λx.zW1 . . . Wq for
some q ≥ 0.
Definition 7. Assume P is a subtree of closed term T of type σ where each path
has even length, m1 , . . . , mq are the leaves of P and Ti , 1 ≤ i ≤ q, is the term
T when the variable uτi i at mi is replaced with a fresh free variable ziτi . We say
that P is realisable if there is a closed term U = λy σ .yS1 . . . Sk such that for
each i, U Ti =βη λx.zi W1 . . . Wqi for qi ≥ 0.
Next we define two useful operations on paths, restriction relative to a suffix and
the subtype after a prefix.
Definition 8. Assume that n = n1 , . . . , np is a path, σ = σ1 → . . . → σk → a,
ni is a lambda node of type σ and w = ni , . . . , np is a suffix of n.
1. The suffix w admits σj , 1 ≤ j ≤ k, if either there is no nq , i ≤ q ≤ p, such
that ni ↓j nq or there is a j-descendant nq of ni whose type is τ1 → . . . →
τl → a and for some r there is not a t : q < t ≤ p such that nq ↓r nt and a
is the target type of τr .
Proof Systems for Retracts in Simply Typed Lambda Calculus 403

2. The restriction of σ to w, σ  w, is defined as σw where


– aw = a,
– (σj → . . . → σk → a)w = if w admits σj then σj → (σj+1 → . . . →
σk → a)w else (σj+1 → . . . → σk → a)w .

Definition 9. Assume that n = n1 , . . . , np is a path of type σ. For a prefix w


of n we define the subtype of σ after w, w(σ):

– if w = ε (the empty prefix) then σ,


– if w = n1 , . . . , nq , q ≤ p, then the type of node nq .

We also define a canonical presentation of a (prefix or suffix of a) path n =


n1 , . . . , nk of type σ as a word w. If w is the empty prefix we write w = ε.
Otherwise, w = (w1 , . . . , wj ), j ≤ k, where for each i ≥ 0, w2i+1 = n2i+1
and if nh ↓m n2i then w2i = nh m. Thus, we distinguish between w = ε (the
empty word) and w = (ε) the prefix of length 1 consisting of the root node.
Also, we can present a subtree as a set of words. Words will occur in our proof
systems as presentations of paths. For example, w = (ε, 1, 11, 112, 1112) of type
τ as in Example 1 represents the path labelled λf, f, λuv, v, λ of D(z1 , z2 ) in
Figure 1 when its root is ε. To illustrate Definitions 8 and 9, for the prefix
w = (ε, 1, 11) and the suffix w = (11, 112, 1122) of w we have w (τ ) = τ1 where
τ1 = σ → (o → o → o) → o as in Example 1 and τ1  w = σ → o: word w of
type τ1 has labelling λuσ v o→o→o , v, λ; so, w admits the first component σ of τ1
but not the second (o → o → o). The final element of w is the same as the first
element of w ; in such a case we define their concatenation to be w.

Definition 10. The concatenation of (a prefix) v and (a suffix) w, v ∧ w, is:


ε∧ w = w; if vk = w1 then v1 , . . . , vk∧ w1 , . . . , wn = v1 , . . . , vk , w2 , . . . , wn .

3 Proof Systems for Retracts

We now develop goal directed tableau proof systems for showing retracts. By
inverting the rules one has more classical axiomatic systems: we do it this way
because it thereby provides an immediate nondeterministic decision procedure
for deciding retracts. We present two such proof systems: a slightly simpler
system for the restricted case when there is a single ground type and one for the
general case.

3.1 Single Ground Type

Assertions in our proof system are of two kinds. First is ρ  τ with meaning ρ
is a retract of τ . The second has the form [ρ1 , . . . , ρk ]  τ which is based on the
“product” as defined in [3]. We follow [9] in allowing reordering of components
of types since ρ → σ → τ is isomorphic to σ → ρ → τ . Instead we could include
explict rules for reordering (as with the axiom in [3]). Moreover, we assume that
[ρ1 , . . . , ρk ] is a multi-set and so elements can be in any order.
404 C. Stirling

I ρρ

ρσ →τ
W
ρτ

δ → ρσ → τ
C
δσ ρτ

ρ1 → . . . → ρk → ρ  σ → τ
P1
[ρ1 , . . . , ρk ]  σ ρτ
[ρ1 , . . . , ρk ]  σ
P2 where
ρ 1  σ  w1 ... ρ k  σ  wk

– w1  . . .  wk are k-minimal realisable paths of odd length of type σ

Fig. 2. Goal directed proof rules

The proof rules are given in Figure 2. There is a single axiom I, identity,
a weakening rule W , a covariance rule C, and two product rules P1 and P2 .
The rules are goal directed: for instance, C allows one to decompose the goal
δ → ρσ → τ into the two subgoals δσ and ρτ . I, W and C (or their variants)
occur in the proof systems for affine retracts (when variables in witnesses can
only occur at most once) [3,9]. The new rules are the product rules: P2 appeals
to k-minimal realisable paths (presented as words), and the restriction operator
of Definition 8. The proof system does not require the axiom A4 of [3], σ  (σ →
a) → a: all instances are provable using W and C.

Definition 11. A successful proof tree for ρ  τ is a finite tree whose root is
labelled with the goal ρ  τ , the successor nodes of a node are the result of an
application of one of the rules to it, and each leaf is labelled with an axiom. We
write ρ  τ if there is a successful proof tree for ρ  τ .

For some intuition about the product rules assume ρ = ρ1 → . . . → ρl → a and


τ = τ1 → . . . → τn → a. Now, |= ρ  τ iff there are canonical, Definition 2,
witnesses D(z1ρ1 , . . . , zlρl ) = λf τ .f S1τ1 . . . Snτn . Since we can reorder components
of ρ and τ we can assume that z1 is in S1τ1 . Suppose z1 , . . . , zk , where k > 1,
are in S1τ1 and so y1 must occur in T1ρ1 , . . . , Tkρk . Therefore, there is a common
ρi1 ρil
coder S1τ1 (x1 /z1 , . . . , xk /zk ) and k decoders Ti (z i ) where z i = zi1 , . . . , zili i
τ1
and ρi1 , . . . , ρili are the components of ρi such that Ti (z i )(S1 (x1 , . . . , xk )) =βη
xi z i (which is similar to the product in [3]). In S1τ1 (x1 /z1 , . . . , xk /zk ) there are
distinct odd length paths w1 , . . . , wk of type τ1 to the lambda nodes above
x1 , . . . , xk . These paths are realisable, Definition 6, because each xi belongs to
the normal form of Ti (z i )(S1τ1 (x1 , . . . , xk )). Using a combinatorial argument,
see the full version, S1τ1 can be chosen so that these words are k-minimal and
by reordering ρ’s components w1  . . .  wk . We may not be able to reduce
Proof Systems for Retracts in Simply Typed Lambda Calculus 405

(σ → o) → (σ → o) → o  (σ → (o → o → o) → o) → o

[σ → o, σ → o]  σ → (o → o → o) → o oo

σ → oσ → o σ →oσ →o

Fig. 3. A proof tree for Example 1

to the subgoals ρ1  τ1 , . . . , ρk  τ1 as wi may prescribe the form of Ti (z i ):


if Ti (z i ) = λf τ1 .f S1i . . . Sm
i
then path wi may prevent Sji containing elements
of z i ; so, this may restrict the possible distribution of z i within the subterms
S1i , . . . , Sm
i
which is captured using τ1  wi .
An example proof tree is in Figure 3 for the retract of Example 1 (which is
not affine). Rule P1 is applied to the root and then P2 to the first subgoal where
w1 = (ε, 2, 21) and w2 = (ε, 2, 22). Let σ  = σ → (o → o → o) → o. Now, σ   w1
= σ → o = σ   w2 ; in both cases only the first component of σ  is admitted.

3.2 Multiple Ground Types


We extend the proof system to include multiple ground types. Again, assertions
are of the two kinds ρ  τ and [ρ1 , . . . , ρk ]  τ . However, we now assume that
to be a well-formed assertion ρ  τ both ρ and τ must share the same target
type (which is guaranteed when there is a single ground type). The rules for this
assertion are as before the axiom I, weakening W , covariance C and the product
rule P1 in Figure 2: however, C carries the requirement that the target types of
δ and σ coincide. The other product rule P2 , presented in Figure 4, is different:
the arity of ρ1 → . . . → ρn → a is the maximum of n and the arities of each ρi
where a gound type a has arity 0.
In [ρ1 , . . . , ρk ]  σ it is not required that ρj and σ share the same target type.
Instead rule P2 requires that ρi and vi (σ), see Definition 9, do share target types:

[ρ1 , . . . , ρk ]  σ
P2 where
ρ1  v1 (σ)  w1 ... ρk  vk (σ)  wk

– k is the maximum of k and h2 where h is the arity of σ


– there is a k -minimal realisable subtree U of type σ where each path has even
length (which can be ∅),
– each vi is ε, a prefix of a path in U of odd length or the extension of a path in U
with a single node,
– v1∧ w1  . . .  vk∧ wk and each vi∧ wi is a k -minimal realisable path of type σ of odd
length and if U = ∅, vi∧ wi extends some path in U .

Fig. 4. Product proof rule for multiple gound types


406 C. Stirling

(b → a) → o → a  (b → (a → o → a) → a) → a

[b → a, o]  b → (a → o → a) → a aa

b→ab→a oo

Fig. 5. A proof tree for Example 2

for the concatenation vi∧ wi see Definition 10. The specialisation to the case of
the single ground type is when U = ∅ and v = ε.
Let ρ = ρ1 → . . . → ρl → a and τ = τ1 → . . . → τn → a. So, |= ρ  τ
iff there are canonical witnesses D(z1ρ1 , . . . , zlρl ) = λf τ .f S1τ1 . . . Snτn and C(x) =
λy1τ1 . . . ynτn .H(xT1ρ1 . . . Tlρl ). Assume z1 , . . . , zk , where k ≥ 1, occur in S1τ1 . There
is a path v in C(x) to the node above x which determines a subtree U of S1τ1 . The
head variable in Tiρi bound in v has the same target type as ρi . There are distinct
paths v1∧ w1 , . . . , vk∧ wk of odd length to the lambda nodes above z1 , . . . , zk in S1τ1 :
vi is decided by the meaning of the head variable in Tiρi ; so, vi (τ1 ) has the same
target type as ρi . The rest of the path is the tail of wi : so we need to consider
whether |= ρi  vi (τ1 )  wi .
Figure 5 is the proof tree for the retract in Example 2. There is an application
of P1 followed by P2 . In the application of P2 the subtree U = {(ε, 2)}, v1 = ε,
w1 = (ε, 2, 21) = v1∧ w1 , v2 = (ε, 2, 22) = v2∧ w2 when w2 = (22). So, v1 (b →
(a → o → a) → a)  w1 = b → a as the first component is admitted (unlike the
second); and v2 (b → (a → o → a) → a) = o = o  w2 .

3.3 Complexity
The proof systems provide nondeterministic decision procedures for checking
retracts. Each subgoal of a proof rule has smaller size than the goal. Hence, by
focussing on one subgoal at a time a proof witness can be presented in PSPACE.
However, this does not take into account checking that a subgoal obeys the
side conditions in the case of the product rules. Given any type σ, there are
boundedly many realisable k-minimal paths (with an upper bound of k n where
n is size of σ). So, this means that overall the decision procedure requires at
most EXPSPACE.

4 Soundness and Completeness


To show soundness and completeness of our proof systems, we define a dialogue
game G(D(z), C(x)) played by a single player ∀ on the trees of potential wit-
nesses for a retract that characterises when (D(z))C(x) =βη xz, similar to game
semantics [5]. The game is defined in the full version of the paper.
To provide intuition for the reader we briefly describe G(D(z1 , z2 ), C(x)) where
these terms are from Figure 1. Play starts at node (0), the binder λf at that
Proof Systems for Retracts in Simply Typed Lambda Calculus 407

node is associated with C(x) rooted at (12); so, the next position is at node (1)
and therefore jumps to (12); the binder at (12) λy is associated with node (2)
(the successor of (1)). Play proceeds to (13) and ∀ chooses to go left or right;
suppose it is left, so play is then at (14); nodes (13) and (14) are part of the
normal form. Play descends to (15) and, therefore, jumps to (2); so, with the
binder at (2), u is associated with the the subtree at (16) and v with the subtree
at (18). Play proceeds to (3) and so jumps to (18); now, s is associated with (4)
and t with (8). Play proceeds to (19) and so jumps to (4), descends to (5) and
then to (6) and then to (7) and jumps to (16) before finishing at (17). This play
captures the path xλw.z1 w of the normal form.
Some of the key properties, defined in the full version, we appeal to in the
correctness proofs below associate subtrees with realisable paths and vice versa.
For instance, as illustrated in the play above the path rooted at (0) downto (7)
is associated with the subtree rooted at (12) and with leaves (17) and (19). Let
ρ = ρ1 → . . . → ρl → a and τ = τ1 → . . . → τn → a and let τ1 = σ = σ1 →
. . . → σm → b.
Theorem 1. (Soundness) If ρ  τ then |= ρ  τ .
Proof. By induction on the depth of a proof. For the base case, the result is
clear for a proof that uses the axiom I. So, assume the result for all proofs of
depth < d. Consider now a proof of depth d. We proceed by examining the
first rule that is applied to show ρ  τ . If it is W or C the result follows
using the same arguments as in [3]. Assume the rule is W and suppose |= ρ  τ .
Therefore there are terms D1 and C1 such that D1τ →ρ (C1ρ→τ x ) =βη x. Now
D(σ→τ )→ρ = λf σ→τ y σ .D1 (f y) and C ρ→(σ→τ ) x = λsσ .C1 (x) are witnesses for
|= ρ  σ → τ . Assume that the rule is C, so |= δ  σ and |= ρ  τ . So there are
terms D1 , C1 , D2 , C2 such that D1σ→δ (C1δ→σ x) =βη x and D2τ →ρ (C2ρ→τ x) =βη x.
Now D(σ→τ )→(δ→ρ) = λxy.C2 (x(D1 y)) and C (δ→ρ)→(σ→τ ) = λuz.D2 (u(C1 z))
are witnesses for |= δ → ρ  σ → τ .
Consider next that the first rule is P1 . So after P1 there is either an application
of P2 or P2 : in the former case, there are k-minimal realisable paths w1  . . . 
wk of odd length of type σ such that ρi  σ  wi ; in the latter case, there is
a k  -minimal realisable subtree U of type σ where each path has even length;
and there are paths v1∧ w1  . . .  vk∧ wk where each element is a k  -minimal
realisable path of type σ of odd length and if U = ∅, it extends some path in
U and where each vi is ε, a prefix of a path in U of odd length path or an
extension of a path in U with a single node and ρi  vi (σ)  wi ; where k  is the
maximum of k and the square of the arity of σ. So, by the induction hypothesis
there are terms Di (z i ) and Ci (xi ) such that Di (z i )(Ci (xi )) =βη xi z i , witnesses
for ρi  σ  wi or ρi  vi (σ)  wi , and terms D (zk+1 , . . . , zl ) and C  (x ) such
that D (zk+1 , . . . , zl )(C  (x )) =βη x zk+1 . . . zl , witnesses for ρk+1 → . . . → ρl →
a  τ  where τ  = τ2 → . . . → τn → a. We assume that all these terms are

canonical witnesses. The term D (zk+1 , . . . , zl ) is λf τ .f S2τ2 . . . Snτn and C  (x ) is
λy2τ2 . . . ynτn .H  (x Tk+1 . . . Tlρl ) where H  = ε if the rule applied was P2 .
ρk+1

We need to show that there are terms D(z1 , . . . , zl ) and C(x) that are wit-
nesses for |= ρ  τ . D(z) will have the form λf τ .f S1τ1 . . . Snτn and C(x) the form
408 C. Stirling

λy1τ1 . . . ynτn .H(xT1ρ1 . . . Tlρl ) where H = ε in the case of a single ground type. All
that remains is to define S1τ1 so it contains z1 , . . . , zk , T1ρ1 , . . . , Tkρk and H (as
an extension of H  ). If U = ∅ then H = H  . Otherwise, let u be an odd length
path such that U is associated with (so, its head variable is y1τ1 ). H consists of
the suffix of u followed by the subtree H  . The head variable of each Tiρi is y1 in
v (σ)
the case of the single ground type and gi i in the general case (which is either

y1 or bound in u). We assume that Si is the subterm of S1σ that is rooted at
the initial vertex of the path wi : which is S1σ itself in the single ground type.
To complete these terms we require that Tiρi (S1σ (z1 , . . . , zk )) =βη zi . Therefore,
removing lambda abstraction over variables zij and changing zi to xi , we require
that Ti (z i )(Si (x1 , . . . , xk )) =βη xi z i . We construct a term C  (xi ) that occurs
after the path wi in Si (and which has root xi when there is a single ground
type). We also complete Ti (z i ) whose initial part is the tree Ui associated with
the path wi .
First, we examine the single ground type case. So, S1σ will have the form
λu1 . . . um .S1 , C  (xi ) the form xi Ci1 
. . . Cip and Ti (z i ) the form λfiσ .fi V1i . . . Vmi .
Assume Di (z i ) is λgiσwi .gi Wii1 . . . Wiil and Ci (xi ) is λui1 . . . uil .xi C1i . . . Cpi . As-
sume wi admits σij : therefore, for some r : 1 ≤ r ≤ m, ij = r (so, Wri may
contain occurrences of variables in z i ). If ur does not occur in the path wi then
we set Vri = Wri . Otherwise, there is a non-empty subpath wir of wi generated
by ur , and a subtree Uri of Vri associated with wir . Each Cji contains a single uik
(as head variable). Assume Csi contains ur . Assume that the path in Wri to the
lambda node above zis is ws . If we can build the same path in Vri (by copying

nodes of Csi to Cis ) then we are done (letting Vri include this path followed by
i 
the subterm of Wr rooted at zis ). Otherwise, we initially include wir in Cis and
 i i  i
then try to build ws in Vr by copying nodes of Cs to Cis : in Vr and, therefore
in Uri , there is a path whose prefix except for its final variable vertex is the same
as a prefix of ws and then differ. In the game G(Cis 
, Vri ), play jumps from that
i
variable in Vr to a lambda node in wir . By definition of admits, there is a binder
n labelled λv in wir such that for some q not(n ↓q ni ) for all nodes ni after n
in wi (and in wir ). Therefore, we add a variable node labelled vq to the end of

wir in Cis ; so play jumps to a lambda node in Vri which is a successor of a leaf
of Ur ; below this node, we build the path ws except for its root node (by adding
i

further nodes to Cis and add the subtree rooted at zis in Wri to Vri ).
For the general case, assume vi (σ) = σ1 → . . . → σm  v (σ)
→ b. So, Si i will
have the form λu1 . . . um .S1 , C  (xi ) the form Hi (xi Ci1  
. . . Cip ) and Ti (z i ) the
v (σ)w
form λfiσ .fi V1i . . . Vmi . Assume Di (z i ) is λgi i i
.gi Wii1 . . . Wiil and Ci (xi ) is
λui1 . . . uil .Hi (xi C1i . . . Cpi ). We set Hi = Hi . Then we proceed in a similar fash-
ion to the single base type case. If some ur does not occur in the path wi then
Vri = Wri ; otherwise we need to build similar paths to zis in Wri in Vri (by copying

vertices from Csi to Cis and using that wi admits (vi (σ))r . &
%

The proof of completeness (by induction on the size of ρ) is easier.

Theorem 2. (Completeness) If |= ρ  τ then ρ  τ.


Proof Systems for Retracts in Simply Typed Lambda Calculus 409

5 Conclusion

We have provided tableau proof systems that characterise when a type is a retract
of another type in simply typed lambda calculus (with respect to βη-equality).
They offer a a nondeterministic decision procedure for the retract problem in
EXPSPACE: it may be possible to improve on the rather crude k-minimality
bounds used on paths within the proof systems. Given the constructive proof of
correctness, we also expect to be able to extract witnesses for a retract from a
successful tableau proof tree (similar in spirit to [10]).

References
1. Barendregt, H.: Lambda calculi with types. In: Abramsky, S., Gabbay, D.,
Maibaum, T. (eds.) Handbook of Logic in Computer Science, vol. 2, pp. 118–309.
Oxford University Press (1992)
2. Bruce, K., Longo, G.: Provable isomorphisms and domain equations in models of
typed languages. In: Proc. 17th Symposium on Theory of Computing, pp. 263–272.
ACM (1985)
3. de ’Liguoro, U., Piperno, A., Statman, R.: Retracts in simply typed λβη-calculus.
In: Procs. LICS 1992, pp. 461–469 (1992)
4. Loader, R.: Higher-order β-matching is undecidable. Logic Journal of the
IGPL 11(1), 51–68 (2003)
5. Ong, C.-H.L.: On model-checking trees generated by higher-order recursion
schemes. In: Procs. LICS 2006, pp. 81–90 (2006)
6. Ong, C.-H.L., Tzevelekos, N.: Functional Reachability. In: Procs. LICS 2009, pp.
286–295 (2009)
7. Padovani, V.: Decidability of fourth-order matching. Mathematical Structures in
Computer Science 10(3), 361–372 (2000)
8. Padovani, V.: Retracts in simple types. In: Abramsky, S. (ed.) TLCA 2001. LNCS,
vol. 2044, pp. 376–384. Springer, Heidelberg (2001)
9. Regnier, L., Urzyczyn, P.: Retractions of types with many atoms, pp. 1–16 (2005),
http://arxiv.org/abs/cs/0212005
10. Schubert, A.: On the building of affine retractions. Math. Struct. in Comp. Sci-
ence 18, 753–793 (2008)
11. Stirling, C.: Higher-order matching, games and automata. In: Procs. LICS 2007,
pp. 326–335 (2007)
12. Stirling, C.: Dependency tree automata. In: de Alfaro, L. (ed.) FOSSACS 2009.
LNCS, vol. 5504, pp. 92–106. Springer, Heidelberg (2009)
13. Stirling, C.: Decidability of higher-order matching. Logical Methods in Computer
Science 5(3:2), 1–52 (2009)
14. Stirling, C.: An introduction to decidability of higher-order matching (2012) (Sub-
mitted for Publication), Availble at author’s website
15. Vorobyov, S.: The “hardest” natural decidable theory. In: Procs. LICS 1997,
pp. 294–305 (1997)
Presburger Arithmetic, Rational Generating
Functions, and Quasi-Polynomials

Kevin Woods

Oberlin College, Oberlin, Ohio, USA


[email protected]

Abstract. A Presburger formula is a Boolean formula with variables in


N that can be written using addition, comparison (≤, =, etc.), Boolean
operations (and, or, not), and quantifiers (∀ and ∃). We characterize sets
that can be defined by a Presburger formula as exactly the sets whose
characteristic functions can be represented by rational generating func-
tions; a geometric characterization of such sets is also given. In addition,
if p = (p1 , . . . , pn ) are a subset of the free variables in a Presburger
formula, we can define a counting function g(p) to be the number of
solutions to the formula, for a given p. We show that every counting
function obtained in this way may be represented as, equivalently, either
a piecewise quasi-polynomial or a rational generating function. In the full
version of this paper, we also translate known computational complexity
results into this setting and discuss open directions.

1 Introduction
A broad and interesting class of sets are those that can be defined over N =
{0, 1, 2, . . .} with first order logic and addition.

Definition 1. A Presburger formula is a Boolean formula with variables in N


that can be written using addition, comparison (≤, =, etc.), Boolean operations
(and, or, not), and quantifiers (∀ and ∃). We will denote a generic Presburger
formula as F (u), where u are the free variables (those not associated with a
quantifier); we use bold notation like u to indicate vectors of variables.
We say that a set S ⊆ Nd is a Presburger set if there exists a Presburger
formula F (u) such that S = {u ∈ Nd : F (u)}.

Example 1. The Presburger formula


 
F (u) = u > 1 and ∃b ∈ N : b + b + 1 = u

defines the Presburger set {3, 5, 7, . . .}. Since multiplication by an integer is the
same as repeated addition, we can conceive of a Presburger formula as a Boolean

combination ofintegral linear (in)equalities, appropriately quantified: ∃b u > 1
and 2b + 1 = u .

Full version available at http://www.oberlin.edu/faculty/kwoods/papers.html

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 410–421, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Presburger Arithmetic, Rational Generating Functions 411

Presburger proved [35] that the truth of a Presburger sentence (a formula with
no free variables) is decidable. In contrast, a broader class of sentences, where
multiplication of variables is allowed, is undecidable; this is a consequence of the
negative solution to Hilbert’s 10th problem, given by Davis, Putnam, Robinson,
and Matiyasevich (see, for example, [19]).
We would like to understand more clearly the structure of a given Presburger
set. One way to attempt to do this is to encode the elements of the set into a
generating function.

Definition 2. Given a set S ⊆ Nd , its associated generating function is

f (S; x) = xs = xs11 xs22 · · · xsdd .


s∈S (s1 ,...,sd )∈S

For example, if S is the set defined by Example 1, then

x3
f (S; x) = x3 + x5 + x7 + · · · = .
1 − x2
We see that, in this instance, the generating function has a nice form; this is not
a coincidence.

Definition 3. A rational generating function is a function that can be written


in the form
q(x)
,
(1 − xb1 ) · · · (1 − xbk )
where q(x) is a polynomial in Q[x] and bi ∈ Nd \ {0}.

We will prove that S ⊆ Nd is a Presburger set if and only if f (S; x) is a rational


generating function. These are Properties 1 and 3 in the following theorem:
Theorem 1. Given a set S ⊆ Nd , the following are equivalent:
1. S is a Presburger set,
2. S is a finite union of sets of the form P ∩ (λ + Λ), where P is a polyhedron,
λ ∈ Zd , and Λ ⊆ Zd is a lattice.
3. f (S; x) is a rational generating function.
Property 2 gives a nice geometric characterization of Presburger sets; the set in
Example 1 can be written as [3, ∞) ∩ (1 + 2Z).
We are particularly interested in generating functions because of their power-
ful flexibility: we can use algebraic manipulations to answer questions about the
set. For example, f (S; 1, 1, . . . , 1) is exactly the cardinality of S (if finite). More
generally, we may want to count solutions to a Presburger formula as a function
of several parameter variables:

Definition 4. The Presburger counting function for a given Presburger formula


F (c, p) is
gF (p) = #{c ∈ Nd : F (c, p)}.
412 K. Woods

Note that c (the counted variables) and p (the parameter variables) are free
variables. We will restrict ourselves to counting functions such that gF (p) is
finite for all p ∈ Nn . One could instead either include ∞ in the codomain of gF
or restrict the domain of gF to where gF (p) is finite (this domain would itself
be a Presburger set).
A classic example is to take F (c, p) to be the conjunction of linear inequalities
of the form a1 c1 + · · ·+ ad cd ≤ a0 p, where ai ∈ Z. Then gF (p) counts the number
of integer points in the pth dilate of a polyhedron.
Example 2. If F (c1 , c2 , p) is 2c1 + 2c2 ≤ p, then the set of solutions (c1 , c2 ) ∈ N2
lies in the triangle with vertices (0, 0), (0, p/2), (p/2, 0), and

1 , p - , p -
gF (p) = +1 +2
2
 2 2
1 2
p + 3 p + 1 if p is even,
= 81 2 41 3
8p + 2p + 8 if p is odd.

The nice form of this function is also not a coincidence. For this particular type
of Presburger formula (dilates of a polyhedron), Ehrhart proved [21] that the
counting functions are quasi-polynomials:

Definition 5. A quasi-polynomial (over Q) is a function g : Nn → Q such


that there exists an n-dimensional lattice Λ ⊆ Zn together with polynomials
qλ̄ (p) ∈ Q[p], one for each λ̄ ∈ Zn /Λ, such that

g(p) = qλ̄ (p), for p ∈ λ̄.

In Example 2, we can take the lattice Λ = 2Z and each coset (the evens and the
odds) has its associated polynomial. We need something slightly more general
to account for all Presburger counting functions:

Definition 6. A piecewise quasi-polynomial


 is a function g : Nn → Q such that
there exists a finite partition i (Pi ∩ N ) of Nn with Pi polyhedra (which may
n

not all be full-dimensional) and there exist quasi-polynomials gi such that

g(p) = gi (p) for p ∈ Pi ∩ Nn .

One last thing that is not a coincidence: For the triangle in Example 2, we can
compute

gF (p)xp = 1 + x + 3x2 + 3x3 + 6x4 + · · ·


p∈N
1
= ,
(1 − x)(1 − x2 )2

a rational generating function! The following theorem says that these ideas are
– almost – equivalent.
Presburger Arithmetic, Rational Generating Functions 413

Theorem 2. Given a function g : Nn → Q and the following three possible


properties:
A. g is a Presburger counting function,
B. gis a piecewise quasi-polynomial, and
p
C. p∈Nn g(p)x is a rational generating function,

we have the implications


A ⇒ B ⇔ C.
Remark 1. Proving Theorem 2 will give us much of Theorem 1, using the fol-
lowing idea. A set S ⊆ Zd corresponds exactly to its characteristic function

1 if u ∈ S,
χS (u) =
0 if u ∈/ S.

If S is a Presburger set defined by F (u), then

χS (u) = #{c ∈ N : F (u) and c = 0}


is a Presburger counting function.
In light of Theorem 1, we might wonder if there is a sense in which B ⇒ A.
Of course we would have to restrict g, for example requiring that its range
be in N (Theorem 1 essentially restricts the range of g to {0, 1}, as it must
be a characteristic function). The implication still does not hold, however. For
example, suppose the polynomial

g(s, t) = (t − s2 )2
were a Presburger counting function given by a Presburger formula F (c, s, t),
that is,
g(s, t) = #{c ∈ Nd : F (c, s, t)}.
Then the set
 
(s, t) ∈ N2 : c F (c, s, t) = {(s, t) ∈ N2 : g(s, t) = 0}
= {(s, s2 ) : s ∈ N}
would be a Presburger set. This is not the case, however, as it does not satisfy
Property 2 in Theorem 1. If the parameter is univariate, however, the following
proposition shows that we do have the implication B ⇒ A.
Proposition 1. Given a function g : N → Q, if g is a piecewise quasi-polyno-
mial whose range is in N, then g is a Presburger counting function.
In Section 4, we prove Theorems 1 and 2 (the proof of Proposition 1 appears in
the full version of this paper). In Section 2, we survey related work. In Section
3, we present the primary tools we need for the proofs. In the full version of this
paper, we also turn to computational questions; we survey known results, but
restate them in terms of Presburger arithmetic.
414 K. Woods

2 Related Work

Presburger arithmetic is a classical first order theory of logic, proven decidable by


Presburger [35]. Various upper and lower bounds on the complexity of decision
algorithms for the general theory have occupied the theoretical computer science
community, see [8,17,22,24,26,33].
A finite automata approach to Presburger arithmetic was pioneered in [12,15],
and continues to be an active area of research (see, for example, [10,16,30,47]).
This approach is quite different from the present paper’s, but it can attack similar
questions: for example, see [34] for results on counting solutions to Presburger
formulas (non-parametrically).
The importance of understanding Presburger Arithmetic is highlighted by the
fact that many problems in computer science and mathematics can be phrased
in this language: for example, integer programming [31,40], geometry of num-
bers [13,29], Gröbner bases and algebraic integer programming [43,45], neigh-
borhood complexes and test sets [38,44], the Frobenius problem [37], Ehrhart
theory [7,21], monomial ideals [32], and toric varieties [23]. Several of the above
references analyze the computational complexity of their specific problem. In
most of the above references, the connection to Presburger arithmetic is only
implicit.
The algorithmic complexity of specific rational generating function problems
has been addressed in, for example, [1,5,9,20,27,28]. Several of these results are
summarized in the full version of this current paper.
Connections between subclasses of Presburger arithmetic and generating func-
tions are made explicit in [3,4,5]. Connections between rational generating func-
tions and quasi-polynomials have been made in [21,41,42], and the algorithmic
complexity of their relationship was examined in [46]. Counting solutions to Pres-
burger formulas has been examined in [36], though the exact scope of the results
is not made explicit, and rational generating functions are not used. Similar
counting algorithms appear in [14], and [18] proves that the counting functions
for a special class of Presburger formuals (those whose parameters p only ap-
pear in terms ci ≤ pi ) are piecewise quasi-polynomials. This current paper is
the first to state and prove a general connection between Presburger arithmetic,
quasi-polynomials, and rational generating functions.
Theorem 1 was originally proved in the author’s thesis [48]; in this paper, it
is put into context as a consequence of the more general Theorem 2. A simpler
geometric characterization of Presburger sets (equivalent to Property 2 of The-
orem 1) was given in [25]: they are the semi-linear sets, those sets that can be
k
written as a finite union of sets of the form S = {a0 + i=1 ni ai : ni ∈ N},
where ai ∈ Nd . Furthermore, if one takes these S to be disjoint and requires the
a1 , . . . , ak to be linearly independent, for each S (as [25] implicitly prove can
be done, made explicit in [18] as semi-simple sets), then each S can be encoded
with the rational generating function

x a0
(1 − xa
1 ) · · · (1
1
− xa
k )
k
Presburger Arithmetic, Rational Generating Functions 415

and we obtain a slightly different version of 2 ⇒ 3 in Theorem 1. There seems


to be no previous result analogous to 3 ⇒ 2.

3 Primary Background Theorems


Here we detail several tools we will use. The first tool we need is a way to simplify
Presburger formulas. As originally proved [35] by Presburger (see [33] for a nice
exposition), we can completely eliminate the quantifiers if we are allowed to also
use modular arithmetic.
Definition 7. An extended Presburger formula is a Boolean formula with vari-
ables in N expressible in the elementary language of Presburger Arithmetic ex-
tended by the mod k operations, for constants k > 1.

Theorem 3. Given a formula F (u) in extended Presburger arithmetic (and


hence any formula in Presburger arithmetic), there exists an equivalent quanti-
fier free formula G(u) such that

{u ∈ Nd : F (u)} = {u ∈ Nd : G(u)}.

For instance, the set from Example 1 can be written as (u > 1 and u mod 2 = 1).
Next, we give two theorems that tie in generating functions. The first gives us
a way to convert from a specific type of Presburger set to a generating function.

 Given a pointλ ∈ Z , a lattice Λ ⊆ Z , and a rational polyhedron


d d
Theorem 4.
P ⊆ R≥0 , f P ∩ (λ + Λ); x (as given in Definition 2) is a rational generating
d

function.

The first step to proving this is to use Brion’s Theorem [11], which says that the
generating
 function can be decomposed into functions of the form f K ∩ (λ +
Λ); x , where K is a cone. Then, one can notice that integer points in cones have
a natural structure that can be encoded as geometric series.

Example 3. Let K ⊆ R2 be the cone with vertex at the origin and extreme rays
u = (1, 0) and v = (1, 2). Using the fact that the lattice (uZ + vZ) has index 2
in Z2 , with coset representatives (0, 0) and (1, 1), every integer point in K can
be written as either (0, 0) + λ1 u + λ2 v or (1, 1) + λ1 u + λ2 v, where λ1 , λ2 ∈ N.
Therefore

f (K ∩ Z2 ; x) = (x(0,0) + x(1,1) )(1 + xu + x2u + · · · )(1 + xv + x2v + · · · )


x(0,0) + x(1,1)
= .
(1 − xu )(1 − xv )

See [2, Chapter VIII], for example, for more details.


Next, we would like to be able to perform substitutions on the variables in
a rational generating function and still retain a rational generating function;
particularly, we would like to substitute in 1’s for several of the variables.
416 K. Woods

Theorem 5. Given a rational generating function f (x), then

g(z) = f (z l1 , z l2 , . . . , z ld ),

with li ∈ Nk , is also a rational generating function, assuming the substituted


values do not lie entirely in the poles of f . In particular, substituting in xi =
z 0 = 1 yields a rational function, if 1 is not a pole of f .

The proof is immediate: if substituting in xi = z li would make any of the


binomials in the denominator of f zero (when f is written in the form from
Definition 3), then that binomial must be a factor of the numerator (or else such
zli would lie entirely in the poles of f ); therefore, substituting in xi = z li yields
a new rational generating function.
Finally, we need a connection between Presburger formulas and quasi-poly-
nomials. This is given by Sturmfels [42]:

Definition 8. Given a1 , . . . , ad ∈ Nn , the vector partition function g : Nn → N


is defined by

g(p) = #{(λ1 , . . . , λd ) ∈ Nd : p = λ1 a1 + · · · + λd ad },

that is, the number of ways to partition the vector p into parts taken from {ai }.

Theorem 6. Any vector partition function is a piecewise quasi-polynomial.

See [6] for a self-contained explanation utilizing the partial fraction expansion
of the generating function
1
g(p)xp = ;
(1 − xa1 ) · · · (1 − xad )
p∈Nn

this equality can be obtained by rewriting the rational function as a product of


infinite geometric series:

(1 + xa1 + x2a1 + · · · ) · · · (1 + xad + x2ad + · · · ).

4 Proofs

4.1 Proof of Theorem 2

A ⇒ C.
Given a Presburger counting function, g(p) = #{c ∈ Nd : F (c, p)}, we
first apply Presburger Elimination (Theorem 3) to F to obtain a quantifier free
formula, G(c, p), in extended Presburger arithmetic such that g(p) = #{c ∈
Nd : G(c, p)}. Integers which satisfy a statement of the form

a1 p1 + · · · + an pn + an+1 c1 + · · · + an+d cd ≡ a0 (mod m)


Presburger Arithmetic, Rational Generating Functions 417

are exactly sets λ + Λ, where λ ∈ Zn+d and Λ is a lattice in Zn+d . Since G(c, p)
is a Boolean combination of linear inequalities and these linear congruences, we
may write the set, S, of points (c, p) which satisfy G(c, p) as a disjoint union

)
k
S= Pi ∩ (λi + Λi ),
i=1

where, for 1 ≤ i ≤ k, Pi ⊆ Rn+d


≥0 is a polyhedron, Λi is a sublattice of Z
n+d
, and
λi is in Zn+d
. (To see this, convert the formula into disjunctive normal form;
each conjunction will be of this form Pi ∩ (λi + Λi ); these sets may overlap, but
their overlap will also be of this form.)
Let Si = Pi ∩ (λi + Λi ). By Theorem 4, we know we can write f (Si ; y, x) as
a rational generating function, and so

f (S; y, x) = f (Si ; y, x) = yc xp
i (c,p): G(c,p)

can be written as a rational generating function. Finally, we substitute y =


(1, 1, . . . , 1), using Theorem 5, to obtain the rational generating function

#{c ∈ Nd : G(c, p)}xp = g(p)xp .


p p

C ⇒ B.  p
It suffices to prove this for functions g such that p g(p)x is a rational
generating function of the form
xq
,
(1 − xa1 )(1 − xa2 ) · · · (1 − xak )

where q ∈ Nn , ai ∈ Nn \ {0}, because the property of being a piecewise quasi-


polynomial is preserved under linear combinations. Furthermore, we may take
q = (0, 0, . . . , 0), because multiplying by xq only shifts the domain of the function
g. Expanding this rational generating function as a product of infinite geometric
series,

g(p)xp = (1 + xa1 + x2a1 + · · · ) · · · (1 + xak + x2ak + · · · ),


p

and we see that

g(p) = #{(λ1 , . . . , λk ) ∈ Nk : p = λ1 a1 + · · · + λk ak }.

This is exactly a vector partition function, which Theorem 6 tells us is a piecewise


quasi-polynomial.
418 K. Woods

B ⇒ C.
Any piecewise quasi-polynomial can be written as a linear combination of
functions of the form

pa if p ∈ P ∩ (λ + Λ),
g(p) =
0 otherwise,

where a ∈ Nn , P ⊆ Rn≥0 is a polyhedron, λ ∈ Zn , and Λ is a sublattice of Zn .


Since linear combinations of rational generating functions are rational generating
functions, it suffices to prove it for such a g. Let cij , for 1 ≤ i ≤ n and 1 ≤ j ≤ ai ,
be variables, and define the polyhedron
Q = {(p, c) ∈ Nn+a1 +···+an :
p ∈ P and 1 ≤ cij ≤ pi for all cij }.
This Q is defined so that #{c : (p, c) ∈ Q} is pa1 1 · · · pann = pa for p ∈ P (and
0 otherwise). Using Theorem 4, we can find the generating function for the set
 generating function. Substituting c = (1, 1, . . . , 1), using
Q∩(λ+Λ) as a rational
Theorem 5, gives us p g(p)xp as a rational generating function.

4.2 Proof of Theorem 1


Given a set S ⊆ Zd , define the characteristic function, χS : Nd → {0, 1}, as in
Remark 1. Define a new property:
2 . χS is a piecewise quasi-polynomial.
Translating Theorem 2 into properties of S and χS , we have
1 ⇒ (2 ⇔ 3).
So we need to prove 2 ⇒ 1 and 2 ⇒ 2.

2 ⇒ 1.
This is straightforward: the property of being an element of λ + Λ can be
written using linear congruences and existential quantifiers, and the property of
being an element of P can be written as a set of linear inequalities.

2 ⇒ 2.
Since χS is a piecewise quasi-polynomial, it is constituted from associated
polynomials. Let us examine such a polynomial q(p) that agrees with χS on
some P ∩ (λ + Λ), where P ⊆ Rn≥0 is a polyhedron, λ ∈ Zn , and Λ a sublattice
of Zn . It suffices to prove that 2 holds for S ∩ P ∩ (λ + Λ), since S is the disjoint
union of such pieces.
Ideally, we would like to argue that, since q only takes on the values 0 and 1,
the polynomial q must be constant on P ∩ (λ + Λ), at least if P is unbounded.
This is not quite true; for example, if
 
P = (x, y) ∈ R2 : x ≥ 0 and 0 ≤ y ≤ 1 ,
Presburger Arithmetic, Rational Generating Functions 419

then the polynomial q(x, y) = y is 1 for y = 1 and 0 for y = 0.


What we can say is that q must be constant on any infinite ray contained in
P ∩(λ+Λ): if we parametrize the ray by x(t) = (x1 (t), · · · , xn (t)), then q(x(t)) is
a univariate polynomial that is either 0 or 1 at an infinite number of points, and
so must be constant. Inductively, we can similarly show that q must be constant
on any cone contained in P .
Let K be the cone with vertex at the origin

K = {y ∈ Rn : y + P ⊆ P }.

Then K is the largest cone such that the cones x + K are contained in P , for
all x ∈ P ; K is often called the recession cone or characteristic cone of P (see
Section 8.2 of [39]), and the polyhedron P can be decomposed into a Minkowski
sum K + Q, where Q is a bounded polyhedron. We can write P ∩ (λ + Λ) as a
finite union (possibly with overlap) of sets of the form

Qj = (vj + K) ∩ (λ + Λ),

for some vj , and on each of these pieces q must be constant. If q is the constant
1 on Qj , then Qj is contained in S, and if q is the constant 0, then none of Qj
is in S. Since S is a finite union of the appropriate Qj , S has the form needed
for Property 2.

Acknowledgements. My thanks to the referees, particularly for pointers to


relevant related works.

References
1. Barvinok, A.: A polynomial time algorithm for counting integral points in polyhe-
dra when the dimension is fixed. Math. Oper. Res. 19(4), 769–779 (1994)
2. Barvinok, A.: A Course in Convexity. Graduate Studies in Mathematics, vol. 54.
American Mathematical Society, Providence (2002)
3. Barvinok, A.: The complexity of generating functions for integer points in
polyhedra and beyond. In: International Congress of Mathematicians, vol. III,
pp. 763–787. Eur. Math. Soc., Zürich (2006)
4. Barvinok, A., Pommersheim, J.: An algorithmic theory of lattice points in polyhe-
dra. In: New Perspectives in Algebraic Combinatorics (Berkeley, CA, 1996–1997).
Math. Sci. Res. Inst. Publ., vol. 38, pp. 91–147. Cambridge Univ. Press, Cambridge
(1999)
5. Barvinok, A., Woods, K.: Short rational generating functions for lattice point prob-
lems. J. Amer. Math. Soc. 16(4), 957–979 (2003) (electronic)
6. Beck, M.: The partial-fractions method for counting solutions to integral linear
systems. Discrete Comput. Geom. 32(4), 437–446 (2004)
7. Beck, M., Robins, S.: Computing the continuous discretely. Undergraduate Texts in
Mathematics. Springer, New York (2007); Integer-point enumeration in polyhedra
8. Berman, L.: The complexity of logical theories. Theoret. Comput. Sci. 11(1), 57,
71–77 (1980); With an introduction “On space, time and alternation”
420 K. Woods

9. Blanco, V., Garcı́a-Sánchez, P.A., Puerto, J.: Counting numerical semigroups with
short generating functions. Internat. J. Algebra Comput. 21(7), 1217–1235 (2011)
10. Boudet, A., Comon, H.: Diophantine equations, Presburger arithmetic and finite
automata. In: Kirchner, H. (ed.) CAAP 1996. LNCS, vol. 1059, pp. 30–43. Springer,
Heidelberg (1996)
11. Brion, M.: Points entiers dans les polyèdres convexes. Ann. Sci. École Norm. Sup. 4,
653–663 (1988)
12. Büchi, J.R.: Weak second-order arithmetic and finite automata. Z. Math. Logik
Grundlagen Math. 6, 66–92 (1960)
13. Cassels, J.W.S.: An introduction to the geometry of numbers. Classics in Mathe-
matics. Springer, Berlin (1997); Corrected reprint of the 1971 edition
14. Clauss, P., Loechner, V.: Parametric analysis of polyhedral iteration spaces. Journal
of VLSI Signal Processing 19(2), 179–194 (1998)
15. Cobham, A.: On the base-dependence of sets of numbers recognizable by finite
automata. Math. Systems Theory 3, 186–192 (1969)
16. Comon, H., Jurski, Y.: Multiple counters automata, safety analysis and Pres-
burger arithmetic. In: Vardi, M.Y. (ed.) CAV 1998. LNCS, vol. 1427, pp. 268–279.
Springer, Heidelberg (1998)
17. Cooper, D.: Theorem proving in arithmetic without multiplication. Machine Intel-
ligence 7, 91–99 (1972)
18. D’Alessandro, F., Intrigila, B., Varricchio, S.: On some counting problems for semi-
linear sets. CoRR abs/0907.3005 (2009)
19. Davis, M.: Hilbert’s tenth problem is unsolvable. Amer. Math. Monthly 80, 233–269
(1973)
20. De Loera, J., Haws, D., Hemmecke, R., Huggins, P., Sturmfels, B., Yoshida, R.:
Short rational functions for toric algebra. To appear in Journal of Symbolic Com-
putation (2004)
21. Ehrhart, E.: Sur les polyèdres rationnels homothétiques à n dimensions. C. R.
Acad. Sci. Paris 254, 616–618 (1962)
22. Fischer, M., Rabin, M.: Super-exponential complexity of Presburger arithmetic. In:
Complexity of Computation. SIAM–AMS Proc., vol. VII, pp. 27–41. Amer. Math.
Soc., Providence (1974); Proc. SIAM-AMS Sympos., New York (1973)
23. Fulton, W.: Introduction to Toric Varieties. Annals of Mathematics Studies,
vol. 131. Princeton University Press, Princeton (1993)
24. Fürer, M.: The complexity of Presburger arithmetic with bounded quantifier al-
ternation depth. Theoret. Comput. Sci. 18(1), 105–111 (1982)
25. Ginsburg, S., Spanier, E.: Semigroups, Presburger formulas and languages. Pacific
Journal of Mathematics 16(2), 285–296 (1966)
26. Grädel, E.: Subclasses of Presburger arithmetic and the polynomial-time hierarchy.
Theoret. Comput. Sci. 56(3), 289–301 (1988)
27. Guo, A., Miller, E.: Lattice point methods for combinatorial games. Adv. in Appl.
Math. 46(1-4), 363–378 (2011)
28. Hoşten, S., Sturmfels, B.: Computing the integer programming gap. To appear in
Combinatorics (2004)
29. Kannan, R.: Test sets for integer programs, ∀∃ sentences. In: Polyhedral Combi-
natorics. DIMACS Ser. Discrete Math. Theoret. Comput. Sci, vol. 1, pp. 39–47.
Amer. Math. Soc., Providence (1990); Morristown, NJ (1989)
30. Klaedtke, F.: Bounds on the automata size for Presburger arithmetic. ACM Trans.
Comput. Log. 9(2), 34 (2008)
31. Lenstra Jr., H.: Integer programming with a fixed number of variables. Math. Oper.
Res. 8(4), 538–548 (1983)
Presburger Arithmetic, Rational Generating Functions 421

32. Miller, E., Sturmfels, B.: Combinatorial commutative algebra. Graduate Texts in
Mathematics, vol. 227. Springer, New York (2005)
33. Oppen, D.: A superexponential upper bound on the complexity of Presburger arith-
metic. J. Comput. System Sci. 16(3), 323–332 (1978)
34. Parker, E., Chatterjee, S.: An automata-theoretic algorithm for counting solu-
tions to presburger formulas. In: Duesterwald, E. (ed.) CC 2004. LNCS, vol. 2985,
pp. 104–119. Springer, Heidelberg (2004)
35. Presburger, M.: On the completeness of a certain system of arithmetic of whole
numbers in which addition occurs as the only operation. Hist. Philos. Logic 12(2),
225–233 (1991); Translated from the German and with commentaries by Dale
Jacquette
36. Pugh, W.: Counting solutions to presburger formulas: how and why. SIGPLAN
Not. 29(6), 121–134 (1994)
37. Ramı́rez Alfonsı́n, J.L.: The Diophantine Frobenius problem. Oxford Lecture Series
in Mathematics and its Applications, vol. 30. Oxford University Press, Oxford
(2005)
38. Scarf, H.: Test sets for integer programs. Math. Programming Ser. B 79(1-3),
355–368 (1997)
39. Schrijver, A.: Theory of Linear and Integer Programming. Interscience Series in
Discrete Mathematics. John Wiley & Sons Ltd., Chichester (1986)
40. Schrijver, A.: Combinatorial optimization. Polyhedra and efficiency. Algorithms
and Combinatorics, vol. 24. Springer, Berlin (2003)
41. Stanley, R.P.: Decompositions of rational convex polytopes. Ann. Discrete Math. 6,
333–342 (1980); Combinatorial mathematics, optimal designs and their applica-
tions. In: Proc. Sympos. Combin. Math. and Optimal Design, Colorado State Univ.,
Fort Collins, Colo. (1978)
42. Sturmfels, B.: On vector partition functions. J. Combin. Theory Ser. A 72(2),
302–309 (1995)
43. Sturmfels, B.: Gröbner Bases and Convex Polytopes. University Lecture Series,
vol. 8. American Mathematical Society, Providence (1996)
44. Thomas, R.: A geometric Buchberger algorithm for integer programming. Math.
Oper. Res. 20(4), 864–884 (1995)
45. Thomas, R.: The structure of group relaxations. In: Aardal, K., Nemhauser, G.,
Weismantel, R. (eds.) Handbook of Discrete Optimization (2003)
46. Verdoolaege, S., Woods, K.: Counting with rational generating functions. J. Sym-
bolic Comput. 43(2), 75–91 (2008)
47. Wolper, P., Boigelot, B.: An automata-theoretic approach to Presburger arithmetic
constraints. In: Mycroft, A. (ed.) SAS 1995. LNCS, vol. 983, pp. 21–32. Springer,
Heidelberg (1995)
48. Woods, K.: Rational Generating Functions and Lattice Point Sets. PhD thesis,
University of Michigan (2004)
Revisiting the Equivalence Problem
for Finite Multitape Automata

James Worrell

Department of Computer Science, University of Oxford, UK

Abstract. The decidability of determining equivalence of deterministic


multitape automata (or transducers) was a longstanding open problem
until it was resolved by Harju and Karhumäki in the early 1990s. Their
proof of decidability yields a co-NP upper bound, but apparently not
much more is known about the complexity of the problem. In this pa-
per we give an alternative proof of decidability, which follows the basic
strategy of Harju and Karhumäki but replaces their use of group theory
with results on matrix algebras. From our proof we obtain a simple ran-
domised algorithm for deciding equivalence of deterministic multitape
automata, as well as automata with transition weights in the field of ra-
tional numbers. The algorithm involves only matrix exponentiation and
runs in polynomial time for each fixed number of tapes. If the two input
automata are inequivalent then the algorithm outputs a word on which
they differ.

1 Introduction

One-way multitape finite automata were introduced in the seminal 1959 paper
of Rabin and Scott [15]. Such automata (under various restrictions) are also
commonly known as transducers—see Elgot and Mezei [6] for an early reference.
A multitape automaton with k tapes accepts a k-ary relation on words. The
class of relations recognised by deterministic automata coincides with the class
of k-ary rational relations [6].
Two multitape automata are said to be equivalent if they accept the same
relation. Undecidability of equivalence of non-deterministic automata is rela-
tively straightforward [8]. However the deterministic case remained open for
many years, until it was shown decidable by Harju and Karhumäki [9]. Their
solution made crucial use of results about ordered groups—specifically that a
free group can be endowed with a compatible order [13] and that the ring of
formal power series over an ordered group with coefficients in a division ring and
with well-ordered support is itself a division ring (due independently to Mal-
cev [11] and Neumann [14]). Using these results [9] established the decidability
of multiplicity equivalence of non-deterministic multitape automata, i.e., whether
two non-deterministic multitape automata have the same number of accepting
computations on each input. Decidability in the deterministic case (and, more

Supported by EPSRC grant EP/G069727/1.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 422–433, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Revisiting the Equivalence Problem for Finite Multitape Automata 423

generally, the unambiguous case) follows immediately. We refer the reader to [16]
for a self-contained account of the proof, including the underlying group theory.
Harju and Karhumäki did not address questions of complexity in [9]. However
the existence of a co-NP guess-and-check procedure for deciding equivalence of
deterministic multitape automata follows directly from [9, Theorem 8]. This
theorem states that two inequivalent automata are guaranteed to differ on a
tuple of words whose total length is at most the total number of states of the two
automata. Such a tuple can be guessed, and it can be checked in polynomial time
whether the tuple is accepted by one automaton and rejected by the other. In
the special case of two-tape deterministic automata, a polynomial-time algorithm
was given in [7], before decidability was shown in the general case.
A co-NP upper bound also holds for multiplicity equivalence of k-tape au-
tomata for each fixed k. However, as we observe below, if the number of tapes
is not fixed, computing the number of accepting computations of a given non-
deterministic multitape automata on a tuple of input words is #P-hard. Thus
the guess-and-check method does not yield a co-NP procedure for multiplicity
equivalence in general.
It is well-known that the equivalence problem for single-tape weighted au-
tomata with rational transition weights is solvable in polynomial time [18,19].
Now the decision procedure in [9] reduces multiplicity equivalence of multitape
automata to equivalence of single-tape automata with transition weights in a
division ring of power series over an ordered group. However the complexity of
arithmetic in this ring seems to preclude an application of the polynomial-time
procedures of [18,19]. Leaving aside issues of representing infinite power series,
even the operation of multiplying a family of polynomials in two non-commuting
variables yields a result with exponentially many monomials in the length of its
input.
In this paper we give an alternative proof that multiplicity equivalence of
multitape automata is decidable, which also yields new complexity bounds on
the problem. We use the same basic idea as [9]—reduce to the single-tape case
by enriching the set of transition weights. However we replace their use of power
series on ordered groups with results about matrix algebras and Polynomial
Identity rings (see Remark 1 for a more technical comparison). In particular,
we use the Amitsur-Levitzki theorem concerning polynomial identities in matrix
algebras. Our use of the latter is inspired by the work of [3] on non-commutative
polynomial identity testing, and our starting point is a simple generalisation of
the approach of [3] to what we call partially commutative polynomial identity
testing.
Our construction for establishing decidability immediately yields a simple ran-
domised algorithm for checking multiplicity equivalence of multitape automata
(and hence also equivalence of deterministic automata). The algorithm involves
only matrix exponentiation, and runs in polynomial time for each fixed number
of tapes.
424 J. Worrell

2 Partially Commutative Polynomial Identities


2.1 Matrix Algebras and Polynomial Identities
Let F be an infinite field. Recall that an F -algebra is a vector space over F
equipped with an associative bilinear product that has an identity . Write F X
for the free F -algebra over a set X. The elements of F X can be viewed as
polynomials over a set of non-commuting variables X with coefficients in F .
Each such polynomial is an F -linear combination of monomials, where each
monomial is an element of X ∗ . The degree of a polynomial is the maximum of
the lengths of its monomials.
Let A be an F -algebra and f ∈ F X. If f evaluates to 0 for all valuations of
its variables in A then we say that A satisfies the polynomial identity f = 0. For
example, an algebra satisfies the polynomial identity xy − yx = 0 if and only if
it is commutative. Note that since the variables x and y do not commute, the
polynomial xy − yx is not identically zero.
We denote by Mn (F ) the F -algebra of n × n matrices with coefficients in F .
The Amitsur-Levitzki theorem [1,4] is a fundamental result about polynomial
identities in matrix algebras.
Theorem 1 (Amitsur-Levitzki). The algebra Mn (F ) of n × n matrices over
a commutative ring F satisfies the polynomial identity

xσ(1) . . . xσ(2n) = 0 ,
σ∈S2n

where the sum is over the (2n)! elements of the symmetric group S2n . Moreover
Mn (F ) satisfies no identity of degree less than 2n.
Given a finite set X of non-commuting variables, the generic n×n matrix algebra
Fn X is defined as follows. For each variable x ∈ X we introduce a family of
(x)
commuting indeterminates {tij : 1 ≤ i, j ≤ n} and define Fn X to be the F -
(x)
algebra of n × n matrices generated by the matrices (tij ) for each x ∈ X. Then
Fn X has the following universal property: any homomorphism from F X to a
matrix algebra Mn (R), with R an F -algebra, factors uniquely through the map
(x)
ΦXn : F X → Fn X given by Φn (x) = (tij ).
X

Related to the map ΦX n we also define an F -algebra homomorphism


(x)
ΨnX : F X → Mn (F tij | x ∈ X, 1 ≤ i, j ≤ n)
by ⎛ ⎞
(x)
0 t
⎜ .12 . ⎟
⎜ .. .. ⎟
ΨnX (x) = ⎜



⎝ (x)
tn−1,n ⎠
0
where the matrix on the right has zero entries everywhere but along the super-
diagonal.
Revisiting the Equivalence Problem for Finite Multitape Automata 425

2.2 Partially Commutative Polynomial Identities


In this section we introduce a notion of partially commutative polynomial iden-
tity. We first establish notation and recall some relevant facts about tensor prod-
ucts of algebras.
Write A ⊗ B for the tensor product of F -algebras A and B, and write A⊗k
for the k-fold tensor power of A. If A is an algebra of a × a matrices and B an
algebra of b × b matrices, then we identify the tensor product A ⊗ B with the
algebra of ab × ab matrices spanned by the matrices M ⊗ N , M = (mij ) ∈ A
and N = (nij ) ∈ B, where
⎛ ⎞
m11 N · · · m1a N
⎜ .. ⎟
M ⊗ N = ⎝ ... . ⎠
ma1 N · · · maa N
In particular we have F ⊗k = F .
A partially commuting set of variables is a tuple X = (X1 , . . . , Xk ), where the
Xi are disjoint sets. Write F X for the tensor product F X1  ⊗ · · · ⊗ F Xk .
We think of F X as a set of polynomials in partially commuting variables.
Intuitively two variables x, y ∈ Xi do not commute, whereas x ∈ Xi commutes
with y ∈ Xj if i = j. Note that if each Xi is a singleton {xi } then F X is
the familiar ring of polynomials in commuting variables x1 , . . . , xk . At the other
extreme, if k = 1 then we recover the non-commutative case.
An arbitrary element f ∈ F X = F X1  ⊗ · · · ⊗ F Xk  can be written
uniquely as a finite sum of distinct monomials, where each monomial is a tensor
product of elements of X1∗ , X2∗ ,. . . , and Xk∗ . Formally, we can write

f= αi (mi,1 ⊗ · · · ⊗ mi,k ) , (1)


i∈I

where αi ∈ F and mi,j ∈ Xj∗ for each i ∈ I and 1 ≤ j ≤ k. Thus we can identify
F X with the free F -algebra over the product monoid X1∗ × . . . × Xk∗ .
Define the degree of a monomial m1 ⊗ . . . ⊗ mk to be the total length |m1 | +
. . . + |mk | of its constituent words. The degree of a polynomial is the maximum
of the degrees of its constituent monomials.
Let A = (A1 , . . . , Ak ) be a k-tuple of F -algebras. A valuation of F X in A
is a tuple of functions v = (v1 , . . . , vk ), where vi : Xi → Ai . Each vi extends
uniquely to an F -algebra homomorphism v4i : F Xi  → Ai , and we define the
map v 4 : F X → A1 ⊗ . . . ⊗ Ak by v4 = v41 ⊗ . . . ⊗ v4k . Often we will abuse
terminology slightly and speak of a valuation of F X in A1 ⊗ · · · ⊗ Ak . Given
f ∈ F X, we say that A satisfies the partially commutative identity f = 0 if
4(f ) = 0 for all valuations v.
v
Next we introduce two valuations that will play an important role in the
subsequent development. Recall that given a set of non-commuting variables X,
we have a map ΦX n : F X → Fn X from the free F -algebra to the generic
n-dimensional matrix algebra. We now define a valuation
ΦX
n : F X −→ Fn X1  ⊗ · · · ⊗ Fn Xk  (2)
426 J. Worrell

by ΦX
n = Φn ⊗ · · · ⊗ Φn . Likewise we define
X1 Xk

ΨnX : F X −→Mn (F tij | x ∈ X1 , 1 ≤ i, j ≤ n) ⊗ · · ·


(x)

(x)
⊗ Mn (F tij | x ∈ Xk , 1 ≤ i, j ≤ n)

by ΨnX = ΨnX1 ⊗ · · · ⊗ ΨnXk . We will usually elide the superscript X from ΦX n


and ΨnX when it is clear from the context.
The following result generalises (part of) the Amitsur-Levitzki theorem, by
giving a lower bound on the degrees of partially polynomial identities holding in
tensor products of matrix algebras.

Proposition 1. Let f ∈ F X and let L be a field extending F . Then the


following are equivalent: (i) The partially commutative identity f = 0 holds in
Mn (L) ⊗F · · ·F ⊗ Mn (L); (ii) Φn (f ) = 0. Moreover, if f has degree strictly less
than n then (i) and (ii) are both equivalent to (iii) Ψn (f ) = 0; and (iv) f is
identically 0 in F X.

Proof. The implication (ii) ⇒ (i) follows from the fact that any valuation from
F X to Mn (L) ⊗F · · ·F ⊗ Mn (L) factors through Φn . To see that (i) ⇒ (ii),
observe that Φn (f ) is an nk × nk matrix in which each entry is a polynomial in
(x)
the commuting variables tij . Condition (i) implies in particular that each such
polynomial evaluates to 0 for all valuations of its variables in F . Since F is an
infinite field, it must be that each such polynomial is identically zero, i.e., (ii)
holds.
The implications (ii) ⇒ (iii) and (iv) ⇒ (i) are both straightforward, even
without the degree restriction on f .
Finally we show that (iii) ⇒ (iv). Let m1 ⊗ . . . ⊗ mk be a monomial in F X,
where mi = mi,1 . . . mi,li ∈ Xi∗ has length li < n. Then Ψn (m1 ⊗ · · · ⊗ mk )
is an nk × nk matrix whose first row has a single non-zero entry, which is the
monomial
(m ) (m ) (m ) (mk,l )
t12 1,1 . . . tl1 ,l1,l
1 +1
1
. . . t12 k,1 . . . tlk ,lk +1
k
(3)

at index (1, . . . , 1), (l1 + 1, . . . , lk + 1).


It follows that Ψn maps the set of monomials in F X of degree less than n
injectively into a linearly independent set of matrices, each with a single mono-
(x)
mial entry over the commuting indeterminates tij . Condition (iv) immediately
follows. &
%

The hypothesis that f have degree less than n in Proposition 1 can be weakened
somewhat, but is sufficient for our purposes.

2.3 Division Rings and Ore Domains

A ring R with no zero divisors is a domain. If moreover each non-zero element


of R has a two-sided multiplicative inverse, then we say that R is a division
Revisiting the Equivalence Problem for Finite Multitape Automata 427

ring (also called a skew field ). A domain R is a (right) Ore domain if for all
a, b ∈ R \ {0}, aR ∩ bR = 0. The significance of this notion is that an Ore domain
can be embedded in a division ring of fractions [4, Corollary 7.1.6], something
that need not hold for an arbitrary domain. If the Ore condition fails then it can
easily be shown that the subalgebra of R generated by a and b is free on a and
b. It follows that a domain R that satisfies some polynomial identity is an Ore
domain [4, Corollary 7.5.2].
Proposition 2. The tensor product of generic matrix algebras Fn X1  ⊗ · · · ⊗
Fn Xk  is an Ore domain for each n ∈ N.
Proof (sketch). We give a proof sketch here, deferring the details to Appendix A.
By the Amitsur-Levitzki theorem, Fn X1 ⊗· · ·⊗Fn Xn  satisfies a polynomial
identity. Thus it suffices to show that Fn X1  ⊗ · · · ⊗ Fn Xn  is a domain for
each n. Now it is shown in [4, Proposition 7.7.2] that Fn X is a domain for
each n and set of variables X. While the tensor product of domains need not
be a domain (e.g., C ⊗R C ∼ = C × C), the proof in [4] can be adapted mutatis
mutandis to show that F X1  ⊗ · · · ⊗ F Xk  is also a domain.
To prove the latter, it suffices to find central simple F -algebras D1 , . . . , Dk ,
each of degree n, such that the k-fold tensor product D ⊗F · · · ⊗F D is a domain.
Such an example can be found, e.g.,in [17, Proposition 1.1]. Then, using the
fact that D ⊗F L ∼ = Mn (L) for any algebraically closed extension field of F , one
can infer that Fn X1  ⊗ · · · ⊗ Fn Xk  is also a domain. &
%

3 Multitape Automata
Let Σ = (Σ1 , . . . , Σk ) be a tuple of finite alphabets. We denote by S the product
monoid Σ1∗ × · · · × Σk∗ . Define the length of s = (w1 , . . . , wk ) ∈ S to be |s| =
|w1 | + . . . + |wk | and write S (l) for the set of elements of S of length l. A
multitape automaton is a tuple A = (Σ, Q, E, Q0 , F ), where Q is a set of states,
E ⊆ Q × S (1) × Q is a set of edges, Q0 ⊆ Q is a set of initial states, and
Qf ⊆ Q is a set of final states. A run of A from state q0 to state qm is a finite
sequence of edges ρ = e1 e2 . . . em such that ei = (qi−1 , si , qi ). The label of ρ is
the product s1 s2 . . . sm ∈ S. Define the multiplicity A(s) of an input s ∈ S to be
the number of runs with label s such that q0 ∈ Q0 and qm ∈ Qf . An automaton
is deterministic if each state reads letters from a single tape and has a single
transition for every input letter. Thus a deterministic automaton has a single
run on each input s ∈ S.

3.1 Multiplicity Equivalence


We say that two automata A and B over the same alphabet are multiplicity
equivalent if A(s) = B(s) for all s ∈ S. The following result implies that multi-
plicity equivalence of multitape automata is decidable.
Theorem 2 (Harju and Karhumäki). Given automata A and B with n
states in total, A and B are equivalent if and only if A(s) = B(s) for all s ∈ S
of length at most n − 1.
428 J. Worrell

Theorem 2 immediately yields a co-NP bound for checking language equiva-


lence of deterministic multitape automata. Given two inequivalent automata A
and B, a distinguishing input s can be guessed, and it can be verified in polyno-
mial time that only one of A and B accepts s. A similar idea also gives a co-NP
bound for multiplicity equivalence in case the number of tapes is fixed. In gen-
eral we note however that the evaluation problem—given an automaton A and
input s, compute A(s)—is #P-complete. Thus it is not clear that the co-NP
upper bound applies to the multiplicity equivalence problem without bounding
the number of tapes.
Proposition 3. The evaluation problem for multitape automata is #P-complete.
Proof. Membership in #P follows from the observation that a non-deterministic
polynomial-time algorithm can enumerate all possible runs of an automaton A
on an input s ∈ S.
The proof of #P-hardness is by reduction from #SAT, the problem of count-
ing the number of satisfying assignments of a propositional formula. Consider
such a formula ϕ with k variables, each with fewer than n occurrences. We define
a k-tape automaton A, with each tape having alphabet {0, 1}, and consider as
input the k-tuple s = ((01)n , . . . , (01)n ). The automaton A is constructed such
that its runs on input s are in one-to-one correspondence with satisfying assign-
ments of ϕ. Each run starts with the automaton reading the symbol 0 from a
non-deterministically chosen subset of its tapes, corresponding to the set of false
variables. Thereafter it evaluates the formula ϕ by repeatedly guessing truth val-
ues of the propositional variables. If the i-th variable is guessed to be true then
the automaton reads 01 from the i-th tape; otherwise it reads 10 from the i-th
tape. The final step is to read the symbol 1 from a non-deterministically chosen
subset of the input tapes—again corresponding to the set of false variables. The
consistency of the guesses is ensured by the requirement that the automaton
have read s by the end of the computation. &
%

3.2 Decidability
We start by recalling from [9] an equivalence-respecting transformation from
multitape automata to single-tape weighted automata.
Recall that a single-tape automaton on a unary alphabet with transition
weights in a ring R consists of a set of states Q = {q1 , . . . , qn }, initial states
Q0 ⊆ Q, final states Qf ⊆ Q, and transition matrix M ∈ Mn (R). Given such
an automaton, define the initial-state vector α ∈ R1×n and final-state vector
η ∈ Rn×1 respectively by
 
1 if qi ∈ Q0 1 if qi ∈ Qf
αi = and ηi =
0 otherwise 0 otherwise
Then αM l η is the weight of the (unique) input word of length l.
Consider a k-tape automaton A = (Σ, Q, E, Q0 , Qf ), where Σ = (Σ1 , . . . , Σk ),
and write S = Σ1∗ × · · · × Σk∗ . Recall the ring of polynomials
F Σ = F Σ1  ⊗ · · · ⊗ F Σn ,
Revisiting the Equivalence Problem for Finite Multitape Automata 429

as defined in Section 2. Recall also that we can identify the monoid S with the set of
monomials in F Σ, where (w1 , . . . , wk ) ∈ S corresponds to w1 ⊗· · ·⊗wk —indeed
F Σ is the free F -algebra on S.
We derive from A an F Σ-weighted automaton A 4 (with a single tape and
unary input alphabet) that has the same sets of states, initial states, and final
states as A. We define the transition matrix M of A 4 by combining the different
transitions of A into a single matrix with entries in F Σ. To this end, suppose
of states of A is Q = {q1 , . . . , qn }. Define the matrix M ∈ Mn (F Σ)
that the set
by Mij = (qi ,s,qj )∈E s for 1 ≤ i, j ≤ n.
Let α and η be the respective initial- and final-state vectors of A. 4 Then the
following proposition is straightforward. Intuitively it says that the weight of
the unary word of length l in A 4 represents the language of all length-l tuples
accepted by A.

Proposition 4. For all l ∈ N we have αM l η = s∈S (l) A(s) · s.
Now consider two k-tape automata A and B. Let the weighted single-tape au-
tomata derived from A and B have respective transition matrices MA and MB ,
initial-state vectors αA and αB , and final-state vectors η A and η B . We combine
the latter into a single weighted automaton with transition matrix M , initial-
state vector α, and final-state vector η, respectively defined by:
 
MA 0 ηA
α = (αA αB ) M= η=
0 MB −η B

Proposition 5. Automata A and B are multiplicity equivalent if and only if


αM l η = 0 for l = 0, 1, . . . , n − 1, where n is the total number of states of the
two automata.
Proof. Since S is a linearly independent subset of F Σ, from Proposition 4
it follows that A and B are multiplicity equivalent just in case αA (MA )l η A =
αB (MB )l η B for all l ∈ N. The latter is clearly equivalent to αM l η = 0 for
all l ∈ N. It remains to show that we can check equivalence by looking only at
exponents l in the range 0, 1, . . . , n − 1.
Suppose that αM i η = 0 for i = 0, . . . , n − 1. We show that αM l η = 0 for an
arbitrary l ≥ n.
Consider the map Φl : F Σ → Fl Σ1  ⊗ · · · ⊗ Fl Σk , as defined in (2).
Observe that αM l η is a polynomial expression in F Σ of degree at most l.
Therefore by Proposition 1 ((ii) ⇔ (iv)), to show that αM l η = 0 it suffices to
show that

Φl (αM l η) = 0 . (4)

Let us write Φl (M ) for the pointwise application of Φl to the matrix M , so that


Φl (M ) is an n × n matrix, each of whose entries is an nk × nk matrix belonging
to Fl Σ1  ⊗ · · · ⊗ Fl Σk . Since Φl is a homomorphism and α and η are integer
vectors, (4) is equivalent to:
430 J. Worrell

α Φl (M )l η = 0 . (5)
Recall from Proposition 2 that the tensor product of generic matrix algebras
Fl Σ1  ⊗ · · · ⊗ Fl Σk  is an Ore domain and hence can be embedded in a division
ring. Now a standard result about single-tape weighted automata with transition
weights in a division ring is that such an automaton with n states is equivalent
to the zero automaton if and only if it assigns zero weight to all words of length
n (see [5, pp143–145] and [18]). Applying this result to the unary weighted
automaton defined by α, M , and η, we see that (5) is implied by
α Φl (M )i η = 0 i = 0, 1, . . . , n − 1 . (6)
But, since Φl is a homomorphism, (6) is implied by
αM i η = 0 i = 0, 1, . . . , n − 1 . (7)
This concludes the proof. &
%
Theorem 2 immediately follows from Proposition 5.
Remark 1. The difference between our proof of Theorem 2 and the proof in [9]
is that we consider a family of homomorphisms of F Σ into Ore domains of
matrices—the maps Φl —rather than a single “global” embedding of F Σ into
a division ring of power series over a product of free groups. None of the maps
Φl is an embedding, but it suffices to use the lower bound on the degrees of
polynomial identities in Proposition 1 in lieu of injectivity. On the other hand,
the fact that Fl Σ1  ⊗ · · · ⊗ Fl Σk  satisfies a polynomial identity makes it
relatively straightforward to exhibit an embedding of the latter into a division
ring. As we now show, this approach leads directly to a very simple randomised
polynomial-time algorithm for solving the equivalence problem.

3.3 Randomised Algorithm


Proposition 5 reduces the problem of checking multiplicity equivalence of multi-
tape automata A and B to checking the partially commutative identities αM l η =
0, l = 0, 1, . . . , n−1 in F Σ. Since each identity has degree less than n, applying
Proposition 1 ((iii) ⇔ (iv)) we see that A and B are equivalent if and only if
α Ψn (M )l η = 0 l = 0, 1, . . . , n − 1 . (8)
Each equation αΨn (M )l η = 0 in (8) asserts the zeroness of an nk × nk matrix
(x)
of polynomials in the commuting variables tij , with each polynomial having
degree less than n. Suppose that αΨn (M )l η = 0 for some l—say the matrix
entry with index ((1, . . . , 1), (l1 + 1, . . . , lk + 1)) contains a monomial with non-
zero coefficient. By (3) such a monomial determines a term s ∈ Σ1l1 × · · · × Σklk
with non-zero coefficient in αM l η, and by Proposition 4 we have A(s) = B(s).
We can verify each polynomial identity in (8), outputting a monomial of any
non-zero polynomial, using a classical identity testing procedure based on the
isolation lemma of [12].
Revisiting the Equivalence Problem for Finite Multitape Automata 431

Lemma 1 ([12]). There is a randomised polynomial-time algorithm that inputs


a multilinear polynomial f (x1 , . . . , xm ), represented as an algebraic circuit, and
either outputs a monomial of f or that f is zero. Moreover the algorithm is
always correct if f is the zero polynomial and is correct with probability at least
1/2 if f is non-zero.

The idea behind the algorithm described in Lemma 1 is to choose a weight


wi ∈ {1, . . . , 2m} for each variable xi of f independently and uniformly at ran-
dom. Defining the weight of a monomial xi1 . . . xit to be wi1 + . . . + wit , then
with probability at least 1/2 there is a unique minimum-weight monomial. The
existence of a minimum-weight monomial can be detected by computing the
polynomial g(y) = f (y w1 , . . . , y wk ), since a monomial with weight w in f yields
a monomial of degree w in g. Using similar ideas one can moreover determine
the composition of a minimum-weight monomial in f .
Applying Lemma 1 we obtain our main result:
Theorem 3. Let k be fixed. Then multiplicity equivalence of k-tape automata
can be decided in randomised polynomial time. Moreover there is a randomised
polynomial algorithm for the function problem of computing a distinguishing in-
put given two inequivalent automata.
The reason for the requirement that k be fixed is because the dimension of the
entries of the transition matrix M , and thus the number of polynomials to be
checked for equality, depends exponentially on k.
The above use of the isolation technique generalises [10], where it is used to
generate counterexample words of weighted single-tape automata. A very similar
application in [2] occurs in the context of identity testing for non-commutative
algebraic branching programs.

4 Conclusion
We have given a simple randomised algorithm for deciding language equivalence
of deterministic multitape automata and multiplicity equivalence of nondeter-
ministic automata. The algorithm arises directly from algebraic constructions
used to establish decidability of the problem, and runs in polynomial time for
each fixed number of tapes. We leave open the question of whether there is a
deterministic polynomial-time algorithm for deciding the equivalence of deter-
ministic and weighted multitape automata with a fixed number of tapes. (Recall
that the 2-tape case is already known to be in polynomial time [7].) We also leave
open whether there is a deterministic or randomised polynomial time algorithm
for solving the problem in case the number of tapes is not fixed.

A Proof of Proposition 2
We first recall a construction of a crossed product division algebra from [17,
Proposition 1.1]. Let z1 , . . . , zk be commuting indeterminates and write F =
432 J. Worrell

Q(z1n , . . . , zkn ) for the field of rational functions obtained by adjoining z1n , . . . , zkn
to Q. Furthermore, let K/F be a field extension whose Galois group is generated
by commuting automorphisms σ1 , . . . , σk , each of order n, which has fixed field F .
(Such an extension can easily be constructed by adjoining extra indeterminates
to F , and having the σi be suitable permutations of the new indeterminates.)
For each i, 1 ≤ i ≤ k, write Ki for the subfield of K that is fixed by each σj
for j = i; then define Di to be the F -algebra generated by Ki and zi such that
azi = zi σi (a) for all a ∈ Ki . Then each Di is a simple algebra of dimension n2
over its centre F . It is shown in [17, Proposition 1.1] that the tensor product
D1 ⊗F · · · ⊗F Dk can be characterised as the localisation of an iterated skew
polynomial ring—and is therefore a domain.
The following two propositions are straightforward adaptations of [4, Propo-
sition 7.5.5.] and [4, Proposition 7.7.2] to partially commutative identities.
Proposition 6. Let f ∈ F X1  ⊗ · · · ⊗ F Xk . If the partially commutative
identity f = 0 holds in D1 ⊗F · · · ⊗F Dk then it also holds in (D1 ⊗F L) ⊗F
· · · ⊗F (Dk ⊗F L) for any extension field L of F .

Proof. Noting that the Di are all isomorphic as F -algebras, let {e1 , . . . , en2 } be
a basis of each Di over its centre F . For each variable x appearing in f , introduce
n2
commuting indeterminates txj , 1 ≤ j ≤ n2 , and write x = j=1 txj ej . Then we
can express f in the form

f= fν · (eν(1) ⊗ · · · ⊗ eν(k) ) , (9)


ν∈{1,...,n2 }k

where fν ∈ F txj : x ∈ X1 , 1 ≤ j ≤ n2  ⊗F · · · ⊗F F txj : x ∈ Xk , 1 ≤ j ≤ n2 .


By assumption, each fν evaluates to 0 for all values of the txj in F . Since F
is an infinite field it follows that each fν must be identically zero. Now we can
also regard {e1 , . . . , en2 } as a basis for Di ⊗F L over L. Then by (9), f = 0 also
on (D1 ⊗F L) ⊗F · · · ⊗F (Dk ⊗F L). &
%

Proposition 7. Fn X1  ⊗ · · · ⊗ Fn Xk  is a domain.

Proof. Recall that if L is an algebraically closed field extension of F , then we


have Di ⊗F L ∼ = Mn (L) for each i. By Proposition 6 it follows that an identity
f = 0 holds in D1 ⊗F · · ·⊗F Dk if and only if it holds in Mn (L)⊗F · · · ⊗F Mn (L).
But by Proposition 1 the latter holds if and only if Φn (f ) is identically zero.
To prove the proposition it will suffice to show that the image of Φn contains
no zero divisors, since the latter is a surjective map. Now given f, g ∈ F X1  ⊗
· · ·⊗ F Xk  with Φn (f g) = 0, we have that D1 ⊗F · · ·⊗F Dk satisfies the identity
f g = 0. Since D1 ⊗F · · ·⊗F Dk is a domain, it follows that it satisfies the identity
f hg = 0 for any h in F X1  ⊗ · · · ⊗ F Xk . But now Mn (L) ⊗F · · · ⊗F Mn (L)
satisfies the identity f hg = 0 for any h. Since h can take the value of an arbitrary
matrix (in particular, any matrix unit) it follows that Mn (L) ⊗F · · · ⊗F Mn (L)
satisfies either the identity f = 0 or g = 0, and so, by Proposition 1 again, either
Φn (f ) = 0 or Φn (g) = 0. &
%
Revisiting the Equivalence Problem for Finite Multitape Automata 433

Acknowledgments. The author is grateful to Louis Rowen for helpful pointers


in the proof of Proposition 2.

References
1. Amitsur, S.A., Levitzki, J.: Minimal identities for algebras. Proceedings of the
American Mathematical Society 1, 449–463 (1950)
2. Arvind, V., Mukhopadhyay, P.: Derandomizing the isolation lemma and lower
bounds for circuit size. In: Goel, A., Jansen, K., Rolim, J.D.P., Rubinfeld, R.
(eds.) APPROX and RANDOM 2008. LNCS, vol. 5171, pp. 276–289. Springer,
Heidelberg (2008)
3. Bogdanov, A., Wee, H.: More on noncommutative polynomial identity testing.
In: IEEE Conference on Computational Complexity, pp. 92–99. IEEE Computer
Society (2005)
4. Cohn, P.M.: Further Algebra and Applications. Springer (2003)
5. Eilenberg, S.: Automata, Languages, and Machines, vol. A. Academic Press (1974)
6. Elgot, C.C., Mezei, J.E.: Two-sided finite-state transductions (abbreviated ver-
sion). In: SWCT (FOCS), pp. 17–22. IEEE Computer Society (1963)
7. Friedman, E.P., Greibach, S.A.: A polynomial time algorithm for deciding the
equivalence problem for 2-tape deterministic finite state acceptors. SIAM J. Com-
put. 11(1), 166–183 (1982)
8. Griffiths, T.V.: The unsolvability of the equivalence problem for -free nondeter-
ministic generalized machines. J. ACM 15(3), 409–413 (1968)
9. Harju, T., Karhumäki, J.: The equivalence problem of multitape finite automata.
Theor. Comput. Sci. 78(2), 347–355 (1991)
10. Kiefer, S., Murawski, A., Ouaknine, J., Wachter, B., Worrell, J.: On the complexity
of equivalence and minimisation for Q-weighted automata. Logical Methods in
Computer Science 9 (2013)
11. Malcev, A.I.: On the embedding of group algebras in division algebras. Dokl. Akad.
Nauk 60, 1409–1501 (1948)
12. Mulmuley, K., Vazirani, U.V., Vazirani, V.V.: Matching is as easy as matrix inver-
sion. In: STOC, pp. 345–354 (1987)
13. Neumann, B.H.: On ordered groups. Amer. J. Math. 71, 1–18 (1949)
14. Neumann, B.H.: On ordered division rings. Trans. Amer. Math. Soc. 66, 202–252
(1949)
15. Rabin, M., Scott, D.: Finite automata and their decision problems. IBM Journal
of Research and Development 3(2), 114–125 (1959)
16. Sakarovich, J.: Elements of Automata Theory. Cambridge University Press (2003)
17. Saltman, D.: Lectures on Division Algebras. American Math. Soc. (1999)
18. Schützenberger, M.-P.: On the definition of a family of automata. Inf. and Con-
trol 4, 245–270 (1961)
19. Tzeng, W.: A polynomial-time algorithm for the equivalence of probabilistic au-
tomata. SIAM Journal on Computing 21(2), 216–227 (1992)
Silent Transitions in Automata with Storage

Georg Zetzsche

Fachbereich Informatik, Technische Universität Kaiserslautern,


Postfach 3049, 67653 Kaiserslautern, Germany
[email protected]

Abstract. We consider the computational power of silent transitions


in one-way automata with storage. Specifically, we ask which storage
mechanisms admit a transformation of a given automaton into one that
accepts the same language and reads at least one input symbol in each
step. We study this question using the model of valence automata. Here,
a finite automaton is equipped with a storage mechanism that is given by
a monoid. This work presents generalizations of known results on silent
transitions. For two classes of monoids, it provides characterizations of
those monoids that allow the removal of silent transitions. Both classes
are defined by graph products of copies of the bicyclic monoid and the
group of integers. The first class contains pushdown storages as well as
the blind counters while the second class contains the blind and the
partially blind counters.

1 Introduction

In a one-way automaton, a transition is called silent if it reads no input symbol.


If it has no silent transitions, such an automaton is called real-time. We consider
the problem of removing silent transitions from one-way automata with various
kinds of storage. Specifically, we ask for which kinds of storage the real-time
and the general version have equal computational power. This is an interesting
problem for two reasons. First, it has consequences for the time and space com-
plexity of the membership problem for these automata. For automata with silent
transitions, it is not even clear whether the membership problem is decidable. If,
however, an automaton has no silent transitions, we only have to consider paths
that are at most as long as the word at hand. In particular, if we can decide
whether a sequence of storage operations is valid using linear space, we can also
solve the membership problem (nondeterministically) with a linear space bound.
Similarly, if we can decide validity of such a sequence in polynomial time, we
can solve the membership problem in (nondeterministic) polynomial time. Sec-
ond, we can interpret the problem as a question on resource consumption of
restricted machine models: we ask for which storage mechanisms we can process
every input word by executing only a bounded number of operations per symbol.

This is an extended abstract. The full version of this work is available under
http://arxiv.org/abs/1302.3798.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 434–445, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Silent Transitions in Automata with Storage 435

There is a wide variety of machine models that consist of a finite state control
with a one-way input and some mechanism to store data, for example (higher
order) pushdown automata, various kinds of counter automata [8], or off-line
Turing machines that can only move right on the input tape.
For some of these models, it is known whether silent transitions can be elim-
inated. For example, the Greibach normal form allows their removal from push-
down automata. Furthermore, for blind counter automata (i.e., the counters can
go below zero and a zero-test is only performed in the end), Greibach also has
also shown that silent transitions can be avoided [8]. However, for partially blind
counter automata (i.e., the counters cannot go below zero and are only zero-
tested in the end) or, equivalently, Petri nets, there are languages for which
silent transitions are indeed necessary [8, 11].
The aim of this work is to generalize these results and obtain insights into
how the structure of the storage mechanism influences the computational power
of the real-time variant. In order to study the expressive power of real-time
computations in greater generality, we use the model of valence automata. For
our purposes, a storage mechanism consists of a (possibly infinite) set of states
and partial transformations operating on them. Such a mechanism often works
in a way such that a computation is considered valid if the composition of the
applied transformations is the identity. For example, in a pushdown storage, the
operations push and pop (for each participating stack symbol) and compositions
thereof are partial transformations on the set of words over some alphabet. In
this case, a computation is valid if, in the end, the stack is brought back to
the initial state, i.e., the identity transformation has been applied. Furthermore,
in a partially blind counter automaton, a computation is valid if it leaves the
counters with value zero, i.e., the composition of the applied operations increase
and decrease is the identity. Therefore, the set of all compositions of the partial
transformations forms a monoid such that, for many storage mechanisms, a
computation is valid if the composition of the transformations is the identity.
A valence automaton is a finite automaton in which each edge carries, in
addition to an input word, an element of a monoid. A word is then accepted if
there is a computation that spells the word and for which the product of the
monoid elements is the identity. Valence automata have been studied throughout
the last decades [4, 5, 10, 12, 16, 17].
The contribution of this work is threefold. On the one hand, we introduce a
class of monoids that accommodates, among others, all storage mechanisms for
which we mentioned previous results on silent transitions. The monoids in this
class are graph products of copies of the bicyclic monoid and the integers. On the
other hand, we present two generalizations of those established facts. Our first
main result is a characterization of those monoids in a certain subclass for which
silent transitions can be eliminated. This subclass contains, among others, both
the monoids corresponding to pushdown storages as well as those corresponding
to blind multicounter storages. Thus, we obtain a generalization and unification
of two of the three λ-removal results above. For those storage mechanisms in this
subclass for which we can remove silent transitions, there is a simple intuitive
436 G. Zetzsche

description. As the simplest example of storages covered by our result beyond


pushdowns and blind multicounters, Parikh pushdown automata [13] are also
provided with a λ-removal procedure.
The second main result is a characterization of the previous kind for the class
of those storage mechanisms that consist of a number of blind counters and
a number of partially blind counters. Specifically, we show that we can remove
silent transitions if and only if there is at most one partially blind counter. Again,
this generalizes and unifies two of the three results above. It should be noted
that all our results are effective.
In Section 2, we will fix notation and define some basic concepts. In Section 3,
we state the main results, describe how they relate to what is known, and explain
key ideas. Sections 4, 5, and 6 contain auxiliary results needed in Section 7, which
presents an outline of the proofs of the main results.

2 Basic Notions
A monoid is a set M together with an associative operation and a neutral ele-
ment. Unless defined otherwise, we will denote the neutral element of a monoid
by 1 and its operation by juxtaposition. That is, for a monoid M and a, b ∈ M ,
ab ∈ M is their product. For a, b ∈ M , we write a . b if there is a c ∈ M such
that b = ac. By 1, we denote the trivial monoid that consists of just one element.
We call a monoid commutative if ab = ba for any a, b ∈ M . A subset N ⊆ M
is said to be a submonoid of M if 1 ∈ N and a, b ∈ N implies ab ∈ N . In each
monoid M , we have the submonoids H(M ) = {a ∈ M | ∃b ∈ M : ab = ba = 1},
R(M ) = {a ∈ M | ∃b ∈ M : ab = 1}, and L(M ) = {a ∈ M | ∃b ∈ M : ba = 1}.
When using a monoid M as part of a control mechanism, the subset J(M ) =
{a ∈ M | ∃b, c ∈ M : bac = 1} will play an important role. By M n , we denote
the n-fold direct product of M , i.e. M n = M × · · · × M with n factors.
Let S ⊆ M be a subset. If there is no danger of confusion with the n-fold
direct product, we write S n for the set of all elements of M that can be written
as a product of n factors from S.
Let Σ be a fixed countable set of abstract symbols, the finite subsets of which
are called alphabets. For an alphabet X, we will write X ∗ for the set of words
over X. The empty word is denoted by λ ∈ X ∗ . Together with concatenation as
its operation, X ∗ is a monoid. For a symbol x ∈ X and a word w ∈ X ∗ , let |w|x
be the number of occurrences of x in w. Given an alphabet X and a monoid M ,
subsets of X ∗ and X ∗ × M are called languages and transductions, respectively.
A family is a set of languages that is closed under isomorphism and contains at
least one non-trivial member.
Given an alphabet X, we write X ⊕ for the set of maps α : X → N. Elements
of X ⊕ are called multisets. By way of pointwise addition, written α + β, X ⊕ is a
commutative monoid. We write 0 for the empty multiset, i.e. the one that maps
every x ∈ X to 0 ∈ N. For α ∈ X ⊕ , let |α| = x∈X α(x). The Parikh mapping
is the mapping Ψ : Σ ∗ → Σ ⊕ with Ψ (w)(x) = |w|x for w ∈ Σ ∗ and x ∈ Σ.
Let A be a (not necessarily finite) set of symbols and R ⊆ A∗ × A∗ . The
pair (A, R) is called a (monoid) presentation. The smallest congruence of A∗
Silent Transitions in Automata with Storage 437

containing R is denoted by ≡R and we will write [w]R for the congruence class
of w ∈ A∗ . The monoid presented by (A, R) is defined as A∗ /≡R . Note that
since we did not impose a finiteness restriction on A, every monoid has a pre-
sentation. Furthermore, for monoids M1 , M2 we can find presentations (A1 , R1 )
and (A2 , R2 ) such that A1 ∩ A2 = ∅. We define the free product M1 ∗ M2 to
be presented by (A1 ∪ A2 , R1 ∪ R2 ). Note that M1 ∗ M2 is well-defined up to
isomorphism. By way of the injective morphisms [w]Ri !→ [w]R1 ∪R2 , w ∈ A∗i for
i = 1, 2, we will regard M1 and M2 as subsets of M1 ∗ M2 . In analogy to the
n-fold direct product, we write M (n) for the n-fold free product of M .
Rational Sets. Let M be a monoid. An automaton over M is a tuple A =
(Q, M, E, q0 , F ), in which Q is a finite set of states, E is a finite subset of Q ×
M × Q called the set of edges, q0 ∈ Q is the initial state, and F ⊆ Q is the set of
final states. The step relation ⇒A of A is a binary relation on Q × M , for which
(p, a) ⇒A (q, b) iff there is an edge (p, c, q) such that b = ac. The set generated
by A is then S(A) = {a ∈ M | ∃q ∈ F : (q0 , 1) ⇒∗A (q, a)}.
A set R ⊆ M is called rational if it can be written as R = S(A) for some
automaton A over M . The set of rational subsets of M is denoted by RAT(M ).
Given two subsets S, T ⊆ M , we define ST = {st | s ∈ S, t ∈ T }. Since {1} ∈
RAT(M ) and ST ∈ RAT(M ) whenever S, T ∈ RAT(M ), this operation makes
RAT(M ) a monoid itself.
Let C be a commutative monoid for which we write the composition additively.
For n ∈ N and c ∈ C, we use nc to denote c + · · · + c (n summands). A subset
n
S ⊆ C is linear if there are elements s0 , . . . , sn such that S = {s0 + i=1 ai si |
ai ∈ N, 1 ≤ i ≤ n}. A set S ⊆ C is called semilinear if it is a finite union of
linear sets. By SL(C), we denote the set of semilinear subsets of C. It is well-
known that RAT(C) = SL(C) for commutative C (we will, however, sometimes
still use SL(C) to make explicit that the sets at hand are semilinear). Moreover,
SL(C) is a commutative monoid by way of the product (S, T ) !→ S + T = {s + t |
s ∈ S, t ∈ T }. It is well-known that the class of semilinear subsets of a free
commutative monoid is closed under intersection [6].
In slight abuse of terminology, we will sometimes call a language L semilinear
if the set Ψ (L) is semilinear. If there is no danger of confusion, we will write S ⊕
instead of S if S is a subset of a commutative monoid C. Note that if X is
regarded as a subset of X ⊕ , the two meanings of X ⊕ coincide.
Valence Automata. A valence automaton over M is an automaton A over the
monoid X ∗ × M , where X is an alphabet. An edge (p, w, m, q) in A is called a
λ-transition if w = λ. A is called λ-free if it has no λ-transitions. The language
accepted by A is defined as L(A) = {w ∈ X ∗ | (w, 1) ∈ S(A)}. The class of
languages accepted by valence automata and λ-free valence automata over M is
denoted by VA(M ) and VA+ (M ), respectively.
A finite automaton is a valence automaton over the trivial monoid 1. For a
finite automaton A = (Q, X ∗ × 1, E, q0 , F ), we also write A = (Q, X, E, q0 , F ).
Languages accepted by finite automata are called regular languages. The finite
automaton A is spelling, if E ⊆ Q × X × Q, i.e. every edges carries exactly one
438 G. Zetzsche

letter. Let M and C be monoids. A valence transducer over M with output in C


is an automaton A over X ∗ × M × C, where X is an alphabet. The transduction
performed by A is T (A) = {(x, c) ∈ X ∗ × C | (x, 1, c) ∈ S(A)}. A valence
transducer is called λ-free if it is λ-free as a valence automaton. We denote the
class of transductions performed by (λ-free) valence transducers over M with
output in C by VT(M, C) (VT+ (M, C)).
Graphs. A graph is a pair Γ = (V, E) where V is a finite set and E ⊆ {S ⊆
V | 1 ≤ |S| ≤ 2}. The elements of V are called vertices and those of E are
called edges. Vertices v, w ∈ V are adjacent if {v, w} ∈ E. If {v} ∈ E for some
v ∈ V , then v is called a looped vertex, otherwise it is unlooped. A subgraph
of Γ is a graph (V  , E  ) with V  ⊆ V and E  ⊆ E. Such a subgraph is called
induced (by V  ) if E  = {S ∈ E | S ⊆ V  }, i.e. E  contains all edges from E
incident to vertices in V  . By Γ \ {v}, for v ∈ V , we denote the subgraph of Γ
induced by V \ {v}. Given a graph Γ = (V, E), its underlying loop-free graph is
Γ  = (V, E  ) with E  = E ∩ {S ⊆ V | |S| = 2}. For a vertex v ∈ V , the elements
of N (v) = {w ∈ V | {v, w} ∈ E} are called neighbors of v. A looped clique is a
graph in which E = {S ⊆ V | 1 ≤ |S| ≤ 2}. Moreover, a clique is a loop-free
graph in which any two distinct vertices are adjacent. Finally, an anti-clique is
a graph with E = ∅.
A presentation (A, R) in which A is a finite alphabet is a Thue system. To
each graph Γ = (V, E), we associate the Thue system TΓ = (XΓ , RΓ ) over the
alphabet XΓ = {av , āv | v ∈ V }. RΓ is defined as

RΓ = {(av āv , λ) | v ∈ V } ∪ {(xy, yx) | x ∈ {av , āv }, y ∈ {aw , āw }, {v, w} ∈ E}.

In particular, we have (av āv , āv av ) ∈ RΓ whenever {v} ∈ E. To simplify nota-


tion, the congruence ≡TΓ is then also denoted by ≡Γ and [w]TΓ is also denoted
[w]Γ . In order to describe the monoids we use to model storage mechanisms, we
define monoids using graphs. To each graph Γ , we associate the monoid

MΓ = XΓ∗ /≡Γ .

If Γ consists of one vertex and has no edges, MΓ is also denoted as B and we


will refer to it as the bicyclic monoid. The generators av and āv are then also
written a and ā, respectively.

3 Overview of Results
Storage Mechanisms as Monoids. First, we will see how pushdown storages
and (partially) blind counters can be regarded as monoids of the form MΓ .
See Table 1 for examples. Clearly, in the bicyclic monoid B, a word over the
generators a and ā is the identity if and only if in every prefix of the word, there
are at least as many a’s as there are ā’s and in the whole word, there are as
many a’s as there are ā’s. Thus, a valence automaton over B is an automaton
with one counter that cannot go below zero and is zero in the end. Here, the
increment operation corresponds to a and the decrement corresponds to ā.
Silent Transitions in Automata with Storage 439

Table 1. Examples of storage mechanisms

Graph Γ Monoid MΓ Storage mechanism

B(3) Pushdown (with three symbols)

B3 Three partially blind counters

Z3 Three blind counters

B(2) × Z2 Pushdown (with two symbols) and two blind counters

Observe that building the direct product means that both storage mechanisms
(described by the factors) are available and can be used simultaneously. Thus,
valence automata over Bn are automata with n partially blind counters. There-
fore, if Γ is a clique, then MΓ ∼= Bn corresponds to a partially blind multicounter
storage.
Furthermore, the free product of a monoid M with B yields what can be
seen as a stack of elements of M : a valence automaton over M ∗ B can store a
sequence of elements of M (separated by a) such that it can only remove the
topmost element if it is the identity element. The available operations are those
available for M (which then operate on the topmost entry) and in addition push
(represented by a) and pop (represented by ā). Thus, B ∗ B corresponds to a
stack over two symbols. In particular, if Γ is an anti-clique (with at least two
vertices), then MΓ ∼ = B(n) represents a pushdown storage.
Finally, valence automata over Zn (regarded as a monoid by way of addition)
correspond to automata with n blind counters. Hence, if Γ is a looped clique,
then MΓ ∼ = Zn corresponds to a blind multicounter storage.
Main Results. Our class of monoids that generalizes pushdown and blind
multicounter storages is the class of MΓ where in Γ , any two looped vertices are
adjacent and no two unlooped vertices are adjacent. Our first main result is the
following.
Theorem 1. Let Γ be a graph such that any two looped vertices are adjacent
and no two unlooped vertices are adjacent. Then, the following are equivalent:
(1) VA+ (MΓ ) = VA(MΓ ).
(2) Every language in VA(MΓ ) is context-sensitive.
(3) The membership problem of each language in VA(MΓ ) is in NP.
(4) Every language in VA(MΓ ) is decidable.
(5) Γ does not contain as an induced subgraph.
440 G. Zetzsche

Note that this generalizes the facts that in pushdown automata and in blind
counter automata, λ-transitions can be avoided. Furthermore, while Greibach’s
construction triples the number of counters, we do not need any additional ones.
It turns out that the storages that satisfy the equivalent conditions of Theorem
1 (and the hypothesis), are exactly those in the following class.
Definition 1. Let C be the smallest class of monoids such that 1 ∈ C and when-
ever M ∈ C, we also have M × Z ∈ C and M ∗ B ∈ C.
Thus, C contains those storage types obtained by successively adding blind coun-
ters and building a stack of elements. For example, we could have a stack each
of whose entries contains n blind counters. Or we could have an ordinary push-
down and a number of blind counters. Or a stack of elements, each of which is
a pushdown storage and a blind counter, etc. The simplest example of a stor-
age mechanism in C beyond blind multicounters and pushdowns is given by the
monoids (B ∗ B) × Zn for n ∈ N. It is not hard to see that these yield the same
languages as Parikh pushdown automata [13]. Hence, our result implies that the
latter also permit the removal of λ-transitions.
Our second main result concerns storages consisting of a number of blind
counters and a number of partially blind counters.
Theorem 2. Let Γ be a graph such that any two distinct vertices are adjacent.
Then, VA+ (MΓ ) = VA(MΓ ) if and only if r ≤ 1, where r is the number of
unlooped vertices in Γ .
In other words, when one has r partially blind counters and s blind counters,
λ-transitions can be eliminated if and only if r ≤ 1. Note that this generalizes
Greibach’s result that in partially blind multicounter automata, λ-transitions
are indispensable.
Key Technical Ingredients. As a first step, we show that for M ∈ C, all lan-
guages in VA(M ) are semilinear. This is needed in various situations throughout
the proof. We prove this using an old result by van Leeuwen [14], which states
that languages that are algebraic over a class of semilinear languages are semi-
linear themselves. Thereby, the corresponding Lemma 2 slightly generalizes one
of the central components in a decidability result by Lohrey and Steinberg on
the rational subset membership problem for graph groups [15] and provides a
simpler proof (relying, however, on van Leeuwen’s result).
Second, we use an undecidability result by Lohrey and Steinberg [15] concern-
ing the rational subset membership problem for certain graph groups. We deduce
that for monoids M outside of C (and satisfying the hypothesis of Theorem 1),
VA(M ) contains an undecidable language.
Third, in order to prove our claim by induction on the construction of M ∈ C,
we use a significantly stronger induction hypothesis: We show that it is not
only possible to remove λ-transitions from valence automata, but also from va-
lence transducers with output in a commutative monoid. Here, however, the con-
structed valence transducer is allowed to output a semilinear set in each step.
Monoids that admit such a transformation will be called strongly λ-independent.
Silent Transitions in Automata with Storage 441

Fourth, we develop a normal form result for rational subsets of monoids in C


(see Section 6). Such normal form results have been available for monoids de-
scribed by monadic rewriting systems (see, for example, [1]), which was applied
by Render and Kambites to monoids representing pushdown storages [17]. Un-
der different terms, this normal form trick has been used by Bouajjani, Esparza,
and Maler [2] and by Caucal [3] to describe rational sets of pushdown opera-
tions. However, since the monoids in C allow commutation of certain non-trivial
elements, a technique more general than those was necessary here. In the case
of monadic rewriting systems, one transforms a finite automaton according to
rewriting rules by gluing in new edges. Here, we glue in automata accepting sets
that are semilinear by earlier steps in the proof. See Lemma 6 for details.
Fifth, we have three new techniques to eliminate λ-transitions from valence
transducers while retaining the output in a commutative monoid. Here, we need
one technique to show that if M is strongly λ-independent, then M × Z is as
well. This technique again uses the semilinearity of certain sets and a result that
provides small preimages for morphisms from multisets to the integers.
The second technique is to show that B is strongly λ-independent. We use a
construction that allows the postponement of increment operations and the early
execution of decrement operations. This is used to show that one can restrict
to computations in which a sequence of increments, followed by a sequence of
decrements, will in the end change the counter only by a bounded amount.
The third technique is to show that if M is strongly λ-independent, where
M is non-trivial, then M ∗ B is as well. Here, the storage consists of a stack of
elements of M . The construction works by encoding rational sets over M ∗ B as
elements on the stack. We have to use the semilinearity results again in order
to be able to compute the set of all possible outputs when elements from two
given rational sets cancel each other out (in the sense that push operations are
followed by pop operations).

4 Semilinear Languages

This section contains semilinearity results that will be needed in later sections.
The first lemma guarantees small preimages of morphisms from multisets to the
integers. This will be used to bound the number of necessary operations on a
blind counter in order to obtain a given counter value.
Lemma 1. Let ϕ : X ⊕ → Z be a morphism. Then for any n ∈ Z, the set ϕ−1 (n)
is semilinear. In particular, ker ϕ is finitely generated. Furthermore, there is a
constant k ∈ N such that for any μ ∈ X ⊕ , there is a ν . μ with μ ∈ ν + ker ϕ
and |ν| ≤ k · |ϕ(μ)|.
Another fact used in later sections is that languages in VA(M ) are semilinear
if M ∈ C. This will be employed in various constructions, for instance when
the effect of computations (that make use of M as storage) on the output in a
commutative monoid is to be realized by a finite automaton. We prove this using
a result of van Leeuwen [14], which states that semilinearity of all languages in a
442 G. Zetzsche

family is inherited by languages that are algebraic over this family. A language
is called algebraic over a family of languages if it is generated by a grammar in
which each production allows a non-terminal to be replaced by any word from
a language in this family.
Note that in [15], a group G is called SLI-group if every language in VA(G) is
semilinear (in different terms, however). Thus, the following recovers the result
from [15] that the class of SLI-groups is closed under taking the free product.
Lemma 2. Every L ∈ VA(M0 ∗ M1 ) is algebraic over VA(M0 ) ∪ VA(M1 ).
Combining the latter lemma with van Leeuwen’s result and a standard argument
for the preservation of semilinearity when builing the direct product with Z yields
the following.
Lemma 3. Let M ∈ C. Then, every language in VA(M ) is semilinear.

5 Membership Problems
In this section, we study decidability and complexity of the membership problem
for valence automata over MΓ . Specifically, we show in this section that for
certain graphs Γ , the class VA(MΓ ) contains undecidable languages (Lemma
5), while for every Γ , membership for languages in VA+ (MΓ ) is (uniformly)
decidable. We present two nondeterministic algorithms, one of them uses linear
space and one runs in polynomial time (Lemma 4).
These results serve two purposes. First, for those graphs Γ for which there
are undecidable languages in VA(MΓ ), it follows that silent transitions are in-
dispensable. Second, if we can show that silent transitions can be removed from
valence automata over MΓ , the algorithms also apply to languages in VA(MΓ ).
Our algorithms rely on the convergence property of certain reduction sys-
tems. For more information on reduction systems, see [1, 9]. The following
lemma makes use of two algorithms to decide, given a word w ∈ XΓ∗ , whether
[w]Γ = [λ]Γ . Specifically, we have a deterministic polynomial-time algorithm
that employs a convergent trace rewriting system to successively reduce a de-
pendence graph, which is then checked for emptiness. On the other hand, the
convergence of the same rewriting system is used in a (nondeterministic) linear
space algorithm to decide the equality above. These two algorithms are then
used to verify the validity of a guessed run to decide the membership problem
for languages in VA+ (MΓ ).
Lemma 4. For each L ∈ VA+ (MΓ ), the membership problem can be decided
by a nondeterministic polynomial-time algorithm as well as a nondeterministic
linear-space algorithm. Hence, the languages in VA+ (MΓ ) are context-sensitive.
The undecidability result is shown by reducing the rational subset membership
problem of the graph group corresponding to a path on four vertices, which was
proven undecidable by Lohrey and Steinberg [15], to the membership problem
of languages L ∈ VA(MΓ ).
Lemma 5. Let Γ be a graph whose underlying loop-free graph is a path on four
vertices. Then, VA(MΓ ) contains an undecidable language.
Silent Transitions in Automata with Storage 443

6 Rational Sets
When removing silent transitions, we will regard an automaton with silent tran-
sition as an automaton that is λ-free but is allowed to multiply a rational subset
(of the storage monoid) for each input symbol. In order to restrict the ways in
which elements can cancel out, these rational sets are first brought into a normal
form. Our normal form result essentially states that there is an automaton that
reads the generators in an order such that certain cancellations do not occur on
any path. Note that in a valence automaton over M , we can remove all edges
labeled with elements outside of J(M ). This is due to the fact that they cannot
be part of a valid computation. In a valence transducer over M with output
in C, the edges carry elements from X ∗ × M × C. Therefore, in the situation
outlined above, a rational set S ⊆ M × C will be replaced by S ∩ (J(M ) × C).

Lemma 6. Let M ∈ C and C be a commutative  monoid and S ⊆ M × C a


rational set. Then, we have S ∩ (J(M ) × C) = ni=1 Li Ui Ri , in which

Li ∈ RAT(L(M ) × C), Ui ∈ RAT(H(M ) × C), Ri ∈ RAT(R(M ) × C)

for 1 ≤ i ≤ n. Moreover,
) )
S ∩ (L(M ) × C) = L i Ui , S ∩ (R(M ) × C) = Ui Ri .
1≤i≤n 1≤i≤n
1∈Ri 1∈Li

7 Silent Transitions
The first lemma in this section can be shown using a simple combinatorial
argument.
Lemma 7. Let Γ be a graph such that any two looped vertices are adjacent, no
two unlooped vertices are adjacent, and Γ does not contain
as an induced subgraph. Then, MΓ is in C.
We prove Theorem 1 by showing that VA+ (M ) = VA(M ) for every M ∈ C. This
will be done using an induction with respect to the definition of C. In order for
this induction to work, we need to strengthen the induction hypothesis. The
latter will state that for any M ∈ C and any commutative monoid C, we can
transform a valence transducer over M with output in C into another one that
has no λ-transitions but is allowed to output a semilinear set of elements in
each step. Formally, we will show that each M ∈ C is strongly λ-independent :
Let C be a commutative monoid and T ⊆ X ∗ × SL(C) be a transduction. Then
Φ(T ) ⊆ X ∗ × C is defined as Φ(T ) = {(w, c) ∈ X ∗ × C | ∃(w, S) ∈ T : c ∈ S}.
For a class F of transductions, Φ(F ) is the class of all Φ(T ) with T ∈ F .
A monoid M is called strongly λ-independent if for any commutative monoid
C, we have VT(M, C) = Φ(VT+ (M, SL(C))). Note that Φ(VT+ (M, SL(C))) ⊆
VT(M, C) holds for any M and C. In order to have equality, it is necessary to
grant the λ-free transducer the ability to output semilinear sets, since valence
444 G. Zetzsche

transducers without λ-transitions and with output in C can only output finitely
many elements per input word. With λ-transitions, however, a valence transducer
can output an infinite set for one input word.
By choosing the trivial monoid for C, we can see that for every strongly
λ-independent monoid M , we have VA+ (M ) = VA(M ). Indeed, given a valence
automaton A over M , add an output of 1 to each edge and transform the resulting
valence transducer into a λ-free one with output in SL(1). The latter can then
clearly be turned into a valence automaton for the language accepted by A.
The following three lemmas each employ a different technique to eliminate
silent transitions. Together with Lemma 7 and the results in Section 5, they
yield the main result.
Lemma 8. B is strongly λ-independent.
Lemma 9. If M ∈ C is strongly λ-independent, then M × Z is as well.
Lemma 10. Suppose M ∈ C is non-trivial and strongly λ-independent. Then,
M ∗ B is strongly λ-independent as well.
We will now outline the proof of Theorem 2. By Theorem 1, we already know
that when r ≤ 1, we have VA+ (MΓ ) = VA(MΓ ). Hence, it suffices to show
that VA+ (MΓ )  VA(MΓ ) if r ≥ 2. Greibach [8] and, independently, Jantzen
[11] have shown that the language L1 = {wcn | w ∈ {0, 1}∗, n ≤ bin(w)}
can be accepted by a partially blind counter machine with two counters, but not
without λ-transitions. Here, bin(w) denotes the number obtained by interpreting
w as a base 2 representation: bin(w1) = 2 · bin(w) + 1, bin(w0) = 2 · bin(w),
bin(λ) = 0. Since we have to show VA+ (Br × Zs )  VA(Br × Zs ) and we know
L1 ∈ VA(Br × Zs ), it suffices to prove L1 ∈ / VA+ (Br × Zs ). We do this by
transforming Greibach’s and Jantzen’s proof into a general property of languages
accepted by valence automata without λ-transitions. We will then apply this to
show that L1 ∈ / VA+ (Br × Zs ).
Let M be a monoid. For x, y ∈ M , write x ≡ y iff x and y have the same
set of right inverses. For a finite subset S ⊆ M and n ∈ N, let fM,S (n) be the
number of equivalence classes of ≡ in S n ∩ R(M ). The following notion is also
used as a tool to prove lower bounds in state complexity of finite automata [7].
Here, we use it to prove lower bounds on the number of configurations that an
automaton must be able to reach in order to accept a language L. Let n ∈ N.
An n-fooling set for a language L ⊆ Θ∗ is a set F ⊆ Θn × Θ∗ such that (i)
for each (u, v) ∈ F , we have uv ∈ L, and (ii) for (u1 , v1 ), (u2 , v2 ) ∈ F such
that u1 = u2 , we have u1 v2 ∈/ L or u2 v1 ∈/ L. Let gL : N → N be defined as
gL (n) = max{|F | | F is an n-fooling set for L}.
The following three lemmas imply that L1 ∈ / VA+ (Br × Zs ) for any r, s ∈ N.
Lemma 11. Let M be a monoid and L ∈ VA+ (M ). Then, there is a constant
k ∈ N and a finite set S ⊆ M such that gL (n) ≤ k · fM,S (n) for all n ∈ N.
Lemma 12. For L = L1 , we have gL (n) ≥ 2n for every n ∈ N.
Lemma 13. Let M = Br × Zs for r, s ∈ N and S ⊆ M a finite set. Then, fM,S
is bounded by a polynomial.
Silent Transitions in Automata with Storage 445

Acknowledgements. The author would like to thank Nils Erik Flick, Reiner
Hüchting, Matthias Jantzen, and Klaus Madlener for comments that improved
the presentation of the paper.

References
[1] Book, R.V., Otto, F.: String-Rewriting Systems. Springer, New York (1993)
[2] Bouajjani, A., Esparza, J., Maler, O.: Reachability Analysis of Pushdown Au-
tomata: Application to Model-Checking. In: Mazurkiewicz, A., Winkowski, J.
(eds.) CONCUR 1997. LNCS, vol. 1243, pp. 135–150. Springer, Heidelberg (1997)
[3] Caucal, D.: On infinite transition graphs having a decidable monadic theory.
Theor. Comput. Sci. 290(1), 79–115 (2003)
[4] Elder, M., Kambites, M., Ostheimer, G.: On Groups and Counter Automata.
Internat. J. Algebra Comput. 18(8), 1345–1364 (2008)
[5] Gilman, R.H.: Formal Languages and Infinite Groups. In: Geometric and Compu-
tational Perspectives on Infinite Groups. DIMACS Series in Discrete Mathematics
and Theoretical Computer Science, vol. 25 (1996)
[6] Ginsburg, S., Spanier, E.H.: Bounded Algol-Like Languages. Trans. Amer. Math.
Soc. 113(2), 333–368 (1964)
[7] Glaister, I., Shallit, J.: A lower bound technique for the size of nondeterministic
finite automata. Inf. Process. Lett. 59(2), 75–77 (1996)
[8] Greibach, S.A.: Remarks on blind and partially blind one-way multicounter ma-
chines. Theor. Comput. Sci. 7(3), 311–324 (1978)
[9] Huet, G.: Confluent Reductions: Abstract Properties and Applications to Term
Rewriting Systems. J. ACM 27(4), 797–821 (1980)
[10] Ibarra, O.H., Sahni, S.K., Kim, C.E.: Finite automata with multiplication. Theor.
Comput. Sci. 2(3), 271–294 (1976)
[11] Jantzen, M.: Eigenschaften von Petrinetzsprachen. German. PhD thesis. Univer-
sität Hamburg (1979)
[12] Kambites, M.: Formal Languages and Groups as Memory. Communications in
Algebra 37(1), 193–208 (2009)
[13] Karianto, W.: Adding Monotonic Counters to Automata and Transition Graphs.
In: De Felice, C., Restivo, A. (eds.) DLT 2005. LNCS, vol. 3572, pp. 308–319.
Springer, Heidelberg (2005)
[14] van Leeuwen, J.: A generalisation of Parikh’s theorem in formal language theory.
In: Loeckx, J. (ed.) ICALP 1974. LNCS, vol. 14, pp. 17–26. Springer, Heidelberg
(1974)
[15] Lohrey, M., Steinberg, B.: The submonoid and rational subset membership prob-
lems for graph groups. J. Algebra 320(2), 728–755 (2008)
[16] Mitrana, V., Stiebe, R.: Extended finite automata over groups. Discrete Applied
Mathematics 108(3), 287–300 (2001)
[17] Render, E., Kambites, M.: Rational subsets of polycyclic monoids and valence
automata. Inform. and Comput. 207(11), 1329–1339 (2009)
New Online Algorithms for Story Scheduling
in Web Advertising

Susanne Albers and Achim Passen

Department of Computer Science, Humboldt-Universität zu Berlin


{albers,passen}@informatik.hu-berlin.de

Abstract. We study storyboarding where advertisers wish to present se-


quences of ads (stories) uninterruptedly on a major ad position of a web
page. These jobs/stories arrive online and are triggered by the brows-
ing history of a user who at any time continues surfing with probability
β. The goal of an ad server is to construct a schedule maximizing the
expected reward. The problem was introduced by Dasgupta, Ghosh, Naz-
erzadeh and Raghavan (SODA’09) who presented a 7-competitive online
algorithm. They also showed that no deterministic online strategy can
achieve a competitiveness smaller than 2, for general β.
We present improved algorithms for storyboarding. First we give a
simple online strategy that achieves a competitive ratio of 4/(2 − β),
which is upper bounded by 4 for any β. The algorithm is also 1/(1 − β)-
competitive, which gives better bounds for small β. As the main result of
this paper we devise a refined√ algorithm that attains a competitive ratio
of c = 1+φ, where φ = (1+ 5)/2 is the Golden Ratio. This performance
guarantee of c ≈ 2.618 is close to the lower bound of 2. Additionally,
we study for the first time a problem extension where stories may be
presented simultaneously on several ad positions of a web page. For this
parallel setting we provide
√ an algorithm whose competitive ratio is upper
bounded by 1/(3 − 2 2) ≈ 5.828, for any β. All our algorithms work in
phases and have to make scheduling decisions only every once in a while.

1 Introduction
Online advertising has grown steadily over the last years. The worldwide online
ad spending reached $100 billion in 2012 and is expected to surpass the print ad
spending during the next few years [4,9]. In this paper we study an algorithmic
problem in advertising introduced by Dasgupta, Ghosh, Nazerzadeh and Ragha-
van [3]. An advanced online ad format is storyboarding, which was first launched
by New York Times Digital and is also referred to as surround sessions [10].
In storyboarding, while a user surfs the web and visits a particular website, a
single advertiser controls a major ad position for a certain continuous period of
time. The advertiser can use these time slots to showcase a range of products
and build a linear story line. Typically several advertisers compete for the ad
position, depending on the user’s browsing history and current actions. The goal
of an ad server is to allocate advertisers to the time slots of a user’s browsing
session so as to maximize the total revenue.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 446–458, 2013.

c Springer-Verlag Berlin Heidelberg 2013
New Online Algorithms for Story Scheduling in Web Advertising 447

Dasgupta, Ghosh, Nazerzadeh and Raghavan [3] formulated storyboarding


as an online job scheduling problem. Consider a user that starts a web session
at time t = 0. Time is slotted. At any time t the user continues surfing with
probability β, where 0 < β ≤ 1, and stops surfing with probability 1 − β. Hence
the surfing time is a geometrically distributed random variable. Over time jobs
(advertisers) arrive online. These jobs arise based on the user’s browsing history
and accesses to web content. Each job i is specified by an arrival time ai , a
length li and a per-unit value vi . Here li is the length of the ad sequence the
advertiser would like to present and vi is the reward obtained by the server in
showing one unit of job i. This reward has to be discounted by the time when the
job unit is shown. Considering all incoming jobs, we obtain a problem instance
i=1 , where N ∈ N ∪ {∞}. We allow N = ∞ to model potentially
I = (ai , vi , li )N
infinitely long browsing sessions and associated job arrivals.
A schedule S for I specifies which job to process at any time t ≥ 0. The
schedule does not have to contain all jobs; it is allowed to leave out (unattractive)
jobs. Schedule S is feasible if every scheduled job i is processed at times t ≥
ai for up to li time units. Moreover, it is required that each scheduled job is
processed continuously without interruption so that an advertiser can build a
story. Preemption of jobs is allowed, i.e. a job i may be processed for less than li
time units. In this case no value can be attained for the preempted unscheduled
portion of a job. Given a schedule S, its value is defined as the expected value
 ∞ t
t=0 β v(t), where v(t) is the per-unit value of the job scheduled at time t. The
goal is to maximize this reward. Let ALG be an online algorithm that, given any
input I, constructs a schedule of value ALG(I). Let OPT (I) be the value of an
optimal offline schedule for I. Algorithm ALG is c-competitive if there exists a
constant α such that c · ALG(I) + α ≥ OPT (I) holds for all I, cf. [12].
Previous Work: Algorithmic problems in online advertising have received con-
siderable research interest lately, see e.g. [1,2,5,6,7,8,11] and references therein.
To the best of our knowledge storyboarding, from an algorithmic perspective, has
only been studied so far by Dasgupta et al. [3]. A first observation is that if β = 1,
then the scheduling problem is trivial. Every schedule that never preempts jobs
and sequences them in an arbitrary order, subject to arrival constraints, achieves
an optimal value. Therefore we concentrate on the case that the discount factor
β satisfies 0 < β < 1.
Dasgupta et al. [3] showed that no deterministic online algorithm can achieve
a competitive ratio smaller than β + β 2 . This ratio can be arbitrarily close to
2 as β → 1. Hence, for general β, no deterministic online strategy can achieve
a competitiveness smaller than 2. As a main result Dasgupta et al. devised a
greedy algorithm that is 7-competitive. At any time the algorithm checks if it
is worthwhile to preempt the job i currently being executed. To this end the
strategy compares the reward obtained in scheduling another unit of job i to the
loss incurred in delaying jobs of per-unit value higher than vi for one time unit.
Furthermore, Dasgupta et al. addressed a problem extension where jobs have
increasing rather than constant per-unit values. They focused on the case that
value is obtained only when a job is completely finished. The authors showed that
448 S. Albers and A. Passen

no algorithm can achieve a constant competitive ratio and gave a strategy with a
logarithmic competitiveness. Finally Dasgupta et al. studied an extension where
a job must be scheduled immediately upon arrival; otherwise it is lost. Here they
proved a logarithmic lower bound on the performance of any randomized online
strategy.
Our Contribution: We present new and improved online algorithms for story-
boarding. All strategies follow the paradigm of processing a given job sequence
I in phases, where a phase consists of k consecutive time steps in the scheduling
horizon, for some k ∈ N. At the beginning of each phase an algorithm computes
a schedule for the phase, ignoring jobs that may arrive during the phase. Hence
the strategies have to make scheduling decisions only every once in a while.
First in Section 2 we give a simple algorithm that computes an optimal sched-
ule for each phase and preempts jobs that are not finished at the end of the
respective phase. We prove that the competitive ratio of this strategy is exactly
1/(β k−1 (1 − β k )), for all k ∈ N and all β. The best choice of k gives a com-
petitiveness of 4/(2 − β), which is upper bounded by 4 for any β. If k is set to
1, the resulting algorithm is 1/(1 − β)-competitive. This gives further improved
bounds for small β, i.e. when β < 2/3.
In Section 3, as our main contribution, we devise a refined algorithm that
prefers not to preempt jobs sequenced last in a phase but rather tries to continue
them in the following phase. The competitive ratio of this strategy is upper
bounded by 1/β k−1 · max{1/β k−1 , 1/(1 − β 2k ), β 3k /(1 − β k )}. Using the√ best
choice of k, we obtain a competitive factor of c = 1 + φ, where φ = (1 + 5)/2
is the Golden Ratio. Hence c ≈ 2.618 and this performance guarantee is close to
the lower bound of 2 presented by Dasgupta et al. [3] for general β.
In Section 4 we consider for the first time a problem extension where a web
page features not only one but several ad positions where stories can be presented
simultaneously. This is a natural extension because many web pages do contain
a (small) number of ad positions. Again a job sequence I = (ai , vi , li )N i=1 is
triggered by the browsing history of a user. We assume that an ad server may
assign these jobs to a general number m of ad positions. Following the scheduling
terminology we refer to these ad positions as machines. In a feasible schedule
each job must be processed continuously without interruption on one machine.
A
∞ migration
m of jobs among machines is not allowed. The value of a schedule is
t
t=0 j=1 β v(t, j), where v(t, j) is the per-unit value of the job scheduled on
machine j at time t. We extend our first algorithm to this parallel setting √ and
derive
√ a strategy that achieves a competitive ratio of
√ (1 + 1/(1 − β(2 − 2))/(2 −
2). For small β, this ratio can be as
√ low as 2/(2 − 2) ≈ 3.414. For any β, the
ratio is upper bounded by 1/(3 − 2 2) ≈ 5.828.
In the analyses of the algorithms we consider quantized inputs in which job
arrival times are integer multiples of k. For the setting where one ad position
is available (Sections 2 and 3), we are able to prove an interesting property
given any quantized input: In an online schedule or a slight modification thereof,
no job starts later than in an optimal offline schedule. This property has the
important consequence that, for its scheduled job portions, an online algorithm
New Online Algorithms for Story Scheduling in Web Advertising 449

achieves a total value that is at least as high as that of an optimal schedule.


Hence the competitive analyses essentially reduce to bounding the loss incurred
by an online strategy in preempting jobs. For the refined algorithm this loss
analysis is quite involved and in order to prove a small competitive ratio we
have to amortize the loss of a preempted job over several phases. In the setting
were multiple ad positions are available (Section 4), such a property on job
starting times unfortunately does not hold. Therefore we construct a specific
optimal schedule S ∗ that allows us to match job units sequenced in S ∗ to job
units sequenced online. Using this matching we can upper bound the additional
value achieved by an optimal solution.
Remark: Due to space limitations the proofs of many lemmas, theorems and
corollaries are omitted in this extended abstract. They are presented in the full
version of the paper.

2 A 4-competitive Algorithm
As mentioned before, all algorithms we present in this paper process a job se-
quence in phases. Let k ≥ 1 be an integer. A k-phase consists of k consecutive
time steps in the scheduling horizon. More specifically, the n-th k-phase is the
subsequence of time steps Pn = (n − 1)k, . . . , nk − 1, for any n ≥ 1. Our first
algorithm, called ALG1 k , computes an optimal schedule for any phase, given the
jobs that are available at the beginning of the phase. Such an optimal schedule
is obtained by simply sequencing the available jobs in order of non-increasing
per-unit value. Jobs that arrive during the phase are deferred until the beginning
of the next phase.
Formally, ALG1 k works as follow. We say that a job i is available at time t
if the job has arrived by time t, i.e. ai ≤ t, and has not been scheduled so far at
any time t < t. Consider an arbitrary phase Pn and let Qn be the set of jobs
that are available at the beginning of Pn . We note that Qn includes the jobs
that arrive at time (n − 1)k. ALG1 k constructs a schedule for Pn by first sorting
the jobs of Qn in order of non-increasing per-unit value. Jobs having the same
per-unit value are sorted in order of increasing arrival times; ties may be broken
arbitrarily. Given this sorted sequence, ALG1 k then assigns the jobs one by one
to Pn until the k time steps are scheduled or the job sequence ends. In the former
case, the last job assigned to Pn is preempted at the end of the phase unless the
job completes by the end of Pn . ALG1 k executes this schedule for Pn , ignoring
jobs that may arrive during the phase at times t = (n − 1)k + 1, . . . , nk − 1.
We first evaluate the performance of ALG1 k , for general k. Then we will
determine the best choice of k.
Theorem 1. For all k ∈ N and all probabilities β, ALG1 k is 1/(β k−1 (1 − β k ))-
competitive.
In the following we prove the above theorem. Let I = (ai , vi , li )N
i=1 be an ar-
bitrary input. In processing I, ALG1 k defers jobs arriving after the beginning
of a phase until the start of the next phase. Consider a k-quantized input Ik
450 S. Albers and A. Passen

in which the arrival time of any job is set to the next integer multiple of k,
i.e. Ik = (ai , vi , li )N 
i=1 , where ai = k ai /k . If ai is a multiple of k and hence
coincides with the beginning of a k-phase, the job is not delayed. Otherwise
the job is delayed until the beginning of the next phase. The schedule gener-
ated by ALG1 k for Ik is identical to that computed by ALG1 k for I. Thus
ALG1 k (Ik ) = ALG1 k (I). In order to prove Theorem 1 it will be convenient to
compare ALG1 k (Ik ) to OPT (Ik ). The next lemma ensures that OPT (Ik ) and
the true optimum OPT (I) differ by a factor of at most 1/β k−1 .
Lemma 1. For all k ∈ N and all probabilities β, 1/β k−1 · OPT (Ik ) ≥ OPT (I).
In order to estimate OPT (Ik ) we consider a stronger optimal offline algorithm
that was also proposed by Dasgupta et al. [3]. This algorithm is allowed to resume
interrupted jobs at a later point in time. We call this offline strategy CHOP .
For any input, at any time t CHOP schedules a job having the highest per-
unit value among the unfinished jobs that have arrived until time t. Obviously,
CHOP(Ik ) ≥ OPT (Ik ). Let S be the schedule computed by ALG1 k for Ik and
let S  be the schedule generated by CHOP for Ik . We assume w.l.o.g. that in
S  all jobs having a certain per-unit value v are processed in the same order as
in S. More specifically, all jobs having per-unit value v are processed in order
of increasing arrival times. Jobs of per-unit value v arriving at the same time
are processed in the same order as in S. Schedule S  can be easily modified so
that this property is satisfied. For any job i, let tS (i) denote its starting time in
S and let tS (i) be its starting time in S  . If job i is never processed in S (or
S  ), then we set tS (i) = ∞ (or tS (i) = ∞). The following lemma states that
ALG1 k starts each job at least as early as CHOP .
Lemma 2. For any job i, tS (i) ≤ tS (i).

Lemma 3. For all k ∈ N and all probabilities β, 1/(1 − β k ) · ALG1 k (Ik ) ≥


OPT (Ik ).

Proof. For any n ≥ 1, let In = {i | (n − 1)k ≤ tS (i) ≤ nk − 1} be the set of jobs


scheduled by ALG1 k in phase Pn . Let ALG1 k (Pn ) be the value achieved by
ALG1 k in scheduling the jobs of In , and let CHOP (Pn ) be the value achieved
by CHOP in processing these jobs. There holds ALG1 k (Ik ) = n ALG1 k (Pn ).
A consequence of Lemma 2 is that all jobs everscheduled by CHOP also oc-
cur in ALG1 k ’s schedule. Thus CHOP (Ik ) = n CHOP(Pn ). We will show
that CHOP(Pn )/ALG1 k (Pn ) ≤ 1/(1 − β k ) holds for every n ≥ 1. This im-
plies CHOP (Ik )/ALG1 k (Ik ) ≤ 1/(1 − β k ). The lemma then follows because
CHOP(Ik ) ≤ OPT (Ik ).
Consider any k-phase Pn . In the schedule S let j be the last job started in Pn
and let λj be the number of time units for which j is sequenced in Pn and thus
in the entire schedule S. By Lemma 2, for any job i, there holds tS (i) ≤ tS (i).
Hence the total value achieved by CHOP in scheduling the jobs i ∈ In with i = j
as well as the first λj time units of job j cannot be higher than ALG1 k (Pn ). If
job j is preempted in S at the end of Pn , then CHOP can achieve an additional
value in scheduling units λj + 1, . . . , lj of job j in S ∗ . Again, since tS (j) ≤ tS (j),
New Online Algorithms for Story Scheduling in Web Advertising 451

these units cannot be sequenced before the beginning of phase Pn+1 , i.e. at time
nk. Thus the additional value achievable for units λj +1, . . . , lj is upper bounded
by β nk /(1 − β) · vj , which is obtained if a job of per-unit value vj and infinite
length is sequenced starting at time nk.
Thus CHOP (Pn ) ≤ ALG1 k (Pn ) + β nk /(1 − β) · vj . In each phase ALG1 k
sequences jobs in order of non-increasing per-unit value. Hence each job of In has
a per-unit value of at least vj . We conclude ALG1 k (Pn ) ≥ (β (n−1)k − β nk )/(1 −
β) · vj and CHOP(Pn )/ALG1 k (Pn ) ≤ 1 + β nk /(β (n−1)k − β nk ) = 1/(1 − β k ).

Combining Lemmas 1 and 3 together with the fact ALG1 k (I) = ALG1 k (Ik ),
we obtain Theorem 1. We determine the best value of k.
Corollary 1. For k = − logβ 2 , the resulting ALG1 k is 4/(2 − β)-competitive.

Theorem 2. For all k ∈ N and all probabilities β, the competitive ratio of


ALG1 k is not smaller than 1/(β k−1 (1 − β k )).

The above theorem shows our analysis of ALG1 k is tight. Finally we consider
the algorithm ALG1 1 in which the phase length k is set to 1.
Corollary 2. For all probabilities β, the competitive ratio of ALG1 1 is exactly
1/(1 − β).

Corollary 3. Set k = 1 if β ≤ 2/3 and k = − logβ 2 otherwise. Then ALG1 k


that achieves a competitive ratio of min{1/(1 − β), 4/(2 − β)}.

3 A Refined Algorithm

We present a second algorithm that, compared to ALG1 k , reduces loss incurred


in preempting jobs. The algorithm also operates in k-phases. Its crucial property
is that it continues processing a job scheduled last in a phase if this job is among
the highest-valued jobs available at the beginning of the next phase.
The refined algorithm, called ALG2 k , works in two steps. Again, let Pn be
any k-phase. Step (1) is defined as follows. If n > 1, then let in be the job
that was scheduled last in Pn−1 and can potentially be continued in Pn . If this
job has been scheduled for less than lin time units in the prior schedule, until
the end of Pn−1 , then define a residual job irn by (ain , vin , lirn ). Here lirn is the
remaining length of job in , i.e. lirn further units have to be processed to complete
the job. Let Qn be the set consisting of job irn and the jobs available at the
beginning of Pn . ALG2 k schedules the jobs of Qn in order of non-increasing
per-unit values in Pn . Again, jobs having the same per-unit value are scheduled
in order of increasing arrival times, where ties may be broken arbitrarily. Among
jobs having a per-unit value of v = vin , job irn is scheduled first. Let S  (Pn )
denote the schedule obtained for Pn at this point.
We next describe Step (2). If S  (Pn ) does not contain job irn , then S  (Pn ) is
equal to the final schedule S(Pn ) for the phase. If S  (Pn ) contains job irn and
this job is scheduled for srn time units starting at time trn in Pn , then ALG2 k
452 S. Albers and A. Passen

modifies S  (Pn ) so as to obtain a feasible schedule. Loosely speaking, job irn is


shifted to the beginning of Pn . More precisely, the original job in is scheduled
for srn time units at the beginning of Pn . The start of all jobs scheduled from
time (n − 1)k to time trn − 1 in S  (Pn ) is delayed by srn time units. Between time
trn + srn and the end of Pn , no modification is required. The resulting schedule is
the final output S(Pn ). While this schedule is executed, newly arriving jobs are
deferred until the beginning of the next phase.
A pseudo-code description of ALG2 k is given below. We remark that a long
job i may be executed over several phases, provided that its per-unit value is
sufficiently high.

Algorithm ALG2 k : Each phase Pn is handled as follows. (1) If n > 1, let in


be the job scheduled last in Pn−1 . If job in has been scheduled for less than lin
time units so far, define job irn by (ain , vin , lirn ) and add it to Qn . Let S  (Pn ) be
the schedule obtained by sequencing the jobs of Qn in order of non-increasing
per-unit value in Pn . (2) If S  (Pn ) processes job irn for srn time units staring
at time trn , then schedule job in for srn time units at the beginning of Pn . Jobs
originally processed from time (n − 1)k to trn − 1 are delayed by srn time units.
Execute this schedule S(Pn ) for Pn , ignoring jobs that arrive during the phase.

Theorem 3. For all k ∈ N and all probabilities β, algorithm ALG2 k achieves


a competitive ratio of 1/β k−1 · max{1/β k−1 , 1/(1 − β 2k ), β 3k /(1 − β k )}.

We proceed to prove the above theorem. Compared to the proof of Theorem 1


the analysis is more involved because we have to take care of the delays incurred
by ALG2 k in Step (2) when scheduling a portion of job in at the beginning of
phase Pn and thereby postponing the start of jobs with higher per-unit values.
Furthermore, in order to achieve a small competitive ratio we have to charge the
loss of a job preempted in a phase to several adjacent phases. To this end we
have to classify phases and form schedule segments of up to three phases.
i=1 , we consider the k-quantized input Ik =
Again, for any input I = (ai , vi , li )N

(ai , vi , li )N
i=1 , where the arrival time of any job i is set to ai = k ai /k . There
holds ALG2 k (Ik ) = ALG2 k (I) and, as shown in Lemma 1, 1/β k−1 OPT (Ik ) ≥
OPT (I). We will compare ALG2 k (Ik ) to CHOP (Ik ), where CHOP is the
stronger optimal offline algorithm described in Section 2. Again let S denote
the schedule computed by ALG2 k for Ik and let S  be CHOP ’s schedule for Ik .
In S  jobs having a certain per-unit value v are processed in the same order as
in S.
In order to evaluate ALG2 k (Ik ), we define a schedule S  that allows us to
prove a statement analogous to Lemma 2 and, moreover, to compare the per-unit
values of jobs scheduled in S  and S  . For any phase Pn , consider the schedule
S  (Pn ) computed in Step (1) of ALG2 k . If n > 1 and the residual job irn is
scheduled for srn time units starting at time trn in Pn , then modify S  (Pn ) by
scheduling the original job in for srn time units starting at time trn . By slightly
overloading notation, we refer to this modified schedule as S  (Pn ). Schedule S 
is the concatenation of the S  (Pn ), for all n ≥ 1.
New Online Algorithms for Story Scheduling in Web Advertising 453

In S  (Pn ) jobs are sequenced in order of non-increasing per-unit value. Among


jobs of per-unit value v = vin , job in is processed first. Schedule S  (Pn ) differs
from S(Pn ) only in that job in is sequenced after the jobs having a strictly higher
per-unit value than vin . Each such job starts and finishes in Pn . The shift of the
job portion of in does not affect the relative order of jobs having the same per-
unit value. Hence in S  and S, and thus in S  and S  , jobs of a certain per-unit
value v occur in the same relative order. We note that schedule S  is infeasible
in that a job in may be interrupted at the end of Pn−1 and resumed later in Pn .
For any job i, let tS  (i) be its starting time in S  , i.e. the earliest time when a
portion of job i is processed. As usual tS (i) and tS (i) denote the starting time
of job i in S and S  , respectively.

Lemma 4. For any job i, tS  (i) ≤ tS (i).

A main goal of the subsequent analysis is to bound the loss incurred by ALG2 k
in preempting jobs. The following lemma will be crucial as it specifies the earliest
time when a job preempted in S can occur again in S ∗ .

Lemma 5. If job i is preempted in S(Pn ) and the following phase schedules


S(Pn+1 ), . . . , S(Pn ) only process jobs of per-unit value higher than vi , then S 
does not schedule job i in phases Pn+1 , . . . , Pn .

The proof of Lemma 5 relies on another lemma that compares per-unit values
of jobs scheduled in S and S  . At any time t, let vS (t) be the per-unit value of
the job scheduled in S  and let vS  (t) be the per-unit value of the job scheduled
in S  . Then vS (t) ≥ vS  (t).
Phase Classification: We classify phases, considering the original schedule S.
A phase Pn is called preempted if a job is preempted in S(Pn ). Phase Pn is called
continued if the job scheduled last in S(Pn ) is also scheduled at the beginning
of S(Pn+1 ). Phase Pn is complete if all jobs scheduled in S(Pn ) are finished by
the end of Pn .
We mention some properties of these phases in the schedule S. (a) In each
phase Pn at most one job is preempted in S(Pn ). (b) If Pn is a continued or
complete phase, no job is preempted in S(Pn ). (c) If Pn is a preempted phase,
then the job preempted is one having the smallest per-unit value among jobs
scheduled in S(Pn ). These properties can be verified as follows. Let Pn be an
arbitrary phase. When ALG2 k constructs a schedule for Pn , it firsts sorts the
jobs of Qn in order of non-increasing per-unit value. In this sorted sequence only
the last job, say job i, assigned to Pn might not be scheduled completely in the
phase and hence is a candidate for preemption. Job i is one having the smallest
per-unit value among jobs scheduled in the phase. This shows properties (a) and
(c). If job i is not moved to the beginning of the phase in Step (2) of ALG2 k and
continued at the beginning of the next phase, then Pn is a continued phase and
no job is preempted in S(Pn ). By definition, no job is preempted in a complete
454 S. Albers and A. Passen

phase. This shows property (b). We observe that in the schedule S each phase
is either preempted, continued or complete.
Schedule Segments: For the further analysis we partition the schedule S into
segments where a segment consists of up to three consecutive phases. The pur-
pose of these segments is to combine “expensive” preempted phases with other
phases so as to amortize preemption loss. First we build segments consisting of
three phases. Phases Pn , Pn+1 , Pn+2 form a segment if Pn is a preempted phase
that is not preceded by a continued phase, Pn+1 is a continued phase and Pn+2
is a preempted phase. Among the remaining phases we build segments consisting
of two phases. Phases Pn , Pn+1 form a segment if (a) Pn is a preempted phase
that is not preceded by a continued phase and Pn+1 is a continued or complete
phase or (b) Pn is a continued phase followed by a preempted phase Pn+1 . Each
remaining phase forms a separate segment. We observe that a preempted phase
that forms a separate one-phase segment is not preceded by a continued phase
and is followed by a preempted phase.
For a segment σ, let ALG2 k (σ) be the value obtained by ALG2 k on σ. More
specifically, let I be the set of jobs scheduled by ALG2 k in the phases of σ. Set
I also includes those jobs that are only partially processed in σ and might also
be scheduled in phases before or after σ. Suppose that job i ∈ I is processed for
δi time units starting at time ti is σ. Then

ALG2 k (σ) = i∈I β ti (1 − β δi )/(1 − β) · vi .

Let CHOP (σ) denote the value achieved by CHOP in processing the jobs and job
portions scheduled by ALG2 k in σ. More specifically, suppose that in S job i ∈ I
has been processed for λi time units before the beginning of σ. Then CHOP (σ)
represents the value achieved by CHOP in processing the units λi + 1, . . . , λi + δi
of job i in S  . If job i is preempted in the segment σ of S, then CHOP (σ)
additionally represents the value achieved by processing units u > λi + δi in S  .
There holds

CHOP(σ) ≤ i∈I β tS (i)+λi (1 − β δi )/(1 − β) · vi + vp (σ),

where vp (σ) denotes the additional value achieved


 by CHOP for jobs preempted
by
 ALG2 k in σ. We have ALG2 k (Ik ) = σ ALG2 k (σ) and CHOP(Ik ) =
σ CHOP (σ) because, by Lemma 4, every job scheduled by CHOP is also sched-
uled by ALG2 k .
Segment Analysis: We develop three lemmas that upper bound the ratio
CHOP(σ)/ALG2 k (σ), for the various segments. In the proofs we use the fol-
lowing notation. For any phase Pn let In denote the set of jobs that are partially
or completely processed in S(Pn ). For any i ∈ In , let δi,n be the number of time
units for which job i is processed in S(Pn ). If job i is only scheduled in phase Pn
of S, then we simply set δi = δi,n . Furthermore, let i1n be the first job scheduled
in S(Pn ). Suppose that Pn is preceded by a continued phase. When construct-
ing S(Pn ), ALG2 k might have delayed the starting times of some jobs of In in
Step (2) in order to move job i1n to the beginning of the phase. Let In ⊂ In be
New Online Algorithms for Story Scheduling in Web Advertising 455

the set of these delayed jobs. If Pn is not preceded by a continued phase, there
are no delayed jobs and we set In = ∅. We observe that tS (i) = tS  (i), for all
i ∈ In \ In ∪ {i1n }, and tS (i) = tS  (i) + δi1n ,n , for all i ∈ In .
For ease of exposition, let w(t, δ, v) = β t (1 − β δ )/(1 − β) · v be the value
achieved in processing a job of per-unit value v for δ time units starting at time
t. We allow δ = ∞, i.e. a job of infinite length is scheduled starting at time t.

Lemma 6. For any σ consisting of one phase, CHOP (σ)/ALG2 k (σ) ≤


max{1/β k−1 , 1 + β 3k /(1 − β k )}.

Proof. We first study the case that the phase Pn of σ is a continued or complete
phase, i.e. no job is preempted in S(Pn ). Let i1 = i1n be the first job scheduled
in S(Pn ). There holds

ALG2 k (σ) = w((n − 1)k, δi1 ,n , vi1 ) + i∈In w(tS  (i) + δi1 ,n , δi,n , vi )

+ i∈In \(In ∪{i1 }) w(tS  (i), δi,n , vi )

≥ β δi1 ,n (w((n − 1)k, δi1 ,n , vi1 ) + i∈In \{i1 } w(tS  (i), δi,n , vi )).

The last inequality holds because w(tS  (i)+δi1 ,n , δi , vi ) = β δi1 ,n w(tS  (i), δi,n , vi ).
Every job i ∈ In , except for possibly i1 , is started in S(Pn ). If job i1 is started
in S(Pn ), where n < n, then the job is scheduled at the end of S(Pn ). Hence
when ALG2 k constructed S(Pn ), the job was not delayed in Step (2) of the
algorithm in order to move another job to the beginning of the phase. Hence
tS (i1 ) = tS  (i1 ) ≤ tS (i1 ). Suppose that before Pn job i1 was processed for λi1
time units in S. Since tS (i1 ) ≤ tS (i1 ), the units λi1 + 1, . . . , λi1 + δi1 ,n of job
i1 cannot be started before the beginning of Pn in S  . For all jobs i ∈ In \ {i1 },
there holds tS  (i) ≤ tS (i). Hence

CHOP (σ) ≤ w((n − 1)k, δi1 ,n , vi1 ) + j∈In \{i1 } w(tS  (i), δi,n , vi ).

We obtain CHOP(σ)/ALG2 k (σ) ≤ 1/β δi1 ,n ≤ 1/β k−1 because a job i ∈ In can
be delayed by at most k − 1 time units.
We next study the case that Pn is a preempted phase. The preceding phase
Pn−1 is not a continued phase while the following phase Pn+1 is also a preempted
phase. Since Pn is not preceded by a continued phase all jobs of In are started
in S(Pn ) and tS (i) = tS  (i), for all i ∈ In . We obtain

ALG2 k (σ) = i∈In w(tS  (i), δi , vi ).

Let ip ∈ In be the job preempted in S(Pn ). The job is preempted at the end of
S(Pn ). Moreover, its per-unit value is strictly smaller than the per-unit value of
any job scheduled in S(Pn+1 ), the schedule of the following phase, since otherwise
ALG2 k would have scheduled job ip in S(Pn+1 ). Phase Pn+1 is also a preempted
phase and the job preempted in S(Pn+1 ) is scheduled at the end of S(Pn+1 ).
Thus the job preempted in S(Pn+1 ) has a strictly smaller per-unit value than
any job scheduled in S(Pn+2 ). It follows that job ip has a strictly smaller per-unit
456 S. Albers and A. Passen

value than any job scheduled in S(Pn+1 ) and S(Pn+2 ). Lemma 5 ensures that
CHOP does not schedule ip in phases Pn+1 and Pn+2 . Thus the value achieved by
CHOP for the preempted portion of ip is upper bounded by w((n + 2)k, ∞, vip )
and

CHOP(σ) ≤ i∈In w(tS  (i), δi , vi ) + w((n + 2)k, ∞, vip )
= ALG2 k (σ) + w((n + 2)k, ∞, vip ).

Job ip has the smallest per-unit value among jobs scheduled in S(Pn ). Thus
ALG2 k (σ) ≥ w((n − 1)k, k, vip ) = β (n−1)k (1 − β k )/(1 − β) · vip . Also w((n +
2)k, ∞, vip ) = β (n+2)k /(1 − β)vip . We conclude CHOP(σ)/ALG2 k (σ) ≤ 1 +
β 3k /(1 − β k ).

The next two lemmas address segments consisting of at least two phases, the
analysis of which is more involved.
Lemma 7. Let σ be a segment consisting of at least two phases. If σ consists of
two phases, assume that the first one is a preempted phase. If σ consists of three
phases, assume that the per-unit value of the job preempted in the first phase
is at least as high as that of the job preempted in the third phase. There holds
CHOP(σ)/ALG2 k (σ) ≤ max{1/β k−1 , 1/(1 − β 2k )}.

Lemma 8. Let σ be a segment consisting of at least two phases. If σ consists


of two phases, assume that the first one is a continued phase. If σ consists of
three phases, assume that the per-unit value of the job preempted in the first
phase is smaller than that of the job preempted in the third phase. There holds
CHOP(σ)/ALG2 k (σ) ≤ max{1/β k−1 , 1/(1 − β 2k )}.

Theorem 3 now follows from the three above lemmas, taking into account that
Lemmas 7 and 8 cover all possible cases of 2-phase and 3-phase segments.

Corollary 4. Let c = 1 + φ, where φ = (1 + 5)/2 is the Golden Ratio. For
k = − 12 logβ c + 1, ALG2 k achieves a competitive ratio of c ≈ 2.618.

4 An Algorithm for Multiple ad Positions


We study the setting where m parallel machines (ad positions) are available. Let
I = (ai , vi , li )N
i=1 be any input. Each job may be processed on one machine only,
i.e. an interruption and migration of jobs is not allowed.
We define an algorithm ALG(m)k that generalizes ALG1 k of Section 2. Let
Pn = (n − 1)k, . . . , nk − 1 be any phase. Again let Qn be the set of jobs i that
have arrived until the beginning of Pn , i.e. ai ≤ (n − 1)k, and have not been
processed in the past phases P1 , . . . , Pn−1 . ALG(m)k constructs a schedule for
Pn by considering the k time steps of the phase. For t = (n − 1)k, . . . , nk − 1,
ALG(m)k determines the m jobs having the highest per-unit values among the
jobs of Qn that are unfinished at time t. Each of these jobs is scheduled for one
time unit. If a job was also scheduled at time t−1, then it is assigned to the same
New Online Algorithms for Story Scheduling in Web Advertising 457

machine at time t. We specify a tie breaking rule if, among the unfinished jobs in
Qn , the m-th largest per-unit value is v and there exist several jobs having this
value. In this case preference is given to those jobs that have already been started
at times t , with t < t. Jobs that have not been started yet are considered in
increasing order of arrival time, where ties may be broken arbitrarily. Of course,
if at time t set Qn contains at most m unfinished jobs, then each of them is
scheduled at that time. We observe that a feasible phase schedule, in which each
job is processed without interruption on one machine, can be constructed easily:
If a job of Qn is among those having the m highest per-unit values, then the job
will remain in this subset until it is finished. Hence the job can be sequenced
continuously on the same machine. We also observe that on each machine jobs
are sequenced in order of non-increasing per-unit value. Let S(Pn ) denote the
schedule constructed for phase Pn . While S(Pn ) is executed, newly arriving jobs
are deferred until the beginning of the next phase. At the end of S(Pn ) unfinished
jobs are preempted.
Theorem 4. For all k ∈ N and all probabilities β, ALG(m)k achieves a com-
petitive ratio of 1/β k−1 · (1 + 1/(1 − β k )).
We finally determine the best choice of√k. For any √
β, the competitive
√ ratio of
Corollary 5 is upper bounded by (1+1/( 2−1))/(2− 2) = 1/(3−2 2) ≈ 5.828.


Corollary 5. For k = logβ (2 − 2) , the resulting algorithm ALG(m)k
√ √
achieves a competitive ratio of (1 + 1/(1 − β(2 − 2)))/(2 − 2).

References
1. Buchbinder, N., Feldman, M., Ghosh, A., Naor, J(S.): Frequency capping in on-
line advertising. In: Dehne, F., Iacono, J., Sack, J.-R. (eds.) WADS 2011. LNCS,
vol. 6844, pp. 147–158. Springer, Heidelberg (2011)
2. Buchbinder, N., Jain, K., Naor, J(S.): Online primal-dual algorithms for maximiz-
ing ad-auctions revenue. In: Arge, L., Hoffmann, M., Welzl, E. (eds.) ESA 2007.
LNCS, vol. 4698, pp. 253–264. Springer, Heidelberg (2007)
3. Dasgupta, A., Ghosh, A., Nazerzadeh, H., Raghavan, P.: Online story scheduling
in web adverstising. In: Proc. 20th Annual ACM-SIAM Symposium on Discrete
Algorithms, pp. 1275–1284 (2009)
4. http://www.emarketer.com/Article/Digital-Account-One-Five-Ad-Dollars/
1009592
5. Feldman, J., Korula, N., Mirrokni, V., Muthukrishnan, S., Pál, M.: Online ad
assignment with free disposal. In: Leonardi, S. (ed.) WINE 2009. LNCS, vol. 5929,
pp. 374–385. Springer, Heidelberg (2009)
6. Feige, U., Immorlica, N., Mirrokni, V.S., Nazerzadeh, H.: A combinatorial alloca-
tion mechanism with penalties for banner advertising. In: Proc. 17th International
Conferene on World Wide Web, pp. 169–178 (2008)
7. Feldman, J., Mehta, A., Mirrokni, V.S., Muthukrishnan, S.: Online stochastic
matching: Beating 1-1/e. In: Proc. 50th Annual IEEE Symposium on Foundations
of Computer Science, pp. 117–126 (2009)
458 S. Albers and A. Passen

8. Ghosh, A., Sayedi, A.: Expressive auctions for externalities in online advertising.
In: Proc. 19th International Conferene on World Wide Web, pp. 371–380 (2010)
9. http://www.marketingcharts.com/wp/television/
global-online-ad-spend-forecast-to-exceed-print-in-2015-25105/
10. marketingterms.com. Surround session,
http://www.marketingterms.com/dictionary/surround_session/
11. Mehta, A., Saberi, A., Vazirani, U.V., Vazirani, V.V.: AdWords and generalized
online matching. Journal of the ACM 54(5) (2007)
12. Sleator, D.D., Tarjan, R.E.: Amortized efficiency of list update and paging rules.
Communications of the ACM 28, 202–208 (1985)
Sketching for Big Data Recommender Systems
Using Fast Pseudo-random Fingerprints

Yoram Bachrach1 and Ely Porat2


1
Microsoft Research, Cambridge, UK
2
Bar-Ilan University, Ramat-Gan, Israel

Abstract. A key building block for collaborative filtering recommender systems


is finding users with similar consumption patterns. Given access to the full data
regarding the items consumed by each user, one can directly compute the simi-
larity between any two users. However, for massive recommender systems such
a naive approach requires a high running time and may be intractable in terms
of the space required to store the full data. One way to overcome this is using
sketching, a technique that represents massive datasets concisely, while still al-
lowing calculating properties of these datasets. Sketching methods maintain very
short fingerprints of the item sets of users, which allow approximately computing
the similarity between sets of different users.
The state of the art sketch [22] has a very low space complexity, and a re-
cent technique [14] shows how to exponentially speed up the computation time
involved in building the fingerprints. Unfortunately, these methods are incompati-
ble, forcing a choice between low running time or a small sketch size. We propose
an alternative sketching approach, which achieves both a low space complexity
similar to that of [22] and a low time complexity similar to [14]. We empirically
evaluate our algorithm using the Netflix dataset. We analyze the running time
and the sketch size of our approach and compare them to alternatives. Further,
we show that in practice the accuracy achieved by our approach is even better
than the accuracy guaranteed by the theoretical bounds, so it suffices to use even
shorter fingerprints to obtain high quality results.

1 Introduction
The amount of data generated and processed by computers has shown consistent ex-
ponential growth. There are currently over 20 billion webpages on the internet, and
major phone companies process tens of gigabytes of call data each day. Analyzing such
vast amounts of data requires extremely efficient algorithms, both in terms of running
time and storage. This has given rise to the field of massive datasets processing. We
focus on massive recommender systems, which provide users with recommendations
for items that they are likely to find interesting, such as music, videos, or web pages.
These systems keep a profile of each user and compare it to reference characteristics.
One approach is collaborative filtering (CF), where the stored information is the items
consumed or rated by the user in the past. CF systems predict whether an item is likely
to interest the target user by seeking users who share similar rating patterns with the
target user and then using the ratings from those like-minded users to generate a pre-
diction for the target user. Various user similarity measures have been proposed, the

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 459–471, 2013.
c Springer-Verlag Berlin Heidelberg 2013
460 Y. Bachrach and E. Porat

most prominent one being the Jaccard similarity [8,4]. A naive approach, which main-
tains the entire dataset of the items examined by each user and their ratings and directly
computes the similarity between any two users, may not be tractable for big data appli-
cations, both in terms of space and time complexity. A recommender system may have
tens of millions of users1 , and may need to handle a possible set of billions of items2 . A
tractable alternative requires representing knowledge about users concisely while still
allowing inference on relations between users (e.g. user similarity). One method for
concisely representing knowledge in such settings is sketching (also known as finger-
printing) [2,20,9,1]. Such methods store fingerprints, which are concise descriptions of
the dataset, and can be though of as an extreme lossy compression method. These fin-
gerprints are extremely short, far more concise than traditional compression techniques
would achieve. On the other hand, as opposed to compression techniques, they do not
allow a full reconstruction of the original data (even approximately), but rather only
allow inferring very specific properties of the original dataset. Sketching allows keep-
ing short fingerprints of the item sets of users in recommender systems, that can still
be used to approximately compute similarity measures between any two users [5,4,3].
Most such sketches use random hashes [2,20,9,1,5].

Our Contribution. The state of the art sketch in terms of space complexity applies many
random hashes to build a fingerprint [22], and stores only a single bit per hash. A draw-
back of this approach is its high running time, caused by the many applications of hashes
to elements in the dataset. The state of the art method in terms of running time is [14],
which exponentially speeds up the computation time for building the fingerprints. Un-
fortunately, the currently best sketching techniques in terms of space complexity [22]
and time complexity [14] are mutually incompatible, forcing a choice between reducing
space or reducing runtime. Also, the low time complexity method [14] is tailored for
computing Jaccard similarity between users, but is unsuitable for other similarity mea-
sures, such as the more sensitive rank correlation similarity measures [4]. We propose
an alternative general sketching approach, which achieves both a low space complexity
similar to that of [22] and a low time complexity similar to [14]. Our sketch uses random
hashing and has a similar space complexity to [22], storing a single bit per hash, thus out-
performing previous approaches such as [2,13,5] in space complexity. Similarly to [14],
we get an exponential speedup in one factor of the computation time. Our discussion
focuses on Jaccard similarity [5], but our approach is more general than [14], captur-
ing other fingerprints, such as frequency moments [2], Lp sketches [17], rarity [13] and
rank correlations [4]. Our sketch “ties” hashes in a novel way, allowing an exponential
runtime speedup while storing only a single bit per hash. We also make an empirical
contribution and evaluate our method using the Netflix [6] dataset. We analyze the run-
ning time and space complexity of our sketch, comparing it to the above state of the

1
For example, Netflix is a famous provider of on-demand internet streaming media and employs
a recommender system with over 20 million users. Our empirical evaluation is based on the
dataset released by Netflix in a challenge to improve their algorithms [6].
2
An example is a recommender system for webpages. Each such a webpage is a potential infor-
mation item, so there are billions of such items that can potentially be recommended.
Sketching for Big Data Recommender Systems 461

art methods. We show that it in practice the accuracy of our sketch is higher than the
theoretical bounds, so even shorter sketches achieve high quality results.
Related Work. Recent work [5,4,3] already suggests sketching for recommender sys-
tems. Rather than storing the full lists of items consumed by each user and their ratings,
they only store fingerprints of the information for each user, designed so that given
the fingerprints of any two users one can accurately estimate the similarity between
them. These fingerprints are constructed using min-wise independent families of hash
functions, MWIFs for short. MWIFs were introduced in [23,8], and are useful in many
applications as they resemble random permutations. Much of the research on sketching
focused on reducing space complexity while accurately computing data stream prop-
erties, but much less attention was given to time complexity. Recent work that does
focus on running time as well as space is [18,19] and [12] which propose low runtime
sketches for locally sensitive hashing under the l2 norm. Many streaming algorithms
apply many hashes to each element in a very long stream of elements, leading to a high
and sometimes intractable computation time. Similarly to [14], our method achieves
an exponential speedup in one factor of the computation time for constructing finger-
prints of massive data streams. The heart of the method lies in using a specific family
of pseudo-random hashes shown to be approximately-MWIF [16], and for which we
can quickly locate the hashes resulting in a small value of an element under the hash.
Similarly to [24] we use the fact that family members are pairwise independent be-
tween themselves. Whereas previous models examine only one hash at a time, we read
and process “chunks” of hashes to find important elements in the chunk, exploiting the
chunk’s structure to significantly speed up computation. We show that our technique
is compatible with storing a single bit rather than the full element IDs, improving the
fingerprint size, similarly to [22].
Improving Time and Space Complexity. Rather than storing the full stream of b items
of a universe [u], sketching methods only store a fingerprint of the stream. Any sketch
achieves an estimate that is “probably approximately correct”, i.e. with high probability
the estimation error is small. Thus the size of the fingerprint and the time required to
compute it depend on the accuracy of the method  and its confidence δ. The accuracy
 is the allowed error (the difference between the estimated similarity and the true simi-
larity), and the confidence δ is the maximal allowed probability of obtaining an estimate
with a “large error”, higher than the allowed error . Similarly to [22] we store a single
ln 1
bit per hash function, which results in a fingerprint of length O( 2δ ) bits, rather than
1
ln
O(log u · 2δ ) bits required by previous approaches such as [14]. On the other hand,
similarly to [14], rather than computing the fingerprint of a stream of b items in time of
b ln 1
O( 2 δ ) as required by [22], we can compute it in time O(b · log 1δ · log 1 ), achieving
an exponential speedup for the fingerprint construction. In addition to the theoretical
guarantees, in Section 3 we evaluate our approach on the Netflix dataset, contrasting it
with previous approaches in terms of time and space complexity. We also show that a
high accuracy can be obtained even for very small fingerprints. Our theoretical results
relate the required storage to the accuracy of the Jaccard similarity estimates, but only
462 Y. Bachrach and E. Porat

provide an upper bound regarding the storage required; we show that in practice the
storage required to achieve a good accuracy can be much lower.
Preliminaries. Let H be a family of hashes over source X and target Y , so h ∈ H is a
function h : X → Y , where Y is ordered. We say H is min-wise independent if when
randomly choosing h ∈ H, for any subset C ⊆ X, any x ∈ C has an equal probability
to be minimal after applying h.
Definition 1. H is min-wise independent (MWIF), if for all C ⊆ X, for any x ∈ C,
1
P rh∈H [h(x) = mina∈C h(a)] = |C| .

 (γ-MWIF), if for all C ⊆


Definition 2. H is a γ-approximately min-wise independent
 1 
X, for any x ∈ C: P rh∈H [h(x) = mina∈C h(a)] − |C|  ≤ |C|
γ

Definition 3. H is k-wise independent, if for all x1 , x2 , . . . , xk ⊆ X, y1 , y2 , . . . , yk ⊆


Y , P rh∈H [(h(x1 ) = y1 ) ∧ . . . ∧ (h(xk ) = yk )] = |Y1|k

Pseudo-random Family of Hashes. We describe our hashes. Given a universe of item


IDs [u], consider a prime p, such that p > u. Consider taking random coefficients for
a d-degree polynomial in Zp . Let a0 , a1 , . . . , ad ∈ [p] be chosen uniformly at random
from [p], and the polynomial in Zp : f (x) = a0 + a1 x + a2 x2 + . . . + ad xd . Denote
by Fd all d-degree polynomials in Zp with coefficients in Zp . Our method chooses
members of this family uniformly at random. Indyk [16] shows that choosing a func-
tion f from Fd uniformly at random results in Fd being a γ-MWIF for d = O(log γ1 ).
Randomly choosing a0 , . . . , ad is equivalent to choosing a member of Fd uniformly at
random, so f (x) = a0 + a1 x + a2 x2 + . . . + ad xd is a hash chosen at random from
the γ-MWIF Fd . Similarly, let b0 , b1 , . . . , bd ∈ [p] be chosen uniformly at random from
[p], and g(x) = b0 + b1 x + b2 x2 + . . . + bd xd , also a hash chosen at random from
the γ-MWIF Fd . Consider the hashes h0 (x) = f (x), h1 (x) = f (x) + g(x), h2 (x) =
f (x) + 2g(x), . . . , hi (x) = f (x) + ig(x), . . . , hk−1 (x) = f (x) + (k − 1)g(x). We
call the construction procedure for f (x), g(x) the base random construction, and the
construction of hi the composition construction. We prove properties of such hashes.
We denote the probability of an event E when the hash h is constructed by choosing
f, g using the base random construction and composing h(x) = f (x) + i · g(x) (for
some i ∈ [p]) as P rh (E). These constructions are similar to the constructions used
in [21], however our hashes are min-wise independent and pair-wise independent be-
tween themselves.
Lemma 1 (Uniform Minimal Values). Let f, g be constructed using the base con-
struction, using d = O(log γ1 ). For any z ∈ [u], any X ⊆ [u] and any value i used to
compose h(x) = f (x) + i · g(x): P rh [h(z) ≤ miny∈X (h(y)] = (1 ± γ) |X|1
. Proof in
full version.
Lemma 2 (Pairwise Interaction). Let f, g be constructed using the base construction
for d = O(log γ1 ). For all x1 , x2 ∈ [u] and X1 , X2 ⊆ [u], and all i = j used to
compose hi (x) = f (x) + i · g(x) and hj (x) = f (x) + j · g(x): P rf,g∈Fd [(hi (x1 ) ≤
miny∈X1 hi (y)) ∧ (hj (x2 ) ≤ miny∈X2 hi (y))] = (1 ± γ)2 |X1 |·|X
1
2|
. Proof in full ver-
sion.
Sketching for Big Data Recommender Systems 463

2 Collaborative Filtering Using Pseudo-random Fingerprints


Collaborative filtering systems provide a target user recommendations for items based
on the consumption patterns of other users. Many such systems rank users by their sim-
ilarity to the target user in their past consumption, then find items many such users have
consumed by the target user has not yet consumed and recommend them to the target
user [26,27,11]. We focus on estimating the similarity between users (see [28] for a
survey of methods for generating recommendations based on similarity information).
A common measure of similarity between two users is Jaccard similarity. Our item
universe consists of the IDs of all items users may consume, [u] = {1, 2, 3, . . . , u}.
There are u different items in the universe, but each user only examined some of these.
Consider one user who has consumed the items C1 ⊆ [u] in the past, and another
user who has consumed the items C2 ⊆ [u] . The Jaccard similarity between the
two users is J1,2 = |C 1 ∩C2 |
|C1 ∪C2 | . Several fingerprinting methods were proposed for ap-
proximating relations between massive datasets, including the Jaccard similarity [8].
We use hashes defined earlier to exponentially speed up such computations. We use
pseudo-random effects, so we must relax the MWIF requirement to a pairwise inde-
pendence requirement (2-wise independence). We briefly review earlier sketches for
estimating Jaccard similarity. Consider a hash h ∈ H chosen from a MWIF H (or
a γ-MWIF). We apply h on all elements C1 and examine the minimal integer we get,
mh1 = arg minx∈C1 h(x). We do the same to C2 and examine mh2 = arg minx∈C2 h(x).
Jaccard similarity sketches are based on computing the probability that m1 = m2 :
P rh∈H [mh1 = mh2 ] = P rh∈H [arg minx∈C1 h(x) = arg minx∈C2 h(x)]. Theorems 1
and 2 are proven in [7,8] regarding a hash h chosen uniformly at random from a MWIF
H and γ-MWIF, correspondingly.3

Theorem 1 (Jaccard / Collision Probability (MWIF)). P rh∈H [mhi = mhj ] = Ji,j .

Theorem 2 (Jaccard / Collision Probability (γ-MWIF)). |P rh∈H [mhi = mhj ] −


Ji,j | ≤ γ.

Rather than storing the full Ci ’s, previous approaches [7,8] store their finger-
prints. Given k hashes h1 , . . . , hk randomly chosen from an γ-MWIF, we can store
mhi 1 , . . . , mhi k . Given Ci , Cj , for any x ∈ [k], the probability that mhi x = mhj x is
Ji,j ± γ. A hash hx where we have mhi x = mhj x is a hash collision. We can estimate
Ji,j by counting the proportion of collision hashes out of all the chosen hashes. In this
approach, the fingerprint contains k item identities in [u], since for any x, mhi x is in [u].
Thus, the fingerprint requires k log u bits. To achieve an accuracy  and confidence δ,
ln 1
such approaches require k = O( 2δ ).
Our General Approach. We use a “block fingerprint” that estimates J = Ji,j with
accuracy  and confidence of 78 . It stores a single bit per hash (where many previous
approaches store log u bits per hash). Later we show how to get a given accuracy 
and confidence δ, by combining several such blocks. To get a single bit per hash, we
use a hash mapping elements in [u] to a single bit — φ : [u] → {0, 1}, taken from
3
The full version includes a proof for Theorem 1.
464 Y. Bachrach and E. Porat

a pairwise independent family (PWIF for short) of hashes. Rather than using mhi =
arg minx∈C1 h(x) we use mφ,h i = φ(arg minx∈C1 h(x)). Storing mφ,h
i rather than mφi
shortens the sketch by a log u factor.
Ji,j
Theorem 3. P rh∈H [mφ,h
i = mφ,h
j ] = 2 + 1
2 ± γ2 .

Proof. P rh∈H,φ∈H  [mφ,h


i = mφ,h
j ] = P r[mi
φ,h
= mφ,h
j |mi = mj ] · P rh∈H [mi =
h h h

mhj ] + P r[mφ,h
i = mφ,h
j |mi = mj ] · P rh∈H [mi = mj ] = 1 · P rh∈H [mi = mj ] +
h h h h h h
1+Ji,j ±γ
1
2 · (1 − P rh∈H [mhi = mhj ]) = 2

The purpose of the fingerprint block is to estimate of Ji,j with accuracy . We use
210 −1
k = 8.022 hashes. Denote α = 210 , and let γ = (1 − α) ·  = 210 . We construct
1
4
a γ-MWIF . To construct the family, consider choosing a0 , . . . , ad and b0 , b1 , . . . , bd
uniformly at random from [p], constructing the polynomials f (x) = a0 + a1 x + a2 x2 +
. . . + ad xd , g(x) = b0 + b1 x + b2 x2 + . . . + bd xd , and using the k hashes hi (x) =
f (x)+ig(x), where i ∈ {0, 1, . . . , k −1}.5. We also use a hash φ : [u] → {0, 1} chosen
from the PWIF of such hashes. We say there is a collision on hl if mφ,h i
l
= mφ,h
j
l
, and
denote the random variable Zl where Zl = 1 if there is a collision on hl for users i, j
and Zl = 0 if there is no such collision. Zl = 1 with probability 12 + J2 ± γ2 and
Zl = 0 with probability 12 − J2 ± γ2 . Thus E(Zl ) = 12 + J2 ± γ2 . Denote Xl = 2Zl − 1.
E(Xl ) = 2E(Zl ) − 1 = J ± γ. Xl can take two values, −1 when Zl = 0, and 1 when
k
Zl = 1. Thus Xl2 always takes the value of 1, so E(Xl2 ) = 1. Consider X = l=1 Xl ,
and take Y = Jˆ = X k as an estimator for J. We show that for the above k, Y is accurate
up to  with probability at least 78 .
Theorem 4 (Simple Estimator). P r(|Y − J| ≤ ) ≥ 78 . Proof in full version.
Due to Theorem 4, we approximate J with accuracy  and confidence 78 using a “block
fingerprint” for Ci , composed of mhi 1 ,φ1 , . . . , mhi k ,φk , where h1 , . . . , hk are random
members of a γ-MWIF and φ1 , . . . , φk are chosen from the PWIF of hashes φ : [u] →
{0, 1}. It suffices to take k = O( 12 ) to achieve this. Constructing each hi can be done
by choosing f, g using the base random construction and composing hi (x) = f (x) +
i · g(x). The base random construction chooses f, g uniformly at random from Fd , the
family of d-degree polynomials in Zp , where d = O(log 1 ). This achieves a γ-MWIF
where γ = (1 − α) ·  = 2110 .

Achieving a Desired Confidence. We combine several independent fingerprints to in-


crease the confidence to a level δ. Earlier this section we proposed a fingerprint of length
k to get a confidence of 78 . Consider taking m fingerprints for each stream, each of
length k. Given two streams, i, j, we have m pairs of fingerprints, each approximating
J with accuracy , and confidence 78 . Denote the obtained estimators by Jˆ1 , Jˆ2 , . . . , Jˆm ,
and the median of these values by J. ˆ Consider using m > 32 ln 1 “blocks”.
9 δ
4
The accuracy γ is stronger than the  required of the full fingerprint, for reasons examined
later.
5
This is similar to [21], but here the hashes are MWIF and pair-wise independent between
themselves.
Sketching for Big Data Recommender Systems 465

Fig. 1. A fingerprint “chunk” for a stream

Theorem 5 (Median Estimator). P r(|Jˆ − J| ≤ ) ≥ 1 − δ. Proof in full version.

By Theorem 5, to get |Jˆ − J| ≤  it suffices to take m > 32


9 ln 1δ blocks, each with
1
28.45 ln ln 1δ
k = 8.02
2 hashes, or
32
9 ln 1δ · 8.02
2 ≤ 2
δ
hashes in total. Thus, we use O( 2 )
hashes.

2.1 Fast Fingerprint Computation


We discuss speeding up the fingerprint computation for a set of b items X =
{x1 , . . . , xb } where xi ∈ [u]. The fingerprint has m “block fingerprints”, with block r
constructed using k hashes hr1 , . . . , hrk , built using 2 · d random coefficients in Zp . The
i’th location in the block is the minimal item in X under hi : mi = arg minx∈X hi (x),
which is then hashed through a hash φ mapping elements in [u] to a single bit. We show
how to quickly compute the block fingerprint (m1 , . . . , mk ). A naive way to do this is
applying k · b hashes to compute hi (xj ) for i ∈ [k], j ∈ [b]. The values hi (xi ) where
i ∈ [k], j ∈ [b] form a matrix, where row i has the values (hi (x1 ), . . . , hi (xb )), shown
in Figure 1.
Once all hi (xj ) values are computed for i ∈ [k], j ∈ [b] , for each row i we check
for which column j the row’s minimal value occurs, and store mi = xj . Comput-
ing the fingerprint requires finding the minimal value across the rows (and the value
xj for the column j where this minimal value occurs). To speed up the process, we
use a method similar to [25] as a building block. Recall the hashes hi were defined
as hi (x) = f (x) + ig(x) where f (x), g(x) are d-degree polynomials with random
coefficients in Zp . Our algorithm is based on a procedure that gets a value x ∈ [u]
and a threshold t, and returns all elements in (h0 (x), h1 (x), . . . , hk−1 (x)) which are
smaller than t, as well as their locations. Formally, the method returns the index list
It = {i|hi (x) ≤ t} and the value list Vt = {hi (x)|i ∈ It } (note these are lists, so
the j’th location in Vt , Vt [j], contains hIt [j] (x)). We call this the column procedure,
and denote by pr − small − loc(f (x), g(x), k, x, t) the function that returns It , and
by pr − small − val(f (x), g(x), k, x, t) the function that returns Vt . We describe
an implementation of these operations later in this section, with a running time of
O(log k + |It |), rather than the naive algorithm which evaluates O(k) hashes. Thus, this
466 Y. Bachrach and E. Porat

procedure quickly finds small elements across columns (by “small” we mean smaller
than t). Our algorithm keeps a bound for the minimal value for each row. It goes through
the columns, finding the small values in each, and updates the row bounds where these
occur.
block-update ((x1 , . . . , xb ), f (x), g(x), k, t) :
1. Let mi = ∞ for i ∈ [k] and let pi = 0 for i ∈ [k]
2. For j = 1 to b:
(a) Let It = pr − small − val(f (x), g(x), k, xj , t)
(b) Let Vt = pr − small − loc(f (x), g(x), k, xj , t)
(c) For y ∈ It : // Indices of the small elements
i. If mIt [y] > Vt [y] // Update to row x required
A. mIt [y] = Vt [y], pIt [y] = xj
If our method updates mi , pi for row i, mi indeed contains the minimal value in that
row, and pi the column where this minimal value occurs, since if even a single update
occurred then the row indeed contains an item that is smaller than t, so the minimal
item in that row is smaller than t and an update would occur for that item. On the other
hand, if all the items in a row are bigger than t, an update would not occur for that
row. The running time of the column procedure is O(log k + |It |), which is a random
variable, that depends on the number of elements returned for that column, |It |. Denote
by Lj the number of elements returned for column j (i.e. |It | for column j). Since we

have b columns, the running time of the block update is O(b log k) + O( bj=1 Lj ). The
b
total number of returned elements is j=1 Lj , which is the total number of elements

that are smaller than t. We denote by Yt = bj=1 Lj the random variable which is the
number of all elements in the block that are smaller than t. The running time of our
block update is thus O(b log k + Yt ). The random variable Yt depends on t, since the
smaller t is the less elements are returned and the faster the column procedure runs. On
the other hand, we only update rows whose minimal value is below t, so if t is too low
we have a high probability of having rows which are not updated correctly. A certain
compromise t value combines a good running time of the block update with a good
probability of correctly computing the values for all the rows.

Theorem 6. Given the threshold t = 12·p·l
b , where l = 80+2 log 1 (so l = O(log 1 )),
the runtime of the block − update procedure is O(b log 1 + 12 log 1 ). Proof in full ver-
sion.

Computing Minimal Elements in the Series. We recursively implement pr − small −


loc(f (x), g(x), k, x, t) and pr − small − val(f (x), g(x), k, x, t), the procedures for
computing Vt and It . The hashes hi were defined as hi (x) = f (x) + ig(x) where
f (x), g(x) are d-degree polynomials with random coefficients in Zp . Consider x ∈ Zp
for which we seek all the values (and indices) in (h0 (x), h2 (x), . . . , hk−1 (x)) smaller
than t. Given x, we evaluate f (x), g(x) in time O(d) = O(log γ1 )6 , and denote a =
6
Using multipoint evaluation we can calculate it in amortized time O(log2 log γ1 ). We can use
other constructions for d-wise independent which can be evaluated in O(1) time but use more
space.
Sketching for Big Data Recommender Systems 467

f (x) ∈ Zp and b = g(x) ∈ Zp . Thus, we seek all values in {a mod p, (a + b)


mod p, (a + 2b) mod p, . . . , (a + (k − 1)b) mod p} smaller than t, and the indices
i where they occur. Consider the series S = (s1 , . . . , sk ) where si = (a + ib) mod p
and i = {0, 1, . . . , k − 1}. We denote the arithmetic series a + bi mod p for i ∈
{0, 1, . . . , k − 1} as S(a, b, k, p), so under this notation S = S(a, b, k, p). Given a
value we can find the index where it occurs, and vice versa. To get the value for index
i, we compute (a + ib) mod p. To get the index i where value v occurs, we solve
v = a + ib in Zp (i.e. i = v−a b mod p). This can be done in O(log p) time using
Euclid’s algorithm. We compute b−1 in Zp once to transform all values to generating
indices. We call a location i where si < si−1 a flip location. The first index is a flip
location if a − b mod p > a. First, consider b < p2 . If si is a flip location, we have
si−1 < p but si−1 + b > p, so si < b. As b < p2 there’s at least one location that is not
a flip location between any two flip locations. Given S = S(a, b, k, p), denote by f (S)
the flip locations in S. The flip locations of S are f (S). Denote f0 (S) = f (S), and by
fi (S) elements occurring i places after the closest flip location. Lemmas 3 and 4 are
proven in the full version.

Lemma 3 (Flip Locations Are Small). If b < p2 , at most k2 elements are flip locations,
and all elements that are smaller than b are flip locations.

Lemma 4 (Element Comparison). If b < p


2, x ∈ fi (S), y ∈ fj (S) for i > j, then
x > y.

The first flip location is p−a


b , as to exceed p we add b p−a
b times. There are a+bk
p
flip locations. Denote the first flip location as j = p−a
b , with value a = (a + jb)
mod p. Denote b = (b − p) mod b and the number of flip locations as k  = (a+bk) p .
The flip locations have an arithmetic progression [25].

Lemma 5 (Flip Locations Arithmetic Progression). The flip locations of S =


S(a, b, k, p) are also an arithmetic progression S  = (a , b , k  , b).

Using these lemmas, we search for the elements smaller than t by examining the flip lo-
cations series in recursion. If case b < t, given q = t b, due to Lemma 4
f (S), f1 (S), . . . fq−1 (S) are smaller then t, and all of their elements must be returned.
We must also scan fq (S) and also return all the elements of fq (S) which are smaller
then t. This additional scan requires O(|fq (S)|) time |fq (S)| ≤ |f (S)|. Thus the case
of b < t examines O(|It |) elements. By Lemma 3, if b > t, all non-flip locations are
bigger than b and thus bigger than t, so we need only consider the flip-locations as
candidates. Using Lemma 5 we scan the flip locations recursively, examining the arith-
metic series of the flip locations. If at most half of the elements in each recursion are
flip locations, this gives a logarithmic running time, but if b is high, more than half the
elements are flip locations. When b > p2 we examine the same flip-location series S  , in
reverse order. The first element in the reversed series is the last element of the current
series, and rather than progressing in steps of b, we progress in steps of p − b. Thus
we obtain the same elements, but in reverse order. In this reversed series, at most half
the elements are flip locations. The procedure below implements our method. It finds
elements smaller then t in time O(log k) = O(log 1 + |It |) where |It | is the number of
468 Y. Bachrach and E. Porat

such values. Given the returned indices, we get the values in them. We use the same b
for all |It |, so this can be done in time O(c log c + |It |) (usually c is a constant).
ps-min(a, b, p, k, t) :
1. if b < t:
(a) Vt = []; if a < t then Vt = Vt + [a + ib for i in range ( t−a
b )]
(b) j = p−a b // First flip (excluding first location)
(c) while j < k:
i. v = (a + jb) mod p
ii. while j < k and v < t:
A. Vt .append(v); j = j + 1; v = v + b
iii. j = j + p−v b //next flip location
iv. return list1
(d) if b > p2 then return f ((a + (k − 1) · b) mod p, p − b, p, k, t)
(e) j = p−a b ; newk = a+bk p
(f) if a < b then j = 0 and newk = newk + 1// get first flip location
(g) return f ((a + jb) mod p, −p mod b, b, newk , t)

3 Empirical Analysis
We empirically evaluated our sketch using the Netflix dataset [6]. This is a movie rat-
ings dataset, with 100 million movie ratings, provided by roughly half a million users
on a collection of 17,000 movies. As there are 100 million ratings, even this meduim-
sized dataset is difficult to fit in memory7, so a massive recommender systems dataset
certainly cannot fit in the main memory, making sketching necessary to handle such
datasets [5]. The state of the art space complexity is achieved using the sketching tech-
nique of [22]. Consider using it to estimate Jaccard similarity, with a reasonable ac-
curacy of  = 0.01 and confidence level of δ = 0.001. The approach of [22] applies
roughly 100,000 hash functions for each entry. Each hash computation requires 20 mul-
tiplication operations, and as there are 100 million entries in the dataset, sketching the
entire dataset requires over than 2 · 1014 multiplications. This takes more than a day to
run on a powerful machine. On the other hand, although the approach of [14] allows
a much shorter running time (less than an hour), it requires sacrificing the low space
achieved by the method of [22]. We first compare our approach with [22,14] in terms
of the running time. Figure 2 shows the running time for generating a fingerprint for a
target with 1,000 items, both under our method (FPRF - Fast Pseudo Random Finger-
prints), and under the sketch of [22] (appearing under “1-bit”, as it maintains a single
bit per hash used) and the sketch of [14] (“FPS”, after the names of the authors). The
Figure indicates the massive saving in computation time our approach offers over the
approach of [22], and shows that the running time of our approach and that of [14] is
very similar.
We now examine the accuracy achieved by our approach, which depends on the
sketch size. To analyze empirical accuracy, we isolated users who provided ratings for
7
The Netflix data can easily be stored on disk. It is even possible to store it in memory on
machine with a large RAM, by compressing it or using a sparse matrix representation.
Sketching for Big Data Recommender Systems 469

Fig. 2. Left: running time of fingerprint computation. Right: accuracy depending on size.

over 1,000 movies. There are over 10,000 such users in the dataset, and as these users
have rated many movies, the Jaccard similarity between two such users is very fine-
grained. We tested the fingerprint size required to achieve a target accuracy level for the
Jaccard similarity. Consider a fingerprint size of k bits. Given two users, denote the true
Jaccard similarity between their lists of rated movies as J. J can be easily computed
using the entire dataset. Alternatively, we can use a fingerprint of size k, resulting in an
estimate Jˆ that has a certain error. The error for a pair of users is e = |J − J|.
ˆ We can
sample many such user pairs, and examine the average error obtained using a fingerprint
of size k, which we call the empirical inaccuracy. We wish to minimize the error in our
estimates, but to reduce the inaccuracy we must use larger fingerprints. As each user
in our sample rated at least 1,000 movies, storing the full list of rated movies for a
user takes 1,000 integers. The Netflix dataset only has 17,000 movies, so we require at
least 15 bits to store the ID of each movie. Thus the full data for a user takes at least
15,000 bits. The space required for this data grows linearly with the numbers of movies
a user has rated. Increasing the size of the universe of movies also increases the storage
requirements, as more bits would be required to represent the ID of each movie.
Using our sketch, the required space does not depend on the number of ratings per
user, or on the number of movies in the system, but rather on the target accuracy for the
similarity estimate. Earlier sketches [5,3] eliminated the dependency on the number of
ratings, but not on the number of movies in the system. Also, our fingerprints are faster
to compute.
We tested how the average accuracy of our Jaccard similarity estimates changes as
we chage the fingerprint size. We have tried fingerprints of different sizes, ranging from
500 bits to 10,000 bits. For each such size we sampled many pairs of users, and com-
puted the average inaccuracy of the Jaccard similarity estimates. The results are given
in Figure 2, for both our approach and for the sketch of [14] (“FPS”), as well as the “1-
bit” sketch of [22]. Lower numbers indicate better empirical accuracy. Figure 2 shows
that our sketch achieves a very high accuracy in estimating Jaccard similarity, even for
small fingerprints. Even for a fingerprint size of 2500 bits per user, the Jaccard similar-
ity can be estimated with an error smaller than 1.5%. Thus using fingerprints reduces
the required storage to roughly 10% of that of the full dataset, without sacrifising much
470 Y. Bachrach and E. Porat

accuracy in estimating user similarity. The figure also indicates that for any sketch size,
the accuracy achieved by our approach is superior to that of the FPS sketch [14]. This is
predictable since the theoretical accuracy guarantee for our approach is better than that
for the sketch of [14]. The figure shows no significant difference in accuracy between
our sketch and the 1-bit sketch [22].
Figure 2 shows that on the Netflix dataset, our sketch has the good properties of
the mutually exclusive sketches of [22,14], and outperforms each of these state of the
art methods in either running time or accuracy. The Netflix dataset is a small dataset,
and the saving in space is much greater for larger datasets. A recommender system for
web pages is likely to have several orders of magnitude more users and information
items. While the storage requirements for such a massive recommender system grow
by several orders of magnitude when storing the full data, the required space remains
almost the same using our sketch. Previous approaches [5,3] compute the sketch in
time quadratic in the required accuracy. Using our approach, computing the sketch
only requires time logarithmic in the accuracy, which makes it tractable even when the
required accuracy is very high.

4 Conclusions
We presented a fast method for sketching massive datasets, based on pseudo-random
hashes. Though we focused on collaborative filtering and examined the Jaccard simi-
larity in detail, the same technique can be used for any fingerprint based on minimal
elements under several hashes. Our approach is thus a general technique for exponen-
tially speeding up computation of various fingerprints, while maintaining a single bit
per hash. We showed that even for these small fingerprints which can be quickly com-
puted, the required number of hashes is asymptotically similar to previously known
methods, and is logarithmic in the required confidence and polynomial in the required
accuracy. Our empirical analysis shows that for the Netflix dataset the required storage
is even smaller than the theoretical bounds.
Several questions remain open. Can we speed up the sketch computation further?
Can similar methods be used that are not based on minimal elements under hashes?

References
1. Aggarwal, C.C.: Data streams: models and algorithms. Springer-Verlag New York Inc.
(2007)
2. Alon, N., Matias, Y., Szegedy, M.: The Space Complexity of Approximating the Frequency
Moments. J. Computer and System Sciences 58(1), 137–147 (1999)
3. Bachrach, Y., Herbrich, R.: Fingerprinting Ratings for Collaborative Filtering — Theoretical
and Empirical Analysis. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393,
pp. 25–36. Springer, Heidelberg (2010)
4. Bachrach, Y., Herbrich, R., Porat, E.: Sketching algorithms for approximating rank correla-
tions in collaborative filtering systems. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE
2009. LNCS, vol. 5721, pp. 344–352. Springer, Heidelberg (2009)
5. Bachrach, Y., Porat, E., Rosenschein, J.S.: Sketching techniques for collaborative filtering.
In: IJCAI, Pasadena, California (July 2009)
Sketching for Big Data Recommender Systems 471

6. Bennett, J., Lanning, S.: The netflix prize. In: KDD Cup and Workshop (2007)
7. Broder, A.Z.: On the resemblance and containment of documents. Sequences (1998)
8. Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permu-
tations. Journal of Computer and System Sciences 60(3), 630–659 (2000)
9. Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch
and its applications. Journal of Algorithms 55(1), 58–75 (2005)
10. Cormode, G., Muthukrishnan, S., Rozenbaum, I.: Summarizing and mining inverse distribu-
tions on data streams via dynamic inverse sampling. In: VLDB (2005)
11. Das, A.S., Datar, M., Garg, A., Rajaram, S.: Google news personalization: scalable online
collaborative filtering. In: WWW. ACM (2007)
12. Dasgupta, A., Kumar, R., Sarlos, T.: Fast locality-sensitive hashing. In: SIGKDD (2011)
13. Datar, M., Muthukrishnan, S.: Estimating rarity and similarity over data stream windows.
In: Möhring, R., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 323–335. Springer,
Heidelberg (2002)
14. Feigenblat, G., Shiftan, A., Porat, E.: Exponential time improvement for min-wise based
algorithms. In: SODA (2011)
15. Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the
American Statistical Association 58(301), 13–30 (1963)
16. Indyk, P.: A Small Approximately Min-Wise Independent Family of Hash Functions. Journal
of Algorithms 38(1), 84–90 (2001)
17. Indyk, P.: Stable distributions, pseudorandom generators, embeddings, and data stream com-
putation. Journal of the ACM (JACM) 53(3), 323 (2006)
18. Kane, D.M., Nelson, J., Porat, E., Woodruff, D.P.: Fast moment estimation in data streams in
optimal space. In: STOC (2011)
19. Kane, D.M., Nelson, J., Woodruff, D.P.: An optimal algorithm for the distinct elements prob-
lem. In: PODS, pp. 41–52. ACM (2010)
20. Karp, R.M., Shenker, S., Papadimitriou, C.H.: A simple algorithm for finding frequent ele-
ments in streams and bags. ACM Transactions on Database Systems (TODS) 28(1), 51–55
(2003)
21. Kirsch, A., Mitzenmacher, M.: Less hashing, same performance: a better Bloom filter. In:
Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 456–467. Springer, Heidelberg
(2006)
22. Li, P., Koenig, C.: b-Bit minwise hashing. In: WWW (2010)
23. Mulmuley, K.: Randomized geometric algorithms and pseudorandom generators. Algorith-
mica (1996)
24. Pǎtraşcu, M., Thorup, M.: On the k-Independence Required by Linear Probing and Minwise
Independence. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis,
P.G. (eds.) ICALP 2010. LNCS, vol. 6198, pp. 715–726. Springer, Heidelberg (2010)
25. Pavan, A., Tirthapura, S.: Range-efficient counting of distinct elements in a massive data
stream. SIAM Journal on Computing 37(2), 359–379 (2008)
26. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: Grouplens: an open architec-
ture for collaborative filtering of netnews. In: Computer Supported Cooperative Work (1994)
27. Sarwar, B., Karypis, G., Konstan, J., Reidl, J.: Item-based collaborative filtering recommen-
dation algorithms. In: WWW (2001)
28. Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Advances in Ar-
tificial Intelligence 2009, 4 (2009)
Physarum Can Compute Shortest Paths:
Convergence Proofs and Complexity Bounds

Luca Becchetti1 , Vincenzo Bonifaci2 , Michael Dirnberger3 ,


Andreas Karrenbauer3, and Kurt Mehlhorn3
1
Dipartimento di Informatica e Sistemistica, Sapienza Università di Roma, Italy
2
Istituto di Analisi dei Sistemi ed Informatica “Antonio Ruberti”,
Consiglio Nazionale delle Ricerche, Rome, Italy
3
Max Planck Institute for Informatics, Saarbrücken, Germany

Abstract. Physarum polycephalum is a slime mold that is apparently


able to solve shortest path problems. A mathematical model for the
slime’s behavior in the form of a coupled system of differential equations
was proposed by Tero, Kobayashi and Nakagaki [TKN07]. We prove that
a discretization of the model (Euler integration) computes a (1 + )-
approximation of the shortest path in O(mL(log n + log L)/3 ) itera-
tions, with arithmetic on numbers of O(log(nL/)) bits; here, n and m
are the number of nodes and edges of the graph, respectively, and L is
the largest length of an edge. We also obtain two results for a directed
Physarum model proposed by Ito et al. [IJNT11]: convergence in the
general, nonuniform case and convergence and complexity bounds for
the discretization of the uniform case.

1 Introduction
Physarum polycephalum is a slime mold [BD97] that is apparently able to solve
shortest path problems. In [NYT00], Nakagaki, Yamada, and Tóth report on the
following experiment (see Figure 1): They built a maze, that was later covered
with pieces of Physarum (the slime can be cut into pieces that will merge if
brought into each other’s vicinity), and then fed the slime with oatmeal at two
locations. After a few hours, the slime retracted to the shortest path connecting
the food sources in the maze. The experiment was repeated with different mazes;
in all experiments, Physarum retracted to the shortest path. Tero, Kobayashi
and Nakagaki [TKN07] propose a mathematical model for the behavior of the
mold. Physarum is modeled as a tube network traversed by liquid flow, with the
flow satisfying the standard Poiseuille assumption from fluid mechanics. In the
following, we use terminology from the theory of electrical networks, relying on
the fact that equations for electrical flow and Poiseuille flow are the same [Kir10].
In particular, let G be an undirected graph1 with node set N , edge set E,
++ and two distinguished nodes s0 , s1 ∈ N . In our discussion,
length labels l ∈ RE 2

1
One can easily generalize the model and extend our results to multigraphs at the
expense of heavier notation. Details will appear in the full version of the paper.
2
We let RA , RA+ and R++ denote the set of real, nonnegative real, and positive real
A

vectors (respectively) whose components are indexed by A.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 472–483, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Physarum Can Compute Shortest Paths 473

Fig. 1. The experiment in [NYT00] (reprinted from there): (a) shows the maze uni-
formly covered by Physarum; the yellow color indicates the presence of Physarum. Food
(oatmeal) is provided at the locations labelled AG. After a while, the mold retracts to
the shortest path connecting the food sources as shown in (b) and (c). (d) shows the
underlying abstract graph. The video [You] shows the experiment.

x ∈ RE+ will be a state vector representing the diameters of the tubular channels
of the Physarum (edges of the graph). The value xe is called the capacity of edge
e. The nodes s0 and s1 represent the location of two food sources. Physarum’s
dynamical system is described by the system of differential equations [TKN07]

ẋ = |q(x, l)| − x. (1)

Equation (1) is called the evolution equation, as it determines the dynamics


of the system over time. It is a compact representation of a system of ordinary
differential equations, one for every edge of the graph; the absolute value operator
|·| is applied componentwise. The vector q ∈ RE , known as the current flow, is
determined by the capacities and lengths of the edges, as follows (see Section 2
for the precise definitions). Force one unit of current from the source to the sink
def
in an electrical network, where the resistance re of edge e is given by re = le /xe ,
and call qe the resulting current across edge e. In [BMV12, Bon13], it was shown
that the dynamics (1) converges to the shortest source-sink path in the following
sense: the potential difference between source and sink converges to the length of
the shortest source-sink path, the capacities of the edges on the shortest source-
sink path3 converge to one, and the capacities of all other edges converge to
zero.
Our first contribution relies on a numerical approximation of (1), as given by
Euler’s method [SM03],

Δx = h · (|q(x, l)| − x) , (2)

or, making the dependency on time explicit,

x(t + 1) − x(t) = h · (|q(x(t), l)| − x(t)) , (3)

where h ∈ (0, 1) is the step size of the discretization. We prove that the dynamics
(3) converges to the shortest source-sink path. More precisely, let opt be the
length of the shortest path, n and m be the number of nodes and edges of the
3
We assume uniqueness of the shortest path for simplicity of exposition.
474 L. Becchetti et al.

Fig. 2. Photographs of the connecting paths between two food sources (FS). (a) The
rectangular sheet-like morphology of the organism immediately before the presentation
of two FS and illumination of the region indicated by the dashed white lines. (b),(c)
Examples of connecting paths in the control experiment in which the field was uni-
formly illuminated. A thick tube was formed in a straight line (with some deviations)
between the FS. (d)-(f) Typical connecting paths in a nonuniformly illuminated field
(95 K lx). Path length was reduced in the illuminated field, although the total path
length increased. Note that fluctuations in the path are exhibited from experiment to
experiment. (Figure and caption reprinted from [NIU+ 07, Figure 2].)

graph, and L be the largest length of an edge. We show that, for  ∈ (0, 1/300)
and for h = /mL, the discretized model yields a solution of value at most (1 +
O())opt in O(mL(log n + log L)/3 ) steps, even when O(log(nL/))-bit number
arithmetic is used. For bounded L, the time bound is therefore polynomial in
the size of the input data.
Our second contribution was inspired by the following experiment of Nakagaki
et al., reported in [NIU+ 07] (see also Figure 2). They cover a rectangular plate with
Physarum and feed it at opposite corners of the plate. Two-thirds of the plate are
put under a bright light, and one-third is kept in the dark. Under uniform lighting
conditions, Physarum would retract to a straight-line path connecting the food
sources [NYT00]. However, Physarum does not like light and therefore forms a path
Physarum Can Compute Shortest Paths 475

with one kink connecting the food sources. The path is such that the part under
light is shorter than in a straight-line connection. In the theory section of [NYT00],
a reactivity parameter ae > 0 is introduced into (1):

ẋe (t) = |qe (x, l)| − ae xe (t). (4)

Note that if, for example, qe (x, l) = 0, the capacity of edge e decreases with
a rate that depends on ae . To model the experiment, ae = 1 for edges in the
dark part of the plate, and ae = C > 1 for the edges in the lighted area, where
C is a constant. The authors of [NIU+ 07] report that in computer simulations,
the dynamics (4) converges to the shortest source-sink path with respect to the
modified length function ae le . A proof of convergence is currently only available
for the uniform case ae = 1 for all e, see [BMV12, Bon13].
A directed version of model (4) was proposed in [IJNT11]. The graph G =
(N, E) is now a directed graph. For a state vector x(t), the flows are defined as
above. A flow qe (x, l) is positive if it flows in the direction of e and is negative
otherwise. The dynamics becomes

ẋe (t) = qe (x, l) − ae xe (t). (5)

Although this model apparently has no physical counterpart, it has the advan-
tage of allowing one to treat directed graphs. Ito et al. [IJNT11] prove con-
vergence to the shortest source-sink path in the uniform case (ae = 1 for all
e). In fact, they show convergence for a somewhat more general problem, the
transportation problem, as does [BMV12] for the undirected model.
We show that the dynamics (5) converges to the shortest directed source-
sink path under the modified length function ae le . This generalizes the conver-
gence result of [IJNT11] from the uniform (ae = 1 for all e) to the nonuniform
case, albeit only for the shortest path problem. Our proof combines arguments
from [MO07, MO08, IJNT11, BMV12, Bon13] and we believe it is simpler than
the one in [IJNT11]. Moreover, for the uniform case (that is, ae = 1 for all e),
we can prove convergence for the discretized model

xe (t + 1) = xe (t) + h(qe (x, l) − xe (t)), (6)

where h ≤ 1/(n(4nm2 LX02 )2 ) is the step size; here, X0 is the maximum between
the largest capacity and the inverse of the smallest capacity at time zero. In
particular, let P ∗ be the shortest directed source-sink path and let  ∈ (0, 1) be
arbitrary:
 we show xe (t) ≥ 1 − 2 for e ∈ P ∗ and xe (t) ≤  for e ∈ P ∗ , whenever
t ≥ h 3 ln X0 + 2 ln  .
4nL 2m

Outline of the paper. The remainder of the paper is structured as follows. In


Section 2 we give basic definitions and properties. In Section 3 we study the
discrete dynamics (3). Section 4 concerns the directed models (5) and (6). We
close with some concluding remarks in Section 5.
476 L. Becchetti et al.

2 Electrical Networks

Without loss of generality, assume that N = {1, 2, . . . , n}, E = {1, 2, . . . , m}


and assume an arbitrary orientation of the edges.4 Let A = (ave )v∈N,e∈E be the
incidence matrix of G under this orientation, that is, ave = +1 if v is the tail of
e, ave = −1 if v is the head of e, and ave = 0, otherwise. Then q is defined as the
unit-value flow from s0 to s1 of minimum energy, that is, as the unique optimal
solution to the following continuous quadratic optimization problem, related to
Thomson’s principle from physics [Bol98, Theorem IX.2]:

min q T Rq such that Aq = b. (7)

def def
Here, R = diag(l/x) ∈ RE×E is the diagonal matrix with value re = le /xe
for the e-th element of the main diagonal, and b ∈ RN is the vector defined by
bv = +1 if v = s0 , bv = −1 if v = s1 , and bv = 0, otherwise. The value re is
called the resistance of edge e. Node s0 is called the source, node s1 the sink. The
def
quantity η = q T Rq is the energy; the quantity bs0 = 1 is the value of the flow
q. The optimality conditions for (7) imply that there exist values p1 , . . . , pn ∈ R
(potentials) that satisfy Ohm’s law [Bol98, Section II.1]:

qe = (pu − pv )/re , whenever edge e is oriented from u to v. (8)

By the conservation of energy principle, the total energy equals the difference
between the source and sink potentials, times the value of the flow [Bol98, Corol-
lary IX.4]:
η = (ps0 − ps1 )bs0 = ps0 − ps1 . (9)

3 Convergence of the Undirected Physarum Model

In this section we characterize Physarum’s convergence properties in the undi-


rected model, as given by equation (3):

x(t + 1) = x(t) + h · (|q(x(t), l)| − x(t)) .

Assumptions on the input data: We assume that the length labels l and the
initial conditions x(0) satisfy the following:

a. each s0 -s1 path in G has a distinct overall length; in particular, there is a


unique shortest s0 -s1 path;
b. all capacities are initialized to one:

x(0) = 1; (10)
4
In the directed model discussed in Section 4, this orientation is simply the one given
by the directed graph.
Physarum Can Compute Shortest Paths 477

c. the initially minimum capacity cut is the source cut, and it has unit capacity:

1TS · x(0) ≥ 1T0 · x(0) = 1, for any s0 -s1 cut S, (11)

where 1S is the characteristic vector of the set of edges in the cut S, and 10
is the characteristic vector of the set of edges incident to the source. Notice
that this can be achieved even when s0 has not degree 1, by connecting a
new source s0 to s0 via a length 1, capacity 1 edge.
d. every edge has length at least 1.

Basic properties: The first property we show is that the set of fractional s0 -s1
paths is an invariant for the dynamics.
Lemma 1. Let x = x(t) be the solution of (3) under the initial conditions
x(0) = 1. The following properties hold at any time t ≥ 0: (a) x > 0, (b)
1TS · x ≥ 1T0 · x = 1, and (c) x ≤ 1.
Proof. (a.) Let e ∈ E be any edge. Since |qe | ≥ 0, by the evolution equation (3)
we have Δxe (t) = h(|qe | − xe (t)) ≥ −hxe (t). Therefore, by induction, xe (t + 1) ≥
xe (t) − hxe (t) = (1 − h)xe (t) > 0 as long as h < 1.
(b.) We use induction. The property is true for x(0) by the assumptions on
the input data. Then, using (3), induction, and the fact that 1TS · |q| ≥ 1 for any
cut S,

1TS ·x(t+1) = 1TS ·(x(t)+h(|q|−x(t))) = (1−h)1TS ·x(t)+h1TS ·|q| ≥ 1−h+h = 1.

The fact that 1T0 · x = 1 can be shown similarly.


(c.) Easy induction, along the same lines as the proof of (a.). &
%
An equilibrium point of (3) is a vector x ∈ RE+ such that Δx = 0. Our assump-
tions imply that there are a finite number of equilibrium points: each equilibrium
corresponds to an s0 -s1 path of the network, and vice versa.
Lemma 2. If x = 1P for some s0 -s1 path P , then x is an equilibrium point.
Conversely, if x is an equilibrium point, then x = 1P for some s0 -s1 path P .
Proof. The proof proceeds along the same lines as for the continuous case, see
[Bon13, Lemma 2.3]. %
&

Convergence: Recall that, by (9),

η= re qe2 = q T Rq = ps0 − ps1 , (12)


e∈E

and let
def
V = lT x = le xe = re x2e = xT Rx. (13)
e∈E e∈E

Here η is the energy dissipated by the system, as well as the potential difference
between source and sink. Notice that the quantity V can be interpreted as the
478 L. Becchetti et al.

“infrastructural cost” of the system; in other terms, it is the cost that would
be incurred if every link were traversed by a flow equal to its current capacity.
While η may decrease or increase during the evolution of the system, we will
show that η ≤ V and that V is always decreasing, except on equilibrium points.
Lemma 3. η ≤ V .
Proof. To see the inequality, consider any flow f of maximum value subject to
the constraint that 0 ≤ f ≤ x. The minimum capacity of a source-sink cut is 1
at any time, by Lemma 1(b). Therefore, by the Max Flow-Min Cut Theorem,
the value of the flow f must be 1. Then by (7),
η = q T Rq ≤ f T Rf ≤ xT Rx = V. &
%
Lemma 4. V is a Lyapunov function for (3); in other words, it is continuous
and satisfies (i) V ≥ 0 and (ii) ΔV ≤ 0. Moreover, ΔV = 0 if and only if
Δx = 0.
Proof. V is continuous and nonnegative by construction. Moreover,
ΔV /h = lT Δx/h = lT (|q| − x) by (3),
= x R |q| − x Rx
T T
by l = Rx,
T
= (x R 1/2
) · (R 1/2
|q|) − x Rx
T

≤ (x Rx)
T 1/2
· (q Rq)1/2 − xT Rx
T
by Cauchy-Schwarz [Ste04],
= (ηV )1/2 − V,
≤V −V by Lemma 3.
= 0.
Observe that ΔV = 0 is possible only when equality holds in the Cauchy-Schwarz
inequality. This, in turn, implies that the two vectors R1/2 x and R1/2 |q| are
parallel, that is, |q| = λx for some λ ∈ R. However, by Lemma 1(b), the capacity
of the source cut is 1 and, by (7), the sum of the currents across the source cut
is 1. Therefore, λ = 1 and Δx = h(|q| − x) = 0. &
%
Corollary 1. As t → ∞, x(t) approaches an equilibrium point of (3), and η(t)
approaches the length of the corresponding s0 -s1 path.
Proof. The existence of a Lyapunov function V implies [LaS76, Theorem 6.3]
that x(t) approaches the set {x ∈ RE + : ΔV = 0}, which by Lemma 4 is the
same as the set {x ∈ RE + : Δx = 0}. Since this set consists of isolated points
(Lemma 2), x(t) must approach one of those points, say the point 1P for some
s0 -s1 path P . When x = 1P , one has η = V = 1TP · l. &
%

Convergence to an approximate shortest path and convergence time: We will


track the convergence process via three main quantities: two of these, η and V ,
have already been introduced. The third one is defined as
def
W = le ln xe ,
e∈P ∗
Physarum Can Compute Shortest Paths 479

where P ∗ is the shortest path. Recall that opt denotes the length of P ∗ . Observe
that W (t) ≤ 0 for all t (due to Lemma 1(c)) and W (0) = 0 due to the choice of
initial conditions. Also observe that V (0) = lT · x(0) = e∈E le ≤ mL, where m
is the number of edges of the graph and L is the length of the longest edge.
For a fixed  ∈ (0, 1/300), we set h = /mL. We will bound the number of
steps before V falls below (1 + 3)3 opt < (1 + 10)opt.

Definition 1. We call a V -step any time step t such that η(t) ≤ (1+3)opt and
V (t) > (1+3)3opt. We call a W -step any time step t such that η(t) > (1+3)opt
and V (t) > (1 + 3)3 opt.

Lemma 5. The number kV of V -steps is at most O((log n + log L)/(h)).

Proof. For any V -step t we have, by the proof of Lemma 4 and the assumptions
on η and V ,

ΔV ≤ h((ηV )1/2 − V ) = hV ((η/V )1/2 − 1)


≤ hV (1/(1 + 3) − 1) ≤ −hV (3/(1 + )) ≤ −hV

so that V (t + 1) ≤ (1 − h)V (t). In other words, V decreases by at least an h


factor at each V -step. Moreover, V is nonincreasing at every step of the whole
process, and after it gets below (1+3)3 opt there are no more V -steps. Therefore,
the number of V -steps, kV , is at most log1/(1−h) (V (0)/opt) ≤ (ln V (0))/(h) =
O(log(mL)/(h)) (we used the assumption that opt ≥ 1). &
%

Lemma 6. At every W -step, W increases by at least opt · h/2.

Proof. Let P ∗ be the shortest path, so that 1TP ∗ · l = opt. For a W -step t, we
have

xe (t + 1) |pu − pv |
W (t + 1) − W (t) = le ln = le ln 1 + h −1 ,

xe (t) ∗
le
e∈P e∈P

where u, v are the endpoints of edge e. Using the bound ln(1 + z) ≥ z/(1 + z),
which is valid for any z > −1 (recall that h < 1), we obtain

|pu −pv |
h le −1 h (|pu − pv | − le )
W (t + 1) − W (t) ≥ le  = 
|pu −pv |
e∈P ∗ 1+h le −1 e∈P ∗ 1 + h |pul−p
e
v|
−1

|pu − pv | le
=h·  − 
|pu −pv | |pu −pv |
e∈P ∗ 1+h le −1 e∈P ∗ 1+h le −1
5 6
|pu − pv | le
≥h· − ,
1 + hη 1−h
e∈P ∗ e∈P ∗
480 L. Becchetti et al.

where we used |pul−p


e
v|
− 1 < η (we are using the assumption le ≥ 1 for all e).
Since e∈P ∗ |pu − pv | ≥ η and η ≤ V ≤ mL, we obtain further
 
η opt (1 − h)η − (1 + hmL)opt
W (t + 1) − W (t) ≥ h − =h
1 + hmL 1 − h (1 − h)(1 + hmL)

(1 − )(1 + 3) − (1 + )  − 32 opt · h
≥ opt · h > opt · h ≥ ,
(1 − h)(1 + ) 1+ 2
where the third inequality follows since  = hmL by definition of h and since h =
/(mL) ≤  (note that mL ≥ 1 from the definition of L). The fourth inequality
follows from simple calculus, while the fifth follows since (1 − 32)/(1 + ) ≥ 1/2,
whenever  ≤ 1/3. &
%
Lemma 7. At every V -step, W decreases by at most 2opt · h.
Proof. Trivially, xe (t + 1) ≥ (1 − h)xe (t), hence ln xe (t + 1) ≥ ln xe (t) − ln(1/(1 −
h)) ≥ ln xe (t) − 2h (since h < 1/2). The claim follows from the definition of
W. &
%
Lemma 8. The number kW of W -steps is at most 4kV / = O(mL(log n +
log L)/3 ).
Proof. At every W -step, W increases by at least opt · h/2. But W is always
bounded above by 0, is decreased by at most 2opt · h · kV , and starts with
W (0) = 0. The claim follows. &
%
Theorem 1. After at most O(mL(log n + log L)/3 ) steps, V decreases below
(1 + 10)opt.
Proof. Until the time that V gets below (1 + 3)3 opt ≤ (1 + 10)opt, every
step is either a V -step or a W -step, of which there can be at most kV + kW =
O(mL(log n + log L)/3 ) in total. &
%

Approximate Computation. Real arithmetic is not needed for the results of the
preceding section; in fact, arithmetic with O(log(nL/)) bits suffices. The proof
that approximate arithmetic suffices mimics the proof in the preceding section;
details are deferred to a full version of the paper.

4 Convergence of the Directed Physarum Model


We characterize Physarum’s convergence properties in the directed model. We
assume (A1) xe (0) > 0 for all e, (A2) There is a directed path from the source
to the sink, (A3) Edge lengths are integral, and (A4) The shortest source-sink
path is unique. It is convenient to study the dynamics
ẋe (t) = ae (qe (t) − xe (t)) (14)
instead of (5). This is simply a change of variables and a rescaling of the
edge lengths. We define several constants: amin = min(1, min
 e ae ), xmax (0) =
max(1, maxe xe (0)), xmin (0) = min(1, mine xe (0)), X0 = max xmax (0), xmin1 (0) ,
and L = maxe le . P ∗ denotes the shortest directed source-sink path. We prove:
Physarum Can Compute Shortest Paths 481

Theorem 2. Assume (A1)–(A4) and let  ∈ (0, 1) be arbitrary. If t ≥ anL ·


  ∗ ∗
min
3 ln X0 + 2 ln 2m
 , then xe (t) ≥ 1 − 2 for e ∈ P and xe (t) ≤  for e ∈
 P .

Electrical flows are uniquely determined by Kirchhoff’s and Ohm’s laws. In our
setting, the electrical flow q(t) and the vertex potentials p(t) are functions of
time. For an edge e = (u, v), let ηe (t) = pu (t) − pv (t), and let η(t) = ps0 (t) −
p
 s1 (t). We have the following facts: (1) For any directed source-sink path P ,
e∈P ηe (t) = η(t). (2) xe (t) ≤ max(1, xe (0)) ≤ xmax (0) for all t. (3) xe (t) > 0
for all e ∈ E and all t (the existence
 of a directed source-sink path is crucial
t
here). (4) ln xe (t) = ln xe (0) + ae η̂el(t)
e
− 1 · t, where η̂e (t) = (1/t) 0 ηe (s)ds
is the average potential drop on edge e up to time t. For a directed source-sink
path P , let
le
lP = le and wP (t) = ln xe (t).
ae
e∈P e∈P

be its length and its weighted sum of log capacities, respectively. The quantity wP
was introduced in [MO07, MO08], and the following property (15) was derived
in these papers.

Lemma 9. Assume (A1), (A2) and let P be any directed source-sink path. Then

d
ẇP (t) = η(t) − lP and (wP (t) − wP ∗ (t)) = lP ∗ − lP . (15)
dt
Moreover, wP (t) ≤ (3nL ln X0 )/amin − t, if P is a non-shortest source-sink
path and (A3) holds: For  ∈ (0, 1), let t1 = nL(3 ln X0 + ln(1/))/amin. Then
mine∈P xe (t) ≤  for t ≥ t1 .

The last claim states that for any non-shortest path P , mine∈P xe (t) goes to
zero. This is not the same as stating that there is an edge in P whose capacity
converges to zero. Such a stronger property will be shown in the proof of the
main theorem.

The Convergence Proof: The proof proceeds in two steps. We first show that
the vector of edge capacities becomes arbitrarily close to a nonnegative non-
circulatory flow and then prove the main theorem. A flow is nonnegative if
fe ≥ 0 for all e, and it is non-circulatory if fe ≤ 0 for at least one edge e on
every directed cycle.
def
Lemma 10. Assume (A1) and (A2): For t > t0 = (1/amin ) ln(3mX0 ), there is
a nonnegative non-circulatory flow f (t) with

|fe (t) − xe (t)| ≤ 5mX0 · e−amin t . (16)

Proof. We follow the analysis in [IJNT11], taking reactivities into account. &
%

We are now ready for the proof of the main theorem.


482 L. Becchetti et al.

Proof (of Theorem 2). Let P be the set of non-shortest simple source-sink paths,
and let t > t0 , where t0 is defined as in Lemma 10. The nonnegative non-
circulatory flow f (t) can be written as a sum of flows along simple directed
source-sink paths, i.e.,

f (t) = αP ∗ (t)1P ∗ + αP (t)1P


P ∈P

with nonnegative coefficients αP . This representation is not unique. However,


there is always a representation with at most m nonzero coefficients.5 For any
edge e and any path P with e ∈ P , the flow fe (t) is at least αP (t).
Let  ∈ (0, 1) be arbitrary. For

1 10m2 X0 2m
t≥ max ln , nL 3 ln X0 + ln ,
amin  

we have |fe (t) − xe (t)| ≤ /(2m) for all e (Lemma 10) and mine∈P xe (t) ≤
/(2m) for every non-shortest path P (Lemma 9). Thus, every non-shortest path
contains an edge e with fe (t) ≤ /m. Thus, αP (t) ≤ /m for all non-shortest
paths P , and hence,

xe (t) ≤ m/m ≤  for all e ∈ P ∗ .

The value of the flow f is one. The total flow along the non-shortest paths is at
most . Thus the flow along P ∗ is at least 1 − . Hence xe (t) ≥ 1 −  − /(2m) ≥
2
1 − 2 for all e ∈ P ∗ . Finally, ln 10m X0 ≤ nL(3 ln X0 + 2 ln 2m
 ). &
%

Discretization. We study the discretization of the system of differential equations


(14). We proceed in discrete time steps t = 0, 1, 2, . . . and define the dynamics

xe (t + 1) = xe (t) + hae (qe (t) − xe (t)), (17)

where h is the step size. We will need the following additional assumptions: (A5)
ae = 1 for all e, and (A6) there is an edge e0 = (s0 , s1 ) of length nL and initial
capacity 0. Observe that the existence of this edge does not change the shortest
directed source-sink path. Our main theorem becomes the following; the proof
structure for the discrete case is similar to the one for the continuous case.

Theorem 3. Assume (A1)–(A6) and h ≤ 24·n(4nm12 L(X )2 )2 . Let  ∈ (0, 1) be


0
arbitrary. For 
4nL 2m
t≥ 3 ln X0 + 2 ln ,
h 
xe (t) ≥ 1 − 2 for e ∈ P ∗ and xe (t) ≤  for e ∈ P ∗ .
5
Let αP ∗ (t) be the minimum value of fe (t) for e ∈ P ∗ . Subtract αP ∗ (t)1P ∗ from f (t).
As long as f (t) is not the zero flow, determine a source-sink path P carrying nonzero
flow and set αP (t) to the minimum value of fe (t) for e ∈ P . Subtract αP (t)1P from
f (t).
Physarum Can Compute Shortest Paths 483

5 Conclusions and Future Work


We summarize our three main results: the discretization (3) of the undirected
Physarum model computes an (1 + )-approximation of the shortest source-
sink path in O(mL(log n + log L)/3 ) iterations with arithmetic on numbers
of O(log(nL/)) bits. The dynamics (5) of the nonuniform directed Physarum
model converges to the shortest directed source-sink path under the modified
length function ae le . Within time nLa−1
min · 3 ln X0 + 2 ln  , an -approximation
2m

is reached. For the uniform model (ae = 1), we also prove convergence of the
discretization.
There are many open questions: (i) Convergence of the nonuniform undirected
model; (ii) Convergence of the discretized nonuniform directed model; (iii) Are
our bounds best possible? In particular, can the dependency on L be replaced
by a dependency on log L?

References
[BD97] Baldauf, S.L., Doolittle, W.F.: Origin and evolution of the slime molds
(Mycetozoa). Proc. Natl. Acad. Sci. USA 94, 12007–12012 (1997)
[BMV12] Bonifaci, V., Mehlhorn, K., Varma, G.: Physarum can compute shortest
paths. Journal of Theoretical Biology 309, 121–133 (2012); A preliminary
version of this paper appeared at SODA 2012, pp. 233–240
[Bol98] Bollobás, B.: Modern Graph Theory. Springer, New York (1998)
[Bon13] Bonifaci, V.: Physarum can compute shortest paths: A short proof. Infor-
mation Processing Letters 113(1-2), 4–7 (2013)
[IJNT11] Ito, K., Johansson, A., Nakagaki, T., Tero, A.: Convergence properties for
the Physarum solver. arXiv:1101.5249v1 (January 2011)
[Kir10] Kirby, B.J.: Micro- and Nanoscale Fluid Mechanics: Transport in Microflu-
idic Devices. Cambridge University Press, Cambridge (2010)
[LaS76] LaSalle, J.B.: The Stability of Dynamical Systems. SIAM (1976)
[MO07] Miyaji, T., Ohnishi, I.: Mathematical analysis to an adaptive network of
the Plasmodium system. Hokkaido Mathematical Journal 36(2), 445–465
(2007)
[MO08] Miyaji, T., Ohnishi, I.: Physarum can solve the shortest path problem on
Riemannian surface mathematically rigourously. International Journal of
Pure and Applied Mathematics 47(3), 353–369 (2008)
[NIU+ 07] Nakagaki, T., Iima, M., Ueda, T., Nishiura, Y., Saigusa, T., Tero, A.,
Kobayashi, R., Showalter, K.: Minimum-risk path finding by an adaptive
amoebal network. Physical Review Letters 99(068104), 1–4 (2007)
[NYT00] Nakagaki, T., Yamada, H., Tóth, Á.: Maze-solving by an amoeboid organ-
ism. Nature 407, 470 (2000)
[SM03] Süli, E., Mayers, D.: Introduction to Numerical Analysis. Cambridge Uni-
versity Press (2003)
[Ste04] Steele, J.: The Cauchy-Schwarz Master Class: An Introduction to the Art
of Mathematical Inequalities. Cambridge University Press (2004)
[TKN07] Tero, A., Kobayashi, R., Nakagaki, T.: A mathematical model for adaptive
transport network in path finding by true slime mold. Journal of Theoret-
ical Biology 244, 553–564 (2007)
[You] http://www.youtube.com/watch?v=czk4xgdhdY4
On Revenue Maximization for Agents
with Costly Information Acquisition
Extended Abstract

L. Elisa Celis1 , Dimitrios C. Gklezakos2, and Anna R. Karlin2


1
Xerox Research Centre India
[email protected]
2
University of Washington
{gklezd,karlin}@cs.washington.edu

Abstract. A prevalent assumption in traditional mechanism design is


that buyers know their precise value for an item; however, this assump-
tion is rarely true in practice. In most settings, buyers can “deliberate”,
i.e., spend money or time, in order improve their estimate of an item’s
value. It is known that the deliberative setting is fundamentally different
than the classical one, and desirable properties of a mechanism such as
equilibria, revenue maximization, or truthfulness, may no longer hold.
In this paper we introduce a new general deliberative model in which
users have independent private values that are a-priori unknown, but can
be learned. We consider the design of dominant-strategy revenue-optimal
auctions in this setting. Surprisingly, for a wide class of environments,
we show the optimal revenue is attained with a sequential posted price
mechanism (SPP). While this result is not constructive, we show how
to construct approximately optimal SPPs in polynomial time. We also
consider the design of Bayes-Nash incentive compatible auctions for a
simple deliberative model.

1 Introduction
In many real-world scenarios, people rarely know precisely how they value an
item, but can pay some cost (e.g., money, time or effort) to attain more certainty.
This not only occurs in online ad markets (where advertisers can buy information
about users), but also in everyday life. Suppose you want to buy a house. You
would research the area, school district, commute, and possibly pay experts such
as a real estate agent or an inspection company. Each such action has some cost,
but also helps better evaluate the worth of the house. In some cases, e.g., if you
find out your commute would be more than an hour, you may simply walk away.
However, if the commute is reasonable, you may choose to proceed further and
take more actions (at more cost) in order to gain even more information. This
continues until you take a final decision. A deliberative agent as defined in this
paper has this kind of multiple-round information-buying capability.
Previous work shows that mechanism design for deliberative agents is funda-
mentally different than classical mechanism design due to the greater flexibility

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 484–495, 2013.

c Springer-Verlag Berlin Heidelberg 2013
On Revenue Maximization for Agents with Costly Information Acquisition 485

in the agents’ strategies. A classical agent has to decide how much information
to reveal; a deliberative agent has to additionally decide how much information
to acquire. This affects equilibrium behavior. For example, in second-price auc-
tions, deliberative agents do not have dominant strategies [20], and standard
mechanisms and techniques do not apply. Revenue-optimal mechanisms have re-
mained elusive, and the majority of positive results are for simple models where
agents can determine their value exactly in one round, often restricting further
to binary values or single-item auctions [2,5,3,24,7]. Positive results for more
general deliberative settings restrict agents in other ways, such as forcing agents
to acquire information before the mechanism begins [22,8], to commit to partic-
ipate before deliberating [13], or to deliberate in order to be served [4]. (Also see
[1,19,14].) Furthermore, impossibility results exist for certain types of dominant
strategy1 deliberative mechanisms [20,21]. This result, however, relies crucially
on the fact that agents are assumed to have the ability to deliberate about other
agent’s values. In an independent value model, this is not a natural assumption.
In this paper we continue a line of research begun by Thompson and Leyton-
Brown [24,7] related to dominant strategy mechanism design in the independent
value model. Specifically, we extend results to a general deliberative model in
which agents can repeatedly refine their information. Our first main result is that
the profit maximizing mechanism (a.k.a. optimal mechanism) is, without loss of
generality, a sequential posted price mechanism (SPP).2 Our second main result
is that, via a suitable reduction, we can leverage classical results (see [10,11,18])
that show revenue-optimal mechanisms can be approximated with SPPs, in order
to construct construct approximately optimal SPPs in our setting. These are
first results in an interesting model that raises many more questions than it
answers. In the final section, we take first steps towards understanding Bayes-
Nash incentive-compatible mechanisms in a simpler deliberative setting.

2 Deliberative Model
In our model, each agent has a set of “deliberative possibilities” that describe
the ways in which they can acquire information about their value.
Definition 1 (Deliberative Possibilities). An agent’s deliberative possibili-
ties are represented by tuples (F , D, c) where
– F is a probability distribution over the possible values the agent can have;
– D is a set of deliberations the agent can perform; and
– c : D → R+ , where c(d) > 0 represents the cost to perform deliberation d.
In any given state (F , D, c), the agent may choose one of the deliberations d ∈ D
to perform. A deliberation is a random function that maps the agent to a new
1
Dominant strategy equilibria occur when each agent has a strategy that is optimal
against any (potentially suboptimal) strategies the other agents play.
2
In a sequential posted price mechanism (SPP), agents are offered take-it-or-leave-it
prices in sequence; the mechanism is committed to sell to an agent at the offered
price if she accepts, and will not serve the agent at any point point if she rejects.
486 L.E. Celis, D.C. Gklezakos, and A.R. Karlin

state (F  , D , c ), where F  is the new prior the agent has over his value, D is
the new set of deliberations the agent can perform, and c (·) the corresponding
costs. The distribution over new priors is such that the marginals agree with F ;
i.e., while v ∼ F  is drawn from a new distribution (the updated prior), v ∼ d(F )
is identically distributed as v ∼ F . 3
We focus on the design of mechanisms in single-parameter environments [17,16],
where each agent has a single private (in our case, unknown) value for “service”,
and there is a combinatorial feasibility constraint on the set of agents that can
be served simultaneously.4
A mechanism in the deliberative setting is a (potentially) multi-stage process
in which the mechanism designer interacts with the agents. It concludes with
an allocation and payment rule (x, p) where xi = 1 if agent i is served and is 0
otherwise5 and agent i is charged pi .6 The mechanism designer knows the agents’
deliberative possibilities and initial priors. At any point during the execution of
the mechanism, an agent is free to perform any of her deliberation possibilities
according to her current state (F , D, c). Indeed, it may be in the mechanism
designer’s best interest to incentivize her to do certain deliberations.
We focus in this paper on a public communication model; i.e., every agent
observes the interaction between any other agent and the mechanism. Versions
of our results also extend to the private communication model. We also make
the standard assumption that agents have full knowledge of the mechanism to
be executed. Crucially however, if and when an agent deliberates, there is no
way for other agents or the mechanism to certify that the deliberation occurred.
Moreover, the outcome of a deliberation is always private. Hence, the mecha-
nism designer must incentivize the agent appropriately in order to extract this
information.
If, over the course of the execution, an agent performs deliberations d1 , . . . , dk ,
then her expected utility is

xi · E[F k |result of {d1 , . . . , dk } ] − pi − c(dj ).


1≤j≤k

We assume that agents choose their actions so as to maximize their expected


utility. However, note that it is possible that an agent’s realized utility turns out
to be negative.
In classical settings, the revelation principle [15,23] is a crucial step in narrow-
ing the search for good mechanisms. In our deliberative setting, we cannot simply
3
For example, the initial prior might be U [0, 1], and the agent might be able to
repeatedly determine which of two quantiles her value is in, at some cost. Note
that in general we do not impose any restriction on the type of distributions or
deliberations except, in some cases, that they be finite (see, e.g., Definition 4).
4
Examples include single-item auctions, k-item auctions, and matroid environments.
5
Clearly, this allocation must satisfy the feasibility constraints
 of the given single-
parameter environment; e.g. in a single item auction, then, i xi ≤ 1.
6
As is standard, the mechanism does not charge agents that are not served. Hence,
pi = 0 when xi = 0.
On Revenue Maximization for Agents with Costly Information Acquisition 487

apply the revelation principle to “flatten” mechanisms to a single stage, since


the mechanism cannot simulate information gathering on behalf of the agent.
However, a minor variant of the revelation principle developed by Thompson
and Leyton-Brown [24] can be shown to also apply to our general setting. To
state it formally, we consider the following type of mechanism:
Definition 2 (Simple Deliberative Mechanism7 ). A simple deliberative
mechanism (SDM) is a multi-stage mechanism where at each stage either
– a single agent (or set of agents) is asked to perform and report the result of
a specified deliberation8 ,
– the mechanism outputs an allocation and payment rule (x, p). (Note that
allocation can be made to agents that were never asked to deliberate.)
Note that for deterministic dominant strategy mechanisms, SDMs can interact
with a single agent at a time and restrict interactions to soliciting and receiving
the results of their deliberations without loss of generality.
A strategy of a deliberative agent against an SDM consists of either
1. not deliberating and reporting a result r7,
2. performing each requested deliberation and reporting a result r7 (which may
depend on the real result r of the deliberation), or
3. performing other or additional deliberation(s) and reporting a result r7 (which
may depend on the real result(s) r of the deliberation(s)).
Definition 3. A truthful strategy always takes option (2) and reports r7 = r,
the true result of the deliberation. A truthful SDM is one in which truth-
telling is a dominant strategy for every agent. That is, no matter how other
agents behave, it is in her best interest to execute the truthful strategy.
A crucial step in narrowing the search for good mechanisms is the Revelation
Principle. A version for simple deliberative agents generalizes to our setting
without complication.
Lemma 1 (Revelation Principle [24]). For any deliberative mechanism M
and equilibrium σ of M, there exists a truthful SDM N which implements the
same outcome as M in equilibrium σ.

3 Revenue Domination by SPPs


Theorem 1 (SPPs are Optimal) Any deterministic truthful mechanism M
in a single-parameter deliberative environment is revenue-dominated by a sequen-
tial posted price mechanism M under the assumption that M does not exploit
indifference points.9
7
This definition was given in [24] under the name Dynamically Direct Mechanism.
8
The deliberation the agent is asked to perform must be one of her current deliberative
possibilities, assuming she did as she was told up to that point.
9
I.e., if the agent is indifferent between receiving and not receiving the item, the
mechanism commits to either serving or not serving this agent without probing
other agents.
488 L.E. Celis, D.C. Gklezakos, and A.R. Karlin

In other words, any optimal mechanism can be transformed into a revenue-


equivalent SPP.
We use the following lemma which extends results from [7,24] to our more
general setting. We omit the proof, which is a straightforward extension.
Lemma 2 (Generalized Influence Lemma).
Consider a truthful SDM M, and a bounded agent who has performed t de-
liberations resulting in a prior distribution F (t) . Let S = support(F (t) ). If the
agent is asked to perform deliberation d, then there exist thresholds Ld , Hd ∈ S
such that
– If an agent deliberates and reports value ≤ Ld she will not be served.
– If an agent deliberates and reports value ≥ Hd she will be served.
– It is possible that the deliberation results in a value ≤ Ld or ≥ Hd .
Note that “value” above refers to an agent’s effective value E[F (t+1) ] where
F (t+1) ∼ d(F (t) ).
We now show that at the time the agent is first approached, the price she is
charged does not depend on the future potential actions of any agent.
Lemma 3. Let M be a dominant-strategy truthful SDM for deliberative agents.
If M asks an agent to deliberate, we can modify M so that there is a single
price p determined by the history before M’s first interaction with this agent.
Moreover, the agent will win if her effective value is above p, and lose if it is
below p. The modification preserves truthfulness and can only improve revenue.
Proof. Let M be an SDM as above. Note that M can be thought of as a tree
where at each node the mechanism asks an agent to perform a deliberation (say
d). Each child corresponds to a potential result reported by the agent, at which
point, the mechanism either probes an (potentially the same) agent or terminates
at a leaf with an allocation and payment rule. With some abuse of notation, we
say an agent reports a value in H if her value is above Hd , reports a value in L
if her value is below Ld , and reports a value in M otherwise.
Consider an execution of M in which agent i is asked to deliberate. Consider
the path of execution.We show that we can modify M without any loss to revenue
so that after i is first asked to deliberate, the price at which she gets the item
(assuming she is served) is effectively fixed and does not depend on the rest of
the path.
Firstly, note that if an agent i reports a value in H, from Lemma 2, then
she is served. Assume that i determines her value is in H after deliberation.
Consider the case where there is only one possible value i can report in H and
she is not asked to deliberate again after this point. If this can result in multiple
possible prices, the difference must depend only on the behavior of other agents,
and is not intrinsic to i. In fact, all such prices must be acceptable to her due to
truthfulness, since, for a fixed set of strategies the other agents take, it should not
be the case that she prefer to lie and say her value is in L and hence avoid service
at a price that is too high. Therefore, any such mechanism can be modified by
replacing all these prices with the maximum such price.
On Revenue Maximization for Agents with Costly Information Acquisition 489

Otherwise, if there are two different prices, p and p , that are be reached
depending on i’s report(s), then this again contradicts dominant-strategy truth-
fulness. This is due to the fact that an agent’s behavior must be truthful against
any set of fixed strategies for the other agents. Thus, whenever i reports a value
in H, she must charged the same price p.
If agent i is never served when i reports v ∈ M , the proof is complete. Assume
otherwise. Let pm be some price that she is served at along a path in M and let
ph be the price she is charged if she reports a value in Hd . Clearly, if pm > ph
then if i’s value is in M she would have incentive to lie. Additionally, if pm < ph ,
then there is a set of strategies we can fix for the other agents for which i would
again have incentive to lie. Hence, by dominant strategy truthfulness, pm = ph .
We conclude by observing that, by truthfulness, it is straightforward to see
that an agent must be served at the above price p whenever her effective value
is above p (and not served otherwise). &
%

Lemma 4. Let M be a truthful SDM such that the price it charges i, assuming
i is served, only depends on the history before M’s first interaction with i. Then,
M is revenue-equivalent to an SPP N .

Proof. Given M, we construct N . Consider any node in the SDM M where


some agent, say i, is asked to deliberate. Let hb i be the history leading up to this
node. Lemma 3 implies that from this point forward, whenever agent i is served,
she is charged pi . Let N , under the same hb i , offer price pi to agent i.
Since M is truthful, each deliberation she asks i to perform is in her best
interest. Additionally, there is no deliberation i would like to perform which the
mechanism did not ask her to perform. Let this final effective value be v7i . Clearly,
by truthfulness, i gets the item if and only if pi ≤ v7i . Similarly, when N offers
her price pi it will be in her interest to take the same sequence of deliberations,
otherwise M was not truthful to begin with. She accepts if and only if pi ≤ v7i ,
matching the scenario under which she accepts M’s offer.
The above holds for any deliberative node. Hence, M and N have the same
expected revenue, completing the proof. &
%

Combining Lemmas 4 and 1 concludes the proof of Theorem 1.

4 Approximating Optimal Revenue

While our result above is nonconstructive, it turns out to be easy to construct


approximately-optimal SPPs. The basic approach is simple:

1. For any price p ∈ [0, ∞), determine the utility-maximizing set of deliber-
ations the agent would perform, and the probability α(p) that the agent
accepts this price when she deliberates optimally. We denote the agent’s
optimal expected utility when offered a price of p by u(p).
2. Note that f (p) = 1 − α(p) defines a cumulative distribution function on
[0, ∞].
490 L.E. Celis, D.C. Gklezakos, and A.R. Karlin

3. Observe that this implies any SPP has the same expected revenue in the
deliberative setting as it does in the classical setting when agents’ values are
drawn from the distribution v ∼ 1 − α(·).10
4. Use known approximation results [9,10,11,18] that show how to derive ap-
proximately optimal SPPs in the classical setting to derive an approximately
optimal SPP in the deliberative setting.
We apply this recipe to bounded deliberative agents for which we can efficiently
compute the distribution f (p) = 1 − α(p).
Definition 4 (Bounded Deliberative Agent). An deliberative agent is
bounded if the following holds:
1. Every prior has bounded expectation, i.e, E[F(t) ] < ∞ for all i, t.
2. Every set of deliberative actions is finite, i.e, D(t)  < ∞ for all i, t.
3. Every deliberative action results in one of finitely-many potential priors F (t) .
4. There is some finite T such that D(t) = ∅ for all t ≥ T , i.e., no further
deliberation is possible.
We now define a lemma that contains the key insight for this result.
Lemma 5. The probability and utility functions α(p) and u(p) have the follow-
ing properties:
1. u is piecewise linear and convex.
2. α is a step function and decreasing.
3. If the agents are bounded, α and u can be constructed in polynomial time in
the size of the deliberation tree.

Proof. Consider the deliberative decision-making an agent faces when offered


service at a particular price p. We think of her as alternating between decisions
(deliberate, accept the price or reject it), and receiving the random results of a
deliberation. This process defines a deliberation tree (which is finite, under the
boundedness assumption above).
We call the decision nodes choice nodes, and each has a corresponding delib-
eration possibilities (F , D, c). From each choice node, the agent can proceed to
an accept or reject leaf. When such a node is selected, the deliberation process
terminates with the agent having effective value equal to the expected value of
his updated prior. Alternately, the agent can proceed to a deliberation node. One
such node exists for each potential deliberation d ∈ D, and proceeding to such
a node comes at cost c(d). The strategy for the agent at a choice node consists
of deciding which child to select. Obviously, under optimal play, the child that
yields the maximum expected utility is chosen.
At a deliberation node, the chosen deliberation d is performed. The children
of a deliberation node d are the set of possible (F  , D , c ) that can be returned
10
Clearly, the expected revenue of this SPP in the classical setting is generally less than
the expected revenue of the optimal mechanism in the classical setting for agents
with values are drawn from these distributions.
On Revenue Maximization for Agents with Costly Information Acquisition 491

when d is applied to F . Note that the agent’s expected utility at a deliberation


node is simply a convex combination of the expected utilities of its children.
Consider a specific deliberation subtree T . Let uT (p) be the agent’s optimal
expected utility conditioned on reaching the root of this subtree. Also let αT (p)
be the probability the agent accepts an offer of p conditioned on reaching the
root of this subtree when she uses an optimal deliberation strategy (from this
point onwards). The lemma states that for any T , uT (p) is a piecewise linear
and convex function of p, αT (p) is a decreasing step function in p, and that both
functions can be constructed in polynomial time in the size of the deliberation
tree. The proof is by induction on the height of the deliberation subtree.
Base Case: The base case is a single node, which is either an accept or reject
node. For an accept node A reached via a sequence of deliberations d1 , . . . , dt
withfinal prior F (t) , the probability of acceptance is 1 and uA = E[F (t) ] −
t
p − i=1 c(di ). For a reject node, the probability of acceptance is 0 and uR =
t
− i=1 c(di ). In this case, all three propositions of the theorem hold trivially.
Inductive Step: Let us first consider the claims about u(p). By the inductive
hypothesis, for any tree of height h, the utility function is convex and piecewise
linear. A tree of height h + 1 is constructed by conjoining a set of m trees
{T1 , ..., Tm }, the maximum height of which is h, via a single root. The convexity
of the utility function is immediate from the fact that it is is either a convex
combination of convex functions or the maximum of a set of convex functions.
When taking the convex combination of the children, the utility function of the
root has a breakpoint (a price where the utility function changes slope) for each
breakpoint a child node has, and thus if b(Ti ) is the number of breakpoints in
the utility curvefor the child Ti , then the total number of breakpoints at the
m
root is |b(T )| ≤ i=1 |b(Ti )|.
When the utility function at the root is the pointwise maximum of the utility
functions
m associated with the children, order the set of all breakpoints |b(T )| ≤
i=1 |b(Ti )| associated with any of the children of the root by their p value.
Within any of these intervals, the utility function is the maximum of m lines,
which can generate at most m − 1new breakpoints. Thus, the total number of
breakpoints is at most |b(T )| ≤ m m i=1 |b(Ti )|. Inductively, this implies that the
total time to compute the utility function is O(|T |2 ).
Now consider α(p). Let αTi (p) be the probability that the agent will accept
an offer at price p conditioned on reaching the root of Ti . The inductive hypoth-

esis is that αTi (p) = −uTi (p). where we extend the derivative (−uTi (p)) to all
breakpoints by right continuity.
If the root of T is a deliberation node, then in each interval in the partition
defined by the breakpoints the slope of the linear function representing uT (p) is
the convex combination of the slopes of theindividual utilities uTi (p). Therefore
 m  m
uT (p) = i=1 q(Ti )uTi (p), and αT (p) = i=1 q(Ti )αTi (p). where q(Ti ) is the
probability of the deliberation
 outcome associated with Ti . It then follows from
 
the IH that αT (p) = − m i=1 q(T i )uTi (p) = −uT (p).
492 L.E. Celis, D.C. Gklezakos, and A.R. Karlin

When T is a choice node, the acceptance probability for each p is precisely


that associated with the Ti that has maximum utility for that p, and we obtain
 
αT (p) = αTi (p) = −uTi (p) = −uT (p)
  
Finally, by convexity, for p < p , we have uT (p ) ≥ uT (p) which, by the
 
previous equalities implies that −αT (p ) ≥ −αT (p) or equivalently αT (p ) ≤
αT (p), concluding the proof. &
%
The following lemma is then immediate.
Lemma 6. Let M be any SPP in a single-parameter deliberative setting. Then
the expected revenue of M in this deliberative setting is equal to the expected
revenue of M in the classical setting with agents whose values v are drawn from
the distribution F (v) = 1 − α(v).
This gives us a direct connection between the deliberative and classical
settings, and, crucially, allows us to apply results from the classical setting.
Specifically, using the fact that optimal mechanisms in the classical settings are
well-approximated by SPPs [9,10,11,18] and that SPPs are optimal for deliber-
ative settings, we obtain the following theorem.
Theorem 2 (Constructive SPP 2-Approximation) Consider a collection
of bounded deliberative agents. An SPP that 2-approximates the revenue of op-
timal deterministic deliberative mechanisms for public communication matroid
environments can be constructed in polynomial time.

5 Bayes-Nash Incentive Compatible Mechanisms


In this section, we consider the design of Bayes-Nash incentive compatible (BIC)
mechanisms in the following simplified setting.
Definition 5. A public communication, single deliberation environment is a set-
ting where an agent’s value is drawn from a distribution with continuous and
atomless density function f (·), and each agent has a single deliberative possibil-
ity after which they learn their exact value v.11
Suppose that agent is asked to deliberate at some point during the execution of
the mechanism. Denote by a(v) the probability that an agent receives allocation
when her value is v.12 Recall we assume a public communication model, hence the
agent knows the reports of all agents that deliberated before her. Her expected
payment p(v) can be characterized as in the classical setting [23] and her utility
is u(v) = a(v)v − p(v).
Note that an SDM is Bayes-Nash incentive compatible (BIC) if and only if
each agent’s expected utility is maximized by complying with the requests of the
mechanism whenever other agents are also compliant. We provide a characteri-
zation for BIC mechanisms.
11
Note that a mechanism will ask an agent to deliberate at most once.
12
This probability is taken over the values of all agents asked to deliberate at the same
time or later than this agent.
On Revenue Maximization for Agents with Costly Information Acquisition 493

Proposition 1. Consider a set of single-parameter agents in a single deliber-


ation enviornment. A simple deliberative mechanism is BIC if and only if for
each agent i:
1. If i is asked to deliberate then
(a) ai (v)≥ 0, with payment ruleas in the classical setting.
∞ v μ
(b) 0 a (x)dx fi (vi )dvi ≥ 0 i ai (x)dx + ci .
0 i
2. If i is offered the item at price e without being asked to deliberate, then
e
0 Fi (v)dv ≤ ci .

Proof. For notational simplicity, we present the proof when Fi = F and ci = c


for all i.
Condition 1a: Follows as in the classical setting [23] after deliberation.
Condition 1b: In the last case, we have to consider what the agent could
gain from this action. The utility of an agent in BNE with allocation probability
v
a(x) that performs a deliberation is u(v) = 0 a(x)dx − c. Given that v ∼ F , the
expected utility ud (v) of an agent that performs a deliberation with cost c is:
 ∞  v 
ud = a(x)dx f (v)dv − c.
0 0

The utility of a player that does not deliberate and reports valuation w is:
 w
uE (w) = μa(w) − p(w) = μa(w) − wa(w) + a(x)dx
0

where μ = E[F ]. Observing that uE is maximized at w = μ, and simplifying the


constraint that ud ≥ uE (w), we obtain
 ∞  v   μ
a(x)dx f (v)dv ≥ a(x)dx + c
0 0 0

Condition 2: If the mechanism offers the item to the agent and expects her to
take it without deliberation at price e, it must be that her utility uE = μ − e is
greater than the utility she could obtain from deliberating, that is:
 ∞  e
μ−e≥ (v − e)f (v)dv − c which after simplification is F (v)dv ≤ c.
e 0
&
%
To give an example, consider a single item auction in the classical setting, with
two agents whose values are drawn uniformly on [0, 1]. Hence, the revenue-
optimal mechanism is a Vickrey auction with reserve price 1/2, which achieves
an expected revenue of 5/12. Note that this acution is dominant strategy truth-
ful. In the deliberative setting this is no longer the case, since that would require
that, ex-post, an agent has “no regrets”. However, a deliberative agent will regret
having paid a deliberation cost c if she ends up losing. It follows though from
1
Proposition 1 that VCG is BIC for for c < 12 . It also follows from condition 2

that an agent will take the item without deliberating at a price up to 2c. Thus,
when c = 1/12 −  the following mechanism raises more revenue than VCG:
494 L.E. Celis, D.C. Gklezakos, and A.R. Karlin

– Approach√agent 1, ask her to deliberate and report her value v1 . If it is above


p∗ = 1 − 2c, sell her the item at price p. √
– Otherwise, approach agent 2, and offer her the item at price 2c without
deliberation.
Surprisingly, the problem of designing an optimal mechanisms, even for 2 agents
and a single item, seems to be difficult.

6 Future Work
We view this as very preliminary work in the setting of deliberative environ-
ments; numerous open problems remain. In the specific model studied, directions
for future research include understanding other communication models, and the
power of randomization. Beyond dominant strategies, revenue maximization us-
ing other solution concepts is wide open. It would also be interesting to study
objectives other than revenue maximization. Finally, it would be interesting to
derive “price of anarchy” style results that compare optimal revenue in deliber-
ative and non-deliberative settings.

Acknowledgements. We would like to thank Kevin Leyton-Brown and Dave


Thompson for introducing us to the fascinating setting of deliberative agents.

References
1. Babaioff, M., Kleinberg, R., Leme, R.P.: Optimal mechanisms for selling informa-
tion. In: 12th International World Wide Web Conference (2012)
2. Bergemann, D., Valimaki, J.: Information acquisition and efficient mechanism de-
sign. Econometrica 70(3) (2002)
3. Bergemann, D., Valimaki, J.: Information acquisition and efficient mechanism de-
sign. Econometrica 70(3) (2002)
4. Bikhchandani, S.: Information acquisition and full surplus extraction. Journal of
Economic Theory (2009)
5. Cavallo, R., Parkes, D.C.: Efficient metadeliberation auctions. In: AAAI, pp. 50–56
(2008)
6. Celis, L.E., Gklezakos, D.C., Karlin, A.R.: On revenue maximization for agents
with costly information acquisition (2013), Full version
http://homes.cs.washington.edu/~ gklezd/publications/deliberative.pdf
7. Celis, L.E., Karlin, A., Leyton-Brown, K., Nguyen, T., Thompson, D.: Approxi-
mately revenue-maximizing mechanisms for deliberative agents. In: Association for
the Advancement of Artificial Intelligence (2011)
8. Chakraborty, I., Kosmopoulou, G.: Auctions with edogenous entry. Economic Let-
ters 72(2) (2001)
9. Chawla, S., Hartline, J., Kleinberg, R.: Algorithmic pricing via virtual valuations.
In: Proc. 9th ACM Conf. on Electronic Commerce (2007)
10. Chawla, S., Hartline, J., Malec, D., Sivan, B.: Sequential posted pricing and multi-
parameter mechanism design. In: Proc. 41st ACM Symp. on Theory of Computing
(2010)
On Revenue Maximization for Agents with Costly Information Acquisition 495

11. Chawla, S., Malec, D., Sivan, B.: The power of randomness in bayesian optimal
mechanism design. In: ACM Conference on Electronic Commerce, pp. 149–158
(2010)
12. Compte, O., Jehiel, P.: Auctions and information acquisition: Sealed-bid or dy-
namic formats? Levine’s Bibliography 784828000000000495, UCLA Department of
Economics (October 2005)
13. Cramer, J., Spiegel, Y., Zheng, C.: Optimal selling mechanisms wth costly infor-
mation acquisition. Technical report (2003)
14. Cremer, J., McLean, R.P.: Full extraction of surplus in bayesian and dominant
strategy auctions. Econometrica 56(6) (1988)
15. Gibbard, A.: Manipulation of voting schemes: a general result. Econometrica 41,
211–215 (1973)
16. Hartline, J.: Lectures on approximation and mechanism design. Lecture notes
(2012)
17. Hartline, J., Karlin, A.: Profit maximization in mechanism design. In: Nisan, N.,
Roughgarden, T., Tardos, É., Vazirani, V. (eds.) Algorithmic Game Theory, ch.
13, pp. 331–362. Cambridge University Press (2007)
18. Kleinberg, R., Weinberg, S.M.: Matroid prophet inequalities. In: Symposium on
Theoretical Computer Science (2012)
19. Larson, K.: Reducing costly information acquisition in auctions. In: AAMAS,
pp. 1167–1174 (2006)
20. Larson, K., Sandholm, T.: Strategic deliberation and truthful revelation: an impos-
sibility result. In: ACM Conference on Electronic Commerce, pp. 264–265 (2004)
21. Lavi, R., Swamy, C.: Truthful and near-optimal mechanism design via linear pro-
gramming. In: Proc. 46th IEEE Symp. on Foundations of Computer Science (2005)
22. Levin, D., Smith, J.L.: Equilibrium in auctions with entry. American Economic
Review 84, 585–599 (1994)
23. Myerson, R.: Optimal auction design. Mathematics of Operations Research 6,
58–73 (1981)
24. Thompson, D.R., Leyton-Brown, K.: Dominant-strategy auction design for agents
with uncertain, private values. In: Twenty-Fifth Conference of the Association for
the Advancement of Artificial Intelligence, AAAI 2011 (2011)
Price of Stability in Polynomial Congestion
Games

George Christodoulou and Martin Gairing

Department of Computer Science, University of Liverpool, U.K.

Abstract. The Price of Anarchy in congestion games has attracted a


lot of research over the last decade. This resulted in a thorough under-
standing of this concept. In contrast the Price of Stability, which is an
equally interesting concept, is much less understood.
In this paper, we consider congestion games with polynomial cost
functions with nonnegative coefficients and maximum degree d. We give
matching bounds for the Price of Stability in such games, i.e., our tech-
nique provides the exact value for any degree d.
For linear congestion games, tight bounds were previously known.
Those bounds hold even for the more restricted case of dominant equi-
libria, which may not exist. We give a separation result showing that
already for congestion games with quadratic cost functions this is not
possible; that is, the Price of Anarchy for the subclass of games that
admit a dominant strategy equilibrium is strictly smaller than the Price
of Stability for the general class.

1 Introduction

During the last decade, the quantification of the inefficiency of game-theoretic


equilibria has been a popular and successful line of research. The two most widely
adopted measures for this inefficiency are the Price of Anarchy (PoA) [17] and
the Price of Stability (PoS) [3].
Both concepts compare the social cost in a Nash equilibrium to the optimum
social cost that could be achieved via central control. The PoA is pessimistic
and considers the worst-case such Nash equilibrium, while the PoS is optimistic
and considers the best-case Nash equilibrium. Therefore, the PoA can be used
as an absolute worst-case guarantee in a scenario where we have no control over
equilibrium selection. On the other hand, the PoS gives an estimate of what
is the best we can hope for in a Nash equilibrium; for example, if the players
collaborate to find the optimal Nash equilibrium, or if a trusted mediator suggest
this solution to them. Moreover, it is a much more accurate measure for those
instances that possess unique Nash equilibria.
Congestion games [20] have been a driving force in recent research on these
inefficiency concepts. In a congestion game, we are given a set of resources and
each player selects a subset of them (e.g. a path in a network). Each resource

This work was supported by EPSRC grants EP/K01000X/1 and EP/J019399/1.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 496–507, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Price of Stability in Polynomial Congestion Games 497

has a cost function that only depends on the number of players that use it.
Each player aspires to minimise the sum of the resources’ costs in its strategy
given the strategies chosen by the other players. Congestion games always admit
a pure Nash equilibrium [20], where players pick a single strategy and do not
randomize. Rosenthal [20] showed this by means of a potential function having
the following property: if a single player deviates to a different strategy then the
value of the potential changes by the same amount as the cost of the deviating
player. Pure Nash equilibria correspond to local optima of the potential function.
Games admitting such a potential function are called potential games and every
potential game is isomorphic to a congestion game [19].
Today we have a strong theory which provides a thorough understanding of
the PoA in congestion games [1,4,5,11,21]. This theory includes the knowledge
of the exact value of the PoA for games with linear [4,11] and polynomial [1] cost
functions, a recipe for computing the PoA for general classes of cost functions
[21], and an understanding of the “complexity” of the strategy space required to
achieve the worst case PoA [5].
In contrast, we still only have a very limited understanding of the Price of
Stability (PoS) in congestion games. Exact values for the PoS are only known
for congestion games with linear cost functions [9,13] and certain network cost
sharing games [3]. The reason for this is that there are more considerations when
bounding the PoS as compared to bounding the PoA. For example, for linear
congestion games, the techniques used to bound the PoS are considerably more
involved than those used to bound the PoA.
A fundamental concept in the design of games is the notion of a dominant-
strategy equilibrium. In such an equilibrium each player chooses a strategy which
is better than any other strategy no matter what the other players do. It is well-
known that such equilibria do not always exist, as the requirements imposed are
too strong. However, it is appealing for a game designer, as it makes outcome
prediction easy. It also simplifies the strategic reasoning of the players and is
therefore an important concept in mechanism design. If we restrict to instances
where such equilibria exist, it is natural to ask how inefficient those equilibria
can be. Interestingly, for linear congestion games, they can be as inefficient as
the PoS [9,13,14].

1.1 Contribution and High-Level Idea


Results. In this paper we study the fundamental class of congestion games with
polynomial cost functions of maximum degree d and nonnegative coefficients.
Our main result reduces the problem of finding the value of the Price of Sta-
bility to a single-parameter optimization problem. It can be summarized in the
following theorem (which combines Theorem 2 and Theorem 3).
Theorem 1. For congestion games with polynomial cost functions with maxi-
mum degree d and nonnegative coefficients, the Price of Stability is given by
(2d d + 2d − 1) · rd+1 − (d + 1) · rd + 1
PoS = max .
r>1 (2d + d − 1) · rd+1 − (d + 1) · rd + 2d d − d + 1
498 G. Christodoulou and M. Gairing

For any degree d, this gives the exact value of the Price of Stability. For example,
for d = 1 and d = 2, we get

3 r2 − 2 r + 1 3 11 r3 − 3 r2 + 1
max 2 =1+ ≈ 1.577 and max ≈ 2.36,
r 2r −2r +2 3 r 5 r3 − 3 r2 + 7
respectively. The PoS converges to d + 1 for large d.
We further show that in contrast to linear congestion games [13,14], already
for d = 2, there is no instance which admits a dominant strategy equilibrium and
achieves this value. More precisely, we show in Theorem 4 that for the subclass
of games that admit a dominant strategy equilibrium the Price of Anarchy is
strictly smaller than the Price of Stability for the general class.

Upper Bound Techniques. Both finding upper and lower bounds for the PoS,
seem to be a much more complicated task than bounding the PoA. For the PoA
of a class of games, one needs to capture the worst-case example of any Nash
equilibrium, and the PoA methodology has been heavily based on this fact. On
the other hand, for the PoS of the same class one needs to capture the worst-
case instance of the best Nash equilibrium. So far, we do not know a useful
characterization of the set of best-case Nash equilibria. It is not straightforward
to transfer the techniques for the PoA to solve the respective PoS problem.
A standard approach that has been followed for upper bounding the PoS can
be summarised as follows:
1. Define a restricted subset R of Nash equilibria.
2. Find the Price of Anarchy with respect to Nash equilibria that belong in R.
The above recipe introduces new challenges: What is a good choice for R, and
more importantly, how can we incorporate the description of R in the Price of
Anarchy methodology? For example, if R is chosen to be the set of all Nash
equilibria, then one obtains the PoA bound. Finding an appropriate restriction
is a non-trivial task and might depend on the nature of the game, so attempts
vary in the description level of R from natural, “as the set of equilibria with
optimum potential”, to the rather more technical definitions like “the equilibria
that can be reached from a best-response path starting from an optimal setup”.
Like previous work (see for example [3,6,9,13,14]) we consider the PoA of Nash
equilibria with minimum potential (or in fact with potential smaller than the
one achieved in the optimum).
Then we use a linear combination of two inequalities, which are derived from
the potential and the Nash equilibrium conditions, respectively. Using only the
Nash inequality gives the PoA value [1]. Using only the potential inequality gives
an upper bound of d + 1. The question is what is the best way to combine these
inequalities to obtain the minimum possible upper bound? Caragiannis et al. [9]
showed how to do this for linear congestion games. Our analysis shows how to
combine them optimally for all polynomials (cf. parameter ν7 in Definition 3).
The main technical challenge is to extend the techniques used for proving
upper bounds for the PoA [1,11,21]. In general those techniques involve optimiz-
ing over two parameters λ, μ such that the resulting upper bound on the PoA
Price of Stability in Polynomial Congestion Games 499

is minimized and certain technical conditions are satisfied – Roughgarden [21]


refers to those conditions as (λ, μ)-smoothness. The linear combination of the
two inequalities mentioned above adds a third parameter ν, which makes the
analysis much more involved.

Lower Bound Techniques. Proving lower bounds for the PoA and PoS is usually
done by constructing specific classes of instances. However, there is a conceptual
difference: Every Nash equilibrium provides a lower bound on the PoA, while for
the PoS we need to give a Nash equilibrium and prove that this is the best Nash
equilibrium. To guarantee optimality, the main approach is based on constructing
games with unique equilibria. One way to guarantee this is to define a game
with a dominant-strategy equilibrium. This approach gives tight lower bounds
in congestion games with linear cost functions [13,14]. Recall, that our separation
result (Theorem 4) shows that, already for d = 2, dominant-strategy equilibria
will not give us a tight lower bound. Thus, we use a different approach. We
construct an instance with a unique Nash equilibrium and show this by using
an inductive argument (Lemma 1).
The construction of our lower bound was governed by the inequalities used
in the proof of the upper bound. At an abstract level, we have to construct an
instance that uses the cost functions and loads on the resource that make all
used inequalities tight. This is not an easy task as there are many inequalities:
most prominently, one derived from the Nash equilibrium condition, one from
the potential, and a third one that upper bounds a linear combination of them
(see Proposition 1). To achieve this we had to come up with a completely novel
construction.

1.2 Related Work

The term Price of Stability was introduced by Anshelevich et al. [3] for a network
design game, which is a congestion game with special decreasing cost functions.
For such games with n players, they showed that the Price of Stability is exactly
Hn , i.e., the n’th harmonic number. For the special case of undirected networks,
the PoS is known to be strictly smaller than Hn [15,7,12,3], but while the best
general upper bound [15] is close to Hn , the best current lower bound is a con-
stant [8]. For special cases better upper bound can be achieved. Li [18] showed an
upper bound of O(log n/ log log n) when the players share a common sink, while
Fiat et al. [16] showed a better upper bound of O(log log n) when in addition
there is a player in every vertex of the network. Chen and Roughgarden [10]
studied the PoS for the weighted variant of this game, where each player pays for
a share of each edge cost proportional to her weight, and Albers [2] showed that
the PoS is Ω(log W/ log log W ), where W is the sum of the players’ weights.
The PoS has also been studied in congestion games with increasing
√ cost func-
tions. For linear congestion games, the PoS is equal to 1 + 3/3 ≈ 1.577 where
the lower bound was shown in [13] and the upper bound in [9]. Bilo[6] showed
upper bounds on the PoS of 2.362 and 3.322 for congestion games with quadratic
500 G. Christodoulou and M. Gairing

and cubic functions respectively. He also gives non-matching lower bounds, which
are derived from the lower bound for linear cost functions in [14].
Awerbuch et al. [4] and Christodoulou and Koutsoupias [11] showed that the
PoA of congestion games with linear cost functions is 52 . Aland et. al. [1] obtained
the exact value on the PoA for polynomial cost functions. Roughgarden’s [21]
smoothness framework determines the PoA with respect to any set of allowable
cost functions. These results have been extended to the more general class of
weighted congestion games [1,4,5,11].

Roadmap. The rest of the paper is organized as follows. In Section 2 we introduce


polynomial congestion games. In Section 3 and 4, we present our matching lower
and upper bounds on the PoS. We present a separation result in Section 5. Due
to space constraints, some of the proofs are deferred to the full version of this
paper.

2 Definitions
For any positive integer k ∈ N, denote [k] = {1, . . . , k}. A congestion game [20]
is a tuple (N, E, (Si )i∈N , (ce )e∈E ), where N = [n] is a set of n players and E is
a set of resources. Each player chooses as her pure strategy a set si ⊆ E from a
given set of available strategies Si ⊆ 2E . Associated with each resource e ∈ E is
a nonnegative cost function ce : N → R+ . In this paper we consider polynomial
cost functions with maximum degree d and nonnegative coefficients; that is every
d
cost function is of the form ce (x) = j=0 ae,j · xj with ae,j ≥ 0 for all j.
A pure strategy profile is a choice of strategies s = (s1 , s2 , ...sn ) ∈ S = S1 ×
· · ·×Sn by players. We use the standard notation s−i = (s1 , . . . , si−1 , si+1 , . . . sn ),
S−i = S1 × · · · × Si−1 × Si+1 × · · · × Sn , and s = (si , s−i ). For a pure strategy
profile s define the load ne (s) = |i ∈ N : e ∈ si | as the number  of players that
use resource e. The cost for player i is defined by Ci (s) = e∈si ce (ne (s)).
Definition 1. A pure strategy profile s is a pure Nash equilibrium if and only
if for every player i ∈ N and for all si ∈ Si , we have Ci (s) ≤ Ci (si , s−i ).
Definition 2. A pure strategy profile s is a (weakly) dominant strategy equilib-
rium if and only if for every player i ∈ N and for all si ∈ Si and s−i ∈ S−i , we
have Ci (s) ≤ Ci (si , s−i ).
The social cost of a pure strategy profile s is the sum of the players costs

SC(s) = Ci (s) = ne (s) · ce (ne (s)).


i∈N e∈E

Denote opt = mins SC(s) as the optimum social cost over all strategy profiles
s ∈ S. The Price of Stability of a congestion game is the social cost of the
best-case Nash equilibrium over the optimum social cost
SC(s)
PoS = min .
s is a Nash Equilibrium opt
Price of Stability in Polynomial Congestion Games 501

The PoS for a class of games is the largest PoS among all games in the class.
For a class of games that admit dominant strategy equilibria, the Price of
Anarchy of dominant strategies, dPoA, is the worst case ratio (over all games)
between the social cost of the dominant strategies equilibrium and the optimum
social cost.  ne (s)
Congestion games admit a potential function Φ(s) = e∈E j=1 ce (j) which
was introduced by Rosenthal [20] and has the following property: for any two
strategy profiles s and (si , s−i ) that differ only in the strategy of player i ∈ N , we
have Φ(s) − Φ(si , s−i ) = Ci (s) − Ci (si , s−i ). Thus, the set of pure Nash equilibria
correspond to local optima of the potential function. More importantly, there
exists a pure Nash eqilibrium s, s.t.

Φ(s) ≤ Φ(s ) for all s ∈ S. (1)

3 Lower Bound
In this section we use the following instance to show a lower bound on PoS.
Example 1. Given nonnegative integers n, k and d, define a congestion game as
follows:

– The set of resources E is partitioned into E = A∪B∪{Γ } where A consists of


n resources A = {Ai |i ∈ [n]}, B consists of n(n − 1) resources B = {Bij |i, j ∈
[n], i = j}, and Γ is a single resource.
– All cost functions are monomials of degree d given as follows:
• For i ∈ [n] the cost of resource Ai is given by cAi (x) = αi · xd , where
d
αi = (k + i) + ε for sufficiently small ε > 0.
(k+i)d −(k+i−1)d
• Denote Ti = Resource Bij with i, j ∈ [n], i = j has cost
(22d −1)
.

Tj , if i < j,
cBij (x) = βij · xd where βij =
2d Ti , if i > j.
• For resource Γ we have cΓ (x) = xd .
– There are n + k players. Each player i ∈ [n] has two strategies si , s∗i where

si = Γ ∪ {Bij |j ∈ [n], j = i} , and


s∗i = Ai ∪ {Bji |j ∈ [n], j = i}.

The remaining players i ∈ [n+1, n+k] are fixed to choose the single resource
Γ . To simplify notation denote by s = (s1 , . . . , sn ) and s∗ = (s∗1 , . . . , s∗n ) the
corresponding strategy profiles. Those profiles correspond to the unique Nash
equilibrium and to the optimal allocation respectively.

In the following lemma we show that s is the unique Nash equilibrium for the
game in Example 1. To do so, we show that s1 is a dominant strategy for player 1
and that given that the first i − 1 players play s1 , . . . , si−1 , then si is a dominant
strategy for player i ∈ [n].
502 G. Christodoulou and M. Gairing

Lemma 1. In the congestion game from Example 1, s is the unique Nash


equilibrium.

We use the instance from Example 1 to show the lower bound in the following
theorem. We define ρ = nk and r = k+n k = 1 + ρ1 > 1. We let n → ∞ and
determine the r > 1 which maximises the resulting lower bound1 . Note that
r > 1 is the ratio of the loads on resource Γ in Example 1.

Theorem 2. For congestion games with polynomial cost functions with maxi-
mum degree d and nonnegative coefficients, we have

(2d d + 2d − 1) · rd+1 − (d + 1) · rd + 1
PoS ≥ max . (2)
r>1 (2d + d − 1) · rd+1 − (d + 1) · rd + 2d d − d + 1

4 Upper Bound
In this section we show an upper bound on the PoS for polynomial congestion
games. We start with two technical lemmas and a definition, all of which will be
used in the proof of Proposition 1. This proposition is the most technical part of
the paper. It shows an upper bound on a linear combination of two expressions;
one is derived from the Nash equilibrium condition and the other one from the
potential. Equipped with this, we prove our upper bound in Theorem 3.
Lemma 2. Let f be a nonnegative
x and convex
x function, then for all nonnegative
integers x, y with x ≥ y, i=y+1 f (i) ≥ y f (t)dt + 12 (f (x) − f (y)).

Definition 3. Define ν7 as the minimum ν such that


  
1 1 1
f (ν) := 2 + (d − 1) 1 − d+1 −
d
· ν − d 1 − d+1 ≥ 0
r r r

for all r > 1.

Observe that for all d ≥ 1 and r > 1, f (ν) is a monotone increasing function in ν.
Thus ν7 ∈ (0, 1] is well defined since f (0) < 0 and f (1) > 0 for all r > 1. Moreover,
f (ν) ≥ 0 for all ν ≥ ν7. We will make use of the following bounds on ν7.
Lemma 3. Define ν7 as in Definition 3. Then d
2d +d−1
≤ ν7 < d+1
2d +d−1
.

Proposition 1. Let 1 ≥ ν ≥ ν7 and define λ = d+1−dν and μ = (2d +d−1)ν−d.


Then for all polynomial cost functions c with maximum degree d and nonnegative
coefficients and for all nonnegative integers x, y we have


y 
x
ν · y · c(x + 1) + (1 − ν)(d + 1) c(i) − c(i) ≤ (μ + ν − 1) · x · c(x) + λ · y · c(y).
i=1 i=1

1
Notice that the value r that optimizes the right hand side expression of (2) might
not be rational. The lower bound is still valid as we can approximate an irrational r
arbitrarily close by a rational.
Price of Stability in Polynomial Congestion Games 503

Proof. Since c is a polynomial cost function with maximum degree d and non-
negative coefficients it is sufficient to show the claim for all monomials of degree
t where 0 ≤ t ≤ d. Thus, we will show that


x 
y
(μ + ν − 1) · xt+1 + λ · y t+1 − ν · y(x + 1)t + (1 − ν)(d + 1) it − it ≥ 0 (3)
i=1 i=1

for all nonnegative integers x, y and degrees 0 ≤ t ≤ d.


Fix some 0 ≤ t ≤ d. First observe that (3) is trivially fulfilled for y = 0, as all
the negative terms disappear. So in the following we assume y ≥ 1.
Elementary calculations show that (3) holds when 0 ≤ x ≤ y. So in the
following we assume x > y ≥ 1. By Lemma 2, we have
x y x
1 1
it − it = it ≥ (xt+1 − y t+1 ) + (xt − y t )
i=1 i=1 i=y+1
t+1 2
1 1
≥ (xt+1 − y t+1 ) + (xt − y t ). (4)
d+1 2
Moreover, since x ≥ 2 we can bound
t  t  i−1
t t−i t 1
(x + 1)t = x ≤ xt + xt−1 ·
i=0
i i=1
i 2
5 t 6
3
=x +xt t−1
·2 −1 . (5)
2
x
Using (4) and (5) and by defining r = y > 1, we can lower bound the left-hand-
side of (3) by

1
μ · xt+1 + (λ + ν − 1) · y t+1 − ν · y(x + 1)t + (1 − ν)(d + 1)(xt − y t )
 2  
λ+ν −1 ν 3 t
≥ μ+ ·x t+1
− · x t+1
+x ·2
t
−1
r t+1 r 2
1 1
+ (1 − ν)(d + 1) 1 − t
· xt
2 r
  
t
λ+ν −1 ν 1 1 2ν 3
= μ+ − ·x t+1
+ (1 − ν)(d + 1) 1 − t − −1 ·xt .
r t+1 r 2 r r 2
     
:=A(ν) :=B(ν)

First observe that by using the definitions of λ, μ, we get


  
1 1 1
A(ν) = 2 + (d − 1) 1 − t+1 −
d
· ν − d 1 − t+1
r r r
  
1 1 1
≥ 2 + (d − 1) 1 − d+1 −
d
· ν − d 1 − d+1
r r r
≥ 0,
504 G. Christodoulou and M. Gairing

where the first inequality holds since ν ≤ 1 and the second inequality is by
Definition 3 and ν ≥ ν7. Since x ≥ 2, we get
A(ν) · xt+1 + B(ν) · xt ≥ (2A(ν) + B(ν)) · xt .
To complete the proof we show that 2A(ν) + B(ν) ≥ 0 for ν ≥ ν7.

2A(ν) + B(ν)
  
t
1 2 1 1 2 3
= d+1
2 + 2(d − 1) 1 − − − (d + 1) 1 − t − −1 ·ν
r t+1 r 2 r r 2
1 1 1
− 2d 1 − + (d + 1) 1 − t
r t+1 2 r
 
t
1 1 1 2 3
= d+1
2 + 2(d − 1) 1 − − (d + 1) 1 − t − ·ν
r t+1 2 r r 2
1 1 1
− 2d 1 − + (d + 1) 1 − t ,
r t+1 2 r

which again is a monotone increasing function in ν. Since ν7 ≥ d


2d +d−1
by Lemma
3 and ν ≥ ν7, we get
2A(ν) + B(ν)
5   t 6
1
1 1 2 3 d
≥ 2d+1 + 2(d − 1) 1 − t+1 − (d + 1) 1 − t − · d
r 2 r r 2 2 +d−1
 
1 1 1
− 2d 1 − t+1 + (d + 1) 1 − t
r 2 r
 3 t t
(d + 1)(2 − 1) · r
d t+1
− 4d 2 · r − (d + 1)(2d − 1) · r + 4d2d
= .
2(2d + d − 1)rt+1
Define D(d, t) as the numerator of this term. Thus,
t
3
D(d, t) = (d + 1)(2d − 1) · rt+1 − 4d · rt − (d + 1)(2d − 1) · r + 4d2d .
2
If r ≤ 4
3 then D(d, t) ≥ (d + 1)(2d − 1) · r(rt − 1) ≥ 0,

for all integer d ≥ 1 and 0 ≤ t ≤ d. If r ≥ 4


3 then
t
3
D(d, t) ≥ (d + 1)(2d − 1) · r(rt − 1) − 4d · (rt − 1)
2
5 d 6
4 3
≥ (d + 1)(2 − 1) · − 4d
d
· (rt − 1),
3 2

which is nonnegative for all integer d ≥ 4 and 0 ≤ t ≤ d. For t ≤ d ≤ 3,


D(d, t) ≥ 0 can be checked using elementary calculus. %
&
Price of Stability in Polynomial Congestion Games 505

We are now ready to prove the upper bound of our main result.
Theorem 3. For congestion games with polynomial cost functions with maxi-
mum degree d and nonnegative coefficients, we have

(2d d + 2d − 1) · rd+1 − (d + 1) · rd + 1
P oS ≤ max .
r>1 (2d + d − 1) · rd+1 − (d + 1) · rd + 2d d − d + 1

Proof. Let s∗ be an optimum assignment and let s be a pure Nash equilibrium


with Φ(s) ≤ Φ(s∗ ). Such a Nash equilibrium exists by (1). Define xe = ne (s) and
ye = ne (s∗ ). Then

SC(s) ≤ SC(s) + (d + 1)(Φ(s∗ ) − Φ(s))


⎛ ⎞
ne (s∗ ) ne (s)
= ne (s) · ce (ne (s)) + (d + 1) ⎝ ce (i) − ce (i)⎠
e∈E e∈E i=1 i=1
5 ye xe
6
= xe · ce (xe ) + (d + 1) ce (i) − ce (i) . (6)
e∈E e∈E i=1 i=1

Moreover, since s is a pure Nash equilibrium, we have


n n
SC(s) = Ci (s) ≤ Ci (s∗i , s−i )
i=1 i=1
n
≤ ce (ne (s) + 1) = ye · ce (xe + 1). (7)
i=1 e∈s∗
i e∈E

Let ν7 as defined in Definition 3. Taking the convex combination ν7·(7)+(1−7


ν )·(6)
of those inequalities gives
8 9
 
ye 
xe
SC(s) ≤ ν7 · ye · ce (xe + 1) + (1 − ν7)xe · ce (xe ) + (1 − ν7)(d + 1) ce (i) − ce (i)
e∈E i=1 i=1

With λ = d + 1 − d7
ν and μ = (2d + d − 1)7
ν − d, applying Proposition 1 gives

SC(s) ≤ [μ · xe · ce (xe ) + λ · ye · ce (ye )] = μ · SC(s) + λSC(s∗ ).


e∈E

Thus,

SC(s) λ d + 1 − d7ν
≤ = ,
SC(s∗ ) 1−μ d + 1 − (2d + d − 1)7
ν

By Definition 3, for all real numbers r > 1, we have

d(1 − rd+1
1
)
ν7 ≥ . (8)
2 + (d − 1)(1 − rd+1
d 1
) − 1r
506 G. Christodoulou and M. Gairing

Denote r̂ as the value for r > 1 which makes inequality (8) tight. Such a value
r̂ must exist since ν7 is the minimum value satisfying this inequality. So,

d(r̂d+1 − 1)
ν7 = .
2d r̂d+1 + (d − 1)(r̂d+1 − 1) − r̂d

Substituting this in the bound from Theorem 3 gives

d + 1 − d7ν
P oS ≤
d + 1 − (2d + d − 1)7
ν
(d+1)2d r̂ d+1 +(d2 −1)(r̂ d+1 −1)−(d+1)r̂ d −d2 (r̂ d+1 −1)
= 2d (d+1)r̂ d+1+(d2 −1)(r̂ d+1 −1)−(d+1)r̂ d −2d d(r̂ d+1 −1)−d(d−1)(r̂ d+1−1)
(2d d + 2d − 1)r̂d+1 − (d + 1)r̂d + 1
= d
(2 + d − 1)r̂d+1 − (d + 1)r̂d + 2d d − d + 1
(2d d + 2d − 1) · rd+1 − (d + 1) · rd + 1
≤ max d ,
r>1 (2 + d − 1) · rd+1 − (d + 1) · rd + 2d d − d + 1

which proves the upper bound. &


%

5 Separation

For the linear case, the Price of Stability was equal to the Price of Anarchy of
dominant strategies, as the matching lower bound instance would hold for dom-
inant strategies. Here, we show that linear functions was a degenerate case, and
that this is not true for higher order polynomials. We show that for games that
possess dominant equilibria, the Price of Anarchy for them is strictly smaller2 .
Our separation leaves as an open question what is the exact value of the Price
of Anarchy of dominant strategies for these games.
Theorem 4. Consider a congestion game with quadratic cost functions which
admits a dominant strategy equilibrium s. Then SC(s)
opt ≤ 3 .
7

Observe that this upper bound is strictly smaller than the exact value of the
PoS for general congestion games with quadratic cost functions from Theorem 1,
which was ≈ 2.36.

References
1. Aland, S., Dumrauf, D., Gairing, M., Monien, B., Schoppmann, F.: Exact price
of anarchy for polynomial congestion games. SIAM Journal on Computing 40(5),
1211–1233 (2011)
2. Albers, S.: On the value of coordination in network design. SIAM Journal on Com-
puting 38(6), 2273–2302 (2009)
2
By a more elaborate analysis one can come up with an upper bound of ≈ 2.242 Here
we just wanted to demonstrate the separation of the two measures.
Price of Stability in Polynomial Congestion Games 507

3. Anshelevich, E., Dasgupta, A., Kleinberg, J.M., Tardos, É., Wexler, T.,
Roughgarden, T.: The price of stability for network design with fair cost allocation.
SIAM Journal on Computing 38(4), 1602–1623 (2008)
4. Awerbuch, B., Azar, Y., Epstein, A.: Large the price of routing unsplittable flow.
In: Proceedings of STOC, pp. 57–66 (2005)
5. Bhawalkar, K., Gairing, M., Roughgarden, T.: Weighted congestion games: Price of
anarchy, universal worst-case examples, and tightness. In: de Berg, M., Meyer, U.
(eds.) ESA 2010, Part II. LNCS, vol. 6347, pp. 17–28. Springer, Heidelberg (2010)
6. Bilò, V.: A unifying tool for bounding the quality of non-cooperative solutions
in weighted congestion games. In: Erlebach, T., Persiano, G. (eds.) WAOA 2012.
LNCS, vol. 7846, pp. 215–228. Springer, Heidelberg (2013)
7. Bilò, V., Bove, R.: Bounds on the price of stability of undirected network design
games with three players. Journal of Interconnection Networks 12(1-2), 1–17 (2011)
8. Bilò, V., Caragiannis, I., Fanelli, A., Monaco, G.: Improved lower bounds on
the price of stability of undirected network design games. In: Kontogiannis, S.,
Koutsoupias, E., Spirakis, P.G. (eds.) SAGT 2010. LNCS, vol. 6386, pp. 90–101.
Springer, Heidelberg (2010)
9. Caragiannis, I., Flammini, M., Kaklamanis, C., Kanellopoulos, P., Moscardelli,
L.: Tight bounds for selfish and greedy load balancing. Algorithmica 61, 606–637
(2011)
10. Chen, H.-L., Roughgarden, T.: Network design with weighted players. Theory of
Computing Systems 45, 302–324 (2009)
11. Christodoulou, G., Koutsoupias, E.: The price of anarchy of finite congestion
games. In: Proceedings of STOC, pp. 67–73 (2005)
12. Christodoulou, G., Chung, C., Ligett, K., Pyrga, E., van Stee, R.: On the price of
stability for undirected network design. In: Bampis, E., Jansen, K. (eds.) WAOA
2009. LNCS, vol. 5893, pp. 86–97. Springer, Heidelberg (2010)
13. Christodoulou, G., Koutsoupias, E.: On the price of anarchy and stability of cor-
related equilibria of linear congestion games. In: Brodal, G.S., Leonardi, S. (eds.)
ESA 2005. LNCS, vol. 3669, pp. 59–70. Springer, Heidelberg (2005)
14. Christodoulou, G., Koutsoupias, E., Spirakis, P.G.: On the performance of approx-
imate equilibria in congestion games. Algorithmica 61(1), 116–140 (2011)
15. Disser, Y., Feldmann, A.E., Klimm, M., Mihalák, M.: Improving the hk -bound
on the price of stability in undirected shapley network design games. CoRR,
abs/1211.2090 (2012); To appear in CIAC 2013
16. Fiat, A., Kaplan, H., Levy, M., Olonetsky, S., Shabo, R.: On the price of stability for
designing undirected networks with fair cost allocations. In: Bugliesi, M., Preneel,
B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4051, pp. 608–618.
Springer, Heidelberg (2006)
17. Koutsoupias, E., Papadimitriou, C.: Worst-case equilibria. In: Meinel, C., Tison,
S. (eds.) STACS 1999. LNCS, vol. 1563, pp. 404–413. Springer, Heidelberg (1999)
18. Li, J.: An O( logloglogn n ) upper bound on the price of stability for undirected Shapley
network design games. Information Processing Letters 109(15), 876–878 (2009)
19. Monderer, D., Shapley, L.: Potential games. Games and Economics Behavior 14,
124–143 (1996)
20. Rosenthal, R.W.: A class of games possessing pure-strategy Nash equilibria. Inter-
national Journal of Game Theory 2, 65–67 (1973)
21. Roughgarden, T.: Intrinsic robustness of the price of anarchy. Communications of
the ACM 55(7), 116–123 (2012)
Localization for a System of Colliding Robots

Jurek Czyzowicz1 , Evangelos Kranakis2, and Eduardo Pacheco2


1
Université du Québec en Outaouais, Gatineau, Québec J8X 3X7, Canada
2
Carleton University, Ottawa, Ontario K1S 5B6, Canada

Abstract. We study the localization problem in the ring: a collection of


n anonymous mobile robots are deployed in a continuous ring of perime-
ter one. All robots start moving at the same time along the ring with arbi-
trary velocity, starting in clockwise or counterclockwise direction around
the ring. The robots bounce against each other when they meet. The task
of each robot is to find out, in finite time, the initial position and the
initial velocity of every deployed robot. The only way that robots per-
ceive the information about the environment is by colliding with their
neighbors; any type of communication among robots is not possible.
We assume the principle of momentum conservation as well as the
conservation of energy, so robots exchange velocities when they collide.
The capabilities of each robot are limited to: observing the times of its
collisions, being aware of its velocity at any time, and processing the
collected information. Robots have no control of their walks or veloci-
ties. Robots’ walks depend on their initial positions, velocities, and the
sequence of collisions. They are not equipped with any visibility mecha-
nism.
The localization problem for bouncing robots has been studied previ-
ously in [1,2] in which robots are assumed to move at the same velocity.
The configuration of initial positions of robots and their speeds is con-
sidered feasible, if there is a finite time, after which every robot starting
at this configuration knows initial positions and velocities of all other
robots. Authors of [1] conjectured that if robots have arbitrary veloci-
ties, the problem might be solvable, if the momentum conservation and
energy preservation principles are assumed.
In this paper we prove that the conjecture in [1] is false. We show that
the feasibility of any configuration and the required time for solving it
under such stronger constraints depend only on the collection of velocities
of the robots. More specifically, if v0 , v1 , . . . , vn−1 are the velocities of a
given robot configuration S, we prove that S is feasible if and only if
v +...+v
vi = v̄ for all 0 ≤ i ≤ n − 1, where v̄ = 0 n n−1 . To figure out
2
the initial positions of all robots no more than min time is
0≤i≤n−1 |vi −v̄|
required.

1 Introduction
Due to their simplicity, efficiency, and flexibility mobile agents or robots have
been widely used in diverse areas namely artificial intelligence, computational
economics, and robotics [3]. Mobile robots are autonomous entities that possess

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 508–519, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Localization for a System of Colliding Robots 509

the ability to move within their environment, to interact with other robots, to
perceive the information of the environment and to process this information.
Some examples of tasks carried out by mobile robots are environment explo-
ration, perimeter patrolling, mapping, pattern formation and localization.
In order to reduce power consumption and to prevent scalability problems,
minimum communication among robots and robots with limited capabilities are
frequently sought. With this in mind, in this paper, we study distributed systems
of mobile robots that allow no communication whatsoever and have limited
capabilities. Part of our motivation is to understand the algorithmic limitations
of what a set of such robots can compute.
The task of our interest is that each robot localizes the initial position of
every other robot also deployed on the ring with respect to its own position. Our
model assumes robot anonymity, collisions as the only way of interaction between
robots, and robots’ movements completely out of their control. The abilities of
each robot are limited to observing the time of any of its collisions, awareness of
its velocity at any time and the capacity to process this information.
Distributed applications often concern mobile robots of very limited commu-
nication and sensing capabilities, mainly due to the limited production cost,
size and battery power. Such collections of mobile robots, called swarms, often
perform exploration or monitoring tasks in hazardous or hard to access envi-
ronments. The usual swarm robot attributes assumed for distributed models
include anonymity, negligible dimensions, no explicit communication, no com-
mon coordinate system (see [4]). In most situations involving such weak robots
the fundamental research question concerns the feasibility of solving the given
task (cf. [5,4]).
In our paper, besides the limited sensing and communication capabilities,
a robot has absolutely no control on its movement, which is determined by the
bumps against its neighbors. In [6,7] the authors introduced population protocols,
modeling wireless sensor networks by limited finite-state computational devices.
The agents of population protocols also follow mobility patterns totally out
of their control. This is called passive mobility, intended to model, e.g., some
unstable environment, like a flow of water, chemical solution, human blood, wind
or unpredictable mobility of agents’ carriers (e.g. vehicles or flocks of birds).
Pattern formation is sometimes considered as one of the steps of more complex
distributed tasks. Our interest in the problem of this paper was fueled by the
patrolling problem [8]. Patrolling is usually more efficient if the robots are evenly
distributed around the environment. Clearly, location discovery is helpful in
uniform spreading of the collection. [9] investigated a related problem, where
uniform spreading in one-dimensional environment has been studied.
The dynamics of the robots in our model is similar to the one observed in
some systems of gas particles which have motivated some applications for mo-
bile robots. The study of the dynamics of particles sliding in a surface that collide
among themselves has been of great interest in physics for a long time. Much of
such work has been motivated for the sake of understanding the dynamic prop-
erties of gas particles [10,11,12,13]. The simplest models of such particle systems
510 J. Czyzowicz, E. Kranakis, and E. Pacheco

assume either a line or a ring as the environment in which particles move. For
instance, Jepsen [14], similarly to our paper, considers particles of equal mass
and arbitrary velocity moving in a ring. He assumes the conservation of momen-
tum and conservation of energy principles, such that when two particles collide
they exchange velocities. Jepsen studies the probabilistic movement of particles
because of its importance for understanding some gas equilibrium properties.
Other works have found applications of particle systems in different fields. For
example, Cooley and Newton [15,16] described a method to generate pseudo
random numbers efficiently by using the dynamics of particle systems.
The distributed computing community has exploited the simple dynamics of
some particle systems to design algorithms for mobile robots. [17] consider a
system of mobile robots that imitate the impact behavior of n elastic particles
moving in a ring. They considered a set of n synchronous robots that collide elas-
tically moving in a frictionless ring in order to carry out perimeter surveillance.
Sporadic communication at the times of collision is assumed. Another example
of a system of robots that mimics particle’s dynamics is found in [18], where the
problem of motion synchronization in the line is studied.
In our paper we also assume a system wherein robots imitate the dynamics of
gas particles moving in a ring, with the restriction that robots can not communi-
cate. The task we consider is localization of the initial position of every robot in
the ring. We call this problem the localization problem. The localization problem
has been previously studied in [2] from a randomized approach and robots oper-
ating in synchronous rounds. Czyzowicz et.al [1], considered a simplified version
of the localization task from a deterministic point of view. They assumed that
all robots have equal speed and that when two of them collide they bounce back.
Czyzowicz et.al show that all robot configurations in which not all the robots
have the same initial direction are feasible and provided a position detection
algorithm for all feasible configurations.
When robots’ velocities are arbitrary and each robot is aware only of its
current velocity, as we assume in this paper, the characterization of all feasible
robot configurations becomes much more complex. In [1] infeasible configurations
can be detected by robots in finite time, while we show here, that without the
knowledge of initial velocities, there always exists some robot for which it is
impossible to decide whether the given configuration is feasible, even if the total
number of robots is known in advance.
We provide a complete characterization of all feasible configurations. If
v0 , v1 , . . . , vn−1 is the collection of velocities of a given robot configuration S,
we prove that S is feasible if and only if for all i, 0 ≤ i ≤ n − 1 we have
vi = v0 +...+v n
n−1
. Moreover, we provide an upper bound of min0≤i≤n−1 2
|vi −v̄| on
the necessary time to solve the localization problem. Hence the feasibility of any
robot configuration is independent of the starting positions of the robots.

2 The Model
We consider a set of n synchronous and anonymous robots r0 , r1 , . . . , rn−1 de-
ployed on a continuous, one-dimensional ring of perimeter one, represented by
Localization for a System of Colliding Robots 511

the real segment [0, 1) wherein 0 and 1 are identified as the same point. Each
robot ri starts moving at time t = 0 with velocity vi either in counterclockwise
or in clockwise direction. By |vi | we denote the speed of velocity vi .
By ri (t) ∈ [0, 1) we denote the position in the ring of robot ri at time t.
Let P denote the sequence p0 , p1 , . . . , pn−1 of initial positions of the robots,
meaning pi = ri (0), 0 ≤ i ≤ n − 1. W.l.o.g, we assume p0 = 0 as well as a
non-decreasing order in the counterclockwise direction of the initial positions of
the robots. If robot ri moves freely along the ring with velocity vi during the
time interval [t1 , t2 ], then its position at time t ∈ [t1 , t2 ] is given by ri (t) =
ri (t1 ) + vi · (t − t1 ). When two robots collide they exchange velocities following
the principle of momentum conservation and conservation of energy in classical
mechanics for objects of equal mass [19]. We assume that in any collision no
more than two robots are involved.
Regarding the capabilities of the robots, we assume that each of them has a
clock which can measure time in a continuous way. Each robot is always aware of
its clock, current velocity and the time of any of its collisions. The movement of
a robot is beyond its control in that it depends solely on its initial position and
velocity, as well as the collisions with other robots along the way. At the time
of deployment, no robot is aware of the initial position and the velocity of any
other robot nor of the total number of robots deployed in the ring. Moreover,
robots do not have a common sense of direction.
Let S = (P, V ) be a system of n mobile robots r0 , r1 , . . . , rn−1 with initial po-
sitions P = (p0 , p1 , . . . , pn−1 ) and velocities V = (v0 , v1 , . . . , vn−1 ) respectively;
we denote by v̄ the average of the velocities in V . We say that the localization
problem for S is feasible if there exists a finite time T , such that each robot can
determine the initial positions, and the initial velocities of all robots in the sys-
tem with respect to its own starting position and its own orientation of the ring.
This should be accomplished by each robot by observing the times of a sequence
of collisions taking place within some time interval [0, T ]. Note that each colli-
sion is accompanied by the measurement of collision time and a corresponding
exchange of velocities.
At issue here is not only to determine the feasibility of the localization problem
for the given system S, but also to characterize all such feasible system instances.

3 Feasibility of the Localization Problem

For two points p, q in the ring, by d(+) (p, q) we denote the counterclockwise
distance from p to q in the ring, i.e. the distance which needs to be travelled in
the counterclockwise direction in order to arrive at q starting from p. Note that
for p = q we have 0 < d(+) (p, q) < 1, and d(+) (p, q) = 1 − d(+) (q, p).
In order to visualize the dynamics of the robots in the ring, we consider an
infinite line L = (−∞, ∞) and for each robot ri we create an infinite number of
(j)
its copies ri , all having the same initial velocity, such that their initial positions
(j)
in L are ri (0) = j + ri (0) for all integer values of j ∈ Z.
512 J. Czyzowicz, E. Kranakis, and E. Pacheco

We use the idea of baton, applied previously in [1,2], in order to simplify our
arguments and to gain intuition of the dynamics of the robots. Assume that
each robot holds a virtual object, called baton, and when two robots collide they
(j) (j)
exchange their batons. By bi we denote the baton originally held by robot ri
(j)
and by bi (t) we denote the position of this baton on L at time t. Notice that
(j)
the velocity of baton bi is constant so its trajectory corresponds to the line of
slope 1/vi .
By putting together the infinite line and the trajectories of batons, we can
depict the walk of the robots up to any given time. For instance, in Fig. 1, the
dynamics of a system of three mobile robots is depicted. The walk of robot r0
along the ring corresponds to the thick polyline.

−1 0 1 2
(−1) (−1) (−1) (0) (0) (0) (1) (1) (0) (2) (2) (2)
r0 r1 r3 r0 r1 r3 r0 r1 r3 r0 r1 r3

t0

t1 t2

t3

t4 t5

time

Fig. 1. Trajectory of robot r0 corresponds to the thick polyline. The times of its first
six collisions are also shown.

When a robot moves from any given position p on line L to the position p + 1
(or p − 1) such a robot has completed a tour along the ring in counterclockwise
(or resp. clockwise) direction. For example r0 in Fig. 1 has completed two coun-
terclockwise tours along the ring between time t0 and t3 . We show first, that the
feasibility of the localization problem does not change when the initial speeds of
all robots are increased, or decreased by the same value.

Definition 1 A translation of a system S = (P, (v0 , . . . , vn−1 )) is a system


Sc = (P, (v0 , . . . , vn−1

)), where vi = vi − c, 0 ≤ i ≤ n − 1, for c ∈ R.

Lemma 1. Let S be a system of robots and let Sc be any of its translations. For
every time t, velocities vi and vj are exchanged in S at time t if, and only if at
time t velocities vi − c and vj − c are exchanged in Sc .
Localization for a System of Colliding Robots 513

Proof. Since S is also a translation of Sc , it is enough to prove the lemma in


one direction. Consider any translation Sc of system S and let vi and vj be such
that vi = vi − c and vj = vj − c for some c ∈ R and let bi , bj , bi , and bj be the
corresponding batons of velocities vi , vj , vi , and vj respectively.
Since the times of exchange of velocities vi and vj coincide with the times of
exchange of batons bi and bj , it is enough to prove that for every time t in which
batons bi and bj are exchanged in S, so are batons bi and bj in Sc .
We prove first that the time of the first meeting of bi and bj in S is the same
as the time of the first meeting of bi and bj in Sc . The statement is clearly true
when vi = vj , since in both systems S and Sc batons bi and bj stay forever
at the same distance on the ring and no meeting ever occurs. Suppose then, by
symmetry, that vi ≥ 0 and vi > vj . Let d be the initial counterclockwise distance
(i.e. at time t = 0) from bi to bj , i.e. d = d(+) (bi (0), bj (0)). Observe that the
batons approach at the speed equal to |vi − vj |. (Note that this holds as well
when robots have different directions, i.e. vj < 0: then |vi − vj | = |vi | + |vj |).
Hence the first meeting of bi and bj occurs at time t∗ = |vi −v d
j|
. However in Sc
 
we have vi = vi − c > vj − c = vj and again the two batons approach reducing
their original distance d with speed |vi − vj | = |vi − vj | meeting eventually after
the same time t∗ .
A careful reader may observe that the above argument holds independently
of the directions that the batons bj , bi and bj may have.
Observe that the same analysis holds by induction for the k-th meeting of the
robots, for k = 2, 3, . . . Indeed, if i < j, i.e. pi < pj , and d = pj − pi , then the
k-th meeting of bi and bj corresponds to the intersection of the trajectory of the
(0) (k−1)
copy bi of baton bi with the copy bj of baton bj . As their initial distance
on L equals d + k − 1, this meeting occurs at time t∗ = |v d+k−1
i −vj |
in both systems
S and Sc . If i > j we have d = 1 − (pj − pi ) and the k-th meeting of bi and bj
(0) (k)
corresponds to the intersection of the trajectories of bi and bj , which happens
at time t∗ = |vi −vj | in both systems S and Sc .
d+k

Fig. 2 illustrates Lemma 1. The walk of robot r1 is represented by a thick polyline


to illustrate how the walk of a robot is affected in a translation of a system.

Lemma 2. Let S be a system of robots and Sc any of its translations. If ti is


the time of the i-th collision of robot rq in S, and ti the time of the i-th collision
of robot rq in Sc , then ti = ti for i ≥ 1, where t0 = t0 = 0. Moreover, if v(ti ) is
the velocity of robot rq at time ti , then the velocity of rq at time ti is v(ti ) − c.
Proof. Assume the lemma holds for i, so at time ti robot rq obtains some baton
bj in S, while at the same time rq obtains the corresponding baton bj in Sc . Let
ti+1 denote the first time moment after ti when baton bj meets another baton
in S, say bk . By Lemma 1 at the same time ti+1 baton bj meets bk in Sc . As
vk = vk − c the claim follows.

We show below that every robot, by each of its collisions, acquires information
about the initial position (relative to its own initial position) and initial velocity
514 J. Czyzowicz, E. Kranakis, and E. Pacheco

0 1 2 0 1 2
(0) (0) (0) (1) (1) (1) (2) (0) (0) (0) (1) (1) (1) (2)
r0 r1 r2 r0 r1 r2 r0 r0 r1 r2 r0 r1 r2 r0
t=0

t0
t1

t2

t3

time
a) b)

Fig. 2. a) depicts a system of three robots r0 , r1 , r2 whose velocities are 3, 2, 1 respec-


tively, b) depicts a translation of a) with new velocities 1, 0, −1 (after subtracting 2 to
each original velocity). Notice that the time at which each collision takes place does
not get affected.

of some other robot of the system. We show later that if S is feasible, at some
time moment the collision revealing the position and velocity of any other robot
will eventually arise. However it is worth noting that up to that time moment,
some collisions revealing the positions of the same robot may arise several times.
We assume that, at time t = 0 each robot learns about its initial velocity.

Lemma 3. Consider the collisions obtained by robot rq deployed at its initial


position pq in the ring. Suppose that the last collision of robot rq , at time ti re-
vealed some robot rs , of initial velocity vs and initial position at counterclockwise
distance ds from pq , i.e. ds = d(+) (pq , ps ). Further assume that, at time ti+1 ,
robot rq collides obtaining velocity vt . Then there exists in S a robot rt with
initial velocity vt such that d(+) (pq , pt ) = ((vs − vt )ti+1 + ds ) mod 1.
Proof. Between time ti and ti+1 robot rq moves with velocity vs so we may
assume that it holds baton bs . The time of collision ti+1 corresponds to the
(j)
time of intersection of the trajectory of some copy bs of this baton with the
(k)
trajectory of some copy bt of the baton moving with velocity vt . The absolute
(j) (k)
distance in L between the starting positions bs (0) and bt (0) equals |vs −vt |ti+1 .
Therefore d (ps , pt ) = (vs − vt )ti+1 mod 1. Since by the assumption of the
(+)

Lemma d(+) (pq , ps ) = ds , we have d(+) (pq , pt ) = (d(+) (pq , ps ) + d(+) (ps , pt ))
mod 1 = ((vs − vt )ti+1 + ds ) mod 1.

It follows from Lemma 3 that for a robot to figure out the starting position
of every other robot it should acquire every velocity of the system in a finite
amount of time. Lemma 3 provides the core of an algorithm for robots to report
the starting position of every robot. We describe such an algorithm later on. The
next lemma is an immediate consequence of Lemma 2 and Lemma 3.
Localization for a System of Colliding Robots 515

Lemma 4. For any system S and its translation Sc , the position discovery prob-
lem is solvable for S, if and only if it is solvable for Sc .

Given a fixed point ρ in the ring, which we call the reference point and S a
system of robots, we associate with each robot ri an integer counter ci that we
call cycle counter. A cycle counter ci increases its value by one each time robot
ri traverses the reference point ρ in the counterclockwise direction and decreases
by one when traversing ρ in clockwise direction. We denote by ci (t) the value of
cycle counter ci at time t. The initial value of ci is set to 0, meaning ci (0) = 0.
(+)
Let Di (t) denote the total distance that robot ri travelled until time t in
(−)
the counterclockwise direction, and Di (t) - the total distance travelled by ri
(+) (−)
in the clockwise direction. Denote Di (t) = Di (t) − Di (t). The following
n−1
observation is the immediate consequence of i=0 vi = 0 for system Sv̄ :

Observation 1 For any system Sv̄ at any time moment t we have


 n−1
i=0 Di (t) = 0.

Lemma 5. Consider the translation Sv̄ of any system S. At any time t, no two
cycle counters differ by more than 1, i.e |ci (t) − cj (t)| ≤ 1, 0 ≤ i, j ≤ n − 1.
Moreover, there should be a cycle counter ck(t) such that ck(t) (t) = 0 for some
0 ≤ k(t) ≤ n − 1.
Proof. Let us observe that since robots can not overpass each other they always
keep their initial cyclic order. Therefore, we can simulate the traversals on ρ
by the robots by assuming that robots remain static while ρ is moving in one
of the two directions along the ring; when ρ traverses a robot ri in clockwise
direction, counter ci increases by one and decreases by one if ρ traverses ri in
counterclockwise direction.
We prove first that |ci (t)| ≤ 1, for each 0 ≤ i ≤ n − 1. Indeed, suppose to the
contrary, that |ci (t)| ≥ 2. Consider first the case when ci (t) ≥ 2. In such a case,
ri must have traversed point ρ at least two more times in the counterclockwise
direction than in the clockwise one. Since the robots do not change their rela-
tive order around the ring, each other robot rj must have traversed ρ at least
once more in the counterclockwise direction than in the clockwise n−1one. Hence
(+) (−)
Di (t) > Di (t) for each i = 0, . . . , n − 1. This contradicts i=0 Di (t) = 0.
The argument for ci (t) ≤ −2 is symmetric.
It is easy to see that there are no two robots ri , rj , such that ci (t) = 1 and
cj (t) = −1. Indeed in such a case these robots must have traversed point ρ in
opposite directions which would have forced them to overpass - a contradiction.
Hence the values of all cycle counters at time t belong to the set {0, 1} or to
the set {0, −1}. However ci (t) = 0, for all i = 0, . . . , n − 1 would imply Di (t) be
n−1
all positive or all negative, contradicting i=0 Di (t) = 0, which concludes the
proof.
We can conclude with the following Corollary:

Corollary 1. For each robot ri of Sv̄ and any time t we have |Di (t)| < 1.
516 J. Czyzowicz, E. Kranakis, and E. Pacheco

(+)
Proof. Suppose to the contrary that |Di (t)| ≥ 1 or, by symmetry, that Di (t)−
(−)
Di (t) ≥ 1. In such a case, ri at time t made a full counterclockwise tour around
the ring. By putting the reference point ρ = ri (0), we notice that this forces each
other robot rj to have cj (t) ≥ 1, which contradicts Observation 1.

−1 0 1 2
(−1) (−1) (−1) (−1) (0) (0) (0) (0) (1) (1) (1) (1) (2) (2) (2) (2)
r0 r1 r2 r3 r0 r1 r2 r3 r0 r1 r2 r3 r0 r1 r2 r3

time

Fig. 3. An example of a system of robots where the average of the velocities is equal
to 0. Notice that no robot completes more than one round in any direction.

Fig. 3 depicts a system of mobile robots, where the average of the velocities
is equal to 0. Notice that every robot in the picture never completes more than
one round along the ring in any direction. In the picture the movements of r0
are shown with a thick polyline to illustrate this.

Theorem 1. For any system of n mobile robots S = (P, V ) the localization


problem is feasible if, and only if vi = v̄, for every vi ∈ V . Moreover, if the
problem is feasible, then each robot knows the positions and the velocities of
2
other robots before time T = min0≤i≤n−1 |vi −v̄| .

Proof. By Lemma 4, it is sufficient to prove the theorem for Sv̄ = (P, Vv̄ ).
We prove first, that if some robot ri has the initial velocity vi = v̄ = 0, then
the system is not feasible. For the localization problem to be feasible in Sv̄ , each
robot must hold every baton at some time within some finite time interval [0, T ].
We prove by contradiction that, if there is a baton bq of velocity 0, then there
exists a robot whose trajectory will not intersect the trajectory of bq . Thus, such
a robot would not obtain the information about the velocity and the position of
robot rq .
Consider cycle counters cj (t) for each robot rj , 0 ≤ j ≤ n − 1, where the
reference point is set to ρ = rq (0). Because vq = 0, there is always a robot of Sv̄
that remains motionless at point ρ. In other words, each robot of Sv̄ , in order to
hold baton bq has to move to position ρ and collide with the current robot at that
position. Observe that it is not possible that all robots arrive at point ρ from
Localization for a System of Colliding Robots 517

the same direction around the ring. Indeed, in such a case the robot velocities
would be all positive or all negative implying v̄ = 0. Consequently, observe that
there must exist two time moments t1 , t2 and two consecutive robots ri and ri+1
(where index i + 1 is taken modulo n) such that one of these robots visited ρ at
time t1 while walking in one direction and the other robot visited ρ at time t2
while walking in the opposite direction. Notice that t1 = t2 , since we supposed
no three robots meeting simultaneously, and ρ coincides with a stationary robot.
Suppose, that ri arrived at ρ at time t1 while walking clockwise and ri+1
arrived at ρ at time t2 while walking counterclockwise. As robots are arranged
in the counterclockwise order around the ring it follows that within the time
interval [t1 , t2 ] each other robot has to walk counterclockwise through ρ (or walk
more times counterclockwise than clockwise) increasing its cycle counter.
Let S  = (P  , Vv̄ ), where P  = (r0 (t1 ), . . . , rn−1

(t1 )) and let cj be the respec-

tive cycle counter of robot rj for every 0 ≤ j ≤ n − 1. Notice that during the
interval of time [0, t2 − t1 ] every robot of S  behaves exactly the same way as
does every robot in Sv̄ in the interval of time [t1 , t2 ]. Thus, at time t∗ = t2 − t1
we have cj (t∗ ) > 0 for all 0 ≤ j ≤ n − 1 which contradicts Lemma 5. This
implies that there is at least one robot that does not learn the initial position of
all robots.
The cases where ri+1 , rather than ri , arrived at ρ at time t1 and when the
directions of ri and ri+1 while walking through ρ are reversed, are symmetric.
Suppose now that no robot has the initial velocity v̄ = 0. Consider any robot
ri and the interval I2 = [ri (0) − 1, ri (0) + 1] of the infinite line L. By Corollary 1
robot ri never leaves interval I2 during its movement, hence its trajectory is
bound to the vertical strip of width 2 (cf. Fig. 3). Consider any baton bj . Suppose,
by symmetry, that vj > 0. Take the trajectory of a copy of baton bj which origins
from the left half of I2 , i.e. from the segment [ri (0) − 1, ri (0]. This trajectory will
go across the vertical strip of width 2 enclosing I2 and leave it before time |v2j | ,
forcing the meeting of robot ri and baton bj . If vj < 0 we need to take a copy
of bj starting at the right half of I2 , i.e. in [ri (0), ri (0) + 1] and the argument is
the same. The time of |v2j | is maximized for j minimizing |vj |.
An example of an infeasible robot configuration is shown in Fig. 2, in which
robot r0 never learns the initial position of robot r1 .

4 The Localization Algorithm


In this section we present an algorithm to solve the localization problem. The
algorithm is based on Lemma 3. According to Theorem 1, if robots have knowl-
edge of the velocities of other robots on the ring, it is possible for them to detect
the infeasibility of the system without even starting to move, or stop it when the
2
variable clock reaches the value of min0≤i≤n−1 |vi −v̄| . Otherwise, the algorithm is
designed to run indefinitely. However, we can also assume that a central author-
ity, perhaps having the knowledge of the velocities of the robots in the system,
may modify robot’s signal variable move to halt the execution. The present al-
gorithm may report the position of the same robot more than once; this may be
518 J. Czyzowicz, E. Kranakis, and E. Pacheco

clearly avoided providing robots with linear-size memory to recall all previously
output robots.
The main theorem ensures that at this time all robots have discovered all the
initial positions if the system is feasible.
In algorithm RingLocalization we assume that a robot has at any time im-
mediate access to its clock as well as to the information of its current velocity
through the variables clock and velocity, respectively. So the value of these vari-
ables can not be modified by the robot and they get updated instantaneously
as a collision happens. We can assume that the values of these variables cor-
respond to the readings of robots’ sensors. A robot uses auxiliary variables,
namely old velocity and pos for recalling the position and the velocity of the
robot detected through its last collision.

Algorithm RingLocalization;
1. var pos ← 0, old velocity ← velocity : real; move ← true : boolean;
2. reset clock to 0;
3. while move do
4. walk until collision;
5. pos ← ((velocity − old velocity) · clock + pos) mod 1;
6. output ("Robot of velocity" velocity "detected at position" pos);
7. old velocity ← velocity;

Since variable pos clearly keeps track of ds = d(+) (pq , ps ), Theorem 1 and
Lemma 3 imply the following result.
Theorem 2. Let S = (P, V ) be a system of robots. Suppose that no robot
has initial velocity v̄, meaning vi = v̄ for all vi ∈ V , and that the algorithm
RingLocalization is executed by each robot for time min0≤i≤n−1 2
|vi −v̄| . Then,
every robot correctly reports the initial positions and directions of all robots on
the ring with respect to its initial position.

5 Conclusions
We characterized configurations of all feasible systems. Observe that without the
knowledge of velocities, even if the number of robots in the system is known, it is
impossible for a robot to decide at any time if the system is infeasible. Indeed, by
Theorem 1 to any system S it is possible to add a new robot of velocity equal to
the average v̄, making S infeasible for at least some robot ri of S. Consequently,
given arbitrarily large time T ∗ it is also possible to add to S a robot of velocity
close to v̄, so the system stays feasible but not within the time bound of T ∗
Notice also, that already for two robots at small distance , starting in opposite
directions with small velocities v1 and v2 = −v1 it takes time 1− 2v1 to get the
first collision, so the worst-case time of localization algorithm proportional to
1
min0≤i≤n−1 |vi −v̄| is unavoidable.
Localization for a System of Colliding Robots 519

References
1. Czyzowicz, J., Gąsieniec, L., Kosowski, A., Kranakis, E., Ponce, O.M., Pacheco,
E.: Position discovery for a system of bouncing robots. In: Aguilera, M.K. (ed.)
DISC 2012. LNCS, vol. 7611, pp. 341–355. Springer, Heidelberg (2012)
2. Friedetzky, T., Gąsieniec, L., Gorry, T., Martin, R.: Observe and remain silent
(communication-less agent location discovery). In: Rovan, B., Sassone, V., Wid-
mayer, P. (eds.) MFCS 2012. LNCS, vol. 7464, pp. 407–418. Springer, Heidelberg
(2012)
3. Kranakis, E., Krizanc, D., Markou, E.: The mobile agent rendezvous problem in
the ring. Synthesis Lectures on Distributed Computing Theory 1(1), 1–122 (2010)
4. Suzuki, I., Yamashita, M.: Distributed anonymous mobile robots: Formation of
geometric patterns. SIAM J. Comput. 28(4), 1347–1363 (1999)
5. Das, S., Flocchini, P., Santoro, N., Yamashita, M.: On the computational power of
oblivious robots: forming a series of geometric patterns. In: PODC, pp. 267–276
(2010)
6. Angluin, D., Aspnes, J., Diamadi, Z., Fischer, M.J., Peralta, R.: Computation in
networks of passively mobile finite-state sensors. Distributed Computing 18(4),
235–253 (2006)
7. Angluin, D., Aspnes, J., Eisenstat, D.: Stably computable predicates are semilinear.
In: PODC, pp. 292–299 (2006)
8. Czyzowicz, J., Gasieniec, L., Kosowski, A., Kranakis, E.: Boundary patrolling by
mobile agents with distinct maximal speeds. Algorithms–ESA 2011, 701–712 (2011)
9. Cohen, R., Peleg, D.: Local spreading algorithms for autonomous robot systems.
Theor. Comput. Sci. 399(1-2), 71–82 (2008)
10. Murphy, T.: Dynamics of hard rods in one dimension. Journal of Statistical
Physics 74(3), 889–901 (1994)
11. Sevryuk, M.: Estimate of the number of collisions of n elastic particles on a line.
Theoretical and Mathematical Physics 96(1), 818–826 (1993)
12. Tonks, L.: The complete equation of state of one, two and three-dimensional gases
of hard elastic spheres. Physical Review 50(10), 955 (1936)
13. Wylie, J., Yang, R., Zhang, Q.: Periodic orbits of inelastic particles on a ring.
Physical Review E 86(2), 026601 (2012)
14. Jepsen, D.: Dynamics of a simple many-body system of hard rods. Journal of
Mathematical Physics 6, 405 (1965)
15. Cooley, B., Newton, P.: Random number generation from chaotic impact collisions.
Regular and Chaotic Dynamics 9(3), 199–212 (2004)
16. Cooley, B., Newton, P.: Iterated impact dynamics of n-beads on a ring. SIAM
Rev. 47(2), 273–300 (2005)
17. Susca, S., Bullo, F.: Synchronization of beads on a ring. In: 46th IEEE Conference
on Decision and Control, pp. 4845–4850 (2007)
18. Wang, H., Guo, Y.: Synchronization on a segment without localization: algorithm
and applications. In: International Conference on Intelligent Robots and Systems,
IROS, pp. 3441–3446 (2009)
19. Gregory, R.: Classical mechanics. Cambridge University Press (2006)
Fast Collaborative Graph Exploration

Dariusz Dereniowski1 , Yann Disser2 , Adrian Kosowski3 ,


˛ 3 , and Przemysław Uznański3
Dominik Pajak
1
Gdańsk University of Technology, Poland
2
TU Berlin, Germany
3
CEPAGE project, Inria Bordeaux Sud-Ouest, France

Abstract. We study the following scenario of online graph exploration. A team


of k agents is initially located at a distinguished vertex r of an undirected graph.
At every time step, each agent can traverse an edge of the graph. All vertices have
unique identifiers, and upon entering a vertex, an agent obtains the list of identi-
fiers of all its neighbors. We ask how many time steps are required to complete
exploration, i.e., to make sure that every vertex has been visited by some agent.
We consider two communication models: one in which all agents have global
knowledge of the state of the exploration, and one in which agents may only
exchange information when simultaneously located at the same vertex. As our
main result, we provide the first strategy which performs exploration of a graph
with n vertices at a distance of at most D from r in time O(D), using a team of
agents of polynomial size k = Dn1+ < n2+ , for any  > 0. Our strategy works
in the local communication model, without knowledge of global parameters such
as n or D.
We also obtain almost-tight bounds on the asymptotic relation between explo-
ration time and team size, for large k. For any constant c > 1, we show that in the
global communication model, a team of k = Dnc agents can always complete ex-
ploration in D(1+ c−1 1
+o(1)) time steps, whereas at least D(1+ 1c −o(1)) steps
2
are sometimes required. In the local communication model, D(1 + c−1 + o(1))
steps always suffice to complete exploration, and at least D(1 + c − o(1)) steps
2

are sometimes required. This shows a clear separation between the global and
local communication models.

1 Introduction
Exploring an undirected graph-like environment is relatively straight-forward for a sin-
gle agent. Assuming the agent is able to distinguish which neighboring vertices it has
previously visited, there is no better systematic traversal strategy than a simple depth-
first search of the graph, which takes 2(n − 1) moves in total for a graph with n vertices.
The situation becomes more interesting if multiple agents want to collectively explore
the graph starting from a common location. If arbitrarily many agents may be used, then

This work was initiated while A. Kosowski was visiting Y. Disser at ETH Zurich.
Supported by ANR project DISPLEXITY and by NCN under contract DEC-
2011/02/A/ST6/00201. The authors are grateful to Shantanu Das for valuable discussions
and comments on the manuscript. The full version of this paper is available online at:
http://hal.inria.fr/hal-00802308.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 520–532, 2013.
c Springer-Verlag Berlin Heidelberg 2013
Fast Collaborative Graph Exploration 521

we can generously send nD agents through the graph, where D is the distance from the
starting vertex to the most distant vertex of the graph. At each step, we spread out the
agents located at each node (almost) evenly among all the neighbors of the current
vertex, and thus explore the graph in D steps.
While the cases with one agent and arbitrarily many agents are both easy to under-
stand, it is much harder to analyze the spectrum in between these two extremes. Of
course, we would like to explore graphs in as few steps as possible (i.e., close to D),
while using a team of as few agents as possible. In this paper we study this trade-off
between exploration time and team size. A trivial lower bound on the number of steps
required for exploration with k agents is Ω(D + n/k): for example, in a tree, some
agent has to reach the most distant node from r, and each edge of the tree has to be
traversed by some agent. We look at the case of larger groups of agents, for which D
is the dominant factor in this lower bound. This complements previous research on the
topic for trees [6,8] and grids [17], which usually focused on the case of small groups
of agents (when n/k is dominant).
Another important issue when considering collaborating agents concerns the model
that is assumed for the communication between agents. We need to allow communica-
tion to a certain degree, as otherwise there is no benefit to using multiple agents for
exploration [8]. We may, for example, allow agents to freely communicate with each
other, independent of their whereabouts, or we may restrict the exchange of informa-
tion to agents located at the same location. This paper also studies this tradeoff between
global and local communication.

The Collaborative Online Graph Exploration Problem. We are given a graph G =


(V, E) rooted at some vertex r. The number of vertices of the graph is bounded by
n. Initially, a set A of k agents is located at r. We assume that vertices have unique
identifiers that admit a total ordering. In each step, an agent visiting vertex v receives a
complete list of the identifiers of the nodes in N (v), where N (v) is the neighborhood of
v. Time is discretized into steps, and in each step, an agent can either stay at its current
vertex or slide along an edge to a neighboring vertex. Agents have unique identifiers,
which allows agents located at the same node and having the same exploration history to
differentiate their actions. We do not explicitly bound the memory resources of agents,
enabling them in particular to construct a map of the previously visited subgraph, and
to remember this information between time steps. An exploration strategy for G is a
sequence of moves performed independently by the agents. A strategy explores the
graph G in t time steps if for all v ∈ V there exists time step s ≤ t and an agent g ∈ A,
such that g is located at v in step s. Our goal is to find an exploration strategy which
minimizes the time it takes the explore a graph in the worst case, with respect to the
shortest path distance D from r to the vertex furthest from r in the graph.
We distinguish between two communication models. In exploration with global com-
munication we assume that, at the end of each step s, all agents have complete knowl-
edge of the explored subgraph. In particular, in step s all agents know the number of
edges incident to each vertex of the explored subgraph which lead to unexplored ver-
tices, but they have no information on any subgraph consisting of unexplored vertices.
In exploration with local communication two agents can exchange information only if
they occupy the same vertex. Thus, each agent g has its own view on which vertices
522 D. Dereniowski et al.

Table 1. Our bounds for the time required to explore general graphs with using Dnc agents. The
same upper and lower bounds hold for trees. The lower bounds use graphs with D = no(1) .

Communication Model Upper bound Lower bound


D · (1 + 1
c−1
+ o(1)) D · (1 + 1c − o(1))
Global communication:
Thm. 3 Thm. 5
D · (1 + c−1
2
+ o(1)) D · (1 + 2c − o(1))
Local communication :
Thm. 3 Thm. 5

were explored so far, constructed based only the knowledge that originates from the
agent’s own observations and from other agents that it has met.
Our results. Our main contribution is an exploration strategy for a team of polynomial
size to explore graphs in an asymptotically optimal number of steps. More precisely, for
any  > 0, the strategy can operate with Dn1+ < n2+ agents and takes time O(D).
It works even under the local communication model and without prior knowledge of n
or D.
We first restrict ourselves to the exploration of trees (Section 2). We show that with
global communication trees can be explored in time D · (1 + 1/(c − 1) + o(1)) for any
c > 1, using a team of Dnc agents. Our approach can be adapted to show that with
local communication trees can be explored in time D · (1 + 2/(c − 1) + o(1)) for any
c > 1, using the same number of agents. We then carry the results for trees over to
the exploration of general graphs (Section 3). We obtain precisely the same asymptotic
bounds for the number of time steps needed to explore graphs with Dnc agents as for
the case of trees, under both communication models.
Finally, we provide lower bounds for collaborative graph exploration that almost
match our positive results (Section 4). More precisely, we show that, in the worst case
and for any c > 1, exploring a graph with Dnc agents takes at least D · (1 + 1/c − o(1))
time steps in the global communication model, and at least D·(1+2/c−o(1)) time steps
in the local communication model. Table 1 summarizes our upper and corresponding
lower bounds.
Related Work. Collaborative online graph exploration has been intensively studied for
the special case of trees. In [8], a strategy is given which explores any tree with a team of
k agents in O(D+n/ log k) time steps, using a communication model with whiteboards
at each vertex that can be used to exchange information. This corresponds to a compet-
itive ratio of O(k/ log k) with respect to the optimum exploration time of Θ(D + n/k)
when the graph is known. In [13] authors show that the competitive ratio of the strategy
presented in [8] is precisely k/ log k. Another DFS-based algorithm, given in [2], has an
exploration time of O(n/k + Dk−1 ) time steps, which provides an improvement only
for graphs of small diameter and small teams of agents, k = O(logD n). For a special
subclass of trees called sparse trees, [6] introduces online strategies with a competitive
ratio of O(D1−1/p ), where p is the density of the tree as defined in that work. The best
currently known lower bound is √much lower: in [7], it is shown that any deterministic
exploration strategy with k < n has a competitive ratio of Ω(log k/ log log k), even
Fast Collaborative Graph Exploration 523

in the global communication model. A stronger lower bound of Ω(k/ log k) holds for
so-called greedy algorithms [13]. Both for deterministic and randomized
√ strategies, the
competitive ratio is known to be at least 2 − 1/k, when k < n [8]. None of these
lower bounds concern larger teams of agents. In [16] a lower bound of Ω(D1/(2c+1) )
on competitive ratio is shown to hold for a team of k = nc agents, but this lower bound
only concerns so-called rebalancing algorithms which keep all agents at the same height
in the tree throughout the exploration process.
The same model for online exploration is studied in [17], where a strategy is pro-
posed for exploring graphs which can be represented as a D × D grid with a certain
number of disjoint rectangular holes. The authors show that such graphs can be explored
with a team of k agents in time O(D log2 D + n log D/k), i.e., with a competitive ratio
of O(log2 D). By adapting the approach for trees from [7], they also show lower bounds
on the competitive√ ratio in this class of graphs of Ω(log k/log log k) for deterministic
strategies and Ω( log k/log log k) for randomized strategies. These lower bounds also
hold in the global communication model.
Collaborative exploration has also been studied with different optimization objec-
tives. An exploration strategy for trees with global communication is given in [7],
achieving a competitive ratio of (4 − 2/k) for the objective of minimizing the maxi-
mum number of edges traversed by an agent. In [5] a corresponding lower bound of
3/2 is provided.
Our problem can be seen as an online version of the k Traveling Salesmen Problem
(k-TSP) [9]. Online variants of TSP (for a single agent) have been studied in various
contexts. For example, the geometric setting of exploring grid graphs with and without
holes is considered by [10,11,14,15,17], where a variety of competitive algorithms with
constant competitive ratios is provided. A related setting is studied in [4], where an
agent has to explore a graph while being attached to the starting point by a rope of
restricted length. A similar setting is considered in [1], in which each agent has to
return regularly to the starting point, for example for refueling. Online exploration of
polygons is considered in [3,12].

2 Tree Exploration
We start our considerations by designing exploration strategies for the special case when
the explored graph is a tree T rooted at a vertex r. For any exploration strategy, the set
of all encountered vertices (i.e., all visited vertices and their neighbors) at the beginning
of step s = 1, 2, 3, . . . forms a connected subtree of T , rooted at r and denoted by T (s) .
In particular, T (1) is the vertex r together with its children, which have not yet been
visited. For v ∈ V (T ) we write T (s) (v) to denote the subtree of T (s) rooted at v. We
denote by L(T (s) , v) the number of leaves of the tree T (s) (v). Note that L(T (s) , v) ≤
L(T (s+1) , v) because each leaf in T (s) (v) is either a leaf of the tree T (s+1) or the root of
a subtree containing at least one vertex. If v is an unencountered vertex at the beginning
of step s, i.e., its parent was not yet visited, we define L(T (s) , v) = 1.
524 D. Dereniowski et al.

2.1 Tree Exploration with Global Communication


We are ready to give the procedure TEG (Tree Exploration with Global Communication).
The pseudocode uses the command “move(s)”, describing the move to be performed by
each agent, specifying the destination at which the agent appears at the start of time
step s + 1. Since the agents can communicate globally, the procedure can centrally
coordinate the movements of each agent. For simplicity we assume that x agents spawn
in r in each time step, for some given value of x. Then, the total number of agents used
after l steps is simply lx.
Procedure TEG (tree T with root r, integer x) at time step s:
Place x new agents at r.
for each v ∈ V (T (s) ) which is not a leaf do: { determine moves of the agents located at v }
(s)
Let Av be the set of agents currently located at v.
Denote by v1 , v2 , . . . , vd the set of children of v.
Let i∗ := arg maxi {L(T (s) , vi )}. { vi∗ is the child of v with the largest value of L }
(s)
Partition Av into disjoint sets A  v1 , Av2 , . . . , Avd , such that:
(s)
|Av |·L(T (s) ,vi )
(i) |Avi | = L(T (s) ,v)
, for i ∈ {1, 2, . . . , d} \ {i∗ },
(s) 
(ii) |Avi∗ | = |Av | − i∈{1,2,...,d}\{i∗ } |Avi |.

for each i ∈ {1, 2, . . . , d} do for each agent g ∈ Avi do move(s) g to vertex vi .


end for
end procedure TEG.
The following lemma provides a characterization of the tradeoff between exploration
time and the number of agents x released at every round in procedure TEG. In the
following, all logarithms are with base 2 unless a different base is explicitly given.
Lemma 1. In the global communication model, procedure TEG with parameter x ex-
plores any rooted tree T in at most D · (1 + log x−1−log
1
) time steps, for
n n (2 log x)
x > 6(n log n + 1).
Proof. Fix any leaf f of the tree T . We want to prove that procedure TEG visits the
leaf f after at most D · (1 + log x−1−log 1
) time steps. Take the path F =
n n (2 log x)
(f0 , f1 , f2 , . . . , fDf ) from r to f in T , where r = f0 ,f = fDf , and Df ≤ D. We define
the wave of agents ws starting from r at time s and traversing the path F as the max-
imum sequence of the non-empty sets of agents which leave the root in step s and tra-
(s) (s+1)
verse edges of F in successive time steps, i.e., ws = (Af0 , Af1 , . . .), where we use
(s)
the notation from procedure TEG. The size of wave ws in step s+t is defined to be |Aft |,
i.e., the number of exploring agents located at vertex ft at the beginning of time step s +
(s) (s+i)
t; initially, every wave has size |Af0 | = x. Note that each agent in Afi , 0 ≤ i < Df ,
is located at r at the start of time step s. We denote the number of leaves in the subtree
(i)
of T (i) rooted at fj by λj = L(T (i) , fj ). Recall that if fj is not yet discovered in step
(i) (i)
i, by definition of the function L, we have λj = 1. In general, 1 ≤ λj ≤ n. We define
(i) (i+1) (i+D −1)
x λ1 λ2 λDf f
αi = · · · (i+D −1) ,
2 λ(i) λ(i+1) λDf −1f
0 1
Fast Collaborative Graph Exploration 525

and define α∗i as the number of agents of the i-th wave that reach the leaf f , i.e., the size
of the i-th wave in step i + Df . If α∗1 = α∗2 = · · · = α∗i−1 = 0 and α∗i ≥ 1 for some
time step i, then we say that leaf f is explored by the i-th wave. Before we proceed with
the analysis, we show the following auxiliary claim.
Claim (*). Let i be a time step for which αi ≥ log x. Then, α∗i ≥ αi , and thus αi is
a lower bound on the number of agents reaching f in step i + Df .
(i+j) (i+j)
Proof (of the claim). We define cj = λj+1 /λj for j = 0, . . . , Df − 1. For i ≥ 1
Df −1
we have αi = x/2 j=0 cj . Since cj ≤ 1 for all j and since αi ≥ log x, there exist
at most log x different j such that cj ≤ 1/2. Denote the set of all such j by J , with
|J | ≤ log x. Also, denote the size of wave wi in step i + s by as (for s = 0, 1, 2, . . .),
in particular a0 = x.
(i+s) (i+s)
Consider some index s for which cs > 1/2. We have λs+1 /λs > 1/2, thus
(i+s)
more than half of all leaves of the tree T (fs ) also belong to the tree T (i+s) (fs+1 ).
But then, in time step i + s + 1, agents are sent from fs to fs+1 according to the defi-
nition in expression (ii) in procedure TEG. Thus, we can lower-bound the size of wave
wi in step i + s + 1 by as+1 ≥ as cs . Otherwise, if cs ≤ 1/2 (i.e., if s ∈ J ), then
agents are sent according the definition in expression (i) in procedure TEG, and hence
as+1 ≥ as cs . Note that these bounds also hold if there are no agents left in the wave,
i.e., as = as+1 = 0. Thus, we have: 
1, if s ∈ J ,
as+1 ≥ as cs − δs , where δs =
0, otherwise.
In this way we expand the expression for α∗i = aDf :

α∗i = aDf ≥ aDf −1 cDf −1 −δDf −1 ≥ . . . ≥ (... ((a0 c0 −δ0 )c1 −δ1 )c2 −. . .)cDf −1 −δDf −1 =
⎛ ⎞
Df −1 Df −1 Df −1 Df −1
   
=x cj − ⎝δj cj ⎠ ≥ 2αi − δj ≥ 2αi − |J | ≥ 2αi − log x.
j=0 j=0 p=j+1 j=0

Since by assumption αi ≥ log x, we obtain α∗i ≥ 2αi − log x ≥ αi , which completes


the proof of the claim.
We now show that if the number of waves a in the execution of the procedure is
sufficiently large, then there exists an index i ≤ a, such that αi ≥ log x. Thus, taking
into account Claim (*), leaf f is explored at the latest by the a-th wave.
a (s)
Take a waves and consider the product i=1 αi . Note that λDf = 1 for every s.
Thus, simplifying the product of all αi by shortening repeating terms in numerators and
(i)
denominators, and using 1 ≤ λj ≤ n, we get
f −1 Df −1 Df (i +j  )

a 
a D (i+j)
λj+1 a
j=0
(i+j)
λj+1
a−1
i =0 λ
j  =1 j 
αi = ( x2 )a (i+j)
= ( x2 )a i=1
a Df −1 (i+j)
= ( x2 )a a Df −1 (i+j)
=
i=1 i=1 j=0 λj i=1 j=0 λj i=1 j=0 λj
   (i +Df )

Df (j  ) a−1 Df −1 (i +j  ) a−1
j  =1
λj  i =1 j  =1
λj  i =1 λDf (x/2)a (x/2)a
= ( x2 )a     ≥ Df −1
≥ a+D .
a (i) a−1 Df −1 (i+j) Df −1 (a+j) a
n n n
i=1 λ0 i=1 j=1 λj j=1 λj
(1)
526 D. Dereniowski et al.

a
We want to find a, such that i=1 αi ≥ (log x)a . Taking into account (1), it is sufficient
to find a satisfying
(x/2)a
≥ (log x)a ,
na+D
which for sufficiently large x (we take x > 6(n log n + 1)) can be equivalently trans-
formed by taking logarithms and arithmetic to the form:
D
a≥ .
logn x − 1 − logn (2 log x)
D
Hence, for a = log x−1−log , we have that there exists some i such that
n n (2 log x)
αi ≥ log x. For the same i we have α∗i ≥ log x, by Claim (*). Thus, a waves are
sufficient to explore the path F . This analysis can be done for any leaf f , thus it is
enough to send a waves in order to explore the graph G. Considering that a wave wi is
completed by the end of step D+i−1, the exploration takes at most D+a−1 time steps
(2 log x) ) time steps. %
&
1
in total. Thus, the exploration takes at most D·(1+ log x−1−log
n n

We remark that in the above Lemma, the total number of agents used throughout all
steps of procedure TEG is x · D · (1 + log x−1−log
1
). For any c > 1, by appro-
n n (2 log x)
c
priately setting x = Θ(n ), we directly obtain the following theorem.

Theorem 1. For any fixed c > 1 and known n, the online


 tree exploration problem with
global communication can be solved in at most D · 1 + c−11
+ o(1) time steps using
a team of k ≥ Dn agents.
c
&
%

2.2 Tree Exploration with Local Communication


In this section we propose a strategy for tree exploration under the local communica-
tion model. In the implementation of the algorithm we assume that whenever two agents
meet, they exchange all information they possess about the tree. Thus, after the meeting,
the knowledge about the explored vertices and their neighborhoods, is a union of the
knowledge of the two agents before the meeting. Since agents exchange information
only if they occupy the same vertex, at any time s, the explored tree T (s) may only par-
tially be known to each agent, with different agents possibly knowing different subtrees
of T (s) .
In order to obtain a procedure for the local communication model, we modify proce-
dure TEG from the previous section. Observe that in procedure TEG, agents never move
towards the root of the tree, hence, in the local communication model, agents cannot
exchange information with other agents located closer to the root. The new strategy is
given by the procedure TEL (Tree Exploration with Local Communication).
In procedure TEL, all agents are associated with a state flag which may be set either
to the value “exploring” or “notifying”. Agents in the “exploring” state act similarly as
in global exploration, with the requirement that they always move to a vertex in groups
of 2 or more agents. Every time a group of “exploring” agents visits a new vertex, it
detaches two of its agents, changes their state to “notifying”, and sends them back along
the path leading back to the root. These agents notify every agent they encounter on their
Fast Collaborative Graph Exploration 527

way about the discovery of the new vertices. Although information about the discovery

may be delayed, in every step s, all agents at vertex v know the entire subtree T (s ) (v)
which was explored until some previous time step s ≤ s. The state flag also has a
third state, “discarded”, which is assigned to agents no longer used in the exploration
process.
The formulation of procedure TEL is not given from the perspective of individual
agents, however, based on its description, the decision on what move to make in the
current step can be made by each individual agent. The correctness of the definition of
the procedure relies on the subsequent lemma, which guarantees that for a certain value

s the tree T (s ) (v) is known to all agents at v.

Procedure TEL (tree T with root r, integer x) at time step s:


Place x new agents at r in state “exploring”.
for each v ∈ V (T (s) ) which is not a leaf do: { determine moves of the agents located at v }
if v = r then for each agent g at v in state “notifying” do move(s) g to the parent of v.
if v contains at least two agents in state “exploring” and agents at v do not have
information of any agent which visited v before step s then:
{ send two new notifying agents back to the root from newly explored vertex v }
Select two agents g ∗ , g ∗∗ at v in state “exploring”.
Change state to “notifying” for agents g ∗ and g ∗∗ .
move(s) g ∗ to the parent of v. { g ∗∗ will move to the parent one step later }
end if
(s)
Let Av be the set of all remaining agents in state “exploring” located at v.
Denote by v1 , v2 , . . . , vd all children of v, and by δ the distance from r to v.
  
s := δ+s 2
. { s is a time in the past such that T (s ) (v) is known to the agents at v }

Let i∗ := arg maxi {L(T (s ) , vi )}. { vi∗ is the child of v with the largest value of L }
(s)
Partition Av into disjoint sets A v1 , Av2 , . . . , Avd , such that:
(s) 
|Av |·L(T (s ) ,vi )
(i) |Avi | = 
L(T (s ) ,v)
, for i ∈ {1, 2, . . . , d} \ {i∗ },
(s) 
(ii) |Avi∗ | = |Av | − i∈{1,2,...,d}\{i∗ } |Avi |.

for each i ∈ {1, . . . , d} do if |Avi | ≥ 2 then for each agent g ∈ Avi do move(s) g to vi .
for each i ∈ {1, . . . , d} do if |Avi | = 1 then change state to “discarded” for agent in Avi .
end for
for each v ∈ V (T (s) ) which is a leaf do move(s) all agents located at v to the parent of v.
end procedure TEL.

Lemma 2. Let T be a tree rooted at some vertex r and let v be a vertex with distance
δ to r. After running procedure TEL until time step s, all agents which: ;are located at

vertex v at the start of time step s know the tree T (s ) (v), for s = δ+s
2 .

(Some proofs are omitted from this extended abstract.)

Lemma 3. In the local communication model, procedure TEL with parameter x ex-
2+1/ log n
plores any rooted tree T in at most D · (1 + log x−1−log ) time steps, for
n n (4 log x)
x > 17(n log n + 1).
528 D. Dereniowski et al.

Proof (sketch). As in the proof of Lemma 1, we consider any leaf f and the path F =
(f0 , f1 , . . . , fDf ) from r to f . As before, we denote the number of leaves in the subtree
(i)
of T (i) rooted at fj by λj = L(T (i) , fj ). Recall that if fj is not yet discovered in step
i, we have L(T (i) , fj ) = 1. We adopt the definition of a wave from Lemma 1. We define
the values αi differently, however, to take into account the fact that the procedure relies
on a delayed exploration tree, and that some waves lose agents as a result of deploying
( 2i ) λ( 2i +1) ( i +Df −1)
λD 2
x λ1
notifying agents: αi = 4 2
· · · f i +D −1 .
( i ) λ( 2i +1)
λ 2
(
λ 2
f )
0 1 Df −1

We call a wave that discovered at least log x new nodes (or equivalently, a wave
whose agents were the first to visit at least log x nodes of the tree) a discovery wave.
D
Thus, there are at most logfx ≤ log D
x discovery waves along the considered path.
Observe that if a wave is not a discovery wave, then the number of notifying agents it
sends out is at most 2 log x.
We define by α∗i the number of agents of the i-th wave that reach leaf f . We first
prove that the following analogue of Claim (*) from the proof of Lemma 1 holds for
non-discovery waves (we leave out the details from this extended abstract).
Claim (**). Let i be a time step for which wi is not a discovery wave and αi ≥ log x.
Then, α∗i ≥ αi , and thus αi is a lower bound on the number of agents reaching f in
step i + Df .
Finally, we prove that if the number of waves a in the execution of the procedure is
2+1/ log n
sufficiently large, i.e. a ≥ D · ( log x−1−log ), there exists an index i ≤ a, such
n n (4 log x)
that wave wi is not a discovery wave and αi ≥ log x. Exploration is then completed
when the last wave reaches leaves, i.e. in D + a − 1 steps, which completes the proof.
&
%
Acting as in the previous Subsection, from Lemma 3 we obtain a strategy for online
exploration of trees in the model with local communication.
Theorem 2. For any fixed c > 1, the online tree exploration problem can be solved
 communication and knowledge of n using a team of k ≥ Dn
c
in the model with local
2
agents in at most D 1 + c−1 + o(1) time steps. &
%

3 General Graph Exploration


In this section we develop strategies for exploration of general graphs, both with global
communication and with local communication. These algorithms are obtained by mod-
ifying the tree-exploration procedures given in the previous section.
Given a graph G = (V, E) with root vertex r, we call P = (v0 , v1 , v2 , . . . , vm ) with
r = v0 , vi ∈ V , and {vi , vi+1 } ∈ E a walk of length (P ) = m. Note that a walk
may contain a vertex more than once. We introduce the notation P [j] to denote vj , i.e.,
the j-th vertex of P after the root, and P [0, j] to denote the walk (v0 , v1 , . . . , vj ), for
j ≤ m. The last vertex of path P is denoted by end(P ) = P [(P )]. The concatenation
of a vertex u to path P , where u ∈ N (end(P )) is defined as the path P  ≡ P + u of
length (P ) + 1 with P  [0, (P )] = P and end(P  ) = u.
Fast Collaborative Graph Exploration 529

Let P be the set of walks P in G having length 0 ≤ (P ) < n. We introduce a


linear order on walks in P such that for two walks P1 and P2 , we say that P1 < P2
if (P1 ) < (P2 ), or (P1 ) = (P2 ) and there exists an index j < (P1 ) such that
P1 ([0, j]) = P2 ([0, j]) and P1 ([j + 1]) < P2 ([j + 1]). The comparison of vertices from
V is understood as comparison of their identifiers in G.
We now define the tree T with vertex set P, root (r) ∈ P, such that vertex P  is
a child of vertex P if and only if P  = P + u, for some u ∈ N (end(P )). We first
show that agents can simulate the exploration of T while in fact moving around graph
G. Intuitively, while an agent is following a path from the root to the leaves of T , its
location in T corresponds to the walk taken by this agent in G.

Lemma 4. A team of agents can simulate the virtual exploration of tree T starting from
root (r), while physically moving around graph G starting from vertex r. The simulation
satisfies the following conditions:
(1) An agent virtually occupying a vertex P of T is physically located at a vertex
end(P ) in G.
(2) Upon entering a vertex P of T in the virtual exploration, the agent obtains the
identifiers of all children of P in T .
(3) A virtual move along an edge of T can be performed in a single time step, by
moving the agent to an adjacent location in G.
(4) Agents occupying the same virtual location P in T can communicate locally, i.e.,
they are physically located at the same vertex of G.

We remark that the number of vertices of tree T is exponential in n. Hence, our goal is
to perform the simulation with only a subset of the vertices of T . For a vertex v ∈ V ,
let Pmin (v) ∈ P be the minimum (with respect to the linear order on P) walk ending
at v. We observe that, by property (1) in Lemma 4, if, for all v ∈ V , the vertex Pmin (v)
of T has been visited by at least one agent in the virtual exploration of T , the physical
exploration of G is completed. We define Pmin = {Pmin (v) : v ∈ V }, and show that
all vertices of Pmin are visited relatively quickly if we employ the procedure TEG (or
TEL) for T , subject to a simple modification. In the original algorithm, we divided the
agents descending to the children of the vertex according to the number of leaves of
the discovered subtrees. We introduce an alternate definition of the function L(T (s) , v),
so as to take into account only the number of vertices in T (s) corresponding to walks
which are smallest among all walks in T (s) sharing the same end-vertex.

Lemma 5. Let T (s) ⊆ T be a subtree of T rooted at (r). For P ∈ V (T (s) ), let


L(T (s) , P ) be the number of vertices v of G, for which the subtree of T (s) rooted at
P contains a vertex representing the smallest walk contained in T (s) which ends at v:
 ) ! "
 
L(T (s) , P ) = V (T (s) (P )) ∩ min{P  ∈ V (T (s) ) : end(P  ) = v} ,
v∈V

and for P ∈ P \ V (T (s) ), let L(T (s) , P ) = 1. Subject to this definition of L, pro-
cedure TEG with parameter x > 6(n log n + 1) (procedure TEL with parameter x >
17(n log n + 1)) applied to tree T starting from root (r) visits all vertices from Pmin
530 D. Dereniowski et al.

2+1/ log n
within D · (1 + log x−1−log
1
) (respectively, D · (1 + log x−1−log )) time
n n (2 log x) n n (4 log x)
steps.
Proof. The set Pmin spans a subtree Tmin = T [Pmin] in T , rooted at (r). We can
perform an analysis analogous to that used in the Proofs of Lemmas 1 and 3, evaluating
sizes of waves of agents along paths in the subtree Tmin . We observe that for any P ∈
Pmin which is not a leaf in Tmin , we always have L(T (s) , P ) ≥ 1. Moreover, we
have L(T (s) , P ) ≤ |V (T (s) (P ))|, and so L(T (s) , P ) ≤ n. Since these two bounds
were the only required properties of the functions L in the Proofs of Lemmas 1 and
3, the analysis from these proofs applies within the tree Tmin without any changes. It
follows that each vertex of Pmin is reached by the exploration algorithm within D · (1 +
log x−1−log (2 log x) ) time steps in case of global communication, and within D · (1 +
1
n n
2+1/ log n
logn x−1−logn (4 log x) ) time steps in case of local communication. &
%

We recall that by Lemma 4, one step of exploration of tree T can be simulated by a


single step of an agent running on graph G. Thus, appropriately choosing x = Θ(nc )
in Lemma 5, we obtain our main theorem for general graphs.
Theorem 3. For any c > 1, the online graph exploration problem with knowledge of n
can be solved using a team of k ≥ Dnc agents:
– in at most D · 1 + c−1 1
+ o(1) time steps in the global communication model.

– in at most D · 1 + c−1 2
+ o(1) time steps in the local communication model.

For the case when we do not assume knowledge of (an upper bound on) n, we provide
a variant of the above theorem which also completes exploration in O(D) steps, with a
slightly larger multiplicative constant.
Theorem 4. For any c > 1, there exists an algorithm for the local communication
model, which explores a rooted graph of unknown order n and unknown diameter D
using a team of k agents, such that its exploration time is O(D) if k ≥ Dnc .
We remark that by choosing x = Θ(n log n) in Lemma 2, we can also explore a graph
using k = Θ(Dn log n) agents in time Θ(D log n), with local communication. This
bound is the limit of our approach in terms of the smallest allowed team of agents.

4 Lower Bounds
In this section, we show lower bounds for exploration with Dnc agents, complementary
to the positive results given by Theorem 3. The graphs that produce the lower bound are
a special class of trees. The same class of trees appeared in the lower bound from [8]
for the competitive ratio of tree exploration algorithms with small teams of agents. In
our scenario, we obtain different lower bounds depending on whether communication
is local or global.
Theorem 5. For all n > 1 and for every increasing function f , such that log f (n) =
o(log n), and every constant c > 0, there exists a family of trees Tn,D , each with n
vertices and height D = Θ(f (n)), such that
Fast Collaborative Graph Exploration 531

(i) for every exploration strategy with global communication that uses Dnc agents
there exists atree in Tn,D such
 that number of time steps required for its exploration
is at least D 1 + 1c − o(1) ,
(ii) for every exploration strategy with local communication that uses Dnc agents there
exists a tree in Tn,D such
 that number of time steps required for exploration is at
least D 1 + 2c − o(1) .
When looking at the problem of minimizing the size of the team of agents, our work
(Theorem 4) shows that it is possible to achieve asymptotically-optimal online explo-
ration time of O(D) using a team of k ≤ Dn1+ agents, for any  > 0. For graphs
of small diameter, D = no(1) , we can thus explore the graph in O(D) time steps us-
ing k ≤ n1+ agents. This result almost matches the lower bound on team size of
k = Ω(n1−o(1) ) for the case of graphs of small diameter, which follows from the trivial
lower bound Ω(D + n/k) on exploration time (cf. e.g. [8]). The question of establish-
ing precisely what team size k is necessary and sufficient for performing exploration in
O(D) steps in a graph of larger diameter remains open.

References
1. Awerbuch, B., Betke, M., Rivest, R.L., Singh, M.: Piecemeal graph exploration by a mobile
robot. Information and Computation 152(2), 155–172 (1999)
2. Brass, P., Cabrera-Mora, F., Gasparri, A., Xiao, J.: Multirobot tree and graph exploration.
IEEE Transactions on Robotics 27(4), 707–717 (2011)
3. Czyzowicz, J., Ilcinkas, D., Labourel, A., Pelc, A.: Worst-case optimal exploration of terrains
with obstacles. Information and Computation 225, 16–28 (2013)
4. Duncan, C.A., Kobourov, S.G., Kumar, V.S.A.: Optimal constrained graph exploration. ACM
Transactions on Algorithms 2(3), 380–402 (2006)
5. Dynia, M., Korzeniowski, M., Schindelhauer, C.: Power-aware collective tree exploration.
In: Grass, W., Sick, B., Waldschmidt, K. (eds.) ARCS 2006. LNCS, vol. 3894, pp. 341–351.
Springer, Heidelberg (2006)
6. Dynia, M., Kutyłowski, J., Meyer auf der Heide, F., Schindelhauer, C.: Smart robot teams
exploring sparse trees. In: Královič, R., Urzyczyn, P. (eds.) MFCS 2006. LNCS, vol. 4162,
pp. 327–338. Springer, Heidelberg (2006)
7. Dynia, M., Łopuszański, J., Schindelhauer, C.: Why robots need maps. In: Prencipe, G., Zaks,
S. (eds.) SIROCCO 2007. LNCS, vol. 4474, pp. 41–50. Springer, Heidelberg (2007)
8. Fraigniaud, P., Ga̧sieniec, L., Kowalski, D.R., Pelc, A.: Collective tree exploration. Net-
works 48(3), 166–177 (2006)
9. Frederickson, G.N., Hecht, M.S., Kim, C.E.: Approximation algorithms for some routing
problems. SIAM Journal on Computing 7(2), 178–193 (1978)
10. Gabriely, Y., Rimon, E.: Competitive on-line coverage of grid environments by a mobile
robot. Computational Geometry 24(3), 197–224 (2003)
11. Herrmann, D., Kamphans, T., Langetepe, E.: Exploring simple triangular and hexagonal grid
polygons online. CoRR, abs/1012.5253 (2010)
12. Higashikawa, Y., Katoh, N.: Online exploration of all vertices in a simple polygon. In: Proc.
6th Frontiers in Algorithmics Workshop and the 8th Int. Conf. on Algorithmic Aspects of
Information and Management (FAW-AAIM), pp. 315–326 (2012)
13. Higashikawa, Y., Katoh, N., Langerman, S., Tanigawa, S.-I.: Online graph exploration al-
gorithms for cycles and trees by multiple searchers. Journal of Combinatorial Optimization
(2013)
532 D. Dereniowski et al.

14. Icking, C., Kamphans, T., Klein, R., Langetepe, E.: Exploring an unknown cellular en-
vironment. In: Proc. 16th European Workshop on Computational Geometry (EuroCG),
pp. 140–143 (2000)
15. Kolenderska, A., Kosowski, A., Małafiejski, M., Żyliński, P.: An improved strategy for ex-
ploring a grid polygon. In: Kutten, S., Žerovnik, J. (eds.) SIROCCO 2009. LNCS, vol. 5869,
pp. 222–236. Springer, Heidelberg (2010)
16. Łopuszański, J.: Tree exploration. Tech-report, Institute of Computer Science, University of
Wrocław, Poland (2007) (in Polish)
17. Ortolf, C., Schindelhauer, C.: Online multi-robot exploration of grid graphs with rectangu-
lar obstacles. In: Proc. 24th ACM Symp. on Parallelism in Algorithms and Architectures
(SPAA), pp. 27–36 (2012)
Deterministic Polynomial Approach in the Plane

Yoann Dieudonné1 and Andrzej Pelc2,


1
MIS, Université de Picardie Jules Verne, France
2
Département d’informatique, Université du Québec en Outaouais,
Gatineau, Québec, Canada

Abstract. Two mobile agents with range of vision 1 start at arbitrary


points in the plane and have to accomplish the task of approach, which
consists in getting at distance at most one from each other, i.e., in getting
within each other’s range of vision. An adversary chooses the initial po-
sitions of the agents, their possibly different starting times, and assigns
a different positive integer label and a possibly different speed to each
of them. Each agent is equipped with a compass showing the cardinal
directions, with a measure of length and a clock. Each agent knows its
label and speed but not those of the other agent and it does not know
the initial position of the other agent relative to its own. Agents do not
have any global system of coordinates and they cannot communicate.
Our main result is a deterministic algorithm to accomplish the task of
approach, working in time polynomial in the unknown initial distance
between the agents, in the length of the smaller label and in the inverse
of the larger speed. The distance travelled by each agent until approach is
polynomial in the first two parameters and does not depend on the third.
The problem of approach in the plane reduces to a network problem: that
of rendezvous in an infinite grid.

1 Introduction

Among numerous tasks performed by mobile agents one of the most basic and well
studied is that of meeting (or rendezvous) of two agents [4,30]. Agents are mobile
entities equipped with computational power and they may model humans, animals,
mobile robots, or software agents in communication networks. Applications of ren-
dezvous are ubiquitous. People may want to meet in an unknown town, rescuers
have to find a lost tourist in the mountains, while animals meet to mate or to give
food to their offsprings. In human-made environments mobile robots meet to ex-
change collected samples or to divide between them the task of future exploration
of a contaminated terrain, while software agents meet to share data collected from
nodes of a network or to distribute between them the task of collective network
maintenance and checking for faulty components. The basic task of meeting of two
agents is a building block for gathering many agents. In all these cases it is impor-
tant to achieve the meeting in an efficient way.

Supported in part by NSERC discovery grant and by the Research Chair in Dis-
tributed Computing of the Université du Québec en Outaouais.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 533–544, 2013.

c Springer-Verlag Berlin Heidelberg 2013
534 Y. Dieudonné and A. Pelc

The Model and the Problem. Agents are usually modeled as points moving
in a graph representing the network, or in the plane. In the first case the meeting
of two agents is defined as both agents being at the same time in the same node
of the graph [15,31] or in the same point inside an edge [11,14]. This is possible to
achieve even when agents have radius of vision 0, i.e., when they cannot sense the
other agent prior to the meeting. As observed in [13], if agents freely circulate
in the plane and can start in arbitrary unknown points of it, bringing them
simultaneously to the same point is impossible with radius of vision 0. A natural
assumption in this case is that agents have some positive radius of vision and
the task is to bring them within this distance, so they can see each other. Once
this is achieved, agents are in contact, so they can get even closer and exchange
information or objects. It should be noted that the expression radius of vision
does not need to be interpreted optically. It is a distance at which the agents can
mutually sense each other optically, audibly (e.g. by emitting sounds in the dark),
chemically (animals smelling each other), or even by touch (real agents are not
points but have positive size, so if a point is chosen inside each of them, there is
some positive distance s, such that they will touch before these chosen points get
at distance s). Since in this paper we study agents moving in the plane, we are
interested in the above described task of bringing the points representing them
at some pre-defined positive distance. Without loss of generality we assume that
this distance is 1 and we call approach the task of bringing the points representing
the two agents at distance at most 1.
An adversary chooses the initial positions of the agents, which are two arbi-
trary points of the plane, it chooses their possibly different starting times, and
it assigns a different label and a possibly different speed to each of them. At
all times each of the agents moves at the assigned speed or stays idle. Labels
are positive integers. Each agent is equipped with a compass showing the cardi-
nal directions, with a measure of length, and with a clock. Clocks of the agents
are not necessarily synchronized. Hence an agent can perform basic actions of
the form: “go North/East/South/West at a given distance” and “stay idle for
a given amount of time”. In fact, these will be the only actions performed by
agents in our solution. Each agent knows its label, having a clock and a mea-
sure of length it can calculate its speed, but it has no information about the
other agent: it does not know the initial position of the other agent relative to
its own, the distance separating them, it does not know the speed of the other
agent or its label. Agents do not have any global system of coordinates and,
prior to accomplishing the approach, they cannot communicate. The cost of an
algorithm accomplishing the task of approach is the total distance travelled by
both agents, and the time of an algorithm is counted from the start of the later
agent.
Our Results. Our main result is a deterministic algorithm to accomplish the
task of approach, working in time polynomial in the unknown initial distance
between the agents, in the length of (the binary representation of) the shorter
label and in the inverse of the larger speed. (Hence it is also polynomial in the
distance, the length of the other label and the inverse of the other speed.) The
Deterministic Polynomial Approach in the Plane 535

cost of the algorithm is polynomial in the first two parameters and does not
depend on the third. The problem of approach in the plane reduces to a network
problem: that of rendezvous in an infinite grid. Due to lack of space, all the
proofs will appear in the journal version of the paper.
Discussion and Open Problems. In this paper we are only interested in
deterministic solutions to the approach problem. Randomized solutions, based on
random walks on a grid, are well known [29]. Let us first discuss the assumptions
concerning the equipment of the agents. From an application point of view the
three tools provided to the agents, i.e., a compass, a unit of length and a clock,
do not seem unrealistic when agents are humans or robots. Nevertheless it is
an interesting question if these tools are really necessary. It is clear that an
agent needs some kind of compass just to be able to change direction of its
walk. However it remains open if our result remains true if the compasses of the
agents are subject to some level of inaccuracy. The same remark concerns the
measure of length: without it an agent could not carry out any moving plan,
but it remains open if the result is still valid if agents have different, possibly
completely unrelated units of length. Probably the most interesting question
concerns the necessity of equipping the agents with a clock. In our solution
clocks play a vital role, as the algorithm is based on interleaving patterns of
moves with prescribed waiting periods. However, the possibility of designing a
polynomial algorithm for the task of approach without any waiting periods, with
each agent always traveling at its steady speed prescribed by the adversary, is
not excluded. Such an algorithm could possibly work without relying on any
clock.
We count the execution time of the algorithm from the starting time of the
later agent. Notice that counting time from the start of the earlier agent does
not make sense. Indeed, the adversary can assign an extremely small speed to
the earlier agent, give a large speed to the later one and start it only when the
earlier agent traversed half of the initial distance between them. The time of
traversing this distance by the earlier agent can be arbitrarily large with respect
to the inverse of the speed of the later agent, and approach must use at least
this time, if the initial distance is larger than 2.
Notice that assigning different labels to agents is the only way to break sym-
metry between them in a deterministic way, and hence to ensure deterministic
approach. Anonymous (identical) agents would make identical moves, and hence
could never approach if started simultaneously with the same speed.
Concerning the complexity of our algorithm, our goal in this paper is only
to keep both the time and the cost polynomial. We do not make any attempt
at optimizing the obtained polynomials. Some improvements may have been
obtained by using more complicated but slightly more efficient procedures or by
performing tighter analysis. However getting optimal time and cost seems to be
a very challenging open problem.
Next, it is interesting to ponder the degree of asynchrony allowed in the nav-
igation of the agents. Some asynchrony is included in our model, by allowing
the adversary to assign arbitrary, possibly different, mutually unknown speeds
536 Y. Dieudonné and A. Pelc

to both agents. Nevertheless we assume that when agents move, they move at
constant speed. A higher level of asynchrony was assumed in [7,11,13,14,22]: the
adversary could change the speed of each agent arbitrarily or halt the agent for
an arbitrary finite time. In such a model it is of course impossible to limit the
time of approach, so our main result could not remain valid. However, it is still
perhaps possible to preserve our other result: design an algorithm in the scenario
of arbitrarily varying speeds of agents, in which the distance travelled by each
agent until approach is polynomial in the unknown initial distance between the
agents and in the length of the shorter label. We leave this as an open problem.
Finally, let us consider the issue of the memory size of the agents. In our model
we do not impose any restriction on it, treating agents, from the computational
point of view, as Turing machines. However, it is easy to see that the execution
of our algorithm requires memory of O(log D + log L + log(1/v)) bits, where D is
the initial distance between the agents, L is the agent’s label and v is its speed.
As for the corresponding lower bound, Ω(log L) bits of memory are necessary to
store the label of the agent, and it can be shown that Ω(log D) bits of memory
are necessary for approach even in the easier scenario where agents are on a line
instead of the plane. By contrast, the lower bound Ω(log(1/v)) is much less clear.
Indeed, if a wait-free solution not using the clock is possible, it could perhaps
be implemented by agents whose memory size does not depend on their speed.
Also this question remains open.
Related Work. The literature on rendezvous can be broadly divided accord-
ing to whether the agents move in a randomized or in a deterministic way. An
extensive survey of randomized rendezvous in various scenarios can be found in
[4], cf. also [2,3,5,6,26]. In the sequel we briefly discuss the literature on deter-
ministic rendezvous that is more closely related to our scenario. This literature
is naturally divided according to the way of modeling the environment: agents
can either move in a graph representing a network, or in the plane. Deterministic
rendezvous in networks has been surveyed in [30].
In most papers on rendezvous in networks a synchronous scenario was as-
sumed, in which agents navigate in the graph in synchronous rounds. Rendezvous
with agents equipped with tokens used to mark nodes was considered, e.g., in [27].
Rendezvous of two agents that cannot mark nodes but have unique labels was
discussed in [15,25,31]. These papers are concerned with the time of synchronous
rendezvous in arbitrary graphs. In [15] the authors show a rendezvous algorithm
polynomial in the size of the graph, in the length of the shorter label and in
the delay between the starting time of the agents. In [25,31] rendezvous time is
polynomial in the first two of these parameters and independent of the delay.
Memory required by two anonymous agents to achieve deterministic rendezvous
has been studied in [20,21] for trees and in [12] for general graphs.
Rendezvous of more than two agents, often called gathering, has been studied,
e.g., in [16,17,28,32]. In [16] agents were anonymous, while in [32] the authors
considered gathering many agents with unique labels. Gathering many labeled
agents in the presence of Byzantine agents was studied in [17]. Gathering many
agents in the plane has been studied in [8,9,19] under the assumption that agents
Deterministic Polynomial Approach in the Plane 537

are memoryless, but they can observe other agents and make navigation deci-
sions based on these observations. Fault-tolerant aspects of this problem were
investigated, e.g., in [1,10]. On the other hand, gathering memoryless agents in
a ring, assuming that agents can see the entire ring and positions of agents in
it, was studied in [23,24].
Asynchronous rendezvous of two agents in a network has been studied in
[7,11,13,14,18,22] in a model when the adversary can arbitrarily change the speed
of each agent or halt the agent for an arbitrary finite time. As mentioned previ-
ously, in this model time cannot be bounded, hence the authors concentrated on
the cost of rendezvous, measured as the total number of edge traversals executed
by both agents. In [14] the authors investigated the cost of rendezvous in the
infinite line and in the ring. They also proposed a rendezvous algorithm for an
arbitrary graph with a known upper bound on the size of the graph. This as-
sumption was subsequently removed in [13], but both in [14] and in [13] the cost
of rendezvous was exponential in the size of the graph and in the larger label.
In [22] asynchronous rendezvous was studied for anonymous agents and the cost
was again exponential. The result from [13] implies a solution to the problem of
approach in the plane at cost exponential in the initial distance between agents
and in the larger of the labels.
The first asynchronous rendezvous algorithms at cost polynomial in the initial
distance of the agents were presented in [7,11]. In these papers the authors
worked in infinite multidimensional grids and their result implies a solution to
the problem of approach in the plane at cost polynomial in the initial distance of
the agents. However, they used the powerful assumption that each agent knows
its starting position in a global system of coordinates. It should be stressed
that the assumptions and the results in these papers are incomparable to the
assumptions and results in the present paper. In [7,11] the authors allow the
adversary to arbitrarily control and change the speed of each agent, and hence
cannot control the time of rendezvous and do not use clocks. To get polynomial
cost (indeed, their cost is close to optimal) they use the assumption of known
starting position in an absolute system of coordinates. By contrast, we assume
arbitrary, possibly different and unknown but constant speeds of each agent, use
clocks and different integer labels of agents but not their positions (in fact our
agents are completely ignorant of where they are) and obtain an algorithm of
polynomial time and cost (in the previously described parameters).
In a recent paper [18] we designed a rendezvous algorithm working for an
arbitrary finite graph in the above asynchronous model. The algorithm has cost
polynomial in the size of the graph and in the length of the smaller label. Again,
the assumptions and the results are incomparable to those of the present paper.
First, in [18], as in [7,11,13], time cannot be controlled and cost in [18] is polyno-
mial in the size of the graph. More importantly, it is unlikely that the methods
from [18], tailored for arbitrary finite graphs, could be used even to obtain our
present result about cost. Indeed, it is easy to see that making cost polynomial
in the initial distance is not possible in arbitrary graphs, as witnessed by the
case of the clique: the adversary can hold one agent at a node and make the
538 Y. Dieudonné and A. Pelc

other agent traverse Θ(n) edges before rendezvous (even at steady speed), in
spite of the initial distance 1. Also in [18] agents walk in the same finite graph,
which is not the case in our present scenario.

2 Preliminaries
It follows from [13] that the problem of approach can be reduced to that of
rendezvous in an infinite grid, in which every node u is adjacent to 4 nodes at
Euclidean distance 1 from it, and located North, East, South and West from node
u. We call this grid a basic grid. Rendezvous in this grid means simultaneously
bringing two agents starting at arbitrary nodes of the grid to the same node or
to the same point inside some edge.
Hence in the rest of the paper we will consider rendezvous in a basic grid,
instead of the task of approach. Instructions in a rendezvous algorithm are:
“go North/East/South/West at distance 1” and “stay idle for a given amount
of time”. Before executing our rendezvous algorithm in a basic grid, an agent
performs two preprocessing procedures. The first is the procedure of transforming
the label of the agent and works as follows. Let L = (b0 b1 . . . br−1 ) be the binary
representation of the label of the agent. We define its transformed label L∗
as the binary sequence b0 b0 b1 b1 . . . br−1 br−1 01). Notice that the length of the
transformed label L∗ is 2r + 2, where r is the length of label L. Moreover,
transformed labels are never prefixes of each other and they must differ at some
position different from the first. This is why original labels are transformed.
The second preprocessing procedure performed by an agent is computing the
inverse of its speed: the agent measures the time θ it takes it to traverse a
distance of length 1. This time, called the basic time of the agent, will be used to
establish the length of waiting periods in the execution of the algorithm. In fact,
measuring θ can be done when the agent traverses the first edge of the basic grid
indicated by the algorithm.
We denote by Δ be√the initial distance between agents in the basic grid.
Notice that D ≤ Δ ≤ 2D, where D is the initial Euclidean distance between
the agents. Let λ be the length of the shorter of the transformed labels of agents,
and let τ be the shorter basic time.

3 Algorithm
Patterns. We first describe several patterns of moves that will be used by our
algorithm. All the patterns are routes in the basic grid, and distances between
nodes are measured also in the basic grid, i.e., in the Manhattan metric. We use
N (resp. E,S,W) to denote the instruction “make an edge traversal by port North
(resp. East, South, West)”. We define the reverse path to the path v1 , . . . , vk of
the agent as the path vk , vk−1 , . . . , v1 . We also define a sub-path of the path
v1 , . . . , vk as vi , vi+1 , . . . , vj−1 , vj , for some 1 ≤ i < j ≤ k.
Pattern BALL(v, s), for a node v and an integer s ≥ 1, visits all nodes of the grid
at distance at most s from v and traverses all edges of the grid between such nodes.
Deterministic Polynomial Approach in the Plane 539

Moreover, executing this pattern the agent is always at distance at most s from v.
Let S(v, i) be the set of nodes at distance exactly i from v. Pattern BALL(v, s) is
executed in s phases, each of which starts and ends at v. Phase 1 is the unit cross with
center v corresponding to the sequence NSEWSNWE. Suppose that phase i − 1,
for i > 1, has been executed. Let v1 , . . . , vq be nodes of S(v, i − 1) with v1 situated
North of v, and all other nodes of S(v, i − 1) ordered clockwise. Phase i consists of
q stages σ1 , . . . , σq . Stage σ1 consists of going from v to v1 using the shortest path
and performing the unit cross NSEWSNWE with center v1 . Stage σj , for 1 < j < q
consists of going from vj−1 to vj using the unique path of length 2 with midpoint
in S(v, i − 2) and performing the unit cross NSEWSNWE with center vj . Stage
σq consists of going from vq−1 to vq , performing the unit cross NSEWSNWE with
center vq and going back to v by the lexicographically smallest shortest path (coded
as a sequence of letters N, E, S, W).
Pattern SU P ERBALL(v, s), for a node v and an integer s ≥ 0, consists
of performing the sequence of patterns BALL(v, 1), BALL(v, 2),...,BALL(v, s),
followed by the reverse path of this sequence of patterns.
For the subsequent patterns we will use the following notation. For i =
1, . . . , s, let w(i, 1), w(i, 2), ... ,w(i, q(i)) be the enumeration of all nodes u at
distance at most i from v in the lexicographic order of the lexicographically
smallest shortest path from v to u.
Pattern F LOW ER(v, s, k) is executed in phases 1, 2, . . . , ks. Each phase con-
sists of two parts. For i ≤ s, part 1 of phase i consists of q(i) stages. Stage
j consists of going from v to w(i, j) by the lexicographically smallest short-
est path π(i, j), then executing SU P ERBALL(w(i, j), i) and then backtrack-
ing to v using the path reverse to π(i, j). For i > s, part 1 of phase i is the
same as part 1 of phase s, except that SU P ERBALL(w(s, j), s) is replaced
by SU P ERBALL(w(s, j), i). Part 2 of every phase is backtracking using the
reverse path to that used in part 1.
Pattern BOU QU ET (v, s, k) is executed in epochs 1, 2, . . . , ks. Each epoch con-
sists of two parts. For i ≤ s, part 1 of epoch i consists of q(i) stages. Stage j consists
of going from v to w(i, j) by the lexicographically smallest shortest path π(i, j),
then executing phase 1, phase 2, ..., phase i of F LOW ER(w(i, j), s, k) and then
backtracking to v using the path reverse to π(i, j). For i > s, part 1 of phase i is the
same as part 1 of phase s, except that the execution of phase 1, phase 2, ..., phase
s of F LOW ER(w(s, j), s, k) is replaced by the execution of phase 1, phase 2, ...,
phase s, phase s + 1, ..., phase i of F LOW ER(w(s, j), s, k). Part 2 of every epoch
is backtracking using the reverse path to that used in part 1.
Pattern CAT CH − BOU QU ET (v, s, k) is executed in q(s) stages. Stage j
consists of going from v to w(s, j) by the lexicographically smallest shortest path
π(s, j), then executing pattern BOU QU ET (w(s, j), s, k) and then backtracking
to v using the path reverse to π(s, j).
Pattern BORDER(v, s, n) consists of executing n times SU P ERBALL(v, s).
Apart from the above patterns of moves, our algorithm will use procedures
W AIT0 (v, s, k), W AIT1 (v, s, k), W AIT2 (v, s, k), and W AIT3 (v, s, k). Each of
540 Y. Dieudonné and A. Pelc

these procedures consists of waiting at the initial position v of the agent for a
prescribed period of time. We will specify these periods of waiting later on.
The Main Idea. The main idea of our rendezvous algorithm in the basic grid is
the following. In order to guarantee rendezvous, symmetry in the actions of the
agents must be broken. Since agents have different transformed labels, this can
be done by designing the algorithm so that each agent processes consecutive bits
of its transformed label, acting differently when the current bit is 0 and when
it is 1. The aim is to force rendezvous when each agent processes the bit cor-
responding to the position where their transformed labels differ. This approach
requires to overcome two major difficulties. The first is that due to the possibly
different starting times and different speeds, agents may execute corresponding
bits of their transformed labels at different times. This problem is solved in our
algorithm by carefully scheduling patterns BORDER and waiting times, in or-
der to synchronize the agents. Patterns BORDER and waiting times have the
following role in this synchronization effort. While a pattern BORDER executed
by one agent pushes the other agent to proceed in its execution, or otherwise
rendezvous is accomplished, waiting periods slow down the executing agent. The
joint application of these two algorithmic ingredients guarantees that agents will
at some point execute almost simultaneously the bit on which they differ. The
second difficulty is to orchestrate rendezvous after the first difficulty has been
overcome, i.e., when each agent executes this bit. This is done by combining
waiting periods with patterns that are included in one another for some pa-
rameters. Our algorithm is designed in such a way that the execution of bit 0
consists of executing a pattern F LOW ER followed by a waiting period followed
by pattern CAT CH − BOU QU ET , while the execution of bit 1 consists of ex-
ecuting a pattern BOU QU ET followed by a waiting period. According to our
algorithm, either F LOW ER will be included in BOU QU ET , or BOU QU ET
will be included in CAT CH − BOU QU ET . If one agent executes a pattern
P  included in the pattern P  executed simultaneously by the other agent, one
agent must “catch” the other, i.e., rendezvous must occur. The main role of the
waiting periods associated with these patterns is to slow down the agent execut-
ing the pattern included in the other. Indeed, these waiting periods ensure that
the agent executing pattern P  does not complete it too early and start some
other action. It can be shown that this synchronization occurs soon enough in
the execution to guarantee rendezvous at polynomial time and cost.
Description of the Algorithm. We are now ready to present a detailed de-
scription of our rendezvous algorithm in the basic grid, executed by an agent
with transformed label L∗ = (c0 c1 . . . c−1 ) of length  and with basic time θ.
The agent starts at node v. For technical reasons we define cj = 0 for all j ≥ .
The main “repeat” loop is executed until rendezvous is accomplished. At this
time both agents stop.
For any pattern P we will use the notation C[P ] to denote the number of edge
traversals in the execution of P .
Deterministic Polynomial Approach in the Plane 541

Algorithm Meeting
s := 1; i := 0
repeat
j := 0
while j ≤ s − 1 do
if cj = 0 then
execute F LOW ER(v, s, 4i + 1)
execute W AIT0 (v, s, 4i + 1)
execute CAT CH − BOU QU ET (v, s, 4i + 1)
else
execute BOU QU ET (v, s, 4i + 1)
execute W AIT1 (v, s, 4i + 1)
endif
j := j + 1; i := i + 1; N := 3 · C[CAT CH − BOU QU ET (v, s, 4i + 1)]
if j ≤ s − 1 then
execute BORDER(v, (4i + 1)s, N )
execute W AIT2 (v, s, 4i + 1)
else
M := 2s · C[BORDER(v, (4i + 1)s, N )]
execute BORDER(v, (4i + 1)s, M )
execute W AIT3 (v, s, 4i + 1)
endif
endwhile
s := s + 1

It remains to give the lengths of the waiting periods used by our algorithm. To
this end we introduce the following terminology. We first define fences and walls.
A wall is a pattern BORDER executed in the algorithm immediately prior to a
waiting period W AIT3 . A fence is any other pattern BORDER. We next define
pieces as follows. Notice that any execution of the algorithm can be viewed as
a concatenation of chunks of the following form: some sequence of instructions
Q immediately followed by a wall, immediately followed by a waiting period
W AIT3 . We define the first piece Q1 as the sequence of instructions before the
first wall, and the i-th piece Qi , for i > 1, as the sequence of instructions between
the end of the (i − 1)th W AIT3 and the beginning of the ith wall. We next define
segments. Consider any piece. It can be viewed as a concatenation of chunks of
the following form: some sequence of instructions S immediately followed by
a fence, immediately followed by a waiting period W AIT2 . We define the first
segment S1 of a piece as the sequence of instructions before the first fence of
this piece, and the i-th segment Si of the piece, for i > 1, as the sequence
of instructions between the end of the (i − 1)th W AIT2 in the piece and the
beginning of the ith fence in it. Notice that segments correspond to bits of the
transformed label of the agent: these are sequences of instructions executed in
the statement “if cj = 0 then ... else ...”. A pattern F LOW ER, BOU QU ET or
CAT CH − BOU QU ET will be called an atom of its segment. Now we are ready
542 Y. Dieudonné and A. Pelc

to give the lengths of waiting periods W AIT0 (v, s, 4i + 1), W AIT1 (v, s, 4i + 1),
W AIT2 (v, s, 4i + 1), and W AIT3 (v, s, 4i + 1).
Consider the waiting period W AIT0 (v, s, 4i + 1).
Let τ0 = θ · C[F LOW ER(v, s, 4i + 1)] be the time spent by the agent to perform
F LOW ER(v, s, 4i + 1). The length of the waiting period W AIT0 (v, s, 4i + 1) is
defined as τ0 · C[BOU QU ET (v, 2s, 4i + 1)]. Notice that if W AIT0 (v, s, 4i + 1)
is located in the m-th segment of the s-th piece, then its length upper-bounds
the time of executing the first atom of the m-th segment of the (2s)-th piece –
assuming that this segment corresponds to bit 1 – by an agent with basic time
τ0 (because this atom is BOU QU ET (v, 2s, 4i + 1)). This property is essential
for the proof of correctness.
Consider the waiting period W AIT1 (v, s, 4i + 1) located in the m-th segment
of the s-th piece. Let τ1 = θ ·C[BOU QU ET (v, s, 4i + 1)] be the time spent by the
agent to perform BOU QU ET (v, s, 4i+1). Let i = i+s2 +s(s−1)/2. The length
of the waiting period W AIT1 (v, s, 4i + 1) is defined as the time of executing the
m-th segment of the (2s)-th piece – assuming that this segment corresponds to
bit 0 – by an agent with basic time τ1 .
Consider the waiting period W AIT2 (v, s, 4i + 1) located immediately before
the m-th segment of the s-th piece. Let τ2 be the sum of times spent by the agent
with basic time θ to perform the following chunks: the (m − 1)-th segment of the
s-th piece, the (m − 1)-th fence of the s-th piece, and the first atom of the m-th
segment of the s-th piece. The length of the waiting period W AIT2 (v, s, 4i + 1)
is defined as the time to perform the (m − 1)-th segment of the (2s)-th piece –
assuming that this segment corresponds to bit 1 – together with the (m − 1)-th
fence of the (2s)-th piece, by an agent with basic time τ2 .
Consider the waiting period W AIT3 (v, s, 4i + 1). This period is located im-
mediately before the (s + 1)-th piece. Let τ3 be the sum of times spent by the
agent with basic time θ to perform the following chunks: the s-th piece, the s-th
wall, and the first atom of the first segment of the (s + 1)-th piece. The length
of the waiting period W AIT3 (v, s, 4i + 1) is defined as the time to perform the
(2s)-th wall by an agent with basic time τ3 .
The lengths of the waiting periods having been defined, the description of our
algorithm is now complete.

4 Correctness and complexity


In this section we formulate results stating that Algorithm Meeting accomplishes
rendezvous in the basic grid, that its execution time is polynomial in Δ, λ and
τ , and that its cost is polynomial in Δ and λ. The proofs of these results will
appear in the journal version of the paper.
The following theorem implies that Algorithm Meeting is correct.

Theorem 1. The rendezvous of agents executing Algorithm Meeting must occur


before the first time when one of them completes the (2(Δ + λ) + 1)-th piece.

Our next result estimates the complexity of Algorithm Meeting.


Deterministic Polynomial Approach in the Plane 543

Theorem 2. Let Δ be the initial distance between agents in the basic grid. Let
λ be the length of the shorter of the transformed labels of agents. Let τ be the
shorter of the basic times of the agents. Then the execution time of Algorithm
Meeting is polynomial in Δ, λ and τ , and its cost is polynomial in Δ and λ.

Since Δ is linear in the initial Euclidean distance between agents, and the length
of the transformed label of an agent is linear in the length of the original label,
in view of the reduction described in Section 2 we have the following corollary
concerning the task of approach in the plane.

Corollary 1. Let D be the initial Euclidean distance between agents in the


plane, let γ be the length of the binary representation of the shorter label and
let τ be the inverse of the larger speed of the agents. Then deterministic ap-
proach between agents is possible in time polynomial in D, γ and τ , and at cost
polynomial in D and γ.

References
1. Agmon, N., Peleg, D.: Fault-tolerant gathering algorithms for autonomous mobile
robots. SIAM J. Comput. 36, 56–82 (2006)
2. Alpern, S.: The rendezvous search problem. SIAM J. on Control and Optimiza-
tion 33, 673–683 (1995)
3. Alpern, S.: Rendezvous search on labelled networks. Naval Research Logistics 49,
256–274 (2002)
4. Alpern, S., Gal, S.: The theory of search games and rendezvous. Int. Series in
Operations research and Management Science. Kluwer Academic Publisher (2002)
5. Alpern, J., Baston, V., Essegaier, S.: Rendezvous search on a graph. Journal of
Applied Probability 36, 223–231 (1999)
6. Anderson, E., Weber, R.: The rendezvous problem on discrete locations. Journal
of Applied Probability 28, 839–851 (1990)
7. Bampas, E., Czyzowicz, J., Gasieniec,
 L., Ilcinkas, D., Labourel, A.: Almost opti-
mal asynchronous rendezvous in infinite multidimensional grids. In: Lynch, N.A.,
Shvartsman, A.A. (eds.) DISC 2010. LNCS, vol. 6343, pp. 297–311. Springer, Hei-
delberg (2010)
8. Cieliebak, M., Flocchini, P., Prencipe, G., Santoro, N.: Solving the robots gathering
problem. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.)
ICALP 2003. LNCS, vol. 2719, pp. 1181–1196. Springer, Heidelberg (2003)
9. Cohen, R., Peleg, D.: Convergence properties of the gravitational algorithm in
asynchronous robot systems. SIAM J. Comput. 34, 1516–1528 (2005)
10. Cohen, R., Peleg, D.: Convergence of autonomous mobile robots with inaccurate
sensors and movements. SIAM J. Comput. 38, 276–302 (2008)
11. Collins, A., Czyzowicz, J., Gasieniec,
 L., Labourel, A.: Tell me where I am so
I can meet you sooner. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf
der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6199, pp. 502–514.
Springer, Heidelberg (2010)
12. Czyzowicz, J., Kosowski, A., Pelc, A.: How to meet when you forget: Log-space
rendezvous in arbitrary graphs. Distributed Computing 25, 165–178 (2012)
13. J. Czyzowicz, A. Labourel, A. Pelc, How to meet asynchronously (almost) every-
where. ACM Transactions on Algorithms 8, article 37 (2012)
544 Y. Dieudonné and A. Pelc

14. De Marco, G., Gargano, L., Kranakis, E., Krizanc, D., Pelc, A., Vaccaro, U.: Asyn-
chronous deterministic rendezvous in graphs. Theoretical Computer Science 355,
315–326 (2006)
15. Dessmark, A., Fraigniaud, P., Kowalski, D., Pelc, A.: Deterministic rendezvous in
graphs. Algorithmica 46, 69–96 (2006)
16. Dieudonné, Y., Pelc, A.: Anonymous meeting in networks. In: Proc. 24rd Annual
ACM-SIAM Symposium on Discrete Algorithms (SODA 2013), pp. 737–747 (2013)
17. Dieudonné, Y., Pelc, A., Peleg, D.: Gathering despite mischief. In: Proc.
23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2012),
pp. 527–540 (2012)
18. Dieudonné, Y., Pelc, A., Villain, V.: How to meet asynchronously at polynomial
cost. In: Proc. 32nd Annual ACM Symposium on Principles of Distributed Com-
puting, PODC 2013 (to appear, 2013)
19. Flocchini, P., Prencipe, G., Santoro, N., Widmayer, P.: Gathering of asynchronous
oblivious robots with limited visibility. In: Ferreira, A., Reichel, H. (eds.) STACS
2001. LNCS, vol. 2010, pp. 247–258. Springer, Heidelberg (2001)
20. Fraigniaud, P., Pelc, A.: Deterministic rendezvous in trees with little memory.
In: Taubenfeld, G. (ed.) DISC 2008. LNCS, vol. 5218, pp. 242–256. Springer,
Heidelberg (2008)
21. Fraigniaud, P., Pelc, A.: Delays induce an exponential memory gap for rendezvous
in trees. In: Proc. 22nd Ann. ACM Symposium on Parallel Algorithms and Archi-
tectures (SPAA 2010), pp. 224–232 (2010)
22. Guilbault, S., Pelc, A.: Asynchronous rendezvous of anonymous agents in arbitrary
graphs. In: Fernàndez Anta, A., Lipari, G., Roy, M. (eds.) OPODIS 2011. LNCS,
vol. 7109, pp. 421–434. Springer, Heidelberg (2011)
23. Klasing, R., Kosowski, A., Navarra, A.: Taking advantage of symmetries: Gather-
ing of many asynchronous oblivious robots on a ring. Theoretical Computer Sci-
ence 411, 3235–3246 (2010)
24. Klasing, R., Markou, E., Pelc, A.: Gathering asynchronous oblivious mobile robots
in a ring. Theoretical Computer Science 390, 27–39 (2008)
25. Kowalski, D., Malinowski, A.: How to meet in anonymous network. In: Flocchini,
P., Gasieniec,
 L. (eds.) SIROCCO 2006. LNCS, vol. 4056, pp. 44–58. Springer,
Heidelberg (2006)
26. Kranakis, E., Krizanc, D., Morin, P.: Randomized rendez-vous with limited mem-
ory. In: Laber, E.S., Bornstein, C., Nogueira, L.T., Faria, L. (eds.) LATIN 2008.
LNCS, vol. 4957, pp. 605–616. Springer, Heidelberg (2008)
27. Kranakis, E., Krizanc, D., Santoro, N., Sawchuk, C.: Mobile agent rendezvous in
a ring. In: Proc. 23rd Int. Conference on Distributed Computing Systems (ICDCS
2003), pp. 592–599. IEEE (2003)
28. Lim, W., Alpern, S.: Minimax rendezvous on the line. SIAM J. on Control and
Optimization 34, 1650–1665 (1996)
29. Mitzenmacher, M., Upfal, E.: Probability and computing: randomized algorithms
and probabilistic analysis. Cambridge University Press (2005)
30. Pelc, A.: Deterministic rendezvous in networks: A comprehensive survey. Net-
works 59, 331–347 (2012)
31. Ta-Shma, A., Zwick, U.: Deterministic rendezvous, treasure hunts and strongly
universal exploration sequences. In: Proc. 18th ACM-SIAM Symposium on Discrete
Algorithms (SODA ), pp. 599–608 (2007)
32. Yu, X., Yung, M.: Agent rendezvous: a dynamic symmetry-breaking problem.
In: Meyer auf der Heide, F., Monien, B. (eds.) ICALP 1996. LNCS, vol. 1099,
pp. 610–621. Springer, Heidelberg (1996)
Outsourced Pattern Matching

Sebastian Faust1, , Carmit Hazay2 , and Daniele Venturi3,


1
Security and Cryptography Laboratory, EPFL, Switzerland
2
Faculty of Engineering, Bar-Ilan Univeristy, Israel
3
Department of Computer Science, Aarhus University, Denmark

Abstract. In secure delegatable computation, computationally weak


devices (or clients) wish to outsource their computation and data to
an untrusted server in the cloud. While most earlier work considers the
general question of how to securely outsource any computation to the
cloud server, we focus on concrete and important functionalities and give
the first protocol for the pattern matching problem in the cloud. Loosely
speaking, this problem considers a text T that is outsourced to the cloud
S by a client CT . In a query phase, clients C1 , . . . , Cl run an efficient pro-
tocol with the server S and the client CT in order to learn the positions
at which a pattern of length m matches the text (and nothing beyond
that). This is called the outsourced pattern matching problem and is
highly motivated in the context of delegatable computing since it offers
storage alternatives for massive databases that contain confidential data
(e.g., health related data about patient history). Our constructions offer
simulation-based security in the presence of semi-honest and malicious
adversaries (in the random oracle model) and limit the communication
in the query phase to O(m) bits plus the number of occurrences — which
is optimal. In contrast to generic solutions for delegatable computation,
our schemes do not rely on fully homomorphic encryption but instead
uses novel ideas for solving pattern matching, based on efficiently solv-
able instances of the subset sum problem.

1 Introduction
The problem of securely outsourcing computation to an untrusted server gained
momentum with the recent penetration of cloud computing services. In cloud
computing, clients can lease computing services on demand rather than main-
taining their own infrastructure. While such an approach naturally has numerous
advantages in cost and functionality, the outsourcing mechanism crucially needs
to enforce privacy of the outsourced data and integrity of the computation. Cryp-
tographic solutions for these challenges have been put forward with the concept
of secure delagatable computation [1,6,11,2,8].

Supported in part by the BEAT project 7th Framework Research Programme of the
European Union, grant agreement number: 284989.

Supported from the Danish National Research Foundation, the National Science
Foundation of China (under the grant 61061130540), the Danish Council for In-
dependent Research (under the DFF Starting Grant 10-081612) and also from the
CFEM research center within which part of this work was performed.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 545–556, 2013.

c Springer-Verlag Berlin Heidelberg 2013
546 S. Faust, C. Hazay, and D. Venturi

In secure delegatable computation, computationally weak devices (or clients)


wish to outsource their computation and data to an untrusted server. The ulti-
mate goal in this setting is to design efficient protocols that minimize the compu-
tational overhead of the clients and instead rely on the extended resources of the
server. Of course, the amount of work invested by the client in order to verify the
correctness of the computation shall be substantially smaller than running the
computation by itself. Indeed, if this was not the case then the client could carry
out the computation itself. Another ambitious goal of delegatable computation
is to design protocols that minimize the communication between the cloud and
the client.
Most recent works in the area of delegatable computation propose solutions
to securely outsource any functionality to an untrusted server [1,6,11,2]. Such
generic solutions often suffer from rather poor efficiency and high communica-
tion overhead due to the use of fully homomorphic encryption [12]. An exception
is the randomized encoding technique used by [1] which instead relies on garbled
circuits. Furthermore, these solution concepts typically examine a restricted sce-
nario where a single client outsources its computation to an external untrusted
server. Only few recent works study the setting with multiple clients that mutu-
ally distrust each other and wish to securely outsource a joint computation on
their inputs with reduced costs, e.g., [15,17]. Of course, also in this more complex
setting recent constructions build up on fully homomorphic encryption.
To move towards more practical schemes, we may focus on particularly effi-
cient constructions for specific important functionalities. This approach has the
potential to avoid the use of fully homomorphic encryption by exploiting the
structure of the particular problem we intend to solve. Some recent works have
considered this question [3,22,20]. While these schemes are more efficient than
the generic constructions mentioned above, they typically only achieve very lim-
ited privacy or do not support multiple distrusting clients. In this paper, we
follow this line of work and provide the first protocols for pattern matching in
the cloud. In contrast to most earlier works, our constructions achieve a high-
level of security, while avoiding the use of FHE and minimizing the amount of
communication between the parties. We emphasize that even with the power of
fully homomorphic encryption it is not clear how to get down to communication
complexity that is linear in the number of matches in two rounds.1

Pattern Matching in the Cloud. The problem of pattern matching considers a


text T of length n and a pattern of length m with the goal to find all the locations
where the pattern matches the text. In a secure pattern matching protocol, one
party holds the text whereas the other party holds the pattern and attempts
to learn all the locations of the pattern in the text (and only that), while the
party holding the text learns nothing about the pattern. Unfortunately, such
protocols are not directly applicable in the cloud setting, mostly because the
1
A one-round solution based on FHE would need a circuit that tolerates the maximal
number of matches — which in the worst case is proportional to the length of the
text.
Outsourced Pattern Matching 547

communication overhead per search query grows linearly with the text length.
Moreover, the text holder delegates its work to an external untrusted server and
cannot control the content of the server’s responses.
In the outsourced setting we consider a set of clients CT , (C1 , . . . , Cl ) that
interact with a server S in the following way. (1) In a setup phase client CT
uploads a preprocessed text to an external server S. This phase is run only
once and may be costly in terms of computation and communication. (2) In a
query phase clients C1 , . . . , Cl query the text by searching patterns and learn the
matched text locations. The main two goals of our approach are as follows:

1. Simulation-based security: We model outsourced pattern matching by


a strong simulation-based security definition (cf. Section 2). Namely, we de-
fine a new reactive outsourced functionality FOPM that ensures the secrecy
and integrity of the outsourced text and patterns. For instance, a semi-honest
server does not gain any information about the text and patterns, except of
what it can infer from the answers to the search queries. If the server is ma-
liciously corrupted the functionality implies the correctness of the queries’
replies as well. As in the standard secure computation setting, simulation-
based modeling is simpler and stronger than game-based definitions.
2. Sublinear communication complexity during query phase: We consider an
amortized model, where the communication and computational costs of the
clients are reduced with the number of queries. More concretely, while in the
setup phase communication and computation is linear in the length of the
text, we want that during the query phase the overall communication and the
work put by the clients is linear in the number of matches (which is optimal).
Of course, we also require the server running in polynomial-time. Clearly,
such strong efficiency requirement comes at a price as it allows the server
to learn the number of matches. We model this additional information by
giving the server some leakage for each pattern query which will be described
in detail below.

1.1 Our Contribution

To simplify notation we will always only talk about a single client C that interacts
with CT and S in the query phase.

Modeling Outsourced Pattern Matching. We give a specification of an ideal exe-


cution with a trusted party by defining a reactive outsourced pattern matching
functionality FOPM . This functionality works in two phases: In the preprocessing
phase client CT uploads its preprocessed text T4 to the server. Next, in an itera-
tive query phase, upon receiving a search query p the functionality asks for the
approvals of client CT (as it may also refuse for this query in the real execution),
and the server (as in case of being corrupted it may abort the execution). To
model the additional leakage that is required to minimize communication we ask
the functionality to forward to the server the matched positions in the text upon
548 S. Faust, C. Hazay, and D. Venturi

receiving an approval from CT . Our functionality returns all matched positions


but can be modified so that only the first few matched positions are returned.2

Difficulties with Simulating FOPM . The main challenge in designing a simulator


for this functionality is in case when the server is corrupted. In this case the sim-
ulator must commit to some text in a way that later allows him (when taking
the role of the server, given some trapdoor) to reply to pattern queries in a con-
sistent way. More precisely, when the simulator commits to a preprocessed text,
the leakage that the corrupted server obtains (namely, the positions where the
pattern matches the text) has to be consistent with the information that it later
sees during the query phases. This implies that the simulator must have flexi-
bility when it later matches the committed text to the trapdoors. This difficulty
does not arise in the classic two-party setting since there the simulator always
plays against a party that contributes an input to the computation which it can
first extract, whereas here the server is just a tool to run a computation. Due to
this inherent difficulty the text must be encoded in a way, that given a search
query p and a list of text positions (i1 , . . . , it ), one can produce a trapdoor for
p in such a way that the “search” in the preprocessed text, using this trapdoor,
yields (i1 , . . . , it ). We note that alternative solutions that permute the text to
prevent the server from learning the matched positions, necessarily require that
the server does not collude with the clients. In contrast, our solutions allow such
strong collusion between the clients and the server.

Solutions Based on Searchable/Non-Committing Encryption. To better moti-


vate our solution, let us consider a toy example first. Assume we encrypt each
substring of length m in T using searchable encryption [4], which allows running
a search over an encrypted text by producing a trapdoor for the searched word
(or a pattern p). Given the trapdoor, the server can check each ciphertext and
return the text positions in which the verification succeeds. The first problem
that arises with this approach is that searchable encryption does not ensure the
privacy of the searched patterns. While this issue may be addressed by tweaking
existing constructions of searchable encryption, a more severe problem is that
the simulator must commit in advance to (searchable) encryptions of a text that
later allow to “find” p at positions that are consistent with the leakage. In other
words: all the plaintexts in the specified positions must be associated with the
keyword p ahead of time. Of course, as the simulator does not know the actual
text T it cannot produce such a consistent preprocessed text. An alternative
solution may be given by combining searchable encryption with techniques from
non-committing encryptions [5]. Note that it is unclear how to combine these
two tools even in the random oracle model.

2
This definition is more applicable for search engines where the first few results are
typically more relevant, whereas the former variant is more applicable for a DNA
search where it is important to find all matched positions. For simplicity we only
consider the first variant, our solutions support both variants.
Outsourced Pattern Matching 549

Semi-Honest Outsourced Pattern Matching from Subset Sum. Our first construc-
tion for outsourced pattern matching is secure against semi-honest adversaries.
In this construction client CT generates a vector of random values, conditioned
on that the sum of elements in all positions that match the pattern equals some
specified value that will be explained below. Namely, CT builds an instance T4
for the subset sum problem, where given a trapdoor R the goal is to find whether
there exists a subset in T4 that sums to R. More formally, the subset sum prob-
lem is parameterized by two integers  and M . An instance of the problem is
generated by picking random vectors T4 ← ZM , s ← {0, 1} and outputting
(T4, R = T4 · s mod M ). The problem is to find s given T4 and a trapdoor R.
Looking ahead, we will have such a trapdoor Rp for each pattern p of length
m, such that if p matches T then with overwhelming probability there will be
a unique solution to the subset sum instance (T4, Rp ). This unique solution is
placed at exactly the positions where the pattern appears in the text. The client
C that wishes to search for a pattern p obtains this trapdoor from CT and will
hand it to the server. Consequently, we are interested in easy instances of the
subset sum problem since we require the server to solve it for each query. This
is in contrast to prior cryptographic constructions, e.g., [18] that design crypto-
graphic schemes based on the hardness of this problem. We therefore consider
low-density instances which can be solved in polynomial time by a reduction to
a short vector in a lattice [16,10,7].
We further note that the security of the scheme relies heavily on the unpre-
dictability of the trapdoor. Namely, in order to ensure that the server cannot
guess the trapdoor for some pattern p (and thus solve the subset problem and
find the matched locations), we require that the trapdoor is unpredictable. We
therefore employ a pseudorandom function (PRF) F on the pattern and fix this
value as the trapdoor, where the key k for the PRF is picked by CT and the two
clients CT and C communicate via a secure two-party protocol to compute the
evaluation of the PRF.

Efficiency Considerations. The scheme described above does not yet satisfy the
desired properties outlined in the previous paragraphs and has a very limited
usage in practice. Recall that the server is asked to solve subset sum instances
of the form (T4, Rp ), where T4 is a vector of length  = n − m + 1 with elements
from ZM for some integer M . In order to ensure correctness we must guarantee
that given a subset sum instance, each trapdoor has a unique solution with high
probability. In other words, the collision probability, which equals 2 /M (stated
also in [13]), should be negligible. Fixing M = 2κ+n for a security parameter
κ, ensures this for a large enough κ, say whenever κ ≥ 80. On the other hand,
we need the subset sum problem to be solvable in polynomial √ time. A simple
calculation (see Eq. (1)), yields in this case a value of  ≈ κ. This poses an
inherent limitation on the length of the text to be preprocessed. For instance,
even using a high value of κ ≈ 104 (yielding approximately subset sum elements
of size 10 KByte) limits the length of the text to only 100 bits. This scheme
also requires quadratic communication complexity in the text length during the
setup phase since client CT sends O(n2 + κn) bits.
550 S. Faust, C. Hazay, and D. Venturi

An Improved Solution Using Packaging. To overcome this limitation, we employ


an important extension of our construction based on packaging. First, the text
is partitioned into smaller pieces of length 2m which are handled separately
by the protocol, where m is some practical upper bound on the pattern length.
Moreover, every two consecutive blocks are overlapping in m positions, so that we
do not miss any match in the original text. Even though this approach introduces

some overhead, yielding a text T √ of overall length 2n, note that now Eq. (1)
yields  = 2m − m + 1 = m + 1 < κ, which is an upper bound on the length of
the pattern (and not on the length of the text as before). Namely, we remove the
limitation on the text length and consider much shorter blocks lengths for the
subset sum algorithm. As a result, the communication complexity in the setup
phase is O(mn + κn), whereas the communication complexity in the query phase
is O(κm). For short queries (which is typically the case), these measures meet
the appealing properties we are sought after.
This comes at a price though since we now need to avoid using in each block
the same trapdoor for some pattern p, as repetitions allow the server to extract
potential valid trapdoors (that have not been queried yet) and figure out in-
formation about the text. We solve this problem by requiring from the function
outputting the trapdoors to have some form of “programmability” (which allows
to simulate the answers to all queries consistently). Specifically, we implement
this function using the random oracle methodology on top of the PRF, so that a
trapdoor now is computed by H(F(k, p)b), for b being the block number. Now,
the simulator can program the oracle to match with the positions where the
pattern appears in each block. Note that using just the random oracle (without
the PRF) is not sufficient as well, since an adversary that controls the server
and has access to the random oracle can apply it on p as well.

Malicious Outsourced Pattern Matching. We extend our construction to the


malicious setting as well, tolerating malicious attacks. Our proof ensures that the
server returns the correct answers by employing Merkle commitments and zero-
knowledge (ZK) sets. Informally speaking, Merkle commitments are succinct
commitment schemes for which the commitment size is independent of the length
of the committed value (or set). This tool is very useful in ensuring correctness,
since now, upon committing to T4, the server decommits the solution to the
subset sum trapdoor and client C can simply verify that the decommitted values
correspond to the trapdoor. Nevertheless, this solution does not cover the case
of a mismatch. Therefore, a corrupted server can always return a “no-match”
massage. In order to resolve this technicality we borrow techniques from ZK sets
arguments [19], used for proving whether an element is in a specified set without
disclosing any further information. Next, proving security against a corrupted C
is a straightforward extension of the semi-honest proof using the modifications
we made above and the fact that the protocol for implementing the oblivious
PRF evaluation is secure against malicious adversaries as well.
The case of a corrupted CT is more challenging since we first need to ex-
tract the text T , but also verify CT ’s computations with respect to the random
oracle when it produces T4. The only proof technique that we are aware of for
Outsourced Pattern Matching 551

proving correctness when using a random oracle is cut-and-choose (e.g., as done


in [14]), which inflates the communication complexity by an additional statisti-
cal parameter. Instead, we do not require that the server can verify immediately
the correctness of the outsourced text, but only ensure that if CT cheats with
respect to some query p, then it will be caught during the query phase whenever
p is queried. The crux of our protocol is that the simulator does not need to
verify all computations at once, but only the computations with respect to the
asked queries. This enables us to avoid the costly cut-and-choose technique since
verification is done using a novel technique of derandomizing CT ’s computations.
We notice that this requires us to slightly adjust the description of our idealized
functionality. For space reasons, we defer the details to the full version [9] and
focus on the semi-honest case here.
We remark that all the solutions described above can be combined together
into a single protocol which is secure even in the case of a collusion between
S and client C. When a collusion between S and client CT occurs we cannot
guarantee either privacy or correctness since the simulator cannot extract the
text, as the preprocessed protocol is “run” between the two corrupted parties.
We stress that collusion does not imply that security collapses into the standard
two-party setting.

2 Modeling Outsourced Pattern Matching


The outsourced pattern matching consists of two phases. In the setup phase a
client CT uploads a (preprocessed) text T4 to an external server S. This phase is
run only once. In the query phase client C queries the text by searching patterns
and learn the matched text locations. We formalize security using the ideal/real
paradigm. Denote by Tj the substring of length m that starts at text location j.
The pattern matching ideal functionality in the outsourced setting is depicted
in Fig. 1. We write |T | for the bit length of T and assume that client C asks a
number of queries pi (i ∈ [λ], λ ∈ N).

The Definition. Formally, denote by IDEALFOPM ,Sim(z) (κ, (−, T, (p1 , . . . , pλ )))
the output of an ideal adversary Sim, server S and clients CT , C in the above
ideal execution of FOPM upon inputs (−, (T, (p1 , . . . , pλ ))) and auxiliary input z
given to Sim.
We implement functionality FOPM via a protocol π = (πPre , πQuery , πOpm ) con-
sisting of three two-party protocols, specified as follows. Protocol πPre is run
in the preprocessing phase by CT to preprocess text T and forward the out-
come T4 to S. During the query phase, protocol πQuery is run between CT and
C (holding a pattern p); this protocol outputs a trapdoor Rp that depends on
p and will enable the server to search the preprocessed text. Lastly, protocol
πOpm is run by S upon input the preprocessed text and trapdoor Rp (forwarded
by C); this protocol returns C the matched text positions (if any). We denote
by REALπ,Adv(z) (κ, (−, T, (p1 , . . . , pλ ))) the output of adversary Adv, server S
and clients CT , C in a real execution of π = (πPre , πQuery , πOpm ) upon inputs
(−, (T, (p1 , . . . , pλ ))) and auxiliary input z given to Adv.
552 S. Faust, C. Hazay, and D. Venturi

Functionality FOPM

Let m, λ ∈ N. Functionality FOPM sets the table B initially to the empty and
proceeds as follows, running with clients CT and C, server S and adversary Sim.

1. Upon receiving a message (text, T, m) from CT , send (preprocess, |T |, m) to S


and Sim, and record (text, T ).
2. Upon receiving a message (query, pi ) from client C (for i ∈ [λ]), where message
(text, ·) has been recorded and |pi | = m, it checks if the table B already contains
an entry of the form (pi , ·). If this is not the case then it picks the next available
identifier id from {0, 1}∗ and adds (pi , id) to B. It sends (query, C) to CT and
Sim.
(a) Upon receiving (approve, C) from client CT , read (pi , id) from B and send
(query, C, (i1 , . . . , it ), id) to server S, for all text positions {ij }j∈[t] such
that Tij = pi . Otherwise, if no (approve, C) message has been received
from CT , send ⊥ to C and abort.
(b) Upon receiving (approve, C) from Sim, read (pi , id) from B and send
(query, pi , (i1 , . . . , it ), id) to client C. Otherwise, send ⊥ to client C.

Fig. 1. The outsourced pattern matching functionality

Definition 1 (Security of outsourced pattern matching). We say that


π securely implements FOPM , if for any PPT real adversary Adv there exists
a PPT simulator Sim such that for any tuple of inputs (T, (p1 , . . . , pλ )) and
auxiliary input z,

{IDEALFOPM ,Sim(z) (κ, (−, T, (p1 , . . . , pλ )))}κ∈N


c
≈ {REALπ,Adv(z) (κ, (−, T, (p1 , . . . , pλ )))}κ∈N .

The schemes described in the next sections, implement the ideal functionality
FOPM in the random oracle model.

3 A Scheme with Passive Security

In this section we present our implementation of the outsourced pattern matching


functionality FOPM that is formalized in Fig. 1, and prove its security against
semi-honest adversaries. A scheme with security against malicious adversaries
is described in the full version of this paper [9], building upon the protocol in
this section. Recall first that in the outsourced variant of the pattern matching
problem, client CT preprocesses the text T and then stores it on the server S in
such a way that the preprocessed text can be used later to answer search queries
submitted by client C. The challenge is to find a way to hide the text (in order
to obtain privacy), while enabling the server to carry out searches on the hidden
text whenever it is in possession of an appropriate trapdoor.
Outsourced Pattern Matching 553

Protocol πSH = (πPre , πQuery , πOpm )

Let κ ∈ N be the security parameter and let M, m, n, μ be integers, where for


simplicity we assume that n is a multiple of 2m. Further, let H : {0, 1}μ → ZM
be a random oracle and F : {0, 1}κ × {0, 1}m → {0, 1}μ be a PRF. Protocol πSH
involves a client CT holding a text T ∈ {0, 1}n , a client C querying for patterns
p ∈ {0, 1}m , and a server S. The interaction between the parties is specified below.

Setup phase, πPre . The protocol is invoked between client CT and server S.
Given input T and integer m, client CT picks a random key k ∈ {0, 1}κ and
prepares first the text T for the packaging by writing it as

T  := (B1 , . . . , Bu ) = ((T [1], . . . , T [2m]),


(T [m + 1], . . . , T [3m]), . . . , (T [n − 2m + 1], . . . , T [n])),

where u = n/m − 1. Next, for each block Bb and each of the m + 1 patterns
p ∈ {0, 1}m that appear in Bb we proceed as follows (suppose there are at
most t matches of p in Bb ).
1. Client CT evaluates Rp := H(F(k, p)||b), samples  a1 , . . . , at−1 ∈ ZM at
random and then fixes at such that at = Rp − t−1 j=1 aj mod M .
2. Set B b [vj ] = aj for all j ∈ [t] and vj ∈ [m + 1]. Note that here we
denote by {vj }j∈[t] (vj ∈ [m + 1]) the set of indexes corresponding to the
positions where p occurs in Bb . Later in the proof we will be more precise
and explicitly denote to which block vj belongs by using explicitly the
notation vjb .
Finally, we outsource the text T = (B 1 , . . . , B
u ) to S.
Query phase, πQuery . Upon issuing a query p ∈ {0, 1}m by client C, clients CT
and C engage in an execution of protocol πQuery which implements the oblivious
PRF functionality (k, p) → (−, F(k, p)). Upon completion, C learns F(k, p).
Oblivious pattern matching phase, πOpm . This protocol is executed between
server S (holding T) and client C (holding F(k, p)). Upon receiving F(k, p)
from C, the server proceeds as follows for each block B b . It interprets

(H(F(k, p)||b), Bb ) as a subset sum instance and computes s as the solution
of Bb · s = H(F(k, p)||b). Let {vj }j∈[t] denote the set of indexes such that
s[vj ] = 1, then the server S returns the set of indexes {ϕ(b, vj )}b∈[u],j∈[t] to
the client C.

Fig. 2. Semi-honest outsourced pattern matching

We consider a new approach and reduce the pattern matching problem to


the subset sum problem. Namely, consider a text T of length n, and assume we
want to allow to search for patterns of length m. For some integer M ∈ N, we
assign to each distinct pattern p that appears in T a random element Rp ∈ ZM .
Letting  = n − m + 1, the preprocessed text T4 is a vector in ZM with elements
specified as follows. For each pattern p that appears
t t times in T , we sample
random values a1 , . . . , at ∈ ZM such that Rp = j=1 aj . Denote with ij ∈ []
the jth position in T where p appears and set T4[ij ] = aj . Notice that for each
pattern p, there exists a vector s ∈ {0, 1} such that Rp = T4 · s. Hence, the
554 S. Faust, C. Hazay, and D. Venturi

positions in T4 where pattern p matches are identified by a vector s and can be


viewed as the solution for the subset sum problem instance (Rp , T4).
Roughly, our protocol works as follows. During protocol πPre , we let the client
CT generate the preprocessed text T4 as described above, and send the result to
the server S. Later, when a client C wants to learn at which positions a pattern
p matches in the text, clients C and CT run protocol πQuery ; at the end of this
protocol, C learns the trapdoor Rp corresponding to p. Hence, during πOpm , client
C sends this trapdoor to S, which can solve the subset sum problem instance
(Rp , T4). The solution to this problem corresponds to the matches of p, which are
forwarded to the client C. To avoid that CT needs to store all trapdoors, we rely
on a PRF to generate the trapdoors itself. More precisely, instead of sampling
the trapdoors Rp uniformly at random, we set Rp := F(k, p), where F is a PRF.
Thus, during the query phase C and CT run an execution of an oblivious PRF
protocol; where C learns the output of the PRF, i.e., the trapdoor Rp .

Efficiency. Although the protocol described above provides a first basic solution
for the outsourced pattern matching, it suffers from a strong restriction as only
very short texts are supported. (On the positive side, the above scheme does not
rely on a random oracle.) The server S is asked to solve subset sum instances
of the form (T4, Rp ), where T4 is a vector of length  = n − m + 1 with elements
from ZM for some integer M . To achieve correctness, we require that each subset
sum instance has a unique solution with high probability. In order to satisfy this
property, one needs to set the parameters such that the value 2 /M is negligible.
Fixing M = 2κ+ achieves a reasonable correctness level.
On the other hand, we need to let S solve subset sum instances efficiently.
The hardness of subset sum depends on the ratio between  and log M , which
is usually referred to as the density Δ of the subset sum instance. In particular
both instances with Δ < 1/ (so called low-density instances) and Δ > / log2 
(so called high-density instances) can be solved in polynomial time. Note that,
however, the constraint on the ratio 2 /M immediately rules out algorithms for
high-density subset sum (e.g., algorithms based on dynamic programming, since
they usually need to process a matrix of dimension M ). On the other hand, for
low-density instances, an easy calculation shows that  + κ > 2 , so that we need
to choose κ,  in such a way that
1 √ 
< 1 + 4κ − 1 . (1)
2

The above analysis yields a value of  ≈ κ. This poses an inherent limitation on
the length of the text. For instance, even using κ ≈ 104 (yielding approximately
subset sum elements of size 10 KByte) limits the length of the text to only 100 bits.
Packaging. To overcome this severe limitation, we partition the text into smaller
pieces each of length 2m, where each such piece is handled as a separate instance of
the protocol. More specifically, for a text T = (T [1], . . . , T [n]) let (T [1], . . . , T [2m]),
(T [m+1], . . . , T [3m]), . . . be blocks, each of length 2m, such that every two consec-
utive blocks overlap in m bits. Then, for each pattern p that appears in the text the
Outsourced Pattern Matching 555

client CT computes an individual trapdoor for each block where the pattern p ap-
pears. In other words, suppose that pattern p appears in block Bb then we compute
the trapdoor for this block (and pattern p) as H(F(k, p)||b). Here, H is a crypto-
graphic hash function that will be modeled as a random oracle in our proofs. Given
the trapdoors, we apply the preprocessing algorithm to each block individually.
The sub-protocols πQuery and πOpm work as described above with a small
change. In πQuery client C learns the output of the PRF F(k, p) instead of the
actual trapdoors and in πOpm client C forwards directly the result F(k, p) to S.
The server can then compute the actual trapdoor using the random oracle. This
is needed to keep the communication complexity of the protocol low. Note that in
this case if we let {vjb }jb ∈[tb ] be the set of indices corresponding to the positions
where p occurs in a given block Bb , the server needs to map these positions to
the corresponding positions in T (and this has to be done for each of the blocks
where p matches). It is easy to see that such a mapping from a position vjb
in block Bb to the corresponding position in the text T can be computed as
ϕ(b, vj ) = (b − 1)m + vj . The entire protocol is shown in Fig. 2.
Note that now each of the preprocessed blocks B 4b consist of  = m+1 elements
in ZM . The advantage is that the blocks are reasonably short which yields subset
sum instances of the form (B 4b , Rp ). Combined with Eq. (1) this yields a value

of  = 2m − m + 1 = m + 1 < κ, which is an upper bound on the length of
the pattern (and not on the length of the text as before). By combining many
blocks we can support texts of any length polynomial in the security parameter.
Finally, we emphasize that the communication/computational complexities of
πQuery depends on the underlying oblivious PRF evaluation. This in particular
only depends on m (due to the algebraic structure of the [21] PRF). Using
improved PRFs can further reduce the communication complexity. On the other
hand, the communication complexity of πOpm is dominated by the number of
matches of p in T which is optimal.
We state the following result. The proof can be found in the full version [9].
Theorem 1. Let κ ∈ N be the security parameter. For integers n, m we set
λ = poly(κ), μ = poly(κ), u = n/m − 1,  = (m + 1)u and M = 2m+κ+1 . We
furthermore require that κ is such that 2m+1 /M is negligible (in κ). Assume
H : {0, 1}μ → ZM is a random oracle and F : {0, 1}κ × {0, 1}m → {0, 1}μ is a
pseudorandom function. Then, protocol πSH from Fig. 2 securely implements the
FOPM functionality in the presence of semi-honest adversaries.

References
1. Applebaum, B., Ishai, Y., Kushilevitz, E.: From secrecy to soundness: Efficient
verification via secure computation. In: Abramsky, S., Gavoille, C., Kirchner, C.,
Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6198, pp.
152–163. Springer, Heidelberg (2010)
2. Asharov, G., Jain, A., López-Alt, A., Tromer, E., Vaikuntanathan, V., Wichs, D.:
Multiparty computation with low communication, computation and interaction
via threshold FHE. In: Pointcheval, D., Johansson, T. (eds.) EUROCRYPT 2012.
LNCS, vol. 7237, pp. 483–501. Springer, Heidelberg (2012)
556 S. Faust, C. Hazay, and D. Venturi

3. Benabbas, S., Gennaro, R., Vahlis, Y.: Verifiable delegation of computation over
large datasets. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 111–131.
Springer, Heidelberg (2011)
4. Boneh, D., Di Crescenzo, G., Ostrovsky, R., Persiano, G.: Public key encryption
with keyword search. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004.
LNCS, vol. 3027, pp. 506–522. Springer, Heidelberg (2004)
5. Canetti, R., Feige, U., Goldreich, O., Naor, M.: Adaptively secure multi-party
computation. In: STOC, pp. 639–648 (1996)
6. Chung, K.-M., Kalai, Y., Vadhan, S.: Improved delegation of computation us-
ing fully homomorphic encryption. In: Rabin, T. (ed.) CRYPTO 2010. LNCS,
vol. 6223, pp. 483–501. Springer, Heidelberg (2010)
7. Coster, M.J., Joux, A., LaMacchia, B.A., Odlyzko, A.M., Schnorr, C.-P., Stern,
J.: Improved low-density subset sum algorithms. Computational Complexity 2,
111–128 (1992)
8. Damgård, I., Faust, S., Hazay, C.: Secure two-party computation with low com-
munication. In: Cramer, R. (ed.) TCC 2012. LNCS, vol. 7194, pp. 54–74. Springer,
Heidelberg (2012)
9. Faust, S., Hazay, C., Venturi, D.: Outsourced pattern matching. Cryptology ePrint
Archive, Report 2013/XX, http://eprint.iacr.org/
10. Frieze, A.M.: On the lagarias-odlyzko algorithm for the subset sum problem. SIAM
J. Comput. 15(2), 536–539 (1986)
11. Gennaro, R., Gentry, C., Parno, B.: Non-interactive verifiable computing: Out-
sourcing computation to untrusted workers. In: Rabin, T. (ed.) CRYPTO 2010.
LNCS, vol. 6223, pp. 465–482. Springer, Heidelberg (2010)
12. Gentry, C.: Fully homomorphic encryption using ideal lattices. In: STOC,
pp. 169–178 (2009)
13. Impagliazzo, R., Naor, M.: Efficient cryptographic schemes provably as secure as
subset sum. J. Cryptology 9(4), 199–216 (1996)
14. Ishai, Y., Kilian, J., Nissim, K., Petrank, E.: Extending oblivious transfers effi-
ciently. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 145–161. Springer,
Heidelberg (2003)
15. Kamara, S., Mohassel, P., Raykova, M.: Outsourcing multi-party computation.
IACR Cryptology ePrint Archive, 2011:272 (2011)
16. Lagarias, J.C., Odlyzko, A.M.: Solving low-density subset sum problems. J.
ACM 32(1), 229–246 (1985)
17. López-Alt, A., Tromer, E., Vaikuntanathan, V.: On-the-fly multiparty computation
on the cloud via multikey fully homomorphic encryption. In: STOC, pp. 1219–1234
(2012)
18. Lyubashevsky, V., Palacio, A., Segev, G.: Public-key cryptographic primitives prov-
ably as secure as subset sum. In: Micciancio, D. (ed.) TCC 2010. LNCS, vol. 5978,
pp. 382–400. Springer, Heidelberg (2010)
19. Micali, S., Rabin, M.O., Kilian, J.: Zero-knowledge sets. In: FOCS, pp. 80–91 (2003)
20. Mohassel, P.: Efficient and secure delegation of linear algebra. IACR Cryptology
ePrint Archive 2011:605 (2011)
21. Naor, M., Reingold, O.: Number-theoretic constructions of efficient pseudo-random
functions. In: FOCS, pp. 458–467 (1997)
22. Papamanthou, C., Tamassia, R., Triandopoulos, N.: Optimal verification of oper-
ations on dynamic sets. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841,
pp. 91–110. Springer, Heidelberg (2011)
Learning a Ring Cheaply and Fast

Emanuele G. Fusco1 , Andrzej Pelc2, , and Rossella Petreschi1


1
Computer Science Department, Sapienza, University of Rome, 00198 Rome, Italy
{fusco,petreschi}@di.uniroma1.it
2
Département d’informatique, Université du Québec en Outaouais,
Gatineau, Québec J8X 3X7, Canada
[email protected]

Abstract. We consider the task of learning a ring in a distributed way:


each node of an unknown ring has to construct a labeled map of it.
Nodes are equipped with unique labels. Communication proceeds in syn-
chronous rounds. In every round every node can send arbitrary messages
to its neighbors and perform arbitrary local computations. We study
tradeoffs between the time (number of rounds) and the cost (number
of messages) of completing this task in a deterministic way: for a given
time T we seek bounds on the smallest number of messages needed for
learning the ring in time T . Our bounds depend on the diameter D of
the ring and on the delay θ = T − D above the least possible time D
in which this task can be performed. We prove a lower bound Ω(D2 /θ)
on the number of messages used by any algorithm with delay θ, and we
design a class of algorithms that give an almost matching upper bound:
for any positive constant 0 < ε < 1 there is an algorithm working with
delay θ ≤ D and using O(D2 (log∗ D)/θ1−ε ) messages.

Keywords: labeled ring, message complexity, time, tradeoff.

1 Introduction

The Model and the Problem. Constructing a labeled map of a network is


one of the most demanding distributed tasks that nodes can accomplish in a
network. Each node has a distinct label and in the beginning each node knows
only its own label. Moreover, ports at each node of degree d are arbitrarily
numbered 0, . . . , d − 1. The goal is for each node to get an isomorphic copy
of the graph underlying the network, including node labels and port numbers.
Once nodes acquire this map, any other distributed task, such as leader election
[9,13], minimum weight spanning tree construction [2], renaming [1], etc. can be
performed by nodes using only local computations. Thus constructing a labeled
map converts all distributed network problems to centralized ones, in the sense

This research was done during the visit of Andrzej Pelc at Sapienza, University of
Rome, partially supported by a visiting fellowship from this university.

Partially supported by NSERC discovery grant and by the Research Chair in Dis-
tributed Computing at the Université du Québec en Outaouais.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 557–568, 2013.

c Springer-Verlag Berlin Heidelberg 2013
558 E.G. Fusco, A. Pelc, and R. Petreschi

that nodes can solve them simulating a central monitor. We are interested in the
efficiency of deterministic algorithms for labeled map construction.
In this paper we use the extensively studied LOCAL model of communication
[12]. In this model, communication proceeds in synchronous rounds and all nodes
start simultaneously. In each round each node can exchange arbitrary messages
with all its neighbors and perform arbitrary local computations. The time of
completing a task is the number of rounds it takes. Our goal is to investigate
tradeoffs between the time of constructing a labeled map and its cost, i.e., the
number of messages needed to perform this task. To see extreme examples of
such a tradeoff, consider the map construction task on an n-node ring. The
fastest way to complete this task is in time D, where D = n/2 is the diameter
of the ring. This can be achieved by flooding, but the number of messages used
is then Θ(n2 ). On the other hand, cost Θ(n) (which is optimal) can be achieved
by a version of the time slicing algorithm [11], but then time may become very
large and depends on the labels of the nodes.
The general problem of tradeoffs between time and cost of labeled map con-
struction can be formulated as follows.

For a given time T , what is the smallest number of messages needed for
constructing a labeled map by each node in time T ?

For trees this problem is trivial: leaves of an n-node tree initiate the communica-
tion process and information about ever larger subtrees gets first to the central
node (or central pair of adjacent nodes) and then back to all leaves, using time
equal to the diameter of the tree and O(n) messages, both of which are optimal.
However, as soon as there are cycles in the network, there is no canonical place
to start information exchange on each cycle and proceeding fast seems to force
many messages to be sent in parallel, which in turn intuitively implies large
cost. This phenomenon is present already in the simplest such network, i.e., the
ring. Indeed, our study shows that meaningful tradeoffs between time and cost
of labeled map construction already occur in rings.
We consider rings whose nodes have unique labels that are binary strings of
length polynomial in the size of the ring. (Our results are valid also for much longer
labels, but these can be dismissed for practicality reasons.) In the beginning, every
node knows only its own label, the allowed time T and the diameter D of the ring.
Equivalently, we provide each node with its label, with the diameter D and with
the delay θ = T − D, which is the extra time allowed on top of the minimum time
D in which labeled map construction can be achieved, knowing D a priori.
Knowing its own label is an obvious assumption. Without any additional
knowledge, nodes would have to assume the least possible time and hence do
flooding at quadratic cost. Instead of providing nodes with D and θ, we could
have provided them only with the allowed delay over the least possible time of
learning the ring without a priori knowledge of the diameter. This would not
affect our asymptotic bounds. However, it would result in more cumbersome
formulations because, without knowing D a priori, the optimal time of labeled
map construction varies between D and D + 1, depending on whether the ring is
Learning a Ring Cheaply and Fast 559

of even or odd size. We are interested in achieving map construction with small
delay: in particular, we assume θ ≤ D.
We assume that messages are of arbitrary size, but in our algorithms they
need only to be sufficiently large to contain already acquired information about
the n-node ring, i.e., strings of up to n labels and port numbers. This is a
natural assumption for the task of labeled map construction whose output has
large size, similarly as is done, e.g., in gossiping [7]. This should be contrasted
with such tasks as leader election [9], distributed minimum weight spanning tree
construction [2], or distributed coloring [8], where each node has to output only
a small amount of information, and considered messages are often of small size.
Our Results. We prove almost tight upper and lower bounds on the minimum
cost (number of messages) needed to deterministically perform labeled map con-
struction on a ring in a given time. Our bounds depend on the diameter D of
the ring and on the delay θ = T − D above the least possible time D in which
this task can be performed. We prove a lower bound Ω(D2 /θ) on the cost of
any algorithm with delay θ, and we design a class of algorithms that give an
almost matching upper bound: for any positive constant 0 < ε < 1 there is an
algorithm working with delay θ ≤ D and using O(D2 (log∗ D)/θ1−ε ) messages.
We also provide tradeoffs between time and cost of labeled map construction for
a more general class of graphs, when the delay is larger.
Due to the lack of space, several proofs are omitted.
Related Work. The task of constructing a map of a network has been studied
mostly for anonymous networks, both in the context of message passing systems
[14] and using a mobile agent exploring a network [3]. The goal was to determine
the feasibility of map construction (also called topology recognition) and to
find fast algorithms performing this task. For networks with unique labels, map
construction is of course always feasible and can be done in time equal to the
diameter of the network plus one (in the LOCAL model), which is optimal.
Tradeoffs between the time and the number of messages have been studied
for various network problems, including leader election [6,9,13] weak unison [10],
and gossiping [5]. It should be noticed that if the requirement concerning time is
loose, i.e., concerns only the order of magnitude, then there are no tradeoffs to
speak of for labeled map construction. It follows from [2] that minimum weight
spanning tree construction can be done in time O(n) and at cost O(m+n log n) in
any network with n nodes and m edges, both of which are known to be optimal.
This implies the same complexities for constructing a labeled map. However, our
results show that the task of labeled map construction is very sensitive to time:
time vs. cost tradeoffs occur for the ring between the time spans D and 2D.
To the best of our knowledge, the problem of time vs. cost tradeoffs for labeled
map construction has never been studied before.

2 The Lower Bound


The main result of this section is a lower bound Ω(D2 /θ) on the cost of any la-
beled map construction algorithm working with delay θ on a ring with diameter D.
560 E.G. Fusco, A. Pelc, and R. Petreschi

We prove the lower bound on the class of oriented rings of even size. (Restricting
the class on which the lower bound is proved only increases the strength of the re-
sult.) We formalize orientation by assigning port numbers 0 and 1 in the clockwise
order at each node. For every node v, let (v) be its label.
We first define the history H(v, t) of node v at time t. Intuitively H(v, t) rep-
resents the entire knowledge that node v can acquire by time t. Since we want
to prove a lower bound on cost, it is enough to assume that whenever a node v
sends a message to a neighbor in round t + 1, the content of this message is its
entire history H(v, t). We define histories of all nodes by simultaneous induction
on t. Define H(v, 0) as the one-element sequence (v). In the inductive defini-
tion, we will use two symbols, s0 and s1 , corresponding to the lack of message
(silence) on port 0 and 1, respectively. Assume that histories of all nodes are
defined until round t. We define H(v, t + 1) as:
– H(v, t), s0 , s1 , if v did not get any message in round t + 1;
– H(v, t), s0 , H(u, t), if v did not get any message in round t+1 on port 0 but
received a message on port 1 from its clockwise neighbor u in that round;
– H(v, t), H(w, t), s1 , if v did not get any message in round t + 1 on port 1
but received a message on port 0 from its counterclockwise neighbor w in
that round;
– H(v, t), H(w, t), H(u, t), if v received a message on port 0 from its counter-
clockwise neighbor w and a message on port 1 from its clockwise neighbor
u, in round t + 1.
We define a communication pattern until round t for the set E of all edges of
the ring as a function f : E × {1, . . . , t} −→ {0, 1}, where f (e, i) = 0, if and
only if no message is sent on edge e in round i. Executing a map construction
algorithm A on a given ring determines a communication pattern, which in turn
determines histories H(v, t), for all nodes v and all rounds t.
For any path πk = u0 . . . uk  between nodes u0 and uk we define, by induction
on k, the communication delay δ(πk , f ) induced on πk by the communication
pattern f . For k = 1, δ(π1 , f ) = d, if and only if, f ({u0 , u1 }, i + 1) = 0, for
all i < d, and f ({u0 , u1 }, d + 1) = 1. In particular, if f ({u0 , u1 }, 1) = 1 then
δ(π1 , f ) = 0. Suppose that δ(πk−1 , f ) has been defined. We define δ(πk , f ) =
δ(πk−1 , f )+d, if and only if, f ({uk−1 , uk }, δ(πk−1 , f )+k+i) = 0, for all i < d, and
f ({uk−1 , uk }, δ(πk−1 , f ) + k + d) = 1. In particular, if f ({uk−1 , uk }, δ(πk−1 , f ) +
k) = 1 then δ(πk , f ) = δ(πk−1 , f ). Intuitively the communication delay on a
path between u and v indicates the additional time, with respect to the length
of this path, that it would take node v to acquire any information about node
u, along this path, if no information could be coded by silence. In fact some
information can be coded by silence, and analyzing this phenomenon is the main
conceptual difficulty of our lower bound proof. In particular, we will show that if
map construction has to be performed quickly, then the number of configurations
that can be coded by silence is small with respect to the total number of possible
instances, and hence many messages have to be used for some of them.
We define the communication delay induced by a communication pattern f
between a node x and its antipodal node x as the minimum of the delays induced
Learning a Ring Cheaply and Fast 561

by f on the two paths connecting x and x. By N (v, i) we denote the neighborhood


of v with radius i, i.e., the set of nodes at distance at most i from v, including v
itself. We also use N← (v, i) and N→ (v, i) to denote the part of the neighborhood
N (v, i) clockwise (respectively counterclockwise) from v, including v itself.
The next lemma will be used to provide a necessary condition for correctness
of a map construction algorithm working with a given delay. This condition will
be crucial in proving the lower bound on the cost of such algorithms.
Lemma 1. Let A be a labeled map construction algorithm. Let R and R be two
rings of size 2D, such that R is obtained from R by changing the label (x) of a
single node x to label  (x). Let x be the antipodal node of node x. Assume that A
determines the same communication pattern f on R and R . Let τ be the delay
induced by f between x and x. Then the history H(x, D + τ − 1) is the same in
R and R .
Theorem 1. Any labeled map construction algorithm A working with delay θ
on the class of rings of diameter D has cost Ω(D2 /θ).
Proof. Let A be a labeled map construction algorithm working with delay θ on
the class of rings of diameter D. Consider an oriented ring R of size 2D. We will
assign labels to nodes in R in such a way that A uses at least D D/(θ + 1)
messages.
If, for some node x there exist two labels (x) and  (x) such that, for some
labeling of the remaining nodes, the communication pattern f determined by A
on both resulting rings is the same, and the delay induced by f between x and
its antipodal node x is larger than θ, then algorithm A is incorrect, by Lemma 1.
Indeed, the history H(x, D + θ) would be the same in both labeled rings, and
node x would fail in the correct construction of the labeled map for one of them.
Hence, for any node x, there can be only one label (x) for any communication
pattern inducing communication delay larger than θ between x and x. Let X be
the set of all such labels. Since there are at most D + θ ≤ 2D rounds of com-
munication, there are at most (2D)2D distinct communication patterns. Hence
|X| ≤ (2D)2D . Let x0 , x1 , . . . , x2D−1 be the clockwise enumeration of all nodes
in the ring. Assign the lexicographically smallest label i ∈ / X ∪ {j : j < i} to
node xi . Recall that by our assumption, labels are binary sequences of length
polynomial in D. In fact for the purpose of this proof it is enough to work with
2
sequences of length bounded by D2 . Indeed, (2D)2D ∈ o(2D ), hence there are
enough available labels outside of X for the construction of our labeled ring.
We will show that algorithm A uses at least D D/(θ + 1) messages for the
above labeling of ring R. Let g be the communication pattern induced by algo-
rithm A on this labeled ring. Let y0 , y1 , . . . , y%2D/(θ+1)&−1 be nodes of the ring
R such that yi+1 is at clockwise distance θ + 1 from yi . Assume, without loss of
generality, that for at least D/(θ+1) nodes yj , the communication delay on the
clockwise path between yj and its antipodal node yj is at most θ. Let Y be the
set of these nodes yj . For any node y = xh in Y , we define the set Zy of size D
as follows. (All additions of indices are modulo 2D.) Elements of Zy are ordered
pairs composed of an edge and a round number, of the form ({xh+p , xh+p+1 }, rp ),
562 E.G. Fusco, A. Pelc, and R. Petreschi

for 0 ≤ p < D, where rp = δ(xh . . . xh+p+1 , g) + p + 1. By the definition of


communication delay induced on a path, we have g({xh+p , xh+p+1 }, rp ) = 1, for
all 0 ≤ p < D.
We now show that sets Zy are pairwise disjoint. Pick two nodes from Y : a node
y = xh and a node y  = xh+d at clockwise distance d < D from xh . Consider
a node xh+p , with d ≤ p < D. Consider two pairs, ({xh+p , xh+p+1 }, rp ) ∈ Zxh
and ({xh+p , xh+p+1 }, rp−d ) ∈ Zxh+d . Since δ(xh+d . . . xh+p+1 , g) ≤ θ, we have
rp−d = p − d + δ(xh+d . . . xh+p+1 , g) + 1 ≤ p − d + θ + 1. By definition of Y , we
have d > θ, hence rp = p+δ(xh . . . xh+p+1 , g)+1 ≥ p+1 > p−d+θ +1 ≥ rp−d .
This implies that rp = rp−d and hence sets Zy and Zy are disjoint. Notice that if
y  is at distance D from y, then y  is the antipodal node of y and hence Zy and Zy
are disjoint because the edges in their elements are different. It follows that all
sets Zy are pairwise disjoint, hence ∪y∈Y Zy has at least D D/(θ + 1) elements.
Since each element corresponds to at least one message sent, we conclude that
the algorithm uses at least D D/(θ + 1) ∈ Ω(D2 /θ) messages. &
%

3 The Algorithm

The general idea of our labeled map construction algorithm is to spend the
allowed delay θ in a preprocessing phase that deactivates some nodes, using
the residual time D for a phase devoted to information spreading. This results
in a reduction of the overall cost of the algorithm, with respect to flooding,
since non-active nodes are only responsible for relaying messages originated at
nodes that remained active after the preprocessing phase. Hence, this approach
requires to deactivate as many nodes as possible. However, within delay θ, we
cannot afford to deactivate sequences of consecutive nodes of length larger than
2θ. Indeed, deactivating such long sequences would imply that the label of some
non-active node is unknown to all active nodes, which would make the time of
the information spreading phase exceed the remaining D rounds. We reconcile
these opposite requirements by defining local rules that allow us to deactivate
almost half of the currently active nodes, without deactivating two consecutive
ones. This process is then iterated as many times as possible within delay θ.
The preprocessing phase of our algorithm is divided into stages, each of which
is in turn composed of multiple steps. In the first stage, all nodes are active.
Nodes that become non-active at the end of a stage will never become active
again. In order to simplify the description of the algorithm, we will use the
concept of residual ring. In such a ring, the set of nodes is a subset of the
original set of nodes, and edges correspond to paths of consecutive removed
nodes. In particular, stage i is executed on the residual ring Ri composed of
nodes that remained active at the end of the previous stage. Communication
between consecutive nodes Ri is simulated by a multi-hop communication in
the original ring, where non-active nodes relay messages of active nodes. Each
simulated message exchange during stage i is allotted 2i−1 rounds.
Steps inside stage i are devoted to the election of (i, j)-leaders, where j is the
number of the step. At the beginning of the first step of stage i, (i, 0)-leaders
Learning a Ring Cheaply and Fast 563

are all still active nodes. Step j of stage i is executed on the residual ring Ri,j
composed of (i, j − 1)-leaders from the step j − 1. Multi-hop communication
between two consecutive nodes in Ri,j is allotted 2i−1 4j−1 rounds.
Whenever a node v (active or not) sends or relays a message to its neighbor
w, it appends to the message its label and the port number, at v, corresponding
to the edge {v, w}. In order to simplify the description of the algorithm, we omit
these message parts. We use log to denote logarithms to base two.
We first introduce three procedures that will be used as parts of our algorithm.
The first procedure is due to Cole and Vishkin [4] and Goldberg et al. [8]. It colors
every ring with at most three colors, so that adjacent nodes have distinct colors.
We call it Procedure RTC as an abbreviation of ring three coloring.
Procedure RTC
Input: i, j.
The procedure starts from a ring whose nodes have unique labels of k bits and
produces a coloring of the ring using at most 3 colors in time O(log∗ k). Let
{1, 2, 3} be the set of these colors. Let α log∗ k, where α is a positive constant,
be an upper bound on the duration of this procedure, when labels are of k bits.
The procedure with input i, j is executed on the residual ring Ri,j . 9
The second procedure elects (i, j)-leaders in the ring Ri,j .
Procedure Elect
Input: i, j.
Each node u sends its color c(u) ∈ {1, 2, 3} to its neighbors in Ri,j .
Let v and w be the neighbors of u in Ri,j .
Node u becomes an (i, j)-leader, if and only if c(u) > c(v) and c(u) > c(w). 9
The third procedure is used to deactivate a subset of active nodes at the end
of each stage.
Procedure Deactivate
Input: i, ε.
Each (i, log(8/ε) )-leader u sends its color c(u) ∈ {1, 2, 3} to both its neighbors
in Ri .
All nodes in Ri that are not (i, log(8/ε) )-leaders, upon receiving a message
containing a sequence of colors from a neighbor in Ri , add their color to the
message and relay it to the other neighbor in Ri .
Let l and r be two consecutive (i, log(8/ε) )-leaders. Let S be the sequence
of consecutive active nodes between l and r. Each node in the sequence S, upon
discovering the sequence of colors in S and its position in the sequence, proceeds
according to the following rules.
– If S is of odd length, i.e., S = la1 . . . ak−1 ak bk−1 . . . b1 r, nodes at and bt
become non-active, for all odd values of t. This means that every second
node is deactivated, starting from both ends.
– If S is of even length, i.e., S = lak . . . a1 b1 . . . bk r, nodes at and bt become
non-active, for all even values of t. This means that every second node is
deactivated, starting from the neighbors of the two central nodes. 9
564 E.G. Fusco, A. Pelc, and R. Petreschi

We are now ready to provide a detailed description of our labeled map con-
struction algorithm. For each task that cannot be carried out locally, we allot
a specific number of rounds to maintain synchronization between the execution
of a given part of the algorithm by different nodes. In the analysis we will show
that the allotted times are always sufficient.
Algorithm RingLearning
Input: D, θ, and ε.
Phase 1 – preprocessing
set all nodes as active – (locally);
for i ← 1 to log θ − 2 log(8/ε) − log(α(log∗ D + 3)) //STAGE
construct the residual ring Ri of active nodes – (locally);
elect all nodes in Ri as (i, 0)-leaders – (locally);
for j ← 1 to log(8/ε) //STEP
construct the residual ring Ri,j of (i, j − 1)-leaders – (locally);
assign color c(u) to all nodes u in Ri,j with procedure RTC(i, j);
(allotted time 2i−1 4j−1 α(log∗ D + 1))
elect (i, j)-leaders with procedure Elect(i, j);
(allotted time 2i−1 4j−1 )
run procedure Deactivate(i, ε) in Ri ;
(allotted time 2i−1 4log(8/ε) )
Phase 2 – information spreading
in round θ + 1 each node that is still active constructs locally a labeled map of
the part of the original ring consisting of nodes from which it received messages
during Phase 1, and sends this map to its neighbors;
both active and non-active nodes that receive a message from one neighbor, send
it to the other neighbor;
at time D + θ, all nodes have the labeled map of the ring and stop. 9
We now prove the correctness of Algorithm RingLearning and analyze it by
estimating its cost for a given delay θ. The first two lemmas show that the time
2i−1 allotted for multi-hop communication between consecutive active nodes in
stage i, and the time 2i−1 4j−1 allotted for multi-hop communication between
consecutive (i, j − 1)-leaders in step j of stage i, are sufficient to perform the
respective tasks.

Lemma 2. The distance between two consecutive (i, j)-leaders is at most 2i−1 4j .

The next two lemmas will be used to prove the correctness of Algorithm Ring-
Learning.

Lemma 3. All calls to procedures RTC, Elect, and Deactivate can be carried
out within times allotted in Algorithm RingLearning.

Proof. In view of Lemma 2, time 2i−1 4j−1 is sufficient to perform a message


exchange between consecutive (i, j − 1)-leaders in stage i.
Let L be the length of the binary strings that are labels of nodes. Since L is
polynomial in D, the execution of Procedure RTC(i, j) in the residual ring Ri,j
Learning a Ring Cheaply and Fast 565

is completed in time at most 2i−1 4j−1 α log∗ L ≤ 2i−1 4j−1 α(log∗ D + 1). Hence
the allotted time is sufficient.
Running Procedure Elect(i, j) requires time 2i−1 4j−1 to allow each (i, j − 1)-
leader to learn the new color of its neighboring (i, j − 1)-leaders. Hence the
allotted time is sufficient.
Running Procedure Deactivate(i, ε) on the residual ring Ri takes time
2i−1 4log(8/ε) . Indeed, within this time, all nodes between two consecutive
(i, log(8/ε) )-leaders learn labels of all nodes between them and decide locally
if they should be deactivated. Hence the allotted time is sufficient. &
%

Lemma 4. log θ − 2 log(8/ε) − log(α(log∗ D + 3)) stages can be completed


in time θ.

We are now ready to prove the correctness of our algorithm.


Theorem 2. Upon completion of Algorithm RingLearning all nodes of the ring
correctly construct its labeled map.

Proof. The correctness of Procedure RTC follows from [8], provided that enough
time is allotted for its completion. Elections of (i, j)-leaders are carried out ac-
cording to the largest color rule by Procedure Elect(i, j), provided that each
node knows the colors assigned to its neighbors in Ri,j . Decisions to become non-
active can be carried out locally by each node, according to the appropriate rule
from Procedure Deactivate, provided that nodes of each sequence S between
two (i, log(8/ε) )-leaders know the entire sequence. By Lemma 3 the times al-
lotted to all three procedures are sufficient to satisfy the above conditions.
Due to Lemma 4, all nodes stop executing the preprocessing phase within
round θ, hence D more rounds are available for the information spreading phase.
At the end of stage i each (i, log(8/ε) )-leader knows the sequences of node
labels and port numbers connecting it to both closest (i, log(8/ε) )-leaders.
Hence, at the beginning of the information spreading phase, the union of the
sequences known to all active nodes covers the entire ring, and consecutive se-
quences overlap. This in turn implies that, after D rounds of the information
spreading phase, all nodes get the complete labeled map of the ring. &
%

The next three lemmas are used to analyze the cost of Algorithm RingLearning,
running with delay θ.

Lemma 5. At the end of stage i there are at most n((ε/2 + 1)/2)i active nodes
in a ring of size n.

Lemma 6. The cost of the preprocessing phase of Algorithm RingLearning with


input parameters D, θ, and ε, where 0 < ε < 1, is O(D log∗ D θlog(1+ε) log θ/ε2 ).

Proof. As shown in the proof of Lemma 4, the time used for stage i is at most
2i−1 4s α(log∗ D + 3), where s = log(8/ε) is the number of steps in each stage.
In view of Lemma 5, during stage i there are at most n((ε/2 + 1)/2)i−1 active
nodes in a ring of size n. Hence the cost of stage i is at most
566 E.G. Fusco, A. Pelc, and R. Petreschi

i−1
ε/2 + 1
2i−1 4s α(log∗ D + 3) · n .
2
Since the number of stages is less than log θ, the overall cost of the preprocessing
phase is less than
%log θ& i−1
∗ ε/2 + 1
2 4 α(log D + 3) · n
i−1 s
.
i=1
2

Bounding each summand with the last one which is the largest we obtain
%log θ& i−1 < =2
ε/2 + 1 8
2i−1 4s α(log∗ D+3)·n ≤ αn(log∗ D+3)θlog(1+ε/2) log θ ,
i=1
2 ε

which is O(D log∗ D θlog(1+ε) log θ/ε2 ). &


%

Lemma 7. The cost of the information spreading phase of Algorithm


RingLearning with input parameters D, θ, and ε, where 0 < ε < 1,
is O(D2 log∗ D/(ε2 θ1−ε )).

Theorem 3. The cost of Algorithm RingLearning, executed in time D + θ in a


ring of diameter D, is O(D2 log∗ D/θ1−ε ), for any constant parameter 0 < ε < 1
and any θ ≤ D.

Proof. Lemmas 6 and 7 imply that the cost of Algorithm RingLearning, ex-
ecuted with parameters D, θ, and ε, in a ring of diameter D, is of the or-
der O(D log∗ D θlog(1+ε/2) log θ + D2 log∗ D/(θ1−ε )), for any constant 0 < ε <
1. Since log(1 + ε/2) − ε is negative for all ε > 0, and θ ≤ D, we have
D
θ1+log(1+ε/2)−ε log θ < D, for sufficiently large D. Hence θ1−ε > θlog(1+ε/2) log θ,
which implies
 
D2 log∗ D D2 log∗ D
O D log∗ D θlog(1+ε/2) log θ + = O .
θ1−ε θ1−ε
&
%

4 Discussion and Open Problems


We proved almost matching upper and lower bounds for the tradeoffs between
time and cost of the labeled map construction task in the class of rings. Can
these tradeoffs be generalized to a larger class of networks? Since lower bounds
are stronger when established on a more restricted class of graphs, the challenge
would be to extend our algorithms, that provide an almost matching upper
bound, to more general networks.
First observe that our approach could not be extended directly. Indeed, we rely
on repeated stages consisting of coloring with few colors and of node deactivation.
Learning a Ring Cheaply and Fast 567

A subsequent stage works on the residual network of active nodes from the
previous stage. In the case of rings, the residual network remains a ring and hence
coloring with few colors can be done again. As soon as we move to networks of
degree higher than two, the maximum degree of the residual network can grow
exponentially in the number of stages, and thus the technique of fast coloring
with few colors cannot be applied repeatedly. However, allowing delays larger
than D, but still linear in D, permits to use a different approach that is successful
on a larger class of networks.
Consider the class of networks in which neighborhoods of nodes grow poly-
nomially in the radius. More precisely, let Nr (v) be the set of all nodes within
distance at most r from v. We will say that a network has polynomially growing
neighborhoods, if there exists a constant c ≥ 1 (called the growth parameter)
such that |Nr (v)| ∈ Θ(rc ) for all nodes v. Notice that the class of networks
with polynomially growing neighborhoods is fairly large, as it includes, e.g., all
multidimensional grids and tori, as well as rings. On the other hand, all such
networks have bounded maximum degree.
Consider the following doubling algorithm, working in two phases. The prepro-
cessing phase of the algorithm is a generalization of the leader election algorithm
for rings from [9]. In the beginning all nodes are active. Each node v that is ac-
tive at the beginning of stage i ≥ 0 has the largest label in the neighborhood
N2i (v). In stage i, every active node sends its label at distance 2i+1 and it re-
mains active at the end of this stage, if it does not receive any larger label. We
devote a given amount of time τ to the preprocessing phase. The rest of the
algorithm is the information spreading phase, in which each node that is still
active constructs independently a BFS spanning tree of the network in time D.
Information exchange in each BFS tree is then initiated by its leaves and com-
pleted in additional time 2D. (Hence BFS trees constructed by active nodes are
used redundantly, but - as will be seen - the total cost can still be controlled.)
Upon completion of information spreading, each node has a labeled map of the
whole network.
We now analyze the cost of the above doubling algorithm.
Proposition 1. The cost of the doubling algorithm, executed in time 3D + τ on
a network with polynomially growing neighborhoods, of diameter D and size n,
is in O(n log τ + nDβ /τ β ), for some constant β > 1.
In particular, for τ ∈ Θ(D), i.e., when the total available time is (3 + η)D for
some constant η > 0, the total cost is O(n log D).
We close the paper with two open problems. The above tradeoffs are valid
for fairly large running times (above 3D). This means that the tradeoff curve
remains flat for a long period of time. It is thus natural to ask for tradeoffs
between cost and time for delays below D, i.e., for overall time below 2D. Can
such tradeoffs be established for some other classes of networks (such as bounded
degree networks or even just grids and tori), similarly as we did for rings?
Finally, notice that for rings the information spreading phase can be performed
in time 2D (instead of 3D) by letting each active node initiate two sequences
of messages (one clockwise, and the other counterclockwise), each containing
568 E.G. Fusco, A. Pelc, and R. Petreschi

labels of all already visited nodes. Moreover, the overall cost of the doubling
algorithm, executed in time 2D + τ on a ring of diameter D and size n, is
O(n log τ + nD/τ ) = O(D log τ + D2 /τ ). This should be compared to the cost
of Algorithm RingLearning, that can be as small as O(D1+ε log∗ D) for total
time 2D and any constant ε > 0. The cost of the doubling algorithm becomes
asymptotically smaller when the overall time is larger than 2D + D1−ε / log∗ D.
Closing the small gap between our bounds on the time vs. cost tradeoffs for
labeled map construction on rings is another open problem.

References
1. Attiya, H., Bar-Noy, A., Dolev, D., Koller, D., Peleg, D., Reischuk, R.: Renaming
in an asynchronous environment. Journal of the ACM 37, 524–548 (1990)
2. Awerbuch, B.: Optimal distributed algorithms for minimum weight spanning tree,
counting, leader election and related problems. In: Proc. 19th Annual ACM Sym-
posium on Theory of Computing (STOC 1987), pp. 230–240 (1987)
3. Chalopin, J., Das, S., Kosowski, A.: Constructing a map of an anonymous graph:
Applications of universal sequences. In: Proc. 14th International Conference on
Principles of Distributed Systems (OPODIS 2010), pp. 119–134 (2010)
4. Cole, R., Vishkin, U.: Deterministic coin tossing with applications to optimal par-
allel list ranking. Information and Control 70, 32–53 (1986)
5. Czumaj, A., Gasieniec, L., Pelc, A.: Time and cost trade-offs in gossiping. SIAM
Journal on Discrete Mathematics 11, 400–413 (1998)
6. Fredrickson, G.N., Lynch, N.A.: Electing a leader in a synchronous ring. Journal
of the ACM 34, 98–115 (1987)
7. Gasieniec, L., Pagourtzis, A., Potapov, I., Radzik, T.: Deterministic communication
in radio networks with large labels. Algorithmica 47, 97–117 (2007)
8. Goldberg, A.V., Plotkin, S.A., Shannon, G.E.: Parallel symmetry- breaking in
sparse graphs. SIAM Journal on Discrete Mathematics 1, 434–446 (1988)
9. Hirschberg, D.S., Sinclair, J.B.: Decentralized extrema-finding in circular configu-
rations of processes. Communications of the ACM 23, 627–628 (1980)
10. Israeli, A., Kranakis, E., Krizanc, D., Santoro, N.: Time-message trade-offs for the
weak unison problem. Nordic Journal of Computing 4, 317–341 (1997)
11. Lynch, N.L.: Distributed algorithms. Morgan Kaufmann Publ. Inc., San Francisco
(1996)
12. Peleg, D.: Distributed Computing, A Locality-Sensitive Approach, Philadelphia.
SIAM Monographs on Discrete Mathematics and Applications (2000)
13. Peterson, G.L.: An O(n log n) unidirectional distributed algorithm for the circular
extrema problem. ACM Transactions on Programming Languages and Systems 4,
758–762 (1982)
14. Yamashita, M., Kameda, T.: Computing on anonymous networks: Part I - charac-
terizing the solvable cases. IEEE Trans. Parallel and Distributed Systems 7, 69–89
(1996)
Competitive Auctions for Markets
with Positive Externalities

Nick Gravin1 and Pinyan Lu2


1
Division of Mathematical Sciences, School of Physical and Mathematical Sciences,
Nanyang Technological University, Singapore
[email protected]
2
Microsoft Research Asia, China
[email protected]

Abstract. In digital goods auctions, the auctioneer sells an item in un-


limited supply to a set of potential buyers. The objective is to design a
truthful auction that maximizes the auctioneer’s total profit. Motivated
by the observation that the buyers’ valuation of the good might be in-
terconnected through a social network, we study digital goods auctions
with positive externalities among buyers. This defines a multi-parameter
auction design problem where the private valuation of every buyer is
a function of the set of other winning buyers. The main contribution
of this paper is a truthful competitive mechanism for subadditive valua-
tions. Our competitive result is with respect to a new solution benchmark
F (3) . On the other hand, we show a surprising impossibility result if com-
paring to the stronger benchmark F (2) , where the latter has been used
quite successfully in digital goods auctions without externalities [16].

1 Introduction
In economics, the term externality is used to describe situations in which private
costs or benefits to the producers or purchasers of a good or service differ from
the total social costs or benefits entailed in its production and consumption. In
this context a benefit is called a positive externality, while a cost is referred to
as a negative one. One needs not to go far to find examples of positive external
influence in digital and communications markets, when a customer’s decision
to buy a good or purchase a service strongly relies on its popularity among
his/her friends or generally among other customers, e.g. instant messenger and
cell phone users will want a product that allows them to talk easily and cheaply
with their friends. Another good example is social network, where a user is
more likely to appreciate membership in a network if many of his/her friends
are already using it. There exist a number of applications, like the very popular
Farm Ville in online social network Facebook, where a user would have more fun
when participating with friends. In fact, quite a few such applications explicitly
reward players with a large number of friends.
On the other hand, negative external effects occur when a potential buyer,
e.g. a big company, incurs a great loss if a subject it fights for, like a small firm or

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 569–580, 2013.

c Springer-Verlag Berlin Heidelberg 2013
570 N. Gravin and P. Lu

company, goes to its direct competitor. Another well-studied example related to


computer science is the allocation of advertisement slots [1, 13–15, 17, 23], where
every customer would like to see a smaller number of competitors’ advertisements
on a web page that contains his/her own advert. One may also face mixed
externalities as in the case of selling nuclear weapons [21], where countries would
like to see their allies win the auction rather than their foes.
We investigate the problem of mechanism design for auctions with positive
externalities. We study a scenario where an auctioneer sells the good, of no
more than a single copy in the hands of each customer. We define a model
for externalities among the buyers in the sealed-bid auction with an unlimited
supply of the good. This types of auctions arise naturally in digital markets,
where making a copy of the good (e.g. cd with songs or games, or extra copy of
online application) has a negligible cost compared to the final price and can be
done at any time the seller chooses.
A similar agenda has been introduced in the paper [18], where authors consider
a Bayesian framework and study positive externalities in the social networks with
single-parameter bidders and submodular valuations. The model in the most gen-
eral form can be described by a number of bidders n, each with a non-negative
private valuation function vi (S) depending on the possible winning set S. This is
a natural multi-parameter mechanism design model that may be considered a gen-
eralization of the classical auctions with unlimited supply, i.e. auctions where the
amount of items being sold is greater than the number of buyers.
Traditionally the main question arising in such situations is how to maximize
the seller’s revenue. In literature on the classical auctions without any exter-
nalities many diverse approaches to this question have been developed. In the
current work we pick a classical approach and benchmark (cf. [16]), namely the
best-uniform-price benchmark called F , which is different from Bayesian frame-
work. There one seeks to maximize the ratio of the mechanism’s revenue to the
revenue of F taken in the worst case over all possible bids. In particular a mech-
anism is called competitive if such a ratio is bounded by some uniform constant
for each possible bid. However, it was shown that there is no competitive truthful
mechanism w.r.t. F , and therefore to get around this problem, a slightly modi-
fied benchmark F (2) [16] was proposed. The only difference of F (2) to F is in one
additional requirement that at least two buyers should be in a winning set. Thus
F (2) becomes a standard benchmark in analyzing digital auctions [11,12,16,20].
Similarly to F (2) one may define benchmark F (k) for any fixed constant k. It
turns out that the same benchmarks can be naturally adopted to the case of
positive externalities. Surprisingly F (2) fails to serve as a benchmark in social
networks with positive externalities, i.e. no competitive mechanism exists w.r.t.
F (2) . Therefore, we go further and consider the next natural candidate for the
benchmark, which is F (3) .
The main contribution of this paper is an universally truthful competitive
mechanism for the general multi-parameter model with subadditive valuations
(substantially broader class than submodular) w.r.t. F (3) benchmark. We com-
plement this result with a proof that no truthful mechanism can achieve constant
Competitive Auctions for Markets with Positive Externalities 571

ratio w.r.t. F (2) . In order to do so we introduce a restricted model with a single


private parameter which in some respects resembles the one considered in [18];
further for this restricted model we give a simple geometric characterization of
all truthful mechanisms and based on the characterization then show that there
is no competitive truthful mechanism w.r.t. F (2) .
Our model is the so-called multi-parameter or multi-dimensional model (see
[25]), as utility of every agent may not be described by a single real number for
all possible outcomes of the mechanism. Mechanism design in this case is known
to be harder than in the single-parameter domains.

1.1 Related Work

Many studies on externalities in the direction of pricing and marketing strategies


over social networks have been conducted over the past few years. In many ways,
they have been caused by the development of social-networks on the Internet,
which has allowed companies to collect information about each user and user
relationships.
Earlier works have generally been focused on the influence maximization prob-
lems (see Chapter 24 of [24]). For instance, Kempe et al. [22] study the algorith-
mic question of searching a set of nodes in a social network of highest influence.
From the economic literature one could name such papers as [26], which studies
the effect of network topology on a monopolist’s profits and [10], which studies
a multi-round pricing game, where a seller may lower his price in an attempt to
attract low value buyers. These works take no heed of algorithmic motivation.
There are several more recent papers [2, 7, 9, 19] studying the question of
revenue maximization as well as work studying the posted price mechanisms
[3, 5, 8, 19].
We could not continue without mentioning a beautiful line of research on
revenue maximization for classical auctions, where the objective is to maximize
the seller’s revenue compared to a benchmark in the worst case. We cite here only
some papers that are most relevant to our setting [4,11,12,16,20]. With respect
to the refined best-uniform-price benchmark F (2) a number of mechanisms with
constant competitive ratio were obtained; each subsequent paper improving the
competitive ratio of the previous one [11, 12, 16, 20]. The best known current
mechanism is due to Hartline and McGrew [20] and has a competitive ratio
of 3.25. On the other hand a lower bound of 2.42 has been proven in [16] by
Goldberg et.al.. The question of closing the gap still remains open.

2 Preliminaries

We suppose that in a marketplace n agents are present, the set of which we


denote by [n]. Each agent i has a private valuation function vi , which is a non-
negative real number for each possible winner set S ⊂ [n]. The seller organizes
a single round sealed bid auction, where agents submit their valuations bi (S)
to an auctioneer for all possible winner sets S, and the auctioneer then chooses
572 N. Gravin and P. Lu

agents who will obtain the good and vector of prices to charge each of them.
The auctioneer is interested in maximizing his/her revenue.
For every i ∈ [n] we impose the following mild requirements on vi .
1. vi (S) ≥ 0.
2. vi (S) = 0 if i ∈
/ S.
3. vi (S) is a monotone sub-additive function of S, i.e.
(a) vi (S) ≤ vi (R) if S ⊆ R ⊆ [n].
(b) vi (S ∪ R) ≤ vi (S) + vi (R), for each i ∈ S, R ⊆ [n]
We should note here that the sub-additivity requirement is only for those subsets
that include the agent i. This is a natural assumption since vi (S) = 0 if i ∈
/ S.

2.1 Mechanism Design


Each agent in turn would like to get a positive utility that is as high as possible
and may lie strategically about his/her valuation. The utility ui (S) of an agent i
for a winning set S is simply the difference of his valuation vi (S) and the price pi
the auctioneer charges i. Thus one of the desired properties for the auction is the
well known concept of truthfulness or incentive compatibility, i.e. the condition
that every agent maximizes his utility by truth telling.
It is worth mentioning that our model is that of multi-parameter mechanism
design and, moreover, that collecting the whole bunch of values vi (S) for every
i ∈ [n] and S ⊂ [n] would require an exponential amount of bits in n and thus
is inefficient. However, in the field of mechanism design there is a way to get
around such a problem of exponential input size with the broadly recognized
concept of black box value queries. The latter simply means that the auctioneer,
instead of getting the whole collection of bids instantly, may ask during the
mechanism execution every agent i only for a small part of his input, i.e. a
number of questions about valuation of i for certain sets. We note that the
agent still may lie in the response to each such query. We denote the bid of i by
bi (S) to distinguish it from the actual valuation vi (S). Thus if we are interested
in designing a computationally efficient mechanism, we can only ask in total a
polynomial in n number of queries.
Throughout the paper, with M we denote a mechanism with allocation rule
A and payment rule P. Allocation algorithm A may ask queries about valuations
of any agent for any possible set of winners. Thus A has an oracle black box
access to the collection of bid functions bi (S). For each agent i in the winning
set S the payment algorithm decides a price pi to charge. The utility of agent i is
then ui = vi (S) − pi if i ∈ S and 0 otherwise. To emphasize the fact that agents
may report untruthfully we will use ui (bi ) notation for the utility function in
the general case and ui (vi ) in the case of truth telling. We assume voluntary
participation for the agents, that is ui ≥ 0 for each i who reports the truth.

2.2 Revenue Maximization and Possible Benchmarks


We discuss here the problem of revenue maximization from the seller’s
 point
of view. The revenue of the auctioneer is simply the total payment i∈S pi of
Competitive Auctions for Markets with Positive Externalities 573

all buyers in the winning set. We assume that the seller incurs no additional
cost for making a copy of the good. This assumption is essential for our model,
since unlike the classical digital auction case there is no simple reduction of the
settings with a positive price per issuing the item
 to the settings with zero price.
The best revenue the seller can hope for is i∈[n] vi ([n]). However, it is not
realistic when the seller does not know agents’ valuation functions. We follow the
tradition of previous literature [11, 12, 16, 20] of algorithmic mechanism design
on competitive auctions with limited or unlimited supply and consider the best
revenue uniform price benchmark, which is defined as maximal revenue that
the auctioneer can get for a fixed uniform price for the good. In the literature
on classical competitive auctions this benchmark was called F and is formally
defined as follows.

Definition 1 (F without Externalities). For the vector of agent’s bids b


 

F (b) = max c · |S|∀i ∈ S bi ≥ c .
c≥0,S⊂[n]

This definition generalizes naturally to our model with externalities and is


defined rigorously as follows.
Definition 2 (F with Externalities). For the collection of agents’ bid func-
tions b.  

F (b) = max c · |S|∀i ∈ S bi (S) ≥ c .
c≥0,S⊂[n]

The important point in considering F in the setting of classical auctions is that


the auctioneer, when he/she is given in advance the best uniform price, can run
a truthful mechanism with corresponding revenue. It turns out that the same
mechanism works neatly for our model. Specifically, a seller who is given the
price c in advance can begin with the set of all agents and drop one by one those
agents with negative utility (bi (S) − c < 0); once there are left no agents to
delete, the auctioneer sells the item to all surviving buyers at the given price c.
In these circumstances, a natural problem arising for the auctioneer is to
devise a truthful mechanism which has a good approximation ratio of the mech-
anism’s revenue to the revenue of the benchmark at any possible bid vector b.
Such a ratio is usually called the competitive ratio of a mechanism. However, it
was shown (cf. [16]) that no truthful mechanism can guarantee any constant com-
petitive ratio w.r.t. F . Specifically, the unbounded ratio appears in the instances
where the benchmark buys only one item at the highest price. To overcome this
obstacle, a slightly modified benchmark F (2) has been proposed and a number
of competitive mechanisms w.r.t. F (2) were obtained [11, 12, 16, 20]. The only
difference of F (2) from F is one additional requirement that at least two buyers
should be in the winning set. Similarly, for any k ≥ 2 we may define F (k) .

Definition 3.
 

F (k) (b) = max c · |S||S| ≥ k, ∀i ∈ S bi (S) ≥ c .
c≥0,S⊂[n]
574 N. Gravin and P. Lu

However, in case of our model the benchmark F (2) does not imply the exis-
tence of a constant approximation truthful mechanism. In order to illustrate that
later in Section 4 we will introduce a couple of new models which differ from the
original one in certain additional restrictions on the domain of agent’s bids. We
further give a complete characterization of truthful mechanisms for these new
restricted settings substantially exploiting the fact that every agent’s bidding
language is single-parameter. Later, we use that characterization to argue that
no truthful mechanism can achieve constant approximation with respect to F (2)
benchmark even for these cases. On the positive side, and quite surprisingly, we
can furnish our work in the next section with the truthful mechanism which has
a constant approximation ratio w.r.t. F (3) benchmark for the general case of
multi-parameter bidding.

3 Competitive Mechanism

Here we give a competitive truthful mechanism, that is a mechanism which


guarantees that the auctioneer gets a constant fraction of the revenue he could
get for the best fixed price benchmark assuming that all agents bid truthfully.
We call it Promotion-Testing-Selling Mechanism. In the mechanism we
give the good to certain agents for free, that is without requiring any payment.
The general scheme of the mechanism is as follows.

Promotion-Testing-Selling Mechanism

1. Put every agent at random into one of the sets A, B, C.


2. Denote rA (C) and rB (C) the largest fixed price revenues one can
extract from C given that, respectfully, either A, or B got the
good for free.
3. Let r(C) = max{rA (C), rB (C)}.
4. Sell items to agents in A for free.
5. Apply Cost Sharing Mechanism(r(C), B, A) to extract revenue r(C)
from set B given that A got the good for free.

Bidders in A receive items for free and increase the demand of agents from B.
One may say that they “advertise” the goods and resemble the promotion that
occurs when selling to participants. The agents in C play the role of the “testing”
group, the only service of which is to determine the right price. Note that we
take no agents of the testing group into the winning set, therefore, they have
nothing to gain for bidding untruthfully. The agents of B appear to be the source
of the mechanism’s revenue, which is being extracted from B by a cost sharing
mechanism as follows.
We note here that a more “natural” mechanism is simply to set that r(C) =
rA (C) rather than max{rA (C), rB (C)}. But unfortunately, we have a counter
example to show that this simpler mechanism cannot guarantee a constant ap-
proximation ratio compared to our benchmark.
Competitive Auctions for Markets with Positive Externalities 575

Cost Sharing Mechanism(r,X,Y)

1. S ← X.
2. Repeat until T = ∅:
– T ← {i|i ∈ S and bi (S ∪ Y ) < |S|
r
}.
– S ← S \ T.
3. If S = ∅ sell items to everyone in S at r
|S|
price.

Lemma 4. Promotion-Testing-Selling Mechanism is universally truth-


ful.

Proof. The partitioning of the set [n] into A, B, C does not depend on the
agent bids. When the partition is fixed, our mechanism becomes deterministic.
Therefore, we are only left to prove the truthfulness for that deterministic part.
Let us do so by going through the proof separately for each set A, B and C.

– Bids of agents in A do not affect the outcome of the mechanism. Therefore,


they have no incentive to lie.
– No agent from C could profit from bidding untruthfully, since her utility will
be zero regardless of the bid.
– Let us note that the Cost Sharing Mechanism is applied to the agents
in B and the value of r does not depend on their bids, since both rA (C) and
rB (C) are retracted from C irrespectively of bids from A and B. Also let us
note that at each step of the cost sharing mechanism the possible payment
r
|S| is rising, and meanwhile the valuation function, because of monotonicity
condition, is going down. Hence, manipulation of a bid does not help any
agent to survive in the winning set and receive positive utility, if by bidding
truthfully he/she has been dropped from the winning set. Mis-reporting a
bid could not help an agent to alter the surviving set and at the same time
remain a winner. These two observations conclude the proof of truthfulness
for B.

Therefore, from now on we may assume that bi (S) = vi (S).

Theorem 5. Promotion-Testing-Selling Mechanism is universally truth-


ful and has an expected revenue of at least F324 .
(3)

Proof. We are left to prove the lower bound on the competitive ratio of our
mechanism, as we have shown the truthfulness in Lemma 4.
For the purpose of analysis, we separate the random part of our mechanism
into two phases. In the first phase, we divide agents randomly into three groups
S1 , S2 , S3 and in the second one, we label the groups at random by A, B and
C. Note that the combination of these two phases produces exactly the same
distribution over partitions as in the mechanism.
Let S be the set of winners in the optimal F (3) solution and the best fixed
price be p∗ . For 1 ≤ i = j ≤ 3 we may compute rij the largest revenue for a fixed
price that one can extract from set Si given Sj is “advertising” the good, that
576 N. Gravin and P. Lu

is agents in Sj get the good for free and thus increase the valuations of agents
from Si though contribute nothing directly to the revenue.
First, let us note that the cost-sharing part of our mechanism will extract one
of these rij from at least one of the six possible labels for every sample of the
dividing phase (in general cost-sharing mechanism may extract 0 revenue, e.g.
if the target revenue is set too high). Indeed, let i0 and j0 be the indexes for
which ri0 j0 achieves maximum over all rij and let k0 = {1, 2, 3} \ {i0 , j0 }. Then
the cost-sharing mechanism will retract the revenue r(C) = max(rA (C), rB (C))
on the labeling with Sj0 = A, Si0 = B and Sk0 = C. It turns out, as we will
prove in the following lemma, that one can get a lower bound on this revenue
within a constant factor of rF (C); the revenue we got from the agents of C in
the benchmark F (3) .
rF (C)
Lemma 6. r(C) ≥ 4 .

Proof. Let Sc = S ∩ C. Thus, by the definition of F (3) , we have rF (C) = |Sc | · p∗


and for all i ∈ Sc , vi (S) ≥ p∗ .
We define a subset T of Sc as a final result of the following procedure.
p∗
1. T ← ∅ and X ← {i|i ∈ Sc and vi (A ∪ {i}) ≥ 2 }.
2. While X = ∅
– T ← T ∪ X, ∗
– X ← {i|i ∈ Sc and vi (A ∪ T ∪ {i}) ≥ p2 }

For any agent of T we have vi (A ∪ T ) ≥ p2 because the valuation function is
monotone. Now if |T | ≥ |S2c | , we get the desired lower bound. Indeed,

|Sc | p∗ |Sc | · p∗ r (C)


r(C) ≥ rA (C) ≥ · = = F .
2 2 4 4
Otherwise, let W = Sc \ T . Then we have |W | ≥ |S2c | . For an agent i ∈ W it

holds true that vi (A ∪ T ∪ {i}) < p2 , since otherwise we should include i into
T . However, since i wins in the optimal F (3) solution, we have vi (S) ≥ p∗ . The
former two inequalities together with the subadditivity of vi (·) (vi (S \ (A ∪ T )) +

vi (A ∪ T ∪ {i}) ≥ vi (S)) allow us to conclude that vi (S \ (A ∪ T )) ≥ p2 for each

i ∈ W . Hence, we get vi (B ∪ W ) ≥ p2 for each i ∈ W , since S \ (A ∪ T ) ⊆ B ∪ W .
Therefore, we are done with the lemma’s proof, since
p∗ |Sc | · p∗ r (C)
r(C) ≥ rB (C) ≥ |W | · ≥ = F .
2 4 4
Let k1 , k2 , k3 be the number of winners of the optimal F (3) solution, respectively,
in S1 , S2 , S3 .
For any fixed partition S1 , S2 , S3 of the dividing phase by applying Lemma
6, we get that the expected revenue of our mechanism over a distribution of six
permutations in the second phase should be at least
1 1
· min{k1 , k2 , k3 } · p∗ .
6 4
Competitive Auctions for Markets with Positive Externalities 577

In order to conclude the proof of the theorem we are only left to estimate the
expected value of min{k1 , k2 , k3 } from below by some constant factor of |S|. The
next lemma will do this for us.
Lemma 7. Let m ≥ 3 items independently at random be put in one of the three
boxes and let a, b and c be the random variables denoting the number of items
in these boxes. Then E[min{a, b, c}] ≥ 27
2
m.

By definition of the benchmark F (3) we have m = k1 + k2 + k3 ≥ 3 and thus


we can apply Lemma 7. Combining every bound we have so far on the expected
revenue of our mechanism we conclude the proof with the following lower bound.
1 1 1 2 F (3)
· E [min{k1 , k2 , k3 }] · p∗ ≥ · · p∗ · m = .
6 4 24 27 324

4 Restricted Single-Parameter Valuations


Here we introduce a couple of special restricted cases of the general setting with a
single parameter bidding language. For these models we only specify restrictions
on the valuation functions. In each case we assume that ti is a single private
parameter for agent i that he submits as a bid and wi (S) and wi (S) are fixed
publicly known functions for each possible winning set S. The models then are
described as follows.
– Additive valuation vi (ti , S) = ti + wi (S).
– Scalar valuation vi (ti , S) = ti · wi (S).
– Linear valuation vi (ti , S) = ti wi (S) + wi (S), i.e. combination of previous
two.
Note that we still require wi (S) = wi (S) = 0 if i ∈ S. These settings are now
single parameter domains, which is the most well studied and understood case
in mechanism design.

4.1 A Characterization
The basic question of mechanism design is to describe truthful mechanisms in
terms of simple geometric conditions. Given a vector of n bids, b = (b1 , . . . , bn ),
let b−i denote the vector, where bi is replaced with a ‘?’. It is well known that
truthfulness implies a monotonicity condition stating that if an agent i wins
for the bid vector b = (b−i , bi ) then she should win for any bid vector (b−i , bi )
with bi ≥ bi . In single-dimensional domains monotonicity turns out to be a suffi-
cient condition for truthfulness [6], where prices are determined by the threshold
functions.
In our model, valuation of an agent may vary for different winning sets and
thus may depend on his/her bid. Nevertheless, any truthful mechanism still has
to have a bid-independent allocation rule, although now it is not sufficient for the
truthfulness. However, in the case of linear valuation functions we are capable
of giving a complete characterization.
578 N. Gravin and P. Lu

Theorem 8. In the model with linear valuation functions vi (ti , S) = ti · wi (S) +


wi (S) an allocation rule A may be truthfully implemented if and only if it satisfies
the following conditions:
1. A is bid-independent, that is for each agent i, bid vector b = (b−i , bi ) with
i ∈ A(b) and any bi ≥ bi , it holds that i ∈ A(b−i , bi ).
2. A encourages asymptotically higher bids, i.e. for any fixed b−i and bi ≥ bi ,
it holds that wi (A(b−i , bi )) ≥ wi (A(b−i , bi )).
Here we prove that these conditions are indeed necessary. The sufficiency part of
the theorem is deferred to the full paper version, where we prove the characteri-
zation for a slightly more general family of single parameter valuation functions.
Proof. The necessity of the first monotonicity condition was known, so we prove
here that the second condition is also necessary. In the truthful mechanism,
an agent’s payment should not depend on his/her bid, if by changing it the
mechanism does not shift the allocated set. We denote by p the payment of
agent i for winner set A(b−i , bi ) and by p the payment of agent i for winner set
A(b−i , bi ). If the agent’s true value is bi , by truthfulness, we have

bi · wi (A(b−i , bi )) + wi (A(b−i , bi )) − p ≥ bi · wi (A(b−i , bi )) + wi (A(b−i , bi )) − p .

And if the agent’s true value is bi , we have

bi · wi (A(b−i , bi )) + wi (A(b−i , bi )) − p ≥ bi · wi (A(b−i , bi )) + wi (A(b−i , bi )) − p.

Adding these two inequalities and using the fact that bi ≥ bi , we have

wi (A(b−i , bi )) ≥ wi (A(b−i , bi )).

4.2 From F (2) to F (3)


Here we show that the usage of F (2) as a benchmark may lead to an unbounded
approximation ratio even for the restricted single parameter scalar valuations.
This justifies why we used a slightly modified benchmark F (3) in Section 3.
Theorem 9. There is no universally truthful mechanism that can achieve a
constant approximation ratio w.r.t. F (2) .
Proof. Consider the example of two people, in which every bidder evaluates the
outcome, where both agents get items much higher than the outcome, where only
one agent gets the item. That is v1 (x, {1}) = x, v2 (y, {2}) = y and v1 (x, {1, 2}) =
M x, v2 (y, {1, 2}) = M y for a large constant M . We note that these are single
parameter scalar valuations. We also note that these valuation functions are
indeed subadditive according to our definition. The subadditive requirement is
only for the subsets that includes the current agent and, in fact, any valuation
function for two agents is subadditive by our definition.
We will show that any universally truthful mechanism MD with a distribution
D over truthful mechanisms cannot achieve an approximation ratio better than
Competitive Auctions for Markets with Positive Externalities 579

M . Each truthful mechanism M in D either sells items to both bidders for some
pair of bids (b1 , b2 ), or for all pairs of bids sells not more than one item. In
the first case, by our characterization of truthful mechanisms (see theorem 8),
M should also sell two items for the bids (x, b2 ) and (b1 , y), where x ≥ b1 and
y ≥ b2 . Therefore, M has to sell two items for any bid (x, y) with x ≥ b1 and
y ≥ b2 . Let us denote the first and the second group of mechanisms in D by G1
and G2 respectively.
For any small  we may pick sufficiently large x0 , such that at least 1− fraction
of G1 mechanisms in D are selling two items for the bids (x = 2M x0 y0
, y = 2M ).
Note that
– revenue of F (2) for the bids (x0 , x0 ) is 2M x0 ,
– revenue of any M in G2 for the bids (x0 , x0 ) is not greater than x0 ,
– revenue of more than 1 −  fraction of G1 mechanisms in D is not greater
x0
than 2M 2M = x0 .
– revenue of the remaining  fraction of G1 mechanisms is not greater than
2M x0 .
Thus we can upper bound the revenue of MD by x0 (1 − ) + 2M x0  while the
revenue of F (2) is 2M x0 . By choosing sufficiently large M and small  we get an
arbitrarily large approximation ratio.
Remark 10. In fact, the same inapproximability results w.r.t. F (2) holds for a
weaker notion of truthfulness, namely truthfulness in expectation.

References
1. Aggarwal, G., Feldman, J., Muthukrishnan, S.M., Pál, M.: Sponsored search auc-
tions with markovian users. In: Papadimitriou, C., Zhang, S. (eds.) WINE 2008.
LNCS, vol. 5385, pp. 621–628. Springer, Heidelberg (2008)
2. Akhlaghpour, H., Ghodsi, M., Haghpanah, N., Mahini, H., Mirrokni, V.S., Nikzad,
A.: Optimal Iterative Pricing over Social Networks. In: Proceedings of the Fifth
Workshop on Ad Auctions (2009)
3. Akhlaghpour, H., Ghodsi, M., Haghpanah, N., Mirrokni, V.S., Mahini, H., Nikzad,
A.: Optimal iterative pricing over social networks (Extended abstract). In: Saberi,
A. (ed.) WINE 2010. LNCS, vol. 6484, pp. 415–423. Springer, Heidelberg (2010)
4. Alaei, S., Malekian, A., Srinivasan, A.: On random sampling auctions for digital
goods. In: EC, pp. 187–196 (2009)
5. Anari, N., Ehsani, S., Ghodsi, M., Haghpanah, N., Immorlica, N., Mahini, H.,
Mirrokni, V.S.: Equilibrium pricing with positive externalities (Extended abstract).
In: Saberi, A. (ed.) WINE 2010. LNCS, vol. 6484, pp. 424–431. Springer, Heidelberg
(2010)
6. Archer, A., Tardos, É.: Truthful mechanisms for one-parameter agents. In: FOCS,
pp. 482–491 (2001)
7. Arthur, D., Motwani, R., Sharma, A., Xu, Y.: Pricing strategies for viral marketing
on Social Networks, pp. 101–112. Springer (2009)
8. Candogan, O., Bimpikis, K., Ozdaglar, A.: Optimal pricing in the presence of local
network effects. In: Saberi, A. (ed.) WINE 2010. LNCS, vol. 6484, pp. 118–132.
Springer, Heidelberg (2010)
580 N. Gravin and P. Lu

9. Chen, W., Lu, P., Sun, X., Tang, B., Wang, Y., Zhu, Z.A.: Optimal pricing in social
networks with incomplete information. In: Chen, N., Elkind, E., Koutsoupias, E.
(eds.) WINE 2011. LNCS, vol. 7090, pp. 49–60. Springer, Heidelberg (2011)
10. Domingos, P., Richardson, M.: Mining the network value of customers. In: ACM
SIGKDD, pp. 57–66. ACM Press, New York (2001)
11. Feige, U., Flaxman, A.D., Hartline, J.D., Kleinberg, R.D.: On the competitive ratio
of the random sampling auction. In: Deng, X., Ye, Y. (eds.) WINE 2005. LNCS,
vol. 3828, pp. 878–886. Springer, Heidelberg (2005)
12. Fiat, A., Goldberg, A.V., Hartline, J.D., Karlin, A.R.: Competitive generalized
auctions. In: STOC, pp. 72–81 (2002)
13. Ghosh, A., Mahdian, M.: Externalities in online advertising. In: WWW,
pp. 161–168. ACM (2008)
14. Ghosh, A., Sayedi, A.: Expressive auctions for externalities in online advertising.
In: WWW, pp. 371–380. ACM (2010)
15. Giotis, I., Karlin, A.R.: On the equilibria and efficiency of the GSP mechanism in
keyword auctions with externalities. In: Papadimitriou, C., Zhang, S. (eds.) WINE
2008. LNCS, vol. 5385, pp. 629–638. Springer, Heidelberg (2008)
16. Goldberg, A.V., Hartline, J.D., Karlin, A.R., Saks, M., Wright, A.: Competitive
auctions. Games and Economic Behavior 55(2), 242–269 (2006)
17. Gomes, R., Immorlica, N., Markakis, E.: Externalities in keyword auctions: An
empirical and theoretical assessment. In: Leonardi, S. (ed.) WINE 2009. LNCS,
vol. 5929, pp. 172–183. Springer, Heidelberg (2009)
18. Haghpanah, N., Immorlica, N., Mirrokni, V.S., Munagala, K.: Optimal auctions
with positive network externalities. In: EC, pp. 11–20 (2011)
19. Hartline, J., Mirrokni, V., Sundararajan, M.: Optimal marketing strategies over
social networks. In: WWW, pp. 189–198. ACM (2008)
20. Hartline, J.D., McGrew, R.: From optimal limited to unlimited supply auctions.
In: EC, pp. 175–182. ACM (2005)
21. Jehiel, P., Moldovanu, B., Stacchetti, E.: How (not) to sell nuclear weapons. Amer-
ican Economic Review 86(4), 814–829 (1996)
22. Kempe, D., Kleinberg, J.M., Tardos, É.: Influential nodes in a diffusion model
for social networks. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C.,
Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 1127–1138. Springer, Heidelberg
(2005)
23. Kempe, D., Mahdian, M.: A cascade model for externalities in sponsored search.
In: Papadimitriou, C., Zhang, S. (eds.) WINE 2008. LNCS, vol. 5385, pp. 585–596.
Springer, Heidelberg (2008)
24. Kleinberg, J.: Cascading behavior in networks: algorithmic and economic issues.
Cambridge University Press (2007)
25. Nisan, N., Roughgarden, T., Tardos, É., Vazirani, V.V.: Algorithmic game theory.
Cambridge University Press (2007)
26. Sääskilahti, P.: Monopoly pricing of social goods, vol. 3526. University Library of
Munich, Germany (2007)
Efficient Computation of Balanced Structures

David G. Harris1, , Ehab Morsy2 , Gopal Pandurangan3, ,


Peter Robinson4,   , and Aravind Srinivasan5,
1
Department of Applied Mathematics, University of Maryland,
College Park, MD 20742
[email protected]
2
Division of Mathematical Sciences, Nanyang Technological University,
Singapore 637371 and Department of Mathematics, Suez Canal University,
Ismailia 22541, Egypt
[email protected]
3
Division of Mathematical Sciences, Nanyang Technological University,
Singapore 637371 and Department of Computer Science,
Brown University, Providence, RI 02912
[email protected]
4
Division of Mathematical Sciences, Nanyang Technological University,
Singapore 637371
[email protected]
5
Department of Computer Science and Institute for Advanced Computer Studies,
University of Maryland, College Park, MD 20742
[email protected]

Abstract. Basic graph structures such as maximal independent sets


(MIS’s) have spurred much theoretical research in distributed algorithms,
and have several applications in networking and distributed computing as
well. However, the extant (distributed) algorithms for these problems do
not necessarily guarantee fault-tolerance or load-balance properties: For
example, in a star-graph, the central vertex, as well as the set of leaves,
are both MIS’s, with the latter being much more fault-tolerant and bal-
anced — existing distributed algorithms do not handle this distinction.
We propose and study “low-average degree” or “balanced” versions of
such structures. Interestingly, in sharp contrast to, say, MIS’s, it can
be shown that checking whether a structure is balanced, will take sub-
stantial time. Nevertheless, we are able to develop good sequential and
distributed algorithms for such “balanced” versions. We also complement
our algorithms with several lower bounds.


Supported in part by NSF Award CNS-1010789.

Supported in part by Nanyang Technological University grant M58110000, Singa-
pore Ministry of Education (MOE) Academic Research Fund (AcRF) Tier 2 grant
MOE2010-T2-2-082, and MOE AcRF Tier 1 grant MOE2012-T1-001-094, and by
a grant from the United States-Israel Binational Science Foundation (BSF).

Supported in part by Nanyang Technological University grant M58110000 and
Singapore Ministry of Education (MOE) Academic Research Fund (AcRF) Tier 2
grant MOE2010-T2-2-082.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 581–593, 2013.

c Springer-Verlag Berlin Heidelberg 2013
582 D.G. Harris et al.

1 Introduction
Fundamental graph-theoretic structures such as maximal independent set (MIS)
and minimal dominating set (MDS) and their efficient distributed computa-
tion are very important, especially in the context of distributed computing and
networks where they have many applications [8]. MIS, for example, is a basic
building block in distributed computing and is useful in basic tasks such as mon-
itoring, scheduling, routing, clustering, etc. (e.g., [7,9]). Extensive research has
gone into designing fast distributed algorithms for these problems since the early
eighties (e.g., see [5,10] and the references therein) . We now know that problems
such as MIS are quite local, i.e., they admit distributed algorithms that run in a
small number of rounds (typically logarithmic in the network size). However, one
main drawback of these algorithms is that there is no guarantee on the quality
of the structure output. For example, the classical MIS algorithm of Luby [6]
computes an MIS in O(log n) rounds (throughout, n stands for number of nodes
in the network) with high probability, but does not give any guarantees on the
properties of the output MIS. (Another O(log n) round parallel algorithm was
independently found by Alon, Babai, and Itai [1].) In this paper, we initiate a
systematic study of “balanced” versions of these structures, i.e., the average de-
gree of the nodes belonging to the structure (the degrees of nodes in the structure
are with respect to the original subgraph, and not with respect to the subgraph
induced by the structure) is small, in particular, compared to the average degree
of the graph (note that, in general, the best possible balance we can achieve is
the average degree of the graph, as in a regular graph). For example, as we de-
fine later, a balanced MIS (BMIS) is an MIS where the average degree of nodes
belonging to the MIS is small.
We note that the maximum independent set (which is a well-studied NP-complete
problem [2]) in a graph G is not necessarily a BMIS in G. Consider the graph G
that contains a complete graph Kp (assume p is even), and a complete bipartite
graph KA,B with |A| = 2 and |B| = 3. Each vertex in A is connected to a different
half of the set of vertices in Kp (i.e., one vertex of A is connected to one half of
vertices of Kp and the second vertex of A is connected to the other half of Kp ),
and each vertex in B is connected to all vertices in Kp . Clearly, B is the maximum
independent set in G and has average degree p + 2, while A is a BMIS in G since
its average degree is p/2 + 3. Thus BMIS is a different problem compared to the
maximum independent set problem (which is not the focus of this paper).
There are two key motivations for studying balanced structures. The first is
from an application viewpoint. In distributed networks, especially in resource-
constrained networks such as ad hoc networks, sensor and mobile networks,
it is important to design structures that favor load balancing of tasks among
nodes (belonging to the structure). This is crucial in extending the lifetime of
the network (see e.g., [11] and the references therein). For example, in a typical
application, an MIS (or an MDS) can be used to form clusters with low diameter,
with the nodes in the MIS being the “clusterheads” [7]. Each clusterhead is
responsible for monitoring the nodes that are adjacent to it. Having an MIS with
low degree is useful in a resource/energy-constrained setting since the number
Efficient Computation of Balanced Structures 583

of nodes monitored per node in the MIS will be low (on average). This can lead
to better load balancing, and consequently less resource or energy consumption
per node, which is crucial for ad hoc and sensor networks, and help in extending
the lifetime of such networks while also leading to better fault-tolerance. For
example, in an n-node star graph, the above requirements imply that it is better
for the leaf nodes to form the MIS rather than the central node alone. In fact,
the average degree of the MIS formed by the leaf nodes (which is 1) is within a
constant factor of the average degree of a star (which is close to 2), whereas the
average degree of the MIS consisting of the central node alone (which is n − 1)
is much larger.
Another potential application of balanced structures is in the context of dy-
namic networks where one would like to maintain structures such as MIS effi-
ciently even when nodes or links (edges) fail or change with time. For example,
this is a feature of ad hoc and mobile networks where links keep changing either
due to mobility or failures. BMIS can be a good candidate for maintaining an
MIS efficiently in an incremental fashion: since the degrees of nodes in the MIS
are balanced, this will lead to less overhead per insertion or deletion.
The second key motivation of our work is understanding the complexity of
local computation of globally optimal (or near optimal) fundamental structures.
The correctness of structures such as MIS or MDS can be verified strictly locally
by a distributed algorithm. In the case of MIS, for example, each node can check
the MIS property by communicating only with its neighbors; if there is a violation
at least one node will raise an alarm. On the other hand, it is not difficult to
show that the correctness of balanced structures such as BMIS cannot be locally
verified (in the above sense) as the BMIS refers to a “global” property: nodes
have to check the average degree property, in addition to the MIS property.
In fact, one can show that at least D rounds (D being the network diameter)
would be needed to check whether a structure is a BMIS. Moreover, we prove
that BMIS is an NP-hard problem and hence the optimality of the structure is
not easy to check even in a centralized setting. A key issue that we address in
this paper is whether one can compute near-optimal local (distributed) solutions
to balanced global structures such as BMIS. A main result of this paper is that
despite the global nature, we can design efficient distributed algorithms that
output high quality balanced structures.
Our work is also a step towards understanding the algorithmic complexity of
balanced problems. While every MIS is an MDS, they differ significantly in their
balanced versions. In particular, we show that there exist graphs for which no
MIS is a good BMDS. Hence we need a different approach to compute a good
BMDS as compared to a good BMIS. Even for BMIS, we show that while one can
(for example) use Luby’s algorithm [6] to efficiently compute an MIS, the same
approach fails to compute a good quality BMIS. We present new algorithms for
computing such balanced structures.

1.1 Problems Addressed and our Results


We consider an undirected simple graph G = (V, E) with n nodes and m edges.
We denote the average degree of G by δ = 2m
n . More generally, given any subset
584 D.G. Harris et al.

S ⊆ V , we define the average degree of S, denoted by δS , as the totaldegree


dv
of the vertices of S divided by the number of vertices in S, i.e., δS = v∈S |S| ,
where dv is the degree of node v in G. To simplify the problem, we assume that
G has no isolated vertices, i.e. dv  1 for all v ∈ V . (This assumption can be
easily removed).
We consider the following fundamental graph structures: A Maximal Inde-
pendent Set (MIS) is an inclusion-maximal vertex subset S ⊆ V such that no
two vertices in S are neighbors. A Minimal Dominating Set (MDS) is an
inclusion-minimal vertex subset S ⊆ V such that every vertex in G is either in
S or is a neighbor of a vertex in S. A Minimal Vertex Cover (MVC) is an
inclusion-minimal vertex subset S ⊆ V such that every edge in G has at least
one endpoint in S.
This paper is concerned with the “balanced” versions of the above problems
which are optimization versions of the above binary problems.
1. Balanced Maximal Independent Set (BMIS): Given an undirected graph
G, a BMIS is an MIS S in G that minimizes the average degree of S. In other words,
the BMIS has the minimum average degree among all MIS’s in G.
2. Balanced Minimal Dominating Set (BMDS): Given an undirected graph
G, a BMDS is an MDS D in G that minimizes the average degree of D.
3. Balanced Minimal Vertex Cover (BMVC): Given an undirected graph
G, a BMVC is an MVC C in G that minimizes the average degree of C.
Our Results. We first note that the trivial lower bound is δ for all balanced
problems which follows from the example of a regular graph (where all nodes have
the same degree). Hence, in general, the average degree of a balanced structure
cannot be guaranteed to be less than δ. On the other hand, there exist graphs
where the average degree of the BMIS is significantly smaller than δ (e.g., con-
sider a graph on 2n nodes, out of which n nodes form a complete graph and
each of these vertices is also connected by a single edge to one each of the rest
of the n nodes). This leads us to two basic questions: (i) In every given graph G,
does there always exist a BMIS whose average degree is at most δ? and (ii) Can
question (i) be answered for a specific graph G in polynomial time? We answer
both questions in the negative.
We show that unlike MIS, its balanced version, BMIS, is NP-hard. In par-
ticular, in the full paper [4] we show that the following decision version of the
problem is NP-complete: “Given a graph G, is there an MIS in G with aver-
age degree at most δ?” In fact we show that the optimization version BMIS
is quite hard to approximate in polynomial√ time: it cannot be approximated in
polynomial time to within a factor of Ω( n) (cf. [4]).
Henceforth, we focus on obtaining solutions for BMIS that are good compared
to the average degree of the graph. We show that we can obtain near-tight
solutions that compare well with δ. The following are our main results:
Theorem 1. There is a (centralized) algorithm that selects an MIS of average
degree at most δ 2 /8 + O(δ) and runs in O(nm3 log n) time with high probability.1
1
“With high probability” (w.h.p.) means with probability  1 − 1/nΩ(1) .
Efficient Computation of Balanced Structures 585

To show the above theorem (the full proof is in [4]) we show that Luby’s MIS
algorithm[6] returns an MIS with average degree at most δ 2 /8 + O(δ), albeit
with inverse polynomially small probability. This can be easily turned into a
centralized algorithm by repeating this algorithm a polynomial number of times
till the desired bound is obtained. (However, this does not give a fast distributed
algorithm.) The above algorithm is nearly optimal with respect to the average
degree of the MIS, as we show an almost matching lower bound (this also answers
the question (i) posed above in the negative):

Theorem 2. For any real number α > 1, there is a graph G with average degree
 α, but in which every MIS has average degree  α2 /8 + 3α/4 + 5/8.

We next consider distributed approximation algorithms for BMIS and show that
we can output near-optimal solutions fast, i.e., solutions that are close to the
lower bound. We consider the following standard model for our distributed algo-
rithms where the given graph G represents a system of n nodes (each node has
a distinct ID). Each node runs an instance of the distributed algorithm and the
computation advances in synchronous rounds, where, in each round, nodes can
communicate with their neighbors in G by sending messages of size O(log n). A
node initially has only local knowledge limited to itself and its neighbors (it may
however know n, the network size). We assume that local computation (per-
formed by the node itself) is free as long it is polynomial in the network size.
Each node u has local access to a special bit (initially 0) that indicates whether
u is part of the output set. Our focus is on the time complexity, i.e., the number
of rounds of the distributed computation.
We present two distributed algorithms for BMIS (cf. Section 2.1), the second
algorithm gives a better bound on the average degree at the cost of (somewhat)
increased run time. However, both algorithms are fast, i.e., run in polylogarith-
mic rounds.

Theorem 3. Consider a graph G = (V, E) with average degree δ.


1. There is a distributed algorithm that runs in O(log n log log n) rounds and
with high probability outputs an MIS with average degree O(δ 2 ).
2. There is a distributed algorithm that runs in log2+o(1) n rounds and with high
probability outputs an MIS with average degree (1 + o(1))(δ 2 /4 + δ).

Note that in general, due to the lower bound (cf. Theorem 2), the bounds pro-
vided by algorithms of the above theorem are optimal up to constant factors
We next present results on BMDS. Since an MIS is also an MDS, an algorithm
for MIS can also be used to output an MDS. However, this can lead to a bad
approximation guarantee, since there are graphs for which every MIS has a bad
average degree compared to some MDS. This follows from the graph family used
in the proof of Theorem 2: while the average degree of every MIS (of any graph
in the family) is Ω(δ 2 ), there exists an MDS with average degree only O(δ).
Because an MIS is also an MDS, the results of Theorem 3 also hold for BMDS.
Our next theorem shows that much better guarantees are possible for BMDS.
586 D.G. Harris et al.

Theorem 4. Any graph G with average degree δ has a minimal dominating


δ log δ
set with average degree at most O( log log δ ). Furthermore, there is a sequential
randomized algorithm for finding such an MDS in polynomial time w.h.p.

The next theorem shows that the bound of Theorem 4 is optimal (in general),
up to constant factors:
Theorem 5. For any real number α > 0, there are graphs with average degree
 α, but for which any MDS has an average degree of Ω( log
α log α
log α ).

Finally, we show the following result for the BMVC problem which shows that
there cannot be any bounded approximation algorithm for the problem:
Theorem 6. For any real number α > 2, there are graphs for which the average
degree is at most α, but for which the average degree of any MVC approaches to
infinity.

2 Balanced Maximal Independent Set


We first prove Theorem 2 which shows that there are graphs G for which the
degree of every MIS is much larger than the degree of G itself. More importantly,
the theorem gives a lower bound on the quality of BMIS in general: one cannot
2
guarantee an MIS whose average degree is less than δ8 + Θ(δ).
Proof of Theorem 2. Consider the graph consisting
 of a copies of Kb , as well
one copy of Kc,c , where b = 3+α
2 and c = 1
2 2ab(α − b + 1) + α2 + α .
2
The resulting graph has average degree ab(b−1)+2c
ab+2c  α. Every MIS of this
graph contains one vertex from each Kb , as well as one half of the vertices of
2
Kc,c , for an average degree of ab+c
a+c . As a tends to infinity, such average degree
(3+α−b)b
increasingly approaches 2  α2 /8 + 3α/4 + 5/8.

2.1 Distributed Algorithms for BMIS

This section is devoted for designing different distributed algorithms for BMIS.2
In particular we will prove Theorem 3 (Parts 1, and 2). The proposed algorithms
do not require any global information of the original graph other than n.

Proof of Part 1 of Theorem 3. We propose a distributed algorithm that


constructs an MIS I of G such that the following two properties hold with high
probability: (a) I has average degree at most O(δ 2 ), and (b) I is constructed
within O(log n × log log n) rounds.
This algorithm does not require any global information of the original graph,
other than the network size n. The algorithm depends on a parameter φ, which
is held constant. For any constant c > 0, one can choose φ appropriately so that
the distributed algorithms succeeds with probability 1 − n−c ; the parameter φ
2
Omitted proofs are included in the full paper [4].
Efficient Computation of Balanced Structures 587

First Phase – Luby’s algorithm on Gφ√n/ log n :


(Recall that Gx denotes the subgraph of vertices of degree  x.)
1. Each vertex v in Gφ√n/ log n marks itself independently with probability
1/(2dv ).
2. If two adjacent nodes are marked, unmark the one with higher degree
(breaking ties arbitrarily).
3. Add any marked nodes to the independent set I.
Second Phase – Extending the MIS:
1. For i = 0, 1, . . . , 12 $log 2 log n − log2 φ%, repeat the following:

2. Let xi = 2i φ n/ log n. Run Luby’s MIS algorithm for φ log n iterations to
extend the current independent set I to an MIS of the graph Gxi .
Third Phase:
1. Using Luby’s algorithm, extend the current independent set I to an MIS of G.

Algorithm 1. Distributed Algorithm for Approximating BMIS

will only affect the running time by a constant factor. This is the strongest form
in which an algorithm can be said to succeed with high probability.
The algorithm
 has three
 phases, which are √ intended to address √ the cases where
δ  O( n/ log n), Θ( n/ log n)  δ  Θ( n), and δ  Ω( n) respectively.
 first phase runs Luby’s algorithm for MIS on the vertices with degree 
The
n/ log n. The next phase gradually extends the resulting independent set by
finding MIS’s of the subgraphs consisting of successively larger degrees. Finally,
using Luby’s algorithm, this is extended to an MIS of G itself. It is easy to see that
this leads to an MIS of G, and the resulting algorithm runs in O(log n×log log n)
rounds. We will also show that if we run only Phases I and III of this algorithm,
then we can obtain an MIS of degree O(δ 2 log δ) in time O(log n).
We introduce the following definition which will be used throughout the proof.
For any real number s, we let Gs denote the subgraph of G induced on the vertices
of degree  s. This notation is used in describing Algorithm 1.
The following basic principle will be used in a variety of places in this proof:
Proposition 1. Suppose a graph G has n vertices and average degree δ. Suppose
s > 1. Then the subgraph Gsδ contains at least n(1 − 1/s) vertices.
We now show that this algorithm√ has good behavior in the first two parameter
regimes. The third regime δ = Ω( n) is trivial.

Lemma 1 (First Phase). Suppose δ  φ2 n/ log n. Then with probability 1 −
n−Ω(1/φ) , the independent set produced at the end of the first phase, contains
Ω(n/δ) vertices. In particular, the final MIS produced has average degree O(δ 2 ).

Proof. Let n , δ  denote the number of vertices and average degree of the graph
Gφ√n/ log n . Note δ   δ. By Proposition 1 we have n  n/2.
For each vertex v ∈ Gφ√n/ log n let Xv be the random variable indicating
that v was marked, and Xv the random variable indicating that v was accepted
588 D.G. Harris et al.

1: Let φ > 1 be a fixed parameter. Initialize I = ∅.


2: for i = 0, . . . , $logφ n% do
3: Using any distributed MIS algorithm, extend I to an MIS of the graph Gφi .
4: Return the final MIS I.

Algorithm 2. Greedy Distributed Approximation Algorithm for BMIS

into I (i.e. it did not conflict with a higher-degree


 vertex). Let Y denote the
size of this independent set, i.e., Y = v∈G √ Xv . As shown in [6], we
φ n/ log n

have E[Y ] = Ω(n /δ  ) = Ω(n/δ). Now, we want to show that Y is concentrated


around its mean. As described in [3], this sum can be viewed as a “read-k family”;
each variable Xv is a Boolean function of the underlying
 independent variables
X, and each variable Xv affects at most k = O(φ n/ log n) of the Boolean
functions. Hence this sum obeys a similar concentration bound to the Chernoff
bound, albeit the exponent is divided by k. In particular, the probability that Y
deviates below a constant factor from its mean is given by P (Y  (1−x)E[Y ]) 
E[Y ]x2 log n
e− 2k  e−Ω( φ ) = n−Ω(1/φ) . Hence, with high probability, the total number
of vertices returned from the first phase is Ω(n/δ) as desired. &
%
 1√
Lemma 2 (Second Phase). Suppose 2 n/ log n < δ  2 n. Then with
φ

probability 1 − n−Ω(1/φ) , the independent set produced at the end of the sec-
ond phase, contains Ω(n/δ) vertices. In particular, the final MIS produced has
average degree O(δ 2 ).
Proof. Let n , δ  represent the number of vertices and average degree of G√n .
 
By Proposition

 1 we must have n = Ω(n) and δ  δ.
If δ  2 n/ log n, then by Proposition 1 there would be Ω(n) vertices of
φ

G with degree  φ2 n/ log n. By Lemma 1, phase 1 would then produce an

independent set with Ω(n/ n/ log n) = Ω(n/δ 2 ) vertices.
So suppose δ   φ2 n/ log n. Now, as i increases, xi is multiplied by a factor
 √
of 2 as it increases from φ n/ log n to n. In particular, there is some value of i
which has 2δ   xi  4δ  . At this point, the standard analysis shows that φ log n
iterations of Luby’s algorithm produces, with probability 1 − n−Ω(1/φ) , an MIS
of the graph Gxi . By Proposition 1, Gxi contains Ω(n ) vertices. Furthermore,
any MIS of G must contain Ω(n /δ  ) vertices; the reason for this is that the
maximum degree of any vertex in Gxi is O(δ  ), and it is necessary to select
Ω(n /δ  ) simply to ensure that every vertex is covered by the MIS.
Now, at stage i we produce an MIS of Gxi which contains Ω(n/δ) vertices.
This is eventually extended to an MIS of G with Ω(n/δ) vertices. &
%

Proof of Part 2 of Theorem 3. The greedy algorithm for BMIS is very simple.
We label the vertices in order of increasing degree (breaking ties arbitrarily).
Each vertex is added to the IS (Initially, IS=∅), unless it was adjacent to a
earlier vertex already selected. This is a simple deterministic algorithm which
requires time O(m).
Efficient Computation of Balanced Structures 589

δ2
Theorem 7. The greedy algorithm produces an MIS of degree at most 4 + δ.
Proof. Order the vertices in order of increasing degree d1  d2  . . .  dn .
Define the indicator variable xv to be 1 if v ∈ I and 0 otherwise, where I is the
MIS produced. For any pair of vertices u and v with du  dv , we also define the
indicator yvu to be 1 if v ∈ I and there is an edge from v to u. (It may seem
strange to include the variable yvv , as we always have yvv = 0 in the intended
solution, but this will be crucial in our proof, which is based on LP relaxation.)
As the greedy algorithm selects v iff no earlier vertex was adjacent to it, we
have xv = 1 if and only if y1v = y2v = · · · = yv−1,v = 0. In particular, xv
satisfies the linear constraint xv  1 − y1v − y2v − · · · − yvv . The variables x, y
also clearly satisfy the linear constraints ∀v : 0  xv  1, ∀v  u : 0  yvu , and
∀v : vu  dv xv . which
u y we refer to as the core constraints. The final MIS
contains xv verticesand v d v xv edges, and hence the average degree of the
resulting MIS is δI = v dv xv / v xv . 
d x
We wish to find an upper bound on the ratio R = v vxv v . The variables x, y
v
satisfy many other linear and non-linear constraints, and in particular are forced
to be integral. However, we will show that the core constraints are sufficient to
bound R. The way we will prove this is to explicitly construct a solution x, y
which satisfies the core constraints and maximizes R subject to them, and then
2
show that the resulting x, y still satisfies R  δ4 + δ.
Let x, y be real vectors which maximizes R among all real vectors satisfying
the core constraints, and among all such vectors, which minimize u>v yvu .
Suppose yvu > 0 for some u > v. If xu = 1, then we simply decrement yvu by .
The constraint xu  1 − y1u − · · · − yuu clearly remains satisfied as xu = 1, and
all other constraints are  unaffected. The objective function is also unchanged.
However, this reduces u>v yvu , contradicting maximality of x, y.
Suppose yvu > 0 for some u > v, and xu < 1 strictly. Note that yvu  dv xv ,
so we must have xv > 0 strictly. For some sufficiently small , we change x, y
 
as follows: yvu = yvu − , yvv = yvv + dv+1 , xv = xv − dv+1 , xu = xu + du+1 ,

and yuu = yuu + dd u
u +1
. All other values remain unchanged. We claim that the
constraints on x, y are still preserved. Furthermore, the numerator of R increases
the denominator decreases; hence R  R. This contradicts the maximality of
x, y.
In summary, we can assume yvu = 0 for all u > v. In this case, the core
constraints on v become simply 1 − yvv  xv  1 and yvv  dv xv . It is a
simple exercise to maximize R subject to these constraints (every vertex operates
completely independently), and the maximum is achieved when xv = dv1+1 for
dv  t, and xv = 1 for dv > t. In this case, the objective function R(x) satisfies
 dv 
dv +1 + dv
R dv t 1
dv >t Let δS , δB denote the average degrees of the vertices
dv t dv +1 + dv >t 1
of degree  t, > t respectively, and let nS , nB represent the number of such
vertices. Then by concavity, we have
nS δSδS+1 + nB δB δ(δB − δS ) + δB δS (δ − δS )
R 
nS
δS +1 + nB δS (δ − δS ) + (δB − δS )
590 D.G. Harris et al.

1: Mark each vertex of degree > 2δ independently with prob. logt t where
2δ log δ
t = log log δ
.
2: Mark every vertex of degree  2δ.
3: If any vertex v is not marked, and none of the neighbors of v are marked,
then mark v.
4: Let M denote the set of marked vertices at this point. M forms a dominating
set of G, but is not necessarily minimal. Using any algorithm, select a minimal
dominating set M  ⊆ M .
5: Check if δM   t. If so, return M  . Otherwise, return FAIL.

Algorithm 3. Approximation Algorithm for BMDS

Routine calculus shows that this achieves its maximum value at δB = ∞, δS =


δ/2, yielding R  δ 2 /4 + δ as claimed. &
%

This greedy algorithm can be converted, with only a little loss, to a distributed
algorithm as shown in Algorithm 2. This algorithm is basically the sequential
greedy algorithm, except we are quantizing the degrees to multiples of some
parameter φ. Allowing φ → 1 sufficiently slowly, we obtain an algorithm which
requries log2+o(1) n rounds and returns an MIS of degree (1 + o(1))(δ 2 /4 + δ)
w.h.p. As we have seen in Theorem 2, this is within a factor of 2 of the lowest
degree possible.

3 Balanced Minimal Dominating Set

For arbitrary graphs, we turn our attention to designing algorithms for finding
approximate solutions to BMDS. Since any MIS in a given graph G is also an
MDS in G, all algorithms designed for BMIS also return an BMDS in G of the
same average degree. Thus, we have the same bounds (and distributed algo-
rithms) corresponding to those in Section 2.However, for BMDS, better bounds
are possible. Given a graph with average degree δ, we will show a polynomial-time
δ log δ
algorithm that finds an MDS of average degree O( log log δ ). We will also construct
δ log δ
a family of graphs G for which every MDS has average degree Ω( log log δ ).

Proof of Theorem
 4. For a target degree t, and any set of vertices V0 , we
define StV0 = v∈V0 (dv − t). Our goal is to find an MDS X with StX  0, for
some t = O(δ log δ/ log log δ).
Let x = 2δ and divide the vertices into three classes: A, the set of vertices of
degree  x; B, the set of vertices of degree > x, which have at least one neighbor
in A; and C, the set of vertices of degree > x, all of whose neighbors are in B
or C. Mark each vertex in B ∪ C with probability p = logt t . Next, form the set
Y ⊆ B ∪ C, by inserting all marked vertices in B ∪ C and vertices in C with no
marked neighbors. Clearly Y dominates C, and A ∪ Y dominates G. Now, select
two subsets A ⊆ A and Y  ⊆ Y such that X = A ∪ Y  is an MDS of G.
Efficient Computation of Balanced Structures 591


We first examine StY . Any vertex of G with degree  t contributes at most

0 to StY . Otherwise, suppose v has degree  t. If v ∈ B, it is selected for
Y with probability at most logt t . If v ∈ C, all its neighbors are marked with
probability logt t , so it is selected for Y with probability at most logt t +(1− logt t )t 

2 logt t . Hence the expected contribution of such vertex to StY is at most 2dv logt t .

Summing over all such vertices, we have E[StY ]  2|B ∪ C|δB∪C logt t , where
δB∪C denote the average degree of vertices in B ∪ C.
Now, some of the vertices in A are dominated by B-vertices of Y  . Let A0 be
the set of vertices not dominated by Y  . These vertices can only be dominated
by vertices of A , so we must have |A |(δA + 1)  |A0 |. Subject to the conditions
|A |(δA + 1)  |A0 | and δA  x, we have StA  |A0x+1 |(x−t)

. The ultimate MDS
may contain vertices in A − A0 as well; however, as x  t, these will have a
negative contribution to St , and hence they will only help up in showing an
upper bound on St .
Consider the expected value of E[|A0 |]. A vertex v ∈ A lies in A0 if none of its
neighbors are marked (this is not a necessary condition),  and vertices are marked
independently with probability p. Hence E[|A0 |]  v∈A (1−p)dv  |A|(1−p)δA .
Putting all this together, we have that the final MDS X = A ∪ Y  satisfies
E[StX ]  2p|B ∪ C|δB∪C + |A| x+1 x−t
(1 − p)δA . For δ sufficiently large, p approaches
zero, so that that (1 − p)δA  e−2pδA .
We know that |A| + |B| + |C| = n, and |A|δA + (|B| + |C|)δB∪C  nδ.
Eliminating |A|, |B|, |C| we have

 2δ − δA ) log t (δB∪C − δ)(t − 2δ)t− t


δA
B∪C (δ
E[StX ]  n −
t(δB∪C − δA ) (2δ + 1)(δB∪C − δA )

Routine calculus shows that, for t sufficiently large, this achieves its maximum
t−2δ
t log( 4δ+2 )
value at δB∪C → ∞, δA = log t , yielding

2δ log t t − 2δ
E[StX ]  − 2 log − 2.
t 4δ + 2

For t = 2δ log δ/ log log δ, the RHS approaches to −∞ as δ → ∞. This implies


that E[StX ]  0, so there is a positive probability of selecting an MDS of average
degree  t.
Note that for t = 2δ log δ/ log log δ, we have StX  −δn logO(1) n and E[StX ] 
−Ω(1). By Markov’s inequality, the random variable StX is negative with prob-
ability  δ −1 n−1 log−O(1) n. Hence, after δn logO(1) n iterations of this sampling
δ log δ
process, we find an MDS of degree O( log log δ ) with high probability. The total
work expended is m2 logO(1) n. This is summarized as Algorithm 3.
δ log δ
We next prove Theorem 5 which shows that this bound O( log log δ ) is optimal,
up to constant factors.
592 D.G. Harris et al.

Proof of Theorem 5. We will construct a graph of average degree δ = O(α),


α log α
all of whose MDS’s have degree Ω( log log α ). To simplify the proof, we will ignore
rounding issues. As all the quantities tend to infinity with α, such rounding
issues are negligible for α sufficiently large.
Define k = log2 (α log α/ log log α). We define a random process which con-
structs a graph with three types of vertices, which we denote A, B, C (these play
the same role as in the proof of Theorem 4). The vertices in A, B are organized
into clusters of related vertices. For class A, there are l = 4k /α2 clusters of size
α. For class B, there are r clusters of size 2k /r, for some r = Θ(k) (the constant
will be specified later). There are 2k − 1 vertices in class C. These are not or-
ganized into clusters but are considered individually. We index these vertices by
the non-zero k-dimensional binary vectors over the finite field GF (2). That is,
C corresponds to C = GF (2)k − 0.
We add the following edges to the graph (some of these edges are deterministic,
some are random):
1. From each A-vertex to the other vertices in the same A-cluster.
2. From each B-vertex to all the other B-vertices, even those outside its cluster.
3. For each B-cluster b, we choose a random non-zero binary vector vb in
GF (2)k . For each vertex in C, indexed by vector w, we construct an edge
from all the B-vertices in the cluster b to the vertex w iff vb .w = 1. The dot
product here is taken over the field GF (2).
4. For each A-cluster a, we select α/(2k /r) B-clusters uniformly at random,
with replacement. We add an edge from every vertex in the A-cluster a to
every vertex in the selected B-clusters.
This graph has degree O(α). In the full paper [4], we show that with high prob-
α log α
ability, every MDS of G has degree Ω( log log α ).

4 Conclusion
We initiate the study (graph-theoretic, algorithmic, and distributed) of the bal-
anced versions of some fundamental graph-theoretic structures. As discussed
in Section 1, the study of balanced structures can be useful in providing fault-
tolerant, load-balanced MISs and MDSs. We develop reasonably-close upper and
lower bounds for many of these problems. Furthermore, for the BMIS problem,
we are able to develop fast (local) distributed algorithms that achieves an ap-
proximation close to the best possible in general. A main open problem that is
left open is whether one can do the same for the BMDS problem. We view our
results also as a step in understanding the complexity of local computation of
these structures whose optimality itself cannot be verified locally.

Acknowledgement. We would like to thank the anonymous referees for the


helpful comments.
Efficient Computation of Balanced Structures 593

References
1. Alon, N., Babai, L., Itai, A.: A fast and simple randomized parallel algorithm for
the maximal independent set problem. J. Algorithms 7(4), 567–583 (1986)
2. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory
of NP-Completeness. W. H. Freeman (1979)
3. Gavinsky, D., Lovett, S., Saks, M., Srinivasan, S.: A tail bound for read-k families
of functions. arXiv preprint arXiv:1205.1478 (2012)
4. Harris, D.G., Morsy, E., Pandurangan, G., Robinson, P., Srinivasan, A.: Efficient
Computation of Balanced Structures, http://static.monoid.at/balanced.pdf
5. Kuhn, F., Moscibroda, T., Wattenhofer, R.: Local computation: Lower and upper
bounds. CoRR, abs/1011.5470 (2010)
6. Luby, M.: A simple parallel algorithm for the maximal independent set problem.
SIAM J. Comput. 15(4), 1036–1053 (1986)
7. Moscibroda, T.: Clustering. In: Algorithms for Sensor and Ad Hoc Networks, pp.
37–60 (2007)
8. Peleg, D.: Distributed Computing: A Locality-Sensitive Approach. SIAM (2000)
9. Rajaraman, R.: Topology control and routing in ad hoc networks: a survey.
SIGACT News 33(2), 60–73 (2002)
10. Suomela, J.: Survey of local algorithms. ACM Comput. Surv. 45(2) (2013)
11. Zhang, H., Shen, H.: Balancing energy consumption to maximize network lifetime
in data-gathering sensor networks. IEEE TPDS 20(10), 1526–1539 (2009)
A Refined Complexity Analysis
of Degree Anonymization in Graphs

Sepp Hartung1 , André Nichterlein1 , Rolf Niedermeier1 , and Ondřej Suchý2


1
Institut für Softwaretechnik und Theoretische Informatik, TU Berlin
{sepp.hartung,andre.nichterlein,rolf.niedermeier}@tu-berlin.de
2
Faculty of Information Technology, Czech Technical University in Prague
[email protected]

Abstract. Motivated by a strongly growing interest in graph anonymiza-


tion in the data mining and databases communities, we study the NP-
hard problem of making a graph k-anonymous by adding as few edges as
possible. Herein, a graph is k-anonymous if for every vertex in the graph
there are at least k − 1 other vertices of the same degree. Our algorith-
mic results shed light on the performance quality of a popular heuristic
due to Liu and Terzi [ACM SIGMOD 2008]; in particular, we show that
the heuristic provides optimal solutions in case that many edges need to
be added. Based on this, we develop a polynomial-time data reduction,
yielding a polynomial-size problem kernel for the problem parameterized
by the maximum vertex degree. This result is in a sense tight since we
also show that the problem is already NP-hard for H-index three, im-
plying NP-hardness for smaller parameters such as average degree and
degeneracy.

1 Introduction

For many scientific disciplines, including the understanding of the spread of dis-
eases in a globalized world or power consumption habits with impact on fighting
global warming, the availability of (anonymized) social network data becomes
more and more important. In a landmark paper, Liu and Terzi [16] introduced
the following simple graph-theoretic model for identity anonymization on (social)
networks. Herein, they transferred the k-anonymity concept known for tabular
data in databases [9] to graphs.

Degree Anonymity [16]


Input: An undirected graph G = (V, E) and two positive integers k
and s.
Question: Is there an edge set E  over V with |E  | ≤ s such that G =
(V, E ∪ E  ) is k-anonymous , that is, for every vertex v ∈ V there
are at least k − 1 other vertices in G having the same degree?

Liu and Terzi [16] assume in this model that an adversary (who wants to de-
anonymize the network) knows only the degree of the vertex of a target individual;

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 594–606, 2013.

c Springer-Verlag Berlin Heidelberg 2013
A Refined Complexity Analysis of Degree Anonymization in Graphs 595

this is a modest adversarial model. Clearly, there are stronger adversarial mod-
els which (in many cases very realistically) assume that the adversary has more
knowledge, making it possible to breach privacy provided by a “k-anonymized
graph” [20]. Moreover, it has been argued that graph anonymization has funda-
mental theoretical barriers which prevent a fully effective solution [1]. Degree
Anonymity, however, provides the perhaps most basic and still practically rele-
vant model for graph anonymization; it is the subject of active research [4, 5, 18].
Graph anonymization problems are typically NP-hard. Thus, almost all
algorithms proposed in this field are heuristic in nature, this also being true
for algorithms for Degree Anonymity [16, 18]. Indeed, as the field of graph
anonymization is young and under strong development, there is very little re-
search on its theoretical foundations, particularly concerning computational com-
plexity and algorithms with provable performance guarantees [6].
Our Contributions. Our central result is to show that Degree Anonymity has
a polynomial-size problem kernel when parameterized by the maximum vertex
degree Δ of the input graph. In other words, we prove that there is a polynomial-
time algorithm that transforms any input instance of Degree Anonymity into
an equivalent instance with at most O(Δ7 ) vertices. Indeed, we encounter a “win-
win” situation when proving this result: We show that Liu and Terzi’s heuristic
strategy [16] finds an optimal solution when the size s of a minimum solution
is larger than 2Δ4 . As a consequence, we can bound s in O(Δ4 ) and, hence, a
polynomial kernel we provide for the combined parameter (Δ, s) is also a poly-
nomial kernel only for Δ. Furthermore, our kernelization has the useful property
(e. g. for approximations) that each solution derived for the kernel instance one-
to-one corresponds to a solution of the original instance. While this kernelization
directly implies fixed-parameter tractability for Degree Anonymity parame-
terized by Δ, we also develop a further improved fixed-parameter algorithm.
In addition, we prove that the polynomial kernel for Δ is tight in the sense that
even for constant values of the “stronger” parameter (that is, provably smaller)
H-index1 , Degree Anonymity becomes NP-hard. The same proof also yields
NP-hardness in 3-colorable graphs. Further, from a parameterized perspective,
we show that Degree Anonymity is W[1]-hard when parameterized by the
solution size s (the number of added edges), even when k = 2. In other words,
there is no hope for tractability even when the level k of anonymity is low and
the graph needs only few edge additions (meaning little perturbation) to achieve
k-anonymity.
Why is the parameter “maximum vertex degree Δ” of specific interest? First,
note that from a parameterized complexity perspective it seems to be a “tight”
parameterization in the sense that for the only little “stronger” parameter H-
index our results already show NP-hardness for H-index three (also implying
hardness e.g. for the parameters degeneracy and average degree). Social networks
typically have few vertices with high degree and many vertices of small degree.
1
The H-index of a graph G is the maximum integer h such that G has at least h
vertices with degree at least h. Thus G has at most h vertices of degree larger
than h.
596 S. Hartung et al.

Leskovec and Horvitz [15] studied a huge instant-messaging network (180 million
nodes) with maximum degree bounded by 600. For the DBLP co-author graph
generated in February 2012 with more than 715,000 vertices we measured a
maximum degree of 804 and an H-index of 208, so there are not more than 208
vertices with degree larger than 208. Thus, a plausible strategy might be to only
anonymize vertices of “small” degree and to remove high-degree vertices for the
anonymization process because it might be overly expensive to anonymize these
high-degree vertices and since they might be well-known (that is, not anonymous)
anyway. Indeed, high-degree vertices can be interpreted as outliers [2], potentially
making their removal plausible.

Related Work. The most important reference is Liu and Terzi’s work [16] where
the basic model was introduced, sophisticated (heuristic) algorithms (also using
algorithms to determine the realizability of degree sequences) have been devel-
oped and validated on experimental data. Somewhat more general models have
been considered by Zhou and Pei [25] (studying the neighborhood of vertices
instead of only the degree) and by Chester et al. [5] (anonymizing a subset of
the vertices of the input). Chester et al. [4] investigate the variant of adding
vertices instead of edges. Building on Liu and Terzi’s work, Lu et al. [18] pro-
pose a “more efficient and more effective” algorithm for Degree Anonymity.
Again, this algorithm is heuristic in nature. Today, the field of graph anonymiza-
tion has grown tremendously with numerous surveys and research directions. We
only mention some directly related work.
There are many other, often more complicated models for graph anonymiza-
tion. Weaknesses Degree Anonymity (mainly depending on the assumed ad-
versary model where for many practical situations the adversary may e.g. have
an auxiliary network that helps in de-anonymizing) of and other models have
been pointed out [1, 20, 24]. In conclusion, given the generality of background
knowledge an adversary may or may not have, graph anonymization remains a
chimerical target [18] and, thus, a universally best model is not available.
Finally, from a (parameterized) computational complexity perspective, the
closest work we are aware of is due to Mathieson and Szeider [19] who provide
a study on editing graphs to satisfy degree constraints. In their basic model,
each vertex is equipped with a degree list and the task is to edit the graph such
that each vertex achieves a degree contained in its degree list. They study the
editing operations edge addition, edge deletion, and vertex deletion and provide
numerous parameterized tractability and intractability results. Interestingly, on
the technical side they also rely on the computation of general factors in graphs
(as we do) and they also study kernelization, where they leave as most challenging
open problem to extend their kernelization results to cases that include vertex
deletion and edge addition, emphasizing that the presence of edge additions
makes their approach inapplicable.
Due to the lack of space, many technical details are deferred to a full version
of the paper.
A Refined Complexity Analysis of Degree Anonymization in Graphs 597

2 Preliminaries
Parameterized complexity. The concept of parameterized complexity was pio-
neered by Downey and Fellows [7] (see also [8, 21] for more recent textbooks). A
parameterized problem is called fixed-parameter tractable if there is an algorithm
that decides any instance (I, k), consisting of the “classical” instance I and a
parameter k ∈ N0 , in f (k) · |I|O(1) time, for some computable function f solely
depending on k. A core tool in the development of fixed-parameter algorithms
is polynomial-time kernelization [3, 12]. Here, the goal is to transform a given
problem instance (I, k) in polynomial time into an equivalent instance (I  , k  ),
the so-called kernel, such that k  ≤ g(k) and |I  | ≤ g(k) for some function g. If g
is a polynomial, then it is called a polynomial kernel. A parameterized problem
that is classified as W[1]-hard (using so-called parameterized reductions) is un-
likely to admit a fixed-parameter algorithm. There is good complexity-theoretic
reason to believe that W[1]-hard problems are not fixed-parameter tractable.

Graphs and Anonymization. We use standard graph-theoretic notation. All


graphs studied in this paper are simple, i. e., there are no self-loops and no
multi-edges. For a given graph G = (V, E) with vertex set V and edge set E we
set n = |V | and m = |E|. Furthermore, by degG (v) we denote the degree of a
vertex v ∈ V in G and ΔG denotes the maximum degree of any vertex of G. For
0 ≤ a ≤ ΔG let DG (a) = {v ∈ V | degG (v) = a} be the block of degree a, that is,
the set of all vertices with degree a in G. Thus, being k-anonymous is equivalent
to each block being of size either zero or at least k. The complement graph of G is
denoted by G = (V, E), E = {{u, v} | u, v ∈ V, {u, v} ∈ / E}. The subgraph of G
induced by a vertex subset V  ⊆ V is denoted by G[V  ]. For an edge subset E  ⊆
E, V (E  ) denotes the set of all endpoints of edges in E  and G[E  ] = (V (E  ), E  ).
For a set of edges S with endpoints in a graph G, we denote by G + S the graph
that results by inserting all edges in S into G and we call S an edge insertion set
for G. Thus, Degree Anonymity is the question whether there is an edge inser-
tion set S of size at most s such that G+S is k-anonymous. In this case S is called
k-insertion set for G. We omit subscripts if the graph is clear from the context.

3 Hardness Results
In this section we provide two polynomial-time many-to-one reductions yielding
three (parameterized) hardness results.

Theorem 1. Degree Anonymity is NP-hard on 3-colorable graphs and on


graphs with H-index three.

Proof (Sketch). We give a reduction from the NP-hard Independent Set prob-
lem, where given a graph G = (V, E) and a positive integer h, the question is
whether there is a size-h independent set, that is, a vertex subset of pairwise
nonadjacent vertices. We assume without loss of generality that in the given
Independent Set instance (G, h) it holds that |V | ≥ 2h + 1. We construct an
598 S. Hartung et al.

equivalent instance (G = (V  , E  ), k, s) for Degree Anonymity as follows. We


start with a copy G of G, denoting with v  ∈ V  the copy of the vertex v ∈ V .
Then, for each vertex v ∈ V we add to G degree-one vertices adjacent to v  such
that v  has degree ΔG in G . Next we add a star with ΔG + h − 1 leaves and
denote its central vertex c. We conclude the construction by setting k = h + 1
and s = h2 .
Independent Set is NP-hard on 3-colorable graphs [23, Lemma 6] and on
graphs with maximum degree three [11, GT20]. Observe that if G is 3-colorable,
then G is also 3-colorable. Furthermore, if G has maximum degree three, then
only the central vertex c might have degree larger than three, implying that the
H-index of G is three. &
%

Degree Anonymity is W[1]-hard with respect to the standard parameteriza-


tion, that is, by the size of edges s that are allowed to add:

Theorem 2. Degree Anonymity is W[1]-hard parameterized by the number


of inserted edges s, even if k = 2.

4 Polynomial Kernel for the Maximum Degree Δ

In this main section we provide a polynomial kernel with respect to the param-
eter maximum degree Δ (Theorem 4). Our proof has two main ingredients: first
we show in Section 4.2 a polynomial kernel with respect to the combined pa-
rameter (Δ, s); second we show in Section 4.3 that a slightly modified variant of
Liu and Terzi’s heuristic [16] exactly solves any instance having a minimum-size
k-insertion set of size at least (Δ2 + 4Δ + 3)2 . Hence, either we can solve a
given instance in polynomial time or we can upper-bound s by (Δ2 + 4Δ + 3)2 ,
implying that the kernel polynomial in (Δ, s) is indeed polynomial only in Δ.
We begin by presenting the main technical tool used in our work, the so-called
f -Factor problem.

4.1 f -Factor Problem

Degree Anonymity has a close connection to the polynomial-time solvable


f -Factor problem [17, Chapter 10]: Given a graph G = (V, E) and a function f :
V → N0 , does there exist an f -factor, that is, a subgraph G = (V, E  ) of G such
that degG (v) = f (v) for all vertices? One can reformulate Degree Anonymity
using f -Factor as follows: Given an instance (G, k, s), the question is whether
there is afunction f : V → N0 such that the complement graph G contains an
f -factor, v∈V f (v) ≤ 2s (every edge is counted twice in the sum of degrees),
and for all v ∈ V it holds that |{u ∈ V | degG (u) + f (u) = degG (v) + f (v)}| ≥ k
(the k-anonymity requirement). The following lemma guarantees the existence
of an f -factor in graphs fulfilling certain requirements on the maximum degree
and the number of vertices.
A Refined Complexity Analysis of Degree Anonymization in Graphs 599

Lemma 1 ([14]). Let G = (V, E) be a graph with minimum vertex degree δ and
let a ≤ b be two positive integers. Suppose further that
b a+b
δ≥ |V | and |V | > (b + a − 3).
a+b a

Then, for any function f : V → {a, a + 1, ..., b} where v∈V f (v) is even, G has
an f -factor.

As we are interested in an f -factor in the complement graph of our input graph G,


we use Lemma 1 with a = 1, b = Δ + 2, and minimum degree δ ≥ n − Δ − 1.
This directly leads to the following.

Corollary 1. Let G = (V, E) be a graph with n vertices, minimum  degree n −


Δ − 1, Δ ≥ 1, and let f : V → {1, . . . , Δ + 2} be a function such that v∈V f (v)
is even. If n ≥ Δ2 + 4Δ + 3, then G has an f -factor.

4.2 Polynomial Kernel for (Δ, s)


Our kernelization algorithm is based on the following observation. For a given
graph G, consider for some 1 ≤ i ≤ ΔG the block DG (i), that is, the set of
all vertices of degree i. If DG (i) contains many vertices, then the vertices are
“interchangeable”:
Observation 1. Let G = (V, E) be a graph, let S be a k-insertion set for G,
and let v ∈ V (S) ∩ DG (i) be a vertex such that |DG (i)| > (Δ + 2)s. Then there
exists a vertex u ∈ DG (i) \ V (S) such that replacing in every edge of S the
vertex v by u results in a k-insertion set for G.

Proof. Since |S| ≤ s, the vertex v can be incident to at most s edges in S.


Denoting the set of these edges by S v , obviously one can replace v by u ∈ DG (i)
if u is non-adjacent to all vertices in V (S v ) \ {v} (this allows to insert all edges)
and u ∈ / V (S) (the size of all blocks in G + S does not change). However, as V (S)
contains at most 2s vertices from DG (i) and each of the at most s vertices
in V (S v ) \ {v} has at most ΔG neighbors in G, it follows that such a vertex
u ∈ DG (i) exists if |DG (i)| > (Δ + 2)s. &
%

By Observation 1, in our kernel we only need to keep at most (Δ + 2)s vertices


in each block: If in an optimal k-insertion set S there is a vertex v ∈ V (S) that
we did not keep, then by Observation 1 we can replace v by some vertex we kept.
There are two major problems that need to be fixed to obtain a kernel: First,
when removing vertices from the graph, the degrees of the remaining vertices
change. Second, k might be “large” and, thus, removing vertices (during kernel-
ization) in one block may breach the k-anonymity constraint. To overcome the
first problem we insert some “dummy-vertices” which are guaranteed not to be
contained in any k-insertion set. However, to solve the second problem we need
to adjust the parameter k as well as the number of vertices that we keep from
each block.
600 S. Hartung et al.

Algorithm 1. The pseudocode or the algorithm producing a polynomial kernel


with respect to (Δ, s).
1: procedure producePolyKernel(G = (V, E), k, s)
2: if |V | ≤ Δ(β + 4s) then // β is defined as β = (Δ + 4)s + 1
3: return (G, k, s)
4: k ← min{k, β}; A ← ∅
5: for i ← 1 to Δ do
6: if 2s < |DG (i)| < k − 2s then
7: return trivial no-instance // insufficient budget for DG (i)
8: if k ≤ β then // determine retained vertices
9: x ← min{|DG (i)|, β + 4s}
10: else if |DG (i)| ≤ 2s then
11: x ← |DG (i)|
12: else
13: x ← k + min{4s, (|DG (i)| − k)} // observe that k = β.
14: add x vertices from DG (i) to A

15: G = G[A]
16: for each v ∈ A do // add vertices to preserve degree of retained vertices
17: add to G degG (v) − degG (v) many degree-one vertices adjacent to v
18: denote with P the set of vertices added in Line 17
19: by adding matched pairs of vertices, ensure that |P | ≥ max{4Δ + 4s + 4, k }
20: if Δ + s + 1 is even then
21: GF = (P, E F ) ←(Δ + s + 1)-factor in G [P ]
22: else
23: GF = (P, E F ) ←(Δ + s + 2)-factor in G [P ]
24: G ← G + E F


25: return (G , k , s)

We now explain the kernelization algorithm in detail (see Algorithm 1 for the
pseudocode). Let (G, s, k) be an instance of Degree Anonymity. For brevity
we set β = (Δ + 4)s + 1. We compute in polynomial time an equivalent in-
stance (G , s, k  ) with at most O(Δ3 s) vertices: First set k  = min{k, β} (Line 4).
We arbitrarily select from each block DG (i) a certain number x of vertices and
collect all these vertices into the set A (Line 14). To cope with the above men-
tioned second problem, the “certain number” is defined in a case distinction on
the value of k (see Lines 5 to 14). Intuitively, if k is large then we distinguish be-
tween “small” blocks of size at most 2s and “large” blocks of size at least k − 2s.
Obviously, if there is a block which is neither small nor large, then the instance
is a no-instance (see Line 7). Thus, in the problem kernel we keep for small
blocks the “distance to size zero” and for large blocks the “distance to size k”.
Furthermore, in order to separate between small and large blocks it is sufficient
that k  > 4s. However, to guarantee that Observation 1 is applicable, the case
distinction is a little bit more complicated, see Lines 5 to 14.
We start building G by first copying G[A] into it (Line 15). Next, adding
a pendant vertex to v means that we add a new vertex to G and make it
A Refined Complexity Analysis of Degree Anonymization in Graphs 601

adjacent to v. For each v ∈ A we add pendant vertices to v to ensure that


degG (v) = degG (v) (Line 17). The vertices of A stay untouched in the following.
Denote the set of all pendant vertices by P . Next, we add enough pairwise
adjacent vertices to P to ensure that |P | ≥ max{k  , 4Δ + 4s + 4} (Line 19).
Hence, |P | ≤ max{|A| · Δ, k  , 4Δ + 4s + 4} + 1. To avoid that vertices in P help
to anonymize the vertices in A we “shift” the degree of the vertices in P (see
Lines 20 to 24): We add edges between the vertices in P to ensure that the
degree of all vertices in P is Δ + s + 2 (when Δ + s + 1 is even) or Δ + s + 3
(when Δ + s + 2 is even). For the ease of notation let χ denote the new degree
of the vertices in P . Observe that before adding edges all vertices in P have
degree one in G . Thus, the minimum degree in G [P ] is |P | − 2. Furthermore,
for each v ∈ P we denote by f (v) the number of incident edges v requires to have
the described degree. It follows that f (v) is even and hence v∈P f (v) is even.
Hence setting a = b = χ fulfills all conditions
 of Lemma 1. Thus, the required
f -factor exists and can be found in O(|P | |P |(Δ + s)) time [10]. This completes
the description of the kernelization algorithm.
The key point of the correctness of the kernelization is to show that without
loss of generality, no k-insertion set S for G of size |S| ≤ s affects any vertex
in P . This is ensured by “shifting” the degree of all vertices in P by s + 1 (or
s + 2), implying that none of the vertices in A can “reach” the degree of any
vertex in P by adding at most s edges. Hence each block either is a subset of A
or of P . We now prove that we may assume that an edge insertion set does not
affect any vertex in P . All what we need to prove this is the fact that A contains
at least β + 4s vertices from at least one block in G. Observe that this is ensured
by the check in Line 2.

Lemma 2. If there is a k-insertion set S for G with |S| ≤ s, then there is also
a k-insertion set S  for G with |S  | = |S| such that V (S  ) ∩ P = ∅.
Based on Lemma 2 we now prove the correctness of our kernelization algorithm.

Theorem 3. Degree Anonymity admits a kernel with O(Δ3 s) vertices.

Proof. The polynomial kernel is computed by Algorithm 1. As f -Factor can


be solved in polynomial time [10], Algorithm 1 runs in polynomial time. The
correctness of the kernelization algorithm is deferred to a full version. It remains
to show the size of the kernel. To this end, observe that each block in A has
size at most β + 4s (see Lines 9, 11, and 13). Thus, |A| = O(Δβ) = O(Δ2 s).
Furthermore, the set P contains at most max{Δ|A|, k  , 4s + 4Δ + 1} vertices
(see Lines 17 to 19). Thus, |P | = O(Δ3 s) and, hence, the reduced instance
contains O(Δ3 s) vertices. &
%

4.3 A Polynomial-Time Algorithm for “Large” Solution Instances


In this section we show that if a minimum size k-insertion set S is “large” com-
pared to Δ, then one can solve the instance in polynomial time (Lemma 5).
Towards this, we first show that a large solution influences the degree of many
602 S. Hartung et al.

vertices. Then the main idea is that if it influences the degree of “many” vertices
from the same block, say DG (i), then by Observation 1 the corresponding ver-
tices can be arbitrarily “interchanged”. Thus it is not important to know which
vertex from DG (i) has to be “moved” up to a certain degree by adding edges,
because Observation 1 ensures that we can greedily find one. This, however, im-
plies that the actual structure of the input graph (which forbids to insert certain
edges since they are already present) no longer matters. Hence, we solve De-
gree Anonymity without taking the graph structure into account. Thereby, if
we can k-anonymize the degree sequence corresponding to G (the sequence of
degrees of G) such that “many” degrees have to be adjusted, then by Corollary 1
we can conclude that G contains an f -factor where f (v) captures the difference
between the degree of v in G and the anonymized degree sequence. The f -factor
can be found in polynomial time [10] and, hence, a k-insertion set can be found
in polynomial time. We now formalize this idea.
We first show that a “large” minimum-size k-insertion set increases the max-
imum degree by at most two.

Lemma 3. Let G = (V, E) be a graph and let S be a minimum-size k-insertion


set. If |V (S)| ≥ Δ2 + 4Δ+ 3, then the maximum degree in G+ S is at most Δ+ 2.

Proof. Let G be a graph with maximum degree Δ and k an integer. Let S be a


minimum-size edge set such that G+S is k-anonymous and |V (S)| ≥ Δ2 +4Δ+3.
Now assume towards a contradiction that the maximum degree in G + S is at
least Δ + 3. We show that there exists an edge set S  such that G + S  is
k-anonymous, |S  | < |S|, and G + S  has maximum degree at most Δ + 2.
First we introduce some notation. Let f be a function f : V → N0 defined
as f (v) = degG+S (v) − degG (v) for all v ∈ V . Furthermore, denote with X
the set of all vertices having degree more than Δ + 2 in G + S, that is, X =
{v ∈ V | f (v) + degG (v) ≥ Δ + 3}. Observe that G[S] is an f -factor of G
and 2|S| = v∈V f (v). We now define a new function f  : V → N0 such that G
contains an f  -factor denoted by G = (V, S  ) where the edge set S  has the
properties as described in the previous paragraph.
We define f  for all v ∈ V as follows:


⎨f (v) if v ∈
/ X,
f  (v) = Δ − degG (v) + 1 if v ∈ X and f (v) + degG (v) − Δ − 1 is even,


Δ − degG (v) + 2 else.

First observe that degG (v) + f  (v) ≤ Δ + 2 for all v ∈ V . Furthermore, observe
that f  (v) = f (v) for all v ∈ V \ X ∈ X it holds that
 and for all v

f (v) < f (v)
and f (v) − f (v) is even. Thus, v∈V f (v) > v∈V f (v) and v∈V f  (v) is
 

even. It remains to show that (i) G contains an f  -factor G = (V, S  ) and (ii)
G + S  is k-anonymous.
To prove (i) let V4 = {v ∈ V | f  (v) > 0} and observe that if f (v) > 0,
then, by definition of f  , we have f  (v) > 0 and hence V4 = V (S). Furthermore,
4 = G[V4 ]. Observe that G
let G 4 has minimum degree |V4 | − Δ − 1 and |V4 | =
A Refined Complexity Analysis of Degree Anonymization in Graphs 603

|V (S)| ≥ Δ2 + 4Δ + 3. Thus, the conditions of Corollary 1 are satisfied and


hence G 4 contains an f  |  -factor G4  = (V4 , S  ). By definition of V4 it follows
V
  
that G = (V, S ) is an f -factor of G. Thus, it remains to show (ii).
Assume towards a contradiction that G + S  is not k-anonymous, that is,
there exists some vertex v ∈ V such that 1 ≤ |DG+S  (degG+S  (v))| < k. Let d =
degG+S (v) and d = degG+S  (v). Observe that d = degG (v) + f  (v). Thus, if v ∈ /
X, then by definition of f  it holds that d = degG (v) + f (v) = d ≤ Δ + 2. Hence,
for all vertices u ∈ DG+S (d ) it follows that u ∈ / X. Thus, DG+S (d ) ⊆ DG+S  (d )
and since G + S is k-anonymous we have |DG+S  (d )| ≥ k, a contradiction. On
the other hand, if v ∈ X, that is, d > Δ + 2, then |DG+S (d)| ≥ k since G + S is
k-anonymous. Furthermore, by the definitions of DG+S (d), f , and X we have for
all u ∈ DG+S (d) that degG (u)+f (u) = d, u ∈ X, and, thus, f  (u)+degG (u) = d .
Therefore, DG+S (d) ⊆ DG+S  (d ) and |DG+S  (d )| ≥ k, a contradiction. &
%

We now formalize the anonymization of degree sequences. A multiset of positive


integers D = d1 , . . . , dn , di that corresponds to the degrees of all vertices in a
graph is called degree sequence. A degree sequence D is k-anonymous if each
number in D occurs at least k times in D. Clearly, the degree sequence of a
k-anonymous graph G is k-anonymous. Moreover, if a graph G can be trans-
formed by at most s edge insertions into a k-anonymous graph, then the degree
sequence of G can be transformed into a k-anonymous degree sequence by in-
creasing the integers by no more than 2s in total (clearly, in the other direction
this fails in general because of the graph structure). As we are only interested in
a degree sequence corresponding to a graph of a Degree Anonymity instance
where s is large, by Lemma 3 we can require the integers in a k-anonymous
degree sequence to be at upper-bounded by Δ + 2.
k-Degree Sequence Anonymity (k-DSA)
Input: Two positive integers k and s and a degree sequence D =
d1 , . . . , dn with d1 ≤ d2 ≤ . . . ≤ dn and Δ = dn .
Question: Is there a k-anonymous degree sequence D= d1 , . . . , dn with
di ≤ di and max1≤i≤n di ≤ Δ + 2 such that i=1 di − di = 2s?
n

By slightly modifying a dynamic programming-based heuristic introduced by


Liu and Terzi [16], we next prove that k-Degree Sequence Anonymity is
polynomial-time solvable. Note that Liu and Terzi [16] use their heuristic to
solve Degree Anonymity by first solving the problem on the degree sequence
of the input graph G and then trying to “realize” (adding the corresponding
edges to G) the produced k-anonymous degree sequence.
Lemma 4. k-Degree Sequence Anonymity can be solved in polynomial
time.
We now have all ingredients to solve Degree Anonymity in polynomial time
in case it has a “large” minimum-size k-insertion set. More formally, let (G, k, s)
be an instance and let D be the degree sequence of G, then first find the largest
i ≤ s such that (D, k, i) is a yes-instance for k-Degree Sequence Anonymity.
If i is “large”, then we prove that we can transfer the solution to G. In all other
604 S. Hartung et al.

cases, since any k-insertion set for G of size j ≤ s directly implies that (D, k, j)
is a yes-instance for k-Degree Sequence Anonymity, it follows that we can
bound the parameter s by a function in Δ.
Lemma 5. Let (G, k, s) be an instance of Degree Anonymity. Either one
can decide the instance in polynomial time or (G, k, s) is a yes-instance if and
only if (G, k, min{(Δ2 + 4Δ + 3)2 , s}) is a yes-instance.
By Lemma 5 it follows that in polynomial time we can either find a solution or
we have s < (Δ2 + 4Δ + 3)2 . By Theorem 3 this implies our main result.
Theorem 4. Degree Anonymity admits an O(Δ7 )-vertex kernel.

5 Fixed-Parameter Algorithm
We provide a direct combinatorial algorithm for the combined parameter (Δ, s).
Roughly speaking, for fixed k-insertion set S the algorithm branches into all
suitable structures of G[S], that is, graphs of at most 2s vertices with vertex
labels from {1, . . . , Δ}. Then the algorithm checks whether the respective struc-
ture occurs as a subgraph in G such that the labels on the vertices match the
degree of the corresponding vertex in G.
Theorem 5. Degree Anonymity can be solved in (6s2 Δ3 )2s · s2 · nO(1) time.
Note that due to the upper bound s < (Δ2 + 4Δ + 3)2 (see Lemma 5) and
the polynomial kernel for the parameter Δ (see Theorem 4), Theorem 5 also
4
provides an algorithm running in ΔO(Δ ) + nO(1) time.

6 Conclusion
One of the grand challenges of theoretical research on computationally hard
problems is to gain a better understanding of when and why heuristic algorithms
work [13]. In this theoretical study, we contributed to a better theoretical under-
standing of a basic problem in graph anonymization, on the one side partially
explaining the quality of a successful heuristic approach [16] and on the other
side providing a first step towards a provably efficient algorithm for relevant spe-
cial cases (bounded-degree graphs). Our work just being one of the first steps
in the so far underdeveloped field of studying the computational complexity of
graph anonymization [6], there are numerous challenges for future research. For
instance, our focus was on classification results rather than engineering the upper
bounds, a natural next step to do. Second, it would be interesting to perform a
data-driven analysis of parameter values on real-world networks in order to gain
parameterizations that can be exploited in a broad-band multivariate complex-
ity analysis [22] of Degree Anonymity. Finally, with Degree Anonymity
we focused on a very basic problem of graph anonymization; there are numer-
ous other models (partially mentioned in the introductory section) that ask for
similar studies.
A Refined Complexity Analysis of Degree Anonymization in Graphs 605

References
[1] Aggarwal, C.C., Li, Y., Yu, P.S.: On the hardness of graph anonymization. In:
Proc. 11th IEEE ICDM, pp. 1002–1007. IEEE (2011)
[2] Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas,
D., Zhu, A.: Achieving anonymity via clustering. ACM Transactions on Algo-
rithms 6(3), 1–19 (2010)
[3] Bodlaender, H.L.: Kernelization: New upper and lower bound techniques. In: Chen,
J., Fomin, F.V. (eds.) IWPEC 2009. LNCS, vol. 5917, pp. 17–37. Springer, Hei-
delberg (2009)
[4] Chester, S., Kapron, B.M., Ramesh, G., Srivastava, G., Thomo, A., Venkatesh, S.:
k-Anonymization of social networks by vertex addition. In: Proc. 15th ADBIS (2).
CEUR Workshop Proceedings, vol. 789, pp. 107–116 (2011), CEUS-WS.org
[5] Chester, S., Gaertner, J., Stege, U., Venkatesh, S.: Anonymizing subsets of social
networks with degree constrained subgraphs. In: Proc. ASONAM, pp. 418–422.
IEEE Computer Society (2012)
[6] Chester, S., Kapron, B., Srivastava, G., Venkatesh, S.: Complexity of social net-
work anonymization. Social Network Analysis and Mining (2012) (online available)
[7] Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer (1999)
[8] Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer (2006)
[9] Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: A
survey of recent developments. ACM Computing Surveys 42(4), 14:1–14:53 (2010)
[10] Gabow, H.N.: An efficient reduction technique for degree-constrained subgraph
and bidirected network flow problems. In: Proc. 15th STOC, pp. 448–456. ACM
(1983)
[11] Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory
of NP-Completeness. Freeman (1979)
[12] Guo, J., Niedermeier, R.: Invitation to data reduction and problem kernelization.
SIGACT News 38(1), 31–45 (2007)
[13] Karp, R.M.: Heuristic algorithms in computational molecular biology. J. Comput.
Syst. Sci. 77(1), 122–128 (2011)
[14] Katerinis, P., Tsikopoulos, N.: Minimum degree and f -factors in graphs. New
Zealand J. Math. 29(1), 33–40 (2000)
[15] Leskovec, J., Horvitz, E.: Planetary-scale views on a large instant-messaging net-
work. In: Proc. 17th WWW, pp. 915–924. ACM (2008)
[16] Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: Proc. ACM
SIGMOD 2008, pp. 93–106. ACM (2008)
[17] Lovász, L., Plummer, M.D.: Matching Theory. Annals of Discrete Mathematics,
vol. 29. North-Holland (1986)
[18] Lu, X., Song, Y., Bressan, S.: Fast identity anonymization on graphs. In: Liddle,
S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012, Part I. LNCS,
vol. 7446, pp. 281–295. Springer, Heidelberg (2012)
[19] Mathieson, L., Szeider, S.: Editing graphs to satisfy degree constraints: A param-
eterized approach. J. Comput. Syst. Sci. 78(1), 179–191 (2012)
[20] Narayanan, A., Shmatikov, V.: De-anonymizing social networks. In: Proc. 30th
IEEE SP, pp. 173–187. IEEE (2009)
606 S. Hartung et al.

[21] Niedermeier, R.: Invitation to Fixed-Parameter Algorithms. Oxford University


Press (2006)
[22] Niedermeier, R.: Reflections on multivariate algorithmics and problem parameteri-
zation. In: Proc. 27th STACS. LIPIcs, vol. 5, pp. 17–32. Schloss Dagstuhl–Leibniz-
Zentrum für Informatik (2010)
[23] Phillips, C., Warnow, T.J.: The asymmetric median tree—a new model for build-
ing consensus trees. Discrete Appl. Math. 71(1-3), 311–335 (1996)
[24] Sala, A., Zhao, X., Wilson, C., Zheng, H., Zhao, B.Y.: Sharing graphs using differ-
entially private graph models. In: Proc. 11th ACM SIGCOMM, pp. 81–98. ACM
(2011)
[25] Zhou, B., Pei, J.: The k-anonymity and l-diversity approaches for privacy preser-
vation in social networks against neighborhood attacks. Knowl. Inf. Syst. 28(1),
47–77 (2011)
Sublinear-Time Maintenance of Breadth-First
Spanning Tree in Partially Dynamic Networks

Monika Henzinger1, , Sebastian Krinninger1, , and Danupon Nanongkai2,  


1
University of Vienna, Fakultät für Informatik, Austria
2
Nanyang Technological University, Singapore

Abstract. We study the problem of maintaining a breadth-first span-


ning tree (BFS tree) in partially dynamic distributed networks modeling
a sequence of either failures or additions of communication links (but not
both). We show (1 + )-approximation algorithms whose amortized time
(over some number of link changes) is sublinear in D, the maximum di-
ameter of the network. This breaks the Θ(D) time bound of recomputing
“from scratch”.
Our technique also leads to a (1 + )-approximate incremental algo-
rithm for single-source shortest paths (SSSP) in the sequential (usual
RAM) model. Prior to our work, the state of the art was the classic ex-
act algorithm of [9] that is optimal under some assumptions [27]. Our
result is the first to show that, in the incremental setting, this bound can
be beaten in certain cases if a small approximation is allowed.

1 Introduction

Complex networks are among the most ubiquitous models of interconnections


between a multiplicity of individual entities, such as computers in a data center,
human beings in society, and neurons in the human brain. The connections
between these entities are constantly changing; new computers are gradually
added to data centers, or humans regularly make new friends. These changes
are usually local as they are known only to the entities involved. Despite their
locality, they could affect the network globally; a single link failure could result
in several routing path losses or destroy the network connectivity. To maintain
its robustness, the network has to quickly respond to changes and repair its
infrastructure. The study of such tasks has been the subject of several active
areas of research, including dynamic, self-healing, and self-stabilizing networks.
One important infrastructure in distributed networks is the breadth-first span-
ning (BFS) tree [23,25]. It can be used, for instance, to approximate the network
diameter and to provide a communication backbone for broadcast of information

Full version available at http://eprints.cs.univie.ac.at/3703

The research leading to these results has received funding from the European
Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement
no. 317532.

Work partially done while at University of Vienna, Austria.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 607–619, 2013.

c Springer-Verlag Berlin Heidelberg 2013
608 M. Henzinger, S. Krinninger, and D. Nanongkai

through the network, routing, and control. In this paper, we study the problem
of maintaining a BFS tree on dynamic distributed networks. Our main interest
is repairing a BFS tree as fast as possible after each topology change.
Model. We model the communication network by the CONGEST model [25],
one of the major models of (locality-sensitive) distributed computation. Consider
a synchronous network of processors modeled by an undirected unweighted graph
G, where nodes model the processors and edges model the bounded-bandwidth
links between the processors. We let V (G) and E(G) denote the set of nodes
and edges of G, respectively. For any node u and v, we let dG (u, v) be the
distance between u and v in G. The processors (henceforth, nodes) are assumed
to have unique IDs of O(log n) bits and infinite computational power. Each
node has limited topological knowledge; in particular, it only knows the IDs of
its neighbors and knows no other topological information. The communication
is synchronous and occurs in discrete pulses, called rounds. All the nodes wake
up simultaneously at the beginning of each round. In each round each node
u is allowed to send an arbitrary message of O(log n) bits through each edge
(u, v) that is adjacent to u, and the message will reach v at the end of the
current round. There are several measures to analyze the performance of such
algorithms, a fundamental one being the running time, defined as the worst-case
number of rounds of distributed communication.
We model dynamic networks by a sequence of attack and recovery stages fol-
lowing the initial preprocessing. The dynamic network starts with a preprocessing
on the initial network denoted by N0 , where nodes communicate on N0 for some
number of rounds. Once the preprocessing is finished, we begin the first attack
stage where we assume that an adversary, who sees the current network N0 and
the states of all nodes, inserts and deletes an arbitrary number of edges in N0 .
We denote the resulting network by N1 . This is followed by the first recovery
stage where we allow nodes to communicate on N1 . After the nodes have finished
communicating, the second attack stage starts, followed by the second recovery
stage, and so on. We assume that Nt is connected for every stage t. For any
algorithm, we let the total update time be the total number of rounds needed
by nodes to communicate during all recovery stages. Let the amortized update
time be the total time divided by q which is defined to be the number of edges
inserted and deleted. Important parameters in analyzing the running time are n,
the number of nodes (which remains the same throughout all changes) and D, the
maximum diameter, defined to be the maximum diameter among all networks
in {N0 , N1 , . . .}. Note that D ≤ n since we assume that the network remains
connected throughout. Following the convention from the area of (sequential)
dynamic graph algorithms, we say that a dynamic network is fully dynamic if
both insertions and deletions can occur in the attack stages. Otherwise, it is par-
tially dynamic. Specifically, if only edge insertions can occur, it is an incremental
dynamic network. If only edge deletions can occur, it is decremental.
Our model highlights two aspects of dynamic networks: (1) how quick a net-
work can recover its infrastructure after changes (2) how edge failures and ad-
ditions affect the network. These aspects have been studied earlier but we are
Sublinear-Time Maintenance of BFS Tree in Dynamic Networks 609

not aware of any previous model identical to ours. To highlight these aspects,
there are also a few assumptions inherent in our model. First, it is assumed that
the network remains static in each recovery stage. This assumption is often used
(e.g. [18,13,21,24]) and helps emphasizing the running time aspect of dynamic
networks. Second, our model assumes that only edges can change. While there
are studies that assume that nodes can change as well (e.g. [18,13]), this as-
sumption is common and practical; see, e.g., [1,4,22,8] and references therein.
Our amortized update time is also similar in spirit to the amortized commu-
nication complexity heavily studied earlier (e.g. [4]). Finally, the results in this
paper are on partially dynamic networks. While fully dynamic algorithms are
more desirable, we believe that the partially dynamic setting is worth studying,
for two reasons. The first reason, which is our main motivation, comes from an
experience in the study of sequential dynamic algorithms, where insights from
the partially dynamic setting often lead to improved fully dynamic algorithms.
Moreover, partially dynamic algorithms can be useful in cases where one type of
changes occurs much more frequently than the other type.
Problem. We are interested in maintaining an approximate BFS tree. For any
α ≥ 1, an α-approximate BFS tree of graph G with respect to a given root s is
a spanning tree T such that for every node v, dT (s, v) ≤ αdG (s, v) (note that,
clearly, dT (s, v) ≥ dG (s, v)). If α = 1, then T is an (exact) BFS tree. The goal
of our problem is to maintain an approximate BFS tree Tt at the end of each
recovery stage t in the sense that every node v knows its approximate distance
to the preconfigured root s in Nt and, for each neighbor u of v, v knows if u is
its parent or child in Tt .
Results. Clearly, maintaining a BFS tree by recomputing it from scratch in
every recovery stage requires Θ(D) time. Our main results are partially dynamic
algorithms that break this time bound over a long run. They can maintain,
for any constant 0 <  ≤ 1, a (1 + )-approximate BFS tree in time that is
sublinear in D when amortized over ω( n log D
D ) edge changes. To be precise, the
1/3 2/3
amortized update time over q edge changes is O((1+ n2/3D
q1/3
) log D +n/q) in the
1/7
D6/7
incremental setting and O(( n q1/7
) log D + n
7/2 q
) in the decremental one. For
the particular case of q = Ω(n), we get amortized update times of O(D2/3 log D)
and O(D6/7 log D) for the incremental and decremental cases, respectively. Our
algorithms do not require any prior knowledge about the dynamic network, e.g.,
D and q. We note that, while there is no previous literature on this problem,
one can parallelize the algorithm of Even and Shiloach [9] (also see [17,27]) to
obtain an amortized update time of O(nD/q + 1) over q changes in both the
incremental and the decremental setting. This bound is sublinear in D when
q = ω(n). Our algorithms give a sublinear time guarantee for a smaller number
of changes, especially in applications where D is large. Consider, for example,
an application where we want to maintain a BFS tree of the network under
link failures until the network diameter is larger than, say n/10 (at this point,
the network will alert an administrator). In this case, our algorithms guarantee
610 M. Henzinger, S. Krinninger, and D. Nanongkai

a sublinear amortized update time after a polylogarithmic number of failures


while previous algorithms cannot do so.
In the sequential (usual RAM) model, our technique also gives an incremen-
tal (1 + )-approximation algorithm for the single-source shortest paths (SSSP)
problem with an amortized update time of O(mn2/5 /q 3/5 ) per insertion and
O(1) query time, where m is the number of edges in the final graph, and q is the
number of edge insertions. Prior to this result, only the classic exact algorithm
of Even and Shiloach [9] from the 80s, with O(mn/q) amortized update time,
was known. No further progress has been made in the last three decades, and
Roditty and Zwick [27] provided an explanation for this by showing that the al-
gorithm of [9] is likely to be the fastest combinatorial exact algorithm, assuming
that there is no faster combinatorial algorithm for Boolean matrix multiplica-
tion. Very recently, Bernstein and Roditty [5] showed that, in the decremental
setting, this bound can be broken if a small approximation is allowed. Our result
is the first one of the same spirit in the incremental setting; i.e., we brake the
bound of Even and Shiloach for the case q = o(n3/2 ).
Related Work. The problem of computing on dynamic networks is a classic
problem in the area of distributed computing, studied from as early as the 70s;
see, e.g. [4] and references therein. The main motivation is that dynamic networks
better capture real networks, which experience failures and additions of new
links. There is a large number of models of dynamic networks in the literature,
each emphasizing different aspects of the problem. Our model closely follows the
model of the sequential setting and, as discussed earlier, highlights the amortized
update time aspect. It is closely related to the model in [20] where the main goal
is to optimize the amortized update time using static algorithms in the recovery
stages. The model in [20] is still slightly different from us in terms of allowed
changes. For example, the model in [20] considers weighted networks and allows
small weight changes but no topological changes; moreover, the message size can
be unbounded (i.e., the static algorithm in the recovery stage operates under
the so-called LOCAL model). Another related model the controlled dynamic
model (e.g. [19,2]) where the topological changes do not happen instantaneously
but are delayed until getting a permit to do so from the resource controller.
Our algorithms can be used in this model as we can delay the changes until
each recovery stage is finished. Our model is similar to, and can be thought
of as a combination of, two types of models: those in, e.g., [18,13,21,24] whose
main interest is to determine how fast a network can recover from changes using
static algorithms in the recovery stages, and those in, e.g., [4,1,8], which focus on
the amortized cost per edge change. Variations of partially dynamic distributed
networks have also been considered (e.g. [14,26,7,6]).
The problem of constructing a BFS tree has been studied intensively in var-
ious distributed settings for decades (see [25,23] and references therein). The
studies were also extended to more sophisticated structures such as minimum
spanning trees (e.g. [11]) and Steiner trees [16]. These studies usually focus on
static networks, i.e., they assume that the network never changes and want to
construct a BFS tree once, from scratch. While we are not aware of any results
Sublinear-Time Maintenance of BFS Tree in Dynamic Networks 611

on maintaining a BFS tree on dynamic networks, there are a few related results.
Much previous attention (e.g. [4]) has been paid on the problem of maintaining
a spanning tree. In a seminal paper by Awerbuch et al. [4], it was shown that
the amortized message complexity of maintaining a spanning tree can be sig-
nificantly smaller than the cost of the previous approach of recomputing from
scratch [1]. Our result is in the same spirit as [4] in breaking the cost of recom-
puting from scratch. An attempt to maintain spanning trees of small diameter
has also motivated a problem called best swap. The goal is to replace a failed edge
in the spanning tree by a new edge in such a way that the diameter is minimized.
This problem has recently gained considerable attention in both sequential (e.g.
[15,3]) and distributed (e.g. [12,10]) settings.
In the sequential dynamic graph algorithms literature, a problem similar to
ours is the single-source shortest paths (SSSP) problem on undirected graphs.
This problem has been studied in partially dynamic settings and has applications
to other problems, such as all-pairs shortest paths and reachability. As we have
mentioned earlier, the classic bound of [9], which might be optimal [27], has
recently been improved by a decremental approximation algorithm [5], and we
achieve a similar result in the incremental setting.

2 Main Technical Idea

All our algorithms are based on a simple idea of lazy updating. Implementing
this idea on different models requires modifications to cope with difficulties and
to maximize efficiency. In this section, we explain the main idea by sketching a
simple algorithm and its analysis for the incremental setting in the sequential
and the distributed model. We start with an algorithm that has additive error:
Let κ and δ be parameters. For every recovery stage t, we maintain a tree Tt
such that dTt (s, v) ≤ dNt (s, v) ≤ dTt (s, v) + κδ for every node v. We will do this
by recomputing a BFS tree from scratch for O(q/κ + nD/δ 2 ) times.
During the preprocessing, our algorithm constructs a BFS tree of N0 , denoted
by T0 . This means that every node u knows its parent and children in T0 and the
value of dT0 (s, u). Suppose that, in the first attack stage, an edge is inserted, say
(u, v) where dN0 (s, u) ≤ dN0 (s, v). As a result, the distances from v to s might
decrease, i.e. dN1 (s, v) < dN0 (s, v). In this case, the distances from s to some
other nodes (e.g. the children of v in T0 ) could decrease as well, and we may wish
to recompute the BFS tree. Our approach is to do this lazily: We recompute the
BFS tree only when the distance from v to s decreases by at least δ; otherwise,
we simply do nothing! In the latter case, we say that v is lazy. Additionally, we
regularly “clean up” by recomputing the BFS tree after every κ insertions.
To prove an additive error of κδ, observe that errors occur for this single
insertion only when v is lazy. Intuitively, this causes an additive error of δ since
we could have decreased the distance of v and other nodes by at most δ, but we
did not. This argument can be extended to show that if we have i lazy nodes,
then the additive error will be at most iδ. Since we do the cleanup every κ
insertions, the additive error will be at most κδ as claimed.
612 M. Henzinger, S. Krinninger, and D. Nanongkai

For the number of BFS tree recomputations, first observe that the cleanup
clearly contributes O(q/κ) recomputations in total, over q insertions. Moreover,
a recomputation also could be caused by some node v, whose distance to s de-
creases by at least δ. Since every time a node v causes a recomputation, its dis-
tance decreases by at least δ, and dN0 (s, v) ≤ D, v will cause the recomputation
at most D/δ times. This naive argument shows that there are nD/δ recomputa-
tions (caused by n different nodes) in total. This analysis is, however, not enough
for our purpose. A tighter analysis, which is crucial to all our algorithms relies
on the observation that when v causes a recomputation, the distance from v’s
neighbor, say v  , to s also decreases by at least δ−1. Similarly, the distance of v  ’s
neighbor to s decreases by at least δ − 2, and so on. This leads to the conclusion
that one recomputation corresponds to (δ+(δ−1)+(δ−2)+. . .) = Ω(δ 2 ) distance
decreases. Thus, the number of recomputations is at most nD/δ 2 . Combining the
two bounds, we get that the number of BFS tree computations is O(q/κ+nD/δ 2 )
as claimed. We get the total time in sequential and distributed models by mul-
tiplying this number by m, the final number of edges, and D (time for BFS tree
computation), respectively.
To convert the additive error into a multiplicative error of (1 + ), we execute
the above algorithm only for nodes whose distances to s are greater than κδ/.
For other nodes, we can use the algorithm of Even and Shiloach [9] to maintain
a BFS tree of depth κδ/. This requires an additional time of O(mκδ/) in the
sequential model and O(nκδ/) in the distributed model.
By setting κ and δ appropriately, the above algorithm immediately gives us
the claimed time bound for the sequential model. For incremental distributed
networks, we need one more idea called layering, where we use different values of
δ and κ depending on the distance of each node to s. In the decremental setting,
the situation is much more difficult, mainly because it is nearly impossible for a
node v to determine how much its distance to s has increased after a deletion.
Moreover, unlike the incremental case, nodes cannot simply “do nothing” when
an edge is deleted. We have to cope with this using several other ideas, e.g.,
constructing an imaginary tree (where edges sometimes represent paths).

3 Incremental Algorithm

We now present a framework for an incremental algorithm that allows up to


q edge insertions and provides an additive approximation of the distances to a
distinguished node s. Subsequently we will explain how to use this algorithm
to get (1 + )-approximations in the RAM model and the distributed model,
respectively. For simplicity we assume that the initial graph is connected. The
algorithm can be modified to work without this assumption within the same
running time. We defer the details to a full version of the paper.
The algorithm (see Algorithm 1) works in phases. At the beginning of every
phase we compute a BFS tree T0 of the current graph, say G0 . Every time an
edge (u, v) is inserted, the distances of some nodes to s in G might decrease.
Our algorithm tries to be as lazy as possible. That is, when the decrease does
Sublinear-Time Maintenance of BFS Tree in Dynamic Networks 613

not exceed some parameter δ, our algorithm keeps its tree T0 and accepts an
additive error of δ for every node. When the decrease exceeds δ, our algorithm
starts a new phase and recomputes the BFS tree. It also start a new phase after
every κ edge insertions to keep the additive error limited to κδ. The algorithm
will answer a query for the distance from a node x to s by returning dG0 (x, s),
the distance from x to s at the beginning of the current phase. It can also return
the path from x to s in T0 of length dG0 (x, s). Besides δ and κ, the algorithm has
a third parameter X which indicates up to which distance from s the BFS tree
will be computed. In the following we denote by G0 the state of the graph at the
beginning of the current phase and by G we denote the current state of the graph
after all insertions. It is easy to see that the algorithm gives the desired additive

Algorithm 1. Incremental algorithm


1: procedure Insert(u, v)
2: k ←k+1
3: if k = κ then Initialize( )
4: if dG0 (u, s) > dG0 (v, s) + δ then Initialize( )
5: procedure Initialize( ) (Start new phase)
6: k=0
7: Compute BFS tree T of depth X rooted at s and current distances dG0 (·, s)

approximation by considering the shortest path of a node x to the root s in the


current graph G. By the main rule in Line 4 of the algorithm, the inequality
dG0 (u, s) ≤ dG0 (v, s) + δ holds for every edge (u, v) that was inserted since the
beginning of the current phase (otherwise a new phase would have been started).
Since at most κ edges have been inserted, the additive error is at most κδ.
Lemma 1. For every node x such that dG0 (x, s) ≤ X, Algorithm 1 maintains
the invariant dG (x, s) ≤ dG0 (x, s) ≤ dG (x, s) + κδ.
If we insert an edge (u, v) such that the inequality dG0 (u, s) ≤ dG0 (v, s) + δ
does not hold, we cannot guarantee the additive error anymore. Nevertheless
the algorithm makes a lot of progress in some sense: After the edge insertion, u
is linked to v whose initial distance to s was significantly smaller than the one
from u to s. This implies that the distance from u to s has decreased by at least
δ since the beginning of the current phase.
Lemma 2. If an edge (u, v) is inserted such that dG0 (u, s) > dG0 (v, s) + δ, then
dG0 (u, s) ≥ dG (u, s) + δ.
Since we consider undirected, unweighted graphs, a large decrease in distance
for one node also implies a large decrease in distance for many other nodes.
Lemma 3. Let G and G be unweighted, undirected graphs such that G is con-
nected, V (G) = V (G ), and  E(G) ⊆ E(G ). If  there is a node y such that
dG (y, s) ≥ dG (y, s) + δ, then x∈V (G) dG (x, s) ≥ x∈V (G ) dG (x, s) + Ω(δ 2 ).
This is the key observation for the efficiency of our algorithm as it limits the
number of times a new phase starts, which is the expensive part of our algorithm.
614 M. Henzinger, S. Krinninger, and D. Nanongkai

Lemma 4. If κ ≤ q and δ ≤ X, then the total running time of Algorithm 1 is


O(TBF S (X) · q/κ + TBF S (X) · nX/δ 2 + q) where TBF S (X) is the time needed for
computing a BFS tree up to depth X.

3.1 RAM Model


In the RAM model we use a standard approach for turning the additive κδ-
approximation of Algorithm 1 into a multiplicative (1 + )-approximation. We
use the Even-Shiloach algorithm for maintaining the exact distances to s up to
depth κδ/. Algorithm 1 now only has to be approximately correct for every node
x such that dG (x, s) ≥ κδ/, and indeed dG (x, s) + κδ ≤ dG (x, s) + dG (x, s) =
(1 + )dG (x, s). The Even-Shiloach tree adds O(mκδ/) to the running time
of Lemma 4 (where computing a BFS tree takes time O(m)). We choose the
parameters κ and δ in a way that balances the dominant terms in the running
time.
Theorem 5. In the RAM model, there is an incremental (1 + )-approximate
SSSP algorithm for inserting up to q edges that has a total update time of
O(mn2/5 q 2/5 /) where m is the number of edges in the final graph. It answers
distance and path queries in optimal worst-case time.

3.2 Distributed Model


In the distributed model, we use a different approach for obtaining the (1 + )-
approximation. We run log D parallel instances of Algorithm 1, where instance i
provides a (1 + )-approximation for nodes in the distance range from 2i to 2i+1 .
For every such range, we can determine the targeted additive approximation A
that guarantees a (1 + )-approximation in that range. We set the parameters
κ and δ in a way that gives κδ = A and minimizes the running time of Algo-
rithm 1. Here, this approach is more efficient than the two-layered approach that
uses a single Even-Shiloach tree for small distances. The fact that the time for
computing a BFS tree of depth X is proportional to X in the distributed model
nicely fits into the running time of the multi-layer approach.
Theorem 6. In the distributed model, there is an incremental algorithm for
maintaining a (1 + )-approximate BFS tree under up to q edge insertions that
has a total running time of O((q 2/3 n1/3 D2/3 /2/3 + q) log D + n), where D is the
initial diameter.

4 Decremental Algorithm
In the decremental setting we use an algorithm of the same flavor as in the
incremental setting (see Algorithm 2). However, the update procedure is more
complicated because it is not obvious which edge should be used to repair the
tree after a deletion. Our solution exploits the fact that in the distributed model
it is relatively cheap to examine the local neighborhood of every node. As in
Sublinear-Time Maintenance of BFS Tree in Dynamic Networks 615

Algorithm 2. Decremental algorithm


1: procedure Delete(u, v)
2: k ←k+1
3: if k = κ then Initialize( )
4: Delete edge (u, v) from F
5: Compute tree T from F and dG0 (·, s): T ← UpdateTree(F, dG0 (·, s))
6: if UpdateTree reports distance increase by at least δ then Initialize( )
7: procedure Initialize( ) (Start new phase)
8: k=0
9: Compute BFS tree T of depth X rooted at s. Set F ← T
10: Compute current distances dG0 (·, s)
11: procedure UpdateTree(F , dG0 (·, s))
12: At any time: U = {u ∈ V | u has no outgoing edge in F and u = s}
13: while U = ∅ do
14: for all u ∈ U do u marks every node x such that dw F (x, u) ≤ 3κδ
15: for all u ∈ U do (Search process)
16: u tries to find a node v by breadth-first search such that
17: (1) dG (u, v) + dG0 (v, s) ≤ dG0 (u, s) + δ
18: (2) v is not marked
19: (3) dG (u, v) ≤ (κ + 1)δ + 1
20: if such a node v could be found then
21: Add edge (u, v) of weight dw F (u, v) = dG (u, v) to F
22: if no such node v could be found for any u ∈ U then
23: return “distance increase by at least δ”
24: return F

the incremental setting, the algorithm has the parameters κ, δ, and X. The
tree update procedure of Algorithm 2 either computes a (weighted) tree T that
approximates the real distances with additive error κδ, or it reports a distance
increase by at least δ since the beginning of the current phase. Let T0 denote
the BFS tree computed at the beginning of the current phase and let F be
the forest resulting from removing those edges from T0 that have already been
deleted in the current phase. After every edge deletion, the tree update procedure
tries to rebuild a tree T by starting from F . Every node u that had a parent
in T0 but has no parent in F tries to find a “good” node v to reconnect to.
This process is repeated until F is a full tree again. Algorithm 2 imposes three
conditions (Lines 17-19) on a “good” node v. Condition (1) guarantees that
the error introduced by each reconnection is at most δ, (2) avoids that the
reconnections introduce any cycles, and (3) states that the node v should be
found relatively close to u. This is the key to efficiently find such a node.

4.1 Analysis of Tree Update Procedure

For the analysis of the tree update procedure of Algorithm 2, we assume that
the edges in F are directed. When we compute a BFS tree, we consider all edges
as directed towards the root. The weighted, directed distance from x to y in F
616 M. Henzinger, S. Krinninger, and D. Nanongkai

is denoted by dw F (x, y). We assume for the analysis that F initially contains all
nodes. By T we denote the graph returned by the algorithm. By Condition (1),
every reconnection made by the tree update procedure adds an additive error
of δ. In total there are at most κ reconnections (one per previous edge deletion)
and therefore the total additive error introduced is κδ.

Lemma 7. After every iteration, we have, for all nodes x and y such that
F (x, y) < ∞, dF (x, y) + dG0 (y, s) ≤ dG0 (x, s) + κδ.
dw w

By Condition (2) we avoid that the reconnection process introduces any cycle.
Ideally, a node u ∈ U should reconnect to a node v that is in the subtree of the
root s. We could achieve this if every node in U marked its whole subtree in
F . However, this would be too inefficient. Instead, marking the subtree up to a
limited depth (3κδ) is sufficient.

Lemma 8. After every iteration, the graph F is a forest.

We can show that the algorithm makes progress in every iteration. There is
always at least one node for which a “good” reconnection is possible that fulfills
the conditions of the algorithm. Even more, if such a node does not exist, then
there is a node whose distance to s has increased by at least δ since the beginning
of the current phase.

Lemma 9. In every iteration, if dG (x, s) ≤ dG0 (x, s) + δ for every node x ∈


U , then for every node u ∈ U with minimal dG0 (u, s), there is a node v ∈ V
such that (1) dG (u, v) + dG0 (v, s) ≤ dG0 (u, s) + δ, (2) v is not marked, and (3)
dG (u, v) ≤ (κ + 1)δ + 1.

The marking and the search process both take time O(κδ). Since there is at
least one reconnection in every iteration (unless the algorithm reports a distance
increase), there are at most κ iterations that take time O(κδ) each.

Lemma 10. The tree update procedure of Algorithm 2 either reports “distance
increase” and guarantees that there is a node x such that dG (x, s) > dG0 (x, s)+δ,
or it computes a tree T such that for every node x we have dG0 (x, s) ≤ dG (x, s) ≤
T (x, s) ≤ dG0 (x, s) + κδ. It runs in time O(κ δ).
dw 2

In the following we clarify some implementation issues of the tree update proce-
dure in the distributed model.
Weighted Edges. The tree computed by the algorithm contains weighted edges.
Such an edge e corresponds to a path P of the same distance in the network.
We implement weighted edges by a routing table for every node v that stores
the next node on P if a message is sent over v as part of the weighted edge e.
Avoiding Congestion. The marking can be done in parallel without congestion
because the trees in the forest F do not overlap. We avoid congestion during the
search as follows. If a node receives more than one message from its neighbors,
we always give priority to search requests originating from the node u with the
Sublinear-Time Maintenance of BFS Tree in Dynamic Networks 617

lowest distance dG0 (u, s) to s in G0 . By Lemma 9, we know that the search of


at least one of the nodes u with minimal dG0 (u) will always be successful.
Reporting Deletions. The nodes that do not have a parent in F before the
update procedure starts do not necessarily know that a new edge deletion has
happened. Such a node only has to become active and do the marking and the
search if there is a change in its neighborhood of distance 3κδ, otherwise it can
still use the weighted edge in the tree T that it previously used because the three
conditions imposed by the algorithm will still be fulfilled. After the deletion of
an edge (x, y), the nodes x and y can inform all nodes at distance 3κδ in time
O(κδ), which is within our projected running time.

4.2 Analysis of Decremental Distributed Algorithm


The tree update procedure provides an additive approximation of the shortest
paths. It also provides a means for detecting that the distance of some node to
s has increased by at least δ since the beginning of the current phase. Using the
distance increase argument of Lemma 3, we can provide a running time analysis
for the decremental algorithm that is very similar to the incremental algorithm.

Lemma 11. If q ≤ κ and δ ≤ X, then the total running time of Algorithm 2 is


O(qX/κ+nX 2/δ 2 +qκ2 δ). It maintains the following invariant: If dG0 (x, s) ≤ X,
then dG0 (x, s) ≤ dG (x, s) ≤ dw
T (x, s) ≤ dG0 (x, s) + κδ.

We use a similar approach as in the incremental setting to get the (1 + )-


approximation. We run i parallel instances of the algorithm where each instance
covers the distance range from 2i to 2i+1 . By a careful choice of the parameters
κ and δ for each instance we can guarantee a (1 + )-approximation.
Theorem 12. In the decremental distributed dynamic network, we can main-
tain a (1 + )-approximate BFS tree over q edge deletions in a total running time
of O((q 6/7 n1/7 D6/7 ) log D + (n + q)/7/2 ), where D is the maximum diameter.

5 Conclusion and Open Problems


In this paper, we showed that maintaining a breadth-first search spanning tree
can be done in an amortized update time that is sublinear in D in partially
dynamic networks. Many problems remain open. For example, can we get a sim-
ilar result for the case of fully dynamic networks? How about weighted networks
(even partially dynamic ones)? Can we also get a sublinear time bound for the
all-pairs shortest paths problem? Moreover, in addition to the sublinear time
complexity achieved in this paper, it is also interesting to obtain algorithms
with small bounds on message complexity and memory.

Acknowledgements. We thank the reviewers of ICALP 2013 for pointing to


related papers and to an error in an example given in the previous version.
618 M. Henzinger, S. Krinninger, and D. Nanongkai

References
1. Afek, Y., Awerbuch, B., Gafni, E.: Applying static network protocols to dynamic
networks. In: FOCS, pp. 358–370 (1987)
2. Afek, Y., Awerbuch, B., Plotkin, S.A., Saks, M.E.: Local management of a global
resource in a communication network. J. ACM 43(1), 1–19 (1996)
3. Alstrup, S., Holm, J., de Lichtenberg, K., Thorup, M.: Maintaining information in
fully dynamic trees with top trees. ACM Transactions on Algorithms 1(2), 243–264
(2005), announced at ICALP 1997 and SWAT 2000
4. Awerbuch, B., Cidon, I., Kutten, S.: Optimal maintenance of a spanning tree. J.
ACM 55(4) (2008); announced at FOCS 1990
5. Bernstein, A., Roditty, L.: Improved dynamic algorithms for maintaining approxi-
mate shortest paths under deletions. In: SODA, pp. 1355–1365 (2011)
6. Cicerone, S., D’Angelo, G., Stefano, G.D., Frigioni, D.: Partially dynamic efficient
algorithms for distributed shortest paths. Theor. Comput. Sci. 411(7-9), 1013–1037
(2010)
7. Cicerone, S., D’Angelo, G., Stefano, G.D., Frigioni, D., Petricola, A.: Partially dy-
namic algorithms for distributed shortest paths and their experimental evaluation.
JCP 2(9), 16–26 (2007)
8. Elkin, M.: A near-optimal distributed fully dynamic algorithm for maintaining
sparse spanners. In: PODC, pp. 185–194 (2007)
9. Even, S., Shiloach, Y.: An on-line edge-deletion problem. J. ACM 28(1), 1–4 (1981)
10. Flocchini, P., Enriques, A.M., Pagli, L., Prencipe, G., Santoro, N.: Point-of-failure
shortest-path rerouting: Computing the optimal swap edges distributively. IEICE
Transactions 89-D(2), 700–708 (2006)
11. Garay, J., Kutten, S., Peleg, D.: A sublinear time distributed algorithm for
minimum-weight spanning trees. SIAM J. on Computing 27, 302–316 (1998); an-
nounced at FOCS 1993
12. Gfeller, B., Santoro, N., Widmayer, P.: A distributed algorithm for finding all best
swap edges of a minimum-diameter spanning tree. IEEE Trans. Dependable Sec.
Comput. 8(1), 1–12 (2011), announced at DISC 2007
13. Hayes, T.P., Saia, J., Trehan, A.: The forgiving graph: a distributed data structure
for low stretch under adversarial attack. Distributed Computing 25(4), 261–278
(2012); announced at PODC 2009
14. Italiano, G.F.: Distributed algorithms for updating shortest paths. In:
WDAG(DISC), pp. 200–211 (1991)
15. Italiano, G.F., Ramaswami, R.: Maintaining spanning trees of small diameter. Al-
gorithmica 22(3), 275–304 (1998)
16. Khan, M., Kuhn, F., Malkhi, D., Pandurangan, G., Talwar, K.: Efficient distributed
approximation algorithms via probabilistic tree embeddings. Distributed Comput-
ing 25(3), 189–205 (2012); announced at PODC 2008
17. King, V.: Fully dynamic algorithms for maintaining all-pairs shortest paths and
transitive closure in digraphs. In: FOCS, pp. 81–91 (1999)
18. Korman, A.: Improved compact routing schemes for dynamic trees. In: PODC,
pp. 185–194 (2008)
19. Korman, A., Kutten, S.: Controller and estimator for dynamic networks. Inf. Com-
put. 223, 43–66 (2013)
Sublinear-Time Maintenance of BFS Tree in Dynamic Networks 619

20. Korman, A., Peleg, D.: Dynamic routing schemes for graphs with low local density.
ACM Transactions on Algorithms 4(4) (2008)
21. Krizanc, D., Luccio, F.L., Raman, R.: Compact routing schemes for dynamic ring
networks. Theory Comput. Syst. 37(5), 585–607 (2004)
22. Kuhn, F., Lynch, N.A., Oshman, R.: Distributed computation in dynamic net-
works. In: STOC, pp. 513–522 (2010)
23. Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann Publishers, San Francisco
(1996)
24. Malpani, N., Welch, J.L., Vaidya, N.H.: Leader election algorithms for mobile ad
hoc networks. In: DIAL-M, pp. 96–103 (2000)
25. Peleg, D.: Distributed computing: a locality-sensitive approach. SIAM, Philadel-
phia (2000)
26. Ramarao, K.V.S., Venkatesan, S.: On finding and updating shortest paths distribu-
tively. J. Algorithms 13(2), 235–257 (1992)
27. Roditty, L., Zwick, U.: On dynamic shortest paths problems. Algorithmica 61(2),
389–401 (2011); announced at ESA 2004
Locally Stable Marriage with Strict Preferences

Martin Hoefer1 and Lisa Wagner2


1
Max-Planck-Institut für Informatik and Saarland University, Germany
[email protected]
2
Dept. of Computer Science, RWTH Aachen University, Germany
[email protected]

Abstract. We study two-sided matching markets with locality of in-


formation and control. Each male (female) agent has an arbitrary strict
preference list over all female (male) agents. In addition, each agent is
a node in a fixed network. Agents learn about possible partners dynam-
ically based on their current network neighborhood. We consider con-
vergence of dynamics to locally stable matchings that are stable with
respect to their imposed information structure in the network. While ex-
istence of such states is guaranteed, we show that reachability becomes
NP-hard to decide. This holds even when the network exists only among
one side. In contrast, if only one side has no network and agents re-
member a previous match every round, reachability is guaranteed and
random dynamics converge with probability 1. We characterize this pos-
itive result in various ways. For instance, it holds for random memory
and for memory with the most recent partner, but not for memory with
the best partner. Also, it is crucial which partition of the agents has
memory. Finally, we conclude with results on approximating maximum
locally stable matchings.

1 Introduction

Matching problems form the basis of many assignment and allocation tasks en-
countered in computer science, operations research, and economics. A prominent
and popular approach in all these areas is stable matching, as it captures aspects
like distributed control and rationality of participants that arise in many as-
signment problems today. A variety of allocation problems in markets can be
analyzed within the context of two-sided stable matching, e.g., the assignment
of jobs to workers [2,5], organs to patients [18], or general buyers to sellers. In ad-
dition, stable marriage problems have been successfully used to study distributed
resource allocation problems in networks [9].
In this paper, we consider a game-theoretic model for matching with dis-
tributed control and information. Agents are rational agents embedded in a
(social) network and strive to find a partner for a joint relationship or activity,

Supported by DFG Cluster of Excellence MMCI and grant Ho 3831/3-1.
An extended full version of this paper can be found at
http://arxiv.org/abs/1207.1265

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 620–631, 2013.
© Springer-Verlag Berlin Heidelberg 2013
Locally Stable Marriage with Strict Preferences 621

e.g., to do sports, write a research paper, exchange data etc. Such problems are
of central interest in economics and sociology, and they act as fundamental coor-
dination tasks in distributed computer networks. Our model extends the stable
marriage problem, in which we have sets U and W of men and women. Each
man (woman) can match to at most one woman (man) and has a complete strict
preference list over all women (men). Given a matching M , a blocking pair is a
man-woman pair such that both strictly improve by matching to each other. A
matching without blocking pair is a stable matching.
A central assumption in stable marriage is that every agent knows all agents
it can match to. In reality, however, agents often have limited information about
their matching possibilities. For instance, in a large society we would not expect
a man to match up with any other woman immediately. Instead, there exist re-
strictions in terms of knowledge and information that allow some pairs to match
up directly, while others would have to get to know each other before being
able to start a relationship. We incorporate this aspect by assuming that agents
are embedded in a fixed network of links. Links represent an enduring knowl-
edge relation that is not primarily under the control of the agents. Depending
on the interpretation, links could represent, e.g., family, neighbor, colleague or
teammate relations. Each agent strives to build one matching edge to a partner.
The set of links and edges defines a dynamic information structure based on
triadic closure, a standard idea in social network theory: If two agents have a
common friend, they are likely to meet and learn about each other. Translated
into our model this implies that each agent can match only to partners in its
2-hop neighborhood of the network of matching edges and links. Then, a local
blocking pair is a blocking pair of agents that are at hop distance at most 2 in
the network. Consequently, a locally stable matching is a matching without local
blocking pairs. Local blocking pairs are a subset of blocking pairs. In turn, every
stable matching is a locally stable matching, because it allows no (local or global)
blocking pairs. Thus, one might be tempted to think that locally stable match-
ings are easier to find and/or reach using distributed dynamics than ordinary
stable matchings. In contrast, we show in this paper that locally stable match-
ings have a rich structure and can behave quite differently than ordinary stable
matchings. Our study of locally stable matching with arbitrary strict preferences
significantly extends recent work on the special case of correlated or weighted
matching [11], in which preferences are correlated via matching edge benefits.

Contribution We concentrate on the important case of two-sided markets, in


which a (locally) stable matching is always guaranteed to exist. Our primary
interest is to characterize convergence properties of iterative round-based dy-
namics with distributed control, in which in each round a local blocking pair is
resolved. We focus on the Reachability problem: Given an instance and an
initial matching, is there a sequence of local blocking pair resolutions leading
to a locally stable matching? In Section 3 we see that there are cases, in which
a locally stable matching might never be reached. This is in strong contrast to
the case of weighted matching, in which it is easy to show convergence of every
sequence of local blocking pair resolutions with a potential function. In fact, it
622 M. Hoefer and L. Wagner

is NP-hard to decide Reachability, even if the network exists only among one
partition of agents. Moreover, there exist games and initial matchings such that
every sequence of local blocking pairs terminating in a locally stable matching is
exponentially long. Hence, Reachability might even be outside NP. If we need
to decide Reachability for a given initial matching and a specific locally stable
matching to be reached, the problem is even NP-hard for correlated matching.
Our NP-hardness results hold even if the network exists only among one par-
tition. In Section 4, we concentrate on a more general class of games in which
links exist in one partition and between partitions (i.e., one partition has no
internal links). This is a natural assumption when considering objects that do
not generate knowledge about each other, e.g., when matching resources to net-
worked nodes or users, where initially resources are only known to a subset of
users. Here we characterize the impact of memory on distributed dynamics. For
recency memory, each agent remembers in every round the most recent partner
that is different from the current one. With recency memory, Reachability
is always true, and for every initial matching there exists a sequence of poly-
nomially many local or remembered blocking pairs leading to a locally stable
matching. In contrast, for quality memory where all agents remember their best
partner Reachability stays NP-hard. This formally supports the intuition that
recency memory is more powerful than quality memory, as the latter yields agents
that are “hung up” on preferred but unavailable partners. This provides a novel
distinction between recency and quality memory.
Our positive results for recency memory in Section 4 imply that if we pick lo-
cal blocking pairs uniformly at random in each step, we achieve convergence with
probability 1. The proof relies only on the memory of one partition. In contrast,
if only the other partition has memory, we obtain NP-hardness of Reachabil-
ity. Convergence with probability 1 can also be guaranteed for random mem-
ory if in each round each agent remembers one of his previous matches chosen
uniformly at random. The latter result holds even when links exist among or
between both partitions. However, using known results on stable marriage with
full information [1], convergence time can be exponential with high probability,
independently of any memory.
In contrast to ordinary stable matchings, two locally stable matchings can
have very different size. This motivates our search for maximum locally stable
matchings in Section 5. While a simple 2-approximation algorithm exists, we can
show a non-approximability result of 1.5 − ε under the unique games conjecture.
For spatial reasons most of the proofs are omitted but can be found in the full
version.

Related Work Locally stable matchings were introduced by Arcaute and Vassil-
vitskii [2] in a two-sided job-market model, in which links exist only among one
partition. The paper uses strong uniformity assumptions on the preferences and
addresses the lattice structure for stable matchings and a local Gale-Shapley
algorithm. More recently, we studied locally stable matching with correlated
preferences in the roommates problem, where arbitrary pairs of agents can be
matched [11]. Using a potential function argument, Reachability is always
Locally Stable Marriage with Strict Preferences 623

true and convergence guaranteed. Moreover, for every initial matching there
is a polynomial sequence of local blocking pairs that leads to a locally stable
matching. The expected convergence time of random dynamics, however, is ex-
ponential. If we restrict to resolution of pairs with maximum benefit, then for
random memory the expected convergence time becomes polynomial, but for
recency or quality memory convergence time remains exponential, even if the
memory is of polynomial size.
For an introduction to stable marriage and some of its variants we refer the
reader to several books in the area [10, 19]. There is a significant literature on
dynamics, especially in economics, which is too broad to survey here. These
works usually do not address issues like computational complexity or worst-
case bounds. We focus on a subset of prominent analytical works related to
our scenario. For the stable marriage problem, it is known that better-response
dynamics, in which agents sequentially deviate to blocking pairs, can cycle [16].
On the other hand, Reachability is always true, and for every initial matching
there exists a sequence of polynomially many steps to a stable matching [20]. If
blocking pairs are chosen uniformly at random at each step, convergence time is
exponential [1] in the worst case.
In the roommates problem, in which every pair of agents can be matched, sta-
ble matchings can be absent. Deciding existence and computing stable matchings
if they exist can be done in polynomial time [14]. In addition, if a stable matching
exists, then Reachability is always true [8]. A similar statement can be made
even more generally for relaxed concepts like P -stable matchings that always
exist [12]. Ergodic sets of the underlying Markov chain have been studied [13]
and related to random dynamics [15]. In addition, for computing (variants of)
stable matchings via iterative entry dynamics see [4–6].
The problem of computing a maximum locally stable matchings has recently
been considered in [7]. In addition to characterizations for special cases, a NP-
hardness result is shown and non-approximability of (21/19 − ε) unless P= NP.
Computing maximum stable matchings with ties and incomplete lists has gener-
ated a significant amount of research interest over the past decade. The currently
best results are a 1.5-approximation algorithm [17] and (4/3 − ε)-hardness under
the unique games conjecture [21].

2 Preliminaries

A network matching game (or network game) consists of a (social) network N =


(V, L), where V is a set of vertices representing agents and L ⊆ {{u, v}|u, v ∈
V, u = v} is a set of fixed links. A set E ⊆ {{u, v}|u, v ∈ V, u = v} defines the
potential matching edges. A state is a matching M ⊆ E such that for each v ∈ V
we have |{e | e ∈ M, v ∈ e}| ≤ 1. An edge e = {u, v} ∈ M provides utilities
bu (e), bv (e) > 0 for u and v, respectively. If for every e ∈ E we have bu (e) =
bv (e) = b(e) > 0, we speak of correlated preferences or a correlated network game.
Otherwise, we will assume that each agent has a total order : over its possible
matching partners, and for every agent the utility of matching edges is given
624 M. Hoefer and L. Wagner

according to this ranking. Throughout the paper we focus on the two-sided or


bipartite case, which is often referred to as the stable marriage problem, where V
is divided into two disjoint sets U and W such that E ⊆ {{u, w}| u ∈ U, w ∈ W }.
Note that this does not imply that N is bipartite. If further the vertices of U
are isolated in N , we speak of a job-market game for consistency with [2, 11].
To describe stability in network matching games, we assume agents u and
v are accessible in state M if they have a distance of at most 2 in the graph
G = (V, L ∪ M ). A state M has a local blocking pair e = {u, v} ∈ E if u and v
are accessible and are each either unmatched in M or matched through an edge
e such that e serves a strictly smaller utility than e. Thus, in a local blocking
pair both agents can strictly increase their utility by generating e (and possibly
dismissing some other edge thereby). A state M that has no local blocking pair
is a locally stable matching.
Most of our analysis concerns iterative round-based dynamics, where in each
round we pick one local blocking pair, add it to M , and remove all edges that
conflict with this new edge. We call one such step a local improvement step.
With random dynamics we refer to the process when in each step the local
blocking pair is chosen uniformly at random from the ones available. A local
blocking pair {u, v} that is resolved and not a link as well must be connected
by some distance-2 path (u, w, v) in M before the step. This path can consist of
two links, or of exactly one link and one matching edge. In the latter case, let
w.l.o.g. {u, w} be the matching edge. As u can have only one matching edge, the
local improvement step will delete {u, w} to create {u, v}. For simplicity, we will
refer to this fact as ”an edge moving from {u, w} to {u, v}” or ”u’s edge moving
from w to v”.
In subsequent sections we will assume that agents have memory that allows
to “remember” one matching partner from a former round. In this case, a pair
{u, v} of agents becomes accessible not only by a distance-2 path in G, but
also when u appears in the memory of v. Hence, in this case a local blocking
pair can be based solely on access through memory. For random memory, we
assume that in every round each agent remembers a previous matching partner
chosen uniformly at random. For recency memory, each agent remembers the last
matching partner that is different from the current partner. For quality memory,
each agent remembers the best previous matching partner.

3 Complexity of Reachability and Duration

Complexity In contrast to ordinary stable marriage, there exist small examples


that show Reachability is not always true for locally stable matchings (see [11,
Thm. 5] or our circling gadget in the full version). Here we consider complexity of
this problem and show NP-hardness results when agents have strict preferences.
This is in contrast to correlated network games, where Reachability is true and
for every initial matching there is always a sequence to a locally stable matching
of polynomial length [11]. Still, we here show that given a particular matching
to reach, deciding Reachability becomes also NP-hard, even for correlated
Locally Stable Marriage with Strict Preferences 625

job-market games. We present the proof of the latter result in detail, as it pro-
vides the basic idea for the omitted NP-hardness proofs as well.

Theorem 1. It is NP-hard to decide Reachability from the initial matching


M = ∅ to a given locally stable matching in a correlated network game.

Proof. We use a reduction from 3Sat . Given a 3Sat formula with k variables
x1 , . . . , xk and l clauses C1 , . . . , Cl , where clause Cj contains the literals l1j , l2j
and l3j , we have

U = {uxi |i = 1 . . . k} ∪ {uCj |j = 1 . . . l} ∪ {bh |h = 1 . . . k + l − 1},


W = {vxi , xi , xi |i = 1 . . . k} ∪ {vCj |j = 1 . . . l} ∪ {a, a1 }.

For the social links see the picture shown below.

uC1 b1 ··· uCl bl ux1 bl+1 ··· uxk

x1

a1 x1

a .. vx1 ··· vxk vC1 ··· vCl


.

xk

xk

We do not restrict the set of matching edges, but assume that every edge not
appearing in the list below has benefit  ; 1 (resulting in them being irrelevant
for the dynamics). The other benefits are given as follows.
u∈U w∈W b({u, w})
uCj a j j = 1, . . . , l
uxi a i+l i = 1, . . . , k
bh a h + 12 h = 1, . . . , k + l − 1
uCj l1j /l2j /l3j k+l+1 j = 1, . . . , l
uxi xi /xi k+l+1 i = 1, . . . , k
uCj vxi k+l+1+i i = 1, . . . , k, j = 1, . . . , l
uxi vxi k + l + 1 + i i = 1, . . . , k, i = 1, . . . , i
uCj vCj 2k + l + 1 + j  j = 1, . . . , l, j  = 1, . . . , j

Our goal is to reach M ∗ = {{us , vs }|s ∈ {x1 , . . . , xk } ∪ {C1 , . . . , Cl }}.


First, note that additional matching edges can only be introduced at {uC1 , a}.
Furthermore, once a vertex uy , y ∈ {x1 , . . . , xk } ∪ {C1 , . . . , Cl }, is matched to
a vertex other than a, it blocks the introduction of any edge for a vertex lying
behind uy on the path from uC1 to uxk . Also, the vertices bh prevent that an edge
626 M. Hoefer and L. Wagner

is moved on from one u-vertex to another after it has left a. Thus, at the time
when an edge to a clause u-vertex is created that still exists in the final matching
(but is connected to some vCj then), the edges for all variable u-vertices must
have been created already.
Assume that the 3Sat formula is satisfiable. Then we first create a matching
edge at {uCj , a}, move it over the u-and b-vertices to uxk , and then move it
into the branching to the one of xk or xk that negates its value in the satisfy-
ing assignment. Similarly, one after the other (in descending order), we create
a matching edge at a for each of the variable u-vertices and move it into the
branching to the variable vertex that negates its value in the satisfying assign-
ment. As every clause is fulfilled, at least one of the three vertices that yield an
improvement for the clause u-vertex from a is not blocked by a matching edge to
a variable u-vertex. Then, the edges to clause u-vertices can bypass the existing
edges (again, one after the other in descending order) and reach their positions
in M ∗ . After that, the variable-edges can leave the branching and move to their
final position in the same order as before.
Now assume that we can reach M ∗ from ∅. We note that the edges to clause
u-vertices have to overtake the edges to variable u-vertices somewhere on the
way to reach their final position. The only place to do so is in the branching
leading over the xi and xi . Thus all variable-edges have to wait at some xi or
xi until the clause-edges have passed. But from a, vertex uxi is only willing to
switch to xi or xi . Thus, every vertex blocks out a different variable (either in
its true or in its false value). Similarly, a vertex uCj will only move further from
a if it can reach one of its literals. Hence, if all clauses can bypass the variables,
then for every clause there was one of its literals left open for passage. Thus, if
we set each variable to the value that yields the passage for clause-edges in the
branching, we obtain a satisfying assignment. &
%

Corollary 1. It is NP-hard to decide Reachability to a given locally stable


matching in a correlated job-market game.

Theorem 2. It is NP-hard to decide Reachability from the initial matching


M = ∅ to an arbitrary locally stable matching in a bipartite network game.

Length of Sequences We now consider the number of improvement steps required


to reach locally stable matchings. In general, there is a network game and an
initial matching such that we need an exponential number of steps before reach-
ing any locally stable matching. This is again in contrast to the correlated case,
where every reachable stable matching can be reached in a polynomial number
of steps. We present the latter result and defer the much more technical proof
of the general lower bound to the full version.

Theorem 3. For every network game with correlated preferences, every locally
stable matching M ∗ ∈ E and initial matching M0 ∈ E such that M ∗ can be
reached from M0 through local improvement steps, there exists a sequence of at
most O(|E|3 ) local improvement steps leading from M0 to M ∗ .
Locally Stable Marriage with Strict Preferences 627

Proof. Consider an arbitrary sequence between M0 and M ∗ . We will explore


which steps in the sequence are necessary and which parts can be omitted. We
rank all edges by their benefit (allowing multiple edges to have the same rank)
such that r(e) > r(e ) iff b(e) > b(e ) and set rmax = max{r(e)|e ∈ E}. Recall
from Section 2 that we can account edges in the way that every edge e has at
most one direct predecessor e in the sequence, which was necessary to build e.
Because e was a local blocking pair, we know r(e ) < r(e). Thus, every edge e
has at most rmax predecessors. Our proof is based on two crucial observations:

(1) An edge can only be deleted by a stronger edge, that is, every chain of one
edge deleting the next is limited in length by rmax .
(2) If an edge is created, then possibly moved, and finally deleted without delet-
ing an edge on its way, this edge would not have to be introduced in the first
place.

Suppose our initial matching is the empty matching, then every edge in the
locally stable matching has to be created and by (repeated application of) (2)
we only need to create and move edges that are needed for the final matching.
Thus we have |M ∗ | edges, which each made at most rmax steps.
Now if we start with an arbitrary matching, the sequence might be forced
to delete some edges that cannot be used for the final matching. Each of these
edges generates a chain of edges deleting each other throughout the sequence,
but (1) tells us that this chain is limited as well as the number of steps each
of these edges has to make. The only remaining issue is what happens to edges
”accidentally” deleted during this procedure. Again, we can use (2) to argue
that there is no reason to rebuild such an edge just to delete it again. Thus, such
deletions can happen only once for every edge we had in M0 (not necessarily
on the position it had in M0 ). It does not do any harm if it happens to an
edge of one of the deletion-chains, as it would just end as desired. For the edges
remaining in |M ∗ | the same bounds holds as before. Thus, we have an overall
bound of |M0 | · rmax · rmax + |M ∗ | · rmax ∈ O(|E|3 ) steps, where the first term
results from the deletion chains and the second one from the edges surviving in
the final matching. &
%

Theorem 4. There is a network game with strict preferences such that a locally
stable matching can be reached by a sequence of local improvement steps from
the initial matching M = ∅, but every such sequence has length 2Ω(|V |) .

4 Memory

Given the impossibility results in the last section, we now focus on the impact of
memory. As a direct initial result, no memory can yield reachability of a given
locally stable matching, even in a correlated job-market game.
Corollary 2. It is NP-hard to decide Reachability to a given locally stable
matching in a correlated job-market game with any kind of memory.
628 M. Hoefer and L. Wagner

Let us instead concentrate on the impact of memory on reaching an arbitrary


locally stable matching. For our treatment we will focus on the case in which
the network links L ⊆ (W × W ) ∪ (U × W ). We assume that in every step, every
agent remembers one previous matching partner.

Quality Memory. With quality memory, each agent remembers the best match-
ing partner he ever had before. While this seems quite a natural choice and
appears like a smart strategy, it can be easily fooled by starting with a much-
liked partner, who soon after matches with someone more preferred and never
becomes available again. This way the memory becomes useless which leaves us
with the same dynamics as before.
Proposition 1. There is a network game with strict preferences, links L ⊆
(W × W ) ∪ (U × W ), quality memory and initial matching M = ∅ such that
no locally stable matching can be reached with local improvement steps from M .
This even holds if every agent remembers the best k previous matches.
Theorem 5. It is NP-hard to decide Reachability to an arbitrary locally sta-
ble matching in a network game with quality memory.

Recency Memory. With recency memory, each agent remembers the last partner
he has been matched to. This is again quite a very natural choice as it expresses
the human character of remembering the latest events best. Interestingly, here
we actually can ensure that a locally stable matching can be reached.
Theorem 6. For every network game with strict preferences, links L ⊆ (U ×
W ) ∪ (W × W ), recency memory and every initial matching, there is a sequence
of O(|U |2 |W |2 ) many local improvement steps to a locally stable matching.
Proof. Our basic approach is to construct the sequence in two phases similarly
as in [1]. In the first phase, we let the matched vertices from U improve, but
ignore the unmatched ones. In the second phase, we make sure that vertices from
W have improved after every round.
Preparation phase: As long as there is at least one u ∈ U with u matched and
u part of a blocking pair, allow u to switch to the better partner.
The preparation phase terminates after at most |U | · |W | steps, as in every
round one matched u ∈ U strictly improves in terms of preference. This can hap-
pen at most |W | times for each matched u. In addition, the number of matched
vertices from U only decreases.
Memory phase: As long as there is a u ∈ U with u part of a blocking pair,
pick u and execute a sequence of local improvement steps involving u until u is
not part of any blocking pair anymore. For every edge e = {u , w} with u = u
that was deleted during the sequence, recreate e from the memory of u .
We claim that if we start the memory phase after the preparation phase, at the
end of every round we have the following invariants: The vertices from W that
have been matched before are still matched, they do not have a worse partner
than before, and at least one of them is matched strictly better than before.
Also, only unmatched vertices from U are involved in local blocking pairs.
Locally Stable Marriage with Strict Preferences 629

Obviously, at the end of the preparation phase the only U -vertices in local
blocking pairs are unmatched, i.e., initially only unmatched U -vertices are part
of local blocking pairs. Let u be the vertex chosen in the following round of
the memory phase. At first we consider the outcome for w ∈ W . If w is the
vertex matched to u in the end, then w clearly has improved. Otherwise w gets
matched to its former partner (if it had one) through memory and thus has the
same utility as before. In particular, every w that represents an improvement
to some u but was blocked by a higher ranked vertex still remains blocked.
Together with the fact that u plays local improvement steps until it is not part
of a local blocking pair anymore, this guarantees that all matched U -vertices
cannot improve at the end of the round. As one W -vertex improves in every
round, we have at most |U | · |W | rounds in the memory phase, where every
round consists of at most |W | steps by u and at most |U | − 1 edges reproduced
from memory. &
%

The existence of sequences to (locally) stable matchings also implies that ran-
dom dynamics converge in the long run with probability 1 [8, 12, 20]. In general,
we cannot expect fast convergence here, as there are instances where random
dynamics yield an exponential sequence with high probability even if all infor-
mation is given – e.g., reinterpret the instance from [1] with L = U × W , then
every agent knows every possible partner and memory has no effect.
Observe that the previous proof relies only on the recency memory of partition
U . Hence, the existence of short sequences holds even if only agents from U have
memory. In contrast, if only agents from W have recency memory, the previous
NP-hardness constructions can be extended.

Theorem 7. It is NP-hard to decide Reachability to an arbitrary locally sta-


ble matching in a network game with links L ⊆ (U × W ) ∪ (W × W ) and recency
memory only for agents in W .

Random Memory. Finally, with random memory, each agent remembers a part-
ner chosen uniformly at random in each step. We consider random memory and
reaching a locally stable matching from every starting state even in general net-
work games. While we cannot expect fast convergence, we can show that random
memory helps with reachability:

Theorem 8. For every network game with random memory, random dynamics
converge to a locally stable matching with probability 1.

5 Maximum Locally Stable Matchings


As the size of locally stable matchings can vary significantly – up to the point
where the empty matching as well as a matching that includes every vertex is
locally stable – it is desirable to target locally stable matchings of maximal size.
We address the computational complexity of finding maximum locally stable
matchings by relating it to the independent set problem.
630 M. Hoefer and L. Wagner

Theorem 9. For every graph G = (V, E) there is a job-market game that admits
a maximum locally stable matching of size |V | + k if and only if G holds a
maximum independent set of size k.

Proof. Given a graph G = (V, E), |V | = n, we construct the job-market game


with network N = (V  = U ∪ W, L). For every v ∈ V we have uv,1 , uv,2 ∈ U and
wv,1 , wv,2 ∈ W . We have the links {wv,1 , wv ,2 } and {wv ,2 , wv,2 } if v  ∈ N (v).
We allow matching edges {uv,1 , wv,1 }, {uv,1 , wv ,2 } for v  ∈ N (v), {uv,1 , wv,2 }
and {uv,2 , wv,2 }. Each uv,1 prefers wv,2 to every wv ,2 , v  ∈ N (v), and every
wv ,2 to wv,1 . The preferences between the different neighbors can be chosen
arbitrarily. Each wv,2 prefers uv,1 to every uv ,1 , v  ∈ N (v), and every uv ,2 to
uv,2 . Again the neighbors can be ordered arbitrarily. Vertices wv,1 and uv,2 have
only one possible matching partner.
We claim that G has a maximum independent set of size k iff N has a locally
stable matching of size n + k.
Let S be a maximum independent set in G. Then M = {{uv,1 , wv,2 }| v ∈
V \ S} ∪ {{uv,1 , wv,1 }, {uv,2 , wv,2 } | v ∈ S} is a locally stable matching as the
edges {uv,1 , wv,2 } are always stable. For the other vertices the independent set
property tells us that for v ∈ S all vertices v  ∈ N (S) generate stable edges
{uv ,1 , wv ,2 } that keep uv,1 from switching to wv ,2 . Thus {uv,1 , wv,1 } is stable
and wv,2 cannot see uv,1 which stabilizes {uv,2 , wv,2 }.
Now let M be a maximum locally stable matching for the job-market game.
Further we chose M such that every uv,1 is matched, which is possible as re-
placing a matching partner of wv,2 by (the unmatched) uv,1 will not generate
instabilities or lower the size of M . We note that no uv,1 is matched to some
wv ,2 with v = v  as from there uv,1 and wv,2 can see each other and thus con-
stitute a blocking pair. Then, for S = {v|uv,2 ∈ M }, |S| = |M | − n and S is
an independent set, as every uv,2 can only be matched to its vertex wv,2 , which
means that uv,1 must be matched to wv,1 . But this edge is only stable if every
wv ,2 , v  ∈ N (v), is blocked by uv ,1 . Hence for every v ∈ S N (v) ∩ S = ∅. &
%

This result allows us to transfer hardness of approximation results for indepen-


dent set to locally stable matching.

Corollary 3. Finding a maximum locally stable matching is NP-complete. Un-


der the unique games conjuncture the problem cannot be approximated within
1.5 − ε, for any constant ε.

In fact, our reduction applies in the setting of the job-market game, where one
side has no network at all. This shows that even under quite strong restric-
tions the hardness of approximation holds. In contrast, it is easy to obtain a
2-approximation in every network game that admits a globally stable matching.

Proposition 2. If a (globally) stable matching exists, every such stable match-


ing is a 2-approximation for the maximum locally stable matching.
Locally Stable Marriage with Strict Preferences 631

References
1. Ackermann, H., Goldberg, P., Mirrokni, V., Röglin, H., Vöcking, B.: Uncoordinated
two-sided matching markets. SIAM J. Comput. 40(1), 92–106 (2011)
2. Arcaute, E., Vassilvitskii, S.: Social networks and stable matchings in the job mar-
ket. In: Leonardi, S. (ed.) WINE 2009. LNCS, vol. 5929, pp. 220–231. Springer,
Heidelberg (2009)
3. Austrin, P., Khot, S., Safra, M.: Inapproximability of vertex cover and independent
set in bounded degree graphs. Theory of Computing 7(1), 27–43 (2011)
4. Biró, P., Cechlárová, K., Fleiner, T.: The dynamics of stable matchings and half-
matchings for the stable marriage and roommates problems. Int. J. Game The-
ory 36(3-4), 333–352 (2008)
5. Blum, Y., Roth, A., Rothblum, U.: Vacancy chains and equilibration in senior-level
labor markets. J. Econom. Theory 76, 362–411 (1997)
6. Blum, Y., Rothblum, U.: “Timing is everything” and martial bliss. J. Econom.
Theory 103, 429–442 (2002)
7. Cheng, C., McDermid, E.: Maximum locally stable matchings. In: Proc. 2nd Intl.
Workshop Matching under Preferences (MATCH-UP), pp. 51–62 (2012)
8. Diamantoudi, E., Miyagawa, E., Xue, L.: Random paths to stability in the room-
mates problem. Games Econom. Behav. 48(1), 18–28 (2004)
9. Goemans, M., Li, L., Mirrokni, V., Thottan, M.: Market sharing games applied to
content distribution in ad-hoc networks. IEEE J. Sel. Area Comm. 24(5), 1020–
1033 (2006)
10. Gusfield, D., Irving, R.: The Stable Marriage Problem: Structure and Algorithms.
MIT Press (1989)
11. Hoefer, M.: Local matching dynamics in social networks. Inf. Comput. 222, 20–35
(2013)
12. Inarra, E., Larrea, C., Moris, E.: Random paths to P -stability in the roommates
problem. Int. J. Game Theory 36(3-4), 461–471 (2008)
13. Inarra, E., Larrea, C., Moris, E.: The stability of the roommate problem revisited.
Core Discussion Paper 2010/7 (2010)
14. Irving, R.: An efficient algorithm for the ”stable roommates” problem. J. Algo-
rithms 6(4), 577–595 (1985)
15. Klaus, B., Klijn, F., Walzl, M.: Stochastic stability for rommate markets. J.
Econom. Theory 145, 2218–2240 (2010)
16. Knuth, D.: Marriages stables et leurs relations avec d’autres problemes combina-
toires. Les Presses de l’Université de Montréal (1976)
17. McDermid, E.: A 3/2-approximation algorithm for general stable marriage. In:
Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W.
(eds.) ICALP 2009, Part I. LNCS, vol. 5555, pp. 689–700. Springer, Heidelberg
(2009)
18. Roth, A., Sönmezc, T., Ünver, M.U.: Pairwise kidney exchange. J. Econom. The-
ory 125(2), 151–188 (2005)
19. Roth, A., Sotomayor, M.O.: Two-sided Matching: A study in game-theoretic mod-
eling and analysis. Cambridge University Press (1990)
20. Roth, A., Vate, J.V.: Random paths to stability in two-sided matching. Economet-
rica 58(6), 1475–1480 (1990)
21. Yanagisawa, H.: Approximation algorithms for stable marriage problems. PhD the-
sis, Kyoto University, Graduate School of Informatics (2007)
Distributed Deterministic Broadcasting
in Wireless Networks of Weak Devices

Tomasz Jurdzinski1 , Dariusz R. Kowalski2, and Grzegorz Stachowiak1


1
Institute of Computer Science, University of Wroclaw, Poland
2
Department of Computer Science, University of Liverpool, United Kingdom

Abstract. Many futuristic technologies, such as Internet of Things or


nano-communication, assume that a large number of simple devices of
very limited energy and computational power will be able to communi-
cate efficiently via wireless medium. Motivated by this, we study broad-
casting in the model of ad-hoc wireless networks of weak devices with
uniform transmission powers. We compare two settings: with and with-
out local knowledge about immediate neighborhood. In the latter set-
ting, we prove Ω(n log n)-round lower bound and develop an algorithm
matching this formula. This result could be made more accurate with
respect to network density, or more precisely, the maximum node de-
gree Δ in the communication graph. If Δ is known to the nodes, it is
possible to broadcast in O(DΔ log 2 n) rounds, which is almost optimal
in the class of networks parametrized by D and Δ due to the lower
bound Ω(DΔ). In the setting with local knowledge, we design a scalable
and almost optimal algorithm accomplishing broadcast in O(D log2 n)
communication rounds, where n is the number of nodes and D is the
eccentricity of a network. This can be improved to O(D log g) if network
granularity g is known to the nodes. Our results imply that the cost of
“local communication” is a dominating component in the complexity of
wireless broadcasting by weak devices, unlike in traditional models with
non-weak devices in which well-scalable solutions can be obtained even
without local knowledge.

1 Introduction
1.1 The Model
We consider a wireless network consisting of n stations, also called nodes, de-
ployed into an Euclidean plane and communicating by a wireless medium. The
Euclidean metric on the plane is denoted dist(·, ·). Each station v has its trans-
mission power Pv , which is a positive real number. There are three fixed model
parameters: path loss α > 2, threshold β ≥ 1, and ambient noise N > 0.


The full version of the paper is available at [13]. This work was supported by the
Polish National Science Centre grant DEC-2012/06/M/ST6/00459.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 632–644, 2013.

c Springer-Verlag Berlin Heidelberg 2013
Distributed Broadcasting 633

The SIN R(v, u, T ) ratio, for given stations u, v and a set of (transmitting)
stations T , is defined as follows:

Pv dist(v, u)−α
SIN R(v, u, T ) =  −α
(1)
N+ w∈T \{v} Pw dist(w, u)

In the weak devices model considered in this work, a station u successfully receives
a message from a station v in a round if v ∈ T , u ∈ / T , and:
a) Pv dist−α (v, u) ≥ (1 + ε)βN , and
b) SIN R(v, u, T ) ≥ β,
where T is the set of stations transmitting at that time and ε > 0 is a fixed
signal sensitivity parameter of the model.1
Ranges and Uniformity. The communication range rv of a station v is the ra-
dius of the ball in which a message transmitted by the station is heard, provided
no other station transmits at the same time. A network is uniform, when trans-
mission powers Pv and thus ranges of all stations rv are equal, or nonuniform
otherwise. In this paper, only uniform networks are considered, i.e., Pv = P and
r = rv = (P/(N β(1 + ε)))1/α . The range area of a station v is defined to be the
ball of radius r centered in v.
Communication Graph and Graph Notation. The communication graph
G(V, E), also called the reachability graph, of a given network consists of all
network nodes and edges (v, u) such that u is in the range area of v. Note that
the communication graph is symmetric for uniform networks. By a neighborhood
of a node u we mean the set (and positions) of all neighbors of u in G, i.e.,
the set {w | (w, u) ∈ E(G)}. The graph distance from v to w is equal to the
length of a shortest path from v to w in the communication graph, where the
length of a path is equal to the number of its edges. The eccentricity of a node
is the maximum graph distance from this node to any other node (note that the
eccentricity is of order of the diameter). By Δ we denote the maximum degree
of a node in the communication graph.
Synchronization. It is assumed that algorithms work synchronously in rounds,
each station can either act as a sender or as a receiver during a round. We do
not assume global clock ticking.
Carrier Sensing. We consider the model without carrier sensing, that is, a
station u has no other feedback from the wireless channel than receiving or not
receiving a message in a round t.

1
This model is motivated by the fact that it is too costly for weak devices to have
receivers doing signal acquisition continuously, c.f., [7]. Therefore, in many systems
they rather wait for an energy spike, c.f., condition (a), and once they see it, they
start sampling and correlating to synchronize and acquire a potential packet pream-
ble [19]. Once synchronized, they can detect signals, c.f., condition (b).
634 T. Jurdzinski, D.R. Kowalski, and G. Stachowiak

Knowledge of Stations. Each station has its unique ID from the set [N ],2
where N is polynomial in n. Stations also know their locations, and parameters n,
N . Some subroutines use the granularity g, defined as r divided by the minimum
distance between any two stations (c.f., [5]). We distinguish between networks
without local knowledge (ad hoc), where stations do not know anything about
the topology of the network, and networks with local knowledge, in which each
station knows locations and IDs of its neighbors in the communication graph.
Broadcasting Problem and Complexity Parameters. In the broadcast
problem, there is one distinguished node, called the source, which initially holds
a piece of information (also called a source message or a broadcast message).
The goal is to disseminate this message to all other nodes. The complexity
measure is the worst-case time to accomplish the broadcast task, taken over
all connected networks with specified parameters. Time, also called the round
complexity, denotes the number of communication rounds in the execution of a
protocol: from the round when the source is activated with its source message
till the broadcast task is accomplished. For the sake of complexity formulas, we
consider the following parameters: n, N , D, and g.
Messages and Initialization of Stations Other than Source. We assume
that a single message sent in the execution of any algorithm can carry the broad-
cast message and at most polynomial, in the size of the network, number of
control bits. (For the purpose of our algorithms, it is sufficient that positions of
stations on the plane are stored with accuracy requiring O(log n) bits; therefore,
we assume that each message contains the position of its sender.) A station other
than the source starts executing the broadcast protocol after the first successful
receipt of the source message; it is often called a non-spontaneous wake-up model.

1.2 Our Results

In this paper we present the first study of deterministic distributed broadcasting


in wireless networks of weak devices with uniform transmission powers, deployed
in the two dimensional Euclidean space. We distinguish between the two settings:
with and without local knowledge about the neighbors in the communication
graph. In the latter model, we developed an algorithm accomplishing broadcast
in O(n log n) rounds, which matches the lower bound (Sections 2.1 and 2.3,
resp.). Then, an algorithm accomplishing broadcast in time O(DΔ log2 n) is
presented, where D is the eccentricity of the source and Δ is the largest degree
of a node in the communication graph (Section 2.2). This algorithm is close
to the lower bound Ω(DΔ), see Section 2.3. Our solution for networks with
local knowledge works in O(D log2 n) rounds (Section 3), which provides only a
small O(log2 n) overhead over the straightforward lower bound of Ω(D), and is
faster, in the worst case, than any algorithm designed for networks without local
knowledge of eccentricity D = o(n/ log n) or maximal degree Δ = ω(1). It also
implies that the cost of learning neighborhoods by stations in wireless network
2
We denote [i] = {1, 2, . . . , i}, [i, j] = {i, i + 1, . . . , j} for i, j ∈ N.
Distributed Broadcasting 635

is much higher, by factor around min{n/D, Δ}, than the cost of broadcast itself
(i.e., broadcast performed when such neighborhoods would be provided). If the
granularity g is known, a complexity O(D log g) can be achieved by a variation
of the algorithm mentioned above.
Our results rely on novel techniques which simultaneously exploit specific
properties of conflict resolution in the SINR model (see e.g., [1]) and several
algorithmic techniques developed for a different radio network model. In par-
ticular, we show how to efficiently combine a novel SINR-based communication
technique, ensuring several simultaneous point-to-point communications inside
the range area of one station (which is unfeasible to achieve in the radio network
model), with strongly selective families and methods based on geometric grids
developed in the context of radio networks. As a result, we are able to transform
algorithms relying on the knowledge of network’s granularity into algorithms of
asymptotically similar performance (up to a log n factor) that do not require such
knowledge; this is in particular demonstrated in the leader election algorithms.
Details of some algorithms and technical proofs can be found in the full version
of the paper [13].

1.3 Previous and Related Results


To the best of our knowledge, this is the first theoretical study of the problem
of distributed deterministic broadcasting in ad hoc wireless networks of weak
devices. In what follows, we list most relevant results in the SINR-based model
and in the older, but still related, radio network model.
SINR Models. In the model of (uniform) weak devices, distributed algorithms
for building a backbone structure in O(Δ polylog n) rounds were constructed
in [11]. Unlike in our broadcast problem, in [11] it was assumed that all nodes si-
multaneously start building the backbone. That result combined with the results
of this work implicates that there is an extra cost payed for the lack of initial
synchronization. If devices are not weak (i.e., not restricted by the fact that the
signal must be sufficiently strong in order to be noticed), broadcasting can be
done in O(D log2 n), as proved in [14]. Combined with results in this paper, it
proves a complexity gap between the two models: weak and non-weak devices.
Under the SINR-based models in ad hoc setting, a few other problems were
also studied, such as deterministic data aggregation [10] and local broadcasting
[20], in which nodes have to inform only their neighbors in the correspond-
ing reachability graph. The considered setting allowed power control by algo-
rithms, in which, in order to avoid collisions, stations could transmit with any
power smaller than the maximal one. Randomized solutions for contention res-
olution [15] and local broadcasting [8] were also obtained.
There is a vast amount of work on centralized algorithms under the SINR
model. The most studied problems include connectivity, capacity maximization,
and link scheduling types of problems; for recent results and references we re-
fer the reader to the survey [9]. Multiple Access Channel properties were also
recently studied under the SINR model, c.f., [18].
636 T. Jurdzinski, D.R. Kowalski, and G. Stachowiak

Radio Network Model. In this model, a transmitted message is successfully


heard if there are no other simultaneous transmissions from the neighbors of the
receiver in the communication graph. This model does not take into account the
real strength of the received signals, and also the signals from outside of the
close proximity. In the geometric ad hoc setting, Dessmark and Pelc [4] were the
first who studied this problem. They analyzed the impact of local knowledge,
defined as the range within which stations can discover the nearby stations.
Emek et al. [5] designed a broadcast algorithm working in time O(Dg) in Unit
Disc Graphs (UDG) radio networks with eccentricity D and granularity g. Later,
Emek et al. [6] developed a matching lower bound Ω(Dg). In the graph-based
model of radio networks, in which stations are not explicitly deployed in a metric
space, the fastest O(n log(n/D))-round deterministic algorithm was developed
by Kowalski [16], and almost matching lower bound was given by Kowalski and
Pelc [17], who also studied fast randomized solutions (in parallel with [3]). The
above results hold without assuming local knowledge. √ With local knowledge,
Jurdzinski and Kowalski [12] showed a lower bound
√ Ω( Dn log n) on the number
of rounds and an algorithm of complexity O(D n log6 n).

1.4 Technical Preliminaries


In the broadcast problem, a round counter could be easily maintained by already
informed nodes by passing it along the network with the source message, so in all
algorithms we in fact assume having a global clock. For simplicity of analysis, we
assume that every message sent during the execution of our broadcast protocols
contains the broadcast message; in practice, further optimization of a message
content could be done in order to reduce the total number of transmitted bits in
real executions. In a given round t we say that a station v transmits c-successfully
in round t if v transmits a message in round t and this message is heard by each
station u in the Euclidean distance at most c from v. We say that a station
v transmits successfully in round t if it transmits r-successfully, i.e., each of its
neighbors in the communication graph can hear its message. Finally, v transmits
successfully to u in round t if v transmits a message and u receives this message in
round t. We say that a station that received the broadcast message is informed.
Grids. Given a parameter c > 0, we define a partition of the 2-dimensional
space into square boxes of size c × c by the grid Gc , in such a way that: all
boxes are aligned with the coordinate axes, point (0, 0) is a grid point, each box
includes its left side without the top endpoint and its bottom side without the
right endpoint and does not include its right and top sides. We say that (i, j)
are the coordinates of the box with its bottom left corner located at (c · i, c · j),
for i, j ∈ Z. A box with coordinates (i, j) ∈ Z2 is denoted C(i, j). As observed
in [4,5], the grid Gr/√2 is very useful in the design of the algorithms for UDG
(unit disk graph) radio networks,√where r is equal to the range of each station.
This follows from the fact that r/ 2 is the largest parameter of a grid such that
each station
√ in a box is in the range of every other station in that box. We fix
γ = r/ 2 and call Gγ the pivotal grid. If not stated otherwise, our considerations
Distributed Broadcasting 637

will refer to (boxes of) Gγ . The boxes C, C  of the pivotal grid are neighbors in
a network if there are stations v ∈ C and v  ∈ C  such that the edge (v, v  )
belongs to the communication graph. We define the set DIR ⊂ [−2, 2]2 such
that (d1 , d2 ) ∈ DIR iff it is possible that boxes C(i, j) and C(i + d1 , j + d2 ) are
neighbors.
Schedules. A (general) broadcast schedule S of length T wrt N ∈ N is a mapping
from [N ] to binary sequences of length T . A station with identifier v ∈ [N ] follows
the schedule S of length T in a fixed period of time consisting of T rounds, when
v transmits a message in round t of that period iff the position t mod T of S(v)
is equal to 1. For the tuples (i1 , j1 ), (i2 , j2 ) the relation (i1 , j1 ) ≡ (i2 , j2 ) mod d
for d ∈ N denotes that (|i1 − i2 | mod d) = 0 and (|j1 − j2 | mod d) = 0. A set
of stations A on the plane is δ-diluted wrt Gc , for δ ∈ N \ {0}, if for any two
stations v1 , v2 ∈ A with grid coordinates (i1 , j1 ) and (i2 , j2 ), respectively, the
relationship (i1 , j1 ) ≡ (i2 , j2 ) mod d holds. We say that δ-dilution is applied
to a schedule S if each round of an execution of S is replaced with δ 2 rounds
parameterized by (i, j) ∈ [0, δ − 1]2 such that a station v ∈ C(a, b) can transmit
a message only in the rounds (i, j) such that (i, j) ≡ (a, b) mod δ.

Proposition 1. For each α > 2 and ε > 0, there exists a constant d0 such that
the following properties hold. Assume that a set of n stations A is d-diluted wrt
the grid Gx , where x = γ/c, c ∈ N, c > 1 and d ≥ d0 . Moreover, at most one
station from A is located in each box of Gx . Then, if all stations from A transmit
simultaneously, each of them transmits 2r c -successfully.

Proposition 2. For each α > 2 and ε > 0, there exists a constant d satisfying
the√following property. Let A be a set of stations such that minu,v∈A {dist(u, v)} =
x· 2, where x ≤ γ. If a station u ∈ C(i, j) for a box C(i, j) of Gx is transmitting
in a round t and no other station in any box C(i , j  ) of Gx such that max{|i −
i |, |j − j  |} ≤ d is transmitting at that round, then v can hear the message from
u at round t.

Selective families. A family S = (S0 , . . . , Ss−1 ) of subsets of [N ] is a (N, k)-ssf


(strongly-selective family) of length s if, for every non empty subset Z of [N ]
such that |Z| ≤ k and for every element z ∈ Z, there is a set Si in S such that
Si ∩ Z = {z}. It is known that there exists (N, k)-ssf of size O(k 2 log N ) for
every k ≤ N , c.f., [2]. We identify a family of sets S = (S0 , . . . , Ss−1 ) with the
broadcast schedule S  such that the ith bit of S  (v) is equal to 1 iff v ∈ Si .

2 Algorithms without Local Knowledge

2.1 Size Dependent Algorithm

In this section we consider networks in which a station knows only n, N , its


own ID and its coordinates in the Euclidean space. We develop an algorithm
SizeUBr, which executes repeatedly two interleaved threads.
638 T. Jurdzinski, D.R. Kowalski, and G. Stachowiak

The first thread keeps combining stations into groups such that eventually,
for any box C of the pivotal grid, all stations located in C form one group.
Moreover, each group has the leader, and eventually each station should be aware
of (i) which group it belongs to, (ii) which station is the leader of that group,
and (iii) which stations belong to that group. Upon waking up, each station
forms a group with a single element (itself), and the groups increase gradually
by merging. The merging process builds upon the following observation. Let σ be
the smallest distance between two stations and let u, v be the closest stations.
Thus, there is at most one station in each box of the grid Gσ/√2 . Then, if u
transmits a message and no other station in distance d · σ, for some constant d,
transmits at the same time, then v can hear that message (see Prop. 2). Using
a (N, (2d + 1)2 )-strongly-selective family as a broadcast schedule S on the set of
leaders of groups, c.f., [16], one can assure that such a situation occurs in each
O(log N ) rounds. If u can hear v and v can hear u during such a schedule, the
groups of u and v can be merged. In order to coordinate the merging process,
we implicitly build a matching among pairs (u, v) such that u can hear v and v
can hear u during execution of S.
The second thread is supposed to guarantee that the broadcast message is
transmitted from boxes containing informed stations to their neighbors. Each
station determines its temporary ID (TID) as the rank of its ID in the set of IDs
in its group. Using these TIDs, the stations apply round-robin strategy. Thus,
if each group corresponds to all stations in the appropriate box, transmissions
are successful (see Prop. 1), and thus they guarantee that neighbors of a box
containing informed stations will also contain informed stations.
The main problem with implementation of these ideas is that, as long as there
are many groups inside a box, transmissions in the second thread may cause
unwanted interferences. Another problem is that the set of stations attending the
protocol changes gradually, when new stations become informed and can join the
execution of the protocol. These issues are managed by measuring the progress
of a protocol using amortized analysis. The details of the implementation and
analysis can be found in the full version of the paper.

Theorem 1. Algorithm SizeUBr performs broadcasting in each n-node net-


work in O(n log n) rounds, in the setting without local knowledge.

2.2 Degree Dependent Algorithm

In this section we present the algorithm GenBroadcast whose complexity is op-


timized with respect to maximal degree of the communication graph. The core
of this algorithm is a leader election procedure which, given a set of stations V ,
chooses exactly one station (the leader) in each box C of the pivotal grid con-
taining at least one element from V . This procedure works in O(log n · log N ) =
O(log2 n) rounds and it is executed repeatedly during GenBroadcast. The set
of stations attending a particular leader election execution consists of all sta-
tions which received the broadcast message and have not been chosen leaders of
their boxes in previous executions of the leader election procedure. Moreover, at
Distributed Broadcasting 639

Algorithm 1. LeaderElection(V, n)
1: For each v ∈ V : cand(v) ← true;
2: for i = 1, . . . , log n + 1 do  Elimination
3: for j, k ∈ [0, 2] do
4: Execute S twice on the set:
5: {w ∈ V | cand(w) = true and w ∈ C(j  , k )
6: such that (j  mod 2, k mod 2) = (j, k)};
7: Each w ∈ V determines and stores Xw during the first execution of S, and
8: Xv , for each v ∈ Xw , during the second execution of S;
9: for each v ∈ V do
10: u ← min(Xv );
11: if Xv = ∅ or v > min(Xu ∪ {u}) then cand(v) ← f alse; ph(v) ← i;
12: For each v ∈ V : state(v) ← active;  Selection
13: for i = log n, (log n) − 1, . . . , 2, 1 do
14: Vi ← GranLeaderElection({v ∈ V |
15: ph(v) = i, state(v) = active}, 1/n);  Vi – leaders
16: Each element v ∈ Vi sets state(v) ← leader and
17: transmits successfully using constant dilution (see Prop. 1);
18: Simultaneously, for each v ∈ V which can hear u ∈ box(v): state(v) ← passive.

the end of each execution of the leader election procedure, each leader chosen in
that execution transmits a message successfully — this can be done in a constant
number of rounds, by using d-dilution with appropriate constant d (c.f., Prop. 1).
In this way, each station receives the source message after O(DΔ log2 n) rounds.
(Note that there are at most Δ stations in a box of the pivotal grid.)
In the following, we describe the leader election algorithm — its pseudo-code
is presented as Algorithm 1. We are given a set of stations V of size at most
n. The set V is not known to stations, each station knows merely whether it
belongs to V or it does not belong to V . In the algorithm, we use (N, e)-ssf S of
size s = O(log N ), where e = (2d + 1)2 and d is the constant depending merely
on the parameters of the model, the same as in Section 2.1 (see also Prop. 2).
Let Xv , for a given execution of S be the set of stations which belong to box(v)
and v can hear them during that execution.
The following proposition combines properties of ssf with Prop. 2.
Proposition 3. For each α > 2 and ε > 0, there exists a constant k satisfying
the following property. Let W be a 3-diluted (wrt the pivotal grid) set of stations
and let C be a box of the pivotal grid. If minu,v∈C∩W = x ≤ r/n and dist(u, v) =
x for some u, v ∈ W such that box(u) = box(v) = C, then v can hear the message
from u during an execution of a (N, k)-ssf on W .
The leader election algorithm consists of two stages. The first stage gradually
eliminates elements from the set of candidates for the leaders of boxes in con-
secutive executions of the ssf S in the first for loop. Therefore, we call this stage
Elimination. Let phase l of Elimination stage denote the executions of S for
i = l. Each station v “eliminated” in phase l has assigned the value ph(v) = l.
Let V (l) = {v | ph(v) > l} and VC (l) = {v | ph(v) > l and box(v) = C} for l ∈ N
640 T. Jurdzinski, D.R. Kowalski, and G. Stachowiak

and C being a box of the pivotal grid. That is, VC (l) is the set of stations from
C which are not eliminated until phase l. The key property of the sets VC (l) is
that |VC (l + 1)| ≤ |VC (l)|/2 and the granularity of VC (lC
) is smaller than n for
each box C and l ∈ N, where lC is the largest l ∈ N such that VC (l) is not empty.


Therefore, we can choose the leader of each box C by applying (simultaneously


in each box) the granularity dependent leader election algorithm GranLeader-

Election, described later in Section 3.2 on the set VC (lC ) and with upper bound
n on granularity. Note that we can elect the leaders in O(log N ) = O(log n)
rounds in this way. However, the stations in C do not necessary know the value

of lC . Therefore, the second stage (called Selection) applies the granularity de-
pendent leader election on V (log n), V (log n − 1), V (log n − 2) and so on. When
the leader of a box C is chosen, all stations in C become silent (state passive
in line 18), i.e., they do not attend the following executions of GranLeaderElec-
tion. It is important that a station becomes silent after the leader of its box is

chosen, since granularity of VC (i) might be larger than n for i < lC . Activity

of stations from such a box C for i < lC during the Selection stage could cause
large interferences preventing other boxes from electing leaders.
Recall that each leader broadcasts successfully at the end of the execution of
LeaderElection in which it is elected. If each station attends consecutive leader
election executions until it becomes a leader in its box, the broadcasting message
is transmitted from a box C to all its neighbors in O(Δ log2 n) rounds, since there
are at most Δ station in each box of the pivotal grid. Therefore, we obtain a
GeneralBroadcast algorithm providing the following result.

Theorem 2. Algorithm GeneralBroadcast completes broadcast in O(DΔ log2 n)


rounds in any network without local knowledge.

2.3 Lower Bounds


Theorem 3. There exist: (i) an infinite family of n-node networks requiring
Ω(n log n) rounds to accomplish deterministic broadcast, and (ii) for every DΔ =
O(n), an infinite family of n-node networks of diameter D and maximum degree
Δ requiring Ω(DΔ) rounds to accomplish deterministic broadcast.

Proof (Sketch). We describe a family of networks F such that broadcasting


requires time Ω(D log N ). By Li we denote the set of stations in distance i
from the source in the communication graph. Each element of F is formed as a
sequential composition of D networks V1 , . . . , VD of eccentricity 3 each, such that:
– the source s is connected with two nodes v1 , v2 in L1 with arbitrary IDs and
fixed positions;
– v1 , v2 are connected with w, the only element of L2 , and satisfy the condition:

P · dist(v1 , w)−α = P · dist(v2 , w)−α − N /2 . (2)

Sequential composition of networks V1 , . . . , VD stands for identifying the element


w of network component Vi with the source s of network component Vi+1 .
Distributed Broadcasting 641

Note that if v1 and v2 transmit simultaneously in a network component Vi ,


the message is not received by w. Using simple counting argument, one can force
such choice of IDs of v1 and v2 that Ω(log N ) rounds are necessary until a round
in which exactly one of v1 , v2 transmits successfully a message to w under the
SINR model of weak devices. Since D = Θ(n) in the above construction and
log n = Θ(log N ), the bound Ω(n log N ) holds.
The above proof can be extended to obtain lower bounds Ω(DΔ), by consid-
ering the following class of network components Vi : the source s, located in the
origin point (0, 0), is the only element of L0 ; L1 consists of Δ nodes v0 , . . . , vΔ−1 ,
where the position of vi is (γ · Δi , γ) for 0 ≤ i ≤ Δ − 1; and L2 contains only
one node wj with coordinates (γ · Δ j
, γ + r), i.e., wj can receive a message only
from vj . &
%
This result can also be transformed to the case of randomized algorithms. We
sketch an idea of these transformations by considering networks from the family
F described in the proof of Theorem 3. Recall that each element of the layer L1
should transmit as the only element of L1 in order to guarantee that the only
element of L2 is informed, regardless of its location. However, by simple counting
arguments, the expectation of the number of steps after which some of elements
of L1 transmit as the only one is Ω(log n) or Ω(Δ), respectively.

3 Algorithms for Networks with Local Knowledge


In this section we assume that each station knows n, N as well as IDs and
locations of all stations in its range area. We start with presenting a generic al-
gorithmic scheme. Next, we describe an algorithm for networks with additionally
known granularity bound g. Finally we provide a solution for the general setting
when granularity g is not known in advance.

3.1 Generic Algorithmic Scheme


In the first round the source sends the broadcast message. Then, we repeat the
generic procedure Inter-Box-Bdcst, whose ith repetition is aimed at transmitting
the broadcast message from boxes of the pivotal grid containing at least one sta-
tion that has received the broadcast message in the previous execution of Inter-
Box-Bdcst (or from the source) to boxes which are their neighbors. The specific
implementation of procedure Inter-Box-Bdcst depends on the considered setting.
Each station v is in state s(v), which may be equal to one of the follow-
ing three values: asleep, active, or idle. At the beginning, the source sends the
source message and all stations of its box in the pivotal grid set their states to
active, while all the remaining stations are in the asleep state. The states of sta-
tions change only at the end of Inter-Box-Bdcst, according to the following rules:

Rule 1: All stations in state active change their state to idle.


Rule 2: A station u changes its state from asleep to active if it has received the
broadcast message from a station v in the current execution of Inter-Box-Bdcst
642 T. Jurdzinski, D.R. Kowalski, and G. Stachowiak

such that either v was in state active or v belongs to the same box of the pivotal
grid as u. That is, let C be a box of the pivotal grid, let u ∈ C be in state
asleep at the beginning of Inter-Box-Bdcst. The only possibility that u receives
a message and it does not change its state from asleep to active at the end of
Inter-Box-Bdcst is that each message received by u is sent by a station v which
is in state asleep when it sends the message and v ∈ C.
The intended properties of an execution of Inter-Box-Bdcst are:
(I) For each box C of the pivotal grid, states of all stations in C are equal.
(P) The broadcast message is (successfully) sent from each box C containing
stations in state active to all stations located in boxes which are neighbors
of C.
The following proposition easily follows from the above stated properties.
Proposition 4. If (I) and (P) are satisfied, the source message is transmitted
to the whole network in O(D · T ) rounds, where T is the number of rounds in a
single execution of Inter-Box-Bdcst.

3.2 A Granularity-Dependent Algorithm


First, we develop a broadcasting algorithm with known granularity g of a net-
work. The main ingredient of this protocol is a leader election algorithm, called
GranLeaderElection(A, g), which, given a set of stations A chooses the leader in
each box of the pivotal grid containing stations from A (at the beginning, each
station knows only whether it belongs to A or not). The idea of the leader elec-
tion procedure is as follows. Granularity g implies that each station is the leader
of a box of Gx , where x = γ/h for h = min(2i | 2i ≥ g). Then, leaders of boxes
of G2i x are chosen among leaders of boxes of G2i−1 x for i = 1, 2, . . . , log h in
constant number of rounds with help of Prop. 1. Thus, leaders in boxes of the
pivotal grid can be chosen in O(log g) rounds.
Given the above (local) leader election procedure, the procedure Inter-Box-
Bdcst is implemented as follows. For each direction (d1 , d2 ) ∈ DIR, the leaders
are elected in all boxes among station in state active which have neighbors in the
direction (d1 , d2 ). Then, these leaders send messages successfully using dilution
(see Prop. 1). Moreover, since each station knows all stations in its box, the
station with smallest ID among newly informed in each box sends the broadcast
message which is delivered to all stations from its box. In this way, the procedure
Inter-Box-Bdcst satisfying the invariants (I) and (P) working in time O(log g) is
obtained which gives the broadcasting algorithm working in time O(D log g).

Theorem 4. Algorithm GranUBr accomplishes broadcast in any n-node net-


work of diameter D and granularity g in O(D log g), in the setting with local
knowledge.
Distributed Broadcasting 643

3.3 General Algorithm


In this section we develop Algorithm DiamUBr, which also builds on the generic
scheme from Section 3.1. Procedure Inter-Box-Bdcst required by the generic al-
gorithm is implemented as in Section 3.2, the only difference is that GranLeader-
Election with complexity O(log g) is replaced with the procedure LeaderElection
from Section 2.2 (Alg. 1). By Prop. 4, and by the round complexity O(log2 n) of
algorithm LeaderElection, we obtain the following result.

Theorem 5. Algorithm DiamUBr completes broadcast in any n-node network


of diameter D in O(D log2 n) rounds, in the setting with local knowledge.

References
1. Avin, C., Emek, Y., Kantor, E., Lotker, Z., Peleg, D., Roditty, L.: Sinr dia-
grams: towards algorithmically usable sinr models of wireless networks. In: PODC,
pp. 200–209 (2009)
2. Clementi, A.E.F., Monti, A., Silvestri, R.: Selective families, superimposed codes,
and broadcasting on unknown radio networks. In: SODA, pp. 709–718 (2001)
3. Czumaj, A., Rytter, W.: Broadcasting algorithms in radio networks with unknown
topology. In: FOCS, pp. 492–501 (2003)
4. Dessmark, A., Pelc, A.: Broadcasting in geometric radio networks. J. Discrete Al-
gorithms 5(1), 187–201 (2007)
5. Emek, Y., Gasieniec, L., Kantor, E., Pelc, A., Peleg, D., Su, C.: Broadcasting in udg
radio networks with unknown topology. Distributed Computing 21(5), 331–351
(2009)
6. Emek, Y., Kantor, E., Peleg, D.: On the effect of the deployment setting on broad-
casting in euclidean radio networks. In: PODC, pp. 223–232 (2008)
7. Goldsmith, A.J., Wicker, S.B.: Design challenges for energy-constrained ad hoc
wireless networks. IEEE Wireless Communications 9(4), 8–27 (2002)
8. Goussevskaia, O., Moscibroda, T., Wattenhofer, R.: Local broadcasting in the phys-
ical interference model. In: DIALM-POMC, pp. 35–44 (2008)
9. Goussevskaia, O., Pignolet, Y.A., Wattenhofer, R.: Efficiency of wireless networks:
Approximation algorithms for the physical interference model. Foundations and
Trends in Networking 4(3), 313–420 (2010)
10. Hobbs, N., Wang, Y., Hua, Q.-S., Yu, D., Lau, F.C.M.: Deterministic distributed
data aggregation under the SINR model. In: Agrawal, M., Cooper, S.B., Li, A.
(eds.) TAMC 2012. LNCS, vol. 7287, pp. 385–399. Springer, Heidelberg (2012)
11. Jurdzinski, T., Kowalski, D.R.: Distributed backbone structure for algorithms in
the SINR model of wireless networks. In: Aguilera, M.K. (ed.) DISC 2012. LNCS,
vol. 7611, pp. 106–120. Springer, Heidelberg (2012)
12. Jurdzinski, T., Kowalski, D.R.: On the complexity of distributed broadcasting and
MDS construction in radio networks. In: Baldoni, R., Flocchini, P., Binoy, R. (eds.)
OPODIS 2012. LNCS, vol. 7702, pp. 209–223. Springer, Heidelberg (2012)
13. Jurdzinski, T., Kowalski, D.R., Stachowiak, G.: Distributed deterministic broad-
casting in wireless networks of weak devices under the sinr model. CoRR,
abs/1210.1804 (2012)
14. Jurdzinski, T., Kowalski, D.R., Stachowiak, G.: Distributed deterministic broad-
casting in uniform-power ad hoc wireless networks. CoRR, abs/1302.4059 (2013)
644 T. Jurdzinski, D.R. Kowalski, and G. Stachowiak

15. Kesselheim, T., Vöcking, B.: Distributed contention resolution in wireless


networks. In: Lynch, N.A., Shvartsman, A.A. (eds.) DISC 2010. LNCS, vol. 6343,
pp. 163–178. Springer, Heidelberg (2010)
16. Kowalski, D.R.: On selection problem in radio networks. In: PODC, pp. 158–166
(2005)
17. Kowalski, D.R., Pelc, A.: Broadcasting in undirected ad hoc radio networks.
Distributed Computing 18(1), 43–57 (2005)
18. Richa, A., Scheideler, C., Schmid, S., Zhang, J.: Towards jamming-resistant and
competitive medium access in the sinr model. In: Proc. 3rd ACM Workshop on
Wireless of the Students, by the Students, for the Students, S3 2011, pp. 33–36
(2011)
19. Schmid, S., Wattenhofer, R.: Algorithmic models for sensor networks. In: IPDPS.
IEEE (2006)
20. Yu, D., Wang, Y., Hua, Q.-S., Lau, F.C.M.: Distributed local broadcasting algo-
rithms in the physical interference model. In: DCOSS, pp. 1–8 (2011)
Secure Equality and Greater-Than Tests
with Sublinear Online Complexity

Helger Lipmaa1 and Tomas Toft2


1
Institute of CS, University of Tartu, Estonia
2
Dept. of CS, Aarhus University, Denmark

Abstract. Secure multiparty computation (MPC) allows multiple par-


ties to evaluate functions without disclosing the private inputs. Secure
comparisons (testing equality and greater-than) are important primitives
required by many MPC applications. We propose two equality tests for -
bit values with O(1) online communication that require O() respectively
O(κ) total work, where κ is a correctness parameter.
Combining these with ideas of Toft [16], we obtain (i) a greater-than
protocol with sublinear online complexity in the arithmetic black-box
model (O(c) rounds and O(c · 1/c ) work online, with c = log  resulting
in logarithmic online work). In difference to Toft, we do not assume two
mutually incorruptible parties, but O() offline work is required, and
(ii) two greater-than protocols with the same online complexity as the
above, but with overall complexity reduced to O(log (κ + loglog )) and
O(c·1/c (κ+log )); these require two mutually incorruptible parties, but
are highly competitive with respect to online complexity when compared
to existing protocols.

Keywords: Additively homomorphic encryption, arithmetic black box,


secure comparison, secure equality test.

1 Introduction

Secure multiparty computation (MPC) considers the following problem: n par-


ties hold inputs, x1 , . . . , xn , for a function, f ; they wish to evaluate f without
disclosing their inputs to each other or third-parties. Numerous solutions to this
problem exist; many provide secure arithmetic over a field or ring, e.g., ZM for
an appropriate M , by relying either on secret sharing or additively homomorphic
encryption. The overall structure of those solutions is similar, thus the details
of the constructions may be abstracted away and MPC-protocols can be con-
structed based on secure arithmetic. This idea was formalized as the arithmetic
black-box (ABB) by Damgård and Nielsen [7]. For a longer discussion of MPC
and the ABB, see Sect. 2.
Secure ZM -arithmetic may be used to emulate integer computation when
inputs/outputs are less than M (which typically can be chosen quite freely).
However, other operations may be needed. Secure comparison – equality testing
(Eq) and greater-than testing (GT) – are two important problems in the (MPC)

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 645–656, 2013.

c Springer-Verlag Berlin Heidelberg 2013
646 H. Lipmaa and T. Toft

Table 1. A comparison of sublinear GT protocols for bitlength 


Result Online rounds Online work Overall work Correctness
Adversary structure with two mutually incorruptible parties
[16] O(c) O(c · 1/c (κ + log )) O(c · 1/c (κ + log )) Statistical
[16] O(log ) O(log (κ + loglog )) O(log (κ + loglog )) Statistical
This paper O(c) O(c · 1/c ) O(c · 1/c (κ + log )) Statistical
This paper O(log ) O(log ) O(log (κ + loglog ))) Statistical
Arbitrary adversary structure

[18] O(1) O( /log) O() Perfect
This paper O(c) O(c · 1/c ) O() Perfect
This paper O(log ) O(log ) O() Perfect

literature. They are required for tasks as diverse as auctions, data-mining, and
benchmarking. A prime example is the first real-world MPC execution [4], which
required both integer additions and GT tests.
In this paper, we introduce two new Eq tests and improve over state of the art
GT testing in the ABB model. The main focus is online efficiency, i.e., parties
may generate joint randomness in advance (e.g, while setting up an auction) to
increase efficiency once the inputs have been supplied (bids have been given).

Related Work. Secure comparison and its applications is a very active


topic with too many papers to mention all. Damgård et al. [6] proposed the
first constant-rounds protocols which required O( log ) secure multiplications.
Nishide and Ohta [13] improved this to O() work for GT and O(κ) work for
equality where κ is a correctness parameter.
Until recently, all GT tests had a complexity (at least) linear in the bitlength,
, of the inputs, but in [16], Toft proposed the first sublinear constructions.
These utilized proofs of boundedness and required the presence of 2 mutually
incorruptible parties, i.e., one of two named parties was required to be honest.
This is naturally satisfied in the two-party case (n = 2), but the multiparty case is
left with either a corruption threshold of 1 or a non-standard adversary structure.
In [18], Yu proposed a sublinear, constant-rounds  protocol in the ABB model
based on sign modules. His protocol requires O( /log ) operations online and
works for an ABB over a finite field, i.e., prime M . It does not appear that
the ideas work with composite M such as is needed by Paillier encryption. See
Table 1 for an overview of existing sublinear GT tests.

Contribution. We propose a collection of actively secure protocols. We first


introduce two new protocols for equality testing of -bit values. The first is
perfectly correct with O(1) ABB-operations online and O() ABB-operations
overall. The second reduces overall communication to O(κ) at the cost of imper-
fect correctness, i.e., κ is a correctness parameter; it also requires two mutually
Secure Equality and GT Tests with Sublinear Online Complexity 647

incorruptible parties. Both improve online complexity dramatically over previous


work. Additionally, we use these in combination with ideas of [16] to obtain new
GT tests for -bit values in the ABB model. We end up with multiple variations.
First, ABB-protocols with O(log ) work and rounds (respectively O(c · 1/c )
work in O(c) rounds for constant c) online; O() work overall. Second, we reduce
overall work to O(log ·(κ+loglog )) (O(c·1/c (κ+log )) respectively) at the cost
of requiring two mutually incorruptible parties. All constructions require proofs
of boundedness to prevent active attacks. In contrast to [18], we do not utilize
sign modules, hence our protocols work for Paillier encryption-based MPC as
well. In that setting our GT tests are the first with sublinear online complexity
and arbitrary adversary structure.

2 Preliminaries

The Arithmetic Black-Box. Many MPC protocols work by having parties


supply their inputs “secretly,” e.g., using secret sharing, which allows a value to
be split between parties such that it remains unknown unless sufficiently many
agree. A homomorphic scheme allows parties to compute sums, while secure
multiplication requires interaction. Once the desired result has been computed,
it is straightforward to output it by reconstructing. The arithmetic black-box
of [7] captures this type of behaviour, making it a convenient model for presenting
MPC protocols. This allows protocol construction with focus on the task at hand
rather than “irrelevant details” such as the specifics and security guarantees of
the underlying cryptographic primitives.
Formally, the arithmetic black-box is an ideal functionality, FABB , and pro-
tocols are constructed in a hybrid model where access to this functionality is
given. FABB can be thought of as a (virtual) trusted third party, who provides
storage of elements of a ring, ZM , as well as arithmetic computation on stored
values. Here, M will be either a prime or an RSA-modulus, i.e., the product of
two large, odd primes. We provide an intuitive presentation of the ABB here;
see [7] or [17] for a formal definition. Full simulation-based proofs are possible;
due to space constraints we merely sketch privacy proofs.
Secure storage (input/output) can be thought of as secret sharing and we
use the notation of Damgård et al. [6], writing ABB-values in square brackets,
[[x]]. ABB-arithmetic is written using “plaintext space,” infix operators, e.g.,
[[x·y +z]] ← [[x]]·[[y]]+[[z]]. Naturally, such operations eventually refer to protocols
between P1 , . . . , Pn , e.g., the protocols of Ben-Or et al. [3].
The complexity of a protocol in the FABB hybrid model is the number of basic
operations performed, input/output and arithmetic. We assume that these op-
erations may be executed concurrently, i.e., that executing the underlying cryp-
tographic protocols concurrently does not violate security. Round complexity
is defined as the number of sequential sets of concurrent operations performed
(basic operations typically require a constant number of rounds; in this case
constant-rounds in the ABB model implies constant-rounds in the actual pro-
tocol). Finally, we focus on communication complexity and therefore consider
648 H. Lipmaa and T. Toft

addition costless; typical ABB realizations are based on additively homomor-


phic primitives, and this is a standard choice. Additionally, ABB-computatation
occurs in two phases: (i) random values are generated within FABB before the
inputs are known (preprocessing or offline phase), and (ii) when the inputs are
available within FABB , the result is computed (online phase). Focus is predom-
inantly on the efficiency of the online phase.

Known ABB Constructions and Additional Primitives. The following


known primitives are needed in the proposed constructions. These are exclusively
needed as part of the preprocessing phase; in practice it may be preferable to
utilize simpler (non-constant-rounds) solutions.
– RandElem: Generates a uniformly random, secret element of ZM stored
within the ABB. Considered as 1 multiplication and 1 rounds, [6].
– RandBit: Generates a uniformly random, secret bit stored within the ABB.
O(1) multiplications in O(1) rounds, [6].1
– RandBits: Generates  i a uniformly random ZM -value r and its binary rep-
resentation r = 2 ri , ri ∈ {0, 1} stored as elements of ZM , O(log M )
multiplications in O(1) rounds, [6].
– RandInv: Generates a uniformly random element in Z∗M along with its in-
verse; O(1) multiplications in O(1) rounds, [6].
– prefix× : Prefix product takes a vector of invertible, secret values, [[r1 ]], [[r2 ]],

. . ., [[rm ]], and computes the prefix-product, i.e., [[ ji=1 ri ]] for 1 ≤ j ≤ m,
using O(m) ABB operations in O(1) rounds [2,6].
We also require that the ABB can verify that an input is of bounded size, e.g.,
[[x]] < 2 . x is known by the inputter, Pi , so this corresponds to executing a proof
of boundedness. A communication-efficiently solution (Θ(1) group elements) can
be obtained using the sum-of-four-squares technique, [10]: Pi supplies an in-
teger input (decomposed into squares) which is converted to a ZM element;
this can be done using integer commitments (for encryption) or linear integer
secret sharing scheme [15] (for Shamir sharing). An alternative is to use the
constant-communication non-interactive zero-knowledge argument of [5]; there,
Pi commits to a vector of digits of x and uses the techniques of [8,11] to prove
that the encrypted x belongs to the given range.

Disclose-If-Equal. In a disclose-if-equal (DIE) protocol between Alice and


Bob, Alice gets to know Bob’s secret β exactly if she encrypted x (where x is a
value known to Bob only, or possibly to both). Otherwise, Alice should obtain
a (pseudo)random plaintext. See [1,9] for original definitions.
If the plaintext space is ZM for a prime M (as it is in the case of the secret
sharing setting), one can use the following simple protocol inspired by [1] (here,
Encpk means encryption by public key pk and Decsk means decryption by the
corresponding secret key sk): (1) Alice sends q ← Encpk (α) to Bob. (2) If the
ciphertext is invalid, Bob returns ⊥. Otherwise, he returns a ← (q ·Encpk (−x))r ·
1
When M is an RSA-modulus, complexity is linear in the number of parties, for
simplicity we assume that this is constant. (This is only used in preprocessing.)
Secure Equality and GT Tests with Sublinear Online Complexity 649

Encpk (β), where r ← ZM . (3) Alice computes Decsk (a) = r(α − x) + β, which is
equal to β when α = x. Clearly, this protocol is perfectly complete, and encryp-
tion, decryption, and exponent-arithmetic can be replaced by ABB operations.
If M is not a prime but has sufficiently large prime factors (like in the case of
existing additively homomorphic public-key cryptosystems), then the resulting
DIE protocol, proposed by Laur and Lipmaa in [9], is somewhat more compli-
cated. Let  be the bitlength of β. Let T ← 2− ·M . Let spf (M ) be the smallest
prime factor of the plaintext group order M . We assume  ≤ 12 log2 M + log2 ε,
where ε ≤ 2−80 is the hiding parameter. Here we assume that Bob knows the
public key and Alice knows the secret key and the parties use an additively ho-
momorphic public-key cryptosystem like the one by Paillier [14]. (1) Alice sends
q ← Encpk (α) to Bob. (2) If the ciphertext is invalid, Bob returns ⊥. Otherwise,
he returns a ← (q · Encpk (−x))r · Encpk (β + 2 · t), where r ← ZM and t ← ZT .
(3) Alice computes Decsk (a) mod 2 .
As shown in [9], this protocol is (1 − ε)-semisimulatable [12] (that is, game-
based computationally private against a malicious server, and simulation-based
statistically private against a malicious client) as long as 2−1 /spf (M ) is
bounded by ε. That is, if x = α then the distribution of U (ZM ) · (α − x) +
2 · U (ZT ) is ε-far from the√uniform distribution U (ZM ) on ZM . Since in the
case of Paillier, spf (M ) ≈ M , we need that  − 1 − 12 · log2 M ≤ log2 ε or
 < 12 · log2 M + log2 ε, as mentioned. The idea behind including the addi-
tional term 2 · t in the Laur-Lipmaa protocol is that if M is composite, then
Decpk ((q · Encpk (−x))U(ZM ) ) = U (ZM ) · (α − x) can be a random element of a
nontrivial subgroup of ZM and thus far from random in ZM ; adding 2 · U (ZT )
guarantees that the result is almost uniform in ZM .

3 Secure Equality Tests


It is well-known that equality testing can be implemented using a zero-test (given
additively homomorphic primitives) as x = y ⇔ x − y = 0; w.l.o.g., we focus on
testing whether x equals 0 and present two new, secure protocols.
The first zero-test is based on the Hamming distance between a mask and
the masked value. Complexity is linear in the bit-length, but only O(1) ABB
multiplications and outputs are needed online. Hence, when a preprocessing
phase is present, this is highly efficient. Additionally, we present a variation
allowing comparison of -bit numbers with O() preprocessing and O(1) work
online, when 2+k+log n ; M for statistical security parameter k, and n parties.
The second approach is based on DIE and reduces the problem from arbitrary
size inputs to κ-bit inputs, where κ is a correctness parameter, e.g., 80. This
simpler problem may then be solved, e.g., using the Hamming-based approach.

3.1 Equality from Hamming Distance


Let M = log2 M be the bitlength of M . The protocol, denoted eqH , is seen as
Protocol 1. It is a variation of [13] with a highly optimized online phase. (Though
phrased differently, Nishide and Ohta [13] did essentially the same thing).
650 H. Lipmaa and T. Toft

Alice ABB([[x]]) Bob

Preprocessing:
[[r]]; [[r M −1 ]], . . . , [[r0 ]] ← RandBits()
([[R]], [[R−1 ]]) ← RandInv()
[[R]], [[R2 ]], . . . , [[R M ]] ← prefix× ([[R]], [[R]], . . . , [[R]])

Online:
[[m]] ← [[r]] + [[x]]
m . m
 M −1
[[1 + H]] ← 1 + i=0 (mi + [[ri ]] − 2 · mi · [[ri ]])
[[mH ]] ← [[R−1 ]] · [[1 + H]]
mH mH
.
for i ← 0 to M do [[(1 + H)i ]] ← miH · [[Ri ]]
 M
[[x =? 0]] ← i=0 αi · [[(1 + H) ]] = [[P M (H + 1)]]
i

Protocol 1. eqH , secure zero-testing based on Hamming distance

Correctness. Picking a uniformly random, unknown [[r]] and revealing m =


[[x]] + [[r]] allows testing x = 0 by testing whether m = r. If [[r]] in generated
M −1
in binary, we can compute the Hamming distance [[H]] = i=0 [[ri ]] ⊕ mi =
M −1
(m i + [[ri ]] − 2 · m i · [[ri ]]), and test if H = 0. Since H ≤  M , the latter
i=0  M
is simpler than the general zero-test. Let PM (x) = i=0 αi · x denote the (at
i

most) M -degree polynomial that maps 1 to 1 and x ∈ {2, 3, . . . , M + 1} to 0.2


Evaluating PM at 1 + H determines H + 1 = 1 ⇔ m = r ⇔ x = 0.
To avoid Ω(M ) online multiplications when computing the M + 1 powers
of [[1 + H]], the following trick is used: A uniformly random value, [[R]] ∈ Z∗M is
chosen in advance, and its exponents [[R0 ]], [[R1 ]], . . . , [[RM ]] and inverse, [[R−1 ]],
are computed in the offline phase. In the online phase, mH = [[R−1 ]] · [[1 + H]]
is computed and revealed, and the powers of [[1 + H]] are computed from the
powers of [[R]] and the powers of mH , which can be done locally by all parties:
miH · [[Ri ]] = (R−1 (1 + H))i · [[Ri ]] = [[(1 + H)i ]].

Privacy. Two values are revealed in eqH , m and mH . Since r is uniformly


random in ZM , then so is m = x + r. Similarly, since 1 + M is smaller3 than the
smallest prime factor of M we have 1 + H ∈ Z∗M . Thus (1 + H) · R−1 is uniformly
random in Z∗M as R is uniformly random. Simulation in the FABB -hybrid model
consists of providing “fake” m and mH distributed as the real ones.

Complexity. The preprocessing phase consists of generating [[r]] along with its
bits, [[ri−1 ]] as well as [[R]], [[R−1 ]], and [[Ri ]] for i ∈ {1, . . . , M }. Overall this
amounts to O(M ) work. Online, only 1 ABB-multiplication (to compute mH )
and 2 outputs are needed. Computing the Hamming distance and evaluating
PM are costless.
2
PM exists both when M is a prime or an RSA-modulus and the coefficients, αi , can
be computed using Lagrange interpolation. For technical reasons, the input to PM
must belong to Z∗M , this is ensured by adding 1.
3
Always the case since M is either a prime or the product of two large primes.
Secure Equality and GT Tests with Sublinear Online Complexity 651

Bounded Inputs. If the input is of bounded size, [[x]] < 2 , and 2+k+log n ; M
where k is a statistical security parameter, the following variation is possible:
Each party Pj inputs a uniformly random k-bit value, r(j) , and the n parties
jointly generate  random bits, [[ri ]], using RandBit. The ABB then computes
n −1
[[r]] ← j=1 [[r(j) ]] · 2 + ( i=0 2i [[ri ]]). Here, r statistically masks x: m mod 2 is
uniformly random, while a single r(j) masks the ’th carrybit of the addition, x+
r, i.e., m/2 is statistically indistinguishable from a sum of uniformly random
k-bit values plus the ri of malicious parties. Testing equality between r mod 2
and m mod 2 is sufficient; note that this zero-test allows equality testing even
when the difference between the inputs is negative.

Theorem 1. Given two -bit values [[x]] and [[y]] stored in an n-party arith-
metic black-box for ZM augmented with a proof of boundedness, equality may
be computed with 2 outputs and 1 ABB-multiplication in the online phase and
O() operations overall. This is the case both when  = M as well as when
2+k+log n ; M , where k is a statistical security parameter.

3.2 Equality from DIE

We utilize the DIE protocol in the ABB model to construct a statistically correct
zero test (and hence an equality test) in the presence of mutually incorruptible
parties, denoted Alice and Bob. Complexity linear in the correctness parameter,
κ, i.e., it is only useful when the input is of greater bitlength, say  = 1000 and
κ = 80. For the sake of concreteness, we describe the case where M is composite.
The idea is to transform [[x]], x ∈ {0, 1}, to [[y]], where y = 0 when x = 0,
and y is (1 − ε)-close to uniformly random, for an exponentially small ε, when
x = 0. Note that here we use the security parameter κ as the bitlength in the
DIE protocol. (See Sect. 2 for the explanation of ε= 2κ−1 /spf (M ).) The value
y is then used to “mask” t · 2κ + β, i.e., disclose it when x = 0 and hide it
otherwise. The value revealed to Alice is always statistically close to uniformly
random, hence reducing it modulo 2κ and testing equality with β provides a zero
test with a probability of failure of 2−κ . Details are seen as Protocol 2, where eq
denotes the equality test from Sect. 3.1 but for κ-bit inputs. We focus on the case
when M is an RSA-modulus and limit the description to the two-party case. The
main benefit of this combined protocol is that by combining it with the equal-
ity test above replaces the O() offline computation/communication with O(κ)
offline computation/communication. As a drawback, it requires two mutually
incorruptible parties and has only has statistical (not perfect) correctness.

Correctness. When x = 0, we have m = t · 2κ + β and therefore m̃ = β. Thus,


the final equality test correctly determines equality with 0. When x = 0, [[x]] · [[r]]
is (1 − ε)-close to uniformly random since [[r]] is generated using RandElem and
therefore guaranteed to be uniformly random. This implies that m is statistically
close to uniformly random, independently of t·2κ +β. Thus, m reveals statistically
almost no information about β. We remark that the ABB must verify not only
652 H. Lipmaa and T. Toft

Alice ABB([[x]]), κ, T = %2−κ · M& Bob

Preprocessing:
[[r]] ← RandElem() β ∈ Z2κ
β < 2κ ; t < T t ∈ ZT

Online:
[[m]] ← [[x]] · [[r]] + (2κ · [[t]] + [[β]])
m
m̃ = m mod 2κ < 2κ
[[x = 0]] ← eq(m̃, β)

Protocol 2. eqDIE , secure zero-testing based on disclose-if-equal

m̃ < 2κ , but also that m̃ = m mod 2κ . This can be done by providing not
only m̃ = m mod 2κ < 2κ , but also m/2κ < M/2κ , and verifying that
m = m/2κ · 2κ + m̃ (e.g., by outputting the difference).

Privacy. A corrupt Bob receives no outputs from the ABB, hence simulation
is trivial: do nothing. For a corrupt Alice, note that the only value leaving the
ABB is m, hence this is the only possible information leak. Since Bob is honest,
t·2κ + β is chosen correctly, thus, no matter the value of x, m will be statistically
close to uniformly random – either due to Bob’s random choice or the addition
of x · r. Hence, simulation will consist of a uniformly random element.

Complexity. The protocol consists of one random element generation, three


inputs, and one output plus the invocation of eq. Using eqH of the previous
subsection, implies O(κ) work overall and O(1) (but a slightly worse constant)
work online. We state the following theorem:
Theorem 2. Given two -bit values [[x]] and [[y]] stored in an n-party arithmetic
black-box for ZM augmented with a proof of boundedness, equality may be com-
puted with 3 outputs and 2 ABB-multiplication in the online phase and O(κ)
operations overall when two mutually incorruptible parties are present.

4 Greater-Than with Sublinear online Complexity


Toft [16] recently introduced the first sublinear GT protocols, i.e., protocols
computing [[x ≥ y]] from [[x]] and [[y]]. Utilizing eqH and eqDIE from Sect. 3,
we propose two different (and orthogonal) improvements: (i) We can eliminate
the need for two mutually incorruptible parties; this comes at the cost of linear
preprocessing, or (ii) We improve efficiency when two mutually incorruptible
parties exist by an order of magnitude. Similarly to [16] we assume 2+k+log n ;
M , where k is a statistical security parameter and n the number of parties.
The overall idea behind Toft’s construction is to perform a GT-test through
log  equality tests: If the /2 most significant bits of [[x]] and [[y]] differ then
Secure Equality and GT Tests with Sublinear Online Complexity 653

Alice ABB([[x]], [[y]], ) Bob

if  = 1 then return
( ) ( )
([[e1 ]], . . . , [[eSe ]]) ← eq( /2),preproc ;
for i ← 0 to  − 1 do [[ri ]] ← RandBit
( )  /2−1 i
[[r⊥ ]] ← 2 [[ri ]];
( ) i=0 /2−1 i
[[r ]] ← i=0 2 [[ri+ /2 ]];

r (A, )
← Z2k r (B, )
← Z2k
r (A, )
< 2k r (B, )
< 2k
.
( ) ( )
[[R( ) ]] ← 2 ([[r (A, ) ]] + [[r (B, ) ]]) + 2 /2
[[r ]] + [[r⊥ ]]
( ) ( )
([[g1 ]], . . . , [[gSg ]]) ← gt( /2),log,preproc

Protocol 3. gt(),log,preproc : Preprocessing for the secure, -bit GT test, gt(),log

ignore the /2 least significant ones; if they are equal then continue with the
/2 least significant ones. (This description is not correct, but provides sufficient
intuition at this point.)

4.1 Sublinear Online Communication in the ABB Model


The main idea of the construction, gt(),log = (gt(),log,preproc , gt(),log,online ),
for comparing -bit numbers is to take the two mutually incorruptible
parties of [16] and implement one using the ABB and executing the other
“publicly.” The core task of the ABB-party is then to generate appropriately
distributed random values. Letting eq() = (eq(),preproc , eq(),online ) denote an
equality test for -bit numbers (and its offline and online phases), preprocess-
ing consists of invoking eq(2j ),preproc as well as generating log  random values
j j n j j−1 (2j ) (2j )
[[R(2 ) ]] ← 2(2 ) ( i=1 [[r(i,2 ) ]])+2(2 ) [[r ]]+[[r⊥ ]] for j ∈ {1, . . . , log }, where
j (2j ) (2j )
[[r(i,2 ) ]] is a uniformly random k-bit number supplied by Pi and [[r ]], [[r⊥ ]] are
uniformly random 2j−1 -bit values unknown to all. Details are seen as Protocol 3.
The online phase of the construction is seen as Protocol 4 and explained
in the correctness argument below. For clarity, we include preprocessed values
implicitly in invocations of subprotocols.

Correctness. Correctness is immediate for single bit inputs: 1 − y + x · y is 1


exactly when x ≥ y. For  > 1, the goal is to transform the comparison of -
bit integers to a comparison of /2-bit integers. Observe that z/2 equals the
desired result and that this can be computed as 2− ([[z]] − [[z mod 2 ]]). Further,
since 2+k+log n ; M , we have z mod 2 ≡ m − r mod 2 . We reduce m and
[[r]] before the subtraction, which ensures that the result lies between −2 and
2 . The correct result is obtained by adding 2 when this is negative, i.e., when
[[r mod 2 ]] > m mod 2 . The latter implies [[f ]] = 1 since we recursively compare
the /2 most- or least-significant bits of [[r mod 2 ]] and m mod 2 depending on
whether the /2 most significant bits differed.
654 H. Lipmaa and T. Toft

Alice ABB([[x]], [[y]]) Bob

if  = 1 then return 1 − [[y]] + [[x]] · [[y]];


[[z]] ← 2 + [[x]] − [[y]];
[[m]] ← [[z]] + [[R( ) ]];
m . m

m⊥ = m mod 2 /2 m⊥ = m mod 2 /2
m = %m/2 /2 & mod 2 /2
m = %m/2 /2 & mod 2 /2

( )
[[b]] ← eq(( /2)),online (m , [[r ]])
[[m̃]] ← [[b]]·(m⊥ − m ) + m
( ) ( ) ( )
[[r̃]] ← [[b]]·([[r⊥ ]] − [[r ]]) + [[r ]]
[[f ]] ← 1 − (gt( /2),log,online ([[m̃]], [[r̃]]))
( ) ( )
[[z mod 2 ]] ← ((m mod 2 ) − (2 /2 [[r ]] + [[r⊥ ]]) + 2 [[f ]]
return [[x ≥ y]] ← 2− ([[z]] − [[z mod 2 ]])

Protocol 4. gt(),log,online : Online phase of the secure, -bit GT test, gt(),log

Privacy. In each recursive call, m = z + r is revealed, but this is statistically


indistinguishable from a random value distributed as r – as above r statistically
masks z as the bit-length is (at least) k bits longer; for honest ith party Pi ,
2 · r(i) + 2/2 · r + r⊥ is uniformly random.

Complexity. Preprocessing requires O() work: Though there is a logarithmic


number of rounds, each one deals with a problem of half size. Hence, the com-
bined r , r⊥ and random masks for eqH,(·) are only O() bits overall. Round
complexity is O(1), as the iterations can be preprocessed in parallel.
Each online iteration (for j ∈ {1, . . . , 2 }) requires an output (m) and an
execution of eq(2j ),online . Additionally, an ABB-multiplication is used to copy
the most significant differing halves (if these exist). The remaining computation
is purely local or in the form of ABB-additions. Thus, the overall complexity is
O(log ) given that eq(·),online requires a constant number of ABB operations.
Implementing eq· as eqH,(·) , the above results in a protocol with 3 log  out-
puts and 2 log  + 1 ABB-multiplications online; three outputs and two ABB-
multiplications per iteration and a single secure ABB-multiplication in the final,
single-bit comparison. We state the following theorem:
Theorem 3. Given two -bit values [[x]] and [[y]] stored in an n-party arithmetic
black-box for ZM augmented with a proof of boundedness, greater-than may be
computed with 3 log  outputs and 2 log  + 1 ABB-multiplications in the online
phase when 2+k+log n ; M , where k is a statistical security parameter.
We may adapt the constant-rounds protocol of [16] to the present setting. Sketch-
ing the solution, let c be a (constant) integer and split m mod 2 into 1/c
strings of 1−1/c length. The most significant differing strings may be deter-
mined using O(1/c ) equality tests and arithmetic; these are then compared
recursively. Overall this requires c iterations and O(c · 1/c ) equality tests and
ABB-multiplications/outputs.
Secure Equality and GT Tests with Sublinear Online Complexity 655

Theorem 4. Given two -bit values [[x]] and [[y]] stored in an n-party arithmetic
black-box for ZM augmented with a proof of boundedness, greater-than may be
computed with O(c · 1/c ) ABB operations in O(c) rounds in the online phase
when 2+k+log n ; M , where k is a security parameter.

4.2 Sublinear, DIE-Based Greater-Than

eqDIE,(·) is much more efficient than the equality test used in [16]. Thus, com-
bining this with Toft’s original protocol4 improves practical efficiency and re-
duces the theoretical online complexity – O(log ) rounds and work online and5
O(log (κ+loglog )) ABB-operations overall. The constant-rounds protocol may
also be combined with eqDIE resulting in an O(c) rounds protocol with O(c·1/c )
work online and O(1/c (κ+ log )) work overall. We state the following theorems:

Theorem 5. Given two -bit values [[x]] and [[y]] stored in an n-party arithmetic
black-box for ZM augmented with a proof of boundedness, GT may be computed in
the presence of two mutually incorruptible parties with 4 log  outputs and 3 log +
1 ABB-multiplications in the online phase and O(log (κ + loglog )) operations
overall when 2+k+log n ; M , where k is a statistical security parameter.

Theorem 6. Given two -bit values [[x]] and [[y]] stored in an n-party arithmetic
black-box for ZM augmented with a proof of boundedness, greater-than may be
computed in the presence of two mutually incorruptible parties with O(c · 1/c )
ABB-operations in O(c) rounds in the online phase and O(1/c (κ + log )) oper-
ations overall when 2+k+log n ; M , where k is a statistical security parameter.

Acknowledgements. The first author was supported by the Estonian Research


Council, and European Union through the European Regional Development
Fund. The second author was supported by COBE financed by “The Danish
Agency for Science, technology and Innovation.” Additional support from the
Danish National Research Foundation and The National Science Foundation of
China (under the grant 61061130540) for the Sino-Danish Center for the Theory
of Interactive Computation.

References

1. Aiello, W., Ishai, Y., Reingold, O.: Priced Oblivious Transfer: How to Sell Digital
Goods. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 119–135.
Springer, Heidelberg (2001)
2. Bar-Ilan, J., Beaver, D.: Non-Cryptographic Fault-Tolerant Computing in a Con-
stant Number of Rounds of Interaction. In: Rudnicki, P. (ed.) PODC 1989,
pp. 201–209. ACM Press (1989)
4
The key difference from Protocol 4 is that Bob selects r, while only Alice learns m.
5
We add loglog  to κ to compensate for a non-constant number of equality tests.
656 H. Lipmaa and T. Toft

3. Ben-Or, M., Goldwasser, S., Wigderson, A.: Completeness Theorems for Non-
Cryptographic Fault-Tolerant Distributed Computation. In: STOC 1988, pp. 1–10.
ACM Press (1988)
4. Bogetoft, P., Christensen, D.L., Damgård, I., Geisler, M., Jakobsen, T., Krøigaard,
M., Nielsen, J.D., Nielsen, J.B., Nielsen, K., Pagter, J., Schwartzbach, M., Toft,
T.: Secure Multiparty Computation Goes Live. In: Dingledine, R., Golle, P. (eds.)
FC 2009. LNCS, vol. 5628, pp. 325–343. Springer, Heidelberg (2009)
5. Chaabouni, R., Lipmaa, H., Zhang, B.: A Non-interactive Range Proof with
Constant Communication. In: Keromytis, A.D. (ed.) FC 2012. LNCS, vol. 7397,
pp. 179–199. Springer, Heidelberg (2012)
6. Damgård, I.B., Fitzi, M., Kiltz, E., Nielsen, J.B., Toft, T.: Unconditionally Se-
cure Constant-Rounds Multi-party Computation for Equality, Comparison, Bits
and Exponentiation. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876,
pp. 285–304. Springer, Heidelberg (2006)
7. Damgård, I.B., Nielsen, J.B.: Universally Composable Efficient Multiparty Com-
putation from Threshold Homomorphic Encryption. In: Boneh, D. (ed.) CRYPTO
2003. LNCS, vol. 2729, pp. 247–264. Springer, Heidelberg (2003)
8. Groth, J.: Short Pairing-Based Non-interactive Zero-Knowledge Arguments. In:
Abe, M. (ed.) ASIACRYPT 2010. LNCS, vol. 6477, pp. 321–340. Springer, Heidel-
berg (2010)
9. Laur, S., Lipmaa, H.: A New Protocol for Conditional Disclosure of Secrets and
Its Applications. In: Katz, J., Yung, M. (eds.) ACNS 2007. LNCS, vol. 4521,
pp. 207–225. Springer, Heidelberg (2007)
10. Lipmaa, H.: On Diophantine Complexity and Statistical Zero-Knowledge Argu-
ments. In: Laih, C.-S. (ed.) ASIACRYPT 2003. LNCS, vol. 2894, pp. 398–415.
Springer, Heidelberg (2003)
11. Lipmaa, H.: Progression-Free Sets and Sublinear Pairing-Based Non-Interactive
Zero-Knowledge Arguments. In: Cramer, R. (ed.) TCC 2012. LNCS, vol. 7194, pp.
169–189. Springer, Heidelberg (2012)
12. Naor, M., Pinkas, B.: Oblivious Transfer and Polynomial Evaluation. In: STOC
1999, pp. 245–254. ACM Press (1999)
13. Nishide, T., Ohta, K.: Multiparty Computation for Interval, Equality, and Com-
parison Without Bit-Decomposition Protocol. In: Okamoto, T., Wang, X. (eds.)
PKC 2007. LNCS, vol. 4450, pp. 343–360. Springer, Heidelberg (2007)
14. Paillier, P.: Public-Key Cryptosystems Based on Composite Degree Residuosity
Classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238.
Springer, Heidelberg (1999)
15. Thorbek, R.: Linear Integer Secret Sharing. Ph.D. thesis, Aarhus University (2009)
16. Toft, T.: Sub-linear, Secure Comparison with Two Non-colluding Parties. In: Cata-
lano, D., Fazio, N., Gennaro, R., Nicolosi, A. (eds.) PKC 2011. LNCS, vol. 6571,
pp. 174–191. Springer, Heidelberg (2011)
17. Toft, T.: Primitives and Applications for Multiparty Computation. Ph.D. thesis,
Aarhus University (2007)
18. Yu, C.H.: Sign Modules in Secure Arithmetic Circuits. Tech. Rep. 2011/539, IACR
(October 1, 2011), http://eprint.iacr.org/2011/539 (checked in February 2013)
Temporal Network Optimization
Subject to Connectivity Constraints

George B. Mertzios1 , Othon Michail2 ,


Ioannis Chatzigiannakis2 , and Paul G. Spirakis2,3
1
School of Engineering and Computing Sciences, Durham University, UK
2
Computer Technology Institute & Press “Diophantus” (CTI), Patras, Greece
3
Department of Computer Science, University of Liverpool, UK
[email protected], {michailo,ichatz,spirakis}@cti.gr

Abstract. In this work we consider temporal networks, i.e. networks


defined by a labeling λ assigning to each edge of an underlying graph G
a set of discrete time-labels. The labels of an edge, which are natural
numbers, indicate the discrete time moments at which the edge is avail-
able. We focus on path problems of temporal networks. In particular, we
consider time-respecting paths, i.e. paths whose edges are assigned by λ
a strictly increasing sequence of labels. We begin by giving two efficient
algorithms for computing shortest time-respecting paths on a temporal
network. We then prove that there is a natural analogue of Menger’s the-
orem holding for arbitrary temporal networks. Finally, we propose two
cost minimization parameters for temporal network design. One is the
temporality of G, in which the goal is to minimize the maximum number
of labels of an edge, and the other is the temporal cost of G, in which
the goal is to minimize the total number of labels used. Optimization of
these parameters is performed subject to some connectivity constraint.
We prove several lower and upper bounds for the temporality and the
temporal cost of some very basic graph families such as rings, directed
acyclic graphs, and trees.

1 Introduction
A temporal (or dynamic) network is, loosely speaking, a network that changes
with time. This notion encloses a great variety of both modern and traditional
networks such as information and communication networks, social networks,
transportation networks, and several physical systems.
In this work, embarking from the foundational work of Kempe et al. [KKK00],
we consider discrete time, that is, we consider networks in which changes occur
at discrete moments in time, e.g. days. This choice is not only a very natural

Supported in part by (i) the project FOCUS implemented under the “ARISTEIA”
Action of the OP “Education and Lifelong Learning” and co-funded by the EU
(ESF) and Greek National Resources, (ii) the FET EU IP project MULTIPLEX
under contract no 317532, and (iii) the EPSRC Grant EP/G043434/1. Full version:
http://ru1.cti.gr/aigaion/?page=publication&kind=single&ID=977

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 657–668, 2013.

c Springer-Verlag Berlin Heidelberg 2013
658 G.B. Mertzios et al.

abstraction of many real systems but also gives to the resulting models a purely
combinatorial flavor. In particular, we consider those networks that can be de-
scribed via an underlying graph G and a labeling λ assigning to each edge of
G a (possibly empty) set of discrete labels. Note that this is a generalization of
the single-label-per-edge model used in [KKK00], as we allow many time-labels
to appear on an edge. These labels are drawn from the natural numbers and
indicate the discrete moments in time at which the corresponding connection is
available. For example, in the case of a communication network, availability of
a communication link at some time t may mean that a communication protocol
is allowed to transmit a data packet over that link at time t.
In this work, we initiate the study of the following fundamental network design
problem: “Given an underlying (di)graph G, assign labels to the edges of G so that
the resulting temporal graph λ(G) minimizes some parameter while satisfying some
connectivity property”. In particular, we consider two cost optimization parame-
ters for a given graph G. The first one, called temporality of G, measures the maxi-
mum number of labels that an edge of G has been assigned. The second one, called
temporal cost of G, measures the total number of labels that have been assigned
to all edges of G  (i.e. if |λ(e)| denotes the number of labels assigned to edge e, we
are interested in e∈E |λ(e)|). Each of these two cost measures can be minimized
subject to some particular connectivity property P that the temporal graph λ(G)
has to satisfy. In this work, we consider two very basic connectivity properties. The
first one, that we call the all paths property, requires the temporal graph to pre-
serve every simple path of its underlying graph, where by “preserve a path of G”
we mean that the labeling should provide at least one strictly increasing sequence
of labels on the edges of that path (we also call such a path time-respecting).
For an illustration, consider a directed ring u1 , u2 , . . . , un . We want to deter-
mine the temporality of the ring subject to the all paths property, that is, we
want to find a labeling λ that preserves every simple path of the ring and at the
same time minimizes the maximum number of labels of an edge. Consider the
paths P1 = (u1 , . . . , un ) and P2 = (un−1 , un , u1 , u2 ). It is immediate to observe
that an increasing sequence of labels on the edges of path P1 implies a decreasing
pair of labels on edges (un−1 , un ) and (u1 , u2 ). On the other hand, path P2 uses
first (un−1 , un ) and then (u1 , u2 ) thus it requires an increasing pair of labels
on these edges. It follows that in order to preserve both P1 and P2 we have to
use a second label on at least one of these two edges, thus the temporality is at
least 2. Next, consider the labeling that assigns to each edge (ui , ui+1 ) the labels
{i, n + i}, where 1 ≤ i ≤ n and un+1 = u1 . It is not hard to see that this labeling
preserves all simple paths of the ring. Since the maximum number of labels that
it assigns to an edge is 2, we conclude that the temporality is also at most 2. In
summary, the temporality of preserving all simple paths of a directed ring is 2.
The other connectivity property that we define, called the reach property,
requires the temporal graph to preserve a path from node u to node v whenever
v is reachable from u in the underlying graph. Furthermore, the minimization
of each of our two cost measures can be affected by some problem-specific con-
straints on the labels that we are allowed to use. We consider here one of the
Temporal Network Optimization Subject to Connectivity Constraints 659

most natural constraints, namely an upper bound of the age of the constructed
labeling λ, where the age of a labeling λ is defined to be equal to the maximum
label of λ minus its minimum label plus 1. Now the goal is to minimize the cost
parameter, e.g. the temporality, satisfy the connectivity property, e.g. all paths,
and additionally guarantee that the age does not exceed some given natural k.
Returning to the ring example, it is not hard to see, that if we additionally re-
strict the age to be at most n − 1 then we can no longer preserve all paths of a
ring using at most 2 labels per edge. In fact, we must now necessarily use the
worst possible number of labels, i.e. n − 1 on every edge.
Minimizing such parameters may be crucial as, in most real networks, making
a connection available and maintaining its availability does not come for free. At
the same time, such a study is important from a purely graph-theoretic perspec-
tive as it gives some first insight into the structure of specific families of temporal
graphs (e.g. no temporal ring exists with fewer than n + 1 labels). Finally, we
believe that our results are a first step towards answering the following funda-
mental question: “To what extent can algorithmic and structural results of graph
theory be carried over to temporal graphs? ”. For example, is there an analogue
of Menger’s theorem for temporal graphs? One of the results of the present work
is an affirmative answer to the latter question.

1.1 Related Work


Single-label Temporal Graphs and Menger’s Theorem. The model of
temporal graphs that we consider in this work is a direct extension of the single-
label model studied in [Ber96] and [KKK00] to allow for many labels per edge.
In [KKK00], Kempe et al., among other things, proved that there is no analogue
of Menger’s theorem, at least in its original formulation, for arbitrary single-
label temporal networks. In this work, we go a step ahead showing that if one
reformulates Menger’s theorem in a way that takes time into acount then a very
natural temporal analogue of Menger’s theorem is obtained. Furthermore, in the
present work, we consider a path as time-respecting if its edges have strictly
increasing labels and not non-decreasing as in the above papers.
Continuous Availabilities (Intervals). Some authors have naturally assumed
that an edge may be available for continuous time-intervals. The techniques used
there are quite different than those needed in the discrete case [XFJ03, FT98].
Distributed Computing on Dynamic Networks. A notable set of recent
works has studied (distributed) computation in worst-case dynamic networks in
which the topology may change arbitrarily from round to round (see e.g. [KLO10,
MCS12]). Population protocols [AAD+ 06] and variants [MCS11a] are collections
of passively mobile finite-state agents that compute something useful in the
limit. Another interesting direction assumes random network dynamicity and the
interest is on determining “good” properties of the dynamic network that hold
with high probability and on designing protocols for distributed tasks [CMM+ 08,
AKL08]. For introductory texts cf. [CFQS12, MCS11b, Sch02].
660 G.B. Mertzios et al.

Distance Labeling. A distance labeling of a graph G is an assignment of unique


labels to the vertices of G so that the distance between any two vertices can be
inferred from their labels alone [GPPR01, KKKP04]. There, the labeling param-
eter to be minimized is the binary length of an appropriate distance encoding,
which is different from our cost parameters.

1.2 Contribution
In Section 2, we formally define the model of temporal graphs under consid-
eration and provide all further necessary definitions. In Section 3, we give two
efficient algorithms for computing shortest time-respecting paths. Then in Sec-
tion 4 we present an analogue of Menger’s theorem which we prove valid for
arbitrary temporal graphs. In the full paper, we also apply our Menger’s ana-
logue to substantially simplify the proof of a recent result on distributed token
gathering. In Section 5, we formally define the temporality and temporal cost
optimization metrics for temporal graphs. In Section 5.1, we provide several up-
per and lower bounds for the temporality of some fundamental graph families
such as rings, directed acyclic graphs (DAGs), and trees, as well as an inter-
esting trade-off between the temporality and the age of rings. Furthermore, we
provide in Section 5.2 a generic method for computing a lower bound of the
temporality of an arbitrary graph w.r.t. the all paths property, and we illustrate
its usefulness in cliques and planar graphs. Finally, we consider in Section 5.3
the temporal cost of a digraph G w.r.t. the reach property, when additionally
the age of the resulting labeling λ(G) is restricted to be the smallest possible.
We prove that this problem is APX-hard. To prove our claim, we first prove
(which may be of interest in its own right) that the Max-XOR(3) problem is
APX-hard via a PTAS reduction from Max-XOR. In Max-XOR(3) problem, we
are given a 2-CNF formula φ, every literal of which appears in at most 3 clauses,
and we want to compute the greatest number of clauses of φ that can be simul-
taneously XOR-satisfied. Then we provide a PTAS reduction from Max-XOR(3)
to our temporal cost minimization problem. On the positive side, we provide
an (r(G)/n)-factor approximation algorithm for the latter problem, where r(G)
denotes the total number of reachabilities in G.

2 Preliminaries
Given a (di)graph G = (V, E), a labeling of G is a mapping λ : E → 2IN , that is,
a labeling assigns to each edge of G a (possibly empty) set of natural numbers,
called labels.

Definition 1. Let G be a (di)graph and λ be a labeling of G. Then λ(G) is the


temporal graph (or dynamic graph) of G with respect to λ. Furthermore, G is
the underlying graph of λ(G).

We denote by λ(E) the multiset of all labels assigned to the underlying


 graph by
the labeling λ and by |λ| = |λ(E)| their cardinality (i.e. |λ| = e∈E |λ(e)|). We
Temporal Network Optimization Subject to Connectivity Constraints 661

also denote by λmin = min{l ∈ λ(E)} the minimum label and by λmax = max{l ∈
λ(E)} the maximum label assigned by λ. We define the age of a temporal graph
λ(G) as α(λ) = λmax − λmin + 1. Note that in case λmin = 1 then we have
α(λ) = λmax . For every graph G we denote by LG the set of all possible labelings
λ of G. Furthermore, for every k ∈ N, we define LG,k = {λ ∈ LG : α(λ) ≤ k}.
For every time r ∈ IN, we define the rth instance of a temporal graph λ(G)
as the static graph λ(G, r) = (V, E(r)), where E(r) = {e ∈ E : r ∈ λ(e)} is the
(possibly empty) set of all edges of the underlying graph G that are assigned label
r by labeling λ. A temporal graph λ(G) may be also viewed as a sequence of static
graphs (G1 , G2 , . . . , Gα(λ) ), where Gi = λ(G, λmin + i − 1) for all 1 ≤ i ≤ α(λ).
Another, often convenient, representation of a temporal graph is the following.
Definition 2. The static expansion of a temporal graph λ(G) is a DAG H =
(S, A) defined as follows. If V = {u1 , u2 , . . . , un } then S = {uij : λmin − 1 ≤ i ≤
λmax , 1 ≤ j ≤ n} and A = {(u(i−1)j , uij  ) : if j = j  or (uj , uj ) ∈ E(i) for some
λmin ≤ i ≤ λmax }.
A journey (or time-respecting path) J of a temporal graph λ(G) is a path
(e1 , e2 , . . . , ek ) of the underlying graph G = (V, E), where ei ∈ E, together
with labels l1 < l2 < . . . < lk such that li ∈ λ(ei ) for all 1 ≤ i ≤ k. In words, a
journey is a path that uses strictly increasing edge-labels. If labeling λ defines
a journey on some path P of G then we also say that λ preserves P . A natural
notation for a journey is (e1 , l1 ), (e2 , l2 ), . . . , (ek , lk ) where each (ei , li ) is called
a time-edge. A (u, v)-journey J is called foremost from time t ∈ IN if l1 ≥ t and
lk is minimized. We say that a journey J leaves from node u (arrives at, resp.)
at time t if (u, v, t) ((v, u, t), resp.) is a time-edge of J. Two journeys are called
out-disjoint (in-disjoint, respectively) if they never leave from (arrive at, resp.)
the same node at the same time. If, in addition to the labeling λ, a positive
weight w(e) > 0 is assigned to every edge e ∈ E, then we get a weighted tempo-
ral graph. If this is the case, then a journey J is called shortest if it minimizes
the sum of the weights of its edges.
Throughout the text, unless otherwise stated, we denote by n the number
of nodes of (di)graphs and by d(G) the diameter of a (di)graph G, that is the
length of the longest shortest path between any two nodes of G. Finally, by δu
we denote the degree of a node u ∈ V (G) (in case of an undirected graph G).

3 Journey Problems
Theorem 1. Let λ(G) be a temporal graph, s ∈ V be a source node, and tstart
a time s.t. λmin ≤ tstart ≤ λmax . There is an algorithm that correctly computes
for all w ∈ V \{s} a foremost (s, w)-journey from time tstart . The running time
of the algorithm is O(nα3 (λ) + |λ|).
Theorem 2. Let λ(G) be a weighted temporal graph and let s, t ∈ V . Assume
also that |λ(e)| = 1 for all e ∈ E. Then, we can compute a shortest journey J
between
 s and t in λ(G) (or report that no such journey exists) in O(m log m +
v∈V v ) = O(n ) time, where m = |E|.
2 3
δ
662 G.B. Mertzios et al.

4 A Menger’s Analogue for Temporal Graphs

In this section, we prove that, in contrast to an important negative result from


[KKK00], there is a natural analogue of Menger’s theorem that is valid for all
temporal networks. In the full paper, we also apply our theorem to substantially
simplify the proof of a recent token gathering result.
When we say that we remove node departure time (u, t) we mean that we
remove all edges leaving u at time t, i.e. we remove the set {(u, v) ∈ E : t ∈
λ(u, v)}. So, when we ask how many node departure times are needed to separate
two nodes s and v we mean how many node departure times must be selected so
that after the removal of all the corresponding time-edges the resulting temporal
graph has no (s, v)-journey.

Theorem 3 (Menger’s Temporal Analogue). Take any temporal graph


λ(G), where G = (V, E), with two distinguished nodes s and v. The maximum
number of out-disjoint journeys from s to v is equal to the minimum number of
node departure times needed to separate s from v.

Proof. Assume, in order to simplify notation, that λmin = 1. Take the


static expansion H = (S, A) of λ(G). Let {ui1 } and {uin } represent s and
v over time, respectively (first and last colums, respectively), where 0 ≤
i ≤ λmax . We extend H as follows. For each uij , 0 ≤ i ≤ λmax − 1,
with at least 2 outgoing edges to nodes different than u(i+1)j , e.g. to nodes
u(i+1)j1 , u(i+1)j2 , . . . , u(i+1)jk , we add a new node wij and the edges (uij , wij )
and (wij , u(i+1)j1 ), (wij , u(i+1)j2 ), . . . , (wij , u(i+1)jk ). We also define an edge ca-
pacity function c : A → {1, λmax } as follows. All edges of the form (uij , u(i+1)j )
take capacity λmax and all other edges take capacity 1. We are interested in the
maximum flow from u01 to uλmax n . As this is simply a usual static flow network,
the max-flow min-cut theorem applies stating that the maximum flow from u01
to uλmax n is equal to the minimum of the capacity of a cut separating u01 from
uλmax n . Finally, observe that (i) the maximum number of out-disjoint journeys
from s to v is equal to the maximum flow from u01 to uλmax n and (ii) the min-
imum number of node departure times needed to separate s from v is equal to
the minimum of the capacity of a cut separating u01 from uλmax n . &
%

5 Minimum Cost Temporal Connectivity

In this section, we introduce (in Definition 3) the temporality and temporal cost
measures. These measures can be minimized subject to some particular connec-
tivity property P that the labeled graph λ(G) has to satisfy. For simplicity of
notation, we consider the connectivity property P as a subset of the set LG of all
possible labelings λ on the (di)graph G. Furthermore, the minimization of each
of these two cost measures can be affected by some problem-specific constraints
on the labels that we are allowed to use. We consider one of the most natural
constraints, namely an upper bound on the age of the constructed labeling.
Temporal Network Optimization Subject to Connectivity Constraints 663

Definition 3. Let G = (V, E) be a (di)graph, αmax ∈ N, and P be a connectivity


property. Then the temporality of (G, P, αmax ) is

τ (G, P, αmax ) = min max |λ(e)|


λ∈P∩LG,αmax e∈E

and the temporal cost of (G, P, αmax ) is

κ(G, P, αmax ) = min |λ(e)|


λ∈P∩LG,αmax
e∈E

Furthermore τ (G, P) = τ (G, P, ∞) and κ(G, P) = κ(G, P, ∞).

Note that Definition 3 can be stated for an arbitrary property P of the labeled
graph λ(G) (e.g. some proper coloring-preserving property). Nevertheless, we
only consider here P to be a connectivity property of λ(G). In particular, we
investigate the following two connectivity properties P:

– all-paths(G) = {λ ∈ LG : for all simple paths P of G, λ preserves P },


– reach(G) = {λ ∈ LG : for all u, v ∈ V where v is reachable from v in G, λ
preserves at least one simple path from u to v}.

5.1 Basic Properties of Temporality Parameters


5.1.1 Preserving All Paths
We begin with some simple observations on τ (G, all paths). Recall that given a
(di)graph G our goal is to label G so that all simple paths of G are preserved by
using as few labels per edge as possible. First note that if p(G) is the length of
the longest path in G then τ (G, all paths) ≤ p(G) for all graphs G: just give to
every edge the labels {1, 2, . . . , p(G)}.
A topological sort of a digraph G is a linear ordering of its nodes such that
if G contains an edge (u, v) then u appears before v in the ordering. It is well
known that a digraph G can be topologically sorted iff is a DAG.

Proposition 1. If G is a DAG then τ (G, all paths) = 1.

Proof. Take a topological sort u1 , u2 , . . . , un of G. Give to every edge (ui , uj ),


where i < j, label i. &
%

5.1.2 Preserving All Reachabilities


Now, instead of preserving all paths, we impose the apparently simpler requirement
of preserving just a single path between every reachability pair u, v ∈ V . We claim
that it is sufficient to understand how τ (G, reach), behaves on strongly connected
digraphs. Let C(G) be the set of all strongly connected components of a digraph G.
The following lemma proves that, w.r.t. the reach property, the temporality of any
digraph G is upper bounded by the maximum temporality of its components.

Lemma 1. τ (G, reach) ≤ maxC∈C(G) τ (C, reach) for every digraph G.


664 G.B. Mertzios et al.

Lemma 1 implies that any upper bound on the temporality of preserving the
reachabilities of strongly connected digraphs can be used as an upper bound
on the temporality of preserving the reachabilities of general digraphs. An in-
teresting question is whether there is some bound on τ (G, reach) either for all
digraphs or for specific families of digraphs. By using Lemma 1, it can be proved
that indeed there is a very satisfactory generic upper bound.
Theorem 4. τ (G, reach) ≤ 2 for all digraphs G.

5.1.3 Restricting the Age


Now notice that for all G we have τ (G, reach, d(G)) ≤ d(G); recall that d(G)
denotes the diameter of (di)graph G. Indeed it suffices to label each edge by
{1, 2, . . . , d(G)}. Thus, a clique G has trivially τ (G, reach, d(G)) = 1 as d(G) = 1
and we can only have large τ (G, reach, d(G)) in graphs with large diameter. For
example, a directed ring G of size n has τ (G, reach, d(G)) = n − 1. Indeed,
assume that from some edge e, label 1 ≤ i ≤ n − 1 is missing. It is easy to see
that there is some shortest path between two nodes of the ring that in order to
arrive by time n − 1 must use edge e at time i. As this label is missing, it uses
label i+1, thus it arrives by time n which is greater than the diameter. On a ring
we can preserve the diameter only if all edges have the labels {1, 2, . . . , n − 1}.
On the other hand, there are graphs with large diameter in which
τ (G, reach, d(G)) is small. This may also be the case even if G is strongly con-
nected. For example, consider the graph with nodes u1 , u2 , . . . , un and edges
(ui , ui+1 ) and (ui+1 , ui ) for all 1 ≤ i ≤ n − 1. In words, we have a directed line
from u1 to un and an inverse one from un to u1 . The diameter here is n − 1 (e.g.
the shortest path from u1 to un ) but τ (G, reach, d(G)) = 1: simply label one
path 1, 2, ..., n − 1 and label the inverse one 1, 2, ..., n − 1 again, i.e. give to edges
(ui , ui+1 ) and (un−i+1 , un−i+2 ) label i. Now consider an undirected tree T .
Theorem 5. If T is an undirected tree then τ (T, all paths, d(T )) ≤ 2.
We next present an interesting trade-off between the temporality and the age of
a directed ring.
Theorem 6. If G is a directed ring and α = (n−1)+k, where 1 ≤ k ≤ n−1, then
τ (G, all paths, α) = Θ(n/k) and in particular n−1 k+1 + 1 ≤ τ (G, all paths, α) ≤
n
k+1 + 1. Moreover, τ (G, all paths, n − 1) = n − 1 (i.e. when k = 0).

5.2 A Generic Method for Lower Bounding Temporality


We show here that there are graphs G for which τ (G, all paths) = Ω(p(G))
(recall that p(G) denotes the length of the longest path in G), that is graphs
in which the optimum labeling, w.r.t. temporality, is very close to the trivial
labeling λ(e) = {1, 2, . . . , p(G)}, for all e ∈ E.
Definition 4. Call a set K = {e1 , e2 , . . . , ek } ⊆ E(G) of edges of a digraph G
an edge-kernel if for every permutation π = (ei1 , ei2 , . . . , eik ) of K there is a
simple path of G that visits all edges of K in the ordering defined by π.
Temporal Network Optimization Subject to Connectivity Constraints 665

The following theorem states that an edge-kernel of size k needs at least k labels
on some edge(s).

Theorem 7 (Edge-Kernel Lower Bound). If a digraph G contains an edge-


kernel of size k then τ (G, all paths) ≥ k.

The usefulness of Theorem 7 is that it allows us to establish a lower bound k on


the temporality of a graph G by only proving the existence of an edge-kernel of
size k in G. We now apply this to complete digraphs and planar graphs.

Lemma 2. If G is a complete digraph of order n then it has an edge-kernel of


size n/2 .

Now Theorem 7 implies that if G is a complete digraph then n/2 ≤


τ (G, all paths) ≤ n − 1.
1
Lemma 3. There exist planar graphs having edge-kernels of size Ω(n 3 ).

5.3 Computing the Cost


5.3.1 Hardness of Approximation
Consider a boolean formula φ in conjunctive normal form with two literals in every
clause (2-CNF). Let τ be a truth assignment of the variables of φ and α = (1 ∨ 2 )
be a clause of φ. Then α is XOR-satisfied in τ , if one of the literals {1 , 2 } of the
clause α is true in τ and the other one is false in τ . The number of clauses of φ that
are XOR-satisfied in τ is denoted by |t(φ)|. The formula φ is XOR-satisfiable if there
exists a truth assignment τ of φ such that every clause of φ is XOR-satisfied in τ . The
Max-XOR problem is the following maximization problem: given a 2-CNF formula
φ, compute the greatest number of clauses of φ that can be simultaneously XOR-
satisfied in a truth assignment τ , i.e. compute the greatest value for |t(φ)|. The Max-
XOR( k) problem is the special case of the the Max-XOR problem, where every
literal of the input formula φ appears in at most k clauses of φ. Max-XOR is known
to be APX-hard, i.e. it does not admit a PTAS unless P = NP [KMSV99, CKS01].
In the next lemma we prove that Max-XOR(3) remains APX-hard by providing a
PTAS reduction from Max-XOR.

Lemma 4. The Max-XOR(3) problem is APX-hard.

Now we provide a reduction from the Max-XOR(3) problem to the problem of


computing κ(G, reach, d(G)). Let φ be an instance formula of Max-XOR(3) with
n variables x1 , x2 , . . . , xn and m clauses. Since every variable xi appears in φ
(either as xi or as xi ) in at most 3 clauses, it follows that m ≤ 32 n. We will
construct from φ a graph Gφ having length of a directed cycle at most 2. Then,
as we prove in Theorem 8, κ(Gφ , reach, d(Gφ )) ≤ 39n − 4m − 2k if and only if
there exists a truth assignment τ of φ with |t(φ)| ≥ k, i.e. τ XOR-satisfies at
least k clauses of φ. Since φ is an instance of Max-XOR(3), we can replace every
clause (xi ∨ xj ) by the clause (xi ∨ xj ) in φ, since (xi ∨ xj ) = (xi ∨ xj ) in XOR.
Furthermore, whenever (xi ∨ xj ) is a clause of φ, where i < j, we can replace
666 G.B. Mertzios et al.

this clause by (xi ∨ xj ), since (xi ∨ xj ) = (xi ∨ xj ) in XOR. Thus, we can assume
w.l.o.g. that every clause of φ is either of the form (xi ∨ xj ) or (xi ∨ xj ), i < j.
For every i = 1, 2, . . . , n we construct the graph Gφ,i of Figure 1. Note that
the diameter of Gφ,i is d(Gφ,i ) = 9 and the maximum length of a directed
cycle in Gφ,i is 2. In this figure, we call the induced subgraph of Gφ,i on the
13 vertices {sxi , ux1 i , . . . , ux6 i , v1xi , . . . , v6xi } the trunk of Gφ,i . Furthermore, for
every p ∈ {1, 2, 3}, we call the induced subgraph of Gφ,i on the 5 vertices
{ux7,p
i
, ux8,p
i xi
, v7,p xi
, v8,p , txp i , } the pth branch of Gφ,i . Finally, we call the edges ux6 i ux7,p i

xi xi
and v6 v7,p the transition edges of the pth branch of Gφ,i . Furthermore, for every
i = 1, 2, . . . , n, let ri ≤ 3 be the number of clauses in which variable xi appears
in φ. For every 1 ≤ p ≤ ri , we assign the pth appearance of the variable xi
(either as xi or as xi ) in a clause of φ to the pth branch of Gφ,i .
Consider now a clause α = (i ∨ j ) of φ, where i < j. Then, by our as-
sumptions on φ, it follows that i = xi and j ∈ {xj , xj }. Assume that the
literal i (resp. j ) of the clause α corresponds to the pth (resp. to the qth)
appearance of the variable xi (resp. xj ) in φ. Then we identify the vertices of
the pth branch of Gφ,i with the vertices of the qth branch of Gφ,j as follows.
If j = xj then we identify the vertices ux7,p i
, ux8,p
i xi
, v7,p xi
, v8,p , txp i with the vertices
xj xj xj xj xj
v7,q , v8,q , u7,q , u8,q , tq , respectively. Otherwise, if j = xj then we identify the
xj xj xj xj x
vertices ux7,p i
, ux8,p
i xi
, v7,p xi
, v8,p , txp i with the vertices u7,q , u8,q , v7,q , v8,q , tq j , respec-
tively. This completes the construction of the graph Gφ . Note that, similarly to
the graphs Gφ,i , 1 ≤ i ≤ n, the diameter of Gφ is d(Gφ ) = 9 and the maximum
length of a directed cycle in Gφ is 2. Furthermore, note that for each of the m
clauses of φ, one branch of a gadget Gφ,i coincides with one branch of a gadget
Gφ,j , where 1 ≤ i < j ≤ n, while every Gφ,i has three branches. Therefore Gφ
has exactly 3n − 2m branches which belong to only one gadget Gφ,i , and m
branches that belong to two gadgets Gφ,i , Gφ,j .

ux7,1
i
ux8,1
i

tx1 i
xi
v7,1 xi
v8,1
ux1 i ux2 i ux6 i ux7,2
i
ux8,2
i

...
Gi : s xi tx2 i

... xi
v1xi v2xi v6xi
xi
v7,2 v8,2

ux7,3
i ux8,3
i

tx3 i

xi xi
v7,3 v8,3

Fig. 1. The gadget Gφ,i for the variable xi


Temporal Network Optimization Subject to Connectivity Constraints 667

Theorem 8. There exists a truth assignment τ of φ with |t(φ)| ≥ k if and only


if κ(Gφ , reach, d(Gφ )) ≤ 39n − 4m − 2k.
Using Theorem 8, we are now ready to prove the main theorem of this section.
Theorem 9 (Hardness of Approximating the Temporal Cost). The prob-
lem of computing κ(G, reach, d(G)) is APX-hard, even when the maximum
length of a directed cycle in G is 2.
Proof. Denote now by OPTMax-XOR(3) (φ) the greatest number of clauses that
can be simultaneously XOR-satisfied by a truth assignment of φ. Then Theo-
rem 8 implies that

κ(Gφ , reach, d(Gφ )) ≤ 39n − 4m − 2 · OPTMax-XOR(3) (φ)

Note that a random assignment XOR-satisfies each clause of φ with probability


1
2 , and thus we can easily compute (even deterministically) an assignment τ that
XOR-satisfies m 2 clauses of φ. Therefore OPTMax-XOR(3) (φ) ≥ 2 , and thus, since
m

every variable xi appears in at least one clause of φ, it follows that n ≤ m ≤


2 · OPTMax-XOR(3) (φ1 ).
Assume that there is a PTAS for computing κ(G, reach, d(G)). Then, for every
ε > 0 we can compute in polynomial time a labeling λ for the graph Gφ , such
that |λ| ≤ (1 + ε) · κ(Gφ , reach, d(Gφ )).
Given such a labeling λ we can compute by the sufficiency part (⇐) of the
proof of Theorem 8 a truth assignment τ of φ such that 39n − 4m − 2|t(φ)| ≤ |λ|,
i.e. 2|t(φ)| ≥ 39n − 4m − |λ|.
Therefore it follows by all the above that 2|t(φ)| ≥ 39n − 4m − (1 + ε) ·
κ(Gφ , reach, d(Gφ )) ≥ 39n−4m−(1+ε)· 39n − 4m − 2 · OPTMax-XOR(3) (φ) =
ε (4m − 39n)+2(1+ε)·OPTMax-XOR(3) (φ) ≥ −35εm+(2+2ε)·OPTMax-XOR(3) (φ)
≥ −35ε · 2OPTMax-XOR(3) (φ) + (2 + 2ε) · OPTMax-XOR(3) (φ) = (2 − 68ε) ·
OPTMax-XOR(3) (φ) and thus

|t(φ)| ≥ (1 − 34ε) · OPTMax-XOR(3) (φ).

That is, assuming a PTAS for computing κ(G, reach, d(G)), we obtain a PTAS
for the Max-XOR(3) problem, which is a contradiction by Lemma 4. Therefore
computing κ(G, reach, d(G)) is APX-hard. Finally, notice that the constructed
graph Gφ has maximum length of a directed cycle at most 2. &
%

5.3.2 Approximating the Cost


In this section, we provide an approximation algorithm for computing
κ(G, reach, d(G)), which complements the hardness result of Theorem 9. Given
a digraph G define, for every u ∈ V
, u’s reachability number r(u) = |{v ∈ V :
v is reachable from u}| and r(G) = u∈V r(u), that is r(G) is the total number
of reachabilities in G.
Theorem 10. There is an r(G)
n−1 -factor approximation algorithm for computing
κ(G, reach, d(G)) on any weakly connected digraph G.
668 G.B. Mertzios et al.

References
[AAD+ 06] Angluin, D., Aspnes, J., Diamadi, Z., Fischer, M.J., Peralta, R.: Compu-
tation in networks of passively mobile finite-state sensors. In: Distributed
Computing, pp. 235–253 (March 2006)
[AKL08] Avin, C., Koucký, M., Lotker, Z.: How to explore a fast-changing world
(Cover time of a simple random walk on evolving graphs). In: Aceto,
L., Damgård, I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A.,
Walukiewicz, I. (eds.) ICALP 2008, Part I. LNCS, vol. 5125, pp. 121–132.
Springer, Heidelberg (2008)
[Ber96] Berman, K.A.: Vulnerability of scheduled networks and a generalization of
Menger’s theorem. Networks 28(3), 125–134 (1996)
[CFQS12] Casteigts, A., Flocchini, P., Quattrociocchi, W., Santoro, N.: Time-varying
graphs and dynamic networks. IJPEDS 27(5), 387–408 (2012)
[CKS01] Creignou, N., Khanna, S., Sudan, M.: Complexity classifications of boolean
constraint satisfaction problems. SIAM Monographs on Discrete Mathe-
matics and Applications (2001)
[CMM+ 08] Clementi, A.E., Macci, C., Monti, A., Pasquale, F., Silvestri, R.: Flooding
time in edge-markovian dynamic graphs. In: Proc. of the 27th ACM Symp.
on Principles of Distributed Computing (PODC), pp. 213–222 (2008)
[FT98] Fleischer, L., Tardos, É.: Efficient continuous-time dynamic network flow
algorithms. Operations Research Letters 23(3), 71–80 (1998)
[GPPR01] Gavoille, C., Peleg, D., Pérennes, S., Raz, R.: Distance labeling in graphs.
In: Proc. of the 12th annual ACM-SIAM Symposium on Discrete Algo-
rithms (SODA), Philadelphia, PA, USA, pp. 210–219 (2001)
[KKK00] Kempe, D., Kleinberg, J., Kumar, A.: Connectivity and inference prob-
lems for temporal networks. In: Proceedings of the 32nd Annual ACM
Symposium on Theory of Computing (STOC), pp. 504–513 (2000)
[KKKP04] Katz, M., Katz, N.A., Korman, A., Peleg, D.: Labeling schemes for flow
and connectivity. SIAM Journal on Computing 34(1), 23–40 (2004)
[KLO10] Kuhn, F., Lynch, N., Oshman, R.: Distributed computation in dynamic
networks. In: Proceedings of the 42nd ACM Symposium on Theory of
Computing (STOC), pp. 513–522. ACM, New York (2010)
[KMSV99] Khanna, S., Motwani, R., Sudan, M., Vazirani, U.: On syntactic ver-
sus computational views of approximability. SIAM Journal on Comput-
ing 28(1), 64–191 (1999)
[MCS11a] Michail, O., Chatzigiannakis, I., Spirakis, P.G.: Mediated population pro-
tocols. Theoretical Computer Science 412(22), 2434–2450 (2011)
[MCS11b] Michail, O., Chatzigiannakis, I., Spirakis, P.G.: New Models for Popula-
tion Protocols. In: Lynch, N.A. (ed.) Synthesis Lectures on Distributed
Computing Theory. Morgan & Claypool (2011)
[MCS12] Michail, O., Chatzigiannakis, I., Spirakis, P.G.: Causality, influence, and
computation in possibly disconnected synchronous dynamic networks. In:
Baldoni, R., Flocchini, P., Binoy, R. (eds.) OPODIS 2012. LNCS, vol. 7702,
pp. 269–283. Springer, Heidelberg (2012)
[Sch02] Scheideler, C.: Models and techniques for communication in dynamic net-
works. In: Alt, H., Ferreira, A. (eds.) STACS 2002. LNCS, vol. 2285, pp.
27–49. Springer, Heidelberg (2002)
[XFJ03] Xuan, B., Ferreira, A., Jarry, A.: Computing shortest, fastest, and foremost
journeys in dynamic networks. International Journal of Foundations of
Computer Science 14(02), 267–285 (2003)
Strong Bounds for Evolution in Networks

George B. Mertzios1 and Paul G. Spirakis2,3


1
School of Engineering and Computing Sciences, Durham University, UK
2
Department of Computer Science, University of Liverpool, UK
3
Computer Technology Institute and University of Patras, Greece
[email protected], [email protected]

Abstract. This work extends what is known so far for a basic model of
evolutionary antagonism in undirected networks (graphs). More specif-
ically, this work studies the generalized Moran process, as introduced
by Lieberman, Hauert, and Nowak [Nature, 433:312-316, 2005], where
the individuals of a population reside on the vertices of an undirected
connected graph. The initial population has a single mutant of a fitness
value r (typically r > 1), residing at some vertex v of the graph, while
every other vertex is initially occupied by an individual of fitness 1. At
every step of this process, an individual (i.e. vertex) is randomly chosen
for reproduction with probability proportional to its fitness, and then it
places a copy of itself on a random neighbor, thus replacing the individ-
ual that was residing there. The main quantity of interest is the fixation
probability, i.e. the probability that eventually the whole graph is occu-
pied by descendants of the mutant. In this work we concentrate on the
fixation probability when the mutant is initially on a specific vertex v,
thus refining the older notion of Lieberman et al. which studied the fix-
ation probability when the initial mutant is placed at a random vertex.
We then aim at finding graphs that have many “strong starts” (or many
“weak starts”) for the mutant. Thus we introduce a parameterized no-
tion of selective amplifiers (resp. selective suppressors) of evolution. We
prove the existence of strong selective amplifiers (i.e. for h(n) = Θ(n)
vertices v the fixation probability of v is at least 1 − c(r) n
for a func-
tion c(r) that depends only on r), and the existence of quite strong
selective suppressors. Regarding the traditional notion of fixation prob-
ability from a random start, we provide strong upper and lower bounds:
first we demonstrate the non-existence of “strong universal” amplifiers,
and second we prove the Thermal Theorem which states that for any
undirected graph, when the mutant starts at vertex v, the fixation prob-
ability at least (r − 1)/(r + degdeg v
). This theorem (which extends the
min
“Isothermal Theorem” of Lieberman et al. for regular graphs) implies
an almost tight lower bound for the usual notion of fixation probability.
Our proof techniques are original and are based on new domination ar-
guments which may be of general interest in Markov Processes that are
of the general birth-death type.

This work was partially supported by (i) the FET EU IP Project MULTIPLEX
(Contract no 317532), (ii) the ERC EU Grant ALGAME (Agreement no 321171),
and (iii) the EPSRC Grant EP/G043434/1. The full version of this paper is available
at http://arxiv.org/abs/1211.2384

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 669–680, 2013.

c Springer-Verlag Berlin Heidelberg 2013
670 G.B. Mertzios and P.G. Spirakis

1 Introduction
Population and evolutionary dynamics have been extensively studied [2, 6, 7, 15,
21, 24, 25], mainly on the assumption that the evolving population is homoge-
neous, i.e. it has no spatial structure. One of the main models in this area is the
Moran Process [19], where the initial population contains a single mutant with
fitness r > 0, with all other individuals having fitness 1. At every step of this
process, an individual is chosen for reproduction with probability proportional
to its fitness. This individual then replaces a second individual, which is chosen
uniformly at random, with a copy of itself. Such dynamics as the above have been
extensively studied also in the context of strategic interaction in evolutionary
game theory [11–14, 23].
In a recent article, Lieberman, Hauert, and Nowak [16] (see also [20]) in-
troduced a generalization of the Moran process, where the individuals of the
population are placed on the vertices of a connected graph (which is, in general,
directed) such that the edges of the graph determine competitive interaction. In
the generalized Moran process, the initial population again consists of a single
mutant of fitness r, placed on a vertex that is chosen uniformly at random, with
each other vertex occupied by a non-mutant of fitness 1. An individual is chosen
for reproduction exactly as in the standard Moran process, but now the second
individual to be replaced is chosen among its neighbors in the graph uniformly
at random (or according to some weights of the edges) [16, 20]. If the underly-
ing graph is the complete graph, then this process becomes the standard Moran
process on a homogeneous population [16, 20]. Several similar models describing
infections and particle interactions have been also studied in the past, including
the SIR and SIS epidemics [10, Chapter 21], the voter and antivoter models and
the exclusion process [1,9,17]. However such models do not consider the issue of
different fitness of the individuals.
The central question that emerges in the generalized Moran process is how the
population structure affects evolutionary dynamics [16, 20]. In the present work
we consider the generalized Moran process on arbitrary finite, undirected, and
connected graphs. On such graphs, the generalized Moran process terminates
almost surely, reaching either fixation of the graph (all vertices are occupied
by copies of the mutant) or extinction of the mutants (no copy of the mutant
remains). The fixation probability of a graph G for a mutant of fitness r, is the
probability that eventually fixation is reached when the mutant is initially placed
at a random vertex of G, and is denoted by fr (G). The fixation probability can,
in principle, be determined using standard Markov Chain techniques. But doing
so for a general graph on n vertices requires solving a linear system of 2n linear
equations. Such a task is not computationally feasible, even numerically. As a
result of this, most previous work on computing fixation probabilities in the
generalized Moran process was either restricted to graphs of small size [6] or
to graph classes which have a high degree of symmetry, reducing thus the size
of the corresponding linear system (e.g. paths, cycles, stars, and cliques [3–5]).
Experimental results on the fixation probability of random graphs derived from
grids can be found in [22].
Strong Bounds for Evolution in Networks 671

A recent result [8] shows how to construct fully polynomial randomized ap-
proximation schemes (FPRAS) for the probability of reaching fixation (when
r ≥ 1) or extinction (for all r > 0). The result of [8] uses a Monte Carlo es-
timator, i.e. it runs the generalized Moran process several times1 , while each
run terminates in polynomial time with high probability [8]. Note that improved
lower and upper bounds on the fixation probability immediately lead to a better
estimator here. Ontil now, the only known general bounds for the fixation proba-
bility on connected undirected graphs, are that fr (G) ≥ n1 and fr (G) ≤ 1 − n+r
1
.
Lieberman et al. [16, 20] proved the Isothermal Theorem, stating that (in the
case of undirected graphs) the fixation probability of a regular graph (i.e. of
a graph with overall the same vertex degree) is equal to that of the complete
graph (i.e. the homogeneous population of the standard Moran process), which
equals to (1 − 1r )/(1 − r1n ), where n is the size of the population. Intuitively,
in the Isothermal Theorem, every vertex of the graph has a temperature which
determines how often this vertex is being replaced by other individuals dur-
ing the generalized Moran process. The complete graph (or equivalently, any
regular graph) serves as a benchmark for measuring the fixation probability of
an arbitrary graph G: if fr (G) is larger (resp. smaller) than that of the com-
plete graph then G is called an amplifier (resp. a suppressor ) [16, 20]. Until
now only graphs with similar (i.e. a little larger or smaller) fixation probability
than regular graphs have been identified [3–5, 16, 18], while no class of strong
amplifiers/suppressors is known so far.
Our Contribution. The structure of the graph, on which the population re-
sides, plays a crucial role in the course of evolutionary dynamics. Human societies
or social networks are never homogeneous, while certain individuals in central po-
sitions may be more influential than others [20]. Motivated by this, we introduce
in this paper a new notion of measuring the success of an advantageous mutant
in a structured population, by counting the number of initial placements of the
mutant in a graph that guarantee fixation of the graph with large probability.
This provides a refinement of the notion of fixation probability. Specifically, we
do not any more consider the fixation probability as the probability of reaching
fixation when the mutant is placed at a random vertex, but we rather consider
the probability fr (v) of reaching fixation when a mutant with fitness r > 1 is
introduced at a specific vertex v of the graph; fr (v) is termed the fixation prob-
ability of vertex v. Using this notion, thefixation probability fr (G) of a graph
G = (V, E) with n vertices is fr (G) = n1 v∈V fr (v).
We aim in finding graphs that have many “strong starts” (or many “weak
starts”) of the mutant. Thus we introduce the notions of (h(n), g(n))-selective
amplifiers (resp. (h(n), g(n))-selective suppressors), which include those graphs
c(r)
with n vertices for which there exist at least h(n) vertices v with fr (v) ≥ 1 − g(n)
c(r)
(resp. fr (v) ≤ g(n) ) for an appropriate function c(r) of r. We contrast this new

1
For approximating the probability to reach fixation (resp. extinction), one needs a
number of runs which is about the inverse of the best known lower (resp. upper)
bound of the fixation probability.
672 G.B. Mertzios and P.G. Spirakis

notion of (h(n), g(n))-selective amplifiers (resp. suppressors) with the notion of


g(n)-universal amplifiers (resp. suppressors) which include those graphs G with
c(r) c(r)
n vertices for which fr (G) ≥ 1 − g(n) (resp. fr (G) ≤ g(n) ) for an appropriate
function c(r) of r. For a detailed presentation and a rigorous definition of these
notions we refer to Section 2.
Using these new notions, we prove that there exist strong selective ampli-
fiers, namely (Θ(n), n)-selective amplifiers (called the urchin graphs). Further-
more we prove that there exist also quite strong selective suppressors, namely
n n
( φ(n)+1 , φ(n) )-selective suppressors (called the φ(n)-urchin graphs) for any func-

tion φ(n) = ω(1) with φ(n) ≤ n.
Regarding the traditional measure of the fixation probability fr (G) of undi-
rected graphs G, we provide upper and lower bounds that are much stronger
than the bounds n1 and 1 − n+r 1
that were known so far [8]. More specifically,
first of all we demonstrate the nonexistence of “strong” universal amplifiers by
showing that for any graph G with n vertices, the fixation probability fr (G) is
c(r)
strictly less than 1 − n3/4+ε , for any ε > 0. This is in a wide contrast with what
happens in directed graphs, as Lieberman et al. [16] provided directed graphs
with arbitrarily large fixation probability (see also [20]).
On the other hand, we provide our lower bound in the Thermal Theorem,
which states that for any vertex v of an arbitrary undirected graph G, the fixation
probability fr (v) of v is at least (r − 1)/(r + deg
deg v
) for any r > 1, where deg v is
min
the degree of v in G (i.e. the number of its neighbors) and degmin (resp. degmax )
is the minimum (resp. maximum) degre in G. This result extends the Isothermal
Theorem for regular graphs [16]. In particular, we consider here a different notion
1
of temperature for a vertex than [16]: the temperature of vertex v is deg v . As
it turns out, a “hot” vertex (i.e. with hight temperature) affects more often its
neighbors than a “cold” vertex (with low temperature). The Thermal Theorem,
which takes into account the vertex v on which the mutant is introduced, provides
immediately our lower bound (r − 1)/(r + deg degmin ) for the fixation probability
max

fr (G) of any undirected graph G. The latter lower bound is almost tight, as it
implies that fr (G) ≥ r−1 r+1 for a regular graph G, while the Isothermal Theorem
implies that the fixation probability of a regular graph G tends to r−1 r as the size
of G increases. Note that our new upper/lower bounds for the fixation probability
lead to better time complexity of the FPRAS proposed in [8], as the Monte Carlo
technique proposed in [8] now needs to simulate the Moran process a less number
of times (to estimate fixation or extinction).
Our techniques are original and of a constructive combinatorics flavor.
For the class of strong selective amplifiers (the urchin graphs) we introduce
a novel decomposition of the Markov chain M of the generalized Moran pro-
cess into n − 1 smaller chains M1 , M2 , . . . , Mn−1 , and then we decompose each
Mk into two even smaller chains M1k , M2k . Then we exploit a new way of com-
posing these smaller chains (and returning to the original one) that is carefully
done to maintain the needed domination properties. For the proof of the lower
bound in the Thermal Theorem, we first introduce a new and simpler weighted
Strong Bounds for Evolution in Networks 673

process that bounds fixation probability from below (the generalized Moran pro-
cess is a special case of this new process). Then we add appropriate dummy states
to its (exponentially large) Markov chain, and finally we iteratively modify the
resulting chain by maintaining the needed monotonicity properties. Eventually
this results to the desired lower bound of the Thermal Theorem. Finally, our
proof for the non-existence of strong universal amplifiers is done by contradic-
tion, partitioning appropriately the vertex set of the graph and discovering an
appropriate independent set that leads to the contradiction.

2 Preliminaries

Throughout the paper we consider only finite, connected, undirected graphs


G = (V, E). Our results apply to connected graphs as, otherwise, the fixation
probability is necessarily zero. The edge e ∈ E between two vertices u, v ∈ V is
denoted by e = uv. For a vertex subset X ⊆ V , we write X + y and X − y for
X ∪{y} and X ∩{y}, respectively. Furthermore, throughout r denotes the fitness
of the mutant, while the value r is considered to be independent of the size n of
the network, i.e. we assume that r is constant. For simplicity of presentation, we
call a vertex v “infected” if a copy of the mutant is placed on v. For every vertex
subset S ⊆ V we denote by fr (S) the fixation probability of the set S, i.e. the
probability that, starting with exactly |S| copies of the mutant placed on the
vertices of S, the generalized Moran process will eventually reach fixation. By
the definition of the generalized Moran process fr (∅) = 0 and fr (V ) = 1, while
for S ∈/ {∅, V },
 
xy∈E deg x f (S + y) + deg y f (S − x)
r 1

fr (S) =  
r 1
xy∈E deg x + deg y

Therefore, eliminating self-loops in the above Markov process,


 
deg x fr (S + y) + deg y fr (S − x)
r 1
xy∈E,x∈S,y ∈S
/
fr (S) =   (1)
r 1
xy∈E,x∈S,y ∈S/ deg x + deg y

In the next definition we introduce the notions of universal and selective


amplifiers.

Definition 1. Let G be an infinite class of undirected graphs. If there exists an


n0 ∈ N, an r0 ≥ 1, and some function c(r), such that for every graph G ∈ G
with n ≥ n0 vertices and for every r > r0 :
c(r)
– fr (G) ≥ 1 − g(n) , then G is a class of g(n)-universal amplifiers,
c(r)
– there exists a subset S of at least h(n) vertices of G, such that fr (v) ≥ 1− g(n)
for every vertex v ∈ S, then G is a class of (h(n), g(n))-selective amplifiers.
674 G.B. Mertzios and P.G. Spirakis

Moreover, G is a class of strong universal (resp. strong selective) amplifiers if


G is a class of n-universal (resp. (Θ(n), n)-selective) amplifiers.
Similarly to Definition 1, we introduce the notions of universal and selective
suppressors.
Definition 2. Let G be an infinite class of undirected graphs. If there exist func-
tions c(r) and n0 (r), such that for every r > 1 and for every graph G ∈ G with
n ≥ n0 (r) vertices:
c(r)
– fr (G) ≤ g(n) , then G is a class of g(n)-universal suppressors,
c(r)
– there exists a subset S of at least h(n) vertices of G, such that fr (v) ≤ g(n)
for every vertex v ∈ S, then G is a class of (h(n), g(n))-selective suppressors.
Moreover, G is a class of strong universal (resp. strong selective) suppressors if
G is a class of n-universal (resp. (Θ(n), n)-selective) suppressors.
Note that n0 = n0 (r) in Definition 2, while in Definition 1 n0 is not a function of
r. The reason for this is that, since we consider the fitness value r to be constant,
the size n of G needs to be sufficiently large with respect to r in order for G to
act as a suppressor. Indeed, if we let r grow arbitrarily, e.g. if r = n2 , then for
any graph G with n vertices the fixation probability fr (v) tends to 1 as n grows.
The next lemma follows by Definitions 1 and 2.
Lemma 1. If G is a class of g(n)-universal amplifiers (resp. suppressors), then
G is a class of (Θ(n), g(n))-selective amplifiers (resp. suppressors).
The most natural question that arises by Definitions 1 and 2 is whether there ex-
ists any class of strong selective amplifiers/suppressors, as well as for which func-
tions h(n) and g(n) there exist classes of g(n)-universal amplifiers/suppressors
and classes of (h(n), g(n))-selective amplifiers/suppressors. In Section 3 and 4
we provide our results on amplifiers and suppressors, respectively.

3 Amplifier Bounds
In this section we prove that there exist no strong universal amplifiers (Sec-
tion 3.1), although there exists a class of strong selective amplifiers (Section 3.2).

3.1 Non-existence of Strong Universal Amplifiers


3
Theorem 1. For any function g(n) = Ω(n 4 +ε ) for some ε > 0, there exists no
graph class G of g(n)-universal amplifiers for any r > r0 = 1.
Proof (sketch). The proof is done by contradiction. It involves a surprising par-
tition of the vertices of the graph into three sets V1 , V2 , V3 , where V1 and V2 are
independent sets, and N (v) ⊆ V3 for every v ∈ V1 ∪ V2 . For the detailed proof
we refer to the full paper in the Appendix.
Corollary 1. There exists no infinite class G of undirected graphs which are
strong universal amplifiers.
Strong Bounds for Evolution in Networks 675

3.2 A Class of Strong Selective Amplifiers


In this section we present the first class G = {Gn : n ≥ 1} of strong selective am-
plifiers, which we call the urchin graphs. Namely, the graph Gn has 2n vertices,
consisting of a clique with n vertices, an independent set of n vertices, and a per-
fect matching between the clique and the independent set, as it is illustrated in
Figure 1(a). For every graph Gn , we refer for simplicity to a vertex of the clique
of Gn as a clique vertex of Gn , and to a vertex of the independent set of Gn as a
nose of Gn , respectively. We prove in this section that the class G of urchin graphs
are strong selective amplifiers. Namely, we prove that, whenever r > r0 = 5, the
fixation probability of any nose v of any graph Gn is fr (v) ≥ 1 − c(r) n , where c(r)
is a function that depends only on the mutant fitness r.
Let v be a clique vertex (resp. a nose) and u be its adjacent nose (resp. clique
vertex). If v is infected and u is not infected, then v is called an isolated clique ver-
tex (resp. isolated nose), otherwise v is called a covered clique vertex (resp. cov-
ered nose). Let k ∈ {0, 1, . . . , n}, i ∈ {0, 1, 2, . . . , n − k}, and x ∈ {0, 1, 2, . . . , k}.
Denote by Qki,x the state of Gn with exactly i isolated clique vertices, x isolated
noses, and k − x covered noses. An example of the state Qki,x is illustrated in
Figure 1. Furthermore, for every k, i ∈ {0, 1, . . . , n}, we define the state Pik of
Gn as follows. If i ≤ k, then Pik is the state with exactly i covered noses and k − i
isolated noses. If i > k, then Pik is the state with exactly k covered noses and
i − k isolated clique vertices. Note that Qki,0 = Pk+i k
and Qk0,x = Pk−x k
, for every
k ∈ {0, 1, . . . , n}, i ∈ {0, 1, 2, . . . , n − k}, and x ∈ {0, 1, 2, . . . , k}. Two examples
of the state Pik , for the cases where i ≤ k and i > k, are shown in Figure 1.

k k k

x i i
Gn : n-clique Qki,x : i Pi : Pi :

(a) (b) (c) (d)

Fig. 1. (a) The “urchin” graph Gn . Furthermore, the state (b) Qki,x and the state Pik ,
where (c) i ≤ k, and (d) i > k.

Let k ∈ {1, 2, . . . , n − 1}. For all appropriate values of i and x, we denote by


k
qi,x (resp. pki ) the probability that, starting at state Qki,x (resp. Pik ) we eventually
arrive to a state with k + 1 infected noses before we arrive to a state with k − 1
infected noses.
Lemma 2. Let 1 ≤ k ≤ n − 1. Then qi,x
k k
> qi−1,x−1 , for every i ∈ {1, 2, . . . , n −
k} and every x ∈ {1, 2, . . . , k}.
Corollary 2. Let k ∈ {1, 2, . . . , n−1}, i ∈ {0, 1, . . . , n−k}, and x ∈ {0, 1, . . . , k}.
k
Then qi,x > pkk+i−x .
676 G.B. Mertzios and P.G. Spirakis

Note by Corollary 2 that, in order to compute a lower bound for the fixation
probability fr (v) of a nose v of the graph Gn , we can assume that, whenever we
have k infected noses and i infected clique vertices, we are at state Pik . That is,
in the Markov chain of the generalized Moran process, we replace any transition
to a state Qki,x with a transition to state Pk+i−xk
. Denote this relaxed Markov
chain by M; we will compute a lower bound of the fixation probability of state
P01 in the Markov chain M (cf. Theorem 2).
In order to analyze M, we decompose it first into the n − 1 smaller Markov
chains M1 , M2 , . . . , Mn−1 , as follows. For every k ∈ {1, 2, . . . , n−1}, the Markov
chain Mk captures all transitions of M between states with k infected noses. We
denote by Fk−1 (resp. Fk+1 ) an arbitrary state with k − 1 (resp. k + 1) infected
noses. Moreover, we consider Fk−1 and Fk+1 as absorbing states of Mk . Since we
want to compute a lower bound of the fixation probability, whenever we arrive
at state Fk+1 (resp. at state Fk−1 ), we assume that we have the smallest number
of infected clique vertices with k + 1 (resp. with k − 1) infected noses. That is,
whenever Mk reaches state Fk+1 , we assume that M has reached state Pk+1 k+1

(and thus we move to the Markov chain Mk+1 ). Similarly, whenever Mk reaches
state Fk−1 , we assume that M has reached state P0k−1 (and thus we move to
the Markov chain Mk−1 ).

A Decomposition of Mk into Two Markov Chains. In order to analyze


the Markov chain Mk , where k ∈ {1, 2, . . . , n − 1}, we decompose it into two
smaller Markov chains {M1k , M2k }.
In M1k , we consider the state Pk+1 k
absorbing. For every i ∈ {0, 1, . . . , k}
denote by hi the probability that, starting at state Pik in M1k , we eventually
k
k
reach state Pk+1 before we reach state Fk−1 . In this Markov chain M1k , every
transition probability between two states is equal to the corresponding transition
probabilities in Mk .
In M2k , we denote by ski , where i ∈ {k, k + 1, . . . , n}, the probability that
starting at state Pik we eventually reach state Fk+1 before we reach state Fk−1 .
In this Markov chain M2k , the transition probability from state Pkk to state Pk+1 k

(resp. to state Fk−1 ) is equal to hk (resp. 1 − hk ), while all other transition prob-
k k

abilities between two states in M2k are the same as the corresponding transition
probabilities in Mk .

Urchin Graphs are Strong Selective Amplifiers. We now conclude our


analysis by combining the results of Section 3.2 on the two Markov chains
M1 and M2 . In the Markov chain M, the transition from state P0k to the
states Pkk , P0k−1 is done through the Markov chain M1 , and the transition
from state Pkk to the states Pk+1 k+1
, P0k−1 is done through the Markov chain M2 ,
respectively.
In the Markov chain M, the transition probability from state Pkk to state Pk+1 k+1

(resp. P0k−1 ) is skk (resp. 1 − skk ). Recall that skk is the probability that, starting
at Pkk in M2 (and thus also in M), we reach state Fk+1 before we reach Fk−1 .
Furthermore, the transition probability from state P0k to state Pkk is equal to the
Strong Bounds for Evolution in Networks 677

probability that, starting at P0k in M1 , we reach Pkk before we reach Fk−1 . Note
that this probability is larger than hk0 . Therefore, in order to compute a lower
bound of the fixation probability of a nose in Gn , we can assume that in M the
transition probability from state P0k to Pkk (resp. P0k−1 ) is hk0 (resp. 1 − hk0 ).
Note that for every k ∈ {2, . . . , n − 1} the infected vertices of state P0k is a
strict subset of the infected vertices of state Pkk . Therefore, in order to compute
a lower bound of the fixation probability of state P01 in M, we can relax M by
k−1
changing every transition from state Pk−1 to state Pkk to a transition from state
Pk−1 to state P0 , where k ∈ {2, . . . , n − 1}. After eliminating the states Pkk in
k−1 k

M , where k ∈ {1, 2, . . . , n − 1}, we obtain an equivalent birth-death process Bn .


Denote by p1 the fixation probability of state P01 in Bn , i.e. p1 is the probability
that, starting at state P01 in Bn , we eventually arrive to state Pnn . For the next
theorem we use the lower bounds of Section 3.2.

Theorem 2. For any r > 5 and for sufficiently large n, the fixation probability
p1 of state P01 in Bn is p1 ≥ 1 − c(r)
n , for some appropriate function c(r) of r.

We are now ready to provide our main result in this section.

Theorem 3. The class G = {Gn : n ≥ 1} of urchin graphs is a class of strong


selective amplifiers.

4 Suppressor Bounds
In this section we prove our lower bound for the fixation probability of an ar-
bitrary undirected graph, namely the Thermal Theorem (Section 4.1), which
generalizes the analysis of the fixation probability of regular graphs [16]. Fur-

thermore we present for every function φ(n), where φ(n) = ω(1) and φ(n) ≤ n,
n n
a class of ( φ(n)+1 , φ(n) )-selective suppressors in Section 4.2.

4.1 The Thermal Theorem

Consider a graph G = (V, E) and a fitness value r > 1. Denote by Mr (G) the
generalized Moran process on G with fitness r. Then, for every subset S ∈ / {∅, V }
of its vertices, the fixation probability fr (S) of S in Mr (G) is given by (1),
where fr (∅) = 0 and fr (V ) = 1. That is, the fixation probabilities fr (S), where
S∈ / {∅, V }, are the solution of the linear system (1) with boundary conditions
fr (∅) = 0 and fr (V ) = 1.
Suppose that at some iteration of the generalized Moran process the set S
of vertices are infected and that the edge xy ∈ E (where x ∈ S and y ∈ / S)
is activated, i.e. either x infects y or y disinfects x. Then (1) implies that the
1
probability that x infects y is higher if deg x is large; similarly, the probability
1
that y disinfects x is higher if deg y is large. Therefore, in a fashion similar
to [16], we call for every vertex v ∈ V the quantity deg 1
v the temperature of v: a
“hot” vertex (i.e. with high temperature) affects more often its neighbors than
678 G.B. Mertzios and P.G. Spirakis

a “cold” vertex (i.e. with low temperature). It follows now by (1) that for every
set S ∈
/ {∅, V } there exists at least one pair x(S), y(S) of vertices with x(S) ∈ S,
y(S) ∈/ S, and x(S)y(S) ∈ E such that
r
deg x(S) fr (S + y(S)) + 1
deg y(S) fr (S − x(S))
fr (S) ≥ r 1 (2)
deg x(S) + deg y(S)

Thus, solving the linear system that is obtained from (2) by replacing inequalities
with equalities, we obtain a lower bound for the fixation probabilities fr (S),
where S ∈ / {∅, V }. In the next definition we introduce a weighted generalization
of this linear system, which is a crucial tool for our analysis in obtaining the
Thermal Theorem.

Definition 3. (the linear system L0 ) Let G = (V, E) be an undirected graph


and r > 1. Let every vertex v ∈ V have weight (temperature) dv > 0. The
linear system L0 on the variables pr (S), where S ⊆ V , is given by the following
equations whenever S ∈/ {∅, V }:

rdx(S) pr (S + y(S)) + dy(S) pr (S − x(S))


pr (S) = (3)
rdx(S) + dy(S)

with boundary conditions pr (∅) = 0 and pr (V ) = 1.

With a slight abuse of notation, whenever S = {u1 , u2 , . . . , uk }, we denote


pr (u1 , u2 , . . . , uk ) = pr (S).

Observation 1. The linear system L0 in Definition 3 corresponds naturally to


the Markov chain M0 with one state for every subset S ⊆ V , where the states ∅
and V are absorbing, and every non-absorbing state S has exactly two transitions
rd
to the states S + y(S) and S − x(S) with transition probabilities qS = rdx(S)x(S)
+dy(S)
and 1 − qS , respectively.

Observation 2. Let G = (V, E) be a graph and r > 1. For every vertex x ∈ V


x be the temperature of x. Then fr (S) ≥ pr (S) for every S ⊆ V ,
1
let dx = deg
where the values pr (S) are the solution of the linear system L0 .

Before we provide the Thermal Theorem (Theorem 4), we first prove an auxiliary
result in the next lemma which generalizes the Isothermal Theorem of [16] for
regular graphs, i.e. for graphs with the same number of neighbors for every
vertex.

Lemma 3. Let G = (V, E) be a graph with n vertices, r > 1, and du be the


1− 1
same for all vertices u ∈ V . Then pr (u) = 1− r1 ≥ 1 − 1r for every vertex u ∈ V .
rn

We are now ready to provide our main result in this section which provides a
lower bound for the fixation probability on arbitrary graphs, parameterized by
the maximum ratio between two different temperatures in the graph.
Strong Bounds for Evolution in Networks 679

Theorem 4 (Thermal Theorem). Let G = (V, E) be a connected undirected


graph and r > 1. Then fr (v) ≥ r+r−1
deg v for every v ∈ V .
degmin

The lower bound for the fixation probability in Theorem 4 is almost tight. Indeed,
if a graph G = (V, E) with n vertices is regular, i.e. if deg u = deg v for every
1− 1
u, v ∈ V , then fr (G) = 1− r1n by Lemma 3 (cf. also the Isothermal Theorem
r
in [16]), and thus fr (G) ∼
= r−1
r for large enough n. On the other hand, Theorem 4
implies for a regular graph G that fr (G) ≥ r−1
r+1 .

4.2 A Class of Selective Suppressors

In this section we present for every function φ(n), where φ(n) = ω(1) and φ(n) ≤

n, the class Gφ(n) = {Gφ(n),n : n ≥ 1} of ( φ(n)+1 n n
, φ(n) )-selective suppressors.
We call these graphs φ(n)-urchin graphs, since for φ(n) = 1 they coincide with
the class of urchin graphs in Section 3.2. For every n, the graph Gφ(n),n =
(Vφ(n),n , Eφ(n),n ) has n vertices. Its vertex set Vφ(n),n can be partitioned into two
φ(n)
1
sets Vφ(n),n 2
and Vφ(n),n , where |Vφ(n),n
1
| = φ(n)+1
n
and |Vφ(n),n
2
| = φ(n)+1 n, such
1 2
that Vφ(n),n induces a clique and Vφ(n),n induces an independent set in Gφ(n),n .
Furthermore, every vertex u ∈ Vφ(n),n 2 1
has φ(n) neighbors in Vφ(n),n , and every
vertex v ∈ Vφ(n),n has φ (n) neighbors in Vφ(n),n . Therefore deg v = n+φ2 (n)−1
1 2 2

for every v ∈ Vφ(n),n


1
and deg u = φ(n) for every u ∈ Vφ(n),n
2
.

Theorem √ 5. For every function φ(n), where φ(n) = ω(1) and


φ(n) ≤ n, the class Gφ(n) = {Gφ(n),n : n ≥ 1} of φ(n)-urchin graphs is a
n n
class of ( φ(n)+1 , φ(n) )-selective suppressors.

References

1. Aldous, D., Fill, J.: Reversible Markov Chains and Random Walks on Graphs.
Monograph in preparation,
http://www.stat.berkeley.edu/aldous/RWG/book.html
2. Antal, T., Scheuring, I.: Fixation of strategies for an evolutionary game in finite
populations. Bulletin of Math. Biology 68, 1923–1944 (2006)
3. Broom, M., Hadjichrysanthou, C., Rychtar, J.: Evolutionary games on graphs
and the speed of the evolutionary process. Proceedings of the Royal Society
A 466(2117), 1327–1346 (2010)
4. Broom, M., Hadjichrysanthou, C., Rychtar, J.: Two results on evolutionary
processes on general non-directed graphs. Proceedings of the Royal Society
A 466(2121), 2795–2798 (2010)
5. Broom, M., Rychtar, J.: An analysis of the fixation probability of a mutant on spe-
cial classes of non-directed graphs. Proceedings of the Royal Society A 464(2098),
2609–2627 (2008)
6. Broom, M., Rychtar, J., Stadler, B.: Evolutionary dynamics on small order graphs.
Journal of Interdisciplinary Mathematics 12, 129–140 (2009)
680 G.B. Mertzios and P.G. Spirakis

7. Sasaki, A., Taylor, C., Fudenberg, D., Nowak, M.A.: Evolutionary game dynamics
in finite populations. Bulletin of Math. Biology 66(6), 1621–1644 (2004)
8. Diáz, J., Goldberg, L., Mertzios, G., Richerby, D., Serna, M., Spirakis, P.: Approx-
imating fixation probabilities in the generalized moran process. In: Proceedings of
the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 954–960 (2012)
9. Durrett, R.: Lecture notes on particle systems and percolation. Wadsworth Pub-
lishing Company (1988)
10. Easley, D., Kleinberg, J.: Networks, Crowds, and Markets: Reasoning about a
Highly Connected World. Cambridge University Press (2010)
11. Gintis, H.: Game theory evolving: A problem-centered introduction to modeling
strategic interaction. Princeton University Press (2000)
12. Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics. Cam-
bridge University Press (1998)
13. Imhof, L.A.: The long-run behavior of the stochastic replicator dynamics. Annals
of applied probability 15(1B), 1019–1045 (2005)
14. Kandori, M., Mailath, G.J., Rob, R.: Learning, mutation, and long run equilibria
in games. Econometrica 61(1), 29–56 (1993)
15. Karlin, S., Taylor, H.: A First Course in Stochastic Processes, 2nd edn. Academic
Press, NY (1975)
16. Lieberman, E., Hauert, C., Nowak, M.A.: Evolutionary dynamics on graphs. Na-
ture 433, 312–316 (2005)
17. Liggett, T.M.: Interacting Particle Systems. Springer (1985)
18. Mertzios, G.B., Nikoletseas, S., Raptopoulos, C., Spirakis, P.G.: Natural models for
evolution on networks. In: Chen, N., Elkind, E., Koutsoupias, E. (eds.) Internet and
Network Economics. LNCS, vol. 7090, pp. 290–301. Springer, Heidelberg (2011)
19. Moran, P.A.P.: Random processes in genetics. Proceedings of the Cambridge Philo-
sophical Society 54, 60–71 (1958)
20. Nowak, M.A.: Evolutionary Dynamics: Exploring the Equations of Life. Harvard
University Press (2006)
21. Ohtsuki, H., Nowak, M.A.: Evolutionary games on cycles. Proceedings of the Royal
Society B: Biological Sciences 273, 2249–2256 (2006)
22. Rychtář, J., Stadler, B.: Evolutionary dynamics on small-world networks. Interna-
tional Journal of Computational and Mathematical Sciences 2(1), 1–4 (2008)
23. Sandholm, W.H.: Population games and evolutionary dynamics. MIT Press (2011)
24. Taylor, C., Iwasa, Y., Nowak, M.A.: A symmetry of fixation times in evoultionary
dynamics. Journal of Theoretical Biology 243(2), 245–251 (2006)
25. Traulsen, A., Hauert, C.: Stochastic evolutionary game dynamics. In: Reviews of
Nonlinear Dynamics and Complexity, vol. 2. Wiley, NY (2008)
Fast Distributed Coloring Algorithms
for Triangle-Free Graphs

Seth Pettie and Hsin-Hao Su

University of Michigan

Abstract. Vertex coloring is a central concept in graph theory and


an important symmetry-breaking primitive in distributed computing.
Whereas degree-Δ graphs may require palettes of Δ+1 colors in the worst
case, it is well known that the chromatic number of many natural graph
classes can be much smaller. In this paper we give new distributed algo-
rithms to find (Δ/k)-coloring in graphs of girth 4 (triangle-free graphs),
girth 5, and trees, where k is at most ( 14 − o(1)) ln Δ in triangle-free
graphs and at most (1 − o(1)) ln Δ in girth-5 graphs and trees, and o(1)
is a function of Δ. Specifically, for Δ sufficiently large we can find such
a coloring in O(k + log∗ n) time. Moreover, for any Δ we can compute
such colorings in roughly logarithmic time for triangle-free and girth-
5 graphs, and in O(log Δ + logΔ log n) time on trees. As a byproduct,
our algorithm shows that the chromatic number of triangle-free graphs
is at most (4 + o(1)) lnΔΔ , which improves on Jamall’s recent bound of
(67 + o(1)) lnΔΔ . Also, we show that (Δ + 1)-coloring for triangle-free
graphs can be obtained in sublogarithmic time for any Δ.

1 Introduction

A proper t-coloring of a graph G = (V, E) is an assignment from V to {1, . . . , t}


(colors) such that no edge is monochromatic, or equivalently, each color class
is an independent set. The chromatic number χ(G) is the minimum number of
colors needed to properly color G. Let Δ be the maximum degree of the graph.
It is easy to see that sometimes Δ + 1 colors are necessary, e.g., on an odd cycle
or a (Δ + 1)-clique. Brooks’ celebrated theorem [9] states that these are the only
such examples and that every other graph can be Δ-colored. Vizing [31] asked
whether Brooks’ Theorem can be improved for triangle-free graphs. In the 1970s
Borodin and Kostochka [8], Catlin [10], and Lawrence [21] independently proved
that χ(G) ≤ 34 (Δ + 2) for triangle-free G, and Kostochka (see [17]) improved
this bound to χ(G) ≤ 23 (Δ + 2).

Existential Bounds. Better asymptotic bounds were achieved in the 1990s by


using an iterated approach, often called the “Rödl Nibble”. The idea is to color
a very small fraction of the graph in a sequence of rounds, where after each

This work is supported by NSF CAREER grant no. CCF-0746673, NSF grant no.
CCF-1217338, and a grant from the US-Israel Binational Science Foundation.

F.V. Fomin et al. (Eds.): ICALP 2013, Part II, LNCS 7966, pp. 681–693, 2013.

c Springer-Verlag Berlin Heidelberg 2013
682 S. Pettie and H.-H. Su

round some property is guaranteed to hold with some small non-zero probability.
Kim [18] proved that in any girth-5 graph G, χ(G) ≤ (1 + o(1)) lnΔΔ . This bound
is optimal to within a factor-2 under any lower bound on girth. (Constructions
of Kostochka and Masurova [19] and Bollobás [7] show that there is a graph G of
Δ
arbitrarily large girth and χ(G) > 2 ln Δ .) Building on [18], Johansson (see [23])
Δ
proved that χ(G) = O( ln Δ ) for any triangle-free (girth-4) graph G.1 In relatively
recent work Jamall [14] proved that the chromatic number of triangle-free graphs
is at most (67 + o(1)) lnΔΔ .

Algorithms. We assume the LOCAL model [26] of distributed computation.2


Grable and Panconesi [12] gave a distributed algorithm that Δ/k-colors a girth-

5 graph in O(log n) time, where Δ > log1+ n and k ≤  ln Δ for any  > 0
and some  < 1 depending on  .3 Jamall [15] showed a sequential algorithm for
O(Δ/ ln Δ)-coloring a triangle-free graph in O(nΔ2 ln Δ) time, for any  > 0

and Δ > log1+ n.
Note that there are two gaps between the existential [14,18,23] and algorithmic
results [12, 15]. The algorithmic results use a constant factor more colors than
necessary (compared to the existential bounds) and they only work when Δ ≥
log1+Ω(1) n is sufficiently large, whereas the existential bounds hold for all Δ.

New Results. We give new distributed algorithms for (Δ/k)-coloring triangle-


free graphs that simultaneously improve on both the existential and algorithmic
results of [12, 14, 15, 23]. Our algorithms run in log1+o(1) n time for all Δ and in
O(k+log∗ n) time for Δ sufficiently large. Moreover, we prove that the chromatic
number of triangle-free graphs is (4 + o(1)) lnΔΔ .

Theorem 1. Fix a constant  > 0. Let Δ be the maximum degree of a triangle-


free graph G, assumed to be at least some Δ depending on  . Let k ≥ 1 be
a parameter such that 2 ≤ 1 − ln4kΔ . Then G can be (Δ/k)-colored, in time

O(k + log∗ Δ) if Δ1− ln Δ − = Ω(ln n), and, for any Δ, in time on the order of
4k

 √ log n
min eO( ln ln n) , Δ + log∗ n · (k + log∗ Δ) · 
= log1+o(1) n
Δ1− ln Δ −
4k

The first time bound comes from an O(k + log∗ Δ)-round procedure, each round
of which succeeds with probability 1 − 1/ poly(n). However, as Δ decreases the
probability of failure tends to 1. To enforce that each step succeeds with high
1
We are not aware of any extant copy of Johansson’s manuscript. It is often cited as
a DIMACS Technical Report, though no such report exists. Molloy and Reed [23]
reproduced a variant of Johansson’s proof showing that χ(G) ≤ 160 lnΔΔ for triangle-
free G.
2
In short, vertices host processors which operate is synchronized rounds; vertices can
communicate one arbitrarily large message across each edge in each round; local
computation is free; time is measured by the number of rounds.
3
They claimed that their algorithm could also be extended to triangle-free graphs.
Jamall [15] pointed out a flaw in their argument.
Fast Distributed Coloring Algorithms for Triangle-Free Graphs 683

probability we use a version of the Local Lemma algorithm of Moser and Tar-
dos [24] optimized for the parameters of our problem.4
By choosing k = ln Δ/(4 + ) and  = /(2(4 + )), we obtain new bounds on
the chromatic number of triangle-free graphs.

Corollary 1. For any  > 0 and Δ sufficiently large (as a function of ), χ(G) ≤
(4 + ) lnΔΔ . Consequently, the chromatic number of triangle-free graphs is (4 +
o(1)) lnΔΔ , where the o(1) is a function of Δ.
 
Our result also extends to girth-5 graphs with Δ1− ln Δ − replaced by Δ1− ln Δ − ,
4k k

which allows us to (1 + )Δ/ ln Δ-color such graphs. Our algorithm can clearly
be applied to trees (girth ∞). Elkin [11] noted that with Bollobás’s construc-
tion [7], Linial’s lower bound [22] on coloring trees can be strengthened to show
that it is impossible to o(Δ/ ln Δ)-color a tree in o(logΔ n) time. We prove that
it is possible to (1 + o(1))Δ/ ln Δ-color a tree in O(log Δ + logΔ log n) time.
Also, we√ show that (Δ + 1)-coloring for triangle-free graphs can be obtained in
exp(O( log log n)) time.

Technical Overview. In the iterated approaches of [12, 14, 18, 23] each vertex u
maintains a palette, which consists of the colors that have not been selected by
its neighbors. To obtain a t-coloring, each palette consists of colors {1, . . . , t}
initially. In each round, each u tries to assign itself a color (or colors) from
its palette, using randomization to resolve the conflicts between itself and the
neighbors. The c-degree of u is defined to be the number of its neighbors whose
palettes contain c. In Kim’s algorithm [18] for girth-5 graphs, the properties
maintained for each round are that the c-degrees are upper bounded and the
palette sizes are lower bounded. In girth-5 graphs the neighborhoods of the
neighbors of u only intersect at u and therefore have a negligible influence on each
other, that is, whether c remains in one neighbor’s palette has little influence
on a different neighbor of u. Due to this independence one can bound the c-
degree after an iteration using standard concentration inequalities. In triangle-
free graphs, however, there is no guarantee of independence. If two neighbors
of u have identical neighborhoods, then after one iteration they will either both
keep or both lose c from their palettes. In other words, the c-degree of u is
a random variable that may not have any significant concentration around its
mean. Rather than bound c-degrees, Johansson [23] bounded the entropy of the
remaining palettes so that each color is picked nearly uniformly in each round.
Jamall [14] claimed that although each c-degree does not concentrate, the average
c-degree (over each c in the palette) does concentrate. Moreover, it suffices to
consider only those colors within a constant factor of the average in subsequent
iterations.
Our (Δ/k)-coloring algorithm performs the same coloring procedure in each
round, though the behavior of the algorithm has two qualitatively distinct phases.
4
Note that for many reasonable parameters (e.g., k = O(1), Δ = log1−δ n), the run-
ning time is sublogarithmic.
684 S. Pettie and H.-H. Su

In the first O(k) rounds the c-degrees, palette sizes, and probability of remain-
ing uncolored are very well behaved. Once the available palette is close to the
number of uncolored neighbors the probability of remaining uncolored begins
to decrease drastically in each successive round, and after O(log∗ n) rounds all
vertices are colored, w.h.p.
Our analysis is similar to that of Jamall [14] in that we focus on bounding the
average of the c-degrees. However, our proof needs to take a different approach,
for two reasons. First, to obtain an efficient distributed algorithm we need to
obtain a tighter bound on the probability of failure in the last O(log∗ n) rounds,
where the c-degrees shrink faster than a constant factor per round. Second, there
is a small flaw in Jamall’s application of Azuma’s inequality in Lemma 12 in [14],
the corresponding Lemma 17 in [15], and the corresponding lemmas in [16]. It
is probably possible to correct the flaw, though we manage to circumvent this
difficulty altogether. See the full version for a discussion of this issue.
The second phase presents different challenges. The natural way to bound
c-degrees using Chernoff-type inequalities gives error probabilities that are ex-
ponential in the c-degree, which is fine if it is Ω(log n) but becomes too large
as the c-degrees are reduced in each coloring round. At a certain threshold we
switch to a different analysis (along the lines of Schneider and Wattenhofer [30])
that allows us to bound c-degrees with high probability in the palette size, which,
again, is fine if it is Ω(log n).
In both phases, if we cannot obtain small error probabilities (via concentration
inequalities and a union bound) we revert to a distributed implementation of
the Moser-Tardos Lovász Local Lemma algorithm [24]. We show that for certain
parameters the symmetric LLL can be made to run in sublogarithmic time.
For the extensions to trees and the (Δ + 1)-coloring algorithm for triangle-free
graphs, we adopt the ideas from [5,6,29] to reduce the graph into several smaller
components and color each of them separately by deterministic algorithms [4,25],
which will run faster as the size of each subproblem is smaller.

Organization. Section 2 presents the general framework for the analysis. Sec-
tion 3 describes the algorithms and discusses what parameters to plug into the
framework. Section 4 describes the extension to graphs of girth 5, trees, and the
(Δ + 1)-coloring algorithm for triangle-free graphs.

2 The Framework

Every vertex maintains a palette that consists of all colors not previously chosen
by its neighbors. The coloring is performed in rounds, where each vertex chooses
zero or more colors in each round. Let Gi be the graph induced by the uncolored
vertices after round i, so G = G0 . Let Ni (u) be u’s neighbors in Gi and let Pi (u)
be its palette after round i. The c-neighbors Ni,c (u) consist of those v ∈ Ni (u)
with c ∈ Pi (v). Call |Ni (u)| the degree of u and |Ni,c (u)| the c-degree of u after
round i. This notation is extended to sets of vertices in a natural way, e.g.,
Ni (Ni (u)) is the set of neighbors of neighbors of u in Gi .
Fast Distributed Coloring Algorithms for Triangle-Free Graphs 685

Algorithm 2 describes the iterative coloring procedure. In each round, each


vertex u selects a set Si (u) of colors by including each c ∈ Pi−1 (u) independently
with probability πi to be determined later. If some c ∈ Si (u) is not selected by
any neighbor of u then u can safely color itself c. In order to remove dependen-
cies between various random variables we exclude colors from u’s palette more
aggressively than is necessary. First, we exclude any color selected by a neighbor,
that is, Si (Ni−1 (u)) does not appear in Pi (u). The probability that a color c is not
selected by a neighbor is (1 − πi )|Ni−1,c (u)| . Suppose that this quantity is at least
some threshold βi for all c. We force c to be kept with probability precisely βi by
putting c in a keep-set Ki (u) with probability βi /(1 − πi )|Ni−1,c (u)| . The proba-
bility that c ∈ Ki (u)\Si (Ni−1 (u)) is therefore βi , assuming βi /(1 − πi )|Ni−1,c (u)|
is a valid probability; if it is not then c is ignored. Let P7i (u) be what remains of
u’s palette. Algorithm 2 has two variants. In Variant B, Pi (u) is exactly P7i (u)
whereas in Variant A Pi (u) is the subset of P7i (u) whose c-degrees are sufficiently
low, less than 2ti , where ti is a parameter that will be explained below.

Include each c ∈ Pi−1 (u) in Si (u) independently with probability πi .


For each c, calculate rc = βi /(1 − πi )|Ni−1,c (u)| .
If rc ≤ 1, include c ∈ Pi−1 (u) in Ki (u) independently with probability rc .
return (Si (u), Ki (u)).

Algorithm 1. Select(u, πi , βi )

repeat
Round i = 1, 2, 3, . . . .
for each u ∈ Gi−1 do
(Si (u), Ki (u)) ← Select(u, πi , βi )
Set Pi (u) ← Ki (u) \ Si (Ni−1 (u))
if Si (u) ∩ Pi (u) = ∅ then color u with any color in Si (u) ∩ Pi (u) end if
(Variant A) Pi (u) ← {c ∈ Pi (u) | |Ni,c (u)| ≤ 2ti }
(Variant B) Pi (u) ← Pi (u)
end for
Gi ← Gi−1 \ {colored vertices}
until the termination condition occurs

Algorithm 2. Coloring-Algorithm(G0, {πi }, {βi })

The algorithm is parameterized by the sampling probabilities {πi }, the ideal


c-degrees {ti } and the ideal probability {βi } of retaining a color. The {βi } define
how the ideal palette sizes {pi } degrade. Of course, the actual palette sizes and
c-degrees after i rounds will drift from their ideal values, so we will need to reason
about approximations of these quantities. We will specify the initial parameters
and the terminating conditions when applying both variants in Section 3.
686 S. Pettie and H.-H. Su

2.1 Analysis A
Given {πi }, p0 , t0 , and δ, the parameters for Variant A are derived below.
i−1
/2)pi
βi = (1 − πi )2ti−1 αi = (1 − πi )(1−(1+δ)
pi = βi pi−1 ti = max(αi βi ti−1 , T ) (1)
pi = (1 − δ/8) pi
i
ti i
= (1 + δ) ti
Let us take a brief tour of the parameters. The sampling probability πi will be
inversely proportional to ti−1 , the ideal c-degree at end of round i − 1. (The
exact expression for πi depends on  .) Since we filter out colors with more
than twice the ideal c-degree, the probability that a color is not selected by
any neighbor is at least (1 − πi )2ti−1 = βi . Note that since πi = Θ(1/ti−1 ) we
have βi = Θ(1). Thus, we can force all colors to be retained in the palette with
probability precisely βi , making the ideal palette size pi = βi pi−1 . Remember
that a c-neighbor stays a c-neighbor if it remains uncolored and it does not
remove c from its palette. The latter event happens with probability βi . We use
αi as an upper bound on the probability that a vertex remains uncolored, so the
ideal c-degree should be ti = αi βi ti−1 . To account for deviations from the ideal
we let pi and ti be approximate versions of pi and ti , defined in terms of a small
error control parameter δ > 0. Furthermore, certain high probability bounds will
fail to hold if ti becomes too small, so we will not let it go below a threshold T .
When the graph has girth 5, the concentration bounds allow us to show that
|Pi (u)| ≥ pi and |Ni,c (u)| ≤ ti with certain probabilities. As pointed out by
Jamall [14,15], |Ni,c (u)| does not concentrate
 in triangle-free graphs. He showed
that the average c-degree, ni (u) = c∈Pi (u) |Ni,c (u)|/|Pi (u)|, concentrates and
will be bounded above by ti with a certain probability. Since ni (u) concentrates,
it is possible to bound the fraction of colors filtered for having c-degrees larger
than 2ti .
Let λi (u) = min(1, |Pi (u)|/pi ). Since Pi (u) is supposed to be at least pi , if
we do not filter out colors, 1 − λi (u) can be viewed as the fraction that has
been filtered. In the following we state an induction hypotheses equivalent to
Jamall’s [14].
Di (u) ≤ ti , where Di (u) = λi (u)ni (u) + (1 − λi (u))2ti
Di (u) can be interpreted as the average of the c-degrees of Pi (u) with pi −|Pi (u)|
dummy colors whose c-degrees are exactly 2ti . Notice that Di (u) ≤ ti also implies
1 − λi (u) ≤ (1 + δ)i /2, because (1 − λi (u))2ti ≤ Di (u) ≤ ti . Therefore:
|Pi (u)| ≥ (1 − (1 + δ)i /2)pi
Recall Pi (u) is the palette consisting of colors c for which |Ni,c (u)| ≤ 2ti .
The main theorem for this section shows the inductive hypothesis holds with
a certain probability. See the full version for the proof.
Theorem 2. Suppose that Di−1 (x) ≤ ti−1 for all x ∈ Gi−1 , then for a given u ∈
2 
Gi−1 , Di (u) ≤ ti holds with probability at least 1−Δe−Ω(δ T ) −(Δ2 +2)e−Ω(δ pi ) .
2
Fast Distributed Coloring Algorithms for Triangle-Free Graphs 687

2.2 Analysis B
Analysis A has a limitation for smaller c-degrees, since the probability guarantee
becomes smaller as ti goes down. Therefore, Analysis A only works well for
ti ≥ T , where T is a threshold for certain probability guarantees. For example,
if we want Theorem 2 to hold with high probability in n, then we must have
T < log n.
To get a good probability guarantee below T , we will use an idea by Schneider
and Wattenhofer [30]. They took advantage of the trials done for each color inside
the palette, rather than just considering the trials on whether each neighbor is
colored or not. We demonstrate this idea in the proof of Theorem 3 in the full
version. The probability guarantee in the analysis will not depend on the current
c-degree but on the initial c-degree and the current palette size.
The parameters for Variant B are chosen based on an initial lower bound on
the palette size p0 , upper bound on the c-degree t0 , and error control parameter
δ. The selection probability is chosen to be πi = 1/(ti−1 + 1) and the probability
a color remains in a palette βi = (1 − πi )ti−1 . The ideal palette size and its
relaxation are pi = βi pi−1 and pi = (1 − δ)i pi , and the ideal c-degree ti =
max(αi ti−1 , 1). One can show the probability of remaining uncolored is upper
bounded by αi = 5t0 /pi ,
Let Ei (u) denote the event that |Pi (u)| ≥ pi and |Ni,c (u)| < ti for all c ∈
Pi (u). Although a vertex could lose its c-neighbor if the c-neighbor becomes
colored or loses c in its palette, in this analysis, we only use the former to
bound its c-degree. Also, if Ei−1 (u) is true, then Pr(c ∈ / Si (Ni−1 (u))) > βi for
all c ∈ Pi−1 (u). Thus in Select(u, πi , βi ), we will not ignore any colors in the
palette. Each color remains in the palette with probability exactly βi .
The following theorem shows the inductive hypothesis holds with a certain
probability. See the full version for the proof.
Theorem 3. If Ei−1 (x) holds for all x ∈ Gi−1 , then for a given u ∈ Gi−1 ,
2 
Ei (u) holds with probability at least 1 − Δe−Ω(t0 ) − (Δ2 + 1)e−Ω(δ pi )

3 The Coloring Algorithms


The algorithm in Theorem 1 consists of two phases. Phase I uses Analysis A
and Phase II uses Analysis B. First, we will give the parameters for both phases.
Then, we will present the distributed algorithm that makes the induction hy-
pothesis in Theorem 2 (Di (u) ≤ ti ) and Theorem 3 (Ei (u)) hold for all u ∈ Gi
with high probability in n for every round i.
 
Let 1 = 1 − ln4kΔ − 23 and 2 = 1 − ln4kΔ − 3 . We will show that upon reaching
the terminating condition of Phase I (which will be defined later), we will have
|Pi (u)| ≥ Δ2 for all u ∈ Gi and |Ni,c (u)| < Δ1 for all u ∈ Gi and all c ∈ Pi (u).
At this point, for a non-constructive version, we can simply apply the results
about list coloring constants [13, 27, 28] to get a proper coloring, since at this
point there is an ω(1) gap between |Ni,c (u)| and |Pi (u)| for every u ∈ Gi . One can
turn the result of [27] into a distributed algorithm with the aid of Moser-Tardos
688 S. Pettie and H.-H. Su

Lovász Local Lemma algorithm to amplify the success probability. However, to


obtain an efficient distributed algorithm we use Analysis B in Phase II.
Since our result holds for large enough Δ, we can assume whenever necessary
that Δ is sufficiently large. The asymptotic notation will be with respect to Δ.

3.1 Parameters for Phase I


In this phase, we use Analysis A with the following parameters and terminating
1 
condition: πi = 2Kti−1 +1 , where K = 4/ is a constant, p0 = Δ/k, t0 = Δ and
def
δ = 1/ log2 Δ. This phase ends after the round when ti ≤ T = Δ1 /3.
First, we consider the algorithm for at most the first O(log Δ) rounds. For
 O(log Δ)
these rounds, we can assume the error (1 + δ)i ≤ 1 + log12 Δ ≤
 O(log Δ)
eO(1/ log Δ) = 1 + o(1) and similarly (1 − δ/8)i ≥ 1 − log2 1Δ+1 ≥
e−O(1/ log Δ) = 1 − o(1). We will show the algorithm reaches the terminating
condition during these rounds, where the error is under control.
The probability a color is retained, βi = (1 − πi )2ti−1 ≥ e−1/K , is bounded
below by a constant. The probability a vertex remains uncolored is at most

αi = (1 − πi )(1−(1+δ) /2)pi ≤ e−(1−o(1))Cpi−1 /ti−1 , where C = 1/(4Ke1/K ).
i−1

Let si = ti /pi be the ratio between the ideal c-degree and the ideal palette size.
Initially, s0 = k and si = αi si−1 ≤ si−1 e−(1−o(1))(C/si−1 ) . Initially, si decreases
roughly linearly by C for each round until the ratio si ≈ C is a constant. Then,
si decreases rapidly in the order of iterated exponentiation. Therefore, it takes
roughly O(k + log∗ Δ) rounds to reach the terminating condition where ti ≤ T .
Our goal is to show upon reaching the terminating condition, the palette size

bound pi is greater than T by some amount, in particular, pi ≥ 30e3/ Δ2 . See
the full version for the proof of the following Lemma.

Lemma 1. Phase I terminates in (4 + o(1))Ke1/K k + O(log∗ Δ) rounds, where



K = 4/ . Moreover, pi ≥ 30e3/ Δ2 for every round i in this phase.

Thus, if the induction hypothesis Di (u) ≤ ti holds for every u ∈ Gi for every round

i during this phase, we will have |Pi (u)| ≥ (1 − (1 + δ)i /2)pi ≥ 10e3/ Δ2 for all
u ∈ Gi and |Ni,c (u)| ≤ 2ti < Δ1 for all u ∈ Gi and all c ∈ Pi (u) in the end.

3.2 Parameters for Phase II


In Phase II, we will use Analysis B with the following parameters and terminating

condition: p0 = 10e3/ Δ2 , t0 = Δ1 and δ = 1/ log2 Δ. This phase terminates
after 3 rounds.
First note that the number of rounds 3 is a constant. We show pi ≥ 5Δ2
for each round 1 ≤ i ≤ 3 , so there is always a sufficient large gap between the
current palette size and the initial c-degree, which implies the shrinking factor

of the c-degrees is αi = 5t0 /pi ≤ Δ− /3 . Since pi shrinks by at most a βi ≥ e−1
 
factor every round, pi ≥ (1 − δ)i j=1 βj p0 ≥ ((1 − δ)e−1 )i 10e3/ Δ2 ≥ 5Δ2 .
i
Fast Distributed Coloring Algorithms for Triangle-Free Graphs 689

i  3
  
Now since αi ≤ Δ− /3 , after 3
 rounds, ti ≤ t0 j=1 αj ≤ Δ Δ− /3 ≤ 1.
The c-degree bound, t /3 , becomes 1. Recall that the induction hypothesis Ei (u)
is the event that |Pi (u)| ≥ pi and |Ni,c (u)| < ti . If Ei (u) holds for every u ∈ Gi
for every round i during this phase, then in the end, every uncolored vertex has
no c-neighbors, as implied by |Ni,c (u)| < ti ≤ 1. This means these vertices can
be colored with anything in their palettes, which are non-empty.

3.3 The Distributed Coloring Algorithm

We will show a distributed algorithm that makes the induction hypothesis in


Phase I and Phase II hold with high probability in n.
Fix the round i and assume the induction hypothesis holds for all x ∈ Gi−1 .
For u ∈ Gi−1 , define A(u) to be the bad event that the induction hypoth-
esis fails at u (i.e. Di (u) > ti in Phase I or Ei (u) fails in Phase II). Let
1− 4k −
p = e−Δ ln Δ
/(eΔ4 ). By Theorem 2 and Theorem 3, Pr(A(u)) is at most:
2  2 
Δe−Ω (δ T ) + (Δ2 + 2)e−Ω (δ pi ) Δe−Ω(t0 ) + (Δ2 + 1)e−Ω (δ
2
or pi )

Since T = Δ1 /3, t0 = Δ1 , pi ≥ Δ2 , Pr(A(u)) ≤ p for large enough Δ.

If Δ1− ln Δ − > c log n, then p < 1/nc . By the union bound over all u ∈ Gi−1 ,
4k

the probability that any of the A(u) fails is at most 1/nc−1 . The induction
hypothesis holds for all u ∈ Gi ⊆ Gi−1 with high probability. In this case,
O(k + log∗ Δ) rounds suffice, because each round succeeds with high probability.

On the other hand, if Δ1− ln Δ − < c log n, then we apply Moser and Tardos’
4k

resampling algorithm to make A(u) simultaneously hold for all u with high
probability. At round i, the bad event A(u) depends on the random variables
which are generated by Select(v, πi , βi ) for v within distance 2 in Gi−1 . Therefore,
the dependency graph G≤4 i−1 consists of edges (u, v) such that distGi−1 (u, v) ≤ 4.
Each event A(u) shares variables with at most d < Δ4 other events. The Lovász
Local Lemma [1] implies that if ep(d + 1) ≤ 1, then the probability that all A(u)
simultaneously hold is guaranteed to be non-zero. Moser and Tardos showed
how to boost this probability by resampling. In each round of resampling, their
algorithm finds an MIS I in the dependency graph induced by the set of bad
events B and then resamples the random variables that I depends on. In our
case, it corresponds to finding an MIS I in G≤4 i−1 [B], where B = {u ∈ Gi−1 |
A(u) fails}. Then, we redo Select(v, πi , βi ) for v ∈ G within distance 2 from I to
resample the random variables that I depends on. By plugging in the parameters
for the symmetric case, their proof shows if ep(d+1) ≤ 1−, then the probability
any of the bad events occur after t rounds of resampling is at most (1 − )t n/d.
1
Thus, O(log n/ log( 1− )) rounds will be sufficient for all A(u) to hold with high
5
probability in n.
5
In the statement of Theorem 1.3 in [24], they used 1/ as an approximation for
1
log( 1− ). However, this difference can be significant in our case, when 1 −  is very
small.
690 S. Pettie and H.-H. Su

1− 4k −
As shown in previous sections, p ≤ e−Δ ln Δ
/(eΔ4 ). We can let 1 −  =
4k 
−Δ1− ln Δ − 
. Therefore, O(log n/Δ1− ln Δ − ) resampling rounds
4k
ep(d + 1) ≤ e
will√be sufficient. Also, an MIS can be found in O(Δ + log∗ n) time [3, 20], or
 
eO( log log n) since Δ ≤ (c log n)1/(1− ln Δ − ) ≤ (c log n)1/ ≤ logO(1) n [5]. Each
4k


of the O(k+log∗ Δ) rounds is delayed by O(log n/Δ1− ln Δ − ) resampling rounds,
4k

which are futher delayed by the rounds needed to find an MIS. Therefore, the
total number of rounds is
   
log n
O (k + log∗ Δ) · 
· min exp O log log n , Δ + log ∗
n
Δ1− ln Δ −
4k

Note that this is always at most log1+o(1) n.

4 Extensions
4.1 Graphs of Girth at Least 5
For graphs of girth at least 5, existential results [18, 23] show that there exists
(1 + o(1))Δ/ ln Δ-coloring. We state a matching algorithmic result. The proof
will be included in the full version.
Theorem 4. Fix a constant  > 0. Let Δ be the maximum degree of a girth-
5 graph G, assumed to be at least some Δ depending on  . Let k ≥ 1 be
a parameter such that 2 ≤ 1 − lnkΔ . Then G can be (Δ/k)-colored, in time

O(k + log∗ Δ) if Δ1− ln Δ − = Ω(ln n), and, for any Δ, in time on the order of
k

 √ log n
min eO( ln ln n) , Δ + log∗ n · (k + log∗ Δ) · 
= log1+o(1) n
Δ1− ln Δ −
k

4.2 Trees
Trees are graphs of infinity girth. According to Theorem 4, it is possible to get a
 
(Δ/k)-coloring in O(k + log∗ Δ) time if Δ1− ln Δ − = Ω(log n). If Δ1− ln Δ − =
k k

O(log n), we will show that using additional O(q) colors, it is possible to get a

(Δ/k + O(q))-coloring in O k + log∗ n + logloglogq n time. By choosing q = Δ,
we can find a (1 + o(1))Δ/ ln Δ-coloring in O(log Δ + logΔ log n) rounds.
The algorithm is the same with the framework, except that at the end of each
round we delete the bad vertices, which are the vertices that fail to satisfy the
induction hypothesis. The remaining vertices must satisfy the induction hypoth-
esis, and then we will continue the next round on these vertices. Using the idea
from [5,6,29], we can show that after O(k + log∗ Δ) rounds of the algorithm,
 the
size of each component formed by the bad vertices is at most O Δ4 log n with
high probability. See the full version for the proof.
Barenboim and Elkin’s deterministic algorithm [4] obtains O(q)-coloring in

O log n
log q + log n time for trees (arboricity = 1). We then apply their algorithm
Fast Distributed Coloring Algorithms for Triangle-Free Graphs 691

on each component formed by bad vertices. Since the size of each component
is at most O(Δ4 log n), their algorithm will run in O log loglog
n+log Δ
q + log∗ n
time,
 using the additional O(q) colors. Note that this running time is actually
O logloglogq n + log∗ n , since Δ = logO(1) n.

4.3 (Δ + 1)-Color Triangle-Free Graphs in Sublogarithmic Time

We show √ that (Δ + 1)-coloring in triangle-free graphs can be obtained in


exp(O( log log n)) rounds for any Δ. Let k = 1 and  = 1/4. By Theorem
1, there exists a constant Δ0 such that for all Δ ≥ Δ0 , if Δ1/2 ≥ log n, then a
(Δ + 1)-coloring can be found in O(log∗ Δ) time. If Δ < Δ0 , then (Δ + 1) can be
solved in O(Δ+log∗ n) = O(log∗ n) rounds [3,20]. Otherwise, if Δ0 ≤ Δ < log2 n,
then we can apply the same technique for trees to bound the size of each bad
component by O(Δ4 log n) = polylog(n), whose vertices failed to satisfy the
induction hypothesis in the O(log∗ Δ) rounds. Panconesi and Srinivasan’s de-
terministic
√ network decomposition algorithm [25] obtains (Δ + 1)-coloring in
exp(O( log n)) for graphs with n vertices. In fact, their decomposition can also
obtain a proper coloring as long as the graph can be greedily colored (e.g. the
palette size is more than the degree for each vertex). Therefore, by applying
√ their
algorithm, each bad component can be properly colored in exp(O( log log n))
rounds.

5 Conclusion
The time bounds of Theorem 1 show an interesting discontinuity. When Δ is
large we can cap the error at 1/ poly(n) by using standard concentration in-
equalities and a union bound. When Δ is small we can use the Moser-Tardos
LLL algorithm to reduce the failure probability again to 1/ poly(n). Thus, the
distributed complexity of our coloring algorithm is tied to the distributed com-
plexity of the constructive Lovász Local Lemma.
We showed that χ(G) ≤ (4 + o(1))Δ/ ln Δ for triangle-free graphs G. It would
be interesting to see if it is possible to reduce the palette size to (1+o(1))Δ/ ln Δ,
matching Kim’s [18] bound for girth-5 graphs.
Alon et al. [2] and Vu [32] extended Johansson’s result [23] for triangle-free
graphs to obtain an O(Δ/ log f )-coloring for locally sparse graphs (the latter
also works for list coloring), in which no neighborhood of any vertex spans more
than Δ2 /f edges. It would be interesting to extend our result to locally sparse
graphs and other sparse graph classes.

References

1. Alon, N., Spencer, J.H.: The Probabilistic Method. Wiley Series in Discrete Math-
ematics and Optimization. Wiley (2011)
2. Alon, N., Krivelevich, M., Sudakov, B.: Coloring graphs with sparse neighborhoods.
Journal of Combinatorial Theory, Series B 77(1), 73–82 (1999)
692 S. Pettie and H.-H. Su

3. Barenboim, L., Elkin, M.: Distributed (Δ + 1)-coloring in linear (in Δ) time.


In: STOC 2009, pp. 111–120. ACM, New York (2009)
4. Barenboim, L., Elkin, M.: Sublogarithmic distributed MIS algorithm for sparse
graphs using Nash-Williams decomposition. Distrib. Comput. 22, 363–379 (2010)
5. Barenboim, L., Elkin, M., Pettie, S., Schneider, J.: The locality of distributed
symmetry breaking. In: FOCS 2012, pp. 321–330 (October 2012)
6. Beck, J.: An algorithmic approach to the lovász local lemma. Random Structures
& Algorithms 2(4), 343–365 (1991)
7. Bollobás, B.: Chromatic number, girth and maximal degree. Discrete Mathemat-
ics 24(3), 311–314 (1978)
8. Borodin, O.V., Kostochka, A.V.: On an upper bound of a graph’s chromatic num-
ber, depending on the graph’s degree and density. Journal of Combinatorial Theory,
Series B 23(2-3), 247–250 (1977)
9. Brooks, R.L.: On colouring the nodes of a network. Mathematical Proceedings of
the Cambridge Philosophical Society 37(02), 194–197 (1941)
10. Catlin, P.A.: A bound on the chromatic number of a graph. Discrete Math. 22(1),
81–83 (1978)
11. Elkin, M.: Personal communication
12. Grable, D.A., Panconesi, A.: Fast distributed algorithms for Brooks-Vizing color-
ings. Journal of Algorithms 37(1), 85–120 (2000)
13. Haxell, P.E.: A note on vertex list colouring. Comb. Probab. Comput. 10(4),
345–347 (2001)
14. Jamall, M.S.: A Brooks’ Theorem for Triangle-Free Graphs. ArXiv e-prints (2011)
15. Jamall, M.S.: A Coloring Algorithm for Triangle-Free Graphs. ArXiv e-prints
(2011)
16. Jamall, M.S.: Coloring Triangle-Free Graphs and Network Games. Dissertation.
University of California, San Diego (2011)
17. Jensen, T.R., Toft, B.: Graph coloring problems. Wiley-Interscience series in dis-
crete mathematics and optimization. Wiley (1995)
18. Kim, J.H.: On brooks’ theorem for sparse graphs. Combinatorics. Probability and
Computing 4, 97–132 (1995)
19. Kostochka, A.V., Mazuronva, N.P.: An inequality in the theory of graph coloring.
Metody Diskret. Analiz. 30, 23–29 (1977)
20. Kuhn, F.: Weak graph colorings: distributed algorithms and applications. In: SPAA
2009, pp. 138–144. ACM, New York (2009)
21. Lawrence, J.: Covering the vertex set of a graph with subgraphs of smaller degree.
Discrete Mathematics 21(1), 61–68 (1978)
22. Linial, N.: Locality in distributed graph algorithms. SIAM J. Comput. 21(1),
193–201 (1992)
23. Molloy, M., Reed, B.: Graph Colouring and the Probabilistic Method. Algorithms
and Combinatorics. Springer (2001)
24. Moser, R.A., Tardos, G.: A constructive proof of the general lovász local lemma.
J. ACM 57(2), 11:1–11:15 (2010)
25. Panconesi, A., Srinivasan, A.: On the complexity of distributed network decompo-
sition. Journal of Algorithms 20(2), 356–374 (1996)
26. Peleg, D.: Distributed Computing: A Locality-Sensitive Approach. Monographs on
Discrete Mathematics and Applications. SIAM (2000)
27. Reed, B.: The list colouring constants. Journal of Graph Theory 31(2), 149–153
(1999)
Fast Distributed Coloring Algorithms for Triangle-Free Graphs 693

28. Reed, B., Sudakov, B.: Asymptotically the list colouring constants are 1. J. Comb.
Theory Ser. B 86(1), 27–37 (2002)
29. Rubinfeld, R., Tamir, G., Vardi, S., Xie, N.: Fast local computation algorithms. In:
ICS 2011, pp. 223–238 (2011)
30. Schneider, J., Wattenhofer, R.: A new technique for distributed symmetry break-
ing. In: PODC 2010, pp. 257–266. ACM, New York (2010)
31. Vizing, V.G.: Some unsolved problems in graph theory. Uspekhi Mat.
Nauk 23(6(144)), 117–134 (1968)
32. Van Vu, H.: A general upper bound on the list chromatic number of locally sparse
graphs. Comb. Probab. Comput. 11(1), 103–111 (2002)
Author Index

Abboud, Amir I-1 Cheilaris, Panagiotis I-208


Albers, Susanne II-4, II-446 Chekuri, Chandra I-328
Almagor, Shaull II-15 Chen, Hubie II-125
Alur, Rajeev II-37 Cheriyan, Joseph I-340
Anand, S. I-13 Childs, Andrew M. I-105
Andoni, Alexandr I-25 Chrétien, Rémy II-137
Aumüller, Martin I-33 Christodoulou, George II-496
Austrin, Per I-45 Chrobak, Marek I-135
Avis, David I-57 Columbus, Tobias I-93
Cortier, Véronique II-137
Babenko, Maxim I-69 Curticapean, Radu I-352
Bachrach, Yoram II-459 Cygan, Marek I-196, I-364
Barthe, Gilles II-49 Czerwiński, Wojciech II-150
Basset, Nicolas II-61 Czyzowicz, Jurek II-508
Bateni, MohammadHossein I-81
Bauer, Reinhard I-93 De, Anindya I-376
Becchetti, Luca II-472 Delaune, Stéphanie II-137
Belovs, Aleksandrs I-105 Demaine, Erik D. I-388, I-400
Benaim, Saguy II-74 Demri, Stéphane II-162
Benedikt, Michael II-74 Deniélou, Pierre-Malo II-174
Bhattacharyya, Arnab I-123 Dereniowski, Dariusz II-520
Dhar, Amit Kumar II-162
Bienkowski, Marcin I-135
Diakonikolas, Ilias I-376
Bille, Philip I-148, I-160
Di Cosmo, Roberto II-187
Bläser, Markus I-172
Dietzfelbinger, Martin I-33
Bläsius, Thomas I-184
Dieudonné, Yoann II-533
Bodlaender, Hans L. I-196
Dinur, Irit I-413
Bohler, Cecilia I-208
Dirnberger, Michael II-472
Boker, Udi II-15, II-89
Disser, Yann II-520
Bonifaci, Vincenzo II-472
Dobbs, Neil I-135
Boros, Endre I-220
Doerr, Benjamin I-255
Braverman, Mark I-232
Duan, Ran I-425
Braverman, Vladimir I-244
Bringmann, Karl I-13, I-255, I-267 Elbassioni, Khaled I-220
Brunsch, Tobias I-279 Etessami, Kousha II-199
Bulánek, Jan I-291
Bun, Mark I-303 Faust, Sebastian II-545
Byrka, Jaroslaw I-135 Fearnley, John II-212
Filmus, Yuval I-437
Carreiro, Facundo II-101 Fischer, Johannes I-148
Celis, L. Elisa II-484 Fotakis, Dimitris I-449
Chan, T.-H. Hubert I-315 Friedmann, Oliver II-224
Charatonik, Witold II-74 Friedrich, Tobias I-13, I-267
Charlier, Émilie II-113 Fu, Yuxi II-238
Chatzigiannakis, Ioannis II-657 Fusco, Emanuele G. II-557
696 Author Index

Gairing, Martin II-496 Kaski, Petteri I-45


Ganian, Robert II-250 Kavitha, Telikepalli I-601
Gao, Zhihan I-340 Kieroński, Emanuel II-74
Garg, Naveen I-13 Kim, Eun Jung I-613
Gelderie, Marcus II-263 Klaedtke, Felix II-224
Genest, Blaise II-275 Klein, Kim-Manuel I-589
Georgiou, Konstantinos I-340 Klein, Rolf I-208
Gilbert, Anna C. I-461 Kleinberg, Jon II-1
Gimbert, Hugo II-275 Kobele, Gregory M. II-336
Gklezakos, Dimitrios C. II-484 Koivisto, Mikko I-45
Glaßer, Christian I-473 Kolmogorov, Vladimir I-625
Gørtz, Inge Li I-148, I-160 Konrad, Christian I-637
Goldberg, Andrew V. I-69 Kopelowitz, Tsvi I-148
Goldenberg, Elazar I-413 Kosowski, Adrian II-520
Golovach, Petr A. I-485 Kothari, Robin I-105
Gorı́n, Daniel II-101 Koucký, Michal I-291
Gottlob, Georg II-287 Kowalski, Dariusz R. II-632
Gravin, Nick II-569 Král’, Daniel II-250
Grier, Daniel I-497 Kranakis, Evangelos II-508
Grossi, Roberto I-504 Kratsch, Dieter I-485
Guo, Heng I-516 Kratsch, Stefan I-196
Gupta, Anupam I-69 Krinninger, Sebastian II-607
Gur, Tom I-528 Kucherov, Gregory I-650
Gurvich, Vladimir I-220 Kumar, Amit I-13
Kumar, Mrinal I-661
Hajiaghayi, MohammadTaghi I-81 Kuperberg, Denis II-89
Harris, David G. II-581 Kupferman, Orna II-15, II-89
Hartung, Sepp II-594 Kushilevitz, Eyal I-576
Hazay, Carmit II-545
Heggernes, Pinar I-485 Lampis, Michael I-673
Hemenway, Brett I-540 Landau, Gad M. I-160
Henzinger, Monika II-607 Lange, Martin II-224
Hirt, Martin I-552 Langer, Alexander I-613
Hliněný, Petr II-250 Langerman, Stefan I-388
Hoefer, Martin II-620 Lauria, Massimo I-437, I-684
Leivant, Daniel II-349
Iacono, John I-388
Lenhardt, Rastislav II-74
Im, Hyeonseung II-299
Leonardos, Nikos I-696
Indyk, Piotr I-564
Levi, Reut I-709
Ishai, Yuval I-576
Lewi, Kevin I-1
Janin, David II-312 Li, Mingfei I-315
Jansen, Klaus I-589 Li, Xin I-576
Jeffery, Stacey I-105 Liaghat, Vahid I-81
Jeż, Artur II-324 Lipmaa, Helger II-645
Jurdziński, Marcin II-212 Liu, Chih-Hung I-208
Jurdzinski, Tomasz II-632 Lohrey, Markus II-361
Lu, Pinyan II-569
Kamae, Teturo II-113
Karlin, Anna R. II-484 Määttä, Jussi I-45
Karrenbauer, Andreas II-472 Magniez, Frédéric I-105
Author Index 697

Maheshwari, Gaurav I-661 Park, Sungwoo II-299


Makino, Kazuhisa I-220 Passen, Achim II-446
Marion, Jean-Yves II-349 Patitz, Matthew J. I-400
Martens, Wim II-150 Paul, Christophe I-613
Marx, Dániel I-721, II-28, II-125 Pelc, Andrzej II-533, II-557
Masopust, Tomáš II-150 Petreschi, Rossella II-557
Mathieu, Claire I-733 Petricek, Tomas II-385
Mauro, Jacopo II-187 Pettie, Seth II-681
Mazowiecki, Filip II-74 Pieris, Andreas II-287
Megow, Nicole I-745 Pilipczuk, Marcin I-364
Mehlhorn, Kurt I-425, II-472 Polyanskiy, Yury I-25
Mertzios, George B. II-657, II-669 Porat, Ely I-461, II-459
Michail, Othon II-657 Prabhakaran, Manoj I-576
Mikša, Mladen I-437 Pudlák, Pavel I-684
Morsy, Ehab II-581 Puzynina, Svetlana II-113
Moruz, Gabriel I-757
Mucha, Marcin I-769 Raghothaman, Mukund II-37
Muscholl, Anca II-275 Raman, Rajeev I-504
Mycroft, Alan II-385 Rao, Anup I-232
Rao, Satti Srinivasa I-504
Nagarajan, Viswanath I-69 Raptopoulos, Christoforos II-29
Nakata, Keiko II-299 Raykov, Pavel I-552
Nanongkai, Danupon II-607 Raz, Ran I-528
Naves, Guyslain I-328 Razenshteyn, Ilya I-564
Nederlof, Jesper I-196 Reidl, Felix I-613
Negoescu, Andrei I-757 Reitwießner, Christian I-473
Nekrich, Yakov I-650 Robinson, Peter II-581
Neumann, Adrian I-255 Rödl, Vojtěch I-684
Ngo, Hung Q. I-461 Rogers, Trent A. I-400
Nguyen, Dung T. I-473 Röglin, Heiko I-279
Nguy˜ên, Huy L. I-25 Ron, Dana I-709
Nichterlein, André II-594 Rosén, Adi I-637
Niedermeier, Rolf II-594 Rossmanith, Peter I-613
Nikoletseas, Sotiris II-29 Rudra, Atri I-461
Ning, Li I-315 Rutter, Ignaz I-93, I-184
Nordström, Jakob I-437
Nowicki, Tomasz I-135 Sach, Benjamin I-148
Sahai, Amit I-576
Obdržálek, Jan II-250 Saks, Michael I-291
O’Donnell, Ryan I-780 Salvati, Sylvain II-336
Olmedo, Federico II-49 Sangnier, Arnaud II-162
Orchard, Dominic II-385 Sarma M.N., Jayalal I-661
Ostrovsky, Rafail I-244, I-540, I-576 Sau, Ignasi I-613
Özkan, Özgür I-388 Schröder, Lutz II-101
Schwartz, Jarett II-250
Pacheco, Eduardo II-508 Schweller, Robert T. I-400
Padovani, Luca II-373 Selman, Alan L. I-473
Pajak,
! Dominik II-520 Servedio, Rocco I-376
Pandurangan, Gopal II-581 Shepherd, F. Bruce I-328
Papadopoulou, Evanthia I-208 Sikdar, Somnath I-613
698 Author Index

Singla, Sahil I-340 Vildhøj, Hjalte Wedel I-148


Skrzypczak, Michal II-89 Vilenchik, Dan I-244
Sliacan, Jakub I-255 Villanger, Yngve I-485
Solomon, Shay I-315 Vinyals, Marc I-437
Spirakis, Paul G. II-29, II-657, II-669
Srinivasan, Aravind II-581 Wagner, Dorothea I-93, I-184
Stachowiak, Grzegorz II-632 Wagner, Lisa II-620
Steinberg, Benjamin II-361 Walukiewicz, Igor II-275
Stewart, Alistair II-199 Ward, Justin I-792
Stirling, Colin II-398 Weimann, Oren I-160, I-828
Strauss, Martin J. I-461 Weinstein, Omri I-232
Su, Hsin-Hao II-681 Widmayer, Peter II-36
Suchý, Ondřej II-594 Williams, Tyson I-516
Summers, Scott M. I-400 Wimmer, Karl I-840
Sviridenko, Maxim I-135, I-769, I-792 Witek, Maximilian I-473
Świrszcz, Grzegorz I-135 Woods, Damien I-400
Woods, Kevin II-410
Tan, Li-Yang I-780 Wootters, Mary I-540
Tendera, Lidia II-287 Worrell, James II-74, II-422
Teska, Jakub II-250 Wu, Yihong I-25
Thaler, Justin I-303
Thapen, Neil I-684 Yannakakis, Mihalis II-199
Tiwary, Hans Raj I-57 Yehudayoff, Amir I-232
Toft, Tomas II-645 Yoshida, Nobuko II-174
Tzamos, Christos I-449 Yoshida, Yuichi I-123, I-840
Young, Neal E. I-135
Uppman, Hannes I-804 Yuster, Raphael I-828
Uznański, Przemyslaw II-520
Zacchiroli, Stefano II-187
Varma, Nithin M. I-601 Zamboni, Luca Q. II-113
Végh, László A. I-721 Zavattaro, Gianluigi II-187
Velner, Yaron I-816 Zavershynskyi, Maksym I-208
Venturi, Daniele II-545 Zetzsche, Georg II-361, II-434
Venturini, Rossano I-504 Zhou, Hang I-733
Verschae, José I-745 Zuckerman, David I-576

You might also like