Skip to main content

Research Seminars 2014

Research Seminars 2014

Computing with Uncertainty


We consider problems where the input data is initially uncertain but the exact value of an input item can be obtained at a certain cost. For example, a typical setting is that instead of an exact value, only an interval containing the exact value is given. An update of an input item then reveals its exact value. This scenario models real-world situations where only estimates of some input parameters are known initially, but it is possible to obtain the actual values of the input parameters with some extra effort. An algorithm performs a number of updates until it has gathered sufficient information to output a correct solution to the problem. The goal is to minimise the number of updates.


We discuss several problems in the setting of computing with uncertainty, including the minimum spanning tree problem and the minimum multicut problem in trees. Algorithms are evaluated using competitive analysis, comparing the number of updates that the algorithm makes on an instance of the problem to the best possible number of updates that suffices to solve that instance.


(The talk is based on joint work with Michael Hoffmann, Frank Kammer, Danny Krizanc, Matus Mihalak, and Rajeev Raman.)

Safe Specification of Operator Precedence Rules


Parsing, is one of the most researched fields in computer science. However, still today, generating a correct and efficient parser from a grammar specification, for example for the OCaml programming language, is difficult. There are efficient parser generators for subclasses of context-free grammars, such as LL(1) or LR(1), but most natural grammars for a language usually do not fit these classes and require manual grammar rewriting. This rewriting usually introduces understandability and maintenance problems.


Generalized parsing algorithms are able to handle any context-free grammar, thus removing the burden of grammar rewriting, but leave open the problem of ambiguity, that is, for a string, multiple parse trees are created. In the case of OCaml, which is basically an expression based functional language, getting the correct precedence of operators is very tricky. Current techniques based generalized parsing, for some OCaml examples, either fail to return any parse tree or fail to disambiguate the result and return multiple parse trees.


In this paper we present an approach to specifying operator precedence based on declarative disambiguation constructs and an implementation mechanism based on grammar rewriting. We identify a problem with existing generalized context-free parsing and disambiguation technology: generating a correct parser for a language such as OCaml using declarative precedence specification is not possible without resorting to some manual grammar transformation. Our approach provides a fully declarative solution to operator precedence specification for context-free grammars, is independent of any parsing technology, and is safe in that it guarantees that the language of the resulting grammar will be the same as the language of the specification grammar. We evaluate our new approach by specifying the precedence rules from the OCaml reference manual against the highly ambiguous reference grammar and validate the output of our generated parser.

A Testing Service for Software Binaries


Software testing is an essential phase of software development, yet it is still done mostly manually, which makes it expensive and is the main reason why software is still plagued by bugs with critical consequences. This talk will discuss an automated software testing service for software binaries. The service is at the intersection of the S2E testing platform and cloud computing. S2E enables environment-sensitive program analysis of closed-source software and is practical even for large, complex systems, such as an entire Microsoft Windows software stack. S2E simultaneously exercises entire families of execution paths in a scalable manner and employs virtualization to perform the desired analyses in vivo; this removes the need for the abstract models required by state-of-the-art symbolic execution tools. S2E uses dynamic binary translation to directly interpret x86 machine code, so it can analyze a wide range of software, including proprietary systems. Cloud computing facilitates the deployment of the S2E testing platform at large scale. The combination of these technologies makes it possible to have accurate, automated testing for software binaries.

Compression of finite-state automata through failure transitions


Several linear-time algorithms for automata-based pattern matching rely on failure transitions for efficient back-tracking. Like epsilon transitions, failure transition do not consume input symbols, but unlike them, they may only be taken when no other transition is applicable. At a semantic level, this conveniently models catch-all clauses and allows for compact language representation.
Our work investigates two versions of the minimisation problem. The input is a DFA A and an integer k, and the question is whether k or more transitions can be saved by transforming the DFA to a failure automaton. In the transition-reduction problem, we replace regular transitions with failure transitions without adding states, whereas in the transition-minimisation problem, any transformation is allowed, including adding states. We show that both problems are NP-complete, and demonstrate an approximation for the transition-reduction problem.

A Collapsible Approach to Higher Order Verification


Higher-order features are becoming increasingly commonplace in industrial programming languages, e.g. the introduction of lambda expressions to C++, the popularity of Scala, or the understated appearance of functions as data in scripting languages such as Python. My research has focussed on extending automata-based verification techniques to be able to reason about higher-order function calls.
I will give an overview of the kinds of automata we use, and how they relate to higher-order programs, and give an overview of the current state of the art in verification tools for higher-order programs. I will then consider the introduction of concurrent features to higher-order programs, and describe the kinds of approaches taken so far.

The complexity of finite-valued CSPs


Let L be a set of rational-valued functions on a fixed finite domain; such a set is called a finite-valued constraint language. We are interested in the problem of minimising a function given explicitly as a sum of functions from L. We establish a dichotomy theorem with respect to exact solvability for all finite-valued languages defined on domains of arbitrary finite size. We present a simple algebraic condition that characterises the tractable cases. Moreover, we show that a single algorithm based on linear programming solves all tractable cases. Furthermore, we show that there is a single reason for intractability; namely, a very specific reduction from Max-Cut.


(based on work published at FOCS'12 and STOC'13, joint work with J. Thapper)

A Machine-Assisted Proof of Godel's Incompleteness Theorems


An Isabelle/HOL formalisation of Gödel's two incompleteness theorems is presented. Aspects of the development include two separate treatments of variable binding: Urban's nominal package and de Bruijn indices. The work follows Swierczkowski's detailed proof of the theorems using hereditarily finite set theory.

Strongly-Typed Language Support for Internet-Scale Information Sources in F# 3.0


"Data, data everywhere, ...." Modern computing is highly information rich, but our programming languages are information sparse, especially our strongly typed ones. In this talk, we will show that through the use of a simple, intuitive, scalable and on-demand compile-time mechanism called "type providers", we can bridge modern strongly-typed languages through to the myriad of heterogeneous data services found in the modern programming context - for example entity graphs, databases, web services, spreadsheets and even other programming languages. Are information spaces "just" libraries? Can we gives types to "everything", and if so, how does that change the way we think about types and metadata at all? What does it mean for future type systems?


The work is done and delivered as part of the F# programming language, which is open-source and cross-platform, see http://fsharp.org. At the start of the talk I'll describe latest developments in F# and how people and companies contribute to the language.

The Rise of RaaS: the Resource-as-a-Service Cloud


Over the next few years, a new model of buying and selling cloud computing resources will evolve. Instead of providers exclusively selling server equivalent virtual machines for relatively long periods of time (as done in today’s IaaS clouds), providers will increasingly sell individual resources (such as CPU, memory, and I/O resources) for a few seconds at a time. We term this nascent economic model of cloud computing the Resource-as-a-Service (RaaS) cloud, and we argue that its rise is the likely culmination of recent trends in the construction of IaaS clouds and of the economic forces operating on both providers and clients.


Cloud providers would like to maximize their clients’ satisfaction by renting precious resources to those clients who value them the most. But real-world cloud clients are selfish: they will only tell their providers the truth about how much they value resources when it is in their own best interest to do so. How can real-world cloud providers allocate resources efficiently to those (selfish) clients who value them the most? Physical memory is the scarcest resource in today’s cloud computing platforms. We present Ginseng, the first +market-driven cloud system that allocates memory efficiently to selfish cloud clients. Ginseng incentivizes selfish clients to bid their true value for the memory they need when they need it. Ginseng continuously collects client bids, finds an efficient memory allocation, and re-allocates physical memory to the clients that value it the most. Ginseng achieves a 6X–15X improvement (83%–100% of the optimum) in aggregate client satisfaction when compared with state-of-the-art approaches for cloud memory allocation.


Joint work with Assaf Schuster, Muli Ben-Yehuda, Eyal Posener, Ahuva Mu'alem, and Dan Tsafrir.

The ICT4Rehab project - A *fully*-integrated ICT platform for patient rehabilitation including clinical database, body tracking, serious gaming and data mining


Rehabilitation therapists got interested by gaming applications since the availability of the first *physical* games (e.g., Wii, Kinect). The therapists* aim was to find an alternative to conventional therapy in order to (re-)motivate patients to perform their rehabilitation schemes. However, such games have been developed for entertainment, and they are closer to fitness exercises then proper physical rehabilitation. More recently the Serious Gaming field developed so-called *adapted rehabilitation games*. This later development allowed some good progresses, but unfortunately many rehabilitation aspects remain simplified or are simply absent. The reason of this simplification is a lack of integration of key rehabilitation content within the games. For example, detailed understanding of the patient troubles (i.e., anatomical system(s) involved, underlying pathologies, etc.) and rehabilitation schemes available to therapists should be deeply embedded into the games. Furthermore, the link between the serious games with the remaining of the patient clinical history (anamnesis, clinical treatment, etc.) is required if one wishes to obtain wide clinical acceptance. Dedicated data analysis tools should also be available for clinical research and clinical reporting. The ICT4Rehab platform includes interlinked components such as a serious gaming development platform including several rehabilitation aspects, full patient data management through a dedicated database (including most clinical historical data next to patient data generated by serious games), statistical analysis based on data mining, and clinical reporting using conventional clinical and biomechanical conventions. The platform has been developed around Cerebral Palsy, but the underlying architecture allows adaptation to other pathologies linked to motion disorders. The ICT4Rehab project is currently reaching valorization and industrial stages.

From Big Data to Little Knowledge


The main intellectual appeal of ‘Big Data’ is in its promise to generate knowledge from data. This talk will present critical discussion of a popular view ‘more_data  more_knowledge’. I will focus on the methodological aspects of data-analytic knowledge discovery, in the context of several real-life applications ranging from financial engineering to medical diagnosis. These important methodological issues include: (a) ‘asking the right question’ e.g., formalization of application domain requirements, (b) Tuning parameters of adaptive statistical/ machine learning algorithms, and (c) Interpretation of predictive (black-box) data-analytic models.

Tractable Approximations of Graph Isomorphism


In this talk I will talk of families of equivalence relations that approximate graph isomorphism and are polynomial-time decidable. The Weisfeiler-Lehman method provides one such family which has a large variety of equivalent characterisations emerging separately from algebra, combinatorics and logic, which I will review. It is known (by Cai-Fuerer-Immerman 1992) that no fixed level of the hierarchy exactly characterises isomorphism. This has recently inspired the definition of a new hierarchy of equivalences, which I will present, which can be seen as a strengthening of the W-L method to algebras over finite fields.


This is joint work with Bjarki Holm.

Programming Robotic Agents: A Multi-tasking Teleo-Reactive Approach


This talk will present a multi-threaded and multi-tasking robotic agent architecture in which the concurrently executing tasks, which can be alternating use of several robotic resources, are programmed in TeleoR. This a major extension of Nilsson's Teleo-Reactive Procedures language (TR) for robotic agents which I described in my previous presentation at RH about 5 years ago. TeleoR programmed tasks are robust and opportunistic, redoing or skipping robotic resource actions as appropriate. This makes them well suited to robot/robot or human/robot co-operative tasks. TeleoR's most important extension of TR is the concept and use of task atomic procedures to control the deadlock and starvation free use of the robotic resources by an agent's concurrent tasks. This use is illustrated in a simulation video at http://youtu.be/f81U0iHNzB0.

Feedback Generation for Performance Problems in Introductory Programming Assignments


Providing feedback on programming assignments manually is a tedious, error prone, and time-consuming task. In this talk, I will motivate and address the problem of generating feedback on performance aspects in introductory programming assignments. We studied a large number of functionally correct student solutions to introductory programming assignments and observed: (1) There are different algorithmic strategies, with varying levels of efficiency, for solving a given problem. These different strategies merit different feedback. (2) The same algorithmic strategy can be implemented in countless different ways, which are not relevant for reporting feedback on the student program.


We propose a light-weight programming language extension that allows a teacher to define an algorithmic strategy by specifying certain key values that should occur during the execution of an implementation. We describe a dynamic analysis based approach to test whether a student's program matches a teacher's specification. Our experimental results illustrate the effectiveness both our specification language and our dynamic analysis.

Fixed-parameter tractable canonization and isomorphism test for graphs of bounded treewidth


In the talk I will sketch a fixed-parameter tractable algorithm that, given a parameter k and two graphs G1,G2, either concludes that one of these graphs has treewidth at least k, or determines whether G1 and G2 are isomorphic. The running time of the algorithm on an n-vertex graph is 2^O(k^5 logk).Pn^5, and this is the first fixed-parameter algorithm for Graph Isomorphism parameterized by treewidth. The algorithm relies on a surprisingly simple twist in the classic approximation algorithm for treewidth due to Robertson and Seymour that makes the choices made in the algorithm isomorphism-invariant. This is joint work with Daniel Lokshtanov, Michal Pilipczuk, and Saket Saurabh, to be presented at FOCS'14.

"Think global, act local!" Or the other way around?


Choregraphic approaches for the design, analysis, and implementation of distributed and concurrent applications hinge on the idea that one first has to come up with a specification that provides a 'global viewpoint' of the application and then obtain specifications of 'local viewpoints' that implementations can use as a blueprint to check their correctness. In this way, the development of applications is "top-down" (going from the specifications to the implementation).


In this talk I will first give an intuition of "choregraphic" approaches, then discuss some recent results based on choreographies that guarantee properties of distributed applications, and finally I will show that choreographies can also be used "bottom-up" to determine if distributed components can harmoniously interact.

Verification of Asynchronously Communicating Systems


Since software systems are becoming increasingly more concurrent and distributed, modelling and verifying of interactions among their components is a crucial problem. Software modelling can be dealt with through two approaches namely top-down or bottom-up. In several application domains these systems are designed based on message communication and the interactions among components. For instance, in Service oriented Computing, top-down model is referred to as a choreography and it describes a communication contract between the services participating in a composition. Regarding the bottom-up model, services can be described using interaction protocols formalised as transition systems, i.e. service behaviours. Two fundamental questions related to both approaches: given a communication contract, is it possible to realise (implement) that contract? given a set of distributed components, is possible to check their interoperability w.r.t. a correctness property? We are interest in studying both questions assuming systems communicating asynchronously through unbounded FIFO buffers. We propose sufficient and necessary conditions to check the realizability and the compatibility of distributed systems.

Measuring disease similarity through ontological analysis of phenotypes


Over the past few decades advances in molecular biology greatly increased our understanding of disease. Our perspective on them has now evolved from simple causal relationships linking a single gene to a disease to a more complex relationship linking biological network modules (possibly containing many genes) to diseases. In this context a disease is consequence of a perturbation in the underlying biological networks. For a large number of diseases the location of these perturbations is unknown, and in order to locate them, it would be important to be able to measure a distance from them to those whose location is known.


In this talk, I will present a novel approach that we have developed for solving this problem. Our fundamental assumption is that diseases whose symptoms are similar have perturbations which are close in the networks, and therefore we measure phenotypic similarity to quantify their distances at a molecular level. We devised a method to obtain similarity scores for all pairs of diseases by associating all inherited diseases in OMIM with terms in the MeSH ontologies, and calculating semantic similarities. Our results show not only a significant improvement over the state of the art, but also provide insight into the effective integration of different databases to exploit existing phenotype information to characterise the genotype-phenotype relationship.

Challenges in modelling and analyzing quantitative aspects of a bike-sharing product line


Bike-sharing systems are becoming popular not only as a sustainable means of transportation in the urban environment, but also as a challenging case study that presents interesting run-time optimization problems. As a side-study within a research project aimed at quantitative analysis that used such a case study, we have observed how the deployed systems enjoy a wide variety of different features.


We have therefore applied variability analysis to define a family of bike-sharing systems, and we have sought support in available tools. We have so established a tool chain that includes (academic) tools that provide different functionalities regarding the analysis of software product lines, from feature modelling to product derivation and from quantitative evaluation of the attributes of products to model checking value-passing modal specifications.


The tool chain is currently experimented inside the mentioned project as a complement to more sophisticated product-based analysis techniques.

Preference Aggregation on Structured Domains


Arrow's famous impossibility theorem (1951) states that there is no perfect voting rule: for three or more candidates, no voting rule can satisfy a small set of very appealing axioms. However, this is no longer the case if we assume that voters' preferences satisfy certain restrictions, such as being single-peaked or single-crossing. In this talk, we discuss single-peaked and single-crossing elections, as well as some other closely related restricted preference domains, and associated algorithmic questions.

Constraint Solving in Symbolic Execution

Dynamic symbolic execution is an automated program analysis technique that employs an SMT solver to systematically explore paths through a program. It has been acknowledged in recent years as a highly effective technique for generating high-coverage test suites as well as for uncovering deep corner-case bugs in complex real-world software, and one of the key factors responsible for that success are the tremendous advances in SMT-solving technology. Nevertheless, constraint solving remains one of the fundamental challenges of symbolic execution, and for many programs it is the main performance bottleneck.


In this talk, I will first give a very general introduction to symbolic execution, and then I will present the results reported in our CAV 2013 paper on integrating support for multiple SMT solvers in the dynamic symbolic execution engine KLEE. In particular, I will outline the key characteristics of the SMT queries generated during symbolic execution, describe several high-level optimisations that KLEE employs to exploit those characteristics, introduce an extension of KLEE that uses a number of state-of-the-art SMT solvers (Boolector, STP and Z3) through the metaSMT solver framework, and compare the solvers' performance when run under KLEE. In addition, I will discuss several options for designing a parallel portfolio solver for symbolic execution tools. This is joint work with Cristian Cadar.

Proof of the Confort-Rao Conjecture for linear balanced matrices


Balanced matrices were introduced by Berge in 1970. A 0,1 matrix is balanced if for every square submatrix with two 1’s per row and column, the number of 1’s is a multiple of 4. This notion was extended to 0,+1,-1 matrices by Truemper in 1982. These matrices have been studied extensively in literature due to their important polyhedral properties.
The structure of these matrices was studied in the context of their bipartite graph representations. So a bipartite graph is balanced if its every chordless cycle has length 0 mod 4, and it is balanceable if there exists an assignment of weights +1 and -1 to the edges of the graph, so that in the resulting weighted bipartite graph, every chordless cycle has weight 0 mod 4. These structural results led to polynomial time recognition algorithms for these classes.


In 1992 Conforti and Rao conjectured that every balanced bipartite graph has an edge that is not the unique chord of a cycle. The conjecture was posed in a paper in which they study the structure of linear balanced matrices (i.e. bipartite graphs), but it was left open as to how to use the decomposition theorem obtained to prove the conjecture for this class. In this talk I will present the proof of the Conforti-Rao Conjecture for linear balanced and other classes of matrices, using known decomposition theorems. This is joint work with Aboulker, Radovanović, Trotignon and Trunck.

Explore Royal Holloway

Get help paying for your studies at Royal Holloway through a range of scholarships and bursaries.

There are lots of exciting ways to get involved at Royal Holloway. Discover new interests and enjoy existing ones.

Heading to university is exciting. Finding the right place to live will get you off to a good start.

Whether you need support with your health or practical advice on budgeting or finding part-time work, we can help.

Discover more about our 21 departments and schools.

Find out why Royal Holloway is in the top 25% of UK universities for research rated ‘world-leading’ or ‘internationally excellent’.

Royal Holloway is a research intensive university and our academics collaborate across disciplines to achieve excellence.

Discover world-class research at Royal Holloway.

Discover more about who we are today, and our vision for the future.

Royal Holloway began as two pioneering colleges for the education of women in the 19th century, and their spirit lives on today.

We’ve played a role in thousands of careers, some of them particularly remarkable.

Find about our decision-making processes and the people who lead and manage Royal Holloway today.