Home » Reading clubs » Classic (Page 2)

Category Archives: Classic

Summary: Cook (1971) The Complexity of Theorem-Proving Procedures

[Summary written by Carlo Maria Scandolo]

A decision or recognition problem concerns the issue of determining if an element (input string) is in a given set (language).
We want to study the complexity of such problems. We can do this by considering ordinary (multi-tape) Turing machines as well as
non-deterministic Turing machines. There are two main results in the paper concerning propositional logic:


The first states that every problem that can be solved in polynomial time by a non-deterministic Turing machine can be reduced to checking whether a given logical proposition expressed in disjunctive normal form is a tautology. In more modern terminology, we would say that checking if a logical proposition in the disjunctive normal form is a tautology is NP-complete. In other words, if we solve the tautology problem, we also solve also all problems in the NP class, namely all problems that can be
solved in polynomial time by a non-deterministic Turing machine.

The second result establishes that checking whether a logical proposition is a tautology in its disjunctive normal form has the same degree of difficulty of the following problems:

  1. checking whether a proposition (expressed in any form) is a
    tautology;
  2. checking whether a proposition in disjunctive normal form where each disjunct has at most three conjunctions is a tautology;
  3. given two finite undirected graphs, checking whether the former is isomorphic to a subgraph of the latter.

To sum up, in general, it is a hard problem to determine if a given
logical proposition is a tautology even if it is expressed in disjunctive normal forms. Another consequence is that it is fruitless to develop a polynomial algorithm for solving the subgraph isomorphism decision problem on a deterministic Turing machine: it is isomorphic to the tautology problem.

Bibliography
S. A. Cook, The Complexity of Theorem-Proving Procedures, Proceedings
of the third annual ACM symposium on theory of computing, 1971

Summarizing a Lecture by Von Neumann given in 1948 on the theory of Automata

[Summary written by Patrick Gemmell]

We read The General and Logical Theory of Automata, a lecture given by John Von Neumann at the Hixon Symposium, Pasadena, in September, 1948. (At about 40 pages, we can only assume the talk finished some time in 1949.)

In actual fact, the lecture was fairly entertaining from a historical perspective, and gives some sort of idea of the direction people thought computing, neuroscience, and AI might have been going in 1948.

First off, JVN (John Von Neumann) was keen to argue for an axiomatic black box approach to understanding biology and brains. In particular, he saw brains as collections of about 10 billion black boxes that emitted pulses and behaved in an essentially digital fashion. (The figure for the number of neurons is, as far as we know, roughly correct, though it must be said that neurons are often modelled in more detail, e.g. using the Hodgkin-Huxley model, for which the Nobel Prize was awarded in ‘63.) Amusingly, JVN described the computer as smaller than the brain “in the sense which really matters,” i.e. switching complexity. Today the computer is smaller in both senses (a modern A8 chip from an iPhone has about 1 billion switches) though tomorrow it may be the brain which will be smaller in the “only sense that matters.”

JVN also took time to contrast analogous machines (analogue computers, where numbers are represented by physical quantities such as current, rotation, or fluid) with digital ones. According to JVN, the main advantage of digital machines is that they can conduct calculations to whatever precision we design them to, while the analogous machines always have, in practice, a fairly limited signal to noise ratio. On the other hand, JVN pointed out that each step involved in a digital calculation must be correct or else errors can be very large: interestingly, JVN could think of no other human endeavour where the result essentially depends on a sequence of a billion steps. Nor could we, can you? Today most people seem to have forgotten about both of these interesting characteristics of digital computers and merely take them for granted.

In the talk, JVN was prescient about future developments in some ways but missed the mark in others. For example, JVN was clear that it is important to determine the number of steps needed in a computing procedure (a la computational complexity theory) rather than just whether a procedure can be computed in principle (a la computability theory/Turing ‘36). On the other hand, JVN though that an equally important development would be to take account of the non-zero probability of errors at intermediate steps in a computation. Today, algorithms are usually accompanied by descriptions of their complexity but no one really worries about the probability of failure, as this is taken account of using coding layers, transmission protocols, and hardware redundancy. JVN also thought that there was “no doubt” that one can design self repairing machines though today self-healing computers remain a daydream so that, again, we tend to rely on redundancy more than anything more organically inspired.

Towards the end of his talk, JVN discussed self reproducing automata and neuroscience. This was perhaps the most interesting part of the reading because JVN was opinionated, and may yet be proved right. [JVN’s stance seems to reflect his interest in natural science and appears in contrast to Turing’s, as well as the major thread of AI that dominated the post-cybernetics/ pre-connectionist eras.] First, JVN thought it unfortunate that computation was so closely connected to formal logic, which he thought of as a rigid, combinatorial field, divorced from the more cultivated area of number and analysis. Second, JVN though that, given biology’s success at dealing with abstract concepts, logic and computing would eventually borrow more from neuroscience than neuroscience would from formal logic. While there have been many advances on the logical side of computing since 1948, it does seem that today more excitement is being generated by deep learning or hierarchical Bayesian modelling than is being generated by symbolic AI; certainly, if an undergraduate were asked to solve JVN’s example problem of recognising triangles, they would be much more likely to train up a neural network than they would be to think about how to describe the attributes of geometric objects in logical terms. On the other hand, the majority of computing is not (yet!) AI, and logical/discrete techniques developed since JVN’s talk support the bulk of the big ideas in the computer science curriculum such as languages, automata, compilers, formal verification, database theory, operating systems, and so on.

JVN’s talk finished with a discussion of self replicating machines. Here JVN explained that all one needs to make self replicating machines is (a) a machine that fabricates the machine described on a blueprint, (b) a machine that duplicates blueprints, and (c) a machine that takes a blueprint, feeds it to a manufacturing machine (a), copies it in copying machine (b), and then shoves the newly copied blueprint into the newly minted machine. Here we were finally a little incredulous, and decided that JVN’s black box thinking had gone a little too far: now everything interesting was happening inside the black box. Perhaps we should read about the Universal Constructor next

Summary: Russell Impagliazzo (1995) “A personal view of average-case complexity”

[Summary written by Jotun Hein]

This paper cannot be called a classics paper, but it was informative and opinionated and led us to the key papers in the field in no time The paper consists of two quite separate parts:

A – a classification of the computational discrete world into 5 possible countries. I found it a bit pointless and I could not see the relationship between the first 2 countries and the last 3.

ALGORITHMICA. Here NP=P

HEURISTICS here problems are polynomial averaged over all possible data sets of a given size, but intractable worst case.

PESSILAND [Danes might swap E & I] here there are hard average cases but no one-way (encryption) functions

MINICRYPT here one way functions exist, but public encryption is impossible (must mean that the function can be inverted)

CRYPTOMANIA and here public-key encryption is possible.

B goes through the original definition of Average Complexity by Levin. Unfortunately no real example is given on a problem where there is a difference between average and worst case complexity. The ideas are quite understandable. An algorithm can only have average polynomial complexity, if the data sets where it isn’t polynomial shrinks sufficiently fast as their size grows. One tricky thing is that complexity is not defined in terms of the algorithms at hand but only in terms of the problem. There are a lot of things not discussed here like how polynomial transformations of a problem to another skews the distribution on the possible data sets.

I think complexity based on distributions would be interesting to pursue in this case: if you generate data from the coalescent with mutation on a finite string. If you have long strings/low mutation rate you are in the perfect phylogeny domain where a linear algorithm exists. If you make mutation rate infinite, you have uniform distribution on all data sets and the worst case complexity is NP-Complete without considering the distribution.

Summary: Shor (1999) Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer

[Summary by Søren Riis]

On the morning of Wednesday 18th of May, a reduced select group met to discuss “Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer,” P. W. Shor [1].

Finding the factors of a given large number is a fundamental problem in computing called factoring. Shor’s result is that factoring can be solved in roughly n^2 steps on a quantum computer (where n is the number of bits in the number we want to factor). In contrast, factoring requires exponentially many steps on an ordinary computer. A traditional computer is based on bits. Each bit can either be zero or one. A quantum computer instead uses q-bits. A q-bit can either take value zero, one or some superposition of zero and one.The way Shor’s algorithm works is that it first converts the problem of factoring into the problem of finding the period of a really long sequence. It is this issue that is the central quantum mechanical part of Shor’s algorithm. Once the period is found, the result can then be used to factorise the number.

Public key cryptography, i.e. the RSA algorithm we discussed in a previous meeting, relies on the assumption that factoring is computationally hard. RSA would be rendered insecure if Shor’s quantum factoring algorithm could be implemented.
After the meeting, I had some fun writing a simple python program on my laptop that can simulate Shor’s algorithm. Maybe at a later stage this program could be turned into a proper teaching tool


[1] P. W. Shor, “Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer,” SIAM Rev., vol. 41, no. 2, pp. 303–332, Jan. 1999.

Summary: Rivest, Shamir, and Alderman (1978) A method for Obtaining Digital Signatures and Public-Key Cryptosystems

On the morning of April 27., we convened to read “A method for Obtaining Digital Signatures and Public-Key Cryptosystems” by Rivest, Shamir, and Alderman [1]. The following Summary was written by Mathias C. Cornjäger:

This paper (first published in 1978) outlined the first practicable Public-Key encryption-system. This encryption-system is still in wide use today, and known by the initials of the paper’s three authors–RSA.

The paper itself is well written and readable. The first 4 sections give an introduction to the theory of “public key cryptosystems” (an already known concept at the time the paper was written), after which the authors outline their proposed implementation of such a system along with arguments why they think the system ought to be hard to break.

The goal of a “public key cryptosystem” is to establish a protocol for how to exchange private messages between two parties via insecure channels of communication without relying on shared secrets between the two parties or a trusted third party. Such protocols had already been proposed earlier by Deffie and Hellman [2], who elaborated on how a class of encryption- and decryption-functions satisfying a list of four axioms could in theory be used to facilitate private messages and digital signatures. It is worth reiterating that the main contribution this paper made was to give the first example of a class of functions which satisfied these four axioms.

An interesting detail in these first sections which the paper laid out quite clearly, was the difference between authentication and signature of messages. A message sent from Alice to Bob is authentifiable if Bob is able to satisfy himself that the message he has just received was indeed sent by Alice (or at least someone with knowledge of Alice’s private key), and that it has not been modified in transit. The same message is regarded as signed only if Bob can convince not only himself, but also a third party (say a judge should a contractual dispute arise), that the message he has received was in fact sent by Alice and that no-one (including Bob himself) could have altered/forged that signed message.

In the proposed encryption-scheme, a public-private key-pair consists of two tuples (e,n) and (d,n), where n is the product of two large primes p and q. If these numbers satisfy the conditions that

  1. d is co-prime to (p-1)*(q-1), and
  2. e*d = 1 holds modulo (p-1)*(q-1),

it can be proven that for any non-negative integer M < n the identity ((M^e)^d) = ((M^d)^e) = M holds modulo n. Encryption then consists of exponentiating a number by e modulo n, and decryption in exponentiation by d (again, modulo n).

The proposed method for encrypting a message intended for the owner of the public-key (e,n) is then to encode a message as a number M<n, and from  M to obtain the cipher C = M^e (mod n) (if the message is to long, it is broken up into blocks).
This cipher C can then be transmitted, and the intended recipient can decode it by computing C^d = (M^e)^d) = M (mod n). Anyone else who obtains C will have a hard time recovering M, since they don’t know d.

Since (e,n) is public, and d can be computed form (e, p , q) using the generalized Euclidian algorithm, the security of the cryptosystem relies on large numbers being difficult to factor (it would otherwise be easy for a third party to obtain p and q from n). The problem of factorizing integers is a well studied problem in number-theory, and no efficient methods of doing so have been developed in the past 300 years (and it is unknown if any exist). With this in mind, the authors argue that the system is unlikely to be broken any time soon.

The last sections of the paper briefly outline proposed algorithms on how to efficiently generate key-pairs, as well as discussions of why other methods of obtaining d from (e,n) can be conjectured to be at least has hard as factoring n.

Next meeting (May 18.) we intend to review Shor’s Paper [3] where he proposes a polynomial time factoring algorithm (on quantum computers).

References:
[1] R. L. Rivest, A. Shamir, and L. Adleman, “A method for obtaining digital signatures and public-key cryptosystems,” Commun. ACM, vol. 21, no. 2, pp. 120–126, 1978.
[2] Diffie, W., and Hellman, M. New directions in cryptography. IEEE Trans. Inform. Theory IT-22, (Nov. 1976), 644-654.
[3] P. W. Shor, “Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer,” SIAM Rev., vol. 41, no. 2, pp. 303–332, Jan. 1999.

Summary: Bell (1964) On the Einstein-Podolski-Rosen paradox

[Summary  by Søren Riis]
On Friday 8th April, we discussed the paper “On the Einstein-Podolski-Rosen paradox” Physics 1 (3), 195 (1964) by John Bell, where he introduced what later became known as Bell’s inequality.

One of the remarkable features of quantum physics is that a locally realistic theory cannot always describe measurements on spatially separated systems. Correlations that are allowed within the framework of a locally causal theory are usually referred to as classical.
The point is that quantum mechanics does not always result in “classical” correlations, and this can be quantified by the violation of Bell’s inequalities. These are constraints that have to be satisfied by any classical correlations.

In the meeting, we went over Bell’s paper. I was already familiar with Einstein Podolski Rosen’s 1935 paradox (the EPR paradox), and I have also tried to get my head round Bohr’s less known response (also from 1935) to the EPR paper. Also, I read various modern presentations of Bell’s inequality, but I never read Bell’s original paper.

I think John Bell’s interpretation of quantum physics is somewhat unclear. In Quantum Gravity 2, pp. 611 (1981) he wrote:
“When the ‘system’ in question is the whole world where is the ‘measurer’ to be found? Inside, rather than outside, presumably. What exactly qualifies some subsystems to play this role? Was the world wave function waiting to jump for thousands of millions of years until a single-celled living creature appeared? Or did it have to wait a little longer for some highly qualified measurer — with a Ph.D.?”

Without going into the technical details a (simplified) way to think of the problem is that Alice and Bob are performing measurements of so-called entangled particles at two spatially separated places. The entanglement ensures that there is 100% correlation between Alice’s result of a measurement on her particle and the corresponding measurement on Bob’s particle, provided they measure the same quantity.

In the setup for Bell’s inequality, Alice can choose between 3 measurements A, B and C. and Bob also has a choice between three measurements A, B and C.
Expressed somewhat crudely (and stripping away further details), quantum physics predicts that we can have non-transitivity, so:

  1. the result of A is strongly correlated with the result of B, AND
  2. the result of B is strongly correlated with the result of C YET
  3. the result of A is uncorrelated with the result of C.

To dramatise the “essence” in the situation further, imagine that Alice and Bob repeatedly chose between 2 measurements among A, B and C. After 100s of repeated experiments they notice that

  1. if they measure A and B they always get the same result;
  2. if they measure B and C they always get the same result.

    But to their utter amazement, they notice that

  3. if they measure A and C they always get two different (opposite) results.

There have been various attempts to interpret quantum physics to make sense of such “impossible” correlations.
Within the last ten years, Bohr’s Copenhagen interpretation has lost its status as the primary interpretation, and an increasing number of physicists support wild interpretations like Everett’s many-worlds interpretation.

In my dramatised ERP thought experiment there is no reason to believe Alice and Bob would conclude that they split into different copies of themselves. The way I see the problem (which I think is the essence of the Copenhagen interpretation), it seems more natural that Alice and Bob find that the world as a whole follows some mathematical laws, so even their choice of measurements is a part of this mathematical pattern.
From this view, it is not just the two particles that are entangled. It seems that they would be forced to conclude that their choice of which measurement to perform would be similarly entangled in agreement with some underlying mathematical principles that are constraining the universe as a whole.

Summary: Papadimitriou (2014) Algorithms, complexity, and the sciences

[Summary by Jotun Hein]

On Wednesday February 10th 10AM we discussed Algorithms, complexity, and the sciences by Christos Papadimitriou published in PNAS 2014. (available here)

This is in principle an outlier, since you can’t call a review published 2 years ago a classic, but computation is so central to all we do that we felt like discussing this would an ideal way to update ourselves and possibly choose the next CLASSICS to read.

As summarized by Patrick Gemmell (famous unbeleiver and Facebook-denier): Algorithms are analysed in terms of complexity – roughly speaking, the amount of time one may have to wait for the algorithm to produce a solution. In practice, algorithm designers have usually found that the problems their algorithms address fall into two groups – those that can be solved efficiently (polynomial time) and those that are tougher. Though it is technically an open question as to whether all search problems can be solved efficiently (shortest path, partition into pairs, min-cut, Euler path, and linear programming can) it is thought that some problems (e.g. travelling salesman, partition, clique, Hamiltonian path, integer linear programming) cannot. This belief is, roughly speaking, formalized as the P != NP conjecture.

The class NP denotes the class of search problems. P denotes the class of search problems that can be solved efficiently. If P = NP life would be all rainbows and unicorns. (Aaronson has spoken about this science fiction scenario). Probably P != NP and life will remain tough.

People are particularly interested in the NP-complete problems. These problems are the same in the sense that if you can show one of them can be solved efficiently then you can show that they all can, and furthermore that P = NP. (Note, a new tough problem is shown to be NP-complete by providing a reduction – an efficient way of turning the input of one problem into an equivalent input for another; Also note, most problems that have not been classified as P or NP-complete are total, and have the property that a solution must exist for all given inputs e.g. prime factorization.)

Considering complexity leads to some interesting and quite philosophical questions. For example, in economics the notion of Nash equilibrium is often used. Traditionally, rational agents are conceived as playing games and coming to situations whereby all players are doing as well as they can do, given the action of others. Papadimitriou suggests that “because groups of players are supposed to attain this pattern of play in real life, there should be a way to simulate [the equilibrium finding] process on a computer in less than astronomical time.” Unfortunately, recent results have shown that finding Nash equilibria is a PPAD-complete problem, and therefore unlikely to be efficiently solvable. So, even if one is willing to accept that economic agents are behaving rationally (and there are plenty of reasons why they might not be) it is not clear quite how agents could be solving the difficult problem of finding Nash equilibrium in general. If algorithmic processes are not able to find a solution efficiently, why should we expect a human being to be able to do so, even if she is supported by an IT department and an army of Oxford educated management consultants? Furthermore, why should we allow economic reasoning that assumes that agents can find Nash equilibria dominate public policy? I’m sure economists have discussed the pros and cons of game theory as a model in great depth, but perhaps not from the computational perspective.

In general, expecting the limitations of a computer apply to nature is a strange idea. Digital philosophers take things to the extreme by hypothesising the universe is a computer. Is the behaviour of nature subject to algorithmic constraints? Should scientific theories be efficiently implementable on a computer before they are taken seriously? This seems rather extreme because, despite its algorithmic universality, a Turing machine is a very particular kind of conceptual device, one that goes through a sequence of states. It’s not clear we want to use it everywhere. (Turing ‘52 did not worry about whether his reaction diffusion systems were efficiently computable on a Turing ‘36 machine – computers were used to approximate the equations in the ‘52 paper but we don’t remember Turing confusing the world with a computer.) Furthermore, in the case of Nash equilibria, there is an empirical question of whether the particular games agents are supposed to play are necessarily tough to crack. Perhaps they are quite easy? Perhaps we play the same games over and over again, and the solutions are intuitive to us? I wonder if Economists have any idea?

A final interesting aspect of this paper was the discussion of the MWU (multiplicative weight update) algorithm. This very simple algorithm – change the strategy you play in proportion to its past performance and a learning parameter – is apparently able to find solutions to some fairly challenging problems. Moreover, given the standard assumptions of population genetics (weak selection, infinite populations, no linkage, etc.) a computation analysis of the Wright-Fisher equations seems to suggest that idealised populations play a coordination game whose dynamics are described by the MWU strategy. This maximizes the entropy of the allele distributions at a particular locus. This leads to the idea that maintaining diversity is a large part of what evolution is about. The analysis of diploid populations is used as a potential explanation for a discrepancy between theoretical and observed heterozygosity seen in populations.

Again, this application of algorithms does appear to raise some interesting questions. But we are not sure if there really is an excess of heterozygosity – based on our experience of biology we think it might not be so easy to find the appropriate sites to support such conclusions. We also wondered, how, when starting from same assumptions, do population genetics and the computational people come to different conclusions about heterozygosity? One group is wrong or there are some matters of taste involved. To be really happy we want to see the classical approximations and reasoning side by side against the new computational perspective. For this reason we’ll probably read Chastain E, et al. (2014) Algorithms, games, and evolution. Proc Natl Acad Sci USA 111(29):10620–10623 next. Papapadimitriou is a co-author so his photo can stay for some weeks.

It will be 9AM March 9th.