Home » Reading clubs

Category Archives: Reading clubs

Skype Book Discussion Group in “Computational Complexity of Sampling”

The present version can be found here:

http://renyi.hu/~miklosi/CCC/ComputationalComplexityOfCountingAndSampling.pdf

The author – Istvan Miklos – believes he will always be ahead of the readers in writing. We would then write a review that would be published about the same time as the book was published and we put an extended report on this page:

https://heingroupoxford.com/learning-resources/lectures/

We also give a summarizing lecture when we have finished the book. Earlier when we did this, we met every 2nd day doing about 20 pages each time, but it can depend on the individual book. We did a similar thing to Mike Steels 2016-book, which I believe was beneficial to both authors and readers.

The ideal number of participants in such a group is 3-5. It would have to be online since I will be Israel. I like to choose a time that is either starting or ending of working day so it interpheres minimally with work. If you know somebody interested in participating in this, please tell me. If it proves a crappy book, we will stop reading, but that is not what I expect.

Extreme Reading – status report

There is a famous danish sketch called “Jarl Kakadue” from the show “Casper og Mandrilaftalen”. In the sketch, Jarl explains how he completed an iron man, but instead of running a marathon, he got a good nights sleep instead.
“But isn’t that cheating?” to host asks, to which Jarl replies “No, because such a run takes a couple of hours, but a proper nights sleep is at least 8 hours.”

As the sketch goes on, more and more of the exercise gets replaced. The full thing can be seen here: (in danish)

The concept of Extreme Reading is also a modified iron man in the following sense:
instead of swimming, we read a book.
instead of cycling, we summarise the book
and
instead of running a marathon, we run half a marathon (over 3 days)

So each day, we read for a couple of hours, ran 7 kilometers, read some more and then we summarized the book for each other and discussed it.

The book i question was “The origin and nature of life on earth – the emergence of the fourth biosphere” – by Eric Smith and Harold J. Morowitz

Unfortunately, the book is rather wordy and not very mathematical. The individual sections are nicely structured, but the book lacks an main message and sense of direction.

This is puzzling, since Morowitz other books are usually shorter and more precise. However, Morowitz died before the book was published, was very weak the last decade, published little in that period and was in general very short in his formulations, while this book is very long (at times lenghty). It is unclear how much Morowitz contributed to the present book.

This book is 600 pages long and consists of 8 chapters. This is a very hard topic to write a coherent book about and the chapters are quite free-standing contributions to describing or explaining the theory of life.

Eric Smith gave a talk somewhat based on the book, which can be found here: https://www.youtube.com/watch?v=0cwvj0XBKlE

The 4 geospheres are:
Atmosphere (air)
Hydrosphere (water)
Lithosphere (earth)
Biosphere (life)

The point of the title is that life should be though of as a planetary property. However, the point seems more philosophical than scientific, which is the case with many of the subtle points in the book.

A longer summary will be added later.

Overall, the project was a success. We managed to run and read a lot. It is a very satisfying feeling to be both mentally and physically exhausted and we can definitely recommend similar undertakings.

EXTREME READING!! I: Origins of Life

I am happy with our little book clubs, but they induce the wish to read more books than we actually read. Especially I have found it frustrating that we had to cut our economics studies short. I know some people have tried to read very large amounts in a very short time span like 12-24-36 hours. It is really demanding but most likely very rewarding.

Now I should like to try this on:

Carlin and Soskice (2014) Macroenomics: Institutions, Instability, And The Financial System – about 600 pages

Eric Smith and Harold Morowitz (2016): The Origin and Nature of Life on Earth.

I should like to start Friday morning 9AM and be done by Sunday 6PM.

We will start with the Origin of Life book and do it April 21st to 23rd.
Does somebody want to participate?? It is possible to do via Skype.
I suggest each day:

Read 100 pages

Write 1 page summary

Run 7 km

Lunch 2pm

Read 100 pages

Write 1 page summary

Dinner – Sleep

 

Take Monday off. Maybe all week. Maybe quit academia.
We will also make a powerpoint presentation over the book, but maybe after the 3 days.
I originally wanted to suggest running a marathon, but realism made me suggest a ½ marathon in installments instead

Chomsky, Chapter 1

1280px-pieter_bruegel_the_elder_-_the_tower_of_babel_vienna_-_google_art_project_-_editedThe humanities book club is currently reading Noam Chomsky’s Aspects of the Theory of Syntax first published in 1965. The following is a summary written by Mathias Cronjäger of the first chapter:

The first chapter of Chomsky’s book sets the stage for the subsequent discussion in the three later chapters of the book. Whereas he will later go into more technical detail Chomsy here paints with a rather broad brush. This frustrated some members of the group who expressed a desire to see concrete examples and exact statements. In the chapter, Chomsky outlines what he means by grammar (a model of how an idealised speaker-listener processes language), and introduces  a range of critical distinctions (such as linguistic performance versus linguistic competence). In terms of grammatical structures, he is keen to distinguish the surface structure of sentences (structural rules about how they are pronounced and expressed) from their deep structure (structural rules for how their semantic content is organized and how to interpret them). He gives a simple example of two English sentences (“I persuaded John to leave” and “I expected John to leave”) and proceeds to demonstrate that they have very different different deep structure in spite of their similar surface structure.

A further distinction made is between descriptive and explanatory theories of grammar. A descriptive account of language is just a set of rules for producing valid sentences in a language (a grammar), or a set of such, which reproduces the structure of a language in a manner that conforms to the linguistic intuitions of native speakers. An explanatory theory of language goes further by also assigning each grammar of a language a notion of “simplicity”, which accounts for what grammar gives the simplest account of a corpus of linguistic data. Such an explanatory theory is not just a theory of language structure, but also one of language acquisition. This is because such a theory can then explain why someone learning a language internalises one set of rules (the simpler ones) over another potential set of rules. The correct notion of “simplicity” in this context is therefore one that corresponds to how humans internally process language. After having introduced this notion of simplicity, Chomsky proceeds to spend a great deal of effort outlining why there is noting simple about determining which measure of simplicity actually accounts for how language acquisition works in humans.

This chapter also includes discussion of linguistic universals; a topic where Chomsky has strong opinions and with which he is often associated. He contrasts the empiricist position (that the only mental procedures universal to all language acquisition is our general capacity for inductive reasoning) with the rationalist position (that we are all born with some basic mental procedures specifically for acquiring and processing language). To put it mildly, Chomsky is not convinced by the arguments of the empiricists. The rationalist position holding true would imply the existence of universal properties that all languages posses: discovering and formalising these into a universal grammar is a project that motivates much of Chomsky’s theory-building. In particular, it is the reason why he is not content to just give an account of English or German language use: he wants to find structures that all languages share (being able to do so would also lend empirical support to the rationalist position).

Limits of Complexity

[written by William K. Larsen]

We read two papers. Both were review articles from the journal Nature.

Summary of: Ultimate physical limits to computation(2000), by Seth Lloyd

The article is written for the general scientifically literate audience and is mostly easily digestible. A description of the concept of an AND-gate is given, which is rather elementary.The paper covers advanced topics as well.

Lloyd defines the Ultimate physical laptop as a “computer” with mass of 1 kg and volume of 1 liter.

2 main arguments are made concerning the ultimate physical laptop:

  1. Energy limits the speed of computation – which leads to a limit of approximately 1050 computations pr. second (1041 GHz).
  2. Entropy limits memory space – which leads to a limit of approximately 1031 bits (1021 GB)

Overall, the paper makes you think of a computer as a physical object with energy, mass and volume, rather than the idealized model often used in computer science. It gives a taste of a number physical concepts and ideas along the way.

Our discussion:

The limits presented in the paper are very large upper bounds. One could theoretically fill an empty jar with colored sand and consider the very large number of possible configurations as being representative of the information stored. The problem is that the jar of sand cannot be utilized for work in the same way that a computer can.

This paper was written before the death of Moore’s law and thus the limits may have seemed more like a important perspective than a fun thought experiment.

 

Brief summary of: Limits on fundamental limits to computation(2014), by Igor L. Markov

The aim of this paper is to get a perspective on the different possible barriers for the development of computers.

The other paper considered the problem of Fundamental limits to computation by looking at Energy, Space and Information.

This corresponds to only 3 of 25 cells in the following table from the paper:

limits of computation.PNG

Markov considers different cells in this table and makes some concluding remarks.

Our discussion of the paper:

The paper is really hard on the quantum computer – claiming that quantum computers can only find use when simulating quantum-chemical phenomena and that this is also uncertain.

Markov generally does not go into too much depth with anything, but the references seem very useful. Especially 8 of them have been highlighted and given a small description (out of 99)

Limits of computation

Thursday February 16th 10AM we will discuss:

http://www.nature.com/nature/journal/v512/n7513/full/nature13570.html

and

http://www.nature.com/nature/journal/v406/n6799/full/4061047a0.html

Jotun will join per Skype – at least Soren and William will be in Department of Statistics, Oxford, room LG.05.

Summary: Ioannidis (2005) Why Most Published Research Findings Are False

[written by Mathias C. Cronjäger]

Summary of our discussion

Compared to most other papers we have read over the course of running this reading group, Ioannidis (2005) is rather recent. In light of its large impact (it has been the most downloaded paper from PLOS Medicine) we are however comfortable referring to it as a modern classic.

It is a short and very well written paper, which does not presume technical expertise on the part of the reader: anyone familiar with basic statistics and manipulation of fractions will be able to follow the technical arguments made. It contains important insights regarding how sceptical one should be that a result reported as statistically significant indeed represents a “true” effect.

Since the basic arguments made by Ioannidis are only a slight variation on basic statistical arguments, the fact that this paper made such a large impact in other fields (such as medicine and psychology) reflects rather poorly on how well the statistical community has managed to communicate with people outside our own field.

Summary of the paper itself

The arguments in the paper revolve around computing the positive predictive value/PPV (the rate of “true” effects being detected by studies reporting positive results relative to the total number of positive results reported), given different values of  the following parameters:

  • R – the rate of “true” effects being tested relative to non-effects being tested. From a Bayesian perspective, this corresponds to the prior probability of an effect being present.
  • α – the rate of type I error. This corresponds to the probability that an individual experiment will have a statistically significant outcome in spite of no true effect being present.
  • β – the rate of type II error: This corresponds to the probability that an experiment will fail to detect that a  true effect which is present and instead yield a statistically insignificant outcome.

These three parameters are standard in the theory of the PPV. Ioannidis introduces a forth parameter to account for bias not accounted for in the above:

  • u – the probability that a false effect tested gets reported as a positive result, even though it would under ideal conditions have been statistically insignificant.

This fudge factor can incorporate anything from badly designed experiments or researchers being less sceptical of positive results to post-hoc change of study design, p-hacking or even outright fraud. Ioannidis does not go into addressing how likely any of these factors are to contribute to u, but contends himself with re-deriving an expression for the PPV if some amount of bias is taken into account.

The author then considers the effect that multiple groups investigating the same effect independently of one another will have: if just one group has a statistically significant result this is likely to get published even if the negative results of other groups is not. This means that for “hot” topics (which are subject to a great number of parallel experiments) we should be even more weary of single studies reporting statistically significant effects.

Based on his mathematical arguments, Ioannidis then proceeds to give a list of six corollaries, all of which are again reasonably well known to statisticians and most practising scientists (such as “smaller studies tend to have a lower PPV all other factors being equal” or “Greater flexibility in study design and how to measure outcomes leads to a lower PPV”).

In his discussion Ioannidis supports the polemical title of the paper by arguing that even for conservatively chosen values of R, α, β, and u, we would expect a PPV below 50%. Finally he gives an overview of how the state of affairs might be improved. Here his prescriptions are similar to what other statisticians and researches have argued for such as Increasing transparency, (pre-registration of trials; making raw data and code used in analysis available) and encouraging the publication of negative results.

The author concludes by suggesting that it is his personal belief that a number of “classics” in various fields would not hold up if replications thereof were attempted. Given the results of later replication results (such as the Open Science Collaboration 2015 replication attempts of 100 famous results in psychology in the references), this seems prescient.

References

The paper itself:

Ioannidis, J.P.A., 2005. Why Most Published Research Findings Are False. PLoS Medicine, 2(8), p.e124. Available at: http://dx.plos.org/10.1371/journal.pmed.0020124.

Later papers expanding on the topic

Colquhoun, D., 2014. An investigation of the false discovery rate and the misinterpretation of P values. Royal Society Open Science, pp.1–15. Available at: http://rsos.royalsocietypublishing.org/content/1/3/140216.

Jager, L.R. & Leek, J.T., 2014. An estimate of the science-wise false discovery rate and application to the top medical literature. Biostatistics (Oxford, England), 15(1), pp.1–12. Available at: http://biostatistics.oxfordjournals.org/content/15/1/1.

Leek, J.T. & Jager, L.R., 2016. Is most published research really false?, Available at: http://biorxiv.org/lookup/doi/10.1101/050575.

A famous replication study in Psychology

Open Science Collaboration, 2015. Estimating the reproducibility of psychological science. Science, 349(6251), p.aac4716-aac4716. Available at: http://www.ncbi.nlm.nih.gov/pubmed/26315443.