Home » Reading clubs (Page 2)

Category Archives: Reading clubs

Blog archives

Here we collect our writing on various topics from our day-to-day work and our reading clubs.

Limits of computation

Thursday February 16th 10AM we will discuss:




Jotun will join per Skype – at least Soren and William will be in Department of Statistics, Oxford, room LG.05.

Summary: Ioannidis (2005) Why Most Published Research Findings Are False

[written by Mathias C. Cronjäger]

Summary of our discussion

Compared to most other papers we have read over the course of running this reading group, Ioannidis (2005) is rather recent. In light of its large impact (it has been the most downloaded paper from PLOS Medicine) we are however comfortable referring to it as a modern classic.

It is a short and very well written paper, which does not presume technical expertise on the part of the reader: anyone familiar with basic statistics and manipulation of fractions will be able to follow the technical arguments made. It contains important insights regarding how sceptical one should be that a result reported as statistically significant indeed represents a “true” effect.

Since the basic arguments made by Ioannidis are only a slight variation on basic statistical arguments, the fact that this paper made such a large impact in other fields (such as medicine and psychology) reflects rather poorly on how well the statistical community has managed to communicate with people outside our own field.

Summary of the paper itself

The arguments in the paper revolve around computing the positive predictive value/PPV (the rate of “true” effects being detected by studies reporting positive results relative to the total number of positive results reported), given different values of  the following parameters:

  • R – the rate of “true” effects being tested relative to non-effects being tested. From a Bayesian perspective, this corresponds to the prior probability of an effect being present.
  • α – the rate of type I error. This corresponds to the probability that an individual experiment will have a statistically significant outcome in spite of no true effect being present.
  • β – the rate of type II error: This corresponds to the probability that an experiment will fail to detect that a  true effect which is present and instead yield a statistically insignificant outcome.

These three parameters are standard in the theory of the PPV. Ioannidis introduces a forth parameter to account for bias not accounted for in the above:

  • u – the probability that a false effect tested gets reported as a positive result, even though it would under ideal conditions have been statistically insignificant.

This fudge factor can incorporate anything from badly designed experiments or researchers being less sceptical of positive results to post-hoc change of study design, p-hacking or even outright fraud. Ioannidis does not go into addressing how likely any of these factors are to contribute to u, but contends himself with re-deriving an expression for the PPV if some amount of bias is taken into account.

The author then considers the effect that multiple groups investigating the same effect independently of one another will have: if just one group has a statistically significant result this is likely to get published even if the negative results of other groups is not. This means that for “hot” topics (which are subject to a great number of parallel experiments) we should be even more weary of single studies reporting statistically significant effects.

Based on his mathematical arguments, Ioannidis then proceeds to give a list of six corollaries, all of which are again reasonably well known to statisticians and most practising scientists (such as “smaller studies tend to have a lower PPV all other factors being equal” or “Greater flexibility in study design and how to measure outcomes leads to a lower PPV”).

In his discussion Ioannidis supports the polemical title of the paper by arguing that even for conservatively chosen values of R, α, β, and u, we would expect a PPV below 50%. Finally he gives an overview of how the state of affairs might be improved. Here his prescriptions are similar to what other statisticians and researches have argued for such as Increasing transparency, (pre-registration of trials; making raw data and code used in analysis available) and encouraging the publication of negative results.

The author concludes by suggesting that it is his personal belief that a number of “classics” in various fields would not hold up if replications thereof were attempted. Given the results of later replication results (such as the Open Science Collaboration 2015 replication attempts of 100 famous results in psychology in the references), this seems prescient.


The paper itself:

Ioannidis, J.P.A., 2005. Why Most Published Research Findings Are False. PLoS Medicine, 2(8), p.e124. Available at: http://dx.plos.org/10.1371/journal.pmed.0020124.

Later papers expanding on the topic

Colquhoun, D., 2014. An investigation of the false discovery rate and the misinterpretation of P values. Royal Society Open Science, pp.1–15. Available at: http://rsos.royalsocietypublishing.org/content/1/3/140216.

Jager, L.R. & Leek, J.T., 2014. An estimate of the science-wise false discovery rate and application to the top medical literature. Biostatistics (Oxford, England), 15(1), pp.1–12. Available at: http://biostatistics.oxfordjournals.org/content/15/1/1.

Leek, J.T. & Jager, L.R., 2016. Is most published research really false?, Available at: http://biorxiv.org/lookup/doi/10.1101/050575.

A famous replication study in Psychology

Open Science Collaboration, 2015. Estimating the reproducibility of psychological science. Science, 349(6251), p.aac4716-aac4716. Available at: http://www.ncbi.nlm.nih.gov/pubmed/26315443.


Penrose is Great

We have started a reading Roger Penrose: The Road to Reality which so far is a subperp book.  Normally we Skype-meet Tuesday morning AM 6.30 for 60-90 minutes.  We normally cover 20-40 pages per session.  In coming week we will have done 5 chapters:

1 The roots of science

2 An ancient theorem and a modern question

3 Kinds of number in the physical world

4 Magical complex numbers

5 Geometry of logarithms, powers, and roots

In Jotun’s view the book is great balance for formulas, intuitive explanations and historical backgrounds.

Done with Keynes!!!!!

Today we turned the last page on John Maynard Keynes: General Theory.  It has been an extremely impenetrable book: There is hardly any data and the formulas that are there are very simple and not central to the main arguments. I hope to arrange that we can meet with an economist (Sophocles Mavroides) a couple weeks from now.  We will each suggest 2 books for what to go through next.  Since reading Keynes didn’t make me understand of the overall economy has not increased a lot, so I am keen to study a more modern work with both theory and data analysis.   I myself went through 2 very large papers by Mavroides that he then explained to me, which was very rewarding for me.  One was on identifiability of a set of dynamics models for macroeconomics and the second was on fitting the New Keynes Philips Curve [NKPC] to economic data for the last 60 years.  The Philips Curve was originally an empirical observation of reciprocicity of the levels of inflation and employment.  These papers contained much of what I missed in Keynes: Data and Theory.

However, my co-readers felt they had had Economics enough for now.  I intend to suggest these two for next readings:

Jotun1: Noam Chomsky (1965) Aspects of the Theory of Syntax

Chomsky is a towering intellect and I read 1 longer 1955 grammar paper and it could be the best paper I have ever read.  A larger summary of this theory would be very rewarding to read and it is about 200 pages.

Jotun2: I would not suggest a book but rather to go through 10 Nobel Lectures by key economists doing 2 lectures at each meeting.  I recently read 3 2013 Nobel lectures in Chemistry by the Karplus, Levitt and Washell and they were an absolute ideal introduction to the history and problems of Molecular Dynamics. So now I suggest the same for economics and it could be these laurates: Friederich Hayek, Leonid Kantorovich, Paul Samuelson, Milton Friedman, John Nash, Daniel Kahneman, Paul Krugman, Scholes, Stieglitz and Amartya Sen.  This could be done in 5 meetings lastting 3+ months.

In general this Book Should:

  1. Read things I would/could not have read otherwise.


and the actual books fall into 2 topics:

I. Current Affairs: Global Warming, Immigration Studies, What is Democracy, The Cause of Conflicts, Theories of Religion, Understanding the Economy, Conceptual Foundation of Political Ideologies, … I rather want to read 3-500 pages of good overview, than a lot of daily news

II.  Real Classics: Spinoza, Kant, Hegel, Hume, Global Sacred texts,…

2. I think too long books takes too much time. Ideally 2-300 pages, max 500.

3.  I think it is a good idea to put possible books in Dropbox so we can pre-view them before buying them.


Next Wednesday: Why Most Published Research Findings Are False

Next meeting of the Classic papers discussion group will take place next Wednesday (1 Feb. 2017) at 9:30 in the Department of Statistics room LG.05. We will be discussing a modern classic: Why Most Published Research Findings Are False by John P. A. Ioannidis. All participants are welcome (just talk to the receptionist when you arrive if you don’t have building access, she should let you in)

Link to the paper: http://dx.doi.org/10.1371/journal.pmed.0020124

Since its publication, the paper has lead to widespread debate within science about the extent of the problem of irreproducibility: how widespread it is, and what to do about it. Given the central importance of this paper within those debates, as well as the various reproducibility-studies it has spawned, we deem this paper well worth reading, and hope that many people will show up to discuss it with us.