Home » Articles posted by thewildwilli (Page 3)

Author Archives: thewildwilli

Blog archives

Here we collect our writing on various topics from our day-to-day work and our reading clubs.


We (not royal we here) intend to read this over the next 2 years in installments of 20ish pages every 2nd week. I think it will do wonders for our understanding of physics and if not, we will write an angry letter to Sir Roger Penrose and The Times.

Since one of us have standard working hours we will do it before work – probably like 6.30AM to 8.00 AM – per Skype. A good start of the day and your breakfast will taste better. If somebody want to join, please contact me. We will start in January.


Summary of “Mathematical Chemistry and Chemoinformatics” page 399-417

Last time we made the following comment: “An interesting point is that if it was possible to exclude the possibility that the molecule had a ring, then the correct result would jump from position 16 to position 2.”
This time we are doing exactly that sort of thing. Based on the theory in chapter 8.

The structure criteria (either that something is present or absent) with probability over 95% is used to restrict the set of possibilities. In the example, this reduces the set of candidates from around 30 to around 3.

A comparison between MOLGEN-MS and two competitors; ACD MS Fragmenter and MetFrag, is made. It seems that they are about the same in quality. Some have better percentwise ranking (RRP) but also more candidates.

Mass spectrometry yields more than just a spiky digram. It also produces the diagram in real time. To take the time element into account, we look at retention time.
Retention properties – retention is the time it takes from insertion to observation and is used in chomtography. (There are to rentention indices, but they build on the same idea)
There is some error when measuring the retention time, so one approach is to allow a molecule if it is within 2 standard deviation of a retention measurement.

2 other properties can be used similarly to retention time to exclude unlikely candidates:
Partitioning properties – A measure for how the molecule partitions related to retention time.
Steric energy – molecules with too high steric energy that are so weird, that they cannot exist.

An alternative to the above approach is Consensus scoring. The idea is to give a combined rank for candidates instead of eliminating unlikely ones for each criterion. This requires a formula for combining the different criteria, which is provided. The formula has some disadvantages, but overall this seems like a more promising approach.

The example used tries to find the molecules present in contaminated groundwater in Germany.

The chapter concludes with naming some ways to improve CASE studies I the future. It mentions that there is a long way to go before it is automated.

The second conclusion is about CASE with high accuracy data. It is not efficient in practice, since the databases are in their infant stages, but people are working on it.

General comments:
The workflow on figure 9.9 is not guilty of the same lack of details as the previous ones.
Chapter 7,8 also seemed like case-studies.

Next up is a talk on Monday 14-16 in the department of statistics. We discuss the contents of the entire book. Interested people are welcome.

Molecular dynamics day

There will be this event in Oxford Monday 13th 1-6pm that one can attend for free: http://www.stats.ox.ac.uk/events/molecular_dynamics_day

Bill Gates can’t spell ”von Neumann” !!!

Bill Gates can’t spell ”von Neumann” !!! [p52 line 8]

I and Jimi Cullen read the Gates and Papadimitriou paper ”Bounds for Sorting by Prefix Reversal” (1979) that investigates how many flips [reversals of positions 1,2,..,k] of {1,2,3…,n} is needed to convert it into an arbitrary permutation and they found linear upper and lower bounds. Sometimes this is called the pan-cake flipping problem. They also investigate the same problem when the pancakes must end having the same side up, thus all experiencing an even number of flips.

This is a precursor to a paper by Hannenhalli and Pevzner (1995) that investigates the same problem where you can flip (invert) any interval in the permutation that also has a version where there are restrictions on the inversions on a given gene.

We were interested in this since we wondered if one could add swaps of neighbors to the TKF91 model since this would be of relevance in its applications to linguistics. This lead us to read Lowrance and Fischer (1974) that investigates the parsimony version of the problem and they only have a restricted solution when swaps can’t cross in the final alignment. We find this problem fun and will now try to look at a stochastic ”pure swap” problem and then maybe add insertion-deletions.

I should run this posting through Microsoft spelling checker to make sure I don’t have too many spelling errors myself, but I don’t trust it anymore.

Phylogeny: Discrete and random processes in evolution

We have read and reviewed “Phylogeny: Discrete and random processes in evolution” by Mike Steel.

The slides for a talk given by Jotun to sum up the book can be seen here: http://tinyurl.com/Steel-Phylogeny.

We have written up a full review of the book which can be found here.

An abbreviated version of the above review has been published on the SIAM News blog, in three parts: (part I, part II, part III)

Comments on the process:

The book has actually done us a lot of good and I hope to return to some of its topics at a more leisurely pace. TIt will be 2pm in the new Statistics Building. We will write an extensive review of the book that we hope to publish simultaneously with its publication. This way of reading a book is extremeli good for everybody: the author gets feedback in the final stages of writing, we get discussions with the author and the book gets review when it is published and not 2 years after. I tried to get the same arrangement with the forth coming multivolume history of the Danish Language, but publisher and editors weren’t too eager.

Physical copy of the book: http://bookstore.siam.org/cb89/