Home » Articles posted by Michael

Author Archives: Michael

Blog archives

Here we collect our writing on various topics from our day-to-day work and our reading clubs.

Talk by Julián Echave Tues 11am

Professor Julián Echave will be giving an informal talk on Tuesday 10 April at 11am in the Small Lecture Theatre at the Department of Statistics.

Title: Protein evolutionary divergence is not random


A simple comparison of homologous proteins shows clear patterns of differential conservation/variation at the levels of amino-acid sequence, 3D structure, and protein motions. For instance, the rate of sequence evolution varies among sites; protein structures diverge more at some sites than others, and some protein vibrations are more variable than others. The default explanation of evolutionary patterns is the rather fuzzy concept of “functional importance”: the underlying assumption is that any extra conservation/variability is due to natural selection. However, while selection does indeed shape sequence divergence, the patterns of divergence of structure and motion are mostly shaped by the physics of the response of proteins to random mutations.


CSML talk: Probabilistic Inference of Nucleotide Coevolution

Michael Golden will be presenting “Probabilistic Inference of Nucleotide Coevolution” at the Computational Statistics and Machine Learning seminar today at 15:30 in the Department of Statistics. His slides are available here.


Pairs of nucleotide positions within biologically functional nucleic acid secondary structures often exhibit evidence of coevolution that is consistent with base-pairing. PICNIC is a probabilistic sequence evolution model that assesses rates of mutation at base-paired sites in alignments of DNA or RNA sequences. PICNIC is able to fully account for an unknown secondary structure, and in doing so can be used to predict a secondary structure shared amongst an alignment of sequences. PICNIC was used to infer rates of coevolution associated with GC, AU (AT in DNA), and GU (GT in DNA) dinucleotides in non-coding RNA alignments, and single-stranded RNA and DNA virus alignments. Strong evidence was found for GU dinucleotides being selectively favoured at base-paired sites in non-coding RNA and RNA virus alignments, with marginal evidence for GT dinucleotides being selectively favoured at base-paired sites in DNA virus alignments. The strength of coevolution at base-paired sites in a SHAPE-MaP-determined HIV-1 NL4-3 RNA secondary structure and a corresponding alignment containing large numbers of HIV group 1M sequences was also measured, finding that the PICNIC-inferred degrees of coevolution were more strongly correlated with experimentally-determined SHAPE-MaP pairing scores than degrees of coevolution measured using three mutual information methods that do not take into account phylogenetic dependencies.