a memory less ephemeral: statistics

Showing posts with label statistics. Show all posts

Tuesday, 26 August 2025

sequestering carbon, several books at a time CLIV

Labels: architecture, biology, books, computer, graphics, LaTeX, mathematics, science fiction, statistics

The latest batch:

Some of these are due to recommendations at the Belfast Eastercon; some from the Seattle Worldcon (which we attended virtually).

Sunday, 6 July 2025

sequestering carbon, several books at a time CLII

Labels: architecture, biology, books, complexity, computer, evolution, fractals, history, mathematics, physics, psychology, python, science fiction, statistics

The latest batch:

Wednesday, 25 December 2024

sequestering carbon, one Christmas at a time XII

Labels: books, Christmas, cognition, complexity, computer, cybernetics, DNA, history, mathematics, physics, science fiction, statistics, time

His'n'hers Christmas presents (oh, there was also a soldering iron and mat, but they're not books...)

Tuesday, 11 July 2023

sequestering carbon, several books at a time CXXX

Labels: biology, books, Doctor Who, game, history, language, mathematics, politics, psychology, science fiction, statistics

Latest batch:

Wednesday, 7 December 2022

sequestering carbon, several books at a time CXXVI

Labels: astronomy, books, dinosaur, engineering, garden, graphics, history, language, psychology, science, science fiction, statistics

It's been a while since the last update, so there are rather more in the pile than usual:

Sunday, 9 October 2022

QBism: the future of quantum physics

Labels: books, review, science, statistics

Hans Christian von Bayer.
QBism: the future of quantum physics.
Harvard University Press. 2016

Quantum mechanics is notoriously unintuitive, with its “collapsing wavefunctions” or “many worlds” or whatever is added as an interpretation to its weird predictions. Yet those predictions belong to the most accurate physical theory known.

Quantum mechanics seems to say we have to give up some cherished notion of the world: if we want locality (no “spooky action at a distance”, only local interactions that propagate), we can’t have realism (the idea that there is some definite thing or process there), and vice versa. Quantum Bayesianism lets go of reality, but in an interesting way.

This brilliant little book takes the reader through a description of the weirdnesses of quantum mechanics, and the interpretation of Bayesian statistics, before applying the latter to the former.

The key idea of Bayesian, as opposed to frequentist, statistics is that probabilities are about our knowledge of the world (we assign a prior probability to a fair coin toss of 50% of landing heads, because we don’t know all the details of the initial spin, the air currents, etc). Bayesianism defines how to update this prior knowledge (the 50/50 chance) with a new probability once we have further data (anything from more knowledge about those initial conditions, to an observation of how the coin actually landed). So in QBism the probabilistic quantum wavefunction is interpreted as our prior knowledge of the system. This gets around “wavefunction collapse”: we just update our prior with the additional observational data. It gets around “Wigner’s friend”: Wigner has one prior wavefunction, his friend, who has observed more of the the situation, has an updated, and therefore different, wavefunction. The wavefunction belongs to the observer, not to the system.

Quantum mechanics allows us to calculate the best possible prior from our knowledge of the system. What makes quantum mechanics weird is that a calculated wave function that says a photon has a 50/50 chance of doing something, say, is the best possible knowledge we can have of the system: unlike in the classical case, it is not possible to improve the prediction by having more data about how it was set up. This is why we have to give up “reality”: the wavefunction isn’t “real”, because it is not a property of the system itself, it is just the best possible description of it. There is no underlying “real” system that “knows” what the actual answer will be (unlike the classical coin toss). Once the photon has interacted with something, the system has changed, and we can update our prior with the new observed data.

This loss of “realism” does not mean QBism denies the existence of a real world “out there”. It instead leads to a startlingly different view of that real world. The world is not a deterministic automaton, set going at the start, trundling along a pre-determined track. It is a world undergoing constant creation by quantum systems (which may include observers) interacting.

[p208] Understood in this way, the QBist universe is not static but dynamic; less like an intricate clockwork and more like the interior of an evolving star that is not alive in the conventional sense but bubbling with creative energy and continual surprise. It is real but veiled, objective but unpredictable, and substantial but unfinished,

von Bayer writes in a very accessible way. I admit, the sections on the delayed choice and the GHZ experiments could have gone more slowly. But overall, the exposition is brilliant: on quantum mechanics, on Bayesianism, and then on the combination. This book changed my world view.

For all my book reviews, see my main website.

Sunday, 14 August 2022

sequestering carbon, several books at a time CXXV

Labels: books, cognition, computer, engineering, evolution, history, mathematics, philosophy, psychology, science, science fiction, statistics

The latest batch:

Sunday, 16 January 2022

book review: The End of Average

Labels: books, review, statistics

Todd Rose.

The End of Average: how to succeed in a world that values sameness.

Penguin. 2015

Nobody is average height and average weight and average arm length and…. Choose enough parameters, and nobody is average in all of them. “One size fits all” actually fits no-one.

Rose provides examples of where the assumption that there is an “average person” can go badly awry (for example, fitting variable pilots into standard cockpits), and where the idea came from historically. There are competing historical camps: one has the average as a measure of “perfection” with outliers being misfits; the other has that deviations from average are correlated, so someone above average height, say, will also be above average everything else. Both camps are wrong: everyone is an outlier in multiple dimensions, and deviations aren’t well correlated.

Rose goes on to describe the perils of standardisation, and how variation needs to be accommodated, and how it can be an advantage: different people are good at different things.

A readable and informative little book, this should be read by everyone responsible for evaluating or designing things for others, from job interviews to education, via cockpits.

For all my book reviews, see my main website.

Sunday, 29 August 2021

sequestering carbon, several books at a time CXIX

Labels: algorithm, astronomy, books, cognition, complexity, computer, economics, engineering, evolution, garden, graphics, history, insect, Mars, mathematics, philosophy, science, science fiction, statistics

The latest batch:

Sunday, 27 June 2021

sequestering carbon, several books at a time CXVI

Labels: algorithm, astronomy, books, complexity, computer, dog, engineering, evolution, graphics, history, language, mathematics, nature, philosophy, psychology, science fiction, statistics

The latest batch:

Saturday, 23 January 2021

sequestering carbon, several books at a time CXII

Labels: books, carbon, computer, economics, history, mathematics, philosophy, politics, probability, psychology, science fiction, simulation, statistics

The latest batch:

Friday, 25 December 2020

sequestering carbon, one Christmas at a time VIII

Labels: books, Christmas, evolution, history, philosophy, science, science fiction, statistics

His ’n’ hers Christmas presents:

Sunday, 1 November 2020

sequestering carbon, several books at a time CX

Labels: books, computer, evolution, graphics, language, mathematics, nature, politics, science fiction, statistics, systems

The latest batch:

Tuesday, 31 March 2020

sequestering carbon, several books at a time CV

Labels: astronomy, books, complexity, computer, dinosaur, economics, evolution, language, mathematics, philosophy, politics, psychology, science, science fiction, statistics

The latest batch:

Wednesday, 25 December 2019

sequestering carbon, one Christmas at a time VII

Labels: books, Christmas, cognition, complexity, computer, economics, graphics, history, mathematics, prime numbers, science fiction, statistics, Unix

His ’n’ hers Christmas presents:

Saturday, 5 October 2019

book review: Risk Savvy

Labels: books, mathematics, probability, review, statistics

Gerd Gigerenzer.
Risk Savvy: how to make good decisions.
Penguin. 2014

There are two main points made in this book.

Firstly, you need to use a different reasoning process about situations where you know the risks and their odds, and the uncertain situations where you don’t; and you need to be able to distinguish these cases. In the case of uncertainty, rules of thumb are usually better than trying to calculate unknown odds. Gigerenzer gives some examples. I particularly liked his discussion of the real Monty Hall problem, rather than the “tidied up” version used for probability calculations. The real situation is much messier, and I have pointed out that you need to know the full rules beforehand: the stated solution works only if the host doesn’t cheat.

Secondly, even when the odds and risks are known, most statistics are so badly presented, possibly to make better headlines, that even the experts don’t understand what they say; you need to look at the real underlying rates. Behaviour X doubles the chance of cancer Y may not be a problem, if the chance of cancer Y is extremely small in the first place. Gigerenzer gives examples of a way to present rates rather than conditional probabilities that makes it much easier to see and understand the true risks.

There are many good cases discussed in here, with a large chunk of the book given over to healthcare. For example, there is a lot about medical screening, false positives, and increased “survival” rates being due entirely to earlier diagnosis, and nothing to do with living longer in total if diagnosed earlier (“lead time bias”). Survival rates are different from mortality rates.

Some of the discussions do feel a little disjointed. In particular, there is early emphasis on how most real world issues deal with uncertainty (rules of thumb) rather than risk (calculating odds), yet much of the book is on increasing statistical literacy. No matter; there is much good material in here.

For all my book reviews, see my main website.

Monday, 30 September 2019

sequestering carbon, several books at a time XCIX

Labels: architecture, books, garden, graphics, history, mathematics, science, science fiction, statistics, systems

The latest batch

Monday, 26 August 2019

book review: The Book of Why

Labels: books, probability, review, statistics

Judea Pearl, Dana Mackenzie.
The Book of Why: the new science of cause and effect.
Penguin. 2018

We have all heard the old saying “correlation is not causation”. This is a problem for statistics, since all it can measure is correlation. Pearl here argues that this is because statisticians are restricting themselves too much, and that it is possible to do more. There is no magic; to get this more, you have to add something into the system, but that something is very reasonable: a causal model.

He organises his argument using the three-runged “ladder of causation”. On the bottom rung is pure statistics, reasoning about observations: what is the probability of recovery, found from observing these people who have taken a drug. The second rung allows reasoning about interventions: what is the probability of recovery, if I were to give these other people the drug. And the top rung includes reasoning about counterfactuals: what would have happened if that person had not received the drug?

Intervention (rung 2) is different from observation alone (rung 1) because the observations may be (almost certainly are) of a biassed group: observing only those who took the drug for whatever reason, maybe because they were already sick in a particular hospital, or because they were rich enough to afford it, or some other confounding variable. The intervention, however, is a different case: people are specifically given the drug. The purely statistical way of moving up to rung 2 is to run a randomised control trial (RCT), to remove the effect of confounding variables, and thereby to make the observed results the same as the results from intervention. The RCT is often known as the “gold standard” for experimental research for this reason.

But here’s the thing: what is a confounding variable, and what is not? In order to know what to control for, and what to ignore, the experimenter has to have some kind of implicit causal model in their head. It has to be implicit, because statisticians are not allowed to talk about causality! Yet it must exist to some degree, otherwise how do we even know which variables to measure, let alone control for? Pearl argues to make this causal model explicit, and use it in the experimental design. Then, with respect to this now explicit causal model, it is possible to reason about results more powerfully. (He does not address how to discover this model: that is a different part of the scientific process, of modelling the world. However, observations can be used to test the model to some degree: some models are simply too causally strong to support the observed situation.)

Pearl uses this framework to show how and why the RCT works. More importantly, he also shows that it is possible to reason about interventions sometimes from observations alone (hence data mining pure observations becomes more powerful), or sometimes with fewer controlled variables, without the need for a full RCT. This is extremely useful, since there are many cases where RCTs are unethical, impractical, or too expensive. RCTs are not the “gold standard” after all; they are basically a dumb sledgehammer approach. He also shows how to use the causal model to calculate which variables do need to be controlled for, and how controlling for certain variables is precisely the wrong thing to do.

Using such causal models also allows us to ascend to the third rung: reasoning about counterfactuals, where experiments are in principle impossible. This gives us power to reason about different worlds: What’s the probability that Fred would have died from lung cancer if he hadn’t smoked? What’s the probability that heat wave would have happened with less CO2 in the atmosphere?

[p51] probabilities encode our beliefs about a static world, causality tells us whether and how probabilities change when the world changes, be it by intervention or by act of imagination.

This is a very nicely written book, with many real world examples. The historical detail included shows how and why statisticians neglected causality. It is not always an easy read – the concepts are quite intricate in places – but it is a crucially important read. We should never again bow down to “correlation is not causation”: we now know how to discover when it is.

Highly recommended.

For all my book reviews, see my main website.

Tuesday, 14 May 2019

sequestering carbon, several books at a time XCVI

Labels: algorithm, books, computer, evolution, history, language, probability, psychology, science fiction, statistics

The latest batch:

Tuesday, 15 January 2019

sequestering carbon, several books at a time XCIII

Labels: books, computer, history, mathematics, philosophy, probability, science, science fiction, statistics

The New Year’s batch:

Pages

Tuesday, 26 August 2025

Sunday, 6 July 2025

Wednesday, 25 December 2024

Tuesday, 11 July 2023

Wednesday, 7 December 2022

Sunday, 9 October 2022

Sunday, 14 August 2022

Sunday, 16 January 2022

Sunday, 29 August 2021

Sunday, 27 June 2021

Saturday, 23 January 2021

Friday, 25 December 2020

Sunday, 1 November 2020

Tuesday, 31 March 2020

Wednesday, 25 December 2019

Saturday, 5 October 2019

Monday, 30 September 2019

Monday, 26 August 2019

Tuesday, 14 May 2019

Tuesday, 15 January 2019