One measure of how two variables interact is called correlation. One annoyance is that
famous saying all statisticians have tattooed near their hearts: "correlation does not
imply causality". This is because correlation is symmetric: however much variable 1
correlates with variable 2, variable 2 correlates the same with variable one. This is not
what we are looking for with volcanoes. There are three other assumptions that must
hold for correlation to be meaningful: linearity, randomness and joint normally
distributed. If there are powerful non-linear effects (like AIDS) present we have to
abandon traditional correlations. Randomness means we are not controlling the values
(sadly true, in the case of geological events) and that values are mostly continuous.
That means correlations with gender, which is discrete (2 or 3 values) are out. Joint
normal can be thought of as a 3-D graph of the two variables forms a bell.
Another possibility is covariance analysis. What we can do is look at how the second
variable's values change as the first variable's values change. We are obliged to have the
same number of observations, so just years or within the a 1000 years plus or minus for
older events will not be sufficient. That means we are likely losing all but the last 300
years of volcanic eruptions. Not to worry. That's still plenty of data. But what kind of
data? Well, we can't have just zero (no eruption) or one (eruption). Suppose we have
a pile of VEI scores (Volcanic Explosivity Index - devised by Chris Newhall of the U.S.
Geological Survey and Steve Self at the University of Hawaiʻi nearly 30 years ago). VEIs
go from 0 (mild) to 8 (largest ever) but they are logarithmic (not linear) and discrete.
Suppose we shift the scale from 1 to 9 (0 is no eruption) and flatten the values by
expanding them to 10, 100, 1000 ... This is the same thing we had to do with earthquake
Richter values. So now we have one problem: given a hyperbolic distance apart can we
assert that a certain value of seismic event at location A causes a certain range of
values at location B? If so, what is the speed of causality and is there a relationship to
distance. Now it gets tricky: suppose we have an eruption or earthquake at location A.
Will that increase or decrease the gas, the magma or some other energy in the fault?
As it happens, this type of challenge has been tackled by epidemiologists. If an
epidemic disease kills too quickly, has very obvious symptoms or is too lethal it can
never spread. AIDS and some influenzas have found optimal velocities. That's not good
for us.
Amidst all the doom and gloom, there is an interesting bit of mathematics. What we
want to do is unleash a pandemic of panels to trigger an economic earthquake.