Log of Determinant

This should be a well known methodology by now, but it seems to not be.

In statistical work, one frequently sees the expression \log{|A|} where the absolute value means determinant of covariance matrix A. With large covariances, the calculations can easily overflow, but usually the finally value is a reasonable finite number.

The covariance is key, as it allows us to compute a Cholesky decomposition of A = LL' first, which is quite numerically stable. We will have two triangular matrices whose determinants are the products of their diagonals (proof left as an exercise to the reader). Taking the log of that will transform the product to a sum!

Thus \log{|A|} = 2\sum \log{d(L)} where d is the diagonals of the matrices. You’ll find that this will rarely overflow, and possibly (depending on how sophisticated the determinant calculation is) speed up the work!

The Voyage of Life

I went to the National Gallery today to take a final look before I leave next week. This particular series of painting caught my eye, especially the one above.

It just seems to reflect the drastic kick of reality I’ve been ingesting. When I was little, dreams were so lofty, ideas so wild and thoughts run carelessly. Now, the future seems perilous at times… what if my next steps are wrong?

I’m still looking forward to the next year though…. but what lies after Cornell truly scares me.

Generating GLM Data and Eccentricity

At work, I had to generate random data to run logistic regressions on. In one unusual case, the slice sampler was performing far worse than expected. The code was simple, and contained no mistake; we thought something incredibly bad had happened with the whole testing framework.

What ended up happening was that our data matrix X was generated by a uniform distribution from 0 to 1, but the reference runs were generated from -.5 to .5. The parameters \beta of the logistical regressions are both generated from -0.5 to 0.5.

The logistic regression is a special case of GLMs, implying that we have a linear predictor of X\beta. Theoretically, in both cases they should be centered at 0. Somehow, it’s (almost) always true that shifting X away from the 0 will cause the condition number to shift larger.

Maybe a small proof will come later? But this seems to have to do with the eigenvalues of the sum of two randomly distributed matrix… which is not entirely trivial.

Work

Work is so tough. I thought it’ll be pretty chill and have tons of time to relax at night. Nope… too tired to do other stuff.

I’ll be posting some statistics soon though!

Paper

Busy with final projects, and watching Pokemon.

*team rocket does something stupid
Jesse: why did we do that?
James: because we have to fill the half hour