Generating GLM Data and Eccentricity

At work, I had to generate random data to run logistic regressions on. In one unusual case, the slice sampler was performing far worse than expected. The code was simple, and contained no mistake; we thought something incredibly bad had happened with the whole testing framework.

What ended up happening was that our data matrix $X$ was generated by a uniform distribution from 0 to 1, but the reference runs were generated from -.5 to .5. The parameters $\beta$ of the logistical regressions are both generated from -0.5 to 0.5.

The logistic regression is a special case of GLMs, implying that we have a linear predictor of $X\beta$ . Theoretically, in both cases they should be centered at 0. Somehow, it’s (almost) always true that shifting $X$ away from the 0 will cause the condition number to shift larger.

Maybe a small proof will come later? But this seems to have to do with the eigenvalues of the sum of two randomly distributed matrix… which is not entirely trivial.

Leave a Reply Cancel reply