Reproducibility of Numerical Data

Using git to version control my code and LaTeX files for my academic career is probably one of the best habits I have.  I do sometimes slack off and have conflicts between the repository on my work computer/laptop/home computer which is agonizing to resolve, but those instances are rare. In general, I try to commit/push once a day with a short comment on what was accomplished, and this allows me to backtrack to a great extent if needed.

Unfortunately, there seems to be two issues which I’m seeing right now while revising a paper. The first is that I should also state the version of the auxiliary software and packages which my own code depends on. I found out this the hard way when I noticed that different versions of gmsh resulted in different meshes, even on simple domains such as the cube. I believe this can be simply resolved with a setup.py or a requirements text file.

The other is that I need to record the exact code and parameter configuration when presenting data. What this entails is to commit code every time data is added to the write up. Then, I should also add the commit ID to the LaTeX file.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.