Good Enough Practices in Scientific Computing

Much of the work that I do as a consultant, or at least that I used to do as these days my juniors tend to do most of the hands on work, falls under a category that I’d describe as scientific computing. Scientific computing is the type of software development done to solve an analytical problem of some sort, usually involving the processing and analysis of data that you have collated or collected.

The practices involved in scientific computing are different from those involved in traditional software engineering. The objectives and context are similar, but not the same.

Software engineering is interested in the development of software that correct and free from errors. Scientific computing is interested in analysis results that are correct and free from errors. The correctness of the software is a means for achieving the correct analysis results, but it is not the only means, as the scientist (or consultant in my case) will perform their own checks over the results before they are willing to publish.

The users of the software developed by a software engineer are many. The software engineer wants their software to be robust to the inputs of this wide range of users. The user of the software developed in scientific computing is often the developer themeselves, or their colleagues. The user can be expected to be competent, unlikely to provide silly inputs, and well equipped to detect and correct errors arising from their use of the software.

In scientific computing, reproducibility is paramount. When our results are challenged, we need to be able to explain and justify them.

Best Practices for Scientific Computing is a paper from 2014 that considers good practice in the field of software engineering, and how these practices shouls be applied to scientific computing.

It was followed in 2017 by the paper Good enough practices in scientific computing, which drew upon experience from the authors experiences helping scientists to try to follow the recommendations in the first paper. The recommendations in the second paper reflect the fact that scientists are not software engineers, and that some of the tooling and practices recommended in the first paper are more challenging to implement for the average scientist than they first thought.

Both of these papers are well worth reading, and have strongly influenced how I approach my work.