6 notes &
Iterative design of experiment
I’ve been reading a little of D C Montgomery’s “Design and analysis of experiments” for a computer networks maths course at UPC. Towards the end of chapter 1, which covers the hows and why of statistical methods for the design of experiments, it diverges from the usual bumph about controlled variables, response variables, replicates, etc, and starts talking about iterative design: an underpinning concept of Agile programming. I was surprised to find it here, but it seems to make sense.
Throughout this entire process, it is important to keep in mind that experimentation is an important part of the learning process, where we tentatively formulate hypotheses about a system, perform experiments to investigate these hypotheses, and on the basis of the results formulate new hypotheses, and so on. This suggests that experimentation is iterative.
There are the magic words: “iterative” and “learning process”.
It is usually a major mistake to design a single, large, comprehensive experiment at the start of a study. A successful experiment requires knowledge of the important factors, the ranges over which these factors should be varied, the appropriate number of levels to use, and the proper units of measurement for these variables. Generally we do not perfectly know the answers to these questions, but we learn about them as we go along.
Discovering the unknown by exploration, the fallacy of big-design-up-front, it’s still sounding familiar…
As an experimental program progresses, we often drop some input variables, add others, change the region of exploration for some factors, or add new response variables. Consequently, we usually experiment sequentially, and as a general rule, no more than about 25% of the available resources should be invested in the first experiment. This will ensure that sufficient resources are available to perform confirmation runs and ultimately accomplish the final objective of the experiment.
Changing specifications, setting aside resources to deal with this variability, the difference between what you may expect and what you may observe.
Recall: this isn’t a book on Agile, or even on software engineering; this is a book on using statistical techniques to effect the scientific method.
Montgomery even goes so far as to make it the closing idea of this first chapter, which follows.
Experiments are usually iterative. Remember that, in most situations, it is unwise to design too comprehensive an experiment at the start of a study. Successful design requires knowledge of the important factors, the ranges over which these factors are varied, the appropriate number of levels for each factor, and the proper units of measurement for each factor and response. Generally, we are not well-equipped to answer these questions at the beginning of the experiment, but we learn the answers as we go along. This argues in favour of the iterative or sequential approach discussed previously. Of course, there are situations where comprehensive experiments are entirely appropriate, but as a general rule, most experiments should be iterative. Consequently, we usually should not invest more than about 25% of the resources of experimentation (runs, budget, time, etc.) in the initial design. Often these first efforts are just learning experiences, and some resources must be available to accomplish the final objectives of the experiment.
I find it quite reassuring that these principles, which seem simple, well founded and effective, already have a solid place in other disciplines through a seemingly independent origin. A general theory of discovering by doing rather than postulating?
Finally, I couldn’t resist this gem from the opening page:
Literally, an experiment is a test.