I'd second the advice about artificial settings for the reasons (efficient, focused development) that Gunnar gave.When it comes to which types of tests are *most* useful, my experience is that those can be found in two categories: 1. Systematic and serious errors your own engine insists on doing. 2. Systematic collections of specific types of problems, preferrably in artificial settings without any distractions.