From: "Mark Boon" <tesuji@xxxxxxxxxxxxxxxxx> wrote:
It appears that you are thinking of "common subtree" as the case of at least two rules involving the same predicate. Since the example involved only 52 predicates total, and 10,000 rules with 15 predicates each, each of those 52 predicates is essentially guaranteed to appear in many different rules, until you have nearly reached the bottom of the decision tree.I'm not following all this, it's way too complicated. You are seeing a theoretical problem that is easily solved in practice. If there are no common sub-trees from a certain point, there's no reason at all to continue maintaining a tree structure.
This is true as long as you reuse decision subtrees (making it technically a DAG, instead of a tree), and as long as each board matches only one pattern, but it does not solve the space problem.Instead you can make a leaf node that contains a list of the remaining questions. If you think about it you'll find that this way the tree will contain as many leafs as there are patterns.
If you wrote "usually remains", I would agree. But a million patterns in a thousand different shapes, as I was daydreaming about, is a particularly ornery kind of practise. While I do not expect that these rules would be one-tenth as pathological as the theoretical example, I would expect them to be problematic enough to require something other than a single straightforward decision tree using the theoretical minimum number of comparisons.And in practise the number of internal nodes remains very limited as well, provided you take good care in which order you insert the questions in the tree.