Latest info: Pictures now online. |
One method for building classification trees is to choose split variables by maximising expected entropy. This can be extended through the application of imprecise probability by replacing instances of expected entropy with the maximum possible expected entropy over credal sets of probability distributions. Such methods may not take full advantage of the opportunities offered by imprecise probability theory. In this paper, we change focus from maximum possible expected entropy to the full range of expected entropy. We present an entropy minimisation algorithm using the non--parametric inference approach to multinomial data. We also present an interval comparison method based on two user--chosen parameters, which includes previously presented splitting criteria (maximum entropy and entropy interval dominance) as special cases. This method is then applied to 13 datasets, and the various possible values of the two user--chosen criteria are compared with regard to each other, and to the entropy maximisation criteria which our approach generalises.
The paper is available in the following formats:
Paul Fink | Paul.Fink@stat.uni-muenchen.de | |
Richard Crossman | r.j.crossman@warwick.ac.uk |
Send any remarks to isipta13@hds.utc.fr.