If a discrete variable, on which the training procedure tries to make a split, takes more than max_categories values, the precise best subset estimation may take a very long time (as the algorithm is exponential). Instead, many decision trees engines (including ML) try to find sub-optimal split in this case by clustering all the samples into max_categories clusters (i.e. some categories are merged together).
Note that this technique is used only in N(>2)-class classification problems. In case of regression and 2-class classification the optimal split can be found efficiently without employing clustering, thus the parameter is not used in these cases.
Namespace: Emgu.CV.ML.StructureAssembly: Emgu.CV.ML
(in Emgu.CV.ML.dll) Version: 126.96.36.1995 (188.8.131.525)
public int maxCategories
Public maxCategories As Integer
val mutable maxCategories: int