![]() |
ИСТИНА |
Войти в систему Регистрация |
ФНКЦ РР |
||
Methods and systems for selecting a selected-sub-set of features from a plurality of features for training a machine learning module, the training of the machine learning module to enable classification of an electronic document to a target label, the plurality of features associated with the electronic document. In one embodiment, the method comprises analyzing a given training document to extract the plurality of features, and for a given not-yet-selected feature of the plurality of features: generating a set of relevance parameters iteratively, generating a set of redundancy parameters iteratively and determining a feature significance score based on the set of relevance parameters and the set of redundancy parameters. The method further comprises selecting a feature associated with a highest value of the feature significance score and adding the selected feature to the selected-sub-set of features.