Selecting Interesting Articles Using Their Similarity Based Only on Positive Examples
Authors | |
---|---|
Year of publication | 2005 |
Type | Article in Proceedings |
Conference | Computational linguistics and Intelligent Text Processing |
MU Faculty or unit | |
Citation | |
Field | Informatics |
Keywords | machine learning; text categorization; text filtration; text similarity; k-NN; ranking |
Description | The task of automated searching for interesting text documents frequently suffers from a~very poor balance among documents representing both positive and negative examples or from one completely missing class. This paper suggests the ranking approach based on the k-NN algorithm adapted for determining the similarity degree of new documents just to the representative positive collection. From the viewpoint of the precision-recall relation, a~user can decide in advance how many and how similar articles should be released through a filter. |
Related projects: |