On the Use of Side Information for Text Mining using Clustering and Classification Techniques-A Survey

Subamanikandan A, Arulmurugan R

On the Use of Side Information for Text Mining using Clustering and Classification Techniques-A Survey

Authors

Subamanikandan A, Arulmurugan R¹

Abstract

Text mining application, side information is available along with text documents. Such side information may be contain different kinds, such as links in the document, document provenance information, user-access behavior from web logs or other non-textual attributes. Such attributes may contain large amount of information in the clustering purposes. However, the relative information is difficult to estimate, when some of information is noisy data. In such cases, it can be risky to incorporate side-information into the mining process, because it can either improve the quality of the representation for the mining process, or can add noise to the process. In this paper, we design an algorithm which combines classical partitioning algorithms with probabilistic models in order to create an effective clustering approach. We then show how to extend the approach to the classification problem.

Article Details

Published

2014-11-28

Issue

Vol. 3 No. 11 (2014)

Section

Articles

How to Cite

On the Use of Side Information for Text Mining using Clustering and Classification Techniques-A Survey. (2014). International Journal of Engineering and Computer Science, 3(11). http://www.ijecs.in/index.php/ijecs/article/view/2305

Download Citation

Downloads

On the Use of Side Information for Text Mining using Clustering and Classification Techniques-A Survey

Authors

Abstract

Article Details

Published

Issue

Section

How to Cite

Make a Submission

classicsidebar

Information