Investigation and application of improved text mining based on support vector machine

Text mining involves the process of structuring the input text, deriving patterns within the structured data, and finally evaluation and interpretation of the output. Text mining can help an organization derive potentially valuable business insights from text-based content. We built a RapidMiner process to examine and learn to classify spam messages. Several thousand messages were analyzed, and a Support Vector Machine learner was able to classify messages with about 91.89% accuracy in a very simple process. We discussed how to examine the frequency of words in documents. The basics of the Support Vector Machine method were explained, as well as cross-validation, and dataset balancing.

Tang Zhi-hang
Volume No: 
Issue No: 
Paper Number: 
Select Subjects: 
Select Issue: 
Download PDF: