How classification is used
Classifying text is used for a number of purposes:
- Spam detection
- Authorship attribution
- Sentiment analysis
- Age and gender identification
- Determining the subject of a document
- Language identification
Spamming is an unfortunate reality for most e-mail users. If an e-mail can be classified as spam, then it can be moved to a spam folder. A text message can be analyzed and certain attributes can be used to designate the e-mail as spam. These attributes can include misspellings, lack of an appropriate e-mail address for recipients, and a non-standard URL.
Classification has been used to determine the authorship of documents. This has been performed for historical documents such as for The Federalist Papers and for the book Primary Colors where the authors have been identified.
Sentiment analysis is a technique that determines the attitude of text. Movie reviews have been a popular domain but it can be used for almost any product review. This helps companies better assess how...