SLGP Header

Multiple Aspect Ranking Using Sentiment Classification for Data Mining

IJCSEC Front Page

Abstract
Numerous consumer reviews of products are now available on the Internet. Consumer reviews contain rich and valuable knowledge for both firms and users. However, the reviews are often disorganized, leading to difficulties in information navigation and knowledge acquisition. This article proposes a product aspect ranking framework, which automatically identifies the important aspects of products from online consumer reviews, aiming at improving the usability of the numerous reviews. The important product aspects are identified based on two observations: 1) the important aspects are usually commented on by a large number of consumers and 2) consumer opinions on the important aspects greatly influence their overall opinions on the product. We then develop a probabilistic aspect ranking algorithm to infer the importance of aspects by simultaneously considering aspect frequency and the influence of consumer opinions given to each aspect over their overall opinions. The experimental results on a review corpus of 21 popular products in eight domains demonstrate the effectiveness of the proposed approach. Moreover, we apply product aspect ranking to two real-world applications, i.e., documentlevel sentiment classification and extractive review summarization, and achieve significant performance improvements, which demonstrate the capacity of product aspect ranking in facilitating real-world applications.
Index Terms: Product aspects, aspect ranking, aspect identification, sentiment classification, consumer review, extractive review summarization
I.Introduction
Recent years have witnessed the rapidly expanding e-commerce. A recent study from ComScore reports that online retail spending reached $37.5 billion in Q2 2011 U.S. Millions of products from various merchants have been offered online. For example, Bing Shopping1 has indexed more than five million products. Amazon.com archives a total of more than 36 million products. Shopper.com records more than five million products from over 3,000 merchants. Most retail Websites encourage consumers to write reviews to express their opinions on various aspects of the products. Here, an aspect, also called feature in literatures, refers to a component or an attribute of a certain product. A sample review “The battery life of Nokia N95 is amazing." reveals positive opinion on the aspect “battery life" of Product Nokia N95.For example, CNet.com involves more than seven million product reviews; whereas Pricegrabber.com contains millions of reviews on more than 32 million products in 20 distinct categories over 11,000 merchants. Such numerous consumer reviews contain rich and valuable knowledge and have become an important resource for both consumers and firms [9]. Consumers commonly seek quality information from online reviews prior to purchasing a product, while many firms use online reviews as important feedbacks in their product development, marketing, and consumer relationship management. Generally, a product may have hundreds of aspects. For example, iPhone 3GS has more than three hundred aspects such as “usability," “design," “Application," “3G network." We argue that some aspects are more important than the others, and have greater impact on the eventual consumers’ decision making as well as firms’ product development strategies. For example, some aspects of iPhone 3GS, e.g., “usability" and “battery," are concerned by most consumers, and are more important than the others such as “usb" and “button." For a camera product, the aspects such as “lenses" and “picture quality" would greatly influence consumer opinions on the camera, and they are more important than the aspects such as “a/v cable" and “wrist strap." Hence, identifying important product aspects will improve the usability of numerous reviews and is beneficial to both consumers and firms. product aspect ranking framework to automatically identify the important aspects of products from online consumer reviews. Our assumption is that the important aspects of a product possess the following characteristics:(a) they are frequently commented in consumer reviews; and (b) consumers’ opinions on these aspects greatly influence their overall opinions on the product. A straightforward frequency-based solution is to regard the aspects that are frequently commented in consumer reviews as important. However, consumers’ opinions on the frequent aspects may not influence their overall opinions on the product, and would not influence their purchasing decisions. For example, most consumers frequently criticize the bad “signal connection" of iPhone 4, but they may still give high overall ratings to iPhone 4.

References:

  1. [1] Bezdek.J.C et al, (2003) “Convergence of alternating optimization,” J. Neural Parallel Scientific Comput., vol. 11, no. 4, pp. 351–368.
  2. Ding.X et al, (2008) “A holistic lexicon-based approach to opinion mining,” in Proc. WSDM, New York, NY, USA, pp. 231–240.
  3. Gupta.V et al, (2010) “A survey of text summarization extractive techniques,” J. Emerg. Technol. Web Intell., vol. 2, no. 3, pp. 258–268.
  4. Ghose.A et al, (2010) “Estimating the helpfulness and economic impact of product reviews: Mining text and reviewer characteristics,” IEEE Trans. Knowl. Data Eng., vol. 23, no. 10, pp. 1498–1512.
  5. Paltoglou.G et al, (2010) “A study of information retrieval weighting schemes for sentiment analysis,” in Proc. 48th Annu. Meeting ACL, Uppsala, Sweden, pp. 1386–1395.
  6. Pang.B et al, (2002) “Thumbs up? Sentiment classification using machine learning techniques,” in Proc. EMNLP, Philadelphia, PA, USA, pp. 79–86.8.
  7. Pang.B et al, (2004) “A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts techniques,” in Proc. ACL, Barcelona, Spain, pp. 271–278.
  8. Pang.B et al, (2008) “Opinion mining and sentiment analysis,” in Found. Trends Inform. Retrieval, vol. 2, no. 1–2, pp. 1–135.
  9. Snyder.B et al, (2007) “Multiple aspect ranking using the good grief algorithm,” in Proc. HLT-NAACL, New York, NY, USA, pp. 300–307.
  10. [10] Zheng-Jun Zha et al, (2014) “Product aspect Ranking and its Application”.