中央研究院 資訊科學研究所

活動訊息

友善列印

列印可使用瀏覽器提供的(Ctrl+P)功能

學術演講

:::

CONCEPT-LEVEL SENTIMENT ANALYSIS

  • 講者ERIK CAMBRIA 教授 (Assistant professor at Nanyang Technological University)
    邀請人:許聞廉
  • 時間2014-05-12 (Mon.) 10:00 ~ 12:00
  • 地點資訊所新館106演講廳
摘要

As the Web rapidly evolves, Web users are evolving with it. In an era of social connectedness, people are becoming increasingly enthusiastic about interacting, sharing, and collaborating through social networks, online communities, blogs, Wikis, and other online collaborative media. In recent years, this collective intelligence has spread to many different areas, with particular focus on fields related to everyday life such as commerce, tourism, education, and health, causing the size of the Social Web to expand exponentially.

The distillation of knowledge from such a large amount of unstructured information, however, is an extremely difficult task, as the contents of today’s Web are perfectly suitable for human consumption, but remain hardly accessible to machines. The opportunity to capture the opinions of the general public about social events, political movements, company strategies, marketing campaigns, and product preferences has raised growing interest both within the scientific community, leading to many exciting open challenges, as well as in the business world, due to the remarkable benefits to be had from marketing and financial market prediction.

Mining opinions and sentiments from natural language, however, is an extremely difficult task as it involves a deep understanding of most of the explicit and implicit, regular and irregular, syntactical and semantic rules proper of a language. Existing approaches mainly rely on parts of text in which opinions and sentiments are explicitly expressed such as polarity terms, affect words and their co-occurrence frequencies. However, opinions and sentiments are often conveyed implicitly through latent semantics, which make purely syntactical approaches ineffective.

Concept-level sentiment analysis focuses on a semantic analysis of text through the use of web ontologies or semantic networks, which allow the aggregation of conceptual and affective information associated with natural language opinions. By relying on external knowledge, such approaches step away from blind use of keywords and word co-occurrence count, but rather rely on the implicit features associated with natural language concepts. Unlike purely syntactical techniques, concept-based approaches are able to detect also sentiments that are expressed in a subtle manner, e.g., through the analysis of concepts that do not explicitly convey any emotion, but which are implicitly linked to other concepts that do so. The bag-of-concepts model can represent semantics associated with natural language much better than bags of words. In the bag-of-words model, in fact, a concept such as 'cloud computing' would be split into two separate words, disrupting the semantics of the input sentence (in which, for example, the word 'cloud' could wrongly activate concepts related to 'weather').

The analysis at concept-level allows for the inference of semantic and affective information associated with natural language text and, hence, enables comparative fine-grained feature-based sentiment analysis. Rather than gathering isolated opinions about a whole item (e.g., iPhone5), users are generally more interested in comparing different products according to specific features (e.g., iPhone5’s vs Galaxy S3’s touchscreen), or even sub-features (e.g., fragility of iPhone5’s vs Galaxy S3’s touchscreen). In this context, the construction of comprehensive common and common-sense knowledge bases is key for feature-spotting and polarity detection, respectively. Common-sense, in particular, is necessary to properly deconstruct natural language text into sentiments –for example, to appraise the concept 'small room' as negative for a hotel review and 'small queue' as positive in a patient opinion, or the concept 'go read the book' as positive for a book review but negative for a movie review.