中央研究院 資訊科學研究所

活動訊息

友善列印

列印可使用瀏覽器提供的(Ctrl+P)功能

TIGP (SNHCC)--Conquering Cross-source Failure for News Credibility: Learning Generalizable Representations beyond Content Embedding

:::

TIGP (SNHCC)--Conquering Cross-source Failure for News Credibility: Learning Generalizable Representations beyond Content Embedding

  • 講者陳宜欣 教授 (清大資工系)
    邀請人:TIGP (SNHCC)
  • 時間2020-06-15 (Mon.) 14:00 ~ 16:00
  • 地點Webex Meeting ID: 581 250 658 / Password: sfBbyVa429f
摘要

Meeting Room Link: https://meetingsapac3.webex.com/meetingsapac3/j.php?MTID=m84fe877b512673837438f73ceaa0d414

Log in after 1:30 PM, June 15. The lecture will start at 2:00 PM. 

False information on the Internet has caused severe damage to society. Researchers have proposed methods to determine the credibility of news and have obtained good results. As different media sources (publishers) have different content generators (writers) and may focus on different topics or aspects, the word/topic distribution for each media source is divergent from others. We expose a challenge in the generalizability of existing content-based methods to perform consistently when applied to news from media sources non-existing in the training set, namely the cross-source failure. A cross-source setting can cause a decrease beyond 15 − 19% in accuracy for current methods; content-sensitive features are considered one of the major causes of cross-source failure for a content-based approach. To overcome this challenge, we propose a syntactic network for news credibility (SYNC), which focuses on function words and syntactic structure to learn generalizable representations for news credibility and further reinforce the cross-source robustness for different media. Experiments with cross-validation on 194 real world media sources showed that the proposed method could learn the generalizable features and outperformed the state-of-the-art methods on unseen media sources. Extensive analysis on the embedding feature representation represents a strength of the proposed method compared to current content embedding feature approaches. We envision that the proposed method is more robust for real-life application with SYNC on account of its good generalizability.