[Most-ai-contest] Multi-spans extraction v18_b20
kysu
kysu於iis.sinica.edu.tw
Thu 4月 2 11:51:13 CST 2020
I have checked the attached error cases. Possible reasons/solutions are listed as follows for your reference. I would like to discuss those issues with Prof. Menphis Chen first.
Prof. Chen: Could you please indicate a discussion time slot that is convenient to you? I will call you via Skype to given detailed explanation. Thanks.
KY
------------------------------------------------------
Training-Set Errors:
1. “問題:莎士比亚的喜剧作品有哪些?(在文本中)”, “問題:莎士比亚的悲剧作品有哪些?(在文本中)”, “問題:莎士比亚的悲喜剧作品有哪些?(在文本中)”, page 7: “Angle bracket pair” formatting information and external knowledge about the names of those scripts should be helpful.
2. “問題:试举出文本中五项作为生质燃料原料的粮食作物”, page 9: “Sentence Scope” information should be helpful.
3. “問題:生物医学所使用的奈米结晶金属是哪两种?(在本文中)”, page 10: Answer tends to be a complete NP constitute.
4. “問題:有哪些大众运输工具可以到达 2019 年台湾宝可梦活动会场呢?”, page 13: NE “中和新芦线” external knowledge should be helpful
5. “問題:根据文本的数据,男性癌症前三名是哪三种?”, page 14: “Word-Matching Indicator” (such as 男性 and 癌症) should be helpful
6. “問題:这起事故的肇事者所使用的交通工具为何?”, “問題:这起事故的肇事者其姓氏分别为何?”, page 15: “Sentence Scope” information should be helpful.
7. “問題:三级座舱设计的波音 747 客机的舱等是哪三个?”, page 15: Benchmark error.
8. “問題:澎湖旅行团总共安排了几个景点?”, “問題:澎湖旅行团安排了哪五个景点?”, page 17: Need “Item formatting information” (Items could be easily identified from visual information, but not NLP).
9. “問題:考古学家从遗址发掘中发现了人类会用什么工具来捕捞溪中的鱼蟹?”, page 18: Benchmark error.
10. “問題:山羊老师教大家哪两种类型的垃圾需要压扁或踩扁后回收丢到资源回收筒里?”, page 19: “Sentence Scope” information should be helpful.
11. “問題:目前在台湾出现的肠病毒有哪几型?”, page 21: Adopted rules could be modified to cover this case.
Test-Set Errors:
1. “問題: 在「乌台诗案」中,有哪些人出面救了苏东坡?”, page 22: Adopted rules could be modified to cover this case.
2. “問題: 高雄市被列为国定古迹的建筑物是哪些?”, page 23: Adopted rules could be modified to cover this case.
3. “問題: 屏东县双层巴士观光服务行经路线停靠哪些景点?”, page 23: Adopted rules could be modified to cover this case.
4. “問題: 全民健保实施之前的健康保险有哪些?”, page 24: Adopted rules could be modified to cover this case.
5. “問題: 癌症分期是根据哪些原则来判断?”, page 24: Need “Item formatting information” (Items could be easily identified from visual information, but not NLP).
6. “問題: 小阿姨说,等春天来临时,哪些花儿会相继绽放,那十的植物元变成了万紫千红的锦绣世界?”, page 25: Adopted rules could be modified to cover this case (e.g., matching more Q-strings).
Development-Set Errors:
1. “問題: 「阿拉伯之春」运动发生在哪些区域?”, page 28: Adopted rules could be modified to cover this case.
2. “問題: 鲸鲨曾经在哪些海域出现过?”, page 30: Adopted rules could be modified to cover this case.
3. “問題: 菲律宾的鲸鲨保育政策为何?”, page 30: Adopted rules could be modified to cover this case (e.g., matching more Q-strings).
4. “問題: 开车出门前要记得做哪些车辆检查以减少交通事故发生?”, page 31: Adopted rules could be modified to cover this case.
5. “問題: 世界上最大的教堂有哪些建筑师与艺术家的设计?”, page 32: Adopted rules could be modified to cover this case.
6. “問題: 世界上最大的教堂内部有哪些艺术家的大作?(本文中)”, page 32: Benchmark error.
7. “問題: 内政部消防署提出防范纵火方法「三从四得」,请问是哪「三从」 ?”, “問題: 内政部消防署提出防范纵火方法「三从四得」,请问是哪「四得」 ?”, page 32: Need “Item formatting information”.
------------------------------------------------
From: most-ai-contest-bounces at iis.sinica.edu.tw [mailto:most-ai-contest-bounces at iis.sinica.edu.tw] On Behalf Of 闍怵羅
Sent: Wednesday, April 1, 2020 6:31 PM
To: Most-ai Contest <Most-ai-contest at iis.sinica.edu.tw>
Subject: [Most-ai-contest] Multi-spans extraction v18_b20
各位好,
抱歉拖了這麼久才更新。
附件包含兩個檔案:
MSPE_v18_branchy20.py <整合的時候,只要把之前的檔案覆蓋即可>
MSPE_error.pdf <包含目前的整體情況,以及錯誤的每一題的文章、題目、預測、答案。>
<由於包含文章,所以頁數非常多,已經有做整理。>
=================
目前MSPE的情況如下:
一:已有採用gold support evidence 去取答案的版本,但是效果是幾乎沒有差別的。
原因是:在原本的5種模型去ensemble的情況,就幾乎都包含原本的 gold support evidence的句子。
也就是說,當我從每個 gold support evidence 去取answer的時候,對於原本的模型來說,是幾乎取出差不多的答案的。
二:從問題直接提取相關的答案,還在撰寫,但是目前能救的題目,也幾乎被新版的MSPE解決了,所以可能不會有更好的效果。
三:目前剩下錯誤的題目,幾乎都是沒有夠多的答案。
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.iis.sinica.edu.tw/pipermail/most-ai-contest/attachments/20200402/ef798d06/attachment-0001.html>
More information about the Most-ai-contest
mailing list