[Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm)

范正忠 jjfan於iis.sinica.edu.tw
Thu 4月 2 11:19:12 CST 2020


Dear all, 

Please find DuReader_6000 with support_evidence distribution. 

jjfan 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Thursday, April 2, 2020 7:27:10 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Please find DRCD_1.7.12 with support_evidence distribution. 

jjfan 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Wednesday, April 1, 2020 4:46:00 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Please find FGC_release_1.7.12 with refinement of correct answers and additional information support_evidence distribution 

SHINT is changed to tuple with 
[0]: supporting sentence index 
[1]: supporting evidence sore of each sentence in passage 

AHINT: {NUMBER: 'X'} is added to predict how many items in 
1 means one item, 
2 means two items, ... 

N means unknown multiple items 
X means unknown 

The owner of each answer module may consider how to use the above new features 

[ https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK | https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK ] 

Best, 
jjfan 



From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Friday, March 27, 2020 5:23:18 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Please find FGC_release_1.7.11 with correct atoken position index. 
[ https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK | https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK ] 

Best, 
jjfan 



From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Thursday, March 26, 2020 11:46:55 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 
Please find FGC_release_1.7.10 with golden answer corrections. 
https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK 
Best, 
jjfan 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Monday, March 23, 2020 9:41:09 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Please find FGC_release_1.7.9 & DRCD_1.7.9 with the last version of NER (Chiao-Wei version) 
https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK 

Best, 
jjfan 



From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Wednesday, March 18, 2020 2:27:45 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Please find DRCD with NER (Chiao-Wei version) DRCD_1.7.8 

https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK 

Best, 
jjfan 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Monday, March 16, 2020 3:07:34 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all 

Please find FGC_release_1.7.8 with improved NE extraction. 
There are total 56,732 NEs (in train, dev, test data set) as compared with 1.7.7 52,996 NEs(7% increase) 
https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK 

Best, 
jjfan 

From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Thursday, March 12, 2020 8:56:59 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Please use this version FGC_release_1.7.7 
https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK 

with adjustment of the passage of D097 

Best, 
jjfan 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Wednesday, March 11, 2020 6:53:23 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Update FGC_release_1.7.6 & DRCD dataset ( https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK ) with 

1. refinement of sentence split algorithm, which solves the embedded quotation issue occurred in D033 

"《美国-墨西哥-加拿大协议》United States-Mexico-Canada Agreement, (简称《美墨加协议》USMCA;西班牙语:Tratado entre México, Estados Unidos y Canadá,T-MEC;法语:Accord Canada–États-Unis–Mexique,ACEUM)是加拿大、墨西哥和美国之间的一项自由贸易协定," 

2. refinement of NER character index issue when a sentence begins with space character occurred in D087 

" 记者 宋玲 报导 以往在台北高雄的观光重镇才会看得到的双层巴士 今天在屏东亮向 顶层开阔的空间 游客可以边吹风边看风景 下层则备有冷气 车上将由有导览人员提供解说服务。" 

Best, 
jjfan 



From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Friday, March 6, 2020 8:12:52 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Update FGC_release_1.7.5 & DRCD dataset 
[ https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK | https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK ] 

The modifications: 
1. Information Extraction (NER, TOKEN, POS) is moved from DIE or QIE into the corresponding sentences with IE keyword. Since we found that Stanford Core NLP will fail while the length of given paragraph is greater than 653 characters 

2. DRCD (including DRCD, ASR, Kaggle, Lee) are all provided with IE keyword. 

Best, 
jjfan 



From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Monday, March 2, 2020 8:20:55 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Update FGC_release_1.7.4 with corrections of wrong answer-type and answer-mode 

[ https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK | https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK ] 

In order to improve performance we let more answer-modules participate in answering questions. 
1. For those questions with answer-type 'date-duration', we add 'date-duration' to their answer-mode 
2. For those questions with answer-type 'num-measure', we add 'arithmetic-operations' to their answer-mode 

So the distribution of this new released data-set as follows (Please note that the count of answer mode is more than that of answer type, since for each question there will be more than one answer mode assigned) 

Train 											
Answer Type 	YesNo 	Num-Measure 	Kinship 	Person 	Date-Duration 	Location 	Organization 	Object 	Event 	Misc 	Total 
	87 	71 	59 	89 	158 	119 	84 	181 	14 	13 	875 
	9.94% 	8.11% 	6.74% 	10.17% 	18.06% 	13.60% 	9.60% 	20.69% 	1.60% 	1.49% 	100.00% 
Answer Mode 	YesNo (是否題) 	Multi-Spans-Extraction (列舉題型) 	Kinship 	Single-Span-Extraction (單一答案) 	Date-Duration 	Arithmetic-Operations 	Counting 	Comparing-Members 	Common-Sense 		
	87 	87 	56 	651 	158 	71 	11 	9 	0 		1130 
	7.70% 	7.70% 	4.96% 	57.61% 	13.98% 	6.28% 	0.97% 	0.80% 	0.00% 		100.00% 
											
Dev 											
Answer Type 	YesNo 	Num-Measure 	Kinship 	Person 	Date-Duration 	Location 	Organization 	Object 	Event 	Misc 	Total 
	28 	27 	30 	27 	26 	30 	19 	48 	2 	5 	242 
	11.57% 	11.16% 	12.40% 	11.16% 	10.74% 	12.40% 	7.85% 	19.83% 	0.83% 	2.07% 	100.00% 
Answer Mode 	YesNo (是否題) 	Multi-Spans-Extraction (列舉題型) 	Kinship 	Single-Span-Extraction (單一答案) 	Date-Duration 	Arithmetic-Operations 	Counting 	Comparing-Members 	Common-Sense 		
	28 	31 	26 	179 	26 	27 	2 	0 	0 		319 
	8.78% 	9.72% 	8.15% 	56.11% 	8.15% 	8.46% 	0.63% 	0.00% 	0.00% 		100.00% 
											
Test 											
Answer Type 	YesNo 	Num-Measure 	Kinship 	Person 	Date-Duration 	Location 	Organization 	Object 	Event 	Misc 	Total 
	25 	19 	12 	14 	31 	27 	20 	34 	4 	4 	190 
	13.16% 	10.00% 	6.32% 	7.37% 	16.32% 	14.21% 	10.53% 	17.89% 	2.11% 	2.11% 	100.00% 
Answer Mode 	YesNo (是否題) 	Multi-Spans-Extraction (列舉題型) 	Kinship 	Single-Span-Extraction (單一答案) 	Date-Duration 	Arithmetic-Operations 	Counting 	Comparing-Members 	Common-Sense 		
	25 	26 	6 	134 	31 	20 	3 	0 	0 		245 
	10.20% 	10.61% 	2.45% 	54.69% 	12.65% 	8.16% 	1.22% 	0.00% 	0.00% 		100.00% 






From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Wednesday, February 26, 2020 2:00:09 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear All, 

Update DROP dataset for date-duration & arithmetic-operation mode (DROP.7z) 

[ https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK | https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK ] 

Best, 
jjfan 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Friday, February 21, 2020 12:45:00 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear All, 

Enclosed please find FGC_release_1.7.3 version with additional information 

1. For each passage & question, (NER, CORF, RELATION, TOKEN, POS) are extracted for your training requirements 
2. FGC_release_ss_test.json has 132 questions all have 'Single-Span-Extraction' answer-mode for your benchmark dataset 

Best, 
jjfan 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Friday, February 21, 2020 9:46:23 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear All, 

Updated news (PBS, MOTI, CDC, CNA) dataset 

[ https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK | https://drive.google.com/drive/folders/1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK ] 

jjfan 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Friday, February 14, 2020 2:36:03 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear All, 

Updated FGC_release_1.7.2 with correction of answer_mode & answer_type 

[ https://drive.google.com/open?id=1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK | https://drive.google.com/open?id=1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK ] 

jjfan 



From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Monday, February 10, 2020 12:36:32 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear All, 

Updated FGC_release_1.7.1 with refinement 

1. add SHINT:[] in QTYPE:"申論" 
2. json.dump(output_dict, out_fh, indent=4, ensure_ascii=False) for easy viewer 

Best, 
jjfan 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Wednesday, February 5, 2020 11:31:21 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Please refer to the following link for FGC_release_1.7 & DuReader dataset 

[ https://drive.google.com/open?id=1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK | https://drive.google.com/open?id=1y0UCb-n1YKlKUsQk2GJz6iKJqQFJ25rK ] 

jjfan 

From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Wednesday, February 5, 2020 8:23:10 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

20 domains 



From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Wednesday, February 5, 2020 7:51:36 AM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

更正: 

1. DM 19 domains assigned person: 廖沛俊, 謝尊安, 蔡仕竑, 梁朝鈞 
1.1 2/7 前提供每個 domian 的'訂單'細項 (依據大會提供的 CSV, 對話腳本, 參考網路相關訂單系統, 個人的經驗), 格式可以參考 ppt 內的 GIST format 
1.2 2/14 根據'訂單'細項, 產出 system bot question texts, user bot answer statements (& corresponding regular expression), 細節的部分可以請教 Dr. Lin 


From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Tuesday, February 4, 2020 5:11:28 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

2.2 謝尊安 & Smolka co-work 
更正 
2.2 郭家銍 & Smolka co-work 



From: "范正忠" <jjfan at iis.sinica.edu.tw> 
To: "Balancy" <balancy at iis.sinica.edu.tw> 
Cc: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Tuesday, February 4, 2020 5:07:54 PM 
Subject: Re: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

Enclosed please find 
1. Dr. Lin's DM presentation slides and 
2. The last Dialog dataset (with CVS, output order items, and dialog examples) 
3. 相關時程如下: 



謝謝大家的參與, 今日會議結論如下: 

1. DM 19 domains assigned person: 廖沛俊, 謝尊安, 蔡仕竑, 梁朝鈞 
1.1 2/7 前提供每個 domian 的'訂單'細項 (依據大會提供的 CSV, 對話腳本, 參考網路相關訂單系統, 個人的經驗), 格式可以參考 ppt 內的 GIST format 
1.2 2/4 根據'訂單'細項, 產出 system bot question texts, user bot answer statements (& corresponding regular expression), 細節的部分可以請教 Dr. Lin 

2. 申論題: 
2.1 Smolka 會先 implement 'Query-focused' model on Chinese language 
2.2 謝尊安 & Smolka co-work 

3. 1.7 ver. Train/Dev/Test dataset 調整相同的 passage 不會同時出現在 Train/Dev/Test dataset, 完成後再 release 

4. 各個 module 如果完成改善功能, 請提供給我進行系統 e2e 的 performance test 

5. NN Library 暫定一 Pytorch 為主, 版本請相關的 owners s 協商一 common version 


jjfan 


From: "Balancy" <balancy at iis.sinica.edu.tw> 
To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw> 
Sent: Tuesday, February 4, 2020 8:04:40 AM 
Subject: [Most-ai-contest] 科技大擂台討論會(Today, 12:30-15:00 pm) 

Dear all, 

今天有科技大擂台討論會,時間及地點如下,請大家預留時間參加。 

時間:2/4 (二), 12:30-15:00PM 
地點:新館101會議室 

注意事項 
*請今天報告人員於開會前,將您的報告檔案存到會議室電腦的桌面上。 

感謝大家配合:) 

---- 
Best, 
吳佩瑾 Peggy, Pei-Jin Wu 
蘇克毅老師實驗室助理 
Administrative Assistant 
Institute of Information Science, Academia Sinica 
Tel: 02-2788-3799 ext . 1453 
Email: balancy at iis.sinica.edu.tw 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 

_______________________________________________ 
Most-ai-contest mailing list 
Most-ai-contest at iis.sinica.edu.tw 
https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.iis.sinica.edu.tw/pipermail/most-ai-contest/attachments/20200402/f9e796e8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 12291.040000040084
Type: image/png
Size: 14935 bytes
Desc: not available
URL: <http://www.iis.sinica.edu.tw/pipermail/most-ai-contest/attachments/20200402/f9e796e8/attachment-0001.png>


More information about the Most-ai-contest mailing list