[Most-ai-contest] The performance of the last version FGC QA system (4/4)

Sat 4月 4 11:42:41 CST 2020

Dear Dr. Fan,

意思是如果我們只用single span的module來做所有題目
可以獲得test set=0.759 的成績，也會是目前最好的結果囉？

謝謝
Best, Menphis.

> 范正忠 <jjfan at iis.sinica.edu.tw> 於 2020年4月4日 11:25 寫道：
> 
> 
> “Single-Span” = >  the performance of 'single-span-extraction' answer mode, no matter which answer module gets the highest score.
> 
> “Kuo” => the performance of 'single-span ensemble' answer module by Kuo & Liao
> 
> “Simonc” => not activated, since  “Simonc” performance is less than "Kuo". It means that for 'single-span-extraction' answer mode the system only activates "Kuo" answer module at the current test
> 
> jjfan
> From: "kysu" <kysu at iis.sinica.edu.tw>
> To: "范正忠" <jjfan at iis.sinica.edu.tw>
> Cc: "Simonc" <simonc at iis.sinica.edu.tw>, "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw>, "kysu" <kysu at iis.sinica.edu.tw>
> Sent: Saturday, April 4, 2020 11:16:35 AM
> Subject: RE: [Most-ai-contest] The performance of the last version FGC QA system (4/4)
> 
> What is the difference between the third column “Single-Span” and the fourth column “Kuo”? Please fill in the column of “Simonc” next time for comparison. Thanks.
> 
>  
> 
> KY
> 
>  
> 
> From: 范正忠 [mailto:jjfan at iis.sinica.edu.tw] 
> Sent: Saturday, April 4, 2020 10:57 AM
> To: kysu <kysu at iis.sinica.edu.tw>
> Cc: Simonc <simonc at iis.sinica.edu.tw>; Most-ai Contest <Most-ai-contest at iis.sinica.edu.tw>
> Subject: Re: [Most-ai-contest] The performance of the last version FGC QA system (4/4)
> 
>  
> 
> Sorry! Correction of the performance of single-span-ensemble module 
> 
>  
> 
> Single-Span
> 
> Kuo
> 
> Simonc
> 
> Multi-Span
> 
> Date-Duration
> 
> Train
> 
> Dev
> 
> Test
> 
> 4月2日
> 
> ENSEMBLEModule: 12 models (new) + single_span_multi_hop_v2_1
> 
> train: 0.842
> dev: 0.712
> test: 0.741
> 
> train: 0.899
> dev: 0.699
> test: 0.741
> 
> train: 0.771
> dev: 0.635
> test: 0.705
> 
> 0.757
> 
> 0.652
> 
> 0.653
> 
> 4月2日
> 
> ENSEMBLEModule: 12 models (new)
> 
> train: 0.890
> dev: 0.801
> test: 0.750
> 
> train: 0.899
> dev: 0.699
> test: 0.741
> 
> 　
> 
> train: 0.493
> dev: 0.476
> test: 0.350
> 
> train: 0.589
> dev: 0.720
> test: 0.581
> 
> 0.786
> 
> 0.709
> 
> 0.658
> 
> 4月2日
> 
> single_span_multi_hop_v2_1
> 
> train: 0.775
> dev: 0.692
> test: 0.714
> 
> train: 0.771
> dev: 0.635
> test: 0.705
> 
> 0.714
> 
> 0.64
> 
> 0.637
> 
> 4月3日
> 
> ENSEMBLEModule: 12 models (new)
> MSPE_v18_branchy27
> date_duration_module_4
> 
> train: 0.890
> dev: 0.801
> test: 0.750
> 
> train: 0.899
> dev: 0.699
> test: 0.741
> 
> train: 0.548
> dev: 0.667
> test: 0.450
> 
> train: 0.589
> dev: 0.720
> test: 0.581
> 
> 0.791
> 
> 0.725
> 
> 0.668
> 
> 4月4日
> 
> ENSEMBLEModule: 12 models (new)
> MSPE_v18_branchy27
> date_duration_module_4
> 
> train: 0.888
> dev: 0.801
> test: 0.759
> 
> train: 0.897
> dev: 0.699
> test: 0.750
> 
> train: 0.949
> dev: 0.837
> test: 0.876
> 
> 
> 
> 
> train: 0.548
> dev: 0.667
> test: 0.450
> 
> train: 0.753
> dev: 0.720
> test: 0.774
> 
> 0.817
> 
> 0.737
> 
> 0.705
> 
>  
> 
>  
> 
>  
> 
>  
> 
> From: "kysu" <kysu at iis.sinica.edu.tw>
> To: "范正忠" <jjfan at iis.sinica.edu.tw>, "Simonc" <simonc at iis.sinica.edu.tw>
> Cc: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw>
> Sent: Saturday, April 4, 2020 10:48:57 AM
> Subject: RE: [Most-ai-contest] The performance of the last version FGC QA system (4/4)
> 
>  
> 
> Thanks. Glad to know that we have got a new record high performance! Let’s keep pushing the performance.
> 
>  
> 
> KY
> 
>  
> 
> From: most-ai-contest-bounces at iis.sinica.edu.tw [mailto:most-ai-contest-bounces at iis.sinica.edu.tw] On Behalf Of 范正忠
> Sent: Saturday, April 4, 2020 10:24 AM
> To: Simonc <simonc at iis.sinica.edu.tw>
> Cc: Most-ai Contest <Most-ai-contest at iis.sinica.edu.tw>
> Subject: Re: [Most-ai-contest] The performance of the last version FGC QA system (4/4)
> 
>  
> 
> Dear all,
> 
>  
> 
> Enclosed please find the current performance of the FGC QA system.
> 
>  
> 
>  
> 
> Single-Span
> 
> Kuo
> 
> Simonc
> 
> Multi-Span
> 
> Date-Duration
> 
> Train
> 
> Dev
> 
> Test
> 
> 4月2日
> 
> ENSEMBLEModule: 12 models (new)
> 
> train: 0.890
> dev: 0.801
> test: 0.750
> 
> train: 0.899
> dev: 0.699
> test: 0.741
> 
> 　
> 
> train: 0.493
> dev: 0.476
> test: 0.350
> 
> train: 0.589
> dev: 0.720
> test: 0.581
> 
> 0.786
> 
> 0.709
> 
> 0.658
> 
> 4月3日
> 
> ENSEMBLEModule: 12 models (new)
> MSPE_v18_branchy27
> date_duration_module_4
> 
> train: 0.890
> dev: 0.801
> test: 0.750
> 
> train: 0.899
> dev: 0.699
> test: 0.741
> 
> train: 0.548
> dev: 0.667
> test: 0.450
> 
> train: 0.589
> dev: 0.720
> test: 0.581
> 
> 0.791
> 
> 0.725
> 
> 0.668
> 
> 4月4日
> 
> ENSEMBLEModule: 12 models (new)
> MSPE_v18_branchy27
> date_duration_module_4
> 
> train: 0.888
> dev: 0.801
> test: 0.759
> 
> train: 0.949
> dev: 0.837
> test: 0.876
> 
> train: 0.548
> dev: 0.667
> test: 0.450
> 
> train: 0.753
> dev: 0.720
> test: 0.774
> 
> 0.817
> 
> 0.737
> 
> 0.705
> 
>  
> 
> Please check the detail in Performance.txt file and Date-Duration performance is similar to  郭家銍 report.
> 
>  
> 
> Best,
> 
> jjfan
> 
>  
> 
>  
> 
> From: "范正忠" <jjfan at iis.sinica.edu.tw>
> To: "Simonc" <simonc at iis.sinica.edu.tw>
> Cc: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw>
> Sent: Friday, April 3, 2020 4:45:39 PM
> Subject: [Most-ai-contest] The performance of the last version FGC QA system
> 
>  
> 
> Dear all,
> 
>  
> 
> Enclosed please find the current performance of the FGC QA system.
> 
>  
> 
> Single-Span
> 
> Kuo
> 
> Multi-Span
> 
> Date-Duration
> 
> Train
> 
> Dev
> 
> Test
> 
> 4月2日
> 
> ENSEMBLEModule: 12 models (new)
> 
> train: 0.890
> dev: 0.801
> test: 0.750
> 
> train: 0.899
> dev: 0.699
> test: 0.741
> 
> train: 0.493
> dev: 0.476
> test: 0.350
> 
> train: 0.589
> dev: 0.720
> test: 0.581
> 
> 0.786
> 
> 0.709
> 
> 0.658
> 
> 4月3日
> 
> ENSEMBLEModule: 12 models (new)
> MSPE_v18_branchy27
> date_duration_module_4
> 
> train: 0.890
> dev: 0.801
> test: 0.750
> 
> train: 0.899
> dev: 0.699
> test: 0.741
> 
> train: 0.548
> dev: 0.667
> test: 0.450
> 
> train: 0.589
> dev: 0.720
> test: 0.581
> 
> 0.791
> 
> 0.725
> 
> 0.668
> 
>  
> 
> Please  郭家銍 help to check Date-Duration ver.4. 
> 
>  
> 
> Best,
> 
> jjfan
> 
>  
> 
>  
> 
> From: "范正忠" <jjfan at iis.sinica.edu.tw>
> To: "Simonc" <simonc at iis.sinica.edu.tw>
> Cc: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw>
> Sent: Thursday, April 2, 2020 3:56:36 PM
> Subject: Re: [Most-ai-contest] Updated Performance Data with Gold        Answer        Inclusion Rate
> 
>  
> 
> Dear all,
> 
>  
> 
> Enclosed please find the performance of the last code:
> 
>  
> 
> ENSEMBLEModule: 12 models (new), single_span_multi_hop_v2_1
> 
> Multi-Spans: MSPE_v18_branchy20
> 
> date_duration_module_3
> 
> arithmetic_module_3
> 
> kinship_module6
> 
>  
> 
> Overall:
> 
> train -> total: 882, correct: 668, accuracy: 0.757
> 
> dev -> total: 247, correct: 161, accuracy: 0.652
> 
> test -> total: 193, correct: 126, accuracy: 0.653
> 
>  
> 
> Single-Span:
> 
> train -> portion:0.619, count:546, errors: 86, accuracy: 0.842
> 
>     ensemble only -> portion:0.619, count:546, errors: 37, accuracy: 0.932
> 
>     multi-hop only -> portion:0.619, count:546, errors: 76, accuracy: 0.861
> 
>  
> 
> dev -> portion:0.632, count:156, errors: 45, accuracy: 0.712
> 
>      ensemble only -> portion:0.632, count:156, errors: 37, accuracy: 0.763
> 
>      multi-hop only -> portion:0.632, count:156, errors: 43, accuracy: 0.724
> 
> test -> portion:0.580, count:112, errors: 29, accuracy: 0.741
> 
>      ensemble only -> portion:0.580, count:112, errors: 23, accuracy: 0.795
> 
>      multi-hop only -> portion:0.580, count:112, errors: 26, accuracy: 0.768
> 
>  
> 
> Multi-Spans:
> 
> train -> portion:0.083, count:73, errors: 37, accuracy: 0.493
> 
> dev -> portion:0.085, count:21, errors: 11, accuracy: 0.476
> 
> test -> portion:0.104, count:20, errors: 13, accuracy: 0.350
> 
>  
> 
> jjfan
> 
> From: "Simonc" <simonc at iis.sinica.edu.tw>
> To: "Most-ai Contest" <Most-ai-contest at iis.sinica.edu.tw>
> Sent: Tuesday, March 31, 2020 6:00:52 PM
> Subject: [Most-ai-contest] Updated Performance Data with Gold Answer        Inclusion Rate
> 
>  
> 
> Dear all,
> 
>  
> 
> The attached file contains the performance data from our results in 3/27.
> 
> This time, the gold answer inclusion rate is also included. (That is, counting the cases where the correct answer is included in the answer candidates.)
> 
>  
> 
> Regards,
> 
> 張光瑜
> 
> 
> _______________________________________________
> Most-ai-contest mailing list
> Most-ai-contest at iis.sinica.edu.tw
> https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest
> 
> 
> _______________________________________________
> Most-ai-contest mailing list
> Most-ai-contest at iis.sinica.edu.tw
> https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest
> 
> 
> _______________________________________________
> Most-ai-contest mailing list
> Most-ai-contest at iis.sinica.edu.tw
> https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest
> 
>  
> 
> 
> _______________________________________________
> Most-ai-contest mailing list
> Most-ai-contest at iis.sinica.edu.tw
> https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.iis.sinica.edu.tw/pipermail/most-ai-contest/attachments/20200404/5bb299bb/attachment-0001.html>