<html><body><div style="font-family: arial, helvetica, sans-serif; font-size: 12pt; color: #000000"><div><span style="font-size: 12pt;" data-mce-style="font-size: 12pt;">Hi <!--StartFragment--><span style="color: rgb(0, 0, 0); font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;" data-mce-style="color: #000000; font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;">上堡,</span></span></div><div><span style="font-size: 12pt;" data-mce-style="font-size: 12pt;"><span style="color: rgb(0, 0, 0); font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;" data-mce-style="color: #000000; font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;"><br data-mce-bogus="1"></span></span></div><div><span style="font-size: 12pt;" data-mce-style="font-size: 12pt;"><span style="color: rgb(0, 0, 0); font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;" data-mce-style="color: #000000; font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;">謝謝你的回覆, 也煩請在 1/14 會議上跟大家說明. 謝謝!</span></span></div><div><span style="font-size: 12pt;" data-mce-style="font-size: 12pt;"><span style="color: rgb(0, 0, 0); font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;" data-mce-style="color: #000000; font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;"><br data-mce-bogus="1"></span></span></div><div><span style="font-size: 12pt;" data-mce-style="font-size: 12pt;"><span style="color: rgb(0, 0, 0); font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;" data-mce-style="color: #000000; font-family: 微軟正黑體, sans-serif; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; float: none; display: inline !important;">jjfan</span></span></div><div><span style="color: #000000; font-family: 微軟正黑體, sans-serif; font-size: 21.3333px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;" data-mce-style="color: #000000; font-family: 微軟正黑體, sans-serif; font-size: 21.3333px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;"><br data-mce-bogus="1"></span></div><div><span style="color: #000000; font-family: 微軟正黑體, sans-serif; font-size: 21.3333px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;" data-mce-style="color: #000000; font-family: 微軟正黑體, sans-serif; font-size: 21.3333px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;"><br data-mce-bogus="1"></span></div><div><br></div><hr id="zwchr" data-marker="__DIVIDER__"><div data-marker="__HEADERS__"><b>From: </b>"闍怵羅" <s2w81234@gmail.com><br><b>To: </b>"Most-ai Contest" <Most-ai-contest@iis.sinica.edu.tw><br><b>Sent: </b>Thursday, January 2, 2020 4:09:18 PM<br><b>Subject: </b>[Most-ai-contest] Multi-span簡易說明<br></div><div><br></div><div data-marker="__QUOTED_TEXT__"><div dir="ltr"><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">各位好,</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">我是負責Multi-span Extraction的人-羅上堡。</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">本次的做法是全基於Rule-based去實作的,所以Step會有點多,可能有些也不是必要呈現的。</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt"><font face="微軟正黑體, sans-serif"><span style="font-size:21.3333px">以下會附上大概的整體流程共分為16步驟,由於過久沒有寫類似這樣流程的東西,所以會附上一張極簡的流程圖,來補充說明他們之間的關係。</span></font></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt"><br></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">整體流程如下:<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step1</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:提取<span lang="EN-US">Passage</span>與<span lang="EN-US">Question</span>之文本和<span lang="EN-US">NER</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step2</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:將<span lang="EN-US">P</span>的特殊符號全部清除。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )"> </span><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">※《『<span lang="EN-US">?</span>「》」』:<span lang="EN-US">~@#</span>¥<span lang="EN-US">%</span>……<span lang="EN-US">&*</span>():<span lang="EN-US">]+...</span></span><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif"></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step3</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:將<span lang="EN-US">NER</span>的<span lang="EN-US">Begin</span>與<span lang="EN-US">End</span>位置重算。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )"> </span><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">※因為<span lang="EN-US">Step2</span>,<span lang="EN-US">Begin</span>與<span lang="EN-US">End</span>位置會有偏移錯誤。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step4</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:創造<span lang="EN-US">BERT</span>輸入矩陣:<span lang="EN-US">[CLS]<b>Q</b>[SEP]<b>P</b>[SEP]</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step5</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:提取<span lang="EN-US">Question</span>的最後一句。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step6</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:依照<span lang="EN-US">Step5</span>的結果,提取關鍵字眼以獲得應回答幾個答案,如果沒有則視為非指定數量題目。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step7</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:將<span lang="EN-US">Step4</span>的矩陣丟給<span lang="EN-US">BERT</span>產生出結果。</span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step8</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:依照<span lang="EN-US">Step7</span>的結果產生<span lang="EN-US">top-k</span>的<span lang="EN-US">Begin</span>與<span lang="EN-US">End</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step9</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:去<span lang="EN-US">top-k</span>裡面尋找答案,同時檢查是否超過<span lang="EN-US">20</span>的長度,如果超過則繼續取下一個<span lang="EN-US">top-k+1</span>的結果,直到數量滿足或是沒有候選答案為止。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step10</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:依照<span lang="EN-US">Step9</span>所選出的所有答案,進行內含<span lang="EN-US">(Within)</span>與交疊<span lang="EN-US">(Overrap)</span>的答案處理。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">※<span lang="EN-US">Within
condition</span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">Answer1: </span><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">今天是總統大選<span lang="EN-US"> </span></span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )"><span lang="EN-US">Answer2</span>:是總統大</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )"><span lang="EN-US">Result</span>:今天是總統大選<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">※<span lang="EN-US">Overrap
condition</span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">Answer1: </span><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">今天是總統大選<span lang="EN-US"> </span></span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )"><span lang="EN-US">Answer2</span>:總統大選的日子</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )"><span lang="EN-US">Result</span>:總統大選<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">※由於這部分的<span lang="EN-US">code</span>有莫名不好處理的地方,所以在此琢磨的地方比較久。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step11</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:尋找候選答案裡面,是否有『、』字眼,如果有執行<span lang="EN-US">Step12</span>;如果沒有則執行<span lang="EN-US">Step13</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step12-1</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:尋找擁有『、』字眼的答案,往後擴充到句號。,並依照<span lang="EN-US">jieba</span>的斷詞結果,來取得、後面的答案。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step12-2</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:如果遇到『等』字眼時,需要再往後延伸找到等後面的字詞,來延伸擴充答案。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step13</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:再度檢查<span lang="EN-US">Within</span>與<span lang="EN-US">Overrap</span>的情況判斷,如果有發生,則執行類<span lang="EN-US">Step10</span>的結果判斷後,執行<span lang="EN-US">Step15</span>:如果沒有,則進行<span lang="EN-US">Step14</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step14</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:由於沒有遇到特殊情況,會將每個篩選後的答案,進行最簡單的<span lang="EN-US">Rule</span>串接。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step15</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:將選出來的答案,透過<span lang="EN-US">NER</span>的資訊,去將有包含到該<span lang="EN-US">NER</span>的部分字元全部擴充回來,讓答案更加完整。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">※<span lang="EN-US">Example
condition</span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">Answer: </span><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )">統大選 </span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )"><span lang="EN-US">NER</span>:總統</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-family:'微軟正黑體' , sans-serif;color:rgb( 191 , 191 , 191 )"><span lang="EN-US">Result</span>:總統大選<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">Step16</span><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif">:輸出最終結果。<span lang="EN-US"></span></span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:'calibri' , sans-serif"><span style="font-size:16pt;font-family:'微軟正黑體' , sans-serif"><br></span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt"><font face="微軟正黑體, sans-serif"><span style="font-size:21.3333px">謝謝。</span></font></p></div>
<br>_______________________________________________<br>Most-ai-contest mailing list<br>Most-ai-contest@iis.sinica.edu.tw<br>https://www.iis.sinica.edu.tw/mailman/listinfo/most-ai-contest<br></div></div></body></html>