<div dir="ltr"><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">各位好,</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">我是負責Multi-span Extraction的人-羅上堡。</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">本次的做法是全基於Rule-based去實作的,所以Step會有點多,可能有些也不是必要呈現的。</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt"><font face="微軟正黑體, sans-serif"><span style="font-size:21.3333px">以下會附上大概的整體流程共分為16步驟,由於過久沒有寫類似這樣流程的東西,所以會附上一張極簡的流程圖,來補充說明他們之間的關係。</span></font></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt"><br></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">整體流程如下:<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step1</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:提取<span lang="EN-US">Passage</span>與<span lang="EN-US">Question</span>之文本和<span lang="EN-US">NER</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step2</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:將<span lang="EN-US">P</span>的特殊符號全部清除。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)"> </span><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">※《『<span lang="EN-US">?</span>「》」』:<span lang="EN-US">~@#</span>¥<span lang="EN-US">%</span>……<span lang="EN-US">&*</span>():<span lang="EN-US">]+...</span></span><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif"></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step3</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:將<span lang="EN-US">NER</span>的<span lang="EN-US">Begin</span>與<span lang="EN-US">End</span>位置重算。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)"> </span><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">※因為<span lang="EN-US">Step2</span>,<span lang="EN-US">Begin</span>與<span lang="EN-US">End</span>位置會有偏移錯誤。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step4</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:創造<span lang="EN-US">BERT</span>輸入矩陣:<span lang="EN-US">[CLS]<b>Q</b>[SEP]<b>P</b>[SEP]</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step5</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:提取<span lang="EN-US">Question</span>的最後一句。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step6</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:依照<span lang="EN-US">Step5</span>的結果,提取關鍵字眼以獲得應回答幾個答案,如果沒有則視為非指定數量題目。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step7</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:將<span lang="EN-US">Step4</span>的矩陣丟給<span lang="EN-US">BERT</span>產生出結果。</span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step8</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:依照<span lang="EN-US">Step7</span>的結果產生<span lang="EN-US">top-k</span>的<span lang="EN-US">Begin</span>與<span lang="EN-US">End</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step9</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:去<span lang="EN-US">top-k</span>裡面尋找答案,同時檢查是否超過<span lang="EN-US">20</span>的長度,如果超過則繼續取下一個<span lang="EN-US">top-k+1</span>的結果,直到數量滿足或是沒有候選答案為止。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step10</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:依照<span lang="EN-US">Step9</span>所選出的所有答案,進行內含<span lang="EN-US">(Within)</span>與交疊<span lang="EN-US">(Overrap)</span>的答案處理。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">※<span lang="EN-US">Within
condition</span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">Answer1: </span><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">今天是總統大選<span lang="EN-US"> </span></span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)"><span lang="EN-US">Answer2</span>:是總統大</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)"><span lang="EN-US">Result</span>:今天是總統大選<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">※<span lang="EN-US">Overrap
condition</span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">Answer1: </span><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">今天是總統大選<span lang="EN-US"> </span></span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)"><span lang="EN-US">Answer2</span>:總統大選的日子</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)"><span lang="EN-US">Result</span>:總統大選<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">※由於這部分的<span lang="EN-US">code</span>有莫名不好處理的地方,所以在此琢磨的地方比較久。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step11</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:尋找候選答案裡面,是否有『、』字眼,如果有執行<span lang="EN-US">Step12</span>;如果沒有則執行<span lang="EN-US">Step13</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step12-1</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:尋找擁有『、』字眼的答案,往後擴充到句號。,並依照<span lang="EN-US">jieba</span>的斷詞結果,來取得、後面的答案。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step12-2</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:如果遇到『等』字眼時,需要再往後延伸找到等後面的字詞,來延伸擴充答案。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step13</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:再度檢查<span lang="EN-US">Within</span>與<span lang="EN-US">Overrap</span>的情況判斷,如果有發生,則執行類<span lang="EN-US">Step10</span>的結果判斷後,執行<span lang="EN-US">Step15</span>:如果沒有,則進行<span lang="EN-US">Step14</span>。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step14</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:由於沒有遇到特殊情況,會將每個篩選後的答案,進行最簡單的<span lang="EN-US">Rule</span>串接。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step15</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:將選出來的答案,透過<span lang="EN-US">NER</span>的資訊,去將有包含到該<span lang="EN-US">NER</span>的部分字元全部擴充回來,讓答案更加完整。<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">※<span lang="EN-US">Example
condition</span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">Answer: </span><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)">統大選 </span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)"><span lang="EN-US">NER</span>:總統</span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-family:微軟正黑體,sans-serif;color:rgb(191,191,191)"><span lang="EN-US">Result</span>:總統大選<span lang="EN-US"></span></span></p>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span lang="EN-US" style="font-size:16pt;font-family:微軟正黑體,sans-serif">Step16</span><span style="font-size:16pt;font-family:微軟正黑體,sans-serif">:輸出最終結果。<span lang="EN-US"></span></span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:12pt;font-family:Calibri,sans-serif"><span style="font-size:16pt;font-family:微軟正黑體,sans-serif"><br></span></p><p class="MsoNormal" style="margin:0cm 0cm 0.0001pt"><font face="微軟正黑體, sans-serif"><span style="font-size:21.3333px">謝謝。</span></font></p></div>