Privacy Crisis due to Crisis Response on the Web

Shao-Yu Wu, Ming-Hung Wang, and Kuan-Ta Chen

PDF Version | Contact Us

Abstract

In recent disasters, the web has served as a medium of communication among disaster response teams, survivors, local citizens, curious onlookers, and zealous people who are willing to assist victims affected by disasters. To encourage and speed up information dissemination, the availability and convenience of use are normally the top concerns in designing disaster response web services, where a design of free-formed inputs without access control is commonly adopted. However, such design may result in personal information disclosure and privacy leakage.
In this paper, using a case study of a real-life disaster response service, the MKER (Morakot Event Reporting) forum, we show that the disclosure of personal information and the resulting privacy disclosure is indeed a serious problem that is currently happening. In our case, we have successfully mapped 1,438 unique cell phone numbers and 1,383 unique addresses to individuals using an automated method, not to mention the much greater invasion of privacy that could be effected by manual analysis of the messages posted on the forum. To resolve this issue, we propose several means to mitigate and prevent the mentioned privacy leakage on disaster response services from being happened.
crisis informatics; disaster management; disaster response; privacy leakage; situation awareness; user privacy

1  Introduction

As the web technology is now pervasive due to the ubiquity of the Internet, it has also expanded the reach of disaster sociology that people can easily participate in disaster situation response and relief efforts, no matter whether peer-to-peer communication and public participation is concerned [1,[2,[3]. In recent disasters, the web has served as a medium of communication among disaster response teams, survivors, local citizens, and even curious onlookers. Furthermore, it provides opportunities for zealous people who would like to assist victims affected by disasters [1,[4] and actively engage in the creation of valuable information rather than being passive information consumers.
When a disaster occurs, all individuals respond to it in their own ways. The communication needs during the crisis response stage are extraordinary in both amount and forms. Synchronous communication, such as phone conversation, is always heavily used among emergency response teams and between people in the disaster area and those outside the area. Meanwhile, asynchronous communication via the web is equally important. For example, people who are in the vicinity of a crisis can provide firsthand situation updates to the public via the web, and people who cannot reach acquaintances in the affected area can call for help on the web.
To fulfill the asynchronous communication needs during crisis response, websites are set up for various purposes (c.f., Section  II-A) like situation overview, relief information, and donation management. Such websites are especially useful when certain information, such as looking for missing people, needs to be broadcasted to unspecific audience and real-time responses are not necessary or feasible.
Although disaster response web services possess an asynchronous communication nature, to encourage and speed up information dissemination, the availability and convenience of use are normally the top concerns in providing such services. As a result, these sites mostly provide information for public access without role-based access control which is commonly used by other websites. Furthermore, they usually accept free-formed inputs from the public because there can be uncountable types of information needed to disseminated via the websites. Because efficiency is assigned much higher weight than privacy on disaster response services, we anticipate that personal information disclosure and the resultant privacy leakage would be a serious issue and worth of investigation.
In this paper, using a case study based on a real-life disaster response service, the MKER (Morakot Event Reporting) forum, we show that the disclosure of personal information and the resulting privacy disclosure is indeed a serious problem that is currently happening. In our case, we successfully mapped 1,438 unique cell phone numbers and 1,383 unique addresses to individual persons using a fully automated approach, not to mention the much greater invasion of privacy that could be effected by manual analysis of the messages posted on the forum. The leaked privacy information, once it becomes available for malicious use, would be disastrous given the huge amount of information on a single site.
Our contribution in this paper is three-fold:
  1. We identify the privacy risks that are caused by users' communication on disaster response web services.
  2. As a demonstration, we extensively analyze the MKER forum1, a disaster response forum set up for Typhoon Morakot in Aug 2009, to understand how people utilize such services with free-formed inputs and how personal information is disclosed. We also present automated analysis techniques to quantify privacy leakage on the forum.
  3. We propose several solutions to mitigate and prevent such privacy leakage from being happened.
The remainder of this paper is organized as follows. Section II provides a review for common disaster response web services and related works. In order to illustrate the privacy leakage risks, we introduce the MKER forum and the Typhoon Morakot that was the root cause of the former in Section III. In Section IV, we present how we analyze the messages publicly available on the MKER forum and how privacy leakage is generated. In Section V, we propose several means for mitigating and preventing the forementioned privacy leakages on current and future disaster response sites. Finally, Section VI draws our conclusion.

2  Background

Web disaster response services have proved valuable in that they provide efficient means to help gather real-time updates from social reporters (some of them may be witnesses of crisis) who are at vantage points and able to access firsthand information. Such crowdsourcing model manifested its usefulness and unprecedented role in recent Haiti earthquake response [5]. In this section, we first provide an overview of various disaster response services on the web and then review related works that inspected the use of web services in recent disasters.

2.1  Disaster Response Web Services

When facing a disaster, people use information from any kind of source as long as it satisfies their needs and informs their actions [6]. In this subsection, we categorize commonly seen web services designated for disaster responses.

2.1.1  Information Portal

Sahana [7] is an open-source disaster management system that addresses the common coordination requirements targeting relief operations and rehabilitation efforts. It was officially deployed by the Center for National Operations in Sri Lanka as a part of their official portal in 2005. Since then, it has been used in response to the 2005 Kashmir earthquake, the 2006 Guinsaugon mudslides in Philippines, and the 2006 Yogyakarta earthquake in Indonesia.
The Disaster Portal created by Project RESCUE [8] is an instance of the web portals for bidirectional communication between response parties and the public during emergency situations. It features a situation overview map, emergency shelter status, donation management, and services for family reunification. It was officially debuted by the City of Ontario during the Southern California wildfires in Oct 2007 [9].

2.1.2  Social Media Web Service

A variety of social media web services have proved helpful in disaster responses, here we exemplify their uses in recent disasters:
  1. Social networking websites (such as Facebook and MySpace): Facebook is one of the instances that was used as an information gathering center during the 2007 Virginia Tech and the 2008 Northern Illinois University shootings [10].
  2. Microblogging services (such as Twitter and Plurk): Broadcasted communication via Twitter during the 2009 Red River Floods has been shown its efficiency and efficacy in enhancing situational awareness of the public [11,[12].
  3. Web publication tools (such as blogs and wikis): Wikipedia was used to generate and disseminate information during the 2007 Virginia Tech shooting. The complete list of the 32 victims it generated was even before the university released the information [4].
  4. Data mashup services (such as Google Maps, AlertMap, and FlickrVision): Google Maps is one of the examples which was exercised by social reporters in the affected area to report and disseminate updates about the disaster during the 2007 California wildfires [13,[3].
  5. Discussion boards and forums: Web forums have been used for coordinating disaster response and relief efforts. For example, during the 2008 Sichuan earthquake in China, netizens used a popular online discussion forum, Tianya Community2, to share and disseminate information about the disaster [14].

2.2  Related Work

Starbird et al [12] analyzed 7,183 Twitter messages (tweets) that took place during the flooding of the Red River Valley in the US and Canada in March and April 2009 to investigate the form and content of Twitter communications regarding the hazard. Their analysis indicates that around 10% tweets were original, over a quarter of the tweets were synthetic (i.e., original tweet messages incorporating outside knowledge such as news from mass media and geographical facts), and around three quarters of the tweets were derivative information. In addition to the Red River flooding event, the same authors further examined the tweets during the Oklahoma Grassfires of April 2009 [11] and identified information that may contribute to enhance situational awareness about the current disaster.
Torrey et al [15] observed the online communication responded to 2005 Hurricane Katrina in four online communities (two blogs and two forums) to understand how people who were willing to donate goods for disaster victims communicated online and facilitated the distribution of donations. They found that online communities played an important role in both information access and trust development in disaster relief, where large discussion forums tend to be more sustainable in disaster relief than small community blogs.
Moreover, during the 2007 California wildfires, residents in the affected area used Google Mashup and social media such as Twitter to report and disseminate updates about the disaster [13,[3]. Meanwhile, popular social networking sites such as Facebook were used as information gathering center during the 2007 Virginia Tech and the 2008 Northern Illinois University shootings [10]. Additionally, during the 2008 Sichuan earthquake in China, netizens used the Tianya Community to share and disseminate information about the disaster [14].
Although the usage of social web services on disaster responses has been extensively studied, the privacy issues in using such web services have rarely been mentioned. Herold [16] has discussed the concerns regarding sensitive information disclosure due to system malfunction or insecure data transmission. Motivated by a similar concern, we instead investigate the privacy disclosure risks during the communications on disaster response forums and propose countermeasures to such risks without sacrificing the capability of the web services.

3  Typhoon Morakot and The Forum

3.1  Typhoon Morakot

2009MORAKOT_new.png
Figure 1: The movement track of Typhoon Morakot during Aug 3, 2009 and Aug 11, 2009 [17]
Death2.png
Figure 2: The number of deaths and people missing caused by Typhoon Morakot in each county of Taiwan
0805-0810.png
Figure 3: The accumulated precipitation brought by Typhoon Morakot during Aug 5 and Aug 10, 2009
In the traceable history of Taiwan Typhoon Morakot3 was the most severe typhoon in terms of casualty and injury, and produced a record-breaking level of rainfall. Early on Aug 7, 2009, the storm attained its peak intensity with the wind speed of 140 km/h (equivalent to 87 mph) according to JMA4. Approximately 24 hours later, the storm emerged back over water into the Taiwan Strait and weakened to a severe tropical storm before making a landfall on China on Aug 9 (see Figure 1). During the four-day period (i.e., from Aug 7 to Aug 10) that Typhoon Morakot struck Taiwan, the heavy rainfall in the mountain area was over 1,500 millimeters in 24 hours and the accumulated precipitation over this period was more than 3,000 millimeters (see Figure 3). The maximum cumulative rainfall depth in three subsequent days in the period was approaching the world's highest rainfall record5.
Table 1: A comparison between the Typhoon Morakot and annual precipitations observed in weather stations in the disaster area
AnnualDaily DailyDailyDaily MorakotMorakot
Rainfall 08/07 08/08 08/09 08/10 08/07- vs
(mm) (mm) (mm) (mm) (mm) 08/10 Annual
Chiayi Alishan 3,910 420 1,161 1,166 218 2,965 76%
Pingtung Sandimen 3,884 745 1,402 394 332 2,872 74%
Chiayi Jhuci 3,801 556 1,185 877 156 2,775 73%
Kaohsiung Taoyuan 4,086 501 1,283 583 423 2,790 68%
Kaohsiung Liouguei 3,138 236 1,178 696 351 2,461 78%
Chiayi Fanlu 3,437 708 815 601 79 2,202 64%
Chiayi Dapu 2,749 482 1,214 458 3 2,156 78%
Kaohsiung Jiasian 2,861 400 1,072 345 203 2,020 71%
Nantou Sinyi 3,254 170 717 909 134 1,929 59%
Kaohsiung Maolin 3,152 252 743 230 179 1,404 45%
Pingtung Wutai 2,898 206 580 208 165 1,160 40%
Kaohsiung Cishan 2,365 91 620 128 85 924 39%
The heavy rainfall brought by Typhoon Morakot caused flooding in a number of areas which spanned a sum of 400 square kilometers. The regions with heavy precipitation and flooding were mostly concentrated in the southwest area of Taiwan, where a number of stations in this area reported to observe more than 70% of the annual precipitations in merely three days (Table I). Among the affected regions, the most severely damaged one was the Shiaolin village in Kaohsiung County. The torrential rainfall caused a large-area landslide, which wiped out the whole village and killed around 500 people [18].
To summarize, Typhoon Morakot caused various damages such as landslide, debris flows, fallen rocks, overflowed levees, building collapse, and road and bridge destruction. There were 677 deaths and 22 people missing due to the hazard (see Figure  2). Economically, an estimated sum of loss on merely agriculture was more than USD$5.3 billion.

3.2  The MKER Forum

website_screenshot_new.png
Figure 4: The screenshot of the MKER (Morakot Event Reporting) forum
new_report.png
Figure 5: The screenshot of the "Making a New Report" interface of the MKER forum
When Typhoon Morakot was hitting Taiwan, a number of disaster response web services were set up for various purposes, such as information portals, situation updates, and donation management. Among these sites, the MKER (Morakot Event Reporting) forum6 was online on Aug 9, 2009 and designated to be an information exchange site for the Morakot disaster with a discussion-board-like interface. Since its launch, people with different roles gathered around this site for three primary purposes: 1) asking for relief support, 2) seeking for missing people, and 3) posting situation updates of people, goods, and any information related to the disaster. Figure 4 shows a screenshot of the MKER forum. The information posted on the forum is organized in "threads" (i.e., records), where a thread contains the following fields: 1) the county and the detailed address associated with an event, 2) contact information of the thread originator, 3) description of the event, 4) help in need (if applicable), and 5) responses (follow-ups) to the thread.
Note that when a user initiates a new thread, he provides only the first four fields (c.f., the "Submit a New Report" interface in Figure 5), as the Response field is intended to be supplied by others, which can contain any number of replies from different respondents. To reduce duplicate information being reported, thread originators were encouraged to use the keyword search functionality before initiating a new thread. In this way, if the information to be posted is related to an existing thread, the user can append follow-up messages to the thread via the "Add Response" interface.
During Aug 7 to Oct 22, people posted a total of 4,315 threads and 12,244 responses on the MKER forum, where 85% of the threads were posted between Aug 9 and Aug 14, as shown in Figure 6. Based on the graph, the forum received 1,030 threads on Aug 11, which was the first day after Typhoon Morakot struck Taiwan. To illustrate what the information on the forum look like, we provide five example threads in Table II. In the first example, the originator could not contact with his/her three acquaintances; therefore asked about any news about them. The second example was similarly intended to look for two missing people, and in this case, someone responded to this thread that both people were safe.
To acquire an overview of the information posted on the forum, we randomly sampled 500 threads and manually classified them by their intentions. The sampled results indicate that 38% of the threads were disaster situation updates, 28% of the threads were used for inquiring the status of certain people, and the remaining 34% of the threads were asking for relief.
timeline6.png
Figure 6: The forum thread count timeline
Table 2: Examples of posts on the disaster relief forum we analyzed
Location Contact Description Relief/Help in Need Response
[08/13 22:45] Can any
Namaxia TownshipHas anyone seen them?Namaxia Township People
Minsheng Vil. If so, plz contact me,who know them contact me?
thanks!Or can anyone who know
their family reply to this thread?
No.***, Nanheng Rd.,phone #Looking for[08/15 01:06]
Launong Vil.,0938-*********Pan (28-year-old Male)Both of them are safe!
Liugui Township,0910-*********Pan (43-year-old Female)All people live there are safe.
Kaohsiung CountyMs. Pan
Please send more[08/14 01:21] I agree with this
soldiers/ rescuerssuggestion. Hope Government
Taoyuan Township,home was floodedto the disaster areaofficials can see this post.
Kaohsiung Countyto help search bodies[08/14 01:24] I've contacted w/
and assist survivorsMr. Chou at the rescue center,
Kaohsiung. 07-******ext****
Does anyone know if[08/13 12:33] Looking for
Xinfa Vil.,Mr. Chang***Chien is safe?***Lin. who lives in
Liugui Township0911******If you have any infoNo.**, Xinkai, Liugui Township.
please contact me.Anyone with any info
Thanks!please contact 0987******
[08/11 23:13] Urgent need for
Dabangu Vil.,0989******power cut, road closurerelief,food, water, living goods.
Chiayi Countyin the village.living goods[08/11 23:16] contact name is
Ms. Yang, Thanks.
For privacy concerns, we have partially replaced sensitive information with *.

3.3  Privacy Leakage Potentials

From the example threads in Table II, it can be observed that personal information on the MKER forum was completely disclosed to the public without any protection. Such information exposure seems unavoidable because in many cases, the thread initiator would like anyone who has the requested information to contact him/her directly. In such critical conditions, mobile and landline phones are much more preferred than Internet communication tools such as e-mail. In addition, to ask for status about certain people, it is usually much more helpful, sometimes mandatory, to provide the name and residential address, as well as of the gender and age, of each person who is to be found.
As of the writing of this paper (Apr 2011), Typhoon Morakot has left Taiwan more than 1.5 year ago, and there have not been any post activities on the MKER forum. However, all the information posted during the Morakot crisis is still available to every Internet user (and web crawlers) today. Actually, the same situation can be observed on many other disaster response websites for Typhoon Morakot, which we believe due to reasons including history archival and memorial purposes. This further deteriorates the privacy leakage risks and provides such disaster response websites to malicious users as resourceful places for digging others' privacy information. Motivated by these observations, we will quantify the degree of privacy leakage on the MKER forum and propose several means to resolve this issue in the rest of this paper.

4  Privacy Leakage on the MKER forum

In this section, we analyze the threads on the MKER forum and reveal privacy leakage based on those information. We begin by defining and extracting personal information from the threads, examine how such information was disclosed by users, and conclude this section by providing a measure of privacy leakage on the forum.
Table 3: The degree of personal information disclosure in each of the Location, Contact, Description, and Response fields
LocationContactDescriptionResponseOverall
Name190.4%1,18727.5%1,10825.6%1,46734%2,46957.2%
Contact20.1%2,37755.0%2826.5%1,41932.8%2,98069.0%
Address3,42079.2%831.9%1,41232.7%1,63137.8%3,76987.3%
Either3,42879.3%2,52158.4%2,13849.5%2,48457.5%4,11595.3%

4.1  Personal Information Disclosure

To analyze the free-formed comments on the MKER forum, we define the following three categories of personal information:
  1. Name: An identifiable person's name can be either a full name (family name and first name), a first name, or a title followed by a last name, e.g., Ms. Cartmon.
  2. Address: An identifiable address can be either a county name, a village name, or a street name.
  3. Contact: An identifiable contact can be either a cell phone number, a landline number, or an e-mail address.
We developed a program which can automatically extract all the identifiable personal information above from publicly available users' comments on the MKER forum. The identification of person names and addresses are based on table lookups, while the identification of contacts is based on regular expressions, as both phone numbers and e-mail addresses have rather strict formats. As a result, we have identified 2,866 unique personal names, 2,556 unique addresses, and 2,903 unique contacts from the 4,315 threads on the forum. Among the threads, 4,115 (95%) of them contain at least one category of personal information, while 1,920 (44%) of them contain all the three categories. The numbers of threads containing all the combinations of identifiable personal information are shown in Figure 7.
We further analyze the relationship between the roles a person plays during disaster response and the tendency his name to be disclosed on the web. By a manual analysis of a random sample of 500 personal names, we found that 65% of personal names belong to people affected by the disaster, 17% belong to the people who made the comments on the web, and the remaining 18% belong to third parties, such as fire fighters and people in emergency response teams.
Venn.png
Figure 7: The Venn diagram of the number of threads which contain personal names, addresses, and contacts
Table 4: A breakdown of the disclosure analysis of each personal information category in each field on the MKER forum
LocationContactDescriptionResponseOverall
TotalUniqueTotalUniqueTotalUniqueTotalUniqueTotalUnique
Full Name1285743372,4611,5333,0471,0506,0942,296
First Name4489502121177332101,038314
Title & Last Name65673156277921,1531782,109256
Total22171,3365432,9501,7424,9331,4389,2412,866
LocationContactDescriptionResponseOverall
TotalUniqueTotalUniqueTotalUniqueTotalUniqueTotalUnique
County4,623108281125278316845,219157
Village4,52226958332,1912163,1262089,897366
Street2,05831540248131981,7552954,666492
Village & Street2,04027721122,7507623,9908548,8011,541
Total13,243969147806,0061,2549,1871,44128,5832,556
LocationContactDescriptionResponseOverall
TotalUniqueTotalUniqueTotalUniqueTotalUniqueTotalUnique
Cell Phone112,5091,6403161982,1747785,0002,171
Landline223723031661398203421,360679
Email Addr003824332024179553
Total332,9191,9675153573,0181,1376,4552,903

4.2  Information Disclosure Analysis

In Table III, we summarize the degree of personal information disclosure in each of the Location, Contact, Description7, and Response fields, where the first three fields are filled by thread initiators, and the Response field can be appended anytime by the thread initiator or others. According to the table, we have successfully identified one or more addresses in the Location field of 80% of the threads, and did so for contacts in the Contact field of 55%. Interestingly, the comments in the Description field allow us to extract personal names from 25% of and addresses from 32% of the threads, which imply that thread initiators often mentioned personal names and locations in the description of events or needs. On the other hand, while the Response field contains relatively higher ratios of personal names (34%) and addresses (38%), the field also contains contacts in 33% of the threads, which indicates that the respondents to a thread often included their own contact information for further communication.
We further provide a breakdown of the disclosure degree of each personal information category in each field in Table IV. From the table, we identify that personal names appeared more frequently in the Description and Response fields, which supports our analysis that around 65% of the disclosed personal names belong to people who were affected by the disaster, as the names of people who made the comments tend to be left in the Contact field. Also, full names were used most of times as unique personal identification was extremely important in such emergency conditions. Similarly, more than half of address occurrences were identified in the Description and Response fields though a Location field is provided, as users tended to provide more location-relevant information in the event description and in follow-up communications.
The identifiable contact information is primarily cell phone numbers (74%) and landline numbers (23%). This again shows that people prefer to communicate synchronously via phones for emergent and critical purposes.

4.3  Privacy Leakage

Table 5: The occurrence numbers of successful bindings between personal names, addresses and contacts (cell phones, landline phones, and email addresses)
LandlineCell PhoneEmailAddressOverall
TotalUniqueTotalUniqueTotalUniqueTotalUniqueTotalUnique
Full Name124891,642596772,5371,1004,3101,792
First Name26153111240017598512237
Title & Last Name120851,3497182383451851,837996
Total2701893,3021,43830153,0571,3836,6593,025
Having extracted the personal information classified in three categories, we are able to continue to gauge the leakage of privacy information on the MKER forum. Here we refer to privacy information as the binding of a personal-identifiable information (PII) and its associated personal information. More specifically, we see a successful binding between a personal name and a corresponding address or contact as an instance of privacy leakage.
We perform the bindings between the extracted personal names and personal information using the positions of their appearance in each thread. Specifically, if a personal name appears in the same field as a certain contact (or address) and the distance between them is within 30 characters, the contact (or address) is considered bound to the personal name. If multiple names are simultaneously bound to the same personal information, only the closest name (in terms of character distance) is considered successfully bound.
We list the number of occurrences of privacy leakage in Table V. The table shows that we are able to infer 1,438 person-cell-phone pairs and 1,383 person-address pairs based on the extracted personal information, where the bindings for landline numbers and e-mail addresses were much less mainly due to the occurrences of the two contact categories are also much fewer. Among the names successfully bound to certain information, 60% are full names, while 33% are titles followed by respective last names. The majority disclosures of full names and cell phones together make the privacy leakage a serious issue as the full name and cell phone number of an individual are sufficient to perform fraudulent and other malicious activities against him/her.
We also summarize the proportions of addresses and the three categories of contact information with and without name bindings in Figure 8. The graph shows that around half of addresses are successfully bound to individuals, while around two thirds of cell phone numbers are successfully bound. On the contrary, merely 28% of landline numbers are bound to personal names. We believe this can be attributed to the fact that many of the landlines in users' comments are owned by governmental or non-governmental organizations related to emergency rescue or relief, such as Red Cross branches, temples, and local police offices.
pie_chart_4.png
Figure 8: The percentages of addresses and the three categories of contact information with and without name bindings
To sum up, based on a total of 5.2 MB of users' plain-text comments on the MKER forum, we infer 3,025 privacy leakage incidents using a fully automated method with simple table lookups and regular expressions. If a malicious user aims to exploit the dataset, he can definitely extract much more "useful" privacy information via manual analysis. For example, ages and genders of people are usually provided in requests made to query the status of certain people. Moreover, we see frequently in users' comments that the relationship between the commenter and the persons he/she mentioned is revealed (e.g., uncle, aunt, or grandchild); sometimes the commenter even provided the complete member list of a family. As we merely use the MKER forum as an example using automated analysis, the overall privacy crisis due to crisis response websites would be devastating severe if we consider the number of similar services, the number of crisis events, and the number of malicious users around.

5  Solutions to Privacy Leakage

In this section, we discuss potential solutions to the identified privacy leakage due to crisis responses on the web.

5.1  Remedy to Current Services

For services that are currently running, such as the MKER forum, firstly, the web server should prevent the site's content from being crawled by search engines. A web service can adopt the Robots Exclusion Standard8 by putting robots.txt under the root directory of the web server to instruct web spiders not to cache the site's content. However, the Robots Exclusion Standard is informational rather than an enforcement, a web spider or a malicious user can simply ignore robots.txt and perform data crawling. Secondly, the site owners can mask all personal information, such as contacts and addresses, using table lookup and regular expression techniques as we did in Section  IV-A.
Although the two methods cannot help much if the site's content has been stored by a search engine or any third party, to the least extent, they are still helpful in preventing further privacy leakage caused by current available information on the sites.

5.2  Personal Information Filtering and Protection

For future disaster response web services, we suggest that such services should adopt certain mechanisms for personal information filtering and protection to avoid privacy leakage.
To facilitate personal information filtering, the system should be able to recognize all possible personal names and addresses, which we consider feasible as personal names can be checked by using a name dictionary and Internet map services can be used for address checking. When a user tends to post comments containing personal names or addresses, the system can warn him/her regarding the possible risks that may be caused by the current operation.
The system can also mask out significant parts of the names and addresses, such as first name and the street and finer levels of address information. If a reader needs to access the masked information, the system could provide certain mechanisms to grant the access. For example, the interested reader needs to authenticate him/herself with a SMS verification code sent from a certificated emergency response organization.
To protect personal contact information from being disclosed, disaster response systems may provide "forwarding" mechanisms. For example, if a user would like to leave his mobile phone number in his comments, the system can "transform" the phone number into an extension number which is visible to the public. By doing so, a reader of the comments must dial the system's representative phone number and then the extension number to reach the comment poster without knowing his actual number. The system has to pay for the communication cost on behalf of the callers for their phone conversations with the comment poster, which could easily be minimized by limiting the duration for each forwarded call9. In addition, the system should adopt certain authentication mechanisms to prevent Sybil attacks that malicious users register a large number of fake mobile phone numbers as the system's extension numbers and misuse them in some other ways.
By adopting both personal information filtering and protection mechanisms, we believe that the privacy leakage on crisis response web services can be mitigated to a reasonable degree.

6  Conclusion and Future Work

In this paper, we have identified the privacy risks which are caused by users' communication on disaster response websites. Based on an analysis of the MKER forum, we have shown that the revelation of personal information and the resulting privacy leakage on such services is indeed a serious issue. We have also proposed several means to prevent such privacy leakage on current and future disaster response services respectively. In the future, we plan to implement the proposed mechanisms for personal information filtering and protection (c.f., Section  V-B) and make them publicly available to disaster response service providers in the hope to resolve the Privacy Crisis due to Crisis Response on the Web.

References

[1] L. Palen and S. B. Liu, "Citizen communications in crisis: anticipating a future of ICT-supported public participation," in Proc. CHI '07.    ACM, pp. 727-736.
[2] S. Vieweg, L. Palen, S. B. Liu, A. L. Hughes, and J. Sutton, "Collective intelligence in disaster collective intelligence in disaster: Examination of the phenomenon in the aftermath of the 2007 Virginia Tech shooting," 2008.
[3] J. Sutton, L. Palen, and I. Shklovski, "Backchannels on the front lines: Emergent uses of social media in the 2007 southern california wildfires," in Proc. ISCRAM '08.
[4] L. Palen, S. Vieweg, J. Sutton, S. B. Liu, and A. Hughes, "Crisis informatics: Studying crisis in a networked world," in Proc. e-SS '07.
[5] F. D. L. Wigand, "Twitter takes wing in government: diffusion, roles, and management," in Proc. dg.o '10.    Digital Government Society of North America, 2010, pp. 66-71.
[6] J. G. Taylor, S. C. Gillette, and F. C. S. C. (U.S.), Communicating with wildland interface communities during wildfire.    USGS Fort Collins Science Center, 2005.
[7] SAHANA. [Online]. Available: http://sahanafoundation.org/
[8] RESCUE project. [Online]. Available: http://www.itr-rescue.org/index.php
[9] RESCUE Disaster Portal. [Online]. Available: http://www.disasterportal.org/
[10] L. Palen and S. Vieweg, "The emergence of online widescale interaction in unexpected events: assistance, alliance & retreat," in Proc. CSCW '08.    ACM, 2008, pp. 117-126.
[11] S. Vieweg, A. L. Hughes, K. Starbird, and L. Palen, "Microblogging during two natural hazards events: what Twitter may contribute to situational awareness," in Proc. CHI '10.    ACM, 2010, pp. 1079-1088.
[12] K. Starbird, L. Palen, A. L. Hughes, and S. Vieweg, "Chatter on the red: what hazards threat reveals about the social life of microblogged information," in Proc. CSCW '10.    ACM, 2010, pp. 241-250.
[13] I. Shklovski, L. Palen, and J. Sutton, "Finding community through information and communication technology in disaster response," in Proc. CSCW '08.    ACM, 2008, pp. 127-136.
[14] Y. Qu, P. F. Wu, and X. Wang, "Online community response to major disaster: A study of Tianya forum in the 2008 Sichuan earthquake," Hawaii International Conference on System Sciences, pp. 1-11, 2009.
[15] C. Torrey, M. Burke, M. Lee, A. Dey, S. Fussell, and S. Kiesler, "Connected giving: Ordinary people coordinating disaster relief on the Internet," Hawaii International Conference on System Sciences, p. 179a, 2007.
[16] R. Herold, "Addressing privacy issues during disaster recovery," Information Systems Security, vol. 14, no. 6, pp. 16-22, 2006.
[17] "Historical typhoon database provided by central weather bureau, Taiwan," http://rdc28.cwb.gov.tw/data.php.
[18] "Extreme events and disasters from typhoon Morakot - the biggest threat ever to Taiwan," Environmental Protection Administration, Executive Yuan, ROC, Tech. Rep., 2009.

Footnotes:

1. http://typhoon.oooo.tw">http://typhoon.oooo.tw
2. http://www.tianya.cn/">http://www.tianya.cn/
3. The name Morakot was assigned by the Japan Meteorological Agency (JMA) on August 3, 2009.
4. http://www.jma.go.jp/">http://www.jma.go.jp/
5. http://wmo.asu.edu/ world-greatest-seventy-two-hour-3-day-rainfall
6. http://typhoon.oooo.tw/">http://typhoon.oooo.tw/
7. For the ease of analysis, we combine the original Description and Needs fields into the current Description field.
8. http://www.robotstxt.org/robotstxt.html
9. We consider such limits reasonable as a short period, say, 3 minutes, is sufficient for both parties to authenticate each other and exchange their actual phone numbers if further communications are needed.


Sheng-Wei Chen (also known as Kuan-Ta Chen)
http://www.iis.sinica.edu.tw/~swc 
Last Update September 28, 2019