We hope that, in the not too distant future, people will not have to constantly update antivirus tools or worry about trojan horses, computer viruses, and frauds when they surf on the Internet. We envisage a time when users' network experiences will not be ruined by threats to privacy and acts of piracy; for example, the deletion of documents by malicious software, the theft of credit card details and online game accounts, and the unauthorized publication of sensitive information, such as how often a user links to dating websites. Thus, our research also includes the study of detection and prevention of malicious/fraudulent activities on the Internet.
Phishing Page Detection
Phishing is a form of online identity theft associated with both social engineering and technical subterfuge. To protect end users from reaching phishing pages, various kinds of visual similarity-based phishing page detectors have been developed to detect such web pages. However, scammers are clever enough to create polymorphic phishing pages to breach the defense of those detectors. We call this kind of countermeasure phishing page polymorphism.
We have proposed two approaches to counteract phishing page polymorphism: 1) We proposed a layout-based mechanism  to detect polymorphic phishing pages. The mechanism analyzes the layout of web pages rather than the HTML codes, colors, or content. 2) To better the mechanism, we applied the state-of-art invariant content description technique of the image processing field to phishing page detection . We use the Contrast Context Histogram (CCH) to compute the similarity degree between suspicious pages and authentic pages based on discriminative keypoint features in web pages. The empirical evaluation results showed that both of the proposed schemes achieved high accuracy and low error rates.
"Counteracting Phishing Page Polymorphism: An Image Layout Analysis Approach," Ieng-Fat Lam, Wei-Cheng Xiao, Szu-Chi Wang, and Kuan-Ta Chen, Proceedings of ISA 2009.
"Fighting Phishing with Discriminative Keypoint Features of Webpages," Kuan-Ta Chen, Jau-Yuan Chen, Chun-Rong Huang, and Chu-Song Chen, IEEE Internet Computing, pp. 30--37, May, 2009. [web | paper]
The fast-flux service network architecture has been widely adopted by bot herders to increase the productivity and extend the lifespan of botnets' domain names. A fast-flux botnet is unique because firstly, each of its domain names is mapped to different sets of IP addresses over time, and secondly legitimate users' requests are handled by machines other than those contacted by users directly. Earlier methods for detecting fast-flux botnets rely mostly on the former property. This approach is effective, but it requires a long observation time, which can be up to a few days, before a conclusion can be drawn. To address the timing issue of the earlier approaches, we proposed a method  that can detect whether a web service is hosted by a fast-flux botnet in real time. Our scheme is unique because it relies on certain characteristics of fast-flux botnets: 1) The request delegation model, 2) bots are not dedicated to malicious services, and 3) the hardware used by bots is normally inferior to that of dedicated servers. As these characteristics are intrinsic and invariant, it is extremely difficult, if not impossible, for bot herders to evade our detection scheme by their countermeasures.
"Fast-Flux Bot Detection in Real Time," Ching-Hsiang Hsu, Chun-Ying Huang, and Kuan-Ta Chen, RAID 2010. [web | paper]
P2P and VoIP Traffic Detection
P2P traffic now constitutes a substantial proportion of Internet traffic; therefore, identifying P2P traffic is essential for traffic managements, such as service differentiation and capacity planning. However, modern P2P applications often use proprietary protocols, dynamic port numbers, and packet encryptions, which make traditional identification approaches like port- and signature-based identification futile. To address these issues, we have proposed a behavior-based approach  for accurately recognizing whether P2P applications are running on certain hosts based on the applications' signaling traffic. Our approach is particularly useful because it does not need to access the packet payload and can recognize applications based purely on their signaling behavior.
In addition, we found that human voice activities can be inferred from VoIP traffic, even if it is encrypted. In , we proposed an innovative network-layer voice activity detection (VAD) algorithm that can extract voice activity from encrypted and non-silence-suppressed Skype traffic. The results showed that our scheme achieved decent performance even if a high degree of timing randomness was injected into the network traffic. At the same time, as VoIP traffic may use random ports and is hard to detect, we shown that the inferred voice activities can be used to detect VoIP traffic in a port- and protocol-independent manner .
"Peer-to-Peer Application Recognition Based on Signaling Activity," Chen-Chi Wu, Kuan-Ta Chen, Yu-Chun Chang, and Chin-Laung Lei, Proceedings of IEEE ICC 2009. [web | paper]
"Inferring Speech Activities from Encrypted Skype Traffic," Yu-Chun Chang, Kuan-Ta Chen, Chen-Chi Wu, and Chin-Laung Lei, Proceedings of IEEE Globecom 2008. [web | paper]
"Detecting VoIP Traffic Based on Human Conversation Patterns," Chen-Chi Wu, Kuan-Ta Chen, Yu-Chun Chang, and Chin-Laung Lei, Proceedings of IPTCOMM 2008. [web | paper]
Information Leakage in Social Network Services
Revealing personal information in online social network services is a double-edged sword. Information exposure is a plus, even not a must, if people wish to enlarge social circle. However, leakage of personal information, especially one's identity, may invite malicious attacks from the real world and cyberspace, such as stalking, reputation slander, personalized spamming and phishing. Even if people try their best concealing personal information online, their friends may "betray" them unintentionally. In , we considered the problem of involuntary information leakage in social network services and demonstrated its seriousness with a case study of Wretch, the largest social network site in Taiwan. Wretch allows users to annotate their friends' profiles with a one-line description, in which their private information, including real name, age, and school attendance records, may be revealed without the owners' consent. In 592,548 effective profiles that we collected, the first name of 72% of the accounts and the full name of 30% of the accounts could be inferred by using a number of heuristics. The age of 15% of the account holders and at least one school attended by 42% of the holders could also be inferred. We are working on the solutions to mitigate the identified involuntary information leakage problem.
"Involuntary Information Leakage in Social Network Services," Ieng-Fat Lam, Kuan-Ta Chen, Ling-Jyh Chen, Proceedings of IWSEC 2008 (Best Paper Award). [web | paper]