|
|
|
|
| |
|
Web, Internet, Networking, and Software
Principal Investigators:
Internet Knowledge Extraction and Management
This project focuses on Internet document
extraction, classification and query processing. We address the
problem of extracting knowledge from web sites along two directions.
In the first direction, we extract knowledge from web documents
using an existing ontology. In the second direction, we recognize
informative structure of a web site and hereof to extract informative
contents which are usually coupled with undesired redundant information.
We are also interested in developing new methods, algorithms, and technologies
to extract knowledge from web pages of a specific type, e.g., citation
list or bibliography.
In addition, the project also studies learning
and management of knowledge from Internet documents, including the
following topics: personalized knowledge management, topic discovery,
and event detection and tracking. The personalization system filters
information for users based on user interests. Topic discovery finds
interesting topics via hyperlinks and contents of Web documents. The
goal of event detection and life cycle tracking is to identify specific
events and distinguish the articles of the events from others from several
document sources. The presentation of a complicated event for reader
comprehension is also an interesting research topic.
WLAN/WMAN
as Access Network
This project focuses on the enhancement of IEEE 802.11and
802.16 families in order to provide QoS guarantee and multimedia
communication, and multi-hop access, such that they can be integrated
as a full-blown access network. Resource management and fairness are
the major issues for QoS guarantee. Beside theoretic study, we implement
embedded system prototypes on Linux for performance evaluation purposes.
Our studies cover MAC layer, network layer and transport layer protocols.
Network Measurement
Network measurement is a fundamental research problem
in the field of computer networks. The results of network measurements
are critical for efficient network design, management, and usage.
With the growing popularity of emerging network technologies, such
as WiFi, Bluetooth, ZigBee, Wi- Media, WiMAX, and GPRS/3G, it is becoming
increasingly desirable to have a simple, accurate, endto- end, and
less intrusive approach to measure the network. In this project, we
intend to study network measurements in emerging and challenging network
scenarios. We plan to develop several approaches that can accurately,
timely, and uncostly measure/ monitor the network.
In addition, using network measurement results, we
intend to develop QoS enhancement schemes for various network services/applications.
For instance, for mobile network applications, we plan to develop
an algorithm which can agilely adapt network services, in accordance
with realtime network measurement results, in order to optimize the
user-perceived service quality for the mobiles. Moreover, we plan to
apply network measurement results to benefit emerging peer-to-peer,
overlay, mesh, sensor, ad hoc, and/or opportunistic networks, so that
not only can the network resource be effectively utilized, but QoS can
also be greatly improved.
Formal Verification
Our research goal is to improve the quality of computer
systems. For the past decades, formal method has been used to ensure
the correctness of computer systems. We are developing and applying
formal verification techniques to help system engineers build high-quality
computer systems.
Currently, the following projects are focused in the
formal verification group:
(1) System-on-Chip verification:
SoC systems require the integration of both hardware
and software components. But it is difficult to integrate due
to different development environments of hardware and software
components. In this project, we would like to propose an integrated
verification platform to check the correctness of SoC systems. The idea
is to develop a core verification tool which allows users to translate
hardware and software specifications to the core verification language.
(2) OMocha model checker:
OMocha is a model checker for reactive modules. It serves
as the core verification platform of our System-on-Chip verification
project. We have also developed model checking algorithms and implemented
them in OMocha. We use CUDD and chaff as the underlying BDD and SAT
packages. To improve the quality of the model checker, we use Objective
Caml to implement high-level algorithms in OMocha. Currently, OMocha
has both BDD and SAT-based model checking algorithms.
(3) Software verification:
Recently, software verification is under the spotlight
of various research communities. Thanks to the success of hardware
verification, many researchers now think that software verification
may provide an answer to improve the quality of software systems.
The key issue in software verification is abstract interpretation.
Because of the intractability of model checking problems, model checkers
cannot verify large systems. The abstraction techniques simplify programs
so that they can be verified by model checkers. We are interested
in developing new abstraction techniques and implementing them in OMocha.
Web-Based Collaborative
Problem-Solving Environment
We have launched an ambitious project called "Share
Tone" to build a web-based collaborative problem-solving environment.
This project is aimed to facilitate collaboration among researchers
and practitioners over the Internet. Users in the problemsolving environment
(PSE) can share pertinent information and ideas, demonstrate their new
findings, and solve a particular problem collaboratively.
The kernel of this project is built upon an efficient
knowledge management mechanism. We integrate the Zope open source
utility with the Plone content management system to construct a
unique type of knowledge portal that supports fundamental functions
of collaboration, i.e., data sharing, software warehousing, application
sharing, and workflow control. The content management mechanism also
allows users to implement discipline-dependent content types within
given ontology. We have implemented a knowledge portal, called "OpenCPS",
(http://www.opencps.org), in which knowledge at this demonstration website
is regarded as a map relating instances among problem, solution, and
implementation spaces.
The Plone/Zope software system supports Python language
for further development of plugin packages. Based on its easy
plug-in property, we develop tele-conferencing and concurrent collaboration
service packages to enhance the performance of the collaborative
problem-solving environment. The tele-conferencing service includes
Internet Relay Chat (IRC) and videoconference systems. The concurrent
collaboration service is referred to as the ShareTone CSCW (Computer
Supported Cooperative Work) system, in which we provide reusable system
components and some useful applications such as "CollabJEditor" and
"Composer". The CollabJEditor is an editor that allows concurrent
editing from multiple users; the Composer is an authoring system which
handles backend collaborative tasks and helps users develop collaborative
applications easily. All these add-on applications are in client-server
architecture. The client-side programs are written in Java language and
embeded into the whole system by running Python scripts in response to web
users' communication and collaboration requests.
The ShareTone Replication Center (STRC, http://www.sharetone.org)
is now available for users to replicate our model of collaborative
PSE. Users can download the most updated system cores to set up
their own knowledge portals equipped with fundamental collaboration
functions, i.e., content type management, software warehousing, application
sharing, and workflow control. The abovementioned communication tools
and concurrent collaboration applications are optional for users
to select and plug-in.
In addition to the general model of the web environment,
users can develop extra applications for use in a particular discipline.
For example, we have developed the following applications for the
OpenCPS knowledge portal that make it a useful practice platform in
an algorithm design course of "Geometric Computing and Visualization"
(1) GeoBuilder
A collaborative visual debugging tool built upon
the ShareTone CSCW components. GeoBuilder supports 2D and 3D geometric
algorithm visualization. During 3D algorithm visualization, the
drawing engine can dynamically decide the camera position for users
to effectively track 3D geometric objects.
(2) Concept map generator
A visualization tool to display the sitemap of the
knowledge portal and observe the concept map of group knowledge.
This map generator supports both the learner- and expert-modes to
interact with the knowledge portal. Users not only can learn from
but also can contribute to the group knowledge through the concept
map.
(3) Algorithm benchmark system
A Python service package integrated with a database
and a group of task servers. This system can compare the memory
usage, the CPU time, and the experimental results of various algorithms.
It can also help users observe the growth of space complexity and time
complexity of an algorithm with input size.
XML Document Processing:
Principles and Practices
XML documents, in particular when they are constrained
using DTD (Document Type Definition) or XML Schema, are highly
structural. The transformation of XML documents of one DTD (or
Schema) to XML documents of another DTD, hence, can be thought as
a structural mapping between two sets of constraints. We are interested
in modeling such structural mappings in a formal way, and in applying
these formal models into practical use.
We propose a parametric content model for XML DTDs
and construct, automatically, their validation procedures. The model
provides a basis for typeful XML programming in ML (a type-safe programming
language with high-order functions and parametric modules) and leads
to a theory of modular XML transformations.
We have also been working on various formal models
of XML streaming processors so as to ensure that the processors
have low memory consumption and they will always generate output of
the required structure.
On the more practical side, we have been using XML
and related technologies, such as SVG (Scalable Vector Graphics),
in a Web-based population mapping system, called Taiwan Social
Map (http://tsm.iis.sinica.edu.tw), for online aggregation and visualization
of census datasets.
Also, many application domains have formed consortia
or coalitions to define domain-specific XML vocabularies for the
interchange of data among organizations within the application domains.
As XML database management is getting matured, XML vocabularies may
even become the standard information model for future enterprise information
systems.
We have been involved in some of these standardization
activities as a technology provider, in particular, the domestic
development of XBRL for financial information representation and HL7/CDA
for medical record representation. We have formed a research group with
some scholars in accounting and information technology to develop domestic
XBRL taxonomies and build tools for XBRL financial information conversion
and analysis. We have joint-forced with the Taiwan Medical Informatics
Association to define CDA-based XML schemas for medical records used
in hospitals in Taiwan and develop a Web-based prototyping system based
on these schemas emphasizing data conversions to CDA and the conventional
medical databases.
With the increasing number of emerging XML vocabularies,
there is apparently a need to construct user interfaces for XML
data (documents) within a given vocabulary effectively. So far, user
interfaces for XML data have been typically constructed from scratch
for specific XML vocabularies and applications. This approach is labor-intensive
and very costly considering that enterprise-level XML vocabularies
typically contain hundreds, even thousands, of elements and attributes,
and the user interface must take care of the dynamic XML data structures,
syntactic constraints, and presentation layout, etc.
The current project is aimed at the construction of
graphical user interfaces for users to create, update and interact
with XML data.
We have been developing a generic interactive component,
Forms-XML, that generates formbased user interfaces for XML vocabularies,
and an XML document editor using Forms-XML. Forms- XML has the following
novel features: It avoids, instead of detect, syntactic violations
so that a working document is always kept compliant with a given schema,
thus freeing its users from any concerns regarding XML syntax. Forms-XML
supports neighborhood insertions, that is, the insertion of an element
within a neighborhood around the position the insertion command is
issued. Thus the user does not have to locate the exact position for
an insertion. Finally, Forms-XML offers facilities to customize and
fine-control its user interface. These include an application programming
interface, a customization file, and means to link to an external CSS
style sheet. With all these features, Forms-XML is able to generate user
interfaces for use by ordinary users who have no knowledge of XML.
Open Spatial Information
Technologies
Recent years have ushered in standard languages,
such as Geography Markup Language (GML) and Scalable Vector Graphics
(SVG), for the exchange of geographical and spatial data. Since their
advent, the project on open spatial information technologies has
successfully used them, together with only open source software, to
build a prototype system that retrofits a legacy data standard used
in Taiwan for the exchange of topographic maps. The prototype coverts
topographic maps in legacy format to GML-based XML documents. The XML
documents are then visualized by a GML-to-SVG transformer. The current
research of the project focuses on promising new directions of Geographic
Information Systems (GIS) research with the use of new standard languages.
These languages, together with a Web-based open technology framework,
make it possible to build novel spatial systems. The systems not only
will integrate heterogeneous sources of geography and non-geography data,
but also can be easily customized to meet special needs of individuals
and communities and made easily accessible from the Web by various
end devices. A goal of this work is to design and build a prototype
open spatial information system that embodies the research results
and demonstrates new concepts and approaches.

|
|