|
Computation Theory and Algorithms Laboratory
Principal
Investigators:
[ Group Profile ]
(1) Graph Theory and Algorithms
Graphs are used to model many important applications and are a main tool for solving many
theoretical problems. We often start with probing fundamental theoretical problems such as
the structure of graphs with certain properties. With these properties, we then design efficient
solutions to them. We work on algorithmic graph problems arisen from real-world applications.
From phylogenetics, we work on two basic problems, the Steiner root problem and the supertree
construction problem. For a set of taxa with similarity relations represented by a graph,
the Steiner root problem is to find a phylogenetic tree of these taxa whose power is exactly the
graph. The supertree construction problem is to yield a phylogenetic tree from a set of small
phylogenetic trees with different but overlapped taxa sets. Though there exist many supertree
construction algorithms, biologists still desire an efficient algorithm providing comprehensive
information of phylogenetic relation among taxa. We also study fundamental algorithms arisen
from applications with massive data. One of the areas involves augmentation algorithms
on graphs with partition constraints. This stems from data privacy protection problems, and
has applications in various fault tolerant designs. One other area of our research concerns
algorithms for finding the tree editing distance when the distance has a limited bound. This
problem has applications in comparing large trees, which are models for many applications
including several problems in the area of bioinformatics.
(2) Geometric Computing
The Voronoi diagram is a very versatile and well-studied geometric construct in geometric
computing. The properties, complexity, and computation of Voronoi diagram are among the
fundamental problems to study. By introducing different metrics, such as time-distance, the
diagram will change. In general, we plan to study how the properties of the Voronoi diagram
vary with the metric selected, and how we make use of the properties of the selected metric,
if any, to construct the diagram. Other problems such as Map Labeling, i.e., how to label given
features of a map so that these labels do not overlap, and Defensive Competitive Facility Location,
i.e., finding the location of a facility so that the gain of prospective competitors is minimized
according to certain rules of competition, are also of interest. In biological computing
we use some important techniques, such as random sampling, parametric search, geometric
cutting, expander graph, matrix searching, etc. to tackle certain subsequence problems in a
given DNA or arbitrary sequence, e.g., finding a maximum density subsequence or a maximum
sum subsequence. We hope to use geometric techniques to solve additional problems
in biological computing, and moreover, apply them to problems in pattern recognition, image
processing, data mining, and so on.
(3) Network Design and Analysis
The network design problem considers how to connect a collection of sites by a network such
that certain requirements are met. Topics in network design arise as central issues in wide areas,
such as VLSI design, wireless sensor network, bioinformatics, and communication networks.
With respect to different applications, these issues demonstrate themselves in different forms,
with subtle similarities via various quality measurements and constraints in practice. From an
algorithmic point of view, we consider the construction and evaluation of networks under one
or more objective functions. We focus on several important measures to evaluate the quality
of networks, including weight (total length of all edges), diameter (longest path between any
two nodes), dilation (largest ratio of network distance to Euclidean distance), routing cost (total
sum over the network distances between all vertex pairs), bottleneck bandwidth or capacity,
etc. We will also address issues about fault tolerance and dynamic maintenance of networks
under node/link-transient failures. We will extend our research to related multi-criteria network
optimization problems. Most of these problems have been shown to
be computationally intractable. Thus, we will explore the underlying
structures or properties of these problems, and attempt to obtain
suitable approximation solutions or heuristics that are practically acceptable
for the situation under consideration.
(4) Data intensive computation: theory
Logic and knowledge representation for massive data: Much information
and knowledge are implicit in massive data. In recent years,
intelligent agent systems with learning and reasoning ability have
received much attention. We want to study the knowledge representation
and inference problems relevant to intelligent agent systems
based on formal logics. We will study methods to inductively produce
useful rules and knowledge with special attention provided to the
representation problem of such extracted knowledge. With a proper
knowledge representation framework, derived knowledge can serve
as the basis for further reasoning and decision making of the intelligent
agent systems. Different agent systems can exchange knowledge
based on the common representation, and agent systems can
produce even more complex knowledge by invoking a proper fusion
mechanism to incorporate knowledge from various sources. The fully
distributed yet coordinated knowledge extraction and processing
mechanism can derive useful knowledge for decision making from a
massive data set. This can effectively mitigate the information explosion
problem for decision makers.
Algorithms and protocols for privacy enhancing technology with
massive data: In recent years, many institutions have collected massive
and comprehensive individual data for various purposes. To
share those data can be beneficial to society, but it also poses a great
threat to individual privacy. How to share the data and keep individual
privacy intact is the topic we are interested in. We proposed
a logic framework to study risks to privacy when publishing microdata,
and a quantitative measurement for the privacy threat. We are
working on a more challenging issue involving the database linkage
problem: how to get the intended final answers without really linking
the databases. Multiparty privacy computation can be applied and
we plan to use secure scalar product as a building block to construct
basic functions which in turn can be used to build various application
systems. The ultimate goal is to develop a complete system, in which
users can write their program in certain high level languages and the
code will be automatically translated into a secure multiparty code
by a compiler.
(5) Computational Complexity
Randomness has become a valuable resource in computation. For
many computational problems, randomized algorithms are simpler,
more natural, or more efficient than the deterministic algorithms currently
known. However, randomized algorithms typically depend on
the availability of a perfect random source, whose existence even in
the nature is debatable. There are two approaches to resolve this issue.
One approach is to study whether or not randomized algorithms
can be derandomized into deterministic ones without sacrificing the
performance too much. We show how such derandomization can be
achieved under reasonable assumptions, and we also study its potential
limitation. We will study how to derandomize special classes
of randomized algorithms, such as those using logarithmic space or
constant parallel time, without relying on any assumption. The other
approach is to design procedures, called extractors, which can extract
almost perfect randomness from slightly random sources. We provide
an optimal construction of extractors, which settles a long-standing
open problem, and we also find a surprising application of extractors
in cryptography, for building an encryption scheme with an everlasting
security. We will extend the scope of extractors to consider
the possibility of extracting randomness from sources that only look
somewhat random to computationally-bounded observers, but contain
no randomness at all in a statistical sense.
(6) Cryptography
Efficient Cryptography and CHES (Crypto Hardware and Embedded
Systems):
Some situations call for the utmost effort in optimization. Here, it is
not just cryptanalytic efforts that require the highest efficiency, in
many cases so does encryption. Sometimes it is because resources
are limited: It seems that computers are ubiquitous, and will soon be
working invisibly and seamlessly in many ways; hence, security and
privacy are becoming pressing issues. RSA may lose its dominance
within 5-10 years, even without quantum computers, because it is
too heavy-weight for low-resource use. Indeed, because of the need
for better security in pervasive or ubiquitous computing, NATO is
planning to adopt ECC (ECIES, ECDSA) as its next standard. In the
opposite direction, higher efficiency in encryption algorithms can
result in higher security levels, or cheaper components for the same
security.
In this area, we study topics ranging from restricted linear algebra
and resource-limited arithmetic, to fast arithmetic and efficient
primitives. We are known for designing cryptographical approaches
for specialized hardware, including implementing cryptographical
algorithms on vector units in CPUs, FPGAs, ASICs, and GPU
(graphic processing units). One of our record-breaking results is
using GPUs to assist cryptanalytic computations. We also study the
implementation of practical information security algorithms, such
as using intelligent agents to assist server-less authenticated information
exchanges.
Post-Quantum Cryptography:
This term has two major meanings. One is the study of cryptosystems
using quantum effects to establish security and privacy, such
as the famous BB84 protocol. The other is the study of cryptography
that does not fall with the advent of quantum computers, which
are expected to become a reality within two decades. Our research
on MPKCs (Multivariate Public-Key Cryptosystems) has advanced
the understanding of the field from both theoretical and practical
viewpoints. MPKCs operate on a vector of variables over a small
field, instead of on an element in a huge algebraic structure (as
in RSA or ECC). This key characteristic makes MPKCs faster while
maintaining comparable design security; hence, they are useful for
low-resource environments, such as embedded systems and smart
cards. Recently, we have conducted several analyses and proposed
improvements to the design of such primitives. Today we have one
of the leading research teams in multivariate cryptosystems.
Algebraic Cryptanalysis:
We have made practical advances in equation-solving and algebraic
cryptanalysis, especially in Groebner Bases and the related XL
(eXtended Linearization) method and its variants. These systemsolving
methods have shaken the field of stream ciphers, and researchers
are still looking for a replacement to the venerable RC4
cipher. We are still working on faster implementations and additional
theory behind such system-solvers. This work also relates to
the previously mentioned area above, since an attack on an MPKC
is equivalent to solving an instance of the multivariate quadratic
systems (MQ) or the extended isomorphism of polynomials (EIP)
problems.
|

|