We study various topics in parallel processing, including parallel algorithm designs, architectures, compilers, and scientific applications and their implementations.
1. The design of efficient PRAM algorithms for graph theory, and their implementations and performance evaluations on parallel computers. Graph algorithms designed for the PRAM model are often very modular in design, and problems are solved by using these basic primitives. Therefore, we propose a set of basic parallel operations to solve various problems from graph theory. We are also interested in finding good mapping techniques to implement these basic operations on parallel machines.
2. The cost-effective tradeoff in the design of parallel architectures. Problems derived from the same application domain, or having a common mathematical structure, can often be solved efficiently by a special-purpose machine with a simple interconnection network, instead of an expensive general-purpose computer with a complicated network. We will investigate the tension between the efficiency of parallel programs and the complexity of the underlining computer architectures, so that we can find a cost-effective middle ground for a particular group of applications.
3. Language and compiler design. Parallel programming languages and compilers are the interfaces between parallel algorithms and computers. We are interested in the essential language features that make parallel machines easy to use, and at the same time we study various compiler issues that are critical in achieving high performance. We will study the parallelization of sequential codes, the optimization of communication and synchronization, and processor-level code optimization, and evaluate their impacts on the efficiency of code generated by a parallel compiler.
4. Practical issues from the implementation of scientific applications. We plan to design and implement software packages for scientific computations, including partial differential equation and linear system solvers, on parallel machines. We have implemented Barnes-Hut's N-body algorithms for the study of astrophysics and multi-filament vortex simulations. The high performance implementation introduces many innovative ideas in dynamic and irregular data structure designs and algorithmic optimizations for distributed memory parallel systems, and can be used as a framework to develop other similar N-body algorithms, e.g. Greengard-Rokhlin's fast multipole method. In addition to our previous emphasis on numerical applications, we are also interested in non-numerical, e.g., combinatorial optimization, algorithms to solve problems in VLSI design, network synthesis and analysis, and scheduling.
The principal investigators of this group are PeiZong Lee, Chien-Min Wang, Tsan-Sheng Hsu, Tyng-Ruey Chuang, and Jan-Jan Wu. Currently available parallel machines include an IBM SP-2, an nCUBE2/32, and a workstation cluster of eight Ultrasparc-2 workstations connected by the fast Ethernet.