Page 19 - 2017 Brochure
P. 19
MapReduce framework was utilized by an Apache open-source project named Research Description
Hadoop. The Hadoop MapReduce framework was very successful and widely
adopted in the processing of large datasets. However, our experience on suffix array Jan-Jan Wu
construction with Hadoop showed that excessive disk usage and access may occur.
Therefore, the performance is degraded and the scale of the application is limited. Research Fellow
Expansive MapReduce (EMR) applications, such as suffix array construction, are
a group of applications that have performance and scalability issues with Hadoop. Yuan-Hao Chang
Our first approach is to instruct in-memory data stores to keep the input data of
MapReduce jobs such that the volume of partially duplicated data is minimized. The Associate Research Fellow
second approach is to pass intermediate key/index pairs to reducers if the size of
a key/index pair is much smaller than the size of a key/value pair. The intermediate Sheng-Wei Chen
values can later be retrieved from in-memory data stores by reducers. This may
greatly reduce the disk usage and access in the shuffle and sorting phase. Our third Research Fellow
approach is to optimize the response of in-memory data stores, so as to improve
the overall performance. With these approaches, we shall propose a MapReduce Pei-Zong Lee
framework for EMR applications, which will be validated using suffix array
construction as our testbed. Research Fellow

4. Non-volatile Memory as Main Memory and Storage Chien-Min Wang

Because data-intensive applications are increasingly being run on computer Associate Research Fellow
systems, it has become critically important to improve I/O efficiency. Responding to
low access performance of existing storage devices by enlarging DRAM capacity 17
to support more data-intensive applications introduces more energy consumption,
which is a major concern in system design. Thus, a new opportunity for non-volatile
memory (NVM) has presented itself. We propose to adopt NVM as both main
memory and storage. Replacing DRAM with NVM will reduce energy consumption
of main memory, and replacing disk drives will enhance I/O performance because
of NVMs superior byte-addressability, non-volatility, capacity for scalability, and high
access performance. To achieve this goal, we propose an NVM translation layer for
the NVM controller, which considers how the main memory is used by the operating
system together with the access patterns between main memory and storage. In this
translation layer, we resolve the endurance issue by allocating young NVM cells from
the whole NVM space for main
memory uses, and switch the
NVM cells from the storage
space to the main memory
space in order to enhance
performance during loading/
storing data between main-
memory buffer and storage. In
the future, we will continue to
tackle these efficiency issues
from the perspective of the
operating system by rethinking
the memory management and
file system design to optimize
the benefits of using NVM
as both main memory and
storage.

Figure: Non-volatile Memory Translation Layer
   14   15   16   17   18   19   20   21   22   23   24