Institute of Information Science, Academia Sinica



Press Ctrl+P to print from browser


Optimizing Performance of Memory Systems in Multicore Processors

  • LecturerDr. Krishna Kavi (The NSF Industry-University Collaborative Research Center)
    Host: Dr. Pen-Chung Yew
  • Time2010-04-16 (Fri.) 10:30 – 12:00
  • LocationAuditorium 106 at new IIS Building
Abstract: This talk focuses on techniques to improve memory performance in multicore processors. Per core cache memories are becoming smaller as more cores (CPUs) are added on chips. Pin-limitations of chips will constrain the available bandwidth to DRAMs. Graphical processing elements (GPUs) contain very small local memories but no caches; data must be sent to these units before computations can proceed. Thus the performance on multicore systems is limited by their memory systems’ performance. Our research has developed both hardware and software solutions to overcome these limitations. Software solutions include profiling of data access patterns, relocating data, and restructuring code to improve performance. Hardware solutions include reconfigurable caches that can be optimally configured for each application, partitioning caches for different data types or as scratch-pad memories and hardware support for intelligent management of data. Spatial localities can be improved using profiling of dynamically allocated objects and relocating them to contiguous memory. Even the data exhibiting temporal locality is likely to be evicted between uses if the reuse distance (time between successive accesses of a data object) is large. Code restructuring techniques such as fusing two loops into a single loop or tiling of large loops into smaller ones can reduce reuse distance, leading to better cache performance. Localities exhibited by data depend on object types and how they are accessed in an application. Better performance can be achieved if cache memories are partitioned and reconfigured optimally to meet divergent needs of data types and access patterns. Combining data and code restructuring with reconfigurable caches can lead to even better performance. Hardware can be used to aid in allocation and relocation of dynamic data. Bio. Dr. Krishna Kavi is currently a Professor of Computer Science and Engineering and the Director of the NSF Industry/University Cooperative Research Center for Net-Centric Software and Systems at the University of North Texas. During 2001-2009, he served as the Chair of the department. He also held an Endowed Chair Professorship in Computer Engineering at the University of Alabama in Huntsville, and served on the faculty of the University Texas at Arlington. He was a Scientific Program Manger at US National Science Foundation during 1993-1995. He served on several editorial boards and program committees. His research is primarily on Computer Systems Architecture including multi-threaded and multi-core processors, cache memories and hardware assisted memory managers. He also conducted research in the area of formal methods, parallel processing, and real-time systems. He published more than 150 technical papers in these areas. He received more than US $4M in research grants. He graduated 12 PhDs and more than 35 MS students. He received his PhD from Southern Methodist University in Dallas Texas and a BS in EE from the Indian Institute of Science in Bangalore, India.