Efficient Virtual Address Translation Architecture

Virtual memory is a key technology in modern computer systems. All memory accesses require translations from virtual address to physical address. Therefore, improving address translation performance is important. We study the opportunities to reduce the address translation overheads. We proposed architectures to minimize the front address translation cost, and to make each translation entity provide a broader translation coverage to improve address translation efficiency.

Acceleration System for Machine Learning

Recently, machine learning became a key application. As many user applications directly use machine learning service, service provider has to meet the service level agreement to guarantee the user experience. Considerable computation amount of machine learning workloads makes the problem difficult. Dedicating resources to a workload makes us to guarantee SLA easily, but it wastes resources. The goal of this study is to improve resource utilization and while guaranteeing performance of machine learning workloads.

Hardware-assisted Security for Cloud Systems

Security is one of the most important features in cloud computing environment. We focus on enhancing the security level by proposing new hardware supports or using existing hardware supports. The goal of this study is to improve the security level with the minimal architectural modifications or performance overhead. We have proposed secure memory architectures and architecural supports for secure virtual environment.

CPU Scheduling

CPU scheduling is a traditional but important topic in computer systems. We have studied CPU scheduling in various contexts ranging from asymmetric multi-core to virtualized environment. The goal of CPU scheduling study is to improve fairness while mitigating scheduling artifacts.

Efficient Memory Persistency for Future Memory Systems

In future systems with non-volatile memories, in-memory data remain durable and persistent across multiple system power-on/off cycles. We study an efficient way to support atomicity and durability of data in SCM through hardware-assisted write-ahead logging in non-volatile memory. We propose a new hardware logging scheme with asynchronous and direct updates of in-place data while optimizing log writes that reduces the NVM writes substantially.

RDMA-based Memory Disaggregation

In cluster systems, memory of some nodes can be underutilized. We propose a RDMA-based memory disaggregation system to fully utilize the memory in distributed systems. Memory disaggregation introduces multi-tier to memory systems. In multi-tired memory systems, memory management policy is important for performance. We can achieve high performance by locating frequently accessed data in local memory. We study system architectures and memory management policies.

Memory Compression Architecture

Memory compression is a technique to serve physical address space with less real memory chips. The goal of memory compression architecture is to reduce the cost on memory system. However, it introduces additional latency and architecture modifications. The goal of this study is to design an efficient memory compression architecture with short latency for frequently accessed data. We mainly focus on hardware-level memory compression.

Flexible and Cost-efficient HW-based Memory Mapping Mechanism

Hybrid memory systems have emerged with new memory techniques and memory models like 3D-stacked memory and non-volatile memory to make use of various memory strengths on a single system. On the existing homogeneous memory system, OS could serve access requests through the address translation from virtual to physical address, however hybrid memory systems require new mechanism to find the actual memory location. Our goal is to make flexible and cost-efficient HW-based memory mapping mechanism on hybrid memory system. This memory mapping structure could support systems to process memory requests quickly with minimum area overhead for mapping information.