Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
naivewong committed May 6, 2024
1 parent 52d47f2 commit 6a600b5
Show file tree
Hide file tree
Showing 2 changed files with 99 additions and 57 deletions.
6 changes: 3 additions & 3 deletions css/myMiddle.css
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@ body {
}

ul{
padding-left: 1.4em;
padding-left: 0em;
}

li{
padding-left: 1.4em;
margin-left: 2.4em;
padding-left: 0em;
margin-left: 0em;
}


Expand Down
150 changes: 96 additions & 54 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -165,22 +165,28 @@ <h2>Biography</h2>
<p>Education</p>
<ul>
<li>
2019 - 2023: Ph.D. in Computer Science and Engineering, CUHK
<div>
<div style="float:left">Ph.D., Computer Science and Engineering, CUHK</div>
<div style="float:right">2019 - 2023</div>
</div>
</li>
<li>
2014 - 2019: B.Eng. in Computer Engineering, CUHK (first class honour, dean list of 2017, 2019) (2+2 joint education with SYSU)
<div>
<div style="float:left">B.Eng., with First Class Honor, Computer Engineering, CUHK</div>
<div style="float:right">2014 - 2019</div>
</div>
</li>
</ul>
My research includes -</p>
<ul>
<li>
Big data applications: timeseries management system and databases.
Big data systems: timeseries management system and databases.
</li>
<li>
Storage libraries/applications: key-value stores.
Storage engines: LSM-tree-based key-value stores.
</li>
<li>
File systems and in-storage computing (hardware/software co-design).
File systems and in-storage computing.
</li>
</ul>

Expand All @@ -189,84 +195,112 @@ <h2>Biography</h2>
<h2>Publications</h2>
<ul>
<li>
<b>A Spatio-Temporal Series Data Model with Efficient Indexing and Layout for Cloud-Based Trajectory Data Management.</b> <br/>
Yang Guo, <b>Zhiqi Wang</b>, Jin Xue, and Zili Shao. <br/>
<i>The 40th International Conference on Data Engineering (<b>ICDE 2024</b>)(CCF-A).</i> <br/>
[<a>code</a>] [<a>paper</a>] <br/><br/>
</li>
<li>
<b>MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying.</b> <br/>
MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying. <br/>
<b>Zhiqi Wang</b>, and Zili Shao. <br/>
<i>The 43rd ACM SIGMOD International Conference on Management of Data (<b>SIGMOD 2024</b>)(CCF-A).</i> <br/>
[<a href="https://dl.acm.org/doi/10.1145/3626736" target="“blank”">paper</a>] <br/><br/>
</li>
<li>
<b>Lightning Talk: Model, Framework and Integration for In-Storage Computing with Computational SSDs.</b> <br/>
Tianyu Wang, Jin Xue, Zelin Du, <b>Zhiqi Wang</b>, Yaotian Cui, and Zili Shao. <br/>
<i>The 60th ACM/IEEE Design Automation Conference (<b>DAC 2023</b>)(CCF-A)(invited paper).</i> <br/>
[<a href="https://doi.org/10.1109/DAC56929.2023.10247955" target="“blank”">paper</a>] <br/><br/>
</li>
<li>
<b>ForestTI: A Scalable Inverted-Index-Oriented Timeseries Management System with Flexible Memory Efficiency.</b> <br/>
ForestTI: A Scalable Inverted-Index-Oriented Timeseries Management System with Flexible Memory Efficiency. <br/>
<b>Zhiqi Wang</b>, and Zili Shao. <br/>
<i>The 42nd ACM SIGMOD International Conference on Management of Data (<b>SIGMOD 2023</b>)(CCF-A).</i> <br/>
[<a href="https://github.com/naivewong/forestti" target="“blank”">code</a>] [<a href="https://dl.acm.org/doi/10.1145/3589260" target="“blank”">paper</a>] <br/><br/>
</li>
<li>
<b>BSCache: A Brisk Semantic Caching Scheme for Cloud-based Performance Monitoring Timeseries Systems.</b> <br/>
Kai Zhang, <b>Zhiqi Wang</b>, and Zili Shao. <br/>
<i>Proceedings of the 51st International Conference on Parallel Processing (<b>ICPP 2022</b>)(CCF-B).</i> <br/>
[<a href="https://github.com/kaizhang15/BSCache" target="“blank”">code</a>] [<a href="https://dl.acm.org/doi/abs/10.1145/3545008.3546183" target="“blank”">paper</a>] <br/><br/>
</li>
<li>
<b>TimeUnion: An Efficient Architecture with Unified Data Model for Timeseries Management Systems on Hybrid Cloud Storage.</b> <br/>
TimeUnion: An Efficient Architecture with Unified Data Model for Timeseries Management Systems on Hybrid Cloud Storage. <br/>
<b>Zhiqi Wang</b>, and Zili Shao. <br/>
<i>The 41st ACM SIGMOD International Conference on Management of Data (<b>SIGMOD 2022</b>)(CCF-A).</i> <br/>
[<a href="https://github.com/naivewong/timeunion" target="“blank”">code</a>] [<a href="https://dl.acm.org/doi/10.1145/3514221.3526175" target="“blank”">paper</a>] <br/><br/>
</li>
<li>
<b>TagTree: Global Tagging Index with Efficient Querying for Time Series Databases.</b> <br/>
Jin Xue, <b>Zhiqi Wang</b>, and Zili Shao. <br/>
<i>The 36th IEEE International Parallel & Distributed Processing Symposium (<b>IPDPS 2022</b>)(CCF-B).</i> <br/>
[<a href="https://github.com/Jimx-/tagtree" target="“blank”">code</a>] [<a href="https://ieeexplore.ieee.org/abstract/document/9820720" target="“blank”">paper</a>] <br/><br/>
</li>
<li>
<b>Heracles: An Efficient Storage Model and Data Flushing for Performance Monitoring Timeseries.</b> <br/>
Heracles: An Efficient Storage Model and Data Flushing for Performance Monitoring Timeseries. <br/>
<b>Zhiqi Wang</b>, Jin Xue, and Zili Shao. <br/>
<i>The 47th International Conference on Very Large Data Bases (<b>VLDB 2021</b>)(CCF-A), Volume 14(6), 1080-1092.</i> <br/>
[<a href="https://github.com/naivewong/heracles" target="“blank”">code</a>] [<a href="https://www.vldb.org/pvldb/vol14/p1080-wang.pdf" target="“blank”">paper</a>] <br/><br/>
</li>
</ul>

<h2>Experience</h2>
<ul>
<li>
11/2023 - Present: Postdoc in CUHK. In-storage computation research.
A Spatio-Temporal Series Data Model with Efficient Indexing and Layout for Cloud-Based Trajectory Data Management. <br/>
Yang Guo, <b>Zhiqi Wang</b>, Jin Xue, and Zili Shao. <br/>
<i>The 40th International Conference on Data Engineering (<b>ICDE 2024</b>)(CCF-A).</i> <br/>
[<a>code</a>] [<a>paper</a>] <br/><br/>
</li>
<li>
Lightning Talk: Model, Framework and Integration for In-Storage Computing with Computational SSDs.</b> <br/>
Tianyu Wang, Jin Xue, Zelin Du, <b>Zhiqi Wang</b>, Yaotian Cui, and Zili Shao. <br/>
<i>The 60th ACM/IEEE Design Automation Conference (<b>DAC 2023</b>)(CCF-A)(invited paper).</i> <br/>
[<a href="https://doi.org/10.1109/DAC56929.2023.10247955" target="“blank”">paper</a>] <br/><br/>
</li>
<li>
06/2022 - 08/2022: Huawei Cloud Database Innovation Lab internship. Optimization of the storage engine of <a href="http://opengemini.org/" target="“blank”">OpenGemini</a>.
BSCache: A Brisk Semantic Caching Scheme for Cloud-based Performance Monitoring Timeseries Systems. <br/>
Kai Zhang, <b>Zhiqi Wang</b>, and Zili Shao. <br/>
<i>Proceedings of the 51st International Conference on Parallel Processing (<b>ICPP 2022</b>)(CCF-B).</i> <br/>
[<a href="https://github.com/kaizhang15/BSCache" target="“blank”">code</a>] [<a href="https://dl.acm.org/doi/abs/10.1145/3545008.3546183" target="“blank”">paper</a>] <br/><br/>
</li>
<li>
06/2019 - 08/2019: Google Summer of Code 2019. Optimization of the storage engine of <a href="https://prometheus.io/" target="“blank”">Prometheus</a>.
TagTree: Global Tagging Index with Efficient Querying for Time Series Databases. <br/>
Jin Xue, <b>Zhiqi Wang</b>, and Zili Shao. <br/>
<i>The 36th IEEE International Parallel & Distributed Processing Symposium (<b>IPDPS 2022</b>)(CCF-B).</i> <br/>
[<a href="https://github.com/Jimx-/tagtree" target="“blank”">code</a>] [<a href="https://ieeexplore.ieee.org/abstract/document/9820720" target="“blank”">paper</a>] <br/><br/>
</li>
</ul>

<h2>Projects</h2>
<h2>Working Experience</h2>
<ul>
<li>
<b>Gemini</b>: A monolithic software/hardware co-design key-value file system with computational storage. First, it contains a host-side kernel file system, which translates the file semantics to key-value commands. Second, we customize the Linux NVMe driver to bypass the Linux block layer and transmit the key-value commands. Third, we carefully design the flash translation layer (FTL) in our real hardware platform (a computational SSD) to handle the received key-value commands and manage the physical area of the SSD.
<div>
<div style="float:left">Postdoctoral Fellow in CUHK</div>
<div style="float:right">11/2023 - Present</div>
</div>
</li>
<li>
<div>
<div style="float:left">Huawei Cloud Database Innovation Lab (internship)</div>
<div style="float:right">06/2022 - 08/2022</div>
</div>
</li>
<li>
<b>MirrorKV</b>: An LSM-tree-based key-value store tailored for cloud storage (EBS, S3). The key idea is to design different compaction mechanisms for different storage tiers, and manage keys and values in two mirrored LSM-trees to maintain the data locality. This project derives from RocksDB (C++).
<div>
<div style="float:left">Google Summer of Code (internship)</div>
<div style="float:right">06/2019 - 08/2019</div>
</div>
</li>
</ul>

<h2>Research Experience</h2>
<b>Big Data Systems</b> <br/>
<i>Supported by Hong Kong General Research Fund: Optimizing Storage System Design for Spatial-Temporal Big Data (RGC Ref No. 15224918). Serve as the project participant.</i>
<ul>
<li>
<b>ForestTI</b>: A memory-efficient timeseries storage engine. The key idea is to design a flexible inverted index that can dynamically alter the structure based on the memory pressure. This project derives from TimeUnion.
<b>Timeseries Management Systems</b> <br/>
A thorough research on the main design decisions of the timeseries management systems, including the data model, memory data management, and persistent data management.
<ul>
<li>Data model: To solve the data redundancy issue of the timeseries data from the same data source, we propose a unified data model for both tags and data samples of timeseries, with a novel compression mechanism and a two-level indexing design.</li>
<li>Memory data management: To mitigate the memory overhead and maintain more timeseries with limited memory, we design a flexible inverted index that can dynamically adapt its structure to the memory pressure.</li>
<li>Persistent data management: To achieve high insertion throughput of big timeseries data, we design a dynamic time-partitioned LSM-tree with high insertion throughput, decent space efficiency, and efficient out-of-order data handling.</li>
</ul>
</li>
</ul>

<b>Storage Engines</b> <br/>
<i>Supported by Hong Kong General Research Fund: StoreLess: Eliminating Redundancy for LSM-tree based Key-Value Stores as Database Storage Engines in Internet Applications (RGC Ref No. 14219422). Serve as the project participant.</i>
<ul>
<li>
<b>TimeUnion</b>: A timeseries storage engine tailored for cloud storage (EBS, S3). First, it proposes a unified data model for timeseries tag management. Second, it presents a time-partitioned LSM-tree with hot/cold data separation and efficient out-of-order data handling. This project is written from scratch in C++.
<b>LSM-Tree-Based Key-Value Stores with Hybrid Cloud Storage</b> <br/>
LSM-tree-based key-value stores are widely used as the storage engines of big data systems. As the data volume scales up, it is a natural trend to deploy the system on the cloud. However, the existing LSM-tree designs can not adapt to cloud storage because of the huge performance gap. We design MirrorKV with a balanced read/write performance which separates keys and values into two mirrored LSM-trees for better data locality and read performance, and designs different compaction mechanisms for fast and slow storage to improve write performance.
</li>
</ul>

<b>File Systems and In-Storage Computing</b> <br/>
<i>Supported by Hong Kong General Research Fund Project: Data Model and Programming Framework for Function Offloading in In-SSD Computing (RGC Ref No. 14202123). Serve as the project participant.</i>
<ul>
<li>
<b>Heracles</b>: A timeseries storage engine with group data management and efficient data flushing mechanism. This project derives from the storage engine of Prometheus (Golang).
<b>A Monolithic Software/Hardware Co-Design Key-Value File System</b> <br/>
To mitigate the metadata manipulation overhead and I/O amplification of the traditional file systems designed for block storage, we implement a file system with a key-value interface, which offloads the data management to our computational storage platform.
<ul>
<li>Host-side key-value filesystem: It translates the file semantics (inode and page contents) to key-value commands correspondingly.</li>
<li>Host storage communication: We customize the Linux NVMe driver to bypass the Linux block layer and transmit the key-value commands.</li>
<li>Storage-side design: We carefully design the flash translation layer (FTL) to handle the received key-value commands and manage the physical area of the SSD.</li>
</ul>
</li>
</ul>

Expand All @@ -280,9 +314,9 @@ <h2>Awards</h2>
</li>
</ul>

<h2>Teaching</h2>
<h2>Teaching Experience</h2>
CSCI3150: Introduction to Operating Systems
<ul>
CSCI3150: Introduction to Operating Systems
<li>
Fall 2019
</li>
Expand All @@ -294,7 +328,7 @@ <h2>Teaching</h2>
</li>
</ul>

<h2>Professional Activities</h2>
<h2>Professional Experience</h2>
<ul>
<li>
Participation & Talks
Expand All @@ -308,11 +342,19 @@ <h2>Professional Activities</h2>
<li>
External Reviewer
<ul>
<li>2024: CODES+ISSS</li>
<li>2023: TODS</li>
<li>2022: DAC, CODES+ISSS, SIGPLAN/SIGBED</li>
<li>2021: DAC, ICCAD, CODES+ISSS</li>
<li>2020: DAC, ICCAD, CODES+ISSS</li>
<li>Journal</li>
<ul>
<li>ACM Transactions on Database Systems (TODS)</li>
<li>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD)</li>
</ul>
<li>Conference</li>
<ul>
<li>Design Automation Conference (DAC)</li>
<li>International Conference on Computer Design (ICCD)</li>
<li>Design Automation and Test in Europe Conference (DATE)</li>
<li>International Conference on Computer Aided Design (ICCAD)</li>
<li>Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)</li>
</ul>
</ul>
</li>
</ul>
Expand Down

0 comments on commit 6a600b5

Please sign in to comment.