diff --git a/README.md b/README.md index abf159dd..c176bc73 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,14 @@ -# libCacheSim - building and running cache simulations +

+ + libCacheSim + +

+ +

+A high-performance library for building and running cache simulations +

+ +--- [![build](https://github.com/1a1a11a/libCacheSim/actions/workflows/build.yml/badge.svg)](https://github.com/1a1a11a/libCacheSim/actions/workflows/build.yml) [![Python Release](https://github.com/1a1a11a/libCacheSim/actions/workflows/pypi-release.yml/badge.svg)](https://github.com/1a1a11a/libCacheSim/actions/workflows/pypi-release.yml) @@ -6,44 +16,6 @@ [![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/1a1a11a/libCacheSim/badge)](https://scorecard.dev/viewer/?uri=github.com/1a1a11a/libCacheSim) - -- [libCacheSim - building and running cache simulations](#libcachesim---building-and-running-cache-simulations) - - [News](#news) - - [What is libCacheSim](#what-is-libcachesim) - - [libCacheSim features](#libcachesim-features) - - [Supported algorithms](#supported-algorithms) - - [Eviction algorithms](#eviction-algorithms) - - [Admission algorithms](#admission-algorithms) - - [Prefetching algorithms](#prefetching-algorithms) - - [Build and Install libCacheSim](#build-and-install-libcachesim) - - [One-line install](#one-line-install) - - [Install dependency](#install-dependency) - - [Build libCacheSim](#build-libcachesim) - - [Developer Setup](#developer-setup) - - [Pre-commit Hooks](#pre-commit-hooks) - - [Usage](#usage) - - [cachesim (a high-performance cache simulator)](#cachesim-a-high-performance-cache-simulator) - - [basic usage](#basic-usage) - - [Run a single cache simulation](#run-a-single-cache-simulation) - - [Run multiple cache simulations with different cache sizes](#run-multiple-cache-simulations-with-different-cache-sizes) - - [Debug cachesim](#debug-cachesim) - - [Plot miss ratio curve](#plot-miss-ratio-curve) - - [Trace analysis](#trace-analysis) - - [Miss ratio curves profiling](#miss-ratio-curves-profiling) - - [Using libCacheSim as a library](#using-libcachesim-as-a-library) - - [Extending libCacheSim (new algorithms and trace types)](#extending-libcachesim-new-algorithms-and-trace-types) - - [Python package](#python-package) - - [Simulation with python](#simulation-with-python) - - [Extending new algorithm](#extending-new-algorithm) - - [Open source cache traces](#open-source-cache-traces) - - [Contributions](#contributions) - - [Reference](#reference) - - [License](#license) - - [Related](#related) - - - - ## News * **2024 Oct**: **S3-FIFO** gets an upgrade! Please try out the new version (the old is now renamed to S3-FIFOv0). * **2023 June**: **QDLP** is available now, see [our paper](https://dl.acm.org/doi/10.1145/3593856.3595887) for details. @@ -51,7 +23,6 @@ * **2024 Jan**: We compiled a list of open-source cache datasets at the bottom of this page --- - ## What is libCacheSim * a high-performance **cache simulator** for running cache simulations. * a high-performance and versatile trace analyzer for **analyzing different cache traces**. @@ -59,7 +30,6 @@ --- - ## libCacheSim features * **High performance** - over 20M requests/sec for a realistic trace replay. * **High memory efficiency** - predictable and small memory footprint. @@ -71,7 +41,6 @@ * **Efficient Miss Ratio Curve profiler** - quickly build highly accurate miss ratio curves on large-scale workloads; see [here](/doc/quickstart_mrcProfiler.md). --- - ## Supported algorithms cachesim supports the following algorithms: ### Eviction algorithms @@ -103,9 +72,7 @@ cachesim supports the following algorithms: --- - ## Build and Install libCacheSim - ### One-line install We provide some scripts for quick installation of libCacheSim. ```bash @@ -115,13 +82,14 @@ If this does not work, please 1. let us know what system you are using and what error you get 2. read the following sections for self-installation. - +
+Step-by-step installation guide + ### Install dependency libCacheSim uses [cmake](https://cmake.org/) build system and has a few dependencies: [glib](https://developer.gnome.org/glib/), [tcmalloc](https://github.com/google/tcmalloc), [zstd](https://github.com/facebook/zstd). Please see [install.md](/doc/install.md) for instructions on how to install the dependencies. - ### Build libCacheSim cmake recommends **out-of-source build**, so we do it in a new directory: ```bash @@ -137,10 +105,14 @@ cmake -G Ninja .. && ninja [sudo] ninja install popd ``` +
+ + +
+ Developer setup - ### Developer Setup -For developers, we provide tools to ensure code quality and consistent formatting: +If you contribute to libCacheSim, we provide tools to ensure code quality and consistent formatting: #### Pre-commit Hooks We provide a git pre-commit hook that runs linting checks before each commit, helping catch issues early: @@ -157,14 +129,13 @@ The pre-commit hook: - Prevents committing code with formatting, static analysis, or compiler issues - Logs are preserved for debugging in `.lint-logs/` directory +
+ --- - ## Usage - ### cachesim (a high-performance cache simulator) After building and installing libCacheSim, `cachesim` should be in the `_build/bin/` directory. - #### basic usage ``` ./bin/cachesim trace_path trace_type eviction_algo cache_size [OPTION...] @@ -172,7 +143,6 @@ After building and installing libCacheSim, `cachesim` should be in the `_build/b use `./bin/cachesim --help` to get more information. - #### Run a single cache simulation Run the example traces using the LRU eviction algorithm and a 1 GB cache size. @@ -181,7 +151,6 @@ Run the example traces using the LRU eviction algorithm and a 1 GB cache size. ./bin/cachesim ../data/trace.vscsi vscsi lru 1gb ``` - #### Run multiple cache simulations with different cache sizes ```bash # Note that there is no space between the cache sizes @@ -202,7 +171,6 @@ Run the example traces using the LRU eviction algorithm and a 1 GB cache size. See [quick start cachesim](/doc/quickstart_cachesim.md) for more usages. - #### Debug cachesim We provide a debug script to help you debug cachesim with GDB. For detailed usage instructions, see [debug guide](/doc/usage.md). @@ -214,7 +182,6 @@ We provide a debug script to help you debug cachesim with GDB. For detailed usag ./scripts/debug.sh -- data/cloudPhysicsIO.vscsi vscsi lru,s3fifo 100mb,1gb ``` - #### Plot miss ratio curve You can plot miss ratios of different algorithms and sizes, and plot the miss ratios over time. @@ -235,7 +202,6 @@ python3 plot_appr_mrc.py MINI ../data/twitter_cluster52.vscsi vscsi s3fifo "0.00 --- - ### Trace analysis libCacheSim also has a trace analyzer that provides a lot of useful information about the trace. And it is very fast, designed to work with billions of requests. @@ -244,7 +210,6 @@ See [trace analysis](/doc/quickstart_traceAnalyzer.md) for more details. --- - ### Miss ratio curves profiling Constructing fine-grained miss ratio curves for large-scale workloads is very demanding on CPU and memory resources. libCacheSim provides advanced miss ratio curves profiling tools to help you quickly build miss ratio curves for large-scale workloads. See [mrcProfiler](/doc/quickstart_mrcProfiler.md) for more details. @@ -253,11 +218,13 @@ Constructing fine-grained miss ratio curves for large-scale workloads is very de --- - ### Using libCacheSim as a library libCacheSim can be used as a library for building cache simulators. For example, you can build a cache cluster with consistent hashing or a multi-layer cache simulator. +
+ See a code example + Here is a simplified example showing the basic APIs. ```c #include @@ -299,13 +266,13 @@ To run the executable, ```bash ./test.out ``` +
See [here](/doc/advanced_lib.md) for more details, and see [example folder](/example) for examples on how to use libCacheSim, such as building a cache cluster with consistent hashing, multi-layer cache simulators. --- - ### Extending libCacheSim (new algorithms and trace types) libCacheSim supports *txt*, *csv*, and *binary* traces. We prefer binary traces because they allow libCacheSim to run faster, and the traces are more compact. @@ -316,7 +283,6 @@ If you need to add a new trace type or a new algorithm, please see [here](/doc/a We encourage the users to check [deepWiki](https://deepwiki.com/1a1a11a/libCacheSim) for a more detailed documentation. --- - ## Python package If you are not extremely sensitive to the performance, our python binding can offer you an easier way to access the core feature of libCacheSim. @@ -340,6 +306,8 @@ print(f"Obj miss ratio: {obj_miss_ratio:.4f}, byte miss ratio: {byte_miss_ratio: ### Extending new algorithm With python package, you can extend new algorithm to test your own eviction design **without any C/C++ compilation**. +
+ See an example below ```python import libcachesim as lcs @@ -375,11 +343,11 @@ obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader) print(f"Obj miss ratio: {obj_miss_ratio:.4f}, byte miss ratio: {byte_miss_ratio:.4f}") ``` +
See more information in [README.md](./libCacheSim-python/README.md) of the Python binding. --- - ## Open source cache traces In the [repo](/data/), there are sample traces in different formats (`csv`, `txt`, `vscsi`, and `oracleGeneral`). Note that the sampled traces are **very small** and __should not be used for evaluating different algorithms' miss ratios__. The full traces can be found either with the original release or the processed `oracleGeneral` format. @@ -395,22 +363,11 @@ struct { ``` The compressed traces can be used with libCacheSim without decompression. And libCacheSim provides a `tracePrint` tool to print the trace in a human-readable format. +We provide a more comprehensive cache datasets at [https://github.com/cacheMon/cache_dataset](https://github.com/cacheMon/cache_dataset). -| Dataset | Year | Type | Original release | OracleGeneral format | -|---------------|------|:---------:|:-----------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------:| -| Tencent Photo | 2018 | object | [link](http://iotta.snia.org/traces/parallel?only=27476) | [link](https://ftp.pdl.cmu.edu/pub/datasets/twemcacheWorkload/cacheDatasets/tencentPhoto/) | -| WikiCDN | 2019 | object | [link](https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Caching) | [link](https://ftp.pdl.cmu.edu/pub/datasets/twemcacheWorkload/cacheDatasets/wiki/) | -| Tencent CBS | 2020 | block | [link](http://iotta.snia.org/traces/parallel?only=27917) | [link](https://ftp.pdl.cmu.edu/pub/datasets/twemcacheWorkload/cacheDatasets/tencentBlock/) | -| Alibaba Block | 2020 | block | [link](https://github.com/alibaba/block-traces) | [link](https://ftp.pdl.cmu.edu/pub/datasets/twemcacheWorkload/cacheDatasets/alibabaBlock/) | -| Twitter | 2020 | key-value | [link](https://github.com/twitter/cache-trace) | [link](https://ftp.pdl.cmu.edu/pub/datasets/twemcacheWorkload/cacheDatasets/twitter/) | -| MetaKV | 2022 | key-value | [link](https://cachelib.org/docs/Cache_Library_User_Guides/Cachebench_FB_HW_eval/#list-of-traces) | [link](https://ftp.pdl.cmu.edu/pub/datasets/twemcacheWorkload/cacheDatasets/metaKV/) | -| MetaCDN | 2023 | object | [link](https://cachelib.org/docs/Cache_Library_User_Guides/Cachebench_FB_HW_eval/#list-of-traces) | [link](https://ftp.pdl.cmu.edu/pub/datasets/twemcacheWorkload/cacheDatasets/metaCDN/) | - -Among the large number of traces, I recommend using the newer ones from Twitter (cluster52), Wiki, and Meta. --- - ## Contributions We gladly welcome pull requests. Before making any large changes, we recommend opening an issue and discussing your proposed changes. @@ -418,8 +375,10 @@ If the changes are minor, then feel free to make them without discussion. This project adheres to Google's coding style. By participating, you are expected to uphold this code. --- - ## Reference +
+ Please cite the following papers if you use libCacheSim. + ``` @inproceedings{yang2020-workload, author = {Juncheng Yang and Yao Yue and K. V. Rashmi}, @@ -455,16 +414,16 @@ This project adheres to Google's coding style. By participating, you are expecte numpages = {10}, } ``` -If you used libCacheSim in your research, please cite the above papers. And we welcome you to send us a link to your paper and add a reference to [references.md](references.md). +If you used libCacheSim in your research, please cite the above papers. + +
--- - ## License See [LICENSE](LICENSE) for details. - ## Related * [PyMimircache](https://github.com/1a1a11a/PyMimircache): a python based cache trace analysis platform, now deprecated --- diff --git a/doc/assets/logo.jpg b/doc/assets/logo.jpg new file mode 100644 index 00000000..779fe2a3 Binary files /dev/null and b/doc/assets/logo.jpg differ diff --git a/doc/assets/logo_circle.png b/doc/assets/logo_circle.png new file mode 100644 index 00000000..0bf3e5d8 Binary files /dev/null and b/doc/assets/logo_circle.png differ