Skip to content

Commit 0b8101b

Browse files
committed
new skill
1 parent f4dfeb7 commit 0b8101b

1 file changed

Lines changed: 33 additions & 10 deletions

File tree

agentic/SKILL.md

Lines changed: 33 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,20 +7,43 @@ Background: we are working on a learned cache project, basic idea is:
77
for a trace, we use first 20% traces for feature extraction, then we use this feature to predict the best parameters for the remaining 80% traces. However, in order to avoid ad-hoc parameter tunning, we do the
88
labeling via search best parameter for the whole trace.
99
Then it introduces a problem: for the same trace, we have three settings
10-
1. best parameters for the whole trace
10+
1. default parameters for the whole trace
1111
2. default parameters for 20% trace + best parameters for the remaining 80% trace
12-
3. default parameters for the whole trace
12+
3. best parameters for the whole trace
1313

14-
We have a hypothesis is that case 1 < case 2 < case 3 in terms of miss ratio, but we find that sometimes case 2 would be worse than case 3, which is counterintuitive.
14+
We have a hypothesis is that case 3 < case 2 < case 1 in terms of miss ratio, but we find that sometimes case 2 would be worse than case 1 (miss ratio high), which is counterintuitive.
1515

16-
Then we conducted a typical analysis over `/mnt/cfs/oracleReuse/tencentBlock/tencentBlock.ns4712.oracleGeneral.zst`.
16+
Then we conducted a typical analysis over `/mnt/cfs/oracleReuse/tencentBlock/tencentBlock.ns4712.oracleGeneral.zst`. with code [compare.sh](../grid_search/analysis_output/compare.sh) [analyze_compare_results.py](../grid_search/analysis_output/analyze_compare_results.py) and the result is shown in the figure below:
1717

18+
![image](../grid_search/analysis_output/hit_trajectory_compare_pairwise.png)
1819

19-
When explaining code, always include:
20+
> note: the value for `after-n-reqs` can be calculated by 20% * number of total requests in the trace, which is 20% * 1,000,000 = 200,000 in this case. total request number can be found in [cluster_stats.csv](../grid_search/analysis_output/cluster_stats.csv)
2021
21-
1. **Start with an analogy**: Compare the code to something from everyday life
22-
2. **Draw a diagram**: Use ASCII art to show the flow, structure, or relationships
23-
3. **Walk through the code**: Explain step-by-step what happens
24-
4. **Highlight a gotcha**: What's a common mistake or misconception?
22+
And in this case, our analysis result is that
23+
a. diagnose the tail cases: it happens when the whole trace optimal mismatches the optimal for switching after 20% requests
24+
             The mismatching can be attributed as a combination is too specific to one trace.
25+
    A typical example is tencentBlock.ns4712.oracleGeneral.zst, under cache size ratio = 0.1, best parameter is small 0.2;  ghost 3; s -> m thres 1; g -> m thres 1; skip ratio -> 25%.
26+
             Result 1: whole trace default - miss ratio 0.3122
27+
             Result 2: whole trace optimal - miss ratio 0.2744
28+
             Result 3: default 20% + switching to whole trace optimal - 0.3879
29+
             If we plot the miss/hit after 20% requests, we find that a common point is the cache size cannot cover the frequently requested scan pattern. However, whole trace optimal utilizes the limited cache space in an efficient way (not general enough) and it highly depends on the cache state (vulnerable). - Shown in the figure
2530

26-
Keep explanations conversational. For complex concepts, use multiple analogies.
31+
Another evidence is when we increase the cache size ratio to 0.2, the miss ratios have no difference as follows
32+
             Result 1  - miss ratio 0.0810
33+
             Result 2  - miss ratio 0.0782
34+
             Result 3  - miss ratio 0.0782
35+
36+
b. inspired by this, we are conducting label cleaning to remove labels that are too specialized for certain traces via check the behavior of similar hyperparameters.
37+
38+
Now you are a analyzer with expertise, please try to analyze other traces with similar behavior (means case 2 is worse than case 3) and how to explain each case.
39+
40+
Before you start, let me tell you how to find those cases.
41+
42+
- [optimal.csv](../grid_search/analysis_output/optimal.csv) contains the miss ratio for case 3 for all the traces, note that case 3 corresponds to miss_ratio column
43+
- [case2.csv](../grid_search/analysis_output/case2.csv) contains the miss ratio for case 2 for all the traces, note that case 2 corresponds to miss_ratio column
44+
- [baseline01.csv](../grid_search/analysis_output/baseline01.csv) contains the miss ratio for case 1 for all the traces with default parameters and cache size ratio = 0.1 (find records with s3fifo algo)
45+
- [baseline001.csv](../grid_search/analysis_output/baseline001.csv) contains the miss ratio for case 1 for all the traces with default parameters and cache size ratio = 0.01 (find records with s3fifo algo)
46+
- [baseline0001.csv](../grid_search/analysis_output/baseline0001.csv) contains the miss ratio for case 1 for all the traces with default parameters and cache size ratio = 0.001 (find records with s3fifo algo)
47+
48+
49+
Then you can find the traces with similar behavior by comparing the miss ratios for case 1 and case 2, and then analyze the possible reasons for the observed behavior.

0 commit comments

Comments
 (0)