Releases · UTSASRG/Scaler

22 Aug 03:48

GammaPi

v0.2.4

e12b904

v0.2.4 Latest

Latest

ASE 2024 reproduction package source code.

Assets 2

10 Apr 02:54

GammaPi

v0.2.3-2

237b456

v0.2.3

v0.2.3 implemented thread attribution approach 1 as described in #86. Approach 1 is logical and easy to prove compared to Approach2. The implementation will not introduce data race but still maintains low overhead.

Proved effectiveness through simple examples.
Added thread imbalance examples to the paper.
Improved performance and made Scaler faster than all other tools.
Improve benchmarksuites to include kernel memory measurement and make results more stable by adding delay between benchmarks.

Assets 2

25 Jan 16:33

GammaPi

v0.2.2-1

d7f0082

v0.2.2

v0.2.2 implemented thread attribution approach 2 as described in #86. We decided not to add outlier removal as thread attribution already helped us to make performance bugs more apparent in Scaler's output.

With this implementation, we are able to sell time attribution (thread attribution, Join/Wait time attribution) as a major contribution and make effectiveness experimentation results more explainable.

We also identified 4 new examples to prove effectiveness.

Assets 2

09 Dec 20:31

GammaPi

v0.2.1

ba5c9e5

v0.2.1

v0.2.1 has a series of improvements to help us understand Scaler's data

v0.2.1 contains a newly implemented benchmarktoolkit. The benchmarktoolkit supports automated experiments, automated artifact collection, benchmarking with multiple machines, file integrity checks, and provides a unified, easily expandable interface to run parsec and real applications together. Currently, the benchmark enables us to test: Parsec, httpd, nginx, memcachd, redis, mysql, postgresql. Scaler currently reports performance results on all those applications except postgresql as it seems to have some issues caused by multi-process. We identified the benchmark machine has CPU errors and disk errors. The postgresql problem is probably not Scaler's problem.

v0.2.1 also has a series of new python scripts to help interpret benchmark results better.

v0.2.1 removed the support of the previous Fine-Grained-Dynamic-Sampling (FGDS) method. The main problem is we cannot justify that FGDS won't impact the correctness of the result. Some details can be seen in #85.

Assets 2

06 Sep 02:09

GammaPi

v0.2.0-2

0c8be32

v0.2.0

v0.2.0 greatly improved Scaler's performance. A new counting method was used and the pre-hook overhead was significantly reduced.
v0.2.0 also supports adaptive timing with customized strategies. This support greatly reduced the overall overhead.

Overhead tested on PARSEC benchmark:

ASM Counting
- Runtime: 1.5%
- Memory: 2.5%
ASM Counting + C pre-hook (Invoke every API call, this is the maximum possible overhead for C pre-hook)
- Runtime: 6.39%
- Memory: 3%
ASM Counting + C pre-hook + C post-hook (Invoke every API call, this is the maximum possible overhead for Scaler)
- Runtime: 23.1%
- Memory: 2.6%
ASM Counting + C pre-hook + C post-hook + Adaptive counting (The overhead in our paper. This might change based on experiment results. But generally we can control it under 5%)
- Runtime: 1.94%
- Memory: 1.18%

PARSEC

Runtime

Memory

Assets 2

05 May 15:04

GammaPi

v0.1.9

fd90c2c

v0.1.9 Pre-release

Pre-release

v0.1.9 is a patch release of v0.1.8

It mainly improves data saving to support openmp applications. #54

Assets 2

21 Apr 04:44

GammaPi

v0.1.8

d137c21

v0.1.8 Pre-release

Pre-release

v0.1.8 is a patch release of v0.1.7
It contains bug fix regarding function call before thread_create and context initialization
It also includes a new visualizer for the newly implemented PLT-like data structure.

You can find effectiveness evaluation in #19

Assets 2

05 Apr 13:11

GammaPi

v0.1.7

c03460f

v0.1.7 Pre-release

Pre-release

v0.1.7 is a extension of v0.1.2 with performance improvements and jmp code handling.

Improvements include:

Runtime:
- Reduce function call.
- Instruction optimization.
- Branch prediction optimization.
- Rm dynamic compilation
Memory
- Reduce unnecessary structure.
Stability
- Handle jmp problem discovered in v0.1.6.

v0.1.7's commits also include an experimental way to map address to id in O(1) time. This version also reduce jmp number. But sadly the trade-off between runtime and memory is hard to do. So this version is abandoned. New v0.1.7 is similar to v0.1.2 but with significantly less overhead.

Evaluation on parsec:

Assets 2

28 Mar 18:51

GammaPi

v0.1.6

d7618ef

v0.1.6 Pre-release

Pre-release

v0.1.6 mainly focus on reducing memory overhead.
Performed optimization in two aspects:

Optimized non-hook code to make them more memory efficient.
Implemented a more efficient hook method.
This new method removes the need for dynamic compilation and reduced memory consumption for hook part. But in the benchmark test, I found some user libraries will also use jmp to call plt entry. Although the majority of functions use standard way (call xxx@PLT) to implement. I cannot detect those non-standard ones. When jmp is used, the program will crash. I have to revert back to the original method. Problem illustrated in #45.

The memory overhead reduced significantly compared to v0.1.2.
I tested on swaptions (The program that used most memory in previous tests). Memory overhead dropped from 4.6x to 1.08x.

In v0.1.7, I will revert hook part to the original version.

Assets 2

11 Feb 22:36

GammaPi

v0.1.5

efd8742

v0.1.5 Pre-release

Pre-release

v0.1.5 should be considered as a patch release for v0.1.4
Implemented thread attribution according to the professor's suggestion (Adding application, provide the ability to filter some symbols eg: pthread_join).
Several bugfixes.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PARSEC

Runtime

Memory

Releases: UTSASRG/Scaler

v0.2.4

v0.2.3

v0.2.2

v0.2.1

v0.2.0

PARSEC

Runtime

Memory

v0.1.9

v0.1.8

v0.1.7

v0.1.6

v0.1.5