pe - perf event wrapper

linux perf event c++ wrapper

Measure how much your code costs in terms of hardware instructions, cachemisses, branchmisses and memory allocations.

1. Compile pe

You will need sashamakarenko/makefile project cloned next to the pe working copy.

$> git clone https://github.com/sashamakarenko/makefile.git makefile
$> git clone https://github.com/sashamakarenko/pe.git pe
$> cd pe
$> make

2. Create code snippet

Let us invoke gettimeofday 20 times on the CPU core 3 and get statistics from this. Instrument your code or create pe/src/tests/TestGettimeofday.cpp like this:

#include <pe/Measurement.h>

int main( int argc, char** argv )
{
    pe::Measurement m;
    m.pinToCpuCore( 3 );
    m.addEvent( pe::EventType::cpuCycles );
    m.addEvent( pe::EventType::hwInstructions );
    m.addEvent( pe::EventType::branchInstructions );
    m.addEvent( pe::EventType::llCacheReadMisses );
    m.addEvent( pe::EventType::branchMisses );
    m.addEvent( pe::EventType::memory ); // you will have to use LD_PRELOAD=build/lib/release/libPePreload-1.0.so to capture memory usage
    m.initialize( 20 );
    
    std::cout << "\ngettimeofday:" << std::endl;
    for( int i = 0; i < m.getMaxCaptures(); ++i )
    {
        m.startCapture();
        // this is what we measure
        gettimeofday( &tv, nullptr );
        m.stopCapture();
    }
    m.prepareResults();
    m.printCaptures();
    m.showAverageValues( std::cout );
    m.rewind();
    return 0;
}

3. Compile

$> make

4. Run measurement

Enable perf events:

$> sudo echo -1 > /proc/sys/kernel/perf_event_paranoid

Bring the frequency up

$> sudo cpufreq-set -c 3 -g performance

Run the code

$> make check

Relax the CPU

$> sudo cpufreq-set -g powersave

5. Results

The metrics correspond to 10 calls in a row. Hopefully the column names are self-explanatory. The column nanos/call correspond to a single call time on a 5GHz

Cost of time measurement functions

event	nanos/call	cpu.cycles	hw.instrs	br.instrs	br.misses	bus.cycles	cch.l1d.rmiss
gettimeofday	13.2	661	821	140	0	3	2
clock_gettime	14.3	714	911	170	3	3	3
rdtsc	7.0	352	61	0	0	2	0

Functions and methods

See src/tests/TestLibCalls.cpp for details. In order to avoid compiler call evictions, for the functions returning int we actually measure:

externVolatileInt += function();
externVolatileInt += function();
... // repeated 10 times
externVolatileInt += function();

Without LTO

event	nanos/call	cpu.cycles	hw.instrs	br.instrs
void()	1.4	70	20	20
void(int)	0.7	36	30	20
int(int)	0.9	43	81	20
inline int base.get()	0.7	37	43	0
int base.get()	0.9	45	83	20
int base.*getIntPtr()	1.2	61	123	30
virtual int base.get()	0.9	43	93	20
virtual int derived.get()	0.9	43	93	20
virtual int virtDerived.get()	1.3	67	133	20
inline base.getIndirect()	1.0	52	46	0
inline derived.getIndirect()	1.0	48	46	0

With LTO

event	nanos/call	cpu.cycles	hw.instrs	br.instrs
void()	0.0	0	0	0
void(int)	0.0	0	0	0
int(int)	1.0	48	30	0
inline int base.get()	0.9	47	40	0
int base.get()	0.9	44	40	0
int base.*getIntPtr()	1.3	64	120	30
virtual int base.get()	1.0	48	71	20
virtual int derived.get()	0.9	45	71	20
virtual int virtDerived.get()	1.5	77	120	20
inline base.getIndirect()	1.0	48	40	0
inline derived.getIndirect()	0.9	43	40	0

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pe - perf event wrapper

1. Compile pe

2. Create code snippet

3. Compile

4. Run measurement

5. Results

Cost of time measurement functions

Functions and methods

Without LTO

With LTO

About

Releases

Packages

Languages

sashamakarenko/pe

Folders and files

Latest commit

History

Repository files navigation

pe - perf event wrapper

1. Compile pe

2. Create code snippet

3. Compile

4. Run measurement

5. Results

Cost of time measurement functions

Functions and methods

Without LTO

With LTO

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages