Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix incorrect implementation of getting CallGraphSCCRevTopoOrder #1640

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

canliture
Copy link
Contributor

@canliture canliture commented Jan 26, 2025

an example of test.c:

#include <stdlib.h>

#define SZ 100

int* mem_alloc(unsigned sz)
{
    int *buf = (int *)malloc(sizeof(int) * sz);
    return buf;
}

void set_mem(int *ar, int x)
{
    ar[0] = x;
    ar[x] = 0;
}

void baz(int arg)
{
    int *buf = mem_alloc(SZ);
    set_mem(buf, 1);
    int t1 = buf[0];
    int t2 = buf[1];
    buf[t1-1] = 0;
    buf[t2-1] = 0;
}

before fix, MRGenerator::getCallGraphSCCRevTopoOrder will get:

baz
set_mem
mem_alloc
malloc

after fix, it will get:

malloc
mem_alloc
set_mem
baz

Copy link

codecov bot commented Jan 26, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 63.70%. Comparing base (141be58) to head (d50f567).
Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1640   +/-   ##
=======================================
  Coverage   63.70%   63.70%           
=======================================
  Files         245      245           
  Lines       25999    26001    +2     
  Branches     4509     4508    -1     
=======================================
+ Hits        16563    16565    +2     
  Misses       9436     9436           
Files with missing lines Coverage Δ
svf/include/Graphs/SCC.h 88.88% <100.00%> (+0.57%) ⬆️
svf/include/WPA/WPAFSSolver.h 100.00% <100.00%> (ø)
svf/lib/MSSA/MemRegion.cpp 70.45% <100.00%> (ø)
svf/lib/WPA/Andersen.cpp 94.08% <100.00%> (-0.02%) ⬇️

@yuleisui
Copy link
Collaborator

Would it be possible to put the implementation in SCC.h so that every graph including calligraphscc can use it?

inline GNodeStack &topoNodeStack()

NodeStack revTopoStack;
NodeStack& topoStack = this->getSCCDetector()->topoNodeStack();
while (!topoStack.empty())
NodeStack revTopoStack = this->getSCCDetector()->revTopoNodeStack();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why making changes on SCCDetect ? It was correct, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems correct.

@@ -250,6 +250,7 @@ void MRGenerator::collectModRefForCall()
{
PTACallGraphNode* subCallGraphNode = callGraph->getCallGraphNode(*it);
/// Get mod-ref of all callsites calling callGraphNode
std::cout << subCallGraphNode->getFunction()->getName() << std::endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this debugging line.

while (!topoOrder.empty())
NodeStack revTopoOrder = getSCCDetector()->revTopoNodeStack();
/// referenced topoOrder is used for restoring the topological order for later solving.
NodeStack &topoOrder = getSCCDetector()->topoNodeStack();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

topoOrder has been there when you retrieve it and should not be changed later at line 725.

GNodeStack &topoOrder = topoNodeStack();
while(!topoOrder.empty())
{
NodeID callgraphNodeID = topoOrder.top();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

callgraphNodeID => nodeID

NodeID nodeId = revTopoOrder.top();
revTopoOrder.pop();
topoOrder.push(nodeId);
/// restore the topological order for later solving.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no restore is needed as you never popped any element in topoOrder

while (!topoOrder.empty())
NodeStack revTopoOrder = getSCCDetector()->revTopoNodeStack();
/// referenced topoOrder is used for restoring the topological order for later solving.
NodeStack &topoOrder = getSCCDetector()->topoNodeStack();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line: NodeStack &topoOrder = getSCCDetector()->topoNodeStack()

revTopoOrder.pop();
topoOrder.push(nodeId);
/// restore the topological order for later solving.
topoOrder.push(repNodeId);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line topoOrder.push(repNodeId);

@yuleisui
Copy link
Collaborator

Could you run the below bcs and ensure you patch produces the same average points-to size using wpa -ander. Please attach the results similar as here:

https://app.filemail.com/d/uayyovelgnuoixz
https://github.com/user-attachments/files/18392297/502.gcc_r-16.bc.zip

@yuleisui
Copy link
Collaborator

You could also validate using your local bcs and report back the stats.

@canliture
Copy link
Contributor Author

Could you run the below bcs and ensure you patch produces the same average points-to size using wpa -ander. Please attach the results similar as here:

https://app.filemail.com/d/uayyovelgnuoixz https://github.com/user-attachments/files/18392297/502.gcc_r-16.bc.zip

Yeah, I first validate the second url. The first url seems be out-dated.

@yuleisui
Copy link
Collaborator

Could you run the below bcs and ensure you patch produces the same average points-to size using wpa -ander. Please attach the results similar as here:
https://app.filemail.com/d/uayyovelgnuoixz https://github.com/user-attachments/files/18392297/502.gcc_r-16.bc.zip

Yeah, I first validate the second url. The first url seems be out-dated.

Do you have any bc in your local machine? Try to validate another bc and upload your bc and results.

@canliture
Copy link
Contributor Author

Could you run the below bcs and ensure you patch produces the same average points-to size using wpa -ander. Please attach the results similar as here:
https://app.filemail.com/d/uayyovelgnuoixz https://github.com/user-attachments/files/18392297/502.gcc_r-16.bc.zip

Yeah, I first validate the second url. The first url seems be out-dated.

Do you have any bc in your local machine? Try to validate another bc and upload your bc and results.

Yes, I find some bcs to validate.

@canliture
Copy link
Contributor Author

I choose 3 project bcs with top size in https://github.com/SVF-tools/Test-Suite/tree/master/test_cases_bc/crux-bc

I run the wpa command in debug mode, for example:

/Users/liture/CLionProjects/SVF-Dev-Env/SVF/cmake-build-debug/bin/wpa -ander /Users/liture/CLionProjects/SVF-Dev-Env/SVF/Test-Suite/test_cases_bc/crux-bc/tmux.bc

In the following diff command context, before means before committing my changes, after means after committing my changes.
For example, bash.bc.before.txt means statistics about bash.bc before committing my changes, bash.bc.after.txt means statistics about bash.bc after committing my changes.

bash.bc, source is: https://github.com/SVF-tools/Test-Suite/blob/master/test_cases_bc/crux-bc/bash.bc

$ diff -u bash.bc.before.txt bash.bc.after.txt
--- bash.bc.before.txt  2025-01-26 19:15:13
+++ bash.bc.after.txt   2025-01-26 19:27:00
@@ -19,7 +19,7 @@
 IndCallSites        72
 LoadsNum            59146
 MaxStructSize       24
-NonPtrObj           8132
+NonPtrObj           8135
 ReturnsNum          5385
 StackObjs           9595
 StoresNum           27665
@@ -32,9 +32,9 @@
 VarArrayObj         157
 VarStructObj        462
 ----------------Time and memory stats--------------------
-LLVMIRTime          5.542
-SVFIRTime           7.674
-SymbolTableTime     2.728
+LLVMIRTime          6.388
+SVFIRTime           7.678
+SymbolTableTime     2.713
 #######################################################
 
 *********PTACallGraph Stats (Andersen analysis)***************
@@ -65,7 +65,7 @@
 MemoryUsageVmsize   0
 SCCDetectTime       0
 SCCMergeTime        0
-TotalTime           58.191
+TotalTime           52.389
 UpdateCGTime        0
 ----------------Numbers stats----------------------------
 AddrProcessed       15085
@@ -116,21 +116,21 @@
 ****Persistent Points-To Cache Statistics: Andersen's analysis bitvector****
 ################ (program : bash.bc)###############
 UniquePointsToSets       15689
-TotalUnions              174202
+TotalUnions              174203
 PropertyUnions           142918
 UniqueUnions             1401
 LookupUnions             27249
-PreemptiveUnions         2634
+PreemptiveUnions         2635
 TotalComplements         668673
 PropertyComplements      644062
 UniqueComplements        515
 LookupComplements        23581
 PreemptiveComplements    515
-TotalIntersections       1067892
-PropertyIntersections    1066301
-UniqueIntersections      5
+TotalIntersections       1067893
+PropertyIntersections    1066300
+UniqueIntersections      6
 LookupIntersections      552
-PreemptiveIntersections  1034
+PreemptiveIntersections  1035
 #######################################################
 
 Process finished with exit code 0

libcurl.so.bc, source is: https://github.com/SVF-tools/Test-Suite/blob/master/test_cases_bc/crux-bc/libcurl.so.bc

$ diff -u libcurl.so.bc.before.txt libcurl.so.bc.after.txt
--- libcurl.so.bc.before.txt    2025-01-26 19:17:05
+++ libcurl.so.bc.after.txt     2025-01-26 19:29:09
@@ -32,9 +32,9 @@
 VarArrayObj         220
 VarStructObj        199
 ----------------Time and memory stats--------------------
-LLVMIRTime          3.352
-SVFIRTime           5.138
-SymbolTableTime     1.566
+LLVMIRTime          3.648
+SVFIRTime           5.701
+SymbolTableTime     1.737
 #######################################################
 
 *********PTACallGraph Stats (Andersen analysis)***************
@@ -65,7 +65,7 @@
 MemoryUsageVmsize   0
 SCCDetectTime       0
 SCCMergeTime        0
-TotalTime           65.245
+TotalTime           61.742
 UpdateCGTime        0
 ----------------Numbers stats----------------------------
 AddrProcessed       10897
@@ -115,22 +115,22 @@
 
 ****Persistent Points-To Cache Statistics: Andersen's analysis bitvector****
 ################ (program : libcurl.so.bc)###############
-UniquePointsToSets       17799
+UniquePointsToSets       17800
 TotalUnions              271805
-PropertyUnions           82352
-UniqueUnions             5906
-LookupUnions             173276
-PreemptiveUnions         10271
-TotalComplements         1215598
+PropertyUnions           82343
+UniqueUnions             5905
+LookupUnions             173285
+PreemptiveUnions         10272
+TotalComplements         1215597
 PropertyComplements      1042920
 UniqueComplements        5128
-LookupComplements        162427
+LookupComplements        162426
 PreemptiveComplements    5123
-TotalIntersections       507640
-PropertyIntersections    494808
+TotalIntersections       507641
+PropertyIntersections    494809
 UniqueIntersections      59
-LookupIntersections      2489
-PreemptiveIntersections  10284
+LookupIntersections      2488
+PreemptiveIntersections  10285
 #######################################################
 
 Process finished with exit code 0

tmux.bc, source is: https://github.com/SVF-tools/Test-Suite/blob/master/test_cases_bc/crux-bc/tmux.bc

$ diff -u tmux.bc.before.txt tmux.bc.after.txt
--- tmux.bc.before.txt  2025-01-26 19:20:19
+++ tmux.bc.after.txt   2025-01-26 19:33:41
@@ -32,9 +32,9 @@
 VarArrayObj         127
 VarStructObj        385
 ----------------Time and memory stats--------------------
-LLVMIRTime          4.205
-SVFIRTime           7.519
-SymbolTableTime     2.422
+LLVMIRTime          4.263
+SVFIRTime           8.364
+SymbolTableTime     2.929
 #######################################################
 
 *********PTACallGraph Stats (Andersen analysis)***************
@@ -65,7 +65,7 @@
 MemoryUsageVmsize   0
 SCCDetectTime       0
 SCCMergeTime        0
-TotalTime           137.919
+TotalTime           132.443
 UpdateCGTime        0
 ----------------Numbers stats----------------------------
 AddrProcessed       14918
@@ -116,21 +116,21 @@
 ****Persistent Points-To Cache Statistics: Andersen's analysis bitvector****
 ################ (program : tmux.bc)###############
 UniquePointsToSets       20463
-TotalUnions              734596
+TotalUnions              734597
 PropertyUnions           137709
 UniqueUnions             6728
 LookupUnions             577604
-PreemptiveUnions         12555
+PreemptiveUnions         12556
 TotalComplements         1868738
 PropertyComplements      1330628
 UniqueComplements        5273
 LookupComplements        527564
 PreemptiveComplements    5273
-TotalIntersections       1197616
-PropertyIntersections    1187010
-UniqueIntersections      20
-LookupIntersections      28
-PreemptiveIntersections  10558
+TotalIntersections       1197617
+PropertyIntersections    1187016
+UniqueIntersections      21
+LookupIntersections      21
+PreemptiveIntersections  10559
 #######################################################
 
 Process finished with exit code 0

@yuleisui
Copy link
Collaborator

@canliture
Copy link
Contributor Author

Could you also run this https://github.com/user-attachments/files/18392297/502.gcc_r-16.bc.zip

Yeah, I have a try.

@canliture
Copy link
Contributor Author

Could you also run this https://github.com/user-attachments/files/18392297/502.gcc_r-16.bc.zip

When I run two testcases, all nearly full of memory with my laptop. I don't know why the time cost difference is 1000+ s.

I also find that: wpa will not reached the function MRGenerator::collectModRefForCall, the place that calls getCallGraphSCCRevTopoOrder.

My laptop is mac os:

2.6 GHz 6-Core Intel Core i7
Intel UHD Graphics 630 1536 MB
16 GB 2667 MHz DDR4

I run two testcases in release mode, and diff is here below.
Currently we can be sure that the commit will not affect the final points-to result, but potentially affect the performance.

$ diff -u 502.gcc_r-16.bc.before.txt 502.gcc_r-16.bc.after.txt
--- 502.gcc_r-16.bc.before.txt	2025-01-27 19:45:11
+++ 502.gcc_r-16.bc.after.txt	2025-01-27 14:38:53
@@ -30,9 +30,9 @@
 VarArrayObj         1505
 VarStructObj        5840
 ----------------Time and memory stats--------------------
-LLVMIRTime          47.759
-SVFIRTime           47.614
-SymbolTableTime     22.9
+LLVMIRTime          51.387
+SVFIRTime           54.674
+SymbolTableTime     23.792
 #######################################################
 
 *********PTACallGraph Stats (Andersen analysis)***************
@@ -63,7 +63,7 @@
 MemoryUsageVmsize   0
 SCCDetectTime       0
 SCCMergeTime        0
-TotalTime           6557.64
+TotalTime           7626.45
 UpdateCGTime        0
 ----------------Numbers stats----------------------------
 AddrProcessed       164468
@@ -139,7 +139,7 @@
 MemoryUsageVmsize   0
 SCCDetectTime       0
 SCCMergeTime        0
-TotalTime           6608.77
+TotalTime           7710.31
 UpdateCGTime        0
 ----------------Numbers stats----------------------------
 AddrProcessed       164468
@@ -190,20 +190,20 @@
 ****Persistent Points-To Cache Statistics: Andersen's analysis bitvector****
 ################ (program : 502.gcc_r-16.bc)###############
 UniquePointsToSets       204056
-TotalUnions              5326743
+TotalUnions              5326741
 PropertyUnions           3024648
 UniqueUnions             30687
 LookupUnions             2212694
-PreemptiveUnions         58714
+PreemptiveUnions         58712
 TotalComplements         24523125
 PropertyComplements      22608331
 UniqueComplements        19815
 LookupComplements        1875164
 PreemptiveComplements    19815
-TotalIntersections       66660467
-PropertyIntersections    66620516
-UniqueIntersections      28
-LookupIntersections      278
-PreemptiveIntersections  39645
+TotalIntersections       66660465
+PropertyIntersections    66620533
+UniqueIntersections      25
+LookupIntersections      264
+PreemptiveIntersections  39643
 #######################################################

@yuleisui
Copy link
Collaborator

You should use release SVF for testing. The time differences are quite large. Could you re-try it? It would be good to understand which modifications you made in the patch affected the performance.

@canliture
Copy link
Contributor Author

You should use release SVF for testing. The time differences are quite large. Could you re-try it? It would be good to understand which modifications you made in the patch affected the performance.

it's release mode. I have a re-try to understand some other problems.

@canliture
Copy link
Contributor Author

I run two testcases in release mode every time I restart the machine.

I think the time cost is not real, it's not memory efficient in my laptop, maybe it spends much time to swap the memory.

Maybe when I have a high performance computer (with more memory, e.g. 64GB memory) in hands, I will re-try later again.

image

Here below is diff:

diff -u before.txt after.txt 
--- before.txt	2025-01-28 11:43:03
+++ after.txt	2025-01-28 11:43:21
@@ -17,7 +17,7 @@
 IndCallSites        3208
 LoadsNum            792650
 MaxStructSize       266
-NonPtrObj           68433
+NonPtrObj           68428
 ReturnsNum          121494
 StackObjs           111699
 StoresNum           287262
\ No newline at end of file
@@ -30,15 +30,15 @@
 VarArrayObj         1505
 VarStructObj        5840
 ----------------Time and memory stats--------------------
-LLVMIRTime          46.64
-SVFIRTime           43.333
-SymbolTableTime     22.643
+LLVMIRTime          77.089
+SVFIRTime           69.757
+SymbolTableTime     24.676
 #######################################################
 
 *********PTACallGraph Stats (Andersen analysis)***************
 ################ (program : 502.gcc_r-16.bc)###############
 ----------------Numbers stats----------------------------
-CalRetPairInCycle   138791
+CalRetPairInCycle   138795
 MaxNodeInCycle      13798
 NodeInCycle         14258
 TotalCycle          245
\ No newline at end of file
@@ -63,7 +63,7 @@
 MemoryUsageVmsize   0
 SCCDetectTime       0
 SCCMergeTime        0
-TotalTime           5440.85
+TotalTime           5634.06
 UpdateCGTime        0
 ----------------Numbers stats----------------------------
 AddrProcessed       164468
\ No newline at end of file
@@ -114,7 +114,7 @@
 *********PTACallGraph Stats (Andersen analysis)***************
 ################ (program : 502.gcc_r-16.bc)###############
 ----------------Numbers stats----------------------------
-CalRetPairInCycle   138791
+CalRetPairInCycle   138795
 MaxNodeInCycle      13798
 NodeInCycle         14258
 TotalCycle          245
\ No newline at end of file
@@ -139,7 +139,7 @@
 MemoryUsageVmsize   0
 SCCDetectTime       0
 SCCMergeTime        0
-TotalTime           5480.38
+TotalTime           5673.53
 UpdateCGTime        0
 ----------------Numbers stats----------------------------
 AddrProcessed       164468
\ No newline at end of file
@@ -190,19 +190,19 @@
 ****Persistent Points-To Cache Statistics: Andersen's analysis bitvector****
 ################ (program : 502.gcc_r-16.bc)###############
 UniquePointsToSets       204113
-TotalUnions              5327937
+TotalUnions              5327933
 PropertyUnions           3024978
 UniqueUnions             30760
 LookupUnions             2213359
-PreemptiveUnions         58840
+PreemptiveUnions         58836
 TotalComplements         24522206
 PropertyComplements      22607432
 UniqueComplements        19856
 LookupComplements        1875062
 PreemptiveComplements    19856
-TotalIntersections       66662738
-PropertyIntersections    66622702
-UniqueIntersections      30
-LookupIntersections      277
-PreemptiveIntersections  39729
+TotalIntersections       66662744
+PropertyIntersections    66622724
+UniqueIntersections      25
+LookupIntersections      270
+PreemptiveIntersections  39725
 #######################################################

@yuleisui
Copy link
Collaborator

@bjjwwang could you test the time differences on our machine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants