Setup PhysicsSchedule
and SubstepSchedule
to use single-threaded executor
#92
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
By default schedules use multi-threaded executor, but since all systems are chained together and run one after another, using the multi-threaded executor is pointless and adds overhead.
This improves performance in benchmarks by between 12% and 50%:
3x3 cubes, 30 steps:
![image](https://private-user-images.githubusercontent.com/42153076/254876341-782f706b-a2e8-442f-b48e-d1fd2052c758.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3ODMwMDYsIm5iZiI6MTczOTc4MjcwNiwicGF0aCI6Ii80MjE1MzA3Ni8yNTQ4NzYzNDEtNzgyZjcwNmItYTJlOC00NDJmLWI0OGUtZDFmZDIwNTJjNzU4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE3VDA4NTgyNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTUzMTIzOGM5ZWM3MGU2MTUzOTY2NzhkMzFiZDM2N2UxMjNjZDFhNzk5N2Q3ZWY2NjNiOWVmY2IzNWE1ZWZmYzQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.PxmWEuNGLyffta6WfEHUa5iK7BEFnuNBRs-XgT9tB7w)
5x5 cubes, 30 steps:
![image](https://private-user-images.githubusercontent.com/42153076/254876382-37c5b371-8e7d-42b9-9dc0-1f71a74990d2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3ODMwMDYsIm5iZiI6MTczOTc4MjcwNiwicGF0aCI6Ii80MjE1MzA3Ni8yNTQ4NzYzODItMzdjNWIzNzEtOGU3ZC00MmI5LTlkYzAtMWY3MWE3NDk5MGQyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE3VDA4NTgyNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWMyM2EzZjJmMzc0NjM5YWY0MGRlMDVjNmJkZTNmNmE1Mzk2NTdkMWMzNjAzZGZlODZkOGEzODA2MTU4YWRjNWQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.SApyMcXFWTSouj55RbVVstyELuqQ9pjOZZnaMRrubxc)
10x10 cubes, 30 steps:
![image](https://private-user-images.githubusercontent.com/42153076/254876408-f8e7f9d9-2914-431b-951f-35cef35f283d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3ODMwMDYsIm5iZiI6MTczOTc4MjcwNiwicGF0aCI6Ii80MjE1MzA3Ni8yNTQ4NzY0MDgtZjhlN2Y5ZDktMjkxNC00MzFiLTk1MWYtMzVjZWYzNWYyODNkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE3VDA4NTgyNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWZlZWE3Mjc0NTIzOTYzMGI1YWM4YzcyMjg5ZDE3MzFjMWU1YTVlMmRjMWNjZTJhNjlkN2U1NGFjMGMyYjgxMjAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.9qsHQ7hD7G1cQZyAfDNlRART8dynmaAlMLJvp4cBh8Q)
The relative speedup is smaller with more cubes, since task overhead is static.
In the future, systems should be internally parallelized (probably using rayon), so using the single threaded executor will free other threads for performing parallel work.