-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added the possibility to check the used middleware for memory operations. #2
Conversation
6446f03
to
808445e
Compare
@wjwwood fyi, performance_test tool now includes memory_tools too |
🆒 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A design problem:
If a user fails to properly initialize the memory checker, it will fail silently giving the user a false assurance that there are no unwanted calls.
We have seen it today on your system.
Please introduce a check before the steady time that all possible calls (malloc
, realloc
, calloc
, and free
) are captured indeed.
@wjwwood, can this falsifiability be somehow automated and be done inside of the tool itself?
#ifdef MEMORY_TOOLS_ENABLED | ||
osrf_testing_tools_cpp::memory_tools::initialize(); | ||
|
||
auto on_unexpected_memory = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto const on_unexpected_memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my own education:
What happens with this lambda when you get out of scope?
Should not you use a plain function callback in such a case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lambda is wrapped in a https://en.cppreference.com/w/cpp/utility/functional/function. So this should fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function is now also const, as requested and correct.
|
||
auto on_unexpected_memory = | ||
[](osrf_testing_tools_cpp::memory_tools::MemoryToolsService &service) { | ||
std::cerr << "unexpected malloc"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a misleading statement. It might be malloc
, free
, calloc
, or realloc
.
I would create separate callbacks for each call with the corresponding messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This std::cerr
should not be there. The next line already prints which type of allocation happened:
malloc (not expected) 104 -> 0x2db7440
Stack trace (most recent call last):
#33 Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in
#32 Object "/home/andreas.pasternak/1429/apex_ws/src/performance_test/performance_test/cmake-build-debug/perf_test", at 0x40a168, in
#31 Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7f70b6cd282f, in __libc_start_main
#30 Object "/home/andreas.pasternak/1429/apex_ws/src/p
I removed this line. Then multiple callbacks are not required at the moment.
service.print_backtrace(); | ||
}; | ||
osrf_testing_tools_cpp::memory_tools::on_unexpected_calloc(on_unexpected_memory); | ||
osrf_testing_tools_cpp::memory_tools::on_unexpected_free(on_unexpected_memory); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on_unexpected_free
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
service.print_backtrace(); | ||
}; | ||
osrf_testing_tools_cpp::memory_tools::on_unexpected_calloc(on_unexpected_memory); | ||
osrf_testing_tools_cpp::memory_tools::on_unexpected_free(on_unexpected_memory); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on_unexpected_calloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
osrf_testing_tools_cpp::memory_tools::on_unexpected_calloc(on_unexpected_memory); | ||
osrf_testing_tools_cpp::memory_tools::on_unexpected_free(on_unexpected_memory); | ||
osrf_testing_tools_cpp::memory_tools::on_unexpected_malloc(on_unexpected_memory); | ||
osrf_testing_tools_cpp::memory_tools::on_unexpected_realloc(on_unexpected_memory); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on_unexpected_malloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Se above.
osrf_testing_tools_cpp::memory_tools::on_unexpected_free(on_unexpected_memory); | ||
osrf_testing_tools_cpp::memory_tools::on_unexpected_malloc(on_unexpected_memory); | ||
osrf_testing_tools_cpp::memory_tools::on_unexpected_realloc(on_unexpected_memory); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on_unexpected_realloc
Sorry for the comments' placements, it's probably my Chrome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Np.
README.md
Outdated
@@ -29,7 +29,7 @@ mkdir -p perf_test_ws/src | |||
cd perf_test_ws/src | |||
git clone git@github.com:ApexAI/performance_test | |||
cd .. | |||
ament build --parallel --build-tests --cmake-args -DCMAKE_BUILD_TYPE=Debug -- | |||
ament build --parallel --build-tests --cmake-args -DCMAKE_BUILD_TYPE=Release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--parallel
is known to be problematic under certain conditions.
If a user wants to build fast, this user will use this option. Otherwise, it's a premature optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a patient person. ;)
2d6e80c
to
df4fa16
Compare
Until I can make function like this in void
malloc_test_function(const std::string & str)
{
void * some_memory = std::malloc(1024);
// We need to do something with the malloc'ed memory to make sure this
// function doesn't get optimized away. memset isn't enough, so we do a
// memcpy from a passed in string, which is enough to keep the optimizer away.
// see: https://github.com/osrf/osrf_testing_tools_cpp/pull/8
memcpy(some_memory, str.c_str(), str.length());
std::free(some_memory);
}
void
assert_memory_tools_is_working()
{
bool saw_malloc = false;
on_malloc([&]() {
saw_malloc = true;
});
OSRF_TESTING_TOOLS_CPP__SCOPE_EXIT({on_malloc(nullptr);});
{
const std::string test_str = "test message";
malloc_test_function(test_str);
}
ASSERT_TRUE(saw_malloc);
}
I don't think it makes sense to do it implicitly, though having a function to test for it is certainly useful enough to have it in If I had an implicit check I'd want to make a distinction between "memory tools is supported on this platform and expected, but not catching things" and "memory tools may or may not be expected to work on this platform, and it is not catching things". |
I guess, 90% would prefer to assert in both cases. |
I will consider making it the default, but in my opinion it's not as obvious as you portray it. Almost all of the tests in https://github.com/ros2/rcl/blob/master/rcl/test/rcl/test_node.cpp It would be equally inconvenient to circumvent the check in all tests like this one. Also, I don't believe that a majority of people would want memory tools to fail on Windows, where your only workaround is to skip the whole test on Windows. |
@wjwwood: Thank you for your input. I adapted your function but it does not seem to work. If I remove the assertion function, memory tools work properly. But when I call the function the lambda |
Not sure, have you setup memory tools before hand? (it won't work if memory tools is uninitialized or if memory checking is not enabled) Otherwise, perhaps the code in the |
It seems like you are initializing it correctly. Are the other calls working (like later in your tool)? |
Oh, you said that is was (sorry working on little sleep :p). I'd see if it's getting optimized out or not. Try adding a print statement in the malloc test function (warning it might cause the check to work, but break again when you remove it). |
Yes, you are right and I'm not. How about a checker function that returns:
|
I can do that or have the user give an argument to the "start" function from your example API to dictate the behavior, or I can do both. I'll be working on the improvements to memory tools tomorrow. |
@wjwwood: Np, I am glad that you help me! So the |
Is it that the |
No that doesn't seem likely either. Can you give me (or point me to) some setup instructions for this and I can try it out myself. |
@wjwwood: I already updated the readme on this branch. so you should be able to follow it. You just need to have memory tools available and run performance test with |
I'll try it out, but it might take me a while. |
Not sure I'll get to this tonight. I'll do my best to look at it tomorrow morning, but while the release is still slowly wrapping up I'm pretty busy. |
@wjwwood: Could you find the time to look into this issue? Thank you! |
Yup, I'm back at work today, I'll catch up on it. |
@deeplearningrobotics I'm not able to build without the micro dds rmw implementation, is that intentional? I get:
|
/cc @dejanpan Here's what I'm using:
|
@wjwwood The actual error is further down: In file included from /home/dejan.pangercic/1578/apex_ws/build/performance_test/src/idlgen/fast_rtps/gen/fast_rtps/Time_PubSubTypes.cxx:26:0:
/home/dejan.pangercic/1578/apex_ws/build/performance_test/src/idlgen/fast_rtps/gen/fast_rtps/Time_PubSubTypes.h: At global scope:
/home/dejan.pangercic/1578/apex_ws/build/performance_test/src/idlgen/fast_rtps/gen/fast_rtps/Time_PubSubTypes.h:48:41: error: ‘SerializedPayload_t’ has not been declared
bool serialize(void *data, SerializedPayload_t *payload);
^
/home/dejan.pangercic/1578/apex_ws/build/performance_test/src/idlgen/fast_rtps/gen/fast_rtps/Time_PubSubTypes.h:49:31: error: ‘SerializedPayload_t’ has not been declared
bool deserialize(SerializedPayload_t *payload, void *data);
^
/home/dejan.pangercic/1578/apex_ws/build/performance_test/src/idlgen/fast_rtps/gen/fast_rtps/Time_PubSubTypes.h:51:38: error: ‘InstanceHandle_t’ has not been declared
bool getKey(void *data, InstanceHandle_t *ihandle); and for that I suspect that the bundled version of fastrtpsgen is out of date: https://github.com/ApexAI/performance_test/tree/memory_check/performance_test/bin. I am looking into this. |
@wjwwood update: I am now looking into why Bouncy fails. It must be a change in fastRTPS. |
Documentation bug: $ source ros2_install_path/local_setup.bash
-bash: ros2_install_path/local_setup.bash: No such file or directory The Readme should explain how to get actual folder instead of |
Documentation bug: No current directory defined. mkdir -p perf_test_ws/src What if my |
Documentation bug: $ source ~/ros2_install/local_setup.bash
-bash: ~/ros2_install/local_setup.bash: No such file or directory The |
|
Scratch the previous comment. My bash had an alias for |
Documentation error: no prerequisite requirement for JDK. The build failed:
|
Multiple build errors, seems of two kinds only:
and
|
Looks like a legit bug to me: the header file contains using namespace eprosima::fastrtps; But no reference to Here is the full contents of // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
/*!
* @file Header_PubSubTypes.h
* This header file contains the declaration of the serialization functions.
*
* This file was generated by the tool fastcdrgen.
*/
#ifndef _HEADER__PUBSUBTYPES_H_
#define _HEADER__PUBSUBTYPES_H_
#include <fastrtps/TopicDataType.h>
using namespace eprosima::fastrtps;
#include "Header_.h"
namespace std_msgs
{
namespace msg
{
namespace dds_
{
/*!
* @brief This class represents the TopicDataType of the type Header_ defined by the user in the IDL file.
* @ingroup HEADER_
*/
class Header_PubSubType : public TopicDataType {
public:
typedef Header_ type;
Header_PubSubType();
virtual ~Header_PubSubType();
bool serialize(void *data, SerializedPayload_t *payload);
bool deserialize(SerializedPayload_t *payload, void *data);
std::function<uint32_t()> getSerializedSizeProvider(void* data);
bool getKey(void *data, InstanceHandle_t *ihandle);
void* createData();
void deleteData(void * data);
MD5 m_md5;
unsigned char* m_keyBuffer;
};
}
}
}
#endif // _Header__PUBSUBTYPE_H_ |
@serge-nikulin: I updated the readme for bouncy and fixed some defects and tested the code on the following platforms:
I think the README is reasonable to follow for a user with very basic ROS knowledge. Please test in ADE or Ubuntu to not run into independent systems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 9 files at r1, 1 of 2 files at r3, 2 of 2 files at r4.
Reviewable status: 7 of 11 files reviewed, 5 unresolved discussions (waiting on @wjwwood and @serge-nikulin)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r5.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @serge-nikulin)
Out of curiosity, why do you use One usually goes to floats (from integers) when one needs dynamic range at the price of precision and resolution. It seems to me in this case timestamps do not need a wide dynamic range ([0 ... 10] sec?) but need both precision and resolution. |
|
@serge-nikulin: I could not use any built-in time types, I wanted to use |
Yes, using |
Do you intend to use your program across different boxes? In this case, you can't use See https://en.cppreference.com/w/cpp/chrono/steady_clock:
Because the clock does not have a universal reference point (like wall clock does), it can't be used across machines (including VM) and even processes on a well secured OS (they isolate and salt such timers to prevent crypto leaks). Steady clock timer value would be different and unrelated on different machines (including VM). Because we do not care, our ADE possibly leaks steady timers across containers, but this is a bug (of Docker?), not a feature to take for granted. |
@serge-nikulin: You are right. As you always want to use a steady clock in robotics the use of a steady clock here is appropriate, IMHO. If people want to test between processes without a common steady clock (because of different physical or logical machines) then one could add:
But this is then part of another feature. |
Please add this restriction to the document, otherwise people will measure
inter box time and get wrong results. It would be nice to stop such
attempts automatically.
…On Fri, Jul 27, 2018 at 10:32 AM deeplearningrobotics < ***@***.***> wrote:
@serge-nikulin <https://github.com/serge-nikulin>: As you always want to
use a steady clock in robotics the use of a steady clock here is
appropriate. If people want to test between processes without a common
steady clock (because of different physical or logical machines) then one
could add:
- The option to select a different clock.
- The option to measure round trip.
But this is then part of another feature.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AeCfXXyfJXyOczCiC275uU9rDeTc0O5fks5uK3jlgaJpZM4U6c1X>
.
|
Because this app is Linux-only and my forte is OS X and Windows, I assume somebody else has actually built and ran it in Linux (I could not: can't install ROS2). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this app is Linux-only and my forte is OS X and Windows, I assume somebody else has actually built and ran it in Linux (I could not: can't install ROS2).
@@ -51,7 +51,6 @@ class DataRunnerBase : boost::noncopyable | |||
|
|||
auto on_unexpected_memory = | |||
[](osrf_testing_tools_cpp::memory_tools::MemoryToolsService &service) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be hard to decipher.
@serge-nikulin: Thank you, I will create an issue for the clock and time-related issues you mentioned and fix them in a seperate MR.. |
There's now a function to check if memory tools is working or not: |
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)