Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatic argument/retval display for well-known functions #158

Closed
wants to merge 8 commits into from

Conversation

honggyukim
Copy link
Collaborator

Hi @namhyung and @Taeung

I'm writing some code for item 3 and I wrote a subset of working prototype.

I would like to ask you about the interface and internals.

  • a new option --auto-args is added
  • utils/autoargs.h describes the list of arguments / retvals for well known library functions
  • the arguments and retvals are passed as shown in the field of uftrace info.

I would like to hear your comments.

Thanks,
Honggyu

@honggyukim
Copy link
Collaborator Author

honggyukim commented Sep 12, 2017

Here is a example usage and output:

$ uftrace --auto-args a.out 
Hello World!
# DURATION    TID     FUNCTION
   1.296 us [86017] | __monstartup();
   1.010 us [86017] | __cxa_atexit();
            [86017] | main() {
   7.497 us [86017] |   puts("Hello World!");
   6.150 us [86017] |   fflush();
   9.430 us [86017] |   malloc(0x5f5e100) = 0x7f8465ceb010;
  12.400 us [86017] |   free(0x7f8465ceb010);
  38.921 us [86017] | } /* main */

@honggyukim
Copy link
Collaborator Author

honggyukim commented Sep 12, 2017

I would like to find a better way to specify arguments and retvals for a long list of functions. And their types as well.

Copy link
Owner

@namhyung namhyung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this. But I think we don't need to pass the auto arg list through the env. Both of uftrace and libmcount can have the list, so it can just check the name of function and use it for argspec.

@@ -114,7 +114,7 @@ struct mcount_shmem {
};

/* first 4 byte saves the actual size of the argbuf */
#define ARGBUF_SIZE 1024
#define ARGBUF_SIZE 16384
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed since it's not for passing argpsec. It is used to save actual contents of arguments and return value.

Copy link
Collaborator Author

@honggyukim honggyukim Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I just limited the maximum length to 48 in 1bed671 because some strings are too long to record. Thanks!

@honggyukim
Copy link
Collaborator Author

honggyukim commented Sep 13, 2017

Thanks for doing this. But I think we don't need to pass the auto arg list through the env.

Hmm.. I don't clearly understand your word. I didn't pass auto arg list through env because it may exceed the limit of environment variable length. I just keep using argspec: form to save it in info file.

Both of uftrace and libmcount can have the list, so it can just check the name of function and use it for argspec.

Sorry but I don't get your point here again. Can you please explain more in details?

@honggyukim honggyukim force-pushed the check/auto-args branch 2 times, most recently from 225220c to 60ab035 Compare September 13, 2017 08:13
@honggyukim
Copy link
Collaborator Author

Thanks for doing this. But I think we don't need to pass the auto arg list through the env.

Hmm.. I don't clearly understand your word. I didn't pass auto arg list through env because it may exceed the limit of environment variable length. I just keep using argspec: form to save it in info file.

Clearly saying that I passed auto arg list directly to libmcount.so by including argspec.h during record time and uftrace also includes the same argspec.h to write argspec to info file. So it doesn't pass auto arg list but just use the same info for both.

@namhyung
Copy link
Owner

Clearly saying that I passed auto arg list directly to libmcount.so by including argspec.h during record time and uftrace also includes the same argspec.h to write argspec to info file. So it doesn't pass auto arg list but just use the same info for both.

Oh, sorry about the misunderstanding. I can see the code not passing but adding the arg string directly.

Copy link
Owner

@namhyung namhyung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a incomplete arbitrary list of standard library functions. I'm not sure how to maintain this list - it'd be great if some conversion script can be provided as well. Also it should be considered that how this can be useful in a long term.

cmd-record.c Outdated
@@ -352,9 +354,46 @@ static int fill_file_header(struct opts *opts, int status, struct rusage *rusage
if (write(fd, &hdr, sizeof(hdr)) != (int)sizeof(hdr))
pr_err("writing header info failed");

char *orig_args = opts->args;
char *orig_retval = opts->retval;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer declaring variables before any statement like in (old) C syntax.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

cmd-record.c Outdated

if (orig_args) {
alloc_size = strlen(auto_args_list) + strlen(orig_args) + 2;
opts->args = (char*)malloc(alloc_size);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use xmalloc (and its friends) and you don't need to cast it in C.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

cmd-record.c Outdated
opts->args = (char*)malloc(alloc_size);
strncpy(opts->args, auto_args_list, alloc_size);
}
pr_dbg("opts->args = %s\n", opts->args);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the debug message, I think a level 1 message (pr_dbg) should be used to describe high level actions. I don't want to too many messages when using -v alone.

In this case, it may be like "auto arg is used" or "extending args using builtin auto-arg list". Messages like above can use level 2 or 3 according to the verbosity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I changed it to pr_dbg2.

if (arg_env) {
alloc_size = strlen(auto_args_list) + strlen(arg_env) + 2;
argument_str = (char*)malloc(alloc_size);
sprintf(argument_str, "%s;%s", auto_args_list, arg_env);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use xasprintf() instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

else {
alloc_size = strlen(auto_args_list) + 1;
argument_str = (char*)malloc(alloc_size);
strncpy(argument_str, auto_args_list, alloc_size);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And xstrdup() for this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

utils/autoargs.h Outdated
* TODO: need to write a script to generate this form from prototypes
*/
static char *auto_args_list = "\
malloc@arg1;free@arg1;\
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C concatenates adjacent string literals into one. Also you could also make it a const array rather than a pointer variable. So it could be:

static const char auto_arg_list[] =
  "AAA"
  "BBB"
  "CCC"
;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it as a single string and separated into many pieces of string literals.

@honggyukim
Copy link
Collaborator Author

honggyukim commented Sep 14, 2017

@namhyung I appreciate your comments in details. I applied your suggestion and updated it as a working-in-progress status. I still need to think how to generate auto args/retvals list with a script.
I just think that I can first support the function list in ltrace as specified in the link below:
https://github.com/dkogan/ltrace/blob/master/etc/libc.so.conf

@honggyukim
Copy link
Collaborator Author

Besides that, I would like to remove library specifier when giving args/retval before writing autoargs.h.
i.e. removing libstdc++ in malloc@libstdc++,args1

@@ -267,6 +267,7 @@ static unsigned save_to_argbuf(void *argbuf, struct list_head *args_spec,
if (str) {
unsigned i;
char *dst = ptr + 2;
const unsigned str_limit = 48;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this kind of magic number, I prefer using macro constant with capital letters. Also 48 seems to too small - I don't know what's the good size but 100 looks like reasonable for me.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also see a crash with a long string argument during uftrace dump.

Copy link
Collaborator Author

@honggyukim honggyukim Sep 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this kind of magic number, I prefer using macro constant with capital letters.

Okay. I will make it as capital letter.

Also 48 seems to too small - I don't know what's the good size but 100 looks like reasonable for me.
I also see a crash with a long string argument during uftrace dump.

I made it as 64 before but crashed when I run uftrace dump. I'm not sure about it but /usr/bin/gcc does a lot of string comparison that exceeds my expectation.

@namhyung
Copy link
Owner

Besides that, I would like to remove library specifier when giving args/retval before writing autoargs.h.
i.e. removing libstdc++ in malloc@libstdc++,args1

Yep, I'm working on that now..

@honggyukim
Copy link
Collaborator Author

honggyukim commented Sep 14, 2017

@namhyung I appreciate your comments in details. I applied your suggestion and updated it as a working-in-progress status. I still need to think how to generate auto args/retvals list with a script.
I just think that I can first support the function list in ltrace as specified in the link below:
https://github.com/dkogan/ltrace/blob/master/etc/libc.so.conf

latrace by Jiri Olsa has the same feature showing library arguments/retval. I wantted to add better example headers in latrace but found that it's GPL v3. Maybe cannot use it directly so we can just generate our own list of functions.
https://github.com/jkkm/latrace/tree/master/etc/latrace.d/headers

@honggyukim
Copy link
Collaborator Author

Besides that, I would like to remove library specifier when giving args/retval before writing autoargs.h.
i.e. removing libstdc++ in malloc@libstdc++,args1

Yep, I'm working on that now..

I appreciate that!

@honggyukim honggyukim force-pushed the check/auto-args branch 2 times, most recently from 3351468 to 9b7014e Compare September 14, 2017 07:10
@honggyukim
Copy link
Collaborator Author

FYI, here is some snippet of gcc trace result:

$ uftrace --nest-libcall --auto-args /usr/bin/gcc hello.c
    ...
   0.250 us [40905] | strcmp("cpp-output", "c");
   0.237 us [40905] | strcmp("c-header", "c");
   0.220 us [40905] | strcmp("c", "c");
   0.567 us [40905] | strncmp("mtune=generic", "E|M|MM:%(trad_capable_cpp) %(cpp_options) %(cp...", 1);
   0.480 us [40905] | strncmp("march=x86-64", "E|M|MM:%(trad_capable_cpp) %(cpp_options) %(cp...", 1);
   0.494 us [40905] | strncmp("mtune=generic", "M|MM:%(trad_capable_cpp) %(cpp_options) %(cpp_...", 1);
   0.476 us [40905] | strncmp("march=x86-64", "M|MM:%(trad_capable_cpp) %(cpp_options) %(cpp_...", 1);
   0.563 us [40905] | strncmp("mtune=generic", "MM:%(trad_capable_cpp) %(cpp_options) %(cpp_de...", 2);
   0.447 us [40905] | strncmp("march=x86-64", "MM:%(trad_capable_cpp) %(cpp_options) %(cpp_de...", 2);
   0.500 us [40905] | strncmp("mtune=generic", "E:%{!M:%{!MM:          %{traditional:%eGNU C n...", 1);
   0.463 us [40905] | strncmp("march=x86-64", "E:%{!M:%{!MM:          %{traditional:%eGNU C n...", 1);
   0.517 us [40905] | malloc(481) = 0xdeb7a0;
   0.433 us [40905] | memcpy(0xdeb7a0, 0x477384, 480);
   0.480 us [40905] | strncmp("mtune=generic", "M:%{!MM:          %{traditional:%eGNU C no lon...", 1);
   0.443 us [40905] | strncmp("march=x86-64", "M:%{!MM:          %{traditional:%eGNU C no lon...", 1);
   0.303 us [40905] | malloc(475) = 0xdeb990;
   0.243 us [40905] | memcpy(0xdeb990, 0xdeb7a5, 474);
    ...

In such cases, it may consume a lot of record data so I couldn't increase the limit more than 48.

@honggyukim
Copy link
Collaborator Author

honggyukim commented Sep 18, 2017

This is a incomplete arbitrary list of standard library functions. I'm not sure how to maintain this list - it'd be great if some conversion script can be provided as well. Also it should be considered that how this can be useful in a long term.

I've added a script that generates autoargs.h from pre-defined protytypes.h header file in de3d0ec. It's not possible to put every functions at once so we can keep adding necessary functions when needed.

The example usage is as follows:

$ cat simple_prototypes.h
void *malloc(size_t size);
void free(void* ptr);
const char* strcpy(char* dest, const char* src);
int open(const char* pathname, int flags);
int pthread_mutex_lock(pthread_mutex_t *mutex);


$ ./gen-autoargs.py simple_prototypes.h
  GEN      autoargs.h


$ cat autoargs.h
    ...
static char *auto_args_list =
        "malloc@arg1/u;"
        "malloc@libstdc++,arg1/u;"
        "free@arg1/x;"
        "free@libstdc++,arg1/x;"
        "strcpy@arg1/s,arg2/s;"
        "strcpy@libstdc++,arg1/s,arg2/s;"
        "open@arg1/s,arg2;"
        "open@libstdc++,arg1/s,arg2;"
        "pthread_mutex_lock@arg1/x;"
        "pthread_mutex_lock@libstdc++,arg1/x;"
;

static char *auto_retvals_list =
        "malloc@retval/x;"
        "malloc@libstdc++,retval/x;"
        "strcpy@retval/s;"
        "strcpy@libstdc++,retval/s;"
        "open@retval;"
        "open@libstdc++,retval;"
        "pthread_mutex_lock@retval;"
        "pthread_mutex_lock@libstdc++,retval;"
;

Please take a look at it!

@honggyukim
Copy link
Collaborator Author

Here is the example usage:

$ uftrace --auto-args tests/t-autoargs hello 
hello 
# DURATION    TID     FUNCTION
            [11503] | main() {
   2.044 us [11503] |   strlen("autoargs test") = 13; 
   1.583 us [11503] |   calloc(1, 14) = 0x1b28a80;
   1.193 us [11503] |   free(0x1b28a80);
   1.344 us [11503] |   strcmp("hello", "hello") = 0;
   4.056 us [11503] |   puts("hello") = 6;
  18.513 us [11503] | } /* main */

@honggyukim honggyukim force-pushed the check/auto-args branch 4 times, most recently from a076aa6 to 22639e2 Compare September 18, 2017 16:18
@honggyukim
Copy link
Collaborator Author

@namhyung It seems that this work is almost done but it always generates autoargs.h whenever I run make. If you know how to fix it, please let me know. Thanks!

@@ -44,7 +44,7 @@
type_specifier.extend(["std::string"])

# The contents of libnames will be made to genearte library name specifier for arg spec
libnames = ["", "libstdc++"]
libnames = ["", "libstdc++", "libc++", "libbfd", "libLLVM"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a bit weird to add such libraries but this commit is added for reference. If library specifier is removed, then we don't need this so we can also make the args/retval list shorter.

Makefile Outdated
@@ -193,6 +194,9 @@ $(filter-out $(objdir)/uftrace.o,$(UFTRACE_OBJS)): $(objdir)/%.o: $(srcdir)/%.c
$(objdir)/version.h: PHONY
@$(srcdir)/misc/version.sh $@ $(VERSION_GIT)

$(objdir)/autoargs.h: PHONY
@$(objdir)/utils/gen-autoargs.py $(objdir)/utils/prototypes.h
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to run gen-autoargs.py always. If we provide autoargs.h, it will run only after prototype.h was changed. Also I'd like put the script and prototype header in the "misc" directory rather than "utils".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I've just updated as you said. I removed the commit for Makefile and added autoargs.h instead. Also I moved those files to misc. Thanks.

@namhyung namhyung changed the title automatic argument/retval display for known functions automatic argument/retval display for well-known functions Sep 22, 2017
@namhyung
Copy link
Owner

namhyung commented Sep 23, 2017

Unlike version.sh it doesn't need to be called everytime and deleted on make clean. As it depends on prototype.h and the script itself so the following patch should work IMHO. But I think it'd be better if it could take an argument to specify name of output file.

diff --git a/Makefile b/Makefile
index a8e65d8..455371d 100644
--- a/Makefile
+++ b/Makefile
@@ -194,8 +194,8 @@ $(filter-out $(objdir)/uftrace.o,$(UFTRACE_OBJS)): $(objdir)/%.o: $(srcdir)/%.c
 $(objdir)/version.h: PHONY
        @$(srcdir)/misc/version.sh $@ $(VERSION_GIT)
 
-$(objdir)/autoargs.h: PHONY
-       @$(objdir)/utils/gen-autoargs.py $(objdir)/utils/prototypes.h
+$(objdir)/autoargs.h: $(srcdir)/utils/gen-autoargs.py $(srcdir)/utils/prototypes.h
+       $(QUIET_GEN)$(srcdir)/utils/gen-autoargs.py $(srcdir)/utils/prototypes.h
 
 $(objdir)/uftrace: $(UFTRACE_OBJS) $(UFTRACE_ARCH_OBJS) $(objdir)/libtraceevent/libtraceevent.a
        $(QUIET_LINK)$(CC) $(UFTRACE_CFLAGS) -o $@ $(UFTRACE_OBJS) $(UFTRACE_ARCH_OBJS) $(UFTRACE_LDFLAGS)
@@ -242,8 +242,7 @@ clean:
        $(Q)$(RM) $(objdir)/*.o $(objdir)/*.op $(objdir)/*.so $(objdir)/*.a
        $(Q)$(RM) $(objdir)/utils/*.o $(objdir)/utils/*.op $(objdir)/libmcount/*.op
        $(Q)$(RM) $(objdir)/gmon.out $(srcdir)/scripts/*.pyc $(TARGETS)
-       $(Q)$(RM) $(objdir)/uftrace-*.tar.gz
-       $(Q)$(RM) $(objdir)/version.h $(objdir)/autoargs.h
+       $(Q)$(RM) $(objdir)/uftrace-*.tar.gz $(objdir)/version.h
        @$(MAKE) -sC $(srcdir)/arch/$(ARCH) clean
        @$(MAKE) -sC $(srcdir)/tests ARCH=$(ARCH) clean
        @$(MAKE) -sC $(srcdir)/doc clean
diff --git a/utils/gen-autoargs.py b/utils/gen-autoargs.py
index b15451a..41e171b 100755
--- a/utils/gen-autoargs.py
+++ b/utils/gen-autoargs.py
@@ -270,5 +270,4 @@ def main(argv):
 
 
 if __name__ == "__main__":
-  print("  GEN      " + argspec_file)
   sys.exit(main(sys.argv))

@honggyukim
Copy link
Collaborator Author

honggyukim commented Sep 24, 2017

@namhyung Thanks but how can I go back to the previous patch series in the same branch? I changed to directly add autoargs.h the force updated the same branch.

@namhyung
Copy link
Owner

git help reflog?

@honggyukim
Copy link
Collaborator Author

Thanks! I've just updated as you said, especially modified gen-autoargs.py to accept more options. Please take a look at it.

This adds *.sw[opn], *.patch and .orig rules to .gitignore.

Signed-off-by: Honggyu Kim <honggyu.kp@gmail.com>
This is a preparation for adding a new option --auto-args, which
displays arguments / return values for known library functions such as
libc functions.

Signed-off-by: Honggyu Kim <honggyu.kp@gmail.com>
Since some of strings are too long to store inside buffer, it sometimes
exceeds ARGBUF_SIZE and fails.

This patch limits the maximum string size is to 48 with NULL to prevent
such problems.

Signed-off-by: Honggyu Kim <honggyu.kp@gmail.com>
This adds a list prototypes of well-known library functions in
"prototypes.h" and its parser that generates uftrace format argspec.

The usage is as follows:
  $ ./misc/gen-autoargs.py ./misc/prototypes.h

Then it generates "autoargs.h" that can be directly included by uftrace
for --auto-args option support.

In the below example, "simple_prototypes.h" has only 5 functions.  It
can be passed to "gen-autoargs.py" and the output is as follows:

  $ cat simple_prototypes.h
  void *malloc(size_t size);
  void free(void* ptr);
  const char* strcpy(char* dest, const char* src);
  int open(const char* pathname, int flags);
  int pthread_mutex_lock(pthread_mutex_t *mutex);

  $ ./gen-autoargs.py simple_prototypes.h
    GEN      autoargs.h

  $ cat autoargs.h
      ...
  static char *auto_args_list =
          "malloc@arg1/u;"
          "free@arg1/x;"
          "strcpy@arg1/s,arg2/s;"
          "open@arg1/s,arg2;"
          "pthread_mutex_lock@arg1/x;"
  ;

  static char *auto_retvals_list =
          "malloc@retval/x;"
          "strcpy@retval/s;"
          "open@retval;"
          "pthread_mutex_lock@retval;"
  ;

Signed-off-by: Honggyu Kim <honggyu.kp@gmail.com>
autoargs.h is generated by "gen-autoargs.py" based on "prototypes.h".
This patch adds the generated file and this will be updated as the list
of functions are updated later on.

Signed-off-by: Honggyu Kim <honggyu.kp@gmail.com>
Since gen-autoargs.py generates a very long list of args/retval list,
uftrace fails to store the entire list into the current buffer.

It needs a huge buffer to store the list to properly support --auto-args
option.  This patch expands the size 8 times bigger.

Signed-off-by: Honggyu Kim <honggyu.kp@gmail.com>
"utils/prototypes.h" contains the prototypes of well-known library
functions, especially libc functions.

uftrace understandable argspec format is generated by "gen-autoargs.py"
based on "utils/prototypes.h" file.  Then the generated "autoargs.h"
file is directly included by uftrace for --auto-args support.

This patch adds automatic args/retval rules directly to uftrace when
--auto-args is enabled.

It allows uftrace to display args/retval even for distributed binaries
without -pg or -finstrument-functions build.

Signed-off-by: Honggyu Kim <honggyu.kp@gmail.com>
  $ uftrace --auto-args tests/t-autoargs hello
  hello
  # DURATION    TID     FUNCTION
              [11503] | main() {
     2.044 us [11503] |   strlen("autoargs test") = 13;
     1.583 us [11503] |   calloc(1, 14) = 0x1b28a80;
     1.193 us [11503] |   free(0x1b28a80);
     1.344 us [11503] |   strcmp("hello", "hello") = 0;
     4.056 us [11503] |   puts("hello") = 6;
    18.513 us [11503] | } /* main */

Signed-off-by: Honggyu Kim <honggyu.kp@gmail.com>
@namhyung
Copy link
Owner

Merged: aeeb2b3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants