Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to split arguments with shlex.split in ExecuteProcess #711

Open
wants to merge 12 commits into
base: rolling
Choose a base branch
from

Conversation

wjwwood
Copy link
Member

@wjwwood wjwwood commented Jun 15, 2023

We already do this for the prefix filter, and it should not break any existing uses, because of how shlex.split works.

Just blindly applying shlex.split will break things, due to complex combinations, so adding an argument to opt into the behavior, and make it possible to set it via launch configuration to support some cases where changing the executable entry in the launch file isn't convenient.

This option was needed because the existing behavior sometimes prevents passing multiple command line arguments as a single launch configuration, because they would be grouped into a single argv entry in the main function of the program it was passed to when shell=False is used with subprocess.

fixes #710


A little more detail on the original problem and how this fixes it...

The tests added in this pr are good examples of the root problem and how this fixes it, but to summarize briefly here:

There's an arg_echo.py script I use in the tests which just prints the arguments received and exits with the count as the return code.

Then consider this test case that was added:

preamble = [sys.executable, os.path.join(os.path.dirname(__file__), 'argv_echo.py')]

def test_execute_process_split_arguments_override_in_launch_file():
    execute_process_args = {
        'cmd': preamble + ['--some-arg "some string"'],
        'output': 'screen',
        'shell': False,
        'log_cmd': True,
    }
    execute_process_action1 = ExecuteProcess(**execute_process_args)
    execute_process_action2 = ExecuteProcess(**execute_process_args)

    ld = LaunchDescription([
        # Control to test the default.
        execute_process_action1,
        # Change default with LaunchConfiguration, test again.
        SetLaunchConfiguration('split_arguments', 'True'),
        execute_process_action2,
    ])
    ls = LaunchService()
    ls.include_launch_description(ld)
    assert 0 == ls.run(shutdown_when_idle=True)

    assert execute_process_action1.return_code == 2, execute_process_action1.process_details['cmd']
    assert execute_process_action2.return_code == 3, execute_process_action2.process_details['cmd']

You can see that without the use of split_arguments=True (in this case being set from a LaunchConfiguration) the script interprets the last argument as '--some-arg "some string"' as a single contiguous argument, which breaks some argument parsers, like the one used in gazebo classic for example. But if you enable the new feature, then it splits that argument into --some-arg and some string, which is what we want.

This is important because if you cannot use shell=True, which we cannot (see ros-simulation/gazebo_ros_pkgs#1376), and you have to pass multiple arguments in a single substitution (e.g. https://github.com/ros-simulation/gazebo_ros_pkgs/blob/df368e60598aa82940bffc7b0db46420b04577a6/gazebo_ros/launch/gzserver.launch.py#L71) or if you need to pass arguments which contain whitespace in a single substitution (e.g. workedaround here ros-simulation/gazebo_ros_pkgs#1502), then you need something like this.

I also considered ways in which a substitution could return a list of substitutions instead of a string, but that was just too big of a change and conceptually had too many issues. I also considered a special syntax or substitution type that the ExecuteLocal action could handle in a specific way, but that also seemed clunky and after trying to implement it I couldn't get something satisfactory, and so I ended up with the current approach, which certainly isn't perfect.

Also, if you want more context on the interactions between cmd as a sequence (list) vs cmd as a string vs cmd as a sequence containing a single string and settings like shell=True, the subprocess documentation in python is helpful: https://docs.python.org/3/library/subprocess.html#subprocess.Popen (search for shell=True and read around those sections)

Hopefully that's enough context, but please let me know if there are more questions.

@wjwwood wjwwood self-assigned this Jun 15, 2023
@wjwwood
Copy link
Member Author

wjwwood commented Jun 15, 2023

Looks like this breaks some tests, so I need to see why this might not be a safe change to make.

@wjwwood
Copy link
Member Author

wjwwood commented Aug 16, 2023

@wjwwood wjwwood force-pushed the wjwwood/shlex_split_cmd_to_executable branch from a4b1461 to 611af57 Compare September 20, 2023 02:07
@wjwwood wjwwood changed the title apply shlex.util to substitutions in cmd given to Executable Add option to split arguments with shlex.split in ExecuteProcess Sep 20, 2023
@wjwwood wjwwood marked this pull request as ready for review September 21, 2023 00:09
Copy link
Contributor

@adityapande-1995 adityapande-1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be generalizing the problem here, but can we add an option to specify the delimiter here, instead of just a whitespace (which can be the default) ?

ExecuteProcess for instance can be used any executable, some of which can accept arguments like some_exec flag1:=flag1Value instead of some_exec --flag1 flag1Value, which would also require separation of arguments.

Either way a test for the := would be nice.

@wjwwood
Copy link
Member Author

wjwwood commented Sep 22, 2023

But in the case of some_exec flag1:=flag1Value the argument parser of the application would need to handle the splitting. I don't actually think we want launch splitting flag1 and flag1Value apart.

The pull request only support whitespace delimiters because that's what shell's support.

Either way a test for the := would be nice.

What would the test look like? Just something with := in it?

Copy link
Contributor

@methylDragon methylDragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance this will break cases where $(exec cmd) type args are passed in?

I tested this with shlex.split() and it seems like it splits it up instead of pre-expanding the command.

shlex.split("wow $(echo ls)")
# Out: ['wow', '$(echo', 'ls)']

# Granted a user could always... (and arguably should)
shlex.split('wow "$(echo ls)"')
# Out: ['wow', '$(echo ls)']

Similarly, we might want to set comments=True on shlex.
Otherwise comments might get interpreted as args (but I'm not too sure about this.)

>>> shlex.split('wow ab:=1\n  "some string"  WORKSPACE=a # aa a', comments=True)
['wow', 'ab:=1', 'some string', 'WORKSPACE=a']
>>> shlex.split('wow ab:=1\n  "some string"  WORKSPACE=a # aa a', comments=False)
['wow', 'ab:=1', 'some string', 'WORKSPACE=a', '#', 'aa', 'a']

Copy link
Contributor

@methylDragon methylDragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than my comments this looks alright to me.

That's a nifty trick with the tests, using the exit code like that 😬

@adityapande-1995
Copy link
Contributor

But in the case of some_exec flag1:=flag1Value the argument parser of the application would need to handle the splitting. I don't actually think we want launch splitting flag1 and flag1Value apart.

The pull request only support whitespace delimiters because that's what shell's support.

Either way a test for the := would be nice.

What would the test look like? Just something with := in it?

Ah right got it nevermind, shell doesn't need to worry about it. Looks good to me !

Copy link
Contributor

@adityapande-1995 adityapande-1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with green CI !

@wjwwood
Copy link
Member Author

wjwwood commented Sep 22, 2023

I tested this with shlex.split() and it seems like it splits it up instead of pre-expanding the command.

My understanding is that any of our substitutions should be processed before the changes in this pr get to it, or if you want to use bash-like things like $(cmd args ...) then you need to use shell=True and then this change has no effect.

And yes, there are cases where the user will just need to quote things to avoid them being split.

Also, I'll mention that the argument parsing that happens in xml/yaml is still very fragile, and I would love to improve the robustness of that, for example something like the xml <executable cmd="python3 -c &quot;print('$(var some_launch_config)')&quot;" ... /> will break the parser because it tries to shlex.split() the string python3 -c "print(' and then crashes because of unmatched quotes. But this is a pre-existing issue that this pr neither made better or worse. However, I didn't have time to fully track down a solution to that. It's a tricky problem for sure.

Similarly, we might want to set comments=True on shlex.

I actually considered that, but I thought that might interfere with arguments that contain #, but then they could just be quoted and/or escaped, so maybe it would be good to do that, but on the other hand, why include comments in the command like that? I mean I guess it might be from user input or something, but there really shouldn't be comments in the cmd and if there are, then for them to be process, you should use shell=True. I don't think it's the responsibility/scope of the split_arguments option/idea to handle everything a shell might do.

So I'm kind of 👎 on turning on comments=True for shlex.split(), but I think I could be convinced otherwise.

@wjwwood
Copy link
Member Author

wjwwood commented Sep 22, 2023

Thanks for the reviews so far!

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@wjwwood
Copy link
Member Author

wjwwood commented Sep 26, 2023

Looks like the shlex.split() may be breaking something on Windows with the \ path separators.

@wjwwood
Copy link
Member Author

wjwwood commented Sep 26, 2023

I think d469612 and c75ccf4 should fix it (and I think the right way, based on the docs and some discussions I found online).

New CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@@ -73,6 +74,8 @@
from ..utilities.type_utils import normalize_typed_substitution
from ..utilities.type_utils import perform_typed_substitution

g_is_windows = 'win' in platform.system().lower()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also comment that this is a good way to detect Windows proper vs Cygwin on Windows, versus other ways of detecting windows, like os.name.

@methylDragon
Copy link
Contributor

I tested this with shlex.split() and it seems like it splits it up instead of pre-expanding the command.

My understanding is that any of our substitutions should be processed before the changes in this pr get to it, or if you want to use bash-like things like $(cmd args ...) then you need to use shell=True and then this change has no effect.

And yes, there are cases where the user will just need to quote things to avoid them being split.

Also, I'll mention that the argument parsing that happens in xml/yaml is still very fragile, and I would love to improve the robustness of that, for example something like the xml <executable cmd="python3 -c &quot;print('$(var some_launch_config)')&quot;" ... /> will break the parser because it tries to shlex.split() the string python3 -c "print(' and then crashes because of unmatched quotes. But this is a pre-existing issue that this pr neither made better or worse. However, I didn't have time to fully track down a solution to that. It's a tricky problem for sure.

Similarly, we might want to set comments=True on shlex.

I actually considered that, but I thought that might interfere with arguments that contain #, but then they could just be quoted and/or escaped, so maybe it would be good to do that, but on the other hand, why include comments in the command like that? I mean I guess it might be from user input or something, but there really shouldn't be comments in the cmd and if there are, then for them to be process, you should use shell=True. I don't think it's the responsibility/scope of the split_arguments option/idea to handle everything a shell might do.

So I'm kind of 👎 on turning on comments=True for shlex.split(), but I think I could be convinced otherwise.

No issues on both of those!

I was thinking comments in the command that's passed could be used as comments. But all contexts in which you'd conceivably use this would also have the ability to have comments, so there's no need for that.

Looking at the windows changes

Copy link
Contributor

@methylDragon methylDragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question, but otherwise I'm approving this conditioned on green CI

launch/launch/actions/execute_local.py Outdated Show resolved Hide resolved
@wjwwood
Copy link
Member Author

wjwwood commented Sep 27, 2023

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@methylDragon
Copy link
Contributor

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Return code None 😱 😱 😱

@wjwwood
Copy link
Member Author

wjwwood commented Sep 28, 2023

Yeah, there's still an issue on Windows. I'm in the process of spinning up a workspace on my Windows VM to debug it.

Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
Signed-off-by: William Woodall <william@osrfoundation.org>
@wjwwood wjwwood force-pushed the wjwwood/shlex_split_cmd_to_executable branch from 0c54b3a to 03fed58 Compare January 22, 2024 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Substitutions in XML are not expanded intuitively when used for the purpose of executing programs.
3 participants