-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the bash scripts in python #1215
Comments
Robin Mills <notifications@github.com> writes:
**Proposal**
Rewrite the bash scripts in python
**Design**
The bash scripts are structured. The have a common core in functions.source which we can easily convert to a Python module.
So, `copyFiles a b` becomes ET.copyFiles('a','b');
`runTest exiv2 -pa filename` becomes `ET.runTest("exiv2 -pa %s" % filename)`
ET.runTest() executes a command and returns an array of strings (line-by-line output).
We collect the lines as the test proceeds:
```
r=[]
r.append(ET.runTest("...........")
# And at the end we compare r to test/data/geotag-test.out
ET.reportTest("geotag-test",r)
```
To avoid piping data with grep, cut and other linux utilities, we implement them in python. So:
```
ET.cut(del,field,array-of-strings) will return an array of shorter string
ET.grep("pattern",array-of-strings) will return a shorter array of strings
```
The new scripts can be added one at a time without breaking anything.
When a test is converted, we replace the code in foo-test.sh `python3 foo-tests.py`
**Benefits**
1 Cross Platform
2 Simpler design than tests/system_tests.py
In that case system_tests.py should be decomisioned, having yet another
test suite is imho not beneficial.
… 3 Can be introduced as time permits
4 No documentation changes!
5 Eliminate line-ending issues
6 Eliminate dos2unix, tr, pipes and other unix hackery
**Effort**
The existing scripts can be easily converted to python. We have 25 scripts with 1424 lines of code (average 56 lines/script). This is a one-week job.
**Example**
Here's geotag-test.sh
```
#!/usr/bin/env bash
# Test driver for geotag
source ./functions.source
( cd "$testdir"
printf "geotag" >&3
jpg=FurnaceCreekInn.jpg
gpx=FurnaceCreekInn.gpx
copyTestFiles $jpg $gpx
echo --- show GPSInfo tags ---
runTest exiv2 -pa --grep GPSInfo $jpg
tags=$(runTest exiv2 -Pk --grep GPSInfo $jpg | tr -d '\r') # MSVC puts out cr-lf lines
echo --- deleting the GPSInfo tags
for tag in $tags; do runTest exiv2 -M"del $tag" $jpg; done
runTest exiv2 -pa --grep GPS $jpg
echo --- run geotag ---
runTest geotag -ascii -tz -8:00 $jpg $gpx | cut -d' ' -f 2-
echo --- show GPSInfo tags ---
runTest exiv2 -pa --grep GPSInfo $jpg
) 3>&1 > $results 2>&1
printf "\n"
# ----------------------------------------------------------------------
# Evaluate results
cat $results | tr -d $'\r' > $results-stripped
mv $results-stripped $results
reportTest $results $good
# That's all Folks!
##
```
In python, it'll look something like this:
```
#!/usr/bin/env python3
# Test driver for geotag
import exiv2Test as ET;
t= "geotag-test"
jpg="FurnaceCreekInn.jpg"
gpx="FurnaceCreekInn.gpx"
r=[]
print(t)
r.append(ET. copyTestFiles(jpg,gpx))
r.append(ET. echo("--- show GPSInfo tags ---"))
r.append(ET. runTest("exiv2 -pa --grep GPSInfo %s" % jpg))
tags=ET. runTest("exiv2 -Pk --grep GPSInfo %s" % jpg)
r.append(ET. echo("--- deleting the GPSInfo tags"))
for tag in tags:
r.append(ET. runTest("exiv2 -M'del %s' %s" % (tag,jpg)))
r.append(ET. runTest("exiv2 -pa --grep GPS %s" % jpg))
r.append(ET. echo("--- run geotag ---"))
r.append(ET. cut(' ',2,ET.runTest("geotag -ascii -tz -8:00 %s %s" % jpg gpx)))
r.append(ET. echo('--- show GPSInfo tags ---'))
r.append(ET. runTest("exiv2 -pa --grep GPSInfo %s" % jpg))
ET.reportTest(r,t)
# That's all Folks!
##
```
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#1215
|
@D4N I don't think you've understood my intention. I'm wondering: "Is there a quick and simple way to remove our dependancy on bash". I admire system_tests.py. Our future test strategy should be based on systems_tests.py So, this is a proposal for v.0.27.4 and not 0.28. |
This is straight-forward and works on macOS, Windows (msvc/mingw/cygwin), solaris and ubuntu.
Looks very similar to bash:
Curiously, there's an unexpected wobble:
When the command |
The code returning 1 is in I don't think we should do anything about this as it's not a error.
|
I've had ideas about how to achieve this in The current construction of system_tests.py is built around the idea of executing a single command and comparing it to the output string stored in the python test file. Good architecture. The test is totally self contained. We could extend the machinery to compare the total output to the reference file in In this way, we'd need 25 new test scripts (in tests/bash_tests), which would be something very similar to what I've written above. I still believe the project can be done in about 1 week. And if it's done in the system_tests, the effort will live into the future. However, as I have said, this proposal relates to Exiv2 v0.27.4 which might never happen. @D4N. Thoughts? |
The bash script test/geotag-test.sh remains in the test suite as documented for v0.27.3. It can be executed directly in the terminal, or via make. The script becomes a one-liner! The big win is that users of cmd.exe in Windows can run those scripts without requiring msys2/bash. And the cmake command
This is a clear winner and fits perfectly into Dan's tests/system_tests.py. I think Dan's objections are not well grounded. This proposal moves our test apparatus forward without loss of existing infrastructure. If the project Exiv2 v0.27.4 is undertaken, I will implement this in test/exiv2Tests.py I don't intend to contribute to Exiv2 v0.28. test/exiv2Tests.py can be ported into tests/system_tests.py and it's up to the engineers on the v0.28 project to decide what should be done. |
Here's another reason to go down this "pythonic bash_tests" road. Currently, there's no obvious way to test a cross-compiled build on wine. In README.md, I documented how I tested it using a shared drive. And I've speculated that we could install MinGW/msys2 in wine. However, there's another way. We can install easily python3 in wine and run the 100% python test suite. I've downloaded a zip of python38 from python.org. It's already built and ready to go.
Incredible, almost. It crashes in about 1 second.
It called an API which isn't in wine. However this is very promising. I hope to get this working for Exiv2 v0.27.4. |
I've reported the wine/python3 issue to the python team: I've reported the wine/python3 issue to the wine team: |
The wine people have said "Fixed in Wine 4.0". I'll test my prototype (above) of |
Here's another, and possibly even simpler way to achieve the goal of getting those bash scripts to run on Windows. Use the python cmd2 module: https://www.youtube.com/watch?v=pebeWrTqIIw In the presentation, Todd said "you can run shell commands and pipe the data exactly the same on Windows, Linux - it doesn't matter". This is music to my ears. We simply modify the bash scripts to run in the cmd2 environment and we're done. Don't touch anything in system_tests.py @LeoHsiao1 Interested? |
I tried using cmd2. It executes successfully on Linux: [root@CentOS ~]# python3 test_cmd2.py
(Cmd) help | grep py | wc -l
1 But it cannot execute bash commands directly on Windows: (Cmd) help | grep py | wc -l
'grep' is not an internal or external command, nor is it a runnable program
Pipe process exited with code 255 before command could run |
Thanks for looking at this, @LeoHsiao1 Ah, yes. If grep's not there, we'll have to fake it. That's not so tough. You can define commands in cmd2. So, if you define do_grep, it'll get called. I don't have a definitive list of commands used by the test suite, however they are fairly simple utilities like cut, grep, sed, diff and checksum. Another possibility is to get the code for those and compile them. However a lot of the code for those old unix utilities is really horrible. Ugly, ugly, code that compiles and inks lots of other horrible ugly code. However we could create simple similar commands e2_cut, e2_grep and so on. That's no so tough. However, the do_grep, do_cut strategy is probably easier as it's all written in python. |
Oh, I thought CMD2 already provided these commands. |
I've had several new ideas about this. I'm writing down my thoughts. This isn't a design set in concrete. It's thoughts.
There are 621 programs run and all of them are built (exiv2 etc) 2 Analysis of test/*.sh and test/functions.sh using It's cp, mkdir, grep, sed, tr, cut, xmllint, md5sum and diff We don't need xmllint as it is only use to "prettify" XMP being extracted from the images. However, I'm sure we can easily create something in Python to do this. I've never thought about what md5sum does. I test two files for equality. I think md5sum uses something in zlib and I know python can do that also. 3 Chapter 7 in the book (I/O in Exiv2) When I started writing the book, I started to write a new program
So, When I started working on tvisitor.cpp, it has an Io class which is similar (although not as comprehensive) as BasicIo. I decide that samples/io.cpp was unnecessary and removed it from the book (there's a relic in SVN around r4950). If I restore and develop samples/io.cpp, we can use that from cmd2. So |
This looks promising. I've taken the demo code "first_app.py" https://cmd2.readthedocs.io/en/latest/examples/first_app.html#first-application and added the command md5sum.
A big win here is that we get away from subtle differences in the platform implementation of diff, checksum, grep and the other utilities and only have our pythonic flavour. So we escape from bash and linux utilities. We can run this on every platform (Windows, macOS, Linux, UNIX). We have to implement the command-set required by our bash scripts.
Here's my first_app.py:
|
There are a several limitations that seem significant: 1 Piping stdout You can only pipe output to shell commands. You can't pipe to cmd2 fake commands. Rats. It's not the end of the world. That implies we have to used named files in the test/tmp directory. I would like a cmd2 fake command such as prettyXML to format XML from stdin:
We may have to express that as:
Perhaps we can persuade Todd to add a decorator to cmd2 to make that elegant and invisible! even if it's implemented with temporary files.
Without hooking the readline parser, I'm not sure you can use ; to separate multiple commands on one line! I don't think this is important. For sure geotag-test.sh does not do this.
We might need to jump in and out of python to achieve this. |
Here's another possibility. Use xonsh. This is a bash look-alike written in python3 and runs on Windows, macOS, Linux. cygwin64, MinGW/msys2. Can't get it to work on Solaris, FreeBSD or NetBSD. |
I started looking at the test cases for Exiv2 after work today. |
Thanks @LeoHsiao1 I don't think we should use cmd2. That would be a mistake. You can see my original proposal and prototype. I converted geotag.sh We don't have to bother with functions.source, it's only purpose was to provide a little library of bash functions. I added that about 2012 because the test suite was just a collection of random bash scripts with no structure at all at that time! xonsh looks promising - however we need it to work everywhere. No exceptions. I couldn't get it install on UNIX (Solaris, FreeBSD and NetBSD) - probably because the python installed at the moment is < 3.5. By the way, the test suite runs well on Windows when you install MinGW/bash and set up the path correctly. Before you dive in and do work, please run the bash test suite in /test and Dan's excellent /tests/system_tests.py. Think carefully before doing a lot of work - your time is valuable and I want it used effectively. |
In regards to For things like defining functions and all that, we support a slightly extended Python syntax for scripting where it is all normal Python plus the ability to call cmd2 commands from Python and get their output. This is available via the built-in For tests, we have always used |
Thank You for your very helpful input, @tleonhardt. We should also look at I really like Python. It's very elegant, consistent and simplistic - quite the opposite of perl. I gave a 5 minute "lighting" presentation at PyCon2010 in Atlanta, Georgia about Python and Exiv2. 5 minutes of fun. Enjoy! https://clanmills.com/2010/PyCon/RobinPyCon2010.mov |
From your demo code, it looks like you've been using Python for years. |
@LeoHsiao1 I don't use python very often. When I use it, it always seems clean and sensible. You use: for item in items:
.... Makes a lot more sense than: for ( const structure iterator item = items.begin() ; item != (*item).end() ; item++ )
... I wrote a compiler/interpreter in 1986 for a CAD system. It was called "PL" = Programming Language. The compiler accepted .pls (source) and generated .plc (code) which was executed by plr (runtime) and the debugger was pld. It was based on the PL/0 code Chapter 5 of Wirth's Book "Data Structure + Algorithms = Programs". PL was really like Python. I don't know anything about pytest. Don't know anything about CTest either. The world of software is now huge. More and more has to be learned every year. Management continue with the myth that every engineer can handle everything. Thank Goodness Medical Doctors don't behave like that. The bash scripts work well and I've carefully documented in README.md how to install MinGW and setup the PATH to run them. However, it would be great to replace the bash scripts with python scripts. You can see in the discussion that I agreed with Dan that we should extend his beautiful tests/systems_tests.py code to replace scripts such as test/geotag-test.sh with tests/geotag-test.py. |
This is my current understanding:
You mentioned two ways on May 23:
Do you mean to convert bash scripts to Python scripts that depend on |
I don't think we have exiv2tests.py - I think that's something invented by GitHub auto-complete! I think there are 25 scripts:
There are other shell scripts which have nothing to do with test:
tests/system_tests.py is very nice code. However to "pipe" the data through a filter (such as grep, sed, or cut) involves redefining the stdin/stdout handler and I think that's a little clumsy. I think you should investigate what I've put into the proposal. What I like about system_tests.py is that the test program is complete. It has both the test and the reference output in a single package. However, I know Dan found it to be a lot of work. For example, the old script test/bugfixes.sh became about 100 python scripts! test/bugfixes.sh was a jumble of tests that had accumulated over about 15 years. You may find that a script such as icc-test.sh becomes lot of different scripts. We shouldn't do that. We will make mistakes. If we're running the test icc-test, I think it's fine to generate the output in tests/tmp/icctest.out and compare it to the reference output in test/data/icc-test.out. It's more than fine. It's desirable. We only want to test if the behaviour changes. Please do the comparison in python and not the utility diff which has different options on different platforms and handles line-endings differently on different platforms. I strongly encourage you to run the existing scripts (both the "new" python script in tests/ and the "old" bash scripts in test/). Look and think before writing lots of new code. I've made a proposal and implemented a prototype for geotag-test. When you understood what already exists, you can choose to ignore my proposal and go down another road. However, I'd like to review your replacement for one test (eg geotag-test or icc-test) before you spend a lot of time on this. I'm not very familiar with the internals of system_tests.py, however I added a feature to that for v0.27.3 and thought the code was well written and quite easy to understand. If you're stuck, Dan will probably help you. |
BTW, I case you think I was making up the story about writing a compiler and interpreter, here's proof. Some pages from the user manual and a typical drawing made by the CAD system. Very proud of that project. They laid me off in 1991 when the business was in trouble and I went to AGFA in Belgium to write a PostScript interpreter. And then to Adobe and 15 years in Silicon Valley. It's been an adventure. And now I'm retired and writing my book! |
Thank You for doing this, @LeoHsiao1 I will review this today (Sunday). Where is the code? Have you submitted a PR. Or can you attach a zip to this issue. I'm working on IPTC for the book today. I haven't worked on ISOBMFF for a couple of weeks. |
The content I submit is in this directory:: https://github.com/LeoHsiao1/exiv2/tree/master/test/test_py |
Thanks. I'll look at that later today. |
Thank You @LeoHsiao1 Your code is beautiful and implements my design perfectly. I'm delighted. Buy yourself a beer! I think it's working as follows:
Some comments:
Have a look at what Dan did in tests/system_tests.py. There's a command-line argument
Your code is neat and careful. I like code to be in "columns" whenever possible because my eye can read it more quickly and with less thought.
You've used a variable called
I added a lot of HTTP stuff to that in v0.27.3 I spin up the python web-server to do some HTTP testing. The new code has to be retained. If you want to move it to another file, that's fine. By the way, it was Dan's idea to spin up the python web-server and that has worked well on every platform.
When you've got this into a PR, I'll undoutably have more to say. I think you should ask Dan to review it. Dan is really smart and thinks of things that nobody else considers. You'll like him a lot. Great Engineer.
I'm really delighted by your work. Thank You Very Much. |
|
Please submit the PR to 0.27-maintenance. Let's do this in steps. Integrate your utils code into tests/systems_tests.py and put something in tests/runner.py to execute pytest for the user. You'll see that in "non-verbose" mode, Dan emits a string of dots as the tests proceed. In "verbose mode" there is a line of output for every test. I find verbose very useful when I add new tests because I can see that they are run. And you can run a single test with the syntax: If you modify runner.py, you'll have have clear sight of the --verbose switch and do not need to modify test/Makefile. You will probably have to put your scripts into test/python_scripts that is hidden from runner.py and only contains scripts understood by pytest. Once the PR is in place, you can deal with the tests one at a time. At sometime in the future, you or I will open a new issue to say "Remove the old bash tests in test/*.sh. As currently documented the command:
When you have finished the PR, and it has been reviewed and integrated in 0.27-maintenance, we'll have a period when we are running the real bash scripts AND your nice new scripts. By the time of the next release, I hope the bash scripts will be history. I don't know when or what's going to happen next with the project. Since January, there has been almost nothing submitted to 'master'. I shipped 0.27.3 in June. I have offered to ship either 0.27.4 or 0.28. I've had friendly correspondance with the guys, however they are inactive. When I decided at the end of March to do 0.27.3, I hoped they would use the C-19 lock-down to do some work on 'master'. They are three really good guys. They have a of other pressure in their life from the office, their partners and so on. Both Dan and Luis have done a lot of great work on 0.27 and 0.28. Kevin did great security work last year on 0.27.2 (and 0.27.3). |
Huge thanks to @LeoHsiao1 for successfully undertaking this project and submitting PR #1257. There are several minor topics remaining such as removing bash from Currently the command This issue will remain open until until all use of bash in the test suite is eliminated. |
Thanks to @LeoHsiao1 for undertaking the work to bring this to a successful conclusion. The great work by Dan in system_tests.py has been extended to provide python functions which emulate linux utilities such as grep, diff and cut. Thank You, Mr Lion. |
Proposal
Rewrite the bash scripts in python. The purpose of this proposal is to remove bash from the test suite.
Design
The bash scripts are structured. They have a common core in functions.source which we can easily convert to a Python module.
So,
copyFiles a b
becomesET.copyFiles('a','b')
runTest exiv2 -pa $filename
becomesET.runTest("exiv2 -pa %s" % filename)
ET.runTest() executes a command and returns an array of strings (line-by-line output).
We collect the lines as the test proceeds:
To avoid piping data via grep, cut and other linux utilities, we implement them in python. So:
When we test the output with
ET.report()
, when the test fails we write the file tmp/test.out and the output can be inspected with your favourite diff utility.Benefits
1 Cross Platform.
2 Simpler design than tests/system_tests.py.
3 Can be introduced as time permits.
4 No documentation changes!
5 We know the test is identical because we do not touch data/test.out.
6 Eliminate line-ending issues.
7 Eliminate diff, dos2unix, tr, pipes and other unix hackery.
8 Binary output support.
Effort
The new scripts can be developed one at a time without breaking anything.
When a test is converted, we replace the code in
foo-test.sh
withpython3 foo-test.py
The existing scripts can be easily converted to python. We have 25 scripts with 1424 lines of code (average 56 lines/script). This is a one-week job.
Example
Here's geotag-test.sh
In python, it'll look something like this:
The text was updated successfully, but these errors were encountered: