Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Matplotlib output for Scatter plot #3134

Merged
merged 11 commits into from
Aug 28, 2018

Conversation

markotoplak
Copy link
Member

Issue

Our outputs are ugly and non-configurable. Outputting python code could solve output flexibility problem for people with basic programming knowledge.

Description of changes

The changes here just enable the user to save graphs as Python code. Currently a pyplot plot is drawn directly.

The code gets the data for drawing from the pyqtgraph's graph.

Includes
  • Code changes
  • Tests
  • Documentation

@BlazZupan
Copy link
Contributor

This is a good first step, but it needs some polishing. In the case when dots where colored I am missing a legend. Also, the code includes two calls to plt.scatter, where the first one unnecessary plots a black edge around the point, while removing it I get exactly Orange's plot, as intended:

In any case, the output is much better than current PDF output. It would be great to work on this prototype and complete it to PR, and then ask Jure to help in polishing it. In summary:

  • add the legend (this may require to split plt.scatter to several calls for discrete legend)
  • remove black edges
  • minimize the output by limiting precision for sizes and edgecolors (use, say, np.float16)

@BlazZupan
Copy link
Contributor

Also, could you please add "Portable Document Format (from Matplotlib) (*.pdf)" to the list of supported outputs. The order of the outputs would then be:

  1. Portable Document Format (from Matplotlib) (*.pdf)
  2. png
  3. Scalable Vector Format
  4. Python code (with Matplotlib)
  5. Portable Document Format (*.pdf)

@markotoplak markotoplak force-pushed the matplotlib_scatterplot branch from a29cdf4 to d05db8e Compare July 23, 2018 13:27
@codecov-io
Copy link

codecov-io commented Jul 23, 2018

Codecov Report

Merging #3134 into master will increase coverage by 0.06%.
The diff coverage is 93.72%.

@@            Coverage Diff             @@
##           master    #3134      +/-   ##
==========================================
+ Coverage   82.74%   82.81%   +0.06%     
==========================================
  Files         344      346       +2     
  Lines       59323    59545     +222     
==========================================
+ Hits        49088    49313     +225     
+ Misses      10235    10232       -3

@markotoplak markotoplak force-pushed the matplotlib_scatterplot branch from d05db8e to 73571fe Compare July 23, 2018 13:27
@BlazZupan
Copy link
Contributor

Works well! When exporting matplotlib pdf's on Mac, OS X asks to modify the extension and the whole process gets confusing. Could you please remove .matplotlib from the extension and simply use pdf? Also, if possible, when saving to .py, reduce the precision edgecolors and facecolors to reduce the size of the file. When these are saved for categorical values, perhaps the length of this file could be reduced even more if edgecolors and facecolors would be created programmatically, avoiding creating arrays with huge chunks of arrays with same numbers.

@markotoplak markotoplak force-pushed the matplotlib_scatterplot branch 2 times, most recently from 5c3b86f to d5d1665 Compare July 26, 2018 14:52
@markotoplak
Copy link
Member Author

I added "compression" of colors to shorten the generated code. Generating code with different numpy versions is a pain - therefore I avoided numpy's repr. This exporter is now only enabled for the scatterplot as the only ".pdf" exporter.

Missing functionality:

  • a legend

A test for the CD diagram fail now. It only fails if I run all the tests, not just that one. Perhaps some file imports matplotlib before and that confuses it. @lanzagar ? If we are making matplotlib a dependency, that is easy to solve: just stop using mock in that test.

Warning: I added matplotlib to requirements. Before merging we would need to take care that package build files are updated.

Looking back, I think this PR should be reimplemented. Reading from pyqtgraph structures has many hidden details, and the code generator is then not aware of what the content really means. implementing equivalent functionality in the corresponding widgets (which also know the meaning of what they draw) should be better. I did not go that way because:

  1. It was easier to integrate my current approach with how graphs are saved (FileFormats taking a scene).
  2. Because scatterplot code is so dirty that was hard to find what to draw in there.

A mistake, probably!

@BlazZupan
Copy link
Contributor

BlazZupan commented Jul 31, 2018

On "I think this PR should be reimplemented: reading from pyqtgraph structures has many hidden details, and the code generator is then not aware of what the content really means." -- I agree, it would probably be easier if the implementation would be widget-specific, but at this stage, what do you suggest. I guess we have two options:

  1. merge these changes reimplement
  2. close with no merge, reimplement

@markotoplak, which one do you think is better?

I wish this implementation would set an example how to implement this functionality in other widgets, where reading the pyqtgraph structures would be even more problematic or not possible since other means of plotting where used. Example include hierarchical clustering, box plot, and silhouette plot.

@markotoplak
Copy link
Member Author

markotoplak commented Aug 1, 2018 via email

@markotoplak markotoplak force-pushed the matplotlib_scatterplot branch from 1786167 to f4a8a4f Compare August 24, 2018 11:50
@markotoplak markotoplak changed the title [WIP][NOMERGE] Matplotlib output for Scatter plot [ENH] Matplotlib output for Scatter plot Aug 24, 2018
@markotoplak
Copy link
Member Author

@ales-erjavec, could you please check if I added the matplotlib dependency correctly?

@ales-erjavec
Copy link
Contributor

could you please check if I added the matplotlib dependency correctly?

Looks Ok.

@lanzagar lanzagar added this to the 3.16 milestone Aug 28, 2018
@lanzagar lanzagar merged commit d3aae03 into biolab:master Aug 28, 2018
@markotoplak markotoplak deleted the matplotlib_scatterplot branch September 7, 2018 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants