Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added jupyter notebook integration and new visualization #129

Merged
merged 99 commits into from
Jun 14, 2023

Conversation

khoda81
Copy link
Contributor

@khoda81 khoda81 commented Mar 17, 2023

Changes in this PR

  • New visualization
    • States are circles instead of ellipses
    • Final state is indicated with a doublecircle
    • Initial state is indicated with a horizontal arrow
    • Multiple transitions between two states are grouped in a comma-separated label
    • Added support for SVG and JPEG
  • Jupyter integration
    • Automatically display the SVG representation of FA
    • Support for "image/jpeg", "image/png" and "image/svg+xml" mime-types

Examples of visual changes

Old New
old dfa new dfa
N/A (Resulted in an error, the generated dot was not valid) dfa-minify

More Example images

images.zip
The ones not included in the old folder resulted in error when generating.

Note: Tests for the pydot graph need to be updated. I am waiting for these new visualizations to be approved and finalized first.

@eliotwrobson
Copy link
Collaborator

@khoda81 This is really cool! I haven't looked at the code super closely, but would it be possible to use this Jupyter integration as part of an optional dependency, as in the discussion here: #124

I think that it would be really nice to have the changes here accompany a move towards the architecture described in the issue above.

@khoda81
Copy link
Contributor Author

khoda81 commented Mar 22, 2023

@eliotwrobson Looking at Visual Automata seems like I reinvented the wheel. I wish it were mentioned in the project readme.

I agree that DFA/NFA/GNFA classes need some refactoring action; the way I implemented this visualization is by adding IPython integration in the FA class and implementing a few methods (like iter_states, iter_transitions, is_accepted, ...) in the DFA/NFA/GNFA classes to refactor the visualization logic. The visualization code does not care about what is being visualized it just calls the iter_* methods. This also makes it really easy to add visualization for PDAs and others. This is (for now) lacking from the Visual Automata. Refactoring this logic would cut the size of the code almost in half.

We can add Visual Automata as an optional dependency, then surround the import with a try-except block; just like it is done in many libraries like flask. The rest is just replacing my visualizations in the FA class with Visual Automata ones and printing a warning if the import failed and falling back to the default repr of the class. This way the only code added to this library would be only a few lines of code in the FA class, with a few utility methods (iter_states, iter_transitions, is_accepted, ...) in DFA/NFA/GNFA.

This would also remove the need for the VisualAutomata/Visualizer class for visualizations because the IPython integration methods are only called by Jupyter. So there is no need to wrap the DFA and it will use the visualization automatically when used in Jupyter and act just like it did before everywhere else and when Visual Automata import failed.

If my rough explanations made sense and this is what you expected, I will implement these changes.

@eliotwrobson
Copy link
Collaborator

@khoda81 I think the way you refactored this is definitely more towards the direction that I think we want to go in with respect to the visualizations described in that issue. Instead of adding Visual Automata as a dependency, this would be a great opportunity to start to integrate that code with your refactors in here (since the purpose of that original issue was to move Visual Automata into this library). As you stated, with your refactors, the size of the code will decrease significantly.

@caleb531 do you have any input?

@caleb531
Copy link
Owner

@eliotwrobson @khoda81 I am hesitant about adding Visual Automata as a dependency because it would duplicate all of the non-visualization Automata code. So I am more in favor of integrating the code into this library.

While I do like my Visualizer class proposal from #124, I also want to consider what is most practical for people using this library. If this PR provides the right IPython / Jupyter integrations (i.e. something more practical than the Visualizer class idea), then I'd be fine with that direction. Especially considering that I am not very familiar with IPython or Jupyter.

I suppose if we nix the Visualizer class, I will again raise my concern about the DFA/NFA classes becoming too bloated, but I also want to recognize that practicality beats purity.

Also cc'ing @lewiuberg on this thread so that he's in the loop about this proposed new direction, especially since he was already supposedly starting on the Visualizer implementation for #124. @lewiuberg, if you have any thoughts on this, please feel free to chime in.

@khoda81
Copy link
Contributor Author

khoda81 commented Mar 23, 2023

@eliotwrobson @caleb531 Alright then, not adding Visual Automata as a dependency makes this implementation a whole lot easier, it's just a matter of stealing the visualizations from Visual Automata and replacing mine with them.
This is the implemented behavior in Jupyter:

from automata.fa.dfa import DFA

dfa = DFA(
    states={'q0', 'q1', 'q2'},
    input_symbols={'0', '1'},
    transitions={
        'q0': {'0': 'q0', '1': 'q1'},
        'q1': {'0': 'q0', '1': 'q2'},
        'q2': {'0': 'q2', '1': 'q1'},
    },
    initial_state='q0',
    final_states={'q1'}
)

dfa

dfa

dfa.minify(retain_names=True)

dfa-minify

It would act just like it did before everywhere else and prints the result of dfa.__repr__(). There would be no need for any Visualizer class. These integrations happen by implementing _repr_mimebundle_ (implemented in FA) which is only called by Jupyter to check if there is any visualization available, in case of an error it would fallback to __repr__. This would be way more practical.
Then the purpose of the show_diagram method would be to give the user options to customize visualizations and save them if needed.

@lewiuberg
Copy link
Contributor

Hi guys! First, sorry for being a bit on/off. I have had a crazy period at work. I have been working on this from time to time, but since automata isn't really something that affects me in my work the direction we decide on isn't that important to me. If you guys want to "steal" the code or add it as a dependency it is fine by me. I just feel it is very beneficial for learning purposes and I don't want all the time I invested into it to go to waste. Although I think it would be the most effective to implement it to automata-lib in terms of maintenance.

One cool thing I want to share is that I just spoke to a professor at Colombia University, and they are now starting to use Visual-Automata in their classes 😃

@eliotwrobson
Copy link
Collaborator

One cool thing I want to share is that I just spoke to a professor at Colombia University, and they are now starting to use Visual-Automata in their classes 😃

That's outstanding! That's actually part of why I really like this PR, the integration with something like Jupyter makes it much easier to create self-contained course content using this library.

Copy link
Collaborator

@eliotwrobson eliotwrobson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, looks really great so far!

automata/fa/fa.py Outdated Show resolved Hide resolved
automata/fa/fa.py Outdated Show resolved Hide resolved
automata/fa/fa.py Outdated Show resolved Hide resolved
automata/fa/fa.py Outdated Show resolved Hide resolved
automata/fa/nfa.py Outdated Show resolved Hide resolved
automata/fa/fa.py Outdated Show resolved Hide resolved
automata/fa/fa.py Outdated Show resolved Hide resolved
@caleb531
Copy link
Owner

caleb531 commented Mar 26, 2023

@khoda81 This is a great start so far! Thank you so much for what you are contributing.

I do have a few points of constructive feedback:

  1. I've tried to chime in on all the comments that @eliotwrobson left for you. I agree with most of his feedback, and would ask that you implement those suggestions of his on which I have indicated approval.
  2. The automated tests are currently failing. Please make sure to run nose2 while you are developing and fix any broken tests (or, as needed, add new tests to maintain code coverage)
  3. I think my recent merge of Better numbering usage + other small updates #130 has caused a conflict in this branch, so that will need to be resolved as well before I am able to merge this PR

automata/fa/fa.py Outdated Show resolved Hide resolved
@eliotwrobson eliotwrobson mentioned this pull request Mar 29, 2023
automata/fa/dfa.py Outdated Show resolved Hide resolved
automata/fa/dfa.py Outdated Show resolved Hide resolved
@caleb531
Copy link
Owner

caleb531 commented Apr 1, 2023

@khoda81 Just a heads up—for more consistent code style across the entire codebase, I plan to integrate the black auto-formatter and apply it to all Python files in the library.

Once I do this on the develop branch, I will update this PR to auto-format the files you've already modified. This should prevent future formatting conflicts, and also eliminate some feedback loops in this PR for pedantic topics like quote style. 😅

cc @eliotwrobson

@caleb531 caleb531 force-pushed the jupyter-integration branch 2 times, most recently from cf81d69 to 2be4d6c Compare April 1, 2023 19:15
@caleb531
Copy link
Owner

caleb531 commented Apr 1, 2023

@khoda81 Okay, I've added black to the develop branch, and have rebased your feature branch (jupyter-integration) off of my official develop branch. Please make sure to pull in these new (rebased) changes properly.

I've done my best to resolve conflicts and preserve the integrity of your changes, but please double-check just to be sure.

cc @eliotwrobson

@khoda81
Copy link
Contributor Author

khoda81 commented Apr 2, 2023

@caleb531 black is a great addition! It would be cool to add it as a pre-commit hook. I might do that after this PR. It would help greatly with contributions. I usually use Gymnasium's pre-commit hooks as a template for my own projects.

@caleb531
Copy link
Owner

caleb531 commented Apr 2, 2023

It would be cool to add it as a pre-commit hook. I might do that after this PR. It would help greatly with contributions. I usually use Gymnasium's pre-commit hooks as a template for my own projects.

@khoda81 So I have limited knowledge of pre-commit hooks, but from what I understand, they can't be committed to the repository directly and require some manual setup to activate. So it seems like the pre-commit hook is only good for those contributors who take the time to set it up. I guess the same could be said of editor plugins.

In your experience, is there a good way to streamline the management/adoption of pre-commit hooks for a project?

cc @eliotwrobson (in case you have any thoughts on this)

@khoda81
Copy link
Contributor Author

khoda81 commented Apr 2, 2023

@caleb531 To add them to this project we need to only add the .pre-commit-config.yaml file to the root of the project, which contains the hooks (like black, trailing-whitespace, isort, etc) and configurations related to them.
What changes for contributors is explained in the contribution file here (which this project also needs, btw). The setup is pretty minimal, pre-commit can be pip installed once and most hooks would also be automatically set up once. Visit pre-commit.com for more info.

@caleb531
Copy link
Owner

caleb531 commented Apr 2, 2023

@khoda81 This project does have a CONTRIBUTING file, but it's under the .github subdirectory:

https://github.com/caleb531/automata/blob/develop/.github/CONTRIBUTING.md

Anyway, you've persuaded me that a pre-commit hook would be a valuable addition to this project. I would greatly appreciate a separate PR for this if you have the time to contribute one.

@khoda81
Copy link
Contributor Author

khoda81 commented Apr 2, 2023

@caleb531 my bad, I somehow missed the file. I will add pre-commit hooks in a separate PR.

@khoda81
Copy link
Contributor Author

khoda81 commented Apr 3, 2023

@eliotwrobson @caleb531
I have been playing around with different implementations of the step visualization feature and I have two ways of achieving it:

  1. Adding an alternate method to DFA/NFA that returns the steps with a data structure that is easier to work with for visualization
  2. Adding a switch case for DFA/NFA in the show_diragram method to translate the result of read_input_stepwise based on the class that we are working with.

The first solution is more on brand with the rest of this PR but would add another method for the purpose of visualization which I have been trying to avoid. The upside would be that if implemented properly it can replace read_input_stepwise and gives a more generalized API for FAs and makes it easier to add visualizations for PDAs or any custom automata as well. The second solution though would be probably less code and using the existing code.
We can also use the second option as a temporary solution and play around with different generalizations of FAs in a separate PR.
Any thoughts on this?

@eliotwrobson
Copy link
Collaborator

@khoda81 For option 1, do you mean that the method would return a data structure holding information about each step that is taken, and then the visualization code could just read this directly? Does this differ greatly from just yielding all results from the generator given by read_input_stepwise? I don't have an immediate strong opinion either way.

@caleb531
Copy link
Owner

caleb531 commented Apr 4, 2023

@khoda81 I think any major re-architecture of the input-reading APIs will require some iteration and proof-of-concepts, for sure. On that note, I like the idea of starting with the second option since it seems like it would have less of an API impact, and would be a good starting point to explore larger changes.

Out of curiosity, what kinds of detail is missing from the current input-reading API that a new API would be more useful in providing?

@khoda81
Copy link
Contributor Author

khoda81 commented Apr 5, 2023

@caleb531 @eliotwrobson My proposal for the new API is here #138.
I need the read_input_stepwise to tell me which edges were used in every single transition. I will handle NFA/DFA input read separately using a good old if-else for now...

BTW, I have a hard time figuring out what the purpose of GNFA class is. Does not seem to be used anywhere internally.

@khoda81
Copy link
Contributor Author

khoda81 commented Apr 9, 2023

Step visualizations are implemented and seem to be working properly. I am torn between different ways to label transitions in a way not to make the diagram cluttered but concise.
step_viz1
step_viz3
step_viz5
@eliotwrobson @caleb531 any opinions/proposals?

@eliotwrobson
Copy link
Collaborator

@khoda81 I think it's always going to be a bit tricky visualizing D/NFAs reading input much longer than their number of states (inherently things are going to get cluttered, it would be better to have an animation rather than a static image but that's way out of scope I think). I like the way you have the color grading become more green as more of the input is being read. Maybe try italicizing the [#1] labels so they stand out from just the input symbols? Overall things look good though.

Could you try an example with a somewhat larger DFA constructed using the from_finite_language method? I think it might be a common use case for people.

I haven't looked at the code, but it seems like a few of the backend changes might overlap a bit with #139, and this branch is already bit out of date with develop. Once that PR is merged, the next step is to rebase the changes here and see if any logic can be condensed (cc @caleb531 to see if we can get that PR merged soon, since it's already been approved).

automata/fa/nfa.py Outdated Show resolved Hide resolved
@eliotwrobson
Copy link
Collaborator

eliotwrobson commented Jun 6, 2023

@caleb531 @khoda81 I've pushed changes 10706a6 that fix the correctness issue I was seeing (at least when testing with the small example NFA), along with tests verifying this for that particular instance. There have been a few uncovered lines that have been exposed, but the test cases are checking for most of the important issues related to correctness.

The preceding issue was the last major obstacle to this PR. All that remains is to add a few test cases that ensure correctness and cover the last remaining lines. I should be able to attack some of these tests, but would greatly appreciate some help on this, as there are a few pesky uncovered lines that require some weird setup code to hit. I also want to be sure about the optimality of the new accepting path finding algorithm.

EDIT: After making another pass, all that's left is to test that everything works even with the optional imports.

@eliotwrobson eliotwrobson mentioned this pull request Jun 6, 2023
2 tasks
@caleb531
Copy link
Owner

@eliotwrobson @khoda81 At this point, I'm tempted to just bypass the slight drop in coverage (to address in a separate PR) for the sake of merging this PR. It will unblock #147 and any other future PRs that touch these files (which would often be the case). I've also been realizing that achieving a perfect 100% code coverage can be impractical or even unattainable, sometimes.

@eliotwrobson I think my only requirement for wrapping up this PR at this point would be to work through the remaining comment threads to reach resolution for all of them.

@eliotwrobson
Copy link
Collaborator

@caleb531 sounds like a plan. I closed all of the conversations related to changes that I already made. There are only 4 remaining ones open, and they're all fairly tangential or something that we can resolve with testing later (minor stuff related to how the project is configured). I'll make another pass.

@caleb531 caleb531 added this to the v9 milestone Jun 14, 2023
@eliotwrobson
Copy link
Collaborator

@caleb531 I've resolved all of the open comment threads on this PR. The only dangling items are:

  • The coverage, where I agree with you that we should merge and then maybe add a test that everything works without the optional dependencies later.
  • The documentation, which I'm inclined to go over later since this PR doesn't really change the API in a substantial way, and we can add some of the new helper functions to the public interface before release.

Other than those, we have very high coverage and tests for most everything that's been added. Once you give the code here a quick lookover / are comfortable, I think we're good to go ahead and merge!

Copy link
Owner

@caleb531 caleb531 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eliotwrobson @khoda81 This works for me! I've just approved everything, so I'll go ahead and merge now.

@caleb531 caleb531 merged commit d90100a into caleb531:develop Jun 14, 2023
@eliotwrobson eliotwrobson mentioned this pull request Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants