-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parameter to limit depth of recursion across relationships #97
Comments
Ideally the best solution would be for it stop if it reaches a node it's already visited. However, that might be complicated to implement and requires keeping track of all visited nodes. A recursion depth count would stop it running infinitely, but the end result coming from the method would probably be unusable. It'd probably be good to have it throw an exception if the recursion goes beyond a given depth. (this would probably good to have either way) I haven't worked on this project in years, though, so I'm not likely the best person to implement this. A recursion depth counter shouldn't be too hard to implement and I'd accept a PR for that if it had tests and docs. |
Thanks for taking the time to respond to this @tisdall. This package is pretty useful to my company so I might be able to convince them to allow me some time to create a pull request for this. A couple of responses to your comments:
It seems like a straight forward solution to this would be add the
Can you elaborate on why the schema may be unusable in those cases? It seems like it would be possible to stop adding relationships once they are more than x levels away from the root table. In our case I can't imagine a scenario where we want tables more the three or four levels from the root added to the schema so it would still be nice to be able prevent the creation of large parts of the schema that are never intended to be used. |
@grant-humphries - What I meant was that if you had data/schema that went 8 layers deep but returned only 4, the result wouldn't be very usable as it's missing data and if you used that to edit the data the end result would save only 4 layers and truncate the rest. So, I think it should be an exception and leave it up to the implementer to deal with it as they want. In other words, it should always be the case that In your specific example, you'd specify the limit to recursion to be 4 since you expect nothing to be deeper than that. Anything that went deeper would be an exception. |
It's been a while since I used SQLAlchemy so I can't remember how it handles some things... I don't think you can just look at class names as you could have a table that has foreign keys to itself. I think you'd have to look at the table and the primary keys to identify if it's the exact same row you've already visited. |
This may sound simple-minded, but what about immediately setting the |
@grant-humphries - can you provide a failing example that we can run tests against? Take a look at the 'tests' directory here for how the tests work and see if you can modify one to mimic your data structure and hopefully replicate the problem. |
@tisdall - I'll take a look, as I am encountering the same issue |
@tisdall - Actually, looking at my models and your tests and doing some debugging, I'm not noticing any sort of recursive circle issues. What I am noticing is that it's recursing as deeply down the stack as necessary to create a full schema. My ORM contains about 400 models, many of which are intertwined. To create a full schema for 1 model ends up creating over 250k schemas (many of which are duplicates, just at different levels in different relationships). I have added some additional models to the tests and added one more test to allow for includes to support dot notation. I will likely submit a pull request this week for review. |
@Joishi - How are you planning on handling the issue in the PR? |
I'm not sure what you mean? This is the first time I've worked on a shared project in Github, so I'm not familiar with the expected workflow. I have forked the repo and made some local changes (not pushed yet) .. was going to submit a pull request from my fork. Or maybe I'm not understanding what you are asking correctly. :) |
@Joishi - well, are you adding a parameter to limit the recursion, or are you implementing something else? |
@tisdall - I'm going to add support for dot notation of includes.. For example: includes = [ This will generate a schema that contains what you specify in the includes, and only those values. If you don't specify an includes, it will get everything by default (as it does now). The above example would generate a schema that includes the relationship rel1, but only the col1 of that relationship. Any relationships THAT object has will not be traversed. |
@Joishi - Just be aware that I haven't touched anything related to this library for quite some time, so I can't guarantee that it'll be merged any time soon... I'll likely need some input from some other users before going ahead with a merge. |
If test is ran to generate full Route schema, it creates 113 total SQLAlchemySchemaNode objects as it traverses all relationships. For small orms this may not be an issue, but the problem gets exponentially worse as the size of the orm grows. This commit will enable users to explicitly state which relationships and columns to include in subnodes by allowing for dot-notation syntax in the includes keyword arg. Resolves: stefanofontanelli#97
@tisdall - I submitted the pull request but it failed on 2.6 support .. I'm guessing it has to do with the list comprehension I used. I will review what 2.6 support will require. Oh ... or not. This is in the build log:
|
Yeah, SQLA dropped support for Python 2.6 so we probably should too. |
Hi folks, I'm attempting to upgrade from version 0.3.1 to 0.3.3. The recursion across all relationships to create schema nodes for more distantly related tables (as opposed to just primary relationships) is one of the features that is leading me to upgrade, but the ORM that I'm working with is pretty massive. When calling
SQLAlchemySchemaNode
against one of our tables I let the code run for a minute or so andadd_nodes
method got called just under 41,000 times. I've let it run for several minutes longer than that, but the task I was attempting never completed and python was chewing up a pretty good chunk of memory during that time.In light of this I propose that a recursion depth parameter be added that limits how many tables away from the root an relationship can be and still be added to the schema. It may be a good idea to have a default value in place for this as well to avoid situations like the one I describe.
The text was updated successfully, but these errors were encountered: