-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow export on large database due to ClearFilesByExtension #527
Comments
Thanks for the feedback on this! I am interested in see some more details on the performance issues. I actually worked on a file scanning function But before I dig into this, I had a couple questions about your environment. Are these database projects on local SSD storage, or a network location? Several minutes seems excessively long, especially for subsequent exports with few changes. Would you be able to post the Performance Reports from the end of the export log on one of these larger databases? That could be helpful in seeing the counts and times for some of the internal operations. |
I would love to assist…Let me gather the requested info. We would be a good test environment because of the large number and large size of our databases.
We are a high end network environment we state of the art hardware. Development is primarily performed on a remote desktop server. All storage locations are networked locations. Unfortunately we are still stuck on .mdb legacy db formats. We are mostly split FE/BE with linked tables to SQL server on network; there are a few local access tables primarily used as temp locations for data as needed.
I want to emphasize that I don’t think the ClearFilesByExtension()is the issue but rather when it is called in the loop. As a result it runs 1500 (500x3 for the 3 extensions) times if we have 500 tables. It just needs to run just once before the loop. I am fairly confident I have analyzed this and that’s where the delay is.
Export.log attached. I ran this one this morning and it took about 30 minutes due to the tables issue.
|
Thank you! This is helpful feedback because sometimes performance issues only become obvious in scenarios dealing with large amounts of data.
The remote desktop server shouldn't be a problem, although you will probably see a difference between SSD vs. mechanical drives due to the higher random access reads that SSDs can do on different parts of the drive at the same time. The key here would be to make sure the developers are working on a local clone of the repository directly on the application server, not trying to export source to a network location on a different server. (It will still function over a network share, but the performance will suffer drastically.) In the ideal scenario, each developer makes a clone of the relevant repositories, (one for each database or system) and performs the development work locally on his machine, committing changes back to the repository which are then merged to the upstream forks as per your development workflow. Operations like exporting and building from source are very fast when running on SSD and typical development hardware. Obviously not everyone gets to choose how the development process happens for their environment, but I just share that background for what I consider a typical usage scenario, for which this add-in is designed to (hopefully) function as optimally as possible.
I absolutely agree! This should only be called once during an export operation, and probably only during a full export. I am already refactoring some things to move this out of the component type classes. Also, just as a friendly note, the GitHub project is public on the Internet, so you will want to redact out any sensitive information from your log file that you don't want posted here. I will go ahead and paste in the relevant section below:
As you correctly surmised, clearing files by extension is definitely the bottleneck. The good news is that this should be changing with some refactoring I am working on today. |
These calls only need to be run during a full export. Definitely not repeated for each object exported. We may do some further cleanup later as some of these legacy files are far enough removed from the current version that the checks are probably no longer needed. See #527
@rsonnier-cfs - If you build the |
I will get back to this next week and follow up here. Also, I have another suggestion that I have implemented in my copy that I want to run by you to see if its something that you would be interested in adding. Related to export path Should I start a new thread to discuss? |
Sure, a new issue would be great for the export path idea. We try to use separate issues for unrelated bugs and feature requests. Thanks! |
These calls only need to be run during a full export. Definitely not repeated for each object exported. We may do some further cleanup later as some of these legacy files are far enough removed from the current version that the checks are probably no longer needed. See joyfullservice#527 Co-authored-by: joyfullservice <joyfullservice@users.noreply.github.com>
I am new to this site, but I have been using Version_Control_v3.4.23 for about a year. I tried upgrading to Version_Control_v4.0.34 months ago, but was discouraged by the extreme slowness of the export source function compared to the previous version. Last month, I decided to go ahead and move to the V4 anyway. So my exports have now increased from seconds or a few minutes to 30 minutes or more! BTW...I really like the new version, and it has resolved some of the issues we were experiencing with v3.
In our environment, we have 4 developers, over 40 MS-Access databases. Some of those databases have in excess of 400 - 500 linked tables (I inherited all of this and am in the process of cleaning up unused linked tables). Regardless, many of our databases actually use well over 100-200 linked tables (and views).
When investigating the cause of the extreme delays in the export to source process, it was fairly easy to determine that the delay occurred in the export tables section of the routine. By suspending code and stepping through the export process, I zeroed in on these new calls to Public Sub ClearFilesByExtension (not in v3) that appear to be searching for and removing some legacy files by the file types "LNKD", "bas", and "tdf" from the tbldefs source folder.
The issue is that the ClearFilesByExtension seems to be called at the wrong place in the code. I am still evaluating the logic behind this, but it appears to me that these only need to be called once during the export tables section of the code. Instead the ClearFilesByExtension sub is called during the loop for EVERY table in the database project. SO in a large database with hundreds of linked tables, the delay is excessive. Further, my source does not have any of these file types in the tbldefs folder so I am guessing that the reason for the ClearFilesByExtension sub may related file types created by an earlier version of the add-in?
MODULE: modImportExport
Public Sub ExportSource
(calls) cDbObject.Export
which leads to:
MODULE: clsDbTableDef
Private Sub IDbComponent_Export
.....
ClearFilesByExtension IDbComponent_BaseFolder, "LNKD"
ClearFilesByExtension IDbComponent_BaseFolder, "bas"
ClearFilesByExtension IDbComponent_BaseFolder, "tdf"
......
The text was updated successfully, but these errors were encountered: