Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermediate commit on each ISPAC... #24

Closed
fpatou opened this issue Aug 31, 2018 · 8 comments
Closed

Intermediate commit on each ISPAC... #24

fpatou opened this issue Aug 31, 2018 · 8 comments
Assignees

Comments

@fpatou
Copy link

fpatou commented Aug 31, 2018

hello,
Great tool and I'm glade to use it and see that some other use it to...

I use it to analyse 1200 dtsx files upon 100 ispac files. I build a pivot table enabling the analysis of all the material stored in the SSIS_META db by adding some objects in the database. (i would be happy to share if you want).

For the time being I'm wondering if this is possible to have an intermediate commit after each ISPAC analysis file as the analysis phase take around 3 hours and i get a timeout at final commit ?

Or is it possible to add (and replace would be the cherry on top of the cake) ispac to existing RUN ID ?

What do you thing about this features ? Do you think this if feasible ? and can you help me to locate the part of the code i would have to change....

Of course i already to look at it by myself and make some changes but would be great if someone can help me to understand some part of the architecture of the application.

Reagrds

@keif888
Copy link
Owner

keif888 commented Sep 2, 2018

At the moment the commit for the repository is called once the analysis is complete.
It uses SqlBulkCopy to save the data, utilising the WriteToServer method. There is another overload on that method, that allows specifying a DataRowState. See the Repository.cs around line 261.

In theory, the commit could be modified to use the alternate WriteToServer(DataTable, DataRowState) and include only DataRowState of Added.
Then after each commit of a table, the rows would need updated to a DataRowState of Unchanged, via a call to AcceptChanges() against the table.
WriteToServer does not update the records in the associated DataTable's state.
Then the repository commit could be called many times, for just added records.
Some work to handle DataRowState.Modified will be required, as comments can be updated. But the vast majority of the execution is adding new records.

If this was done, then the various loops that are scanning items, could include a call to the repository commit.

@fpatou
Copy link
Author

fpatou commented Sep 4, 2018

Hello...
I did a modifications see blow and it works for the context of ispac files

in program.cs :
public static void Commit_Intermediate(Repository repository)
{
Console.Write("Committing ISPAC analysis information to database...");
repository.Commit_Intermediate();
Console.WriteLine("Done for ISPAC File.");
}

in Repository.cs :
public void Commit_Intermediate()
{
// write each data table to the database
SqlBulkCopy bulkCopy = new SqlBulkCopy(repositoryConnection);

		bulkCopy.DestinationTableName = "RunScan";
		bulkCopy.WriteToServer(runScanTable);
		runScanTable.Clear (); //in the final & next intermadiate commit mean nothing to add
		
		bulkCopy.DestinationTableName = "Objects";
		bulkCopy.WriteToServer(objectTable);
		objectTable.Clear ();
		
		bulkCopy.DestinationTableName = "ObjectDependencies";
		bulkCopy.WriteToServer(objectDependenciesTable);
		objectDependenciesTable.Clear ();

		bulkCopy.DestinationTableName = "ObjectAttributes";
		bulkCopy.WriteToServer(objectAttributesTable);
		objectAttributesTable.Clear ();

		bulkCopy.DestinationTableName = "ObjectTypes";
		bulkCopy.WriteToServer(objectTypesTable);
		objectTypesTable.Clear ();


		bulkCopy.Close();
	}

and in SsisEnumerator.cs ONLY for ISPAC FILES :

	private void EnumerateProjects(string rootFolder, string pattern, bool recurseSubFolders, string locationName)
    {
        string[] filesToInspect;
        Console.Write("Enumerating projects...");
        filesToInspect = System.IO.Directory.GetFiles(rootFolder, pattern, (recurseSubFolders) ? System.IO.SearchOption.AllDirectories : System.IO.SearchOption.TopDirectoryOnly);
        Console.WriteLine("done.");

        foreach (string projectFileName in filesToInspect)
        {
           EnumerateIntegrationServicePack(projectFileName, locationName);
			Program.Commit_Intermediate(repository);
        }
    }

@keif888
Copy link
Owner

keif888 commented Sep 4, 2018

Your code will work.
It's just not quite the way I have implemented it. (I hadn't added a comment that I'd made changes, as I haven't finished regression testing everything.)

If you grab the latest commit you can see they way that I implemented it, which is slightly different, as I changed the main commit routine to handle both bulk insert and updates to change records.
I've also got significantly more commit's happening, on a "per folder basis", including where you have yours above.

@keif888 keif888 self-assigned this Sep 4, 2018
@fpatou
Copy link
Author

fpatou commented Sep 4, 2018

hello,

Ok i can see your change :-) glade to see you already work on it :-)

As is said can be very interesting in a big SSIS referential to avoid 2 hours treatment to go in the trash...

Of course the next step after this one would be to allow a new analyse to be done with the same run id avoiding to re analyse ISPAC already loaded.

And the cherry on the cake would be to allow the storage of the ISPAC filename + timestamp update date to allow a run upon a ispac referential and update only ISPAC file that have been updated without re analyse all the set...
this is the first milestone of the project for me to implement in order to have a daily updated SSIS metadata upon our 1200 dtsx with 100 Ispac files.

Hope you share the same roadmap :-) any way thank you for your work.

@keif888
Copy link
Owner

keif888 commented Sep 4, 2018

If you could raise each of those ideas as a separate issues.
That way I can tag them as enhancements, and at least provide ideas on how they could be implemented.

@fpatou
Copy link
Author

fpatou commented Sep 4, 2018

NO probs thank you again

@keif888 keif888 added the fixed label Sep 7, 2018
@fpatou
Copy link
Author

fpatou commented Sep 12, 2018

hello...we try this and found that if the project is in exception the commit happen...this corrupt the run cause the analysis is uncompleted.
This commit should only happen if there is no exception i guess ...

Anyway we are now looping on each isapc...meaning this intermediate commit may be not needed anymore.

Regards

keif888 added a commit that referenced this issue Sep 12, 2018
…s fired whilst scanning an SSIS package. Improvements to fix for Issue #24
@keif888
Copy link
Owner

keif888 commented Sep 12, 2018

I have added Rollback capability, which removes any added records, when an exception happens whilst scanning an SSIS package. Each ssis package is now committed individually.

keif888 added a commit that referenced this issue Sep 12, 2018
…back only does the in flight package. Issue #24 fix improvement
@keif888 keif888 closed this as completed Apr 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants