From 0419fd35b19cea9ce86cf786e69d410233effdb6 Mon Sep 17 00:00:00 2001 From: Qing Tomlinson Date: Tue, 24 Sep 2024 15:23:45 -0600 Subject: [PATCH] Introduce a new traversal policy The "always" traversal policy behaves as follows: - if the tool result (e.g. licensee) for a specific component exist, the component will be refetched and the tool will be rerun. - if the tool result for a specific component is missing, using the "always" policy leads to a "Unreachable for reprocessing" status and the tool being skipped. The "always" traversal policy is basically a rerun for all the previously ran tools. It is somewhat cumbersome in the case to retriger harvest, especially for integration tests. The proposed new policy make reharvest simpler: - When the tool result for a component is available, the tool will be rerun and tool result updated, similar to the "always" policy. - When the tool result for a component is not available, the component will be fetched and the tool will be run. In summary, this "reharvestAlways" policy is to rerun the harvest tools if results exist and run the harvest tools if results are missing. --- ghcrawler/lib/traversalPolicy.js | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/ghcrawler/lib/traversalPolicy.js b/ghcrawler/lib/traversalPolicy.js index badb8679..208113c4 100644 --- a/ghcrawler/lib/traversalPolicy.js +++ b/ghcrawler/lib/traversalPolicy.js @@ -143,6 +143,10 @@ class TraversalPolicy { return new TraversalPolicy('storageOnly', 'always', TraversalPolicy._resolveMapSpec(map)) } + static reharvestAlways(map) { + return new TraversalPolicy('mutables', 'always', TraversalPolicy._resolveMapSpec(map)) + } + static clone(policy) { return new TraversalPolicy(policy.fetch, policy.freshness, policy.map) }