Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you please provide more explanation on how to pretrain PKG? #6

Open
huoxingdawang opened this issue Oct 9, 2023 · 1 comment

Comments

@huoxingdawang
Copy link

huoxingdawang commented Oct 9, 2023

Hi, thank you for your very interesting work and sharing your code! I wondering if you could provide more information on how to build the PKG? I saw that you explained the detail of how to generate PKG in the appendix of your paper, but I can't find the relevant code in this repo. In the paper, you mentioned that generating PKG requires the use of the S3D model. As far as I know, S3D is a very large model that requires the use of GPUs, but you also said in the readme that "The preprocessing stage does not require GPUs ." This really confused me. Can you provide more information?

@hongluzhou
Copy link
Contributor

Thank you for your interest! Please refer to

def obtain_external_knowledge(args, logger):
for the code to build the PKG. Specifically:

  • 'segment_wikistep_sim_scores_ready' indicates whether the similarity score between a video segment and a wikiHow step has been computed and saved on the disk. If it is not ready, the function get_sim_scores() will be called.

  • 'nodes_formed' indicates whether the graph nodes of the PKG have been formed. As explained in the paper, graph nodes are created by clustering the wikiHow steps.

  • 'edges_formed' indicates whether the graph edges of the PKG have been established. Once the function get_edges() is completed, the graph structure will be ready.

The subsequent functions prefixed with 'pseudo_label_*' pertain to extracting different types of pseudo labels based on the constructed PKG.

To save computation time and avoid using GPUs during the PKG construction process, we utilize the S3D model to extract features in advance and save these features on the disk. Instructions for feature extraction using S3D can be found here: https://github.com/salesforce/paprika#feature-extraction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants