Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look at automating indexing load distribution to speed up indexing/re-indexing. #576

Open
WadeBarnes opened this issue Oct 9, 2020 · 1 comment

Comments

@WadeBarnes
Copy link
Member

WadeBarnes commented Oct 9, 2020

When re-indexing the prod environment recently we broke the indexing load down using update_index --start and --end and distributed the processes across a number of pods (16). This greatly speed up the indexing process >8M credentials in ~6 hours. If the process were automated the time could be reduced significantly (an hour or more).

Breaking down the indexing to work on 1 hour intervals worked well for dealing with all of the indexes that were created during an initial data load, which covers the bulk of the data. One month intervals worked well during periods were indexing was done on credentials issued based on BC Registry events.

@WadeBarnes
Copy link
Member Author

Example breakdown:
Indexing.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant