Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance with large data sets #36

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

convenient
Copy link
Contributor

@convenient convenient commented Jul 24, 2024

resolves #9

To improve pefromance

  • Page over data to prevent loading all into memory
  • Batch the update commands so we don't have to to multiple individual update queries

TODO

  • We should profile at the end in some manner for a concrete measurement
  • get something in CI for this
  • Address TODO comments in PR

Summary

  • Original implementation - 2.5 million records ~= 8m46s
  • Proposed implementation - 2.5 million records ~= 1m4s
  • Proposed implementation - 25 million records ~= 11m1s

Baseline build

This was generated by making a branch with the original ChangeEncryptionKey.php file and seeing how it peformed in the same way.

For posterity #41

Generating a new encryption key

real    8m46.067s
user    1m57.702s
sys     0m54.372s
PASS
A new key will be generated for re-encryption, use "--key" to specify a custom key.
The system currently has 1 keys
Generating a new encryption key using the magento core class
_reEncryptSystemConfigurationValues - start
_reEncryptSystemConfigurationValues - end
_reEncryptCreditCardNumbers - start
_reEncryptCreditCardNumbers - end
reEncryptEnvConfigurationValues - start
reEncryptEnvConfigurationValues - end
Cleaning cache
Done
cc_number_enc
1:3:Y+Vfqd/P2VFws/1d1l3KUm7DefiEK+8UuPXsjZLPy5yiniWFWUbV4S7YiMQ9VQWV
1:3:ObbdarSNVDDDlbp+Wzs8lkrUNGGSXiEnKNa6J2qztFchM5w8ewKArbyitG95spU1
1:3:dVCvUsULbR5MThPjiZupC96cH/BsIzAKI9Xt3kEDA3PHAktee1wSnLq5WUlVwfJ3
1:3:f42nrfoDfbZxOah560YcrdaD964ClS98QDv5sIi9pM87rcxpamt84r8Y1YS8Ngaa
1:3:pmMR3kjiuo9Ffqrx+a5I3VCyGtjCHeRc6hVQgz+m4zcQvf7QW7I9csNT86un/eUH

New build

https://app.circleci.com/pipelines/github/genecommerce/module-encryption-key-manager/100/workflows/31a8735c-5a08-4add-a5c2-13e1d6c452f9/jobs/94

Generating a new encryption key

real    1m4.381s
user    0m11.750s
sys     0m1.129s
PASS
A new key will be generated for re-encryption, use "--key" to specify a custom key.
The system currently has 1 keys
Generating a new encryption key using the magento core class
_reEncryptSystemConfigurationValues - start
_reEncryptSystemConfigurationValues - end
_reEncryptCreditCardNumbers - start
_reEncryptCreditCardNumbers - total possible records: 2500000
_reEncryptCreditCardNumbers - batch size:             10000
_reEncryptCreditCardNumbers - batch count:            250
_reEncryptCreditCardNumbers - total records updated:  2500000
_reEncryptCreditCardNumbers - end
reEncryptEnvConfigurationValues - start
reEncryptEnvConfigurationValues - end
Cleaning cache
Done
cc_number_enc
1:3:pU72eUmaUyl8zt/+6urRRwXZkQOhOLH5f46wXQvnsIpeX7wmL1R88+scz+rw4utf
1:3:pDFpTbclXYOekn0qhbVmlLYkBWdpksAKe8/2azFXxbaNwpCmiSscdqn30PdTWv1j
1:3:IY0C8vwgATh4xa6Tlz1f7SBgQ77Db9jityxA3JAWSdR/Fhm0aPxWVDiogINr/mIa
1:3:CUbE6GMFaAgOm0Hn3y1FsofhCsnwYh+q9nJX7alYzbgkXkg7zhDqZ9sjjuRIhP+t
1:3:HUG3gXyIXJs7MvdaJqbT3cg9+5qR3up9lb3k5MoxF7VwSpBRYtXzArwjO1qlG6nU

Extra large data set

25 mil rows

Generating a new encryption key

real    11m0.911s
user    1m58.354s
sys     0m10.758s
PASS
A new key will be generated for re-encryption, use "--key" to specify a custom key.
The system currently has 1 keys
Generating a new encryption key using the magento core class
_reEncryptSystemConfigurationValues - start
_reEncryptSystemConfigurationValues - end
_reEncryptCreditCardNumbers - start
_reEncryptCreditCardNumbers - total possible records: 25000000
_reEncryptCreditCardNumbers - batch size:             10000
_reEncryptCreditCardNumbers - batch count:            2500
_reEncryptCreditCardNumbers - total records updated:  25000000
_reEncryptCreditCardNumbers - end
reEncryptEnvConfigurationValues - start
reEncryptEnvConfigurationValues - end
Cleaning cache
Done
cc_number_enc
1:3:8hhO5NotWUFLfPAzybRUpojMCZoinhlvWjwQz7h5fY7kWqg4XEgou4IoRtcT50Gj
1:3:9/MFBMZzd/keueHRSYLaB3+WKnbHpdJiATswS8C+xqjdjk2X4TvS71fTPwWRUAoS
1:3:FwMeT8GBvLoeqSN59cvxfCKhmfB/LvgJoX4DoIComQ7hFVdn/GVDsnrRfzgfGtz9
1:3:UYp/qBO3yS/QropCcwsh3VCSlKUHUT0Nhbe79vTMM6FvG99wV27yje6yqhHRxL0a
1:3:O6K5ZOgzFDdg2B5tlAN+Gh9B9dXg1Kn8zyIp5BD5bsAdYYdUAOFLMWp0y88qh3+c

@convenient convenient force-pushed the performance-improvements branch from bbbde12 to 575323e Compare July 24, 2024 13:14
@convenient convenient added the help wanted Extra attention is needed label Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Development

Successfully merging this pull request may close these issues.

Improve handling of sales_order_payment cc_number_enc and reencrypt-column
1 participant