Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Base64 class to match coding style #10371

Closed
wants to merge 1 commit into from

Conversation

Joe-Abraham
Copy link
Contributor

@Joe-Abraham Joe-Abraham commented Jul 2, 2024

Rewrite to match the Velox coding guidelines.
Rename constCharsetContains to findCharacterInCharset for clarity.
Add comments.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 2, 2024
@Joe-Abraham Joe-Abraham marked this pull request as draft July 2, 2024 08:32
Copy link

netlify bot commented Jul 2, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 2de0472
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/66b4dfb3fa75ff0008b7ea8b

Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joe-Abraham Some high-level comments.

velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.cpp Outdated Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.cpp Outdated Show resolved Hide resolved
velox/common/encode/Base64.cpp Outdated Show resolved Hide resolved
Copy link
Contributor Author

@Joe-Abraham Joe-Abraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the changes as per your comments

velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.cpp Outdated Show resolved Hide resolved
velox/common/encode/Base64.cpp Outdated Show resolved Hide resolved
@Joe-Abraham Joe-Abraham force-pushed the clean_base64 branch 2 times, most recently from 72f2e1e to 7b31e43 Compare July 9, 2024 05:01
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joe-Abraham Can you clarify the changes to the API?

velox/common/encode/Base64.h Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
Copy link
Contributor Author

@Joe-Abraham Joe-Abraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the comments

velox/common/encode/Base64.h Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.h Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
velox/common/encode/Base64.h Outdated Show resolved Hide resolved
@Joe-Abraham Joe-Abraham force-pushed the clean_base64 branch 4 times, most recently from e659d21 to 4ade5dc Compare July 19, 2024 12:02
velox/common/encode/Base64.h Show resolved Hide resolved
velox/common/encode/Base64.cpp Outdated Show resolved Hide resolved
constexpr bool constCharsetContains(

/// Searches for a character within a charset up to a certain index.
static constexpr bool findCharacterInCharSet(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove constexpr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is indirectly used by static assertion, so couldn't be removed

velox/common/encode/Base64.cpp Outdated Show resolved Hide resolved
velox/common/encode/Base64.cpp Outdated Show resolved Hide resolved
velox/common/encode/Base64.cpp Outdated Show resolved Hide resolved
@Joe-Abraham Joe-Abraham force-pushed the clean_base64 branch 2 times, most recently from 94a53ba to 570bb3e Compare July 21, 2024 07:52
@Joe-Abraham
Copy link
Contributor Author

@majetideepak I have made the following changes

  1. checkForwardIndex, findCharacterInCharSetand checkReverseIndex, and other necessary functions have been moved to the anonymous namespace.
  2. checkForwardIndex and checkReverseIndex are used in static assertion, so constexpr can't be removed from all three functions.
  3. I have added new test cases for findCharacterInCharSet.
  4. For the functions checkForwardIndex, findCharacterInCharSet, and checkReverseIndex, I tried moving the function definition to the cpp file, and I am running into the compiler error. I think it is impossible to keep the definition and declaration of the constexpr functions separate. I did a bit of research and found this https://stackoverflow.com/questions/27345284/is-it-possible-to-declare-constexpr-class-in-a-header-and-define-it-in-a-separat

@Joe-Abraham Joe-Abraham force-pushed the clean_base64 branch 3 times, most recently from ccc856e to 551b286 Compare July 22, 2024 03:31
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joe-Abraham I just noticed the usage of checkForwardIndex and checkReverseIndex. They are only used by the static_assert in the cpp file.
They are not used by any other function. Can you clarify the scope of their refactor and the new tests added? Do we need to test outside of those static asserts?

@Joe-Abraham Joe-Abraham force-pushed the clean_base64 branch 5 times, most recently from 9780b07 to f2727c7 Compare August 8, 2024 14:24
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joe-Abraham One comment. Thanks!

velox/common/encode/tests/Base64Test.cpp Outdated Show resolved Hide resolved
@majetideepak
Copy link
Collaborator

@bikramSingh91, @mbasmanova do you have any feedback on this? Thanks!

@@ -86,4 +89,14 @@ TEST_F(Base64Test, calculateDecodedSizeProperSize) {
EXPECT_EQ(14, encoded_size);
}

TEST_F(Base64Test, checksPadding) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider renaming these test to match the methods they are testing: isPadded and numPadding.

@@ -59,9 +62,9 @@ TEST_F(Base64Test, calculateDecodedSizeProperSize) {
EXPECT_EQ(18, encoded_size);

encoded_size = 21;
EXPECT_THROW(
VELOX_ASSERT_THROW(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joe-Abraham Thank you for the cleanup.

template <class T>
static std::string
encodeImpl(const T& data, const Charset& charset, bool include_pad);

/// Encodes the specified data using the provided charset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume here the caller is expected to allocate 'out' of sufficient length. It would be nice to clarify this in the comment.

static uint8_t Base64ReverseLookup(char p, const ReverseIndex& table);
/// Performs a reverse lookup in the reverse index to retrieve the original
/// index of a character in the base.
static uint8_t base64ReverseLookup(char p, const ReverseIndex& reverseIndex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a private method? If so, comments should use //

}

static inline size_t countPadding(const char* src, size_t len) {
/// Counts the number of padding characters in encoded data.
static inline size_t numPadding(const char* src, size_t len) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: using std::string_view for the argument would be nicer

private:
/// Checks if there is padding in encoded data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// for private methods

static size_t
decode(const char* src, size_t src_len, char* dst, size_t dst_len);

/// Decodes the specified number of characters from the 'src' using URL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add some links to material that explains the difference between 'encoding' and 'URL encoding'.

Would you clarify what happens if 'src' cannot be decoded?

static void decodeUrl(
const std::pair<const char*, int32_t>& payload,
std::string& output);

/// Decodes the specified URL encoded text.
static std::string decodeUrl(folly::StringPiece text);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to replace folly::StringPiece with std::string_view

static std::string encodeUrl(folly::StringPiece text);

/// Decodes the specified URL encoded payload and writes the result to the
/// 'output'.
static void decodeUrl(
const std::pair<const char*, int32_t>& payload,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use std::string_view here?

/// Reverse lookup table for decoding purposes.
/// Maps each possible encoded character to its corresponding numeric value
/// within the encoding base.
using ReverseIndex = std::array<uint8_t, kReverseIndexSize>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be public?

if (!src_len) {
return 0;
}

auto needed = calculateDecodedSize(src, src_len);
if (dst_len < needed) {
throw Base64Exception(
VELOX_USER_FAIL(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to make this API non-throwing: https://velox-lib.io/blog/optimize-try-more

template <class T>
/* static */ std::string
Base64::encodeImpl(const T& data, const Charset& charset, bool include_pad) {
/* static */ std::string Base64::encodeImpl(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// static

@mbasmanova mbasmanova changed the title Improve base64 implementation Refactor Base64 class to match coding style Aug 9, 2024
@mbasmanova mbasmanova added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Aug 9, 2024
@facebook-github-bot
Copy link
Contributor

@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@xiaoxmeng merged this pull request in a4389f9.

Copy link

Conbench analyzed the 1 benchmark run on commit a4389f93.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

@Joe-Abraham
Copy link
Contributor Author

@majetideepak I will add a new PR for status change alone first, and then we will return to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants