Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cassandra Store #360

Merged
merged 5 commits into from
Jan 8, 2020
Merged

Conversation

smadarasmi
Copy link
Contributor

@smadarasmi smadarasmi commented Dec 11, 2019

Notes/Comments:

  • Cassandra's table name is required to be declared along with the object mapper with the @table annotation. I've left table_name as a field in CassandraConfig for use in Serving module. If user registers cassandra config with a different table name (not "feature_store"), this will throw an exception.
  • Downgraded Guava to 25.
    • Beam 2.16 uses Cassandra 3.4.0 (So we cannot use Cassandra 4.x which shades Guava)
    • Cassandra 3.4.0 uses Guava version 16.0 but has a compatibility check to use a different class when we use version > 19.0.
    • Guava version 26 (one used in Feast) has breaking change to method used in compatibility check in Cassandra's dependency, hence version 25
  • Aware that there are duplicate code between core and serving. I think that should be solved in a different PR to introduce a common module
  • A default TTL is set when registering Cassandra store (so records are not there forever)
  • This PR handles out of order ingestion to Cassandra by setting the write time as the event_timestamp so older records will get a tombstone if newer records with the same primary key exists. Aware that this solution does not work for all serving stores but we need this until there is a generic way to handle out of order arrivals.
  • I've refactored common code for Serving to OnlineServingService. Redis implementation will have complete tests, but Cassandra will not as those should already be covered in Redis since both extends OnlineServingService.
  • Integration test and unit tests are currently mixed. I think that should be a separate PR anyway since there are already existing integration tests in Feast and to avoid this PR having more changes than it is already introducing.
    • I also explored mocking Cassandra responses to write unit test, but I don't think it is worth the effort and even provides much usefulness. Mocking Cassandra's Session and ResultSet will be a pain

/cc @ches

@feast-ci-bot
Copy link
Collaborator

Hi @smadarasmi. Thanks for your PR.

I'm waiting for a gojek member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ches
Copy link
Member

ches commented Dec 11, 2019

References #335.

We're tracking BEAM-6911 to upgrade Beam's CassandraIO to the latest generation of the driver, which drops Guava dependency altogether 🎉 It also has a new DAO API which I think may open enhancing the CassandraIO in ways that could avoid the hardcoded @Table name annotation we've had to accept. With luck I'm hoping to find time during the holidays to tackle that ticket.

Assuming it gets done, that'd come along with a future Beam release so Feast in turn can reap the benefits later.

Using Guava 26 in Feast may run a risk of problems integrating with Beam, so the downgrade here might be prudent anyway.

This is obviously a big PR. We're happy to discuss any questions further by whatever medium is efficient.

@woop
Copy link
Member

woop commented Dec 12, 2019

@smadarasmi @ches super impressed!

Please give us a little bit of time to process it.

We are currently working on project namespacing, and we are hoping to cut a 0.4 release for that pretty soon (two weeks). That will affect serving, since your table/document naming will include the project name (feature sets can have the same name across projects). However, this probably won't be a significant difference from what you have.

Would it make sense to do the review of this PR soonish, but only merge in after project namespacing? I'd feel a little bit more comfortable with you updating the Cassandra code than us.

Which reminds me, we can add you as codeowners parts of the codebase that you created. That would allow you to make changes without having to wait for our approval.

@woop
Copy link
Member

woop commented Dec 12, 2019

pendency altogether It also has a new DAO API which I think may open enhancing the CassandraIO

Ok, this is great. I wonder if we can nudge anyone from the Beam team to look at this sooner.

@woop
Copy link
Member

woop commented Dec 12, 2019

/ok-to-test

@smadarasmi
Copy link
Contributor Author

@woop Do you know why the checks fail, but when I open the link it shows all tests are passing? I don't have permission to view the build logs.

@smadarasmi
Copy link
Contributor Author

/retest

@zhilingc
Copy link
Collaborator

@smadarasmi yeah the failures are not propagating properly; in the end to end tests core fails to start up. My guess is that you are missing a @Transactional somewhere (where you retrieve/write the JobInfo object, which uses Lobs).

@woop
Copy link
Member

woop commented Dec 17, 2019

I have created a branch v0.3-branch where we continue to cut patches from. We will officially stop doing that from master.

@ches, since you indicated that you will be staying on 0.3 for a while, can I suggest that we target this Cassandra release to this v0.3 branch? The rest of the folks will continue on master for async job management and project namespacing. I am updating the merge target. Please let me know if you have concerns.

If time permits, we can try to get Cassandra updated to changes for 0.4, but I don't really want to hold up 0.4 for this yet. I think we can always bring it in later as a patch (not ideal) or in 0.5.

By the way, @voonhous will be spiking this PR to see if it's functional and get it merged in.

@woop woop changed the base branch from master to v0.3-branch December 17, 2019 01:37
@smadarasmi
Copy link
Contributor Author

@smadarasmi yeah the failures are not propagating properly; in the end to end tests core fails to start up. My guess is that you are missing a @Transactional somewhere (where you retrieve/write the JobInfo object, which uses Lobs).

@zhilingc Seems like they are passing now, let me know if you still think there are some issues that needs to be fixed.

@woop Yes I think we can target this for 0.3 for now. I think updating this for 0.4 shouldn't be much work at all. Since currently the Cassandra key consists of concatenated feature set name (and version) + entities, I think we just need to update the key to add project namespace to it.

for (FeatureSetSpec spec : specs.values()) {
String featureSetName = spec.getName() + ":" + spec.getVersion();
if (spec.getMaxAge() != null && spec.getMaxAge().getSeconds() > 0) {
maxAges.put(featureSetName, Math.toIntExact(spec.getMaxAge().getSeconds()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one thing: even though its a field in the feature set, our intention for maxAge is to be a query time concept (that can change, depending on the user's use case), rather than an ingestion time one, so I'm not entirely sure if using it as the TTL is the best choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense about maxAge being a query time concept, where each team using the same features may have different expiration values.

How would you propose we do this? Redis currently sets TTL to null (no TTL). Do you have any thoughts or work in progress regarding this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense about maxAge being a query time concept, where each team using the same features may have different expiration values.

How would you propose we do this? Redis currently sets TTL to null (no TTL). Do you have any thoughts or work in progress regarding this?

Seems like two options

  1. Don't expire (this will accumulate bad state over time)
  2. Expire, but force users to reingest

Is it possible to leave this up to whoever sets up the infra, or does it have to be defined as part of schemas?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We configure our redis deployments as LRU caches in lieu of a TTL, but afaik there isn't such an option for Cassandra :/ There isn't really an alternative here I guess. It shouldn't be a problem as long as its documented that the feature set default max age is used for TTL when using this store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll think about where to add the documentation for this.

Not sure if a good idea to not have the TTL setup by infra. Some features such as embeddings may not want expiration for a long time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some documentation in Store.proto. I think it makes sense to put it there as users of the Cassandra would check the proto file to see the required and supported configurations in Cassandra anyway.

@smadarasmi
Copy link
Contributor Author

/retest

smadarasmi added 4 commits January 3, 2020 17:35
  * Downgraded Guava to 25
    * Beam 2.16 uses Cassandra 3.4.0 (So we cannot use Cassandra 4.x which shades Guava)
    * Cassandra 3.4.0 uses Guava version 16.0 but has a compatibility check to use a different class when we use version > 19.0.
    * Guava version 26 (version previously used) has breaking change to method used in compatibility check in Cassandra's dependency, hence version 25
  * Using Cassandra's internal field 'writetime' to handle out of order arrivals. When older records where the primary key already exist in Cassandra are ingested, they are set as tombstones in Cassandra and ignored on retrieval.
    * Aware that this way of handling out of order arrival is specific to Cassandra, but until we have a general way to handle out of order arrivals we need to do it this way
  * Cassandra's object mapper requires stating table's name along with @table annotation
    * table_name is still part of CassandraConfig for use in serving module
    * if user registers CassandraConfig with a different table name other than "feature_store", this will throw an exception
  * Abstracted OnlineServingService for common implementation of online serving stores
  * Complete tests remain in RedisServingServiceTest while Cassandra tests only contain basic tests for writes, and some other implementation specific to Cassandra
@davidheryanto
Copy link
Collaborator

/lgtm

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: smadarasmi, woop

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@feast-ci-bot feast-ci-bot merged commit 505bdc6 into feast-dev:v0.3-branch Jan 8, 2020
Wirick pushed a commit to postmates/feast that referenced this pull request Jan 24, 2020
* create cassandra store for registration and ingestion

  * Downgraded Guava to 25
    * Beam 2.16 uses Cassandra 3.4.0 (So we cannot use Cassandra 4.x which shades Guava)
    * Cassandra 3.4.0 uses Guava version 16.0 but has a compatibility check to use a different class when we use version > 19.0.
    * Guava version 26 (version previously used) has breaking change to method used in compatibility check in Cassandra's dependency, hence version 25
  * Using Cassandra's internal field 'writetime' to handle out of order arrivals. When older records where the primary key already exist in Cassandra are ingested, they are set as tombstones in Cassandra and ignored on retrieval.
    * Aware that this way of handling out of order arrival is specific to Cassandra, but until we have a general way to handle out of order arrivals we need to do it this way
  * Cassandra's object mapper requires stating table's name along with @table annotation
    * table_name is still part of CassandraConfig for use in serving module
    * if user registers CassandraConfig with a different table name other than "feature_store", this will throw an exception

* add cassandra serving service

  * Abstracted OnlineServingService for common implementation of online serving stores
  * Complete tests remain in RedisServingServiceTest while Cassandra tests only contain basic tests for writes, and some other implementation specific to Cassandra

* update documentation to reflect current API and add cassandra store to docs

* add default expiration to cassandra config for when featureset does not have max age

* docs update, spotless check, and bug fix on cassandra schema
Wirick pushed a commit to postmates/feast that referenced this pull request Mar 16, 2020
* create cassandra store for registration and ingestion

  * Downgraded Guava to 25
    * Beam 2.16 uses Cassandra 3.4.0 (So we cannot use Cassandra 4.x which shades Guava)
    * Cassandra 3.4.0 uses Guava version 16.0 but has a compatibility check to use a different class when we use version > 19.0.
    * Guava version 26 (version previously used) has breaking change to method used in compatibility check in Cassandra's dependency, hence version 25
  * Using Cassandra's internal field 'writetime' to handle out of order arrivals. When older records where the primary key already exist in Cassandra are ingested, they are set as tombstones in Cassandra and ignored on retrieval.
    * Aware that this way of handling out of order arrival is specific to Cassandra, but until we have a general way to handle out of order arrivals we need to do it this way
  * Cassandra's object mapper requires stating table's name along with @table annotation
    * table_name is still part of CassandraConfig for use in serving module
    * if user registers CassandraConfig with a different table name other than "feature_store", this will throw an exception

* add cassandra serving service

  * Abstracted OnlineServingService for common implementation of online serving stores
  * Complete tests remain in RedisServingServiceTest while Cassandra tests only contain basic tests for writes, and some other implementation specific to Cassandra

* update documentation to reflect current API and add cassandra store to docs

* add default expiration to cassandra config for when featureset does not have max age

* docs update, spotless check, and bug fix on cassandra schema
Wirick added a commit to postmates/feast that referenced this pull request Mar 16, 2020
* create cassandra store for registration and ingestion

  * Downgraded Guava to 25
    * Beam 2.16 uses Cassandra 3.4.0 (So we cannot use Cassandra 4.x which shades Guava)
    * Cassandra 3.4.0 uses Guava version 16.0 but has a compatibility check to use a different class when we use version > 19.0.
    * Guava version 26 (version previously used) has breaking change to method used in compatibility check in Cassandra's dependency, hence version 25
  * Using Cassandra's internal field 'writetime' to handle out of order arrivals. When older records where the primary key already exist in Cassandra are ingested, they are set as tombstones in Cassandra and ignored on retrieval.
    * Aware that this way of handling out of order arrival is specific to Cassandra, but until we have a general way to handle out of order arrivals we need to do it this way
  * Cassandra's object mapper requires stating table's name along with @table annotation
    * table_name is still part of CassandraConfig for use in serving module
    * if user registers CassandraConfig with a different table name other than "feature_store", this will throw an exception

* add cassandra serving service

  * Abstracted OnlineServingService for common implementation of online serving stores
  * Complete tests remain in RedisServingServiceTest while Cassandra tests only contain basic tests for writes, and some other implementation specific to Cassandra

* update documentation to reflect current API and add cassandra store to docs

* add default expiration to cassandra config for when featureset does not have max age

* docs update, spotless check, and bug fix on cassandra schema
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants