Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: BigTable online store #3140

Merged
merged 24 commits into from
Oct 5, 2022

Commits on Sep 30, 2022

  1. Initial implementation of BigTable online store.

    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    b349ad6 View commit details
    Browse the repository at this point in the history
  2. Attempt to run bigtable integration tests.

    Currently focusing on just getting the tests running locally. I've only
    build python3.8 requirements.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    f9f45bb View commit details
    Browse the repository at this point in the history
  3. Got the BigTable tests running in local containers

    Signed-off-by: Abhin Chhabra <chhabra.abhin@gmail.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    2a1a863 View commit details
    Browse the repository at this point in the history
  4. Set serialization version when computing entity ID

    Signed-off-by: Abhin Chhabra <chhabra.abhin@gmail.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    96814d6 View commit details
    Browse the repository at this point in the history
  5. Switch to the recommended layout in bigtable.

    This was recommended by the BigTable dev team. Details of this layout
    will be added to the documentation in a future commit.
    
    Signed-off-by: Abhin Chhabra <chhabra.abhin@gmail.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    6e6233f View commit details
    Browse the repository at this point in the history
  6. Minor bugfixes.

    - If a row is empty when fetching data, don't process it more.
    - If a task in the threadpool fails, bubble up that failure.
    - If a `created_ts` is not available, use an empty string. `None` does
      not automatically serialize to bytes.
    
    Signed-off-by: Abhin Chhabra <chhabra.abhin@gmail.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    2a0d09a View commit details
    Browse the repository at this point in the history
  7. Move BigTable online store out of contrib

    As per feedback on the PR.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    98532a5 View commit details
    Browse the repository at this point in the history
  8. Attempt to run integration tests in CI.

    Provide the GCP project and the bigtable instance ID for the tests to
    connect to.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    2a65fef View commit details
    Browse the repository at this point in the history
  9. Delete tables for entity-less feature views.

    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    de795f3 View commit details
    Browse the repository at this point in the history
  10. Table names should be smaller than 50 characters

    This is BigTable's table length limit and it's causing test failures.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    bf798e8 View commit details
    Browse the repository at this point in the history
  11. Optimize bigtable reads.

    - Fetch all the rows in one bigtable fetch.
    - Get only the columns that are necessary (using a column regex filter).
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    eb3ab91 View commit details
    Browse the repository at this point in the history
  12. dynamodb: switch to mock_dynamodb

    The latest rebuilding of requirements has upgraded the `moto` library
    past the `4.0.0` release, which has a couple of breaking changes.
    Specifically, the `mock_dynamodb2` decorator has been deprecated. See
    https://github.com/spulec/moto/blob/master/CHANGELOG.md#400 for more
    details.
    
    The actual PR (getmoto/moto#4919) mentions that
    it's because the `mock_dynamodb` decorator is now equivalent to the
    `mock_dynamodb2` decorator.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    1383b7e View commit details
    Browse the repository at this point in the history
  13. minor: rename BigTable to Bigtable

    This matches the GCP docs.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    6986fa9 View commit details
    Browse the repository at this point in the history
  14. Wrote some Bigtable documentation.

    Closely mirrors the docs for the other online stores.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    3cd76a8 View commit details
    Browse the repository at this point in the history
  15. Bugfix: Deal with missing row keys.

    It looks like the bigtable client will just skip over non-existent row
    keys.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    c7449cc View commit details
    Browse the repository at this point in the history
  16. Fix linting issues.

    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    f356312 View commit details
    Browse the repository at this point in the history
  17. Generate requirements files.

    - As of version `1.49`, the various python packages in the [grpc
      repo](https://github.com/grpc/grpc/tree/master/src/python) require
      `protobuf>=4.21.3`. Unfortunately, this is incompatible with all
      versions of `tensorflow-metadata` (see [this
      issue](tensorflow/metadata#37)). And since
      `piptools` doesn't backtrack during dependency resolution, the
      requirement files cannot be regenerated without adding an upper limit
      on these grpc libraries directly in `setup.py`.
    - The previous attempt to upgrade usages of the `mock_dynamodb2`
      decorator to the newest version failed. Since I'm not an expert in
      dynamodb, it made sense to just cap the test tool to the version
      already being used in CI.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    ff62c6b View commit details
    Browse the repository at this point in the history
  18. Don't bother materializing created timestamp.

    Had a discussion with Danny about whether it's useful to copy this
    column. He agreed that there's no value to storing this in the online
    store.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    cce3602 View commit details
    Browse the repository at this point in the history
  19. Remove tensorflow-metadata.

    Turns out that this dependency is not required. We removed all
    references to it in [this
    PR](feast-dev#2063), but did not remove it
    from `setup.py`. Removing it has caused many of the restrictions imposed
    in previous commits to be unnecessary.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Sep 30, 2022
    Configuration menu
    Copy the full SHA
    943ee3f View commit details
    Browse the repository at this point in the history

Commits on Oct 5, 2022

  1. Minor fix to Bigtable documentation.

    Feedback from Danny mentioned that Bigtable should be able to store
    multiple versions of the same key and fetch the latest at read time.
    This makes sense and means that concurrent writes should work just fine.
    
    Signed-off-by: Abhin Chhabra <abhin.chhabra@shopify.com>
    chhabrakadabra committed Oct 5, 2022
    Configuration menu
    Copy the full SHA
    ab80b42 View commit details
    Browse the repository at this point in the history
  2. update roadmap docs

    Signed-off-by: Danny Chiao <danny@tecton.ai>
    adchia committed Oct 5, 2022
    Configuration menu
    Copy the full SHA
    4755745 View commit details
    Browse the repository at this point in the history
  3. Fix roadmap doc

    Signed-off-by: Danny Chiao <danny@tecton.ai>
    adchia committed Oct 5, 2022
    Configuration menu
    Copy the full SHA
    c0f2d8e View commit details
    Browse the repository at this point in the history
  4. Change link to point to roadmap page

    Signed-off-by: Danny Chiao <danny@tecton.ai>
    adchia committed Oct 5, 2022
    Configuration menu
    Copy the full SHA
    2d6bdac View commit details
    Browse the repository at this point in the history
  5. change order in roadmap

    Signed-off-by: Danny Chiao <danny@tecton.ai>
    adchia committed Oct 5, 2022
    Configuration menu
    Copy the full SHA
    992c318 View commit details
    Browse the repository at this point in the history