Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for new filtering ops #581

Merged
merged 47 commits into from
Jun 17, 2024
Merged

Conversation

czaloom
Copy link
Collaborator

@czaloom czaloom commented May 14, 2024

Feature

This PR implements a new filtering schema for Valor and connects the refactored query generator #597 to the API.

This includes support for logical operations (AND, OR, NOT) and the ability to apply filters per table.

API Endpoints

Filters no longer use query parameters in api calls.

GET - gets all instances of a table 
/datasets
/models
/data
/labels

POST - post a filter, return filtered instances of a table
/datasets/filter
/models/filter
/data/filter
/labels/filter

Examples

# No swimmer, small boats
no_swimmer_small_boats = client.get_datums(
    Filter(
        datums=And(
            Label.key == "class",
            Label.value != "swimmer",
        ),
        annotations=And(
            Label.key == "class",
            Label.value == "boat",
            Annotation.bounding_box.area < 50,
        ),
    )
)
assert len(no_swimmer_small_boats) == 1
assert no_swimmer_small_boats[0].uid == "uid1"

# No swimmer, large boats
no_swimmer_large_boats = client.get_datums(
    Filter(
        datums=And(
            Label.key == "class",
            Label.value != "swimmer",
        ),
        annotations=And(
            Label.key == "class",
            Label.value == "boat",
            Annotation.bounding_box.area > 50,
        ),
    )
)
assert len(no_swimmer_large_boats) == 1
assert no_swimmer_large_boats[0].uid == "uid2"

# Swimmer with small boat
swimmer_with_small_boats = client.get_datums(
    Filter(
        datums=And(
            Label.key == "class",
            Label.value == "swimmer",
        ),
        annotations=And(
            Label.key == "class",
            Label.value == "boat",
            Annotation.bounding_box.area < 50,
        ),
    )
)
assert len(swimmer_with_small_boats) == 1
assert swimmer_with_small_boats[0].uid == "uid3"

# Swimmer with large boat
swimmers_and_boats = client.get_datums(
    Filter(
        datums=And(
            Label.key == "class",
            Label.value == "swimmer",
        ),
        annotations=And(
            Label.key == "class",
            Label.value == "boat",
            Annotation.bounding_box.area > 50,
        ),
    )
)
assert len(swimmers_and_boats) == 1
assert swimmers_and_boats[0].uid == "uid4"

@czaloom czaloom closed this Jun 10, 2024
@czaloom czaloom force-pushed the czaloom-update-query-filtering branch from dd47208 to 79928c3 Compare June 10, 2024 20:10
@czaloom czaloom reopened this Jun 10, 2024
@czaloom czaloom marked this pull request as ready for review June 14, 2024 04:25
@czaloom czaloom requested review from ntlind and ekorman as code owners June 14, 2024 04:25
require_bounding_box=False,
require_polygon=False,
require_raster=False,
labels=schemas.LogicalFunction(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is much longer and harder to read than the old version. is it necessary to write it this way? I'm sure the answer is "yes", and I know it doesn't matter much since this will all live in the backend, but...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldnt think of a more intuitive name. I could enumerate it into And, Or and Not like the client?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just meant that we're going from one class Filter with a few easy arguments to 5(?) classes with more complicated arguments and doubling the lines of code in the process. I don't have any real suggestions here as I'm not deep enough in the code to understand if we need to be this specific with all of these different classes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5(?) classes

Which ones are you referring to?

Filters are constructed using 2 classes.

  • Condition (e.g. ==, !=, >, etc)
  • LogicalFunctions (e.g. AND, OR, NOT).

api/valor_api/schemas/__init__.py Outdated Show resolved Hide resolved
api/valor_api/main.py Show resolved Hide resolved
client/valor/schemas/symbolic/operators.py Outdated Show resolved Hide resolved
client/valor/schemas/symbolic/operators.py Show resolved Hide resolved
@@ -313,7 +312,7 @@ def test_generate_segmentation_data(

for image in dataset.get_datums():
uid = image.uid
sample_gt = dataset.get_groundtruth(uid)
sample_gt = dataset.get_groundtruth(uid) # type: ignore - issue #604
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's causing there to be a type issue on this line with these changes?

Copy link
Collaborator Author

@czaloom czaloom Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed type ignore. There is red highlighting but it passes pre-commit.

Some of these do fail pre-commit. This is an issue documented in #604.

Copy link
Contributor

@ntlind ntlind Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was to be the last filtering PR, so I was expecting issues like this one to be fixed. I'm ok with pushing this into another PR if you think that's best.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the last one to work on filter structure. These type ignores are related to pyright issues that only affect json generation in the client.

integration_tests/client/metrics/test_segmentation.py Outdated Show resolved Hide resolved
ekorman
ekorman previously approved these changes Jun 17, 2024
@ekorman ekorman dismissed their stale review June 17, 2024 17:03

too soon

@czaloom czaloom merged commit 02a8260 into main Jun 17, 2024
11 checks passed
@czaloom czaloom deleted the czaloom-update-query-filtering branch June 17, 2024 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update Filter Schemas + Query Generation
3 participants