Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust,python,cli): add SQL engine support for LIKE and ILIKE pattern matching #13522

Merged
merged 2 commits into from
Jan 8, 2024

Conversation

alexander-beedie
Copy link
Collaborator

@alexander-beedie alexander-beedie commented Jan 8, 2024

Closes #12399.
Closes #9104.

A long-requested addition, implementing LIKE and ILIKE for the SQL interface (though this first cut does not yet support the optional follow-on ESCAPE clause).

Rewrites the given pattern as a regular expression match, appropriately escaping special characters and replacing % and _ with .* and ., before dispatching the translated pattern to str().contains. Has a simple fast-path if the pattern contains no wildcard characters (and is not case insensitive).

Examples

import polars as pl

df = pl.DataFrame(
    data = [
        ("Dubai", 3564931),
        ("Abu Dhabi", 1807000),
        ("Sharjah", 1405000),
        ("Al Ain", 846747),
        ("Ajman", 490035),
        ("Ras Al Khaimah", 191753),
        ("Fujairah", 118933),
        ("Umm Al Quwain", 59098), 
    ],
    schema = {"City": pl.String, "Population": pl.Int32},
)

with pl.SQLContext(cities=df, eager_execution=True) as ctx:
    print( ctx.execute("SELECT * FROM cities WHERE City LIKE '%D%i'") )
    # ┌───────────┬────────────┐
    # │ City      ┆ Population │
    # │ ---       ┆ ---        │
    # │ str       ┆ i32        │
    # ╞═══════════╪════════════╡
    # │ Dubai     ┆ 3564931    │
    # │ Abu Dhabi ┆ 1807000    │
    # └───────────┴────────────┘
    print( ctx.execute("SELECT * FROM cities WHERE City ILIKE '%al%'") )
    # ┌────────────────┬────────────┐
    # │ City           ┆ Population │
    # │ ---            ┆ ---        │
    # │ str            ┆ i32        │
    # ╞════════════════╪════════════╡
    # │ Al Ain         ┆ 846747     │
    # │ Ras Al Khaimah ┆ 191753     │
    # │ Umm Al Quwain  ┆ 59098      │
    # └────────────────┴────────────┘
    print( ctx.execute("SELECT * FROM cities WHERE City LIKE '%a__a%'") )
    # ┌────────────────┬────────────┐
    # │ City           ┆ Population │
    # │ ---            ┆ ---        │
    # │ str            ┆ i32        │
    # ╞════════════════╪════════════╡
    # │ Sharjah        ┆ 1405000    │
    # │ Ras Al Khaimah ┆ 191753     │
    # │ Fujairah       ┆ 118933     │
    # └────────────────┴────────────┘

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Jan 8, 2024
@alexander-beedie alexander-beedie added the A-sql Area: Polars SQL functionality label Jan 8, 2024
Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this one is a must!

@ritchie46 ritchie46 merged commit 7ff3bf7 into pola-rs:main Jan 8, 2024
32 checks passed
@alexander-beedie alexander-beedie deleted the sql-like-constraints branch January 8, 2024 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql Area: Polars SQL functionality enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
2 participants