Skip to content

Commit

Permalink
Merge branch 'main' into wkx/risingwave-backend
Browse files Browse the repository at this point in the history
  • Loading branch information
KeXiangWang authored Jan 25, 2024
2 parents 3781dc8 + 5de08c7 commit cfa40e8
Show file tree
Hide file tree
Showing 10 changed files with 1,895 additions and 239 deletions.
2 changes: 1 addition & 1 deletion compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,7 @@ services:
- exasol:/data

flink-jobmanager:
image: flink:1.18.0
image: flink:1.18.1
environment:
FLINK_PROPERTIES: |
jobmanager.rpc.address: flink-jobmanager
Expand Down
4 changes: 2 additions & 2 deletions docker/flink/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
FROM flink:1.18.0
FROM flink:1.18.1
# ibis-flink requires PyFlink dependency
RUN wget -nv -P $FLINK_HOME/lib/ https://repo1.maven.org/maven2/org/apache/flink/flink-python/1.18.0/flink-python-1.18.0.jar
RUN wget -nv -P $FLINK_HOME/lib/ https://repo1.maven.org/maven2/org/apache/flink/flink-python/1.18.1/flink-python-1.18.1.jar
Binary file added docs/posts/ibis-analytics/dag.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2,109 changes: 1,883 additions & 226 deletions docs/posts/ibis-analytics/index.qmd

Large diffs are not rendered by default.

Binary file modified docs/posts/ibis-analytics/motherduck.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/posts/ibis-analytics/thumbnail.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/posts/ibis-analytics/top.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
13 changes: 4 additions & 9 deletions docs/posts/ibis-duckdb-geospatial/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ boroughs
```

```{python}
boroughs.filter(boroughs.geom.intersects(broad_station.select(broad_station.geom).to_array()))
boroughs.filter(_.geom.intersects(broad_station.geom))
```

### `d_within` (ST_DWithin)
Expand All @@ -133,15 +133,10 @@ streets
Using the deferred API, we can check which streets are within `d=10` meters of distance.

```{python}
sts_near_broad = streets.filter(_.geom.d_within(broad_station.select(_.geom).to_array(), 10))
sts_near_broad = streets.filter(_.geom.d_within(broad_station.geom, 10))
sts_near_broad
```

::: {.callout-note}
In the previous query, `streets` and `broad_station` are different tables. We use [`to_array()`](../../reference/expression-tables.qmd#ibis.expr.types.relations.Table.to_array) to generate a
scalar subquery from a table with a single column (whose shape is scalar).
:::

To visualize the findings, we will convert the tables to GeoPandas DataFrames.

```{python}
Expand Down Expand Up @@ -201,7 +196,7 @@ To find if there were any homicides in that area, we can find where the polygon
200 meters buffer to our "Broad St" station point intersects with the geometry column in our homicides table.

```{python}
h_near_broad = homicides.filter(_.geom.intersects(broad_station.select(_.geom.buffer(200)).to_array()))
h_near_broad = homicides.filter(_.geom.intersects(broad_station.geom.buffer(200)))
h_near_broad
```

Expand All @@ -210,7 +205,7 @@ data we can't tell the street near which it happened. However, we can check if t
distance of a street.

```{python}
h_street = streets.filter(_.geom.d_within(h_near_broad.select(_.geom).to_array(), 2))
h_street = streets.filter(_.geom.d_within(h_near_broad.geom, 2))
h_street
```

Expand Down
4 changes: 3 additions & 1 deletion ibis/formats/pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,9 @@ def to_ibis(cls, typ, nullable=True):
elif pdt.is_datetime64_dtype(typ):
return dt.Timestamp(nullable=nullable)
elif isinstance(typ, pdt.CategoricalDtype):
return dt.String(nullable=nullable)
if typ.categories is None or pdt.is_string_dtype(typ.categories):
return dt.String(nullable=nullable)
return cls.to_ibis(typ.categories.dtype, nullable=nullable)
elif pdt.is_extension_array_dtype(typ):
if _has_arrow_dtype and isinstance(typ, pd.ArrowDtype):
return PyArrowType.to_ibis(typ.pyarrow_dtype, nullable=nullable)
Expand Down
2 changes: 2 additions & 0 deletions ibis/formats/tests/test_pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,8 @@ def test_dtype_from_pandas_arrow_list_dtype():
dt.Timestamp("US/Eastern"),
),
(pd.CategoricalDtype(), dt.String()),
(pd.CategoricalDtype(["a", "b", "c"]), dt.String()),
(pd.CategoricalDtype(np.array([1, 2, 3], dtype="int64")), dt.int64),
(pd.Series([], dtype="string").dtype, dt.String()),
],
ids=str,
Expand Down

0 comments on commit cfa40e8

Please sign in to comment.