Constructing DataGrids

For the following examples, dg is assumed to be a Kangas DataGrid.

import kangas as kg

dg = kg.DataGrid(name="A Meaningful Name", ...)

Make sure you call dg.save() when you are done constructing your DataGrid. And make sure you give the DataGrid a name. Otherwise, it will be saved in your temporary directory.

For more details, see DataGrid()

Overview

You can imagine Kangas DataGrid's operating as a list of lists, or a list of dicts.

Of special note is that adding rows and changing the DataGrid is quick and easy while it is in memory. However, after saving a DataGrid, altering it is much more expensive. In some circumstances, it may be easier to re-create the DataGrid from scratch, rather than attempting to add rows after it has been saved.

Appending Data

For in-memory DataGrid's, you can easily add rows via the DataGrid.append() or DataGrid.extend() methods.

You can append a row using a list or position-oriented data:

dg.append([value1, value2, value3, ...])

Or you can append a row using a dictionary of keyword-oriented data, where each keyword is a column name:

dg.append({"column a": value_a, "column b": value_b, "column c": value_c, ...})

Likewise, you can extend a DataGrid with multiple rows at once:

dg.extend([
    [value1_1, value1_2, value1_3, ...],
    [value2_1, value2_2, value2_3, ...],
])

Or you can extend with many rows using a dictionary of keyword-oriented data, where each keyword is a column name:

dg.extend([
    {"column a": value1_a, "column b": value1_b, "column c": value1_c, ...},
    {"column a": value2_a, "column b": value2_b, "column c": value2_c, ...},
)

Note that for DataGrid's on disk (after DataGrid.save()) you are restricted to using DataGrid.extend() as this is such an expensive procedure.

Using Images

You can store images in a DataGrid by using the Image() class or a Python Imaging Library (PIL) image.

The Image() class can take a variety of formats, including:

a filename (including zip and tgz formats)
a URL (including zip and tgz formats)
a list or numpy array
a TensorFlow tensor
a PyTorch tensor
a PIL Image
a file-like object

In addition, you can provide additional arguments, including:

format
scale
shape
colormap
minmax
channels

See Image() for more information.

Image Metadata

When you create an image, you may also pass in a dictionary of items as metadata. Metadata can be nested, given that the values eventually are JSON-encodable values (numbers, strings, None, etc.).

name - used to title the image in the UI

The following are automatically logged as metadata:

filename - fallback title for the UI, if a name is not given
assetId - fallback title for the UI, if a filename is not given

Annotations for image (bounding boxes, regions, etc.) are also stored in an image's metadata. For more information, see:

An image's metadata can be viewed in the image's expanded dialog in the UI.

Appending columns

Before or after saving your DataGrid, you may add additional columns:

dg.append_column("column name", [row1_value, row2_value, row3_value, ...])

or:

dg.append_columns(
  ["New Column 1", "New Column 2"],
   [
    ["row1 col1",
     "row2 col1"],
    ["row1 col2",
     "row2 col2"],
  ],
)

or:

dg.append_columns(
    {
     "column 1 name": [row1_col1, row2_col1, row3_col1, ...],
     "column 2 name": [row1_col2, row2_col2, row3_col2, ...],
    }
)

Note that the number of row values must exactly match the number of rows in the DataGrid.

Removing rows

After reading or constructing, you can remove rows using the DataGrid.pop(INDEX) method:

dg.pop(0)

The INDEX is a zero-based integer.

For additional information see:

Visualizing updates

As you edit your data, Kangas will automatically update any running visualizations. The Kangas UI polls the the Kangas server every few seconds to monitor for updates to the underlying DataGrid, meaning you can edit your DataGrid and visualize your changes in real-time without relaunching the Kangas server.

Full Example

Check out the MNIST Classification Example to see a full code example!

Kangas DataGrid is completely open source; sponsored by Comet ML

Home
- User Guides
  - Installation - installing kangas
  - Reading data - importing data
  - Constructing DataGrids - building from scratch
  - Exploring data - exploration and analysis
  - Examples - scripts and notebooks
- Kangas Command-Line Interface
- Kangas Python API
  - kangas - top-level functions
  - DataGrid - DataGrid object and methods
  - Image - Image object and methods
  - Embedding - Embedding object and methods
  - Tensor - Tensor object and methods
- Integrations - with Hugging Face and Comet
- User Interface
  - Filter expressions - filter syntax
  - Cell Types
    - Boolean
    - Datetime
    - Embedding
    - Float
    - Image
    - Integer
    - JSON
    - Tensor
    - Text
    - Vector
- FAQ - Frequently Asked Questions
- Under the Hood
  - Security - issues related to security
  - Development - setting up a development environment
  - Roadmap - plans and known issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly