Skip to content

Commit

Permalink
Deephaven Python API MVP (#1094)
Browse files Browse the repository at this point in the history
* feature-776, implemented the Python client with limited functionality

* added two env vars DHCE_HOST and DHCE_PORT to support running tests against remote DHCE servers, also refactored the unit tests to be more friendly to use for CI

* Deephaven Python Client MVP

* fixed setup.py/requirements.txt and remove LICENSEf file

* update README and related changes

* complete code refactoring in response to review feedback, some work left to update the docstrings, but code itself is reviewable.

fixes #961, #873, #874, #776
  • Loading branch information
jmao-denver authored Aug 28, 2021
1 parent 404d5fd commit 6ef97c8
Show file tree
Hide file tree
Showing 41 changed files with 12,802 additions and 0 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,14 @@ build/pipcache/
.programfiles/
workspaces/
devopsLocal/
pyclient/venv
pyclient/dist
pyclient/*.egg-info

jenkins/failures

node_modules
.java-version
.project
.classpath
.settings
59 changes: 59 additions & 0 deletions pyclient/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@

# Deephaven Python Client

Deephaven Python Client is a Python package created by Deephaven Data Labs. It is a client API that allows Python applications to remotely access Deephaven data servers.

## Source Directory

### From the deephaven-core repository root
``` shell
$ cd pyclient
```
## Dev environment setup
``` shell
$ pip3 install -r requirements.txt
```

## Build
``` shell
$ python3 setup.py bdist_wheel
```
## Run tests
``` shell
$ python3 -m unittest discover tests
```
## Install
``` shell
$ pip3 install dist/pydeephaven-0.0.1-py3-none-any.whl
```
## Quick start

```python
>>> from pydeephaven import Session
>>> session = Session() # assuming Deephaven Community Edition is running locally with the default configuration
>>> table1 = session.time_table(period=1000000).update(column_specs=["Col1 = i % 2"])
>>> df = table1.snapshot().to_pandas()
>>> print(df)
Timestamp Col1
0 1629681525690000000 0
1 1629681525700000000 1
2 1629681525710000000 0
3 1629681525720000000 1
4 1629681525730000000 0
... ... ...
1498 1629681540670000000 0
1499 1629681540680000000 1
1500 1629681540690000000 0
1501 1629681540700000000 1
1502 1629681540710000000 0

>>> session.close()

```

## Related documentation
* https://deephaven.io/
* https://arrow.apache.org/docs/python/index.html

## API Reference
[start here] https://deephaven.io/core/docs/clientapis/python
20 changes: 20 additions & 0 deletions pyclient/docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
35 changes: 35 additions & 0 deletions pyclient/docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
8 changes: 8 additions & 0 deletions pyclient/docs/source/_static/custom.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@

/* Spacing after functions. */
dl {
margin-bottom: 50px;
}

/* Do not print the module value properties (e.g. "= None") */
em.property { display: none }
61 changes: 61 additions & 0 deletions pyclient/docs/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys

sys.path.insert(0, os.path.abspath('../..'))

# -- Project information -----------------------------------------------------

project = 'Deephaven Python Client API'
copyright = '2021, Deephaven Data Labs'
author = 'Deephaven Data Labs'

# The full version, including alpha/beta/rc tags
release = '0.0.1'

# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['sphinx.ext.napoleon', 'sphinx.ext.todo', 'sphinx.ext.viewcode', 'sphinx.ext.autodoc',
"sphinx_autodoc_typehints"]

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ["proto"]

# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'alabaster'

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']

html_css_files = ['custom.css']
html_theme_options = {
'page_width': '80%',
'sidebar_width': '35%',
}

add_module_names = False
37 changes: 37 additions & 0 deletions pyclient/docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
.. Deephaven Python Client API documentation master file, created by
sphinx-quickstart on Thu Aug 19 12:27:56 2021.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Deephaven Python Client API Documentation
===========================================================================

Deephaven Python Client (pydeephaven) is a Python API built on top of Deephaven’s highly efficient OpenAPI which is based on gRPC and Apache Arrow. It allows Python applications to remotely connect to Deephaven data servers, export/import data with the server, run Python scripts on the server, and execute powerful queries on data tables.

Because Deephaven data servers and Deephaven clients including pydeephaven exchange data in the Apache Arrow format, pydeephaven is able to leverage ‘pyarrow’ - the Python bindings of Arrow (ttps://arrow.apache.org/docs/python/) for data representation and integration with other data analytic tools such as NumPy, Pandas, etc.

Examples:
>>> from pydeephaven import Session
>>> from pyarrow import csv
>>> session = Session() # assuming Deephaven Community Edition is running locally with the default configuration
>>> table1 = session.import_table(csv.read_csv("data1.csv")
>>> table2 = session.import_table(csv.read_csv("data2.csv")
>>> joined_table = table1.join(table2, keys=["key_col_1", "key_col_2"], columns_to_add=["data_col1"])
>>> df = joined_table.snapshot().to_pandas())
>>> print(df)
>>> session.close())

.. toctree::
:maxdepth: 2
:caption: Contents:

modules


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
4 changes: 4 additions & 0 deletions pyclient/docs/source/modules.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.. toctree::
:maxdepth: 4

pydeephaven
55 changes: 55 additions & 0 deletions pyclient/docs/source/pydeephaven.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
pydeephaven package
===================

.. toctree::
:maxdepth: 2

pydeephaven.constants module
----------------------------

.. automodule:: pydeephaven.constants
:members:
:show-inheritance:

pydeephaven.dherror module
--------------------------

.. automodule:: pydeephaven.dherror
:members:
:show-inheritance:

pydeephaven.query module
------------------------

.. automodule:: pydeephaven.query
:members:
:show-inheritance:

pydeephaven.session module
--------------------------

.. automodule:: pydeephaven.session
:members:
:show-inheritance:
:special-members: __init__

pydeephaven.table module
------------------------

.. automodule:: pydeephaven.table
:members:
:show-inheritance:

pydeephaven._table_interface module
-----------------------------------

.. automodule:: pydeephaven._table_interface
:members:
:show-inheritance:

Module contents
---------------

.. automodule:: pydeephaven
:members:
:show-inheritance:
33 changes: 33 additions & 0 deletions pyclient/pydeephaven/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#
# Copyright (c) 2016-2021 Deephaven Data Labs and Patent Pending
#
"""Deephaven Python Client (`pydeephaven`) is a Python API built on top of Deephaven's highly efficient Open API which is
based on gRPC and Apache Arrow. It allows Python applications to remotely connect to Deephaven data servers,
export/import data with the server, run Python scripts on the server, and execute powerful queries on data tables.
Because Deephaven data servers and Deephaven clients including pydeephaven exchange data in the Apache Arrow format,
pydeephaven is able to leverage 'pyarrow' - the Python bindings of Arrow (ttps://arrow.apache.org/docs/python/) for
data representation and integration with other data analytic tools such as NumPy, Pandas, etc.
Examples:
>>> from pydeephaven import Session
>>> from pyarrow import csv
>>> session = Session() # assuming Deephaven Community Edition is running locally with the default configuration
>>> table1 = session.import_table(csv.read_csv("data1.csv"))
>>> table2 = session.import_table(csv.read_csv("data2.csv"))
>>> joined_table = table1.join(table2, keys=["key_col_1", "key_col_2"], columns_to_add=["data_col1"])
>>> df = joined_table.snapshot().to_pandas()
>>> print(df)
>>> session.close()
"""

from .table import Table
from .session import Session
from .dherror import DHError
from ._combo_aggs import ComboAggregation
from .constants import SortDirection, MatchRule
from ._table_interface import TableInterface
from .query import Query

__all__ = ["Session", "Table", "Query", "TableInterface", "ComboAggregation", "DHError", "SortDirection", "MatchRule"]
Loading

0 comments on commit 6ef97c8

Please sign in to comment.