Skip to content

Commit

Permalink
[HOPSWORKS-2206] HSFS profile to install with and without Hive depend…
Browse files Browse the repository at this point in the history
…encies (#200)
  • Loading branch information
moritzmeister committed Dec 18, 2020
1 parent 8a59a4e commit 110b613
Show file tree
Hide file tree
Showing 4 changed files with 22 additions and 7 deletions.
6 changes: 5 additions & 1 deletion docs/integrations/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,13 @@ Create a file called `featurestore.key` in your designated Python environment an
To be able to access the Hopsworks Feature Store, the `HSFS` Python library needs to be installed in the environment from which you want to connect to the Feature Store. You can install the library through pip. We recommend using a Python environment manager such as *virtualenv* or *conda*.

```
pip install hsfs~=[HOPSWORKS_VERSION]
pip install hsfs[hive]~=[HOPSWORKS_VERSION]
```

!!! attention "Hive Dependencies"

By default, `HSFS` assumes Spark/EMR is used as execution engine and therefore Hive dependencies are not installed. Hence, on a local Python evnironment, if you are planning to use a regular Python Kernel **without Spark/EMR**, make sure to install the **"hive"** extra dependencies (`hsfs[hive]`).

!!! attention "Matching Hopsworks version"
The **major version of `HSFS`** needs to match the **major version of Hopsworks**.

Expand Down
6 changes: 5 additions & 1 deletion docs/integrations/sagemaker.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,9 +141,13 @@ You have two options to make your API key accessible from SageMaker:
To be able to access the Hopsworks Feature Store, the `HSFS` Python library needs to be installed. One way of achieving this is by opening a Python notebook in SageMaker and installing the `HSFS` with a magic command and pip:

```
!pip install hsfs~=[HOPSWORKS_VERSION]
!pip install hsfs[hive]~=[HOPSWORKS_VERSION]
```

!!! attention "Hive Dependencies"

By default, `HSFS` assumes Spark/EMR is used as execution engine and therefore Hive dependencies are not installed. Hence, on AWS SageMaker, if you are planning to use a regular Python Kernel **without Spark/EMR**, make sure to install the **"hive"** extra dependencies (`hsfs[hive]`).

!!! attention "Matching Hopsworks version"
The **major version of `HSFS`** needs to match the **major version of Hopsworks**.

Expand Down
11 changes: 10 additions & 1 deletion python/hsfs/engine/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@
# limitations under the License.
#

from hsfs.engine import spark, hive
from hsfs.engine import spark
from hsfs.client import exceptions

_engine = None

Expand All @@ -25,6 +26,14 @@ def init(engine_type, host=None, cert_folder=None, project=None, cert_key=None):
if engine_type == "spark":
_engine = spark.Engine()
elif engine_type == "hive":
try:
from hsfs.engine import hive
except ImportError:
raise exceptions.FeatureStoreException(
"Trying to instantiate Hive as engine, but 'hive' extras are "
"missing in HSFS installation. Install with `pip install "
"hsfs[hive]`."
)
_engine = hive.Engine(host, cert_folder, project, cert_key)


Expand Down
6 changes: 2 additions & 4 deletions python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,7 @@ def read(fname):
"boto3",
"pandas",
"numpy",
"pyhopshive[thrift]",
"PyMySQL",
"pyjks",
"sqlalchemy",
"mock",
],
extras_require={
Expand All @@ -37,7 +34,8 @@ def read(fname):
"mkdocs",
"mkdocs-material",
"keras-autodoc",
"markdown-include"]
"markdown-include"],
"hive": ["pyhopshive[thrift]", "sqlalchemy", "PyMySQL"],
},
author="Logical Clocks AB",
author_email="moritz@logicalclocks.com",
Expand Down

0 comments on commit 110b613

Please sign in to comment.