Skip to content

Commit

Permalink
Publish PYDF source
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 570962548
  • Loading branch information
rstz authored and copybara-github committed Oct 5, 2023
1 parent ec6513c commit a576c23
Show file tree
Hide file tree
Showing 77 changed files with 9,785 additions and 0 deletions.
95 changes: 95 additions & 0 deletions yggdrasil_decision_forests/port/python/.bazelrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Bazel configuration for Yggdrasil Decision Forests
#
# OPTIONS
#
# Linux
# linux_cpp17: Linux build. Uses C++17.
#
# Linux options:
#
# linux_avx2: AVX2.
#
# Windows
#
# On Windows, uncomment the output_user_root option to avoid long path issues.
#
# windows_cpp17: Windows build. Uses C++17.
#
# Windows options:
#
# windows_avx2: AVX2.
#

# Common flags.
common --experimental_repo_remote_exec

# Avoid long path issues on Windows
# startup --output_user_root=C:/tmpbld

# Flags to compile with or without recent version of Bazel.

# Required for bazel <=4, fails with bazel >= 5.
# common --incompatible_restrict_string_escapes=false

build -c opt
build --announce_rc
build --noincompatible_strict_action_env

# Enable after adding python headers to protobuf.
build --define=use_fast_cpp_protos=true
build --define=allow_oversize_protos=true
build --define=grpc_no_ares=true

# Nice print
build:linux --copt=-fdiagnostics-color=always
build --color=yes

# Suppress C++ compiler warnings.
build:linux --copt=-w
build:linux --host_copt=-w
build:macos --copt=-w
build:windows --copt=/W0

# Build mode.
build:linux_cpp17 --cxxopt=-std=c++17
build:linux_cpp17 --host_cxxopt=-std=c++17
build:linux_cpp17 --config=linux

build:macos --cxxopt=-std=c++17
build:macos --host_cxxopt=-std=c++17

build:windows_cpp17 --cxxopt=/std:c++17
build:windows_cpp17 --host_cxxopt=/std:c++17
build:windows_cpp17 --config=windows

build:windows_cpp20 --cxxopt=/std:c++20
build:windows_cpp20 --host_cxxopt=/std:c++20
build:windows_cpp20 --config=windows

# Instruction set optimizations
build:linux_avx2 --copt=-mavx2
build:windows_avx2 --copt=/arch:AVX2

# Misc build options we need for windows.
build:windows --copt=/D_USE_MATH_DEFINES
build:windows --host_copt=/D_USE_MATH_DEFINES
build:windows --copt=-DWIN32_LEAN_AND_MEAN
build:windows --host_copt=-DWIN32_LEAN_AND_MEAN
build:windows --copt=-DNOGDI
build:windows --host_copt=-DNOGDI
build:windows --linkopt=/NDEBUG
build:windows --host_linkopt=/NDEBUG
build:windows --linkopt=/OPT:REF
build:windows --host_linkopt=/OPT:REF
build:windows --linkopt=/OPT:ICF
build:windows --host_linkopt=/OPT:ICF
build:windows --experimental_strict_action_env=true
build:windows --copt=/Zc:preprocessor
build:windows --host_copt=/Zc:preprocessor
build:windows --materialize_param_files
build:windows --features=compiler_param_file
build:windows --verbose_failures
build:windows --copt=/Oi
build:windows --host_copt=/Oi
build:windows --copt=/GL
build:windows --host_copt=/GL
15 changes: 15 additions & 0 deletions yggdrasil_decision_forests/port/python/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Changelog

## 0.0.1 - 2023-10-03

Initial Alpha Release

### Features

- Training, prediction, evaluation, Import from Pandas
- Learners: Gradient Boosted Trees (and derivatives), Random Forest (and
derivatives), Cart.

#### Release music

Frisch heran (Opus 386), Johann Strauss (Sohn)
42 changes: 42 additions & 0 deletions yggdrasil_decision_forests/port/python/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Port of Yggdrasil / TensorFlow Decision Forests for Python

The Python port of Yggdrasil Decision is a light-weight wrapper around Yggdrasil
Decision Forests. It allows direct, fast access to YDF's methods and it also
offers advanced import / export, evaluation and inspection methods. While the
package is called YDF, the wrapping code is sometimes lovingly called *PYDF*.

It is not a replacement for its sister project
[Tensorflow Decision Forests](https://github.com/tensorflow/decision-forests)
(TF-DF). Instead, it complements TF-DF for use cases that cannot be solved
through the Keras API.

## Installation

To install YDF, in Python, simply grab the package from pip:

```
pip install ydf
```

## Compiling & Building

To build the Python port of YDF, install GCC-9 and run the following command
from the root of the port/python directory in the YDF repository

```sh
PYTHON_BIN=python3.9
./tools/test_pydf.sh
./tools/build_pydf.sh $PYTHON_BIN
```

## Frequently Asked Questions

* **Is it PYDF or YDF?** The name of the library is simply ydf, and so is the
name of the corresponding Pip package. Internally, the team sometimes uses
the name *PYDF* because it fits so well.
* **What is the status of PYDF?** PYDF is currently in Alpha development. Some
parts still work well (training models and generating predictions), others
are yet to be added. The API surface may still change without notice.
* **How should you pronounce PYDF?** The preferred pronunciation is
"Py-dee-eff" / ˈpaɪˈdiˈɛf (IPA)

19 changes: 19 additions & 0 deletions yggdrasil_decision_forests/port/python/WORKSPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
workspace(name = "ydf")

# This workspace (YDF Python) relies on YDF C++ located at "../../../"
local_repository(
name = "ydf_cc",
path = "../../../",
)

load("//ydf:library.bzl", ydf_load_deps = "load_dependencies")
ydf_load_deps()

# Load the dependencies of YDF.
load("@ydf_cc//yggdrasil_decision_forests:library.bzl", ydf_cc_load_deps = "load_dependencies")
ydf_cc_load_deps(repo_name = "@ydf_cc", exclude_repo="tensorflow")

load("@com_github_grpc_grpc//bazel:grpc_deps.bzl", "grpc_deps")
grpc_deps()
load("@com_github_grpc_grpc//bazel:grpc_extra_deps.bzl", "grpc_extra_deps")
grpc_extra_deps()
6 changes: 6 additions & 0 deletions yggdrasil_decision_forests/port/python/config/MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
include LICENSE
include README.md
include CHANGELOG.md
recursive-include * *.so
recursive-include * *.dylib
recursive-include * *.dll
131 changes: 131 additions & 0 deletions yggdrasil_decision_forests/port/python/config/setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Copyright 2022 Google LLC.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Setup file for pip's build.
This file is used by tools/build_pip_package.sh.
"""
import platform
import setuptools
from setuptools.command.install import install
from setuptools.dist import Distribution

_VERSION = "0.0.1"

with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()

REQUIRED_PACKAGES = [
"numpy",
"absl_py",
"protobuf>=3.14",
]

OPTIONAL_PACKAGES = {"pandas": ["pandas"], "plot": ["matplotlib"]}


class InstallPlatlib(install):

def finalize_options(self):
install.finalize_options(self)
if self.distribution.has_ext_modules():
self.install_lib = self.install_platlib


class BinaryDistribution(Distribution):

def has_ext_modules(self):
return True

def is_pure(self):
return False


try:
from wheel.bdist_wheel import bdist_wheel as _bdist_wheel

class bdist_wheel(_bdist_wheel):

def finalize_options(self):
_bdist_wheel.finalize_options(self)
self.root_is_pure = False

def get_tag(self):
python, abi, plat = _bdist_wheel.get_tag(self)
if platform.system() == "Darwin":
# Uncomment on of the lines below to adapt the platform string when
# cross-compiling.
# plat = "macosx_12_0_arm64"
# plat = "macosx_10_15_x86_64"
pass
return python, abi, plat

except ImportError:
bdist_wheel = None

setuptools.setup(
cmdclass={
"bdist_wheel": bdist_wheel,
"install": InstallPlatlib,
},
name="ydf",
version=_VERSION,
author="Mathieu Guillame-Bert, Richard Stotz, Jan Pfeifer",
author_email="decision-forests-contact@google.com",
description=(
"YDF (short for Yggdrasil Decision Forests) is a library for training,"
" serving, evaluating and analyzing decision forest models such as"
" Random Forest and Gradient Boosted Trees."
),
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/google/yggdrasil-decision-forests",
project_urls={
"Documentation": "https://ydf.readthedocs.io/",
"Source": "https://github.com/google/yggdrasil-decision-forests.git",
"Tracker": (
"https://github.com/google/yggdrasil-decision-forests/issues"
),
},
classifiers=[
"Intended Audience :: Developers",
"Intended Audience :: Education",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: Apache Software License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3 :: Only",
"Topic :: Scientific/Engineering",
"Topic :: Scientific/Engineering :: Mathematics",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"Topic :: Software Development",
"Topic :: Software Development :: Libraries",
"Topic :: Software Development :: Libraries :: Python Modules",
],
distclass=BinaryDistribution,
packages=setuptools.find_packages(),
python_requires=">=3.8",
license="Apache 2.0",
keywords=(
"machine learning decision forests random forest gradient boosted"
" decision trees classification regression ranking uplift"
),
install_requires=REQUIRED_PACKAGES,
extras_require=OPTIONAL_PACKAGES,
include_package_data=True,
zip_safe=False,
)
2 changes: 2 additions & 0 deletions yggdrasil_decision_forests/port/python/dev_requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pandas
matplotlib
17 changes: 17 additions & 0 deletions yggdrasil_decision_forests/port/python/documentation/glossary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Glossary

## Confusion matrix

The confusion matrix shows how many times the model predicted a given class
(column) while the example was really of a given class (row). Values in the
confusion matrix are weighed.

## Number of examples

Number of examples in the test dataset. The example weight is not taken into
account.

## Weighted number of examples

Number of examples in the test dataset. The example weight is taken into
account.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This directory contains the documentation for automatically generated reports in
PYDF. Ultimately, this documentation will be moved to the YDF website i.e.
"third_party/yggdrasil_decision_forests/documentation/rtd".
3 changes: 3 additions & 0 deletions yggdrasil_decision_forests/port/python/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
numpy
absl-py
protobuf>=3.14
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
licenses(["notice"])
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
"""Pybind project."""

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

def deps():
# Version 2.11.1
PYBIND_BAZEL_COMMIT_HASH = "fd7d88857cca3d7435b06f3ac6abab77cd9983b2"
PYBIND_BAZEL_SHA = "34bc7959304c22ca7b37be06c6078b6be1ffd12e683961aadb1c18d28d7d9d5f"
PYBIND_COMMIT_HASH = "8a099e44b3d5f85b20f05828d919d2332a8de841"
PYBIND_SHA = "e7fc4519e2c59737d38751fab8de2192865a06dac4f45231b96f9628e62da5a6"
http_archive(
name = "pybind11_bazel",
strip_prefix = "pybind11_bazel-{commit}".format(commit = PYBIND_BAZEL_COMMIT_HASH),
urls = ["https://github.com/pybind/pybind11_bazel/archive/{commit}.tar.gz".format(commit = PYBIND_BAZEL_COMMIT_HASH)],
sha256 = PYBIND_BAZEL_SHA,
)

http_archive(
name = "pybind11",
build_file = "@pybind11_bazel//:pybind11.BUILD",
strip_prefix = "pybind11-{commit}".format(commit = PYBIND_COMMIT_HASH),
urls = ["https://github.com/pybind/pybind11/archive/{commit}.tar.gz".format(commit = PYBIND_COMMIT_HASH)],
sha256 = PYBIND_SHA,
)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
licenses(["notice"])
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""Pybind absl wrappers project."""

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

def deps():
PYBIND_ABSL_COMMIT_HASH = "fcfff8502fad281b0c1197872a1e30cdab69a323"
PYBIND_ABSL_SHA = "ad51ee8caa3919c030cffdc1494bde346dba188db191dc4be9ff652e2c4a4d2f"
http_archive(
name = "com_google_pybind11_abseil",
urls = ["https://github.com/pybind/pybind11_abseil/archive/{commit}.tar.gz".format(commit = PYBIND_ABSL_COMMIT_HASH)],
strip_prefix = "pybind11_abseil-{commit}".format(commit = PYBIND_ABSL_COMMIT_HASH),
sha256 = PYBIND_ABSL_SHA,
)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
licenses(["notice"])
Loading

0 comments on commit a576c23

Please sign in to comment.