A unified Data Analytics and AI platform for distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray
Analytics Zoo seamless scales TensorFlow, Keras and PyTorch to distributed big data (using Spark, Flink & Ray).
-
End-to-end pipeline for applying AI models (TensorFlow, PyTorch, OpenVINO, etc.) to distributed big data
- Write TensorFlow or PyTorch inline with Spark code for distributed training and inference.
- Native deep learning (TensorFlow/Keras/PyTorch/BigDL) support in Spark ML Pipelines.
- Directly run Ray programs on big data cluster through RayOnSpark.
- Plain Java/Python APIs for (TensorFlow/PyTorch/BigDL/OpenVINO) Model Inference.
-
High-level ML workflow for automating machine learning tasks
- Cluster Serving for automatically distributed (TensorFlow/PyTorch/Caffe/OpenVINO) model inference .
- Scalable AutoML for time series prediction.
-
Built-in models for Recommendation, Time Series, Computer Vision and NLP applications.
You may want to develop your AI solutions using Analytics Zoo if:
- You want to easily apply AI models (e.g., TensorFlow, Keras, PyTorch, BigDL, OpenVINO, etc.) to distributed big data.
- You want to transparently scale your AI applications from a single laptop to large clusters with "zero" code changes.
- You want to deploy your AI pipelines to existing YARN or K8S clusters WITHOUT any modifications to the clusters.
- You want to automate the process of applying machine learning (such as feature engineering, hyperparameter tuning, model selection, distributed inference, etc.).
- Check out the Getting Started page for a quick overview of how to use Analytics Zoo.
- Refer to the Python, Scala and Docker guides to install Analytics Zoo.
- Visit the Document Website (mirror in China) for more information on Analytics Zoo.
- Check the Powered By & Presentations pages for real-world applications using Analytics Zoo.
- Join the Google Group (or subscribe to the Mail List) for more questions and discussions on Analytics Zoo.