Below is DataHub's roadmap for the short and medium term. We'll revise this on a regular basis and welcome suggestions from the communities.
Replace PDSC with PDL [WIP]
- More readable, Java-like syntax + code-gen based on annotations
- Split up unified events to improve scalability & modularity
- Models + UI
- Models + UI
- Link datasets to jobs & flows
- Make schemas searchable
- Support GraphQL schemas
- Simple tag-based data privacy metadata
- Add query-after-write capability to local DAO
- Support majority of gremlin-compatible graph DBs
- Config-driven UI
- Generate TypeScript types from Pegasus
- UI to highlight high value information about Entities within Search and Entity Pages
- Migration from docker-compose to Kubernetes for Docker container orchestration
- Run DataHub in Azure and provide how-to guides
- Models + impact analysis
- Indexing in OLAP store (Pinot) with TTL
- Users will be able to like and follow entities
- Dataset & field-level commenting
- Initially focus on rest.li services & GraphQL integration
- Support a wide range of document stores
- Use GraphQL exclusively for frontend queries
- Use Redux exclusively for UI state management
- Add docker-based integration tests
- Donate code to Apache foundation
- TypeScript-only frontend development
- Modeling in protobuf + serving in gRPC