Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-2393][SQL] Cost estimation optimization framework for Catalyst logical plans & sample usage. #1238

Closed
wants to merge 27 commits into from

Commits on Jul 29, 2014

  1. Prototype impl of estimations for Catalyst logical plans.

    - Also add simple size-getters for ParquetRelation and
      MetastoreRelation.
    - Add a rule to auto-convert equi-joins to BroadcastHashJoin, if a table
      has smaller size, based on the above getter (for MetastoreRelation).
    concretevitamin committed Jul 29, 2014
    Configuration menu
    Copy the full SHA
    56a8e6e View commit details
    Browse the repository at this point in the history
  2. Typo.

    concretevitamin committed Jul 29, 2014
    Configuration menu
    Copy the full SHA
    5bf5586 View commit details
    Browse the repository at this point in the history
  3. Refactors.

    - Remove BaseRelation from Catalyst and clean up related code (e.g.
      unmake SparkLogicalPlan a BaseRelation).
    - Remove broadcastTables from SQLConf and clean up related code.
    - Add EstimatesSuite.
    - Address some review comments.
    concretevitamin committed Jul 29, 2014
    Configuration menu
    Copy the full SHA
    84301a4 View commit details
    Browse the repository at this point in the history
  4. Cleanups.

    concretevitamin committed Jul 29, 2014
    Configuration menu
    Copy the full SHA
    dcff9bd View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    de3ae13 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    7a60ab7 View commit details
    Browse the repository at this point in the history
  7. Move SQLConf to Catalyst & add default val for sizeInBytes.

    Conflicts:
    	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/SQLConf.scala
    concretevitamin committed Jul 29, 2014
    Configuration menu
    Copy the full SHA
    73412be View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    73cde01 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    7d9216a View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    e5bcf5b View commit details
    Browse the repository at this point in the history
  11. Add comment.

    concretevitamin committed Jul 29, 2014
    Configuration menu
    Copy the full SHA
    3ba8f3e View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    4ef0d26 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    0ef9e5b View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    43d38a6 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    ca5b825 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    2d99eb5 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    573e644 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    729a8e2 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    549061c View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    01b7a3e View commit details
    Browse the repository at this point in the history
  21. Get size info from metastore for MetastoreRelation.

    Additionally, remove size estimate from ParquetRelation since the Hadoop
    FileSystem API calls can be expensive (e.g. S3FileSystem has a lot of
    RPCs).
    concretevitamin committed Jul 29, 2014
    Configuration menu
    Copy the full SHA
    6e594b8 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    8bd2816 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    16fc60a View commit details
    Browse the repository at this point in the history
  24. Remove childrenStats.

    concretevitamin committed Jul 29, 2014
    Configuration menu
    Copy the full SHA
    9951305 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    2f2fb89 View commit details
    Browse the repository at this point in the history
  26. Use BigInt for stat; for logical leaves, by default throw an exception.

    Also cleanups & scaladoc fixes per review comments.
    concretevitamin committed Jul 29, 2014
    Configuration menu
    Copy the full SHA
    8663e84 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    329071d View commit details
    Browse the repository at this point in the history