Skip to content
Paul Rogers edited this page Feb 4, 2020 · 30 revisions

JSON Reader

Branches:

  • DRILL-6953-rev - Newest version
  • DRILL-6953 - Original version with a bunch of batch count fixes

Review comments suggest that the current approach needs adjustment. Maybe:

  • Pull out the JSON reader itself (not the full format plugin).
  • Pull out the projection rework plus better projection tests.
  • Track down other bugs and fix them as separate PRs.
  • Finally, follow up with revised JSON format plugin.

Prior Plan

Tasks:

  • Make JSON reader reusable (request from Charles)
  • Fix issues reported in PR

Follow-on work:

  • Run all unit tests with new reader enabled
  • Refactor to allow use of JSON reader outside of JSON file scan
  • Provided-schema support
  • Review past two years of JSON changes
  • Switch JSON function to use this reader
  • Switch Kafka to use this reader

Base/Dummy scan

Reviews suggest the current approach is too limited. Split into two streams.

  • Better API, more robust than the minimal "Base" framework.
  • Separate out the filter push-down. More robust solution.
  • Cancel current PR after above is done.

Revised Copier

Tasks:

  • PR: DRILL-7486: refactor reader creation - merged
  • PR: DRILL-TBD: refactor tests to use new schema builder
  • PR: DRILL-TBD: Bulk copy in RSL

On branch svr-exp3

  • Get copier to work with bulk copy. (Done)
  • Split copier tests into multiple files. (Done)
  • Bulk copy tests for structured data types. (Done)
  • PR for move of allocator into RSL options.
  • PR for restructure of reader creator and indexes.
  • PR for bulk copy feature in RSL
  • PR for copier

Current problem: TestCsvWithSchema.testBlankCols() fails with SV4 from sort. Likely problem is batch ownership. Maybe first move merging into sort?

Revised Mock

  • Revise, PR ColumnMetadata
  • Revise mock to keep its structure externally, use CMD internally
  • Revise to use Base structure

PR for general clean-up.

  • Copy from copier branch
  • Copy from Sumo branch
  • Copy from mock branch
  • Copy from CountFix branch

Code Gen Refactoring

  • Refactor ExprTreeMaterializer to use schema, not vectors.
  • Project code gen unit tests
  • Unit tests for specific bits of code gen
  • Begin process of thinking how to incorporate column readers/writers

Parquet reader

  • Review existing code, work out an approach

Data Model

  • Prepare a writeup, gathering recent comments.
  • Look into Java Object support.
  • Work out an evolution plan.

Docker/K8s

Wait for Abhishek.

Other

  • Remove OUT_OF_MEMORY result status (DRILL-7487) - Done
  • Remove STOP result status
    • Working branch: stop
    • DRILL-7507: Convert fragment interrupts to exceptions - Merged
    • DRILL-7506: Simplify code gen error handling - Approved, awaiting merge
    • PR for other error handling
    • PR to remove STOP
    • PR to remove other items
  • Remove NOT_YET result status
  • Find fixed-size-block branch

Branches

  • CountFix - Probably obsolete
  • CountFix2 - Probably obsolete
  • CountFix3 - Probably obsolete
  • DRILL-6832 - Merged
  • DRILL-6951 - Merged
  • DRILL-6951-1 - Obsolete?
  • DRILL-6953 - Probably obsolete
  • DRILL-6953-2 - Probably obsolete
  • DRILL-6953-orig - Probably obsolete
  • DRILL-6953-rev - PR for JSON reader
  • DRILL-7181 - Merged
  • DRILL-7224
  • DRILL-7257 - Merged
  • DRILL-7258 - Merged
  • DRILL-7261 - Merged
  • DRILL-7261-old - Obsolete?
  • DRILL-7278 - Merged
  • DRILL-7279 - Merged
  • DRILL-7292 - Merged
  • DRILL-7293 - Merged
  • DRILL-7293-orig
  • DRILL-7306 - Merged
  • DRILL-7306-debug - Obsolete?
  • DRILL-7311
  • DRILL-7311-2
  • DRILL-7311-debug
  • DRILL-7324 - Merged
  • DRILL-7327 - Merged
  • DRILL-7333 - Abandoned, done via other PRs
  • DRILL-7333-orig - Obsolete?
  • DRILL-7358 - Merged
  • DRILL-7377 - Merged
  • DRILL-7377x - Obsolete?
  • DRILL-7402 - Merged
  • DRILL-7403 - Merged
  • DRILL-7412 - Merged
  • DRILL-7413 - Merged
  • DRILL-7413x - Obsolete?
  • DRILL-7414 - Merged
  • DRILL-7424 - Merged
  • DRILL-7436 - Merged
  • DRILL-7439 - Abandoned, done via other PRs
  • DRILL-7441 - Merged
  • DRILL-7442 - Merged
  • DRILL-7445 - Merged
  • DRILL-7446 - Merged
  • DRILL-7447
  • DRILL-7456 - Merged
  • DRILL-7458 - Base framework PR
  • DRILL-7458-2
  • DRILL-7476 - Merged
  • DRILL-7479 - Merged
  • DRILL-7486 - Merged
  • DRILL-7487 - Merged
  • DRILL-7502 - Merged
  • DRILL-7503 - Merged
  • DRILL-7506
  • DRILL-7507 - Merged
  • Dec10
  • Dec30
  • Dec30b
  • JavaObjRow - Quick & dirty Java object batch prototype
  • July14
  • June18
  • June6
  • Nov7
  • Nov7b
  • Nov7c
  • Oct19
  • Oct26
  • Oct29
  • RowSetRev4 - Probably obsolete
  • cg-test
  • cleanup-Dec1
  • error
  • error2
  • error3
  • lastSetFix
  • logrev
  • logrev-exp1
  • logrev-exp2
  • logrev-exp3
  • master
  • md-type
  • perf
  • stop - Work to retire STOP status
  • svr-exp
  • svr-exp2
  • svr-exp3
  • vectorcheck
Clone this wiki locally