Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay] Refactor Interpreter to treat lowering as IRModule->IRModule rewrite. #8597

Merged
merged 2 commits into from
Aug 17, 2021

Commits on Aug 17, 2021

  1. This continues the work outlined in the RFC

      https://discuss.tvm.apache.org/t/rfc-relay-tecompiler-rewrite-existing-compile-engine-to-match-updated-compiler-flow/9233
    This gets about halfway there for the Interpreter:
    
    * Remove direct access to TECompiler from interpreter, and instead call
      tec::LowerTEExpr when 'preparing' a module and expression for evaluation.
    * Make clear there's no phase distinction between create_interpreter and
      evaluate on the Python side -- both must be prepared together as a single IRModule.
    * But in return make sure the result of evaluate on the Python side is a packed func
      ready to directly apply 'simple' arguments to an already interpreted closure.
    * The interpreter builds and caches primitive TIR functions (and their corresponding
      dynamic shape functions) as packed funcs as they are encountered.
    * Cleanup uses of interpreter for constant folding on the C++ side.
    
    Future work:
    * Fold LoweredModule into IRModule so tec::LowerTEExpr is just another pass.
    * Get rid of the implicit caching of lowered functions in TECompiler.
    * Make calling convention from Relay to TIR explicit, and remove all the function
      attribute hackery currently needed so the interpreter can correctly invoke lowered
      functions as it encounters them.
    * Make TECompiler private. Though could do this now it will make migrating the VM and
      AOT uses of CompilerEngine harder.
    
    Force a gc between sphinx-gallery items to reclaim GPU memory. (apache#8722)
    
    GPU memory is only released once the PackedFunc for evaling the model is gced
    by Python. In CI we're noticing intermittent 'CUDA: Out of memory' failures
    while processing the tutorials, and tracing showed there was no gc happening
    between items. Not confident this will solve the problem but worth a try.
    mbs-octoml committed Aug 17, 2021
    Configuration menu
    Copy the full SHA
    a249ea8 View commit details
    Browse the repository at this point in the history
  2. Get rid of logs spam.

    mbs-octoml committed Aug 17, 2021
    Configuration menu
    Copy the full SHA
    72c0002 View commit details
    Browse the repository at this point in the history