Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logs in DHaaL are confusing #2734

Closed
niloc132 opened this issue Aug 15, 2022 · 1 comment · Fixed by #2763
Closed

Logs in DHaaL are confusing #2734

niloc132 opened this issue Aug 15, 2022 · 1 comment · Fixed by #2763
Assignees
Labels
Milestone

Comments

@niloc132
Copy link
Member

Split off from #2453

From the original issue, linked above:

Revisit how logs are handled, redirected, displayed, to be sure that if the web console is used at all, it sees what it would expect to see.

Roughly, there are two kinds of logs in Deephaven - stdout/stderr which will be written mostly by script code or handle some basic app logging, and Deephaven logs, which is configurable at startup or at runtime but almost exclusively runs from inside the server or engine. Stdout/err should almost always write to the python console, whether it is interactive or not, and both should write to the web console. Additionally, Deephaven logs should generally be written to stderr (not stdout) so a long-running batch script or server can see its output.

Likely will be fixed along with #2723.

@niloc132 niloc132 added bug Something isn't working logging python-server-side labels Aug 15, 2022
@niloc132 niloc132 added this to the Aug 2022 milestone Aug 15, 2022
@niloc132 niloc132 self-assigned this Aug 15, 2022
@niloc132
Copy link
Member Author

niloc132 commented Aug 17, 2022

Current flow of logs in DHaaL:

  • python starts, creates its own sys.stdout/err objects, which point to the process's own stdout/stderr
  • jvm starts, creates its own System.out/err objects, which point to the process's own stdout/stderr
  • Deephaven logging rewires sys.stdout/err objects with ones that only forward to java's System.out/err, which has no net effect (but enables other steps)
graph TD;

subgraph Outputs
terminal{{python console}}
web{{web console}}
end
style terminal fill:#ccc
style web fill:#ccc

dh(dh log buffer)
dh--->web

stdout(process stdout)
stdout-->terminal

sysstdout0(Python's original sys.stdout)
sysstdout0-->stdout;
systemout0(Java's original System.out)
systemout0--> stdout;

sysstdout(sys.stdout reference)
%% sysstdout --> sysstdout0

systemout(System.out reference)
%% systemout --> systemout0

pylogadapter(PythonLogAdapter)
pylogadapter-->systemout1
sysstdout-->pylogadapter


systemout1(dh general system.out replacement)

%% systemout1-->dh %% disabled today
systemout1-->systemout0
systemout-->systemout1


subgraph Use cases
usesystemout["System.out("test")"]
usesysstdout["print("test")"]
usedhlog["log.info().append("test")"]
style usesystemout fill: #fff
style usesysstdout fill: #fff
style usedhlog fill: #fff
end
usesystemout-->systemout
usesysstdout-->sysstdout
usedhlog-->dh


Loading

Simple change to enable stdout/err to make it to the web UI:

  • Deephaven java code should replace System.out/err with implementations that "tee" output to DH logs, so they are buffered to the UI. This was disabled for the first iteration.
graph TD;

subgraph Outputs
terminal{{python console}}
web{{web console}}
end
style terminal fill:#ccc
style web fill:#ccc

dh(dh log buffer)
dh--->web

stdout(process stdout)
stdout-->terminal

sysstdout0(Python's original sys.stdout)
sysstdout0-->stdout;
systemout0(Java's original System.out)
systemout0--> stdout;

sysstdout(sys.stdout reference)
%% sysstdout --> sysstdout0

systemout(System.out reference)
%% systemout --> systemout0

pylogadapter(PythonLogAdapter)
pylogadapter-->systemout1
sysstdout-->pylogadapter


systemout1(dh general system.out replacement)

systemout1-->|previously missing| dh 
systemout1-->systemout0
systemout-->systemout1


subgraph Use cases
usesystemout["System.out("test")"]
usesysstdout["print("test")"]
usedhlog["log.info().append("test")"]
style usesystemout fill: #fff
style usesysstdout fill: #fff
style usedhlog fill: #fff
end
usesystemout-->systemout
usesysstdout-->sysstdout
usedhlog-->dh


Loading

Simple change to enable DH logs to write to stderr (preferred over stdout in case there is a process watching output). Not diagrammed at this time because it is more than a little hard to read.

  • Deephaven logging that replaces System.out/err can test if python's old sys.stderr is a tty, if so skip. Note that this introduces a dependency on no other changes be made along the way. We cannot test accurately if Java's System.err is going to the tty or not, so we have to lean on python for this.
  • Either write logs directly to python's old sys.stderr, or to Java's old System.err. Writing to Java's old System.err seems preferable in terms of staying in Java and avoiding the GIL, but we do want to take care to defer to python's expectations in this case.

Potential changes to avoid future surprises:

  • If python code replaces stdout/stderr before DH Java starts, it might surprise the user that their redirection no longer works. Instead, consider also changing our current implementation to replace Java's System.out/err with redirections to whatever the current Python sys.stdout/stderr values are. This must take place before Deephaven replaces Java System.out/err with a "tee" output to DH logs. Example: jupyter.
graph TD;

subgraph Outputs
terminal{{python console}}
web{{web console}}
end
style terminal fill:#ccc
style web fill:#ccc

dh(dh log buffer)
dh--->web

stdout(process stdout)
stdout-->terminal

sysstdout0(Python's original sys.stdout)
sysstdout0-->stdout;
systemout0(Java's original System.out)
systemout0--> stdout;

sysstdout(sys.stdout reference)
%% sysstdout --> sysstdout0

systemout(System.out reference)
%% systemout --> systemout0

pystreamtee(dh python StreamTee)
pystreamtee-->sysstdout0
sysstdout-->pystreamtee


systemout1(dh general system.out replacement)

systemout1-->dh
%%systemout1-->systemout0 % Removed in this step
systemout-->sysstdout
pystreamtee-->systemout1


subgraph Use cases
usesystemout["System.out("test")"]
usesysstdout["print("test")"]
usedhlog["log.info().append("test")"]
style usesystemout fill: #fff
style usesysstdout fill: #fff
style usedhlog fill: #fff
end
usesystemout-->systemout
usesysstdout-->sysstdout
usedhlog-->dh


Loading

Fix for #2723

  • Avoid replacing Python's sys.stdout/stderr directly, but instead wrap these in python so that isatty and fileno work, among others. As above, this would protect against the existing python script replacing stdout and being surprised that we don't send logs where they expect. This is not diagrammed, since it is already handled by letting python code have one stream replacement directly wrap another. Strawman, built with Jianfeng:
class TeeStream(io.TextIOBase):
   encoding = 'utf-8'
   closed = False

   def __init__(self, stream, write_func, flush_func, close_func):
       """
       Creates a new TeeStream to let output be written from out place, but be sent to
       multiple places.

       Ideally, the stream would be passed as more funcs, but we have to manage correctly
       propagating certain non-java details like encoding and isatty.
       :param stream: the underlying python stream to use as a prototype
       :param write_func: a function to call when data should be written
       :param flush_func: a function to call when data should be flushed
       :param close_func: a function to call when the stream should be closed
       """
       self.stream = stream
       self.write_func = write_func
       self.flush_func = flush_func
       self.close_func = close_func

       if stream.encoding is not None:
           # Read this in the constructor, since encoding is read as an attr, not a method
           self.encoding = stream.encoding

   def isatty(self):
       return self.stream.isatty()

   def fileno(self):
       return self.stream.fileno()

   def write(self, string):
       self.write_func(string)
       self.stream.write(string)

   def flush(self):
       self.close_func()
       self.stream.flush()

   def close(self):
       self.closed = True
       self.stream.close()

Current flow of logs in Java jetty/netty servers with python enabled

  • jvm starts, creates its own System.out/err objects, which point to the process's own stdout/stderr
  • Deephaven java code replaces System.out/err with impls that "tee" output to DH logs, so they are buffered to the UI
  • python starts, creates its own sys.stdout/err objects, which point to the process's own stdout/stderr
  • Java_org_jpy_PyLib_startPython0 invokes PyLib_RedirectStdOut, which creates the jpy_stdout module, which overwrites python's sys.stdout and sys.stderr with a simple "file-like" object that only handles write and flush, and both, write to the stdout builtin, so stderr distinctions are lost. I'm not clear why this is done at all (python's own stdout/stderr should still write to the process's own stdout/err), but as of this step any write in python would be skip the "tee" installed above.
  • Deephaven logging rewires sys.stdout/err objects with ones that only forward to java's System.out/err, enabling stdout vs err distinctions, and allowing stdout/err to be shown to web users.

While this implementation supports the features we want and is internally consistent, it would now be inconsistent with the fixes proposed above (fixing isatty/fileno/encoding/etc, as well as requiring a different implementation. As such, I suggest instead the following changes:

  • In jpy upstream, discuss removing the stderr/stdout redirect
  • With this change in place, instead of replacing python's sys.stdout/err objects with a Java PythonLogAdapter, instead use the same java->py redirect in DHaaL
  • This implies setting stdout/stderr.skipReal=true earlier in startup so that output isn't written to the process's stdout/stderr twice. This comes with the risk of missing stdout/stderr after this is configured but before python is started.

Note that this approach would preclude 0-many concurrent python sessions (and while jpy doesn't support that today, the underlying c library does), so we should be cautious in considering this.

niloc132 added a commit to niloc132/deephaven-core that referenced this issue Aug 26, 2022
niloc132 added a commit to niloc132/deephaven-core that referenced this issue Aug 26, 2022
niloc132 added a commit to niloc132/deephaven-core that referenced this issue Aug 26, 2022
niloc132 added a commit to niloc132/deephaven-core that referenced this issue Aug 30, 2022
niloc132 added a commit to niloc132/deephaven-core that referenced this issue Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant