Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debugger stops working #460

Closed
expez opened this issue Dec 11, 2017 · 15 comments
Closed

Debugger stops working #460

expez opened this issue Dec 11, 2017 · 15 comments
Assignees

Comments

@expez
Copy link
Member

expez commented Dec 11, 2017

Expected behavior

Debugger keeps instrumenting stuff

Actual behavior

After a while (with no discernible pattern) the debugger stops working. I instrument functions, but the breakpoint doesn't trigger anymore. If I restart the REPL it starts working again.

Steps to reproduce the problem

I don't have time to create a minimal reproducible case. This happens in the (huge) system at work, when I'm sending web requests using restclient-mode to a version of our system running locally.

Since web-requests might run on different threads, that might be related. I'm hoping this issue will give me some stuff to try, before I start re-creating a smaller version of the system.

Environment & Version information

;; CIDER 0.16.0snapshot (package: 20171001.112), nREPL 0.2.12
;; Clojure 1.8.0, Java 1.8.0_151

Operating system

Ubuntu 16.04 LTS

@expez expez added the bug label Dec 11, 2017
@bbatsov
Copy link
Member

bbatsov commented Dec 13, 2017

I'm afraid in its current form this bug report is not actionable at all.

@expez
Copy link
Member Author

expez commented Dec 13, 2017

I was hoping to start a dialogue, with stuff to try, to avoid sinking hours into creating a minimal repro.

I'm so pressed at work that I don't have hours to sink into this. For now I've just taken the productivity hit by going back to using spyscope instead of the debugger.

If nobody has any suggestions as to how to debug this, we should indeed just close this, until someone else comes along to spearhead this.

@bbatsov
Copy link
Member

bbatsov commented Dec 13, 2017

At least - is this some recent regression or it has been a problem for a while? @vspinu did some changes on the debugger in 0.15, but no one has complained after it was released, so I assume this is something from 0.16, right? But we didn't really change anything except the deferred middleware loading there.

@expez
Copy link
Member Author

expez commented Dec 13, 2017

I'm a bit more conservative about upgrading emacs packages at work than at home, so this is probably due to the changes in 0.15.

I think this is related to threads, somehow. Perhaps the breakpoint information is imperfectly shared among threads? The reason I suspect this is that even after web-requests stop tripping breakpoints, I can get the debugger to start by instrumenting a function and calling it from the REPL.

@vspinu
Copy link
Contributor

vspinu commented Dec 13, 2017

Maybe obvious, but if your complex system passes functions around (higher order fns, ring, transducers etc) then you need to instrument it before the start of the system. Hard to say what's going on, really. At the least please try to remember what was the last debug action before the breakage. It's rather strange that instrumentation stops working without any errors or messages.

@expez
Copy link
Member Author

expez commented Dec 14, 2017

please try to remember what was the last debug action before the breakage

I just did the following, to reproduce the problem:

  1. Restart REPL and start system using reloaded workflow.
  2. Instrument a regular function in the bowel of the system
  3. Hit my system with a web request, using restclient-mode.
  4. Step hit n[ext] a few times, and enjoy a working debugger.
  5. Hit c[ontinue]
  6. C-c C-c
  7. C-u C-c C-c

At this point the function has the red box around it (again) to indicate instrumentation, but the debugger is no longer triggered by web requests.

nrepl messages associated with failed instrumentation:

(-->
 id         "27"
 op         "eval"
 session    "3c503663-1acd-4da5-91c0-ce0c1b8f47cc"
 time-stamp "2017-12-14 12:18:05.641002243"
 code       "#dbg
(defn- foo!
  [{:keys [qux..."
 column     1
 file       "/home/user/path/project/src/project/service/foo_service.cl..."
 line       279
 ns         "project.service.foo-service"
)
(<--
 id         "27"
 session    "3c503663-1acd-4da5-91c0-ce0c1b8f47cc"
 time-stamp "2017-12-14 12:18:05.937256370"
 ns         "project.service.foo-service"
 value      "#'project.service.foo-service/bar!"
)
(<--
 id         "27"
 session    "3c503663-1acd-4da5-91c0-ce0c1b8f47cc"
 time-stamp "2017-12-14 12:18:05.968759376"
 status     ("done")
)
(<--
 id                 "27"
 session            "3c503663-1acd-4da5-91c0-ce0c1b8f47cc"
 time-stamp         "2017-12-14 12:18:06.445546860"
 changed-namespaces (dict ...)
 repl-type          "clj"
 status             ("state")
)

I tried looking at the messages when it was working and when it wasn't, but the chatter looked identical.

@vspinu
Copy link
Contributor

vspinu commented Dec 14, 2017

I see. I think this happens due to how c is implemented. Related (or identical) to clojure-emacs/cider#1869 and to #429. (Could someone pls label those with "debugger"? ).

Also with recent changes to the debugger the relevant #1054 could be finally implemented. Will try to have a look at all these but probably not before the new year.

@expez
Copy link
Member Author

expez commented Dec 14, 2017

Your guess is spot on. If I only use n I can keep re-evaluating and using the debugger, but as soon as I hit c the debugger won't get triggered on subsequent web requests going through that function or any other.

I've tagged the issues you mentioned.

@expez expez added the debugger label Dec 14, 2017
@zsrail
Copy link

zsrail commented Mar 7, 2018

I had the same issue and found minimal reproduction steps and a work around.

first an example without threads:

(ns debug-the-debugger.core)

(defn foo []
  (println "Still going..")
  (Thread/sleep 500))

(defn infinite-loop []
  (repeatedly foo))

When I use cider-debug-defun-at-point to instrument foo and then call infinite-loop at the repl the debugger predictably starts at the first breakpoint in foo. Then, if I repeatedly press n the debugger will loop through foo multiple times, but if I press c foo keeps getting called but the debugger disappears.

This behavior is more likely to be to be annoying if the loop is being run on another thread that lasts the lifetime of your application.

(ns debug-the-debugger.now-with-threads)

(def keep-running? (atom true))

(defn foo []
  (println "Still going...")
  (Thread/sleep 500))

(defn infinite-loop []
  (while @keep-running?
    (foo)))

(defn main []
  (future (infinite-loop)))

In this example if I instrument foo, run main at the repl, and then use c at the debugger I'm not able to debug the calls to foo again unless I restart the thread.

My workaround is to bind *skip-breaks* to (atom nil) before every call to foo.

(ns debug-the-debugger.threads-fixed
  (:require [cider.nrepl.middleware.debug :refer [*skip-breaks*]]))

(def keep-running? (atom true))

(defn foo []
  (println "Still going..")
  (Thread/sleep 500))

(defn infinite-loop []
  (while @keep-running?
    (binding [*skip-breaks* (atom nil)]
      (foo))))

(defn main []
  (future (infinite-loop)))

Now when I use c the debugger will restart in the next call to foo.

@zsrail
Copy link

zsrail commented Mar 7, 2018

I played around in cider.nrepl.middleware.debug a bit and was able to fix the issue I was having (i.e. pressing c in the debugger no longer prevents me from debugging future calls by the same thread). I did it by putting skip-breaks atoms in the breakpoints' STATE__ maps and rebinding the global *skip-breaks* var to the local skip-breaks atoms during calls to break.

Here are the functions I changed:

(defmacro with-initial-debug-bindings
  "Let-wrap `body` with STATE__ map containing code, file, line, column etc.
  STATE__ is an anaphoric variable available to all breakpoint macros. Ends with
  __ to avid conflicts with user locals and to signify that it's an internal
  variable which is cleaned in `sanitize-env' along other clojure's
  temporaries."
  {:style/indent 1}
  [& body]
  ;; NOTE: *msg* is the message that instrumented the function,
  `(let [~'STATE__ {:msg ~(let [{:keys [code id file line column ns]} *msg*]
                            (-> {:code code
                                 ;; Passing clojure.lang.Namespace object
                                 ;; as :original-ns breaks nREPL in bewildering
                                 ;; ways.
                                 :original-id id, :original-ns (str (or ns *ns*))
                                 :file file, :line line, :column column}
                                ;; There's an nrepl bug where the column starts counting
                                ;; at 1 if it's after the first line. Since this is a
                                ;; top-level sexp, a (= col 1) is much more likely to be
                                ;; wrong than right.
                                (update :column #(if (= % 1) 0 %))))
                    :forms @*tmp-forms*
                    :skip-breaks (atom nil)}]
     ~@body))

(defn break
  "Breakpoint function.
  Send the result of form and its coordinates to the client and wait for
  response with `read-debug-command`'."
  [coor val locals STATE__]
  (binding [*skip-breaks* (:skip-breaks STATE__)]
    (cond
      (skip-breaks? coor (get-in STATE__ [:msg :code])) val
      ;; The length of `coor` is a good indicator of current code
      ;; depth.
      (= (:mode @*skip-breaks*) :trace)
      (do (print-step-indented (count coor) (get-in STATE__ [:forms coor]) val)
          val)
      ;; Most common case - ask for input.
      :else
      (read-debug-command val (assoc (:msg STATE__)
                                    :debug-value (pr-short val)
                                    :coor coor
                                    :locals locals)))))

However this introduced another potentially annoying behavior. Now, if I use c while debugging a function, I have to reinstrument the function to debug it a second time (No matter which thread calls it). I think the only way to avoid both behaviors at the same time would be to have the skip-breaks atoms belong to debugging sessions, as discussed in clojure-emacs/cider#1054, that only last for a single function call.

benzenwen added a commit to benzenwen/cider-nrepl that referenced this issue Jun 16, 2018
@benzenwen
Copy link

Here's a fork with @Toad-Racer's workaround benzenwen/cider-nrepl Hattip @Toad-Racer

@leourbina
Copy link

@vspinu it seems from your comment here that this shouldn't be too hard to fix. Is this something you're planning on spending time in the near future? Let me know if I can help (totally unfamiliar with cider-nrepl's internals, but willing to learn)

@vspinu
Copy link
Contributor

vspinu commented Aug 26, 2018

It's not hard but needs some careful consideration because all of the related issues (including clojure-emacs/cider#1054) need to be tackled simultaneously. At some point I will have a look (unless someone else beats me to it) but no clear idea when. I have in plan to resume some clojure projects in September and if this issue bugs me much I will give it a stab pretty soonish.

@pjstadig
Copy link
Contributor

pjstadig commented Jan 16, 2019

This was marked as closed, but I'm still experiencing the issue with cider 0.20. I've created a repo to repo https://github.com/pjstadig/cider-debugger-stops-working.

Based on my understanding of the issue, I'm not sure there is an easy fix. I would think maybe the best one could do is leave *skip-breaks* unbound so you would just get an error when trying to debug without a binding, and then provide some utility with-debugging macro to establish a binding? That doesn't seem satisfying, but it is at least not misleading.

Another option would be to have the (instrumented?) code check if *skip-breaks* is thread-bound? and if not then bind it.

Aside from that, I guess one would have to find a way to manage state that doesn't involve bindings and atoms.

Apologies if there's still on-going work on this, or if there are already better ideas (I know there's been discussion about it across several different issues). I may dig into this a bit more, so if there are any ideas about what should happen, I'd be interested in them.

@holtzermann17
Copy link

I cloned the repo from @pjstadig and followed the instructions, after adding :repl {:plugins [[cider/cider-nrepl "0.24.0-SNAPSHOT"]]} to :profiles. I couldn't reproduce the problem using a current CIDER! Nevertheless, I find that in practice the CIDER debugger does stop working for me when I am trying to debug a larger web app.

I don't have a reproduction case - perhaps it's rather challenging to capture one, because it silently fails. If someone can give me advice on how to capture useful information about the state of the debugger, I will add more information here.

CIDER 0.24.0snapshot (package: 20200208.809), nREPL 0.6.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants