Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base R plot support #2

Closed
kylebutts opened this issue Oct 12, 2023 · 10 comments
Closed

Base R plot support #2

kylebutts opened this issue Oct 12, 2023 · 10 comments

Comments

@kylebutts
Copy link
Owner

kylebutts commented Oct 12, 2023

@grantmcdermott I'm wondering if I could pick your brain about base R plot support (or more likely, using plot2). I've added support for fixest_multi usage (multiple y variables and/or split/fsplit samples). Now you can replace feols with fwlplot and it will just work :-)

I have the following dataset ready to plot from previous calls (see bottom for example of what the dataset would look like). If there is no split/fsplit, then sample is a character column of empty strings "".

For plotting, if there is both I do facet_grid(rows = var, cols = sample) if the estimation is split and multiple y variables. If there is only 1 of split or multiple y, then I use facet_wrap. Otherwise, what we have.

I saw your most recent comment here: grantmcdermott/tinyplot#76, and was wondering if this is something that might be easily possible in plot2?

         sample   idx    var       y_resid       x_resid           fit           lwr           upr
         <char> <int> <char>         <num>         <num>         <num>         <num>         <num>
 7:     cyl = 4     9   disp  2.291529e+01  4.627560e-01  2.286022e+01  1.229056e+01  3.342988e+01
 8:     cyl = 4     8   disp  5.531900e+01  1.039647e+00  5.135873e+01  3.308987e+01  6.962759e+01
 9:     cyl = 6     1   disp  8.526513e-12 -1.275000e-01  8.540724e-12  8.465837e-12  8.615610e-12
10:     cyl = 6    10   disp  8.554935e-12  2.042810e-14  8.540724e-12  8.497488e-12  8.583959e-12
11:     cyl = 6    11   disp  8.554935e-12  2.042810e-14  8.540724e-12  8.497488e-12  8.583959e-12
12:     cyl = 6     2   disp  8.526513e-12  1.275000e-01  8.540724e-12  8.465837e-12  8.615610e-12
13: Full sample     1   disp  1.202142e+01 -3.645965e-01 -1.538589e+01 -2.786976e+01 -2.902013e+00
23: Full sample     9   disp  1.108624e+01  4.346205e-01  1.834088e+01  4.748985e+00  3.193278e+01
24: Full sample     8   disp  5.716882e+01  1.066898e+00  4.502284e+01  1.904455e+01  7.100112e+01
25:     cyl = 4    19    mpg -8.817797e-01 -3.726589e-01  1.349912e+00 -1.675924e+00  4.375747e+00
31:     cyl = 4     9    mpg -6.758828e-01  4.627560e-01 -1.676277e+00 -5.002769e+00  1.650214e+00
32:     cyl = 4     8    mpg -5.066455e+00  1.039647e+00 -3.765994e+00 -9.515585e+00  1.983596e+00
33:     cyl = 6    10    mpg  7.000000e-01  2.042810e-14  1.477903e-12 -8.894343e+00  8.894343e+00
34:     cyl = 6    11    mpg -7.000000e-01  2.042810e-14  1.477903e-12 -8.894343e+00  8.894343e+00
35:     cyl = 6     2    mpg  8.668621e-13  1.275000e-01  8.668021e-13 -1.257850e+01  1.257850e+01
36: Full sample     3    mpg -1.091027e+00 -3.594840e-01  1.343803e+00 -8.147531e-01  3.502358e+00
37: Full sample    20    mpg  4.979079e+00 -3.419456e-01  1.283776e+00 -8.296313e-01  3.397184e+00
         sample   idx    var       y_resid       x_resid           fit           lwr           upr
@grantmcdermott
Copy link
Contributor

Hi Kyle.

The tl;dr read is that, yes, facets are on the roadmap for plot2. I haven't written down the formal implementation, but I believe it should work following the basic format that I outlined in the comment you linked too. TBH I hadn't considered facet_grid support yet and was mostly thinking i.t.o. facet_wrap... But I guess both should be possible.

One UI thing I'd like feedback on is potential formula method support and implementation. My current thinking is that we could denote facets after a second vertical bar, e.g. y ~ x | colour_var | facet_var. You could expand to grids via a two-part formula, y ~ x | colour_var | grid_var1 ~ grid_var2.

I can't make any promises RE timeline. But my goal is to finish this type of feature support in time for a CRAN submission before 2023 is out. I may go ahead and merge the current PR since that's the current bottleneck for further development rn.

@grantmcdermott
Copy link
Contributor

grantmcdermott commented Oct 12, 2023

OTOH if you're in a rush, you may only need plot2 if you are worried about a nice legend (potentially irrelevant here?). Quick mockup below using your dataset. I'm still using plot2 here, but only for the ribbon part since I'm lazy I don't feel like going the manual polygon route.

dat = data.table::fread('~/Desktop/kyle.csv')
sdat = split(dat, list(dat$var, dat$sample))

par(mfcol = c(length(unique(dat$var)), length(unique(dat$sample))))

invisible(lapply(
    seq_along(sdat),
    \(i) {
        # plot points
        with(
            sdat[[i]],
            plot(
                x = x_resid, y = y_resid,
                pch = 16,
                xlim = range(dat$x_resid), ylim = range(c(dat$upr, dat$lwr)),
                frame.plot = FALSE, axes = FALSE, ann = FALSE
            )
        )
        # add grid
        grid()
        # facet details for some label calcs
        facets = par("mfcol") 
        n_facets = Reduce(`*`, par("mfcol"))
        # x axis
        if (i %% facets[1] == 0) {
            axis(1, lty = 0)
            title(xlab = "Residualized x")
        }
        # y axis
        if (i %in% 1:facets[1]) {
            axis(2, lty = 0, las = 1)
            title(ylab = "Residualized y")
        }
        # 1st facet labels
        if (i %% facets[1] == 1) {
            mtext(sub(".*\\.", "", names(sdat))[i], side = 3)
        }
        # 2 facet labels
        if (i %in% c(n_facets, n_facets-1)) {
            mtext(sub("\\..*", "", names(sdat))[i], side = 4, crt = 180)
        }
        
        # add best fit ribbon (here, using plot2)
        with(
            sdat[[i]], 
            plot2::plot2(
                x = x_resid, y = fit, ymin = lwr, ymax = upr, 
                type = "ribbon", col = "darkgreen",
                add = TRUE
                )
        )
    }
))

Created on 2023-10-12 with reprex v2.0.2

@kylebutts
Copy link
Owner Author

This is really great Grant! I didn't know about par(mfcol). Really should dive into base R plotting at some point...

@kylebutts
Copy link
Owner Author

Completed by 71eb918

@grantmcdermott
Copy link
Contributor

Just a HU that plot2 v0.0.4 (just released on R-universe) includes full-featured facet support. One nice thing compared to your current bespoke solution is that it correctly spaces and scales the axes titles relative to the individual plot windows. So you don't end up with the left-most plots being "thinner" as a result of axes printing.

library(plot2)

# resids dataset obtained by running debugonce(fwlplot:::plot_resids_base_r)
# and writing to disk
resids = read.csv('~/Desktop/resids.csv')

resids |>
  with(plot2(
    x = x_resid, y = y_resid,
    facet = sample ~ var,
    facet.args = list(bg = "grey90"),
    pch  = 21, fill = adjustcolor(1, alpha = 0.1),
    frame = FALSE, grid = TRUE
  ))
resids |>
  with(plot2(
    x = x_resid, y = fit, ymin = lwr, ymax = upr,
    type = "ribbon", col = "dodgerblue",
    facet = sample ~ var,
    add = TRUE
  ))

Created on 2024-01-26 with reprex v2.0.2

(Obvs looks a bit silly here b/c of reprex's squashed plot aspect ratio. But you can see that things at least look proportional.)

We could probably port that logic here, or I can flag once plot2 hits CRAN.

@kylebutts
Copy link
Owner Author

Thanks for the heads up. This is great. Since there's going to be basically 0 more features added on this, I'll port over to plot2's feature and then push to CRAN when you do.


FYI, with reprex you can use # %%/#' since it uses knitr::spin under the hood. e.g.

library(fixest)
library(plot2)

aq = airquality
aq$Month = factor(month.abb[aq$Month], levels = month.abb[5:9])

mod = lm(Temp ~ 0 + Month / Day, data = aq)
aq = cbind(aq, predict(mod, interval = "confidence"))

# %% fig.width = 8, fig.height = 6, dpi = 300
# Plot the original points 
with(
  aq,
  plot2(
    x = Day, y = Temp,
    by = Month,
    facet = "by", facet.args = list(bg = "grey90"),
    palette = "dark2",
    grid = TRUE, frame = FALSE, ylim = c(50, 100),
    main = "Actual and predicted air temperatures"
  )
)

yields

library(fixest)
library(plot2)
#> Warning: package 'plot2' was built under R version 4.3.2

aq = airquality
aq$Month = factor(month.abb[aq$Month], levels = month.abb[5:9])

mod = lm(Temp ~ 0 + Month / Day, data = aq)
aq = cbind(aq, predict(mod, interval = "confidence"))
# Plot the original points 
with(
  aq,
  plot2(
    x = Day, y = Temp,
    by = Month,
    facet = "by", facet.args = list(bg = "grey90"),
    palette = "dark2",
    grid = TRUE, frame = FALSE, ylim = c(50, 100),
    main = "Actual and predicted air temperatures"
  )
)
# Add the model predictions to the same plot 
with(
  aq,
  plot2(
    x = Day, y = fit,
    ymin = lwr, ymax = upr,
    by = Month, facet = "by",
    type = "ribbon",
    palette = "dark2",
    add = TRUE
  )
)

Created on 2024-01-26 with reprex v2.1.0

@kylebutts kylebutts reopened this Feb 17, 2024
@kylebutts
Copy link
Owner Author

Made the switch! I'll try to keep an eye out when tinyplot gets put on CRAN @grantmcdermott.

@grantmcdermott
Copy link
Contributor

Now on CRAN :-)
https://cran.r-project.org/package=tinyplot

@kylebutts
Copy link
Owner Author

Congrats on getting it through to CRAN, Grant!

@grantmcdermott
Copy link
Contributor

Congrats on the new CRAN release. We can close, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants