Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0 #4719

Closed
aptlin opened this issue Oct 26, 2021 · 12 comments · Fixed by #4723
Closed
Labels

Comments

@aptlin
Copy link

aptlin commented Oct 26, 2021

Description

Neptune-LightGBM integration is broken due to changes in handling the callbacks in v3.3.0.

Reproducible example

  1. Install the dependencies
pip install lightgbm==3.3.0 neptune-client==0.10.10 neptune-lightgbm==0.9.13
  1. Run the example:
import copy
from neptune.new.integrations.lightgbm import NeptuneCallback as LGBMNeptuneCallback
import neptune.new as npt
import numpy as np
import pandas as pd
from scipy.special import expit

import lightgbm as lgb

#################
# Taken from https://github.com/microsoft/LightGBM/blob/master/examples/python-guide/logistic_regression.py
# Simulate some binary data with a single categorical and
#   single continuous predictor
np.random.seed(0)
N = 1000
X = pd.DataFrame({
    'continuous': range(N),
    'categorical': np.repeat([0, 1, 2, 3, 4], N / 5)
})
CATEGORICAL_EFFECTS = [-1, -1, -2, -2, 2]
LINEAR_TERM = np.array([
    -0.5 + 0.01 * X['continuous'][k]
    + CATEGORICAL_EFFECTS[X['categorical'][k]] for k in range(X.shape[0])
]) + np.random.normal(0, 1, X.shape[0])
TRUE_PROB = expit(LINEAR_TERM)
Y = np.random.binomial(1, TRUE_PROB, size=N)
#################

run = npt.init(mode='debug')
callback = LGBMNeptuneCallback(run)
callbacks = [callback]
model = lgb.LGBMClassifier()
model.fit(X, Y, callbacks=callbacks)

This fails with

TypeError: cannot pickle '_io.TextIOWrapper' object

Environment info

LightGBM version: 3.3.0

Additional Comments

neptune-lightgbm version: 0.9.13
neptune-client version: 0.10.10

The reason why this happens is because of using a deepcopy for callbacks:

callbacks = copy.deepcopy(callbacks)

Indeed, the error is the same for the code below:

import copy
copy.deepcopy(callbacks)
@jameslamb jameslamb changed the title Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0 [python] Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0 Oct 26, 2021
@jameslamb
Copy link
Collaborator

Thanks for the report! Just a small note, I've modified your initial description to include a link that is anchored to a specific commit. That way, that link will continue to point to the code you're talking about, even if the file is changed.

In case you're not familiar with this feature of GitHub, you can see https://docs.github.com/en/repositories/working-with-files/using-files/getting-permanent-links-to-files#press-y-to-permalink-to-a-file-in-a-specific-commit.

@jameslamb jameslamb added the bug label Oct 26, 2021
@aptlin
Copy link
Author

aptlin commented Oct 26, 2021

Thanks! It was pinned to the v3.3.0 tag, so that should have been fine

@jameslamb
Copy link
Collaborator

It was pinned to the v3.3.0 tag, so that should have been fine

Ah, sorry! I probably just misread that. 😂

changes in handling the callbacks in v3.3.0

Looking at the blame, I can see the change you're talking about came from #4574, specifically.

I'm not sure if we've ever discussed this question:

do all callbacks need to be serializable?

Based on the usage I see in lightgbm.train(), I don't think they're stored in the Booster object and therefore wouldn't be persisted if you pickle that object:

# process callbacks

That copy.deepcopy() performed in the point you've linked to (added in #4574) is there because lightgbm appends some other callback functions internally, and we wanted to avoid having side-effects on the user-provided list. I think that's desirable behavior.

Can you think of a way to achieve both the behavior "lightgbm training does not have side effects on the list of callbacks you pass in" AND "lightgbm training can accept items in callbacks that are not serializable"?

If not, maybe the solution here is a change in neptune instead of lightgbm.

@jameslamb jameslamb added question and removed bug labels Oct 26, 2021
@aptlin
Copy link
Author

aptlin commented Oct 26, 2021

Sounds strange why we should limit the callbacks to be serializable objects. What's the reasoning for that other than avoiding side effects (which are desirable given the Neptune integration, since we want to send metrics during the training process)?

@aptlin
Copy link
Author

aptlin commented Oct 26, 2021

Tagging neptune integration contributors who might be interested in this issue.
cc: @shnela @PiotrJander

@StrikerRUS
Copy link
Collaborator

Actually, we deepcopy callbacks because internally some deprecated args are transformed into callbacks (this will be removed in the future) and we append one more callback

evals_result = {}
callbacks.append(record_evaluation(evals_result))

Given the list mutability it is unacceptable to append to the originally provided callback argument. Also, there can be some critical crashes in the sklearn-wrapper specifically due to this reason. For example, refer to
# Do not modify original args in fit function
# Refer to https://github.com/microsoft/LightGBM/pull/2619
eval_metric_list = copy.deepcopy(eval_metric)

#2610.

But I guess (requires more investigation) that for this particular case with callback shallow copy will be enough. @aptlin Will this fit your (Neptune's) needs?

@aptlin
Copy link
Author

aptlin commented Oct 26, 2021

Yes, thanks! The only critical thing was just passing the neptune integration as one of the callbacks.

copy.copy(callbacks) with callbacks as in the example I gave works fine

@StrikerRUS
Copy link
Collaborator

@jameslamb Don't you mind including this into the upcoming 3.3.1 bug fixing release?

@jameslamb
Copy link
Collaborator

Don't you mind including this into the upcoming 3.3.1 bug fixing release

@StrikerRUS sure, seems ok to me, as long as it can be done quickly. I don't want to delay too long, since I think getting the R package back up on CRAN should be treated as a fairly urgent concern.

Actually, we deepcopy callbacks because ... we append one more callback
Given the list mutability it is unacceptable to append

To be clear, this is what I meant by "wanted to avoid having side-effects on the user-provided list". Thanks for adding a bit more detail there in case my language wasn't clear!

@aptlin
Copy link
Author

aptlin commented Oct 26, 2021

Right, so I think all it would take is just replacing deepcopy with copy here

callbacks = copy.deepcopy(callbacks)

@StrikerRUS will you create a PR please?

@StrikerRUS
Copy link
Collaborator

Yeah, I can do this today.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants