You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone, first of all, thank you for your work, this project is quite amazing !
I had the occasion to test the constrained generation feature of llama.cpp, using grammar at inference time, which is working perfectly.
However, I was thinking recently about a hypothetical new feature : Would it be possible to use grammar while training a model ? I looked for similar propositions with an "enhancement" tag, but couldn't find one.
For example, my finetuning problem is a multilabel (each binary) text classification one, so the JSON format for answers from the LLM is convenient, as I can have multiple fields (one per label) and associated predicted boolean values by the model.
The thing is, to me, using grammar only at inference when I have access to the training conditions is suboptimal as it perturbs the proba tokens distribution of a model which is not "used" to it. So in terms of efficiency of the training (QLora finetuning), this would surely be a large loss in accuracy compared to a model which would have already encountered grammar use during its training.
In addition, from a strictly computational perspective, would this be a way of gaining speed for training ? I mean that the backprop algo would not need to be applied for every token generated by the model, but only those where the model truely intervenes, thus limiting greatly the number of call of the related functions. I am not sure for this part however as I am not a specialist about causal LM training, and thus if a "discontinuous" application of the backprop algo would be realistic.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi everyone, first of all, thank you for your work, this project is quite amazing !
I had the occasion to test the constrained generation feature of llama.cpp, using grammar at inference time, which is working perfectly.
However, I was thinking recently about a hypothetical new feature : Would it be possible to use grammar while training a model ? I looked for similar propositions with an "enhancement" tag, but couldn't find one.
For example, my finetuning problem is a multilabel (each binary) text classification one, so the JSON format for answers from the LLM is convenient, as I can have multiple fields (one per label) and associated predicted boolean values by the model.
The thing is, to me, using grammar only at inference when I have access to the training conditions is suboptimal as it perturbs the proba tokens distribution of a model which is not "used" to it. So in terms of efficiency of the training (QLora finetuning), this would surely be a large loss in accuracy compared to a model which would have already encountered grammar use during its training.
In addition, from a strictly computational perspective, would this be a way of gaining speed for training ? I mean that the backprop algo would not need to be applied for every token generated by the model, but only those where the model truely intervenes, thus limiting greatly the number of call of the related functions. I am not sure for this part however as I am not a specialist about causal LM training, and thus if a "discontinuous" application of the backprop algo would be realistic.
Thank you for your time
Beta Was this translation helpful? Give feedback.
All reactions