[Not a bug] Support for fine-tuning codestral-mamba? #87

ciscodoojung · 2024-08-05T15:41:20Z

Python Version

n/a

Pip Freeze

n/a

Reproduction Steps

n/a

Expected Behavior

n/a

Additional Context

Hi, first of all, thank you all for your help.

I have been trying to use this code (mistral-finetune) to fine-tune the codestral-mamba model (https://huggingface.co/mistralai/Mamba-Codestral-7B-v0.1) but it seems like the code is not expecting mamba architecture. Is there a plan to support to fine-tune codestral-mamba in this repo? or are there any other tools we can use to fine-tune codestral-mamba?

Your response is greatly appreciated! Thank you!

Suggested Solutions

No response

acodercat · 2024-08-07T08:14:27Z

+1

ciscodoojung · 2024-08-07T16:07:51Z

Just to update the thread (not related to this repo).

Huggingface transformers v4.44.0 was released yesterday to support codestral-mamba training from HF.
Release: https://github.com/huggingface/transformers/releases/tag/v4.44.0
PR: huggingface/transformers#32080

Hope this helps! Thank you.

ciscodoojung added the bug Something isn't working label Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Not a bug] Support for fine-tuning codestral-mamba? #87

[Not a bug] Support for fine-tuning codestral-mamba? #87

ciscodoojung commented Aug 5, 2024

acodercat commented Aug 7, 2024

ciscodoojung commented Aug 7, 2024

[Not a bug] Support for fine-tuning codestral-mamba? #87

[Not a bug] Support for fine-tuning codestral-mamba? #87

Comments

ciscodoojung commented Aug 5, 2024

Python Version

Pip Freeze

Reproduction Steps

Expected Behavior

Additional Context

Suggested Solutions

acodercat commented Aug 7, 2024

ciscodoojung commented Aug 7, 2024