huggingface / transformers Public

Notifications You must be signed in to change notification settings
Fork 27.3k
Star 137k

Code
Issues 998
Pull requests 525
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: huggingface/transformers

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

998 Open 15,351 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Maybe the way SequenceClassification Model calculates the last non-pad token is not reasonable. bug

#35352 opened Dec 20, 2024 by liangxuZhang

4 tasks

Any plans to add AIMv2 in the model? New model

#35351 opened Dec 20, 2024 by nahidalam

2 tasks done

SinkCache (StreamLLM) implemented over Post-RoPE Key cache might result in confused position for inference bug

#35350 opened Dec 19, 2024 by wangguangtao0722

4 tasks

A warning message showing that MultiScaleDeformableAttention.so is not found in /root/.cache/torch_extensions if ninja is installed with transformers bug

#35349 opened Dec 19, 2024 by cainmagi

1 of 4 tasks

[Mamba2] Varlen implementation Feature request

Request for a new feature

#35346 opened Dec 19, 2024 by vasqu

Llama model, torch.compile output for custom device does not match with eager/cpu when generation_config.use_cache set to True bug

#35343 opened Dec 19, 2024 by vpandya-quic

4 tasks

Option to Disable Model Caching When Using "pipeline" Feature request

Request for a new feature

#35337 opened Dec 19, 2024 by FadiAmon

Default arguments in DebertaConfig disable relative attention, contrary to the docs and deberta-base bug

#35335 opened Dec 19, 2024 by bauwenst

4 tasks

AllAboardBertweetModel New model

#35333 opened Dec 19, 2024 by CharC0de

2 tasks done

DeBERTa's DisentangledSelfAttention hardcodes float dtype, which causes bfloat16 overflow error bug

#35332 opened Dec 19, 2024 by bauwenst

2 of 4 tasks

MPI environment variables are not set. bug

#35331 opened Dec 18, 2024 by fabiogeraci

2 of 4 tasks

tokenizer decode decode with timestamp fails for extended vocabulary bug

#35330 opened Dec 18, 2024 by bnestor

2 of 4 tasks

InternVL is ExecuTorch Compatible ExecuTorch Feature request

Request for a new feature

#35327 opened Dec 18, 2024 by guangy10

unable to convert llama 3.3 weights to hf.py bug

#35326 opened Dec 18, 2024 by AshishMulupuri

1 of 4 tasks

Deepseek v2 New model

#35317 opened Dec 18, 2024 by VladOS95-cyber

2 tasks done

train_new_from_iterator() does not work when pre_tokenizer is null bug

#35315 opened Dec 18, 2024 by cecheta

1 of 4 tasks

Unclear what happens when using torchrun, multi-gpu and trainer arguments. bug

#35311 opened Dec 17, 2024 by davies-w

2 of 4 tasks

Multi-GPU training crashes with IterableDataset and different length input (e.g. Next token prediction) bug

#35308 opened Dec 17, 2024 by avishaiElmakies

2 of 4 tasks

[Question] Why doesn't trainer.state.epoch fall round after training? trainer

#35298 opened Dec 16, 2024 by qgallouedec

Custom 4D tensor caused shape mismatch error bug

#35290 opened Dec 16, 2024 by fingertap

1 of 4 tasks

version 4.47.0 provides different generation results when using quantized awq model bug

#35286 opened Dec 16, 2024 by xin3he

2 of 4 tasks

Request to add D-FINE contributions-welcome Good Second Issue

Issues that are more difficult to do than "Good First" issues - give it a try if you want!

New model Vision

#35283 opened Dec 15, 2024 by brockt96

1 of 2 tasks

Qwen2vl support for GGUF Feature request

Request for a new feature

#35282 opened Dec 15, 2024 by cheald

Vision models don't work for non-square object bug Vision

#35280 opened Dec 15, 2024 by liujilei156231

2 of 4 tasks

KeyError: 'intern_vit_6b' bug Multimodal

#35279 opened Dec 15, 2024 by Sweewangyu

1 of 4 tasks

Previous 1 2 3 4 5 … 39 40 Next

Previous Next

ProTip! Find all open issues with in progress development work with linked:pr.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly