Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for Jais models #3183

Merged
merged 51 commits into from
Mar 21, 2024
Merged

Added support for Jais models #3183

merged 51 commits into from
Mar 21, 2024

Conversation

grandiose-pizza
Copy link
Contributor

@grandiose-pizza grandiose-pizza commented Mar 4, 2024

Jais models are pretrained and fine-tuned over a curated Arabic and English text/ prompt-response pairs datasets respectively. It is trained from scratch by Core42 in partnership with MBZUAI and Cerebras systems on their Condor Galaxy. The model architecture is based on transformer-based decoder-only (GPT-3) architecture and uses SwiGLU non-linearity. It implements ALiBi position embeddings, enabling the model to extrapolate to long sequence lengths, providing improved context handling and model precision.

The research can be studied here: https://arxiv.org/pdf/2308.16149.pdf

Metrics can be found here: https://huggingface.co/core42/jais-30b-chat-v3

These are SoTA Arabic-English bilingual models. The playground can be accessed at https://arabic-gpt.ai/

@grandiose-pizza grandiose-pizza changed the title Added support for Jais amodels Added support for Jais models Mar 4, 2024
@grandiose-pizza
Copy link
Contributor Author

Not ready to be merged yet. Bugs found.Fixes to be done.

@robertgshaw2-redhat
Copy link
Collaborator

Nice PR! Left a few cosmetic changes + some ideas for how to slightly improve the performance of the models via fusion in the MLP / a fused activation function

@grandiose-pizza grandiose-pizza changed the title Added support for Jais models [WIP] Added support for Jais models Mar 5, 2024
@esmeetu esmeetu added the new model Requests to new models label Mar 5, 2024
@7ossam81
Copy link

@robertgshaw2-neuralmagic could you please check the status of the PR ?

@esmeetu
Copy link
Collaborator

esmeetu commented Mar 21, 2024

@grandiose-pizza Sorry for delay some days. Please merge the latest update, and you can now just read model's scale parameter from config and pass it to LogitsProcessor without custom Sampler.

@grandiose-pizza
Copy link
Contributor Author

grandiose-pizza commented Mar 21, 2024

@grandiose-pizza Sorry for delay some days. Please merge the latest update, and you can now just read model's scale parameter from config and pass it to LogitsProcessor without custom Sampler.

Hi @esmeetu ,

I have update the code to adapt to #3233 . Tested with multi and single GPU setting. Works as expected.

However, there was a tiny miss in your PR and I have included that fix as well in this push.

I am running into an error for gpt2, I think you may have missed updating the new_tokens properly just for gpt2.py. For rest is okay. I have included this fix in this PR. Hope this is Okay:

Here is the older bug.
Your sampler forward function during a call from a model is passing three values:

next_tokens = self.sampler(self.lm_head_weight, logits,

    def sample(
        self,
        logits: torch.Tensor,
        sampling_metadata: SamplingMetadata,
    ) -> Optional[SamplerOutput]:
        next_tokens = self.sampler(self.lm_head_weight, logits,
                                   sampling_metadata)
        return next_tokens

But the sampler forward function itself accepts two values:

    def forward(
        self,
        logits: torch.Tensor,
        sampling_metadata: SamplingMetadata,
    ) -> Optional[SamplerOutput]:
        assert logits is not None
        _, vocab_size = logits.shape

I am getting the following error due to this:
TypeError: forward() takes 3 positional arguments but 4 were given

Copy link
Collaborator

@esmeetu esmeetu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that! LGTM.

@esmeetu esmeetu enabled auto-merge (squash) March 21, 2024 08:31
@esmeetu esmeetu merged commit 4c07dd2 into vllm-project:main Mar 21, 2024
32 checks passed
tjohnson31415 added a commit to tjohnson31415/vllm that referenced this pull request Mar 21, 2024
* upstream/main:
  [Misc] Bump up transformers to v4.39.0 & Remove StarCoder2Config (vllm-project#3551)
  [Misc][Log] Add log for tokenizer length not equal to vocabulary size (vllm-project#3500)
  [🚀 Ready to be merged] Added support for Jais models (vllm-project#3183)
  Fix 1D query issue from `_prune_hidden_states` (vllm-project#3539)
  [PREFIX CACHING FOLLOW UP] OrderedDict-based evictor (vllm-project#3431)
  [BugFix] Hot fix in setup.py for neuron build (vllm-project#3537)
  Migrate `logits` computation and gather to `model_runner` (vllm-project#3233)
  [1/n][Chunked Prefill] Refactor input query shapes (vllm-project#3236)
  [1/n] Triton sampling kernel (vllm-project#3186)
  [Bugfix] Fix ROCm support in CMakeLists.txt (vllm-project#3534)
@grandiose-pizza grandiose-pizza mentioned this pull request Mar 31, 2024
3 tasks
@grandiose-pizza grandiose-pizza changed the title [🚀 Ready to be merged] Added support for Jais models Added support for Jais models Apr 8, 2024
Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new model Requests to new models
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants