Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question #1

Open
naymaraq opened this issue May 20, 2024 · 10 comments
Open

Question #1

naymaraq opened this issue May 20, 2024 · 10 comments

Comments

@naymaraq
Copy link

Hi,

I have a couple of questions about the challenge:

  1. Can we apply spelling correction before using a speaker correction model?
  2. Is there a leaderboard available?
  3. Can we use non-open source LLMs like GPT-4?
  4. Do we need to change only the speaker tags, or can we change the words as well?
@naymaraq
Copy link
Author

Also, I am unable to achieve any improvements using the baseline beam-search that was provided compared to cpWER where no correction was made

@tango4j
Copy link
Owner

tango4j commented May 25, 2024

  1. Yes, you could make any type of correction before feed the text into LLMs or Beam search decoder.
    This is due to the fact that it is very challenging to force LLMs to only fix the speaker tagging without changing the word tokens. However, you should be responsible for the WER degradation from the correction.

  2. We haven't set up a leaderboard yet, but if there is demand, we can consider opening it.

  3. Yes. You could use any type of LLMs, if you can state the prompt method in the technical paper.

  4. I guess this is similar to Q1. You are allowed to change words. This is also already mentioned in the description.
    You are even allowed to use Track1's system to correct the ASR errors.
    However, you should also be aware that some ground truth files are only tagged to have correct speakers rather than fixing the words so fixing the words could sometimes damage cpWER. Having said that, there is no restriction on correcting the words.

@tango4j
Copy link
Owner

tango4j commented May 25, 2024

The challenge organizers made decision to exclude the audio source in this challenge after we choose the baseline system.
We also realized that the base line does not improve the given dataset.

We will be releasing the subset list of files that baseline system improves. Also, we are also planning to upload another baseline. Until then, please think of the baseline code as a tool for checking input/output.

@naymaraq
Copy link
Author

Thanks for answers @tango4j

@tango4j tango4j reopened this Jun 13, 2024
@tango4j
Copy link
Owner

tango4j commented Jun 13, 2024

@naymaraq
We have created a leaderboard today, and reduced the size of dev/eval set to 10~13 files for .

https://huggingface.co/spaces/GenSEC-LLM/task2_speaker_tagging_leaderboard

A few teams are preparing to submit.
It is good time to check the performance of your Speaker tagging corrector on this.

@naymaraq Let me know if there is question..! Thank you.

@naymaraq
Copy link
Author

@tango4j
Thank you for sharing updates. We submitted our solution on the reduced dataset. Do we need to send the output on a whole dataset? Also, do we need to do other actions besides uploading seglist.json into a huggingface eval tool?

@tango4j
Copy link
Owner

tango4j commented Jun 17, 2024

@naymaraq
Thank you so much for putting this effort on this.
Your submission seems like performing far better than the baseline for both dev/eval.

If there is no need for tie-breaker, we might note request the participants to evaluate other bigger splits of datasets.
In terms of the technical descriptions, let me ask about it and get back to you.

@tango4j
Copy link
Owner

tango4j commented Jun 19, 2024

@naymaraq

Hi, the committee says that you are encouraged to submit a paper. Minimum 2 page technical details are required, max 6 page. You may include fine-tuning, prompting and data processing for train/inference, parameter tuning. Also, model information should be included in the technical paper.
You should register your paper by June/20th/2024 then you can modify it until June/27th/2024.
The paper will be in proceedings of SLT 2024.

https://sites.google.com/view/gensec-challenge/home

Technical Papers
Please submit a challenge submission paper through [CMT system]. Minimum 2 page - Max 6 page is allowed.�

For templates, detailed requirements, please visit https://2024.ieeeslt.org/paper_submission/
June 20, 2024 : Paper submission deadline�
June 27, 2024: Paper update deadline

@naymaraq
Copy link
Author

Hi @tango4j

We posted the technical details of our submission here: link. Unfortunately, it didn't pass the review stage.

@tango4j
Copy link
Owner

tango4j commented Sep 12, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants