Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fantastic work! Is code data considered in Cosmopedia? #11

Open
UniverseFly opened this issue Feb 22, 2024 · 1 comment
Open

Fantastic work! Is code data considered in Cosmopedia? #11

UniverseFly opened this issue Feb 22, 2024 · 1 comment

Comments

@UniverseFly
Copy link

Wow, this is super cool work, and thanks for open sourcing everything!! I wonder if cosmopedia tries incorporating code data as seeds to rephrase them into high-quality data? We did some explorations in Magicoder for instruction tuning, but in our case, the "rephrasing" requires a very delicate prompt design, so I am quite excited about this development and would love to know any insights towards rephrasing code instructions.

@loubnabnl
Copy link
Collaborator

Thank you! Yes we're planning to try generating code data, we can try MagiCoder instructions to generate some coding tutorials (in a similar way to how we used UltraChat & OpenHermes). But it might require a few iterations since it really depends on the coding performance of the LLM we use, similarly to how we've seen issues with Math reasoning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants