Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification Needed: Is the Bike Flow Prediction Example Truly Zero-Shot? #24

Open
Kaleemullahqasim opened this issue Oct 14, 2024 · 4 comments

Comments

@Kaleemullahqasim
Copy link

I came across the Example-1: Bike Flow Prediction (Zero-shot scenario) in your paper, and I have some concerns regarding the classification of this task as “zero-shot.”

As I understand it, a zero-shot scenario typically involves the model performing a task without being provided any specific prior examples or historical data that directly relate to the task. However, in the provided bike flow prediction example, the model is given 12 time steps of historical inflow and outflow data. This seems to provide the model with concrete examples to base its predictions on, which would generally classify the task as a few-shot or data-driven prediction rather than a zero-shot task.

Could you clarify why this is being labeled as a zero-shot scenario? If it is indeed zero-shot, could you explain the reasoning behind the classification, given that historical data is provided for the prediction task?

Thank you for your time and help in clearing this up!

image

@Kaleemullahqasim
Copy link
Author

https://huggingface.co/datasets/bjdwh/ST_data_urbangpt/viewer?row=0

even the training dataset is not zero-shot, you did add few-shot examples, but for some reason you always specified they are zeroshots

@LZH-YS1998
Copy link
Collaborator

Hello. We understand the aspects that may be causing confusion. The spatio-temporal prediction task involves forecasting future values based on historical data, whether it occurs in full-shot, few-shot, or zero-shot settings. In this context, we differentiate the zero-shot setting from others based on whether the model has been trained on the historical data from the test regions. If the model has not been trained with data from the test regions, it is considered a zero-shot setting. We have formalized this task in Session 2, titled "Spatio-Temporal Zero-Shot Learning."

@Kaleemullahqasim
Copy link
Author

Thank you for clarifying the concept of “Spatio-Temporal Zero-Shot Learning” in your study. I appreciate your explanation of how the zero-shot setting is defined based on the absence of training on the specific test regions.

However, after reviewing the approach, I have a follow-up question. It appears that the model still relies on historical data from the test region for making predictions, which can be viewed as a form of few-shot learning. If the goal is to generalize to new regions without explicit training, could this not be achieved by carefully crafting few-shot prompts that provide the model with the necessary historical data and context during inference?

Why fine-tune or train the model?if prompt with a few shots can fix the whole problem?
Which in this case you are still using at the testing stage or prediction stage?

@LZH-YS1998
Copy link
Collaborator

Hello, we understand your concern. The spatial-temporal prediction task we're discussing is inherently defined as predicting future trends based on historical data. Without historical data, it becomes a fundamentally different task. You may want can also look into similar works on how to implement "zero-shot definitions" for time series prediction or spatial-temporal prediction.
Large Language Models Are Zero-Shot Time Series Forecasters; Unified Training of Universal Time Series Transformers... I hope this can help you.

Regarding your point about few-shot learning, we haven't found that providing few examples can effectively solve this problem. Therefore, instruction fine-tuning is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants