Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow using exogenous data in hierarchical forecasting models #124

Closed
antoinecarme opened this issue Apr 24, 2020 · 5 comments
Closed

Allow using exogenous data in hierarchical forecasting models #124

antoinecarme opened this issue Apr 24, 2020 · 5 comments

Comments

@antoinecarme
Copy link
Owner

antoinecarme commented Apr 24, 2020

PyAF does not yet allow using exogenous data (explanatory variables) to enrich the models used in hierarchies.

Expect the possibility to define one exogenous data for all hierarchy nodes or setting a per-node exogenous data.

antoinecarme added a commit that referenced this issue Apr 24, 2020
Added a test with the same exogenous data fore all hierarchy nodes
@antoinecarme
Copy link
Owner Author

First specification method : one exogenous data for all nodes : (dataframe , list of used variables)

https://github.com/antoinecarme/pyaf/blob/hierarchical_exog/tests/hierarchical/test_hierarchy_AU_AllMethods_Exogenous_all_nodes.py

def create_exog_data(b1):
    # fake exog data based on date variable
    lDate1 = b1.mPastData['Date']
    lDate2 = b1.mFutureData['Date'] # not needed. exogfenous data are missing when not available.
    lDate = lDate1.append(lDate2)
    lExogenousDataFrame = pd.DataFrame()
    lExogenousDataFrame['Date'] = lDate
    lExogenousDataFrame['Date_second'] = lDate.dt.second
    lExogenousDataFrame['Date_minute'] = lDate.dt.minute
    lExogenousDataFrame['Date_hour'] = lDate.dt.hour
    lExogenousDataFrame['Date_dayofweek'] = lDate.dt.dayofweek
    lExogenousDataFrame['Date_day'] = lDate.dt.day
    lExogenousDataFrame['Date_dayofyear'] = lDate.dt.dayofyear
    lExogenousDataFrame['Date_month'] = lDate.dt.month
    lExogenousDataFrame['Date_week'] = lDate.dt.week
    # a column in the exog data can be of any type
    lExogenousDataFrame['Date_day_name'] = lDate.dt.day_name()
    lExogenousDataFrame['Date_month_name'] = lDate.dt.month_name()
    lExogenousVariables = [col for col in lExogenousDataFrame.columns if col.startswith('Date_')]
    lExogenousData = (lExogenousDataFrame , lExogenousVariables) 
    return lExogenousData

antoinecarme added a commit that referenced this issue Apr 24, 2020
Added a test with the same exogenous data for some hierarchy nodes
@antoinecarme
Copy link
Owner Author

Second specification method : per-node exogenous data : lExogenous[signal] = (dataframe , list of used variables)

https://github.com/antoinecarme/pyaf/blob/hierarchical_exog/tests/hierarchical/test_hierarchy_AU_AllMethods_Exogenous_per_node.py

def create_exog_data(b1):
    # fake exog data based on date variable
    lDate1 = b1.mPastData['Date']
    lDate2 = b1.mFutureData['Date'] # not needed. exogfenous data are missing when not available.
    lDate = lDate1.append(lDate2)
    lExogenousDataFrame = pd.DataFrame()
    lExogenousDataFrame['Date'] = lDate
    lExogenousDataFrame['Date_second'] = lDate.dt.second
    lExogenousDataFrame['Date_minute'] = lDate.dt.minute
    lExogenousDataFrame['Date_hour'] = lDate.dt.hour
    lExogenousDataFrame['Date_dayofweek'] = lDate.dt.dayofweek
    lExogenousDataFrame['Date_day'] = lDate.dt.day
    lExogenousDataFrame['Date_dayofyear'] = lDate.dt.dayofyear
    lExogenousDataFrame['Date_month'] = lDate.dt.month
    lExogenousDataFrame['Date_week'] = lDate.dt.week
    # a column in the exog data can be of any type
    lExogenousDataFrame['Date_day_name'] = lDate.dt.day_name()
    lExogenousDataFrame['Date_month_name'] = lDate.dt.month_name()
    lExogenousVariables = [col for col in lExogenousDataFrame.columns if col.startswith('Date_')]
    lExogenousData = {}
    # define exog only for three state nodes
    lExogenousData["NSW_State"] = (lExogenousDataFrame , lExogenousVariables[:3]) 
    lExogenousData["VIC_State"] = (lExogenousDataFrame , lExogenousVariables[-3:]) 
    lExogenousData["QLD_State"] = (lExogenousDataFrame , lExogenousVariables) 
    return lExogenousData

@antoinecarme
Copy link
Owner Author

The M5 Competition

https://mofc.unic.ac.cy/m5-competition/

image

@antoinecarme
Copy link
Owner Author

    def get_exogenous_data(self, signal):
        if(self.mExogenousData is None):
            return None
        # A signal is a hierarchy node
        if(type(self.mExogenousData) == tuple):
            # same data for all signals
            return self.mExogenousData
        if(type(self.mExogenousData) == dict):
            # one exogenous data by signal
            return self.mExogenousData.get(signal)
        raise tsutil.PyAF_Error("BAD_EXOGENOUS_DATA_SPECIFICATION");

antoinecarme added a commit that referenced this issue Apr 24, 2020
Added two tests for explanatory variables with grouped signals
@antoinecarme antoinecarme changed the title Allow using exogenous data in hierachical models Allow using exogenous data in hierarchical forecasting models Apr 25, 2020
@antoinecarme
Copy link
Owner Author

antoinecarme commented Apr 28, 2020

Closing.

Will be officially available in release 2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant