-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔢 Not computing the node features in HungaryCPDataLoader #86
Comments
Added descriptive docstrings for HungaryCPDataLoader. Made couple of additions to the docstrings of CoraDataLoader. Made assert checks for the parameters passed to the above two dataloaders. Also made an issue related to HungaryCPDataLoader #86 reagarding the node features array.
Montevideo BusSomething similar was also noticed in MontevideoBusDataLoader, where we are only calculating the target and not the features. We were also not considering the lags while calculating the target. Our Version def _get_targets(self, target_var: str = "y"):
targets = []
for node in self._dataset["nodes"]:
y = node.get(target_var)
targets.append(np.array(y))
stacked_targets = np.stack(targets).T
standardized_targets = (
stacked_targets - np.mean(stacked_targets, axis=0)
) / np.std(stacked_targets, axis=0)
self._all_targets = np.array([
standardized_targets[i, :].T
for i in range(len(standardized_targets))
]) PyG-T version def _get_features(self, feature_vars: List[str] = ["y"]):
features = []
for node in self._dataset["nodes"]:
X = node.get("X")
for feature_var in feature_vars:
features.append(np.array(X.get(feature_var)))
stacked_features = np.stack(features).T
standardized_features = (
stacked_features - np.mean(stacked_features, axis=0)
) / np.std(stacked_features, axis=0)
self.features = [
standardized_features[i : i + self.lags, :].T
for i in range(len(standardized_features) - self.lags)
]
def _get_targets(self, target_var: str = "y"):
targets = []
for node in self._dataset["nodes"]:
y = node.get(target_var)
targets.append(np.array(y))
stacked_targets = np.stack(targets).T
standardized_targets = (
stacked_targets - np.mean(stacked_targets, axis=0)
) / np.std(stacked_targets, axis=0)
self.targets = [
standardized_targets[i + self.lags, :].T
for i in range(len(standardized_targets) - self.lags)
] For the new dataloader that is being implemented #85 , I will be following the PyG-T version unless our previous version is a better choice. |
PedalMeSomething similar was also noticed for PedalMe dataset in our version. |
WikiMathNoticed that there is a difference in our version when calculating the PyG-T Version def _get_targets_and_features(self):
targets = []
for time in range(self._dataset["time_periods"]):
targets.append(np.array(self._dataset[str(time)]["y"]))
stacked_target = np.stack(targets)
standardized_target = (
stacked_target - np.mean(stacked_target, axis=0)
) / np.std(stacked_target, axis=0)
self.features = [
standardized_target[i : i + self.lags, :].T
for i in range(len(targets) - self.lags)
]
self.targets = [
standardized_target[i + self.lags, :].T
for i in range(len(targets) - self.lags)
] Our Version def _set_targets(self):
r"""Calculates and sets the target attributes"""
targets = []
for time in range(self.gdata["total_timestamps"]):
targets.append(np.array(self._dataset[str(time)]["y"]))
stacked_target = np.stack(targets)
standardized_target = (stacked_target - np.mean(stacked_target, axis=0)) / (
np.std(stacked_target, axis=0) + 10**-10
)
breakpoint()
self._all_targets = np.array(
[standardized_target[i, :].T for i in range(len(targets))]
) |
Was going through the methods present in the
HungaryCPDataLoader
during the dataset abstraction task and noticed the following in the_get_targets_and_features
methodOur Version
But inside the PyTorch Geometric Temporal version they are computing the
features
listPyG-T Version
Need to confirm why we omitted this computation in our dataloader and add it back in if necessary.
The text was updated successfully, but these errors were encountered: