Simplest version of dumping train config #558

orazve · 2023-12-19T16:56:48Z

Thank you for your contribution to the Graph Data Science Client project.

Before submitting this PR, please read Contributing to the Neo4j Ecosystem.

Make sure:

You signed the Neo4j CLA (Contributor License Agreement) so that we are allowed to ship your code in our library
Your contribution is covered by tests

netlify · 2023-12-19T16:56:53Z

✅ Deploy Preview for neo4j-graph-data-science-client ready!

Name	Link
🔨 Latest commit	`bdf5369`
🔍 Latest deploy log	https://app.netlify.com/sites/neo4j-graph-data-science-client/deploys/6581cb527387df00084d388d
😎 Deploy Preview	https://deploy-preview-558--neo4j-graph-data-science-client.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

adamnsch

Cool :)

adamnsch · 2023-12-20T09:13:48Z

graphdatascience/model/model_proc_runner.py

@@ -45,6 +46,34 @@ def create(
            relationship_type_embeddings,
        )

+    @compatible_with("train", min_inclusive=ServerVersion(2, 5, 0))
+    @client_only_endpoint("gds.model.transe")
+    def train(self,


I'm a little bit confused about how the predict-only TransE we've had previously fits into this. They kind of seem like different things. Also I wonder if it's better if we have a gds.kge.train call instead of one dedicated to each particular scoring function

I think that we are adding training step in addition to existing prediction one. Maybe it's worth to have a fully separate API like gds.kge.train or gds.kge.transe.train.

Personally I prefer to mention scoring method in a function name, like gds.kge.transe.train, not gds.kge.train. Because KGE algorithms are different and supposed to catch different relationship properties.

Our API proposal has gds.model.transe.train call, that's why I wrote it that way.

Ok! Fair enough :) I do still prefer gds.kge.train: I think it makes sense since the embeddings are what are trained, not TransE which is a scoring function. And since all KGE algos are the same (in terms of eg. hyperparameters) except for the scoring function, I think it would make sense to group them in the API for simplicity, sharing the same docs, etc. In that sense scoring function is just another hyperparam I think, and one may even want to use an ensemble of them. I like how pyKEEN designed their API

adamnsch · 2023-12-20T09:14:46Z

graphdatascience/model/model_proc_runner.py

+              # loss: str
+              ) -> int:
+        config = {'scoring_function': 'TransE',
+                  'proportions': proportions,


I call this split_ratios in the python runtime. I think "ratio" is a more common term to refer to this (also what pyKEEN use for example)

initial simplest version of kg train API

bdf5369

adamnsch reviewed Dec 20, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplest version of dumping train config #558

Simplest version of dumping train config #558

orazve commented Dec 19, 2023

netlify bot commented Dec 19, 2023 •

edited

Loading

adamnsch left a comment

adamnsch Dec 20, 2023

orazve Dec 20, 2023

adamnsch Dec 21, 2023

adamnsch Dec 20, 2023

Simplest version of dumping train config #558

Are you sure you want to change the base?

Simplest version of dumping train config #558

Conversation

orazve commented Dec 19, 2023

netlify bot commented Dec 19, 2023 • edited Loading

✅ Deploy Preview for neo4j-graph-data-science-client ready!

adamnsch left a comment

Choose a reason for hiding this comment

adamnsch Dec 20, 2023

Choose a reason for hiding this comment

orazve Dec 20, 2023

Choose a reason for hiding this comment

adamnsch Dec 21, 2023

Choose a reason for hiding this comment

adamnsch Dec 20, 2023

Choose a reason for hiding this comment

netlify bot commented Dec 19, 2023 •

edited

Loading