Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undocumented requirements for input nodes/edges file names? #471

Open
amykglen opened this issue Oct 25, 2023 · 0 comments
Open

Undocumented requirements for input nodes/edges file names? #471

amykglen opened this issue Oct 25, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@amykglen
Copy link

amykglen commented Oct 25, 2023

Describe the bug
The neo4j-upload CLI command fails to successfully upload my KGX-formatted json lines files and complains with the following:

[KGX][jsonl_source.py][               parse] WARNING: Parse function cannot resolve the KGX file type in name nodes-tiny.jsonl. Skipped...
[KGX][jsonl_source.py][               parse] WARNING: Parse function cannot resolve the KGX file type in name edges-tiny.jsonl. Skipped...

To Reproduce
You can reproduce by running the following command, where nodes-tiny.jsonl and edges-tiny.jsonl are any KGX-formatted nodes/edges json lines files (and you have Neo4j running on localhost).

kgx neo4j-upload --uri bolt://localhost:7687 --username neo4j --password [password] --input-format jsonl nodes-tiny.jsonl edges-tiny.jsonl

Expected behavior
I would expect that command to upload my files to Neo4j successfully.

Additional context
I eventually figured out that if I tweak the names of my nodes/edges files so that they end with nodes.jsonl and edges.jsonl, then the command completes successfully. In other words, this command works normally (differs only in file names):

kgx neo4j-upload --uri bolt://localhost:7687 --username neo4j --password [password] --input-format jsonl tiny-nodes.jsonl tiny-edges.jsonl

I might have missed it, but I don't see this file naming requirement in the documentation. Could this requirement either be made looser (e.g., require that nodes/edges is anywhere in the file name, rather than at the end?), or be documented clearly somewhere?

(As a side note, I see that the KGX specification lists file names as nodes.jsonl and edges.jsonl, but it doesn't appear that that exact naming is actually expected in practice - examples in the kgx package documentation use different file names, like test_nodes.jsonl (here))

@amykglen amykglen added the bug Something isn't working label Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant