-
Notifications
You must be signed in to change notification settings - Fork 14
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only other suggestion is to replace ws://
with workspace://
since ws://
is ordinarily used for WebSocket URLs.
if strings.HasPrefix(paths[0], "ws://") { | ||
if len(paths) > 1 { | ||
return 0, 0, fmt.Errorf("cannot ingest multiple paths from workspace") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would not catch the case where the user passes a normal path as the first parameter, and a ws:// path as the second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think path will be "director:///" if it is local, and s3 and r2 s3:///"
r2:///`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we're only checking the very first path for the ws:// prefix. If it's on the second item in the paths
slice, it won't get caught.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ws:// is overloaded, but fairly clear with regards to knowledge (which is file-based) I guess 🤷♂️
Anyway, those checks are "superficial", as only Otto should be using this stuff.
But you're right and I will fix the checks later on 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm willing to accept these shortcomings (that it only checks the first path and that ws is overloaded).
@g-linville @StrongMonkey if this is your only issue with the PR, can your finish your review?
if err != nil { | ||
return err | ||
exitErr0(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have to do this because otherwise cobra will return a non-zero if you bubble an err all the way up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup
default: true | ||
globals: | ||
ingestion: | ||
textsplitter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean you are just using textsplitter until you write the "enhanced" splitter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - but the new one will land today
load.gpt
tool as a wrapper aroundknowledge load
to transform an input file into Knowledge JSON format and store it to another file.ingest
andload
output final error (if any) as JSON to stdout (and exit with0
) or don't print anything at all. All other logging should go to stderr (slog does that by default, so we're good)--metadata foo=bar
foringest
andload
commands (also usable viaKNOWLEDGE="foo=bar,baz=bom"
in env)ingest
andload
both use the gptscript SDK to read from and write to the workspace if paths are prefixed witthws://
(to keep knowledge usable as a standalone tool, making it easier to test and evaluate)