-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Request/AvailableDAGs APIs #68
Conversation
8f4485c
to
299bbb0
Compare
@@ -30,6 +31,7 @@ impl fmt::Debug for StoredBlock { | |||
|
|||
f.debug_struct("StoredBlock") | |||
.field("cid", &cid_str) | |||
.field("filename", &self.filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you're storing the blocks in sqlite not in individual files (and if you were I would hope you could derive the filename from the CID).
So I'm guessing this filename member is actually from a UnixFS directory's link to this block? e.g. the directory entry's name for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filename originates in this API message ApplicationAPI::ImportFile { path: String }
. We just pull out the "file name" piece of the path in Storage::import_path
.
@@ -68,7 +69,11 @@ impl Storage { | |||
} | |||
} | |||
if let Some(root_cid) = root_cid { | |||
if let Some(filename) = path.file_name().and_then(|p| p.to_str()) { | |||
self.provider.name_dag(&root_cid, filename)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, interesting - you're naming root CIDs. I think Pinata does this IIRC (but doesn't really expose it via gateway).
window_size: Option<u32>, | ||
) -> Result<Vec<StoredBlock>> { | ||
let mut base_query = " | ||
WITH RECURSIVE cids(x,y,z) AS ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not familiar with WITH RECURSIVE
. That's interesting.
How is performance? Is this function (regardless of how it's implemented) a memory limitation concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't run into any show stopping performance issues with files in the 5-50 mB range, but also haven't done much memory profiling around larger files here. I think we are Ok for the smaller files scheduled for usage in the demonstration.
} | ||
|
||
impl StorageProvider for SqliteStorageProvider { | ||
fn import_block(&self, block: &StoredBlock) -> Result<()> { | ||
self.conn.execute( | ||
"INSERT OR IGNORE INTO blocks (cid, data) VALUES (?1, ?2)", | ||
(&block.cid, &block.data), | ||
"INSERT OR IGNORE INTO blocks (cid, data, filename) VALUES (?1, ?2, ?3)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "or ignore" makes me immediately question what the PK and/or unique constraints are. And looking at that code leads me to a side question: The purpose of an autonumber-style "id"?
But back to the main question: it looks like you did not modify the unique index on CID. So if you add a second file with the same content it gets silently dropped completely - no record of that filename existing. Is this really desired?
Perhaps it's too academic a concern, but in a pure mathematical sense a filename probably should't be a column in blocks, right? It would be a separate table that associates... I guess many-to-many?
})? | ||
// TODO: Correctly catch/log/handle errors here | ||
.filter_map(|cid| cid.ok()) | ||
.collect(); | ||
Ok(roots) | ||
} | ||
|
||
fn name_dag(&self, cid: &str, file_name: &str) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conversely this method assumes you want to drop the old name.
I would've guessed there'd be a parameter, perhaps bool or bool-equivalent-enum, specifying whether you'd like to do this or do WHERE cid = ?2 AND filename = NULL
)?; | ||
Ok(()) | ||
} | ||
|
||
fn get_missing_cid_blocks(&self, cid: &str) -> Result<Vec<String>> { | ||
// First get all block cid+id associated with root cid | ||
let blocks: Vec<(String, Option<i32>)> = self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious that you don't want to use a recursive select here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and fixed this for consistency and in case it becomes an issue later
@@ -103,6 +116,11 @@ pub enum ApplicationAPI { | |||
Version { | |||
version: String, | |||
}, | |||
RequestAvailableDags, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I assume it wouldn't be a concern here, are there any cross-version compatibility concerns with changes to this enum? Like, is it append-only for new enum variants or whatever?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently there is no assumed cross-version compatibility for these enums. For the purposes of the initial demonstrations we are assuming that all sides are running the same version, and that we have the control to make that so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A future enhancement along the path to SDK-ifying this software would be building more cross-version compatibility into these enums.
@@ -213,6 +213,9 @@ impl<T: Transport + Send + 'static> Listener<T> { | |||
} | |||
None | |||
} | |||
Message::ApplicationAPI(ApplicationAPI::RequestAvailableDags) => { | |||
Some(handlers::get_available_dags(self.storage.clone())?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunate that this sort of mapping can't be generated.
@@ -418,6 +418,7 @@ impl<T: Transport + Send + 'static> Shipper<T> { | |||
cid: Cid::try_from(block.cid)?.to_string(), | |||
data: block.data, | |||
links, | |||
filename: block.filename, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may very well be misunderstanding the purpose of this mod, but are the various satellites & ground stations supposed to agree on the filenames of roots?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to give recognizable names to DAGs/root CIDs for recognition by operators and/or observers. Maintaining the same "file name" across satellites & ground stations will hopefully help folks recognize the completion of a DAG transfer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think at some point we should pick one way handle conflicting names (old wins silently, new wins silently, error) and use the same approach in both import_dag and also name_dag.
But if, for right now, the answer is "it doesn't matter which happens", then this is fine. We could, for example, simply ask that they never use a file with the same content as another file they've used.
Sorry, I didn't mean to ignore your two comments on this! Yes I agree that it should be addressed at some point. I created #69 to address this in the future. |
Adds the
RequestAvailableDags
API, and association between filenames and DAGs.