Skip to content

Commit

Permalink
Support incremental loading data for 'vineyard-graph-loader' (#1563)
Browse files Browse the repository at this point in the history
What do these changes do?
-------------------------

Add support for incremental vertices and edges for
vineyard-graph-loader.
For step_by_step, the expected behaviors are as below
<!DOCTYPE html>

Order|Case |Expect Behavior
-- | -- | --
1 | combine multi files of same label | as before vineyard-graph-loader
2 | combine multi files of same label but with duplicated data(vertex) |
drop duplicates
3 | combine multi files of same label but with duplicated data(edge) |
log warning and become multigraph(as previous behaviors)

For whole or the default way, the expected behaviors are as below,
<!DOCTYPE html>

Order| Case | Expect Behavior
-- | -- | --
1 | combine multi files of same label | as before vineyard-graph-loader
2 | combine multi files of same label but with duplicated data(vertex) |
expect user not to add this kind of data
3 | combine multi files of same label but with duplicated data(edge) |
expect user not to add this kind of data

Fixes #1295

Signed-off-by: SighingSnow <1263750383@qq.com>
  • Loading branch information
SighingSnow authored Sep 26, 2023
1 parent a6fc10a commit 1de8226
Show file tree
Hide file tree
Showing 18 changed files with 1,665 additions and 97 deletions.
11 changes: 11 additions & 0 deletions modules/graph/fragment/arrow_fragment.vineyard-mod
Original file line number Diff line number Diff line change
Expand Up @@ -626,6 +626,17 @@ class [[vineyard]] ArrowFragment
ObjectID vm_id,
const int concurrency = std::thread::hardware_concurrency()) override;

boost::leaf::result<ObjectID> AddVerticesToExistedLabel(
Client& client, label_id_t label_id,
std::shared_ptr<arrow::Table>&& vertex_table, ObjectID vm_id,
const int concurrency = std::thread::hardware_concurrency()) override;

boost::leaf::result<ObjectID> AddEdgesToExistedLabel(
Client& client, label_id_t label_id,
std::shared_ptr<arrow::Table>&& edge_table,
const std::set<std::pair<std::string, std::string>>& edge_relations,
const int concurrency = std::thread::hardware_concurrency()) override;

/// Add a set of new edge labels to graph. Edge label id started from
/// edge_label_num_.
boost::leaf::result<ObjectID> AddNewEdgeLabels(
Expand Down
27 changes: 27 additions & 0 deletions modules/graph/fragment/arrow_fragment_base.h
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,33 @@ class ArrowFragmentBase : public vineyard::Object {
VINEYARD_ASSERT(false, "Not implemented");
return vineyard::InvalidObjectID();
}
// Add vertices progressively to existed vertex label.
virtual boost::leaf::result<ObjectID> AddVerticesToExistedLabel(
Client& client, PropertyGraphSchema::LabelId label_id,
std::shared_ptr<arrow::Table>&& vertex_table, ObjectID vm_id,
const int concurrency = std::thread::hardware_concurrency()) {
VINEYARD_ASSERT(false, "Not implemented");
return vineyard::InvalidObjectID();
}

/**
* @brief Add edges progressively to existed vertex label.
*
* @param client
* @param label_id the label id of the existed vertex label.
* @param edge_table the newly added edges
* @param edge_relations
* @param concurrency
* @return boost::leaf::result<ObjectID>
*/
virtual boost::leaf::result<ObjectID> AddEdgesToExistedLabel(
Client& client, PropertyGraphSchema::LabelId label_id,
std::shared_ptr<arrow::Table>&& edge_table,
const std::set<std::pair<std::string, std::string>>& edge_relations,
const int concurrency = std::thread::hardware_concurrency()) {
VINEYARD_ASSERT(false, "Not implemented");
return vineyard::InvalidObjectID();
}

virtual boost::leaf::result<ObjectID> AddEdges(
Client& client,
Expand Down
Loading

0 comments on commit 1de8226

Please sign in to comment.