Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulk upsert fails if oplog.timestamp is deleted #62

Open
CIB opened this issue Jul 8, 2016 · 2 comments
Open

bulk upsert fails if oplog.timestamp is deleted #62

CIB opened this issue Jul 8, 2016 · 2 comments

Comments

@CIB
Copy link

CIB commented Jul 8, 2016

In the official documentation, it says that deleting the timestamp should be possible. But if I start neo4j doc manager once, stop it, delete the timestamp, and start it again, I get the following error:

mongo-connector -m $MONGODB -t $NEO4JDB -d $NEO4JDOCMANAGER


 2016-07-08 07:30:53,976 [CRITICAL] mongo_connector.oplog_manager:625 - Exception during collection dump
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/site-packages/mongo_connector/util.py", line 32, in wrapped
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.4/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 89, in bulk_upsert
    tx.commit()
  File "/usr/local/lib/python3.4/site-packages/py2neo/cypher/core.py", line 333, in commit
    return self.post(self.__commit or self.__begin_commit)
  File "/usr/local/lib/python3.4/site-packages/py2neo/cypher/core.py", line 288, in post
    raise self.error_class.hydrate(error)
py2neo.cypher.error.schema.ConstraintViolation: Node 0 already exists with label Test and property "_id"=[577ccc5e39a414a3d7d17171]
@johnymontana
Copy link

Thanks for pointing this out @CIB. Neo4j Doc Manager creates a uniqueness constraint on the _id property (the value of the ObjectID for each document), so this error is thrown because the bulk upsert is trying to create nodes that already exist. Currently bulk_upsert uses CREATE Cypher statements, but I suppose we could try changing those to MERGE and SET statements to avoid these constraint violation errors. I will try some performance tests with this to see if it makes sense. In the meantime, you could delete the data in Neo4j before restarting the doc manager to avoid this error.

@simonthme
Copy link

Hello, I'm facing the same problem. Any news on this?
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants