-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zodbcmp - Tool to compare two ZODB databases #128
Conversation
mintid, minstor, maxstor = l[0] | ||
print("not-equal: tid %s present in stor%i but not in stor%i" % ( | ||
ashex(mintid), minstor, maxstor)) | ||
return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
txn1 = next(iter1, None)
txn2 = next(iter2, None)
if txn1 != txn2:
if verbose:
mintid, minstor, maxstor = min(
(txn1.tid if txn1 else inf, 1, 2),
(txn2.tid if txn2 else inf, 2, 1))
print(...)
return 1
if not txn1:
if verbose:
print("equal")
return 0
MAX_TID
(instead of inf
) may also be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought about MAX_TID too but:
- there is no MAX_TID in ZODB (it is in NEO) - in ZODB it is
utils.maxtid
- in both ZODB and NEO it is 7fff... instead of ffff... for signedness. Still having 7f vs ff looks there can be a risk of skipping transactions.
- I wanted code here to be generic, not depending on particular MAX value a tid can be. This can be achieved with
inf
approach.
|
||
# actual txn comparison | ||
tcmp = txncmp(txn1, txn2) | ||
if tcmp: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really wonder why you like temp variables so much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes the code more structured and clear to me.
if not tidmax: | ||
tidmax = None | ||
|
||
return tidmin, tidmax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return (tidmin.decode('hex') or None,
tidmax.decode('hex') or None)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the or None
hint. However since .split("..") and .decode("hex") raise different exceptions, for proper error reporting I decided to wrap both & hex-decode actions and reaise dedicated exception so users will get e.g.
E: invalid tidrange: q..
instead of
Traceback (most recent call last):
File "./zodbcmp.py", line 205, in <module>
main()
File "./zodbcmp.py", line 192, in main
tidmin, tidmax = parse_tidrange(argv[2])
File "./zodbcmp.py", line 153, in parse_tidrange
tidmin = tidmin.decode('hex')
File ".../lib/python2.7/encodings/hex_codec.py", line 42, in hex_decode
output = binascii.a2b_hex(input)
TypeError: Odd-length string
See the change in interdiff for next patch revision.
if exc_type is SystemExit: | ||
raise # this was sys.exit() call, not an error | ||
traceback.print_exc() | ||
sys.exit(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try:
main2()
except SystemExit:
raise
except:
traceback.print_exc()
sys.exit(2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, this was a thinko on my side, thanks for correction.
This is a tool to compare two ZODB databases in between tidmin..tidmax transaction range with default range being -∞..+∞ - (whole database). For comparision both databases are scanned at storage layer and every transaction content is compared bit-to-bit between the two. The program stops either at first difference found, or when whole requested transaction range is scanned with no difference detected. Database storages are specified in files with ZConfig-based storage definition, e.g. %import neo.client <NEOStorage> master_nodes ... name ... </NEOStorage> Please see https://lab.nexedi.com/nexedi/neoppod/merge_requests/4 for one of possible contexts. The tool is generic though and is not NEO-specific. It should be able to even check two different storages like ZEO & NEO, or FileStorage and NEO etc and thus can be handy.
@jmuchemb thanks for feedback. I've adjusted the patch. The interdiff is below: diff --git a/src/ZODB/scripts/zodbcmp.py b/src/ZODB/scripts/zodbcmp.py
index f70739e73..bc16a6e24 100755
--- a/src/ZODB/scripts/zodbcmp.py
+++ b/src/ZODB/scripts/zodbcmp.py
@@ -147,19 +147,24 @@ def usage(out):
""", file=out)
# tidmin..tidmax -> (tidmin, tidmax)
+class TidRangeInvalid(Exception):
+ pass
+
def parse_tidrange(tidrange):
- tidmin, tidmax = tidrange.split("..")
- tidmin = tidmin.decode('hex')
- tidmax = tidmax.decode('hex')
+ try:
+ tidmin, tidmax = tidrange.split("..")
+ except ValueError: # not exactly 2 parts in between ".."
+ raise TidRangeInvalid(tidrange)
+
+ try:
+ tidmin = tidmin.decode("hex")
+ tidmax = tidmax.decode("hex")
+ except TypeError: # hex decoding error
+ raise TidRangeInvalid(tidrange)
# empty tid means -inf / +inf respectively
# ( which is None in IStorage.iterator() )
- if not tidmin:
- tidmin = None
- if not tidmax:
- tidmax = None
-
- return tidmin, tidmax
+ return (tidmin or None, tidmax or None)
def main2():
verbose = False
@@ -189,8 +194,8 @@ def main2():
if len(argv) > 2:
try:
tidmin, tidmax = parse_tidrange(argv[2])
- except ValueError:
- usage(sys.stderr)
+ except TidRangeInvalid as e:
+ print("E: invalid tidrange: %s" % e, file=sys.stderr)
sys.exit(2)
stor1 = ZODB.config.storageFromFile(open(storconf1, 'r'))
@@ -202,10 +207,9 @@ def main2():
def main():
try:
main2()
+ except SystemExit:
+ raise # this was sys.exit() call, not an error
except:
- exc_type, _, _ = sys.exc_info()
- if exc_type is SystemExit:
- raise # this was sys.exit() call, not an error
traceback.print_exc()
sys.exit(2) |
This sounds like a useful tool, but I'd rather it not be included in ZODB. This has nothing to do with the tool, which I haven't studied :), but with the weight of ZODB maintenance. This weight has been historically been onerous in the case of utility scripts, which like this one, have lacked tests and tended to rot. If it were up to me, the scripts directory would go away. FWIW, I hope that soon there will be some sort of namespace package for the ZODB ecosystem. This could be a tool to release separately within that namespace. (Although I would prefer it have tests.) |
@jimfulton thanks for feedback. It could eventually have tests like checking database against itself is equal and aginst another db a bit modified - as not equal. It's a pity you do not want core (imho) ZODB-related things to be in ZODB repository itself. Just for the reference
but because of this I assume retrying the pull-request with tests for zodbcmp which are run on every |
At a high level, it would be better to release this separately. At the same time, we should create a page on zodb.org (via a ZODB/docs pr :)) that makes it easy for people to find tools like this. |
ok I'm moving this to https://lab.nexedi.com/kirr/zodbtools . I hesitate to change ZODB/docs about the tool for now. Thanks once again for feedback, |
Project to be a place for various ZODB utilitieis. Jim Fulton suggested to have it separate from ZODB/scripts/ [1] and now we have it here. [1] zopefoundation/ZODB#128 (comment)
This is a tool to compare two ZODB databases in between tidmin..tidmax
transaction range with default range being -∞..+∞ - (whole database).
For comparision both databases are scanned at storage layer and every
transaction content is compared bit-to-bit between the two. The program stops
either at first difference found, or when whole requested transaction range is
scanned with no difference detected.
Database storages are specified in files with ZConfig-based storage definition, e.g.
Please see https://lab.nexedi.com/nexedi/neoppod/merge_requests/4 for
one of possible contexts.
The tool is generic though and is not NEO-specific. It should be able to
even check two different storages like ZEO & NEO, or FileStorage and NEO
etc and thus can be handy.