Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend agnostic read-only DB wrapper framework #2664

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Backend agnostic read-only DB wrapper framework #2664

wants to merge 2 commits into from

Conversation

mjfh
Copy link
Contributor

@mjfh mjfh commented Sep 27, 2024

Proof-of-concept, up for discussion.

Code might raise to a wish-list what is really needed, useful, etc.


proc posixPrngRand(state: var uint32): byte =
## POSIX.1-2001 example of a rand() implementation, see manual page rand(3).
state = state * 1103515245 + 12345;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is useful to have an intentionally reproducible PRNG, but this particular one isn't very good, and conveniently due to exactly this inclusion in POSIX standards has been widely studied to the point of being an example of what not to do in undergraduate CS lectures:

The output of this generator isn’t very random. If we use the sample generator listed above, then the sequence of 16 key bytes will be highly non-random. For instance, it turns out that the low bit of each successive output of rand() will alternate (e.g., 0, 1, 0, 1, 0, 1, . . . ). Do you see why? The low bit of x * 1103515245 is the same as the low bit of x, and then adding 12345 just flips the low bit. Thus the low bit alternates.

It's true though that LCGs are simple and easy, and that's useful here too. For a while Mersenne Prime was typical, but that's a bit more involved.

Xorshift might be one approach, more popular in the last few years. https://prng.di.unimi.it/ suggests it's fairly good quality as well.

https://nim-lang.org/docs/random.html uses this, even, and maybe using initRand with specific values and std/random suffices for this purpose.

proc lastBlockNumber(e1db: Era1DbRef): BlockNumber =
var
minNum = BlockNumber(1)
maxNum = BlockNumber(4700013) # MainNet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit specific? it might be enough practically to have this maxNum be efectively the max of mainnet/Holesky/Sepolia, since there presumably will never again be a network with era1 files since they'll launch merged, but then that should be noted, rather than treating the tests as only applicable to mainnnet (or if they are mainnet-specific by necessity, they probably shouldn't be framed as a general-purpose test utility).

var
minNum = BlockNumber(1)
maxNum = BlockNumber(4700013) # MainNet
middle = (maxNum + minNum) div 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, (in)famously, this was one of the longstanding binary search bugs because it can overflow, even if the relevant block numbers are unlikely to overflow a BlockNumber = uint64.

Practically, it probably can't, but

AristoDbMemory.newCoreDbRef()

proc newCommonInstance*(): CommonRef =
const networkId = MainNet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another case of, this is probably a bit too specific for a "backend agnostic", or it should at least be better-documented/enforced that all of this applies to mainnet only.

But there's a practical tradeoff here of, something like a Sepolia or Holesky era1 test can be done to completion much more quickly than mainnet, so constraining this to mainnet constraints tests to being only on small slices.

# ------------------------------------------------------------------------------

proc setTraceLevel* =
discard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these setFooLevel should need the discard statements, even when the when condition is false. These aren't templates, for which that would be necessary.

mainDir: openArray[string] = mainDir;
subDir: openArray[string] = subDir;
): Result[string,void] =
for dir in baseDir:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this particular search pattern, apparently including searching the root directory ($DirSep)?

Also there's an odd mix here between this very generic search pattern and otherwise specialized to mainnet.

proc edgeDbMain*() =
testEra1CoreDbMain()

when isMainModule:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When imported from all_tests elsewhere, this when ensures it won't run any tests.

key: uint64;
): Result[Blob,EdgeDbError] =
## Simple linear get policy.
var error = EdgeColUnsupported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error symbol has a couple of other typical uses:

  • chronicles logging error
  • results magical variable error

Those already often conflict, e.g., arnetheduck/nim-results#34 which triggered the workarounds in arnetheduck/nim-results#35 and while those two are somewhat baked in, less chance of random future blowups to avoid reusing the error symbol beyond that unless necessary.

key: openArray[byte];
): Result[Blob,EdgeDbError] =
## Ditto for `Blob` like key
var error = EdgeColUnsupported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check hdrData == rlp.encode(e1Tpl.header)
check bdyData == rlp.encode(e1Tpl.body)

when isMainModule:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be run standalone or imported? Nothing else in this PR references this module.

@arnetheduck
Copy link
Member

What is this wrapper for? ie for the moment, we should not need any additional wrappers or interfaces (in fact, we have too many already) - we should not be abstracting or interfacing anything prior to there being a non-abstract version working as intended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants