Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not to save unclean object database solution for issue #946 #1311

Closed
wants to merge 3 commits into from

Conversation

cogutvalera
Copy link
Member

PR for #946 issue "Possible to save unclean object database to file during replay"

@@ -155,7 +155,7 @@ int main(int argc, char** argv) {
if (unhandled_exception)
{
elog("Exiting with error:\n${e}", ("e", unhandled_exception->to_detail_string()));
node->shutdown();
node->shutdown(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not correct.

Please carefully evaluate all scenarios and the need of passing in different parameters. Best leave comments in code explaining why.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry forgot here to check "replay-blockchain" mode for condition execution with different params

@@ -220,7 +220,8 @@ void database::close(bool rewind)
// DB state (issue #336).
clear_pending();

object_database::flush();
if (is_clean)
object_database::flush();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to change the parameter name to do_flush or similar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with you do_flush name is better and makes more sense

@@ -109,7 +109,7 @@ namespace graphene { namespace chain {
* Will close the database before wiping. Database will be closed when this function returns.
*/
void wipe(const fc::path& data_dir, bool include_blocks);
void close(bool rewind = true);
void close(bool rewind = true, bool is_clean = true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add in-code documentation here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@@ -53,7 +53,7 @@ namespace graphene { namespace app {
void initialize(const fc::path& data_dir, const boost::program_options::variables_map&options);
void initialize_plugins( const boost::program_options::variables_map& options );
void startup();
void shutdown();
void shutdown(bool is_clean = true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to have doc here as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

the on-disk object database would be considered clean and be loaded into memory directly, that's why we need to
pass do_flush = FALSE in event of exception while rebuilding object graph by replaying all blocks.
*/
if( options.count("replay-blockchain") )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not correct. Replay can be triggered due to several reasons, having replay-blockchain specified in options is only one of them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Researching all reasons.

@@ -109,7 +109,15 @@ namespace graphene { namespace chain {
* Will close the database before wiping. Database will be closed when this function returns.
*/
void wipe(const fc::path& data_dir, bool include_blocks);
void close(bool rewind = true, bool is_clean = true);
/**
* @brief Shutdown application
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is in database class but not application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry.

void close(bool rewind = true, bool is_clean = true);
/**
* @brief Shutdown application
* @param rewind pop all of the blocks that we can given our undo history,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is a bit hard to understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, ok.

@@ -155,6 +157,10 @@ int main(int argc, char** argv) {
{
elog("Exiting with error:\n${e}", ("e", unhandled_exception->to_detail_string()));

chain::block_database block_id_to_block;
block_id_to_block.open(data_dir / "database" / "block_num_to_block");
Copy link
Member

@abitmore abitmore Sep 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not put logic about underlying files in main(). Please do it in database class. (Note: I haven't checked you code since it shouldn't be here, the logic can be wrong if moved to database)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Thanks !

@cogutvalera
Copy link
Member Author

now seems much better and much simpler solution with checking undo_db is disabled or not, what are your thoughts about this solution ? is it correct ? or I've missed something again ?

Thanks !

last_block->block_num() > node->chain_database()->head_block_num()
)
)
if( node->chain_database()->undo_db_disabled() )
node->shutdown(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's good to deal with database in main(). IMHO a better approach would be telling application either "we got an error, please shutdown" or "we are fine, please shutdown", and let application decide what to do. Then, perhaps application doesn't know what to do either and will pass the info to database and let database decide, or perhaps it knows what to do and do it accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks !

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I think the best way to put this logic inside application and not inside database, IMHO application must know if there was a exception during replay mode and that database must not be flushed. I think this logic in application makes more sense than if it would be in database.

{
if( my->_p2p_network )
my->_p2p_network->close();
if( my->_chain_db )
{
my->_chain_db->close(true, do_flush);
if (after_exception)
my->_chain_db->close(true, my->_chain_db->undo_db_enabled());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This code looks a bit ugly to me.
  2. I haven't checked whether the whole logic is correct yet.

@abitmore
Copy link
Member

The more I think about the issue and read your code, the more strongly I feel that we should refactor/rewrite some code buried deeply but not try to add code in main() to handle uncaught exceptions. For example, why database::reindex() didn't catch exceptions in the first place and handle them locally then return properly?

@cogutvalera please think/analyze before coding. The way you trying to fix the issue (write some code then change over and over) has led to unnecessary (wasted) work/efforts.

@cogutvalera
Copy link
Member Author

cogutvalera commented Sep 10, 2018

I thought we do not need rewrite a lot of code so deeply. IMHO I thought that we need to fix this issue with minor changes. Should we make deep code refactor/changes for this issue and not minor changes ?

@abitmore
Copy link
Member

Try to avoid unnecessary/duplicate efforts. If we simply patch it this time, we'll have to spend more time/resources to cleanup the patch in the future. This issue is not a high priority one, so we have much time to fix it in a good manner. If we can improve the code structure while fixing this issue, we'll gain much more value.

Don't let the estimated hours bind you down. If you got extra useful work done, I'm sure you'll be compensated. On the opposite, I don't feel good to compensate the hours for writing some code which have to be rewritten later.

@cogutvalera
Copy link
Member Author

Ok. Thanks ! Understood ! I won't be scared next time to make more code changes for more ideal solutions even if it will require deep code refactoring ! So I will think/analyze this issue more deeply then with best design paterns, design principles and approaches.

I do not worry about spending more time and about compensation, just tried to make changes as small as possible without refactoring a lot current architecture, but I was wrong because for more ideal architecture we must not be scared about deep code refactoring and architecture refactoring.

Thanks !

@abitmore abitmore added this to the 201812 - Feature Release milestone Sep 14, 2018
@cogutvalera
Copy link
Member Author

I've created Error Handling new issue #1338 for better architecture design and approach, but this issue is fixed by simple enough solution with minor changes, because new Error Handling implementation by Eithers must be implemented as NEW issue, it requires a lot of code changes and deep architecture design refactor. Of course we can implement it step by step, just need more discussion about Eithers.

@cogutvalera
Copy link
Member Author

cogutvalera commented Oct 19, 2018

is this PR and related issue within low priority scope ?

@pmconrad
Copy link
Contributor

Argh. Please remove that huge comment, having the discussion here on github is sufficient.
You can update your original message with a short summary of discussion and solution.

@cogutvalera
Copy link
Member Author

ok sure ! Thank you very much !

@cogutvalera
Copy link
Member Author

Done ! Thank you !

@cogutvalera
Copy link
Member Author

cogutvalera commented Nov 1, 2018

is something wrong with this PR or with my solution ? maybe I've missed anything ? Or we can merge and close it maybe ? What are your thoughts friends ?

Thank you !

@cogutvalera
Copy link
Member Author

@abitmore @pmconrad @jmjatlanta @oxarbitrage Guys what should we do with this PR ? Look please when you will have time.

Thanks !

@pmconrad
Copy link
Contributor

The issue was inadvertently fixed by #1529

@pmconrad pmconrad closed this Apr 11, 2019
@pmconrad pmconrad removed this from the Future Feature Release milestone Apr 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants