Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sqlite pragma settings to sql_storage.py - change distinct to aggregate in mongodb.py #916

Merged
merged 18 commits into from
Aug 11, 2017

Conversation

lesleslie
Copy link
Contributor

This commit is in response to #873. It allows sqlite pragma settings to be set. This commit sets

PRAGMA journal_mode=WAL
PRAGMA synchronous=NORMAL

when sqlite is used for the database. These settings should speed things up while still maintaining data integrity.

See #873 for more details.

@lesleslie
Copy link
Contributor Author

Also, change default_session.id to default_conversation_id
for examples learning_feedback_example.py and learning_new_response.py.

Those examples were broken somewhere between 7.4 and 7.6.

@lesleslie
Copy link
Contributor Author

lesleslie commented Aug 9, 2017

change distinct to aggregate for response_query in mongodb.py get_response_statements.

This probably closes #747, #697, #686.

I didn't actually clock it, but the query now seems to be running about at least 4x faster on an 18MB collection.

and yes, you should be able to break the 16MB BSON limit?.

@lesleslie lesleslie changed the title add sqlite pragma settings add sqlite pragma settings to sql_storage.py - change distinct to aggregate in mongodb.py Aug 9, 2017
@@ -116,7 +116,7 @@ def get_response(self, input_item, conversation_id=None):

if not self.read_only:
self.learn_response(statement, previous_statement)
self.storage.add_to_converation(conversation_id, statement, response)
self.storage.add_to_conversation(conversation_id, statement, response)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@gunthercox
Copy link
Owner

@lesleslie There are some great changes here, thank you. Once the tests are passing I'll be happy to merge in your changes.

Thank you again.

@lesleslie
Copy link
Contributor Author

lesleslie commented Aug 11, 2017 via email

@lesleslie
Copy link
Contributor Author

lesleslie commented Aug 11, 2017 via email

@lesleslie
Copy link
Contributor Author

Ok - all tests pass!

Sorry about the sloppy pull request and all the commits. I have a better idea of the process now. This was also my first experience with Travis-CI but I think I have a good handle on that now too.
I'll be much cleaner about my pull requests moving forward.

I will note it hear that increasing the sort buffer size using:

self.client.admin.command({'setParameter': 1, 'internalQueryExecMaxBlockingSortBytes': 44040192})

does not currently work for Python 2 on Linux or Python 3 on Windows.

@@ -22,7 +22,7 @@
output_adapter='chatterbot.output.TerminalAdapter'
)

DEFAULT_SESSION_ID = bot.default_session.id
DEFAULT_SESSION_ID = bot.default_conversation_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could also rephrase to DEFAULT_CONVERSATION_ID

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -16,7 +16,7 @@

bot.train("chatterbot.corpus.english")

DEFAULT_SESSION_ID = bot.default_session.id
DEFAULT_SESSION_ID = bot.default_conversation_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could also rephrase to DEFAULT_CONVERSATION_ID

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -92,6 +92,9 @@ def __init__(self, **kwargs):
# Use the default host and port
self.client = MongoClient(self.database_uri)

# Increase the sort buffer to 42M
self.client.admin.command({'setParameter': 1, 'internalQueryExecMaxBlockingSortBytes': 44040192})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like tests are failing after this change;

pymongo.errors.OperationFailure: no such command: 'internalQueryExecMaxBlockingSortBytes', bad cmd: '{
internalQueryExecMaxBlockingSortBytes: 44040192, setParameter: 1
}'

@gunthercox
Copy link
Owner

Thank you again. Fantastic job @lesleslie putting all this together. It is greatly appreciated!

@gunthercox gunthercox merged commit 2ff9a2c into gunthercox:master Aug 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants