-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIML english corpus updation into chatterbot #516
Conversation
], | ||
[ | ||
"FATHER", | ||
"My father is Gunter Cox" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I feel conformable with my name being in the data. Could you please remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, reason don't understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. The reason is simply that I want ChatterBot to be useful to as many developers as possible. So far, I've made sure that all the data in the training corpus is relatively generic. I feel like this is too specific. Also, keep in mind that ChatterBot is a tool for creating chat bots, it is not a chat bot itself so it shouldn't have an identity.
@gunthercox any comments/suggestion on this PR, I almost done except knowledge, I'll make another PR soon |
@gunthercox i will re-submit knowledge corpus some other time. Any comments/suggestions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few comments on parts of the files that appear to have issues. Hopefully this feedback is helpful. Let me know if you have any questions.
[ | ||
"JOKE", | ||
"Did you hear the one about the Mountain Goats in the Andes? It was Ba a a a a a d.", | ||
"I never forget a face, but in your case I'll make an exception.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe the format of this file would train an instance of ChatterBot properly. Each of these strings is a single statement, but none of them are related to each other. ChatterBot's corpus trainer expects the list to represent a conversation.
|
||
"gossip": [ | ||
[ | ||
"GOSSIP", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A number of these files have this extra single word at the top, it looks like it might be a duplicate of the outer corpus label. Either way, it is not a valid part of a conversation.
"History has two broad interpretations, depending on whether you accept the role of individuals as important or not." | ||
], | ||
[ | ||
"WHO INVENTED THE LIGHT", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proper sentence case is preferred in the corpus files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have checked this sooner, but I just noticed that many of the files in this pull request appear to contain text that was directly copied from the repository at https://github.com/drwallace/aiml-en-us-foundation-alice/
The header included in each of the aiml files states:
<!-- This program is open source code released under -->
<!-- the terms of the GNU General Public License -->
<!-- as published by the Free Software Foundation. -->
The GNU General Public License is a copyleft license, which means that derivative work can only be distributed under the same license terms. ChatterBot is licensed under the BSD license, not GPL so legally they cannot be redistributed in ChatterBot without explicit permission from the copyright holder.
If he agrres re use can we use this stuff? Is there any sprcific mechanism license request changes? |
If the owner agrees to allow the content to be released under a different license, then yes, it is safe to use. Usually it just takes an email to get in contact with the copyright owner. |
@gunthercox any updates? |
Because reusing text from https://github.com/drwallace/aiml-en-us-foundation-alice was authorized by @drwallace, this should be ok to merge soon. I will check over the changes to make sure everything is valid before merging it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks great. I've checked over the first few files and noted a few changes that need to be made. Please check over the rest of the files to make sure that there aren't similar issues to the ones I commented on.
"Artificial intelligence is the branch of engineering and science devoted to constructing machines that think." | ||
], | ||
[ | ||
"what language are you written", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a missing "in" at the end of this sentence. I believe it should read:
what language are you written in
], | ||
[ | ||
"It pays", | ||
"No i am free of cost!!! you could start from here https://github.com/gunthercox/chatterbot" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems too specific to be useful to other developers. Maybe just remove the "you could start from here https://github.com/gunthercox/chatterbot" part.
{ | ||
"profile": [ | ||
[ | ||
"interests", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"interests" should probably be something like "What are your interests?"
], | ||
[ | ||
"whats your masters email address", | ||
"gunthercx@gmail.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in an earlier review, I would appreciate it if my name and email address were removed. I don't think this data would be useful to other developers who want to train their chat bot to communicate.
"i will consume electricity" | ||
], | ||
[ | ||
"location", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"location" might be better represented in the form of a question such as "What is your location?"
"i don't have any brothers. but i have a lot of clones." | ||
], | ||
[ | ||
"father", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statement "father" should be a full sentence. Maybe something like "Who is your father?"
"a human" | ||
], | ||
[ | ||
"mother", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statement "mother" should be a full sentence. Maybe something like "Who is your mother?"
], | ||
[ | ||
"Tell me about your dreams", | ||
"I dream that i will become a better." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there is a noun missing at the end of this sentence. A better what?
], | ||
[ | ||
"for dinner", | ||
"i don't dinner menu for you" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "i don't dinner menu for you" is suppose to mean. This doesn't appear grammatically correct.
|
||
"history": [ | ||
[ | ||
"american civil war", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"american civil war" looks like a topic, when the text here should probably be a question or something else.
af48af1
to
f1aa00d
Compare
I rebased this pull request against the master branch to bring in the existing fix for the 2017 new years test bug. |
1. Take my name out of it 2. Fix sentence capitalization
The requested changes have been made.
@gunthercox thank you very much |
By looking into Repo https://github.com/drwallace/aiml-en-us-foundation-alice, i think we need update many conversations. I am started updating few important one