Meet Sage, Sage tries to guess what you're thinking by asking a series of questions. Sage is flexible and can be customized to work with any set of questions and answers. Think of the possibilities! You could set it up to guess your favourite fictional character or act as customer support for common problems or diagnose medical conditions based on symptoms. You provide it with a database of answers and questions it can ask the user, based on whcih, it will decide through a series of questions the best- fitting answer. That's right, I said "best-fitting" answer because it take into account mistakes in the user's responses. Sound like magic? Sort of and some beautiful mathematics, right now let's look at a demo and see how to use it ourselves.
You only need two things, these being SageCore.h
which can be found in the reop by navigating to Sage\Sage\include\
from the reop root directory. Second you will need to statically link Sage to your project.
If you don't know how to do that I recoomend this excellent video by the Cherno which will take you through the entire process in Visual Studio. I have made
a release with the .lib files.
Sage is very easy to use, the hard part is getting the data! Sage is all contained in a C++ class called SageCore. You can include it using
#include "SageCore.h"
SageCore contains the brain of Sage but you will need to supply it with the questions and answers it needs. To do this extend the Sagecore class and define the virtual
functions std::string LookupQuestionPrompt(size_t question_j)
and LookupAnswer(size_t character_i, size_t question_j)
:
class Sage : public SageCore {
public:
using SageCore::SageCore;
// Implement the pure virtual function
float LookupAnswer(size_t character_i, size_t question_j) override {
// Logic to determine answer to question_j for character_i
return ans;
}
// Implement the pure virtual function
std::string LookupQuestionPrompt(size_t question_j) override {
// Logic for finding corresponding the text for question_j
return question_j_prompt;
}
};
What just happened! Nothing much, we just built a new class called Sage ontop of SageCore. Occasionally, while sage is running it will want to ask the user a question to
get more information, it will do so by calling the method std::string LookupQuestionPrompt(size_t question_j)
with the number "question_j" as arguement. Let's say
question_j=5, so Sage is asking for the prompt of question 5 (0-indexed) to show it to the user. The function might then return the string "Is your character tall?" etc.
This function will only be called once every time the user is asked a question. However, 'float LookupAnswer(size_t character_i, size_t question_j)' will be called much
more often!
'float LookupAnswer(size_t character_i, size_t question_j)' is called whenever Sage wants to know the answer to qeustion_j about character_i. For example, if character_i=2 and question_j=5 and character_i=2 repersents Robert Borone then Sage wants to know is Robert Berone tall. We want to say yes so we return 1.0f. To say no we would pass 0.0f. We are also free to pass anything inbetween like 0.5f which corresponds to "I don't know". That's it! You are now ready to start Sage.
You do this by specifying the set of acceptable responses to the user and the degree of certainty they correspond to. Here is an example:
std::map<char, float> response_map = {
{'y', 0.8f},
{'n', 0.2f},
{'p', 0.65f},
{'u', 0.35f},
};
This means the possible options per question are 'y' (yes), 'n' (no), 'p' (probabally), 'u' (unlikely). Why didn't I set yes to 1.0f for example? For a deeper dive check out sec::responses. However, for now suffice it to say if a user says yes or no we will not trust them 100%. For example if a user says "no" to the question "Is Robert tall?" then Robert would be entirely eliminated eventhough Robert may be the right and answer and the user made a mistake. Needless to say, Robert will be suppressed as a possible answer.
Let's do an example to get a better sense.
- Define your set of questions
std::vector<std::string> characters = {
"Robert Borone",
"Buzz Lightyear",
"Phoebe Buffay"
};
- Define a set of questions
std::vector<std::string> questions = {
"Is your character a toy?",
"Is your character known for their space adventures?",
"Does your character play a musical instrument?",
"Is your character a human?",
"Does your character have a catchphrase?",
"Is your character tall?"
};
- Define the answers to each question for each character
std::vector<std::vector<float>> true_answers = {
{0.0f, 0.0f, 0.0f, 1.0f, 0.0f, 1.0f}, // Robert Borone
{1.0f, 1.0f, 0.0f, 0.5f, 1.0f, 0.0f}, // Buzz Lightyear
{0.0f, 0.0f, 1.0f, 1.0f, 0.35f, 0.0f} // Phoebe Buffay
};
Here the rows correspond to a particular character and the column corresponds to a question. The order of the questions is the asme as the order we wrote them in step 2. Notice, for Buzz Lightyear, it's unclear if he is a human since he is also a toy so I put 0.5f for "I don't know".
- Extend SageCore
class Sage : public SageCore {
public:
using SageCore::SageCore;
// Implement the pure virtual function
float LookupAnswer(size_t character_i, size_t question_j) override {
// Logic to determine answer to question_j for character_i
return ans;
}
// Implement the pure virtual function
std::string LookupQuestionPrompt(size_t question_j) override {
// Logic for finding corresponding the text for question_j
return question_j_prompt;
}
};
- Implement
std::string LookupQuestionPrompt(size_t question_j)
std::string LookupQuestionPrompt(size_t question_j) {
return questions[question_j];
}
- Implement
float LookupAnswer(size_t character_i, size_t question_j)
float LookupAnswer(size_t character_i, size_t question_j) override {
return true_answers[character_i][question_j];
}
- Define allowed user responses
std::map<char, float> response_map = {
{'y', 0.9f},
{'n', 0.1f},
{'p', 0.65f},
{'u', 0.35f},
};
- Instantiate Sage and start it up
Sage sage(characters.size(), questions.size(), response_map);
while (sage.RefineGuess());
std::cout << characters[sage.TopGuess()];
Done! Here is the full script:
#include <iostream>
#include <vector>
#include <array>
#include <map>
#include "SageCore.h"
const unsigned int number_of_questions = 6;
std::vector<std::array<float, number_of_questions>> true_answers;
std::vector<std::string> questions;
class Sage : public SageCore {
public:
using SageCore::SageCore;
// Implement the pure virtual function
float LookupAnswer(size_t character_i, size_t question_j) override {
return true_answers[character_i][question_j];
}
std::string LookupQuestionPrompt(size_t question_j) {
return questions[question_j];
}
};
int main()
{
std::vector<std::string> characters = {
"Robert Borone",
"Buzz Lightyear",
"Phoebe Buffay"
};
questions = {
"Is your character a toy?",
"Is your character known for their space adventures?",
"Does your character play a musical instrument?",
"Is your character a human?",
"Does your character have a catchphrase?",
"Is your character tall?"
};
true_answers = {
{0.0f, 0.0f, 0.0f, 1.0f, 0.0f, 1.0f}, // Robert Borone (from "Everybody Loves Raymond")
{1.0f, 1.0f, 0.0f, 0.5f, 1.0f, 0.0f}, // Buzz Lightyear
{0.0f, 0.0f, 1.0f, 1.0f, 0.35f, 0.3f} // Phoebe Buffay
};
std::map<char, float> response_map = {
{'y', 0.9f},
{'n', 0.1f},
{'p', 0.65f},
{'u', 0.35f},
};
Sage sage(characters.size(), questions.size(), response_map);
while (sage.RefineGuess());
std::cout << characters[sage.TopGuess()];
}
Sage is actually doing something humans might do also. When you want to be sure of an answer given you pose the question again. If you recieve a second confident answer you are more likely to trust the answer you are given. Sage is doing the same. Despite this, I've programmed Sage to never ask the same question consecutively. There are two ways to help mitigate this:
- Allow Sage to trust the response it recieves from the user more i.e. change the response map so that yes might now correspond to 0.9f rather than 0.7f, for instance. If you want Sage to never ask a repeat question then have the response map as follows:
std::map<char, float> response_map = {
{'y', 1.0f},
{'n', 0.0f},
};
The down side of course is that if the user gives an incorrect response Sage will never reconsider.
- Have more question of a greater variety The best types of questions are those where the user is as likely to respond in the affirmative as the negative regardless of the character the user has chosen. In other words, a more nuanced questions variety allows Sage to explore more nuanced paths.
Sage uses very elegant concepts from probability and information theory to update its guess. Using this approach as oppsoed to binary trees for example permits the user to answer the questions asked incorrectly occasionally. How impactful the mistake is depends on the response_map chosen.
Lets assume we are performing an experiment where the the user plays our guessing game. Now the outcome of this experiment is given by a cartesian product of all posible characters the user could have chosen from the available options, the questions asked and the answers given i.e.
where
As we ask questions and receive responses we would like to update our guess which we can achieve using Bayes' Theorem give by
We can use this result to ask, given the response of the
This is exactly what we need to update our beliefs! We begin by defining priors, by default Sage will give each character the same initial weight, this can be modified by specifying a list of priors when instantiating the class, each question and response then refine Sage's guess.
Sage aims to ask questions which minimize the entropy of the random variable,
If we ask question
where