-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NL Interface] Making medium_ft the default #2920
[NL Interface] Making medium_ft the default #2920
Conversation
Thanks Jehangir! Taking a look... |
(Sorry sent early) Do you have the diff report between small vs. medium_ft? |
Here is the differ report in the previous PR (#2908) description: |
Just updated the Python test goldens. One problem which will need to be fixed in the next batch of embedding refresh is this (should will be a TODO): In the multi-sv test for "hispanic women phd", we now get this SV to show up: Count_Death_DiseasesOfTheNervousSystem_White Upon further investigation, seems like this SV has the following description and alternatives: 25% of hispanics live in low-poverty neighborhoods; We need to fix these sentences because the word "hispanic" should not feature here. |
}} | ||
/> | ||
Medium-5K (experimental) | ||
Medium-FT-5K (experimental) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call it Medium-FineTuned ? Drop (experimental)??
@@ -38,6 +38,7 @@ export const NL_URL_PARAMS = { | |||
export const NL_INDEX_VALS = { | |||
SMALL: "small", | |||
MEDIUM: "medium", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop MEDIUM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also change the checkbox default to MEDIUM_FT:
website/static/js/apps/nl_interface/app.tsx
Lines 45 to 47 in f734137
const [indexType, setIndexType] = useState( | |
getUrlTokenOrDefault(NL_URL_PARAMS.IDX, NL_INDEX_VALS.SMALL) | |
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Should we fix the long-tail SVs for "blood pressure" query? https://screenshot.googleplex.com/3aQgF2er4bwtec2 That makes 3 queries on the go/dcnl-main-demo look odd
…e in the embeddings PR
Maybe we fix the chart title for those to be a bit more readable and then its good. |
This is resolved now. |
This is also perhaps OK for now. We can have a bug? |
This is also fixed now! |
Updated the chart title. |
Opened #2964 to track |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good!
Thanks for patiently working through this!
], | ||
"title": "Greenhouse Gas Emissions from Agriculture in the World (${date})", | ||
"title": "Annual Amount of Emissions: Other Energy Use, Greenhouse Gas in the World (${date})", | ||
"type": "MAP" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: we can remove CLimateTrace_OtherEnergyUse
@@ -1061,7 +1061,7 @@ | |||
"NumberOfMonths_WetBulbTemperature_35COrMore_RCP85_MeanRelativeHumidity": "Number of Months Reaching Wet Bulb Temperature based on RCP 8.5 Mean Relative Humidity", | |||
"NumberOfMonths_WetBulbTemperature_35COrMore_RCP85_MinRelativeHumidity": "Number of Months Reaching Wet Bulb Temperature based on RCP 8.5 Min Relative Humidity", | |||
"PalmerDroughtSeverityIndex_Atmosphere": "Palmer Drought Severity Index", | |||
"Percent_Person_18OrMoreYears_WithHighBloodPressure_ReceivedTakingBloodPressureMedication": "Percent of Population With High Blood Pressure Who Are Taking Blood Pressure Medication Among Population Who Are 18 Years or Older in Age;", | |||
"Percent_Person_18OrMoreYears_WithHighBloodPressure_ReceivedTakingBloodPressureMedication": "Percent of Adults Taking Medication Among Those With High Blood Pressure", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Percent of Adults with Blood Pressure Taking Medication
In this PR, we have the following:
Below is the manually checked report (for each integration test scenario) for the diffs between the old (small index embeddings) and new (medium_ft index embeddings) for what shows up on the frontend (it may be hard to parse the changes from the config jsons):
'What are the projected temperature extremes across California',
Not necessarily a good/bad thing. Bad is that the Mean/Min/Max temps are not projected.
'Where were the major fires in the last year',
'Tell me about Placer County',
'What were the most common jobs there',
'Which jobs have grown the most',
'What are the most common health issues there',
'Which counties in california have the highest levels of blood pressure',
'Which counties in the USA have the highest levels of blood pressure',
'How does this correlate with income',
'What is the meaning of life',
'How big are the public schools in Sunnyvale',
'What is the prevalence of asthma there',
'What is the commute pattern there',
'How does that compare with San Bruno',
'Which cities in the SF Bay Area have the highest larceny',
'What countries in Africa had the greatest increase in life expectancy',
'Number of Shakespeare fans in San Francisco and Chicago.',
'Crime in California and Florida',
'counties in California with highest crime',
'obesity in California',
'GDP of countries in the US',
"Poverty vs. Obesity in California",
"Poverty vs. Obesity in California and Florida",
"California cities with hispanic population over 10000",
"Prevalence of Asthma in California cities with hispanic population over 10000",
Could be improved with some finetuning.
'tell me about palo alto',
'US states which have that the cheapest houses',
'what about in florida',
'compare with california and new york state and washington state',
'show me the population of mexico city',
'counties in the US with the most poverty',
'Where are the most rural districts in India',
'Life expectancy across provinces of China',
'GDP of counties in the United Kingdom',
'Districts in Turkey with the highest fertility rate',
'Floods in Brazil',
'Drought in Africa',
'which cities in the Santa Clara County have the highest larceny?',
'household median income across tracts of Placer County',
'how many people are unemployed in zip codes of washington state?'
'tell me about poverty in africa',
'which countries have show the greatest reduction?',
'health in the world',