Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NL Interface] Making medium_ft the default #2920

Merged
merged 44 commits into from
Jul 19, 2023
Merged

[NL Interface] Making medium_ft the default #2920

merged 44 commits into from
Jul 19, 2023

Conversation

jehangiramjad
Copy link
Contributor

@jehangiramjad jehangiramjad commented Jul 11, 2023

In this PR, we have the following:

  1. Making the medium_ft embeddings and the finetuned model to be the default.
  2. Updating integration tests with medium_ft embeddings (and associated finetuned model)

Below is the manually checked report (for each integration test scenario) for the diffs between the old (small index embeddings) and new (medium_ft index embeddings) for what shows up on the frontend (it may be hard to parse the changes from the config jsons):

textbox_sample:

  1. 'family earnings in california':

demo_feb2023:

  1. 'What are the projected temperature extremes across California',

    • same top few charts
    • Now we see more charts (Mean, Min, Max temperature, Heat Events etc show up as well).
      Not necessarily a good/bad thing. Bad is that the Mean/Min/Max temps are not projected.
  2. 'Where were the major fires in the last year',

    • Same chart and data
  3. 'Tell me about Placer County',

    • Same
  4. 'What were the most common jobs there',

    • Same chart and data
  5. 'Which jobs have grown the most',

    • Same top charts shown
  6. 'What are the most common health issues there',

    • Same chart and data
  7. 'Which counties in california have the highest levels of blood pressure',

  8. 'Which counties in the USA have the highest levels of blood pressure',

    • Same as above (same top chart but one more charts now being shown which isn't terrible).
  9. 'How does this correlate with income',

    • Same top correlation chart.
    • Due to the second chart in steps above, there are more correlation charts shown in for medium_ft but with the limit on the number of charts shown, the top four appear to be slightly different between small vs medium_ft.
  10. 'What is the meaning of life',

    • Both show the same thing: no result with no place in query message.

demo2_cities_feb2023:

  1. 'How big are the public schools in Sunnyvale',

    • Same rankings
  2. 'What is the prevalence of asthma there',

  3. 'What is the commute pattern there',

    • Same chart
  4. 'How does that compare with San Bruno',

    • Same chart
  5. 'Which cities in the SF Bay Area have the highest larceny',

    • Same charts (which is an improvement compared to the previous medium_ft index where other crimes were also showing up very highly).
  6. 'What countries in Africa had the greatest increase in life expectancy',

    • Same top few charts
    • Improvement: in the small index there are two charts about Death per Capita which are not extremely relevant. This is also an improvement over the old medium_ft index which was showing mortality SVs (this is due to the fine-tuning to disassociate death with life expectancy).

demo_fallback:

  1. 'Number of Shakespeare fans in San Francisco and Chicago.',

    • Same chart
  2. 'Crime in California and Florida',

    • Pretty much the same charts (one chart re-ordering): small has SV about "rape" show up but medium_ft does not.
  3. 'counties in California with highest crime',

    • Same behavior in both.
    • Nothing comes up for "counties" and the suggestion about using California produces the same starts (same as above).
    • Note that "cities in California..." Works.
  4. 'obesity in California',

    • Same as above.
    • Improvement: SV about overweight population among 4 years or younger shows up now.
  5. 'GDP of countries in the US',

    • Top charts are the same and except for some re-ordering below, the charts are mostly the same.

multisv:

  1. "Poverty vs. Obesity in California",

    • Same chart
  2. "Poverty vs. Obesity in California and Florida",

    • Same top chart
    • Different charts and ordering after that in small vs medium_ft
    • Improvement: the small index brings up many "above poverty" SVs whereas the medium_ft now only brings up one.
    • Small matches these SVs: https://screenshot.googleplex.com/6Fqomb2amuwjFUL
    • Medium_ft matches these SVs: https://screenshot.googleplex.com/4yUkYbhpnoqCdq2
    • Improvement: small index matches many races with poverty but they mostly disappear from the medium_ft
    • Improvement over previous medium_ft index: "Count_Person_PovertyStatusDetermined" is not showing (due to fine tuning).
  3. "California cities with hispanic population over 10000",

    • Same chart
  4. "Prevalence of Asthma in California cities with hispanic population over 10000",

    • Same top chart
    • Both now show just one chart (whereas previously medium_ft was showing more).
    • One thing to potentially fix is that even though we don't see the chart, the top there SVs matched in medium_ft also have race info (Count_Death_DiseasesOfTheRespiratorySystem_BlackOrAfricanAmerican and Percent_Person_18To64Years_NoHealthInsurance_HispanicOrLatino).
      Could be improved with some finetuning.

demo_climatetrace:

  1. Which countries emit the most greenhouse gases?
    • Same top chart.
    • Not great: Similar SVs and charts but different orderings with medium_ft bringing up many more "emissions" SVS which is not great.

place_detection_e2e:

  1. 'tell me about palo alto',

    • same chart
  2. 'US states which have that the cheapest houses',

  3. 'what about in florida',

    • Similar as above
  4. 'compare with california and new york state and washington state',

    • Similar as above
  5. 'show me the population of mexico city',

    • Same top charts
    • Female and Male population charts/SVs no longer show in medium_ft (which is OK but not ideal)
  6. 'counties in the US with the most poverty',

    • Same top chart
    • Different charts and ordering after that in small vs medium_ft
    • Improvement: the small index brings up many "above poverty" SVs whereas the medium_ft now only brings up one.
    • Improvement: small index matches many races with poverty but they mostly disappear from the medium_ft
    • Improvement over previous medium_ft index: "Count_Person_PovertyStatusDetermined" is not showing (due to fine tuning).

international:

  1. 'Where are the most rural districts in India',

  2. 'Life expectancy across provinces of China',

    • Same top 3 charts
    • Improvement: SV about deaths drop out in medium_ft (due to finetuning).
  3. 'GDP of counties in the United Kingdom',

    • Smart charts at top
    • Medium_ft now also showing the PerCapita SV (and then taking the per capita of that SV which is bad)
  4. 'Districts in Turkey with the highest fertility rate',

  5. 'Floods in Brazil',

    • Same chart
  6. 'Drought in Africa',

    • Same chart

usa_map_types:

  1. 'which cities in the Santa Clara County have the highest larceny?',

      • Same charts (which is an improvement compared to the previous medium_ft index where other crimes were also showing up very highly).
  2. 'household median income across tracts of Placer County',

  3. 'how many people are unemployed in zip codes of washington state?'

    • Same charts
    • Improvement over previous medium_ft where extra unnecessary charts were also showing up.

sdg:

  1. 'tell me about poverty in africa',

    • Same charts
  2. 'which countries have show the greatest reduction?',

    • Same charts
  3. 'health in the world',

    • Same charts

@jehangiramjad jehangiramjad requested a review from pradh July 11, 2023 21:06
@pradh
Copy link
Contributor

pradh commented Jul 12, 2023

Thanks Jehangir! Taking a look...

@pradh
Copy link
Contributor

pradh commented Jul 12, 2023

Thanks Jehangir! Taking a look...

(Sorry sent early) Do you have the diff report between small vs. medium_ft?

@jehangiramjad
Copy link
Contributor Author

Here is the differ report in the previous PR (#2908) description:

  1. Linked is the differ report (html) and spreadsheet (link) which show the comparison vs the currently deployed "small" index and also makes several comments to the differences compared to a "medium" index which did not use the finetuned model.

@jehangiramjad
Copy link
Contributor Author

Just updated the Python test goldens. One problem which will need to be fixed in the next batch of embedding refresh is this (should will be a TODO):

In the multi-sv test for "hispanic women phd", we now get this SV to show up:

Count_Death_DiseasesOfTheNervousSystem_White

Upon further investigation, seems like this SV has the following description and alternatives:

25% of hispanics live in low-poverty neighborhoods;
Count of Mortality Event: G00-G98 (Diseases of The Nervous System), White;
Hispanic Population With Live in Low Poverty Neighborhood,25th Percentile;
a quarter of hispanics live in low-poverty neighborhoods;
about one-fourth of hispanics live in low-poverty neighborhoods;
one in four hispanics lives in a low-poverty neighborhood

We need to fix these sentences because the word "hispanic" should not feature here.

}}
/>
Medium-5K (experimental)
Medium-FT-5K (experimental)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call it Medium-FineTuned ? Drop (experimental)??

@@ -38,6 +38,7 @@ export const NL_URL_PARAMS = {
export const NL_INDEX_VALS = {
SMALL: "small",
MEDIUM: "medium",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop MEDIUM?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also change the checkbox default to MEDIUM_FT:

const [indexType, setIndexType] = useState(
getUrlTokenOrDefault(NL_URL_PARAMS.IDX, NL_INDEX_VALS.SMALL)
);

Copy link
Contributor

@pradh pradh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Should we fix the long-tail SVs for "blood pressure" query? https://screenshot.googleplex.com/3aQgF2er4bwtec2 That makes 3 queries on the go/dcnl-main-demo look odd

@jehangiramjad jehangiramjad requested a review from pradh July 18, 2023 22:06
@pradh
Copy link
Contributor

pradh commented Jul 19, 2023

  1. Should we fix the long-tail SVs for "blood pressure" query? https://screenshot.googleplex.com/3aQgF2er4bwtec2 That makes 3 queries on the go/dcnl-main-demo look odd

Maybe we fix the chart title for those to be a bit more readable and then its good.

@pradh
Copy link
Contributor

pradh commented Jul 19, 2023

  1. Slightly wonder if we can do something about the climate trace GHG query? Because that is a very common query and dc/topic/GHGEmissionsBySource has too many SVs that push the main one out of the screen: https://screenshot.googleplex.com/C5b8nF4xjEaF85J

This is resolved now.

@pradh
Copy link
Contributor

pradh commented Jul 19, 2023

  1. In go/dcnl-ct-demo, with medium_ft, the [How does that compare with russia?] page has many more charts (https://screenshot.googleplex.com/9DqPsPxG9ZasUMt) than the [top sources of co2 emissions in USA] page. That's because the "ranking-across-vars" doesn't get inherited thru context. We can improve on the fullfillment side.

This is also perhaps OK for now. We can have a bug?

@pradh
Copy link
Contributor

pradh commented Jul 19, 2023

This is also fixed now!

@jehangiramjad
Copy link
Contributor Author

  1. Should we fix the long-tail SVs for "blood pressure" query? https://screenshot.googleplex.com/3aQgF2er4bwtec2 That makes 3 queries on the go/dcnl-main-demo look odd

Maybe we fix the chart title for those to be a bit more readable and then its good.

Updated the chart title.

@jehangiramjad
Copy link
Contributor Author

3. In go/dcnl-ct-demo, with medium_ft, the [How does that compare with russia?] page has many more charts (https://screenshot.googleplex.com/9DqPsPxG9ZasUMt) than the [top sources of co2 emissions in USA] page. That's because the "ranking-across-vars" doesn't get inherited thru context. We can improve on the fullfillment side.

Opened #2964 to track

Copy link
Contributor

@pradh pradh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good!

Thanks for patiently working through this!

],
"title": "Greenhouse Gas Emissions from Agriculture in the World (${date})",
"title": "Annual Amount of Emissions: Other Energy Use, Greenhouse Gas in the World (${date})",
"type": "MAP"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: we can remove CLimateTrace_OtherEnergyUse

@@ -1061,7 +1061,7 @@
"NumberOfMonths_WetBulbTemperature_35COrMore_RCP85_MeanRelativeHumidity": "Number of Months Reaching Wet Bulb Temperature based on RCP 8.5 Mean Relative Humidity",
"NumberOfMonths_WetBulbTemperature_35COrMore_RCP85_MinRelativeHumidity": "Number of Months Reaching Wet Bulb Temperature based on RCP 8.5 Min Relative Humidity",
"PalmerDroughtSeverityIndex_Atmosphere": "Palmer Drought Severity Index",
"Percent_Person_18OrMoreYears_WithHighBloodPressure_ReceivedTakingBloodPressureMedication": "Percent of Population With High Blood Pressure Who Are Taking Blood Pressure Medication Among Population Who Are 18 Years or Older in Age;",
"Percent_Person_18OrMoreYears_WithHighBloodPressure_ReceivedTakingBloodPressureMedication": "Percent of Adults Taking Medication Among Those With High Blood Pressure",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Percent of Adults with Blood Pressure Taking Medication

@jehangiramjad jehangiramjad merged commit f2779f5 into datacommonsorg:master Jul 19, 2023
9 checks passed
@jehangiramjad jehangiramjad deleted the embeddings branch July 19, 2023 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants