Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classify search engine results page content #6000

Closed
jsecretan opened this issue Sep 12, 2019 · 7 comments · Fixed by brave/brave-core#13522
Closed

Classify search engine results page content #6000

jsecretan opened this issue Sep 12, 2019 · 7 comments · Fixed by brave/brave-core#13522

Comments

@jsecretan
Copy link

jsecretan commented Sep 12, 2019

No description provided.

@jsecretan jsecretan added bug priority/P3 The next thing for us to work on. It'll ride the trains. feature/ads labels Sep 12, 2019
@tmancey tmancey changed the title SERP pages are not being classified in the ads user model Fixes SERP pages are not being classified in the ads user model Sep 23, 2019
@tmancey tmancey added the QA/Yes label Sep 23, 2019
@tmancey tmancey added this to the 0.72.x - Nightly milestone Sep 23, 2019
@jsecretan jsecretan changed the title Fixes SERP pages are not being classified in the ads user model SERP pages are not being classified in the ads user model Oct 1, 2019
@NejcZdovc NejcZdovc removed this from the 1.2.x - Dev milestone Nov 22, 2019
@tmancey tmancey changed the title SERP pages are not being classified in the ads user model Search engine result pages are not being classified Mar 23, 2020
@tmancey tmancey self-assigned this May 29, 2022
@tmancey tmancey added priority/P3 The next thing for us to work on. It'll ride the trains. QA/Yes release-notes/exclude labels May 29, 2022
@tmancey tmancey added the OS/Android Fixes related to Android browser functionality label May 31, 2022
@tmancey tmancey added this to the 1.41.x - Nightly milestone May 31, 2022
@LaurenWags LaurenWags added the QA/In-Progress Indicates that QA is currently in progress for that particular issue label Jun 29, 2022
@LaurenWags
Copy link
Member

LaurenWags commented Jun 29, 2022

Verified with

Brave | 1.41.82 Chromium: 103.0.5060.66 (Official Build) beta (x86_64)
-- | --
Revision | 20b1569438a85e631d15e83eb355e3e326e5da6f-refs/branch-heads/5060@{#1066}
OS | macOS Version 12.4 (Build 21F79)

Verified test plan from brave/brave-core#13522 for a selection of the sites listed (not all).

https://github.com/ - PASSED
  • Using 1.40.107
    • Launched a clean profile and enabled ads.
    • Waited for all items to complete and components to download.
    • Did a search for "ruby".
    • Confirmed no mention of Classified text with the top segment as... after performing this search.
    • Did see the below in my logs:
[79627:259:0629/123826.460466:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification
  • Repeated the above using 1.41.82.
    • After searching github.com for "ruby", confirmed the below shows in my logs:
[77166:259:0629/101652.316679:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as technology & computing-programming
https://amazon.com/ - PASSED
  • Using 1.40.107
    • Launched a clean profile and enabled ads.
    • Waited for all items to complete and components to download.
    • Did a search for "espresso machine".
    • Saw mention of Classified text with the top segment as... after performing this search.
[78961:259:0629/122141.826300:VERBOSE1:text_classification_processor.cc(65)] Classified text with the top segment as home-appliances
  • Repeated the above using 1.41.82.
    • After searching amazon.com for "espresso machine", confirmed the below shows in my logs:
[79039:259:0629/122241.965933:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as home-appliances
https://www.semanticscholar.org/- PASSED
  • Using 1.40.107
    • Launched a clean profile and enabled ads.
    • Waited for all items to complete and components to download.
    • Did a search for "economic growth".
    • Confirmed no mention of Classified text with the top segment as... after performing this search.
    • Did see the below:
[79287:259:0629/123445.689539:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification
  • Repeated the above using 1.41.82.
    • After searching semanticscholar.org for "economic growth", confirmed the below shows in my logs:
[79488:259:0629/123736.755781:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as business-business
https://www.webcrawler.com/ - PASSED
  • Using 1.40.107
    • Launched a clean profile and enabled ads.
    • Waited for all items to complete and components to download.
    • Did a search for "monstera".
    • Saw mention of Classified text with the top segment as... after performing this search.
[79883:259:0629/124430.419159:VERBOSE1:text_classification_processor.cc(65)] Classified text with the top segment as home-garden
  • Repeated the above using 1.41.82.
    • After searching webcrawler.com for "monstera", confirmed the below shows in my logs:
[79972:259:0629/124631.194934:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as home-garden
https://www.amazon.co.uk/ - PASSED
  • Using 1.40.107
    • Launched a clean profile and enabled ads.
    • Waited for all items to complete and components to download.
    • Did a search for "anker charger".
    • Saw mention of Classified text with the top segment as... after performing this search.
[85368:259:0629/155720.006724:VERBOSE1:text_classification_processor.cc(65)] Classified text with the top segment as home-appliances
  • Repeated the above using 1.41.82.
    • After searching amazon.co.uk for "anker charger", confirmed the below shows in my logs:
[85565:259:0629/160111.840785:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as cell phones-cell phones

@LaurenWags LaurenWags added QA Pass-macOS and removed QA/In-Progress Indicates that QA is currently in progress for that particular issue labels Jun 29, 2022
@stephendonner
Copy link

stephendonner commented Jun 30, 2022

Verification PASSED using

Brave 1.41.86 Chromium: 103.0.5060.66 (Official Build) beta (64-bit)
Revision 20b1569438a85e631d15e83eb355e3e326e5da6f-refs/branch-heads/5060@{#1066}
OS Linux

Shared Steps:

  1. installed 1.41.86
  2. launched Brave
  3. loaded the base URL noted, for each
  4. searched for the quoted term
  5. checked console logs for text-classification snippet

duckduckgo.com - "shoes"

1.40.107

[14081:14081:0630/144804.244637:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.86

[14463:14463:0630/144939.553472:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as fashion-fashion

search.brave.com - "barbie dolls"

1.40.107

[18922:18922:0630/170221.055122:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.86

[19797:19797:0630/170514.302088:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as other-other

stackoverflow.com - "python"

1.40.107

[20212:20212:0630/170859.943418:VERBOSE1:text_classification_processor.cc(65)] Classified text with the top segment as technology & computing-technology & computing

1.41.86

[21522:21522:0630/171410.003110:VERBOSE1:text_classification_processor.cc(65)] Classified text with the top segment as technology & computing-programming

twitter.com/explore - "covid"

1.40.107

[21939:21939:0630/171607.933452:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.86

[22368:22368:0630/171759.284545:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as travel-adventure travel

excite.com - "ukraine"

1.40.107

[22784:22784:0630/172129.282665:VERBOSE1:text_classification_processor.cc(65)] Classified text with the top segment as history-history

1.41.86

[23205:23205:0630/172314.808215:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as history-history

search.yahoo.com - "news"

1.40.107

[23690:23690:0630/173019.199762:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.86

[24110:24110:0630/173145.632865:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as other-other

uk.search.yahoo.com - "mozilla firefox"

1.40.107

[24542:24542:0630/173353.184219:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.86

[24976:24976:0630/173525.163903:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as technology & computing-software

@stephendonner
Copy link

@tmancey can you check both @LaurenWags and my results, above?

In particular, that examples like search.yahoo.com and uk.search.yahoo.com both classify text -- we don't exclude classification due to locale, right?

@tmancey
Copy link
Contributor

tmancey commented Jul 5, 2022

@stephendonner LGTM, all working as expected. The ML text classifier will decide if it can classify the text, but for the above this is all working as expected. Thanks

@Uni-verse
Copy link
Contributor

Uni-verse commented Jul 5, 2022

Verified on Samsung GS 21 5G & Galaxy Tab S7 using

Brave	1.41.91 Chromium: 103.0.5060.114 (Official Build) beta (64-bit) 
Revision	a1c2360c5b02a6d4d6ab33796ad8a268a6128226-refs/branch-heads/5060@{#1124}
OS	Android 12; Build/SP1A.210812.016

STR from brave/brave-core#13522 (comment):

  • installed 1.41.91
  • launched Brave
  • Enabled ads, verbose logging for ads in QA pref.
  • loaded the base URL noted, for each
  • searched for the quoted term
  • checked console logs for text-classification snippet

amazon.com - "fishing gear"

Results

1.41.91
07-05 11:52:05.227 10630 10630 V chromium: [VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as sports-fishing

1.40.79
07-05 12:35:46.048 22798 22798 V chromium: [VERBOSE1:text_classification_processor.cc(65)] Classified text with the top segment as sports-fishing

duckduck.com - "Top gun maverick"

Results

1.41.91
07-05 12:03:47.153 10630 10630 V chromium: [VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as arts & entertainment-film

1.40.79
07-05 12:42:24.008 22798 22798 V chromium: [VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

search.brave.com - "Obi wan kenobi"

Results

1.41.91
07-05 12:08:27.126 10630 10630 V chromium: [VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as hobbies & interests-sci-fi

1.40.79
07-05 12:45:05.436 22798 22798 V chromium: [VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

github.com - "cardano"

Results

1.41.91
07-05 12:10:29.702 10630 10630 V chromium: [VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as technology & computing-programming

1.40.79
07-05 12:49:09.538 22798 22798 V chromium: [VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

youtube.com - "crypto"

Results

1.41.91
07-05 12:17:25.017 10630 10630 V chromium: [VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as education-homeschooling

1.40.79
07-05 12:51:30.818 22798 22798 V chromium: [VERBOSE1:text_classification_processor.cc(65)] Classified text with the top segment as crypto-crypto

@MadhaviSeelam
Copy link

Verification PASSED using

Brave | 1.41.91 Chromium: 103.0.5060.114 (Official Build) beta (64-bit)
-- | --
Revision | a1c2360c5b02a6d4d6ab33796ad8a268a6128226-refs/branch-heads/5060@{#1124}
OS | Windows 11 Version 21H2 (Build 22000.739)

Verified test plan from brave/brave-core#13522 for sites as below:

  • installed 1.41.91
  • launched Brave
  • enabled ads
  • visited some of the sites as mentioned in the test plan
  • searched for a term
  • checked console logs for text-classification snippet

baidu.com - wine

1.40.113

[29400:15920:0707/150007.827:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.91

[13716:27124:0707/145308.881:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as food & drink-wine

qwant.com - iPhone

1.40.113

[9656:28920:0707/152523.268:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.91

[22368:6708:0707/164144.117:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as technology & computing-software

semanticscholar.org - corona

1.40.113

[15552:22112:0707/153502.686:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.91

[11944:29600:0707/164302.347:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as business-business

findx.com - dogs

1.40.113

[640:21512:0707/154903.703:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.91

[28400:16208:0707/164432.537:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as gaming-gaming

google.com - toyota

1.40.113

[21608:27868:0707/151942.766:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.91

[13340:21536:0707/164532.108:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as automotive-automotive

yandex.com - mail

1.40.113

[16300:26528:0707/164738.480:VERBOSE1:ads_impl.cc(219)] Search engine pages are not supported for text classification

1.41.91

[17688:9740:0707/164826.477:VERBOSE1:text_classification_processor.cc(64)] Classified text with the top segment as technology & computing-technology & computing

@tmancey tmancey added this to Ads Jun 10, 2024
@tmancey tmancey moved this to Done in Ads Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment