Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve PageFind search results #47137

Open
cjyabraham opened this issue Jul 12, 2024 · 6 comments
Open

Improve PageFind search results #47137

cjyabraham opened this issue Jul 12, 2024 · 6 comments
Labels
area/web-development Issues or PRs related to the kubernetes.io's infrastructure, design, or build processes kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@cjyabraham
Copy link
Contributor

We now have PageFind search serving users who cannot access the Google Programmable Search, such as those behind the firewall in China. There have been reports that the search results are not well ranked so we'd like to improve them.

To properly assess the quality of the search results, we should find a way to measure their quality against some kind of baseline, such as what is provided by our Google Programmable Search engine. It may be best to, say, start with the 20 most common searches and then grade how suitable the results are. Grading the results over a broad range of search terms will ensure we're not optimizing things for just one or two particular use-cases.

This work should be done by someone who is familiar with the Kubernetes docs and knows what results would be best served for a particular query. Once we see where things are now, we can tune the PageFind results to see if we can improve their score to an acceptable level.

@cjyabraham cjyabraham added the kind/bug Categorizes issue or PR as related to a bug. label Jul 12, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 12, 2024
@nate-double-u
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 12, 2024
@nate-double-u
Copy link
Contributor

/area web-development

@dipesh-rawat
Copy link
Member

/area web-development

/priority important-soon
(Please feel free to adjust the priority as needed, if the SIG consensus leans toward a different priority)

@k8s-ci-robot k8s-ci-robot added area/web-development Issues or PRs related to the kubernetes.io's infrastructure, design, or build processes priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jul 14, 2024
@TPXP
Copy link
Contributor

TPXP commented Jul 15, 2024

Ideally, the search should also understand the same aliases as kubectl (svc -> service,....). Google handles some of the aliases but not all of them

By the way, starting with entity types (pod, ingress, service...) and making sure the first result is the page presenting may be a great start

@sftim
Copy link
Contributor

sftim commented Aug 21, 2024

Ideally, the search should also understand the same aliases as kubectl (svc -> service,....). Google handles some of the aliases but not all of them

By the way, starting with entity types (pod, ingress, service...) and making sure the first result is the page presenting may be a great start

I think that's a great idea but could be its own feature request @TPXP

@sftim
Copy link
Contributor

sftim commented Aug 21, 2024

We're not staffing this, so:
/remove-priority important-soon
/priority important-longterm

People accessing our docs from behind state censorship may have other search options available, other than our built-in search.

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/web-development Issues or PRs related to the kubernetes.io's infrastructure, design, or build processes kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

6 participants