Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add search by index to issues api #2606

Closed

Conversation

Morlinest
Copy link
Member

This PR adds issue index field to bleve for indexing and adds option to search issues by index (search by prefix - eg. Issue{Index: 1234} will match ?index=1, 12, 123 or 1234).

@codecov-io
Copy link

codecov-io commented Sep 25, 2017

Codecov Report

Merging #2606 into master will decrease coverage by <.01%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2606      +/-   ##
==========================================
- Coverage   27.32%   27.32%   -0.01%     
==========================================
  Files          86       86              
  Lines       17143    17144       +1     
==========================================
  Hits         4684     4684              
- Misses      11781    11782       +1     
  Partials      678      678
Impacted Files Coverage Δ
models/issue_indexer.go 8.33% <0%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 339d7de...4e78ff3. Read the comment docs.

@tboerger tboerger added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Sep 25, 2017
return issueIDs, nil
// SearchIssuesByIndex searches for issues by given conditions.
// Returns the matching issue IDs
func SearchIssuesByIndex(repoID, issueIndex int64) ([]int64, error) {
Copy link
Member

@ethantkoenig ethantkoenig Sep 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure using the indexer here is the best approach.

First of all, existing indexer entries won't have an IssueIndex field, so this feature won't work without some sort of indexer migration (which right now we don't support)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know good and effective solution for searching numbers like this?

@ethantkoenig
Copy link
Member

@Morlinest What exactly is the intended use case here?

@Morlinest
Copy link
Member Author

@ethantkoenig It can be used in #2531. Also it can be used inside comments, like github does.

@ethantkoenig
Copy link
Member

ethantkoenig commented Sep 25, 2017

Can't we just use a SQL where condition? Ajax calls for autocomplete don't seem like a performance-critical component, and even if they were using the DB might still be faster for medium-sized repos.

@Morlinest
Copy link
Member Author

@ethantkoenig Sure, if you know good where condition for this. I was asking on discord about this, this is result.

@ethantkoenig
Copy link
Member

ethantkoenig commented Sep 26, 2017

@Morlinest WHERE CAST(index AS STRING) LIKE "123%". I think this is valid in all of our supported RDBMS's, but you should double-check.

@lafriks
Copy link
Member

lafriks commented Sep 26, 2017

@ethantkoenig cast(col as string) like '1%' performance in database would be very poor

@ethantkoenig
Copy link
Member

ethantkoenig commented Sep 26, 2017

@lafriks If we introduce an index, I think the performance would be acceptable, since we're only doing prefix searches (as opposed to LIKE '%foo%')

In fact, it seems like MySQL, PostgreSQL, and MSSQL all support function-based indices, so we might not even need to introduce a new column (edit: Sqlite too)

@lafriks lafriks closed this Sep 26, 2017
@lafriks lafriks reopened this Sep 26, 2017
@lafriks
Copy link
Member

lafriks commented Sep 26, 2017

Sorry, accidently closed. Index would help yes but it still needs to be verified if cast as text does work on all databases as I think mysql will need convert as text. Sqlite btw also supports functional index for this. But I think doing it in bleve index would be better

@ethantkoenig
Copy link
Member

ethantkoenig commented Sep 26, 2017

@lafriks It's not clear to me that bleve is better here. Suppose there is a repository with 2000 issues. ~1100 of these issues begin with 1, so if I search for "1", the bleve indexer will return an array consisting of 1100 issue ids. We then have to include these 1100 issue ids in a very large IN clause, which might be slower than just doing an indexed search in the DB.

I could be convinced otherwise, but if we do decide bleve, we'll need to introduce some sort of indexer migration framework (which should definitely be a separate PR).

@Morlinest
Copy link
Member Author

@ethantkoenig Do you have idea for migration? I found that only way is to delete index and create new one. But I don't know how to check some kind of version of current bleve index.

If I understand, this PR should be blocked till introduction of bleve index migration.

@lafriks
Copy link
Member

lafriks commented Sep 26, 2017

It think that you can create migration that just drops and recreates that index

@lafriks
Copy link
Member

lafriks commented Sep 26, 2017

I think that for autocomplete only top X must be returned. Doesn't bleve supports that?

@Morlinest
Copy link
Member Author

@lafriks I haven't found it yet. Still searching.

@Morlinest
Copy link
Member Author

@lafriks Do you want to use DB version for bleve migration?

@lafriks
Copy link
Member

lafriks commented Sep 26, 2017

Yes, why not it can be used all kind of migrations just like there are migrations to change git gitea hooks

@Morlinest
Copy link
Member Author

Added migration.

@Morlinest
Copy link
Member Author

Morlinest commented Sep 26, 2017

@lafriks @ethantkoenig I've looked again at github solution. It's more simple. They are just returning "suggestions" and I think it's list of 1000 last updated issues. What about using similar approach instead of adding this feature? Now we are limited for max 50 issues by default, but it's enough right now and can be easily changed.

Edit: Limit is 10 in config file.

@lafriks
Copy link
Member

lafriks commented Sep 26, 2017

@Morlinest currently if I understand code correctly it will search for all issues not limited in count from bleve and only limit them in sql later

@Morlinest
Copy link
Member Author

@lafriks Yes (2147483647 is hardcoded limit), I've worked on limiting it but found more problems than it solves. Eg. you can find first 10 ID by Index, but they can be closed, so you have to add isClosed to bleve too. Same if you want to search another page. So for adding limit, you have to add all other conditions to it.

@lafriks
Copy link
Member

lafriks commented Sep 26, 2017

I don't know how advanced queries does bleve support but if it would be possible to already get total item count and issue ids only for single page that would be great performance improvement but that is probably out of scope of this PR

@Morlinest
Copy link
Member Author

Morlinest commented Sep 26, 2017

@lafriks We can add isClose but as number. Bleve supports:

Supported field types:
Text, Numeric, Date

Supported query types:
Term, Phrase, Match, Match Phrase, Prefix
Conjunction, Disjunction, Boolean
Numeric Range, Date Range
Simple query syntax for human entry

Also there is SortBy() function, where you can sort by array of field or _id and _score`. Also supports reversed order.

Edit: It looks like boolean can be used too as field.

@lafriks
Copy link
Member

lafriks commented Sep 26, 2017

yes but to fully support that we would need to add all fields that need to be sorted by in issues page (created time, updated time, comment count and priority) or filtered by (assigned to user id, author id, label id array, milestone id). And in such case select from issue table would not be even needed.

@ethantkoenig
Copy link
Member

IMO, adding every column of the issue table to the indexer is not the best approach. I'm afraid we'll end up with a bloated index that will be difficult to keep in the sync with the DB.

@Morlinest
Copy link
Member Author

@lafriks @ethantkoenig Anyway, I'll close this PR as it's main purpose is doable by other (easier) way. If there will be need for adding fields to indexer, we can reopen/reuse this.

@Morlinest Morlinest closed this Sep 27, 2017
@go-gitea go-gitea locked and limited conversation to collaborators Nov 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants