-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Support search qualifiers #8386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Gitea version (or commit ref): 1.10.0+dev-375-g8a828500e You can use the footer from try.gitea.io. I know this is nit-picky of me, but |
Sure, I am usefully more careful when posting bugs. |
Another benefit of this way of searching is that it is more clear for the user when selecting search options in the UI. For example if selecting "Assigned to you" it would end up as "assignee:davidsvantesson" in the search field. |
Perhaps we should consider using a localizable resource for this (e.g. "assignee", "asignado", "asignato", etc.). Preferrably with many options. For example, in
To avoid complicating the code too much, the system could just use the first one in the list when building the search string for the search field in the returned form, like some sort of "normalization". As for other kinds of keywords (like "fixes", "closes", etc.) this is not something you want to change whenever you update the translations from Crowding. |
I am so used working in English so I didn't consider localization, but it sounds like a good idea. I also think the current behavior with splitting on "," should be reconsidered, although it would be a breaking change. It doesn't feel very standardized and is not documented (I didn't know about it before looking into code). I think it is better to search each white-space separated word separately and allow quotation marks to search for an exact match. Then package text/scanner can be used with default settings to split the string. |
I agree with you about the spaces. As a breaking change, there are three options the way I see this:
In this matter I'm all for (1), but it's just my opinion. If implementing (3) is trivial, it may be added and decide a sensible default for it by consensus. |
For 1.10, I choose option 3 (as a transition into proposed behavior), and on version 1.11, use option 1 (remove transitional flag). |
@bagasme This is only a proposal issue so far. I think the only change of behavior would be that comma should be replaced with no comma, and exact searches need quotes. The previous behavior was undocumented anyhow, I think we can add documentation for the new behavior. The special queries can be kept for backward compatibility (like &topic=1) Preferably there shall be a "best match" sorting, like how many of the search words occurs in the repo. I don't know how hard it is to make an effective algorithm of that. |
@davidsvantesson Besides documenting (proposed) new behavior, the old one should also be documented too. This come handy when we switch to new behavior, and users complain when their old/undocumented syntax doesn't work anymore and they want explanations... |
I tried to do some research on 'best match' text searches for sql. Most solutions are tied to specific sql databases (eg. oracle: REGEXP_COUNT, MsSQL: Rank). |
I think bleve can do that; it's the default text search engine for issues. I don't mind if simpler SQL indexing lacks this kind of feature. For SQL search it's hard to decide what should count as "best match". Number of times a word appears in the title? (that should be low) Number of times it appears in the body? Number of comments, counting the body, that contain the word? Most of these will be very heavy on the database. Hence, bleve should be preferred. A "best match" is more useful when you do some semantic analysis on the text, like counting any of "do", "did", "done" as synonyms for each other. And that's language dependent; we could make Gitea support x number of languages, but that's another whole can of worms. |
Sounds reasonable. So a best match search for repositories would need a bleve only for name, description and other repository metadata. |
I have not understood bleve fully, but it seems it can support this directly if just indexing different fields of the repo metadata: |
We're not using that interface. We're one level below querying directly constructing the objects manually: gitea/modules/indexer/issues/bleve.go Lines 222 to 231 in 6551a9d
But of course that means we can do it either way. |
Here's the code where the analyzers are decided: gitea/modules/indexer/issues/bleve.go Lines 129 to 133 in 6551a9d
Those decide what kind of analysis you want on the strings (they must be decided at the moment the index is built!). |
Maybe it is better to build up custom search query to have more control. If we shall support localization we would need to do that. |
We should have our own rules but not follow bleve's because we will support many indexer backend. i.e. elasticsearch. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions. |
search by owner:name is good ideal, officially can add this function ? |
@lunny any progress on this issue? |
Nobody are working on this currently. |
Would love to see this for code search as well to be able to filter a query to a subtree like |
Description
Gitea should support search qualifiers when searching for repositories, issues, PRs or users. This would allow more flexible options when searching.
Example when searching for repos:
topic:
is:private
is:public
owner:name
Search qualifiers shall always be AND search terms (in contrast to text search which is OR).
Note: Gitea has divided search terms by comma and not space so this example:
"sentence of words,topic:mytopic,separate sentance"
would search for repositories where "sentence of words" OR "separate sentance" is in the name or description AND has the topic "mytopic".
Maybe this can be solved by indexing with bleve and using required and exclusion of fields:
https://blevesearch.com/docs/Query-String-Query/
The text was updated successfully, but these errors were encountered: