How to make Solr synonyms work with KeywordTokenizerFactory?
I have a field with KeywordTokenizerFactory in both analyzers because I need to search the docs those begin with a query phrase like "Lorem Ipsum" so I can search like q=myField:/Lorem Ipsum/
. The result must match the query exactly, it must start with Lorem and end with Ipsum with no words between them. At the same time I have a synonym for Ipsum - Dolor. This way when I search "Lorem Ipsum" I want to find documents where myField field contains only "Lorem Ipsum" or "Lorem Dolor".
I can't split the phrase when indexing because in this case I won't be able to make sure to find only documents those start and end with words I need. I don't know Solr well and really stuck here.
Maybe I miss something and I can get the same result another way?
solr lucene
add a comment |
I have a field with KeywordTokenizerFactory in both analyzers because I need to search the docs those begin with a query phrase like "Lorem Ipsum" so I can search like q=myField:/Lorem Ipsum/
. The result must match the query exactly, it must start with Lorem and end with Ipsum with no words between them. At the same time I have a synonym for Ipsum - Dolor. This way when I search "Lorem Ipsum" I want to find documents where myField field contains only "Lorem Ipsum" or "Lorem Dolor".
I can't split the phrase when indexing because in this case I won't be able to make sure to find only documents those start and end with words I need. I don't know Solr well and really stuck here.
Maybe I miss something and I can get the same result another way?
solr lucene
Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.
– MatsLindh
Nov 15 '18 at 9:16
@MatsLindh what do you mean by "doing it at query time"?
– through.a.haze
Nov 15 '18 at 10:13
Using a query likemyField:/Lorem (Ipsum|Dolor)/
- i.e. use the regex capabilities to allow multiple words in the location instead.
– MatsLindh
Nov 15 '18 at 10:23
@MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something likemyField:/(Lorem|Sit) (Ipsum|Dolor)/
can be very complicated.
– through.a.haze
Nov 15 '18 at 10:34
add a comment |
I have a field with KeywordTokenizerFactory in both analyzers because I need to search the docs those begin with a query phrase like "Lorem Ipsum" so I can search like q=myField:/Lorem Ipsum/
. The result must match the query exactly, it must start with Lorem and end with Ipsum with no words between them. At the same time I have a synonym for Ipsum - Dolor. This way when I search "Lorem Ipsum" I want to find documents where myField field contains only "Lorem Ipsum" or "Lorem Dolor".
I can't split the phrase when indexing because in this case I won't be able to make sure to find only documents those start and end with words I need. I don't know Solr well and really stuck here.
Maybe I miss something and I can get the same result another way?
solr lucene
I have a field with KeywordTokenizerFactory in both analyzers because I need to search the docs those begin with a query phrase like "Lorem Ipsum" so I can search like q=myField:/Lorem Ipsum/
. The result must match the query exactly, it must start with Lorem and end with Ipsum with no words between them. At the same time I have a synonym for Ipsum - Dolor. This way when I search "Lorem Ipsum" I want to find documents where myField field contains only "Lorem Ipsum" or "Lorem Dolor".
I can't split the phrase when indexing because in this case I won't be able to make sure to find only documents those start and end with words I need. I don't know Solr well and really stuck here.
Maybe I miss something and I can get the same result another way?
solr lucene
solr lucene
asked Nov 14 '18 at 17:58
through.a.hazethrough.a.haze
138211
138211
Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.
– MatsLindh
Nov 15 '18 at 9:16
@MatsLindh what do you mean by "doing it at query time"?
– through.a.haze
Nov 15 '18 at 10:13
Using a query likemyField:/Lorem (Ipsum|Dolor)/
- i.e. use the regex capabilities to allow multiple words in the location instead.
– MatsLindh
Nov 15 '18 at 10:23
@MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something likemyField:/(Lorem|Sit) (Ipsum|Dolor)/
can be very complicated.
– through.a.haze
Nov 15 '18 at 10:34
add a comment |
Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.
– MatsLindh
Nov 15 '18 at 9:16
@MatsLindh what do you mean by "doing it at query time"?
– through.a.haze
Nov 15 '18 at 10:13
Using a query likemyField:/Lorem (Ipsum|Dolor)/
- i.e. use the regex capabilities to allow multiple words in the location instead.
– MatsLindh
Nov 15 '18 at 10:23
@MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something likemyField:/(Lorem|Sit) (Ipsum|Dolor)/
can be very complicated.
– through.a.haze
Nov 15 '18 at 10:34
Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.
– MatsLindh
Nov 15 '18 at 9:16
Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.
– MatsLindh
Nov 15 '18 at 9:16
@MatsLindh what do you mean by "doing it at query time"?
– through.a.haze
Nov 15 '18 at 10:13
@MatsLindh what do you mean by "doing it at query time"?
– through.a.haze
Nov 15 '18 at 10:13
Using a query like
myField:/Lorem (Ipsum|Dolor)/
- i.e. use the regex capabilities to allow multiple words in the location instead.– MatsLindh
Nov 15 '18 at 10:23
Using a query like
myField:/Lorem (Ipsum|Dolor)/
- i.e. use the regex capabilities to allow multiple words in the location instead.– MatsLindh
Nov 15 '18 at 10:23
@MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something like
myField:/(Lorem|Sit) (Ipsum|Dolor)/
can be very complicated.– through.a.haze
Nov 15 '18 at 10:34
@MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something like
myField:/(Lorem|Sit) (Ipsum|Dolor)/
can be very complicated.– through.a.haze
Nov 15 '18 at 10:34
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53306223%2fhow-to-make-solr-synonyms-work-with-keywordtokenizerfactory%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53306223%2fhow-to-make-solr-synonyms-work-with-keywordtokenizerfactory%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.
– MatsLindh
Nov 15 '18 at 9:16
@MatsLindh what do you mean by "doing it at query time"?
– through.a.haze
Nov 15 '18 at 10:13
Using a query like
myField:/Lorem (Ipsum|Dolor)/
- i.e. use the regex capabilities to allow multiple words in the location instead.– MatsLindh
Nov 15 '18 at 10:23
@MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something like
myField:/(Lorem|Sit) (Ipsum|Dolor)/
can be very complicated.– through.a.haze
Nov 15 '18 at 10:34