How to make Solr synonyms work with KeywordTokenizerFactory?












0















I have a field with KeywordTokenizerFactory in both analyzers because I need to search the docs those begin with a query phrase like "Lorem Ipsum" so I can search like q=myField:/Lorem Ipsum/. The result must match the query exactly, it must start with Lorem and end with Ipsum with no words between them. At the same time I have a synonym for Ipsum - Dolor. This way when I search "Lorem Ipsum" I want to find documents where myField field contains only "Lorem Ipsum" or "Lorem Dolor".

I can't split the phrase when indexing because in this case I won't be able to make sure to find only documents those start and end with words I need. I don't know Solr well and really stuck here.

Maybe I miss something and I can get the same result another way?










share|improve this question























  • Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.

    – MatsLindh
    Nov 15 '18 at 9:16











  • @MatsLindh what do you mean by "doing it at query time"?

    – through.a.haze
    Nov 15 '18 at 10:13











  • Using a query like myField:/Lorem (Ipsum|Dolor)/ - i.e. use the regex capabilities to allow multiple words in the location instead.

    – MatsLindh
    Nov 15 '18 at 10:23











  • @MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something like myField:/(Lorem|Sit) (Ipsum|Dolor)/ can be very complicated.

    – through.a.haze
    Nov 15 '18 at 10:34
















0















I have a field with KeywordTokenizerFactory in both analyzers because I need to search the docs those begin with a query phrase like "Lorem Ipsum" so I can search like q=myField:/Lorem Ipsum/. The result must match the query exactly, it must start with Lorem and end with Ipsum with no words between them. At the same time I have a synonym for Ipsum - Dolor. This way when I search "Lorem Ipsum" I want to find documents where myField field contains only "Lorem Ipsum" or "Lorem Dolor".

I can't split the phrase when indexing because in this case I won't be able to make sure to find only documents those start and end with words I need. I don't know Solr well and really stuck here.

Maybe I miss something and I can get the same result another way?










share|improve this question























  • Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.

    – MatsLindh
    Nov 15 '18 at 9:16











  • @MatsLindh what do you mean by "doing it at query time"?

    – through.a.haze
    Nov 15 '18 at 10:13











  • Using a query like myField:/Lorem (Ipsum|Dolor)/ - i.e. use the regex capabilities to allow multiple words in the location instead.

    – MatsLindh
    Nov 15 '18 at 10:23











  • @MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something like myField:/(Lorem|Sit) (Ipsum|Dolor)/ can be very complicated.

    – through.a.haze
    Nov 15 '18 at 10:34














0












0








0








I have a field with KeywordTokenizerFactory in both analyzers because I need to search the docs those begin with a query phrase like "Lorem Ipsum" so I can search like q=myField:/Lorem Ipsum/. The result must match the query exactly, it must start with Lorem and end with Ipsum with no words between them. At the same time I have a synonym for Ipsum - Dolor. This way when I search "Lorem Ipsum" I want to find documents where myField field contains only "Lorem Ipsum" or "Lorem Dolor".

I can't split the phrase when indexing because in this case I won't be able to make sure to find only documents those start and end with words I need. I don't know Solr well and really stuck here.

Maybe I miss something and I can get the same result another way?










share|improve this question














I have a field with KeywordTokenizerFactory in both analyzers because I need to search the docs those begin with a query phrase like "Lorem Ipsum" so I can search like q=myField:/Lorem Ipsum/. The result must match the query exactly, it must start with Lorem and end with Ipsum with no words between them. At the same time I have a synonym for Ipsum - Dolor. This way when I search "Lorem Ipsum" I want to find documents where myField field contains only "Lorem Ipsum" or "Lorem Dolor".

I can't split the phrase when indexing because in this case I won't be able to make sure to find only documents those start and end with words I need. I don't know Solr well and really stuck here.

Maybe I miss something and I can get the same result another way?







solr lucene






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 14 '18 at 17:58









through.a.hazethrough.a.haze

138211




138211













  • Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.

    – MatsLindh
    Nov 15 '18 at 9:16











  • @MatsLindh what do you mean by "doing it at query time"?

    – through.a.haze
    Nov 15 '18 at 10:13











  • Using a query like myField:/Lorem (Ipsum|Dolor)/ - i.e. use the regex capabilities to allow multiple words in the location instead.

    – MatsLindh
    Nov 15 '18 at 10:23











  • @MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something like myField:/(Lorem|Sit) (Ipsum|Dolor)/ can be very complicated.

    – through.a.haze
    Nov 15 '18 at 10:34



















  • Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.

    – MatsLindh
    Nov 15 '18 at 9:16











  • @MatsLindh what do you mean by "doing it at query time"?

    – through.a.haze
    Nov 15 '18 at 10:13











  • Using a query like myField:/Lorem (Ipsum|Dolor)/ - i.e. use the regex capabilities to allow multiple words in the location instead.

    – MatsLindh
    Nov 15 '18 at 10:23











  • @MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something like myField:/(Lorem|Sit) (Ipsum|Dolor)/ can be very complicated.

    – through.a.haze
    Nov 15 '18 at 10:34

















Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.

– MatsLindh
Nov 15 '18 at 9:16





Would doing it at query time be acceptable (since you're already using regular expressions)? Otherwise you'll have to generate multiple input strings, one for each synonym you want to expand. Synonyms work on the token level, and when you're using keywordtokenizer, that will just be one single, large token.

– MatsLindh
Nov 15 '18 at 9:16













@MatsLindh what do you mean by "doing it at query time"?

– through.a.haze
Nov 15 '18 at 10:13





@MatsLindh what do you mean by "doing it at query time"?

– through.a.haze
Nov 15 '18 at 10:13













Using a query like myField:/Lorem (Ipsum|Dolor)/ - i.e. use the regex capabilities to allow multiple words in the location instead.

– MatsLindh
Nov 15 '18 at 10:23





Using a query like myField:/Lorem (Ipsum|Dolor)/ - i.e. use the regex capabilities to allow multiple words in the location instead.

– MatsLindh
Nov 15 '18 at 10:23













@MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something like myField:/(Lorem|Sit) (Ipsum|Dolor)/ can be very complicated.

– through.a.haze
Nov 15 '18 at 10:34





@MatsLindh but I do the query from the client side and the client knows nothing about synonyms. Also there could be other synonyms as well, e.g. Lorem == Sit How should I handle this case? Making something like myField:/(Lorem|Sit) (Ipsum|Dolor)/ can be very complicated.

– through.a.haze
Nov 15 '18 at 10:34












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53306223%2fhow-to-make-solr-synonyms-work-with-keywordtokenizerfactory%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53306223%2fhow-to-make-solr-synonyms-work-with-keywordtokenizerfactory%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Florida Star v. B. J. F.

Error while running script in elastic search , gateway timeout

Adding quotations to stringified JSON object values