Elasticsearch: mapping text field for search optimization -

January 15, 2014

i have implement text search application indexes news articles , allows user search keywords, phrases or dates inside these texts.

after consideration regarding options(solr vs. elasticsearch mainly), ended doing testing elasticsearch.

now part stuck on regards mapping , search query construction options best suited special cases have encountered. current mapping has 1 field contains text , needs analyzed in order searchable.

the specific part of mapping field:

"txt": {          "type" : "string",          "term_vector" : "with_positions_offsets",          "analyzer" : "shingle_analyzer"        }

where shingle_analyzer is:

"analysis" : {            "filter" : {               "filter_snow": {                   "type":"snowball",                   "language":"romanian"               },               "shingle":{                   "type":"shingle",                   "max_shingle_size":4,                   "min_shingle_size":2,                   "output_unigrams":"true",                   "filler_token":""                },                "filter_stop":{                   "type":"stop",                   "stopwords":["_romanian_"]                }            },            "analyzer" : {                "shingle_analyzer" : {                    "type" : "custom",                    "tokenizer" : "standard",                    "filter" : ["lowercase","asciifolding", "filter_stop","filter_snow","shingle"]                                     }             }}

my question regards following situations:

i have search "ing" , there several "ing." returned.
i have search "e!" , analyzer kills punctuation , no results.
i have search uppercased common terms used company names (like "apple" multiple words) , lowercase filter creates useless results.

the idea have build different fields different filters cover these possible issues.

three questions:

is splitting field in 3 fields different analyzers correct way?
how use different fields when searching?
could explain how scoring work include these fields?

Search This Blog

Silver

Elasticsearch: mapping text field for search optimization -

Comments

Post a Comment

Popular posts from this blog

user interface - How to replace the Python logo in a Tkinter-based Python GUI app? -

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -