Elasticsearch: mapping text field for search optimization -


i have implement text search application indexes news articles , allows user search keywords, phrases or dates inside these texts.

after consideration regarding options(solr vs. elasticsearch mainly), ended doing testing elasticsearch.

now part stuck on regards mapping , search query construction options best suited special cases have encountered. current mapping has 1 field contains text , needs analyzed in order searchable.

the specific part of mapping field:

"txt": {          "type" : "string",          "term_vector" : "with_positions_offsets",          "analyzer" : "shingle_analyzer"        } 

where shingle_analyzer is:

"analysis" : {            "filter" : {               "filter_snow": {                   "type":"snowball",                   "language":"romanian"               },               "shingle":{                   "type":"shingle",                   "max_shingle_size":4,                   "min_shingle_size":2,                   "output_unigrams":"true",                   "filler_token":""                },                "filter_stop":{                   "type":"stop",                   "stopwords":["_romanian_"]                }            },            "analyzer" : {                "shingle_analyzer" : {                    "type" : "custom",                    "tokenizer" : "standard",                    "filter" : ["lowercase","asciifolding", "filter_stop","filter_snow","shingle"]                                     }             }} 

my question regards following situations:

  1. i have search "ing" , there several "ing." returned.
  2. i have search "e!" , analyzer kills punctuation , no results.
  3. i have search uppercased common terms used company names (like "apple" multiple words) , lowercase filter creates useless results.

the idea have build different fields different filters cover these possible issues.

three questions:

  1. is splitting field in 3 fields different analyzers correct way?
  2. how use different fields when searching?
  3. could explain how scoring work include these fields?


Comments

Popular posts from this blog

android - Get AccessToken using signpost OAuth without opening a browser (Two legged Oauth) -

org.mockito.exceptions.misusing.InvalidUseOfMatchersException: mockito -

google shop client API returns 400 bad request error while adding an item -