Tips and Tricks in a world of Mix

So , after the last post about elasticsearch that explains a bit the terms of the technology , I’m getting to real life problems.

So after I’ve done entering the data into the elasticsearch at the last post I have now to delete it all! Oh my , how did that happened!  What shell I do now ?

 

Well if you are only starting it’s not that bad , just delete the index , which will remove the data and existing mapping as well.

curl –XDELETE “http://localhost:9200/test”

WHY?

One of the demands was to make the data searchable by only few chars , and not the whole word.

So .. ?

Well actually that means that the default indexing that occurred while creating the index is not good enough , we should have defined the index settings manually suggesting from the beginning what kind of analysis should be performed on the index .

 

The nGram allows as to break the data that we enter to small tokens witch we can search later. So if you have

“Jerusalem”

and define nGram min- 7 , max – 20  ==> you”ll get indexed “[Jerusal, Jerusale, Jerusalem]”

Of course more logical is to start with two chars nGram min and go on..

Tried to set the indexing and mapping into one file but it failed with mistake

Analyzer [your_analyzer_name] not found for field [_all]

When I split it ,it worked.

To the last post I added the manual index definition : CreateIndex.js:

{

“index” : {

               “name” : “test”,

               “number_of_shrads”: 1,

               “settings” : {

                                    “analysis”:{

                                                         “filter”: {

                                                                        “your_name_for_nGram_filter”:{

                                                                                             “type” : “nGram”,

                                                                                              “min_gram”:”2”,

                                                                                               “max_gram”:”20”,

                                                                                               “token_chars”: [ “letter”, “digit” , “punctuation”, “symbol”]

                                                                                            }

                                                        }

                                           },

                                        “analyzer”:{

                                                       “your_name_for_index_analyzer”: {

                                                                      “type”:”custom”,

                                                                      “tokenizer” : “whitespace” ,

                                                                      “filter”: [“lowercase”, “asciifolding” , “your_name_for_nGram_filter”]

                                                                          } ,

                                                       “your_name_for_search_analyzer”: {

                                                                      “type”:”custom”,

                                                                      “tokenizer” : “whitespace” ,

                                                                      “filter”: [“lowercase”, “asciifolding” ]

                                                                          }

      }}}}}                                                                               

 

Than you run the curl to enter it :

curl –XPUT “http://localhost:9200/test” –d @c:\pathto\CreateIndex.js

{“acknowledged”:true}

Now we have the index settings right with autocomplete suggestions starting with 2 letters.

 

Now we reenter the mapping and data from the last post , just add some features to the mapping ;

CreateMappings.js :

{

            “mappings”:{

                         “name_of_your_object”:{

                               “_all”: {

                                              “search_analyzer”:“your_name_for_search_analyzer” ,

                                              “index_analyzer”:“your_name_for_index_analyzer” ,

                                    },

                                 “properties”:{

                                                        “field_you_don’t_wont_to_break_to_small_tokens”:{

                                                                      “type”:”string”,

                                                                     “index”:”not_analyzed”

                                                           },

                                                          “always_in_query_field” :{

                                                                          “type”:”string”,

                                                                           “include_in_all”:true

                                                              }

}}}}

          Than you run the curl

curl –XPUT “http://localhost:9200/test/name_of_your_object/_mapping” –d @C:\pathto\createMappings.js

{“acknowledged”:true}

 

Now we’ll enter the actual data as at the last post 

curl –XPOST “http://localhost:9200/test/name_of_your_object/_bulk –data-binary @c:\pathto\formatizedToIndex.json

 

Now you have data with analyzers inside the elasticsearch with autocomplete .. Happy searching!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Tag Cloud

%d bloggers like this: