Not Analyzed & Multi-Fields

  • Lucene and hence ElasticSearch break strings into terms using built-in or custom tokenizers
  • Some strings don't make sense to tokenize e.g. uuid or guid often used as equivalent of a primary key and/or unique identifier
  • not_analyzed: ElasticSearch mapping option to suppress tokenization:
    curl -XPUT 'localhost:9200/orders/orders/_mapping?pretty=true' \
     -H 'Content-Type: application/json' \
     -d '
    {
     "orders": {
        "properties": {
           "id": {
              "type": "text",
              "index": false
           }
        }
     }
    }'
    
  • What you think difference will be searching or aggregating tokenized uuid/guid vs. non-tokenized uuid/guid property?
  • What if I need both tokenized and non-tokenized option for the same field?
  • Multi-Fields Mapping allows double indexing the same data:
    curl -XPUT 'localhost:9200/ordering/orders/_mapping?pretty=true' -d '
    {  
     "orders":{  
        "properties": {  
           "streetName": {
              "type":"text",
              "fields": {  
                 "notparsed": {  
                    "type":"keyword",
                    "index":"not_analyzed"
                 }
              }
           }
        }
     }
    }'
    

results matching ""

    No results matching ""