Yazılım Çorbası: Elasticsearch match Query

Giriş

Analiz edilen sorgulardan birisidir. Açıklaması şöyle

match query
The standard query for performing full text queries, including fuzzy matching and phrase or proximity queries.

Açıklaması şöyle.

Creates a boolean query that returns results if the search term is present in the field.

Term queries vs Match query

Açıklaması şöyle

Term-level queries are not analyzed. The match queries that work on text fields, on the other hand, are analyzed. The same analyzers used during the indexing process (unless search queries were explicitly defined with different analyzers) process the search words in match queries. If a standard analyzer (default analyzer) is used during the indexing of our document, the search words are analyzed using the same standard analyzer before the search is executed.

Additionally, the standard analyzer applies the same lowercase token filter (remember, the lowercase token filter is applied during the indexing) to the search words. Thus, if you provide the search keywords as uppercased, they are converted to lowercase letters and searched against the inverted index. For example, if we change the titlevalue to use uppercase criteria such as "title”: “JAVA”, for example, and rerun the query, the results are the same as the search query in listing 10.4. If you change the title value to lowercase or in any other way (e.g., java, jaVA, etc.), the query still returns the same results.

Standard analyzer

Açıklaması şöyle. Yani her kelimeyi küçük harfe çevirir.

Additionally, the standard analyzer applies the same lowercase token filter (remember, the lowercase token filter is applied during the indexing) to the search words. Thus, if you provide the search keywords as uppercased, they are converted to lowercase letters and searched against the inverted index. For example, if we change the title value to use uppercase criteria such as "title”: “JAVA”, for example, and rerun the query, the results are the same as the search query in listing 10.4. If you change the title value to lowercase or in any other way (e.g., java, jaVA, etc.), the query still returns the same results.

Söz dizimi

Kısa Form

Söz dizimi şöyle

GET books/_search
{
  "query": {
    "match": { 
      "FIELD": "SEARCH TEXT" 
    }
  }
}

Örnek

Şöyle yaparız

GET books/_search
{
  "query": {
    "match": { 
      "title": "Java" 
    }
  }
}

Uzun Form

Söz dizimi şöyle

GET books/_search
{
  "query": {
    "match": {
      "FIELD": { 
        "query":"<SEARCH TEXT>", 
        "<parameter>":"<MY_PARAM>", 
     }
    }
  }
}

Açıklaması şöyle

As you can see in the snippet, the match query expects the search criteria to be defined in the form of a field value. The field can be any of the text fields present in a document, whose values are to be matched. The value can be a word or multiple words, given either as uppercase, lowercase, or camel case.

Çok Sayıda İndex

Örnek

Şöyle yaparız

GET new_books,classics,top_sellers, crime* /_search
{
  ...
}

Açıklaması şöyle

We can search across multiple indices by providing comma-separated indices in the search URL

As you can see, any number of indices can be provided when invoking the _search endpoint, including wildcards.

Note : If we omit the index (or indices) in the search request, we effectively search the entire index. For example, GET _search{ ... } searches across all the indices in the cluster.

match Query Belirtilen Değerlerden Herhangi Birisi Varsa Eşleşir

match Query Or sorgusu olarak düşünülebilir. Sorgudaki tam kelimelerin herhangi birisinin belirtilen field'da olması durumunda doküman sonuca dahil edilir.

Örnek

Açıklaması şöyle

Keywords: “puerto baham”
It will look for countries that have “puerto” or “baham” in their name, so it will return users from Puerto Rico and Bahamas, which is exactly what want.

Örnek

Elimizde şöyle bir arama olsun

GET books/_search
{
  "query": {
    "match": {
      "title": {
        "query": "Java Complete Guide"
      }
    }
  },
  "highlight": {
    "fields": {
      "title": {}
    }
  }
}

Bu arama aslında şöyle Yani title alanın da Java veya Complete veya Guide geçen tüm kitapları döndürür

GET books/_search
{
  "query": {
    "match": {
      "title": {
        "query": "Java Complete Guide",
        "operator": "OR" 
      }
    }
  }
}

Bunu değiştirmek için şöyle yaparız

GET books/_search
{
  "query": {
    "match": {
      "title": {
        "query": "Java Complete Guide",
        "operator": "AND" 
      }
    }
  }
}

Örnek - minimum_should_matchattribute

Açıklaması şöyle

What if we want to find documents that match at least a few words from the given set of words? In the previous example, suppose we want at least two words out of three to match (say, Java and Guide, for example). This is where the minimum_should_matchattribute comes in handy.

The minimum_should_matchattribute indicates the minimum number of words that should be used to match the documents.

Şöyle yaparız

GET books/_search
{
  "query": {
    "match": {
      "title": {
        "query": "Java Complete Guide",
        "operator": "OR",
        "minimum_should_match": 2 
      }
    }
  }
}

Fuzzy Search

Açıklaması şöyle

Simply put, fuzziness is a mechanism to correct a user’s spelling mistakes in query criteria.

Fuzziness makes character changes to string input so that it is the same as the string that may exist in the index. It employs the Levenshtein distance algorithm to fix incorrect spellings.

A match query also allows us to add a fuzzinessparameter to fix spelling mistakes. We can set it as a numeric value, where the expected values are 0, 1, or 2, meaning none, one, or two character changes (insertions, deletions, modifications), respectively. In addition to setting these values, we also use an AUTO setting; we let the engine deal with the changes by setting AUTOas its fuzziness parameter.

Örnek

Şöyle yaparız

GET books/_search
{
  "query": {
    "match": {
      "title": {
        "query": "Kava",
        "fuzziness": 1 
      }
    }
  }
}

Tüm Alanlara Göre Aramak

Örnek

Şöyle yaparız

{“query”: { “match”: { “_all”: “meaning” } } }

Açıklaması şöyle

...looks for the term “meaning” in all of the fields in all of the documents in your cluster.

Döndürülecek Alanları Belirtmek

Örnek

Şöyle yaparız

{
  “query”: {
    “match”: { “_all”: “meaning” }
  },
  “fields”: [“name”, “surname”, “age”],
  “from”: 100, “size”: 20
}

Açıklaması şöyle

Here, we’re using the “fields” element to restrict which fields should be returned and the “from” and “size” elements to tell Elasticsearch we’re looking for documents 100 to 119 (starting at 100 and counting 20 documents).

Örnek - score

Elimizde şöyle bir sorgu olsun

GET /_search
{
   "query" : {
     "match" : {
       "tweet" : "grow up"
     }
  }
}

Çıktı olarak şunu alırız

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.9790175,
    "hits": [
      {
        "_index": "App3",
        "_type": "tweets",
        "_id": "2",
        "_score": 1.9790175,
        "_source": {
          "name": "Katrina Kaif",
          "age": 22,
          "tweet": "We never really grow up, we only learn how to act in public."
        }
      },
      {
        "_index": "App3",
        "_type": "tweets",
        "_id": "114",
        "_score": 0.30432263,
        "_source": {
          "name": "Ajay Devgn",
          "age": 62,
          "tweet": "Stress is when you wake up screaming and you realize you haven’t fallen asleep yet."
        }
      }
    ]
  }
}

Açıklaması şöyle

L2–8 shows meta information like it took 3ms for the query to return the result and some information about the shards.
L9 onwards we see the actual query results.
L10 We know that there are two matching results to the query.
L11: We see the max relevance _score value as 1.979. This is followed by the two matching objects, the first with a _score value of 1.979 and the second with a _score value of 0.304. The drastic score difference is likely due to the fact that the second tweet doesn’t have “grow up” as a phrase. It only has the word “up”.

Yazılım Çorbası

11 Şubat 2021 Perşembe

Elasticsearch match Query - Full Text Search İçindir

Hiç yorum yok:

Yorum Gönder

Blog Arşivi