引言
在大数据时代,高效的数据检索能力是众多应用体系的关键需求。Elasticsearch作为一款强盛的开源分布式搜刮和分析引擎,提供了两种根本的检索方式,帮助开发者从海量数据中精准获取所需信息。这两种检索方式各有特点,适用于差别的业务场景和查询需求。本文将深入探讨这两种检索方式,并通过丰富的示例和详细的表明,帮助读者全面掌握Elasticsearch的检索技巧。
数据准备:数据JSON
Elasticsearch检索方式概述
两种检索方式先容
Elasticsearch支持通过REST request uri发送搜刮参数和通过REST request body发送搜刮参数这两种根本检索方式。明白这两种方式的差别和适用场景,是高效使用Elasticsearch的基础。
方式一:通过REST request uri发送搜刮参数
- 原理:将搜刮参数以查询字符串的形式直接附加在URI后面,传递给Elasticsearch服务器。这种方式简朴直观,适用于简朴的搜刮场景。
- 示例:
- GET bank/_search?q=*&sort=account_number:asc
复制代码
- 参数表明:
- q=*:q代表查询条件,*是通配符,表示查询所有文档。
- sort=account_number:asc:sort用于指定排序规则,这里表示按照account_number字段进行升序分列,asc表示升序,desc表示降序。
- 返回结果分析:
- {
- "took" : 235,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 1000,
- "relation" : "eq"
- },
- "max_score" : null,
- "hits" : [
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "0",
- "_score" : null,
- "_source" : {
- "account_number" : 0,
- "balance" : 16623,
- "firstname" : "Bradshaw",
- "lastname" : "Mckenzie",
- "age" : 29,
- "gender" : "F",
- "address" : "244 Columbus Place",
- "employer" : "Euron",
- "email" : "bradshawmckenzie@euron.com",
- "city" : "Hobucken",
- "state" : "CO"
- },
- "sort" : [
- 0
- ]
- },
- // 此处省略其他文档数据
- ]
- }
- }
复制代码- - `took`:表示Elasticsearch执行查询所花费的时间,单位为毫秒,这里是235毫秒,反映了查询的执行效率。
- - `timed_out`:表示搜索请求是否超时,`false`表示未超时,说明查询在规定时间内顺利完成。
- - `_shards`:包含搜索的分片信息,`total`表示总分片数,`successful`表示成功搜索的分片数,`skipped`表示跳过的分片数,`failed`表示搜索失败的分片数。这里总分片数为1,且成功搜索了1个分片,说明搜索过程顺利。
- - `hits.total.value`:表示找到的匹配文档数量,这里是1000,说明在`bank`索引中共有1000个文档符合查询条件(因为这里是查询所有文档)。
- - `max_score`:表示文档的最高相关性得分,由于使用`match_all`查询所有文档,没有相关性得分的概念,所以为`null`。
- - `hits.sort`:表示文档的排序位置(当不按相关性得分排序时),这里按照`account_number`升序排列,所以每个文档的`sort`值就是其`account_number`的值。
- - `hits._score`:表示文档的相关性得分(使用`match_all`时不适用),这里为`null`。
复制代码 方式二:通过REST request body发送搜刮参数
原理:将搜刮参数放在HTTP请求的消息体中发送给Elasticsearch服务器,使用的是一种领域对象语言(DSL),以JSON格式来定义复杂的查询条件、排序规则、分页设置等。这种方式灵活性高,可以或许满足复杂的搜刮需求。
(1)根本语法格式
Elasticsearch提供了一个可以实验查询的Json风格的DSL。这个被称为Query DSL,该查询语言非常全面。
一个查询语句的典型布局
- QUERY_NAME:{
- ARGUMENT:VALUE,
- ARGUMENT:VALUE,...
- }
复制代码 如果针对于某个字段,那么它的布局如下:
- {
- QUERY_NAME:{
- FIELD_NAME:{
- ARGUMENT:VALUE,
- ARGUMENT:VALUE,...
- }
- }
- }
复制代码- GET bank/_search
- {
- "query": {
- "match_all": {}
- },
- "from": 0,
- "size": 5,
- "sort": [
- {
- "account_number": {
- "order": "desc"
- }
- }
- ]
- }
- //match_al查询所有,从第0个数据拿5个数据
复制代码 query定义如何查询;
- match_all查询类型【代表查询所有的所有】,es中可以在query中组合非常多的查询类型完成复杂查询;
- 除了query参数之外,我们可也传递其他的参数以改变查询结果,如sort,size;
- from+size限定,完身分页功能;
- sort排序,多字段排序,会在前序字段相等时后续字段内部排序,否则以前序为准;
(2)返回部分字段
- GET bank/_search
- {
- "query": {
- "match_all": {}
- },
- "from": 0,
- "size": 5,
- "sort": [
- {
- "account_number": {
- "order": "desc"
- }
- }
- ],
- "_source": ["balance","firstname"]
-
- }
复制代码 查询结果:
- {
- "took" : 18,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 1000,
- "relation" : "eq"
- },
- "max_score" : null,
- "hits" : [
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "999",
- "_score" : null,
- "_source" : {
- "firstname" : "Dorothy",
- "balance" : 6087
- },
- "sort" : [
- 999
- ]
- },
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "998",
- "_score" : null,
- "_source" : {
- "firstname" : "Letha",
- "balance" : 16869
- },
- "sort" : [
- 998
- ]
- },
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "997",
- "_score" : null,
- "_source" : {
- "firstname" : "Combs",
- "balance" : 25311
- },
- "sort" : [
- 997
- ]
- },
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "996",
- "_score" : null,
- "_source" : {
- "firstname" : "Andrews",
- "balance" : 17541
- },
- "sort" : [
- 996
- ]
- },
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "995",
- "_score" : null,
- "_source" : {
- "firstname" : "Phelps",
- "balance" : 21153
- },
- "sort" : [
- 995
- ]
- }
- ]
- }
- }
复制代码 (3)match匹配查询
- 根本类型(非字符串),“account_number”: 20 可加可不加“ ” 不加就是精确匹配
- GET bank/_search
- {
- "query": {
- "match": {
- "account_number": "20"
- }
- }
- }
复制代码 match返回account_number=20的数据。
查询结果:
- {
- "took" : 1,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 1,
- "relation" : "eq"
- },
- "max_score" : 1.0,
- "hits" : [
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "20",
- "_score" : 1.0,
- "_source" : {
- "account_number" : 20,
- "balance" : 16418,
- "firstname" : "Elinor",
- "lastname" : "Ratliff",
- "age" : 36,
- "gender" : "M",
- "address" : "282 Kings Place",
- "employer" : "Scentric",
- "email" : "elinorratliff@scentric.com",
- "city" : "Ribera",
- "state" : "WA"
- }
- }
- ]
- }
- }
复制代码
- GET bank/_search
- {
- "query": {
- "match": {
- "address": "kings"
- }
- }
- }
复制代码 全文检索,最终会按照评分进行排序,会对检索条件进行分词匹配。
查询结果:
- {
- "took" : 30,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 2,
- "relation" : "eq"
- },
- "max_score" : 5.990829,
- "hits" : [
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "20",
- "_score" : 5.990829,
- "_source" : {
- "account_number" : 20,
- "balance" : 16418,
- "firstname" : "Elinor",
- "lastname" : "Ratliff",
- "age" : 36,
- "gender" : "M",
- "address" : "282 Kings Place",
- "employer" : "Scentric",
- "email" : "elinorratliff@scentric.com",
- "city" : "Ribera",
- "state" : "WA"
- }
- },
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "722",
- "_score" : 5.990829,
- "_source" : {
- "account_number" : 722,
- "balance" : 27256,
- "firstname" : "Roberts",
- "lastname" : "Beasley",
- "age" : 34,
- "gender" : "F",
- "address" : "305 Kings Hwy",
- "employer" : "Quintity",
- "email" : "robertsbeasley@quintity.com",
- "city" : "Hayden",
- "state" : "PA"
- }
- }
- ]
- }
- }
复制代码 (4) match_phrase [短句匹配]
将必要匹配的值当成一整个单词(不分词)进行检索
- GET bank/_search
- {
- "query": {
- "match_phrase": {
- "address": "mill road"
- }
- }
- }
复制代码 查处address中包罗mill_road的所有记录,并给出相干性得分
检察结果:
- {
- "took" : 32,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 1,
- "relation" : "eq"
- },
- "max_score" : 8.926605,
- "hits" : [
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "970",
- "_score" : 8.926605,
- "_source" : {
- "account_number" : 970,
- "balance" : 19648,
- "firstname" : "Forbes",
- "lastname" : "Wallace",
- "age" : 28,
- "gender" : "M",
- "address" : "990 Mill Road",
- "employer" : "Pheast",
- "email" : "forbeswallace@pheast.com",
- "city" : "Lopezo",
- "state" : "AK"
- }
- }
- ]
- }
- }
复制代码 match_phrase和match的区别,观察如下实例:
match_phrase是做短语匹配
match是分词匹配,例如990 Mill匹配含有990大概Mill的结果
- GET bank/_search
- {
- "query": {
- "match_phrase": {
- "address": "990 Mill"
- }
- }
- }
复制代码 查询结果:
- {
- "took" : 0,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 1,
- "relation" : "eq"
- },
- "max_score" : 10.806405,
- "hits" : [
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "970",
- "_score" : 10.806405,
- "_source" : {
- "account_number" : 970,
- "balance" : 19648,
- "firstname" : "Forbes",
- "lastname" : "Wallace",
- "age" : 28,
- "gender" : "M",
- "address" : "990 Mill Road",
- "employer" : "Pheast",
- "email" : "forbeswallace@pheast.com",
- "city" : "Lopezo",
- "state" : "AK"
- }
- }
- ]
- }
- }
复制代码 使用match的keyword
- GET bank/_search
- {
- "query": {
- "match": {
- "address.keyword": "990 Mill"
- }
- }
- }
复制代码 查询结果,一条也未匹配到
- {
- "took" : 0,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 0,
- "relation" : "eq"
- },
- "max_score" : null,
- "hits" : [ ]
- }
- }
复制代码 修改匹配条件为“990 Mill Road”
- GET bank/_search
- {
- "query": {
- "match": {
- "address.keyword": "990 Mill Road"
- }
- }
- }
复制代码 查询出一条数据
- {
- "took" : 1,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 1,
- "relation" : "eq"
- },
- "max_score" : 6.5032897,
- "hits" : [
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "970",
- "_score" : 6.5032897,
- "_source" : {
- "account_number" : 970,
- "balance" : 19648,
- "firstname" : "Forbes",
- "lastname" : "Wallace",
- "age" : 28,
- "gender" : "M",
- "address" : "990 Mill Road",
- "employer" : "Pheast",
- "email" : "forbeswallace@pheast.com",
- "city" : "Lopezo",
- "state" : "AK"
- }
- }
- ]
- }
- }
复制代码 文本字段的匹配,使用keyword,匹配的条件就是要表现字段的全部值,要进行精确匹配的。
match_phrase是做短语匹配,只要文本中包罗匹配条件既包罗这个短语,就能匹配到。
(5)multi_math【多字段匹配】
- GET bank/_search
- {
- "query": {
- "multi_match": {
- "query": "mill",
- "fields": [
- "state",
- "address"
- ]
- }
- }
- }
复制代码 state大概address中包罗mill,而且在查询过程中,会对于查询条件进行分词。
查询结果:
- {
- "took" : 28,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 4,
- "relation" : "eq"
- },
- "max_score" : 5.4032025,
- "hits" : [
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "970",
- "_score" : 5.4032025,
- "_source" : {
- "account_number" : 970,
- "balance" : 19648,
- "firstname" : "Forbes",
- "lastname" : "Wallace",
- "age" : 28,
- "gender" : "M",
- "address" : "990 Mill Road",
- "employer" : "Pheast",
- "email" : "forbeswallace@pheast.com",
- "city" : "Lopezo",
- "state" : "AK"
- }
- },
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "136",
- "_score" : 5.4032025,
- "_source" : {
- "account_number" : 136,
- "balance" : 45801,
- "firstname" : "Winnie",
- "lastname" : "Holland",
- "age" : 38,
- "gender" : "M",
- "address" : "198 Mill Lane",
- "employer" : "Neteria",
- "email" : "winnieholland@neteria.com",
- "city" : "Urie",
- "state" : "IL"
- }
- },
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "345",
- "_score" : 5.4032025,
- "_source" : {
- "account_number" : 345,
- "balance" : 9812,
- "firstname" : "Parker",
- "lastname" : "Hines",
- "age" : 38,
- "gender" : "M",
- "address" : "715 Mill Avenue",
- "employer" : "Baluba",
- "email" : "parkerhines@baluba.com",
- "city" : "Blackgum",
- "state" : "KY"
- }
- },
- {
- "_index" : "bank",
- "_type" : "account",
- "_id" : "472",
- "_score" : 5.4032025,
- "_source" : {
- "account_number" : 472,
- "balance" : 25571,
- "firstname" : "Lee",
- "lastname" : "Long",
- "age" : 32,
- "gender" : "F",
- "address" : "288 Mill Street",
- "employer" : "Comverges",
- "email" : "leelong@comverges.com",
- "city" : "Movico",
- "state" : "MT"
- }
- }
- ]
- }
- }
复制代码 免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |