插入测试数据
POST nba/_bulk
{"index":{"_index":"nba","_type":"_doc","_id":"1"}}
{"countryEn":"United States","teamName":"老鹰","birthDay":831182400000,"country":"美国","teamCityEn":"Atlanta","code":"jaylen_adams","displayAffiliation":"United States","displayName":"杰伦 亚当斯","schoolType":"College","teamConference":"东部","teamConferenceEn":"Eastern","weight":"86.2 公斤","teamCity":"亚特兰大","playYear":1,"jerseyNo":"10","teamNameEn":"Hawks","draft":2018,"displayNameEn":"Jaylen Adams","heightValue":1.88,"birthDayStr":"1996-05-04","position":"后卫","age":23,"playerId":"1629121"}
{"index":{"_index":"nba","_type":"_doc","_id":"2"}}
{"countryEn":"New Zealand","teamName":"雷霆","birthDay":743140800000,"country":"新西兰","teamCityEn":"Oklahoma City","code":"steven_adams","displayAffiliation":"Pittsburgh/New Zealand","displayName":"斯蒂文 亚当斯","schoolType":"College","teamConference":"西部","teamConferenceEn":"Western","weight":"120.2 公斤","teamCity":"俄克拉荷马城","playYear":6,"jerseyNo":"12","teamNameEn":"Thunder","draft":2013,"displayNameEn":"Steven Adams","heightValue":2.13,"birthDayStr":"1993-07-20","position":"中锋","age":26,"playerId":"203500"}
{"index":{"_index":"nba","_type":"_doc","_id":"5"}}
{"countryEn":"United States","teamName":"马刺","birthDay":490593600000,"country":"美国","teamCityEn":"New Orleans","code":"lamarcus_aldridge","displayAffiliation":"Texas/United States","displayName":"拉马库斯 阿尔德里奇","schoolType":"College","teamConference":"西部","teamConferenceEn":"Western","weight":"117.9 公斤","teamCity":"圣安东尼奥","playYear":13,"jerseyNo":"12","teamNameEn":"Spurs","draft":2006,"displayNameEn":"LaMarcus Aldridge","heightValue":2.11,"birthDayStr":"1985-07-19","position":"中锋-前锋","age":34,"playerId":"200746"}
{"index":{"_index":"nba","_type":"_doc","_id":"6"}}
{"countryEn":"Canada","teamName":"鹈鹕","birthDay":887000400000,"country":"加拿大","teamCityEn":"New Orleans","code":"nickeil_alexander-walker","displayAffiliation":"Virginia Tech/Canada","displayName":"Nickeil Alexander-Walker","schoolType":"College","teamConference":"西部","teamConferenceEn":"Western","weight":"92.5 公斤","teamCity":"新奥尔良","playYear":0,"jerseyNo":"","teamNameEn":"Pelicans","draft":2019,"displayNameEn":"Nickeil Alexander-Walker","heightValue":1.96,"birthDayStr":"1998-02-09","position":"后卫","age":21,"playerId":"1629638"}
{"countryEn":"United States","teamName":"尼克斯","birthDay":727074000000,"country":"美国","teamCityEn":"New York","code":"kadeem_allen","displayAffiliation":"Arizona/United States","displayName":"卡迪姆 艾伦","schoolType":"College","teamConference":"东部","teamConferenceEn":"Eastern","weight":"90.7 公斤","teamCity":"纽约","playYear":2,"jerseyNo":"0","teamNameEn":"Knicks","draft":2017,"displayNameEn":"Kadeem Allen","heightValue":1.9,"birthDayStr":"1993-01-15","position":"后卫","age":26,"playerId":"1628443"}
只要分词后能匹配到了,那么即命中,单词顺序不影响查询结果,也不会影响最终分数。
如下 2 种写法等价,第二种写法扩展性更好,第一种写法更简洁。
GET /nba/_search
{
"query": {
"match": {
"teamCityEn": "New York" # "York New"
}
}
}
GET /nba/_search
{
"query": {
"match": {
"teamCityEn": {
"query": "New York"
}
}
}
}
支持and
和or
,默认是or
即只包含任何一个分词后的结果,那么即匹配。如果要全部匹配。建议使用and
GET /nba/_search
{
"query": {
"match": {
"teamCityEn": {
"query": "New York",
"operator": "and"
}
}
}
}
and
的粒度太粗了,必须要全部满足。当需求要求的是部分满足的时候,可以使用:minimum_should_match
该选项表示:至少匹配多少个单词。比如如下案例,表示至少要匹配到New/York/States
中的 2 个单词。
GET /nba/_search
{
"query": {
"match": {
"teamCityEn":
{
"query": "New York States",
"minimum_should_match": 2
}
}
}
}
上述写法除了具体的数值外也支持百分数,比如下面这写法
GET /nba/_search
{
"query": {
"match": {
"teamCityEn":
{
"query": "New York States",
"minimum_should_match": "50%"
}
}
}
}
表示至少匹配 3 个单词中50%
,此处有个坑,3 个单词的 50%是 1.5
,但是 ES会向下取整即 1
,所以只要匹配到1 个分词后的结果,即会展示在结果中。
如下例子中是多字段且不同字段带有不同的权重
GET /nba/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"teamCity": {
"query": "New York",
"boost": 2
}
}
},
{
"match": {
"displayName": {
"query": "亚当斯",
"boost": 5
}
}
}
]
}
}
}
该需求是最近似我们通过搜索引擎搜索的一个场景。
在搜索的时候,我们通常会遇到一个内容,匹配多个字段,比如输入"亚当斯",需要从:displayName、teamName、country三个字段中搜索出相似内容。
不做其他处理的情况下可以使用下面 2 种写法。
GET /nba/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"teamCity": {
"query": "亚当斯"
}
}
},
{
"match": {
"displayName": {
"query": "亚当斯"
}
}
},
{
"match": {
"country": {
"query": "亚当斯"
}
}
}
]
}
}
}
# 方法 2,后续详细讲解
GET /nba/_search
{
"query": {
"multi_match": {
"query": "亚当斯",
"fields": [
"displayName",
"teamName",
"country"]
}
}
}
一种是非常简洁的multi_match
,另外一种是bool-should-match
。
上述查询中,会将所有字段匹配后,再显示结果,假如搜索项在其中某一个字段的分值非常高,但是其他 2 项的分值很低,就会拉低平均值。
但是通常我们希望找到的是某一项最高的某个项目,3 个字段中任何一个符合都可以,此时就需要使用dis_max
了。
GET /nba/_search
{
"query": {
"dis_max": {
"queries": [
{
"match": {
"teamCity": {
"query": "亚当斯"
}
}
},
{
"match": {
"displayName": {
"query": "亚当斯"
}
}
},
{
"match": {
"country": {
"query": "亚当斯"
}
}
}
]
}
}
}
上述查询中,只考虑匹配值最大的那个,不考虑其他字段的匹配度。但是假如也想将其他字段的检索结果纳入匹配考虑,此时就可以使用tie_breaker
。
GET /nba/_search
{
"query": {
"dis_max": {
"tie_breaker": 0.7,
"queries": [
{
"match": {
"teamCity": {
"query": "亚当斯"
}
}
},
{
"match": {
"displayName": {
"query": "亚当斯"
}
}
},
{
"match": {
"country": {
"query": "亚当斯"
}
}
}
]
}
}
}
当然你也可以结合boost
将某个字段的权重设置的比较高,然后 使用dis_max
找出最高的分值的那个,同时考虑其他字段的影响tie_brekder
GET /nba/_search
{
"query": {
"dis_max": {
"tie_breaker": 0.7,
"queries": [
{
"match": {
"teamCity": {
"query": "亚当斯"
}
}
},
{
"match": {
"displayName": {
"query": "亚当斯",
"boost": 3
}
}
},
{
"match": {
"country": {
"query": "亚当斯"
}
}
}
]
}
}
}
继续完善中....