微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

按“前缀优先”逻辑对弹性命中进行排序

如何解决按“前缀优先”逻辑对弹性命中进行排序

我想实现一个排序的结果集,其中自动建议中搜索词开头的词出现在顶部,然后是在文本中“包含”它的词:例如: 搜索词:倡导者 结果:

拥护x
拥护Yx
一些倡导者

然而,我的结果集为包含该术语的结果提供了更多的分数,而不是那些“以”它开头的结果。我该如何解决这个问题:

映射,js:

{
  "settings": {
    "index": {
      "max_ngram_diff": 39
    },"analysis": {
      "normalizer": {
        "custom_normalizer": {
          "type": "custom","char_filter": [],"filter": [
            "lowercase","asciifolding"
          ]
        }
      },"analyzer": {
        "custom_analyzer": {
          "tokenizer": "custom_tokenizer","filter": [
            "lowercase"
          ]
        },"autocomplete_search": {
          "type": "custom","tokenizer": "keyword","filter": "lowercase"
        }
      },"tokenizer": {
        "custom_tokenizer": {
          "type": "ngram","min_gram": 1,"max_gram": 40,"token_chars": [
            "letter","digit","whitespace","punctuation","symbol"
          ]
        }
      }
    }
  },"mappings": {
    "relations": {  
      "properties": {
      "primaryTerm": {
        "type": "text","analyzer": "custom_analyzer","search_analyzer": "autocomplete_search","fielddata": "true","fields": {
          "raw": {
            "type": "keyword","normalizer": "custom_normalizer"
          }
        }
      },"entityType": {
        "type": "keyword","normalizer": "custom_normalizer"
      },"variants": {
        "type": "text","normalizer": "custom_normalizer"
          }
          }
        }
      }
    }
  }
}

搜索查询

String query="{"bool": { "should": [ {"query_string": {"query":"advocate","fields": ["primaryTerm" ]}},{"query_string": {"query":"advocate","fields": ["primaryTerm.raw^2" ] } } ]}}";
结果:

enter image description here

其他:

enter image description here

弹性结果:

{"total":1,"successful":1,"skipped":0,"Failed":0},"hits":{"total":12,"max_score":6.094379,"hits":[{"_index":"agencyvars","_type":"relations","_id":"qCeqHHgBcFeeTWhjAoua","_score":6.094379,"_source":{"entityType":"Agency","primaryTerm":"ACT ADVOCATES","variants":[]}},{"_index":"agencyvars","_id":"OyeqHHgBcFeeTWhjJYxu","_score":5.6339674,"primaryTerm":"TALWAR ADVOCATES","variants":["TALWAR & ADVOCATES"]}},"_id":"BSeqHHgBcFeeTWhjGIyJ","_score":5.1183944,"primaryTerm":"ZEUSIP ADVOCATES LLP","variants":["ZEUS IP,ADVOCATES","ZEUSIP ADVOCATES","ZEUS IP ADVOCATES","ZEUS IP","ZEUSIPADVOCATES LLP","ZIUSIP ADVOCATES"]}},"_id":"3CeqHHgBcFeeTWhjTYyZ","_score":4.6892724,"primaryTerm":"MURTI & MURTI ADVOCATES","_id":"0SeqHHgBcFeeTWhjjI18","_score":4.4118576,"primaryTerm":"ANAND AND ANAND ADVOCATES","variants":["AANAND & ANAND ADVOCATES","NAND AND ANAND ADVOCATES","ANAND & ANAND,"ANAND & ANAND ADVOCATES","ANAND & ANAND","ANAND&ANAND","ANAND AND ANAND ADVOCAETES","ANAND AND ANAND ADVOCATE","ANAND AND ANANDADVOCATES","AND ANAND ADVOCATES","ANAND & ANAND ADVOCATES.","ANAND AND ANAN","ANAND AND ANAND","ANAND AND ANAND ADVOCATES,","ANAND AND ANAND ADVOCATES.","ANAND AND ANAND,"ANAND AND"]}},"_id":"2CeqHHgBcFeeTWhjTIyn","_score":3.2560868,"primaryTerm":"STAR IP Advocates and IPR Attorneys","variants":["STARIP,ADVOCATES & IP ATTORNEYS"]}},"_id":"3yeqHHgBcFeeTWhjD4uW","_score":2.521993,"primaryTerm":"ADVOCATE AND PATENTS & TradE MARKS ATTORNEY","variants":[]}}]}}#######3

简而言之,分数是:

score":5.6339674,"_source":{"primaryTerm":"TALWAR ADVOCATES"}

_score":5.1183944,"_source":{"primaryTerm":"INTELLEXIP ADVOCATES}

score":2.521993,"_source":{"primaryTerm":"ADVOCATE AND PATENTS & TradEMARKS ATTORNEY}

PS:对答案的小解释将不胜感激,因为我是弹性的新手

解决方法

要应用前缀优先逻辑,您可以将 prefix queryboost 参数一起使用。试试下面的查询

{
  "query": {
    "bool": {
      "should": [
        {
          "query_string": {
            "query": "advocate","fields": [
              "primaryTerm"
            ]
          }
        },{
          "prefix": {
            "primaryTerm.raw": {
              "value": "advocate","boost": 2
            }
          }
        }
      ]
    }
  }
}

搜索结果将是

"hits": [
      {
        "_index": "67049029","_type": "_doc","_id": "1","_score": 2.0386105,"_source": {
          "primaryTerm": "ADVOCATE AND PATENTS & TRADE MARKS ATTORNEY"
        }
      },{
        "_index": "67049029","_id": "3","_score": 0.08597656,"_source": {
          "primaryTerm": "TALWAR ADVOCATES"
        }
      },"_id": "2","_score": 0.07815027,"_source": {
          "primaryTerm": "INTELLEXIP ADVOCATES"
        }
      }
    ]

更新 1:

boost 2 在您的情况下不起作用,因为 TALWAR ADVOCATES" 的得分为 5.6339674,而 "ADVOCATE AND PATENTS & TRADE MARKS ATTORNEY" 的得分为 2.521993。

当您将 2.521993 乘以 2 时,您会得到 5.043986。由于 5.043986

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。