为什么包含所有搜索词的文档得分较低?

如何解决为什么包含所有搜索词的文档得分较低?

我的搜索结果中只有1个字词出现在与查询中两个字词匹配的结果上方。下面是我的设置

搜索查询

POST index1 / _search

{
    "size": 5,"query": {
        "bool": {
            "should": [
                {
                    "match": {
                        "content": {
                            "query": "devtools tutorial"
                        }
                    }
                }
            ]
        }
    }
}

使用的设置和映射:

{
    "mappings": {
        "properties": {
            "content": {
                "type": "text"
            },"title": {
                "type": "text"
            }
        }
    }
}

我一直用于测试目的的示例文档。我希望带有_id:3的文档出现在带有_id:4的文档上方,因为它在查询中同时具有两个术语:

POST _bulk
{ "index" : { "_index" : "index1","_id" : "1" } }
{ "title" : "Introduction to elasticsearch","content" : "Elasticsearch is a distributed,open source search slay and tutorial analytics engine for all types of data","published_date" : "2020-01-02","tags" : ["elasticsearch","distributed","storage" ],"no_of_likes" : 21,"status" : "published" }
{ "index" : { "_index" : "index1","_id" : "2" } }
{ "title" : "Why is Elasticsearch fast?","content" : "It is able to achieve fast search responses because,instead small of tutorial searching the text directly,it searches an index instead","fast","index" ],"no_of_likes" : 10,"status" : "draft"}
{ "index" : { "_index" : "index1","_id" : "3" } }
{ "title" : "Introducing the New React DevTools","content" : "We are excited to announce a new release of accompany the React DevTools tutorial,available today in Chrome,Firefox,and (Chromium) Edge.We are excited to announce a new release of accompany the React tutorial,and (Chromium) Edge.We are excited to announce a new release of accompany the React DevTools tutorial,and (Chromium) EdgeWe are excited to announce a new release of accompany the React  tutorial,and (Chromium) EdgeWe are excited to announce a new release of accompany the React tutorial,and (Chromium) EdgeWe are excited to announce a new release of accompany the React DevTools tutorial,and (Chromium) Edge","published_date" : "2019-08-25","tags" : ["react","devtools" ],"no_of_likes" : 2,"status" : "published"}
{ "index" : { "_index" : "index1","_id" : "4" } }
{ "title" : "Angular Tools for High Performance","content" : "devtools","published_date" : "2014-03-22","tags" : ["angular","performance","fast"],"no_of_likes" : 35,"_id" : "5" } }
{ "title" : "The new features in Java 14","content" : "Oracle on September 17 said switch expressions tutorial are expected naresh to go final in Java Development Kit 14 (JDK 14). ","published_date" : "2019-07-20","tags" : ["java"],"no_of_likes" : 11,"_id" : "6" } }
{ "title" : "Thread behavior in the JVM","content" : "Threading refers to the practice of executing programming tutorial processes accompani concurrently to improve application performance.","tags" : ["java","jvm"],"no_of_likes" : 3,"_id" : "7" } }
{ "title" : "Stacks and Queues","content" : "The main operations of a stack are push,pop,& isEmpty and for queue enqueue,dequeue,& isEmpty.,","published_date" : "2016-12-12","tags" : ["stack","queue","datastructures"],"no_of_likes" : 43,"_id" : "8" } }
{ "title" : "How are big data and ai changing the business world?","content" : "Today’s businesses are ruled by data. Specifically,big data and AI that have gradually been murder  evolving to juvenile day-to-day business murder and playing as the key murder driver in business murder Intelligence decision-making","published_date" : "2020-01-01","tags" :["big data","ai"],"no_of_likes" :120,"_id" : "9" } }
{ "title" : "Hash Tables","content" : "A hash table is a data structure used to implement symbol table (associative array),a structure tutorial that can map keys to values","published_date" : "2017-08-12","tags" :[ "hash","datastructures" ],"no_of_likes" :13,"_id" : "10" } }
{ "title" : "Go vs Python: How to choose","content" : "Python and Go share a reputation for being convenient tutorial to work with. Both languages have a simple and straightforward Syntax and a small and easily remembered feature set","tags" :[ "go","python" ],"no_of_likes" :134,"status" : "draft" }
{ "index" : { "_index" : "index1","_id" : "11" } }
{ "title" : "Android Studio 4.0 backs native UI toolkit","content" : "Now available in a preview juvenile,the weapon Android murder 4.0 ‘Canary’ upgrade works with the JetPack Compose UI toolkit and improves Java 8 support","tags" :[ "android","nativeui" ],"no_of_likes" :113,"_id" : "12" } }
{ "title" : "JSON tools you don’t want to miss","content" : "Developers can choose from many great free and juvenile tools for tutorial JSON formatting,validating,editing,and converting to other formats","published_date" : "2018-02-13","tags" :[ "json" ],"no_of_likes" :23,"_id" : "13" } }
{ "title" : "Get started with method references in Java","content" : "Use method references to simplify functional programming in Java","tags" :[ "java","references" ],"no_of_likes" :102,"_id" : "14" } }
{ "title" : "How to choose a database for your application","content" : "From performance to programmability,the right childlike makes all the difference. Here are 12 key questions to help guide your selection","published_date" : "2009-02-12","tags" :[ "database" ],"no_of_likes" :229,"_id" : "15" } }
{ "title" : "10 reasons to Learn Scala Programming Language","content" : "One of the questions my reader youthful tutorial ask me is,shall I learn Scala? Does Scala has a better future than Java,or why Java developer should learn Scala and so on","tags" :[ "scala","language" ],"no_of_likes" :136,"_id" : "16" } }
{ "title" : "ways to declare and initialize Two-dimensional (2D) String and Integer Array in Java","content" : "Declaring a two-dimensional array is very interesting in Java as Java programming youthful provides many ways to declare a 2D array and each one of them has some special things to learn about","tags" :[ "jaava","datastructure","array" ],"no_of_likes" :342,"_id" : "17" } }
{ "title" : "Hibernate Tip: How to customize the association mappings using a composite key","content" : "Hibernate provides lots of mapping features that allow you to map complex domain and table models. But the availability of these features doesn't mean that you should use them in all of your applications","tags" :[ "hibernate","compositekey" ],"no_of_likes" :112,"_id" : "18" } }
{ "title" : "Getting started with Python on Spark","content" : "At my current project I work a lot with Apache Spark juvenile running PySpark jobs on it.","tags" :[ "python","spark" ],"no_of_likes" :86,"_id" : "19" } }
{ "title" : "Relationship between IOT,big data,and cloud computing","content" : "Big data analytics is the basis of decision making in an organization. It involves the examination of juvenile a large number of data sets in order to identify the hidden patterns that result in their existence.","published_date" : "2018-11-10","tags" :[ "iot","big data","cloud computing" ],"no_of_likes" :12,"_id" : "20" } }
{ "title" : "Get started with juvenile expressions in Java","content" : "Learn how to use lambda juvenile and tutorial functional programming techniques in your Java programs.","lambda","functional programming" ],"no_of_likes" :128,"status" : "draft" }

请注意,同时具有devtoolstutorial的doc-3得分低于仅有devtools的doc-4。

解决方法

为此花费了很多时间,并在分析了explain=true参数的搜索输出后,找到了根本原因并找到了解决方法,如果您注意到了,下面是计算tf得分的公式

“说明”:“ tf,计算为频率/(频率+ k1 *(1-b + b * dl / avgdl))来自:”,

"details": [   {
"value": 1.0,"description": "freq,occurrences of term within document","details": [
  
]   },{
"value": 1.2,"description": "k1,term saturation parameter",{
"value": 0.75,"description": "b,length normalization parameter",{
"value": 2.0,"description": "dl,length of field",{
"value": 29.545454,"description": "avgdl,average length of field","details": [
  
]   } ]

如果您注意到,它由总共5个分量和dl组成,即与doc-id 4匹配的搜索结果的字段长度要少得多,因为它仅包含devtools并且如果您注意到,dl是分母的一部分,并且较小的值将增加tf,并且如果score(freq=4.0),computed as boost * idf * tf与相同的其他分量相乘,则最终公式为tf对于所有文档。

这是由于field normalization而发生的,为了解决它,您必须在可搜索字段上禁用norms,然后重试,我再次使用{{1 }}在norms字段中被禁用,并获得了所需的结果。

索引映射

content

然后使用您的{ "mappings": { "properties": { "content": { "type": "text","norms": false },"title": { "type": "text" } } } } 请求为文档建立索引,然后使用相同的搜索请求,从而产生以下预期结果:

bulk

附言:与 "hits": [ { "_index": "64180913_1","_type": "_doc","_id": "3","_score": 5.803219,"_source": { "title": "Introducing the New React DevTools","content": "We are excited to announce a new release of accompany the React DevTools tutorial,available today in Chrome,Firefox,and (Chromium) Edge.We are excited to announce a new release of accompany the React tutorial,and (Chromium) Edge.We are excited to announce a new release of accompany the React DevTools tutorial,and (Chromium) EdgeWe are excited to announce a new release of accompany the React tutorial,and (Chromium) EdgeWe are excited to announce a new release of accompany the React tutorial,and (Chromium) EdgeWe are excited to announce a new release of accompany the React DevTools tutorial,and (Chromium) Edge","published_date": "2019-08-25","tags": [ "react","devtools" ],"no_of_likes": 2,"status": "published" } },{ "_index": "64180913_1","_id": "4","_score": 3.5244086,"_source": { "title": "Angular Tools for High Performance","content": "devtools",--> note this its below doc-3 "published_date": "2014-03-22","tags": [ "angular","performance","fast" ],"no_of_likes": 35,"_id": "1","_score": 1.1478703,"_source": { "title": "Introduction to elasticsearch","content": "Elasticsearch is a distributed,open source search slay and tutorial analytics engine for all types of data","published_date": "2020-01-02","tags": [ "elasticsearch","distributed","storage" ],"no_of_likes": 21,"_id": "2","_source": { "title": "Why is Elasticsearch fast?","content": "It is able to achieve fast search responses because,instead small of tutorial searching the text directly,it searches an index instead","fast","index" ],"no_of_likes": 10,"status": "draft" } },"_id": "5","_source": { "title": "The new features in Java 14","content": "Oracle on September 17 said switch expressions tutorial are expected naresh to go final in Java Development Kit 14 (JDK 14). ","published_date": "2019-07-20","tags": [ "java" ],"no_of_likes": 11,"status": "published" } } ] 无关,因此我删除了该部分,以使您的问题简短明了。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?
Java在半透明框架/面板/组件上重新绘画。
Java“ Class.forName()”和“ Class.forName()。newInstance()”之间有什么区别?
在此环境中不提供编译器。也许是在JRE而不是JDK上运行?
Java用相同的方法在一个类中实现两个接口。哪种接口方法被覆盖?
Java 什么是Runtime.getRuntime()。totalMemory()和freeMemory()?
java.library.path中的java.lang.UnsatisfiedLinkError否*****。dll
JavaFX“位置是必需的。” 即使在同一包装中
Java 导入两个具有相同名称的类。怎么处理?
Java 是否应该在HttpServletResponse.getOutputStream()/。getWriter()上调用.close()?
Java RegEx元字符(。)和普通点?