此篇文档仅仅是简单的记录一下painless的一些简单的例子,防止以后忘记,不过多涉及painless的语法。
PUT /index_person
{"mappings": {"properties": {"name": {"type": "keyword"},"age": {"type": "integer"},"province": {"type": "keyword"}}}
}
PUT /index_person/_bulk
{"index":{"_id":1}}
{"name":"张三","age":20,"province":"湖北"}
{"index":{"_id":2}}
{"name":"李四","age":25,"province":"北京"}
{"index":{"_id":3}}
{"name":"王五","age":30,"province":"湖南"}
POST index_person/_update/1
{"script": {"lang": "painless","source": """ctx['_source']['age'] += params['incrAge']""","params": {"incrAge": 2}}
}
POST index_person/_update_by_query
{"query": {"term": {"province": {"value": "北京"}}},"script": {"lang": "painless","source": """ctx['_source']['age'] -= params['decrAge']""","params": {"decrAge": 1}}
}
POST index_person/_update/1
{"script": {"lang": "painless","source": """// 这是默认值,表示的是更新值,重新索引记录ctx.op = 'index';if(ctx._source.age < 20){// 表示不处理ctx.op = 'none';}else{// 表示删除这个文档ctx.op = 'delete'; }"""}
}
PUT _scripts/add_city
{"script":{"lang": "painless","source": "ctx._source.city = params.city"}
}
add_city为脚本的id
POST index_person/_update_by_query
{"query": {"term": {"province": {"value": "湖南"}}},"script": {"id": "add_city","params": {"city": "长沙"}}
}
PUT _ingest/pipeline/pipeline_index_person_small
{"description": "如果插入的文档的age<10则放入到index_person_small索引中","processors": [{"script": {"source": """if(ctx.age < 10){ctx._index = 'index_person_small';}"""}}]
}
PUT index_person/_doc/4?pipeline=pipeline_index_person_small
{"name":"赵六","age": 8,"province":"四川"
}

如果这个用户是湖南的,则使用 age作为分数
GET index_person/_search
{"query": {"function_score": {"query": {"match_all": {}},"functions": [{"script_score": {"script": """if(doc.province.value == 0){return 0;}if(doc.province.value == '湖南'){return doc.age.value;}return 0;"""}}],"boost_mode": "sum","score_mode": "sum"}}
}

GET index_person/_search
{"query": {"match_all": {}},"fields": ["double_age"], "script_fields": {"double_age": {"script": {"lang": "painless","source": "doc.age.value * 2"}}}
}
针对age<25的文档,返回double_age字段,否则不处理。
GET index_person/_search
{"query": {"match_all": {}},"fields": ["double_age"],"runtime_mappings": {"double_age":{ "type": "keyword","script": """if(doc.age.size() == 0){return;}if(doc.age.value < 25){emit(doc.age.value * 2 + '');}"""}}
}
在runtime field 中,需要使用emit来返回数据,但是不是emit(null)
POST _reindex
{"source": {"index": "index_person"},"dest": {"index": "index_person_new"},"script": {"lang": "painless","source": """if(ctx._source.age < 25){ctx._source.tag = '年轻人';}else{ctx._source.tag = '中年人';}"""}
}

GET index_person/_search
{"query": {"script": {"script": {"lang": "painless","source": """if(doc.age.size() == 0){return false;}return doc.age.value < 25;"""}}}
}
GET index_person/_search
{"size": 0, "aggs": {"agg_province": {"terms": {"script": {"lang": "painless","source": """return doc.province"""}, "size": 10}},"agg_age":{"avg": {"script": "params._source.age"}}}
}

可以通过Debug.explain来进行一些简单的调试。
doc[…]:使用doc关键字,将导致该字段的术语被加载到内存(缓存),这将导致更快的执行,但更多的内存消耗。此外,doc[…]表示法只允许简单的值字段(您不能从中返回json对象),并且仅对非分析或基于单个术语的字段有意义。然而,如果可能的话,使用doc仍然是访问文档值的推荐方法。
params[_source][…]: 每次使用_source都必须加载和解析, 因此使用_source会相对而言要慢点。
![脚本中的doc[..]和params._source[..]](/uploadfile/202405/8967adcefdccb5e.png)

详细了解,请参考这个文档https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-contexts.html
上一篇:linux性能优化-内存原理
下一篇:Jenkins创建多分支流水线