概述目标:以合理的映射与查询模式,避免深分页与昂贵聚合导致的慢查询;通过刷新间隔、字段类型、排序与`search_after`实现稳定性能。适用:日志/指标检索、订单/交易检索、用户画像聚合等高并发读写场景。核心与实战索引与映射设计:PUT /orders-2025-11 { "settings": { "number_of_shards": 3, "number_of_replicas": 1, "refresh_interval": "5s", "index.routing.allocation.total_shards_per_node": 2, "index.codec": "best_compression" }, "mappings": { "dynamic": "strict", "properties": { "order_id": {"type": "keyword", "doc_values": true}, "user_id": {"type": "keyword", "doc_values": true}, "status": {"type": "keyword"}, "amount": {"type": "scaled_float", "scaling_factor": 100}, "currency": {"type": "keyword"}, "items": {"type": "nested", "properties": { "sku": {"type": "keyword"}, "qty": {"type": "integer"}, "price": {"type": "scaled_float", "scaling_factor": 100} }}, "created_at": {"type": "date"}, "text": {"type": "text", "analyzer": "standard"} } } } 写入与刷新:批量写`_bulk`,控制`refresh_interval`避免频繁刷新影响吞吐;需要实时可见时使用`?refresh=wait_for`。查询优化:优先使用`filter`语句(不计分)结合精确字段(`keyword`/数值/日期);深分页采用`search_after`替代`from/size`;聚合基于`keyword`且保证`doc_values`开启;避免对`text`字段聚合。示例search_after分页:POST /orders-2025-11/_search { "size": 100, "sort": [ {"created_at": "desc"}, {"order_id": "desc"} ], "query": { "bool": { "filter": [ {"term": {"status": "PAID"}}, {"range": {"created_at": {"gte": "now-7d"}}} ] } } } 下一页示例(取上一页最后一条的`sort`值):POST /orders-2025-11/_search { "size": 100, "sort": [ {"created_at": "desc"}, {"order_id": "desc"} ], "search_after": ["2025-11-26T10:00:00.000Z", "ORD-0000123"], "query": { "bool": { "filter": [ {"term": {"status": "PAID"}}, {"range": {"created_at": {"gte": "now-7d"}}} ] } } } 聚合示例(按状态计数与金额总计):POST /orders-2025-11/_search { "size": 0, "aggs": { "by_status": { "terms": {"field": "status"}, "aggs": { "total_amount": {"sum": {"field": "amount"}} } } }, "query": {"match_all": {}} } 验证与监控Profile分析:POST /orders-2025-11/_search?profile=true { "query": {"match_all": {}}, "size": 10, "sort": [{"created_at": "desc"}] } 集群/索引状态与设置:GET /_cat/indices?v GET /orders-2025-11/_settings GET /_nodes/stats/indices/search 观察慢查询:启用`index.search.slowlog.threshold.query.warn`和`...fetch.warn`(例如`2s`),在`/var/log/elasticsearch/`查看slowlog。常见误区使用`from/size`进行深分页导致堆内存与CPU飙升,应改用`search_after`或`scroll`(离线批处理)。对`text`字段做terms聚合或排序,缺少`doc_values`支持导致慢查询;应使用`keyword`字段。过多`shard`数量与不合适的`refresh_interval`导致查询/写入抖动;根据数据量与节点资源设定合理值。结语通过严格的映射、过滤优先、search_after分页与慢日志观测,可以稳定控制Elasticsearch查询性能与成本,并保证交付质量。

发表评论 取消回复