概述 - 目标:以合理的映射与查询模式,避免深分页与昂贵聚合导致的慢查询;通过刷新间隔、字段类型、排序与`search_after`实现稳定性能。 - 适用:日志/指标检索、订单/交易检索、用户画像聚合等高并发读写场景。 核心与实战 - 索引与映射设计: ``` PUT /orders-2025-11 { "settings": { "number_of_shards": 3, "number_of_replicas": 1, "refresh_interval": "5s", "index.routing.allocation.total_shards_per_node": 2, "index.codec": "best_compression" }, "mappings": { "dynamic": "strict", "properties": { "order_id": {"type": "keyword", "doc_values": true}, "user_id": {"type": "keyword", "doc_values": true}, "status": {"type": "keyword"}, "amount": {"type": "scaled_float", "scaling_factor": 100}, "currency": {"type": "keyword"}, "items": {"type": "nested", "properties": { "sku": {"type": "keyword"}, "qty": {"type": "integer"}, "price": {"type": "scaled_float", "scaling_factor": 100} }}, "created_at": {"type": "date"}, "text": {"type": "text", "analyzer": "standard"} } } } ``` - 写入与刷新:批量写`_bulk`,控制`refresh_interval`避免频繁刷新影响吞吐;需要实时可见时使用`?refresh=wait_for`。 - 查询优化: - 优先使用`filter`语句(不计分)结合精确字段(`keyword`/数值/日期); - 深分页采用`search_after`替代`from/size`; - 聚合基于`keyword`且保证`doc_values`开启;避免对`text`字段聚合。 示例 - search_after分页: ``` POST /orders-2025-11/_search { "size": 100, "sort": [ {"created_at": "desc"}, {"order_id": "desc"} ], "query": { "bool": { "filter": [ {"term": {"status": "PAID"}}, {"range": {"created_at": {"gte": "now-7d"}}} ] } } } ``` - 下一页示例(取上一页最后一条的`sort`值): ``` POST /orders-2025-11/_search { "size": 100, "sort": [ {"created_at": "desc"}, {"order_id": "desc"} ], "search_after": ["2025-11-26T10:00:00.000Z", "ORD-0000123"], "query": { "bool": { "filter": [ {"term": {"status": "PAID"}}, {"range": {"created_at": {"gte": "now-7d"}}} ] } } } ``` - 聚合示例(按状态计数与金额总计): ``` POST /orders-2025-11/_search { "size": 0, "aggs": { "by_status": { "terms": {"field": "status"}, "aggs": { "total_amount": {"sum": {"field": "amount"}} } } }, "query": {"match_all": {}} } ``` 验证与监控 - Profile分析: ``` POST /orders-2025-11/_search?profile=true { "query": {"match_all": {}}, "size": 10, "sort": [{"created_at": "desc"}] } ``` - 集群/索引状态与设置: ``` GET /_cat/indices?v GET /orders-2025-11/_settings GET /_nodes/stats/indices/search ``` - 观察慢查询:启用`index.search.slowlog.threshold.query.warn`和`...fetch.warn`(例如`2s`),在`/var/log/elasticsearch/`查看slowlog。 常见误区 - 使用`from/size`进行深分页导致堆内存与CPU飙升,应改用`search_after`或`scroll`(离线批处理)。 - 对`text`字段做terms聚合或排序,缺少`doc_values`支持导致慢查询;应使用`keyword`字段。 - 过多`shard`数量与不合适的`refresh_interval`导致查询/写入抖动;根据数据量与节点资源设定合理值。 结语 - 通过严格的映射、过滤优先、search_after分页与慢日志观测,可以稳定控制Elasticsearch查询性能与成本,并保证交付质量。

发表评论 取消回复