elasticsearch索引名称(ElasticSearch--聚合查询)
elasticsearch索引名称(ElasticSearch--聚合查询)我们来看最简单的词条桶,brand_aggs就是自定义桶的名字 terms表示词条桶,field:brand表示按照字段brand来划分桶 size为0表示不想返回查询结果,从这里可以看出分页不影响聚合的结果,也就是说可以实现分页查询和聚合结果一起返回。Avg Aggregation:求平均值Max Aggregation:求最大值Min Aggregation:求最小值Percentiles Aggregation:求百分比Stats Aggregation:同时返回avg、max、min、sum、count等Sum Aggregation:求和Top hits Aggregation:求前几Value Count Aggregation:求总数可以看出ES的分组方式相当强大,mysql的group by只能实现类似Terms Aggregation的分组效果,而ES还可以根据阶梯和范围
聚合(aggs)聚合一般用于数据的统计分析 类似于mysql的group by。
聚合里面有两个基本概念,一个叫桶,一个叫度量。
桶的作用,是按照某种方式对数据进行分组,每一组数据成为一个桶。比如对手机品牌分组,可以得到小米桶,华为桶。
桶的分组方式Date Histogram Aggregation:根据日期阶梯分组,例如给定阶梯为周,会自动每周分为一组
Histogram Aggregation:根据数值阶梯分组,与日期类似
Terms Aggregation:根据词条内容分组,词条内容完全匹配的为一组
Range Aggregation:数值和日期的范围分组,指定开始和结束,然后按段分组
可以看出ES的分组方式相当强大,mysql的group by只能实现类似Terms Aggregation的分组效果,而ES还可以根据阶梯和范围来分组。
度量度量类似mysql的avg max等函数,用来求分组内平均值,最大值等。
比较常用的一些度量聚合方式:
Avg Aggregation:求平均值
Max Aggregation:求最大值
Min Aggregation:求最小值
Percentiles Aggregation:求百分比
Stats Aggregation:同时返回avg、max、min、sum、count等
Sum Aggregation:求和
Top hits Aggregation:求前几
Value Count Aggregation:求总数
我们来看最简单的词条桶,brand_aggs就是自定义桶的名字 terms表示词条桶,field:brand表示按照字段brand来划分桶 size为0表示不想返回查询结果,从这里可以看出分页不影响聚合的结果,也就是说可以实现分页查询和聚合结果一起返回。
下面的查询是通过品牌名来分组统计
GET /goods/_search
{
"size" : 0
"aggs" : {
"brand_aggs" : {
"terms" : {
"field" : "brand"
}
}
}
}
查询结果:
{
"took" : 3
"timed_out" : false
"_shards" : {
"total" : 3
"successful" : 3
"skipped" : 0
"failed" : 0
}
"hits" : {
"total" : {
"value" : 5
"relation" : "eq"
}
"max_score" : null
"hits" : [ ]
}
"aggregations" : {
"brand_aggs" : { //桶的名字
"doc_count_error_upper_bound" : 0
"sum_other_doc_count" : 0
"buckets" : [ //查询结果
{
"key" : "华为" //品牌名,因为是按照品牌分组
"doc_count" : 3 //统计的数量
}
{
"key" : "小米"
"doc_count" : 2
}
]
}
}
}
可以看到不需要加度量默认就把总数求出来了,如果要求品牌下平均手机价格,就需要加度量了
度量平均值GET /goods/_search
{
"size" : 0
"aggs" : {
"brand_aggs" : {
"terms" : {
"field" : "brand"
}
"aggs":{
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
返回结果:
{
"took" : 1
"timed_out" : false
"_shards" : {
"total" : 3
"successful" : 3
"skipped" : 0
"failed" : 0
}
"hits" : {
"total" : {
"value" : 5
"relation" : "eq"
}
"max_score" : null
"hits" : [ ]
}
"aggregations" : {
"brand_aggs" : {
"doc_count_error_upper_bound" : 0
"sum_other_doc_count" : 0
"buckets" : [
{
"key" : "华为"
"doc_count" : 3
"avg_price" : {
"value" : 4500.0
}
}
{
"key" : "小米"
"doc_count" : 2
"avg_price" : {
"value" : 5000.0
}
}
]
}
}
}
代码实现
public void testAggs() {
AbstractAggregationBuilder aggregationBuilder = AggregationBuilders.terms("brand_aggs").field("brand");//通过品牌分组
aggregationBuilder.subAggregation(AggregationBuilders.avg("avg_price").field("price")); //平均值度量,计算price平均值
NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
.withPageable(PageRequest.of(0 1)) //size只能大于0
.addAggregation(aggregationBuilder)
.build();
SearchHits<GoodsInfo> goodsInfos = elasticsearchRestTemplate.search(nativeSearchQuery GoodsInfo.class);
Terms brandTerms = goodsInfos.getAggregations().get("brand_aggs");
brandTerms.getBuckets().stream().forEach(bucket -> {
System.out.println(bucket.getKey()); //获取品牌名
System.out.println(bucket.getDocCount()); //获取总数
ParsedAvg avgPrice = bucket.getAggregations().get("avg_price"); //获取平均价格
System.out.println(avgPrice.getValue());
});
}
下面的例子是按照500为一个阶梯统计不同价位手机数量
GET /goods/_search
{
"size":0
"aggs":{
"price_histogram":{
"histogram": {
"field": "price"
"interval": 500
}
}
}
}
结果:
{
"took" : 103
"timed_out" : false
"_shards" : {
"total" : 3
"successful" : 3
"skipped" : 0
"failed" : 0
}
"hits" : {
"total" : {
"value" : 5
"relation" : "eq"
}
"max_score" : null
"hits" : [ ]
}
"aggregations" : {
"price_histogram" : {
"buckets" : [
{
"key" : 3500.0
"doc_count" : 1
}
{
"key" : 4000.0
"doc_count" : 0
}
{
"key" : 4500.0
"doc_count" : 2
}
{
"key" : 5000.0
"doc_count" : 0
}
{
"key" : 5500.0
"doc_count" : 2
}
]
}
}
}
代码:
public void testHistogram() {
AbstractAggregationBuilder aggregationBuilder = AggregationBuilders.histogram("price_histogram").field("price").interval(500);//500一个阶梯统计
NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
.withPageable(PageRequest.of(0 1)) //size只能大于0
.addAggregation(aggregationBuilder)
.build();
SearchHits<GoodsInfo> goodsInfos = elasticsearchRestTemplate.search(nativeSearchQuery GoodsInfo.class);
ParsedHistogram priceHistogram = goodsInfos.getAggregations().get("price_histogram");
priceHistogram.getBuckets().stream().forEach(bucket -> {
System.out.println(bucket.getKey()); //阶梯值
System.out.println(bucket.getDocCount()); //获取总数
});
}
统计价格在4000-6000手机的数量
GET /goods/_search
{
"size": 0
"aggs": {
"price_range": {
"range": {
"field": "price"
"ranges": [
{
"from": 4000
"to": 6000
}
]
}
}
}
}
结果:
{
"took" : 1
"timed_out" : false
"_shards" : {
"total" : 3
"successful" : 3
"skipped" : 0
"failed" : 0
}
"hits" : {
"total" : {
"value" : 5
"relation" : "eq"
}
"max_score" : null
"hits" : [ ]
}
"aggregations" : {
"price_range" : {
"buckets" : [
{
"key" : "4000.0-6000.0"
"from" : 4000.0
"to" : 6000.0
"doc_count" : 4
}
]
}
}
}
代码:
public void testRangeAggrs() {
AbstractAggregationBuilder aggregationBuilder = AggregationBuilders.range("price_range").field("price").addRange(4000 6000);//500一个阶梯统计
NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
.withPageable(PageRequest.of(0 1)) //size只能大于0
.addAggregation(aggregationBuilder)
.build();
SearchHits<GoodsInfo> goodsInfos = elasticsearchRestTemplate.search(nativeSearchQuery GoodsInfo.class);
ParsedRange priceHistogram = goodsInfos.getAggregations().get("price_range");
priceHistogram.getBuckets().stream().forEach(bucket -> {
System.out.println(bucket.getKey()); //key值
System.out.println(bucket.getDocCount()); //获取总数
});
}
GET /cars/_search
{
"size":0
"aggs" : {
"date" : {
"date_histogram" : {
"field" : "sold"
"interval" : "1M"
"format" : "yyyy-MM"
"time_zone": "-01:00"
"min_doc_count": 1
}
}
}
}
结果:
"aggregations" : {
"date" : {
"buckets" : [
{
"key_as_string" : "2013-12"
"key" : 1385859600000
"doc_count" : 1
}
{
"key_as_string" : "2014-02"
"key" : 1391216400000
"doc_count" : 1
}
{
"key_as_string" : "2014-05"
"key" : 1398906000000
"doc_count" : 1
}
{
"key_as_string" : "2014-07"
"key" : 1404176400000
"doc_count" : 1
}
{
"key_as_string" : "2014-08"
"key" : 1406854800000
"doc_count" : 1
}
{
"key_as_string" : "2014-10"
"key" : 1412125200000
"doc_count" : 1
}
{
"key_as_string" : "2014-11"
"key" : 1414803600000
"doc_count" : 2
}
]
}
}