Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

61 餐馆营业额变化增长 #74

Open
astak16 opened this issue Sep 23, 2023 · 0 comments
Open

61 餐馆营业额变化增长 #74

astak16 opened this issue Sep 23, 2023 · 0 comments
Labels

Comments

@astak16
Copy link
Owner

astak16 commented Sep 23, 2023

题目

题目链接:餐馆营业额变化增长

你是餐馆的老板,现在你想分析一下可能的营业额变化增长(每天至少有一位顾客)。

计算以 7 天(某日期 + 该日期前的 6 天)为一个时间段的顾客消费平均值。average_amount 要 保留两位小数。

结果按 visited_on 升序排序。

返回结果格式的例子如下。

Create table If Not Exists Customer (customer_id int, name varchar(20), visited_on date, amount int);
Truncate table Customer;
insert into Customer (customer_id, name, visited_on, amount) values ('1', 'Jhon', '2019-01-01', '100');
insert into Customer (customer_id, name, visited_on, amount) values ('2', 'Daniel', '2019-01-02', '110');
insert into Customer (customer_id, name, visited_on, amount) values ('3', 'Jade', '2019-01-03', '120');
insert into Customer (customer_id, name, visited_on, amount) values ('4', 'Khaled', '2019-01-04', '130');
insert into Customer (customer_id, name, visited_on, amount) values ('5', 'Winston', '2019-01-05', '110');
insert into Customer (customer_id, name, visited_on, amount) values ('6', 'Elvis', '2019-01-06', '140');
insert into Customer (customer_id, name, visited_on, amount) values ('7', 'Anna', '2019-01-07', '150');
insert into Customer (customer_id, name, visited_on, amount) values ('8', 'Maria', '2019-01-08', '80');
insert into Customer (customer_id, name, visited_on, amount) values ('9', 'Jaze', '2019-01-09', '110');
insert into Customer (customer_id, name, visited_on, amount) values ('1', 'Jhon', '2019-01-10', '130');
insert into Customer (customer_id, name, visited_on, amount) values ('3', 'Jade', '2019-01-10', '150');

-- 时间不连续的例子
insert into Customer (customer_id, name, visited_on, amount) values ('1', 'Jhon', '2019-01-01', '100');
insert into Customer (customer_id, name, visited_on, amount) values ('4', 'Khaled', '2019-01-04', '130');
insert into Customer (customer_id, name, visited_on, amount) values ('5', 'Winston', '2019-01-05', '110');
insert into Customer (customer_id, name, visited_on, amount) values ('6', 'Elvis', '2019-01-06', '140');
insert into Customer (customer_id, name, visited_on, amount) values ('7', 'Anna', '2019-01-07', '150');
insert into Customer (customer_id, name, visited_on, amount) values ('8', 'Maria', '2019-01-08', '80');
insert into Customer (customer_id, name, visited_on, amount) values ('9', 'Jaze', '2019-01-09', '110');
insert into Customer (customer_id, name, visited_on, amount) values ('1', 'Jhon', '2019-01-10', '130');
insert into Customer (customer_id, name, visited_on, amount) values ('3', 'Jade', '2019-01-10', '150');
Customer 表:
+-------------+--------------+--------------+-------------+
| customer_id | name         | visited_on   | amount      |
+-------------+--------------+--------------+-------------+
| 1           | Jhon         | 2019-01-01   | 100         |
| 2           | Daniel       | 2019-01-02   | 110         |
| 3           | Jade         | 2019-01-03   | 120         |
| 4           | Khaled       | 2019-01-04   | 130         |
| 5           | Winston      | 2019-01-05   | 110         |
| 6           | Elvis        | 2019-01-06   | 140         |
| 7           | Anna         | 2019-01-07   | 150         |
| 8           | Maria        | 2019-01-08   | 80          |
| 9           | Jaze         | 2019-01-09   | 110         |
| 1           | Jhon         | 2019-01-10   | 130         |
| 3           | Jade         | 2019-01-10   | 150         |
+-------------+--------------+--------------+-------------+
在 SQL 中,(customer_id, visited_on) 是该表的主键。
该表包含一家餐馆的顾客交易数据。
visited_on 表示 (customer_id) 的顾客在 visited_on 那天访问了餐馆。
amount 是一个顾客某一天的消费总额。

输出:
+--------------+--------------+----------------+
| visited_on   | amount       | average_amount |
+--------------+--------------+----------------+
| 2019-01-07   | 860          | 122.86         |
| 2019-01-08   | 840          | 120            |
| 2019-01-09   | 840          | 120            |
| 2019-01-10   | 1000         | 142.86         |
+--------------+--------------+----------------+
解释:
第一个七天消费平均值从 2019-01-01 到 2019-01-07 是restaurant-growth/restaurant-growth/ (100 + 110 + 120 + 130 + 110 + 140 + 150)/7 = 122.86
第二个七天消费平均值从 2019-01-02 到 2019-01-08 是 (110 + 120 + 130 + 110 + 140 + 150 + 80)/7 = 120
第三个七天消费平均值从 2019-01-03 到 2019-01-09 是 (120 + 130 + 110 + 140 + 150 + 80 + 110)/7 = 120
第四个七天消费平均值从 2019-01-04 到 2019-01-10 是 (130 + 110 + 140 + 150 + 80 + 110 + 130 + 150)/7 = 142.86

本题考察的知识是如何累加一段时间区间内的值

有两种实现方式:

  1. 使用窗口函数,窗口函数比较好理解使用 6 PRECEDING AND current ROW 就能查找出来了(方案一)
  2. 使用自连,连接条件不太容易想到,需要使用 DATEDIFF 函数,这个函数可以计算两个日期之间的天数,然后使用 BETWEEN 条件(方案二和方案三)

解析

  1. 要知道过去 7 天的平均消费额,需要先知道每天的总消费额,作为临时表 tmp1
    select visited_on, sum(amount) sum_amount from Customer group by visited_on

    +-------------+--------------+
    | visited_on  | sum_amount   |
    +-------------+--------------+
    | 2019-01-01  |  100         |
    | 2019-01-02  |  110         |
    | 2019-01-03  |  120         |
    | 2019-01-04  |  130         |
    | 2019-01-05  |  110         |
    | 2019-01-06  |  140         |
    | 2019-01-07  |  150         |
    | 2019-01-08  |  80          |
    | 2019-01-09  |  110         |
    | 2019-01-10  |  280         |
    +-------------+--------------+
    
  2. 使用窗口函数,计算过去 7 天的总的消费额,作为临时表 tmp2
    select sum(sum_amount) sum_amount over (order by to_days(visited_on) range between 6 preceding and current row) as sum_amount from tmp1

    | visited_on  | sum_amount   |
    +-------------+--------------+
    | 2019-01-01  | 100          |
    | 2019-01-02  | 210          |
    | 2019-01-03  | 330          |
    | 2019-01-04  | 460          |
    | 2019-01-05  | 570          |
    | 2019-01-06  | 710          |
    | 2019-01-07  | 860          |
    | 2019-01-08  | 840          |
    | 2019-01-09  | 840          |
    | 2019-01-10  | 1000         |
    +-------------+--------------+
    
  3. 计算过去 7 天的平均消费额,作为临时表 tmp3
    select visited_on, sum_amount amount, sum_amount / 7 as average_amount from tmp2

    | visited_on  | sum_amount   | average_amount |
    +-------------+--------------+----------------+
    | 2019-01-01  | 100	        | 14.2857        |
    | 2019-01-02  | 210	        | 30.0000        |
    | 2019-01-03  | 330	        | 47.1429        |
    | 2019-01-04  | 460	        | 65.7143        |
    | 2019-01-05  | 570	        | 81.4286        |
    | 2019-01-06  | 710	        | 101.4286       |
    | 2019-01-07  | 860	        | 122.8571       |
    | 2019-01-08  | 840	        | 120.0000       |
    | 2019-01-09  | 840	        | 120.0000       |
    | 2019-01-10  | 1000	     | 142.8571       |
    +-------------+-------------+----------------+
    
  4. 筛选出计算数据大于等于七天的数据

    • 需要知道表中日期最小的一天,作为临时表 tmp4
      select min(visited_on) min_visited_on from Customer
    | min_visited_on  |
    +-----------------+
    | 2019-01-01      |
    +-----------------+
    
    • 使用 datediff(expr1, expr2) 函数,计算两个日期之间的天数,这里需要大于等于 6
      select visited_on, amount, round(average_amount, 2) average_amount from tmp3 where datediff(visited_on, (select min(visited_on) from Customer)) >= 6
    | visited_on  | amount       | average_amount |
    +-------------+--------------+----------------+
    | 2019-01-07	| 860          |  122.8571      |
    | 2019-01-08	| 840          |  120.0000      |
    | 2019-01-09	| 840          |  120.0000      |
    | 2019-01-10	| 1000         |  142.8571      |
    +-------------+--------------+----------------+
    

最终 sql 语句如下:

SELECT
   visited_on,
   sum_amount amount,
   ROUND( sum_amount / 7, 2 ) average_amount
FROM (
   SELECT
      visited_on,
      SUM( sum_amount ) OVER ( ORDER BY to_days(visited_on) RANGE BETWEEN 6 PRECEDING AND current ROW ) sum_amount
   FROM (
      SELECT
         visited_on,
         SUM( amount ) sum_amount
      FROM Customer
      GROUP BY visited_on
   ) tmp1
) tmp2
WHERE DATEDIFF(visited_on, ( SELECT MIN( visited_on ) FROM Customer )) >= 6;

上面 sql 可以简化一下,不过有问题,就是如果时间不连续,排序不会跳过。

也就是说 rk > 7 只能筛选出连续 7 天的数据

SELECT
   visited_on,
   amount,
   SUM( amount / 7, 2 ) average_amount
FROM (
   SELECT
      visited_on,
      RANK() OVER ( ORDER BY visited_on ) AS rk,
      SUM(SUM( amount )) OVER ( ORDER BY visited_on RANGE INTERVAL 7-1 DAY PRECEDING ) AS amount
   FROM Customer GROUP BY visited_on
) AS tep WHERE rk >= 7 ORDER BY 1

方法二

此方法是使用自连,连接的条件是时间连续 7 天,这个方法如果时间不连续,就会有问题

WITH t AS (
   SELECT visited_on, SUM( amount ) amount FROM Customer GROUP BY visited_on
)
SELECT a.visited_on, SUM( b.amount ) amount, ROUND( AVG( b.amount ), 2 ) average_amount
FROM t a, t b
WHERE DATEDIFF( a.visited_on, b.visited_on ) BETWEEN 0 AND 6
GROUP BY a.visited_on COUNT(*) = 7;

方法三

SELECT
   a.visited_on,
   sum( b.amount ) AS amount,
   round( sum( b.amount ) / 7, 2 ) AS average_amount
FROM
   ( SELECT DISTINCT visited_on FROM Customer ) a
   JOIN Customer b ON datediff( a.visited_on, b.visited_on ) BETWEEN 0 AND 6
WHERE
   a.visited_on >= ( SELECT min( visited_on ) FROM Customer ) + 6
GROUP BY a.visited_on
ORDER BY visited_on
@astak16 astak16 added the 中等 label Sep 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant