HiveSQL经典面试题(建议点赞收藏)

打印 上一主题 下一主题

主题 557|帖子 557|积分 1671

目录
经典面试系列
每科成绩都大于80分的学生信息
连续登录问题
行列转换问题
留存问题:看当天登录后第N天是否登录
TopN问题
累计计算问题
HSQL进阶版
 直播间在线最大人数
SQL循环
计算中位数
产生连续数值

经典面试系列


  • 每科成绩都大于80分的学生信息

    • 建表+初始化
      1. --创建表scdn_student_score_test 并且初始化三个学生成绩
      2. create table hdw_tmp_dev.scdn_student_score_test as
      3. select '张三' as name,'数学' as subject ,'80' as score
      4. union all
      5. select '张三' as name,'语文' as subject ,'90' as score
      6. union all
      7. select '张三' as name,'英语' as subject ,'90' as score
      8. union all
      9. select '李四' as name,'数学' as subject ,'90' as score
      10. union all
      11. select '李四' as name,'语文' as subject ,'90' as score
      12. union all
      13. select '李四' as name,'英语' as subject ,'70' as score
      14. union all
      15. select '王五' as name,'数学' as subject ,'90' as score
      16. union all
      17. select '王五' as name,'语文' as subject ,'90' as score
      18. union all
      19. select '王五' as name,'英语' as subject ,'50' as score
      20. --查询结果显示
      21. | scdn_student_score_test.name  | scdn_student_score_test.subject  | scdn_student_score_test.score  |
      22. +-------------------------------+----------------------------------+--------------------------------+
      23. | 张三                            | 数学                               | 80                             |
      24. | 张三                            | 语文                               | 90                             |
      25. | 张三                            | 英语                               | 90                             |
      26. | 李四                            | 数学                               | 90                             |
      27. | 李四                            | 语文                               | 90                             |
      28. | 李四                            | 英语                               | 70                             |
      29. | 王五                            | 数学                               | 90                             |
      30. | 王五                            | 语文                               | 90                             |
      31. | 王五                            | 英语                               | 50                             |
      32. +-------------------------------+----------------------------------+---------------------
      33. --求所有学科成绩都大于等于80分的学生姓名
      复制代码
    • 思路一(头脑转换):所有问题找最小(最大)
      1. select
      2.     t1.name
      3.     ,t1.min_score
      4. from
      5.     (
      6.         select
      7.             name
      8.             ,min(score)  as min_score
      9.         from hdw_tmp_dev.scdn_student_score_test
      10.         group by name
      11.     ) as t1 --求出最小的成绩
      12. where t1.min_score >=80 --最小成绩大于等于80,则这个学生的所有成绩都会大于等于80
      复制代码
    • 执行效果

    • 思路二(巧用左关联举行筛选):
      1. select
      2.     t1.name
      3. from
      4. (
      5.     select
      6.         name
      7.     from hdw_tmp_dev.scdn_student_score_test
      8.     group by name
      9. ) as t1
      10. left join
      11. (
      12.     select
      13.         name
      14.     from hdw_tmp_dev.scdn_student_score_test
      15.     where score <80
      16.     group by name
      17. ) as t2 on t1.name = t2.name
      18. where t2.name is null
      复制代码

  • 连续登录问题

    • 建表+初始化
      1. create table hdw_tmp_dev.csdn_user_login_test as
      2. select 'xiaoming' as user_name,'2024-01-01' as login_date
      3. union all
      4. select 'xiaoming' as user_name,'2024-01-02' as login_date
      5. union all
      6. select 'xiaoming' as user_name,'2024-01-03' as login_date
      7. union all
      8. select 'xiaoming' as user_name,'2024-01-04' as login_date
      9. union all
      10. select 'xiaoming' as user_name,'2024-01-05' as login_date
      11. union all
      12. select 'dahuang' as user_name,'2024-01-02' as login_date
      13. union all
      14. select 'dahuang' as user_name,'2024-01-03' as login_date
      15. union all
      16. select 'dahuang' as user_name,'2024-01-04' as login_date
      17. union all
      18. select 'dahuang' as user_name,'2024-01-05' as login_date
      19. union all
      20. select 'lucky_dog' as user_name,'2024-01-01' as login_date
      21. union all
      22. select 'lucky_dog' as user_name,'2024-01-03' as login_date
      23. union all
      24. select 'lucky_dog' as user_name,'2024-01-04' as login_date
      25. union all
      26. select 'lucky_dog' as user_name,'2024-01-05' as login_date
      复制代码
    • Sql参考
      1. select
      2.     t2.user_name
      3.     ,t2.date_begin_flag
      4.     ,count(1) as max_login_days
      5. from
      6. (
      7.     select
      8.         t1.user_name
      9.         ,t1.login_date
      10.         ,date_sub(date(login_date),t1.rn) as date_begin_flag
      11.     from
      12.     (
      13.         select
      14.             user_name
      15.             ,login_date
      16.             ,row_number()over(partition by user_name order by login_date) as rn
      17.         from hdw_tmp_dev.csdn_user_login_test
      18.     ) as t1
      19. ) as t2  
      20. group by t2.user_name
      21.     ,t2.date_begin_flag
      复制代码
    • 执行过程+效果


  • 行列转换问题

    • 建表+初始化:参考问题一的表
      1. | t1.name              | t1.subject             | t1.score  |
      2. +-------------------------------+----------------------------------+-----------------+
      3. | 张三                 | 数学                    | 80                             |
      4. | 张三                 | 语文                    | 90                             |
      5. | 张三                 | 英语                    | 90                             |
      6. | 李四                 | 数学                    | 90                             |
      7. | 李四                 | 语文                    | 90                             |
      8. | 李四                 | 英语                    | 70                             |
      9. | 王五                 | 数学                    | 90                             |
      10. | 王五                 | 语文                    | 90                             |
      11. | 王五                 | 英语                    | 50                             |
      12. +-------------------------------+----------------------------------+------------------
      复制代码
    • 行专列Sql参考
      1. select
      2.     name
      3.     ,max(case when subject = '数学' then score end) as math_score
      4.     ,max(case when subject = '语文' then score end) as china_score
      5.     ,max(case when subject = '英语' then score end) as english_score
      6. from hdw_tmp_dev.scdn_student_score_test
      7. group by name
      复制代码
    • 行专列执行效果

    • 列转行:可以参考建表初始化语句(union all)
      1. select
      2.     name
      3.     ,collect_set(subject)  as subject_set
      4. from hdw_tmp_dev.scdn_student_score_test
      5. group by name
      复制代码
    • 每个学生选课效果(多行变一行):

    •  将上面的效果睁开(一行变多行)数据准备:
      1. create table hdw_tmp_dev.scdn_student_score_test_collect as
      2. select 'zhangsan' as name ,'"数学","语文","英语"' as subject_list
      3. union all
      4. select 'lisi' as name ,'"美术","生物","物理"' as subject_list
      5. union all
      6. select 'wangwu' as name ,'"计算机","日语","韩语"' as subject_list
      复制代码
    • 炸开代码
      1. select
      2.     name
      3.     ,subject_list
      4.     ,subject_name
      5. from hdw_tmp_dev.scdn_student_score_test_collect
      6. lateral view explode(split(subject_list,',')) extend_sub as subject_name
      复制代码
    • 效果


  • 留存问题:看当天登录后第N天是否登录

    • 建表+初始化:参照连续登录表
    • Sql参考
      1. --方案一:利用lead(日期,N)是否等于 当天登录实践+N天
      2. select
      3.     t1.user_name
      4.     ,t1.logon_date
      5.     ,case when lead1_logon_date = date_add(logon_date,1) then '1天留存' end as 1day_remain
      6.     ,case when lead3_logon_date = date_add(logon_date,3) then '3天留存' end as 3day_remain
      7. from
      8. (
      9.     select
      10.         user_name
      11.         ,logon_date
      12.         ,lead(user_name,1)over(partition by user_name order by logon_date) as lead1_user_name
      13.         ,lead(logon_date,1)over(partition by user_name order by logon_date) as lead1_logon_date
      14.         ,lead(user_name,3)over(partition by user_name order by logon_date) as lead3_user_name
      15.         ,lead(logon_date,3)over(partition by user_name order by logon_date) as lead3_logon_date
      16.     from hdw_tmp_dev.csdn_user_logon_test
      17. ) as t1
      18. --方案二:
      19. select
      20.     t2.first_log_date       as first_log_date
      21.     ,count(t2.user_id)      as new_user_cnt --新用户数
      22.     ,count(t3.user_id)      as next_user_id --次日回访用户数
      23.     ,count(t4.user_id)      as 30_user_id   --30天回访用户数
      24.     ,count(t3.user_id)/count(t2.user_id)    as next_back_rate --次日回访率
      25.     ,count(t4.user_id)/count(t2.user_id)    as 30_back_rate   --30天回访率
      26. from
      27. (
      28.     select
      29.         first_log_date
      30.         ,user_id
      31.         ,date_add(first_log_date,1)  as next_log_date
      32.         ,date_add(first_log_date,29) as 30_log_date
      33.     from
      34.     (
      35.         select
      36.             user_id
      37.             ,log_time
      38.             ,first_value(date(log_time))over(partition by user_id) as first_log_date
      39.         from user_log
      40.     ) as t1
      41.     group by first_log_date
      42.         ,user_id
      43. ) as t2
      44. left join
      45. (
      46.     select
      47.         user_id
      48.         ,date(log_date) as log_date  
      49.     from user_log
      50.     group by user_id
      51.         ,date(log_date) as log_date
      52. ) as t3 on t2.user_id = t2.user_id and t2.next_log_date = t3.log_date
      53. left join
      54. (
      55.     select
      56.         user_id
      57.         ,date(log_date) as log_date  
      58.     from user_log
      59.     group by user_id
      60.         ,date(log_date) as log_date
      61. ) as t4 on t2.user_id = t4.user_id and t2.30_log_date = t4.log_date
      62. group by t2.first_log_date
      复制代码
    • 方案一执行效果


  • TopN问题

    • 建表+初始化
      1. create table hdw_tmp_dev.scdn_student_score_test1 as
      2. select '张三' as name,'数学' as subject ,'80' as score
      3. union all
      4. select '张三' as name,'语文' as subject ,'90' as score
      5. union all
      6. select '张三' as name,'英语' as subject ,'90' as score
      7. union all
      8. select '李四' as name,'数学' as subject ,'90' as score
      9. union all
      10. select '李四' as name,'语文' as subject ,'90' as score
      11. union all
      12. select '李四' as name,'英语' as subject ,'70' as score
      13. union all
      14. select '王五' as name,'数学' as subject ,'90' as score
      15. union all
      16. select '王五' as name,'语文' as subject ,'90' as score
      17. union all
      18. select '王五' as name,'英语' as subject ,'50' as score
      19. union all
      20. select '小明' as name,'数学' as subject ,'88' as score
      21. union all
      22. select '小明' as name,'语文' as subject ,'99' as score
      23. union all
      24. select '小明' as name,'英语' as subject ,'77' as score
      25. union all
      26. select '小文' as name,'数学' as subject ,'66' as score
      27. union all
      28. select '小文' as name,'语文' as subject ,'89' as score
      29. union all
      30. select '小文' as name,'英语' as subject ,'90' as score
      复制代码
    • Sql参考:求每科前三名对应的人员的成绩单
      1. select
      2.     *
      3. from
      4. (
      5.     select
      6.         name
      7.         ,subject
      8.         ,score
      9.         ,row_number()over(partition by subject order by score desc) as rn
      10.     from hdw_tmp_dev.scdn_student_score_test1
      11. ) as t1
      12. where rn<=3
      复制代码
    • 执行效果


  • 累计计算问题

    • 建表+初始化
      1. create table hdw_tmp_dev.user_sale_date as
      2. select '001' as user_id,'2024-02-01' as sale_date, 100 as amount
      3. union all
      4. select '001' as user_id,'2024-02-02' as sale_date, 200 as amount
      5. union all
      6. select '001' as user_id,'2024-02-03' as sale_date, 300 as amount
      7. union all
      8. select '001' as user_id,'2024-02-04' as sale_date, 400 as amount
      9. union all
      10. select '001' as user_id,'2024-02-05' as sale_date, 500 as amount
      11. union all
      12. select '001' as user_id,'2024-02-06' as sale_date, 600 as amount
      复制代码
    • SQL逻辑
      1. select
      2.     user_id
      3.     ,sale_date
      4.     ,amount
      5.     ,sum(amount)over(partition by user_id order by sale_date) as accuma_amount --按日期逐渐累加
      6.     ,sum(amount)over(partition by user_id) as total_amount --按人汇总
      7.     ,avg(amount)over(partition by user_id) as avg_amount --安人平均每天
      8.     ,max(amount)over(partition by user_id) as max_amount --单日最大销售
      9.     ,min(amount)over(partition by user_id) as min_amount --单日最小销售
      10. from hdw_tmp_dev.user_sale_date
      复制代码
    • 效果展示


HSQL进阶版


  •  直播间在线最大人数

    • 建表+初始化
      1. create table hdw_tmp_dev.csdn_user_login_time_detail as
      2. select '001' as user_id,'2024-02-01 10:00:00' as begin_time,'2024-02-01 12:00:00' as end_time
      3. union all
      4. select '002' as user_id,'2024-02-01 11:00:00' as begin_time,'2024-02-01 13:00:00' as end_time
      5. union all
      6. select '003' as user_id,'2024-02-01 11:00:00' as begin_time,'2024-02-01 14:00:00' as end_time
      7. union all
      8. select '004' as user_id,'2024-02-01 15:00:00' as begin_time,'2024-02-01 16:00:00' as end_time
      复制代码
    • SQL逻辑
      1. select
      2.     t1.user_id
      3.     ,t1.time1
      4.     ,t1.flag
      5.     ,sum(flag)over(order by t1.time1) as user_cnt
      6. from
      7. (
      8.     select
      9.         user_id
      10.         ,begin_time as time1
      11.         ,1    as flag
      12.     from hdw_tmp_dev.csdn_user_login_time_detail
      13.     union all
      14.     select
      15.         user_id
      16.         ,end_time
      17.         ,-1    as flag
      18.     from hdw_tmp_dev.csdn_user_login_time_detail
      19. ) as t1
      复制代码
    • 效果展示


  • SQL循环

    • 建表+初始化
      1. create table hdw_tmp_dev.cycle_1 as
      2. select '1011' as a
      3. union all
      4. select '0101' as a
      复制代码
    • SQL逻辑
      1. select  
      2.         a,
      3.         concat_ws(",",collect_list(cast(index  as  string)))  as  res
      4. from  (
      5.         select  
      6.                 a,
      7.                 index+1  as  index,
      8.                 chr
      9.         from  (
      10.                 select  
      11.                         a,
      12.                         concat_ws(",",substr(a,1,1),substr(a,2,1),substr(a,3,1),substr(a,-1))  str
      13.                 from  hdw_tmp_dev.cycle_1 as t8
      14.         )  tmp1
      15.         lateral  view  posexplode(split(str,","))  t  as  index,chr
      16.         where  chr  =  "1"
      17. )  tmp2
      18. group  by  a;
      复制代码
    • 效果展示


  • 计算中位数

    • 建表+初始化:
      1. create table hdw_tmp_dev.user_sale_date as
      2. select '001' as user_id,'2024-02-01' as sale_date, 100 as amount
      3. union all
      4. select '001' as user_id,'2024-02-02' as sale_date, 200 as amount
      5. union all
      6. select '001' as user_id,'2024-02-03' as sale_date, 300 as amount
      7. union all
      8. select '001' as user_id,'2024-02-04' as sale_date, 400 as amount
      9. union all
      10. select '001' as user_id,'2024-02-05' as sale_date, 500 as amount
      11. union all
      12. select '001' as user_id,'2024-02-06' as sale_date, 600 as amount
      复制代码
    • SQL逻辑
      1. select
      2.     t1.user_id
      3.     ,t1.sale_date
      4.     ,t1.amount
      5.     ,avg(t1.amount)over(partition by t1.user_id) as zhongwenshu
      6. from
      7. (
      8.     select
      9.         user_id
      10.         ,sale_date
      11.         ,amount
      12.         ,row_number()over(partition by user_id order by amount) as rn
      13.     from hdw_tmp_dev.user_sale_date
      14. ) as t1
      15. left join
      16. (
      17.     select
      18.         user_id
      19.         ,count(1) as cnt
      20.     from hdw_tmp_dev.user_sale_date
      21.     group by user_id
      22. ) as t2 on t1.user_id = t2.user_id
      23. where t1.rn in (cnt/2,(cnt+1)/2,cnt/2+1)
      24. --总个数为奇数命中(cnt+1)/2;总个数为偶数命中:cnt/2,cnt/2+1,两数相加求平均值
      复制代码
    • 效果展示


  • 产生连续数值

    • SQL逻辑
      1. --产生1到10的连续数据
      2. select
      3.     start_id + pos  as  id
      4.     ,pos
      5.     ,val
      6. from(
      7.     select
      8.         1   as  start_id,
      9.         10  as  end_id
      10. )  m  lateral  view  posexplode(split(space(end_id - start_id),''))  t  as  pos,  val
      11. --方案二
      12. select
      13.     row_number()  over()  as  id
      14. from   
      15.     (select  split(space(99), '') as  x)  t
      16. lateral  view
      17. explode(x)  ex;
      复制代码
    • 效果展示



免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
回复

使用道具 举报

0 个回复

倒序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

慢吞云雾缓吐愁

金牌会员
这个人很懒什么都没写!

标签云

快速回复 返回顶部 返回列表