Scala练习题

打印 上一主题 下一主题

主题 881|帖子 881|积分 2643

SQL join语法案例

Data:
  1. order.txt
  2. order011,u001,300
  3. order012,u002,200
  4. order023,u006,100
  5. order056,u007,300
  6. order066,u003,500
  7. order055,u004,300
  8. order021,u005,300
  9. order014,u001,100
  10. order025,u005,300
  11. order046,u007,30
  12. order067,u003,340
  13. order098,u008,310
  14. user.txt
  15. u001,hls,22,fengjie
  16. u002,wangwu,31,lisi
  17. u003,zhangyanru,22,tananpengyou
  18. u004,laocao,26,fengyi
  19. u005,mengqi,12,nvmengqi
  20. u006,haolei,38,sb
  21. u007,wanghongjing,24,wife
  22. u009,wanghongjing,24,wife
  23. 返回一个结果:order011  u001   300  hls  22   fengjie
复制代码
代码示例:
  1. package com.doit.day03
  2. import scala.io.{BufferedSource, Source}
  3. object JoinDemo {
  4.   def main(args: Array[String]): Unit = {
  5.     //u001,hls,22,fengjie
  6.     val bs1= Source.fromFile("D:\\develop\\ideaWorkSpace\\myself\\study\\scalaDemo\\data\\user.txt")/*.getLines().toList*/
  7.     //order011,u001,300
  8.     val bs2 = Source.fromFile("D:\\develop\\ideaWorkSpace\\myself\\study\\scalaDemo\\data\\order.txt")/*.getLines().toList*/
  9.     //实现left join
  10. /*
  11.     //将用户数据转换成map集合
  12.     val users: Iterator[String] = bs1.getLines()
  13.     val iters: Iterator[(String, (String, String, String, String))] = users.map(_.split(",", -1)).map(x => (x(0), (x(0), x(1), x(2), x(3))))
  14.     val map: Map[String, (String, String, String, String)] = iters.toMap
  15.     // 将订单数据转换成list集合
  16.     val orders: Iterator[String] = bs2.getLines()
  17.     val iters2: Iterator[(String, (String, String))] = orders.map(_.split(",", -1)).map(x => (x(1), (x(0), x(1))))
  18.     val list2: List[(String, (String, String))] = iters2.toList
  19.     //遍历每个订单  拼接用户信息
  20.     var r = list2.map(x => {
  21.       val user = map.getOrElse(x._1, ("null", "null", "null", "null"))
  22.       (user._1, user._2, user._3, user._4, x._2._1)
  23.     })
  24.     // 打印结果
  25.     r.sortBy(_._1).foreach(println)
  26.     */
  27.     //实现join
  28.     /*
  29.     val userTuple: List[(String, String, String, String)] = users.map(line => {
  30.       val arr: Array[String] = line.split(",")
  31.       //user_id,user_name,age,name
  32.       (arr(0), arr(1), arr(2), arr(3))
  33.     })
  34.     val orderTuple: List[(String, String, String)] = orders.map(line => {
  35.       val arr: Array[String] = line.split(",")
  36.       //order_id  user_id   amount
  37.       (arr(0), arr(1), arr(2))
  38.     })
  39.     //join关联条件是user_id = user_id
  40.     for (user <- userTuple) {
  41.       for (order <- orderTuple) {
  42.         if(user._1 == order._2){
  43.           println(user._1,user._2,user._3,user._4,order._1,order._3)
  44.         }
  45.       }
  46.     }
  47. */
  48.   }
  49. }
复制代码
线段重叠案例

data:
  1. site1,user1,2018-03-01 02:12:22
  2. site1,user2,2018-03-05 04:12:22
  3. site1,user2,2018-03-05 04:13:22
  4. site1,user2,2018-03-05 04:14:22
  5. site1,user2,2018-03-05 04:15:22
  6. site4,user7,
  7. site1,user2,2018-03-05 05:15:22
  8. site1,user2,2018-03-05 08:15:22
  9. site1,user3,2018-03-05 04:15:22
  10. site1,user4,2018-03-05 05:15:22
  11. site1,user3,2018-03-07 11:12:22
  12. site1,user3,2018-03-08 11:12:22
  13. site2,user4,2018-03-07 15:12:22
  14. site3,user5,2018-03-07 08:12:22
  15. site3,user6,2018-03-05 08:12:22
  16. site1,user1,2018-03-08 11:12:22
  17. site1,,2018-03-08 11:12:22
  18. site2,user2,2018-03-07 15:12:22
  19. site3,user5,2018-03-07 08:12:22
  20. site3,user5,2018-03-07 18:12:22
  21. site3,user6,2018-03-05 08:12:22
  22. site4,user7,2018-03-03 10:12:22
  23. site2,,2018-03-08 11:12:22
  24. site3,user5,2018-03-07 08:12:22
  25. site3,user6,2018-03-05 08:12:22
  26. site4,user5,2018-03-03 10:12:22
  27. site4,user7,2018-02-20 11:12:22
复制代码
代码:
  1. package com.doit.day03
  2. import scala.io.{BufferedSource, Source}
  3. /**
  4. * 需求:计算每天的pv和uv
  5. * pv:浏览次数
  6. * uv:访客数
  7. */
  8. object PVUVDemo {
  9.   def main(args: Array[String]): Unit = {
  10.     val source: BufferedSource = Source.fromFile("D:\\develop\\ideaWorkSpace\\myself\\study\\scalaDemo\\data\\pvuv.txt")
  11.     val list: List[String] = source.getLines().toList
  12.     //过滤一些脏数据
  13.     val filtered: List[String] = list.filter(line => {
  14.       val arr: Array[String] = line.split(",",-1)
  15.       arr.length >= 0 && !arr.exists(_.isEmpty)
  16.     })
  17.     val events: List[(String, String, String)] = filtered.map(line => {
  18.       val arr: Array[String] = line.split(",")
  19.       val date: String = arr(2).substring(0, 10)
  20.       //site1,user1,2018-03-01 02:12:22
  21.       (arr(0), arr(1),date)
  22.     })
  23.     //pv:该页面被浏览了多少次
  24.     val tuples: List[((String, String), String)] = events.map(tp => {
  25.       ((tp._3, tp._1), tp._2)
  26.     })
  27.     val pv: Map[(String, String), Int] = tuples.groupBy(_._1).map(tp => (tp._1, tp._2.size))
  28.     val uv: Map[(String, String), Int] = tuples.groupBy(_._1).map(tp => (tp._1, tp._2.distinct.size))
  29.     println("============pv================")
  30.     pv.foreach(println)
  31.     println("============uv================")
  32.     uv.foreach(println)
  33.   }
  34. }
复制代码
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!
回复

使用道具 举报

0 个回复

倒序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

何小豆儿在此

金牌会员
这个人很懒什么都没写!

标签云

快速回复 返回顶部 返回列表