程序人生【技术流吃瓜】python可视化大屏舆情分析“张天爱“事件网友评论

写过一篇 发表于 2022-9-16 17:21:15

【技术流吃瓜】python可视化大屏舆情分析“张天爱“事件网友评论

目录

[*]一、事件背景
[*]二、微热点分析
[*]二、自开发Python舆情分析

[*]2.1 Python爬虫
[*]2.2 可视化大屏

[*]2.2.1 大标题
[*]2.2.2 词云图
[*]2.2.3 条形图
[*]2.2.4 饼图（玫瑰图）
[*]2.2.5 地图

[*]三、演示视频

一、事件背景

大家好，我是马哥python说。
演员张天爱于2022.8.25号在网上爆出一段音频 "惯犯，希望所以女孩擦亮眼睛。"https://upload-images.jianshu.io/upload_images/28008898-53f0b45d2ac21057.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
至今已有2.5亿次观看量，瞬间冲上热搜。
二、微热点分析

以下数据来源：微热点
从舆情分析网站上来看，从热度指数的变化趋势来看，"张天爱"的热度在08月25日22时达到了92.56的峰值。https://upload-images.jianshu.io/upload_images/28008898-80d2dc8afffd1094.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
"张天爱"全网热度：https://upload-images.jianshu.io/upload_images/28008898-0eff0b58d55cb462.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
"张天爱"网络媒体的评价指标：https://upload-images.jianshu.io/upload_images/28008898-545992c93e7b9b6a.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
"张天爱"关键词分析：https://upload-images.jianshu.io/upload_images/28008898-955e90d50f682d7f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
"张天爱"地域分析：https://upload-images.jianshu.io/upload_images/28008898-70064202170bd468.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
二、自开发Python舆情分析

2.1 Python爬虫

从博文URL地址中找出id。
目标链接地址的id参数值就是id：
原文查看
把id带入到我的Python爬虫代码中，下面展示部分爬虫代码。
关键逻辑，就是max_id的处理：
原文查看
如果是第一页，不用传max_id参数。
如果非第一页，需要传max_id参数，它的值来自于上一页的r.json()['data']['max_id']
首先，向页面发送请求：
r = requests.get(url, headers=headers)# 发送请求
print(r.status_code)# 查看响应码
print(r.json())# 查看响应内容下面，是解析数据的处理逻辑：
datas = r.json()['data']['data']
for data in datas:
page_list.append(page)
id_list.append(data['id'])
dr = re.compile(r'<[^>]+>', re.S)# 用正则表达式清洗评论数据
text2 = dr.sub('', data['text'])
text_list.append(text2)# 评论内容
time_list.append(trans_time(v_str=data['created_at']))# 评论时间
like_count_list.append(data['like_count'])# 评论点赞数
source_list.append(data['source'])# 评论者IP归属地
user_name_list.append(data['user']['screen_name'])# 评论者姓名
user_id_list.append(data['user']['id'])# 评论者id
user_gender_list.append(tran_gender(data['user']['gender']))# 评论者性别
follow_count_list.append(data['user']['follow_count'])# 评论者关注数
followers_count_list.append(data['user']['followers_count'])# 评论者粉丝数最后，是保存数据的处理逻辑：
df = pd.DataFrame(
{
'id': * len(time_list),
'评论页码': page_list,
'评论id': id_list,
'评论时间': time_list,
'评论点赞数': like_count_list,
'评论者IP归属地': source_list,
'评论者姓名': user_name_list,
'评论者id': user_id_list,
'评论者性别': user_gender_list,
'评论者关注数': follow_count_list,
'评论者粉丝数': followers_count_list,
'评论内容': text_list,
}
)
if os.path.exists(v_comment_file):# 如果文件存在，不再设置表头
header = False
else:# 否则，设置csv文件表头
header = True
# 保存csv文件
df.to_csv(v_comment_file, mode='a+', index=False, header=header, encoding='utf_8_sig')
print('结果保存成功:{}'.format(v_comment_file))篇幅有限，请求头、cookie、循环页码、数据清洗等其他细节不再赘述。
看下最终数据： https://upload-images.jianshu.io/upload_images/28008898-24b10220c9fc48d3.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
2.2 可视化大屏

首先，看下最终大屏交互效果：
这个大屏，包含了5个图表：

[*]大标题-Line
[*]词云图-Wordcloud
[*]条形图-Bar
[*]饼图-Pie
[*]地图-Map
下面，依次讲解代码实现。
2.2.1 大标题

由于pyecharts组件没有专门用作标题的图表，我决定灵活运用Line组件实现大标题。
line3 = (
Line(init_opts=opts.InitOpts(width="1000px",# 宽度
                              height="625px",# 高度
                              bg_color={"type": "pattern", "image": JsCode("img"),
                                       "repeat": "repeat", }))# 设置背景图片
.add_xaxis()# 插入空数据
.add_yaxis("", )# 插入空数据
.set_global_opts(
title_opts=opts.TitleOpts(title=v_title,
                           pos_left='center',
                           title_textstyle_opts=opts.TextStyleOpts(font_size=45,
                                                                  color='#51c2d5',
                                                                  align='left'),
                           pos_top='top'),
yaxis_opts=opts.AxisOpts(is_show=False),# 不显示y轴
xaxis_opts=opts.AxisOpts(is_show=False))# 不显示x轴
)
# 设置背景图片
line3.add_js_funcs(
"""
var img = new Image(); img.src = '大屏背景.jpg';
"""
)
line3.render('大标题.html')
print('页面渲染完毕:大标题.html')这里最关键的逻辑，就是背景图片的处理。我找了一个张天爱的图片：https://upload-images.jianshu.io/upload_images/28008898-da3ceab2332497db.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
然后用add_js_funcs代码把此图片设置为整个大屏的背景图。
大标题效果：https://upload-images.jianshu.io/upload_images/28008898-ff33cfad5099eb86.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
2.2.2 词云图

首先，把评论数据清洗出来：
cmt_list = df['评论内容'].values.tolist()# 转换成列表
cmt_list = # 数据清洗
cmt_str = ' '.join(cmt_list)# 转换成字符串然后，将清洗后的数据，带入词云图函数，核心代码：
wc = WordCloud(init_opts=opts.InitOpts(width=chart_width, height=chart_height, theme=theme_config, chart_id='wc1'))
wc.add(series_name="词汇",
   data_pair=data,
   word_gap=1,
   word_size_range=,
   mask_image='张天爱背景图.png',
   )# 增加数据
wc.set_global_opts(
title_opts=opts.TitleOpts(pos_left='center',
                           title="张天爱评论-词云图",
                           title_textstyle_opts=opts.TextStyleOpts(font_size=20)# 设置标题
                           ),
tooltip_opts=opts.TooltipOpts(is_show=True),# 不显示工具箱
)
wc.render('张天爱词云图.html')# 生成html文件
print('渲染完成:' + '张天爱词云图.html')看下效果：https://upload-images.jianshu.io/upload_images/28008898-72c72c64cfcf991e.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
2.2.3 条形图

针对评论数据的TOP10高频词，绘制出条形图。
核心代码：
bar = Bar(
init_opts=opts.InitOpts(theme=theme_config, width=chart_width, height=chart_height,
                        chart_id='bar_cmt'))# 初始化条形图
bar.add_xaxis(x_data)# 增加x轴数据
bar.add_yaxis("数量", y_data)# 增加y轴数据
bar.reversal_axis()# 设置水平方向
bar.set_series_opts(label_opts=opts.LabelOpts(position="right"))# Label出现位置
bar.set_global_opts(
legend_opts=opts.LegendOpts(pos_left='right'),
title_opts=opts.TitleOpts(title=v_title, pos_left='center'),# 标题
toolbox_opts=opts.ToolboxOpts(is_show=False, ),# 不显示工具箱
xaxis_opts=opts.AxisOpts(name="数量", axislabel_opts={"rotate": 0}),# x轴名称
yaxis_opts=opts.AxisOpts(name="关键词",
                        axislabel_opts=opts.LabelOpts(font_size=9, rotate=0),# y轴名称
                        ))
bar.render(v_title + ".html")# 生成html文件
print('渲染完成:' + v_title + '.html')看下效果：https://upload-images.jianshu.io/upload_images/28008898-1214abc30de3ce67.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240
2.2.4 饼图（玫瑰图）

首先，针对评论数据，用snownlp库做情感分析判定。
for comment in v_cmt_list: tag = '' sentiments_score = SnowNLP(comment).sentiments if sentiments_score < 0.4:# 情感分小于0.4判定为消极 tag = '消极' neg_count += 1 elif 0.4

页: [1]

IT评测·应用市场-qidao123.com技术社区's Archiver

【技术流吃瓜】python可视化大屏舆情分析“张天爱“事件网友评论