干翻全岛蛙蛙 发表于 2024-6-13 20:24:59

本机使用python操纵hdfs搭建及常见问题

一、虚拟机安装CentOS7并设置共享文件夹
二、CentOS 7 上hadoop伪分布式搭建全流程完整教程
三、本机使用python操纵hdfs搭建及常见问题
四、mapreduce搭建
五、mapper-reducer编程搭建
六、hive数据仓库安装


一、环境搭建

1.打开虚拟机体系,打开hadoop

确保网页可以打开
https://img-blog.csdnimg.cn/421313a035a541f6b9b158cce9dc61f9.png
2.修改本机hosts文件

ifconfig查看当前ip,
https://img-blog.csdnimg.cn/ff1022c6c2eb42b180bf8589d6aa4c11.png
打开C:\Windows\System32\drivers\etc\,编辑hosts文件 ,在末端添加192.168.137.134 hadoop4
https://img-blog.csdnimg.cn/70c5206464dd4e91800c341816b8b00c.png
若不可以编辑
右键属性->安全->选择Users谁人->编辑->勾选修改 即可
3.举行ping测试连通

打开我的电脑打开恣意文件夹->左上角文件->打开Windows powershell->
举行ping测试
https://img-blog.csdnimg.cn/58ecdc9309c64c2abbc08618e9c61007.png
4.安装hdfs

wIn+r 输入cmd进入执行(在恣意python或conda环境安装皆可)
pip install hdfs -i https://pypi.douban.com/simple/
https://img-blog.csdnimg.cn/a3d08384287d4d4f9437709adfc3a8a9.png
二、python操纵hdfs

1.建立连接

PS D:\software\Bandicam\video> python
Python 3.9.7 (default, Sep 16 2021, 16:59:28) :: Anaconda, Inc. on win32

Warning:
This Python interpreter is in a conda environment, but the environment has
not been activated.Libraries may fail to load.To activate this environment
please see https://conda.io/activation

Type "help", "copyright", "credits" or "license" for more information.
>>> from hdfs.client import Client
>>> link=Client('http://hadoop4:50070')
>>> link.list('/')
[]
2.创建文件夹

新建文件夹出现错误
>>> link.makedirs('/test')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\Anaconda3\install_position\lib\site-packages\hdfs\client.py", line 1036, in makedirs
    self._mkdirs(hdfs_path, permission=permission)
File "D:\Anaconda3\install_position\lib\site-packages\hdfs\client.py", line 118, in api_handler
    raise err
hdfs.util.HdfsError: Permission denied: user=dr.who, access=WRITE, inode="/":huangqifa:supergroup:drwxr-xr-x
解决:
虚拟机端执行hadoop fs -chmod -R 777 /
https://img-blog.csdnimg.cn/49245447bb1248bda7679a4a75e842b0.png
>>> link.makedirs('/test')
>>> link.list('/')
['test']
>>>
https://img-blog.csdnimg.cn/64d5423e87a8433699cf36f2b21a8ede.png
创建文件夹出现安全模式问题:
>>> link.makedirs('/test')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\Anaconda3\install_position\lib\site-packages\hdfs\client.py", line 1036, in makedirs
    self._mkdirs(hdfs_path, permission=permission)
File "D:\Anaconda3\install_position\lib\site-packages\hdfs\client.py", line 118, in api_handler
    raise err
hdfs.util.HdfsError: Cannot create directory /test. Name node is in safe mode.
Resources are low on NN. Please add or free up more resources then turn off safe mode manually. NOTE:If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave
" to turn safe mode off.
解决:
虚拟机端执行
hdfs dfsadmin -safemode leave
https://img-blog.csdnimg.cn/fc804c903cc64678a95f1c6cb2482931.png
3.上传文件(需保证jps服务中datanode namenode均打开)

>>> link.makedirs('/test')
>>> link.list('/')
['test']
>>>
link.upload('/test','C:/readme.txt')'/test/readme.txt'>>> link.list('/test')['readme.txt'] https://img-blog.csdnimg.cn/9aeb9968fb53489b8900dd077b43d0c7.png
https://img-blog.csdnimg.cn/e6ff67ec73584ffdb8561cd842dfec43.png
4.写文件

>>> link.write('/test/test01.txt',"hello world")
>>> link.list('/test')
['readme.txt', 'test01.txt']
https://img-blog.csdnimg.cn/69c76f869c9e4f6398de284fe4872920.png
5.下载文件或文件夹

>>> link.download('/test/test01.txt','D:/')
'D:\\test01.txt'
>>>
https://img-blog.csdnimg.cn/deefcbdbf39348d497303dbb5232f5e8.png
https://img-blog.csdnimg.cn/6918cba8b6b04d17a43263ba5306e8b9.png
三、大概出现的问题

问题形貌:物理机ping不到虚拟机ip或物理机192.168…hadoop网页打不开
重置虚拟机网络中NAT谁人
解决:
虚拟机软件->左上角编辑->虚拟机网络编辑器
https://img-blog.csdnimg.cn/034fc680870541c6b2d0a1c61ec2985a.png
https://img-blog.csdnimg.cn/3b81ba3add7041dd80fd81b526230e53.png

免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
页: [1]
查看完整版本: 本机使用python操纵hdfs搭建及常见问题