一、虚拟机安装CentOS7并设置共享文件夹
二、CentOS 7 上hadoop伪分布式搭建全流程完整教程
三、本机使用python操纵hdfs搭建及常见问题
四、mapreduce搭建
五、mapper-reducer编程搭建
六、hive数据仓库安装
一、环境搭建
1.打开虚拟机体系,打开hadoop
确保网页可以打开
2.修改本机hosts文件
ifconfig查看当前ip,
打开C:\Windows\System32\drivers\etc\,编辑hosts文件 ,在末端添加192.168.137.134 hadoop4
若不可以编辑
右键属性->安全->选择Users谁人->编辑->勾选修改 即可
3.举行ping测试连通
打开我的电脑打开恣意文件夹->左上角文件->打开Windows powershell->
举行ping测试
4.安装hdfs
wIn+r 输入cmd进入执行(在恣意python或conda环境安装皆可)
- pip install hdfs -i https://pypi.douban.com/simple/
复制代码
二、python操纵hdfs
1.建立连接
- PS D:\software\Bandicam\video> python
- Python 3.9.7 (default, Sep 16 2021, 16:59:28) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
- Warning:
- This Python interpreter is in a conda environment, but the environment has
- not been activated. Libraries may fail to load. To activate this environment
- please see https://conda.io/activation
- Type "help", "copyright", "credits" or "license" for more information.
- >>> from hdfs.client import Client
- >>> link=Client('http://hadoop4:50070')
- >>> link.list('/')
- []
复制代码 2.创建文件夹
新建文件夹出现错误
- >>> link.makedirs('/test')
- Traceback (most recent call last):
- File "<stdin>", line 1, in <module>
- File "D:\Anaconda3\install_position\lib\site-packages\hdfs\client.py", line 1036, in makedirs
- self._mkdirs(hdfs_path, permission=permission)
- File "D:\Anaconda3\install_position\lib\site-packages\hdfs\client.py", line 118, in api_handler
- raise err
- hdfs.util.HdfsError: Permission denied: user=dr.who, access=WRITE, inode="/":huangqifa:supergroup:drwxr-xr-x
复制代码 解决:
虚拟机端执行hadoop fs -chmod -R 777 /
- >>> link.makedirs('/test')
- >>> link.list('/')
- ['test']
- >>>
复制代码
创建文件夹出现安全模式问题:
- >>> link.makedirs('/test')
- Traceback (most recent call last):
- File "<stdin>", line 1, in <module>
- File "D:\Anaconda3\install_position\lib\site-packages\hdfs\client.py", line 1036, in makedirs
- self._mkdirs(hdfs_path, permission=permission)
- File "D:\Anaconda3\install_position\lib\site-packages\hdfs\client.py", line 118, in api_handler
- raise err
- hdfs.util.HdfsError: Cannot create directory /test. Name node is in safe mode.
- Resources are low on NN. Please add or free up more resources then turn off safe mode manually. NOTE: If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave
- " to turn safe mode off.
复制代码 解决:
虚拟机端执行
- hdfs dfsadmin -safemode leave
复制代码
3.上传文件(需保证jps服务中datanode namenode均打开)
- >>> link.makedirs('/test')
- >>> link.list('/')
- ['test']
- >>>
- link.upload('/test','C:/readme.txt')'/test/readme.txt'>>> link.list('/test')['readme.txt']
复制代码
4.写文件
- >>> link.write('/test/test01.txt',"hello world")
- >>> link.list('/test')
- ['readme.txt', 'test01.txt']
复制代码
5.下载文件或文件夹
- >>> link.download('/test/test01.txt','D:/')
- 'D:\\test01.txt'
- >>>
复制代码
三、大概出现的问题
问题形貌:物理机ping不到虚拟机ip或物理机192.168…hadoop网页打不开
重置虚拟机网络中NAT谁人
解决:
虚拟机软件->左上角编辑->虚拟机网络编辑器
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。 |