使用docker部署hadoop hdfs

先创建一个网络

docker network create hadoop

把本机新建一个目录hadoop,以下xml保存为docker-compose.yml

version: "3"
services:
  namenode:
    image: uhopper/hadoop-namenode
    hostname: namenode1
    container_name: namenode1
    # 要和networks保持一致,这样能够自动发现其他机器(如datanode1.hadoop)
    domainname: hadoop
    ports:
      - "50070:50070"
    volumes:
      - ./nn:/hadoop/dfs/name
    environment:
      - HDFS_CONF_dfs_replication=1
      - CLUSTER_NAME=ns1
  
  datanode1:
    image: uhopper/hadoop-datanode
    hostname: datanode1
    container_name: datanode1
    domainname: hadoop
    volumes:
      - ./dn1:/hadoop/dfs/data
    environment:
      - HDFS_CONF_dfs_replication=1
      - CLUSTER_NAME=ns1
      - CORE_CONF_fs_defaultFS=hdfs://namenode1:8020
      

networks:
  default:
    # 使用外部网络(如果不加这个,启动时会自动创建一个${目录名}_${name}的网络)
    external:
      name: hadoop

使用docker-compose up -d 启动,可以访问 http://127.0.0.1:50070 查看namenode页面 另外可以进入container里面

docker exec -it namenode1 bash

尝试执行一下hdfs dfs -put /etc/issue / 如果报错

17/12/24 08:39:15 WARN hdfs.DFSClient: DataStreamer Exception
java.nio.channels.UnresolvedAddressException
        at sun.nio.ch.Net.checkAddress(Net.java:101)
        at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1537)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1313)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)

请检查网络配置是否正确,ping datanode1.hadoop 是否返回IP

updatedupdated2024-12-222024-12-22