哔哔大数据

name文件夹不存在错误描述:Directory/usr/local/src/hadoop/tmp/dfs/nameisinaninconsistentstate:storagedirectorydoesnotexistorisnotaccessible.处理方法:创建name文件重新格式化再次启动namenode无法启动错误描述:当把hadoop停止之后,再次启动没有namenode节点处理方法:清空临时目录tmp里面的data下的东西,再次启动namenode处于安全模式错误描述Namenodeisinsafemode.Namenode处于安全模式处理方式:关闭安全模式hadoopdfsadmin-safemodeleave进入安全模式hadoopdfsadmin-safemodeenterHive无法跑MR任务错误描述Taskwiththemostfailures(4):-----TaskID:task_1594519690907_0001_m_000000URL:http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1594519690907_0001&tipid=task_1594519690907_0001_m_000000-----DiagnosticMessagesforthisTask:Containerlaunchfailedforcontainer_1594519690907_0002_01_000005:org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException:TheauxService:mapreduce_shuffledoesnotexistatsun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeMethod)atsun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)atsun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)atjava.lang.reflect.Constructor.newInstance(Constructor.java:423)atorg.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)atorg.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)atorg.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)atorg.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:162)atorg.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:393)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)atjava.lang.Thread.run(Thread.java:748)FAILED:ExecutionError,returncode2fromorg.apache.hadoop.hive.ql.exec.mr.MapRedTaskMapReduceJobsLaunched:Stage-Stage-1:Map:1Reduce:1HDFSRead:0HDFSWrite:0FAILTotalMapReduceCPUTimeSpent:0msec处理方式:#确认yarn-site.xml的配置是否正确<!--NodeManager获取数据的方式--><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property>#然后重启Hadoop

哔哔大数据

相关配置版本组件版本提取码Hadoop2.9.2qtf9jdk1.8.0_221yjwtcentOS7.0bodr环境准备1.修改机器名称机器名称映射master:192.168.5.139slave1:192.168.5.143slave2:192.168.5.145修改机器名文件:vi/etc/hostname执行:hostname机器名检查:hostname2.修改master的hostname与ip的映射修改的文件:vi/etc/hosts192.168.5.139master192.168.5.143slave1192.168.5.145slave2master修改完成,把hosts发送到slave1、slave1节点foriin{1..2};doscp/etc/hostsroot@slave${i}:/etc;done角色分配机器名称节点节点masterDataNode/NameNodeNodeManager/ResourceManagerslave1DataNodeNodeManagerslave2DataNodeNodeManager前置配置1.ssh免密码登录每台机器执行:ssh-keygen-trsa把master节点上的authorized_keys钥发送到其他节点master执行命令,生成authorized_keys文件:ssh-copy-id-i/root/.ssh/id_rsa.pubmaster把authorized_keys发送到slave1slave2节点上scp/root/.ssh/authorized_keysroot@slave1:/root/.ssh/scp/root/.ssh/authorized_keysroot@slave2:/root/.ssh/在master节点测试免密码登录slave1、slave2命令:ssh机器名2.配置master的jdk,后面与hadoop一起发送到其他节点在master上解压jdk,并配置环境变量Hadoop集群的搭建解压Hadoop安装包,配置环境变量解压hadoop安装包到/usr/local/src/hadoop目录下,并配置HADOOP_HOME到环境变量修改配置文件进入hadoop的hadoop-2.9.2/etc/hadoop目录下1.修改hadoop-env.sh文件第一处#Thejavaimplementationtouse.#exportJAVA_HOME=${JAVA_HOME}(注释掉)exportJAVA_HOME=/usr/local/src/jdk1.8.0_221(添加上)第二处#exportHADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}(注释)exportHADOOP_CONF_DIR=/usr/local/src/hadoop-2.9.2/etc/hadoop(添加上)修改完记得sourcehadoop-env.sh2.修改core-site.xml文件<configuration><property><name>fs.defaultFS</name><value>hdfs://master:9000</value></property><!--临时目录--><property><name>hadoop.tmp.dir</name><value>/usr/local/src/hadoop-2.9.2/tmp</value></property></configuration>3.修改hdfs-site.xml文件添加到hdfs-site.xml文件<configuration><!--block块的复制数量--><property><name>dfs.replication</name><value>3</value></property><!--namenode的http协议地址和端口--><property><name>dfs.namenode.secondary.http-address</name><value>master:50090</value></property><!--namenode的https协议地址和端口--><property><name>dfs.namenode.secondary.https-address</name><value>master:50091</value></property></configuration>4.修改yarn-site.xml文件<configuration><!--用于存储本地化文件的目录列表--><!--创建目录mkdir-p/usr/local/src/nm/localdir--><property><name>yarn.nodemanager.local-dirs</name><value>/usr/local/src/nm/localdir</value></property><!--reducer获取数据的方式--><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><!--指定yarn的resourcemanager的地址--><property><name>yarn.resourcemanager.hostname</name><value>node1</value></property><!--忽略虚拟内存的检查虚拟机上设置有很大用处--><property><name>yarn.nodemanager.vmem-check-enabled</name><value>false</value><description>Whethervirtualmemorylimitswillbeenforcedforcontainers</description></property><!--yarn分配的内存大小--><property><name>yarn.nodemanager.resource.memory-mb</name><value>3276</value></property><!--每台机器最大分配内存,超过报异常--><property><name>yarn.scheduler.maximum-allocation-mb</name><value>3276</value></property><!--yarn分配的CPU个数--><property><name>yarn.nodemanager.resource.cpu-vcores</name><value>4</value></property><!--每台机器最大分配CPU个数,超过报异常--><property><name>yarn.scheduler.maximum-allocation-vcores</name><value>4</value></property></configuration>5.修改mapred-site.xml文件首先拷贝一份:cpmapred-site.xml.templatemapred-site.xml<configuration><!--mapreduce运行时的框架,可以是local,classicoryarn--><property><name>mapreduce.framework.name</name><value>yarn</value></property><!--mapreduce历史任务的地址端口--><property><name>mapreduce.jobhistory.address</name><value>master:10020</value></property><!--MapReduceJobHistory服务器WebUI主机:端口--><property><name>mapreduce.jobhistory.webapp.address</name><value>master:19888</value></property></configuration>6.修改slaves文件这个文件就是规定从节点运行的机器删除原本的localhost添加上masterslave1slave2分发配置文件到slave1,slave2把hadoop、java分发到slave1、slave2scp-r/usr/local/src/root@slave1:/usr/local/scp-r/usr/local/src/root@slave2:/usr/local/把环境变量文件分发到slave1、slave2scp/etc/profileroot@slave1:/etc/scp/etc/profileroot@slave2:/etc/分发完记得去slave1、slave2source/etc/profile启动Hadoop集群1.格式化namenode节点只需要在master机器上执行就好hdfsnamenode-format2.启动集群:在master上执行start-all.sh验证jps验证masterslave1slave2JpsJpsJpsNodeManagerNodeManagerNodeManagerDataNodeDataNodeDataNodeNameNodeSecondaryNameNodeResourceManager网页端验证关闭防火墙systemctlstopfirewalld.servicemaster机器IP:50070master机器IP:8088