Hive On Tez cloudera5.4

配置

tez-site.xml

<configuration>
  <property>
    <name>tez.lib.uris</name>
    <value>${fs.defaultFS}/apps/tez-0.8.5/,${fs.defaultFS}/apps/tez-0.8.5/lib/</value>
  </property>

  <property>
    <name>tez.use.cluster.hadoop-libs</name>
    <value>true</value>
  </property>

  <property>
    <name>tez.runtime.compress</name>
    <value>true</value>
  </property>
  <property>
    <name>tez.runtime.compress.codec</name>
    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
  </property>
</configuration>

调优参数

tez.grouping.min-size 分片最小限制

报错处理

找不到lzo

Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2018)
        at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:128)
        ... 23 more

LZO包没加载到,我是把lzo.jar放入tez的lib里面,但是这样会导致不能用到native库,暂时没找到解决方法。 找不到org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator

Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator

同样的解决方法,把hive-exec.jar放进去

errorMessage=Shuffle Runner Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: Error while doing final merge 
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:320)
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:285)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Not a valid ifile header
        at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.verifyHeaderMagic(IFile.java:837)
        at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.isCompressedFlagEnabled(IFile.java:844)
        at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.<init>(IFile.java:547)
        at org.apache.tez.runtime.library.common.sort.impl.TezMerger$DiskSegment.init(TezMerger.java:407)
        at org.apache.tez.runtime.library.common.sort.impl.TezMerger$MergeQueue.merge(TezMerger.java:753)
        at org.apache.tez.runtime.library.common.sort.impl.TezMerger.merge(TezMerger.java:192)
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:1203)
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:583)
        at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:316)

我是重新指定了一下压缩算法(tez.runtime.compress.codec),不知道为什么。压缩关掉也可以。

<property>
    <name>tez.runtime.compress</name>
    <value>true</value>
  </property>
  <property>
    <name>tez.runtime.compress.codec</name>
    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
  </property>

写的不错的文章

Tez系列第二篇 Hive_on_tez

updatedupdated2023-12-062023-12-06