除了官方的hadoop版本外,cloudera.com也发布自己的hadoop版本。据说yahoo 负责hadoop开发的某个哥们跳槽到cloudera.com(Hugo告诉我的)。为此,我把hadoop从官方的hadoop-0.20.1换成cloudera.com的hadoop-0.20.1+152.tar.gz,再加上hive-0.4.0+14.

配置方法跟官方的版本一致,具体请参考前边的文章或者网上的文档。可以当我配置好以后,运行start-all.sh时,确出现如下的报错:

======================================================================+
|      Error: JAVA_HOME is not set and Java could not be found         |
+———————————————————————-+
| Please download the latest Sun JDK from the Sun Java web site        |
|       > http://java.sun.com/javase/downloads/ <                      |
|                                                                      |
| Hadoop requires Java 1.6 or later.                                   |
| NOTE: This script will find Sun Java whether you install using the   |
|       binary or the RPM based installer.                             |
+======================================================================+

明明在hadoop-env.sh设置好了,为了保险,又在/etc/profile设置了一遍。以hadoop用户登陆系统,在任意路径执行echo $JAVA_HOME ,显示”/usr/local/jdk”,是我服务器jdk正确的路径啊!再查一下文档,云:需要jdk1.6以上的版本,我使用的是jdk1.6.0_16,应该符合要求的。

 

那问题又会在哪里呢?grep一把,看那些文件包含字符串JAVA_HOME.先搜hadoop/conf,只有hadoop-env.sh包含这个字串。再换一个目录hadoop/bin,查看一下,其输出如下:

[root@hadoops2 hadoop]# grep JAVA_HOME bin/*bin/hadoop:export JAVA_HOME=/usr/local/jdkbin/hadoop:#   JAVA_HOME        The java implementation to use.  Overrides JAVA_HOME.bin/hadoop:if [ "$JAVA_HOME" != "" ]; then

bin/hadoop:  #echo “run java in $JAVA_HOME”

bin/hadoop:  JAVA_HOME=$JAVA_HOME

bin/hadoop:if [ "$JAVA_HOME" = "" ]; then

bin/hadoop:  echo “Error: JAVA_HOME is not set.”

bin/hadoop:JAVA=$JAVA_HOME/bin/java

bin/hadoop:CLASSPATH=${CLASSPATH}:$JAVA_HOME/lib/tools.jar

bin/hadoop-config.sh:if [ -z "$JAVA_HOME" ]; then

bin/hadoop-config.sh:      export JAVA_HOME=$candidate

bin/hadoop-config.sh:  if [ -z "$JAVA_HOME" ]; then

bin/hadoop-config.sh:|      Error: JAVA_HOME is not set and Java could not be found         |

bin/rcc:#   JAVA_HOME        The java implementation to use.  Overrides JAVA_HOME.

bin/rcc:if [ "$JAVA_HOME" != "" ]; then

bin/rcc:  #echo “run java in $JAVA_HOME”

bin/rcc:  JAVA_HOME=$JAVA_HOME

bin/rcc:if [ "$JAVA_HOME" = "" ]; then

bin/rcc:  echo “Error: JAVA_HOME is not set.”

bin/rcc:JAVA=$JAVA_HOME/bin/java

bin/rcc:CLASSPATH=${CLASSPATH}:$JAVA_HOME/lib/tools.jar

 

嘿!好几个文件都与JAVA_HOME相关呢。接着挨个查看文件,感觉hadoop-config.sh这个文件嫌疑最大,其内容刚好有一段与运行hadoop报错内容相一致,我把程序片段贴在下面:

# attempt to find java
if [ -z "$JAVA_HOME" ]; then
  for candidate in \
    /usr/lib/jvm/java-6-sun \
    /usr/lib/j2sdk1.6-sun \
    /usr/local/jdk \
    /usr/java/jdk1.6* \
    /usr/java/jre1.6* \
    /Library/Java/Home ; do
    if [ -e $candidate/bin/java ]; then
      export JAVA_HOME=$candidate
      break
    fi
  done
  # if we didn’t set it
  if [ -z "$JAVA_HOME" ]; then
    cat 1>&2 <<EOF
+======================================================================+
|      Error: JAVA_HOME is not set and Java could not be found         |
+———————————————————————-+
| Please download the latest Sun JDK from the Sun Java web site        |
|       > http://java.sun.com/javase/downloads/ <                      |
|                                                                      |
| Hadoop requires Java 1.6 or later.                                   |
| NOTE: This script will find Sun Java whether you install using the   |
|       binary or the RPM based installer.                             |
+======================================================================+
EOF
    exit 1
  fi
fi

我在这段中间加如行 /usr/local/jdk \“,然后在运行,一切就正常了!