Hive queries fail when the hive.execution.engine is set to MR, they work when set to Tez?

I am using HDP 2.1 sandbox for my work. The version of hive as listed by the jar file is: hive-exec-0.13.0.2.1.1.0-385.jar.

I have created a directory in HDFS having weather information. the actual information is in text files with the following 5 fields (usafid:string,obsdate:string,winddir:int,windspeed:int,visibility:double), E.g. file contents are:

  • 725805 201301010853 70 8 10.0
  • 725805 201301010953 350 6 10.0
  • 725805 201301011053 20 11 10.0
  • 725805 201301011153 20 8 10.0

I am now overlaying a HIVE table using the SQL command

CREATE DATABASE weather;
USE weather;
CREATE EXTERNAL TABLE IF NOT EXISTS wind( 
    usafid     STRING,
    obsdate    STRING,
    winddir    INT,
    windspeed  INT,
   visibility DOUBLE
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES  TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/WEATHER/PROCESSED/WIND_RECORDS';

When I run the query SELECT * from wind;, it works fine. But, if I run the query SELECT * from wind WHERE wind = 3;, hive launches a MR job and fails with the following stack trace:

2014-10-29 00:10:58,975 ERROR [IPC Server handler 3 on 52990] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1414566304731_0001_m_000000_0 - exited : java.lang.RuntimeException: java.lang.ArrayIndexOutOfBoundsException: 0
	at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:284)
	at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:250)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:256)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:383)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:376)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:552)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:168)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
	at java.beans.XMLDecoder.readObject(XMLDecoder.java:250)
	at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObject(Utilities.java:679)
	at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:622)
	at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:272)

I did a lot of research and digging and was able to trace the error during the parsing of the "Query Plan". Any query with the 'WHERE' clause fails. If I set the execution engine to Tez using the command, the queries run fine.

set hive.execution.engine=tez; 

Not sure what is happening and why it is failing when the hive.execution.engine=mr (which I believe is the default)?

EDIT: I setup a 3 node cluster and used Ambari to install and setup HDP 2.1. I am unable to re-create the problem on the 3 node cluster. Looks like the issue manifests itself only in the standalone VM HDP 2.1.