Monday, November 3, 2014

Getting Hive metastore to work with HDInsight Emulator.

For some reason, Hive metastore.exe (which runs as a service) did not create the metastore db that is required for Hive to store table definitions. 

Hive reported the following exception at startup in the log files. 
"set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out"


Because of this error, "Show databases" command did not work and I could not use Hive with HDInsight emulator. 

After hours of investigation, I made the following edits to "C:\hdp\hive-0.13.0.2.1.3.0-1981\conf\hive-site.xml". As you can see, I changed the settings to EmbeddedDriver instead of ClientDriver. 

  javax.jdo.option.ConnectionURL
 
  jdbc:derby:;databaseName=metastore_db;create=true
  JDBC connect string for a JDBC metastore

  javax.jdo.option.ConnectionDriverName
 
  org.apache.derby.jdbc.EmbeddedDriver
  Driver class name for a JDBC metastore

Then I stopped the default metastore.exe service. 
And used the following command to start metastore
"Hive.cmd --service metastore" 

with the initial directory set to hive home.

This created a folder named "metadata_db" and finally hive started successfully!! 

By default, the metastore service listens for requests on port 9083.
You can confirm that the service is running, by executing netstat command as shown below.  
netstat -ano | find "9083"

Tuesday, September 2, 2014

Not running in a hosted service or the Development Fabric

Runtime Exception:
Could not create Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener, Microsoft.WindowsAzure.Diagnostics, Version=2.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35.

{"Not running in a hosted service or the Development Fabric."}

Fix:
I accidentally had set the Web Role project as the startup project. Setting the cloud service project in the solution as the Startup project helped me get past this exception.

Metaprogramming with T4

Recently, I learnt how to generate code using T4 text templates. The requirement was to read data from a CSV file and create in-memory objects based on the data representation in the CSV file. So if my CSV file had records with 4 columns, the code had to read the file and create in-memory objects such that each object has 4 fields mapped to the 4 columns. There were multiple CSV files of different record types and manually creating a class for each record type was tedious. After doing commonality/variability analysis, it was apparent that I could use T4 to generate the CSV file reader for each record type by reading metadata about the CSV file.
As my understanding of T4 evolves, I will share my learning here. For reference, posting some good links that provide some guidelines on code generation and when to use it.

Code Generation good or evil
Code Generation Pros and Cons of T4 template
MSDN Code Generation and T4 templates

What is success?

The journey of life takes us through varied experiences like landing an admission at a prestigious college, earning a degree, getting hired,...