I was playing around with Datanucleus JPA with HBase from last few days. Datanucleus also have support for JDO but my choice to go for JPA is due to Sun standardization on it. There are couple of good tutorials already available but my personal favorite is Matzew one. Apart from few issues of jar dependencies its quite straight forward Maven script which runs a jetty server and a web app.
But if you are crazy about IDEs like Eclipse then it’s a bit tricky to get things in place and working. So, here is the tutorial which guides you to run the same application in Eclipse and Tomcat.
Prerequisites:
- Hadoop and HBase up and running.
- Basic knowledge of HBase concepts (Schema, Columns, Column Families etc).
- Datanucleus Eclipse plug-in installed on your Eclipse.
- High level understanding of Datanucleus and JPA.
For Whom:
If you are eying any one of following issues:
- Integration of HBase using JPA in your application.
- Port Matzew’s Maven script based example in Eclipse Tomcat environment.
- Suffering from “javax.persistence.* class not found” exception.
- “No persistence providers available for storename” exception is frustrating you in-spite to few try/solutions suggested on forums.
- Confused where to put persistence.xml in your web application.
- Need some way to control column families name in your HBase schema.
- @Column(name=”familyname:columnname”) is making no effect on your HBase column families.
- Didn’t find Datanucleus menu item after successfully installing Datanucleus eclipse plug-in.
Let’s Start:
Here I am using the same code committed by Matzew on github and would show you steps to create a simple web application.
Step 1: Create a simple Servlet.
I have created a simple servlet which persist Contact Entity class into database.
import java.io.IOException; |
Step 2: Web application Structure
Your Eclipse dynamic web project should look like this.
Make sure to add all required jars (pretty obvious) and remember few critical points:
- Make sure you have “persistence-api-XXX.jar” in web-inf\lib (Because Tomcat doesn’t ships with its own version of Persistence.jar)
- Add META-INF\persistence.xml source (SRC) folder.
- Place orm.xml in parallel with the Entity classes.
Step 3: Important Configuration XMLs
The persistence.xml file
<persistence xmlns="http://java.sun.com/xml/ns/persistence" |
Fine touch by orm.xml file
The idea of this xml is to map your Entity class with corresponding table in the database. You can also map member variable of Entity class with field/column of the table and can declare constraints on them as well.
<?xml version="1.0" encoding="UTF-8"?> |
Structure of this xml is quite self-explanatory, still few points to remember:
- <id> …</id> tag is used to declare and map row key of the HBase table.
- <basic>...</basic> maps field with the column name of table.
- Most Important: Note here that column name is given in “family-name:column-name” format. If you don’t specify column name in this format, Datanucleus would take class name as the column family name of the HBase table.
Step 4: Enhance your Entity classes
Step 5: Deploy and Run
Go head. Deploy your first JPA app on server and run.
If everything goes well you would have Contact table created in your HBase with one row entry.
I have uploaded the zip of this Eclipse Project here.
That’s It. Enjoy!!
Additional Resources:
- https://github.com/matzew/hbase-jpa-jsf/
- http://www.datanucleus.org/products/accessplatform/hbase/support.html
This comment has been removed by the author.
ReplyDeleteHello Mayur,
ReplyDeleteFirst of all, I would like to congratulate you for this nice tuto !
In fact, you mentioned: "I have uploaded the zip of this Eclipse Project here." But unfortunately, I can't find the link?
Could you, please, verify the link for your project zip folder !
Thanks you very much !
Hi Karray,
ReplyDeleteThat link was broken so I removed. Now its fixed.
Thanks for notifying me!
;) You are welcome!
ReplyDeleteBy the way: I tried to implement your project with Eclipse and a tomcat server 5.5, but when I run the project (Run on Server) ... I get a 404 error with a description: The required resource is not available !
.... I tried to figure out the problem, and I doubt that there is not an "index.html" or "index.jsp" pages (I am not sure anyway)!
Did you have a suggestion? thank you !
Karray,
ReplyDeleteI have kept this tutorial as simple as possible. So am not using jsp/html in this project. There is only one servlet "Index". When you hit this servlet it will insert on contact row in Hbase table.
Hope its clear now :)
Okkk, it's clear now !
ReplyDeleteJust one last question because I really want to make it work!
When I run your example, I get this error in my ECPLISE console output (you should download the file):
http://www.2shared.com/file/tijLwDwv/logexample.html
This is, also, a picture of your project implementation on Eclipse:
http://data.imagup.com/7/1119080445.png
So what Can I do to resolve this?
-------------------------------------------
PS: Hbase and Hadoop are running + Datanucleus plugin is working
--------------------------------------------
Thank you very much !
Clean and Build project in Eclipse. I hope that would resolve your issue.
ReplyDeleteMayur,
ReplyDeleteDo you have any performance benchmarking of using data nucleus against avro or thrift?
-Jignesh