• No results found

Script table schema from the UNIX shell

In document Hbase in Action (Page 172-175)

Alternative HBase clients

6.1.2 Script table schema from the UNIX shell

Way back when learning HBase, you started development on the TwitBase application. One of the first things you did with TwitBase was to create a users table using the

HBase shell. As TwitBase grew, so did your schema. Tables for Twits and Followers soon emerged. All management code for those tables accumulated in the InitTables class. Java isn’t a convenient language for schema management because it’s verbose and requires building a custom application for each migration. Let’s reimagine that code as HBase shell commands.

The main body of code for creating a table in InitTables looks mostly the same for each table:

System.out.println("Creating Twits table...");

HTableDescriptor desc = new HTableDescriptor(TwitsDAO.TABLE_NAME); HColumnDescriptor c = new HColumnDescriptor(TwitsDAO.INFO_FAM); c.setMaxVersions(1);

desc.addFamily(c);

admin.createTable(desc);

System.out.println("Twits table created.");

You can achieve the same effect using the shell:

hbase(main):001:0> create 'twits', {NAME => 't', VERSIONS => 1} 0 row(s) in 1.0500 seconds

A brush with JRuby

If you’re familiar with the Ruby programming language, the create command may look conspicuously like a function invocation. That’s because it is. The HBase shell is implemented in JRuby. We’ll look more at this link to JRuby later in this chapter.

Five lines of Java reduced to a single shell command? Not bad. Now you can take that

HBase shell command and wrap it in a UNIX shell script. Note that the line exec hbase shell may be slightly different for you if the hbase command isn’t on your path. You handle that scenario in the final script, shown in listing 6.1:

#!/bin/sh

exec $HBASE_HOME/bin/hbase shell <<EOF create 'twits', {NAME => 't', VERSIONS => 1} EOF

Adding the other tables to your script is easy:

exec $HBASE_HOME/bin/hbase shell <<EOF create 'twits', {NAME => 't', VERSIONS => 1} create 'users', {NAME => 'info'}

create 'followes', {NAME => 'f', VERSIONS => 1} create 'followedBy', {NAME => 'f', VERSIONS => 1} EOF

At this point, you’ve moved your table and column family names out of Java. Overrid- ing them on the command line is now much easier:

#!/bin/sh

TWITS_TABLE=${TWITS_TABLE-'twits'} TWITS_FAM=${TWITS_FAM-'t'}

exec $HBASE_HOME/bin/hbase shell <<EOF

create '$TWITS_TABLE', {NAME => '$TWITS_FAM', VERSIONS => 1} create 'users', {NAME => 'info'}

create 'followes', {NAME => 'f', VERSIONS => 1} create 'followedBy', {NAME => 'f', VERSIONS => 1} EOF

If you update your application code to read those same constants from a configura- tion file, you can move your schema definition completely out of the Java code. Now you can easily test different versions of TwitBase against different tables on the same

HBase cluster. That flexibility will simplify the process of bringing TwitBase to produc- tion. The complete script is shown next.

#!/bin/sh

HBASE_CLI="$HBASE_HOME/bin/hbase"

test -n "$HBASE_HOME" || { echo >&2 'HBASE_HOME not set. using hbase on $PATH' HBASE_CLI=$(which hbase) } TWITS_TABLE=${TWITS_TABLE-'twits'} TWITS_FAM=${TWITS_FAM-'t'} USERS_TABLE=${USERS_TABLE-'users'} USERS_FAM=${USERS_FAM-'info'} FOLLOWS_TABLE=${FOLLOWS_TABLE-'follows'}

Listing 6.1 UNIX shell replacement for InitTables.java

Find hbase command

Determine table and column family names

147

Programming the HBase shell using JRuby FOLLOWS_FAM=${FOLLOWS_FAM-'f'}

FOLLOWEDBY_TABLE=${FOLLOWED_TABLE-'followedBy'} FOLLOWEDBY_FAM=${FOLLOWED_FAM-'f'}

exec "$HBASE_CLI" shell <<EOF create '$TWITS_TABLE',

{NAME => '$TWITS_FAM', VERSIONS => 1} create '$USERS_TABLE',

{NAME => '$USERS_FAM'} create '$FOLLOWS_TABLE',

{NAME => '$FOLLOWS_FAM', VERSIONS => 1} create '$FOLLOWEDBY_TABLE',

{NAME => '$FOLLOWEDBY_FAM', VERSIONS => 1} EOF

This was a primer on how you can use the HBase shell to create scripts that make it easy to do janitorial tasks on your HBase deployment. The HBase shell isn’t something you’ll use as your primary access method to HBase; it’s not meant to have an entire application built on top of it. It’s an application itself that has been built on top of

JRuby, which we study next.

6.2

Programming the HBase shell using JRuby

The HBase shell provides a convenient interactive environment and is sufficient for many simple administrative tasks. But it can become tedious for more complex opera- tions. As we mentioned in the previous section, the HBase shell is implemented in

JRuby.1 Behind the scenes is a nice library exposing the HBase client to JRuby. You can

access that library in your own scripts to create increasingly complex automation over

HBase. In this example, you’ll build a tool for interacting with the TwitBase users table, similar to the UsersTool you wrote in Java. This will give you a feel for interact- ing with HBase from JRuby.

Programming HBase via this JRuby interface is one step above the shell in terms of sophistication. If you find yourself writing complex shell scripts, a JRuby application may be a preferable approach. If for whatever reason you need to use the C imple- mentation of Ruby instead of JRuby, you’ll want to explore Thrift. We demonstrate using Thrift from Python later in this chapter; using it from Ruby is similar.

You can find the completed TwitBase.jrb script from this section in the TwitBase project source at https://github.com/hbaseinaction/twitbase/blob/master/bin/ TwitBase.jrb.

6.2.1 Preparing the HBase shell

The easiest way to launch your own JRuby applications is through the existing HBase shell. If you haven’t already done so, locate the shell by following the instructions at the beginning of the previous section.

1 JRuby is the Ruby programming language implemented on top of the JVM. Learn more at http://jruby.org/.

Run shell commands

Once you’ve found the hbase command, you can use that as the interpreter for your own scripts. This is particularly useful because it handles importing the necessary libraries and instantiates all the classes you’ll need. To get started, create a script to list the tables. Call it TwitBase.jrb:

def list_tables() @hbase.admin(@formatter).list.each do |t| puts t end end list_tables exit

The variables @hbase and @formatter are two instances created for you by the shell. They’re part of that JRuby API you’re about to take advantage of. Now give the script a try:

$ $HBASE_HOME/bin/hbase shell ./TwitBase.jrb followers

twits users

With everything in place, let’s start working with TwitBase.

In document Hbase in Action (Page 172-175)