PIG를 통해 카산드라(Cassandra) 와 연결에 성공한 설정입니다.
==========================================================================================
core-site.xml on HADOOP
==========================================================================================
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/lib64/hadoop/tmp</value>
</property>
<property>
<name>cassandra.thrift.address</name>
<value>localhost</value>
</property>
<property>
<name>cassandra.thrift.port</name>
<value>9160</value>
</property>
<property>
<name>cassandra.partitioner.class</name>
<value>org.apache.cassandra.dht.RandomPartitioner</value>
</property>
</configuration>
==========================================================================================
==========================================================================================
system environment(/etc/profile) for HADDOP, PIG and CASSANDRA
==========================================================================================
export JAVA_HOME=/usr/lib64/jvm/java-6-sun
export HADOOP_HOME=/usr/lib64/hadoop
export HADOOPDIR=$HADOOP_HOME/conf
export HADOOP_CONF_DIR=$HADOOPDIR
export HADOOP_CONF=$HADOOPDIR
export HADOOPDIR=$HADOOP_CONF
export HADOOP_CLASSPATH=$HADOOP_CONF:$CASSANDRA_LIB
export PIG_HOME=/usr/lib64/pig
export PIG_CLASSPATH=$HADOOPDIR
export PIG_CONF=$HADOOPDIR
export PIG_CONF_DIR=$HADOOPDIR
export PIG_RPC_PORT=9160
export PIG_INITIAL_ADDRESS=localhost
export cassandra_home=/usr/share/cassandra
export CASSANDRA_HOME=$cassandra_home
export CASSANDRA_LIB=/usr/share/cassandra
export CASSANDRA_CLASSPATH=$cassandra_home
export PATH=$PATH:$HADOOP_HOME/bin:$PIG_HOME/bin
================================================================================================
================================================================================================
HADOOP_CLASSPATH
================================================================================================
One configuration note on getting the task trackers to be able to perform queries over Cassandra: you’ll want to update your HADOOP_CLASSPATH in your <hadoop>/conf/hadoop-env.sh to include the Cassandra lib libraries. For example you’ll want to do something like this in the hadoop-env.sh on each of your task trackers:
export HADOOP_CLASSPATH=/opt/cassandra/lib/*:$HADOOP_CLASSPATH
================================================================================================
PIG를 사용하지 않고 HADOOP에서 CASSANDRA를 연결하기 위해서는 몇 가지 설정이 필요하지 안습니다.
PIG 에서 직접 Cassandra 에서 접근하지 않을 경우 Java에서 드라이버를 이용해서 직접 접근 할 수 있으리라 판단 됩니다.