PIG를 통해 카산드라(Cassandra) 와 연결에 성공한 설정입니다.

==========================================================================================
core-site.xml on HADOOP
==========================================================================================
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:54310</value>
  </property>

  <property>
     <name>hadoop.tmp.dir</name>
     <value>/usr/lib64/hadoop/tmp</value>
  </property>
  <property>
      <name>cassandra.thrift.address</name>
      <value>localhost</value>
  </property>
  <property>
      <name>cassandra.thrift.port</name>
      <value>9160</value>
  </property>
  <property>
      <name>cassandra.partitioner.class</name>
      <value>org.apache.cassandra.dht.RandomPartitioner</value>
  </property>
</configuration>
==========================================================================================

==========================================================================================
system environment(/etc/profile) for HADDOP, PIG and CASSANDRA
==========================================================================================
export JAVA_HOME=/usr/lib64/jvm/java-6-sun

export HADOOP_HOME=/usr/lib64/hadoop
export HADOOPDIR=$HADOOP_HOME/conf
export HADOOP_CONF_DIR=$HADOOPDIR
export HADOOP_CONF=$HADOOPDIR
export HADOOPDIR=$HADOOP_CONF
export HADOOP_CLASSPATH=$HADOOP_CONF:$CASSANDRA_LIB

export PIG_HOME=/usr/lib64/pig
export PIG_CLASSPATH=$HADOOPDIR
export PIG_CONF=$HADOOPDIR
export PIG_CONF_DIR=$HADOOPDIR
export PIG_RPC_PORT=9160
export PIG_INITIAL_ADDRESS=localhost

export cassandra_home=/usr/share/cassandra
export CASSANDRA_HOME=$cassandra_home
export CASSANDRA_LIB=/usr/share/cassandra
export CASSANDRA_CLASSPATH=$cassandra_home

export PATH=$PATH:$HADOOP_HOME/bin:$PIG_HOME/bin
================================================================================================

================================================================================================
HADOOP_CLASSPATH
================================================================================================

One configuration note on getting the task trackers to be able to perform queries over Cassandra: you’ll want to update your HADOOP_CLASSPATH in your <hadoop>/conf/hadoop-env.sh to include the Cassandra lib libraries. For example you’ll want to do something like this in the hadoop-env.sh on each of your task trackers:

export HADOOP_CLASSPATH=/opt/cassandra/lib/*:$HADOOP_CLASSPATH

================================================================================================

PIG를 사용하지 않고  HADOOP에서 CASSANDRA를 연결하기 위해서는 몇 가지 설정이 필요하지 안습니다.
PIG 에서 직접 Cassandra 에서 접근하지 않을 경우 Java에서 드라이버를 이용해서 직접 접근 할 수 있으리라 판단 됩니다.

Leave a Reply

Your email address will not be published. Required fields are marked *