Skip to content

Latest commit

 

History

History
59 lines (49 loc) · 1.29 KB

8-CustomUdfs.md

File metadata and controls

59 lines (49 loc) · 1.29 KB

Custom UDFs

  • Let's first use a built-in UDF. On hive shell:

SELECT
  LOWER(origin)
FROM
   flight_data
LIMIT 10;

  • Now, let's write our own UDF. However, we need maven to build code, so let's install maven. On bash shell, type:

cd ~
wget http://apache.mirrors.lucidnetworks.net/maven/maven-3/3.1.1/binaries/apache-maven-3.1.1-bin.tar.gz
tar -xzvf apache-maven-3.1.1-bin.tar.gz
export PATH=$PATH:$(pwd)/apache-maven-3.1.1/bin

  • Write some code. For now, let's just download it. On bash, git clone this repo

cd ~
git clone git://github.com/markgrover/hive-translate.git
cd hive-translate
mvn clean package

  • Now, let's use the UDF we just compiled. On hive shell, type the following to register the UDF with Hive:

ADD JAR target/translate-udf-0.0.1-SNAPSHOT.jar;
CREATE TEMPORARY FUNCTION my_translate AS 'org.mgrover.hive.translate.GenericUDFTranslate';

  • On Hive shell: type the following to use the newly created UDF:

SELECT
   my_translate('abc','a','A')
FROM
   flight_data
LIMIT 10;