Writing User Defined Functions For Pig
If you are processing a bunch of data, grouping it, joining it, filtering it, then you should probably be using pig.
So go download that, and get it all setup. You need:
Java 1.6 (with JAVA_HOME setup)
Hadoop (with HADOOP_HOME setup)
pig (of course)
Put all the relevant stuff in your PATH too.
pig 101
So here’s a simple pig script.
This registers a jar file and defines a custom UDF for doing whatever. It happens to be a log line p...