After several failures in Sahara UI environment, I just attempted to set up Pig on Mater
Node (VM)
following
https://pig.apache.org/docs/r0.13.0/start.html#build ( current release 0.15)
Downloaded Pig on Master VM and configured it, so that I could run
simple pig script like (filtering "ERROR" line from sahara-engine.log) :-
messages = LOAD './input.txt';
out = FILTER messages BY $0 MATCHES '^+.*ERROR+.*';
STORE out INTO 'output';
in mapreduce mode :
pig -x mapreduce prg1.pig
with no errors.
Folder output got created in hadoop home directory and contained file with correct
extraction been done.
Then I configured Sahara Data sources input and output as follows:-
demo.sahara/input.txt ( previously uploading input.txt to swift container demo of AIO RDO
Liberty running Hadoop VMs )
demo.sahara/output .
Created job-binary-template uploading prg1.pig
messages = LOAD '$INPUT';
out = FILTER messages BY $0 MATCHES '^+.*ERROR+.*';
STORE out INTO '$OUPUT';
and got on second worker-node java trace :-
Configurator.java:842)
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
at
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
at org.apache.log4j.Logger.getLogger(Logger.java:104)
at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:262)
at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:108)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1025)
at
org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:844)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:541)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:292)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:269)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:657)
at org.apache.hadoop.service.AbstractService.<clinit>(AbstractService.java:43)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.PigMain], main() threw
exception, begin > end in range (begin, end): (1455138266308, 1455138263146)
java.lang.IllegalArgumentException: begin > end in range (begin, end): (1455138266308,
1455138263146)
at
org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetApplicationsRequestPBImpl.setStartRange(GetApplicationsRequestPBImpl.java:340)
at
org.apache.oozie.action.hadoop.LauncherMainHadoopUtils.getChildYarnJobs(LauncherMainHadoopUtils.java:68)
at
org.apache.oozie.action.hadoop.LauncherMainHadoopUtils.killChildYarnJobs(LauncherMainHadoopUtils.java:88)
at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:216)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:380)
at
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:301)
at
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:187)
at
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:230)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
log4j:WARN No appenders could be found for logger
(org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See
http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Either I have wrong understanding how set up Pig on Hadoop Cluster or doing mistakes in
Sahara GUI environment.
One more thing confusing me Pig should be installed during Sahara's Hadoop Cluster
generating, it is not a part of
of Hadoop Cluster ( no matter which plugin has been used ), how Sahara suggests to set up
Pig Jobs if there is no
any Pig on cluster's VMs. I am missing something here.
Please, advise.
Boris.