Tuesday, January 20, 2015

create dynamic output directory from reducer using Map Reduce

Using MultipleOutputs class we can create our own output directory.
For example create the Folderpath as /<output path>/<System date>/<Namedoutput>

In Main Driver we need to mention the actual output path:
MultipleOutputs.addNamedOutput(job, "Combined", TextOutputFormat.class, Text.class, Text.class);
        MultipleOutputs.addNamedOutput(job, "UnProcessed", TextOutputFormat.class, Text.class, Text.class);

Reducer Class:
public class Reducer extends Reducer<Text, TextArrayWritable, Text, Text> {
MultipleOutputs<Text, Text> mos;
String currentDate= "";

public void setup(Context context) {
mos = new MultipleOutputs<Text, Text>(context);
DateFormat dateFormat = new SimpleDateFormat("yyyyMMdd-HHmmss");
  //get current date time with Date()
Date date = new Date();
currentDate = dateFormat.format(date);
}

public void reduce(Text key, Iterable<TextArrayWritable> values,
Context context) throws IOException, InterruptedException {
  
ArrayList<Text[]> sortedList = new ArrayList<Text[]>();
//divided into MsgReq and MsgResponse Lists
for (TextArrayWritable value : values) {
Text[] createList=(Text[])value.toArray();
if (!(createList[0].toString().contains("0000000000000000"))) {
mos.write("Dummy", key,new Text(createList[0].toString()),currentDate+"/Dummy");
}
sortedList.add((Text[]) value.toArray());
}
}

protected void cleanup(Context context) throws IOException,
InterruptedException {
mos.close();
}

}

Here:
mos.write("Dummy", key,new Text(createList[0].toString()),currentDate+"/Dummy");
In the above statement 
First argument: NamedOutput is "Dummy"
Second argument: Key is key
Third argument: value is new Text(createList[0].toString())
Fourth argument: basOutputPath is currentDate+"/Dummy" <sysdate+"/"+NamedOutput>

Result:
sysdate/Dummy-r-00000

Have a nice day...............:-)
           

No comments:

Post a Comment