Map Reduce Programs

teragen:

Generate data for the terasort.

To run:

bin/hadoop jar <path to jar file> teragen <num rows> <output dir>

Input:

bin/hadoop jar <path to jar file> teragen 10 <path to output dir>

Output:

The output will be stored under the specified directory.

Screenshots:

tsg

terasort:

Run the terasort.

To run:

bin/hadoop jar <path to jar file>  terasort <input dir> <output dir>

Input:

This program takes the data generated by teragen program as input.

Output:

The output will be stored under the specified directory.

Screenshots:

ts

teravalidate:

Check the results of the terasort.

TeraValidate ensures that the output data of TeraSort is globally sorted.

To run:

bin/hadoop jar <path to jar file> teravalidate <output dir of terasort> <terasort-validate dir>

Input:

This program takes the data generated by terasort program as input.

Output:

The output will be stored under the specified directory.The checksum will be genrated.

Screenshots:

tv tvr

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s