You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Zubair Nabi edited this page Apr 2, 2015
·
10 revisions
Before you begin, create the dataset required by the InfoSphere Streams benchmark: [Create dataset for InfoSphere Streams benchmark](Create dataset for InfoSphere Streams benchmark)
Overview
The StreamsEmailBenchmark project contains the InfoSphere streams application for processing the emails.
Prerequisites
Copy your serialized/compressed dataset (obtained using StreamsPrepareDataset) to StreamsEmailBenchmark/data
Note: Naming convention should be filename0.av to filename<parallelism>.av
For instance, if you want to process two files in parallel, they should be named, filename0.av and filename1.av
Compile
To build the application, go to the root directory of StreamsEmailBenchmark, and type make all PARALLELISM=<parallelism> at the command line.
Execution
To run the application:
Make sure a streams instance is created and started
To submit the job to the streams instance:
streamtool submitjob output/Main/Distributed/Main.adl -P filename=<input_file_name> -P windowTime=<flush_interval_for_metrics_in_secs> -P printWindowMetrics=<yes_or_no>
Results Collection
Metrics will be dumped to stdout in case of standalone execution and to the logs in case of distributed execution
CPU Time can be obtained by visually inspecting the SPL graph in Streams Studio