Working with Spring XD

Create a Stream:

In Spring-XD, a basic stream defines the ingestion of event driven data from a source to a sink that passes through any number of processors. You can create a new stream by issuing a stream create command from the-XD shell. Stream definitions are built from a simple DSL. For example, execute:

stream create –name ticktock –definition “time | log” –deploy

This defines a stream named ticktock based off the DSL expression time | log. The DSL uses the “pipe” symbol |, to connect a source to a sink. The stream server finds the time and log definitions in the modules directory and uses them to setup the stream. In this simple example, the time source simply sends the current time as a message each second, and the log sink outputs it using the logging framework at the WARN logging level. Since the –deploy flag was provided, this stream will be deployed immediately. In the console where you started the server, you will see log output similar to that listed below

13:09:53,812 INFO http-bio-8080-exec-1 module.SimpleModule:109 – started module: Module [name=log, type=sink]
13:09:53,813 INFO http-bio-8080-exec-1 module.ModuleDeployer:111 – launched sink module: ticktock:log:1
13:09:53,911 INFO http-bio-8080-exec-1 module.SimpleModule:109 – started module: Module [name=time, type=source]
13:09:53,912 INFO http-bio-8080-exec-1 module.ModuleDeployer:111 – launched source module: ticktock:time:0
13:09:53,945 WARN task-scheduler-1 logger.ticktock:141 – 2013-06-11 13:09:53
13:09:54,948 WARN task-scheduler-1 logger.ticktock:141 – 2013-06-11 13:09:54
13:09:55,949 WARN task-scheduler-2 logger.ticktock:141 – 2013-06-11 13:09:55

To stop the stream, and remove the definition completely, you can use the stream destroy command:

stream destroy –name ticktock
It is also possible to stop and restart the stream instead, using the undeploy and deploy commands. The shell supports command completion so you can hit the tab key to see which commands and options are available.

Using Hadoop:
Spring-XD supports the following Hadoop distributions:

hadoop22 – Apache Hadoop 2.2.0 (default)

hadoop24 – Apache Hadoop 2.4.1

phd1 – Pivotal HD 1.1

phd20 – Pivotal HD 2.0

cdh5 – Cloudera CDH 5.0.0

hdp21 – Hortonworks Data Platform 2.1

To specify the distribution libraries to use for Hadoop client connections, use the option
–hadoopDistro for the-xd-container and-xd-shell commands:

xd/bin>$ ./xd-shell –hadoopDistro <distribution>


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s