IComponent接口是全部组件的接口。java
主要包含两个方法:react
package org.apache.storm.topology; import java.io.Serializable; import java.util.Map; /** * Common methods for all possible components in a topology. This interface is used * when defining topologies using the Java API. */ public interface IComponent extends Serializable { /** * Declare the output schema for all the streams of this topology. * * @param declarer this is used to declare output stream ids, output fields, and whether or not each output stream is a direct stream */ void declareOutputFields(OutputFieldsDeclarer declarer); /** * Declare configuration specific to this component. Only a subset of the "topology.*" configs can * be overridden. The component configuration can be further overridden when constructing the * topology using {@link TopologyBuilder} * */ Map<String, Object> getComponentConfiguration(); } |
ISpout是实现Spout的核心接口。Spout负责提供消息给拓扑进行处理。Storm将跟踪基于Spout发射的元组产生的有向无环图。当Storm检测到有向无环图的每一个元组已经成功被处理时,它将发送一个ack信息到Spout。apache
若是一个元组在配置的超时时间以前不能被彻底处理,Storm将发送fail信息到Spout。安全
当一个Spout发送一个元组时,可使用messageId来标记元组。消息id能够是任何类型。当Storm进行ack或者fail消息时,它可使用messageId来识别是哪些元组。若是Spout漏掉了messageId,或者将它设置为null,那么Storm将不会跟踪信息,而且Spout也不会收到任何ack或者fail信息的回调。并发
Storm在同一个线程里执行ack()、fail()和nextTuple()方法。觉得这ISpout的实现并不想须要担忧这些方法之间的并发问题。然而,这也觉得这ISpout的实现必须确保nextTuple()方法是非阻塞,不然nextTuple()方法可能会组织等待处理的ack()和fail()方法。ide
包含以下方法:函数
package org.apache.storm.spout; import org.apache.storm.task.TopologyContext; import java.util.Map; import java.io.Serializable; /** * ISpout is the core interface for implementing spouts. A Spout is responsible * for feeding messages into the topology for processing. For every tuple emitted by * a spout, Storm will track the (potentially very large) DAG of tuples generated * based on a tuple emitted by the spout. When Storm detects that every tuple in * that DAG has been successfully processed, it will send an ack message to the Spout. * * If a tuple fails to be fully processed within the configured timeout for the * topology (see {@link org.apache.storm.Config}), Storm will send a fail message to the spout * for the message. * * When a Spout emits a tuple, it can tag the tuple with a message id. The message id * can be any type. When Storm acks or fails a message, it will pass back to the * spout the same message id to identify which tuple it's referring to. If the spout leaves out * the message id, or sets it to null, then Storm will not track the message and the spout * will not receive any ack or fail callbacks for the message. * * Storm executes ack, fail, and nextTuple all on the same thread. This means that an implementor * of an ISpout does not need to worry about concurrency issues between those methods. However, it * also means that an implementor must ensure that nextTuple is non-blocking: otherwise * the method could block acks and fails that are pending to be processed. */ public interface ISpout extends Serializable { /** * Called when a task for this component is initialized within a worker on the cluster. * It provides the spout with the environment in which the spout executes. * * This includes the: * * @param conf The Storm configuration for this spout. This is the configuration provided to the topology merged in with cluster configuration on this machine. * @param context This object can be used to get information about this task's place within the topology, including the task id and component id of this task, input and output information, etc. * @param collector The collector is used to emit tuples from this spout. Tuples can be emitted at any time, including the open and close methods. The collector is thread-safe and should be saved as an instance variable of this spout object. */ void open(Map<String, Object> conf, TopologyContext context, SpoutOutputCollector collector); /** * Called when an ISpout is going to be shutdown. There is no guarentee that close * will be called, because the supervisor kill -9's worker processes on the cluster. * * The one context where close is guaranteed to be called is a topology is * killed when running Storm in local mode. */ void close(); /** * Called when a spout has been activated out of a deactivated mode. * nextTuple will be called on this spout soon. A spout can become activated * after having been deactivated when the topology is manipulated using the * `storm` client. */ void activate(); /** * Called when a spout has been deactivated. nextTuple will not be called while * a spout is deactivated. The spout may or may not be reactivated in the future. */ void deactivate(); /** * When this method is called, Storm is requesting that the Spout emit tuples to the * output collector. This method should be non-blocking, so if the Spout has no tuples * to emit, this method should return. nextTuple, ack, and fail are all called in a tight * loop in a single thread in the spout task. When there are no tuples to emit, it is courteous * to have nextTuple sleep for a short amount of time (like a single millisecond) * so as not to waste too much CPU. */ void nextTuple(); /** * Storm has determined that the tuple emitted by this spout with the msgId identifier * has been fully processed. Typically, an implementation of this method will take that * message off the queue and prevent it from being replayed. */ void ack(Object msgId); /** * The tuple emitted by this spout with the msgId identifier has failed to be * fully processed. Typically, an implementation of this method will put that * message back on the queue to be replayed at a later time. */ void fail(Object msgId); } |
IRichSpout继承了ISpout接口和IComponent接口。oop
package org.apache.storm.topology; import org.apache.storm.spout.ISpout; /** * When writing topologies using Java, {@link IRichBolt} and {@link IRichSpout} are the main interfaces * to use to implement components of the topology. * */ public interface IRichSpout extends ISpout, IComponent { } |
IBolt是实现Bolt的核心接口。IBolt表示一个以元组做为输入并生成元组做为输出的组件。IBolt能够完成过滤、链接、函数、聚合等任何功能。IBolt没有当即处理元组,能够保留元组之后再处理。ui
Bolt的生命周期以下:在客户端主机上建立IBolt对象,IBolt被序列化到拓扑(使用Java序列化)并提交到集群的主控节点(Nimbus)。而后supervisor启动工做进程(Worker)反序列化对象,调用对象上的prepare方法,而后开始处理元组。this
若是你喜欢参数化一个IBolt,应该经过其构造函数设置参数并做为实例变量保存参数化状态,而后,实例变量会序列化,并发送给跨集群的每一个任务来执行这个Bolt。若是使用Java来定义Bolt,应该使用IRichBolt接口,IRichBolt接口添加了使用Java TopologyBuilder API的必要方法。
IBolt以下方法:
package org.apache.storm.task; import org.apache.storm.tuple.Tuple; import java.util.Map; import java.io.Serializable; /** * An IBolt represents a component that takes tuples as input and produces tuples * as output. An IBolt can do everything from filtering to joining to functions * to aggregations. It does not have to process a tuple immediately and may * hold onto tuples to process later. * * A bolt's lifecycle is as follows: * * IBolt object created on client machine. The IBolt is serialized into the topology * (using Java serialization) and submitted to the master machine of the cluster (Nimbus). * Nimbus then launches workers which deserialize the object, call prepare on it, and then * start processing tuples. * * If you want to parameterize an IBolt, you should set the parameters through its * constructor and save the parameterization state as instance variables (which will * then get serialized and shipped to every task executing this bolt across the cluster). * * When defining bolts in Java, you should use the IRichBolt interface which adds * necessary methods for using the Java TopologyBuilder API. */ public interface IBolt extends Serializable { /** * Called when a task for this component is initialized within a worker on the cluster. * It provides the bolt with the environment in which the bolt executes. * * This includes the: * * @param topoConf The Storm configuration for this bolt. This is the configuration provided to the topology merged in with cluster configuration on this machine. * @param context This object can be used to get information about this task's place within the topology, including the task id and component id of this task, input and output information, etc. * @param collector The collector is used to emit tuples from this bolt. Tuples can be emitted at any time, including the prepare and cleanup methods. The collector is thread-safe and should be saved as an instance variable of this bolt object. */ void prepare(Map<String, Object> topoConf, TopologyContext context, OutputCollector collector); /** * Process a single tuple of input. The Tuple object contains metadata on it * about which component/stream/task it came from. The values of the Tuple can * be accessed using Tuple#getValue. The IBolt does not have to process the Tuple * immediately. It is perfectly fine to hang onto a tuple and process it later * (for instance, to do an aggregation or join). * * Tuples should be emitted using the OutputCollector provided through the prepare method. * It is required that all input tuples are acked or failed at some point using the OutputCollector. * Otherwise, Storm will be unable to determine when tuples coming off the spouts * have been completed. * * For the common case of acking an input tuple at the end of the execute method, * see IBasicBolt which automates this. * * @param input The input tuple to be processed. */ void execute(Tuple input); /** * Called when an IBolt is going to be shutdown. There is no guarentee that cleanup * will be called, because the supervisor kill -9's worker processes on the cluster. * * The one context where cleanup is guaranteed to be called is when a topology * is killed when running Storm in local mode. */ void cleanup(); } |
IRichBolt继承了IBolt接口和IComponent接口。
package org.apache.storm.topology; import org.apache.storm.task.IBolt; /** * When writing topologies using Java, {@link IRichBolt} and {@link IRichSpout} are the main interfaces * to use to implement components of the topology. * */ public interface IRichBolt extends IBolt, IComponent { } |
IBasicBolt继承了IComponent接口。
IBasicBolt与IRichBolt具备同样的同名方法,可是IBasicBolt的execute方法会自动处理Acking机制。
package org.apache.storm.topology; import org.apache.storm.task.TopologyContext; import org.apache.storm.tuple.Tuple; import java.util.Map; public interface IBasicBolt extends IComponent { void prepare(Map<String, Object> topoConf, TopologyContext context); /** * Process the input tuple and optionally emit new tuples based on the input tuple. * * All acking is managed for you. Throw a FailedException if you want to fail the tuple. */ void execute(Tuple input, BasicOutputCollector collector); void cleanup(); } |