自定义计数器java
计数器用来监控,hadoop中job的运行进度和状态。apache
如源文件内容为:网络
a bide
c d e foop
g h ithis
如今须要找出字段数大于3和小于3的记录条数,能够使用计数器来实现,代码以下:spa
public void map(LongWritable key, Text value,orm
OutputCollector<Text, Text> output, Reporter reporter)接口
throws IOException {hadoop
String[] split = value.toString().split("\t");
if(split.length>3){
org.apache.hadoop.mapred.Counters.Counter counter = reporter.getCounter("MyCounter", "isLong");
counter.increment(1);
}else if(split.length<3){
org.apache.hadoop.mapred.Counters.Counter counter = reporter.getCounter("MyCounter","isShort");
counter.increment(1);
}
2. hadoop中的自定义数据类型
hadoop中默认的数据类型有:
BooleanWritable:标准布尔型数值
ByteWritable:单字节数值
DoubleWritable:双字节数值
FloatWritable:浮点数
IntWritable:整型数
LongWritable:长整型数
Text:使用UTF8格式存储的文本
NullWritable:当<key, value>中的key或value为空时使用
自定义数据类型的实现:
1.实现Writable接口,并重写内部write()和readFields()方法,从而完成序列化以后的网络传输和文件输入或输出。
2.若是该数据类型被做为mapreduce中的key,则该key须要为可比较的,须要实现WriableComparable接口,并重写内部write()和readFields()、compare()方法。
代码以下:
代码一:
public class Person implements Writable{
long id;
String name;
long age;
@Override
public void readFields(DataInput in) throws IOException {
this.id = in.readLong();
this.name = in.readUTF();
this.age = in.readLong();
}
@Override
public void write(DataOutput out) throws IOException {
out.writeLong(id);
out.writeUTF(name);
out.writeLong(age);
}
@Override
public String toString() {
return "id:"+id+" name:"+name+" age:"+age;
}
public long getId() {
return id;
}
public String getName() {
return name;
}
public long getAge() {
return age;
}
}
代码二:基于key的比较
package cn.com.bonc.hadoop;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.WritableComparable;
public class PersonSortByAge implements WritableComparable<PersonSortByAge>{
long id;
String name;
long age;
@Override
public void readFields(DataInput in) throws IOException {
in.readLong();
in.readUTF();
in.readLong();
}
@Override
public void write(DataOutput out) throws IOException {
out.writeLong(id);
out.writeUTF(name);
out.writeLong(age);
}
@Override
public int compareTo(PersonSortByAge o) {
return (int) (this.id - o.id);
}
@Override
public String toString() {
return "id:"+id+" name:"+name+" age:"+age;
}
}