使用JavaApi写Spark若是PairRDD的key值为自定义的类型,须要重写hashcode以及equals方法,否则就会发现相同的Key值并无进行聚合操做。apache
例如:使用User类型做为Keyeclipse
public class User { private String name; private String age; public String getName() { return name; } public void setName(String name) { this.name = name; } public String getAge() { return age; } public void setAge(String age) { this.age = age; } @Override public int hashCode() { final int prime = 31; int result = 1; result = prime * result + ((age == null) ? 0 : age.hashCode()); result = prime * result + ((name == null) ? 0 : name.hashCode()); return result; } @Override public boolean equals(Object obj) { if (this == obj) return true; if (obj == null) return false; if (getClass() != obj.getClass()) return false; User other = (User) obj; if (age == null) { if (other.age != null) return false; } else if (!age.equals(other.age)) return false; if (name == null) { if (other.name != null) return false; } else if (!name.equals(other.name)) return false; return true; } }
通常eclipse能够自动的生成类型的hashcode以及equals方法,不须要本身特别处理,ide
若是遇到特殊的状况的话,咱们能够使用commons-lang3包里面的HashCodeBuilder以及EqualsBuilder两个工具类来生成相应的方法工具
package run.aaa.spark; import org.apache.commons.lang3.builder.EqualsBuilder; import org.apache.commons.lang3.builder.HashCodeBuilder; public class User { private String name; private String age; public String getName() { return name; } public void setName(String name) { this.name = name; } public String getAge() { return age; } public void setAge(String age) { this.age = age; } @Override public int hashCode() { return HashCodeBuilder.reflectionHashCode(this); } @Override public boolean equals(Object obj) { return EqualsBuilder.reflectionEquals(this, obj); } }