最近,在看一本名为《编程珠玑》的书,提升本身编写代码的能力和思路。里面描述的都是用c或c++来写,本身决定用java来实现里面提到的一些思路。 这一章,讲述如何在容量限制的范围下,对数据量庞大,不会出现重复的随机数据(整数)进行排序。如,用1MB的内存处理7位的整数。java
问题一:如何快速排序? 方法一:将整数一次读入,进行屡次归并排序。方法二:将整数分屡次读入,多趟排序。方法三:也就是今天要讲的重点,位图或位向量集合排序。 先讲讲思路,用一个20位长的字符串表示一个全部元素都小于20的简单的非负整数集合。例如,能够用以下字符串来表示集合{1,2,3,5,8,13}: 0 1 1 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 表明集合中数值的位都置为1,其余的全部的位都置为0; 所以,每一个7位的十进制整数表示一个小于1000万的整数。c++
问题二:如何生成在n范围内不重复的k个随机数 思路,交换位置。先在r[0...n]中置为各个值,值为其下标值:a[i] = i 。而后在范围(1,n)随机产生一个正整数random,交换值r[1]与r[random];再在范围(2,n)随机产生另外一个正整数random,交换r[2]与r[random]....以此类推,一直循环k次。最后输出前k项就是符合要求的随机数。代码以下:编程
<!--lang: java--> package ckj.chapter1; import java.io.BufferedWriter; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.IOException; import java.io.OutputStreamWriter; import java.util.ArrayList; import java.util.Collection; import java.util.List; import java.util.Random; public class RandomNumber { public static final int ARRAY_LENGTH = 3000000; private int size; private List<Integer> arrayList; @SuppressWarnings("unchecked") RandomNumber() { arrayList = new ArrayList<Integer>(); for (int i = 1; i <= ARRAY_LENGTH; i++) { arrayList.add(i); } } public RandomNumber(int size){ this(); this.size = size; } public List<Integer> generateRandNum() { Random r = new Random(); for (int i = 0; i < size; i++) { //System.out.println(Math.abs(r.nextInt(ARRAY_LENGTH))); int itemp = arrayList.get(i); int rtemp = Math.abs(r.nextInt(ARRAY_LENGTH-i)+i); //System.out.println("No. " + i + " random Number ---- > " + rtemp); arrayList.set(i, arrayList.get(rtemp)); arrayList.set(rtemp, itemp); } arrayList = arrayList.subList(0, size); return arrayList; } public static void print(Collection<Integer> array){ //System.out.println(); System.out.println(array); } public void writeFile(String fileName){ try { FileOutputStream fos = new FileOutputStream(fileName); OutputStreamWriter osw = new OutputStreamWriter(fos); BufferedWriter bw = new BufferedWriter(osw); String buf = arrayList.toString(); String tempbuf = buf.substring(1, buf.length()-1); bw.write(tempbuf); bw.flush(); bw.close(); osw.close(); } catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } public static void main(String[] args) { // TODO Auto-generated method stub int size = 50; RandomNumber rand = new RandomNumber(size); RandomNumber.print(rand.generateRandNum()); rand.writeFile("random.txt"); } }
下面,我分别用系统内部ArrayList提供的sort方法,TreeSet的排序集合 和 位排序 这三种方法比较。数组
首先,定义一个抽象类,名为sort,用来记录排序时间和从文件读入随机数。dom
<!-- lang: java --> package ckj.chapter1; import java.io.BufferedReader; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Collection; import java.util.List; public abstract class Sort { public String sortName; public List<Integer> testArray; public Sort() { this.testArray = readFromFile("random.txt"); //RandomNumber.print(testArray); // System.out.println(testArray.size()); } private List<Integer> readFromFile(String fileName) { List<Integer> tempList = new ArrayList<Integer>(); try { FileInputStream fis = new FileInputStream(fileName); InputStreamReader isr = new InputStreamReader(fis); BufferedReader br = new BufferedReader(isr); String s; while ((s = br.readLine()) != null) { String temp = s; String[] arrayString = temp.split(", "); //System.out.println("SORT---->"+arrayString.length); for (int i = 0; i < arrayString.length; i++) { Integer in = new Integer(arrayString[i]); tempList.add(in.intValue()); //System.out.print(arrayString[i]+" "); } } br.close(); isr.close(); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (NumberFormatException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return tempList; } public void sortTime() { long t1 = System.currentTimeMillis(); sort(); long costTime = System.currentTimeMillis() - t1; System.out.println("Time of " + sortName + " : " + costTime); } abstract public void sort(); public static void print(Collection<Integer> array){ System.out.println(array); } public void print(){ System.out.println(testArray); } }
而后,SetSort类实现sort的具体方法,调用Collecion.sort()方法。ide
<!-- lang: java --> import java.util.Collections; import ckj.chapter1.Sort; public class SetSort extends Sort { public SetSort(){ this.sortName = "SetSort"; } @Override public void sort() { // TODO Auto-generated method stub Collections.sort(testArray); } public static void main(String[] args){ Sort s = new SetSort(); s.sortTime(); }
}函数
TreeSort实现TreeSet类,其是自动排序的。不用调用任何方法。测试
<!-- lang: java --> package ckj.chapter1.treeset; import java.util.Set; import java.util.TreeSet; import ckj.chapter1.Sort; public class TreeSetSort extends Sort{ public Set<Integer> st; public TreeSetSort(){ this.sortName = "TreeSet" ; } @Override public void sort() { // TODO Auto-generated method stub st = new TreeSet<Integer>(this.testArray); } public void print(){ System.out.println(st); } }
最后是 BitSort,位排序,由于int是32位的,因此一个数组就能够存32个数字位,因此生成一个size/32的数组,存放数据。调用set(),进行排序;test()方法是输出。this
<!-- lang: java --> package ckj.chapter1.bitsort; import ckj.chapter1.RandomNumber; import ckj.chapter1.Sort; public class BitSort extends Sort { private int[] sortArray; public BitSort(){ this.sortName = "BitSort"; this.sortArray = new int[RandomNumber.ARRAY_LENGTH/32+1]; } private void set(int i){ this.sortArray[i>>5] |= (1 << ( i & 0x1f)); } private void clr(int i){ this.sortArray[i>>5] &= ~(1 << ( i & 0x1f)); } private int test(int i){ return (this.sortArray[i>>5] & (1 << (i & 0x1f))) ; } @Override public void sort() { // TODO Auto-generated method stub for ( int i = 0 ; i < RandomNumber.ARRAY_LENGTH ; i ++) clr(i); for ( int i = 0 ; i < this.testArray.size() ; i ++){ set(this.testArray.get(i)); } } public void print(){ for ( int i = 0 ; i < RandomNumber.ARRAY_LENGTH ; i ++) { if (test(i) != 0){ System.out.print(i+ " "); } } System.out.println(); } /*public static void main(String[] args){ Sort s = new BitSort(); s.sortTime(); s.print(); }*/ }
主函数MainClass调用测试代码:code
<!-- lang: java --> package ckj.chapter1; import ckj.chapter1.bitsort.BitSort; import ckj.chapter1.setsort.SetSort; import ckj.chapter1.treeset.TreeSetSort; public class MainClass { private static final int _RANDOMSIZE = 1000000; /** * @param args */ public static void main(String[] args) { // TODO Auto-generated method stub generateRandom(); Sort s = new SetSort(); s.sortTime(); //s.print(); s = new TreeSetSort(); s.sortTime(); //s.print(); s = new BitSort(); s.sortTime(); } private static void generateRandom() { RandomNumber rand = new RandomNumber(_RANDOMSIZE); //RandomNumber.print(rand.generateRandNum()); rand.generateRandNum(); rand.writeFile("random.txt"); } }
最后给一个测试结果,在3000000个范围内,随机产生不重复的1000000个正整数,运行时间比较:
<!-- lang: java --> Time of SetSort : 576 Time of TreeSet : 1537 Time of BitSort : 34
能够看出,数据量越大,位排序的效果越好!又多了一种排序的思路了。