Given a data stream input of non-negative integers a1, a2, ..., an, ..., summarize the numbers seen so far as a list of disjoint intervals. For example, suppose the integers from the data stream are 1, 3, 7, 2, 6, ..., then the summary will be: [1, 1] [1, 1], [3, 3] [1, 1], [3, 3], [7, 7] [1, 3], [7, 7] [1, 3], [6, 7] **Follow up:** What if there are lots of merges and the number of disjoint intervals are small compared to the data stream's size?
这里面提到了一个disjoint interval的概念,它是指不相交的区间。若是新来的数据与当前的区间集产生了重合,则须要将当前的区间集进行合并。从而确保每次获得的数据集都是不相交的。java
这道题目总体来讲有两种思路,一种就是每次插入数据的时候将出现交集的区间进行合并,另外一种就是在插入数据时只更新区间的范围,并不将区间合并。优化
这两种方案在不一样特色的数据集下各有优点。第一种方法适用于数据量大区间集少的场景。由于区间集少,而每次合并带来的区间集的数据移动成本较低。后者则恰好相反,适用于区间集较多的场景,只有在读取区间集的时候,去触发合并操做。this
这里是采用方案一来实现的。若是咱们维护了一组有序的区间集,这时数据流中传来一个新的数字,则该数字和区间流中的区间一共有以下几种状况:code
代码以下:对象
List<Pair> pairs = new ArrayList<>(); public void addNum(int val) { if (pairs.size() == 0) { pairs.add(new Pair(val, val)); return; } int left = 0, right = pairs.size() - 1; while (left <= right) { int mid = (left + right) / 2; if (pairs.get(mid).left < val) { left = mid + 1; }else if (pairs.get(mid).left > val) { right = mid - 1; }else { return; } } if (left == 0) { if (pairs.get(left).left == val + 1) { pairs.get(left).left = val; }else { pairs.add(0, new Pair(val, val)); } } else if (left == pairs.size()) { if (pairs.get(left-1).right == val - 1) { pairs.get(left-1).right = val; }else if (pairs.get(left-1).right < val - 1) { pairs.add(new Pair(val, val)); } }else if (pairs.get(left-1).right == val - 1 && pairs.get(left).left == val + 1) { pairs.get(left-1).right = pairs.get(left).right; pairs.remove(left); }else if (pairs.get(left-1).right == val - 1) { pairs.get(left - 1).right = val; }else if (pairs.get(left).left == val + 1) { pairs.get(left).left = val; }else if (pairs.get(left-1).right < val){ pairs.add(left, new Pair(val, val)); } } public int[][] getIntervals() { int[][] result = new int[pairs.size()][2]; for (int i = 0 ; i < pairs.size() ; i++) { result[i][0] = pairs.get(i).left; result[i][1] = pairs.get(i).right; } return result; } public static class Pair{ int left; int right; public Pair(int left, int right) { this.left = left; this.right = right; } }
上面的代码新建了类Pair做为区间的对象。而后使用List对区间Pair按照从小到大存储。每次调用addNum传入val时,都会利用二分法找到左边界比val大的最近的区间。找到该区间后再按照上文的判断方式逐个处理。假设数据流大小为n,区间的平均大小为m,则这种方法的时间复杂度为O(lgM*n)。可是由于判断逻辑较多,所以代码显得凌乱,可读性不好。rem
这里可使用TreeMap进行优化,TreeMap默认上是最小堆的实现,它帮咱们省去了大量二分法的逻辑,同时在插入区间的时候,时间复杂度也下降到O(NlgN)。只须要对边界场景进行判断便可。代码以下:get
TreeMap<Integer, Integer> treeMap = new TreeMap<>(); public void addNum(int val) { if (treeMap.containsKey(val)) { return; } Integer lowerKey = treeMap.lowerKey(val); Integer higherKey = treeMap.higherKey(val); if (lowerKey != null && higherKey != null && treeMap.get(lowerKey) == val - 1 && higherKey == val + 1) { treeMap.put(lowerKey, treeMap.get(higherKey)); treeMap.remove(higherKey); }else if (lowerKey != null && treeMap.get(lowerKey) >= val - 1) { treeMap.put(lowerKey, Math.max(val, treeMap.get(lowerKey))); }else if (higherKey != null && higherKey == val + 1) { treeMap.put(val, treeMap.get(higherKey)); treeMap.remove(higherKey); }else { treeMap.put(val, val); } } public int[][] getIntervals() { int[][] result = new int[treeMap.size()][2]; int index = 0; for (int key : treeMap.keySet()){ result[index][0] = key; result[index][1] = treeMap.get(key); } return result; }