LeetCode
  • Introduction
  • Design
    • 348. Design Tic-Tac-Toe
    • 534. Design TinyURL
    • 535. Encode and Decode TinyURL
    • 346. Moving Average from Data Stream
    • 281. Zigzag Iterator
    • 381. Insert Delete GetRandom O(1) - Duplicates allowed
    • 432. All O`one Data Structure
    • 341. Flatten Nested List Iterator
    • 642. Design Search Autocomplete System
    • 170. Two Sum III - Data structure design
    • 622. Design Circular Queue
    • 295. Find Median from Data Stream
  • OOD
    • 146.LRU Cache(Amazon)
  • Sort
    • Bucket Sort
      • 791. Custom Sort String
      • 347. Top K Frequent Elements
      • 1057. Campus bikes
    • 280. Wiggle Sort
    • 75. Sort Colors
  • Binary Search
    • 278. First Bad Version
    • 162. Find Peak Element in O(logn)
    • LintCode 183. Wood Cut
    • 33. Search in Rotated Sorted Array
    • 81. Search in Rotated Sorted Array II
    • 153. Find Minimum in Rotated Sorted Array
    • 154. Find Minimum in Rotated Sorted Array II
    • 34. Find First and Last Position of Element in Sorted Array
    • Count number of occurrences (or frequency) in a sorted array
    • 378. Kth Smallest Element in a Sorted Matrix
    • 240. Search a 2D Matrix II
    • 658. Find K Closest Elements
    • 410. Split Array Largest Sum
  • Data Structure
    • Arrays
      • Two Pointer
        • 11. Container With Most Water
      • 283. Move Zeros
      • 26. Remove Duplicates from Sorted Array
      • 88. Merge Sorted Array
      • 683. K Empty Slots
      • 4. Median of Two Sorted Arrays
      • Interval
        • 56. Merge Interval
        • 57. Insert Interval
      • 54. Spiral Matrix
      • 48. Rotate Image(Amazon, MicroSoft, Apple)
      • 448. Find All Numbers Disappeared in an Array
      • 228. Summary Ranges
      • 289. Game of Life
      • 298. Product of Array Except Self
      • 825. Friends of Appropriate Ages
    • Stack
      • Postfix to Infix
      • 155. Min Stack
      • 716. Max Stack
      • 388. Longest Absolute File Path
      • 316. Remove Duplicate Letters
      • 636. Exclusive Time of Functions
      • Calculator Series
        • 227. Basic Calculator
        • 772. Basic Calculator III
    • Trie
      • 208. Implement Trie (Prefix Tree)
      • 211. Add and Search Word - Data structure design
      • 336. Palindrome Pairs
      • 212. Word Search II
    • Heap
      • 218. The Skyline Problem
      • 23. Merge k Sorted Lists
      • 252. Meeting Rooms
      • 253. Meeting Rooms II
      • 215. Kth Largest Element in an Array
      • 692. Top K Frequent Words
      • 373. Find K Pairs with Smallest Sums
    • Deque
      • 239. Sliding Window Maximum
    • Map
      • 128. Longest Consecutive Sequence
      • 350. Intersection of Two Arrays II
      • 760. Find Anagram Mappings
    • Segment Tree / Binary Indexed Tree
      • 308. Range Sum Query 2D - Mutable
      • 218. The Skyline Problem
      • 315. Count of Smaller Numbers After Self
      • 308. Range Sum Query 2D - Mutable
      • 493. Reverse Pairs
    • Linked List
      • 19. Remove Nth Node From End of List
      • 24. Swap Nodes in Pairs
      • 160. Intersection of Two Linked Lists
      • 138. Copy List with Random Pointer
      • 143. Reorder List
      • 445. Add Two Numbers II
    • Tree
      • BST
        • 109. Convert Sorted List to Binary Search Tree
        • 285. Inorder Successor in BST
        • Convert Binary Search Tree (BST) to Sorted Doubly-Linked List
        • Construct Binary Tree from Inorder and Postorder Traversal
      • 101. Symmetric Tree
      • 105. Construct Binary Tree from Preorder and Inorder Traversal
      • 106. Construct Binary Tree from Inorder and Postorder Traversal
      • 110. Balanced Binary Tree
      • 129. Sum Root to Leaf Numbers
      • 199. Binary Tree Right Side View
      • 257. Binary Tree Paths
      • 543. Diameter of Binary Tree
      • Sum Root to Leaf Numbers
    • Hash Table
      • 325. Maximum Size Subarray Sum Equals k
      • 314. Binary Tree Vertical Order Traversal
    • Union-Find
      • 721. Accounts Merge
      • 305. Number of Islands II
      • 323. Number of Connected Components in an Undirected Graph
      • 261. Graph Valid Tree
      • 947. Most Stones Removed with Same Row or Column
    • Linked List
      • 138. Copy List with Random Pointer
      • 21. Merge Two Sorted Lists
      • 23. Merge K Sorted Lists
      • 206. Reverse Linked List
      • 237. Delete Node in a Linked List
      • 141. Linked List Cycle
      • 142. Linked List Cycle II
      • 148. Sort List
      • 708. Insert into a Cyclic Sorted List
  • Graph
    • BFS/DFS
      • Topological Sort
        • 207. Course Schedule
        • 210. Course Schedule II
        • 133. Clone Graph
        • 675. Cut Off Trees for Golf Event
        • 269. Alien Dictionary
      • BackTracking
        • 39. Combination Sum
        • 40. Combination Sum II
        • 79. Word Search
        • 212. Word Search II
        • 282. Expression Add Operators
      • Memoization
        • 329. Longest Increasing Path in a Matrix
      • 733. Flood Fill
      • 279. Perfect Squares
      • 200. Number of Islands (Amazon)
      • 694. Number of Distinct Islands
      • 711. Number of Distinct Islands II
      • 98. Validate Binary Search Tree
      • 785. Is Graph Bipartite?
      • 491. Increasing Subsequences
      • 51. N-Queens
      • 52. N-Queens II
      • 286. Walls and Gates
      • 111. Minimum Depth of Binary Tree
      • 753. Craking the Safe
    • 399. Evaluate Division
    • 127. Word Ladder
    • 126. Word Ladder II
    • 339. Nested List Weight Sum
  • Traversal
    • 297. Serialize and Deserialize Binary Tree
    • 236. Lowest Common Ancestor of a Binary Tree
    • 235. Lowest Common Ancestor of a Binary Search Tree
    • 144. Binary Tree Preorder Traversal
    • 255. Verify Preorder Sequence in Binary Search Tree
    • 94. Binary Tree Inorder Traversal
    • 145. Binary Tree Post Order Traversal
    • 102. Level Order Traversal
    • 103. Binary Tree Zipzag Level Order Traversal
    • 107. Binary Tree Level Order Traversal
    • 156. Binary Upside Down
    • 173. Binary Search Tree Iterator
    • 99. Recover Binary Search Tree
    • Morris Traversal
    • 498. Diagonal Traverse
    • Flatten
      • Flatten Binary Tree to Linked List
  • Math
    • Bit Manipulation
      • 231. Power of Two
      • 393. UTF-8 Validation
      • 371. Sum of Two Integers
    • 463. Island Perimeter
    • 50. Pow(x,n)
    • 326. Power of Three
    • 67. Add Binary
    • 29. Divide Two Integers
    • 360. Sort Transformed Array
    • 247. Strobogrammatic Number II
    • 277. Find the Celebrity
    • 246. Strobogrammatic Number
    • 415. Add Strings
    • 398. Reservoir Sampling
    • 43. Multiply Strings
    • 311. Sparse Matrix Multiplication
    • 8. String to Integer (Atoi)
    • 7. Reverse Integer
    • 829. Consecutive Numbers Sum
    • MiniMax
      • Guess the Word
  • Sum
    • 1. Two Sum
    • 167. Two Sum II - Input array is sorted
    • 15. 3Sum
    • 16. 3Sum Closest
    • 18. 4Sum
    • 454. 4Sum II
    • Path sum
      • 124. Binary Tree Maximum Path Sum
      • 113. Path Sum II
      • 437. Path Sum III
    • 209. Minimum Size Subarray Sum
    • 862. Shortest Subarray with Sum at Least K
    • 560. Subarray Sum Equals K
    • 523. Continuous Subarray Sum
  • Parentheses
    • 20. Valid Parentheses
    • 32. Longest Valid Parentheses
    • 301. Remvoe Invalid Parenthesis
    • 678. Valid Parenthesis String
  • String
    • Substring Search
      • 3. Longest Substring Without Repeating Characters
      • 438. Find All Anagrams in a String
      • 76. Minimum Window Substring
      • 30. Substring with Concatenation of All Words
      • 159. Longest Substring with At Most Two Distinct Characters
      • 340. Longest Substring with At Most K Distinct Characters
      • 395. Longest Substring with At Least K Repeating Characters
    • 12. Integer to Roman
    • 13. Roman to Integer
    • 44. Wildcard Matching
    • 242. Valid Anagram
    • 49. Group Anagrams
    • 657. Judge Route Circle
    • 482. License Key Formatting
    • 681. Next Closet Time
    • 387. First Unique Character in a String
    • 426. Convert Binary Search Tree to Sorted Doubly Linked List
    • 273. Integer to English Words
    • 157. Read N Characters Given Read4
    • 158. Read N Characters Given Read4 II - Call multiple times
    • 480. Valid Word Abbreviation
    • 409. Longest Palindrome
    • 680. Valid Palindrome II
    • 125. Valid Palindrome
    • 468. Validate IP Address
    • 161. One Edit Distance
    • 38. Count and Say
    • 1096. Brace Expansion II
  • Greedy
    • 630. Course Schedule III
    • 621. Task Scheduler
    • 358. Rearrange String k Distance Apart
    • 406. Queue Reconstruction by Height
    • 316. Remove Duplicate Letters
    • 767. Reorganize String
  • Dynamic Programming
    • Palindromic Substring
      • 5. longest Palindromic Substring
    • Subarray
      • 53. Maximum Subarray
      • 152. Maximum Product Subarray
    • Backpack problem
      • Backpack I
      • Backpack II
      • Backpack III
      • Backpack IV
      • Backpack V
      • Backpack VI aka: Combination Sum IV
      • Coin Change 2
    • 70. Climbing Stairs
    • 10. Regular Expression Matching
    • 91. Decode Ways
    • 338. Counting Bits
    • 121. Best Time to Buy and Sell Stock
    • 122. Best Time to Buy and Sell Stock II
    • 123. Best Time to Buy and Sell Stock III
    • 139. Word Break
    • 140. Word Break II
    • 198. House Robber
    • 72. Edit Distance
    • 221. Maximal Square
    • 84. Largest Rectangle in Histogram
    • 312. Burst Balloons
    • 304. Range Sum Query 2D - Immutable
    • 518. Coin Change 2
    • Longest Arithmetic Progression
    • 975. Odd Even Jump
    • 1066. Campus Bikes II
  • JingChi.ai
    • Word Composition
    • Sliding Window Median
    • 537. Erect the Fence (Convex Hull Problem)
    • LintCode 558: Sliding Window Matrix Maximum
    • orienteering game
    • 扫雷
  • Twitter
    • Identifying Triangle
    • Last and Second-Last
    • 300. Longest increasing subsequence
    • Twin String
    • 647. Number of palindromic substring
    • Wildcard Matching
  • Akuna
    • C++ Intern
    • V1
    • V2
    • Cut the Sticks
    • Quant Dev
    • Postfix_to_infix
    • Drone Delivery
    • phone
  • LintCode Contest
    • Ask For Cooling Time
  • SQL
    • 176. Second Highest Salary
    • 597. Friend Requests I: Overall Acceptance Rate
  • FB 19
    • Convert Binary Search Tree (BST) to Sorted Doubly-Linked List
    • 3Sum
    • Minimum Window Substring
    • Count number of occurrences (or frequency) in a sorted array
    • Count NO2
    • Valid Palindrome
    • Merge k Sorted Lists
    • Kth Largest Element in an Array
    • Move Zeros
    • Remove Invalid Parenthesis
    • friends
    • Integer to English
    • Islands
    • Valid Palindrome
    • sec
      • Longest Increasing Path in a Matrix
      • Product of Array Except Self
      • Binary Tree Vertical Order Traversal
      • Add Binary
      • Valid Parentheses
      • sup
        • 128. Longest Consecutive Sequence
        • Combination Sum II
        • Add and Search Word - Data structure design
        • feb
  • Google 20
    • 410. Split Array Largest Sum
  • 818. Race Card
Powered by GitBook
On this page

Was this helpful?

  1. Design

295. Find Median from Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

For example,

[2,3,4], the median is3

[2,3], the median is(2 + 3) / 2 = 2.5

Design a data structure that supports the following two operations:

  • void addNum(int num) - Add a integer number from the data stream to the data structure.

  • double findMedian() - Return the median of all elements so far.

Example:

addNum(1)
addNum(2)
findMedian() -> 1.5
addNum(3) 
findMedian() -> 2

Follow up:

  1. If all integer numbers from the stream are between 0 and 100, how would you optimize it?

  2. If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it? (Segment Trees)

  3. how to improve it and make sure the method is thread-safe if the two heaps are shared - Use read and write locks.

Thoughts:

  1. Having two queues, one (right part) puts smallest number on the top, and one (left part) keeps largest element on the top.

  2. Using Binary Search Tree: As the input numbers are random, so the height of the binary search tree is O(logN) We maintain every single node's children's size and it's easy to implement because it just has add operation.

  3. There are so many ways around this problem, that frankly, it is scary. Here are a few more that I came across:

  4. Code

class MedianFinder {

    /** initialize your data structure here. */
    private Queue<Long> left;
    private Queue<Long> right;

    public MedianFinder() {
            left = new PriorityQueue();
            right = new PriorityQueue();
    }

    public void addNum(int num) {
        right.add((long) num);
        left.add(-right.poll());
        if(right.size() < left.size())
            right.add(-left.poll());
    }

    public double findMedian() {
        return right.size() > left.size()
            ?right.peek()
            :(right.peek() - left.peek())/2.0; 
    }
}

/**
 * Your MedianFinder object will be instantiated and called as such:
 * MedianFinder obj = new MedianFinder();
 * obj.addNum(num);
 * double param_2 = obj.findMedian();
 */

Java: reverse ordering

class MedianFinder {

    /** initialize your data structure here. */
    private Queue<Integer> l;
    private Queue<Integer> r;
    boolean even;
    public MedianFinder() {
            l = new PriorityQueue();
            r = new PriorityQueue(Collections.reverseOrder());
            even = true;
    }

    public void addNum(int num) {
             if(even){
                l.add(num);
                r.add(l.poll());
            }
            else{
                r.add(num);
                l.add(r.poll());
            }
        even=!even;
    }

    public double findMedian() {
         return even?(r.peek() + l.peek()) / 2.0:r.peek();
    }
}

/**
 * Your MedianFinder object will be instantiated and called as such:
 * MedianFinder obj = new MedianFinder();
 * obj.addNum(num);
 * double param_2 = obj.findMedian();
 */

C++: max_heap

class MedianFinder {
    priority_queue<long> left, right;// max_heap
public:
    /** initialize your data structure here. */

    MedianFinder() {

    }

    void addNum(int num) {
        left.push(num);
        right.push(-left.top()); left.pop();
        if(left.size() < right.size()){
            left.push(-right.top()); right.pop();
        }
    }

    double findMedian() {
        return left.size() > right.size()
            ? left.top()
            : (left.top() - right.top())/2.0;
    }
};

/**
 * Your MedianFinder object will be instantiated and called as such:
 * MedianFinder obj = new MedianFinder();
 * obj.addNum(num);
 * double param_2 = obj.findMedian();
 */

Python

from heapq import *
class MedianFinder(object):

    def __init__(self):
        """
        initialize your data structure here.
        """
        self.heaps = [], []


    def addNum(self, num):
        """
        :type num: int
        :rtype: void
        """
        left, right = self.heaps # min heap
        heappush(left, -heappushpop(right, num))
        if len(right) < len(left):
            heappush(right, -heappop(left))

    def findMedian(self):
        """
        :rtype: float
        """
        left, right = self.heaps
        if len(right) > len(left):
            return float(right[0])
        return (right[0] - left[0])/2.0        


# Your MedianFinder object will be instantiated and called as such:
# obj = MedianFinder()
# obj.addNum(num)
# param_2 = obj.findMedian()

Code: BST (Even if the incoming data ins't random, we can still use red-black tree so that finding median value is super fast.)

struct BST {
    struct node {
        int val;
        int size;
        node* left, *right;
        node(int v) : size(1), val(v) {};
    } *Null, *root;

    BST() {
        Null = new node(0);
        Null -> size = 0;
        root = Null;
    }

    void add(int val, node*& R) {
        if(R == Null) {
            R = new node(val);
            R -> left = R -> right = Null;
            return;
        }
        if(R->val <= val) add(val, R->left);
        else add(val, R->right);
        R->size = R->left->size + R->right->size + 1;

    }

    int rank(int k) {
        node* t = root;
        while(true) {
            int leftSize =  t -> left -> size;
            if(leftSize == k) return t -> val;
            if(leftSize > k) {
                t = t -> left;
            } else {
                k = k - leftSize - 1;
                t = t -> right;
            }
        }
        return -1;
    }
};




class MedianFinder {
public:
    BST* bst;
    MedianFinder() {
        bst = new BST();
    }
    // Adds a number into the data structure.
    void addNum(int num) {
        bst->add(num, bst->root);
    }

    // Returns the median of current data stream
    double findMedian() {
        int sz = bst -> root -> size;
        if(sz % 2 == 0) {
            return 1.0 * (bst -> rank(sz / 2) + bst -> rank(sz / 2 - 1)) / 2;
        } else return bst->rank(sz / 2);

    }

Binary Search: O(n) (Add function may cost o(n).Insert function costs o(n))

private List<double> m_listNum = new List<double>();

public void AddNum(double num) {

    int lintPos = m_listNum.BinarySearch( num ) ;
    if (lintPos >= 0) {
        m_listNum.Insert(lintPos, num);
    } else {
        lintPos = ~lintPos;
        if (lintPos == m_listNum.Count) {
            m_listNum.Add(num);
        } else {
            m_listNum.Insert(lintPos, num);
        }
    }
}

// return the median of current data stream

public double FindMedian() {

    int lintCount = m_listNum.Count ;

    if (lintCount == 0 ) throw new Exception("array is empty");

    if (lintCount % 2 == 0)
        return (m_listNum[lintCount / 2 - 1] + m_listNum[lintCount / 2]) / 2;
    else 
        return m_listNum[lintCount / 2];
}
Previous622. Design Circular QueueNextOOD

Last updated 5 years ago

Was this helpful?

Further Thoughts (Analysis written by )

Buckets!If the numbers in the stream are statistically distributed, then it is easier to keep track of buckets where the median would land, than the entire array. Once you know the correct bucket, simply sort it find the median. If the bucket size is significantly smaller than the size of input processed, this results in huge time saving. has an interesting implementation

Reservoir Sampling. Following along the lines of using buckets: if the stream is statistically distributed, you can rely on Reservoir Sampling. Basically, if you could maintain just one good bucket (or reservoir) which could hold a representative sample of the entire stream, you could estimate the median of the entire stream from just this one bucket. This means good time and memory performance. Reservoir Sampling lets you do just that. Determining a"good"size for your reservoir?_Now, that's a whole other challenge. A good explanation for this can be found in

Segment Trees are a great data structure if you need to do a lot of insertions or a lot of read queries over a limited range of input values. They allow us to do all such operations fast and in roughly the same amount of time, always.The only problem is that they are far from trivial to implement. Take a look at my if you are interested.

Order Statistic Trees are data structures which seem to be tailor-made for this problem. They have all the nice features of a BST, but also let you find the k^{th} ​​order element stored in the tree. They are a pain to implement and no standard interview would require you to code these up. But they are fun to use if they are already implemented in the language of your choice.

Reference: 's and 's

@babhishek21
@mitbbs8080
here.
this StackOverflow answer.
introductory article on Segment Trees
5
StefanPochmann
post
lgn
post