LeetCode
  • Introduction
  • Design
    • 348. Design Tic-Tac-Toe
    • 534. Design TinyURL
    • 535. Encode and Decode TinyURL
    • 346. Moving Average from Data Stream
    • 281. Zigzag Iterator
    • 381. Insert Delete GetRandom O(1) - Duplicates allowed
    • 432. All O`one Data Structure
    • 341. Flatten Nested List Iterator
    • 642. Design Search Autocomplete System
    • 170. Two Sum III - Data structure design
    • 622. Design Circular Queue
    • 295. Find Median from Data Stream
  • OOD
    • 146.LRU Cache(Amazon)
  • Sort
    • Bucket Sort
      • 791. Custom Sort String
      • 347. Top K Frequent Elements
      • 1057. Campus bikes
    • 280. Wiggle Sort
    • 75. Sort Colors
  • Binary Search
    • 278. First Bad Version
    • 162. Find Peak Element in O(logn)
    • LintCode 183. Wood Cut
    • 33. Search in Rotated Sorted Array
    • 81. Search in Rotated Sorted Array II
    • 153. Find Minimum in Rotated Sorted Array
    • 154. Find Minimum in Rotated Sorted Array II
    • 34. Find First and Last Position of Element in Sorted Array
    • Count number of occurrences (or frequency) in a sorted array
    • 378. Kth Smallest Element in a Sorted Matrix
    • 240. Search a 2D Matrix II
    • 658. Find K Closest Elements
    • 410. Split Array Largest Sum
  • Data Structure
    • Arrays
      • Two Pointer
        • 11. Container With Most Water
      • 283. Move Zeros
      • 26. Remove Duplicates from Sorted Array
      • 88. Merge Sorted Array
      • 683. K Empty Slots
      • 4. Median of Two Sorted Arrays
      • Interval
        • 56. Merge Interval
        • 57. Insert Interval
      • 54. Spiral Matrix
      • 48. Rotate Image(Amazon, MicroSoft, Apple)
      • 448. Find All Numbers Disappeared in an Array
      • 228. Summary Ranges
      • 289. Game of Life
      • 298. Product of Array Except Self
      • 825. Friends of Appropriate Ages
    • Stack
      • Postfix to Infix
      • 155. Min Stack
      • 716. Max Stack
      • 388. Longest Absolute File Path
      • 316. Remove Duplicate Letters
      • 636. Exclusive Time of Functions
      • Calculator Series
        • 227. Basic Calculator
        • 772. Basic Calculator III
    • Trie
      • 208. Implement Trie (Prefix Tree)
      • 211. Add and Search Word - Data structure design
      • 336. Palindrome Pairs
      • 212. Word Search II
    • Heap
      • 218. The Skyline Problem
      • 23. Merge k Sorted Lists
      • 252. Meeting Rooms
      • 253. Meeting Rooms II
      • 215. Kth Largest Element in an Array
      • 692. Top K Frequent Words
      • 373. Find K Pairs with Smallest Sums
    • Deque
      • 239. Sliding Window Maximum
    • Map
      • 128. Longest Consecutive Sequence
      • 350. Intersection of Two Arrays II
      • 760. Find Anagram Mappings
    • Segment Tree / Binary Indexed Tree
      • 308. Range Sum Query 2D - Mutable
      • 218. The Skyline Problem
      • 315. Count of Smaller Numbers After Self
      • 308. Range Sum Query 2D - Mutable
      • 493. Reverse Pairs
    • Linked List
      • 19. Remove Nth Node From End of List
      • 24. Swap Nodes in Pairs
      • 160. Intersection of Two Linked Lists
      • 138. Copy List with Random Pointer
      • 143. Reorder List
      • 445. Add Two Numbers II
    • Tree
      • BST
        • 109. Convert Sorted List to Binary Search Tree
        • 285. Inorder Successor in BST
        • Convert Binary Search Tree (BST) to Sorted Doubly-Linked List
        • Construct Binary Tree from Inorder and Postorder Traversal
      • 101. Symmetric Tree
      • 105. Construct Binary Tree from Preorder and Inorder Traversal
      • 106. Construct Binary Tree from Inorder and Postorder Traversal
      • 110. Balanced Binary Tree
      • 129. Sum Root to Leaf Numbers
      • 199. Binary Tree Right Side View
      • 257. Binary Tree Paths
      • 543. Diameter of Binary Tree
      • Sum Root to Leaf Numbers
    • Hash Table
      • 325. Maximum Size Subarray Sum Equals k
      • 314. Binary Tree Vertical Order Traversal
    • Union-Find
      • 721. Accounts Merge
      • 305. Number of Islands II
      • 323. Number of Connected Components in an Undirected Graph
      • 261. Graph Valid Tree
      • 947. Most Stones Removed with Same Row or Column
    • Linked List
      • 138. Copy List with Random Pointer
      • 21. Merge Two Sorted Lists
      • 23. Merge K Sorted Lists
      • 206. Reverse Linked List
      • 237. Delete Node in a Linked List
      • 141. Linked List Cycle
      • 142. Linked List Cycle II
      • 148. Sort List
      • 708. Insert into a Cyclic Sorted List
  • Graph
    • BFS/DFS
      • Topological Sort
        • 207. Course Schedule
        • 210. Course Schedule II
        • 133. Clone Graph
        • 675. Cut Off Trees for Golf Event
        • 269. Alien Dictionary
      • BackTracking
        • 39. Combination Sum
        • 40. Combination Sum II
        • 79. Word Search
        • 212. Word Search II
        • 282. Expression Add Operators
      • Memoization
        • 329. Longest Increasing Path in a Matrix
      • 733. Flood Fill
      • 279. Perfect Squares
      • 200. Number of Islands (Amazon)
      • 694. Number of Distinct Islands
      • 711. Number of Distinct Islands II
      • 98. Validate Binary Search Tree
      • 785. Is Graph Bipartite?
      • 491. Increasing Subsequences
      • 51. N-Queens
      • 52. N-Queens II
      • 286. Walls and Gates
      • 111. Minimum Depth of Binary Tree
      • 753. Craking the Safe
    • 399. Evaluate Division
    • 127. Word Ladder
    • 126. Word Ladder II
    • 339. Nested List Weight Sum
  • Traversal
    • 297. Serialize and Deserialize Binary Tree
    • 236. Lowest Common Ancestor of a Binary Tree
    • 235. Lowest Common Ancestor of a Binary Search Tree
    • 144. Binary Tree Preorder Traversal
    • 255. Verify Preorder Sequence in Binary Search Tree
    • 94. Binary Tree Inorder Traversal
    • 145. Binary Tree Post Order Traversal
    • 102. Level Order Traversal
    • 103. Binary Tree Zipzag Level Order Traversal
    • 107. Binary Tree Level Order Traversal
    • 156. Binary Upside Down
    • 173. Binary Search Tree Iterator
    • 99. Recover Binary Search Tree
    • Morris Traversal
    • 498. Diagonal Traverse
    • Flatten
      • Flatten Binary Tree to Linked List
  • Math
    • Bit Manipulation
      • 231. Power of Two
      • 393. UTF-8 Validation
      • 371. Sum of Two Integers
    • 463. Island Perimeter
    • 50. Pow(x,n)
    • 326. Power of Three
    • 67. Add Binary
    • 29. Divide Two Integers
    • 360. Sort Transformed Array
    • 247. Strobogrammatic Number II
    • 277. Find the Celebrity
    • 246. Strobogrammatic Number
    • 415. Add Strings
    • 398. Reservoir Sampling
    • 43. Multiply Strings
    • 311. Sparse Matrix Multiplication
    • 8. String to Integer (Atoi)
    • 7. Reverse Integer
    • 829. Consecutive Numbers Sum
    • MiniMax
      • Guess the Word
  • Sum
    • 1. Two Sum
    • 167. Two Sum II - Input array is sorted
    • 15. 3Sum
    • 16. 3Sum Closest
    • 18. 4Sum
    • 454. 4Sum II
    • Path sum
      • 124. Binary Tree Maximum Path Sum
      • 113. Path Sum II
      • 437. Path Sum III
    • 209. Minimum Size Subarray Sum
    • 862. Shortest Subarray with Sum at Least K
    • 560. Subarray Sum Equals K
    • 523. Continuous Subarray Sum
  • Parentheses
    • 20. Valid Parentheses
    • 32. Longest Valid Parentheses
    • 301. Remvoe Invalid Parenthesis
    • 678. Valid Parenthesis String
  • String
    • Substring Search
      • 3. Longest Substring Without Repeating Characters
      • 438. Find All Anagrams in a String
      • 76. Minimum Window Substring
      • 30. Substring with Concatenation of All Words
      • 159. Longest Substring with At Most Two Distinct Characters
      • 340. Longest Substring with At Most K Distinct Characters
      • 395. Longest Substring with At Least K Repeating Characters
    • 12. Integer to Roman
    • 13. Roman to Integer
    • 44. Wildcard Matching
    • 242. Valid Anagram
    • 49. Group Anagrams
    • 657. Judge Route Circle
    • 482. License Key Formatting
    • 681. Next Closet Time
    • 387. First Unique Character in a String
    • 426. Convert Binary Search Tree to Sorted Doubly Linked List
    • 273. Integer to English Words
    • 157. Read N Characters Given Read4
    • 158. Read N Characters Given Read4 II - Call multiple times
    • 480. Valid Word Abbreviation
    • 409. Longest Palindrome
    • 680. Valid Palindrome II
    • 125. Valid Palindrome
    • 468. Validate IP Address
    • 161. One Edit Distance
    • 38. Count and Say
    • 1096. Brace Expansion II
  • Greedy
    • 630. Course Schedule III
    • 621. Task Scheduler
    • 358. Rearrange String k Distance Apart
    • 406. Queue Reconstruction by Height
    • 316. Remove Duplicate Letters
    • 767. Reorganize String
  • Dynamic Programming
    • Palindromic Substring
      • 5. longest Palindromic Substring
    • Subarray
      • 53. Maximum Subarray
      • 152. Maximum Product Subarray
    • Backpack problem
      • Backpack I
      • Backpack II
      • Backpack III
      • Backpack IV
      • Backpack V
      • Backpack VI aka: Combination Sum IV
      • Coin Change 2
    • 70. Climbing Stairs
    • 10. Regular Expression Matching
    • 91. Decode Ways
    • 338. Counting Bits
    • 121. Best Time to Buy and Sell Stock
    • 122. Best Time to Buy and Sell Stock II
    • 123. Best Time to Buy and Sell Stock III
    • 139. Word Break
    • 140. Word Break II
    • 198. House Robber
    • 72. Edit Distance
    • 221. Maximal Square
    • 84. Largest Rectangle in Histogram
    • 312. Burst Balloons
    • 304. Range Sum Query 2D - Immutable
    • 518. Coin Change 2
    • Longest Arithmetic Progression
    • 975. Odd Even Jump
    • 1066. Campus Bikes II
  • JingChi.ai
    • Word Composition
    • Sliding Window Median
    • 537. Erect the Fence (Convex Hull Problem)
    • LintCode 558: Sliding Window Matrix Maximum
    • orienteering game
    • 扫雷
  • Twitter
    • Identifying Triangle
    • Last and Second-Last
    • 300. Longest increasing subsequence
    • Twin String
    • 647. Number of palindromic substring
    • Wildcard Matching
  • Akuna
    • C++ Intern
    • V1
    • V2
    • Cut the Sticks
    • Quant Dev
    • Postfix_to_infix
    • Drone Delivery
    • phone
  • LintCode Contest
    • Ask For Cooling Time
  • SQL
    • 176. Second Highest Salary
    • 597. Friend Requests I: Overall Acceptance Rate
  • FB 19
    • Convert Binary Search Tree (BST) to Sorted Doubly-Linked List
    • 3Sum
    • Minimum Window Substring
    • Count number of occurrences (or frequency) in a sorted array
    • Count NO2
    • Valid Palindrome
    • Merge k Sorted Lists
    • Kth Largest Element in an Array
    • Move Zeros
    • Remove Invalid Parenthesis
    • friends
    • Integer to English
    • Islands
    • Valid Palindrome
    • sec
      • Longest Increasing Path in a Matrix
      • Product of Array Except Self
      • Binary Tree Vertical Order Traversal
      • Add Binary
      • Valid Parentheses
      • sup
        • 128. Longest Consecutive Sequence
        • Combination Sum II
        • Add and Search Word - Data structure design
        • feb
  • Google 20
    • 410. Split Array Largest Sum
  • 818. Race Card
Powered by GitBook
On this page

Was this helpful?

  1. Design

534. Design TinyURL

Previous348. Design Tic-Tac-ToeNext535. Encode and Decode TinyURL

Last updated 5 years ago

Was this helpful?

Note: For the coding companion problem, please see:.

How would you design a URL shortening service that is similar to ?

Background: TinyURL is a URL shortening service where you enter a URL such ashttps://leetcode.com/problems/design-tinyurland it returns a short URL such ashttp://tinyurl.com/4e9iAk.

Requirements:

  1. For instance,"http://tinyurl.com/**4e9iAk" is the tiny url for the page"https://leetcode.com/problems/design-tinyurl". The

    identifier (the highlighted part) can be any string with 6 alphanumeric characters containing0-9,a-z,A-Z.

  2. Each shortened URL must be unique; that is, no two different URLs can be shortened to the same URL.

Note about Questions: Below are just a small subset of questions to get you started. In real world, there could be many follow ups and questions possible and the discussion is open-ended (No one true or correct way to solve a problem). If you have more ideas or questions, please ask in Discuss and we may compile it here!

Questions:

  1. How many unique identifiers possible? Will you run out of unique URLs?

    1. Take short_url as a 62 base notation. 6 bits could represent 62^6=57 billion. Of course this will run out (assuming there is no expiration for the short_url)

  2. Should the identifier be increment or not? Which is easier to design? Pros and cons?

    1. Look at the "Thoughts" part (Hash function vs Base62)

  3. Mapping an identifier to an URL and its reversal - Does this problem ring a bell to you?

  4. How do you store the URLs? Does a simple flat file database work?

    1. SQL?

  5. What is the bottleneck of the system? Is it read-heavy or write-heavy ? 1. Read-Heavy?

  6. Estimate the maximum number of URLs a single machine can store.

    1. 10M new mappings (long URL to short URL) per day.

    2. assume each mapping takes 100B in average.

    3. 1GB every day. 1 TB hard drive could stand for 3 years.

    4. Storage is not the problem for this kind of system. Service like Netflix may have storage issues

    5. Through SN( Scenario: Long URL to short URL and reversed.; Need:Assume the system is not massive if you are not sure) analysis, we could have a big picture of the system. In general, this system is not hard and could be handled by a single SSD Machine.

  7. Estimate the maximum number of queries per second (QPS) for decoding a shortened URL in a single machine.

    1. Daily User: 100M.

    2. Daily usage per person: (Write) long2short 0.1s, (Read) short2long 1s.

    3. Daily request: Write 10M, Read 100M.

    4. QPS: Since a day is 86400s approximately 100K: Write 100, Read 1K.

    5. Peak QPS: Write 200, Read 2K.

    6. Thousand level can be handled by a single SSD MySQL Machine

  8. How would you scale the service? For example, a viral link which is shared in social media could result in a peak QPS at a moment's notice.

  9. How could you handle redundancy? i,e, if a server is down, how could you ensure the service is still operational?

  10. Keep URLs forever or prune, pros/cons? How we do pruning? (Contributed by @alex_svetkin)

  11. What API would you provide to a third-party developer? (Contributed by @alex_svetkin) 1. Only one service: URLService 1. Core (Business Logic) Layer:Class: URLService 2. Interface:

    • URLService.encode(String long_url)

    • URLService.decode(String short_url)

    • Web Layer:

      1. REST API:

    • GET: /{short_url}, return a http redirect response(301)

    • POST: method -

  12. If you can enable caching, what would you cache and what's the expiry time? (Contributed by @Humandroid)

  13. SQL vs NoSQL: 1. Does it need to support transactions? NoSQL does not support transaction. No -> NoSQL 2. Do we need rich SQL query? NoSQL does not support as many queries as SQL. No -> NoSQL 3. Pursue development efficiency? Most Web Framework supports SQL database very well (with ORM()). It means fewer codes for the system. Does not matter because of only a few lines of code -> NoSQL 4. Do we need to use AUTO_INCREMENT ID? NoSQL couldn't do this. It only has a global unique Object_id. Our algorithm needs AUTO_INCREMENT ID.-> SQL 5. Does the system has a high requirement for QPS? NoSQL has high performance. For example, Memcached's QPS could reach million level, MondoDB does 10K level, MySQL only supports K level. Write 200, Read 2K. Not high.-> SQL 6. How high is the system's scalability? SQL requires developers write their codes to scale, while NoSQL comes with them (sharding, replica). Not high -> SQL

Thoughts:

For the coding part, the goal is to compress the long URL and make sure the original url would be re-applied after the encode-decode. For the systematic algorithm part, here are two proposed solution:

    1. These two algorithms could make sure hash values are randomly distributed, implementation-wise simple.

    2. Cons: the conflicts are inevitable. Any hash algorithm could have inevitable conflicts.

    3. Solutions: 1. use (long_url + timestamp) as the hash function key. 2. When conflicts, regenerates the hash value(it's different because timestamp changes).

      Overall, when urls are over 1 billion, there would be a lot of conflicts and the efficiency could be very low.

  1. Base62: Take short_url as a 62 base notation. 6 bits could represent 62^6=57 billion. 1. This implementation simply adopts the base-62 (26*2 letters + 10 digits) mapping to the URL string. Take short_url as a 62 base notation. 6 bits could represent 62^6=57 billion. 2. Each short_url represent a decimal digit. It could be the auto_increment_id in SQL database.

// You can type code here and execute it.
#include<iostream>
#include<algorithm>
#include<string>
using namespace std;

string idToShortURL(long int test){
    char map[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
    string shorturl;
    while(test){
        shorturl.push_back(map[test%62]);
        test = test/62;
    }
    reverse(shorturl.begin(), shorturl.end());
    return shorturl;
}

long int shortURLtoID (string shortURL){
    long int id = 0;
    for (int i = 0; i < shortURL.length(); i++){
        if ('a' <= shortURL[i] && shortURL[i]<='z')
            id = id * 62 + shortURL[i] - 'a';
        if ('A' <= shortURL[i] && shortURL[i] <= 'Z')
            id = id * 62 + shortURL[i] - 'A' + 26;
        if('0' <= shortURL[i] && shortURL[i]<='9')
            id = id * 62 + shortURL[i] - '0' + 52;
    }
    return id;
  }

// function 
int main(int argc, char** argv) {
    unordered_map<int, string> record;

    for (int i = 1; i< argc; i++){
         cout<<"original URL: "<<(argv[i])<<endl;
        record[i] = (argv[i]);
    }
    for (int i = 1; i< argc; i++){

        cout<<"test URL: "<<record[i]<<endl;
        string shortURL = idToShortURL(i);
        cout<<"generated URL: "<<shortURL<<endl;
        cout<<"Mapped URL back to ID: "<<shortURLtoID(shortURL)<<endl;
    }

    // int test = 66666;
    // string shortURL = idToShortURL(test);
    // cout<<"generated URL: "<<shortURL<<endl;
    // cout<<"Mapped URL back to ID: "<<shortURLtoID(shortURL)<<endl;
}

Hash Function: long_url -> md5/sha1: a. md5 convert a string into 128 binary bits, generally represented as 16 bytes hex: -> c93a360dc7f3eb093ab6e304db516653; b. sha1 convert a string into 160 binary bits, generally represented as 20 bytes hex: -> dff85871a72c73c3eae09e39ffe97aea63047094

Special thanks: and for the reference!

Encode and Decode TinyURL
TinyURL
goo.gl
google shorten URL
Object-Relational Mapping
http://site.douban.com/chuan
http://site.douban.com/chuan
A "complete" solution for TinyURL (Leetcode System Design)
Segmentfault