• Definition
    • Defining the architecture, components, modules, interfaces and data for a system to satisfy specified requirements
  • Conceptual design -> Logical design -> Physical design (Macro -> Micro, can also Micro -> Macro)
  • What is a good design?
    • Healthiness
      • Execution
      • Communication
    • Simplicity
      • No more, no less
      • Understandable

  • Focus in this lesson: Fundamental questions in system design interviews
    • Design the system
    • Evaluate query per second
    • Scale the system
  • Design Netflix/ytb/Spotify
    • Tag:Uber, Google, Alibaba

Please design "Netflix" - Macro

  • Crack a design in 5 steps:
    • Scenario: case/interface
    • Necessity: constrain/hypothesis
    • Application:service/algorithm
    • Kilobit:data
    • Evolve
  • Scenario:case/interface
      1. Enumerate (chat w/ interviewer)
        1. register/login
        2. play movie
        3. movie recommendation
        4. ...
      2. Sort
        1. play movie
          1. get channels
          2. get movies in channels
          3. play a movie in a channel
        2. ...
  • Necessity: constrain/hypothesis(主要经验估算,辅以询问数据)
      1. Ask: how many active users?
      2. Predict
        1. User
          1. average concurrent users = $$\frac{daily-active-users}{daily-seconds} * average-online-time$$
          2. peak users = average concurrent users * 6 (6是经验值)
          3. MAX peak users in 3 months = peak users * 2 (上线后三个月可能用户增长)
      3. Traffic
        1. Traffic per user = 3Mbps
        2. Max peak traffic = Max peak users * traffic per user
      4. Memory
        1. Memory per user = 10KB
        2. Max daily memory = daily active users * 2 (3 months) * 10KB = 100GB(Redis TB OK)
      5. Storage
        1. Total # of movies = 14,000
        2. Movie storage = # of movies * average movie size(multiple versions) = 14,000*50GB = 700TB
  • Application:service algorithm
      1. Replay the case, add a service for each request
      2. Merge the service

  • Kilobit: data
      1. Append dataset for each request below a service
      2. Choose storage types
        1. User Service - Accounts(MySQL)
        2. Channel Service - Channel List(MongeDB?? -> 档案)
        3. Movie Service - Movies(Files)
  • Evolve
      1. Analyze
        1. Better: constraints
        2. Broader: new cases
        3. Deeper:details
      2. Views..
        1. Performance
        2. Scalability (# of users.. # of machines)
        3. Robustness

Please design "Netflix" - Micro

  • Design recommendation module

  • 10^6~10^9 approximately 1s
  • Similarity <- bucket sort (倒序索引 Inverted Index,以movie为key建立索引)
  • Dispatcher -> load balancer

  • 密集型
    • 硬盘密集 - 读写?,爬虫结果保存
    • CPU密集 - 计算密集,爬虫内部去重
    • 内存密集 - 爬虫结果二次计算
    • 网络带宽密集 - 爬虫爬取网页

Improve Robustness

  • Dispatcher 可能有mirror
  • Feed manager -> timeline manager 信息流
  • Loggers -> Monitor

results matching ""

    No results matching ""