Designing Machine learning platforms

Table of Contents
- [Overview of Search Rank Chapter from educative.io and other sources:](#overview-of-search-rank-chapter-from-educative-dot-io-and-other-sources)

Overview of Search Rank Chapter from educative.io and other sources:

Problem: Design a Twitter Feed system that will show the most relevant tweets for a user based on their social graph

  • Time stamp based approach: All tweets gneerated by a users’ followees since user’s last visit were displayed in reverse chronological order.

  • WE need to rank the most relevant tweets:

  • Scale:

    • 500 million DAU and on average each user is connected to 100 users.
    • Each user fetches their feed 10 times in a day.

    500 min * 10 = 5 billion times ranking system will run.

    “Given a list of tweets, train an ML model that predicts the probability of engagement of tweets and orders them based on that score”

  • Goal: Maximize user engagement.

  • User actions can be positive or negative

    • Postive actions:
      • Time spent viewing the tweet
      • Liking
      • Retweeting
      • COmmenting
    • Negative Actions:
      • Hiding a tweet
      • reporting tweet as inappropriate
  • User engagement Metrics:

  • Increase User engagement:
    • Focus on increasing number of comments
    • increase overall engagement i.e comments, likes, and retweets
    • Increase time spent on twitter
    • Average negative action per user
  • All engagements are not equally important. So have different weights for each action.

The above metric is calculated as follows:

  • IN a day, 2000 tweets were viewed.
  • There were 70 likes, 80 comemnts, 20 retweets and 5 reports.
  • The wegithed impact is calcuated by multiplying the occurence by their weights.
  • The weighted impact is summed up to determine the score.
  • The score is normalized with the total number of users.

Why normalization is important?

  • The score is caculated for a period of time for a given number of users. If the score is calculated for a different period for different number of users, then the scores will not be comparable.

Architecture:

Continuous Integration:

  1. Version Control –> Git
  2. Automated TEsting –> Pytest, pytest-cov, codecov
  3. Static Code Analysis –> Pylint, flake8, bandit
  4. CI server: Jenkins, CircleCI, Travis CI to automate build, test and validation process

Continous Delivery:

  1. Iaas –> Terraform
  2. Containarization –> Docker
  3. CD server: Kubernetes, ECS, Google cloud Run to automate deployment process

Continuous Testing:

  1. Unit Testing –> Unittest, Pytest, nose
  2. Integration Testing –> Robot Framework, Behav, PyAutoGUI to test integration with other models
  3. Load Testing –> Apache JMeter, Gatling, Locust to test scalability and perf of model.
  4. A/B testing –> Tensorflow serving, Kubeflow, sagemaker to deploy multiple versions and test against different user groups or scenarios

Example 1: Build a model with CI

  1. Setup code in GIT
  2. Write Unit tests using Pytest.
  3. Create CI Pipeline using Jenkins or Travis CI
  4. Configure pipeline to run unit tests automatically whenever repo changes
  5. Use code coverage tool like pytest-cov.
  6. Use code quality tool like pylint.
  7. Use code review tool like codecov to review and merge code changes.

Example 2: Deploying a bmodel with CD

1.

References:

Youtube

Related