Question
YouTube’s analytics team is studying the evolution of tech content creators. They have historical data
showing how creators’ content changes over time in terms of technical depth and entertainment value.
Each video is rated on two scales: technical depth and entertainment value and each creator posts one video every week.
You have a dataset of 100 creators spread across 52 weeks. Each line in the dataset
contains <tech value, entertainment value>
of previous video and <tech value, entertainment value>
of the next
video posted by the same creator. Analyzing this shows will show you how the content evolves over time.
Now, you are given a different list of 30 creators
and their current state of content <tech value, entertainment value>
. Now among these 30 creators figure out,
- the creator, to have highest technical depth after 4 weeks
- the creator, to have highest entertainment value after 4 weeks
- the creators who switched from tech-focused to entertainment-focused and from entertainment-focused to more tech-focused
You can output the index of the creator in the list of 30 creators (starting with 0).
Datasets
Solution
Here’s the code for reference and some notes
on the solution below.
We need to use the data to calculate the transformation matrix.
The transformation matrix will be a 2x2 matrix which tells how much current tech depth and entertainment value
influences future tech depth and entertainment value.
To generate this matrix, we leverage least squares regression
method. Either follow the link above or refer your favourite LLM tool to build an understanding. Applying this method
to the data will give us the following matrix.
[[0.70500624 0.19902547]
[0.09087316 0.89926622]]
- 0.70500624 represents, how much current tech depth influences future tech depth
- 0.19902547 represents, how much current entertainment value influences future tech depth
- 0.09087316 represents, how much current tech depth influences future entertainment value
- 0.89926622 represents, how much current entertainment value influences future entertainment
Now that we have the transformation matrix, we can use it to predict the future state of any
creator. The idea is to multiply the transformation matrix with the current state of the creator
to get the future state.
To compute the k
th state, we have two options
- Multiply the transformation matrix with the current state
k
times
- Use eigenvalues and eigenvectors
The second option is better because it is faster and more efficient.
def predict(A, x0, k):
eigenvalues, eigenvectors = np.linalg.eig(A)
return eigenvectors @ np.diag(eigenvalues ** k) @ np.linalg.inv(eigenvectors) @ x0
Applying the predict
to all 30 creators (in the test), we get the final state for each and
then computing
np.argmax(final_state[:, 0])
to get the creator with highest technical depth after 4 weeks
np.argmax(final_state[:, 1])
to get the creator with highest entertainment value after 4 weeks
- Comparing
argmin
s of initial and final state to tell which creators switched from tech-focused to entertainment-focused and from entertainment-focused to more tech-focused
Why this matters?
- Transtion or Transformation matrix can be leveraged to predict the future state of any system
- This is used in prediction, system stability, recommendation systems, etc.
- This is used in Markov Chains to predict the future state of a system
- This is used in Computer Graphics, Finance modelling, NLP, and Social Media Analysis.