- Published on
Build a TikTok Data Science App with Streamlit and Python | Data Science Project
Build a TikTok Data Science App with Streamlit and Python | Data Science Project
Introduction
Hey everyone, my name is Nicole Astronaut, and in this article, we're going through an exciting project where we'll apply our data analytics skills to build a real-time TikTok analytics dashboard. We'll utilize Python, specific libraries like Streamlit and Playwright, and go through the process step-by-step, reflecting a real-life client and developer relationship.
Step 1: Gathering Data from TikTok
First things first, we'll collect TikTok data. The TikTok API isn't the most accessible for data based on trending analytics or specific hashtags, so we'll use an unofficial Python library. Here’s the general approach:
- Install necessary libraries:
pip install tiktokapi playwright
- Create and activate a virtual environment to keep our dependencies isolated.
- Write a Python script to fetch data from TikTok using the API library and process it into a JSON-friendly format.
Step 2: Preprocessing Data
The data from TikTok will not be in the most convenient format, with nested dictionaries making it challenging to analyze directly. We'll:
- Import necessary modules (e.g.,
json
,pandas
) - Preprocess the data by flattening these nested structures into a more tabular format.
- Export the cleaned data to a CSV, making it more manageable and shareable.
Step 3: Building the Streamlit Dashboard
Next, we move on to building our dashboard. We'll use Streamlit, which simplifies the creation of interactive web apps with Python.
Basic Setup
Install Streamlit and set up the basics:
pip install streamlit
Creating the App Script
Create an app.py
file and import necessary modules:
import streamlit as st
import pandas as pd
import plotly.express as px
Adding a Search Bar and Data Fetching Button
In app.py
, set up a search bar for hashtags and a button to fetch data:
hashtag = st.text_input("Search for a hashtag here", value="")
if st.button("Get Data"):
# Here we call our data fetching function
Fetching and Displaying Data
Invoke the TikTok data fetching function upon button click and display it:
import subprocess
## Introduction
def get_data(hashtag):
command = f"python tiktok.py (hashtag)"
subprocess.call(command.split())
if st.button("Get Data"):
get_data(hashtag)
df = pd.read_csv('tiktok_data.csv') # Assuming the result is saved to this file
st.dataframe(df)
Step 4: Adding Visualizations
We'll add various visualizations using Plotly:
if st.button("Get Data"):
# Other code here...
fig = px.histogram(df, x="description", y="stats.diggCount", title="Top TikToks")
st.plotly_chart(fig)
scatter_fig = px.scatter(df, x="stats.playCount", y="stats.commentCount",
title="Play Count vs Comment Count",
size="stats.shareCount", color="stats.likeCount")
st.plotly_chart(scatter_fig)
Step 5: Enhancing the Dashboard with a Sidebar and Responsiveness
Add a sidebar for instructional information and make the page layout wider:
st.sidebar.markdown("# TikTok Analytics")
st.sidebar.markdown("This dashboard allows you to analyze trending TikToks in real-time.")
st.sidebar.markdown("## Instructions")
st.sidebar.markdown("1. Enter the hashtag you want to analyze.")
st.sidebar.markdown("2. Click Get Data.")
st.sidebar.markdown("3. View and analyze the visualizations and table below.")
st.set_page_config(layout="wide")
Finally, run your Streamlit app:
streamlit run app.py
Conclusion
We've successfully created a real-time TikTok analytics dashboard using Python and Streamlit, providing interactive visualizations and data fetched from TikTok. This approach can be iterated and refined based on client feedback, demonstrating how agile development principles can be applied to data science projects.
Keywords
- Data Analytics
- TikTok API
- Python
- Streamlit
- Data Preprocessing
- Visualization
- Plotly
- JSON
- CSV
FAQ
What is Streamlit?
Streamlit is an open-source Python library that makes it easy to create and share interactive web applications for data science and machine learning projects.
How do I install the TikTok API and other necessary libraries?
Use the following commands:
pip install tiktokapi playwright streamlit pandas plotly
How do I preprocess nested JSON data into a flat CSV file?
You can create a helper function to convert nested structures into a flat dictionary and then use pandas to export the result to CSV.
Why did you use subprocess to call the TikTok API script?
Using subprocess to call the TikTok API script avoids threading issues with Streamlit and allows for stable interaction with the API.
How can I add custom visualizations to the Streamlit dashboard?
You can use Plotly Express to create custom visualizations and embed them in your Streamlit app using st.plotly_chart()
. Adjust the settings and parameters according to your data attributes.