It’s been a while I haven’t published any articles on my Medium account since I was too busy for that and I think now would be a good time to wrap up the things that happened in the past few years during the holiday
Here are some updates on my current status:
- I am still in the SEO field and doing it quite hands-on
- I’ve been working for Crypto.com since 2019
- I’ve started to learn Python in 2020 and applied it to my works
- I’ve been bought more Bitcoin since 2020 and be a HODLer for it
Since I have learned Python, so I just want to share some of my learning experience as a regular refresh. The reason why I learn Python is because I found it’s good for large scale data analysis. In Crypto.com, we have a section called Crypto.com Price with nearly 3,000 pages and more than 65,000 organic keywords to track and report, so using traditional reporting methods in google spreadsheet, excel table or etc would be quite challenging and not scalable.
Another driving force is from the inspiration of one of our industry legend Hamlet Batista (CEO of RankSense who just passed away in January 2021 due to COVID19) and I was fascinated by his deep knowledge on how to use Python on SEO to bring the game to the next level.
One of my job routines is doing SEO reporting and visualizing a large scale of data in a simplified way so that managements can have a quick overview of the search landscape performance on our products (sub-directories of our website)
Plotly — Python Open Source Graphing Library
On the data visualisation, I make use of the open-source Python library called Plotly to generate a treemap charts to address this reporting need.
Plotly is a free and open-source python library, licensed under the MIT license, and it was able to make interactive, publication-quality graphs online. Below are some charts which can be populated with Plotly
In this article, we will use the Treemap charts to visualize our SEO data. Treemap charts visualize hierarchical data using nested rectangles, and you can categorize the treemap charts and assign different parameters base on your need. Below is just an example for Treemap charts created by Plotly library
Install Plotly to get start
Now you will need to install Plotly to get everything starts.
And you will need to have a Python IDEs first, I personally use Google Colab
To install Plotly in Google Colab is fairly easy. You need to run the below pip install command
pip install plotly==4.14.1
Alternatively, you can also use this command
pip install — upgrade plotly
And then we need to import plotly for use
To make a treemap chart, we use the
plotly.express module (usually imported as
px). Plotly Express is a built-in part of the
plotly library and is the recommended starting point for creating most common figures.
You will need to use the below command to import
import plotly.express as px
With the plotly library import is not enough, we need data to populate the treemap diagram. To handle data in python, Pandas is a great tool and a very commonly used module in the data science field. To import Pandas in your favorite notebook, you can use the below command
import pandas as pd
Upload SEO data
After you run the above commands, it’s time to upload the data to build the treemap chart, in this example, I was using the data of Organic keywords 2.0 report from the SEO tool - Ahrefs. You can find this report from the Ahrefs dashboard interface and you can export the data into a CSV file. And you can save the CSV file as .xls or .xlsx format for the subsequent analysis. more details about Organic keywords 2.0 report can be referred to below youtube video or found in here
To upload excel files from your local storage, you will need to run the belo wadditional commands. Once these commands were executed, an upload widget will appear on screens and you can upload the files from your computer.
from google.colab import files
uploaded = files.upload()
After the data file has been uploaded, you will need to convert it into a pandas dataframe for analysis. You can do this by first read the data with the method pandas.read_excel( ) and store the data under a variable df
You can have a preview of the dataframe by print it
you can also preview the first n rows with the head( ) method and the default no. of rows are 5
Once you have the dataframe in ready, you will have to make use of the dataframe columns to construct the hierarchy of the treemap charts as we have mentioned treemap charts are presenting data in an ahierarchical way using nested rectangles. You will need to create different variables to store the data from different columns just like the below screenshot example.
After you have store the columns data into these variables, it’s time put these variables to construct the treemap charts with the below snippet of commands
px.treemap can take a
path parameter corresponding to a list of columns and the sequence of the parameters (country, category,keyword) are also corresponding to the hierarchy of the treemaps from outer to inner manner.
color_continuous_scale argument is used to present the color scale for the
color argument from the smallest value (in green color) to the largest value (red color).
hover_data argument is used to store the variables that you want to show when mouse hover on particular rectangles.
we use the last argument
plotly.offline.plot(fig,filename='') to construct the treemap charts in HTML format and we download it with one last command in below
and the output would be something like this, the area of the retangales are representing the keywords search volume while the colors are representing their search ranking, when hover over particular retangles, the coresponsed page URLs and the previous keyword ranking will be shown too.
Hope you find this article useful, your applause and sharing will be my driving force to keep sharing my Python learning on Medium.