Bulk Loading Efficiency Assessments With PageSpeed Insights API & Python



Google gives PageSpeed Insights API to assist search engine marketing professionals and builders by mixing real-world information with simulation information,  offering load efficiency timing information associated to net pages.

The distinction between the Google PageSpeed Insights (PSI) and Lighthouse is that PSI includes each real-world and lab information, whereas Lighthouse performs a web page loading simulation by modifying the connection and user-agent of the gadget.

One other level of distinction is that PSI doesn’t provide any data associated to net accessibility, search engine marketing, or progressive net apps (PWAs), whereas Lighthouse offers all the above.

Thus, after we use PageSpeed Insights API for the majority URL loading efficiency take a look at, we gained’t have any information for accessibility.

Nonetheless, PSI offers extra data associated to the web page pace efficiency, akin to “DOM Dimension,” “Deepest DOM Little one Component,” “Complete Activity Depend,” and “DOM Content material Loaded” timing.

Another benefit of the PageSpeed Insights API is that it provides the “noticed metrics” and “precise metrics” totally different names.

On this information, you’ll be taught:

  • Learn how to create a production-level Python Script.
  • Learn how to use APIs with Python.
  • Learn how to assemble information frames from API responses.
  • Learn how to analyze the API responses.
  • Learn how to parse URLs and course of URL requests’ responses.
  • Learn how to retailer the API responses with correct construction.

An instance output of the Web page Velocity Insights API name with Python is beneath.

example output of the Page Speed InsightsScreenshot from creator, June 2022

Libraries For Utilizing PageSpeed Insights API With Python

The mandatory libraries to make use of PSI API with Python are beneath.

  • Advertools retrieves testing URLs from the sitemap of a web site.
  • Pandas is to assemble the information body and flatten the JSON output of the API.
  • Requests are to make a request to the particular API endpoint.
  • JSON is to take the API response and put it into the particularly associated dictionary level.
  • Datetime is to change the particular output file’s identify with the date of the second.
  • URLlib is to parse the take a look at topic web site URL.

How To Use PSI API With Python?

To make use of the PSI API with Python, observe the steps beneath.

  • Get a PageSpeed Insights API key.
  • Import the mandatory libraries.
  • Parse the URL for the take a look at topic web site.
  • Take the Date of Second for file identify.
  • Take URLs into an inventory from a sitemap.
  • Select the metrics that you really want from PSI API.
  • Create a For Loop for taking the API Response for all URLs.
  • Assemble the information body with chosen PSI API metrics.
  • Output the ends in the type of XLSX.

1. Get PageSpeed Insights API Key

Use the PageSpeed Insights API Documentation to get the API Key.

Click on the “Get a Key” button beneath.

psi api key Picture from builders.google.com, June 2022

Select a undertaking that you’ve created in Google Developer Console.

google developer console api projectPicture from builders.google.com, June 2022

Allow the PageSpeed Insights API on that particular undertaking.

page speed insights api enablePicture from builders.google.com, June 2022

You will have to make use of the particular API Key in your API Requests.

2. Import The Vital Libraries

Use the strains beneath to import the basic libraries.

    import advertools as adv
    import pandas as pd
    import requests
    import json
    from datetime import datetime
    from urllib.parse import urlparse

3. Parse The URL For The Check Topic Web site

To parse the URL of the topic web site, use the code construction beneath.

  area = urlparse(sitemap_url)
  area = area.netloc.cut up(".")[1]

The “area” variable is the parsed model of the sitemap URL.

The “netloc” represents the particular URL’s area part. Once we cut up it with the “.” it takes the “center part” which represents the area identify.

Right here, “0” is for “www,” “1” for “area identify,” and “2” is for “area extension,” if we cut up it with “.”

4. Take The Date Of Second For File Title

To take the date of the particular operate name second, use the “datetime.now” methodology.

Datetime.now offers the particular time of the particular second. Use the “strftime” with the “%Y”, “”%m”, and “%d” values. “%Y” is for the 12 months. The “%m” and “%d” are numeric values for the particular month and the day.

 date = datetime.now().strftime("%Y_percentm_percentd")

5. Take URLs Into A Record From A Sitemap

To take the URLs into an inventory type from a sitemap file, use the code block beneath.

   sitemap = adv.sitemap_to_df(sitemap_url)
   sitemap_urls = sitemap["loc"].to_list()

When you learn the Python Sitemap Well being Audit, you possibly can be taught additional details about the sitemaps.

6. Select The Metrics That You Need From PSI API

To decide on the PSI API response JSON properties, you need to see the JSON file itself.

It’s extremely related to the studying, parsing, and flattening of JSON objects.

It’s even associated to Semantic search engine marketing, because of the idea of “directed graph,” and “JSON-LD” structured information.

On this article, we gained’t deal with inspecting the particular PSI API Response’s JSON hierarchies.

You may see the metrics that I’ve chosen to collect from PSI API. It’s richer than the essential default output of PSI API, which solely provides the Core Net Vitals Metrics, or Velocity Index-Interplay to Subsequent Paint, Time to First Byte, and First Contentful Paint.

In fact, it additionally provides “ideas” by saying “Keep away from Chaining Vital Requests,” however there isn’t any have to put a sentence into a knowledge body.

Sooner or later, these ideas, and even each particular person chain occasion, their KB and MS values might be taken right into a single column with the identify “psi_suggestions.”

For a begin, you possibly can examine the metrics that I’ve chosen, and an necessary quantity of them will likely be first for you.

PSI API Metrics, the primary part is beneath.

    fid = []
    lcp = []
    cls_ = []
    url = []
    fcp = []
    performance_score = []
    total_tasks = []
    total_tasks_time = []
    long_tasks = []
    dom_size = []
    maximum_dom_depth = []
    maximum_child_element = []
    observed_fcp  = []
    observed_fid = []
    observed_lcp = []
    observed_cls = []
    observed_fp = []
    observed_fmp = []
    observed_dom_content_loaded = []
    observed_speed_index = []
    observed_total_blocking_time = []
    observed_first_visual_change = []
    observed_last_visual_change = []
    observed_tti = []
    observed_max_potential_fid = []

This part contains all of the noticed and simulated basic web page pace metrics, together with some non-fundamental ones, like “DOM Content material Loaded,” or “First Significant Paint.”

The second part of PSI Metrics focuses on doable byte and time financial savings from the unused code quantity.

    render_blocking_resources_ms_save = []
    unused_javascript_ms_save = []
    unused_javascript_byte_save = []
    unused_css_rules_ms_save = []
    unused_css_rules_bytes_save = []

A 3rd part of the PSI metrics focuses on server response time, responsive picture utilization advantages, or not, utilizing harms.

    possible_server_response_time_saving = []
    possible_responsive_image_ms_save = []

Notice: Total Efficiency Rating comes from “performance_score.”

7. Create A For Loop For Taking The API Response For All URLs

The for loop is to take all the URLs from the sitemap file and use the PSI API for all of them one after the other. The for loop for PSI API automation has a number of sections.

The primary part of the PSI API for loop begins with duplicate URL prevention.

Within the sitemaps, you possibly can see a URL that seems a number of instances. This part prevents it.

for i in sitemap_urls[:9]:
         # Forestall the duplicate "/" trailing slash URL requests to override the data.
         if i.endswith("/"):
               r = requests.get(f"https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={i}&technique=cellular&locale=en&key={api_key}")
               r = requests.get(f"https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={i}/&technique=cellular&locale=en&key={api_key}")

Keep in mind to examine the “api_key” on the finish of the endpoint for PageSpeed Insights API.

Verify the standing code. Within the sitemaps, there is likely to be non-200 standing code URLs; these must be cleaned.

         if r.status_code == 200:
               data_ = json.hundreds(r.textual content)

The subsequent part appends the particular metrics to the particular dictionary that we have now created earlier than “_data.”

               performance_score.append(data_["lighthouseResult"]["categories"]["performance"]["score"] * 100)

Subsequent part focuses on “complete job” depend, and DOM Dimension.


The subsequent part takes the “DOM Depth” and “Deepest DOM Component.”


The subsequent part takes the particular noticed take a look at outcomes throughout our Web page Velocity Insights API.


The subsequent part takes the Unused Code quantity and the wasted bytes, in milliseconds together with the render-blocking sources.


The subsequent part is to offer responsive picture advantages and server response timing.


The subsequent part is to make the operate proceed to work in case there may be an error.


Instance Utilization Of Web page Velocity Insights API With Python For Bulk Testing

To make use of the particular code blocks, put them right into a Python operate.

Run the script, and you’ll get 29 web page speed-related metrics within the columns beneath.

pagespeed insights apiScreenshot from creator, June 2022


PageSpeed Insights API offers various kinds of web page loading efficiency metrics.

It demonstrates how Google engineers understand the idea of web page loading efficiency, and presumably use these metrics as a rating, UX, and quality-understanding standpoint.

Utilizing Python for bulk web page pace assessments provides you a snapshot of your entire web site to assist analyze the doable consumer expertise, crawl effectivity, conversion charge, and rating enhancements.

Extra sources:

Featured Picture: Dundanim/Shutterstock



Please enter your comment!
Please enter your name here