Still Getting 429 Errors After 4.9.1 Update #573

searchsolved · 2023-04-10T08:04:59Z

I'm still getting consistent 429 errors after the latest PyTrends update.

Should I by doing something different? I've read the docs and they look the same as before the PyTrends update.

Thanks.

uzairamer · 2023-04-10T14:38:24Z

Basically, you need to get the NID cookie. Following solution worked for me:

import requests

session = requests.Session()
session.get('https://trends.google.com')
cookies_map = session.cookies.get_dict()
nid_cookie = cookies_map['NID']

then plug the NID cookie in the TrendReq object

from pytrends.request import TrendReq

TrendReq(hl='en-US', tz=360, retries=3, requests_args={'headers': {'Cookie': f'NID={nid_cookie}'}})

DumbFace · 2023-04-10T16:28:25Z

this solution worked for me:

import requests

session = requests.Session()
session.get('https://trends.google.com')
cookies_map = session.cookies.get_dict()
nid_cookie = cookies_map['NID']

then plug the NID cookie in the TrendReq object

from pytrends.request import TrendReq

TrendReq(hl='en-US', tz=360, retries=3, requests_args={'headers': {'Cookie': f'NID={nid_cookie}'}})

Thanks a lot, it worked for me.

searchsolved · 2023-04-10T17:06:07Z

Thanks that worked for me too!

fackse · 2023-04-10T18:05:24Z

The solution provided by @uzairamer works for me too. Thanks a lot @uzairamer !

emlazzarin · 2023-04-10T20:21:37Z

Could we incorporate this into the codebase?

…

On Mon, Apr 10, 2023 at 11:05 AM, fackse < ***@***.*** > wrote: The solution provided by @ uzairamer ( https://github.com/uzairamer ) works for me too. Thanks a lot @ uzairamer ( https://github.com/uzairamer ) ! — Reply to this email directly, view it on GitHub ( #573 (comment) ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AAIWU4PDAAEDC5TKMSRJW53XARDXDANCNFSM6AAAAAAWYW2WKM ). You are receiving this because you are subscribed to this thread. Message ID: <GeneralMills/pytrends/issues/573/1502129017 @ github. com>

Terseus · 2023-04-11T09:33:40Z

Basically, you need to get the NID cookie. Following solution worked for me:

import requests

session = requests.Session()
session.get('https://trends.google.com')
cookies_map = session.cookies.get_dict()
nid_cookie = cookies_map['NID']

then plug the NID cookie in the TrendReq object

from pytrends.request import TrendReq

TrendReq(hl='en-US', tz=360, retries=3, requests_args={'headers': {'Cookie': f'NID={nid_cookie}'}})

I don't understand why this solution works at all, this is basically what the GetGoogleCookie method is already doing, or at least trying to do.

If someone who can work around the problem using this solution could post more information we could fix it in the library, we need to know at least:

what problem are you having; how much successful downloads do you have before having 429 errors?
how are you using the TrendReq objects? Do you create one object for every request, or create it once for a whole batch?
how are you applying this? Manually before every request? Once for every 100 requests?

whatalnk · 2023-04-11T10:03:21Z

I don't understand why this solution works at all, this is basically what the GetGoogleCookie method is already doing, or at least trying to do.

It seems that GetGoogleCookie is not working as intended.

I accessed to https://trends.google.com/trends/explore?geo=JP, which is used by GetGoogleCookie, from Google Chrome (Secret Window):

First time: it returned error (404 or 429)
Second time (after reload): It returned normal responce

https://trends.google.com/trends?geo=JP is ok, and it redirected to https://trends.google.com/home?geo=JP.

Terseus · 2023-04-11T10:10:20Z

Hi @whatalnk

That request is expected to fail with a 429, pytrends doesn't care if it works or not, it's only meant to generate a valid NID cookie.

Even when the request fails the backend returns a NID cookie that we can use to the next request(s).

Thanks for the response ,though.

fackse · 2023-04-11T10:12:10Z

Currently, I want to retrieve the trends for about 5536 terms. Before the hint with the NID cookie came here, I tried it like this:

[...]
pytrends = TrendReq(hl='en-US', tz=360, retries=3)
# Set the number of terms per request (maximum 5)
terms_per_request = 5

dfs = []
# Loop through the chunks of unique names
for name_chunk in tqdm(list(chunk(name_time_ranges_tuples, terms_per_request)), desc="Fetching trends data"):
    names, time_ranges = zip(*name_chunk)
    try:
        # Build the payload with the current name chunk
        pytrends.build_payload(names, timeframe="2023-03-06 2023-04-03", geo='US')
        result = pytrends.interest_over_time()
        result = result.drop('isPartial', axis=1)
        dfs.append(result)
    except Exception as e:
        print(f"Error for {names}: {e}")
        
    time.sleep(60)
[...]

With the 5536 terms (chunked) I got 230 times the error 429. I didn't follow the progress "live" to the end. But at the beginning, the error came about every second call. Apparently, the error occurred less frequently later.

On another machine I used the following code. In this case the abort condition was not reached once:

def process_chunk(chunk, index, retries=3):
    for attempt in range(retries):
        try:
            pytrends.build_payload(chunk, timeframe="2023-03-06 2023-04-03", geo='US')
            result = pytrends.interest_over_time()
            result = result.drop('isPartial', axis=1)
            return result
        except Exception as e:
            if attempt < retries - 1:
                print(f"Error processing chunk {index} (attempt {attempt + 1}): {e}. Retrying...")
            else:
                print(f"Error processing chunk {index} (final attempt {attempt + 1}): {e}. Giving up.")
                return None

session = requests.Session()
session.get('https://trends.google.com')
cookies_map = session.cookies.get_dict()
nid_cookie = cookies_map['NID']
proxy = 'http://<user>:<pw>@isp2.hydraproxy.com:9989'

trends_df = pd.DataFrame()
dfs = []

pytrends = TrendReq(hl='en-US', tz=360, retries=3, proxies=[proxy]*100000,requests_args={'headers': {'Cookie': f'NID={nid_cookie}'}})

with Progress() as progress:
    task = progress.add_task("[cyan]Processing chunks...", total=len(chunks))

    for index, chunk in enumerate(chunks):
        result = process_chunk(chunk, index)
        if result is not None:
            dfs.append(result)
        progress.update(task, advance=1)

trends_df = pd.concat(dfs)

Terseus · 2023-04-11T10:24:39Z

Hi @fackse,

I see you're using proxies in the second solution, but not in the first.

The TrendReq class behaves differently between when using proxies and when doesn't:

when using proxies, the instance will retrieve a new NID cookie for every request.
when not using them the instance will reuse the same NID over and over again.

Please, can you try to execute the first solution using proxies in the same way you do in the second solution and report if there's any difference in the error rate?

Thanks a lot.

whatalnk · 2023-04-11T11:52:41Z

Thank you for your quick reply. This comment was my misunderstandig.

Well, I found funny things.

I downloaded source code (pytrends==4.9.1) from PyPI, and checked the contents.
It seems that the change from the commit f6b2d0c is not included.

In requests.py, GetGoogleCookie is still use f'{BASE_TRENDS_URL}/?geo={self.hl[-2:]}', not f'{BASE_TRENDS_URL}/explore/?geo={self.hl[-2:]}'

fackse · 2023-04-11T13:09:06Z

@Terseus : I would like to do that, but now another error occurs. No idea if it is related to this. In any case, I can't use pytrends right now.
Tested on two different machines. I am working in Jupyter Notebook. Have restarted the kernel several times, also restarted jupyter itself.:

I also took a look at the results from the last run and did some sampling and compared the results to those on the Google Trends page. It showed that pytrends often reports NaNs where values are actually present. I don't want to post the data here publicly, but I can send you an excerpt if you want.

Edit:
Reported error was due to user stupidity 😅 Thank you @whatalnk

whatalnk · 2023-04-11T13:51:18Z

@fackse

pytrends.build_payload("Meech", timeframe="2023-03-06 2023-04-03", geo='US')

kw_list is interpreded as M, e, e, c, h, not Meech

How about

pytrends.build_payload(["Meech"], timeframe="2023-03-06 2023-04-03", geo='US')

fackse · 2023-04-11T13:55:22Z

@fackse

pytrends.build_payload("Meech", timeframe="2023-03-06 2023-04-03", geo='US')

kw_list is interpreded as M, e, e, c, h, not Meech

How about
pytrends.build_payload(["Meech"], timeframe="2023-03-06 2023-04-03", geo='US')

You're absolutely right, my bad! 🤦‍♂️

Terseus · 2023-04-11T16:09:42Z

Hi @whatalnk,

Thank you for your quick reply. This comment was my misunderstandig.

Well, I found funny things.

I downloaded source code (pytrends==4.9.1) from PyPI, and checked the contents. It seems that the change from the commit f6b2d0c is not included.

In requests.py, GetGoogleCookie is still use f'{BASE_TRENDS_URL}/?geo={self.hl[-2:]}', not f'{BASE_TRENDS_URL}/explore/?geo={self.hl[-2:]}'

Thanks a lot for checking it and raising it here.

I've checked the package in pypi.org (both wheel and sdist) and you're right, the version 4.9.1 doesn't contain the fix from the PR #570.

The version 4.9.1 is generated from the commit ed8c400dd9e0b52d878187802ad01c4f7e1b9a71 which original branch doesn't contain the code from #570.

Please @emlazzarin can you please make a 4.9.2 release from the current master branch?

In the meantime, please @fackse install pytrends from the current master branch and retry your code again, to do it you can:

clone the repo in your machine, e.g. git clone https://github.com/GeneralMills/pytrends /home/fackse/pytrends.
install the code directly from the repo in editable mode with pip install -e /home/fackse/pytrends.

Thank you.

fackse · 2023-04-11T17:13:11Z

It appears to be working! I installed it using "pip install git+https://github.com/GeneralMills/pytrends". I tested it with 1108 requests, each consisting of 5 keywords. To speed up the process, I used ThreadPoolExecutor to parallelize it and a proxy. The call was made with the following code:

pytrends = TrendReq(hl='en-US', tz=360, retries=3, proxies=[proxy]*100000)
pytrends.build_payload(chunk, timeframe="2023-03-06 2023-04-03", geo='US')

During the test, I encountered 196 instances of the message "Proxy error. Changing IP". To better control the behavior after the third attempt, I set the retries parameter to 3. Only 15 times did the request fail to go through within three attempts, in which case I had to re-initialize TrendReq (again with 3 retries - as seen in the code above).

datacubed · 2023-04-13T08:56:40Z

@emlazzarin @Terseus sorry to pester, but any idea when we would get this much needed release?

emlazzarin · 2023-04-13T14:53:00Z

Will fix shortly.

…

On Thu, Apr 13 2023 at 01:56, datacubed < ***@***.*** > wrote: @emlazzarin ( https://github.com/emlazzarin ) @Terseus ( https://github.com/Terseus ) sorry to pester, but any idea when we would get this much needed release? — Reply to this email directly, view it on GitHub ( #573 (comment) ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AAIWU4MFC2QRFHAS6RIBZVTXA65VHANCNFSM6AAAAAAWYW2WKM ). You are receiving this because you were mentioned. Message ID: <GeneralMills/pytrends/issues/573/1506598871 @ github. com>

emlazzarin · 2023-04-13T23:18:11Z

https://pypi.org/project/pytrends/4.9.2/ has now been updated with the correct code. Thanks!

JUSTINDSBAUTISTA · 2024-09-13T19:47:43Z

Hello,

I don't know if you will response or not, but it's good to give it a try.
I am intern, and have a task to SCRAPE the google trends.

I have to get the RelatedQueries and the graph.
and write it to .CSV

I did research for almost a week, but I don't get any guide from my work place.
But I found this code from the internet, that takes the RELATED QUERIES & RELATED TOPICS but not the Graph and save it as .csv file.
But you have to copy and paste each URL from the NETWORK.

I would be happy to learn PYTHON.
Thank you for your help.

JUSTINDSBAUTISTA · 2024-09-13T19:47:56Z

import httpx
import json
import pandas as pd

# Set the geographical location to the United States

geo_location = "CA"

# Add the API URLs
topics_url = f"https://trends.google.com/trends/api/widgetdata/relatedsearches?hl=en-US&tz=240&req=%7B%22restriction%22:%7B%22geo%22:%7B%22country%22:%22CA%22%7D,%22time%22:%222024-09-12T18%5C%5C:05%5C%5C:57+2024-09-13T18%5C%5C:05%5C%5C:57%22,%22originalTimeRangeForExploreUrl%22:%22now+1-d%22,%22complexKeywordsRestriction%22:%7B%22keyword%22:%5B%7B%22type%22:%22BROAD%22,%22value%22:%22programming%22%7D%5D%7D%7D,%22keywordType%22:%22ENTITY%22,%22metric%22:%5B%22TOP%22,%22RISING%22%5D,%22trendinessSettings%22:%7B%22compareTime%22:%222024-09-11T18%5C%5C:05%5C%5C:57+2024-09-12T18%5C%5C:05%5C%5C:57%22%7D,%22requestOptions%22:%7B%22property%22:%22%22,%22backend%22:%22CM%22,%22category%22:0%7D,%22language%22:%22en%22,%22userCountryCode%22:%22CA%22,%22userConfig%22:%7B%22userType%22:%22USER_TYPE_LEGIT_USER%22%7D%7D&token=APP6_UEAAAAAZuXQhcOGUPuW6AsOipJBkyUfkQjfuYgk"

queries_url = f"https://trends.google.com/trends/api/widgetdata/relatedsearches?hl=en-US&tz=240&req=%7B%22restriction%22:%7B%22geo%22:%7B%22country%22:%22CA%22%7D,%22time%22:%222024-09-12T18%5C%5C:05%5C%5C:57+2024-09-13T18%5C%5C:05%5C%5C:57%22,%22originalTimeRangeForExploreUrl%22:%22now+1-d%22,%22complexKeywordsRestriction%22:%7B%22keyword%22:%5B%7B%22type%22:%22BROAD%22,%22value%22:%22programming%22%7D%5D%7D%7D,%22keywordType%22:%22QUERY%22,%22metric%22:%5B%22TOP%22,%22RISING%22%5D,%22trendinessSettings%22:%7B%22compareTime%22:%222024-09-11T18%5C%5C:05%5C%5C:57+2024-09-12T18%5C%5C:05%5C%5C:57%22%7D,%22requestOptions%22:%7B%22property%22:%22%22,%22backend%22:%22CM%22,%22category%22:0%7D,%22language%22:%22en%22,%22userCountryCode%22:%22CA%22,%22userConfig%22:%7B%22userType%22:%22USER_TYPE_LEGIT_USER%22%7D%7D&token=APP6_UEAAAAAZuXQhdFr9kwhcYtatVcsQ2f0ELPYNdUo"

# Get the data from the API URLs
topics_response = httpx.get(url=topics_url)
queries_response = httpx.get(url=queries_url)

# Remove the extra symbols and add the data into JSON objects
topics_data = json.loads(topics_response.text.replace(")]}',", ""))
queries_data = json.loads(queries_response.text.replace(")]}',", ""))

result = []

# Prase the topics data and the data into the result list
for topic in topics_data["default"]["rankedList"][1]["rankedKeyword"]:
    topic_object = {
        "Title": topic["topic"]["title"],
        "Search Volume": topic["value"],
        "Link": "https://trends.google.com/" + topic["link"],
        "Geo Location": geo_location,
        "Type": "search_topic",
    }
    result.append(topic_object)

# Prase the querires data and the data into the result list
for query in queries_data["default"]["rankedList"][1]["rankedKeyword"]:
    query_object = {
        "Title": query["query"],
        "Search Volume": query["value"],
        "Link": "https://trends.google.com/" + query["link"],
        "Geo Location": geo_location,
        "Type": "search_query",
    }
    result.append(query_object)

print(result)

# Create a Pandas dataframe and save the data into CSV
df = pd.DataFrame(result)
df.to_csv("keywords.csv", index=False)

emlazzarin closed this as completed Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Still Getting 429 Errors After 4.9.1 Update #573

Still Getting 429 Errors After 4.9.1 Update #573

searchsolved commented Apr 10, 2023

uzairamer commented Apr 10, 2023 •

edited

Loading

DumbFace commented Apr 10, 2023

searchsolved commented Apr 10, 2023

fackse commented Apr 10, 2023

emlazzarin commented Apr 10, 2023 via email

Terseus commented Apr 11, 2023

whatalnk commented Apr 11, 2023

Terseus commented Apr 11, 2023

fackse commented Apr 11, 2023

Terseus commented Apr 11, 2023

whatalnk commented Apr 11, 2023

fackse commented Apr 11, 2023 •

edited

Loading

whatalnk commented Apr 11, 2023

fackse commented Apr 11, 2023

Terseus commented Apr 11, 2023

fackse commented Apr 11, 2023

datacubed commented Apr 13, 2023

emlazzarin commented Apr 13, 2023 via email

emlazzarin commented Apr 13, 2023

JUSTINDSBAUTISTA commented Sep 13, 2024

JUSTINDSBAUTISTA commented Sep 13, 2024

Still Getting 429 Errors After 4.9.1 Update #573

Still Getting 429 Errors After 4.9.1 Update #573

Comments

searchsolved commented Apr 10, 2023

uzairamer commented Apr 10, 2023 • edited Loading

DumbFace commented Apr 10, 2023

searchsolved commented Apr 10, 2023

fackse commented Apr 10, 2023

emlazzarin commented Apr 10, 2023 via email

Terseus commented Apr 11, 2023

whatalnk commented Apr 11, 2023

Terseus commented Apr 11, 2023

fackse commented Apr 11, 2023

Terseus commented Apr 11, 2023

whatalnk commented Apr 11, 2023

fackse commented Apr 11, 2023 • edited Loading

whatalnk commented Apr 11, 2023

fackse commented Apr 11, 2023

Terseus commented Apr 11, 2023

fackse commented Apr 11, 2023

datacubed commented Apr 13, 2023

emlazzarin commented Apr 13, 2023 via email

emlazzarin commented Apr 13, 2023

JUSTINDSBAUTISTA commented Sep 13, 2024

JUSTINDSBAUTISTA commented Sep 13, 2024

uzairamer commented Apr 10, 2023 •

edited

Loading

fackse commented Apr 11, 2023 •

edited

Loading