Table of Content:
- What is an Attribute Error in Python?
- What is the 'NoneType' object has no attribute 'text' Error in Web Scraping?
- Why is the 'NoneType' object has no attribute 'text' Error Difficult to Fix?
- Why does the 'NoneType' object has no attribute 'text' error occur during web scraping?
- How to Fix the 'NoneType' object has no attribute: 'text' Error?
- Wrap Off
When you are web scraping with Python, especially using Beautiful Soup, the most common error you will likely encounter is the 'NoneType' object has no attribute 'text.'
This error is peculiar with newbies who are starting with web scraping.
As usual, to solve this error, most web scraping developers simply Google 'NoneType' object has no attribute 'text', open a bunch of Stack Overflow results, only to meet a dead end.
And that's the biggest problem with this error.
What is an Attribute Error in Python?
An attribute error in Python programming language is an exception we get when you call an attribute from an object that does not exist or is not supported.
What is the 'NoneType' object has no attribute 'text' Error in Web Scraping?
It is an exception thrown when we try to get a text from an object, in this case, an HTML tag that does not exist or is misplaced. The web scraper detects that the requested element does not exist to get the text.
Hence, it throws the exception 'NoneType' object has no attribute 'text.'
Honestly, that is the first clue on how to fix this!
Why is the 'NoneType' object has no attribute 'text' Error Difficult to Fix?
Most developers get stuck with this error when web scraping because the error is not descriptive enough to point out.
How do you google something you can't even describe to yourself?
Why does the 'NoneType' object has no attribute 'text' error occur during web scraping?
When you get Unstructured HTML Data During Web Scraping
Naturally, this happens because of the unstructured data we get from a website when we scrape it.
Data from web scraping do not come in a neat and structured format like an API.
For instance, if you pull some data from an API and the value of an attribute or key of the data is missing, you usually still get that key with a null or empty value.
It tells you straight up that there is no value for that key or attribute. But, the good thing is, you can still work through it when you're pulling the data.
In web scraping, on the other hand, you cannot pull the text (the value) from an element (the key) that doesn't exist.
So, what happens is, it will just come back as none, and then when you try to access the text, you get this attribute error.
Looping Through Multiple Pages with Similar Data
A common scenario where this happens is when you are looping through several pages. Let's take an eCommerce product page, for instance.
Say you want to get the title, price, description, and rating of each product. You manually scan through a couple of pages and find all these data present.
You may assume that they all have the data.
Meanwhile, there is a possibility that one down the line does not have the rating data.
You run your scraper and print out all your results. As you go along, all of a sudden, you get this error.
That is the second clue to finding the specific object throwing the exception.
That means, whatever product or whatever page came after the one that last executed correctly, that's the one that doesn't have whatever that piece of information is that you're trying to get.
And, of course, when we use requests to access a website, we are not using our browser. We are sending a request directly to the webserver to get the information.
Hopefully, you now understand what that error means. Now, how can you fix it?
How to Fix the 'NoneType' object has no attribute: 'text' Error?
Double-check the web pages that you will be scraping to get information. Make sure that all element exists.
Verify that the selector or selectors you are using are correct.
Try a different selector or try a different selecting approach, or go back up the tree to the DOM tree and try and print something out and crosscheck that it is working.
Check if the website you are trying to scrape is accessible through requests and web scraping.
There you go. If you find this post helpful, do not hesitate to share.