Thanks a lot, great video ! I don't understand why your videos are so underrated. By the way, is there a way to get the Request URL from the website's source code ?
you can get request URL from response object, e.g. # make HTTP request response = requests.get('google.com') # print URL print(response.url) # output google.com
Thank you very much. Very awesome. When I tried the code, I encountered an error like that: Traceback (most recent call last): File "demo.py", line 31, in scraper.start_me() File "demo.py", line 27, in start_me self.to_csv() File "demo.py", line 22, in to_csv writer.writerow(row) File "C:\Users\Future\AppData\Local\Programs\Python\Python36\lib\csv.py", line 155, in writerow return self.writer.writerow(self._dict_to_list(rowdict)) File "C:\Users\Future\AppData\Local\Programs\Python\Python36\lib\encodings\cp1256.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2161' in position 247: character maps to But I got the table.csv although there is an error.
This is Window specific unicode character encoding issue. In order to get rid of the error in line where you're opening file stream just add this keyword argument: with open('your.csv', 'w', encoding='utf-8')
@@monkey_see_monkey_do Thank you very much. Now it worked. But when opening the CSV file I found a lot of empty rows in the output. (rows 2,4,6,8,10, .. and so on). By the way I found a way to use Pandas to export to CSV and I think more easier than using CSV module ..
@@KhalilYasser your issue is OS specific. I don't have windows and can't reproduce the issue on my side. The way people use csv module is all the same and the matter of skipped rows is the matter of unicode character policies, not the csv package. The fact that pandas handles it implicitly doesn't make standard csv module bad.
@@KhalilYasser You don't need to get rid of them, you need to handle them properly instead. Just google for "python csv unicode error windows" - there's lots of windows specific solutions. I had some similar questions from my subscribers before and all of them managed to find the way to go quite easily. They didn't mention the exact solutions though so I can't help in particular. Anyway I think you should focus on the technique of fetching data via faking ajax request and apply it to the domain of your interest. 90% of the data I've been scraping didn't have any unicode characters (usually you encounter them when say specific character for square meters/feet occurs or currency). Unicode characters are often the say every time so you can track and replace them.
Thanks a lot, great video ! I don't understand why your videos are so underrated. By the way, is there a way to get the Request URL from the website's source code ?
you can get request URL from response object, e.g.
# make HTTP request
response = requests.get('google.com')
# print URL
print(response.url)
# output
google.com
My website requests by POST 😭
then just use requests.post(url, params)
chatgpt would be super helpful
Thank you very much. Very awesome.
When I tried the code, I encountered an error like that:
Traceback (most recent call last):
File "demo.py", line 31, in
scraper.start_me()
File "demo.py", line 27, in start_me
self.to_csv()
File "demo.py", line 22, in to_csv
writer.writerow(row)
File "C:\Users\Future\AppData\Local\Programs\Python\Python36\lib\csv.py", line 155, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
File "C:\Users\Future\AppData\Local\Programs\Python\Python36\lib\encodings\cp1256.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2161' in position 247: character maps to
But I got the table.csv although there is an error.
This is Window specific unicode character encoding issue. In order to get rid of the error in line where you're opening file stream just add this keyword argument:
with open('your.csv', 'w', encoding='utf-8')
@@monkey_see_monkey_do Thank you very much. Now it worked. But when opening the CSV file I found a lot of empty rows in the output. (rows 2,4,6,8,10, .. and so on).
By the way I found a way to use Pandas to export to CSV and I think more easier than using CSV module ..
@@KhalilYasser your issue is OS specific. I don't have windows and can't reproduce the issue on my side. The way people use csv module is all the same and the matter of skipped rows is the matter of unicode character policies, not the csv package. The fact that pandas handles it implicitly doesn't make standard csv module bad.
@@monkey_see_monkey_do Thank you very much for fast reply. Is there any workaround as the point of unicode to get rid of the empty rows?
@@KhalilYasser You don't need to get rid of them, you need to handle them properly instead. Just google for "python csv unicode error windows" - there's lots of windows specific solutions. I had some similar questions from my subscribers before and all of them managed to find the way to go quite easily. They didn't mention the exact solutions though so I can't help in particular.
Anyway I think you should focus on the technique of fetching data via faking ajax request and apply it to the domain of your interest. 90% of the data I've been scraping didn't have any unicode characters (usually you encounter them when say specific character for square meters/feet occurs or currency). Unicode characters are often the say every time so you can track and replace them.