Saya mencoba mengikis data dari situs web Tesco untuk mendapatkan nama dan harga produk. Di bawah ini adalah kode saya. Beberapa produk tidak memiliki harga karena sudah habis terjual dan Python memberi saya kesalahan karena tidak ada yang perlu dikikis. Saya ingin itu dapat melewati ubin itu dan beralih ke ubin berikutnya jika harga tidak tersedia.

Adakah yang tahu bagaimana saya bisa melakukan ini?

from bs4 import BeautifulSoup
import requests

#URL to be scraped
url_to_scrape = 'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all?page=1&count=48'
#Load html's plain data into a variable
plain_html_text = requests.get(url_to_scrape)
#parse the data
soup = BeautifulSoup(plain_html_text.text, "lxml")

#Get the name of the class
for name_of in soup.find_all('div',class_='product-tile-wrapper'):
    name =name_of.h3.a.text
    print(name)
    price = name_of.find('div', class_='price-details--wrapper')
    pricen =price.find('span', class_='value').text
    print(pricen)
1
Ambilli Radhakrishnan 11 April 2020, 19:48

1 menjawab

Jawaban Terbaik

Gunakan try - except blok:

from bs4 import BeautifulSoup
import requests

#URL to be scraped
url_to_scrape = 'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all?page=1&count=48'
#Load html's plain data into a variable
plain_html_text = requests.get(url_to_scrape)
#parse the data
soup = BeautifulSoup(plain_html_text.text, "lxml")

#Get the name of the class
for name_of in soup.find_all('div',class_='product-tile-wrapper'):
    try:
        name =name_of.h3.a.text
        print(name)
        price = name_of.find('div', class_='price-details--wrapper')
        pricen =price.find('span', class_='value').text
        print(pricen)
    except:
        pass

Anda juga dapat membuatnya lebih interaktif dengan:

from bs4 import BeautifulSoup
import requests

#URL to be scraped
url_to_scrape = 'https://www.tesco.com/groceries/en-GB/shop/fresh-food/all?page=1&count=48'
#Load html's plain data into a variable
plain_html_text = requests.get(url_to_scrape)
#parse the data
soup = BeautifulSoup(plain_html_text.text, "lxml")

#Get the name of the class
for name_of in soup.find_all('div',class_='product-tile-wrapper'):
    name =name_of.h3.a.text
    print(name)
    try:
        price = name_of.find('div', class_='price-details--wrapper')
        pricen =price.find('span', class_='value').text
        print(pricen)
    except:
        print('Sold Out')
0
Joshua 11 April 2020, 16:50