Ini adalah kode saya sejauh ini: Pada dasarnya saya ingin mengikis artis top10, dan membuat dataFrame di mana 3 colums adalah: nama mereka, tautan dan HTML dari masing-masing artis. Bagaimana saya bisa maju dari sini? Juga bagian ini juga tidak berfungsi, semua yang dicetak adalah 'SoundCloud'. Saran apa pun dipersilakan.

from urllib.request import urlopen
import pandas as pd 
import requests
import bs4
import re
url='https://soundcloud.com/popular/searches'
source=urlopen(url).read().decode('utf-8')
bsource=bs4.BeautifulSoup(source)
tag=bsource.find('a', href=True)
print(tag.string)

1
JoshBob 29 Mei 2021, 02:41

1 menjawab

Jawaban Terbaik

Mencoba:

import requets
import pandas as pd
from bs4 import BeautifulSoup


url = "https://soundcloud.com/popular/searches"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

data = []
for a in soup.select("ol a")[:10]:
    data.append(
        {
            "name": a.get_text(strip=True),
            "link": "https://soundcloud.com" + a["href"],
            "html": str(a),
        }
    )

df = pd.DataFrame(data)
print(df)

Cetakan:

           name                                            link                                               html
0  nba youngboy  https://soundcloud.com/search?q=nba%20youngboy  <a href="/search?q=nba%20youngboy">nba youngbo...
1    juice wrld    https://soundcloud.com/search?q=juice%20wrld    <a href="/search?q=juice%20wrld">juice wrld</a>
2        polo g        https://soundcloud.com/search?q=polo%20g            <a href="/search?q=polo%20g">polo g</a>
3      lil baby      https://soundcloud.com/search?q=lil%20baby        <a href="/search?q=lil%20baby">lil baby</a>
4      lil durk      https://soundcloud.com/search?q=lil%20durk        <a href="/search?q=lil%20durk">lil durk</a>
5  xxxtentacion    https://soundcloud.com/search?q=xxxtentacion  <a href="/search?q=xxxtentacion">xxxtentacion</a>
6        j cole        https://soundcloud.com/search?q=j%20cole            <a href="/search?q=j%20cole">j cole</a>
7      rod wave      https://soundcloud.com/search?q=rod%20wave        <a href="/search?q=rod%20wave">rod wave</a>
8  moneybagg yo  https://soundcloud.com/search?q=moneybagg%20yo  <a href="/search?q=moneybagg%20yo">moneybagg y...
9      king von      https://soundcloud.com/search?q=king%20von        <a href="/search?q=king%20von">king von</a>
1
Andrej Kesely 28 Mei 2021, 23:45