I m trying to build a simple web scrapping tool.
Right now I m having an issue extracting data from each row because <tr>
header is missing.
(Only <tr>
header is missing, and < r>
header is still there)
下面是我的法典
from bs4 import BeautifulSoup
import requests
url = "https://companiesmarketcap.com/dow-jones/largest-companies-by-market-cap/"
data = requests.get(url).text
print(data)
它缺少一个头盔,只有每个行人都有。
<tbody>
((THERES SUPPOSED TO BE A <tr> TAG HERE))))!!!
<td class="fav"><img alt="favorite icon" src="/img/fav.svg?v2" data-id="2"></td>
</td><td class="rank-td td-right" data-sort="1">1
</td><td class="name-td">
<div class="logo-container"><img loading="lazy" class="company-logo" alt="Apple logo" src="/img/company-logos/64/AAPL.png" data-img-path="/img/company-logos/64/AAPL.png" data-img-dark-path="/img/company-logos/64/AAPL.D.png"></div>
<div class="name-div"><a href="/apple/marketcap/"><div class="company-name">Apple</div>
<div class="company-code"><span class="rank d-none"></span>AAPL</div>
</a></div></td><td class="td-right" data-sort="2891576508416">$2.891 T</td><td class="td-right" data-sort="18592">$185.92</td><td data-sort="18" class="rh-sm"><span class="percentage-green"><svg class="a" viewBox="0 0 12 12"><path d="M10 8H2l4-4 4 4z"></path></svg>0.18%</span></td><td class="p-0 sparkline-td red"><svg><path d="M0,21 5,18 10,22 15,14 20,16 25,12 30,8 35,14 40,11 45,3 50,3 55,4 60,8 65,6 70,10 75,11 80,13 85,13 90,14 95,14 100,13 105,16 110,16 115,31 120,34 125,39 130,41 135,31 140,32 145,30 150,31 155,30" /></svg></td><td>?? <span class="responsive-hidden">USA</span></td>
</tr>
谢谢!
+ I tried following
soup = BeautifulSoup(data, "lxml")
table = soup.find("table")
# print(table)
rows = table.find_all("tr")
但它只做少量工作,因为再一次,<tr>
。 头盔失踪