In this Web Scraping in Python video, we are going to do a simple web scrapng exercise to extract Excel function list from Microsoft’s website.
The reason why I recorded this video was because recently I was developing an Excel application and I wanted to know what functions are available in Excel, and since when were some of the functions released. After examing Excel functions site, I noticed that the page is well-built and the HTML markups are well structured and organized. So I thought this woud be a good exercise for beginners to get some hands on experience.
Buy Me a Coffee? Your support is much appreciated!
PayPal Me: https://www.paypal.me/jiejenn/5
Venmo: @Jie-Jenn
Source Code:
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = 'https://support.office.com/en-us/article/excel-functions-alphabetical-b3944572-255d-4efb-bb96-c6d90033e188'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
tableFunctions = soup.find('table', attrs={'id': 'tblID0EBDAAA'})
functions = tableFunctions.find_all('tr')
lstFunctions = []
for function in functions:
try:
values = []
tds = function.find_all('td')
if tds:
for td in tds:
values.append(td.text.strip())
values.append(function.find('img')['alt']) if function.find('img') else values.append('')
lstFunctions.append(values)
except Exception as e:
print(function)
print(e)
continue
dfFunctions = pd.DataFrame(lstFunctions[1:], columns=['Function Name', 'Desc', 'Excel version released'])
dfFunctions.to_excel('Excel Functions List.xlsx', index=False)