Name		Name	Last commit message	Last commit date
parent directory ..
015_Web_Scraping_Covid-19_Data.ipynb		015_Web_Scraping_Covid-19_Data.ipynb
Covid-19_data.csv		Covid-19_data.csv
README.md		README.md
wordometer1.png		wordometer1.png
wordometer2.png		wordometer2.png

README.md

Web Scraping Covid-19 Data

In this class, you'll learn how to perform web scraping on Covid-19 data using python.

Prerequisites:

Python Basics
pandas Basics
HTML Basics
CSS module
Beautiful soup/bs4 module
requests module
lxml module

1. What is Web Scrapping??

Web scraping, also known as web data extraction, is the process of retrieving or “scraping” data from a website. This information is collected and then exported into a format that is more useful for the user. Be it a spreadsheet or an API.

Two important points to be taken into consideration here:

Always be respectful and try to get permission to scrape, do not bombard a website with scraping requests, otherwise, your IP address may get blocked!
Be aware that websites change often, meaning your code could go from working to totally broken from one day to the next.

The Process:

Request for a response from the webpage
Parse and extract with the help of Beautiful soup and lxml
Download and export the data with pandas into excel

Uses

It can serve several purposes, most popular ones are Investment Decision Making, Competitor Monitoring, News Monitoring, Market Trend Analysis, Appraising Property Value, Estimating Rental Yields, Politics and Campaigns and many more.

2. Covid-19 Data Source

We will use Worldometer website to fetch the data because we are interested in the data contained in a table at Worldometer’s website, where there are lists all the countries together with their current reported coronavirus cases, new cases for the day, total deaths, new deaths for the day, etc.

3. HTML

<!DOCTYPE html>
<html>
  <head>
  </head> 
  <body>
    <h1> Scrapping </h1>
    <p> Hello </p>
  </body>
</html>

<!DOCTYPE html>: HTML documents must start with a type declaration.
The HTML document is contained between <html> and </html> .
The meta and script declaration of the HTML document is between <head> and </head> .
The visible part of the HTML document is between <body> and </body> tags.
Title headings are defined with the <h1> through <h6> tags.
Paragraphs are defined with the <p> tag.
Other useful tags include <a> for hyperlinks, <table> for tables, <tr> for table rows, and <td> for table columns.

Install Necessary Modules:

Open your Prompt and type and run the following command (individually):

```
  pip install requests
```
```
  pip install lxml
```
```
  pip install bs4  
```

requests

Use the requests library to grab the page.
This may fail if you have a firewall blocking Python/Jupyter.
Sometimes you need to run this twice if it fails the first time.

Beautiful soup

BeautifulSoup library already has lots of built-in tools and methods to grab information from a string of this nature (basically an HTML file). It is a Python library for pulling data out of HTML and XML files.

Using BeautifulSoup we can create a "soup" object that contains all the "ingredients" of the webpage.

Once Installed now we can import it inside our python code.

Frequently asked questions ❔

How can I thank you for writing and sharing this tutorial? 🌷

You can and Starring and Forking is free for you, but it tells me and other people that it was helpful and you like this tutorial.

Go here if you aren't here already and click ➞ ✰ Star and ⵖ Fork button in the top right corner. You will be asked to create a GitHub account if you don't already have one.

How can I read this tutorial without an Internet connection?

Go here and click the big green ➞ Code button in the top right of the page, then click ➞ Download ZIP.
Extract the ZIP and open it. Unfortunately I don't have any more specific instructions because how exactly this is done depends on which operating system you run.
Launch ipython notebook from the folder which contains the notebooks. Open each one of them

Kernel > Restart & Clear Output

This will clear all the outputs and now you can understand each statement and learn interactively.

If you have git and you know how to use it, you can also clone the repository instead of downloading a zip and extracting it. An advantage with doing it this way is that you don't need to download the whole tutorial again to get the latest version of it, all you need to do is to pull with git and run ipython notebook again.

Authors ✍️

I'm Dr. Milaan Parmar and I have written this tutorial. If you think you can add/correct/edit and enhance this tutorial you are most welcome🙏

See github's contributors page for details.

If you have trouble with this tutorial please tell me about it by Create an issue on GitHub. and I'll make this tutorial better. This is probably the best choice if you had trouble following the tutorial, and something in it should be explained better. You will be asked to create a GitHub account if you don't already have one.

If you like this tutorial, please give it a ⭐ star.

Licence 📜

You may use this tutorial freely at your own risk. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

015_Web_Scraping_Covid-19_Data

015_Web_Scraping_Covid-19_Data

README.md

Web Scraping Covid-19 Data

Prerequisites:

1. What is Web Scrapping??

Two important points to be taken into consideration here:

The Process:

Uses

2. Covid-19 Data Source

3. HTML

Install Necessary Modules:

requests

Beautiful soup

Frequently asked questions ❔

How can I thank you for writing and sharing this tutorial? 🌷

How can I read this tutorial without an Internet connection?

Authors ✍️

Licence 📜

Connect with me

Files

015_Web_Scraping_Covid-19_Data

Directory actions

More options

Directory actions

More options

Latest commit

History

015_Web_Scraping_Covid-19_Data

Folders and files

parent directory

README.md

Web Scraping Covid-19 Data

Prerequisites:

1. What is Web Scrapping??

Two important points to be taken into consideration here:

The Process:

Uses

2. Covid-19 Data Source

3. HTML

Install Necessary Modules:

requests

Beautiful soup

Frequently asked questions ❔

How can I thank you for writing and sharing this tutorial? 🌷

How can I read this tutorial without an Internet connection?

Authors ✍️

Licence 📜

Connect with me