create a client-side web scraper using JavaScript running in the browser. We’ll use the fetch
API to make HTTP requests and DOMParser
to parse HTML content. Here’s how you can do it:
html<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Web Scraper</title>
</head>
<body>
<h1>Scraped Articles</h1>
<ul id="article-list"></ul>
<script>
async function fetchPage(url) {
try {
const response = await fetch(url);
if (!response.ok) {
throw new Error('Failed to fetch page');
}
const html = await response.text();
return html;
} catch (error) {
console.error('Error fetching page:', error);
return null;
}
}
function parsePage(html) {
const parser = new DOMParser();
const doc = parser.parseFromString(html, 'text/html');
const titles = Array.from(doc.querySelectorAll('h2')).map(title => title.textContent);
return titles;
}
async function main() {
const url = 'https://example.com/articles';
const html = await fetchPage(url);
if (html) {
const articles = parsePage(html);
const articleList = document.getElementById('article-list');
articles.forEach(title => {
const listItem = document.createElement('li');
listItem.textContent = title;
articleList.appendChild(listItem);
});
}
}
main();
</script>
</body>
</html>
In this example:
- We define functions
fetchPage
,parsePage
, andmain
. fetchPage
uses thefetch
API to make an HTTP request to the specified URL and returns the HTML content of the page.parsePage
usesDOMParser
to parse the HTML content and extract the titles of articles.main
is the main function that fetches the page, parses it, and then displays the titles of articles on the webpage.
You can replace 'https://example.com/articles'
with the URL of the website you want to scrape.
Remember, running client-side web scraping code has limitations due to browser security policies, such as the same-origin policy and CORS restrictions. You may encounter issues accessing certain websites due to these restrictions. Always respect the terms of service of the websites you’re scraping and ensure your scraping activities comply with legal and ethical guidelines.
Leave a Reply