Building an Automated Web Browser with Asyncio, Aiohttp, and Playwright

Introduction

In the fast-paced digital world, automation is crucial for efficiency and scalability. Web automation, specifically, allows businesses and developers to streamline repetitive tasks, gather data, and interact with web applications programmatically. This article delves into an advanced Python script that leverages asyncio, aiohttp, and playwright to create an automated browser experience, complete with proxy rotation and URL navigation.

Full Code (Simple Copy And Paste):

import asyncio
import aiohttp
from playwright.async_api import Playwright, async_playwright
import time
import random

async def make_api_request():
    api_url = "https://api.prestigeproxies.com/rotate/vodafone.trial?port=01&secret=mIHVrT85ux"
    async with aiohttp.ClientSession() as session:
        async with session.get(api_url) as response:
            if response.status == 200:
                print("Proxy rotated successfully.")
            else:
                print(f"Failed to rotate proxy. Status: {response.status}")

async def close_and_reopen_browser(playwright: Playwright):
    global browser, context, page
    
    # Close the browser
    await context.close()
    await browser.close()

    # Make API request to rotate the proxy
    await make_api_request()
    time.sleep(5)
    # Reopen the browser
    await run(playwright)

async def run(playwright: Playwright) -> None:
    global browser, context, page

    proxy_info = {
        "server": f"http://yawer:[email protected]:55401",
        "username": "yawer",
        "password": "mIHVrT85ux"
    }

    browser = await playwright.chromium.launch(
        headless=False,
        proxy=proxy_info
    )
    context = await browser.new_context()
    
    # List of URLs to open
    urls = [
        "https://tastycherrygames.com/games/btc/bitcoinclicker.html",
        "https://blokette.com",
        "https://cryptoellen.com",
        "https://cryptochemy.com",
        "https://circlekgame.com",
        "https://cosmobulletin.com",
        "https://jebate.com",
        "https://queentakes.com",
        "https://cryptotaxy.com",
        "https://chipste.com",
        "https://bit.ly/coverga",
        "https://tastycherrygames.com/games/feb/aads.html",
        "https://staggereddam.com/vc4vs0gxcb?key=7b529dfba7708e88def930dd1c4666d9"
    ]
    
    # Shuffle URLs to open them in random order
    random.shuffle(urls)
    
    pages = []
    for url in urls:
        page = await context.new_page()
        await page.goto(url, timeout=0)
        pages.append(page)

    # Wait for 10 seconds
    await asyncio.sleep(15)

    # Close and reopen the browser
    await close_and_reopen_browser(playwright)

    # Sleep for 10 minutes before closing the context and browser
    await asyncio.sleep(600)

    await context.close()
    await browser.close()

async def main() -> None:
    async with async_playwright() as playwright:
        await run(playwright)

asyncio.run(main())

Understanding the Code

Before diving into the code’s functionality, let’s break down the key components and libraries involved:

  1. Asyncio: A library to write concurrent code using the async/await syntax.
  2. Aiohttp: An asynchronous HTTP client/server framework for making HTTP requests.
  3. Playwright: A library to automate Chromium, Firefox, and WebKit with a single API.

Code Breakdown

Importing Necessary Libraries

import asyncio
import aiohttp
from playwright.async_api import Playwright, async_playwright
import time
import random

Here, we import the required libraries. asyncio handles the asynchronous programming, aiohttp manages HTTP requests, and playwright.async_api provides the tools for browser automation.

Function: make_api_request

async def make_api_request():
    api_url = "https://api.prestigeproxies.com/rotate/vodafone.trial?port=01&secret=mIHVrT85ux"
    async with aiohttp.ClientSession() as session:
        async with session.get(api_url) as response:
            if response.status == 200:
                print("Proxy rotated successfully.")
            else:
                print(f"Failed to rotate proxy. Status: {response.status}")

This function performs an asynchronous HTTP GET request to a proxy rotation API. If the response status is 200, it indicates a successful proxy rotation.

Function: close_and_reopen_browser

async def close_and_reopen_browser(playwright: Playwright):
    global browser, context, page

    await context.close()
    await browser.close()

    await make_api_request()
    time.sleep(5)

    await run(playwright)

This function closes the current browser and its context, makes an API call to rotate the proxy, and then reopens the browser. The time.sleep(5) ensures a brief pause to allow the proxy rotation to complete.

Function: run

async def run(playwright: Playwright) -> None:
    global browser, context, page

    proxy_info = {
        "server": f"http://yawer:[email protected]:55401",
        "username": "yawer",
        "password": "mIHVrT85ux"
    }

    browser = await playwright.chromium.launch(
        headless=False,
        proxy=proxy_info
    )
    context = await browser.new_context()

    urls = [
        "https://tastycherrygames.com/games/btc/bitcoinclicker.html",
        "https://blokette.com",
        "https://cryptoellen.com",
        "https://cryptochemy.com",
        "https://circlekgame.com",
        "https://cosmobulletin.com",
        "https://jebate.com",
        "https://queentakes.com",
        "https://cryptotaxy.com",
        "https://chipste.com",
        "https://bit.ly/coverga",
        "https://tastycherrygames.com/games/feb/aads.html",
        "https://staggereddam.com/vc4vs0gxcb?key=7b529dfba7708e88def930dd1c4666d9"
    ]

    random.shuffle(urls)

    pages = []
    for url in urls:
        page = await context.new_page()
        await page.goto(url, timeout=0)
        pages.append(page)

    await asyncio.sleep(15)

    await close_and_reopen_browser(playwright)

    await asyncio.sleep(600)

    await context.close()
    await browser.close()

This function sets up the Playwright browser with proxy settings and opens a series of URLs. Key steps include:

  1. Proxy Configuration: The browser is launched with specific proxy settings.
  2. URL Navigation: A list of URLs is shuffled and then navigated in new pages.
  3. Closing and Reopening: After a brief pause, the browser is closed and reopened to refresh the proxy settings.
  4. Sleep Intervals: The script includes sleep intervals to simulate user behavior and allow for proxy rotation.

Main Function

async def main() -> None:
    async with async_playwright() as playwright:
        await run(playwright)

asyncio.run(main())

The main function initializes the asynchronous Playwright context and runs the primary automation function.

Detailed Explanation and Use Cases

1. Asynchronous Programming with Asyncio

asyncio enables concurrent code execution, crucial for web scraping and automation tasks where waiting for network responses or browser actions can otherwise block the program flow. By using async/await, the script can handle multiple tasks, such as making HTTP requests and controlling the browser, without being sequentially blocked.

2. Handling HTTP Requests with Aiohttp

Aiohttp is used to manage HTTP requests efficiently. In this script, it’s employed to interact with the proxy rotation API, ensuring that the browser uses a fresh proxy for each session, thereby reducing the risk of being blocked by the target websites.

3. Automating Browser Actions with Playwright

Playwright provides a robust API for browser automation. It supports multiple browsers (Chromium, Firefox, and WebKit), and its asynchronous API fits seamlessly with asyncio. Key functionalities used in this script include:

  • Launching Browser with Proxy: The browser is configured to use a proxy server for all network requests.
  • Context Management: Playwright’s context feature allows the script to manage multiple browser contexts, each with isolated storage and settings.
  • Page Navigation: The script opens multiple URLs in separate pages, simulating user interaction with the websites.

4. Proxy Rotation

Proxy rotation is vital for tasks like web scraping to avoid detection and blocking. By regularly changing the proxy server, the script mimics multiple users accessing the websites, thus evading anti-bot mechanisms.

5. Practical Use Cases

  • Web Scraping: Automate data extraction from multiple websites without getting blocked.
  • Automated Testing: Test web applications under different network conditions and proxy settings.
  • SEO Monitoring: Check website rankings and performance from various locations using different proxies.

Challenges and Solutions

1. Managing Asynchronous Tasks

Handling multiple asynchronous tasks can be complex. Ensuring that tasks like making HTTP requests and controlling the browser are performed efficiently requires careful management of the event loop and task scheduling.

Solution: Using asyncio and aiohttp together with Playwright’s asynchronous API helps streamline these tasks, allowing for efficient and concurrent execution.

2. Proxy Reliability

Proxies can sometimes be unreliable or slow, affecting the performance of the script.

Solution: Implementing error handling and retries for proxy requests ensures that the script can recover from proxy failures and continue executing.

3. Browser Automation Stability

Automating browsers can be prone to crashes or unexpected behavior, especially when dealing with multiple tabs and contexts.

Solution: Regularly closing and reopening the browser, as done in the script, helps maintain stability. Additionally, monitoring for exceptions and implementing retries can improve reliability.

Conclusion

This Python script showcases the power of combining asyncio, aiohttp, and Playwright for advanced web automation. By incorporating asynchronous programming, efficient HTTP request handling, and robust browser automation, the script achieves reliable and scalable web interactions. Whether for web scraping, automated testing, or SEO monitoring, this approach provides a solid foundation for building sophisticated automation tools.

By admin

Leave a Reply

Your email address will not be published. Required fields are marked *