Back
Process Status Scrape
Year
2024
Tech & Technique
Python, Selenium, Pandas, Docker, AWS, Google Sheets
Description
A web scraper for a property regularization company (Aprovcon) that collects up-to-date statuses of various construction processes from government websites and saves all the statuses in a Google Sheets spreadsheet.
Key Features:
Key Features:
- ๐ค Automated Status Tracking: Automatically retrieves the current status of construction and regularization processes from official government sources.
- ๐๏ธ Government Data Integration: Seamlessly connects and extracts relevant information from various government websites for up-to-date insights.
- ๐ Real-time Data Updates: Ensures statuses are always current, providing timely information on process progression.
- ๐ Google Sheets Integration: Stores and organizes all collected status data directly into a centralized Google Sheets spreadsheet for easy access and analysis by the Aprovcon team.
- โ Streamlined Regularization Workflow: Automates manual tracking tasks, significantly enhancing the efficiency and speed of property regularization for Aprovcon.
My Role
Web Scraper
- ๐ค Automated Web Scraping: Designed, developed, and deployed a robust web scraper to automate the collection of real-time construction process statuses from diverse government websites for Aprovcon.
- ๐ก๏ธ CAPTCHA Evasion: Successfully engineered and implemented techniques to bypass multiple, varied CAPTCHA systems, ensuring uninterrupted data retrieval operations.
- ๐ Persistent Data Access: Overcame restrictive access protocols and dynamic content challenges on government platforms by identifying and leveraging specific system behaviors to ensure consistent and reliable information gathering.
- ๐ Data Integration & Reporting: Automated the process of parsing, structuring, and saving all collected statuses into a Google Sheets spreadsheet, providing Aprovcon with an up-to-date and accessible data source for tracking property regularization processes.