Web scraping, data mining and data extraction services are available for lead generation, business process automation, research, and marketing.
Custom web scrapers are written in Python (BeautifulSoup, Requests, Selenium), data is extracted, filtered and packaged in various formats including CSV, JSON and XML.
Web scraping features include:
- Extracting data tables, text, images, links/etc.
- Filtering and compiling data into various formats including JSON, XML, CSV and SQL
- Setting up alerts for new content discovery.
- Screenshots or full HTML download of websites
- “Real” interactions with websites such as clicking buttons, accessing drop downs, and entering text into forms.
Scraping tools used: Python BeautifulSoup, Requests, Selenium, Headless Chrome
Recent Web Scraping and Data Transformation tasks performed:
-
- System to rename thousands of images files stored in Dropbox folders. The desktop based app was built using Python and Dropbox API.
- Web app to translate Excel documents while preserving sheet styles and formulas. The app was developed using Python Flask and Google Cloud Translation API.
- Scraped list of 100,000 funeral homes across the U.S.
- Scraped list of 3,000 martial arts institutes in the UK and identified tools used to build the sites, mobile responsiveness, and page load speed.
- Built a system to automatically scrape property listings and publish to a WordPress site.
- Scraped product listings from home appliances vendor scraped using Selenium
- Built a system to scrape job listing from Indeed and import into a WordPress based website.
Do you have a web scraping, data extraction or business process automation requirement?