Web Scraping Specialist (Senior) ID30242

Department: Engineering
Specialization: Data Engineer
Experience: Senior
Technologies: Big Data Python
Client: Aura Intel
Special referral bonus: No
Hot position?: Hot
Technical flow: Data Engineer Python
Engineering technical flow: Data Engineer Python
Non-engineering technical flow: none
  • What you will do

  • Web Scraping & Data Extraction: design, develop, and optimize web scraping strategies for large-scale data extraction from dynamic websites; identify and assess relevant data sources, ensuring alignment with business objectives; implement automated web scraping solutions using Python and libraries like Scrapy, BeautifulSoup, and Selenium; build resilient and adaptable scrapers that can handle website structure changes, rate limits, and anti-scraping measures;
  • Data Processing & Integration: cleanse, validate, and transform extracted data to ensure accuracy, consistency, and usability; store and manage large volumes of scraped data using best-in-class storage solutions; develop ETL pipelines to integrate scraped data into data warehouses and analytics platforms; collaborate with cross-functional teams, including data scientists and engineers, to make scraped data actionable.
  • Web Scraping & Optimization: optimize scraping procedures to improve efficiency, reliability, and scalability across multiple data sources; implement solutions for bypassing CAPTCHAs, rotating user agents, and managing proxy services; continuously monitor, troubleshoot, and maintain scraping scripts to minimize disruptions due to site changes.
  • Compliance & Documentation: stay up to date with legal, ethical, and compliance considerations related to web scraping and data collection; ensure data collection processes align with best practices and regulatory requirements; maintain clear and detailed documentation of scraping methodologies, data pipelines, and best practices.
  • Must haves

  • 5+ years of hands-on experience in web scraping, data extraction, and integration;
  • Strong proficiency in Python and web scraping frameworks (Scrapy, BeautifulSoup, Selenium);
  • Expertise in handling dynamic content, browser fingerprinting, and bypassing anti-bot mechanisms (e.g., CAPTCHAs, rate limits, proxy rotation);
  • Deep understanding of HTML, CSS, XPath, and JavaScript-rendered content;
  • Experience working with large-scale data storage solutions and optimizing retrieval performance;
  • Strong grasp of ETL processes, data pipelines, and data warehousing;
  • Familiarity with APIs for data extraction and integration from public and restricted sources;
  • Strong problem-solving skills with an ability to debug and adapt to changing web structures;
  • Solid understanding of web scraping ethics, legal implications, and compliance guidelines;
  • Upper-Intermediate English level.
  • Nice to Haves

  • Bachelor’s degree in Computer Science, Data Science, Information Technology, or a related field;
  • Experience with cloud-based distributed scraping systems (AWS, GCP, Azure);
  • Knowledge of big data frameworks and experience handling high-volume datasets within Snowflake;
  • Familiarity with machine learning techniques for data extraction and natural language processing (NLP);
  • Experience working with JSON, XML, CSV, and other structured data formats;
  • Proficiency with version control systems (Git).

AgileEngine is one of the Inc. 5000 fastest-growing companies in the US and a top-3 ranked dev shop according to Clutch. We create award-winning custom software solutions that help companies across 15+ industries change the lives of millions.

If you like a challenging environment where you’re working with the best and are encouraged to learn and experiment every day, there’s no better place — guaranteed! 🙂

About the project

The benefits of joining us

Professional growth

Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps

Competitive compensation

We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities

A selection of exciting projects

Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands

Flextime

Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.

Your AgileEngine journey starts here

1

Test task

We will review your CV and send you a test task via email

2

Intro Call

Our recruitment team will reach you to discuss available opportunities

WFH or a comfy office? Why not both?

International Projects

3

Technical Interview

You will have an interview with your future team lead

Our geography

UTC-5
WASHINGTON DC USA
UTC-5
MIAMI USA
UTC-6
MEXICOMexico
UTC-5
ColombiaColombia
UTC-3
BrazilBrazil
UTC-3
ArgentinaArgentina
UTC+2
UkraineEurope
UTC+1
PolandEurope
UTC+0
PortugalPortugal
UTC+5:30
IndiaIndia

About AgileEngine

Founded as a dev tool vendor with a 2-person team

2010

Opened a dev center in Ukraine

2012

Pivoted into outsourced product development

2014

Launched mobile and UI labs

2015

Got our first Inc. 5000 award

2016

Opened a dev center in Argentina

2017

Became a top-3 ranked custom software developer in DC, Ukraine, and Argentina

2019

Became the #1 software development company to hire in 2020

2020

Opened new dev centers in Mexico and Colombia, counting 500+ experts

2021

How we lead

A company where experts grow, hone their skills, and do what they love, AgileEngine is guided by these principles:

Stay agile and embrace changes

Thrive in a results-driven culture with individual autonomy

Innovate with fellow experts in a no-blame environment

Learn from mistakes and move on

Foster mutual trust and support

Our geography

UTC-5
WASHINGTON DC USA
UTC-5
MIAMI USA
UTC-6
MEXICOMexico
UTC-5
ColombiaColombia
UTC-3
BrazilBrazil
UTC-3
ArgentinaArgentina
UTC+2
UkraineEurope
UTC+1
PolandEurope
UTC+0
PortugalPortugal
UTC+5:30
IndiaIndia

Apply for this position

Allowed Type(s): .pdf, .doc, .docx