Web Scraping Specialist

MLabs
New York, NY

Location: Remote with a 6 hour overlap with EST

Remote | Full-time

Compensation: $75K - $100K

We are hiring on behalf of our client who is seeking a Web Scraping Specialist to join a specialized technical team focused on building the infrastructure that delivers massive amounts of web data for the training of advanced AI models. This organization operates a massive distributed crawler and manages complex pipelines for ingesting, segmenting, and annotating billions of data points, including videos, transcripts, and audio files.

The successful candidate will lead efforts to gather and analyze data, optimize scraping processes, and support the scaling of high-quality public web data accessibility. This role is ideal for a lean, technical builder who thrives in a fast-paced environment without bureaucratic red tape.

Key Responsibilities

  • Code Development: Write, test, and refine high-performance code to extract data from various online sources, ensuring maximum reliability and efficiency.
  • Data Retrieval: Manage complex data retrieval tasks, including handling pagination and dynamic content loaded via AJAX.
  • Data Quality: Clean and format extracted data to ensure it meets rigorous quality standards for downstream analysis and processing.
  • Database Management: Store and manage scraped data in appropriate databases, optimizing for both access speed and long-term data integrity.
  • Monitoring and Maintenance: Regularly monitor scraping processes and infrastructure to identify and resolve issues, ensuring a continuous and stable data flow.

Requirements

  • Extraction Expertise: Demonstrated ability to extract data from complex websites with minimal supervision, supported by a portfolio of past projects.
  • Technical Proficiency: Advanced skills in Python or JavaScript, specifically with libraries and frameworks such as BeautifulSoup, Scrapy, or Selenium.
  • Advanced Programming: Strong knowledge of asynchronous programming, multithreading, and distributed scraping architectures.
  • Web Fundamentals: In-depth knowledge of HTML, CSS, JavaScript, and the Document Object Model (DOM).
  • Data Storage: Experience with NoSQL databases (e.g., MongoDB, Cassandra), including the ability to design efficient storage solutions.
  • Cloud Infrastructure: Experience deploying and managing large-scale scraping jobs using cloud services such as AWS, Google Cloud, or Azure.
  • Preferred Skills: Ability to apply machine learning algorithms for data cleaning, categorization, or predictive analysis; active participation in relevant open-source projects.

Benefits

  • Competitive Compensation: A highly competitive salary ranging from $75,000 to $100,000 , complemented by a comprehensive benefits and equity package.
  • Impactful Work: The opportunity to work at the forefront of AI development and web-scale knowledge graph creation.
  • High-Output Culture: A professional environment that prioritizes low ego, technical autonomy, and rapid execution.
  • Remote Flexibility: This is a remote position requiring a 6-hour overlap with the core team's schedule.

Due to the high volume of applications we anticipate, we regret that we are unable to provide individual feedback to all candidates. If you do not hear back from us within 4 weeks of your application, please assume that you have not been successful on this occasion. We genuinely appreciate your interest and wish you the best in your job search.

Commitment to Equality and Accessibility

At MLabs, we are committed to offer equal opportunities to all candidates. We ensure no discrimination, accessible job adverts, and providing information in accessible formats. Our goal is to foster a diverse, inclusive workplace with equal opportunities for all. If you need any reasonable adjustments during any part of the hiring process or you would like to see the job-advert in an accessible format please let us know at the earliest opportunity by emailing ***email_hidden***.

MLabs Ltd collects and processes the personal information you provide such as your contact details, work history, resume, and other relevant data for recruitment purposes only. This information is managed securely in accordance with MLabs Ltd’s Privacy Policy and Information Security Policy, and in compliance with applicable data protection laws. Your data may be shared only with clients and trusted partners where necessary for recruitment purposes. You may request the deletion of your data or withdraw your consent at any time by contacting [email protected].

Posted 2026-06-26

Recommended Jobs

Lawn Care Spray Technician

Morris Communications Careers
Deer Park, NY

Benefits: Bonus based on performance Opportunity for advancement Paid time off Training & development Find your next home away from home by applying to Lawn Doctor of Long Island …

View Details
Posted 2026-06-23

IT Engineer Level Two

Consulting Technology Company
Ronkonkoma, NY

Job Description Job Description Benefits: ~401(k) ~401(k) matching ~ Bonus based on performance ~ Company parties ~ Competitive salary ~ Free food & snacks ~ Opportunity for advancem…

View Details
Posted 2026-06-23

Order Management Specialist

GE Renewable Energy Power and Aviation
Bohemia, NY

Job Description Summary The Order Management Specialist is the primary focal for end-to-end customer order execution. This role owns order entry and maintenance, drives past-due recovery, and ensu…

View Details
Posted 2026-05-12

Tory Burch FT Store Director

Tory Burch
Central Valley, NY

We are an American luxury lifestyle brand, founded in 2004. Anchored in the casual elegance of American sportswear, Tory's design philosophy is defined by effortless silhouettes, innovative material…

View Details
Posted 2026-05-15

RWI Underwriter - Top Transactional Risk Firm (New York, NY)

New York, NY

Join a market-leading RWI underwriting team in New York! Our client is a premier transactional risk insurer seeking a sharp corporate attorney (2-5 years, NY law firm) ready to transition into a f…

View Details
Posted 2026-06-09

Join surveys and discussions to earn rewards!

Springboard America
New York, NY

Join our dynamic survey participant community at Unlock Surveys today and start earning rewards for sharing your valuable opinions. We offer daily survey opportunities to our members. Don't miss out …

View Details
Posted 2025-09-05

26-27 Upper School Psychosocial Assistant

The Quad Preparatory School
New York, NY

The Quad Preparatory School Upper School 1:1 Psychosocial Assistant Anticipated Opening for 2026-2027 School Year   Classification: Non-Exempt  Reports to: Upper School Co-Head, Clinic…

View Details
Posted 2026-05-12

Produce Manager

Jetro / Restaurant Depot
Farmingdale, NY

Position Title:  Produce Manager Department:  Produce Supervisor:  Assistant Branch Manager/Branch Manager FLSA:  Non-exempt (Paid by the hour) Position Summary: Responsible for receiving…

View Details
Posted 2026-06-01

Florist

Tizarah Business Group Corporation
Jamaica, NY

Job Description Job Description Company: Tizarah & Co. Location: Jamaica, Queens, NY Job Type: Full-Time About Us Tizarah & Co. is a luxury floral and event design company s…

View Details
Posted 2026-06-25

Forward Deployed Engineer

New York, NY

Forward Deployed Software Engineer Location: New York preferred | London, DC, Denver considered with travel Work Model: Hybrid with customer site travel between 10% and 33% About the Role …

View Details
Posted 2026-05-03