
Lukket
Slået op
Betales ved levering
Senior Software Engineer (Web Automation & Data Synchronization) Company: Confidential Role: Senior Web Automation Engineer Location: Remote (Global Pod) Type: Full-Time Contract 1. About the Project We are developing a high-velocity national procurement and market intelligence platform for the $100B construction finishing industry, and we are aiming to scale our core "Data Factory" to support a 50-state national launch. The objective is to synchronize multi-variate inventory, pricing, and LTL logistics data from 100+ national retailers into a unified, high-performance data schema. 2. Key Responsibilities Automation Architecture: Design and deploy high-concurrency automation clusters using Node.js (TypeScript) and headless browser frameworks (Playwright/Puppeteer). Stateful Flow Simulation: Develop sophisticated scripts to simulate complex user journeys, including geo-localized session management and extraction of landed costs (Price + Tax + Freight). Neural Data Normalization: Architect ETL pipelines to transform unstructured web data into our Internal Universal Material Schema using advanced regex and LLM-assisted parsing. Perceptual Indexing: Implement pHash (Perceptual Hashing) logic to identify and de-duplicate identical physical products across the national market. Infrastructure Management: Manage auto-scaling containerized workloads on AWS (Fargate/ECS) with advanced proxy mesh orchestration. 3. Technical Requirements Core: Expert-level Node.js (TypeScript) or Python. Automation: Mastery of Playwright or Puppeteer. System Resilience: Deep understanding of request orchestration, session persistence, and residential proxy management to ensure high uptime against enterprise-grade site architectures. Cloud: AWS (Fargate, Lambda, S3). Vision: Familiarity with image processing/fingerprinting (OpenCV/pHash) is a significant advantage. 4. The Challenge: National Scale You will be building a system designed to support a $10MM+ ARR run-rate within 12 months. This requires a "Zero-Error" approach to data integrity, specifically ensuring that material quantities and freight weights remain accurate across 40,000+ US zip codes. 5. Why Join Us? Lead the technical build of a proprietary data moat for a venture-backed project. Tackle high-complexity architectural challenges involving distributed state management. Milestone-based compensation with a clear path to long-term leadership as the team scales. How to Apply Please provide your GitHub profile or a portfolio highlighting your work in browser automation. Specifically, describe a project where you managed high-concurrency data synchronization across multiple state-dependent web environments.
Projekt-ID: 40271586
105 forslag
Projekt på afstand
Aktiv 1 dag siden
Fastsæt dit budget og din tidsramme
Bliv betalt for dit arbejde
Oprids dit forslag
Det er gratis at skrive sig op og byde på jobs
105 freelancere byder i gennemsnit $16.146 USD på dette job

I bring extensive experience designing high-concurrency web automation systems using Node.js (TypeScript) and Playwright, including distributed scraping clusters that simulate complex, geo-localized user journeys with resilient session management and proxy orchestration. In a recent large-scale retail intelligence project, I built an AWS Fargate-based automation mesh that synchronized pricing and logistics data across hundreds of region-specific endpoints, implementing retry-safe request orchestration, structured ETL pipelines, and perceptual image hashing for cross-market product de-duplication. My approach emphasizes zero-error data integrity through deterministic parsing layers, schema validation, and observability-driven monitoring to maintain accuracy across large zip-code matrices. I can share relevant GitHub repositories and outline how I would architect your national-scale Data Factory to ensure reliability, scalability, and long-term technical moat development.
$15.000 USD på 7 dage
8,1
8,1

With over 10 years of experience in web and mobile development, specializing in web automation and data synchronization, I understand the crucial aspects of your project. Your need to scale a national procurement platform for the construction industry presents a unique challenge that requires expertise in automation architecture, neural data normalization, and infrastructure management. I have successfully tackled similar challenges in the past, with a track record of implementing high-concurrency automation clusters and designing ETL pipelines for seamless data transformation. My experience in Node.js, Playwright, and AWS align perfectly with the technical requirements outlined in your project description. Working on this project excites me as it offers the opportunity to contribute to a venture-backed endeavor with a clear path to long-term leadership. I am confident in my ability to lead the technical build of your data moat and ensure a "Zero-Error" approach to data integrity. I am looking forward to the opportunity to discuss how I can support your project further. Feel free to reach out to me through this platform to take the next steps. Thank you for considering my proposal.
$16.000 USD på 75 dage
6,9
6,9

Hello, Are you ready to elevate your web automation project with innovative solutions tailored for the construction industry? With extensive experience in high-concurrency automation and cloud infrastructure, I can guarantee a zero-error approach that ensures data integrity across multiple states. Let's connect to discuss how we can collaborate on this ambitious project and build a robust data synchronization platform together. Best, Smith
$15.000 USD på 7 dage
6,8
6,8

As an experienced and versatile senior full-stack developer, my skillset is perfectly aligned with your need for a Senior Engineer for Web Automation. I have spent considerable time automating complex processes, synchronizing data from various sources, and ensuring high-performance architecture. My deep familiarity with Node.js (TypeScript) and Puppeteer will enable me to contribute meaningfully to your project right off the bat. I understand the considerable technical challenges involved in synchronizing a vast multi-variate dataset from a multitude of retailers into a unified format. My expertise in ETL pipelines using advanced regex and LLM-assisted parsing will be pivotal in transforming the unstructured web data into a structured, highly efficient format. Moreover, I'm very familiar with AWS services like Fargate, Lambda, S3 -which is especially valuable as it's a central part of your infrastructure. This strong technical foundation accompanied by my rare ability to seamlessly handle Distributed state management makes me uniquely prepared to bring complete efficiency to your 100B construction finishing industry automation project.
$20.000 USD på 90 dage
6,9
6,9

Greetings, Are you looking for a single senior engineer or a small dedicated pod to architect and scale the full Data Factory automation layer? Our understanding is that you need a senior level engineer to design high concurrency automation clusters using Node.js or Python with Playwright or Puppeteer, simulate complex geo localized flows, extract landed costs, normalize unstructured data into a unified schema, implement perceptual hashing for product de duplication, and manage auto scaling workloads on AWS Fargate or ECS with resilient proxy orchestration. Our team has strong experience in large scale browser automation, distributed scraping systems, ETL pipeline design, session persistence, residential proxy management, AWS container orchestration, and high integrity data synchronization across state dependent environments. We focus on zero error data validation and scalable architecture ready for national rollout. We provide 5 months FREE support along with long term collaboration guarantee. For quick response and one on one communication, please click the chat button as we are available most of the time. FYI, current bid amount is placeholder to submit the proposal. Look forward to connecting via chat. Regards, Yasir LEADconcept PS: Let me know, if you want to see our team past work to determine our skills/expertise or past customer's references.
$15.000 USD på 7 dage
6,6
6,6

Hi, I am Elias from Miami. I’ve reviewed your scope. This isn’t simple scraping — it’s distributed stateful automation at national scale with strict data integrity requirements. I’d approach this as a resilient data factory: high-concurrency Playwright clusters in Node.js (TypeScript), structured session orchestration with geo-aware proxy routing, and controlled extraction of landed cost components. ETL would normalize into a versioned universal schema, combining deterministic parsing (regex rules) with LLM-assisted edge-case handling. For de-duplication, I’d layer pHash-based perceptual indexing with metadata fingerprinting to reduce false positives. Infrastructure would run containerized on ECS/Fargate with autoscaling tied to queue depth and health metrics. I have a few points I’d like to clarify: Q1: Are retailer integrations fully browser-driven, or do some expose structured APIs we can hybridize? Q2: How are you validating material quantity and freight weight accuracy today — rule-based or reconciliation layer? Q3: Is there an existing internal schema, or would I be co-designing the Universal Material Schema? Happy to share relevant automation work and discuss how to structure this for zero-error scale.
$15.000 USD på 7 dage
6,7
6,7

Hello, As a senior software engineer and an owner of Live Experts, LLC, I am excited to offer my unique set of skills for this project. Firstly, our team is proficient with Node.js (TypeScript) and Python which are essential for your data synchronization needs. We have significant experience architecting and managing automation with Playwright and Puppeteer, specializing in headless browser frameworks which will be perfect for synchronizing complex user journeys. Our mastery in AWS (Fargate, Lambda, S3) could be of great value to manage the auto-scaling workload that this project demands. Moreover, our expertise extends beyond traditional engineering boundaries. With a deep understanding of Machine Learning (ML), Deep Learning (DL), and Artificial Intelligence (AI), we can employ these skills to develop the perceptual imagery required for your 'Neural Data Normalization' requirement using OpenCV/pHash. We believe in continuously challenging ourselves; hence, we stay on top of cutting-edge technologies to deliver dynamic solutions. Finally, it's not just about the technicality but the challenge at hand too. Building a system to support a $10MM+ ARR requires meticulous execution like ours at Live Experts'. Our zero-error approach to data integrity ensures high uptime against enterprise-grade site architectures. I believe my extensive skill set aligns perfectly with not only the technical demands but also the aspirations your company Thanks!
$20.000 USD på 4 dage
6,4
6,4

Dear , We carefully studied the description of your project and we can confirm that we understand your needs and are also interested in your project. Our team has the necessary resources to start your project as soon as possible and complete it in a very short time. We are 25 years in this business and our technical specialists have strong experience in Data Processing, Cloud Computing, Node.js and other technologies relevant to your project. Please, review our profile https://www.freelancer.com/u/tangramua where you can find detailed information about our company, our portfolio, and the client's recent reviews. Please contact us via Freelancer Chat to discuss your project in details. Best regards, Sales department Tangram Canada Inc.
$15.934 USD på 5 dage
6,5
6,5

Hello! This is the type of high concurrency, state aware automation architecture I specialize in. I have built large scale scraping and synchronization systems using Node.js with TypeScript and Playwright, deployed across containerized AWS clusters with proxy orchestration and distributed job queues. How I would approach this: Automation Architecture • Playwright cluster with context isolation per retailer and region • Centralized orchestration layer with Redis backed job queue • Session fingerprint persistence with geo scoped proxy routing • Intelligent retry logic with adaptive backoff Stateful Flow Simulation • Cookie jar persistence per zip code region • Tax and freight calculation flows fully simulated through checkout state machines • Deterministic capture of landed cost components ETL and Normalization • Structured parsing layer using regex pipelines • LLM assisted field reconciliation for messy catalog structures • Canonical Internal Material Schema enforcement with validation guards Perceptual Indexing • pHash implementation using OpenCV • Image deduplication and product clustering • Confidence scoring to prevent false merges Infrastructure • AWS Fargate task autoscaling • S3 for raw capture archive • Lambda based event triggers • Proxy mesh routing with health monitoring Available to discuss architecture depth immediately. Best regards, Jasmin
$15.000 USD på 7 dage
6,6
6,6

I am a Senior Software Engineer specializing in web automation and data synchronization, perfectly aligning with your project requirements. With extensive experience in Node.js (TypeScript) and cloud infrastructure on AWS, I have successfully designed high-concurrency automation clusters and managed containerized workloads across dynamic environments. My proficiency with Playwright and Puppeteer frameworks positions me well for your complex automation architecture needs. In previous roles, I developed stateful flow simulations for intricate user journeys and implemented ETL pipelines transforming unstructured data into coherent schemas. I possess expertise in perceptual hashing for product de-duplication and have managed auto-scaling solutions on AWS Fargate/ECS with advanced proxy orchestration. My familiarity with OpenCV/pHash optimizes image processing, crucial for ensuring data integrity at a national scale. I am interested in discussing how I can contribute to your $10MM+ ARR goal. Could we explore specific challenges you foresee in state management? I am ready to share my GitHub portfolio demonstrating relevant projects.
$180.000 USD på 60 dage
6,6
6,6

Hello, We've thoroughly reviewed your project requirements for a Senior Web Automation Engineer and are excited about the opportunity to collaborate. This project aligns perfectly with our expertise in developing high-concurrency automation systems and advanced data synchronization. Previously, we successfully delivered a similar project involving the creation of a national-scale procurement platform, integrating data from numerous retail sources. Our experience with complex ETL pipelines and LLM-assisted parsing was pivotal in transforming vast amounts of unstructured data into a cohesive schema. With over 8 years of experience, we are well-versed in Node.js (TypeScript) and Playwright/Puppeteer, ensuring robust automation architecture. Our background in AWS and advanced proxy management guarantees seamless scaling and resilience. Our portfolio showcases projects where we delivered top-rated results, reinforcing our capability to meet your high standards. Please message us with more details so we can provide a tailored proposal within 24 hours. We're eager to contribute to your project's success. Best regards, Puru Gupta
$20.000 USD på 50 dage
6,2
6,2

⭐⭐⭐⭐⭐ Senior Software Engineer for Web Automation & Data Synchronization ❇️ Hi My Friend, I hope you're doing well. I reviewed your project requirements and see you're looking for a Senior Web Automation Engineer. Look no further; Zohaib is here to help you! My team has successfully completed 50+ similar projects in web automation and data synchronization. I will create efficient automation architectures and ensure smooth data synchronization across various platforms. ➡️ Why Me? I can easily manage your web automation and data synchronization project as I have 5 years of experience in Node.js, TypeScript, and Python. My skills include automation design, ETL pipeline creation, and session management. Additionally, I have a strong grip on AWS services, ensuring high performance and reliability for your system. ➡️ Let's have a quick chat to discuss your project in detail and let me show you samples of my previous work. Looking forward to discussing this with you in chat. ➡️ Skills & Experience: ✅ Node.js (TypeScript) ✅ Python ✅ Playwright ✅ Puppeteer ✅ Automation Architecture ✅ ETL Pipelines ✅ Data Normalization ✅ AWS (Fargate, Lambda, S3) ✅ Session Management ✅ Proxy Management ✅ Data Integrity ✅ Image Processing (OpenCV) Waiting for your response! Best Regards, Zohaib
$12.000 USD på 2 dage
6,1
6,1

Hello, I’m very interested in the Senior Web Automation Engineer role and the scale of what you’re building. I’ve designed high-concurrency automation systems in Node.js (TypeScript) using Playwright, running containerized workloads on AWS ECS/Fargate, where session persistence, proxy rotation, geo-context simulation, and resilient retry orchestration were critical to maintaining uptime against enterprise-grade sites. In one project involving multi-state pricing aggregation, I built a distributed scraping cluster with queue-based workload partitioning, structured logging, and schema-normalized ETL pipelines that transformed inconsistent web data into a unified internal model with strict validation rules to prevent quantity and pricing drift. I’d be happy to share my GitHub and walk through a prior high-concurrency automation build in detail, including cluster topology, proxy mesh design, and failure-recovery strategy. This is exactly the kind of technically ambitious, data-moat project I enjoy leading. Best regards, Juan
$10.000 USD på 7 dage
5,6
5,6

I build browser automation systems - BrowserChef is my browser automation tool used by thousands of users. High-concurrency scraping with Playwright and smart proxy rotation is something I've handled for several clients over the years. For this project specifically - geo-localized session management to extract landed costs (price + tax + freight) across 100+ retailers at scale is exactly the kind of distributed data challenge I enjoy. The pHash dedup layer is a nice touch, I've done perceptual hashing for product matchng across catalogs before. Stack I'd bring: Node.js/TypeScript + Playwright, AWS Fargate for containerized workers, Redis for session state, residential proxy pools with rotation. ETL normalization via schema + LLM-assisted parsing for the messy edge cases. Happy to share my GitHub and walk through past automation work on a quick call. What's the target scope for the first milestone? - Usama
$12.000 USD på 30 dage
5,2
5,2

Hi, how are you? I’m a Senior Software Engineer specializing in high-concurrency web automation and distributed data pipelines. I’ve built large-scale scraping and synchronization systems using Node.js (TypeScript) and Python with Playwright, running containerized workloads on AWS (ECS/Fargate, Lambda, S3). In a recent project, I architected a multi-region automation cluster that synchronized pricing and logistics data across 70+ geo-dependent retail endpoints. I implemented stateful session pools (geo-targeted proxies + persistent cookies), request orchestration with adaptive throttling, and resilient retry queues. Data was normalized into a unified schema using regex + LLM-assisted parsing, with strict validation rules to prevent quantity/weight drift. We also deployed perceptual hashing (pHash via OpenCV) to deduplicate SKUs across vendors, achieving >99.7% data consistency at scale. I’m comfortable designing proxy mesh strategies, container auto-scaling policies, and zero-error validation pipelines suitable for a $10MM+ ARR growth trajectory. A few questions: What is your current concurrency target (sessions per retailer)? Are you already using residential proxy providers? How are you validating freight/weight integrity today? Is the Universal Material Schema finalized or evolving? I hope to discuss with you as soon as possible Best Regards
$18.000 USD på 90 dage
5,4
5,4

⭐⭐⭐⭐⭐ CnELIndia, led by Raman Ladhani, can architect and execute your national-scale Data Factory with a zero-error mindset. We propose a phased approach: Automation Architecture: Build high-concurrency clusters in Node.js (TypeScript) using Playwright with adaptive request orchestration, geo-aware session persistence, and intelligent residential proxy rotation to ensure resilience across enterprise-grade retailer sites. Stateful Flow & Landed Cost Engine: Develop deterministic user-journey simulators to capture price, tax, and LTL freight across 40,000+ ZIP codes with validation layers to prevent quantity/weight drift. Neural ETL Pipeline: Implement regex + LLM-assisted normalization into your Universal Material Schema with strict checksum and anomaly detection controls. Perceptual De-duplication: Integrate OpenCV-based pHash fingerprinting to unify identical SKUs nationally. Cloud Scale: Deploy containerized workloads on AWS Fargate/ECS with auto-scaling, S3 data lakes, and monitoring for sustained 50-state throughput. Our focus: accuracy, concurrency, and scalable distributed state management aligned with your ARR targets.
$15.000 USD på 7 dage
5,3
5,3

Hello, I am an experienced Node.js/TypeScript developer with expertise in high-concurrency web automation using Playwright and Puppeteer. I have designed and deployed scalable automation clusters, managing stateful session flows, proxy orchestration, and complex multi-step user simulations for large-scale e-commerce and procurement data pipelines. I can architect ETL pipelines to normalize unstructured web data into a unified schema, implement perceptual hashing (pHash) for deduplication, and ensure zero-error data integrity across multiple geolocations. I am also proficient with AWS (Fargate, Lambda, S3) for containerized workload management and auto-scaling, ensuring high availability and resilience under national-scale operations. Clarification Questions: Are there preferred sites or retailers to prioritize for initial automation clusters, or is coverage uniform across all 100+ targets? Should the perceptual hashing pipeline operate in real-time during extraction, or in batch post-processing? GitHub portfolio: [Insert link showcasing high-concurrency automation projects] Thanks, Asif
$20.000 USD på 30 dage
5,0
5,0

Hi, there! Most automation stacks don’t fail because of scraping. They fail because distributed state, geo variance, and data normalization spiral out of control at scale. When 100 plus retailers behave differently across 40000 zip codes, you need architecture, not scripts. I build high concurrency Playwright clusters in Node.js with clean task sharding, session persistence, and intelligent proxy orchestration. I would design Fargate based workers segmented by retailer and geo context, each simulating full cart flows to extract true landed cost including tax and freight. Raw data streams into a hardened ETL layer where I normalize into a strict internal schema with rule based parsing plus LLM assisted edge handling. For product duplication, I implement pHash fingerprinting with similarity scoring to create a reliable national product identity layer. Everything is containerized, auto scaled, logged centrally, and designed for deterministic reconciliation. I think in systems that survive scale and revenue pressure, not scraping demos. Happy to share GitHub examples of high concurrency automation and multi state synchronization builds.
$10.000 USD på 30 dage
4,7
4,7

Greetings! I’m a top-rated freelancer with 16+ years of experience and a portfolio of 750+ satisfied clients. I specialize in delivering high-quality, professional web automation and data synchronizing services tailored to your unique needs. Please feel free to message me to discuss your project and review my portfolio. I’d love to help bring your ideas to life! Looking forward to collaborating with you! Best regards, Revival
$10.000 USD på 30 dage
4,1
4,1

With over a decade of experience navigating the intricate terrain of web automation and data synchronization, my company WellSpring Infotech comes armed with an Arsenal of skills that ensure the success of projects just like yours. We understand that your goal is to scale your core "Data Factory"- and that's exactly what we are known for-scaling! One need central to your project has to have a "Zero-Error" approach to data integrity, specifically ensuring that material quantities and freight weights remain accurate across 40,000+ US zip codes- this is where our technical expertise shines. We implement robust, fluid ETL pipelines that transform unstructured web data into your Internal Universal Material Schema using advanced regex and LLM-assisted parsing. Furthermore, at WellSpring Infotech, we are not only passionate about technology but also about fostering lasting partnerships. Beyond delivering results consistently beyond expectations, we work with an eye towards the future. Your project is set to grow at rapid pace and we're already excited about the possibilities it portends. As such, we have structured our milestones based compensation with a clear path to long-term leadership as the team scales; we are ready to be a part of your success story! Take a look at our impressive track record on our gitHub profile and let's get started on building a proprietary data moat for your venture-backed project!!! Thank you!!!
$15.000 USD på 7 dage
4,2
4,2

Bethesda, United States
Betalingsmetode verificeret
Medlem siden mar. 2, 2026
₹12500-37500 INR
$2-3 USD / time
$250-750 USD
₹37500-75000 INR
$10-30 USD
₹12500-37500 INR
₹750-1250 INR / time
₹600-1500 INR
₹12500-37500 INR
₹12500-37500 INR
£20-250 GBP
₹75000-150000 INR
₹750-1250 INR / time
$250-750 AUD
$250-750 USD
₹12500-37500 INR
₹600-1500 INR
$250-750 USD
$25-50 AUD / time
$3000-5000 USD