Python script for crawling API stops for some reason - make suggestions for improvement

  • Status: Closed
  • Præmier: $20
  • Modtagne indlæg: 4
  • Vinder: marioada

Konkurrence Instruktioner

Dear all,

We're using the below script for making requests with the crawling provider [login to view URL] (the documentation can be accessed here after having created a free account: [login to view URL]).

The script is working well in general, however with one problem remaining: It simply stops working from time to time - sometimes after having successfully crawled a couple of hundred, sometimes only after a couple of thousand URLs. But we can't get it stable to crawl a couple of 10k URLs.

Please make suggestions right in the code - including a comment that describes why you made the change. We'll then test it and award the amount if the change brings the desired result.

Looking forward to your contributions!

Anbefalede Evner

Arbejdsgiverfeedback

“Mario is a great guy and a pleasure to work with!”

Profilbillede thomasjohn6, Germany.

Bedste indlæg fra denne konkurrence

Se flere indlæg

Offentlig Præciserings Opslagstavle

  • imo581
    imo581
    • 3 uger siden

    I tried your scripts with some links. The API responds with status code 403 Forbidden. I tried to use the API using a browser and it gives me this message "Token is invalid or account is temporarily blocked! please login to your dashboard for more details". Is something wrong with your subscription?

    • 3 uger siden
    1. thomasjohn6
      Konkurrenceafholder
      • 3 uger siden

      Hello Islam, Thanks for your interest in the contest! I guess for somewhat obvious reasons, before posting the script in public, I removed the real token from the script :-)

      • 3 uger siden
  • busygayan
    busygayan
    • 3 uger siden

    Literally makes no sense for you to pay a third party service which costs you money, and their prices are pretty expensive.

    Why don't you create your own tiny system which can get this done ?
    It's nothing complicated.

    • 3 uger siden
    1. busygayan
      busygayan
      • 3 uger siden

      So 40 Bucks plus you need a sever which can handle 50K plain requests per an hour ? So to answer the question

      Proxy crawl cost - 2500 USD ( basic, not JavaScript )
      Custom approach cost - less than 400 USD ( with a 64GB / 16 vCPUs Server )

      Javascript based crawl on proxy crawl - $5,054.90
      Custom approach cost - less than 1000 USD ( 192 GB of ram , 32 vCPUs Server )

      Besides all that, the code is custom, its transparent and debugging is way easy.
      Your data is private.

      • 3 uger siden
    2. busygayan
      busygayan
      • 3 uger siden

      I have a bot which crawls facebook daily with over 1,000 concurrent accounts daily. custom coded using selenium with python and i make over 100 requests each second ( each request has its own unique IP / proxy ). Still i spend only around 2,000 on a monthly basis,

      This makes no sense and the customer is being technically ripped off, paying almost 5x the amount. Still the customer is stuck having to debug his own code, I'm not even going to go why the code fails. You could pay a couple of engineer a salary and have your own servers maintained with 0 issues for the amount that you spend on this company. even if you're doing this on a small scale, makes no sense.

      High cohesion is not bad at all, that's my point basically.
      Good Luck

      • 3 uger siden
  • thomasjohn6
    Konkurrenceafholder
    • 3 uger siden

    Thanks for your comment! However, for now we would like to use the convenience of such a provider. Maybe later do it on our own. So do you have any idea what the problem could be in the script? Thanks in advance!

    • 3 uger siden

Vis flere kommentarer

Sådan kommer du i gang med konkurrencer

  • Opret din konkurrence

    Opret din konkurrence Hurtigt og nemt

  • Få tonsvis af indlæg

    Få tonsvis af indlæg Fra hele verden

  • Tildel det bedste indlæg

    Tildel det bedste indlæg Download filerne - Nemt!

Opret en Konkurrence Nu eller slut dig til os i dag!