Find Jobs
Hire Freelancers

Python Script to Download

$25-50 USD / hour

Lukket
Slået op næsten 10 år siden

$25-50 USD / hour

I have a set of urls that are like this: [login to view URL] (which redirects to [login to view URL]) The valid values for file are not incremental. I have a list (in the millions) of valid ones. I can prepare the list as a text file, sqlite, whatever we want. The idea is to start the script in a threaded manner, download 10,000 files, zip them, download another 10,000, zip them, and so on until the list is exhausted. :-) Want to get this done asap.
Projekt-ID: 6025530

Om projektet

23 forslag
Projekt på afstand
Aktiv 10 år siden

Leder du efter muligheder for at tjene penge?

Fordele ved budafgivning på Freelancer

Fastsæt dit budget og din tidsramme
Bliv betalt for dit arbejde
Oprids dit forslag
Det er gratis at skrive sig op og byde på jobs
23 freelancers are bidding on average $30 USD/time for this job
Brug Avatar.
I am an expert in delivering custom scripts and willing to discuss further about the project specifications.
$52 USD på 3 dage
4,9 (46 anmeldelser)
6,3
6,3
Brug Avatar.
Hello, This looks like a very interesting project. I have extensive experience with HTTP libraries under Python (I have used urllib2, tornado, requests, mechanize) and with threaded programming (especially the multiprocessing module). If the number of files is in the millions we must first establish how many simultaneous requests you are allowed to make to that specific domain. If that number is in the tens then thread pools (or better yet process pools because of the Python GIL) will probably work. However if the number is in the hundreds the asynchronous HTTP client provided by tornado is a much more elegant and faster solution. If the number of simultaneous requests is in the thousands a solution based on combining tornado and multiprocessing is also a good idea. Can you give me some details about the files? How large are they usually; are they all PDF files? I'm asking because I want to get an idea about the compression level that can be achieved and the disk space needed for this project. If you can provide a few URLs so I can get a better idea about this project I will run a few tests and let you know my opinions; afterwards if you are happy with the results we can continue the collaboration on this project. Thank you for your time! Best wishes, Ionut
$30 USD på 16 dage
5,0 (25 anmeldelser)
5,5
5,5
Brug Avatar.
I am expert in desired skills for this project and have done similar tasks already. Please get back to me, so I can show you some of the work I have done. I will not be asking for any upfront only pay me when you are satisfied with the progress.
$38 USD på 40 dage
3,0 (61 anmeldelser)
7,3
7,3
Brug Avatar.
Hi there! I have done this type of work before. Please message me if you could with more information such as the url file list. Thanks! -Steve
$25 USD på 3 dage
4,7 (9 anmeldelser)
4,2
4,2
Brug Avatar.
Hello, I can write this script for you today. It will split the list in smaller of 10000 entries and after it will download to a folder1 first 10000 files, zip the folder1, delete folder, start again again for folder2, zip folder2 and so on until the end of list. sqlite might be better if there are millions of entries. Please send me at least just a few links to test and I'll send you back the script for test, just to be sure that I can write this script for you Best Regards, Constantin Plaiasu
$33 USD på 3 dage
5,0 (15 anmeldelser)
4,1
4,1
Brug Avatar.
Hi, starting from the text file is enough. Implementing parallel downloads with python is straight-forward. I would start multiple workers to get their own set of 10,000 files and start downloading them. This way you will finish as fast as possible, and this solution needs only python standard modules, hence runs on most installations. I think the total work will take roughly two hours, so you should get results quickly. Regards from Berlin, Guido
$25 USD på 5 dage
5,0 (2 anmeldelser)
3,6
3,6
Brug Avatar.
Hi, I have such a script ready to go. My work would be to test it for the first 50 files and then schedule and check every time where we are. Might be some things to think about; - is this 1 site where the files are download from so we might have issues with a firewall blocking this after xxxx requests? - do you have an ftp server ready to receive the files from the batches? - This might be running for a few days, I need to spent a few hours on this but can give a precise estimate during the first runs. Using python to do this and have a server ready which can handle this! Dirk
$40 USD på 8 dage
5,0 (1 bedømmelse)
2,3
2,3
Brug Avatar.
Hello, shouldn't be a problem. What will be done with the urls that failed to download? Written to another file/database/whatever to try later? Any authentication necessary? KR, Oliver
$27 USD på 3 dage
5,0 (1 bedømmelse)
1,7
1,7
Brug Avatar.
I am highly interested in this project, I can complete the project as per your expectation and withing the deadline. I am having vast exposure of the technologies you mentioned. I am open to met your budget and deadline.
$28 USD på 3 dage
5,0 (2 anmeldelser)
1,8
1,8
Brug Avatar.
Hi, I have experience in this type of work. Kindly give the information so that we can start Thanks in advance Karthi palanisamy
$25 USD på 3 dage
4,7 (2 anmeldelser)
1,0
1,0
Brug Avatar.
Hi, I am a Perl Expert. I do Web Scraping, I have Scraped Websites like Amazon. The same thing can be achieved in Perl too. It will be a command line tool and faster. Just drop me a message for further discussion. Regards Nithin
$25 USD på 3 dage
0,0 (0 anmeldelser)
0,0
0,0
Brug Avatar.
I've worked with python in a lot of projects but I never get any project from here. I think I can do this. I've worked with Python in a Linux distribution (Linux Mandriva) and in the City Council in the city I live (was a database migration, from lotus notes to postgres, using python). I hope I can help you. Thanks in advance.
$33 USD på 3 dage
0,0 (0 anmeldelser)
0,0
0,0
Brug Avatar.
La propuesta todavía no ha sido proveída
$25 USD på 3 dage
0,0 (0 anmeldelser)
0,0
0,0
Brug Avatar.
A proposal has not yet been provided
$27 USD på 3 dage
0,0 (0 anmeldelser)
0,0
0,0
Brug Avatar.
Sir , I have already had a python script for automatically downloading urls(for scratch a ebook site, links are listed in file), I can modify it to fit your demand ASAP Thank you:-)
$27 USD på 3 dage
0,0 (1 bedømmelse)
0,0
0,0
Brug Avatar.
I am a Computer engineer. If accepted, this will be my first job on freelancer. But trust me when I say that I know what I am doing and your project is very simple to complete. It will take me less than an hour to complete your project. The work will be very good that you will surely be compelled to leave positive reviews. Thank You.
$25 USD på 1 dag
0,0 (0 anmeldelser)
0,0
0,0
Brug Avatar.
I have the script to do the same work. I am the first time to get a work in freelancer.com, so I ask for very very little pay.
$25 USD på 2 dage
0,0 (0 anmeldelser)
0,0
0,0
Brug Avatar.
This should be a very quick project. I could easily have it done by Friday. I have experience with Python and threading.
$27 USD på 5 dage
0,0 (0 anmeldelser)
0,0
0,0
Brug Avatar.
A proposal has not yet been provided
$25 USD på 3 dage
0,0 (0 anmeldelser)
0,0
0,0
Brug Avatar.
Hello, I have around 9+ yrs of experience in perl/python/C/C++/C++11. I worked in reputed companies like IBM, Alcatel-Lucent. etc. I wish to take up this project and deliver it on time. Thanks Nagaraj
$41 USD på 3 dage
0,0 (0 anmeldelser)
0,0
0,0

Om klienten

Flag for UNITED STATES
Evansville, United States
5,0
2
Betalingsmetode verificeret
Medlem siden maj 17, 2014

Klientverificering

Tak! Vi har sendt dig en e-mail med et link, så du kan modtage din kredit.
Noget gik galt, da vi forsøgte at sende din mail. Prøv venligst igen.
Registrerede brugere Oprettede jobs i alt
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Indlæser forhåndsvisning
Geolokalisering er tilladt.
Din session er udløbet, og du er blevet logget ud. Log venligst ind igen.