Scenario: Website has 100 categories, and roughly 50 amount of pages in each category. On each page, I'll be taking names. I have already created a working system that gets users from the first page, I need help with looping it to the next page, and than the next category when pages are ended for that category. From there to save all names to a .csv file column. Than we can remove the duplicates.
category = 1
page = 1
web = ("wget [url removed, login to view]" + category + page)
output = [url removed, login to view](web)
names = [url removed, login to view]('name=')[1:]
for i in names:
print([url removed, login to view]('">'))
if [url removed, login to view]("[url removed, login to view]") == -1:
print "Set loop to go to next page"
print "End of category pages, continue to next category.."
^sample code, will not work as is but gives you a great idea.
Need someone who is proficient in python. If you can multi thread this be faster, let me know also.
22 freelancere byder i gennemsnit $34 for dette job
can loop through pages , and can make multithreaded version either. But I have to look what you have there already, and check out target website for this tas.
Hey there, I am willing to work on this project. More discussions over chat. "I can start work right away." Thanks for looking forward to my proposal.