WebScarping with phyton works 7/24 with concurrent threads with in every x mins

I gang Opslået 6 år siden Betalt ved levering
I gang Betalt ved levering

I have separate 1 hour to scope, seperate 5 mins to read... than apply for the job

I need a price information from n different pages at the sametime in every x miliseconds ( around 30 mins ).

Need to 7/24 run as a service at windows or a docker service in cloud (preffered),

Scrapping for all pages that has in XML needs to be concurrent (thread) request to get the data at the same timeline.

After filling all data, you can make the calculations and then you can save all values.

I need a webservice feed which is serving these information for PowerBI or Qlikview to check.

I think 3 service will be enough... service credentials will be in XML file.

PossibleDiagram has attached.

Save every actionto a log file with in relateddatename to see if error occurs.

1- You will get config of running process from XML file (time, log, mails....)

2- You will get the pages, Xpath and regex from XML file.

3- You will get calculations from XML.

4- You will get warning conditions from XML

5- SQLite is enough for DB, but If you prefer to use another for better results.

6- I wish to run this project in Amazon Web Service but if it' s not possible I will provide a VM to make a setup.

General Accepts

- Page number won' t be more than 10, probably 4 but it must be capable work 10 (telling you because of thread issue )

- Minimum x time will be 5 mins, so you have 5 mins to calculate

- Calculates won' t be more then 10.

- All Scraps will be a numeric value, mostly money.

- All values default is 0. If you can' t calculate some how... log why... where you coudn' t calculate (xmlid) and set 0

- Calculates will be done by ordering of XML. It' s not gonna happen with concurrent threads, calculations will be made after all concurrent page scrapping finished.

- Calculates can use calculate ID' s in formula. calc1 formula can be 10x5, calc2 formula could be calc1*5 - I need to get 250... if calc1 is not calculated because of XML order, calc2 result will be 0, but you need to log that problem in to log file as "calc1 was null" - calc2 coudn' t calculated.

Alert logic

Alert logic is easy.

this is an SMS type alert find admins from ops such as admin find his GSM (if you can' t find it, bypass... but log it)

call web page of sms which all details under SMS chids in XML with the formula

//

{URL}?GSMNO={findadminGSMnumber}&text={GETALERTTEXT}&{child}={child/text()}....

Currently its

[login to view URL];123123&text=balance%20is%20low&user=tako&pass=mako&title=POAS&attr1=123&attr2=345

You will make get requests.

If it' s email type of alert... Get the mail, send ... title will be the same with Text.

You will use SMTP details from config.

I need a XML configuration file such as attached.

latest update ---

- I want to get price from 4 site of BTC in every 5 mins. Some of them has webservices
- I want to get exchange rate of a currency ( tell later )
- I want to make some calculations ( I will share excel with you )
- I want to see a dashboard screen of that day, highgest price difference, hpd all the time, displaying price difference according to hours, ( https://google-developers.appspot.com/chart/interactive/docs/gallery/piechart )
- I want to see all history with in a datagrid ( component ex: https://datatables.net )
- I want this project run as a service.
- I want to display this dashboard & history datagrid from anywhere with a user&password entry

If it' s possible I want to have that with amazon services but mostly their ip' s are blocked maybe we can run on a local vm
Every source code will be shared, every thing must be editable with a coding knowladge...
I don' t want anything in a blackbox

Amazon Web Services Datasøgning Python RESTful Web Skrabning

Projekt ID: #15896898

Om projektet

19 bud Remote projekt Aktiv 6 år siden