Algorithm Optimization 3


• We will provide one datasets with one target variable (“Score”), a timestamp and 24 independent variables. The dataset contains ~55 thousand observations (however your solution should be scalable to accommodate a much larger dataset).

• The goal is to write at most 6 sets of “greater than” and “less than” restrictions on the independent variables. Each set of restrictions will return a subsample of the dataset on which we evaluate an objective function.

• Specifically, the objective function is the sum of the target of the observations in the selected subsample. Each query (set of restrictions) has to return at least 10 valid responses.

• In addition, any observations that come less than 60 seconds after a valid observation in this subsample will be removed. So each query has to return at least 10 responses that are 60 seconds apart from each other.

• In other words, your goal in this project is cornering up to 6 regions of the dataset using intervals on the independent variables, and maximize the density of positive values of the target.


• You can find the dataset in the Excel file “[url removed, login to view]”.

• You will see some variables have version A or version B (for instance W2 R2). In such cases you can use either one or the other, not both.

• Your restrictions cannot be applied using a higher number of decimal places than occur in the observations. For instance a restriction to W1R1 cannot be 0.015, it must be either 0.01 or 0.02.


• Regression analysis, Neural Networks, SVM, and K-clusters will not help you much. These methods classify observations by applying a weighted average of the independent variables. The classification rule has to be on the independent variables directly, cannot be on a weighted average of them or any other function.

• Make sure to order the timestamp chronologically.

• A start could be plotting the density of the independent variables for the subsample of positive target values and for the subsample of negative target values. Then you can identify regions with a high density of positive target observations.


• All accepted bids will be awarded on completion

• We are going to judge the performance of each bid both inside the sample (milestone 1) and outside the sample (milestone 2). A good performance consists in a high aggregate sum of the target variable.

• After this stage we will ask you provide details of how you would maintain the existing algorithms (milestone 3) over a much larger data set, ~500 thousand observations.

Evner: Algoritme, Programmering af database, Matematik, Software Udvikling, Statistikker

Se mere: use of algorithms in programming, use of algorithm in programming, use of algorithm, use algorithm, the analysis of algorithms, the algorithm is, target logistics, solution algorithm, software development independent, software development algorithms, set algorithm, programming tips, make algorithm, how to write programming in excel, how to write an objective, how to write algorithms, how to use algorithm, how to make an algorithm, how to make algorithms, how to make algorithm, how to algorithm, help with algorithms, good algorithm, development of algorithms, development of algorithm

Om arbejdsgiveren:
( 26 bedømmelser ) Birmingham, United Kingdom

Projekt ID: #5995085

12 freelancers are bidding on average $707 for this job


I am expert in matlab vectorization techniqu. Using that if we deal with large data set, we actually need less time like fraction of second but if you write the same program using loops then in that case for large data Flere

$666 USD in 10 dage
(40 bedømmelser)

A proposal has not yet been provided

$998 USD in 10 dage
(2 bedømmelser)

Hi, Good day! I have read the description and I am confident that it can be done if given ample time. I have done similar type of projects in past related to optimization theory. This one is similar to those project Flere

$850 USD in 15 dage
(13 bedømmelser)

I have some ideas how to efficiently solve this. I assume because all accepted bids will be awarded, that you don't have any interest in gathering up algorithms, you are looking for good people for this kind of thing, Flere

$750 USD in 7 dage
(8 bedømmelser)

Hi, very interesting project. Could you send me the dataset? I'd code this in R if you don't mind. Best regards, marcin

$400 USD in 7 dage
(13 bedømmelser)

Hi, I know the previous project was granted to someone else and since you said you need several ones to work in parallel, hence I would like to bid for this project. Please let me know specifically which task is expect Flere

$388 USD in 10 dage
(5 bedømmelser)

Let an expert do it.. i have 8+ years of experience. Can we discuss the project. Please initiate a chat with me so that we can discuss the project at a broader level. Why you should hire me- 1. I have a very g Flere

$1546 USD in 10 dage
(2 bedømmelser)

HI Brother, I am Data Scientist working in Multinational Company. My work is to see the hidden pattern in the large and complex data sets and predictive analytics, Data mining,Machine Learning and also uses the stati Flere

$555 USD in 10 dage
(7 bedømmelser)

I am a Subject Matter Expert in Mathematics, Statistics, Computer Science and Physics, and a SEO search engine optimization specialist. I worked as Matlab and Statistics Consultant for several years for many compani Flere

$666 USD in 10 dage
(2 bedømmelser)

Hi, I have more than 14 years of exp and I am expert in this kind of work. I have completed more than 210 projects. Please look at the feedback left by my employers to know more about my work. Waiting for your positive Flere

$500 USD in 20 dage
(1 bedømmelse)

A proposal has not yet been provided

$750 USD in 10 dage
(2 bedømmelser)

Please give me your best time for discussion.. My Skype Id: Vijaywebsolutions. Thanks, [url removed, login to view]

$412 USD in 15 dage
(0 bedømmelser)