Optimization Subject Project in Engineering along with a Report -- 2

Lukket Opslået Oct 26, 2015 Betalt ved levering
Lukket Betalt ved levering

Dear Free Lancers, Please go through all the questions and contact me if your sure that U can completely do the Project.

Consider 10 slot machines that you can only choose one machine to play at each time. You will get a reward for playing with any machine that you choose. The goal is to find the most profitable machine and play it as often as possible. The p-code Matlab function generate_MAB.m provides 10 sequences of rewards (randomly) of length N. You can design an algorithm to learn a good strategy by playing the multi-armed bandit problem N times. Example Matlab code Test_MAb.m is provided where a simplified epsilon-greedy strategy is used to make the trade off between exploration and exploitation. The goal is to minimize the total regret (or the percentage of reward loss) between the ideal cumulative reward (that you always choose the best arm) and the actual reward obtained from your algorithm.

Try to formulate your algorithm design as an engineering optimization problem and test your hypothesis for multiple runs of the 10 reward sequences.

1. When N is very large, do you see any arm that provides maximum expected reward? Can your algorithm eventually select the most profitable arm with high probability?

2. When you fix N (say N=1000), can your algorithm perform better than the simplified epsilon-greedy strategy on average (say over 100 runs)? Can your algorithm perform better in every single run?

3. Propose your own evaluation criterion to compare your optimized sequential arm selection strategy with the epsilon-greedy strategy. Can you find another way to adjust epsilon sequentially? Can you optimize the epsilon-greedy strategy when you can see the reward of each arm after you selected an arm at each time?

4. Assume that you initially have $100 and selecting an arm will cost you $1.00 each time and you will get the reward from the arm that you selected. Set N to be very large and simulate your arm selection strategy until either you have used all your money or you have reached more than $200 for the first time. Document the time that your algorithm has to stop. Can you optimize your strategy so that it will get $200 with high probability in as few steps as possible?

Note: Matlab codes of [url removed, login to view], generate_MAB.p are available

Hope U understand,

Thank You

Elektrisk Ingeniørarbejde Matematik Matlab and Mathematica

Projekt ID: #8773250

Om projektet

Remote projekt Aktiv Dec 2, 2015
go4mugam01

We are a Team of Well experienced Engineers, Researchers and MBAs. We offer the following services to you. Academic Writing : HND Assignments , GCSE/GCE O+ and A+ e-Portfolios and Technical Reports , All Windows/Li Flere

$100 USD in 5 dage
(30 bedømmelser)
4.8
pinetree200

Hi, dear friend! I saw your project description. I am expert in SolidWorks, AutoCAD, MATLAB, FEA and CAD/CAM/CAE. I have a lot of experiences in this field. I can help you well. Please contact me. Best regards.

$25 USD på 1 dag
(6 bedømmelser)
2.8