Reinforcement learning: Markov Decision Process

February 9, 2023February 9, 2023 MehmoodLeave a comment

In the previous blog, we learned basic terminologies used in reinforcement learning, now we are going to see the basic mathematics and rules behind reinforcement learning i.e MDP.

Markov Decision Processes (MDPs) are mathematical frameworks for modeling decision-making problems in which an agent takes actions to maximize a reward signal, that is where MDP is connected with reinforcement learning because in reinforcement learning we also want to maximize the reward. In this blog post, we’ll take a closer look at what MDPs are, how they are constructed, and how they can be solved. But before going toward MDP need to see the fundamentals of MDP i.e Markov Property and Markov Chain, on which we are building MDP.

Markov Property

The Markov property is a fundamental concept in Markov Decision Processes (MDPs). It states that the future is independent of the past given the present. In other words, the future state of a system depends only on the current state and the actions taken, and not on any previous states or actions.

Formally, the Markov property can be expressed as follows:

For any state s and any time step t, the probability distribution over future states, given the history of states and actions up to t, is equal to the probability distribution over states at time t+1, given only the state at time t.

This property makes MDPs well-suited for modeling decision-making problems where the future is uncertain, but the uncertainty can be reduced by taking action and observing the results.

The Markov property is a key requirement for MDPs because it allows us to model the decision-making process in a way that is computationally tractable. By assuming the Markov property, we can simplify the problem of finding an optimal policy by considering only the current state and the immediate rewards and transitions, rather than the entire history of the system. This allows us to use algorithms like value iteration, and policy iteration to solve the MDP efficiently. Now we will take a look at Markov Chain.

Setting Virtual Environment For Atari Games and Running Airstriker Genesis using gym-retro

September 1, 2020 MehmoodLeave a comment

In this blog, I will set up a virtual environment using pip, It is always better to make a virtual environment in order to perform some machine learning or reinforcement learning or any other task which depends upon different library version. You can also create a virtual environment using Anaconda but in this blog, I will go with the virtual environment created using pip. The rest of the steps will be the same.

The first thing you have to do is to install the package that will be used to create the virtual environment

pip install virtualenv

Next is to create a virtual environment using pip with the following command:

virtualenv striker
source ./striker/bin/activate

Now the virtual environment is activated. Next, install important libraries to run the retro.

pip install tensorflow
pip install retro

Next run the Airstriker-Genesis game with the sample actions.

import retro

def main():
    env = retro.make(game='Airstriker-Genesis')
    obs = env.reset()
    while True:
        obs, rew, done, info = env.step(env.action_space.sample())
        env.render()
        if done:
            obs = env.reset()
    env.close()


if __name__ == "__main__":
    main()

When you run this code you will get this error.

Flask Website Deployment using Docker Compose on Azure Cloud

July 18, 2020July 18, 2020 Mehmood1 Comment

In this blog, I will deploy FLASK app on nginx using Docker Compose. If you don’t have any idea how Docker Compose works, read my previous blog first (Click here to read the previous blog).
Before we move to work, I will give you brief intro the things which I am going to use. I created a virtual machine on Azure Cloud Platform. If you don’t know, how to create virtual machine on Azure, go to my this blog create virtual machine easily with few simple steps (Click here to read the blog).
After creating virtual machine just ssh into your machine using simple command.

ssh halcyoona@40.114.31.5

Installation

To install Docker the in your virtual machine, type the command in terminal and press enter:

sudo apt install docker.io
sudo apt install docker-compose

Create a directory with the name of your app, like I am creating my_flask_app.

mkdir my_flask_app
cd my_flask_app

In this floder create two more directory with the nginx and flask. Nginx is entry point of the app, where we get request and then we redirect those request to flask app.

mkdir flask
mkdir nginx

Flask App Setting

Now move into flask directory and create python virtual environment but first install package that is required to create the virtual environment i.e venv first then create virtual environment using following command:

cd flask
sudo apt install python3-venv
python3 -m venv env

Now activate the environment with the simple command:

soure env/bin/activate

And then install flask and uwsgi.

pip install flask uwsgi

flask package is used to create the applications.
uwsgi is a Web Server Gateway Interface used to communicate to nginx server.

Vector Grapher

July 18, 2018July 18, 2018 Mehmood3 Comments

Vector Grapher was our group project of CS-103 course. Idea was proposed by my colleague Ahmed Waheed. Group project is considered as a foremost in any course . That’s why we decided to do something new and unique to check our skills that we had learned in computer programming course, in a better way and output should be a productive project not only a project.This is our first ever project in university so we were so excited, passionate and zealous about our project and little bit confused.But at that time we didn’t know what to do ?

Then we saw this in Calculus-II Book:

Helix seen in above screen shot was our motivation. We decided to do this as group project in CS-103.We meet our Calculus Instructor, Professor Dr. M. Tariq Rahim, told him about our idea. Dr. M. Tariq Rahim said that’s a brilliant idea. This is never done by any student in Fast university yet.But this is not a bit difficult job, you can do this very easily by working hard. If you need any help, I will be there for you. We were motivated and we finalized to take Vector grapher as a semester project.

First most Difficult part was accomplished,Second task i.e making a proposal to submit to our course instructor of Computer Programming, Assistant Professor M. Tehseen Khan. Making a proposal was not as simple as piece of cake we had a call of almost half hour for writing proposal at midnight to submit , because we were not able to submit proposal in time.

Halcyoona

Category: Project

Reinforcement learning: Markov Decision Process

Markov Property

Setting Virtual Environment For Atari Games and Running Airstriker Genesis using gym-retro

Vector Grapher