Thursday, April 18, 2019

Wednesday, April 3, 2019

Machine Learning Notes - 03

What is the process for a machine learning project,

simple and useful explanation,

"

5-Step Systematic Process

I liked to use a 5-step process:
  1. Define the Problem
  2. Prepare Data
  3. Spot Check Algorithms
  4. Improve Results
  5. Present Results...............",
Read all

https://machinelearningmastery.com/process-for-working-through-machine-learning-problems/


Problem definition process..

Problem Definition Framework

I use a simple framework when defining a new problem to address with machine learning. The framework helps me to quickly understand the elements and motivation for the problem and whether machine learning is suitable or not.
The framework involves answering three questions to varying degrees of thoroughness:
  • Step 1: What is the problem?
  • Step 2: Why does the problem need to be solved?
  • Step 3: How would I solve the problem?
Read all..



Best Resources found on Machine Learning with Python

1. https://machinelearningmastery.com/machine-learning-in-python-step-by-step/

Machine Learning Notes 02

Very first thing you create with a machine learning program is a Data Model, which can use in business applications to do the predictions, forecasting and etc. 

There are four main steps in creating a Model ,

1. Load Data
2. Clean up data
3. Create the model
4. Train the model
5. Test the Model
6. Deploy the Model 

Step 1 - Load the data

To load the data, we use python library called "pandas" mostly

Example of loading data from a CSV file,

import pandas as pd
import matplotlib as plt
#matplotlib innline - this is just for Jupitor Notbooks
filename = 'datafile.csv'
columnames = ['preg', 'pres','skin']
data = pd.load_csv(filename, names = columnnames


Just to check the shape of the data, use,

print(data.shape)

and see the description of the data use

data.describe()

Can just the how data is been structured like this,

print(data.groupby('pres').size())

just the visualize the data,

use, Uni-variate and multi-variate plots

Uni-Variate
We start with some univariate plots, that is, plots of each individual variable.
Given that the input variables are numeric, we can create box and whisker plots of each.
code,
data.plot(kind='box', subplots = True, layout(2,2), sharex = False, sharey=False)
this plots a BOX plot for each numeric data field in the data set.
and view the histagram of the data by,

data.hist()
plt.show()
continuing....
Step 2 - Clean the data 

Before start processing data, clean up data is required in machine leaning. We do that in python with two main libraries,

1. numpy
2. pandas

Clean up of data can happen for different ways

1. Dropping unwanted columns from data frame
2. Changing the index of the data frame
3. Tiding up fields in the data
4. Combining str methods to NumPy to clean the columns
5. Cleaning the entire data set with applymap function
6. Renaming columns an Skipping rows


First import two main libraries,

import numpy as np
import pandas as pd

there is a function call "drop" comes with pandas to use for drop data columns from a data frame. 

first load the data from a csv file, repeat the code

filename = 'datafile.csv'
columnames = ['preg', 'pres','skin']
data = pd.load_csv(filename, names = columnnames

define the columns to drop from the data frame

drop_columns = ['preg', 'press']

then drop the columns like this,

data.drop(drop_columns, inplace = True, axis = 1)


Machine Learning Notes 01

Machine learning divides into two categories mainly,

1. Supervised Learning
2. Unsupervised Learning

Supervised Learning

Main scenarios of doing the Supervised learning,
1. Classification
2. Prediction
3. Sequence Prediction

Main methods of doing the Supervised Learning

1. Nearest Neighbors
2. Logistic Regression
3. Support Vector Machine
4. Random Forest
5. Naive Bayes
6. Neural Networks

Choosing the method to do the learning, change the mindset not to ask "How" and "Why",

but to ask "Does it work?"