Skip to main content

Featured post

The Active Ones Take It All

Hey, you! Yes.. you! Are you still delaying that wonderful idea you may have been nursing for a while now? Have you been hesitating on starting that business, journey, career, course, or work you have  to do?

Have you identified a favorable opportunity, but you've not been able to utilize it because you're thinking too much about it? Then this article is for you. I want you to bear this at the back of your mind: "The active ones take it all."

Life offers everything to the ones who are active. Life doesn't care about your intention or what you're thinking of doing. It cares about what you're doing!

Let's say there are two people who intend to start a similar business, let's say it's a small restaurant. One of them has been nursing the idea for a long time and is very passionate about it. He keeps thinking and thinking of how to start up the business and get everything ready but has done nothing yet.

The other one also nurses the idea though he…

Motivation and Machine Learning - Part2

Day 5

Learnt about Tabular Data - data simply arranged in tabular format like in an Excel spreadsheet with rows, columns and cells where they intersect.

Rows describe a single observation, product or entity

Columns describe the properties or features of the item. Column values can be continuous (countable numeric values that can take any value) or discrete(categorical) values which have a limited range and needs to be converted.

Cells represent single value in row and column intersection.

In machine learning we ultimately work with numbers specifically vectors. So everything that isn't numbers like the categorical variables, text, pictures, videos, audio inputs are eventually converted to array of numbers.

Day 6

Revision - 2.8 Scaling Data and 2.9 Encoding categorical data

The point of scaling Data is transforming it to fit within some range or scale say 0 - 1 or 1-100. It doesn't affect the algorithms because every value is scaled same way. It can speed up the training process.

Two common approaches to scaling: standardization and normalizing.

standardization rescales data to have mean of Zero and variance of one. variance is an indication of the data's dispersion. formular: (x - mean)/variance

normalizing simply rescales the data within range [0-1]. formular: (x - xmin)/(xmax-xmin)

Categorical data needs to be encoded into numbers, so that it can be modeled by machine learning algorithms. Two approaches: Ordinal encoding and One hot Encoding.

Ordinal assumes there's an order or rank of importance It converts categorical data into integers ranging from zero to the number of categories minus 1.

One hot Encoding doesn't assume order. It creates new columns for each category. And number of distinct categories correspond to number of the categories.


Day 7

 I revised lessons 2.10 - Image data and 2.11 - Text data My summary:

Image data needs to be represented as numbers before it can be fed to ML algorithms which is why we describe it in terms of pixels.

A pixel can be viewed as a tiny square of color obtained by a combination of 3 color channels (RGB) or more.

So images are represented 3D vectors with height, width and no of channels. It's important to use same aspect ratio for ML image data. In case you're wondering what aspect ratio means, it refers to the ratio between an image height and its width.

It's also important to normalize image data by subtracting per channel mean pixel values from individual pixels.

Text Data is also processes to numbers before use. There are two steps in that: normalization, then vectorization.

Normalization transforms a piece of text to canonical (official) form. This helps handle multiple words that mean same thing(lemmetization) and remove unnecessary words(stop words).

Tokenization refers to splitting string of text into list of smaller parts.

Then comes vectorization. The point is to encode the text into numerical form. We identify most relevant features or key words after normalizing and assign values to each one.

Two approaches: TF-IDF(Term frequency inverse document frequency) and Word2Vec

The difference between them is that while TD-IDF ignores the order of words and gives a matrix based on number of words(frequency) in the vocabulary and number of documents,  Word2Vec on the other hand gives a unique vector for each word based on the words appearing around the particular word.

Sorry, the summary is longer than I planned. Hope you like it. :blush: Let's keep going.

Day 8

I took lessons 2.12 - 2.14. It centered on the two perspectives to machine learning; the computer science perspective and the statistical perspective.

In the computer science perspective, we try to determine the program that given the data input (input features) can produce a correct output or expected prediction.

In the statistical perspective, the ML algorithm is trying to learn a hypothetical function F from given input variables(x). The output is a dependent variable.

The challenge is same in both perspectives; determining a program or learning a function for a dependent variable. 

Day 9

Summary 2.15-2.24

Part 1

2.15(Tools for Machine Learning)

ML tools consists of Libraries, development environments and services provide support for the ML ecosystem.

Common libraries: Scikit-learn a classical machine learning library.

Keras, Tensorflow and Pytorch - popular deep learning libraries.

Development environments: most popular is Jupyter notebooks where you can use Python to write your codes and you can split runs in cells. Other environments are; Azure notebooks, Azure Databricks, Visual Studio code etc.

Cloud services are needed to support the development environments. Microsoft Azure is a leading cloud computing service for building, testing, deploying and managing apps and services through Microsoft data centers. Cloud services are important because sometimes data is too large and you need a faster processor to handle your ML workloads which can easily be provisioned on Cloud.

Microsoft has Azure Data Science VM(virtual machines), Azure Databricks, Azure Machine Learning Compute and even SQL server ML services to handle all kinds of ML workloads irrespective of size.

For notebooks, Jupyter, Databricks, R markdown and Apache Zeppelin are the most popular.

Library is a collection of pre-written(and compiled code) which you can make use of in your own project by simply importing them.

2.16 (ML libraries)

Python is a programming language tool which can be used for ML. It has two important libraries for this: Pandas, which allows you work efficiently with Dataframes and Numpy which supports high level mathematical functions for numerical optimization and operations on arrays.

For ML: most popular library is Scikit-learn. There is also spark ML which is also used for classical ML.

For DL: there are two core libraries: Tensorflow and Pytorch. Keras is a library was developed to make Tensorflow easier to work with for Deep learning.

For Visualization: we have Seaborn, Plotly, Matplotlib, and Bokeh.. Seaborn provides a high level interface and has additional features than Matplotlib. Bokeh generates Interactive data visualization.

Revision , Lessons 2.17-2.18

Cloud services for ML provides support for managing the core assets used in ML projects.

Datasets: define version and monitor data used in ML runs

Experiments & runs: organize ML workloads and keep track of each task executed through the service.

Pipelines: provide structural flow of tasks to model complex ML flows.

Model registry: manages models and registers then with support for versioning and deployment to production.

Endpoints: expose real time endpoint for scoring as well as pipelines for advanced automation.

Datastores: are data sources connected to the service environments like blob stores, fileshare data stores, data lake shows and databases. Simply a place you can create datasets from other sources.

Compute: This is a designated compute resource where you run your ML trainings.

Difference between models and algorithms

Models are specific representations learnt from data algorithms. Model is the end product.

Algorithms: can be seen as prescriptive recipes to transform data into models.

The goal of Linear regression assumes there's a linear relationship between x, the dependent variable and y, the independent variable. Represented as y=mx+c

:sunglasses: That's quite a lot. Hope you found it useful.

Automated ML: rapidly iterates over many combinations of algorithms and hyperparameters to help you find the best model based on your defined metrics.


Popular posts from this blog

Why You Should Be Careful With An "I don't Care" Attitude

You've probably heard people say things like "I don't care what anybody thinks." "I don't give a damn" "I don't care anymore" and other words like that. The "I don't care" attitude is becoming quite popular and of course it comes with a good feeling.

This attitude can actually be helpful if it motivates you to keep trying where other people have failed or it helps you become a better and a much happier person. But sometimes, this attitude can arise out of pure stubbornness and laziness.

Many relationships among people have failed because of this. Some people don't care at all about another person but just themselves and their needs forgetting that their needs are dependent on the needs of other people.

It is important to actually care to know the truth and properly look into our consciences before we adopt an I don't care attitude towards anything. Are you adopting it because you are convinced you're on a right t…

What Happened To Victor Pride of Bold and Determined?

So after so many months without visiting his blog, I tried to visit for some fired up articles only to get a surprise.. that the blog has been permanently closed and then there's podcast following from Vic Pride (now Brother Nicholas) claiming that He's now Born again and has given his life to Jesus Christ.
I didn't know how to feel. Whether it's good news or bad news will ultimately be up to us, but I just think I should share my thoughts about it here. But before that a brief background story.
Victor Pride has been running the motivational blog bold and determined since 2011 and he has actually inspired and motivated a lot of young men and women to break out lazy attitudes/habits to live the life of their dreams. 
Even though, I never really agreed with quite a number of his ideas about God, religion, the government, women etc, I still saw the truth in some of the things he said. And he had a very unique way of writing with gives you that adrenalin…

The Definition Of A True Woman

Previously, I wrote an article on the definition of a true man. It would be fair enough to also write and article for the women who read this blog.

Women are beautiful, lovely and sensitive creatures way different from men in a lot of things both mentally and physically. They are special in their own way and also play very important roles in our society.

Just like in the case of the men, value systems seem to be changing for women too.

Most women seem to rate themselves these days based mainly on their looks. They spend so much time and money in ensuring they look very good, clean and posh. They can do anything to look attractive and get attention.

Some even go to the extent of almost going nude on social media just to feel good about themselves and get reassuring likes.

Painfully, most of these same women don't pay attention to what is inside of them. They often neglect the unseen qualities which make them who they truly are.

There are some women who boast about how many men the…