With Machine learning all over the place, it is becoming increasingly important to capture best practices and solutions to tackle common ML problems. Design patterns are a way of capturing these problems and providing reusable answers using generic and well-proven ML designs. They are ways of thinking when designing solutions or building ML systems.
Now the question is , Can we abstract best machine learning practices into design patterns?
As per Wikipedia, “Design patterns are formalized best practices that the programmer can use to solve common problems when designing an application or system”.
In a world full of chaos, it is the patterns that provide us with a sense of order. They are everywhere — in nature, art, science and thought process. A child begins his learning journey by identifying patterns which help him see relationships and form generalizations.
Pattern is fundamental to our understanding of the world.
A pattern can be defined as a regularity that repeats itself. In any given situation, patterns help identify the underlying order and to ask the right questions. …
Data visualization is one of the most powerful ways of communication. It is the way of encoding information into graphical representations. It supports both exploratory and explanatory analysis. Humans have better ability to grasp information through visual representations. Though the process of data visualization is abstract, understanding the underlying framework helps us to develop better dashboards. It is crucial to answer the following questions to design effective dashboards:
1. What is being analyzed(Data)
2. Why visualization is required (Tasks)
3. How to encode data effectively(Methodology)
It is important to analyze and understand the data being visualized, tasks supported by the…
Testing new ideas and figuring out better ways of doing things is an everyday process. But how do we know if the change we implemented is significantly better than the previous method or not. Here’s when t-test comes to our rescue. t-test is used to differentiate between two sample means to find out if the difference between them is actually real or occurred simply due to chance. Any idea is called a hypothesis and the process of testing these ideas is called the Hypothesis Testing.
t-test is one of the most important statistical tool that is used to test the…
Today, in this golden age of being drowned in data every single moment, we are being flooded by a lot of opinions masked under the name of information. False conclusions are disastrous! Here’s Hasan Minhaj’s take on it: https://www.youtube.com/watch?v=icNirsV1rLA
Generally, bigger the data, better the results. But on the contrary, the more data, higher the risk of being fooled by randomness to make false conclusions. And p-hacking is just that.
But before we delve into the details, lets understand the basic terms.
Statistics is the science that helps us manage risk in this uncertain world. …
If you have been in the data science space for a decent amount of time, you would have already realized how ensemble techniques are one of the core strategies for winning Kaggle competitions.
The story of 2006 Netflix prize is one of the game changing tale in the AI folklore. The winning entry used the ensemble technique to bag the million dollar prize.
Before we move on to look into ensemble techniques, let’s look into the concept of “Wisdom of crowd”.
Wisdom of crowd is the idea of collective intelligence popularized by James Surowiecki in his book, The Wisdom of…
Let’s say, we are left in the middle of the mountain ranges, blind-folded and need to find the lowest point in the range.
One of the most intuitive way we go about it is to feel the slope of the ground.
From the position that we are standing, we try checking all possible directions for the greatest descending slope/downhill and move in that direction.
We take each step, one at a time, iteratively until a point where there is no downward slope in all possible directions and stop there. …
Natural Language Processing(NLP) is a branch of AI which helps understand and interpret human language bridging the gap between human and machine language.
We use the concept of analogies between words to predict a country, given the name of a capital city.
Machine learning and deep learning algorithms generally deal with numeric data. So, for converting text into numbers, BagofWords technique has been developed to extract numeric features from text. It uses the concept of frequency distribution of words to find the number of times each word appeared in the text which is also known as the vectorization…
Sentiment analysis is an NLP technique that allows us to classify if a text, tweet or comment is either positive, neutral or negative. Today’s technology enables users to express their emotions and thoughts more openly on social platforms than ever before. So Sentiment Analysis has become a mandatory tool for every business to understand user sentiment and gauge their performance to tailor their products and services according to user needs, thus making systems more efficient.
Logistic regression is a supervised machine learning technique for classification problems. Supervised machine learning algorithms train on a labeled dataset along…
In a no-perfect-solution world, optimizing the existing solutions is the only way of progress. But real life problems often come with a set of constraints. Lagrangian function comes to the rescue in times of handling such situations. Every business has a lot of constraints to deal with on a daily basis. The constraints might include the manufacturing equipment, workforce, budget, etc. So the goal is to optimize the function, within the constraints.
Step 1: Identifying objective function: It represents the goal — Maximizing the profit/Minimizing the error rate
Step 2: Identifying constraint function: It represents the limitations in the system…