This is the second part of the Airflow series as indicated in the title. I have tried to explain and implement the features in the articles. Here is a list of articles that I have posted so far:
Template Search Path is an argument in Dag definition, that contains mainly a list of folder locations that Airflow will look into…
This post is not an exact introduction to Airflow. But is the exact thing that you'll need once you are introduced to airflow. But that will not restrict beginners from reading this post and understanding nothing, sometimes it is better to start from level 1 instead of level 0. Airflow is simple yet complicated. Well, that’s not very helpful, is it? that was a contradictory statement, don't worry even I didn't get it at first. You’ll understand this once you start working with Airflow. In this post, I’ll explain common Airflow features along with some important features which are mentioned/explained…
I am going to describe my introduction to PL/Python just like any other facial cream advertisement that changed their life. Recently I was working on python to get data from a source or database, do some calculations in python and then insert them in another table in the same database. The amount of data was large and need to be done on a daily basis, it took time to run and consumed a lot of resources using python(psycopg2). Then I came across PL/Python and since then my world has changed! …
For The Neighborhood Of Mumbai
This report is for the final course of the Data Science Specialization. A 9-course series created by IBM and hosted on Coursera Learning platform, it covers the entire process in Data Science from collecting data to making predictive conclusions based on the data.
The problem and analysis approach are left for the learner to decide and explore on his own, with a requirement to leverage the Foursquare location data to explore or compare neighbourhoods or cities of our choice or to come up with a problem that we can use the Foursquare location data to…
Data science is like a beautiful painting but what if you don’t have paints for that painting. There would be no data science without data (obviously a valid data). Mostly there are 3 cases of raw data
1. When you don’t have data readily available from any trusted platform. (eg. kaggle,data.gov, database, etc)
2. When you get data from the trusted platform/database but it’s not appropriate. (data is biased or it doesn’t have features that you require or it needs too much cleaning).
3. When you have appropriate data available from a trusted platform/ (eg. …
Make your machine understand the data better
In this digital world “encoding” is a powerful word, with its application and presence spread in all aspect. In general, Encoding is the process of converting data from one form to another required form. One of the best examples of encoding comes from the era before the outbreak of internet, Use of Morse code which was character encoding scheme that encodes text characters as standardized sequences of two different signal durations called dots and dashes or dits and dashes.
The World of Data also makes use of encoding for encryption, telecommunication of data…
Optimize KNN with the best value of K.
There’s a famous saying,
“You are who you surround yourself with”
Just as the saying goes today we’ll look at a Machine Learning Algorithm which is based on this saying. According to psychology and also layman understanding we know, how surrounding or association affect one’s behaviour. That’s why as a kid we are always told to stay away from bad influence.
Ok, now enough of philosophy lets get down to understand the title of this post. K-nearest Neighbors is one of the easiest to understand and very effective in the same hand.
…
A Quick Guide to get started with Git
Yes, the title clearly states the purpose of this article i.e. this article is just enough for you to get started with Git and start committing. Yes, git requires commitment but here it doesn’t restrict one’s freedom of action. You’ll get to know about this commitment as you read through the article.
How many of you know GitHub? You must have heard of GitHub mainly from people involved in technology or people who code or more categorically the geeks, who rave about GitHub all the time. Most people only know GitHub as…
This tutorial is a continuation of my previous post as the title suggests. If you know the basics of Tree-based learning algorithm and more specifically Decision Tree Algorithm, then you can continue on your quest to master “Decision Tree Algorithm”. But if you are a beginner or a novice or you can’t recall the concept then I would suggest you go through the Machine Learning Basics: Decision Tree From Scratch (Theoretical Framework) because the Information Gain of the Theoretical part is higher than Practical part (if you didn’t get this ->> Click<<-). …
Data Scientist | Imagine — Believe — Achieve |