Gradient Boosting Machines is a boosting ensemble technique. Boosting algorithms perform better because both variance and bias can be controlled by careful hyperparameter tuning. GBMs use shallow decision trees as compared to stumps in AdaBoost. They fit between AdaBoost and XGBoost in prediction/estimation performance. The advantages of GBM are better performance than AdaBoost but simpler mechanics than XGBoost.
Let X be the feature set with m samples and n features. Let y be the continuous response.
Parameters of the model are represented as:
India is a melting pot of cultures, traditions, and values. The subcontinent of 1+ billion people houses its citizens in 36 States and Union Territories. The quality of education is affected not only by the presence of different educational boards but also by different prevalent socio-economic factors.
I grew up in a beautiful Christian brothers’ school in a quiet town. After completing school, I stayed in a couple of different places in India, which exposed me to diverse backgrounds and diverse schooling practices.
Just like any problem can be solved most effectively by working at the grass-root level, I believe…
K Nearest Neighbors is a lazy learner. It does not build a model as such. Scaling of features is necessary before calculating distances when the features have different ranges.
While using Linear Regression if you thought to yourself, “gosh, how can I use this for classification?”, you are reading the right article. Logistic Regression borrows the concept of best fit line from Linear Regression to demarcate classes in an OVR(one-vs-rest) fashion. Since the required output is a prediction, the model uses a sigmoid transformation to keep the output bound within [0,1]. Also, the loss function changes to hinge loss from a continuous convex loss function seen in Linear Regression. Missing values need to be imputed or dropped in Linear Regression. Outliers affect the formation of best fit line. They…
This is the first algorithm anybody learns when they step into the world of Machine Learning. The advantage of the model lies in its simplicity. A linear model has a high bias and low variance. Also, if the features are normalized, it helps the algorithm converge faster while using gradient descent. Missing values need to be imputed or dropped in Linear Regression. Outliers affect the formation of best fit line. They should be filtered using boxplot or any other method.
“That’s the thing about books. They let you travel without moving your feet.” — Jhumpa Lahiri, The Namesake 
I belong to a family where someone is always pouring over books, newspapers, or magazines. At school, we had a library and a library period dedicated to reading. However, much to the annoyance of my friends and family, I was not a child who was keen on reading.
During college, I realized the value of reading books outside the curriculum. To open your mind to black holes and understand the intricacies of human nature is important. …