Error Analysis for Tabular Data - Part-2 (Residual Analysis)

Residuals refer to the difference between the recorded value of a dependent variable and the predicted value of a dependent variable for every row in a data set. Plotting the residual values against the predicted values is a time-honoured model assessment technique and a great way to see all your modelling results in two dimensions.

Read More

Monotonic Constraints for Boosting Models.

Boosting models like XGBoost, LightBoost have been some of the go to model for tabular datasets.This is because these models are highly performant and are able to learn various non-linear relationships within the datasets.However sometimes in the real world,creating an high performant model can result in a non-intuitive model. There are also requirements to include domain knowledge into the ML models.

Read More

ML Model Metric Credibility

"How confident are you that the model performance (on test data) is not below 0.75 AUC ?".I came across this question in my work.This inspired me to write this blog.

Read More

MultiObjective Optimisation with Genetic Algorithm

For most of the problems in real world, we have to optimise on two or more objective functions . Eg in the case of machine learning problems we may have to optimise on two or more objective function like maximise accuracy vs increase interpretability . Almost always these objective functions will be orthogonal to each other.

Read More

Up and running!

Hello there, this is my personal blog to note down my thoughts both personal and professional. I plan to write about technology, the books I have read, the side project I am working on etc. I also plan to share my ideas freely here, so that others can discuss it with me and contribute as well .

Read More