Machine Learning Skills.
Data visualization using different types of plots and packages such as
matplotlib and seaborn.
Dimensionality reduction using PCA and T-SNE.
Text pre-processing by cleaning the data, removing stop words, stemming,
Lemmatizing, convert text to numerical vectors, word embedding, and other
operations as required.
Converting text data into a numerical vector using BoW and tfidf with n-grams,
word2vect, average-word2vec, tfidf-word2vec.
Splitting the data using random splitting or time-based splitting based on the
problem.
Find the best value for hyper-parameters by applying Gridsearch, random
search, hyperopt, and optuna and optimization using Gradient descent.
Can apply cross-validation using K-fold and leave-one-out cross-validation.
Creating the appropriate models using a set of modelling algorithms such as:
K-Nearest Neighbours, Naïve Bayes, Logistic regression, Linear regression,
Supported vector machine (classifier and regressor), Decision trees (classifier
and regressor).
Use of ensemble models (Bagging, Boosting, Stacking, and Cascading
Clustering data ) using K-means (and K-medoids) clustering, Hierarchical
clustering, Density-based clustering techniques.
Create recommendation system using content based and collaborative filter
approach and used both in one recommendation system.