My Journey to Kaggle

1 minute read

Foreword

After studying the basis of data science, I think Kaggle is the right place for me to sharpen my machine learning and data science skills. With Kaggle, the hardest part has already been done for you: collecting, cleaning, and defining the problem to be solved with that data. Moreover, Kaggle also offers me the opportunity to learn from brilliant and more experienced data scientists from all over the world.

It is always a good idea to have a solid basis first. Consistent practice is the only way to improve and enhance your data science skills - the best way to become better at data sciences is by doing it as often as you can. Consistent practice means your aim is learning and not to focus all your attention on winning.

Structure

Discuss

It is a section dedicated to beginners like me to get started.

Datasets

It is a good idea to use the ‘search’ feature to look up some of the standard data sets out there.

Notebooks

It is a cool feature in which participants can submit “Notebooks”, which are short scripts that explore a concept, showcase a technology, or even share a solution. You can get novel ideas by going through popular kennels.

Competitions

When you have worked with the Notebooks and the datasets, it’s a good idea to get into the competitions. Start with the ‘getting stated’ category of the competitions, which provides numerous guiding tutorials and simpler datasets.

  • Featured Competitions
  • Research Competitions
  • Non-traditional Competitions
  • Recruitment Competitions

Data Exploration:

It is an indispensable first step in data science. It helps you ascertain the decisions that will be made in the model training process. You will understand different features, statistician distribution of values and learn about null and missing values in the process. Seaborn library is a popular and highly recommended for data exploration, which provides high-level functions to plot and visualizes the data.

Tips

Start Simple

To get a feel of the real project, take you time to train and practice on a manageable and simpler datasets.

Solo

Working alone on a project helps you learn more because you are focused to tackle every step of the journey alone right from exploratory analysis, model training and handling dirty data.

Incremental targets

  • Beats the benchmark solution.
  • Score in the top 50%.
  • Score in the top 25%.
  • Score in the top 10%.
  • Win one.