Skills every aspiring data scientist should master and demonstrate — when presenting oneself for potential data science opportunities

OCTAVE - John Keells Group
4 min readJan 26, 2023

--

Written by Kiruparan Balachandran, Manager Data Scientist at OCTAVE

Following are some very common questions I hear from aspiring data scientists:

  • How can I become a data scientist?
  • What are the skills required to become a data scientist?
  • How can I present myself to a recruiter as a potential candidate?

Of course, answering the second and third questions will answer the first question.

Hence, this series of articles highlights the critical skills an aspiring data scientist should learn and the selected skills one should deep dive into, and also directs how aspirers can present themselves for potential data science opportunities.

Pure Mathematics

Most programming languages provide libraries where data scientists can build machine learning models with less effort. Still, it is highly advisable to firm your pure mathematics knowledge before stepping into data science. Aspirers should master the core areas of calculus (e.g., optimization algorithm for finding a local minimum is performed using calculus) and linear algebra (e.g., in a neural network, networks are represented and processed based on linear algebra) in pure mathematics.

Image by Undrey on Freepik

Statistics

Like pure mathematics, statistics is another core skill every data scientist should master. Ideation in almost all the advanced analytics projects is driven using descriptive statistics (e.g., mean, median, mode, variance, and standard deviation). On the other hand, inferential statistics helps to generalise a larger population based on sample data.

Image by pikisuperstar on Freepik

Machine Learning

Expertise in the above two topics makes your life easier to explore this area. Task-driven (supervised learning), data-driven (unsupervised learning), and learning from errors (reinforcement learning) are the three pillars that drive machine learning paradigms. Deep dive on ensemble methods such as bagging and boosting will help you understand supervised learning in detail.

Programming Language

The previous three skills give a solid foundation on conceptual aspects of data science; thereafter, you must master the programming language to perform descriptive statistics, inferential statistics, and implement machine learning models. R is very popular among statisticians, and Python is the most preferred language among data science professionals.

Big Data Processing

Python and R help you to create proof-of-concept solutions with limited data, but in most of the advanced analytics projects, you will end up processing millions of data points. So, it is highly recommended that you learn more about scalable machine learning models (e.g., Spark MLlib), storage formats that support big data processing (e.g., parquet and delta), and platforms (e.g., Databricks) that enables big data processing.

Problem Solving

The skills we discussed so far are more technical and not sufficient to play the data scientist’s role. You will also engage with clients and delivery teams to create advanced analytics solutions as part of your day-to-day work. By pursuing this skill, you should be able to convert a generalized problem statement into a more specific problem statement and come up with actionable advanced analytics solutions.

Image by patcharin on Freepik

Data Visualisation

With the above skills, you can perform exploratory data analysis and build model. However, a data scientist’s role is showing the model outcome to project stakeholders. You have to manage the complexity based on the audiences for dashboards. Further, you must be able to decide how best to visualise the model insights and recommendations in a manner that would be easily understood and useful to the end user.

Story Telling

Like problem-solving, this is another non-technical skill the aspiring data scientist should master. Frequently, this is misinterpreted and considered based on language fluency. However, this is about how simply you communicate the complex, advanced analytics solutions and outcomes to stakeholders. You may have to change the content based on the audience and communicate.

Image by vectorjuice on Freepik

Becoming a data scientist is a journey, since many aspire to become a better data scientist, do not limit your learning to what is explained above. Practice this skill further on different forums to sharpen your knowledge.

The next article will select a few of the above topics and do a deep dive to give more color.

--

--

OCTAVE - John Keells Group
OCTAVE - John Keells Group

Written by OCTAVE - John Keells Group

OCTAVE, the John Keells Group Centre of Excellence for Data and Advanced Analytics, is the cornerstone of the Group’s data-driven decision making.

No responses yet