How to get into data science

I recently attended a talk on ”Introduction To Data Science and How You Can Get Started” by General Assembly in Singapore. While the speakers shared the kind of work they were doing, I felt that at the end of the seminar, the topic was not answered extensively.

Here’s my take on how you can get into data science:

  1. Understand that data science is a spectrum across (in order of difficulty) business intelligence, business analytics and predictive modelling
    It can be as simple as downloading some data, using Excel to build dashboards (reporting and intelligence). You can take a step further to establish some trends and correlation. Finally, for predictive modelling, there are tools like KXEN that will allow businesses to build models through a “black box” software.
  2. Pick up the tools of the trade: SQL, Python, Statistics, Tableau, Qlikview
    If you’re starting your foray into data science, machine learning is probably a bit of a stretch. As a start, learn how to extract, transform and manipulate data using SQL and Python. There are plenty of resources online such as W3Schools or Code Academy.
  3. Get comfortable with large amounts of data
    Most individuals will find millions of rows of data with hundreds of columns (also known as features or attributes) very frightening. There’s no getting around this and requires getting used to. You can do analysis on samples of large data and do SQL counts of the results prior to running the entire query so that you know whether the result of the query is something you might expect.
  4. Know that data is NEVER clean
    I’ve seen and used the data of several banks, insurance, travel, and telecommunication companies and 1 thing is common, the data is never clean. There will be missing values, inconsistent formats, extreme values outside the normal ranges just to name a few. Similar to large amounts of data, this requires getting used to. A lot of time cleaning the data before you can even start working on it. One common mistake is to do an analysis based on incorrect data resulting in an incorrect business decision.
  5. Learn to structure an analysis
    My methodology in this is to form a hypothesis based on research or observations, extracting data and define a possible actionable outcome from that analysis. I’ve made a career out of doing this!

Data science is a really interesting field and will continue to play an important role in any company big or small. Even if it’s not career that you wish to pursue, having some data-related skills will definitely give you an edge.

Leave a Comment

Your email address will not be published. Required fields are marked *