Data Analytics is a vast field which makes use of more than one tool, technique and strategies. The basic idea of Data Analytics is to obtain insights by just studying and querying data, often using various tools via various operations on the data. Data analysis in all honesty can seem overwhelming at first; however becoming a data analyst is a great profession for people who love to work with numbers and your journey into learning data analysis doesn’t have to be stressful.
WeSoft has arranged an internal staff tutorial and workshop presented and taught by Sourav Kundu, attendees can digest the basic knowledge and techniques of Data Analytics at own pace and own learning curve. It would be an honest attempt to give our staff the confidence to launch their first hands-on simple Data Analytics project. It covered the following topics:
— Session 1 —
Basic Tools: No matter what type of business insights you are trying look for by implementing a data Analytics project, you’re likely going to be expected to know how to use the tools of data analysis. This means a statistical programming language, like R or Python, and a database querying language like SQL.
Basic Statistics: At least a basic understanding of statistics is vital as a data scientist. As a result of this training and workshop you should be familiar with statistical tests, distributions, regression, maximum likelihood estimators, etc. This will also be the case for machine learning, but one of the more important aspects of your statistics knowledge will be understanding when different techniques are (or aren’t) a valid approach. Statistics is important at all company types, but especially data-driven companies where the product is not data-focused and product stakeholders will depend on your help to make decisions and design / evaluate experiments.
— Session 2 —
Basic process and techniques of working with data.
Data Munging: Often times, the data you’re analyzing is going to be messy and difficult to work with. Because of this, it’s really important to know how to deal with imperfections in data. Some examples of data imperfections include missing values, inconsistent string formatting (e.g., ‘New York’ versus ‘new york’ versus ‘ny’), and date formatting (‘2014-01-01’ vs. ‘01/01/2014’, unix time vs. timestamps, etc.).
Data Visualization & Communication: Visualizing and communicating data is incredibly important, especially at young companies who are making data-driven decisions for the first time or companies where data scientists are viewed as people who help others make data-driven decisions.
— Session 3 —
Statistics for Data Scientists ( includes Excel Basis and R Basics )
Introduction to R ( or Python)
Data Analytics using R ( or Python)
— Workshop —
Finally, a “Proof Of Concept” (POC), which will be a small and simple Data Analytics project to learn hand-on how to get insights by structuring and studying and querying data, using statistical methods which can be programmed in R language.
Brief Profile of Sourav Kundu
Sourav Kundu is an experienced and accomplished professional with over 20 years IT experience , of which 12 years are in Data Analytics theory and practice, with recent applications in the areas of Insurance Business Process Improvement, Internet of Things and Healthcare in general helping Insurance companies discover business insights to achieve their business’ goals. His recent position was CIO of Cigna Hong Kong.
Sourav has a Doctor of Engineering in Non-Linear Control Engineering using Genetic Algorithms for non-linear control optimization. Sourav has taught courses in Mechanical Engineering, Computer Science, Philosophy, Mathematics, Data Science at universities around the world, as a faculty member in Japan, Australia and as visiting faculty in USA, India etc.