The difference between Data Analytics, Data Analysis, Data Mining, Data Science, Machine Learning, Data Engineer and Big Data


I have had course to listen to a number of debates about the difference between Data Analytics, Data Analysis, Data Mining, Data Science, Machine Learning, Data Engineer and Big Data but if I have to summarize machine learning in one sentence, I would say it is a collection of algorithms and techniques used to design systems that learn from data.
Now with the exception of big data they're all subsets of mathematics. The names just change to reflect the domain or as a branding exercise to try and say they're doing something different when at a fundamental level they're mostly not.
Again the algorithms of ML are very general in the sense usually they have a strong mathematical and statistical basis that does not take into account domain knowledge and data pre-processing.

 Here are the key differences:

1. Software engineering.
There are many people in the technology industry doing data science at scale who would simply call themselves software engineers. There are also a significant number of engineers who transitioned internally into data-focused roles, and have invested significant time improving their statistics skills. I see quite a few people with this background who have math degrees.

2. Quantitative advanced degrees.
Many data scientists transitioned from a MS or PhD in Statistics, Electrical Engineering, Physics, Mechanical Engineering, Bioinformatics, Chemical Engineering, or similar into a data science role. These disciplines have strong common foundations and utilize many overlapping techniques.

3. Data analysis.
A growing number of data scientists worked first in data analysis and/or predictive modeling, and then picked up machine learning and improved software engineering skills required to move into a data science role. These individuals tend to also have quantitative backgrounds. Data Analysis, Data Mining, Machine Learning and Mathematical Modeling are tools: means towards an end. Analytics, Business Intelligence, Econometrics and Artificial Intelligence are application areas: domains that use the tools above (and others) to produce results within its subject. Among them, Analytics is probably a more generic term (i.e. non domain-specific). 


No comments:

Post a Comment