Over the last years more and more people heard about Data science. Since 2010 the popularity of data science has exploded. The number of data scientist job postings between 2010 and 2012 increased by 15 000%. There are academic programs that train specialists in data science. You can have even PhD degree in data science. Dozens of conferences are held annually in data science, big data and AI. It was several reasons for such a big interest:
- First, the need to analyze growing volume of data collected by corporations and government.
- Second, price reduction of computational hardware, improvement of computational software, and emergence of new data science methods.
High popularity of social networks, online services discovered unlimited potential of monetization to be unlocked through new products and improved information over competitors. Big companies started to form teams of people to analyze collected data.
What is Data Science?
- First, Data Science is the study of data and their properties. It is focused on discovering the question from data to ask unlike just asking it.
- Second, Data Science is the multi-disciplinary field of study, it combines domain expertise, programming skills, and knowledge of math and statistics to extract meaningful insights from data.
- Third, models in Data Science are not static. They are constantly tested, updated and improved in time, so it becomes possible to obtain actionable information from data and not just historical facts.
Data Science tasks requires knowledge and experience in different areas. Collection, analysis, interpretation, presentation and organization of data require a rich background in statistics, linear algebra and calculus. Experience in computer science and software programming is necessary for data manipulation and processing. Knowledge in domain expertise (business, philology, medicine, biology, etc. domains) gives an understanding of problems need to be solved and kinds of existing data in the domain. Also, written and verbal communication skills are in high demand in Data Science. The probability to find an expert in all areas is close to find a unicorn. Therefore, the most successful strategy is to build a Data Science team of specialists in different areas.
High level of Data Science team and digitalization level within an organization can provide the main Data Science activities: description, discovering, prediction and advice. These activities are in the order of increasing complexity of their implementation. At the first stage, Data Science team use basic analytic functions and refine raw data to describe data. For example, it could be the description of the customers distribution with respect to location. At the next stage, it is possible to discover hidden relationships or patterns in data. It gives the opportunity to find possible groups of customers that purchase similarly. Then, past observations can be used to predict future observations. The case of study at this stage can be the prediction of the products that certain customer groups are more likely to purchase. The last stage and the most difficult one is to define possible decisions, to optimize over those decisions, and to advise the decision that gives the best outcome. Each of those stages can improve organizations activity, but the company gain the most with the Predict and Advise stages.
The benefits of data science depend on the industry and their goal. But the common advantages of enlisting data science for most organizations are improvement in operational efficiency and workflow, performance improvement in data-driven decisions making. Now, Data Science is rapidly evolving as it touches the main aspects of our lives: healthcare, financial trading, travel, energy management, social media, e-commerce, human resources, fraud detection, etc. The coming years will bring a deeper integration of Data Science in the most domains of business, science and daily life.