![]() |
| (Photo by Matthew Henry from Burst) |
Data
science (DS) is the need of the hour in almost every industry. As data plays an
important role in making business-critical decisions, DS has become a must-have
function in every organization.
Data science (DS) is no more a new term, even for a layperson.
Nowadays, every industry is benefiting from various applications of DS, which
were unimaginable a few decades ago – but now possible due to artificial
intelligence (AI). AI is not a new concept and has been there for at least
60-70 years now. However, it only became possible for scientific and technology
communities to make the best use of AI after the advancements in the field of computers
and similar technologies.
DS works on historical data to predict classes, numerical values
or clusters/groups of a given sample. Some of the examples include flight
departure time, employee churn, loan defaulters, fraud detection, weather
forecast, online product recommendations, etc.
DS cannot exist without machine learning (ML), which is based on
AI. ML can be divided into two main categories – Supervised ML and Unsupervised
ML. Although both rely on historical data, the former requires data with
response variable (that a model predicts), also called tagged or labeled data;
while the latter doesn’t need data with response variable, i.e. data without
label or tag. There is a third kind of ML, known as Semi-supervised learning,
requiring some portion of data to be tagged or labeled. (Read on to know more on Machine Learning)
Supervised ML mainly consists of classification, regression and
forecast models. Classification models predict the class of an object,
situation, people, etc. (for example: male or female, rain or no rain).
Regression models estimate the numerical values such as age, salary, etc.
Forecast models are mainly time series models estimating values just like
regression models (example: share price).
Unsupervised ML builds models based on algorithms, such as
Clustering and Principal Component Analysis (PCA). Clustering divides data into
various clusters depending on similarities withing a cluster and
dissimilarities between two clusters. PCA uses principal components to group
together similar data points.
Some of the famous models are mentioned below:
Classification: Random Forest, Deep Learning (Deep Neural
Networks, LSTM, RNN), Decision Tree, K-NN, Naive Bayse, Logistic Regression,
Ensemble Methods (boosting and bagging), Support Vector Machines (SVM),
Recommender System, etc.
Regression: Linear Regression and some of the algorithms
mentioned under classification
Forecast: ARIMA, Exponential Smoothing Models, etc.
Clustering: K-means, Hierarchical Clustering, mixture models,
etc.
Applications and Benefits of using data science:
There are numerous advantages of DS applications in every
industry. 10 of them are mentioned below:
1. Predicting future possibilities to come up with successful
strategies
2. Increased securities through fraud and anomaly detection
3. Reduced errors through anomaly detection
4. Increased efficiency through automation using DS
5. Reduced Turn-Around-Time through automation using DS
6. Enhanced customer experience by learning about their behavior
through DS applications
7. Streamline business processes through automation of data
collection and analysis using DS
8. Risk Analysis and Management
9. Personalized services to customers using DS based recommender
system
10. Share-market Trading using DS Algorithm
The above list is not exhaustive and only indicative.
However, DS works excellent when there is useful data, else it
can be a complete waste of effort and time. There is a famous saying in data
science community, "Garbage in, garbage out." If you feed useless data
to your algorithm, you can only expect a useless model that may not help you
make the right decision. (More on this...)
Following are some of the main industries that have benefited a
great deal from DS applications:
Airlines, e-commerce, banking, financial sectors, insurance,
healthcare, cybercrime security systems, social media, etc.

Comments
Post a Comment