Data mining refers to extracting or mining knowledge from large amounts of data. In other words,Data mining is the science, art, and technology of discovering large and complex bodies of data in order to discover useful patterns. Theoreticians and practitioners are continually seeking improved techniques to make the process more efficient, cost-effective, and accurate. Many other terms carry a similar or slightly different meaning to data mining such as knowledge mining from data, knowledge extraction, data/pattern analysis data dredging.

Data mining treats as a synonym for another popularly used term, Knowledge Discovery from Data, or KDD. In others view data mining as simply an essential step in the process of knowledge discovery, in which intelligent methods are applied in order to extract data patterns.

Gregory Piatetsky-Shapiro coined the term “Knowledge Discovery in Databases” in 1989. However, the term ‘data mining’ became more popular in the business and press communities. Currently, Data Mining and Knowledge Discovery are used interchangeably.

Nowadays, data mining is used in almost all places where a large amount of data is stored and processed.

**Knowledge Discovery From Data Consists of the Following Steps:**

- Data cleaning (to remove noise or irrelevant data).
- Data integration (where multiple data sources may be combined).
- Data selection (where data relevant to the analysis task are retrieved from the database).
- Data transformation (where data are transmuted or consolidated into forms appropriate for mining by performing summary or aggregation functions, for sample).
- Data mining (an important process where intelligent methods are applied in order to extract data patterns).
- Pattern evaluation (to identify the fascinating patterns representing knowledge based on some interestingness measures).
- Knowledge presentation (where knowledge representation and visualization techniques are used to present the mined knowledge to the user).

Now we discuss here different types of Data Mining Techniques which are used to predict desire output.

**Data Mining Techniques**

### 1. Association

Association analysis is the finding of association rules showing attribute-value conditions that occur frequently together in a given set of data. Association analysis is widely used for a market basket or transaction data analysis. Association rule mining is a significant and exceptionally dynamic area of data mining research. One method of association-based classification, called associative classification, consists of two steps. In the main step, association instructions are generated using a modified version of the standard association rule mining algorithm known as Apriori. The second step constructs a classifier based on the association rules discovered.

### 2. Classification

Classification is the processing of finding a set of models (or functions) that describe and distinguish data classes or concepts, for the purpose of being able to use the model to predict the class of objects whose class label is unknown. The determined model depends on the investigation of a set of training data information (i.e. data objects whose class label is known). The derived model may be represented in various forms, such as classification (if – then) rules, decision trees, and neural networks. Data Mining has a different type of classifier:

- Decision Tree
- SVM(Support Vector Machine)
- Generalized Linear Models
- Bayesian classification:
- Classification by Backpropagation
- K-NN Classifier
- Rule-Based Classification
- Frequent-Pattern Based Classification
- Rough set theory
- Fuzzy Logic

Decision Trees**: **A decision tree is a flow-chart-like tree structure, where each node represents a test on an attribute value, each branch denotes an outcome of a test, and tree leaves represent classes or class distributions. Decision trees can be easily transformed into classification rules. Decision tree enlistment is a nonparametric methodology for building classification models. In other words, it does not require any prior assumptions regarding the type of probability distribution satisfied by the class and other attributes. Decision trees, especially smaller size trees, are relatively easy to interpret. The accuracies of the trees are also comparable to two other classification techniques for a much simple data set. These provide an expressive representation for learning discrete-valued functions. However, they do not simplify well to certain types of Boolean problems.

This figure generated on the IRIS data set of the UCI machine repository. Basically, three different class labels available in the data set: Setosa, Versicolor, and Virginia.

Support Vector Machine (SVM) Classifier Method**: **Support Vector Machines is a supervised learning strategy used for classification and additionally used for regression. When the output of the support vector machine is a continuous value, the learning methodology is claimed to perform regression; and once the learning methodology will predict a category label of the input object, it’s known as classification. The independent variables could or could not be quantitative. Kernel equations are functions that transform linearly non-separable information in one domain into another domain wherever the instances become linearly divisible. Kernel equations are also linear, quadratic, Gaussian, or anything that achieves this specific purpose. A linear classification technique may be a classifier that uses a linear function of its inputs to base its decision on. Applying the kernel equations arranges the information instances in such a way at intervals in the multi-dimensional space, that there is a hyper-plane that separates knowledge instances of one kind from those of another. The advantage of Support Vector Machines is that they will make use of certain kernels to transform the problem, such we are able to apply linear classification techniques to nonlinear knowledge. Once we manage to divide the information into two different classes our aim is to include the most effective hyper-plane to separate two kinds of instances.

**Generalized Linear Models:*** *Generalized Linear Models(GLM) is a statistical technique, for linear modeling.GLM provides extensive coefficient statistics and model statistics, as well as row diagnostics. It also supports confidence bounds.

Bayesian Classification: Bayesian classifier is a statistical classifier. They can predict class membership probabilities, for instance, the probability that a given sample belongs to a particular class. Bayesian classification is created on the Bayes theorem. Studies comparing the classification algorithms have found a simple Bayesian classifier known as the naive Bayesian classifier to be comparable in performance with decision tree and neural network classifiers. Bayesian classifiers have also displayed high accuracy and speed when applied to large databases. Naive Bayesian classifiers adopt that the exact attribute value on a given class is independent of the values of the other attributes. This assumption is termed class conditional independence. It is made to simplify the calculations involved, and is considered “naive”. Bayesian belief networks are graphical replicas, which unlike naive Bayesian classifiers allow the depiction of dependencies among subsets of attributes. Bayesian belief can also be utilized for classification.

**Classification By Backpropagation: **A Backpropagation learns by iteratively processing a set of training samples, comparing the network’s estimate for each sample with the actual known class label. For each training sample, weights are modified to minimize the mean squared error between the network’s prediction and the actual class. These changes are made in the “backward” direction, i.e., from the output layer, through each concealed layer down to the first hidden layer (hence the name backpropagation). Although it is not guaranteed, in general, the weights will finally converge, and the knowledge process stops.

K-Nearest Neighbor (K-NN) Classifier Method**: **The k-nearest neighbor (K-NN) classifier is taken into account as an example-based classifier, which means that the training documents are used for comparison instead of an exact class illustration, like the class profiles utilized by other classifiers. As such, there’s no real training section. once a new document has to be classified, the k most similar documents (neighbors) are found and if a large enough proportion of them are allotted to a precise class, the new document is also appointed to the present class, otherwise not. Additionally, finding the closest neighbors is quickened using traditional classification strategies.

**Rule-Based Classification:*** *Rule-Based classification represent the knowledge in the form of If-Then rules. An assessment of a rule evaluated according to the accuracy and coverage of the classifier. If more than one rule is triggered then we need to conflict resolution in rule-based classification. Conflict resolution can be performed on three different parameters: Size ordering, Class-Based ordering, and rule-based ordering. There are some advantages of Rule-based classifier like:

- Rules are easier to understand than a large tree.
- Rules are mutually exclusive and exhaustive.
- Each attribute-value pair along a path forms conjunction: each leaf holds the class prediction.

**Frequent-Pattern Based Classification: **Frequent pattern discovery (or FP discovery, FP mining, or Frequent itemset mining) is part of data mining. It describes the task of finding the most frequent and relevant patterns in large datasets. The idea was first presented for mining transaction databases. Frequent patterns are defined as subsets (item sets, subsequences, or substructures) that appear in a data set with a frequency no less than a user-specified or auto-determined threshold.

**Rough Set Theory:*** *Rough set theory can be used for classification to discover structural relationships within imprecise or noisy data. It applies to discrete-valued features. Continuous-valued attributes must therefore be discrete prior to their use. Rough set theory is based on the establishment of equivalence classes within the given training data. All the data samples forming a similarity class are indiscernible, that is, the samples are equal with respect to the attributes describing the data. Rough sets can also be used for feature reduction (where attributes that do not contribute towards the classification of the given training data can be identified and removed), and relevance analysis (where the contribution or significance of each attribute is assessed with respect to the classification task). The problem of finding the minimal subsets (redacts) of attributes that can describe all the concepts in the given data set is NP-hard. However, algorithms to decrease the computation intensity have been proposed. In one method, for example, a discernibility matrix is used which stores the differences between attribute values for each pair of data samples. Rather than pointed on the entire training set, the matrix is instead searched to detect redundant attributes.

Fuzzy-Logic: Rule-based systems for classification have the disadvantage that they involve sharp cut-offs for continuous attributes. Fuzzy Logic is valuable for data mining frameworks performing grouping /classification. It provides the benefit of working at a high level of abstraction. In general, the usage of fuzzy logic in rule-based systems involves the following:

- Attribute values are changed to fuzzy values.
- For a given new data set /example, more than one fuzzy rule may apply. Every applicable rule contributes a vote for membership in the categories. Typically, the truth values for each projected category are summed.

### 3. Prediction

Data Prediction is a two-step process, similar to that of data classification. Although, for prediction, we do not utilize the phrasing of “Class label attribute” because the attribute for which values are being predicted is consistently valued(ordered) instead of categorical (discrete-esteemed and unordered). The attribute can be referred to simply as the predicted attribute. Prediction can be viewed as the construction and use of a model to assess the class of an unlabeled object, or to assess the value or value ranges of an attribute that a given object is likely to have.

**4. **Clustering

Unlike classification and prediction, which analyze class-labeled data objects or attributes, clustering analyzes data objects without consulting an identified class label. In general, the class labels do not exist in the training data simply because they are not known to begin with. Clustering can be used to generate these labels. The objects are clustered based on the principle of maximizing the intra-class similarity and minimizing the interclass similarity. That is, clusters of objects are created so that objects inside a cluster have high similarity in contrast with each other, but are different objects in other clusters. Each Cluster that is generated can be seen as a class of objects, from which rules can be inferred. Clustering can also facilitate classification formation, that is, the organization of observations into a hierarchy of classes that group similar events together.

### 5. Regression

Regression can be defined as a statistical modeling method in which previously obtained data is used to predicting a continuous quantity for new observations. This classifier is also known as the Continuous Value Classifier. There are two types of regression models: Linear regression and multiple linear regression models.

### 6. Artificial Neural network (ANN)Classifier Method

An artificial neural network (ANN) also referred to as simply a “Neural Network” (NN), could be a process model supported by biological neural networks. It consists of an interconnected collection of artificial neurons. A neural network is a set of connected input/output units where each connection has a weight associated with it. During the knowledge phase, the network acquires by adjusting the weights to be able to predict the correct class label of the input samples. Neural network learning is also denoted as connectionist learning due to the connections between units. Neural networks involve long training times and are therefore more appropriate for applications where this is feasible. They require a number of parameters that are typically best determined empirically, such as the network topology or “structure”. Neural networks have been criticized for their poor interpretability since it is difficult for humans to take the symbolic meaning behind the learned weights. These features firstly made neural networks less desirable for data mining.

The advantages of neural networks, however, contain their high tolerance to noisy data as well as their ability to classify patterns on which they have not been trained. In addition, several algorithms have newly been developed for the extraction of rules from trained neural networks. These issues contribute to the usefulness of neural networks for classification in data mining.

An artificial neural network is an adjective system that changes its structure-supported information that flows through the artificial network during a learning section. The ANN relies on the principle of learning by example. There are two classical types of neural networks, perceptron and also multilayer perceptron.

### 7. Outlier Detection

A database may contain data objects that do not comply with the general behavior or model of the data. These data objects are Outliers. The investigation of OUTLIER data is known as OUTLIER MINING. An outlier may be detected using statistical tests which assume a distribution or probability model for the data, or using distance measures where objects having a small fraction of “close” neighbors in space are considered outliers. Rather than utilizing factual or distance measures, deviation-based techniques distinguish exceptions/outlier by inspecting differences in the principle attributes of items in a group.

### 8. Genetic Algorithm

Genetic algorithms are adaptive heuristic search algorithms that belong to the larger part of evolutionary algorithms. Genetic algorithms are based on the ideas of natural selection and genetics. These are intelligent exploitation of random search provided with historical data to direct the search into the region of better performance in solution space. They are commonly used to generate high-quality solutions for optimization problems and search problems. Genetic algorithms simulate the process of natural selection which means those species who can adapt to changes in their environment are able to survive and reproduce and go to the next generation. In simple words, they simulate “survival of the fittest” among individuals of consecutive generations for solving a problem. Each generation consist of a population of individuals and each individual represents a point in search space and possible solution. Each individual is represented as a string of character/integer/float/bits. This string is analogous to the Chromosome.

## FAQs

### What are the five 5 data mining techniques? ›

**Below are 5 data mining techniques that can help you create optimal results.**

- Classification analysis. This analysis is used to retrieve important and relevant information about data, and metadata. ...
- Association rule learning. ...
- Anomaly or outlier detection. ...
- Clustering analysis. ...
- Regression analysis.

### How many data mining techniques are there? ›

**16** Data Mining Techniques: The Complete List.

### What are the major data mining techniques? ›

**10 Key Data Mining Techniques and How Businesses Use Them**

- Clustering.
- Association.
- Data Cleaning.
- Data Visualization.
- Classification.
- Machine Learning.
- Prediction.
- Neural Networks.

### What are the 3 types of data mining? ›

...

**2.**

**Descriptive Data Mining**

- Clustering Analysis.
- Summarization Analysis.
- Association Rules Analysis.
- Sequence Discovery Analysis.

### What are the 6 processes of data mining? ›

Data mining is as much analytical process as it is specific algorithms and models. Like the CIA Intelligence Process, the CRISP-DM process model has been broken down into six steps: **business understanding, data understanding, data preparation, modeling, evaluation, and deployment**.

### What is the purpose of data mining techniques? ›

Data mining is the process of sorting through large data sets to identify patterns and relationships that can help solve business problems through data analysis. Data mining techniques and tools **enable enterprises to predict future trends and make more-informed business decisions**.

### What is data mining techniques in machine learning? ›

Data mining is designed to extract the rules from large quantities of data, while machine learning teaches a computer how to learn and comprehend the given parameters. Or to put it another way, data mining is simply **a method of researching to determine a particular outcome based on the total of the gathered data**.

### What is one of the most well known data mining techniques? ›

**Prediction**. Prediction is one of the most valuable data mining techniques, since it's used to project the types of data you'll see in the future. In many cases, just recognizing and understanding historical trends is enough to chart a somewhat accurate prediction of what will happen in the future.

### What is data mining techniques PDF? ›

Data mining is **a process of extraction of**. **useful information and patterns from huge data**. It is also called as knowledge discovery process, knowledge mining from data, knowledge extraction or data /pattern analysis.

### Which algorithm is used in data mining? ›

Some of the popular data mining algorithms are **C4.** **5 for decision trees, K-means for cluster data analysis, Naive Bayes Algorithm, Support Vector Mechanism Algorithms, The Apriori algorithm for time series data mining**. These algorithms are part of data analytics implementation for business.

### What are the different types of mining? ›

There are four main mining methods: **underground, open surface (pit), placer, and in-situ mining**. Underground mines are more expensive and are often used to reach deeper deposits.

### Which of the following is not a data mining technique? ›

Q. | Which of the following is not a data mining functionality? |
---|---|

B. | classification and regression |

C. | selection and interpretation |

D. | clustering and analysis |

Answer» c. selection and interpretation |

### Is a data mining technique used to predict future behavior? ›

**Predictive modeling** is a data-mining technique used to predict future behavior and anticipate the consequences of change.

### What are the 4 characteristics of data mining? ›

**Characteristics of a data mining system**

- Large quantities of data. The volume of data so great it has to be analyzed by automated techniques e.g. satellite information, credit card transactions etc.
- Noisy, incomplete data. ...
- Complex data structure. ...
- Heterogeneous data stored in legacy systems.

### What are the two types of models in data mining? ›

1. **Linear Regression** is associated with the search for the optimal line to fit the two attributes so that one attribute can be applied to predict the other. 2. Multi-Linear Regression involves two or more than two attributes and data are fit to multidimensional space.

### What are types of data source? ›

There are three types of data sources: **relational**. **multidimensional (OLAP)** **dimensionally modeled relational**.

### What is the first step of data mining? ›

The first step is to define a data preparation input model. This means to localize and relate the relevant data in the database. This task is usually performed by a database administrator (DBA) or a data warehouse administrator, because it requires knowledge about the database model.

### What are data mining models? ›

A data mining model **gets data from a mining structure and then analyzes that data by using a data mining algorithm**. The mining structure and mining model are separate objects. The mining structure stores information that defines the data source.

### What are the major issues in data mining? ›

**Some of the Data mining challenges are given as under:**

- Security and Social Challenges.
- Noisy and Incomplete Data.
- Distributed Data.
- Complex Data.
- Performance.
- Scalability and Efficiency of the Algorithms.
- Improvement of Mining Algorithms.
- Incorporation of Background Knowledge.

### What is the scope of data mining? ›

Data created by data mining is **used by businesses to boost their revenues, know about business investment risks, improve customer relationships, etc**. Data mining is a crucial part of successful business analytics in organizations. Its tools help us to analyze historical as well as real-time data.

### Is data mining easy to learn? ›

Data mining is often perceived as a challenging process to grasp. However, **learning this important data science discipline is not as difficult as it sounds**. Read on for a comprehensive overview of data mining's various characteristics, uses, and potential job paths.

### Which is better data mining or machine learning? ›

5. As machine learning is an automated process, the result produces by **machine learning will be more precise as compared to data mining**.

### What are the advantages of data mining? ›

**It helps businesses make informed decisions**. It helps detect credit risks and fraud. It helps data scientists easily analyze enormous amounts of data quickly. Data scientists can use the information to detect fraud, build risk models, and improve product safety.

### What industries use data mining? ›

**Top 8 Use Cases of Data Mining Across Different Industries:**

- Telecom. Vodafone. T-Mobile.
- Retail. Walmart. Amazon.
- Healthcare. Cardinal Health. DOJ.
- Advertising. Netflix. Spotify.

### Where can data mining be applied? ›

Data Mining can be applied to **any type of data e.g. Data Warehouses, Transactional Databases, Relational Databases, Multimedia Databases, Spatial Databases, Time-series Databases, World Wide Web**. Data mining provides competitive advantages in the knowledge economy.

### What are the steps in data mining process? ›

**7 Key Steps in the Data Mining Process**

- Data Cleaning.
- Data Integration.
- Data Reduction for Data Quality.
- Data Transformation.
- Data Mining.
- Pattern Evaluation.
- Representing Knowledge in Data Mining.

### What are the types of data mining? ›

Data mining has several types, including **pictorial data mining, text mining, social media mining, web mining, and audio and video mining** amongst others.

### What is data mining techniques in machine learning? ›

Data mining is designed to extract the rules from large quantities of data, while machine learning teaches a computer how to learn and comprehend the given parameters. Or to put it another way, data mining is simply **a method of researching to determine a particular outcome based on the total of the gathered data**.

### What are data mining techniques in healthcare? ›

Data mining consists in discovering knowledge and techniques such as **classification and regression trees, logistic regression and neural networks** that are adequate to predict the health status of a patient, by taking into account various medical parameters (also known as attributes) and demographic parameters.

### What is data mining techniques PDF? ›

Data mining is **a process of extraction of**. **useful information and patterns from huge data**. It is also called as knowledge discovery process, knowledge mining from data, knowledge extraction or data /pattern analysis.

### What is another name for data mining? ›

Data mining is also known as **Knowledge Discovery in Data** (KDD).

### Where is data mining used? ›

Banks use data mining to better understand market risks. It is commonly applied to **credit ratings and to intelligent anti-fraud systems to analyse transactions, card transactions, purchasing patterns and customer financial data**.

### What are the 4 characteristics of data mining? ›

**Characteristics of a data mining system**

- Large quantities of data. The volume of data so great it has to be analyzed by automated techniques e.g. satellite information, credit card transactions etc.
- Noisy, incomplete data. ...
- Complex data structure. ...
- Heterogeneous data stored in legacy systems.

### Is data mining easy to learn? ›

Data mining is often perceived as a challenging process to grasp. However, **learning this important data science discipline is not as difficult as it sounds**. Read on for a comprehensive overview of data mining's various characteristics, uses, and potential job paths.

### Which is better data mining or machine learning? ›

5. As machine learning is an automated process, the result produces by **machine learning will be more precise as compared to data mining**.

### Why is data mining important? ›

So why is data mining important for businesses? Businesses that utilize data mining are able to have a competitive advantage, better understanding of their customers, good oversight of business operations, improved customer acquisition, and new business opportunities.

### What are the issues with data mining? ›

**Data Mining challenges**

- Security and Social Challenges.
- Noisy and Incomplete Data.
- Distributed Data.
- Complex Data.
- Performance.
- Scalability and Efficiency of the Algorithms.
- Improvement of Mining Algorithms.
- Incorporation of Background Knowledge.

### What is one reason why data mining is used in the healthcare industry? ›

In the healthcare industry specifically, data mining can be used to decrease costs by increasing efficiencies, improve patient quality of life, and perhaps most importantly, **save the lives of more patients**.

### How can data mining improve patient outcomes? ›

**Benefits of data mining in healthcare**

- Enhanced clinical decision-making. ...
- Increased diagnosis accuracy. ...
- Improved treatment efficiency. ...
- Avoiding harmful drug and food interactions. ...
- Better customer relationships. ...
- Detection of insurance fraud. ...
- Enabling predictive analysis. ...
- Classification.

### What is the concept of data mining? ›

Data mining is **the process of discovering actionable information from large sets of data**. Data mining uses mathematical analysis to derive patterns and trends that exist in data.

### What is data mining w3schools? ›

Data Mining is **a process of finding potentially useful patterns from huge data sets**. It is a multi-disciplinary skill that uses machine learning, statistics, and AI to extract information to evaluate future events probability.

### What is data mining and its process? ›

Data mining is **the process of sorting through large data sets to identify patterns and relationships that can help solve business problems through data analysis**. Data mining techniques and tools enable enterprises to predict future trends and make more-informed business decisions.