What is summarization and data generalization in data mining?

Before We Learn: What is Summarization and Data Generalization in Data Mining? Let’s Know About Data Mining First. Also, you can Read About Data Entry:10 Best legitimate Online Data entry jobs from home without investment

What is data mining ??

We can define the term data mining in a way that it is a kind of discovering the patterns in a huge number of data sets, which also includes the methods of the intersection of machine learning, we can present these statistics in a very representable way, We can create the database system more innovatively through Data mining.

we can say it is generally used by companies, who wanted to turn in their raw data in useful information. They can take the help of the various software so they can look for more patterns in huge batches of data.

It is also used in business, they can acquire more information about their customers, by using the data mining they can make more effective marketing strategies, they sale more and more at low cost. they present their data’s in a much informative way and show it in a better way so everyone can access it and get ready for the agreement with the associated company.

Data mining is also known as knowledge discovery in data ( KDD ). As in data mining, you have to acquire more and more knowledge about software and everything you need for it.

There are generally four types of data mining.

  1. Pictorial data mining: It is defined as the pictures present in the form of information . in many data’s it is required full to present the data in a pictorial form.
  2. Text data mining: It is the process of deriving high-quality information in the text form. As much information together seems Very unclear. So we can provide high-quality information in a text form in a very innovative way through text data mining.
  3. Social media data mining: We can define these social media data mining in this way that it is the Process of obtaining the various data from user Generated content on social media sites and mobile apps to extract different kinds of patterns.
  4. Web Data mining: It is the application of the techniques of data mining To discover the various types of patterns from the world wide web (www).
  5. Audio and video mining: In this process audios and videos are converted to information . they want conveyed information through video or audio forms.

There are four tools required in determining:

  • Quality and reporting tools.
  • Intelligent agents.
  • Multidimensional analysis tool.
  • Statistics tool.

Data generalization in data mining: We can define the data generalization process that is abstract and also associated with a huge set of tasks Relevant data in the form of a database from the lower conceptual level to the higher conceptual level.

Generalization categorized according to two approaches :

  • The data cube or OLAP approach.
  • The attribute oriented induction approach.

What is summarization ??

It is the process of shortening a set of data computationally, Using this process we can create a subset that represents the most important and relevant information with the original content.

We can do the summarization in any text or images as well as in videos.

By using this process we have to ensure that the content should be full of only relevant information and useful information.

If the content contains any irrelevant information or Something irrelevant that should not be there then we cannot say this a summarization.

It should always be most informative. We have to write it down in an informative way so it can be more useful to the users or viewers.

How we can use the Data generalization in data mining in an effective way?

Conceptually, the data cube can be viewed as a kind of multidimensional data generalization . in general, data generalization summarizes data by replacing relatively low-level values we like we can say that the numeric values for an attribute age with high-level concepts.

we generally understand that the middle-aged people or the students may do not understand the value of the high concept. We have to represent the data in a way which can be understood by everyone easily.

Given the large amount of data stored in databases, it is useful to be able to describe concepts in concise and succinct terms at generalized ( rather than low levels of abstraction allowing the data.

What is the purpose of Summarization?

We generally summarize because we want to convert the large information or very huge information in key points . we want to represent briefly the theory or the work to context for theses.

We can also represent Summarization in a very innovative way. In the key points, we can highlight the most important information, also we can put the information in bullets.

It is recommended very usefully when you don’t have very much time to read everything So you can only read the summary in a very effective way. Most of the data contain a huge amount of information so we also face problem to memorize it.

If we can read the summary from the original content then it will be very useful in some emergency cases.as we know that summarize means, to sum up, the main point.

As we know that summarize means, to sum up, the main points of something. So we can write it down accordingly.

How we can summarize our data… in a very innovative way..??

  • Read and understand the data very carefully: It is obvious that if we cannot understand the data carefully then we cannot write the summary accordingly. we have to carefully understand and read the complete data thoroughly once. So, when we write the summary It would be relevant.
  • We should think about the purpose of the data: We have to understand the purpose of the data behind the concept. every data contains some hiding Concepts we have to find out those hiding concepts a recreate it innovatively in this summarization.
  • Select the relevant information: We have to choose the only relevant information from the original data.No need to add any irrelevant information or topic.
  • Find the main ideas –what is most important: The summarization always should be informative. If the content doesn’t contain any important information and ideas then we cannot say that a summarization.
  • Change the structure of the data: We have to change the structure of the data so we can acquire it very easily and use it in a very useful manner.
  • Rewrite the main ideas in complete sentences: We should always fully present anything as any information in an incomplete way,  is not at all relevant.
  • Check the data again: We should check the data before finalization. As many people used to read the summary only so if the get any error or any mistake which seems irrelevant then it will be harmful to the readers as well as the goodwill of the company.

Data and objects in databases contain detailed information at the primitive concept level. It is useful to be able To summarize a large set of data and present it at a high Conceptual level.


Leave a Comment