Key Phrase Extraction – Fundamentals of Natural Language Processing
Key Phrase Extraction
The Azure Cognitive Service for Language is a cloud-based collection of machine learning and AI algorithms that can be used to make smart apps that use written language. One of the services it offers is keyword extraction. To quickly identify the main concepts in a text, use key phrase extraction. For example, in the text “The food was delicious and the staff were wonderful,” key phrase extraction will return the main topics: “food” and “wonderful staff.”
“Keyword extraction” is the process of looking at the text of a document to find out its most important points. Consider the preceding restaurant scenario. Reading through the reviews can take a long time, depending on how many surveys you have collected. You can instead use the Language Service’s tools for finding keywords to sum up the main points.
You might get a review like this:
“We had a wonderful evening after coming here for a birthday supper. A kind waiter welcomed us right away and led us to our table. The lunch was fantastic, the setting was informal, and the service was top-notch. If you appreciate excellent food and friendly service, you should check out this restaurant.”
By identifying the following phrases as key phrases, you can add context to this review:
- Prompt service
- Wonderful birthday celebration
- Fantastic lunch
- Friendly hostess
- Dinner setting
- Place
You can leverage sentiment analysis to determine not just whether or not this review is good but also to emphasize important elements of the review.
Entity Detection
Entity extraction, also called named entity recognition (NER), is a part of natural language processing that entails identifying key information in the text and categorizing it into a set of predefined categories. It is the most common data preprocessing task. It finds the most relevant information (entities) in free form text (e.g., news, web pages, and text fields). Names of people, places, organizations, and things; dates; email addresses; phone numbers; and similar pieces of information are all examples of entities. Once entities are extracted, they can be used to complete a textual database entry. This framework makes it possible to do more complex analyses, such as those involving entity relationships, event detection, and sentiment analysis. NLP is essentially a two-step process, with the two steps listed as follows:
- Identifying the entities in the text
- Grouping them into broad segments
The unstructured text you give the language service will be turned into a set of entities that you can use.