Posts

Showing posts with the label REGEX

Using GROK for Information Extraction from Text

Image
What Information extraction from text is ??? One of the key part while working with text data is extracting information from the raw text data. Let's take an example of a text sentence that belongs to some data and has data in the following form. Details are: Name Japneet Singh Age 27 Profession Software Engineer Information Extracted from this text would look like Name: Japneet Singh Age: 27 Profession: Software Engineer This information then can be used further in any Machine Learning model. Generally, we perform this step in very early stages of data preprocessing and there can be many advanced ways to deal with it but the old way of using regex remains undefeated champion. REGEX plays an important role whenever we are playing with text data. Here, we will discuss two ways to extract the information: REGEX  GROK to deal with this data extraction. The REGEX Approach Regex is defined by regular-expression.info as A regular expressi