Posts

Why should you learn PYTHON in 2020

Image
What is Coding/Programming? For all those who are very new to coding/programming and are about to start their journey by reading this blog post.  We give instructions to our system using the keyboard or mouse informing the computer to perform some set of tasks like: Printing a doc Writing an email Playing music Dimming the monitor backlight and many more tasks The computer is a hardware device which needs some instructions to run and when we are performing some tasks on the computer, in the background there is some code which is running and telling the computer machine what it has to do. And this happens with the help of some programming language, which acts as an interface between the computer user and hardware. So, coding/programming is an act of writing these instructions using some programming language which when runs on a computer performs some tasks. Pheww...... Programming Languages There is a large variety of programming languages available in the market right now and every lan

Writing an Instagram Bot with Python

Image
Growing on Instagram is a difficult and time-consuming job if you are already not a famous personality. I started my food blogging page @_.interwined_dodos._ (do check it out and follow me there) and faced these challenges in growing. Some of the very common challenges I have faced were: Interacting with all community throughout the day is a difficult job Instagram blocks the like, comment, follow and unfollow actions if you are over interactive in a short span of time Following and engaging with new people of same or different community(again a time-consuming job) Being a lazy yet ambitious person, I wanted to grow on Instagram but didn't want to spend much time on it. Here is when my developer mind kicked in and I thought of automating this stuff and my journey of this automation lead me to blogs related to selenium and then to my destination Instapy . InstaPy is a tool developed to automate Instagram tasks using Selenium and the brain. It has a large variety of actions that ca

Autocomplete Using Redis and Python

Autocomplete using Redis and Python Reading about the use cases of the Redis I came across a use case in which we can implement an autocomplete functionality using it. I am going to show you the code to implement the Autocomplete in under 40 lines of code. Here we go: #redis client for python import redis #flask to expose api's to outside world from flask import Flask,request,jsonify app = Flask("autocomplete") #creating a redis connection r = redis.StrictRedis(host='localhost', port=7001, db=0) #route to add a value to autocomplete list ''' FORMAT: localhost:5000/add?name=<name> ''' @app.route('/add') def add_to_dict(): try: name = request.args.get('name') n = name.strip() for l in range(1,len(n)): prefix = n[0:l] r.zadd('compl',{prefix:0}) r.zadd('compl',{n+"*":0}) return "Added"

Extracting text from PDF for NLP tasks

Image
Introduction Natural Language Processing is a task that involves data collection from various sources and not every time one is lucky to get the baked data. Many times you have to extract data from various sources, one of them is Files. In this post, I will be talking specifically about the PDF files. Getting the Guns ready After some exploration on the internet, I came across a python package PyPDF  which sounded a good contender to achieve what we want (text extraction), although it can do more than what we need. This package can also be used to generate, decrypting and merging PDF files, although its usage details are not that clear that's why I thought of writing a post to explain it. Installation pip install PyPDF2 Reading the File and extracting Text import PyPDF2 filename = 'complete path of your pdf file'  #opening the file  pdfFileObj = open(filename,'rb') #creating a pdf reader object pdfReader = PyPDF2.PdfFileReader(pdf

Error Handling in R

Image
What is an Exception? An unwanted situation that may arise while your code is getting exected is called an Exception e.g when your code attempts to divide a value by zero. Exception Handling Exception handling is the process of handling the errors that might occur in the code and avoid abrupt halt of the code. In simple English, our code should either end by performing the intended task or prints a useful message if it is not able to complete the task. We have this code which has non-numeric value in the list and we are trying to divide  5 with every element of vector v #a list with one non numeric value v<-list(1,2,4, '0' ,5) for (i in v) { print(5/i) } Here we can see that the code has not printed any result and has stopped abruptly. To avoid these situations we use exception handling constructs available in R Exception Handling Constructs in R try tryCatch Using try We need to enclose the objectionable statements

Information Extraction using GROKS in Python

Image
Groks in Python In my previous blog , I wrote about information extraction using GROKS and REGEX. If you have not read that I will encourage you to go through this blog first. One of the important aspects of any tool is the ability to use it in a different environment and automate the tasks. In this post, we will be looking at the implementation of GROKs in python using pygrok library. By now we know that GROKs are a form of regular expressions that are more readable. Installation Pygrok is an implementation of GROK patterns in python which is available through pip distribution pip install pygrok Usage The library is extremely useful for using the pre-built groks as well as our own custom-built GROKS. Let's start with a very basic example: Parsing Text  #import the package from pygrok import Grok #text to be processed text = ' gary is male, 25 years old and weighs 68.5 kilograms ' #pattern which you want to match pattern = ' % {WORD :

Using GROK for Information Extraction from Text

Image
What Information extraction from text is ??? One of the key part while working with text data is extracting information from the raw text data. Let's take an example of a text sentence that belongs to some data and has data in the following form. Details are: Name Japneet Singh Age 27 Profession Software Engineer Information Extracted from this text would look like Name: Japneet Singh Age: 27 Profession: Software Engineer This information then can be used further in any Machine Learning model. Generally, we perform this step in very early stages of data preprocessing and there can be many advanced ways to deal with it but the old way of using regex remains undefeated champion. REGEX plays an important role whenever we are playing with text data. Here, we will discuss two ways to extract the information: REGEX  GROK to deal with this data extraction. The REGEX Approach Regex is defined by regular-expression.info as A regular expressi