Some combination of steps and patterns to convert the customers text or speech into structured data and those data is then used to intern or to select the suitable answer based on some of the machine learning algorithms this is how a chat bot usually responds to your question even though there is no person behind you still get some relevant answers to your questions so and the next application is saw analyzing banking documents and extracting information from them so banks end up with huge quantities of documentation for various purposes such as regulatory purposes and there will be a lot of KY seas that happen etc so if they don't or choose to take a disciplined approach to classify and manage these documents then it would have a huge cost impact on storing and retrieving this talk for later use etc.

So NLP helps and not handling this problem by streamlining be a document classification and to perform analytics on the documents whenever there is a requirement all right the next application NASA analyzing conversations about a product or service so banks also hold a huge log of customer conversations in the form of audio data so whenever we make a call to a customer service in a bank we get a message that how this conversation will be recorded right so all this data when it is stored it becomes a great source for understanding our customer sentiments and it helps the bank to handle a saw the sentiment of customers so on various products and services offered by the bank the NLP helps in conversion of audio data to text and to perform further analysis on understanding the complaints or feedbacks from the customers and taking further actions to improve the bank saw service delivery and the next application.

Where we can use in LPS or gathering or real-time intelligence on stocks or on the company data etcetera so there are many news API switcher you might have used in our different instances or you might have come across it so these are the APS which are available to crawl the news on real time and the play NLP techniques on them to understand the customer office and the status of stocks and status of the annual reports or financial data of companies and the latest to current affairs and news on the companies etc which are being or traded by the banks on behalf of their customers so applying machine learning and NLP helps in real time market research and taking strategic decisions on in the mence the next application would be in monitoring sentiments so by law majority of the news such as annual report or mergers or acquisitions these affect how investors view a company so similarly even the general tone of the news coverage and the social media sentiments can have an impact on the markets.

So NLP helps in converting public opinion into sentiment factors which can further help in taking strategic decisions by a bank and the next application would be anticipating customer concerns so we know that we do gather a lot of audio logs as part of banking data and all this and also we gather a lot of additional information from social media forums on such as Twitter Facebook wherever people are stating their opinions on a particular product or a service from a bank so all this information can be used by applying NLP algorithms and it helps us to discover and parse the customer sentiments and anticipate concerns on a particular product or service this can now help banks take proactive measures and to mitigate the risk of future failure on a product or service so these are various areas where we can make use of NLP in a banking industry so in the upcoming slides let us also look at the actual concepts behind this NLP.

The implementation of NLP using Python here we are going to cover the NLP concepts these are the some of the important concepts which we need to know to apply your NLP algorithms on takes data the first one here is text processing then tokenization normalization stemming bag of words he was tagging and classification let us look at each of this in a detailed manner takes processing so what is mean by that we are all very familiar with the text because we use it every day to read and write and NLP treats this text as raw data the most important source of text is under undoubtably the internet so we gather humongous amount of text in the internet and the functions send parcels that are available within Na NLP algorithms and libraries they help us to access this text so we will be able to access the text either from the local directories or from the websites and process them using this various algorithms so text processing would be the first step to start building NLP based models and the next concept is tokenization so what is mean by tokenization to state it as a definition we can say that tokenization is the process of cutting a string into identifiable language units what exactly is this in simple terms.

We are splitting a sentence into words using separators such as saw in a sentence every word is separated by a space or it can be separated by tab if it is in the form of an excel sheet or something and there might be new lines newline characters which are separating the text so if you are using this kind of separators and separating your sentence into it is no nest organization the next concept is normalization what is mean by normalization so when we say normalization mathematically it is your rod bringing your data into a common scale similarly when it comes to the context of texts are converting the text into a common scale for further processing as normalization okay so for instance we need to convert all uppercase words sent to lowercase words for the ease of manipulating the text they further machine learning models in that case we normalize our text into all lowercase words or we have a lot of abbreviated text saw in our raw data so we need to expand it convert it into the full form and then use it in or further processing so that can also be called as normalization in case so we are handling different kinds of data for instance the fruit Apple is the fruit but Apple is also an organization name so if you want your rod text.

If you are raw data say it has this term Apple and it implies saw the fruit and not the organization then your algorithm should handle it as a fruit instead of an organization so these are the kind of functions which we can program within the normalization so there is no single method to normalize the text or knowledge of the context in which you are applying NLP is very important in programming the normalization logic the next concept here is stemming so what is mean by stemming to make it in a simple term you're just stripping off the prefixes and suffixes of your text according to your need which is stemming the text that known as stemming so similar to normalization stemming also don't have a single method or process we can do it in any method according to the application where we are applying stemming the next concept is bag of words what is mean by bag of words so when we say a bag of words you have a huge set of data which is obviously a huge collection of words and you are trying to identify the potential features in that complete set of data so in text analytics or in natural language processing your features in the data would be the words which you are using within your raw text straight up so this bag of words is a feature extraction technique where a sentence is split into words similar to tokenization but in the bag of words model the number of occurrences of each word is maintained in a dictionary.

So it will be stored as a key value pair for instance so the key would be the word which occurs in a sentence and the value would be the number of occurrences of that particular word in your data so after removing grammar and stop words such as a conjunctions prepositions etcetera from the sentences we form a corpus out of the complete set of valid words within your raw text data that is what is performed by this bag of words model so once a corpus is formed these words are converted into feature vectors so feature vectors what is mean by this Oh every machine learning model it basically accepts data in the form of numbers.

Post a Comment

Previous Post Next Post