Natural Language Processing Engineer

  Home  Education and Science  Natural Language Processing Engineer


“Natural Language Processing Engineer related Frequently Asked Questions by expert members with job experience as Natural Language Processing Engineer. These questions and answers will help you strengthen your technical skills, prepare for the new job interview and quickly revise your concepts”



78 Natural Language Processing Engineer Questions And Answers

22⟩ Tell me what is batch statistical learning?

Statistical learning techniques allow learning a function or predictor from a set of observed data that can make predictions about unseen or future data. These techniques provide guarantees on the performance of the learned predictor on the future unseen data based on a statistical assumption on the data generating process.

 183 views

26⟩ You have created a document term matrix of the data, treating every tweet as one document. Which of the following is correct, in regards to document term matrix? Removal of stopwords from the data will affect the dimensionality of data Normalization of words in the data will reduce the dimensionality of data Converting all the words in lowercase will not affect the dimensionality of the data A) Only 1 B) Only 2 C) Only 3 D) 1 and 2 E) 2 and 3 F) 1, 2 and 3

D) 1 and 2

Choices A and B are correct because stopword removal will decrease the number of features in the matrix, normalization of words will also reduce redundant features, and, converting all words to lowercase will also decrease the dimensionality.

 226 views

28⟩ Retrieval based models and Generative models are the two popular techniques used for building chatbots. Which of the following is an example of retrieval model and generative model respectively. A) Dictionary based learning and Word 2 vector model B) Rule-based learning and Sequence to Sequence model C) Word 2 vector and Sentence to Vector model D) Recurrent neural network and convolutional neural network

B) Rule-based learning and Sequence to Sequence model

choice 2 best explains examples of retrieval based models and generative models

 224 views

29⟩ Basic Natural Language Processing Engineer Job Interview Questions

☛ Do you know about latent semantic indexing? Where can you apply it?

☛ Is it possible to find all the occurrences of quoted text in an article? If yes, explain how?

☛ What is a POS tagger? Explain the simplest approach to build a POS tagger?

☛ Which is a better algorithm for POS tagging – SVM or hidden Markov models?

☛ What is the difference between shallow parsing and dependency parsing?

☛ What package are you aware of in python which is used in NLP and ML?

☛ Explain one application in which stop words should be removed.

☛ How will you train a model to identify whether the word “Raymond” in a sentence represents a person’s name or a company?

 155 views

30⟩ Common Natural Language Processing Engineer Job Interview Questions

☛ As a beginner in Natural Language processing, from where should I start?

☛ What is the relation between sentiment analysis, natural language processing and machine learning?

☛ What is the current state of the art in natural language processing?

☛ What is the state of the art in natural language understanding?

☛ Which publications would you recommend reading for someone interested in natural language processing?

☛ What are the basics of natural language processing?

☛ Could you please explain the choice constraints of the pros/cons while choosing Word2Vec, GloVe or any other thought vectors you have used?

☛ How do you explain NLP to a layman?

☛ How do I explain NLP, text mining, and their difference in layman’s terms?

☛ What is the relationship between N-gram and Bag-of-words in natural language processing?

☛ Is deep learning suitable for NLP problems like parsing or machine translation?

☛ What is a simple explanation of a language model?

☛ What is the definition of word embedding (word representation)?

☛ How is Computational Linguistics different from Natural Language Processing?

☛ Natural Language Processing: What is a useful method to generate vocabulary for large corpus of data?

☛ How do I learn Natural Language Processing?

☛ Natural Language Processing: What are good algorithms related to sentiment analysis?

☛ What makes natural language processing difficult?

☛ What are the ten most popular algorithms in natural language processing?

☛ What is the most interesting new work in deep learning for NLP in 2017?

☛ How is word2vec different from the RNN encoder decoder?

☛ How does word2vec work?

☛ What’s the difference between word vectors, word representations and vector embeddings?

☛ What are some interesting Word2Vec results?

☛ How do I measure the semantic similarity between two documents?

☛ What is the state of the art in word sense disambiguation?

☛ What is the main difference between word2vec and fastText?

☛ In layman terms, how would you explain the Skip-Gram word embedding model in natural language processing (NLP)?

☛ In layman’s terms, how would you explain the continuous bag of words (CBOW) word embedding technique in natural language processing (NLP)?

☛ What is natural language processing pipeline?

☛ What are the available APIs for NLP (Natural Language Processing)?

☛ How does perplexity function in natural language processing?

☛ How is deep learning used in sentiment analysis?

 161 views

31⟩ General Natural Language Processing Engineer Job Interview Questions

☛ Differentiate regular grammar and regular expression.

☛ How will you estimate the entropy of the English language?

☛ Describe dependency parsing?

☛ What do you mean by Information rate?

☛ Explain Discrete Memoryless Channel (DMC).

☛ How does correlation work in text mining?

☛ How to calculate TF*IDF for a single new document to be classified?

☛ How to build ontologies?

☛ What is an N-gram in the context of text mining?

☛ What do you know about linguistic resources such as WordNet?

☛ Explain the tools you have used for training NLP models?

 155 views

32⟩ Fresh Natural Language Processing Engineer Job Interview Questions

☛ Artificial Intelligence: What is an intuitive explanation for recurrent neural networks?

☛ How are RNNs storing ‘memory’?

☛ What are encoder-decoder models in recurrent neural networks?

☛ Why do Recurrent Neural Networks (RNN) combine the input and hidden state together and not seperately?

☛ What is an intuitive explanation of LSTMs and GRUs?

☛ Are GRU (Gated Recurrent Unit) a special case of LSTM?

☛ How many time-steps can LSTM RNNs remember inputs for?

☛ How does attention model work using LSTM?

☛ How do RNNs differ from Markov Chains?

☛ For modelling sequences, what are the pros and cons of using Gated Recurrent Units in place of LSTMs?

☛ What is exactly the attention mechanism introduced to RNN (recurrent neural network)? It would be nice if you could make it easy to understand!

☛ Is there any intuitive or simple explanation for how attention works in the deep learning model of an LSTM, GRU, or neural network?

☛ Why is it a problem to have exploding gradients in a neural net (especially in an RNN)?

☛ For a sequence-to-sequence model in RNN, does the input have to contain only sequences or can it accept contextual information as well?

☛ Can “generative adversarial networks” be used in sequential data in recurrent neural networks? How effective would they be?

☛ What is the difference between states and outputs in LSTM?

☛ What is the advantage of combining Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN)?

☛ Which is better for text classification: CNN or RNN?

☛ How are recurrent neural networks different from convolutional neural networks?

 159 views

33⟩ Professional Natural Language Processing Engineer Interview Questions

☛ What is part of speech (POS) tagging? What is the simplest approach to building a POS tagger that you can imagine?

☛ How would you build a POS tagger from scratch given a corpus of annotated sentences? How would you deal with unknown words?

☛ How would you train a model that identifies whether the word “Apple” in a sentence belongs to the fruit or the company?

☛ How would you find all the occurrences of quoted text in a news article?

☛ How would you build a system that auto corrects text that has been generated by a speech recognition system?

☛ What is latent semantic indexing and where can it be applied?

☛ How would you build a system to translate English text to Greek and vice-versa?

☛ How would you build a system that automatically groups news articles by subject?

☛ What are stop words? Describe an application in which stop words should be removed.

☛ How would you design a model to predict whether a movie review was positive or negative?

☛ What is entropy? How would you estimate the entropy of the English language?

☛ What is a regular grammar? Does this differ in power to a regular expression and if so, in what way?

☛ What is the TF-IDF score of a word and in what context is this useful?

☛ How does the PageRank algorithm work?

☛ What is dependency parsing?

☛ What are the difficulties in building and using an annotated corpus of text such as the Brown Corpus and what can be done to mitigate them?

☛ What tools for training NLP models (nltk, Apache OpenNLP, GATE, MALLET etc…) have you used?

☛ Do you have any experience in building ontologies?

☛ Are you familiar with WordNet or other related linguistic resources?

☛ Do you speak any foreign languages?

 189 views

35⟩ Tell me what is ‘Training set’ and ‘Test set’?

In various areas of information science like machine learning, a set of data is used to discover the potentially predictive relationship known as ‘Training Set’. Training set is an examples given to the learner, while Test set is used to test the accuracy of the hypotheses generated by the learner, and it is the set of example held back from the learner. Training set are distinct from Test set.

 187 views

37⟩ In a corpus of N documents, one document is randomly picked. The document contains a total of T terms and the term “data” appears K times. What is the correct value for the product of TF (term frequency) and IDF (inverse-document-frequency), if the term “data” appears in approximately one-third of the total documents? A) KT * Log(3) B) K * Log(3) / T C) T * Log(3) / K D) Log(3) / KT

B) K * Log(3) / T

formula for TF is K/T

formula for IDF is log(total docs / no of docs containing “data”)

= log(1 / (⅓))

= log (3)

Hence correct choice is Klog(3)/T

 198 views

38⟩ How many trigrams phrases can be generated from the following sentence, after performing following text cleaning steps Stopword Removal Replacing punctuations by a single space “#Analytics-vidhya is a great source to learn @data_science.” A) 3 B) 4 C) 5 D) 6 E) 7

C) 5

After performing stopword removal and punctuation replacement the text becomes: “Analytics vidhya great source learn data science”

Trigrams – Analytics vidhya great, vidhya great source, great source learn, source learn data, learn data science

 192 views

39⟩ Google Search’s feature – “Did you mean”, is a mixture of different techniques. Which of the following techniques are likely to be ingredients? Collaborative Filtering model to detect similar user behaviors (queries) Model that checks for Levenshtein distance among the dictionary terms Translation of sentences into multiple languages A) 1 B) 2 C) 1, 2 D) 1, 2, 3

C) 1, 2

Collaborative filtering can be used to check what are the patterns used by people, Levenshtein is used to measure the distance among dictionary terms.

 213 views

40⟩ Solve the equation according to the sentence “I am planning to visit New Delhi to attend Analytics Vidhya Delhi Hackathon”. A = (# of words with Noun as the part of speech tag) B = (# of words with Verb as the part of speech tag) C = (# of words with frequency count greater than one) What are the correct values of A, B, and C? A) 5, 5, 2 B) 5, 5, 0 C) 7, 5, 1 D) 7, 4, 2 E) 6, 4, 3

D) 7, 4, 2

Nouns: I, New, Delhi, Analytics, Vidhya, Delhi, Hackathon (7)

Verbs: am, planning, visit, attend (4)

Words with frequency counts > 1: to, Delhi (2)

Hence option D is correct.

 167 views