Issues and Challenges in Natural
Language Processing
(Irregularity, Ambiguity, Productivity)
Natural Language Processing (NLP) deals with the complexities of human language. However, human language is
full of inconsistencies, ambiguities, and innovations, which present significant challenges for computational models.
The three major areas where these challenges arise are: Irregularity, Ambiguity, and Productivity.
1. Irregularity
Irregularity refers to the phenomenon where certain words or word forms do not follow regular patterns of
morphology (word structure) or syntax (sentence structure).
→ This becomes a challenge for NLP algorithms which are often built on rules, patterns, or data regularities. When
words break these rules, the model may fail to interpret or process them correctly.
(i) Irregular Verbs & Nouns
Some English verbs and nouns do not follow the standard rules for tense or plural forms. These are known as
irregular inflections, and they need to be handled as exceptions in NLP systems.
Examples:
Tense Verb: choose Verb: bite Verb: go
Present choose bite go
Past chose bit went
Past Participle chosen bitten gone
These irregular forms cannot be derived by simply applying suffix rules like -ed for past tense or -en for participle.
(ii) Exceptional Inflection (Adjectives)
Some adjectives do not follow the regular -er and -est forms for comparative and superlative degrees.
Examples:
• big → bigger → biggest
• dark → darker → darkest
• good → better → best (irregular – no rule applied)
These irregular adjectives again must be explicitly known to the system. Rule-based systems will fail to
infer them without lexical lookup.
2. Ambiguity
Ambiguity occurs when a word, phrase, or sentence has more than one interpretation.
→ In NLP, ambiguity poses a → Words that look the same → Ambiguity can arise at
major challenge because but have different meanings different levels, including
machines lack the or functions are called morphology, syntax,
commonsense knowledge and homonyms. semantics, and pragmatics.
context awareness that
humans naturally use to
disambiguate.
Types of Ambiguity:
(i) Word-Sense Ambiguity
Some words have multiple meanings, and the correct interpretation depends on context.
Examples:
bank bat
• A financial institution • A nocturnal flying animal
• The land beside a river • A cricket/baseball bat
In NLP tasks like machine translation or information retrieval, disambiguating the correct meaning is
crucial.
(ii) Part-of-Speech Ambiguity
A word may serve different grammatical roles depending on the sentence.
Examples:
• I run every day. → run is a verb
• He went for a run. → run is a noun
Part-of-speech taggers need to consider surrounding words to assign the correct tag.
(iii) Structural Ambiguity
A single sentence may have multiple valid syntactic structures, leading to different interpretations.
Example:
I saw the man with the telescope.
Possible meanings:
• I used a telescope to see the man.
• The man I saw had a telescope.
Parsing systems must choose the most likely structure, often using statistical models or syntactic rules.
(iv) Referential Ambiguity
Sometimes, it's unclear which entity a pronoun or noun phrase refers to.
Example:
Ravi told Ramesh that he was late.
Who was late? Ravi or Ramesh?
Such ambiguity is common in coreference resolution tasks, where the goal is to identify entities referred to by pronouns or noun
phrases.
3. Productivity
Productivity is the capacity of a language to create new words or forms using known linguistic rules.
→ It reflects the creative and generative nature of human → NLP systems must be capable of handling novel word
language. formations that were not present in the training data.
Example: The Word "Google"
The word Googol (1 followed by 100 zeros) inspired the name of the tech company Google.
From the root "Google", many new words were productively formed:
googling googlicious googleology
the act of searching on playfully derived adjective study of all things Google
Google (made-up, but linguistically
valid)
Such neologisms are generated based on morphology rules (like adding -ing, -ology) but may not exist in
standard dictionaries.
Challenges in Productivity:
Proper nouns Neologisms
Names of people, places, organizations often Newly coined words constantly enter the
have unique or arbitrary structures. language (e.g., selfie, metaverse).
Domain-specific vocabulary Low-resource languages
Specialized fields invent technical jargon Productivity patterns might not be well-
(e.g., in medicine, tech). documented or well-supported.