Unstructured text is everywhere: emails, news articles, customer reviews. But computers find it hard to understand. If you feed the sentence "Elon Musk bought Twitter in California" to a standard computer, it just sees 38 characters.
Named Entity Recognition (NER) is a subfield of Natural Language Processing (NLP) that solves this. It identifies and categorizes key information in text.
The Named Entity Extractor automatically scans your text and pulls out:
- PERSON: People's names (e.g., "Alice", "Obama").
- ORG: Companies, agencies (e.g., "Google", "FBI").
- GPE/LOC: Countries, cities, states (e.g., "Paris", "Texas").
- DATE: Absolute or relative dates (e.g., "January 1st", "tomorrow").
How It Works (Under the Hood)
NER models aren't just looking up words in a dictionary (a "gazetteer"). They use context.
Example: "Apple is watching you." vs "The apple is tasty."
- In sentence 1, "Apple" is an ORG because it performs an action (watching).
- In sentence 2, "apple" is not an entity (or just a food item) because of the article "The" and adjective "tasty".
Use Cases
1. Customer Support Automation
Automatically tag tickets based on mentioned products or locations. "My iPhone is broken in London" -> Tag: Product:iPhone, LOC:London.
2. Content Classification
News aggregators use NER to group articles by topic. If "Microsoft" and "Activision" appear together often, the articles are likely about a merger.
3. Privacy & Redaction
Identify names and locations in a document so you can anonymize them before sharing the data.
Client-Side Privacy
NLP usually requires heavy servers. However, this tool uses lightweight, optimized models that run entirely in your browser. This is critical for privacy. If you paste a confidential legal contract here to extract names, that contract never leaves your computer.