A Python-based application that scans images, extracts text using OCR, and identifies useful information such as email addresses, phone numbers, and URLs.
The application allows users to upload an image through a simple frontend interface. The backend processes the image, extracts text, and detects structured information using pattern matching.
- Upload images through a web interface
- Extract text from images using OCR
- Detect email addresses from the extracted text
- Detect phone numbers from the extracted text
- Detect URLs or website links
- Fast backend API built using FastAPI
- Python
- FastAPI
- Tesseract OCR
- Pillow
- Regular Expressions (Regex)
- HTML / CSS / JavaScript (Frontend)
- User uploads an image from the frontend interface
- The backend processes the image and extracts text using OCR
- The extracted text is analyzed using regular expressions
- Emails, phone numbers, and URLs are returned as structured data
{
"emails": ["example@email.com"],
"phones": ["+91-9876543210"],
"urls": ["https://example.com"]
}
- Extract contact information from business cards
- Identify links and contact details from screenshots
- Automate simple data extraction from images