✅ Overview
We’ll build receipt_reader.py
in 3 clear parts + 1 optional export step.
🧱 PART 1: Import & Setup
Paste this at the top of your Python file:
import pytesseract
import cv2
import csv
from PIL import Image
# Set path for Windows
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
✅ Explanation: This loads your OCR and file handling tools.
🧱 PART 2: Load Image and Extract Text
Paste this after Part 1:
img = cv2.imread('receipt.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
raw_text = pytesseract.image_to_string(gray)
lines = raw_text.split('\n')
lines = [line.strip() for line in lines if line.strip() != '']
print("Cleaned Lines:")
for line in lines:
print(line)
✅ Explanation: Reads your image and cleans the text line-by-line.
🧱 PART 3: Extract Items and Prices
Paste this after Part 2:
expenses = []
for line in lines:
parts = line.rsplit('₦', 1)
if len(parts) == 2:
item = parts[0].strip()
price_str = parts[1].replace(',', '').strip()
try:
price = int(price_str)
expenses.append({'item': item, 'price': price})
except ValueError:
pass
✅ Explanation: This turns lines into structured data (dictionary format).
🧱 PART 4 (Optional): Show and Save Results
print("\nStructured Data:")
for e in expenses:
print(f"{e['item']} - ₦{e['price']}")
with open('expenses.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(['Item', 'Price'])
for e in expenses:
writer.writerow([e['item'], e['price']])
✅ Explanation: Prints the results and saves a file expenses.csv
✅ Final File Summary
# PART 1
import pytesseract, cv2, csv
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
# PART 2
img = cv2.imread('receipt.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
raw_text = pytesseract.image_to_string(gray)
lines = [line.strip() for line in raw_text.split('\n') if line.strip()]
# PART 3
expenses = []
for line in lines:
parts = line.rsplit('₦', 1)
if len(parts) == 2:
item = parts[0].strip()
price_str = parts[1].replace(',', '').strip()
try:
price = int(price_str)
expenses.append({'item': item, 'price': price})
except ValueError:
pass
# PART 4
print("\nStructured Data:")
for e in expenses:
print(f"{e['item']} - ₦{e['price']}")
with open('expenses.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(['Item', 'Price'])
for e in expenses:
writer.writerow([e['item'], e['price']])
📦 File/Folder Structure:
receipt_reader.py
receipt.jpg ← your actual image file
expenses.csv ← will be created after run
✅ Run It:
python receipt_reader.py
You should see cleaned data printed + expenses.csv
saved in your folder.