🎯 What Are We Doing?
We are assigning a category (like Food, Cleaning, Personal Care) to each receipt item, based on keywords found in the item name.
🧠 Example
You have this item from OCR:
"Sunlight Detergent"
We want to assign it the category:
"Cleaning"
✅ Step-by-Step Breakdown
🔹 Step 1: Create a keyword-to-category dictionary
category_map = {
"tooth": "Personal Care",
"detergent": "Cleaning",
"indomie": "Food",
"milk": "Beverage",
"oil": "Food",
# ...
}
This is a lookup table. If any of these keywords appear in an item name, we assign that category.
🔹 Step 2: Write a function to search each item name
def categorize(item_name):
item_name = item_name.lower()
for keyword, category in category_map.items():
if keyword in item_name:
return category
return "Uncategorized"
This function scans each item for keywords and returns the matched category.
🔍 Example Walkthrough
categorize("Colgate Toothpaste") # returns "Personal Care"
categorize("Indomie Chicken Flavor 120g") # returns "Food"
categorize("GOHH LAF OMIO") # returns "Uncategorized"
✅ Where This Happens in Your Code
for e in expenses:
e['category'] = categorize(e['item'])
🔄 Final Output
{
'item': 'Indomie Chicken Flavor 120g',
'price': 150,
'category': 'Food'
}
✅ Summary
We use a list of known keywords like "tooth"
, "detergent"
, "oil"
and match them against each item name using basic string search in Python. If a keyword is found, we assign it a category. If not, we tag it as Uncategorized.