Phase 6 – Keyword Matching Logic

🔥 Make your categorization smart, accurate, and explainable to any director, investor, or crowd. This is Python logic made crystal clear.

🎯 What Are We Doing?

We are assigning a category (like Food, Cleaning, Personal Care) to each receipt item, based on keywords found in the item name.

🧠 Example

You have this item from OCR:

"Sunlight Detergent"

We want to assign it the category:

"Cleaning"

✅ Step-by-Step Breakdown

🔹 Step 1: Create a keyword-to-category dictionary

category_map = {
    "tooth": "Personal Care",
    "detergent": "Cleaning",
    "indomie": "Food",
    "milk": "Beverage",
    "oil": "Food",
    # ...
}

This is a lookup table. If any of these keywords appear in an item name, we assign that category.

🔹 Step 2: Write a function to search each item name

def categorize(item_name):
    item_name = item_name.lower()
    for keyword, category in category_map.items():
        if keyword in item_name:
            return category
    return "Uncategorized"

This function scans each item for keywords and returns the matched category.

🔍 Example Walkthrough

categorize("Colgate Toothpaste")  # returns "Personal Care"
categorize("Indomie Chicken Flavor 120g")  # returns "Food"
categorize("GOHH LAF OMIO")  # returns "Uncategorized"

✅ Where This Happens in Your Code

for e in expenses:
    e['category'] = categorize(e['item'])

🔄 Final Output

{
  'item': 'Indomie Chicken Flavor 120g',
  'price': 150,
  'category': 'Food'
}

✅ Summary

We use a list of known keywords like "tooth", "detergent", "oil" and match them against each item name using basic string search in Python. If a keyword is found, we assign it a category. If not, we tag it as Uncategorized.