Skip to content

Bank Statement Parser

Parse Mexican bank statement PDFs (American Express, BBVA, HSBC) into CSV files compatible with personal finance apps like Sure, Monarch Money, and others.

Features

  • Multi-bank support: Amex, BBVA, HSBC Mexico with auto-detection
  • Smart categorization: SQLite-backed rules with priority, bank-specific patterns, and ~80 pre-seeded rules
  • Multiple export formats: Generic (all fields), Sure/Maybe Finance, Monarch Money
  • MSI tracking: Installment info preserved (cargo X de Y), both "sin intereses" and "con intereses"
  • Foreign currency: Original amount, currency code, and exchange rate preserved
  • Multi-cardholder: Distinguishes titular vs additional cardholders (Amex)
  • Filtering: By cardholder, transaction type, fees, MSI, charges-only
  • OCR fallback: HSBC statements with CID-encoded fonts are parsed via Tesseract OCR

Installation

git clone https://github.com/Perafan18/bank-statement-parser.git
cd bank-statement-parser
pip install -e ".[dev]"

OCR support (required for HSBC)

HSBC Mexico statements use CID-encoded fonts that pdfplumber cannot decode. To parse HSBC statements, install the OCR dependencies:

# Python packages
pip install -e ".[ocr]"

# System packages (Ubuntu/Debian)
sudo apt install tesseract-ocr tesseract-ocr-spa poppler-utils

# macOS
brew install tesseract poppler

Without these, BBVA and Amex statements will work normally, but HSBC parsing will show a clear error message with install instructions.

Quick Start

# Parse a statement (auto-detects bank)
bankparse parse statement.pdf

# Specify bank and format
bankparse parse statement.pdf --bank amex --format sure

# Multiple files at once
bankparse parse *.pdf -f sure -o all_transactions.csv

# Only actual purchases (no fees, interest, MSI)
bankparse parse statement.pdf --charges-only

# Exclude fees and interest but keep everything else
bankparse parse statement.pdf --no-fees

# Exclude MSI installments
bankparse parse statement.pdf --no-msi

# Filter by cardholder name (substring match)
bankparse parse statement.pdf --cardholder garcia

See Usage for full CLI reference, Architecture for project internals.