Skip to content

MongoDB Sync

Sync PropertyMe data to MongoDB for analytics, AI/ML pipelines, and system integrations.

Installation

uv add pypropertyme[mongodb]

Configuration

Set these environment variables (in .env or your shell):

Variable Description
MONGODB_URI MongoDB connection string (e.g., mongodb://localhost:27017)
MONGODB_DATABASE Database name

Quick Start

# 1. Set environment variables
export MONGODB_URI="mongodb://localhost:27017"
export MONGODB_DATABASE="pypropertyme"

# 2. Sync data
pypropertyme-mongodb sync

# 3. Check document counts
pypropertyme-mongodb status

CLI Commands

pypropertyme-mongodb sync

Syncs all records from PropertyMe to MongoDB.

pypropertyme-mongodb sync [--dry-run] [--collection <COLLECTION>]
Option Description
--dry-run Preview sync without writing to MongoDB
--collection Sync specific collection only (e.g., contacts, properties)

pypropertyme-mongodb status

Shows collection statistics and document counts.

pypropertyme-mongodb status

pypropertyme-mongodb test-connection

Tests connections to both PropertyMe and MongoDB.

pypropertyme-mongodb test-connection

Collections

The sync creates 7 collections in MongoDB:

Collection Description
contacts Owners, tenants, suppliers
members Agency staff
properties Managed properties
tenancy_balances Financial data (rent, arrears, bond)
jobs Maintenance jobs
inspections Property inspections
tasks Tasks and reminders

Note

tenancy_balances is a superset of tenancy data, containing all tenancies (active and closed) plus financial balance information. A separate tenancies collection is not needed.

How It Works

Document Structure

Documents are stored as complete Pydantic model JSON with no transformations:

{
  "_id": ObjectId("..."),
  "id": "abea0092-b4f4-c250-9bc4-eba5ca44a35b",
  "name_text": "Gary Jarrel",
  "email": "gary@example.com",
  "is_owner": true,
  "roles": ["owner"]
}

Upsert Logic

Uses MongoDB's bulk_write with ReplaceOne operations keyed by the id field. Documents are created if new, replaced if existing.

Stale Record Refresh

After syncing, documents in MongoDB that weren't included in the PropertyMe sync response are refreshed individually:

  • For each stale document, the sync fetches its current state from PropertyMe
  • Documents that return 404 (deleted from PropertyMe) are logged and skipped
  • Documents that have become archived/closed are updated with their true state

Indexing

A unique index on the id field is created for each collection to ensure efficient upserts and prevent duplicates.

Module Structure

src/pypropertyme/sync/mongodb/
├── __init__.py  # Module exports
├── cli.py       # CLI commands (pypropertyme-mongodb)
└── sync.py      # Core sync logic

Programmatic Usage

from pypropertyme.client import Client
from pypropertyme.sync.mongodb import MongoDBSync

# Create PropertyMe client first
pme_client = Client.get_client(token)

# Create sync instance
sync = MongoDBSync(pme_client, "mongodb://localhost:27017", "pypropertyme")

# Run sync
await sync.sync_all()

# Sync specific collection
await sync.sync_collection("contacts")

# Get status
status = await sync.get_status()
print(status)

Querying Data

Once synced, you can query the data using MongoDB's query language:

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017")
db = client["pypropertyme"]

# Find all owners
owners = list(db.contacts.find({"is_owner": True}))

# Find properties with rent > 500
high_rent = list(db.properties.find({"rent_amount": {"$gt": 500}}))

# Find open jobs
open_jobs = list(db.jobs.find({"status": {"$in": ["Assigned", "Quoted"]}}))

# Find overdue tenancies
import datetime
overdue = list(db.tenancy_balances.find({"arrears_days": {"$gt": 14}}))

Use Cases

Analytics Dashboard

# Aggregate rent by property type
pipeline = [
    {"$group": {
        "_id": "$property_type",
        "total_rent": {"$sum": "$rent_amount"},
        "count": {"$sum": 1}
    }}
]
results = list(db.properties.aggregate(pipeline))

AI/ML Integration

import pandas as pd

# Export to DataFrame for ML
contacts_df = pd.DataFrame(list(db.contacts.find()))
properties_df = pd.DataFrame(list(db.properties.find()))

# Feature engineering
features = properties_df[["rent_amount", "bedrooms", "bathrooms"]]
# Create text index
db.contacts.create_index([("name_text", "text"), ("email", "text")])

# Search
results = db.contacts.find({"$text": {"$search": "smith"}})