Proxyrack - May 6, 2026

How to Scrape RealCommercial.com.au

Data ScrapingTutorialsUse Case

Scrape RealCommercial.com.au

A beginner-friendly guide — no programming knowledge needed. Just follow the steps.

What We're Building

We'll create a small program that:

Searches for commercial properties on RealCommercial
Grabs the details (price, agent, description, demographics)
Prints them nicely in the terminal

The program uses two strategies: a fast API call for search results, and a real browser as backup when the website blocks automated requests.

Step 1: Install Python 3.12

Python Installation

Windows:

Go to python.org/downloads
Click the yellow button that says "Download Python 3.12.x"
Run the downloaded .exe file
IMPORTANT: Check the box that says "Add Python to PATH" at the bottom
Click "Install Now" and wait for it to finish

macOS:

Go to python.org/downloads
Click the yellow button for "Download Python 3.12.x"
Open the downloaded .pkg file and follow the installer

Ubuntu / Debian Linux:

sudo apt update
sudo apt install python3.12 python3.12-venv

Fedora Linux:

sudo dnf install python3.12

Verify it worked: Open a terminal (Command Prompt on Windows, Terminal on Mac/Linux) and type:

python3.12 --versionYou should see: Python 3.12.x

Step 2: Install a Browser

Choose one browser

We need Brave or Google Chrome installed. Either works.

Brave Browser (recommended):

Go to brave.com/download
Download and install like any other program

Google Chrome:

Go to google.com/chrome
Download and install

Important for Linux users: If you install Brave via the terminal, note where it's installed. The default is usually /usr/bin/brave-browser. If it's somewhere else, you'll update one line of code later.

Step 3: Install uv (Package Manager)

Install uv

uv is a fast tool that installs Python libraries. We use it instead of pip.

Windows (PowerShell):

powershell -c "irm <https://astral.sh/uv/install.ps1> | iex"

macOS / Linux:

curl -LsSf <https://astral.sh/uv/install.sh> | sh

After installing, close and reopen your terminal.

Verify:

uv --versionShould show something like uv 0.x.x

Step 4: Create the Project Folder & Files

Set up the project

Open your terminal and run these commands one at a time:

# Create the main folder
mkdir realcommercial-scraper
cd realcommercial-scraper

# Create subfolders
mkdir browser
mkdir models

Now create the following files. Copy-paste each one exactly.

File 1: `pyproject.toml`

[project]
name = "realcommercial-scraper"
version = "0.1.0"
description = "Scraper for realcommercial.com.au"
readme = "README.md"
requires-python = ">=3.12"
dependencies = [
    "httpx",
    "pydantic",
    "psutil",
    "websocket-client",
]

[tool.ruff.lint]
fixable = ["ALL"]
select = ["I", "B", "E"]

[tool.pyright]
typeCheckingMode = "strict"
venvPath = "."
venv = ".venv"

File 2: `constants.py`

from pathlib import Path

HOME = Path.home()
BASE_DIR = Path(__file__).parent

# Browser location — CHANGE THIS if your browser is elsewhere
DEBUG_BROWSER_PATH = "/usr/bin/brave-browser"

# Profile folder (cookies & settings are saved here)
USER_PROFILE_DIR = BASE_DIR / "browser" / "browser_profile"

# Browser window size
WIN_W = 720
WIN_H = 760

For Windows users: Change the browser path to something like:

DEBUG_BROWSER_PATH = "C:\\\\Program Files\\\\BraveSoftware\\\\Brave-Browser\\\\Application\\\\brave.exe"For macOS users:

DEBUG_BROWSER_PATH = "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser"If using Chrome, replace brave with chrome in the path.

File 3: `api.py`

import httpx

from browser.browser import render_html
from listing_parse import parse_listing
from models.browser import Browser
from models.listing import PropertyDetailResponse
from models.request import SearchPayload
from models.search import SearchResponse

class RealCommercialClient:
    def __init__(self) -> None:
        self.client: httpx.Client = httpx.Client(
            headers={
                "Accept": "*/*",
                "Accept-Language": "en-US,en;q=0.6",
                "Content-Type": "application/json",
                "Origin": "<https://www.realcommercial.com.au>",
                "Referer": "<https://www.realcommercial.com.au/>",
                "User-Agent": (
                    "Mozilla/5.0 (X11; Linux x86_64) "
                    "AppleWebKit/537.36 (KHTML, like Gecko) "
                    "Chrome/146.0.0.0 Safari/537.36"
                ),
            },
            timeout=30.0,
        )

    def __enter__(self) -> "RealCommercialClient":
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.close()

    def search(self, request: SearchPayload) -> SearchResponse:
        response = self.client.post(
            "<https://api.realcommercial.com.au/listing-ui/searches>",
            json=request.model_dump(
                by_alias=True, exclude_none=True, exclude_unset=True
            ),
        )
        response.raise_for_status()
        return SearchResponse.model_validate(response.json())

    def update_headers(self) -> None:
        self.client.headers.update({
            "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8",
            "accept-encoding": "utf-8",
            "accept-language": "en-US,en;q=0.9",
            "cache-control": "no-cache",
            "dnt": "1",
            "pragma": "no-cache",
            "priority": "u=0, i",
            "referer": "<https://www.realcommercial.com.au/leased/property-678-high-street-thornbury-vic-3071-505055376>",
            "sec-ch-ua": '"Chromium";v="146", "Not-A.Brand";v="24", "Brave";v="146"',
            "sec-ch-ua-mobile": "?0",
            "sec-ch-ua-platform": '"Linux"',
            "sec-fetch-dest": "document",
            "sec-fetch-mode": "navigate",
            "sec-fetch-site": "same-origin",
            "sec-gpc": "1",
            "upgrade-insecure-requests": "1",
            "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36",
        })

    def listing_details(
        self, pdp_url: str, browser: Browser, timeout: int = 5
    ) -> PropertyDetailResponse:
        self.update_headers()
        if pdp_url.startswith("/"):
            pdp_url = pdp_url[1:]
        full_url = f"<https://www.realcommercial.com.au/{pdp_url}>"
        response = self.client.get(full_url)
        if response.status_code == 429:
            html = render_html(browser, full_url, timeout)
            return parse_listing(html)
        response.raise_for_status()
        return parse_listing(response.text)

    def close(self) -> None:
        self.client.close()

File 4: `listing_parse.py`

import json

from models.listing import PropertyDetailResponse

def isolate_data_json(html_content: str) -> str:
    first_part = "  REA.pageData = "
    second_part = ";</script>"
    chopped = html_content.split(first_part)[-1]
    chopped = chopped.split(second_part)[0]
    return chopped

def parse_listing(html: str) -> PropertyDetailResponse:
    json_data = isolate_data_json(html)
    return PropertyDetailResponse(**json.loads(json_data))

if __name__ == "__main__":
    with open("index.html", "r") as f:
        html_content = f.read()
    result = parse_listing(html_content)
    print(f"Title: {result.listing.title}")
    print(f"Description: {result.listing.description[:200]}...")

File 5: `main.py`

from api import RealCommercialClient
from browser.browser import get_browser_instance
from models.listing import DemographicData
from models.request import SearchPayload as SearchRequest
from models.request import SimpleFilters as SearchFilters
from models.search import SearchListing

def print_separator(char: str = "=", length: int = 10) -> None:
    print(char * length)

def print_listing_summary(listing: SearchListing) -> None:
    print(f"{listing.title}")
    print(f"   Address: {listing.address.suburb_address}")
    area = getattr(listing.attributes, "area", "N/A")
    agent_name = listing.agents[0].name if listing.agents else "N/A"
    print(f"   Area: {area}")
    print(f"   Agent: {agent_name}")
    print(f"   URL: {listing.pdp_url}")

def print_demographics(demographic_data: DemographicData) -> None:
    print("  Demographics:")
    for insight in demographic_data.insights:
        print(f"    • {insight.label}: {insight.value}")

def main() -> None:
    search_config = SearchRequest(
        channel="leased",
        filters=SearchFilters(
            within_radius="includesurrounding",
            surrounding_suburbs=True,
        ),
        page=1,
        page_size=100,
    )
    browser = get_browser_instance(9999)

    with RealCommercialClient() as client:
        response = client.search(search_config)
        print(f"Available results: {response.available_results}")
        print(f"Returned listings: {len(response.listings)}")
        print_separator()

        for listing in response.listings:
            print_listing_summary(listing)
            print_separator("-")
            data = client.listing_details(listing.pdp_url, browser)
            description = data.listing.description
            print("Details:")
            print(f"{description[:300]}...")
            if data.demographic_data:
                print_demographics(data.demographic_data)
            break

if __name__ == "__main__":
    main()

Step 5: Add the Supporting Files

These files make the browser and data models work

Create each file below in its correct folder.

`browser/init.py`

`models/init.py`

`models/request.py`

from pydantic import BaseModel

class SimpleFilters(BaseModel):
    within_radius: str = "includesurrounding"
    surrounding_suburbs: bool = True

class SearchPayload(BaseModel):
    channel: str
    filters: SimpleFilters
    page: int = 1
    page_size: int = 100

`models/search.py`

from typing import Any

from pydantic import BaseModel, Field, HttpUrl

class BrandingLogo(BaseModel):
    alt: str
    url: HttpUrl
    url_template: HttpUrl = Field(alias="urlTemplate")

class Branding(BaseModel):
    color: str
    logo: BrandingLogo

class Address(BaseModel):
    street_address: str = Field(alias="streetAddress")
    suburb_address: str = Field(alias="suburbAddress")

class Phone(BaseModel):
    display: str
    dial: str

class Photo(BaseModel):
    alt: str
    url: HttpUrl
    url_template: HttpUrl | None = Field(None, alias="urlTemplate")

class Agent(BaseModel):
    id: str
    name: str
    image_path: HttpUrl | None = Field(None, alias="imagePath")
    image_url_template: HttpUrl | None = Field(None, alias="imageUrlTemplate")
    enquiry_uri: str | None = Field(None, alias="enquiryUri")
    phone: Phone | None = None

class Agency(BaseModel):
    id: str
    name: str
    additional_branding: bool = Field(alias="additionalBranding")
    branding: Branding
    phone: Phone | None = None
    salespeople: list[Agent] = []

class ListingAttributes(BaseModel):
    area: str | None = None

class ConjunctionalAgency(BaseModel):
    pass

class Omniture(BaseModel):
    pass

class ListingDetails(BaseModel):
    pass

class Advertising(BaseModel):
    pass

class SearchListing(BaseModel):
    highlights: list[str] = []
    agencies: list[Agency] = []
    days_active: int = Field(alias="daysActive")
    agents: list[Agent] = []
    attributes: ListingAttributes
    id: str
    has_tour: bool = Field(alias="hasTour")
    pdp_url: str = Field(alias="pdpUrl")
    title: str
    conjunctional_agencies: list[ConjunctionalAgency] = Field(
        default_factory=list, alias="conjunctionalAgencies"
    )
    product: str
    other_agencies: list[str] = Field(default_factory=list, alias="otherAgencies")
    omniture: Omniture = Field(default_factory=Omniture)
    status: str
    address: Address
    details: ListingDetails = Field(default_factory=ListingDetails)
    branding: Branding
    photos: list[Photo] = []

class SearchResponse(BaseModel):
    listings: list[SearchListing]
    surrounding_suburb_listings: list[Any] = Field(alias="surroundingSuburbListings")
    resolved_locations: list[Any] = Field(alias="resolvedLocations")
    available_results: int = Field(alias="availableResults")
    advertising: Advertising = Field(default_factory=Advertising)

`models/listing.py`

from datetime import datetime
from typing import Any

from pydantic import BaseModel, Field, HttpUrl

class FullAddress(BaseModel):
    street_address: str = Field(alias="streetAddress")
    suburb_address: str = Field(alias="suburbAddress")
    state: str
    postcode: str
    suburb: str
    marketing_region: str | None = Field(None, alias="marketingRegion")
    marketing_suburb: str | None = Field(None, alias="marketingSuburb")

class PriceInfo(BaseModel):
    display: str
    is_price_hidden: bool = Field(alias="isPriceHidden")

class ListingPrice(BaseModel):
    leased: PriceInfo | None = None

class Photo(BaseModel):
    alt: str
    url: HttpUrl
    url_template: HttpUrl | None = Field(None, alias="urlTemplate")

class FloorPlan(BaseModel):
    alt: str
    url: HttpUrl
    url_template: HttpUrl | None = Field(None, alias="urlTemplate")

class Attribute(BaseModel):
    id: str
    label: str
    value: str

class MapData(BaseModel):
    zoom_level: int = Field(alias="zoomLevel")
    thumbnail: HttpUrl
    lat: float
    lng: float
    precision: str

class PropertyType(BaseModel):
    id: str
    url: str
    old_url: str = Field(alias="oldUrl")
    agency_url: str = Field(alias="agencyUrl")
    marketing: str
    display_text: str = Field(alias="displayText")
    long_display_text: str = Field(alias="longDisplayText")
    sentence_display_text: str = Field(alias="sentenceDisplayText")
    pdp_title: str = Field(alias="pdpTitle")
    icon_name: str | None = Field(None, alias="iconName")

class TenureType(BaseModel):
    key: str
    omniture: str
    display_text: str = Field(alias="displayText")

class AvailableChannel(BaseModel):
    id: str
    price: str
    omniture: str
    tealium: str
    campaign: str
    krux: str
    url: str
    ad_area: str = Field(alias="adArea")
    short_human_readable: str = Field(alias="shortHumanReadable")
    display_in_select: str = Field(alias="displayInSelect")
    human_readable: str = Field(alias="humanReadable")
    title_variant: str = Field(alias="titleVariant")
    title: str

class DescriptionMetadata(BaseModel):
    phone_numbers: list[str] = Field(default_factory=list, alias="phoneNumbers")

class SimilarListing(BaseModel):
    id: str
    title: str
    pdp_url: str = Field(alias="pdpUrl")
    address: Any
    area: str | None = None
    price: ListingPrice
    main_photo: Photo = Field(alias="mainPhoto")
    branding: Any
    product: str
    property_type_objects: list[PropertyType] = Field(alias="propertyTypeObjects")

class DemographicInsightItem(BaseModel):
    icon: str
    label: str
    value: str
    typename: str = Field(alias="__typename")

class DemographicData(BaseModel):
    summary: str
    insights: list[DemographicInsightItem]

class Listing(BaseModel):
    id: str
    title: str
    description: str
    canonical_path: str = Field(alias="canonicalPath")
    product: str
    status: str | None = None
    days_active: int = Field(alias="daysActive")
    last_updated_at: datetime = Field(alias="lastUpdatedAt")
    address: FullAddress
    price: ListingPrice
    photos: list[Photo] = []
    floor_plans: list[FloorPlan] = Field(alias="floorPlans")
    main_photo: Photo | None = None
    attributes: list[Attribute] = []
    highlights: list[str] = []
    map: MapData
    agencies: list[Any] = []
    branding: Any
    property_type_objects: list[PropertyType] = Field(alias="propertyTypeObjects")
    tenure_type_object: TenureType = Field(alias="tenureTypeObject")
    available_channel_objects: list[AvailableChannel] = Field(alias="availableChannelObjects")
    similar_listings: list[SimilarListing] = Field(alias="similarListings")
    description_metadata: DescriptionMetadata = Field(default_factory=DescriptionMetadata, alias="descriptionMetadata")
    high_quality_listing: bool = Field(alias="highQualityListing")
    multiple_properties: bool = Field(alias="multipleProperties")
    tours: list[Any] = []
    websites: list[Any] = []

class PropertyDetailResponse(BaseModel):
    listing: Listing
    demographic_data: DemographicData | None = Field(None, alias="demographicData")

`models/browser.py`

from pydantic import BaseModel
from websocket import WebSocket

class DebuggerInfo(BaseModel):
    last_updates: int = -1

class Browser(BaseModel):
    ws: WebSocket = WebSocket()
    process_id: int = 0
    process: Any = None
    debugger_info: DebuggerInfo = DebuggerInfo()

    class Config:
        arbitrary_types_allowed = True

    def connect(self) -> None:
        if not self.ws.connected:
            self.ws.connect(self.ws.url)

`browser/utils.py`

import subprocess
import json
import urllib.request

from constants import DEBUG_BROWSER_PATH, USER_PROFILE_DIR, WIN_W, WIN_H

def spawn_debug_browser(debug_port: int, headless: bool = False):
    cmd = [
        DEBUG_BROWSER_PATH,
        f"--remote-debugging-port={debug_port}",
        f"--user-data-dir={USER_PROFILE_DIR}",
        f"--window-size={WIN_W},{WIN_H}",
        "--no-first-run",
        "--no-default-browser-check",
    ]
    if headless:
        cmd.append("--headless")
    process = subprocess.Popen(cmd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
    return process.pid, process

def attach_debugger(browser, debug_port: int):
    try:
        resp = urllib.request.urlopen(f"<http://localhost>:{debug_port}/json/version")
        data = json.loads(resp.read())
        ws_url = data.get("webSocketDebuggerUrl")
        if ws_url:
            browser.ws.url = ws_url
            browser.ws.connect(ws_url)
            browser.debugger_info.last_updates = 1
    except Exception:
        browser.debugger_info.last_updates = -1

`browser/actions.py`

import json
import uuid

def _send(browser, method: str, params: dict | None = None):
    msg = {"id": uuid.uuid4().int % 1000000, "method": method}
    if params:
        msg["params"] = params
    browser.ws.send(json.dumps(msg))
    return json.loads(browser.ws.recv())

def browser_init_domains(browser):
    _send(browser, "Page.enable")
    _send(browser, "DOM.enable")
    _send(browser, "Runtime.enable")
    _send(browser, "Network.enable")

def browser_open_url(browser, url: str):
    _send(browser, "Page.navigate", {"url": url})
    # Wait for page to load
    while True:
        resp = _send(browser, "Runtime.evaluate", {"expression": "document.readyState"})
        state = resp.get("result", {}).get("result", {}).get("value", "")
        if state == "complete":
            break

def dom_element_html(browser, selector: str, outer_html: bool = True):
    doc = _send(browser, "DOM.getDocument")
    root = doc["result"]["root"]["nodeId"]
    node = _send(browser, "DOM.querySelector", {"nodeId": root, "selector": selector})
    node_id = node["result"]["nodeId"]
    if outer_html:
        html = _send(browser, "DOM.getOuterHTML", {"nodeId": node_id})
    else:
        html = _send(browser, "DOM.getInnerHTML", {"nodeId": node_id})
    return html["result"]["outerHTML"] if outer_html else html["result"]["innerHTML"]

Step 6: Install Dependencies

Let uv install everything

Make sure you're in the realcommercial-scraper folder, then run:

uv sync

This reads pyproject.toml and installs:

httpx — makes web requests
pydantic — handles data cleanly
psutil — manages the browser process
websocket-client — talks to the browser

You should see: Resolved 4 packages in Xms followed by installation messages. If you see errors, double-check that you're in the right folder.

Step 7: Run the Scraper

Execute the program

In your terminal (still in the realcommercial-scraper folder):

uv run python main.py

What happens:

uv creates a virtual environment (one-time setup)
A Brave/Chrome browser window opens (this is normal)
The program searches RealCommercial for leased properties
It fetches details for the first listing
Results print in your terminal

Expected Output

Available results: 2437
Returned listings: 100
==========
Shop & Retail Premises • 148m²
   Address: Thornbury, VIC 3071
   Area: 148m²
   Agent: John Smith
   URL: /leased/property-678-high-street-thornbury-vic-3071-505055376
----------
Details:
Prime retail space located on High Street, Thornbury...

  Demographics:
    • Total Population: 19,200
    • Median Age: 38
    • Average Household Income: $95,400/yr

Troubleshooting

Common Problems & Fixes

"Command not found: uv"

→ Close and reopen your terminal. If it still doesn't work, restart your computer.

"python3.12: command not found"

→ On Windows, try python instead of python3.12. On Mac, make sure you installed from python.org (not the built-in one).

Browser doesn't open / "connection refused"

→ Check constants.py. Make sure DEBUG_BROWSER_PATH points to where your browser is actually installed. For Windows, use double backslashes: C:\\\\Program Files\\\\...

"No module named 'models'"

→ Make sure you created the models/__init__.py file (it can be empty). Same for browser/__init__.py.

Browser opens but nothing happens

→ Wait up to 30 seconds. The first run is slow because it creates a browser profile. Subsequent runs are faster.

Want More Results?

📄 Getting all listings (pagination)

To get more than the first 100 results, replace main.py's main() function with:

def main() -> None:
    search_config = SearchRequest(
        channel="leased",
        filters=SearchFilters(
            within_radius="includesurrounding",
            surrounding_suburbs=True,
        ),
        page=1,
        page_size=100,
    )
    browser = get_browser_instance(9999)

    with RealCommercialClient() as client:
        page = 1
        while True:
            search_config.page = page
            response = client.search(search_config)
            listings = response.listings
            if not listings:
                break
            print(f"Page {page}: {len(listings)} listings")
            for listing in listings:
                print(f"  - {listing.title}")
            page += 1

This loops through all pages until there are no more results.

Final File Checklist

Your folder should look exactly like this:

realcommercial-scraper/ ├── pyproject.toml ├── constants.py ├── api.py ├── listing_parse.py ├── main.py ├── browser/ │ ├── init.py │ ├── browser.py │ ├── actions.py │ └── utils.py └── models/ ├── init.py ├── browser.py ├── listing.py ├── request.py └── search.py

You're done! If every file is in place and you ran uv sync, the scraper should work. Run it anytime with:

cd realcommercial-scraper uv run python main.py

Questions? Double-check file names and paths — 95% of issues are a typo or a missing file.

: Prime retail space located on High Street, Thornbury... Demographics: • Total Population: 19,200 • Median Age: 38 • Average Household Income: $95,400/yr