Ctrl K

Sub-City Level Rent Yield Analysis

This notebook studies Dubai Land Department transaction and rental contract data to estimate rental yield and cumulative property value growth across major residential projects and master communities.

The analysis is designed at the sub-city level, focusing on comparable flat segments across projects so that sale prices, annual rents, and implied gross yields can be compared on a more consistent basis.

Data sources

The primary source for this data is the Dubai Land Department service portal:

  • Dubai Land Department real estate transaction services
  • Dubai Pulse open data portal for DLD historical data

At the time this notebook was prepared, the original open data provider was unavailable, so this analysis uses an earlier locally saved version of the dataset. We are monitoring the original open data source and will update the links and replication steps when the source becomes available again.

Replication note

To reproduce this workflow at the moment, use the currently available public Kaggle mirror that includes both transaction and rent data:

  • Dubai real estates 1969-2023

This can be used as a temporary substitute while the original provider remains down.

Data availability note

Because the original source dataset license is not currently obtainable, the exact dataset version used in this notebook cannot be redistributed here. Please download a comparable version from the currently available public source above and adapt file paths as needed.

Datasets used

  • Transactions.csv - property sales records
  • Rent_Contracts.csv - Ejari rental contract records

Scope of analysis

  • Property type: Flat, existing properties only
  • Room types: 2 B/R and 3 B/R
  • Area range: 70 to 160 sqm
  • Period: January 2014 to July 2024
  • Projects with fewer than 10 recorded transactions are excluded
Item Value
Geography Major residential projects and master communities in Dubai
Main use case Sub-city level rent yield and price growth analysis
Main files Transactions.csv, Rent_Contracts.csv
Original provider status Currently unavailable at the time of writing
Redistribution Not provided here due to source licensing uncertainty
Replication path Use the currently available Kaggle mirror until the original source returns
In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import warnings
warnings.filterwarnings('ignore')

pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', '{:.3f}'.format)
%matplotlib inline

plt.rcParams.update({
    'figure.dpi': 300,
    'axes.spines.top': False,
    'axes.spines.right': False,
    'axes.grid': True,
    'grid.alpha': 0.3,
    'font.size': 11,
})

Sales Tansactions Data

We load the full transactions file, parse dates, and apply filters to focus on a comparable segment: existing flat sales in the 2 B/R and 3 B/R range between 70 and 160 sqm.

In [2]:
buy = pd.read_csv('data/Transactions.csv')

buy['year'] = buy['instance_date'].map(lambda x: x.split('-')[2]).astype(int)
buy = buy[buy['year'] >= 2000]

buy['date'] = pd.to_datetime(buy['instance_date'], format='%d-%m-%Y')
buy['count'] = 1
buy = buy.set_index('date').sort_index()

selected = [
    'trans_group_en', 'procedure_name_en', 'property_type_en',
    'property_sub_type_en', 'property_usage_en', 'reg_type_en',
    'area_name_en', 'building_name_en', 'project_name_en',
    'master_project_en', 'rooms_en', 'procedure_area',
    'actual_worth', 'meter_sale_price', 'count'
]
buy = buy[selected]
buy = buy.loc['2014-01-01':'2024-07-31']

buy = buy[buy['property_sub_type_en'] == 'Flat']
buy = buy[buy['reg_type_en'] == 'Existing Properties']
buy = buy[(buy['rooms_en'] == '2 B/R') | (buy['rooms_en'] == '3 B/R')]
buy = buy[(buy['procedure_area'] >= 70) & (buy['procedure_area'] <= 160)]

project_counts = buy['project_name_en'].value_counts()
top_projects = project_counts[project_counts > 10].index
buy = buy[buy['project_name_en'].isin(top_projects)]
buy = buy[buy['master_project_en'].notnull()]

buy_summary = pd.DataFrame([
    {
        'Metric': 'Transactions loaded',
        'Value': f'{len(buy):,}'
    },
    {
        'Metric': 'Date range',
        'Value': f'{buy.index.min().date()} to {buy.index.max().date()}'
    },
    {
        'Metric': 'Unique projects',
        'Value': f'{buy["project_name_en"].nunique():,}'
    },
    {
        'Metric': 'Master communities',
        'Value': f'{buy["master_project_en"].nunique():,}'
    }
])

top_projects_df = (
    buy['project_name_en']
    .value_counts()
    .head(15)
    .rename_axis('Project')
    .reset_index(name='Transactions')
)

top_communities_df = (
    buy['master_project_en']
    .value_counts()
    .head(15)
    .rename_axis('Master community')
    .reset_index(name='Transactions')
)

display(buy_summary.head(5))

print('Top projects by transaction count')
display(top_projects_df.head(5))

print('Top master communities by transaction count')
display(top_communities_df.head(5))
Metric Value
0 Transactions loaded 52,341
1 Date range 2014-01-02 to 2024-07-17
2 Unique projects 500
3 Master communities 52
Top projects by transaction count
Project Transactions
0 REMRAAM 2456
1 SKY COURTS 1540
2 TOWN SQUARE ZAHRA 846
3 THE POLO RESIDENCE 682
4 1 Residences 623
Top master communities by transaction count
Master community Transactions
0 Dubai Marina 6525
1 Burj Khalifa 3470
2 Dubai Sports City 3109
3 Jumeirah Village Circle 3056
4 Business Bay 2989

Area Distribution

After filtering to the 70 to 160 sqm range, the distribution below shows the most common unit sizes in the sales data. The peak around 115 to 125 sqm reflects the typical 2 B/R flat footprint in Dubai's established residential projects.

In [3]:
room_counts = buy['procedure_area']

plt.figure(figsize=(8, 4.5))
plt.hist(
    room_counts,
    bins=30,
    color='steelblue',
    edgecolor='white',
    linewidth=0.5
)
plt.xlabel('Procedure area (sqm)')
plt.ylabel('Number of transactions')
plt.title('Distribution of unit sizes for sales')
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

area_bins_df = pd.DataFrame({
    'Procedure area': room_counts.describe()
}).reset_index()
area_bins_df.columns = ['Metric', 'Value']

display(area_bins_df)
No description has been provided for this image
Metric Value
0 count 52341.000
1 mean 121.122
2 std 20.893
3 min 70.070
4 25% 105.030
5 50% 121.640
6 75% 138.490
7 max 160.000

Annual Transaction Volume

Transaction volume by year reveals market cycles. The COVID-related dip in 2020 and the strong recovery from 2021 onward are both clearly visible.

In [4]:
annual_volume = buy.resample('YE')['count'].sum()

plt.figure(figsize=(8, 4.5))
plt.bar(
    annual_volume.index.year,
    annual_volume.values,
    color='steelblue',
    width=0.6
)
plt.xlabel('Year')
plt.ylabel('Number of transactions')
plt.title('Annual sales volume for the filtered segment')
plt.gca().yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'{int(x):,}')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

annual_volume_df = pd.DataFrame({
    'Year': annual_volume.index.year,
    'Transactions': annual_volume.values
})

display(annual_volume_df.tail(5))
No description has been provided for this image
Year Transactions
6 2020 3237
7 2021 5609
8 2022 7302
9 2023 10377
10 2024 6203

Median Price per Sqm Over Time

The median meter sale price (AED/sqm) across all projects shows the broad market price trend over the analysis period.

In [5]:
annual_price = buy.resample('YE')['meter_sale_price'].median()

plt.figure(figsize=(8, 4.5))
plt.plot(
    annual_price.index,
    annual_price.values,
    marker='o',
    color='steelblue',
    linewidth=2
)
plt.xlabel('Year')
plt.ylabel('Median price (AED/sqm)')
plt.title('Median sale price per sqm for all filtered projects')
plt.gca().yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'{int(x):,}')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

annual_price_df = pd.DataFrame({
    'Year': annual_price.index.year,
    'Median price (AED/sqm)': annual_price.values
})

display(annual_price_df.tail(5))
No description has been provided for this image
Year Median price (AED/sqm)
6 2020 7534.720
7 2021 9095.160
8 2022 10710.385
9 2023 11370.770
10 2024 11115.520

Rental Data

We load the Ejari rental contracts, applying matching filters to ensure comparability with the sales segment: new contracts only, residential flats, 2 or 3 bedrooms, 70 to 160 sqm.

In [6]:
rent = pd.read_csv('data/Rent_Contracts.csv')

rent['date'] = pd.to_datetime(rent['contract_start_date'], format='%d-%m-%Y')
rent = rent.set_index('date').sort_index()
rent = rent.loc['2014-01-01':'2024-07-31']

selected_cols = [
    'contract_reg_type_en', 'contract_amount', 'annual_amount',
    'contract_end_date', 'line_number', 'is_free_hold',
    'ejari_bus_property_type_en', 'ejari_property_type_en',
    'ejari_property_sub_type_en', 'property_usage_en',
    'project_name_en', 'master_project_en', 'area_name_en',
    'tenant_type_en', 'actual_area'
]
rent = rent[selected_cols]
rent['count'] = 1

rent = rent[rent['contract_reg_type_en'] == 'New']
rent = rent[rent['ejari_bus_property_type_en'] == 'Unit']
rent = rent[rent['ejari_property_type_en'] == 'Flat']
rent = rent[
    (rent['ejari_property_sub_type_en'] == '2 bed rooms+hall') |
    (rent['ejari_property_sub_type_en'] == '3 bed rooms+hall')
]
rent = rent[(rent['actual_area'] >= 70) & (rent['actual_area'] <= 160)]
rent = rent[rent['property_usage_en'] == 'Residential']

project_counts_rent = rent['project_name_en'].value_counts()
top_projects_rent = project_counts_rent[project_counts_rent > 10].index
rent = rent[rent['project_name_en'].isin(top_projects_rent)]
rent = rent[rent['master_project_en'].notnull()]

rent['annual_rent_m2'] = rent['annual_amount'] / rent['actual_area']

rent_summary = pd.DataFrame([
    {
        'Metric': 'Rental contracts loaded',
        'Value': f'{len(rent):,}'
    },
    {
        'Metric': 'Date range',
        'Value': f'{rent.index.min().date()} to {rent.index.max().date()}'
    },
    {
        'Metric': 'Unique projects',
        'Value': f'{rent["project_name_en"].nunique():,}'
    },
    {
        'Metric': 'Master communities',
        'Value': f'{rent["master_project_en"].nunique():,}'
    }
])

top_projects_rent_df = (
    rent['project_name_en']
    .value_counts()
    .head(15)
    .rename_axis('Project')
    .reset_index(name='Contracts')
)

top_communities_rent_df = (
    rent['master_project_en']
    .value_counts()
    .head(15)
    .rename_axis('Master community')
    .reset_index(name='Contracts')
)

display(rent_summary.head(5))

print('Top projects by rent contract count')
display(top_projects_rent_df.head(5))

print('Top master communities by rent contract count')
display(top_communities_rent_df.head(5))
Metric Value
0 Rental contracts loaded 102,363
1 Date range 2014-01-01 to 2024-07-31
2 Unique projects 591
3 Master communities 52
Top projects by rent contract count
Project Contracts
0 REMRAAM 4617
1 SKY COURTS 3645
2 TOWN SQUARE ZAHRA 1595
3 TORCH TOWER 1518
4 SULAFA TOWER 1073
Top master communities by rent contract count
Master community Contracts
0 Dubai Marina 15178
1 Jumeirah Lakes Towers 7399
2 Burj Khalifa 7389
3 Business Bay 6642
4 Dubai Sports City 6355

Rental Area Distribution

In [7]:
rent_area = rent['actual_area']

plt.figure(figsize=(8, 4.5))
plt.hist(
    rent_area,
    bins=30,
    color='#e07b39',
    edgecolor='white',
    linewidth=0.5
)
plt.xlabel('Actual area (sqm)')
plt.ylabel('Number of contracts')
plt.title('Distribution of unit sizes for rental contracts')
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

rent_area_df = pd.DataFrame({
    'Metric': rent_area.describe().index,
    'Value': rent_area.describe().values
})

display(rent_area_df)
No description has been provided for this image
Metric Value
0 count 102363.000
1 mean 119.466
2 std 20.830
3 min 70.000
4 25% 103.000
5 50% 120.000
6 75% 136.000
7 max 160.000

Annual Rent per Sqm Over Time

The sharp post-2021 rise in median rent per sqm reflects both demand recovery and the broader price cycle observed in the sales data.

In [8]:
annual_rent = rent.resample('YE')['annual_rent_m2'].median()

plt.figure(figsize=(8, 4.5))
plt.plot(
    annual_rent.index,
    annual_rent.values,
    marker='o',
    color='#e07b39',
    linewidth=2
)
plt.xlabel('Year')
plt.ylabel('Median annual rent (AED/sqm)')
plt.title('Median rental rate per sqm for all filtered projects')
plt.gca().yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'{int(x):,}')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

annual_rent_df = pd.DataFrame({
    'Year': annual_rent.index.year,
    'Median annual rent (AED/sqm)': annual_rent.values
})

display(annual_rent_df)
No description has been provided for this image
Year Median annual rent (AED/sqm)
0 2014 860.215
1 2015 902.778
2 2016 868.094
3 2017 796.460
4 2018 715.849
5 2019 643.521
6 2020 543.478
7 2021 557.450
8 2022 685.599
9 2023 886.076
10 2024 1037.398

Rental Yield by Project

Rental yield is computed as the ratio of annual rent per sqm to sale price per sqm, aggregated at the yearly level per project. Only projects appearing in both the sales and rental datasets are included. Yield observations below 2% or above 20% are removed as they typically reflect data sparsity rather than true market conditions.

In [9]:
grouped_buy = buy.groupby('project_name_en').resample('YE')['meter_sale_price'].median().reset_index()
pivot_buy = grouped_buy.pivot_table(index='date', columns='project_name_en', values='meter_sale_price')

grouped_rent = rent.groupby('project_name_en').resample('YE')['annual_rent_m2'].median().reset_index()
pivot_rent = grouped_rent.pivot_table(index='date', columns='project_name_en', values='annual_rent_m2')

common_projects = pd.Series(pivot_buy.columns)[pd.Series(pivot_buy.columns).isin(pivot_rent.columns)]
pivot_buy  = pivot_buy[common_projects]
pivot_rent = pivot_rent[common_projects]

roi_projects = pivot_rent / pivot_buy
roi_projects = roi_projects.map(lambda v: np.nan if v is not None and (v > 0.20 or v < 0.02) else v)
roi_projects = roi_projects.ffill()

print(f'Projects with yield data: {roi_projects.shape[1]}')
print(f'Years covered:            {roi_projects.shape[0]}')
Projects with yield data: 473
Years covered:            11

Yield Uncertainty via Error Propagation

Since both rent and sale price carry variability, we propagate their uncertainty into a yield standard error estimate using the standard ratio error formula: sigma_roi / roi = sqrt( (sigma_rent/rent)^2 + (sigma_price/price)^2 ).

In [10]:
rent_mean = pivot_rent.mean()
buy_mean  = pivot_buy.mean()
rent_std  = pivot_rent.std()
buy_std   = pivot_buy.std()

roi_mean = rent_mean / buy_mean
roi_std_propagated = roi_mean * np.sqrt((rent_std / rent_mean)**2 + (buy_std / buy_mean)**2)
roi_std_df = roi_std_propagated.rename('roi_std_err_propagation').reset_index()
roi_std_df.columns = ['project_name_en', 'roi_std_err_propagation']

Project Yield Summary (2020 to 2024)

Summary statistics per project over the 2020 to 2024 window, sorted by median yield. The 2023 and 2024 columns show the most recent annual yield observations.

In [11]:
roi_summary = roi_projects.loc['2020-12-31':].describe().T
roi_summary['2024'] = roi_projects.iloc[-1]
roi_summary['2023'] = roi_projects.iloc[-2]
roi_summary = roi_summary.sort_values('50%', ascending=False)
roi_summary = roi_summary[roi_summary['count'] > 2]

display_cols = ['count', 'mean', 'std', '25%', '50%', '75%', '2023', '2024']
display(
    round(roi_summary[display_cols].iloc[:20] * 100, 2).rename(columns={
        'count': 'N years', 'mean': 'Mean %', 'std': 'Std %',
        '25%': 'P25 %', '50%': 'Median %', '75%': 'P75 %',
        '2023': '2023 %', '2024': '2024 %'
    }).head(10)
)
N years Mean % Std % P25 % Median % P75 % 2023 % 2024 %
project_name_en
UNIESTATE SPORTS TOWER 500.000 12.200 1.320 10.760 13.170 13.170 10.760 10.760
ZENITH TOWER A2 500.000 12.000 2.380 11.480 11.960 12.810 11.960 11.480
HANOVER SQUARE 500.000 10.400 2.020 9.630 11.280 11.690 12.190 11.280
HDS SUNSTAR 1 500.000 10.240 1.440 9.410 11.240 11.240 8.090 9.410
DIAMOND VIEWS 2 500.000 13.160 3.170 10.990 10.990 16.620 10.990 10.990
PALACE TOWER 500.000 10.760 0.880 10.850 10.950 11.200 10.850 11.200
LINCOLN PARK 500.000 10.200 1.080 9.600 10.870 10.870 11.080 9.600
EATON PLACE 500.000 10.800 3.470 8.650 10.620 10.810 8.650 10.810
Rigel 400.000 10.480 1.090 9.900 10.600 11.190 10.160 11.050
QPOINT LIWAN-PLOT R092 500.000 10.190 1.340 10.190 10.590 10.600 10.590 11.570

Top Projects by Median Yield

Error bars represent the adjusted yield uncertainty: propagated standard error is used when it is small and the observed standard deviation is used as a fallback when propagated error is large, which typically occurs in projects with sparse data.

In [12]:
master_map = buy[['master_project_en', 'project_name_en']].drop_duplicates().set_index('project_name_en')
roi_ext = roi_summary.merge(master_map, left_index=True, right_index=True, how='left')
roi_ext = roi_ext.merge(roi_std_df.set_index('project_name_en'), left_index=True, right_index=True, how='left')

roi_ext['adj_roi_std'] = np.where(
    roi_ext['roi_std_err_propagation'] > 0.10,
    roi_ext['std'],
    roi_ext['roi_std_err_propagation']
)

top10 = roi_ext.iloc[:10].copy()

plt.figure(figsize=(8, 4.5))
plt.bar(
    range(len(top10)),
    top10['50%'] * 100,
    yerr=top10['adj_roi_std'] * 100,
    capsize=4,
    color='steelblue',
    alpha=0.85,
    error_kw={'elinewidth': 1.2, 'alpha': 0.6}
)
plt.xticks(range(len(top10)), top10.index, rotation=45, ha='right')
plt.ylabel('Median rental yield (%)')
plt.title('Top 10 projects by median rental yield (2020 to 2024)')
plt.gca().yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'{x:.1f}%')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

top10_df = top10.reset_index().rename(columns={'index': 'project_name_en'})
top10_df = top10_df[[
    'project_name_en',
    'master_project_en',
    '50%',
    'adj_roi_std'
]].copy()

top10_df.columns = [
    'Project',
    'Master community',
    'Median rental yield',
    'Yield std'
]

top10_df['Median rental yield'] = (top10_df['Median rental yield'] * 100).round(2)
top10_df['Yield std'] = (top10_df['Yield std'] * 100).round(2)

display(top10_df.head(5))
No description has been provided for this image
Project Master community Median rental yield Yield std
0 UNIESTATE SPORTS TOWER Dubai Sports City 13.170 2.860
1 ZENITH TOWER A2 Dubai Sports City 11.960 3.070
2 HANOVER SQUARE Jumeirah Village Circle 11.280 3.360
3 HDS SUNSTAR 1 International City Phase 1 11.240 2.330
4 DIAMOND VIEWS 2 Jumeirah Village Circle 10.990 4.010

Rental Yield by Master Community

Aggregating yields to the master project level gives a regional view for comparing broad investment zones.

In [13]:
roi_regions = roi_ext.groupby('master_project_en')[['mean', 'std', '2024', '2023', 'adj_roi_std']].mean()
roi_regions = roi_regions.sort_values('mean', ascending=False)

display(
    round(roi_regions * 100, 2).rename(columns={
        'mean': 'Avg Yield %', 'std': 'Std %',
        '2024': '2024 %', '2023': '2023 %', 'adj_roi_std': 'Adj Std %'
    }).head(10)
)
Avg Yield % Std % 2024 % 2023 % Adj Std %
master_project_en
Badra 10.350 0.510 10.520 10.090 3.890
Majan 9.960 1.950 8.070 7.770 2.570
Dubai Studio City 9.430 1.240 9.640 9.360 2.490
Jumeirah Village Triangle 9.150 1.480 9.640 9.510 2.880
Jumeirah Islands 9.140 0.720 9.000 10.050 3.140
Dubai Sports City 9.050 1.280 9.850 9.520 2.990
Jumeirah Village Circle 8.880 1.130 9.340 9.120 2.400
Silicon Oasis 8.800 1.250 9.070 9.590 2.500
Liwan 8.780 1.230 10.150 9.230 2.650
Dubai Health Care City Phase 2 8.720 0.870 9.520 8.810 2.240
In [14]:
top10_regions = roi_regions.head(10).copy()

plt.figure(figsize=(8, 4.5))
plt.bar(
    range(len(top10_regions)),
    top10_regions['mean'] * 100,
    yerr=top10_regions['adj_roi_std'] * 100,
    capsize=4,
    color='#5b8db8',
    alpha=0.85,
    error_kw={'elinewidth': 1.2, 'alpha': 0.6}
)
plt.xticks(range(len(top10_regions)), top10_regions.index, rotation=45, ha='right')
plt.ylabel('Average rental yield (%)')
plt.title('Top 10 master communities by rental yield')
plt.gca().yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'{x:.1f}%')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

top10_regions_df = top10_regions.reset_index()
top10_regions_df.columns = ['Master community', 'Mean yield', 'Median yield', 'Std', 'Yield std', 'Adjusted yield std']

for col in ['Mean yield', 'Median yield', 'Std', 'Yield std', 'Adjusted yield std']:
    top10_regions_df[col] = (top10_regions_df[col] * 100).round(2)

display(top10_regions_df.head(5))
No description has been provided for this image
Master community Mean yield Median yield Std Yield std Adjusted yield std
0 Badra 10.350 0.510 10.520 10.090 3.890
1 Majan 9.960 1.950 8.070 7.770 2.570
2 Dubai Studio City 9.430 1.240 9.640 9.360 2.490
3 Jumeirah Village Triangle 9.150 1.480 9.640 9.510 2.880
4 Jumeirah Islands 9.140 0.720 9.000 10.050 3.140

Historical Property Value Growth by Region

Using the sales data, we estimate annual percentage change in median price per sqm for each master community. The chart shows the average recent return over the last 4 years with historical standard deviation as error bars. Year-on-year changes above 40% are treated as outliers and removed, as they usually reflect thin transaction coverage in a given year.

In [15]:
grouped_region = buy.groupby('master_project_en').resample('YE')['meter_sale_price'].median().reset_index()
pivot_region = grouped_region.pivot_table(index='date', columns='master_project_en', values='meter_sale_price')

pivot_region = pivot_region.dropna(thresh=5, axis=1).ffill()
pct_change = pivot_region.pct_change()
pct_change = pct_change.where(pct_change <= 0.40, np.nan)

regional_means = pct_change.iloc[-4:].mean()
regional_stds = pct_change.std()

sorted_idx = regional_means.sort_values(ascending=False).index
sorted_means = regional_means[sorted_idx]
sorted_stds = regional_stds[sorted_idx]

top10_idx = sorted_idx[:10]
top10_means = sorted_means.loc[top10_idx]
top10_stds = sorted_stds.loc[top10_idx]

plt.figure(figsize=(8, 4.5))
plt.bar(
    range(len(top10_means)),
    top10_means * 100,
    yerr=top10_stds * 100,
    capsize=4,
    color='steelblue',
    alpha=0.75,
    error_kw={'elinewidth': 1.2, 'alpha': 0.5}
)
plt.xticks(range(len(top10_means)), top10_idx, rotation=45, ha='right')
plt.ylabel('Mean annual price change (%)')
plt.title('Top 10 regions by recent 4 year average price change')
plt.gca().yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'{x:.1f}%')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

regional_growth_df = pd.DataFrame({
    'Master community': top10_idx,
    'Mean annual price change (%)': (top10_means.values * 100).round(2),
    'Std (%)': (top10_stds.values * 100).round(2)
})

display(regional_growth_df.head(5))
No description has been provided for this image
Master community Mean annual price change (%) Std (%)
0 DUBAI HILLS - PARK 24.460 11.050
1 Burj Khalifa 21.600 17.360
2 DUBAI HILLS 17.680 6.250
3 International City Phase 3 16.550 23.570
4 The Greens 15.150 14.010

10-Year Cumulative Return Projection

Using the recent 4-year average regional price appreciation as a starting point, we project cumulative returns over a 10-year horizon. The model assumes a gradual mean-reversion toward a long-run equilibrium return of approximately 3.1% per year, composed of 1.5% real estate real return and 1.6% inflation, reached by year 5.

Two reference lines are shown for context:

  • Inflation baseline - cumulative CPI-based erosion of purchasing power
  • RE market estimate - baseline nominal return (inflation + long-run real return)
In [16]:
long_run = 0.015 + 0.016

yearly_return = pd.DataFrame(
    np.zeros((len(regional_means), 11)),
    index=regional_means.index,
    columns=[f'year_{i}' for i in range(11)]
)

for idx in range(len(yearly_return)):
    start = regional_means.iloc[idx]
    path  = np.concatenate([np.linspace(start, long_run, 6), np.full(4, long_run)])
    yearly_return.iloc[idx, 1:] = path

yearly_return = yearly_return.T
cumulative = (yearly_return + 1).cumprod()
cumulative = cumulative.reset_index().rename(columns={'index': 'year'})
cumulative.head().T.head(10)
Out[16]:
0 1 2 3 4
master_project_en
year year_0 year_1 year_2 year_3 year_4
Al Furjan 1.000 1.068 1.134 1.194 1.249
Arjan 1.000 1.022 1.047 1.074 1.104
Badra 1.000 1.098 1.192 1.277 1.351
Burj Khalifa 1.000 1.216 1.434 1.637 1.809
Business Bay 1.000 1.119 1.232 1.335 1.423
City Walk 1.000 1.106 1.206 1.298 1.377
Culture Village 1.000 1.090 1.176 1.254 1.322
DAMAC HILLS 1.000 1.056 1.111 1.162 1.210
DUBAI HILLS 1.000 1.177 1.351 1.511 1.646
In [17]:
inf_start = 0.0381
inf_path = np.concatenate([
    [0],
    np.linspace(inf_start, long_run - 0.016, 6),
    np.full(4, long_run - 0.016)
])
re_path = inf_path + 0.016

years = cumulative['year'].values
inf_cum = (inf_path + 1).cumprod()
re_cum = (re_path + 1).cumprod()

region_cols = [c for c in cumulative.columns if c != 'year']

plt.figure(figsize=(8, 4.5))

for col in region_cols:
    plt.plot(
        years,
        cumulative[col].values,
        color='steelblue',
        alpha=0.25,
        linewidth=1.2
    )

plt.plot(
    years,
    inf_cum,
    color='black',
    linestyle='--',
    linewidth=2,
    label='Inflation baseline'
)
plt.plot(
    years,
    re_cum,
    color='#444444',
    linestyle=':',
    linewidth=2,
    label='RE market estimate'
)

plt.xlabel('Year')
plt.ylabel('Cumulative return (1.0 = no change)')
plt.title('10 year cumulative property value projection by region')
plt.legend(fontsize=10)
plt.gca().yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'{x:.2f}x')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

final_projection_df = pd.DataFrame({
    'Region': region_cols,
    'Final cumulative return': [cumulative[col].values[-1] for col in region_cols]
}).sort_values('Final cumulative return', ascending=False)

final_projection_df['Final cumulative return'] = final_projection_df['Final cumulative return'].round(3)

benchmark_df = pd.DataFrame([
    {
        'Series': 'Inflation baseline',
        'Final cumulative return': round(inf_cum[-1], 3)
    },
    {
        'Series': 'RE market estimate',
        'Final cumulative return': round(re_cum[-1], 3)
    }
])

print('Final year projection by region')
display(final_projection_df.head(5))

print('Benchmark paths')
display(benchmark_df)
No description has been provided for this image
Final year projection by region
Region Final cumulative return
9 DUBAI HILLS - PARK 2.421
3 Burj Khalifa 2.251
8 DUBAI HILLS 2.032
19 International City Phase 3 1.972
41 The Greens 1.900
Benchmark paths
Series Final cumulative return
0 Inflation baseline 1.242
1 RE market estimate 1.474

End-of-Term Distribution

The histogram below shows the spread of projected 10-year cumulative returns across all regions under the mean-reversion model.

In [18]:
end_values = cumulative.iloc[-1][region_cols]

plt.figure(figsize=(8, 4.5))
plt.hist(
    end_values,
    bins=15,
    color='steelblue',
    edgecolor='white',
    linewidth=0.5
)
plt.axvline(
    end_values.median(),
    color='black',
    linestyle='--',
    linewidth=1.5,
    label=f'Median: {end_values.median():.2f}x'
)
plt.xlabel('Cumulative return at year 10')
plt.ylabel('Number of regions')
plt.title('Distribution of projected 10 year returns across regions')
plt.legend()
plt.tick_params(length=0)
plt.tight_layout()
plt.show()

end_values_summary = pd.DataFrame([
    {
        'Metric': 'Mean',
        'Value': f'{end_values.mean():.3f}x'
    },
    {
        'Metric': 'Median',
        'Value': f'{end_values.median():.3f}x'
    },
    {
        'Metric': 'Std',
        'Value': f'{end_values.std():.3f}'
    },
    {
        'Metric': 'Min',
        'Value': f'{end_values.min():.3f}x'
    },
    {
        'Metric': 'Max',
        'Value': f'{end_values.max():.3f}x'
    },
    {
        'Metric': 'P25',
        'Value': f'{end_values.quantile(0.25):.3f}x'
    },
    {
        'Metric': 'P75',
        'Value': f'{end_values.quantile(0.75):.3f}x'
    }
])

display(end_values_summary)
No description has been provided for this image
Metric Value
0 Mean 1.571x
1 Median 1.569x
2 Std 0.296
3 Min 1.008x
4 Max 2.421x
5 P25 1.356x
6 P75 1.729x

Illustrative Mortgage Cost Model

A simplified cost model for a representative 1.8M AED flat purchase, illustrating total cost of ownership over a 10-year holding period. This is for reference only and does not account for individual tax situations, service charges above the estimate, or rental income offsets.

Assumptions:

  • Property price: AED 1,800,000
  • Down payment: 20%
  • Loan fees and DLD transfer: approximately 7.7% of property value
  • Loan term: 5 years at 4.5% annual interest
  • Annual running cost (service charge, maintenance): AED 30,000
  • Holding period: 10 years
In [19]:
property_price = 1_800_000
fees_rate = 0.077
down_payment_rate = 0.20
loan_years = 5
annual_interest_rate = 0.045
yearly_running_cost = 30_000
holding_years = 10

down_payment = property_price * down_payment_rate
fees = property_price * fees_rate
total_initial_payment = down_payment + fees

loan_amount = property_price * (1 - down_payment_rate)
loan_months = loan_years * 12
monthly_rate = annual_interest_rate / 12
monthly_payment = (loan_amount * monthly_rate) / (1 - (1 + monthly_rate) ** -loan_months)
total_repaid = monthly_payment * loan_months
total_interest = total_repaid - loan_amount
total_running = yearly_running_cost * holding_years
total_cost = total_initial_payment + total_interest + total_running

ownership_cost_df = pd.DataFrame([
    {
        'Metric': 'Down payment + fees',
        'Value': f'AED {total_initial_payment:,.0f}'
    },
    {
        'Metric': 'Loan amount',
        'Value': f'AED {loan_amount:,.0f}'
    },
    {
        'Metric': 'Monthly repayment',
        'Value': f'AED {monthly_payment:,.0f}'
    },
    {
        'Metric': 'Total repaid (5 year loan)',
        'Value': f'AED {total_repaid:,.0f}'
    },
    {
        'Metric': 'Total interest paid',
        'Value': f'AED {total_interest:,.0f}'
    },
    {
        'Metric': 'Total running costs',
        'Value': f'AED {total_running:,.0f}'
    },
    {
        'Metric': 'Total cost of ownership',
        'Value': f'AED {total_cost:,.0f}'
    }
])

display(ownership_cost_df)
Metric Value
0 Down payment + fees AED 498,600
1 Loan amount AED 1,440,000
2 Monthly repayment AED 26,846
3 Total repaid (5 year loan) AED 1,610,757
4 Total interest paid AED 170,757
5 Total running costs AED 300,000
6 Total cost of ownership AED 969,357