Sub-City Level Rent Yield Analysis
This notebook studies Dubai Land Department transaction and rental contract data to estimate rental yield and cumulative property value growth across major residential projects and master communities.
The analysis is designed at the sub-city level, focusing on comparable flat segments across projects so that sale prices, annual rents, and implied gross yields can be compared on a more consistent basis.
Data sources
The primary source for this data is the Dubai Land Department service portal:
- Dubai Land Department real estate transaction services
- Dubai Pulse open data portal for DLD historical data
At the time this notebook was prepared, the original open data provider was unavailable, so this analysis uses an earlier locally saved version of the dataset. We are monitoring the original open data source and will update the links and replication steps when the source becomes available again.
Replication note
To reproduce this workflow at the moment, use the currently available public Kaggle mirror that includes both transaction and rent data:
This can be used as a temporary substitute while the original provider remains down.
Data availability note
Because the original source dataset license is not currently obtainable, the exact dataset version used in this notebook cannot be redistributed here. Please download a comparable version from the currently available public source above and adapt file paths as needed.
Datasets used
Transactions.csv- property sales recordsRent_Contracts.csv- Ejari rental contract records
Scope of analysis
- Property type: Flat, existing properties only
- Room types: 2 B/R and 3 B/R
- Area range: 70 to 160 sqm
- Period: January 2014 to July 2024
- Projects with fewer than 10 recorded transactions are excluded
| Item | Value |
|---|---|
| Geography | Major residential projects and master communities in Dubai |
| Main use case | Sub-city level rent yield and price growth analysis |
| Main files | Transactions.csv, Rent_Contracts.csv |
| Original provider status | Currently unavailable at the time of writing |
| Redistribution | Not provided here due to source licensing uncertainty |
| Replication path | Use the currently available Kaggle mirror until the original source returns |
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import warnings
warnings.filterwarnings('ignore')
pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', '{:.3f}'.format)
%matplotlib inline
plt.rcParams.update({
'figure.dpi': 300,
'axes.spines.top': False,
'axes.spines.right': False,
'axes.grid': True,
'grid.alpha': 0.3,
'font.size': 11,
})
Sales Tansactions Data
We load the full transactions file, parse dates, and apply filters to focus on a comparable segment: existing flat sales in the 2 B/R and 3 B/R range between 70 and 160 sqm.
buy = pd.read_csv('data/Transactions.csv')
buy['year'] = buy['instance_date'].map(lambda x: x.split('-')[2]).astype(int)
buy = buy[buy['year'] >= 2000]
buy['date'] = pd.to_datetime(buy['instance_date'], format='%d-%m-%Y')
buy['count'] = 1
buy = buy.set_index('date').sort_index()
selected = [
'trans_group_en', 'procedure_name_en', 'property_type_en',
'property_sub_type_en', 'property_usage_en', 'reg_type_en',
'area_name_en', 'building_name_en', 'project_name_en',
'master_project_en', 'rooms_en', 'procedure_area',
'actual_worth', 'meter_sale_price', 'count'
]
buy = buy[selected]
buy = buy.loc['2014-01-01':'2024-07-31']
buy = buy[buy['property_sub_type_en'] == 'Flat']
buy = buy[buy['reg_type_en'] == 'Existing Properties']
buy = buy[(buy['rooms_en'] == '2 B/R') | (buy['rooms_en'] == '3 B/R')]
buy = buy[(buy['procedure_area'] >= 70) & (buy['procedure_area'] <= 160)]
project_counts = buy['project_name_en'].value_counts()
top_projects = project_counts[project_counts > 10].index
buy = buy[buy['project_name_en'].isin(top_projects)]
buy = buy[buy['master_project_en'].notnull()]
buy_summary = pd.DataFrame([
{
'Metric': 'Transactions loaded',
'Value': f'{len(buy):,}'
},
{
'Metric': 'Date range',
'Value': f'{buy.index.min().date()} to {buy.index.max().date()}'
},
{
'Metric': 'Unique projects',
'Value': f'{buy["project_name_en"].nunique():,}'
},
{
'Metric': 'Master communities',
'Value': f'{buy["master_project_en"].nunique():,}'
}
])
top_projects_df = (
buy['project_name_en']
.value_counts()
.head(15)
.rename_axis('Project')
.reset_index(name='Transactions')
)
top_communities_df = (
buy['master_project_en']
.value_counts()
.head(15)
.rename_axis('Master community')
.reset_index(name='Transactions')
)
display(buy_summary.head(5))
print('Top projects by transaction count')
display(top_projects_df.head(5))
print('Top master communities by transaction count')
display(top_communities_df.head(5))
| Metric | Value | |
|---|---|---|
| 0 | Transactions loaded | 52,341 |
| 1 | Date range | 2014-01-02 to 2024-07-17 |
| 2 | Unique projects | 500 |
| 3 | Master communities | 52 |
Top projects by transaction count
| Project | Transactions | |
|---|---|---|
| 0 | REMRAAM | 2456 |
| 1 | SKY COURTS | 1540 |
| 2 | TOWN SQUARE ZAHRA | 846 |
| 3 | THE POLO RESIDENCE | 682 |
| 4 | 1 Residences | 623 |
Top master communities by transaction count
| Master community | Transactions | |
|---|---|---|
| 0 | Dubai Marina | 6525 |
| 1 | Burj Khalifa | 3470 |
| 2 | Dubai Sports City | 3109 |
| 3 | Jumeirah Village Circle | 3056 |
| 4 | Business Bay | 2989 |
Area Distribution
After filtering to the 70 to 160 sqm range, the distribution below shows the most common unit sizes in the sales data. The peak around 115 to 125 sqm reflects the typical 2 B/R flat footprint in Dubai's established residential projects.
room_counts = buy['procedure_area']
plt.figure(figsize=(8, 4.5))
plt.hist(
room_counts,
bins=30,
color='steelblue',
edgecolor='white',
linewidth=0.5
)
plt.xlabel('Procedure area (sqm)')
plt.ylabel('Number of transactions')
plt.title('Distribution of unit sizes for sales')
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
area_bins_df = pd.DataFrame({
'Procedure area': room_counts.describe()
}).reset_index()
area_bins_df.columns = ['Metric', 'Value']
display(area_bins_df)
| Metric | Value | |
|---|---|---|
| 0 | count | 52341.000 |
| 1 | mean | 121.122 |
| 2 | std | 20.893 |
| 3 | min | 70.070 |
| 4 | 25% | 105.030 |
| 5 | 50% | 121.640 |
| 6 | 75% | 138.490 |
| 7 | max | 160.000 |
Annual Transaction Volume
Transaction volume by year reveals market cycles. The COVID-related dip in 2020 and the strong recovery from 2021 onward are both clearly visible.
annual_volume = buy.resample('YE')['count'].sum()
plt.figure(figsize=(8, 4.5))
plt.bar(
annual_volume.index.year,
annual_volume.values,
color='steelblue',
width=0.6
)
plt.xlabel('Year')
plt.ylabel('Number of transactions')
plt.title('Annual sales volume for the filtered segment')
plt.gca().yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'{int(x):,}')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
annual_volume_df = pd.DataFrame({
'Year': annual_volume.index.year,
'Transactions': annual_volume.values
})
display(annual_volume_df.tail(5))
| Year | Transactions | |
|---|---|---|
| 6 | 2020 | 3237 |
| 7 | 2021 | 5609 |
| 8 | 2022 | 7302 |
| 9 | 2023 | 10377 |
| 10 | 2024 | 6203 |
Median Price per Sqm Over Time
The median meter sale price (AED/sqm) across all projects shows the broad market price trend over the analysis period.
annual_price = buy.resample('YE')['meter_sale_price'].median()
plt.figure(figsize=(8, 4.5))
plt.plot(
annual_price.index,
annual_price.values,
marker='o',
color='steelblue',
linewidth=2
)
plt.xlabel('Year')
plt.ylabel('Median price (AED/sqm)')
plt.title('Median sale price per sqm for all filtered projects')
plt.gca().yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'{int(x):,}')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
annual_price_df = pd.DataFrame({
'Year': annual_price.index.year,
'Median price (AED/sqm)': annual_price.values
})
display(annual_price_df.tail(5))
| Year | Median price (AED/sqm) | |
|---|---|---|
| 6 | 2020 | 7534.720 |
| 7 | 2021 | 9095.160 |
| 8 | 2022 | 10710.385 |
| 9 | 2023 | 11370.770 |
| 10 | 2024 | 11115.520 |
Rental Data
We load the Ejari rental contracts, applying matching filters to ensure comparability with the sales segment: new contracts only, residential flats, 2 or 3 bedrooms, 70 to 160 sqm.
rent = pd.read_csv('data/Rent_Contracts.csv')
rent['date'] = pd.to_datetime(rent['contract_start_date'], format='%d-%m-%Y')
rent = rent.set_index('date').sort_index()
rent = rent.loc['2014-01-01':'2024-07-31']
selected_cols = [
'contract_reg_type_en', 'contract_amount', 'annual_amount',
'contract_end_date', 'line_number', 'is_free_hold',
'ejari_bus_property_type_en', 'ejari_property_type_en',
'ejari_property_sub_type_en', 'property_usage_en',
'project_name_en', 'master_project_en', 'area_name_en',
'tenant_type_en', 'actual_area'
]
rent = rent[selected_cols]
rent['count'] = 1
rent = rent[rent['contract_reg_type_en'] == 'New']
rent = rent[rent['ejari_bus_property_type_en'] == 'Unit']
rent = rent[rent['ejari_property_type_en'] == 'Flat']
rent = rent[
(rent['ejari_property_sub_type_en'] == '2 bed rooms+hall') |
(rent['ejari_property_sub_type_en'] == '3 bed rooms+hall')
]
rent = rent[(rent['actual_area'] >= 70) & (rent['actual_area'] <= 160)]
rent = rent[rent['property_usage_en'] == 'Residential']
project_counts_rent = rent['project_name_en'].value_counts()
top_projects_rent = project_counts_rent[project_counts_rent > 10].index
rent = rent[rent['project_name_en'].isin(top_projects_rent)]
rent = rent[rent['master_project_en'].notnull()]
rent['annual_rent_m2'] = rent['annual_amount'] / rent['actual_area']
rent_summary = pd.DataFrame([
{
'Metric': 'Rental contracts loaded',
'Value': f'{len(rent):,}'
},
{
'Metric': 'Date range',
'Value': f'{rent.index.min().date()} to {rent.index.max().date()}'
},
{
'Metric': 'Unique projects',
'Value': f'{rent["project_name_en"].nunique():,}'
},
{
'Metric': 'Master communities',
'Value': f'{rent["master_project_en"].nunique():,}'
}
])
top_projects_rent_df = (
rent['project_name_en']
.value_counts()
.head(15)
.rename_axis('Project')
.reset_index(name='Contracts')
)
top_communities_rent_df = (
rent['master_project_en']
.value_counts()
.head(15)
.rename_axis('Master community')
.reset_index(name='Contracts')
)
display(rent_summary.head(5))
print('Top projects by rent contract count')
display(top_projects_rent_df.head(5))
print('Top master communities by rent contract count')
display(top_communities_rent_df.head(5))
| Metric | Value | |
|---|---|---|
| 0 | Rental contracts loaded | 102,363 |
| 1 | Date range | 2014-01-01 to 2024-07-31 |
| 2 | Unique projects | 591 |
| 3 | Master communities | 52 |
Top projects by rent contract count
| Project | Contracts | |
|---|---|---|
| 0 | REMRAAM | 4617 |
| 1 | SKY COURTS | 3645 |
| 2 | TOWN SQUARE ZAHRA | 1595 |
| 3 | TORCH TOWER | 1518 |
| 4 | SULAFA TOWER | 1073 |
Top master communities by rent contract count
| Master community | Contracts | |
|---|---|---|
| 0 | Dubai Marina | 15178 |
| 1 | Jumeirah Lakes Towers | 7399 |
| 2 | Burj Khalifa | 7389 |
| 3 | Business Bay | 6642 |
| 4 | Dubai Sports City | 6355 |
Rental Area Distribution
rent_area = rent['actual_area']
plt.figure(figsize=(8, 4.5))
plt.hist(
rent_area,
bins=30,
color='#e07b39',
edgecolor='white',
linewidth=0.5
)
plt.xlabel('Actual area (sqm)')
plt.ylabel('Number of contracts')
plt.title('Distribution of unit sizes for rental contracts')
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
rent_area_df = pd.DataFrame({
'Metric': rent_area.describe().index,
'Value': rent_area.describe().values
})
display(rent_area_df)
| Metric | Value | |
|---|---|---|
| 0 | count | 102363.000 |
| 1 | mean | 119.466 |
| 2 | std | 20.830 |
| 3 | min | 70.000 |
| 4 | 25% | 103.000 |
| 5 | 50% | 120.000 |
| 6 | 75% | 136.000 |
| 7 | max | 160.000 |
Annual Rent per Sqm Over Time
The sharp post-2021 rise in median rent per sqm reflects both demand recovery and the broader price cycle observed in the sales data.
annual_rent = rent.resample('YE')['annual_rent_m2'].median()
plt.figure(figsize=(8, 4.5))
plt.plot(
annual_rent.index,
annual_rent.values,
marker='o',
color='#e07b39',
linewidth=2
)
plt.xlabel('Year')
plt.ylabel('Median annual rent (AED/sqm)')
plt.title('Median rental rate per sqm for all filtered projects')
plt.gca().yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'{int(x):,}')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
annual_rent_df = pd.DataFrame({
'Year': annual_rent.index.year,
'Median annual rent (AED/sqm)': annual_rent.values
})
display(annual_rent_df)
| Year | Median annual rent (AED/sqm) | |
|---|---|---|
| 0 | 2014 | 860.215 |
| 1 | 2015 | 902.778 |
| 2 | 2016 | 868.094 |
| 3 | 2017 | 796.460 |
| 4 | 2018 | 715.849 |
| 5 | 2019 | 643.521 |
| 6 | 2020 | 543.478 |
| 7 | 2021 | 557.450 |
| 8 | 2022 | 685.599 |
| 9 | 2023 | 886.076 |
| 10 | 2024 | 1037.398 |
Rental Yield by Project
Rental yield is computed as the ratio of annual rent per sqm to sale price per sqm, aggregated at the yearly level per project. Only projects appearing in both the sales and rental datasets are included. Yield observations below 2% or above 20% are removed as they typically reflect data sparsity rather than true market conditions.
grouped_buy = buy.groupby('project_name_en').resample('YE')['meter_sale_price'].median().reset_index()
pivot_buy = grouped_buy.pivot_table(index='date', columns='project_name_en', values='meter_sale_price')
grouped_rent = rent.groupby('project_name_en').resample('YE')['annual_rent_m2'].median().reset_index()
pivot_rent = grouped_rent.pivot_table(index='date', columns='project_name_en', values='annual_rent_m2')
common_projects = pd.Series(pivot_buy.columns)[pd.Series(pivot_buy.columns).isin(pivot_rent.columns)]
pivot_buy = pivot_buy[common_projects]
pivot_rent = pivot_rent[common_projects]
roi_projects = pivot_rent / pivot_buy
roi_projects = roi_projects.map(lambda v: np.nan if v is not None and (v > 0.20 or v < 0.02) else v)
roi_projects = roi_projects.ffill()
print(f'Projects with yield data: {roi_projects.shape[1]}')
print(f'Years covered: {roi_projects.shape[0]}')
Projects with yield data: 473 Years covered: 11
Yield Uncertainty via Error Propagation
Since both rent and sale price carry variability, we propagate their uncertainty into a yield standard error estimate using the standard ratio error formula: sigma_roi / roi = sqrt( (sigma_rent/rent)^2 + (sigma_price/price)^2 ).
rent_mean = pivot_rent.mean()
buy_mean = pivot_buy.mean()
rent_std = pivot_rent.std()
buy_std = pivot_buy.std()
roi_mean = rent_mean / buy_mean
roi_std_propagated = roi_mean * np.sqrt((rent_std / rent_mean)**2 + (buy_std / buy_mean)**2)
roi_std_df = roi_std_propagated.rename('roi_std_err_propagation').reset_index()
roi_std_df.columns = ['project_name_en', 'roi_std_err_propagation']
Project Yield Summary (2020 to 2024)
Summary statistics per project over the 2020 to 2024 window, sorted by median yield. The 2023 and 2024 columns show the most recent annual yield observations.
roi_summary = roi_projects.loc['2020-12-31':].describe().T
roi_summary['2024'] = roi_projects.iloc[-1]
roi_summary['2023'] = roi_projects.iloc[-2]
roi_summary = roi_summary.sort_values('50%', ascending=False)
roi_summary = roi_summary[roi_summary['count'] > 2]
display_cols = ['count', 'mean', 'std', '25%', '50%', '75%', '2023', '2024']
display(
round(roi_summary[display_cols].iloc[:20] * 100, 2).rename(columns={
'count': 'N years', 'mean': 'Mean %', 'std': 'Std %',
'25%': 'P25 %', '50%': 'Median %', '75%': 'P75 %',
'2023': '2023 %', '2024': '2024 %'
}).head(10)
)
| N years | Mean % | Std % | P25 % | Median % | P75 % | 2023 % | 2024 % | |
|---|---|---|---|---|---|---|---|---|
| project_name_en | ||||||||
| UNIESTATE SPORTS TOWER | 500.000 | 12.200 | 1.320 | 10.760 | 13.170 | 13.170 | 10.760 | 10.760 |
| ZENITH TOWER A2 | 500.000 | 12.000 | 2.380 | 11.480 | 11.960 | 12.810 | 11.960 | 11.480 |
| HANOVER SQUARE | 500.000 | 10.400 | 2.020 | 9.630 | 11.280 | 11.690 | 12.190 | 11.280 |
| HDS SUNSTAR 1 | 500.000 | 10.240 | 1.440 | 9.410 | 11.240 | 11.240 | 8.090 | 9.410 |
| DIAMOND VIEWS 2 | 500.000 | 13.160 | 3.170 | 10.990 | 10.990 | 16.620 | 10.990 | 10.990 |
| PALACE TOWER | 500.000 | 10.760 | 0.880 | 10.850 | 10.950 | 11.200 | 10.850 | 11.200 |
| LINCOLN PARK | 500.000 | 10.200 | 1.080 | 9.600 | 10.870 | 10.870 | 11.080 | 9.600 |
| EATON PLACE | 500.000 | 10.800 | 3.470 | 8.650 | 10.620 | 10.810 | 8.650 | 10.810 |
| Rigel | 400.000 | 10.480 | 1.090 | 9.900 | 10.600 | 11.190 | 10.160 | 11.050 |
| QPOINT LIWAN-PLOT R092 | 500.000 | 10.190 | 1.340 | 10.190 | 10.590 | 10.600 | 10.590 | 11.570 |
Top Projects by Median Yield
Error bars represent the adjusted yield uncertainty: propagated standard error is used when it is small and the observed standard deviation is used as a fallback when propagated error is large, which typically occurs in projects with sparse data.
master_map = buy[['master_project_en', 'project_name_en']].drop_duplicates().set_index('project_name_en')
roi_ext = roi_summary.merge(master_map, left_index=True, right_index=True, how='left')
roi_ext = roi_ext.merge(roi_std_df.set_index('project_name_en'), left_index=True, right_index=True, how='left')
roi_ext['adj_roi_std'] = np.where(
roi_ext['roi_std_err_propagation'] > 0.10,
roi_ext['std'],
roi_ext['roi_std_err_propagation']
)
top10 = roi_ext.iloc[:10].copy()
plt.figure(figsize=(8, 4.5))
plt.bar(
range(len(top10)),
top10['50%'] * 100,
yerr=top10['adj_roi_std'] * 100,
capsize=4,
color='steelblue',
alpha=0.85,
error_kw={'elinewidth': 1.2, 'alpha': 0.6}
)
plt.xticks(range(len(top10)), top10.index, rotation=45, ha='right')
plt.ylabel('Median rental yield (%)')
plt.title('Top 10 projects by median rental yield (2020 to 2024)')
plt.gca().yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'{x:.1f}%')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
top10_df = top10.reset_index().rename(columns={'index': 'project_name_en'})
top10_df = top10_df[[
'project_name_en',
'master_project_en',
'50%',
'adj_roi_std'
]].copy()
top10_df.columns = [
'Project',
'Master community',
'Median rental yield',
'Yield std'
]
top10_df['Median rental yield'] = (top10_df['Median rental yield'] * 100).round(2)
top10_df['Yield std'] = (top10_df['Yield std'] * 100).round(2)
display(top10_df.head(5))
| Project | Master community | Median rental yield | Yield std | |
|---|---|---|---|---|
| 0 | UNIESTATE SPORTS TOWER | Dubai Sports City | 13.170 | 2.860 |
| 1 | ZENITH TOWER A2 | Dubai Sports City | 11.960 | 3.070 |
| 2 | HANOVER SQUARE | Jumeirah Village Circle | 11.280 | 3.360 |
| 3 | HDS SUNSTAR 1 | International City Phase 1 | 11.240 | 2.330 |
| 4 | DIAMOND VIEWS 2 | Jumeirah Village Circle | 10.990 | 4.010 |
Rental Yield by Master Community
Aggregating yields to the master project level gives a regional view for comparing broad investment zones.
roi_regions = roi_ext.groupby('master_project_en')[['mean', 'std', '2024', '2023', 'adj_roi_std']].mean()
roi_regions = roi_regions.sort_values('mean', ascending=False)
display(
round(roi_regions * 100, 2).rename(columns={
'mean': 'Avg Yield %', 'std': 'Std %',
'2024': '2024 %', '2023': '2023 %', 'adj_roi_std': 'Adj Std %'
}).head(10)
)
| Avg Yield % | Std % | 2024 % | 2023 % | Adj Std % | |
|---|---|---|---|---|---|
| master_project_en | |||||
| Badra | 10.350 | 0.510 | 10.520 | 10.090 | 3.890 |
| Majan | 9.960 | 1.950 | 8.070 | 7.770 | 2.570 |
| Dubai Studio City | 9.430 | 1.240 | 9.640 | 9.360 | 2.490 |
| Jumeirah Village Triangle | 9.150 | 1.480 | 9.640 | 9.510 | 2.880 |
| Jumeirah Islands | 9.140 | 0.720 | 9.000 | 10.050 | 3.140 |
| Dubai Sports City | 9.050 | 1.280 | 9.850 | 9.520 | 2.990 |
| Jumeirah Village Circle | 8.880 | 1.130 | 9.340 | 9.120 | 2.400 |
| Silicon Oasis | 8.800 | 1.250 | 9.070 | 9.590 | 2.500 |
| Liwan | 8.780 | 1.230 | 10.150 | 9.230 | 2.650 |
| Dubai Health Care City Phase 2 | 8.720 | 0.870 | 9.520 | 8.810 | 2.240 |
top10_regions = roi_regions.head(10).copy()
plt.figure(figsize=(8, 4.5))
plt.bar(
range(len(top10_regions)),
top10_regions['mean'] * 100,
yerr=top10_regions['adj_roi_std'] * 100,
capsize=4,
color='#5b8db8',
alpha=0.85,
error_kw={'elinewidth': 1.2, 'alpha': 0.6}
)
plt.xticks(range(len(top10_regions)), top10_regions.index, rotation=45, ha='right')
plt.ylabel('Average rental yield (%)')
plt.title('Top 10 master communities by rental yield')
plt.gca().yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'{x:.1f}%')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
top10_regions_df = top10_regions.reset_index()
top10_regions_df.columns = ['Master community', 'Mean yield', 'Median yield', 'Std', 'Yield std', 'Adjusted yield std']
for col in ['Mean yield', 'Median yield', 'Std', 'Yield std', 'Adjusted yield std']:
top10_regions_df[col] = (top10_regions_df[col] * 100).round(2)
display(top10_regions_df.head(5))
| Master community | Mean yield | Median yield | Std | Yield std | Adjusted yield std | |
|---|---|---|---|---|---|---|
| 0 | Badra | 10.350 | 0.510 | 10.520 | 10.090 | 3.890 |
| 1 | Majan | 9.960 | 1.950 | 8.070 | 7.770 | 2.570 |
| 2 | Dubai Studio City | 9.430 | 1.240 | 9.640 | 9.360 | 2.490 |
| 3 | Jumeirah Village Triangle | 9.150 | 1.480 | 9.640 | 9.510 | 2.880 |
| 4 | Jumeirah Islands | 9.140 | 0.720 | 9.000 | 10.050 | 3.140 |
Historical Property Value Growth by Region
Using the sales data, we estimate annual percentage change in median price per sqm for each master community. The chart shows the average recent return over the last 4 years with historical standard deviation as error bars. Year-on-year changes above 40% are treated as outliers and removed, as they usually reflect thin transaction coverage in a given year.
grouped_region = buy.groupby('master_project_en').resample('YE')['meter_sale_price'].median().reset_index()
pivot_region = grouped_region.pivot_table(index='date', columns='master_project_en', values='meter_sale_price')
pivot_region = pivot_region.dropna(thresh=5, axis=1).ffill()
pct_change = pivot_region.pct_change()
pct_change = pct_change.where(pct_change <= 0.40, np.nan)
regional_means = pct_change.iloc[-4:].mean()
regional_stds = pct_change.std()
sorted_idx = regional_means.sort_values(ascending=False).index
sorted_means = regional_means[sorted_idx]
sorted_stds = regional_stds[sorted_idx]
top10_idx = sorted_idx[:10]
top10_means = sorted_means.loc[top10_idx]
top10_stds = sorted_stds.loc[top10_idx]
plt.figure(figsize=(8, 4.5))
plt.bar(
range(len(top10_means)),
top10_means * 100,
yerr=top10_stds * 100,
capsize=4,
color='steelblue',
alpha=0.75,
error_kw={'elinewidth': 1.2, 'alpha': 0.5}
)
plt.xticks(range(len(top10_means)), top10_idx, rotation=45, ha='right')
plt.ylabel('Mean annual price change (%)')
plt.title('Top 10 regions by recent 4 year average price change')
plt.gca().yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'{x:.1f}%')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
regional_growth_df = pd.DataFrame({
'Master community': top10_idx,
'Mean annual price change (%)': (top10_means.values * 100).round(2),
'Std (%)': (top10_stds.values * 100).round(2)
})
display(regional_growth_df.head(5))
| Master community | Mean annual price change (%) | Std (%) | |
|---|---|---|---|
| 0 | DUBAI HILLS - PARK | 24.460 | 11.050 |
| 1 | Burj Khalifa | 21.600 | 17.360 |
| 2 | DUBAI HILLS | 17.680 | 6.250 |
| 3 | International City Phase 3 | 16.550 | 23.570 |
| 4 | The Greens | 15.150 | 14.010 |
10-Year Cumulative Return Projection
Using the recent 4-year average regional price appreciation as a starting point, we project cumulative returns over a 10-year horizon. The model assumes a gradual mean-reversion toward a long-run equilibrium return of approximately 3.1% per year, composed of 1.5% real estate real return and 1.6% inflation, reached by year 5.
Two reference lines are shown for context:
- Inflation baseline - cumulative CPI-based erosion of purchasing power
- RE market estimate - baseline nominal return (inflation + long-run real return)
long_run = 0.015 + 0.016
yearly_return = pd.DataFrame(
np.zeros((len(regional_means), 11)),
index=regional_means.index,
columns=[f'year_{i}' for i in range(11)]
)
for idx in range(len(yearly_return)):
start = regional_means.iloc[idx]
path = np.concatenate([np.linspace(start, long_run, 6), np.full(4, long_run)])
yearly_return.iloc[idx, 1:] = path
yearly_return = yearly_return.T
cumulative = (yearly_return + 1).cumprod()
cumulative = cumulative.reset_index().rename(columns={'index': 'year'})
cumulative.head().T.head(10)
| 0 | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| master_project_en | |||||
| year | year_0 | year_1 | year_2 | year_3 | year_4 |
| Al Furjan | 1.000 | 1.068 | 1.134 | 1.194 | 1.249 |
| Arjan | 1.000 | 1.022 | 1.047 | 1.074 | 1.104 |
| Badra | 1.000 | 1.098 | 1.192 | 1.277 | 1.351 |
| Burj Khalifa | 1.000 | 1.216 | 1.434 | 1.637 | 1.809 |
| Business Bay | 1.000 | 1.119 | 1.232 | 1.335 | 1.423 |
| City Walk | 1.000 | 1.106 | 1.206 | 1.298 | 1.377 |
| Culture Village | 1.000 | 1.090 | 1.176 | 1.254 | 1.322 |
| DAMAC HILLS | 1.000 | 1.056 | 1.111 | 1.162 | 1.210 |
| DUBAI HILLS | 1.000 | 1.177 | 1.351 | 1.511 | 1.646 |
inf_start = 0.0381
inf_path = np.concatenate([
[0],
np.linspace(inf_start, long_run - 0.016, 6),
np.full(4, long_run - 0.016)
])
re_path = inf_path + 0.016
years = cumulative['year'].values
inf_cum = (inf_path + 1).cumprod()
re_cum = (re_path + 1).cumprod()
region_cols = [c for c in cumulative.columns if c != 'year']
plt.figure(figsize=(8, 4.5))
for col in region_cols:
plt.plot(
years,
cumulative[col].values,
color='steelblue',
alpha=0.25,
linewidth=1.2
)
plt.plot(
years,
inf_cum,
color='black',
linestyle='--',
linewidth=2,
label='Inflation baseline'
)
plt.plot(
years,
re_cum,
color='#444444',
linestyle=':',
linewidth=2,
label='RE market estimate'
)
plt.xlabel('Year')
plt.ylabel('Cumulative return (1.0 = no change)')
plt.title('10 year cumulative property value projection by region')
plt.legend(fontsize=10)
plt.gca().yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'{x:.2f}x')
)
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
final_projection_df = pd.DataFrame({
'Region': region_cols,
'Final cumulative return': [cumulative[col].values[-1] for col in region_cols]
}).sort_values('Final cumulative return', ascending=False)
final_projection_df['Final cumulative return'] = final_projection_df['Final cumulative return'].round(3)
benchmark_df = pd.DataFrame([
{
'Series': 'Inflation baseline',
'Final cumulative return': round(inf_cum[-1], 3)
},
{
'Series': 'RE market estimate',
'Final cumulative return': round(re_cum[-1], 3)
}
])
print('Final year projection by region')
display(final_projection_df.head(5))
print('Benchmark paths')
display(benchmark_df)
Final year projection by region
| Region | Final cumulative return | |
|---|---|---|
| 9 | DUBAI HILLS - PARK | 2.421 |
| 3 | Burj Khalifa | 2.251 |
| 8 | DUBAI HILLS | 2.032 |
| 19 | International City Phase 3 | 1.972 |
| 41 | The Greens | 1.900 |
Benchmark paths
| Series | Final cumulative return | |
|---|---|---|
| 0 | Inflation baseline | 1.242 |
| 1 | RE market estimate | 1.474 |
End-of-Term Distribution
The histogram below shows the spread of projected 10-year cumulative returns across all regions under the mean-reversion model.
end_values = cumulative.iloc[-1][region_cols]
plt.figure(figsize=(8, 4.5))
plt.hist(
end_values,
bins=15,
color='steelblue',
edgecolor='white',
linewidth=0.5
)
plt.axvline(
end_values.median(),
color='black',
linestyle='--',
linewidth=1.5,
label=f'Median: {end_values.median():.2f}x'
)
plt.xlabel('Cumulative return at year 10')
plt.ylabel('Number of regions')
plt.title('Distribution of projected 10 year returns across regions')
plt.legend()
plt.tick_params(length=0)
plt.tight_layout()
plt.show()
end_values_summary = pd.DataFrame([
{
'Metric': 'Mean',
'Value': f'{end_values.mean():.3f}x'
},
{
'Metric': 'Median',
'Value': f'{end_values.median():.3f}x'
},
{
'Metric': 'Std',
'Value': f'{end_values.std():.3f}'
},
{
'Metric': 'Min',
'Value': f'{end_values.min():.3f}x'
},
{
'Metric': 'Max',
'Value': f'{end_values.max():.3f}x'
},
{
'Metric': 'P25',
'Value': f'{end_values.quantile(0.25):.3f}x'
},
{
'Metric': 'P75',
'Value': f'{end_values.quantile(0.75):.3f}x'
}
])
display(end_values_summary)
| Metric | Value | |
|---|---|---|
| 0 | Mean | 1.571x |
| 1 | Median | 1.569x |
| 2 | Std | 0.296 |
| 3 | Min | 1.008x |
| 4 | Max | 2.421x |
| 5 | P25 | 1.356x |
| 6 | P75 | 1.729x |
Illustrative Mortgage Cost Model
A simplified cost model for a representative 1.8M AED flat purchase, illustrating total cost of ownership over a 10-year holding period. This is for reference only and does not account for individual tax situations, service charges above the estimate, or rental income offsets.
Assumptions:
- Property price: AED 1,800,000
- Down payment: 20%
- Loan fees and DLD transfer: approximately 7.7% of property value
- Loan term: 5 years at 4.5% annual interest
- Annual running cost (service charge, maintenance): AED 30,000
- Holding period: 10 years
property_price = 1_800_000
fees_rate = 0.077
down_payment_rate = 0.20
loan_years = 5
annual_interest_rate = 0.045
yearly_running_cost = 30_000
holding_years = 10
down_payment = property_price * down_payment_rate
fees = property_price * fees_rate
total_initial_payment = down_payment + fees
loan_amount = property_price * (1 - down_payment_rate)
loan_months = loan_years * 12
monthly_rate = annual_interest_rate / 12
monthly_payment = (loan_amount * monthly_rate) / (1 - (1 + monthly_rate) ** -loan_months)
total_repaid = monthly_payment * loan_months
total_interest = total_repaid - loan_amount
total_running = yearly_running_cost * holding_years
total_cost = total_initial_payment + total_interest + total_running
ownership_cost_df = pd.DataFrame([
{
'Metric': 'Down payment + fees',
'Value': f'AED {total_initial_payment:,.0f}'
},
{
'Metric': 'Loan amount',
'Value': f'AED {loan_amount:,.0f}'
},
{
'Metric': 'Monthly repayment',
'Value': f'AED {monthly_payment:,.0f}'
},
{
'Metric': 'Total repaid (5 year loan)',
'Value': f'AED {total_repaid:,.0f}'
},
{
'Metric': 'Total interest paid',
'Value': f'AED {total_interest:,.0f}'
},
{
'Metric': 'Total running costs',
'Value': f'AED {total_running:,.0f}'
},
{
'Metric': 'Total cost of ownership',
'Value': f'AED {total_cost:,.0f}'
}
])
display(ownership_cost_df)
| Metric | Value | |
|---|---|---|
| 0 | Down payment + fees | AED 498,600 |
| 1 | Loan amount | AED 1,440,000 |
| 2 | Monthly repayment | AED 26,846 |
| 3 | Total repaid (5 year loan) | AED 1,610,757 |
| 4 | Total interest paid | AED 170,757 |
| 5 | Total running costs | AED 300,000 |
| 6 | Total cost of ownership | AED 969,357 |