Sharing is caring!

Dataset:

You can use publicly available datasets from Airbnb that contain information about property listings, prices, location, reviews, and more.

Project Goals:

  1. Load and preprocess the Airbnb dataset.
  2. Explore the distribution of property types and room types.
  3. Analyze the relationship between prices and different attributes.
  4. Identify popular neighborhoods and their average prices.
  5. Visualize the availability of properties throughout the year.

Python Code:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load the Airbnb dataset
data = pd.read_csv('airbnb_data.csv')  # Replace with your dataset file

# Data Preprocessing
# Handle missing values, data type conversion, etc.

# Distribution of property types and room types
property_counts = data['property_type'].value_counts()
room_counts = data['room_type'].value_counts()

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
property_counts.plot(kind='bar')
plt.title('Distribution of Property Types')
plt.xlabel('Property Type')
plt.ylabel('Number of Listings')
plt.xticks(rotation=45)

plt.subplot(1, 2, 2)
room_counts.plot(kind='bar')
plt.title('Distribution of Room Types')
plt.xlabel('Room Type')
plt.ylabel('Number of Listings')

plt.tight_layout()
plt.show()

# Analyze prices vs. attributes
sns.scatterplot(x='accommodates', y='price', data=data)
plt.title('Price vs. Number of Accommodates')
plt.xlabel('Accommodates')
plt.ylabel('Price')
plt.show()

# Popular neighborhoods and average prices
neighborhood_avg_prices = data.groupby('neighborhood')['price'].mean().sort_values(ascending=False).head(10)

plt.figure(figsize=(10, 6))
neighborhood_avg_prices.plot(kind='bar')
plt.title('Top Neighborhoods by Average Price')
plt.xlabel('Neighborhood')
plt.ylabel('Average Price')
plt.xticks(rotation=45)
plt.show()

# Visualize availability throughout the year
data['availability_365'] = (data['availability_365'] / 365) * 100  # Convert to percentage

plt.figure(figsize=(10, 6))
sns.histplot(data=data, x='availability_365', bins=30, kde=True)
plt.title('Availability of Properties Throughout the Year')
plt.xlabel('Availability (%)')
plt.ylabel('Number of Listings')
plt.show()

Remember to replace ‘airbnb_data.csv’ with the actual dataset file you have. You can expand this project by exploring other attributes like reviews, amenities, or performing clustering analysis to group similar listings together. As always, adapt the code to your dataset’s structure and your specific analysis goals.


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *