Deep Dive into Amazon Quick Sight to discover the insights from Airbnb Public Dataset
In our previous blog post, we transform data from csv to parquet using AWS Glue then used Amazon Athena for analytics and used Quick Sight for data visualization. In this post we will walk you through all the features available in the Quick Sight and how to use them to build complex dashboards and make advance data visualizations to get some insights from the Airbnb Public Dataset.
First, we will learn about the data which is being used here for visualization. In the next steps we will load the data into spice engine and learn about the functionalities available in Quick Sight to create complex data visualization.
1. Description of Data:
We will be using a publicly available dataset from Airbnb, which describes the listing activity and metrics of hosts, guests, locations, properties, reviews, prices, etc. in NYC for the year 2019. Airbnb is an online marketplace that connects hosts who want to rent out their homes/properties with guests who are looking for accommodations in that locale. Guests and hosts have used Airbnb to expand on traveling possibilities and present a more unique, personalized way of experiencing the world.
We will be able to visualize the following insights from the data step by step using all functionalities of Quick Sight:
- Total number of rental houses listed per neighborhood group.
- 5 most popular hosts based on the number of reviews.
- The average rental price for private rooms, shared rooms, and entire homes for each of the neighborhoods and for a single neighborhood group so that one can pick an economic location to book a rental house.
- Total number of reviews for a given date range.
- List of host IDs with their respective neighborhood group, room type, price, and price status.
- Average rental price for Expensive, Normal, and Cheap properties in a neighborhood group.
- Average rental price per Room Type.
2. Loading the Dataset into Spice Engine for Visualization:
We will import the data into Spice Engine for visualization.
3. Creating Visualization:
We will start creating our visualization, and we will be using all the functionalities available in Quick Sight one by one and apply them to the given scenario.
We can see all the features listed below:
- a. Add a Visual.
- b. Creating a Top and Bottom Filter.
- c. Creating a Parameter and Control to Filter the Visuals.
- d. Applying Date Range Filter to the Control.
- e. Adding a Calculated Field.
- f. Traversing from one screen to another or Drill Down to a Specific Field
- g. Format each individual Visual.
3a. Add a Visual:
We can add visuals to create a chart or graph based on the requirement by clicking on Add on top of the window. In the given chart, we are using a Pie Chart which is showing the Total Number of Houses Listed per neighborhood group.
3b. Creating a Top and Bottom Filter:
First, we need to select the visual in which we want to apply the filter.
1. We will create a filter and apply that filter to one column, which will filter the values. Here in this case, we are taking the host ID column as a filtered column.
2. By clicking on the small square icon, we can edit the filter and get an option to isolate it from other visuals by choosing Only this visual option.
3. From the filter list, we will choose the Top and Bottom filter. For Show top, we are not using any parameter to select the value, so we will type 5 and in By, we will select the number of reviews (sum) and Apply that filter.
As soon as we apply this filter to this visual, we can see that we are getting the top 5 host IDs based on the number of reviews.
3c. Creating a Parameter and Control to Filter the Visuals:
We can create a new parameter by choosing two options: Add parameter from the top or create a parameter from the left pane. We will name it and choose the data type of this parameter. Since we will be using this parameter for selecting the neighborhood group, we will choose a string data type. In Values, we will choose Single value. In this case, we will visualize the Average Rental Price of Houses listed per Room type according to the selected neighborhood group, which will be our control parameter.
2. Once our parameter is created, we need to connect this parameter to control or filter or calculated field or URL action. In this case, we are connecting it with control.
3. After selecting the control, we need to add a control for the parameter where we have to choose the neighborhood group as Display name, choose Single select drop-down in Style, and Link to a dataset field in Values. Then we will select our dataset and finally the column which will be used as a control parameter.
We have added a Control here named neighborhood group which will be used for filtering the visual based on different neighborhood groups. You can see here this Control is showing all the Values of the neighborhood group column.
4. In the visual, you can see that before applying this control, this chart is showing the Average Rental Price of Houses per Room type for all the neighborhood groups. If we want to see this chart only for Brooklyn or any other neighborhood group, we need to add a filter to this control.
5. We will create a new filter and select Custom filter and Equals in Filter type, then Use the parameter that we created earlier and Apply.
6. You can see here selecting Manhattan from the Control our Visual has been filtered and only showing values related to Manhattan.
3d. Applying Date Range Filter to the Control:
To filter the values for a visual between two given dates, we need to create two parameters, one for Start Date and the other for End Date. In the following screenshot, I am creating an End Date parameter in the same way you would be creating a Start Date parameter.
2. After creating parameters, we need to add controls for these two parameters.
Once we are done with creating parameters and their respective controls, we need to add a Time range filter which will use these two parameters to select the needed Data Range from the given Controls.
3. In the following screenshot, you can see that the Total number of reviews showing for the date range from Jun 16 to Jun 30. In this chart, you will see data for Jun 16 is not available as I have not marked on Include start date on the left side pane, and data for Jun 30 is available as I have marked Include end date.
3e. Adding a Calculated field: I have added a calculated field/column which will calculate the value of a new column based on your requirement and condition.
1. In this case, I have created a price status column which will categorize price values as Cheap/Normal/Expensive based on a certain condition.
For creating a Calculated field, you need to choose Add calculated field option from Add. You will be redirected to the following page where you have the option to write your Calculated field name and write a Formula according to your requirement and condition.
You will also get the option to use required functions from the Function list and required fields from Field list.
2. You can see in this visual I have used a Pivot Table from the chart and chosen host id, price status, neighborhood group, and room type fields as Rows and price(Average) as Values. which shows host id 2,845 have two types of rental houses/properties which are Shared room and Entire home/apt in Manhattan neighborhood_group. You can clearly see that Shared room price is $99.00 which has been categorized as Cheap and Entire home/apt price is $225.00 which has been categorized as Expensive using our calculated field price status.
3f. Traversing from one screen to another or Drill Down to a Specific Field: We have an option to drill down to a specific field so that we can further analyze the data based on that specific field. In this case, we have a Pie Chart of the total number of houses listed per neighborhood group. Now we want to drill down our chart based on the field neighborhood group, which will take us to the screen where we can further explore new visualization based on the selected neighborhood group.
1. For that first, we need to choose URL actions.
After choosing this URL actions, it will take us to the following page where we have to write the action name followed by field name in Action name whereas in URL we need to copy the URL link of the screen we want to be redirected and paste it here then followed by#p.parametername=<<field name.
2. We will be using the same browser tab.
3. You can see here we are drilling down this chart based on the neighborhood group field. In this case, we are drilling down to Brooklyn (just click on the chart in Brooklyn area) then you will see an option to go to Price Status Screen and drill down to Brooklyn. By clicking on this, you will be redirected to the Price Status Screen and Chart/Visual presented in that screen will be filtered out by Brooklyn (You need to apply the filter in that screen where the parameter name for filter control is neighborhood). We will be using this parameter name in the URL action as you already saw in the previous screenshot.
4. You can see here by drilling down to Brooklyn, Filter control in this Price Status Screen automatically selects Brooklyn and based on that chart has been filtered by Brooklyn. In this chart, you can see that the Average Price for Expensive rental house is $380, Average price for Normal rental houses is $170, and Average price for Cheap rental house is $88.
3g. Format each individual Visual: We can format our visuals by using this option. We can modify the X-Axis/Y-Axis labels, also we can show data labels and modify some of the parameters related to it. In this scenario, we are using a Horizontal bar chart which is showing Average Rental Price of house per Room type.
4. Summary: In this blog, we are exploring Airbnb public dataset and discovering useful insights and showing them in visual form by using different functionalities of Quick Sight. Here we are learning how to apply filters with and without parameters. Apart from that, we are exploring all the features available in the Quick Sight and usage of them. For more information on Quick Sight and Other Technical Services please check out our ITTStar Technology Blogs.