This is a Capstone project I completed for a course i was enrolled in through an online platform. I hope you find some value to it. Feel free to comment and add your ideas/reviews.
Its so much fun to explore a new city. The food culture, the local travel places, coffee shops and much more. Although some of us generally don’t have a lot of time to explore a city, especially when we are on a business trip or meeting a friend/date for one day, it would be nice if we could get a recommendation for a hotel which is in least proximity to our places of interest. This would vastly reduce our travel time around the spots and let us have more time with our loved ones.
Meet Mia. She works in Colorado, US, but she is visiting Toronto for a business meeting. Mia is a fitness enthusiast. She loves coffee and is a great fan of a good brunch. Mia just ran her first full marathon just 24 hours before her plane to Toronto. Her legs aren’t in the best form of walking around too much. She has to attend to her work whole day and only find time for exploring neighborhood in the evening. As she likes a good brunch, a great coffee and prefers to unwind over a glass of wine and a fancy dinner. If She randomly books a hotel/Inn without searching for nearby brunches, coffee or bars, she could spend a lot of time in traffic moving from one part of the neighborhood to another. Which is a not great idea.
If we could recommend a hotel to Mia before she arrives in Toronto which is very close to all- a great breakfast place, a coffee shop and a Bar cum dinner spot, it would be so much better as he wouldn’t have to waste time in traffic. The idea here is to filter out a hotel which is the least far from this cluster- which has All- Brunch-Coffee-Dinner (A-B-C-D).
This could be done by leveraging foursquare API to call for nearest coffee shops, breakfast places and dinner places within the minimum radius and then plotting it on a folium map along with the hotel we selected.
Then we can introduce a function which can calculate the total distance one needs to cover around the neighborhood for getting dinner, coffee and brunch. Finally we can use the min () function to see which has the least walking distance, and that is our choice of hotel.
Data used here-
- Postal codes and borough names data
- Latitude and longitude data for each neighborhood
For finding out the ABCD cluster and the hotels, we use the Toronto postal code data from Wikipedia page. The table consists of Postal codes, name of each borough and neighborhood of Toronto, Canada. We clean the data, remove the rows with unassigned borough, and put the same borough to neighborhood with no name- for the sake of clustering points here. We can scrape the html page using Python’s Beautiful Soup and get the table into a data frame.
Alternatively, we can fetch the data directly to a Pandas data frame using — pd.read_html(‘wiki page here’). This directly fetches all rows and columns of page into a proper data frame.
And then from geocoder, we get the latitude and longitude date for the city and the neighborhoods. We can find the latitude and longitude of places using foursquare too, but here we use geocoder. As a course material, we were offered a Toronto latitude and longitude data as a readily available file, we make use of that here.
After feeding the file into a data frame, we concatenate the two data frames- postal codes and latitude and longitude. But before that we sort our postal codes data frame so that we have a same aligned column of postal codes common in both the data frames.
The final data looks something like this-
Here we use three basic features of python-
- EDA (Explanatory data analysis using pandas, numpy and sklearn)
- Folium maps- to visualize the maps
- Geocoder to fetch the latitude and longitudes of places.
Another import feature we used here is Four squares API. (RESTful API).
We made calls for finding out the coffee shops, hotels, bars, nightlife and venue details using Four square API.
And last but not the least, a function to have following features-
- Satisfies all ABCD condition (All-brunch-Coffee-Dinner) place
- Calculates closest proximity around the hotel
The following methodology was used-
- Finding and scraping the data of Toronto, as a data frame
- Sourcing latitude and longitude data of Toronto
- Sorting data and removing NaNs and unassigned attributes
- Import all libraries.
- Mapping Toronto and locating boroughs.
- Initializing foursquare api by setting up client id and username.
- Selecting 4 hotels based on nearest proximity to Mia’s office.
- Setting up address of hotel, finding out location and running API calls to locating all shops nearby and plotting it on the map
- Finding out function to calculate least distance proximity to hotel and coming to a conclusion.
Once we set up a foursquare account, we could use the API to fetch all the details around the neighborhood. Lets say, Mia’s work is in Scarborough. Now the 4 hotels very close to Mia’s work place are-
- Monte Carlo Inn & suites, 7255 Warden Ave
- Days Inn by Wyndham, 2151 Kingston road
- Delta hotels by Marriott, 2035 Kennedy road
- Knights Inn, 4694 Kingston road
Once we have the name of the hotels we can run queries across foursquare using API like-
The process includes-
- Finding coordinates of hotel
- Search for a specific category venue and distance from hotel
- Define URL, get API call
- Send GET and examine the results
- Transform into data frame
- Define information of Interest and filter data frame
- Visualize the places nearby
- Create function get the best hotel
As we can see from the function, the best option here is Delta house by Marriott which holds the best proximity from each of our interest- Coffee shop, brunch point and a Bar and dinner place.
As the total distance, across coffee place, brunch place and dinner place is only 250 meters (ABCD square), we can fairly recommend Mia for staying in Delta hotels by Marriott.
Here we have chosen the hotels based on the proximity from Mia’s office. But we can make this project even better by just asking a person about the borough and then clustering best hotels using KNN and then filter it using the best user rated reviews.
Or if someone asks for a coffee shop nearby, we can apply a function to fetch the closest and best reviewed coffee place.
Finding a best fit of a location based on certain terms and proximity is helpful as it eradicates the human effort to manually list and search out every place of interest. In a world where every moment is precious, nobody wants to waste time stuck in traffic waiting to get to a point.
The idea here was to create a location which was least distant from certain points of interest. Finding out the best ABCD- All Brunch Coffee and Dinner places all near the hotel. We were successful in finding out the one hotel- which was Delta hotels by Marriott.