Vacation Rentals: Can Machine Learning accurately predict the ‘goodness’ of a photo? [Part 1]

Vacation Rentals: Can Machine Learning accurately predict the ‘goodness’ of a photo? [Part 1]

In the Vacation Rental industry picking the right photos for your property can be really tough. Furthermore, some property managers who are super talented at business development, customer service or a number of other tasks will self admittedly tell you they don’t have a great eye for design. Or maybe they think they do have a great idea for design and photography, but in actuality, they really don’t.

The Challenge

The honest reality is that choosing ‘good’ photos to list your vacation rental with, or knowing the right photographer to hire, is an incredibly subjective process. I have had conversations with numerous property managers and private vacation rental owners who say that they usually just “go with their gut” on choosing which photos of their vacation rental to list with.

But what if there was a way we could objectively, quantifiably rate a photo on it’s goodness using intelligent algorithms? While the process would inevitably have some subjectivity involved, I think there are clear improvements that can be made by leverage technology.

Said another way, can we train a computer to understand that Photo A is a ‘good’ photo and that Photo B is a ‘bad’ photo?

Photo A
Photo B

 

 

 

 

 

 

For the sake of argument and the study, ‘good’ here is not dealing with whether or not the photo is framed well, whether the property is ‘nice’, or etc. For this specific example we’re dealing with primarily: Is the photo generally exposed well, with good lighting throughout the photo? In the example photos above, the key difference between the two is that Photo A is really well exposed, appears professionally shot, and is visually appealing through the depth of the photo. Photo B on the other hand isn’t doing the property any favors – it’s poorly exposed, the back half of the room is extremely dark, and overall makes the room feel uninviting.

So how do we train a computer to accurately know the difference?

The Process

Unfortunately, most machine learning and artificial intelligence algorithms require a lot of data to generate hugely accurate results. For our purposes though, we can validate the idea with an image set of no more than 100 photos.

  • Step 1: Go grab ~100 images from major listing sites.
  • Step 2: Group these images in buckets of ‘Good’ and ‘Bad’. (Obviously in a production environment we may want more graduated options, e.g. “Really Good”, “Good”, “Okay”, “Not So Good”, “Bad”, etc. – Yes, this is a fairly subjective process.)
  • Step 3: Crop all images to the exact same dimensions.
  • Step 4: Push images through a rich feature extraction algorithm. For my test, we generated 2048 features for each photo.
  • Step 5: Run our classified image data through a Support Vector Machine (SVM). SVMs are EXTREMELY powerful at classifying data of all types and is widely used in everything from image classification to email spam detection at huge email companies.
  • Step 6: Check the results!

So how did we do?

The Positive

For a large part of the data set we were able to predict whether the photo was ‘good’ or ‘bad’ with over 90% accuracy! Here are some cases where The algorithm predicted their ‘goodness’ or ‘badness’ with at least 90% accuracy:

 

 

The Negative

So while our model was able to predict with 90% accuracy whether a photo was a ‘good’ photo, it struggles with knowing what really makes a ‘bad’ photo. Unfortunately when it comes to ‘bad’ photos, our model predicted 48% of the time that they were ‘good’ even though we had previously defined them as ‘bad’ photos.

The photos above were labeled ‘bad’ photos originally but our model incorrectly predicted that these were also ‘good’ photos. What does this mean? It means that when a photo is really good, it really knows it! But when it’s bad, it’s only sure about 52% of the time. It also means that knowing the difference between a good and bad photo is a hugely subjective one!

For a production ready model we would likely need a MUCH larger dataset, as well as some thoughtful pre-categorization (interior vs exterior, etc.) to be done before really trying to train a model. But still, with a small dataset like this, the results are very promising, but not quite ready for production use.

Why is this helpful?

There are a number of ways that a tool which could accurately score or predict the ‘goodness’ or ‘badness’ of a photo could be useful.

  1. Quality Scores. I think that it goes without saying that major listing sites are constantly looking for differentiating factors for their listings to use in their ranking algorithms. Having a model that could accurately classify an image as ‘good’ or ‘bad’ would be helpful in knowing which listings are likely going to generate the most clicks and bookings (great photos generate better bookings!).
  2. No more subjectivity. Having a tool that could accurately inform a property manager about the quality of their photos could go a long way in helping the less creatively-inclined property managers still be able to know whether they have good photos or not.
  3. Outsourcing: Having a tool that could accurately tell you whether your photos are good or not would also aid the decision to hire a professional. While I would always recommend using a professional, it could also help you choose the right professional by letting you grade their photos in their portfolio before you make the first phone call!

Wrapping Up

While there are many implications for a tool like this, it’s really only the beginning.

In the coming posts of this series I am going to cover alternative algorithmic approaches we could build with even better results, as well as discuss other novel ways to use image classification tools for quality assurance of your listings among other things.

Stay tuned!

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

asdf

 

Leave a Reply

Your email address will not be published. Required fields are marked *