Making Recommendations with Apache Mahout Presentation
Last month I gave a presentation about making recommendations with Apache Mahout. Since the presentation, Manning books has released the final version of their book, Mahout in Action, which should be an even better resource than the book that I was using for my slides and presentation. Here are the slides:
(Having trouble seeing the slides? Try here.)
next post: Signs You Aren’t Really Building a Minimum Viable Product
Anthony, great to know someone that has used Mahout. I am not much of an advanced programmer, but I’m still very interested in learning about solutions like this. My business partner and I (from the business I started this summer) worked on a recommendation engine used just Lucene, and from that I encountered Mahout.
My question (from as non-technical of a perspective as possible) is: how much work is it to get a basic recommendation system off the ground? This may be a poor question, since maybe everyone’s needs are different and thus there is no simple solution. But I’ve been curious about this for a while now.
Thanks Matt, you ask a great question! It doesn’t take very long to get some kind of recommendation going. Maybe a week or two if you are technically oriented. I did a good amount of general research in the area and trying to get some seed data using Mechanical Turk for rating things, so it took me a little while. The nice thing about Mahout is that it is open source and has a book written about it, so there are some resources out there to get going.
However, it probably takes a bit longer to get something that you like that makes great recommendations. The nice thing is that you can get some super-basic recommendations fairly quickly (maybe just averaging ratings) and then can try to refine the recommendations over time. When you get an improvement to your algorithm, you can make that the production version and keep iterating on it.
Additionally, you might consider if the concierge model would be useful for very early recommendations. You have probably heard about this, basically, you could just pretend that there is a magic algorithm out there that is giving people fantastic recommendations, when in reality it is you manually picking things that people might actually like. This might seem like a lot of work, but the thinking is:
1) you are learning a lot about your users by giving them personalized recommendations
2) you don’t need to build a full recommendation system
3) you don’t have a lot of users yet, so it is not resource intensive to do
4) this would work best in a domain where there are not many things to recommend, or that you know well enough to quickly recommend
5) it would also be nice if the recommendations were not very time-sensitive
Hope that helps, let me know if I can provide more details.
Thanks for your detailed response, Anthony! Much appreciated. I’m sure this will come in handy at some point in the near future. I’ve already forwarded this on to a friend who is building a rec. engine.