LCBO API

LCBO API Crawler Blocked (Resolved)

January 5, 2016 heycarsten

UPDATE: (Jan 6) The issue has been resolved and LCBO API is functioning normally again. A big thanks to the LCBO team!


UPDATE: (Jan 6) It appears that the blockage was NOT intentional, and the LCBO is working to resolve the issue.


On January 5th, 2016 the LCBO blocked LCBO API’s crawler from accessing LCBO websites. This means that LCBO API can not update its data.

It is still unknown whether this action was deliberate or an unintended side-effect of something else.

What is known is that for the past 7 years LCBO API has been crawling LCBO.com and serving a machine-readable version of it. I have attempted to get in touch with LCBO management (unsuccessfully) and have yet to be contacted by the LCBO in any way (ever). If this makes it to anyone at the LCBO, please contact me.

To the LCBO: I feel that blocking LCBO API is not a good idea, it’s not good for Ontarians and it’s not good for the LCBO. It is used heavily and is a source of goodwill and value for your corporation. I am open to any discussion you wish to have and I am happy to work with you in any way. Frankly, I’ve always wished you were fully involved!

For all of the creators of innovative applications that use LCBO API to integrate LCBO store, product, and inventory data into their products; I am sorry that this happened, hang in there though.

— Carsten

Follow @lcboapi on Twitter for updates.


V2 Update, Ontario Craft Brewers Identification

February 17, 2015 heycarsten

Well, it’s been a busy couple of months since my last post. In two months, exactly 99 of you have registered and created API keys, I am humbled and impressed! LCBO API saw 200% more throughput this January compared to last year for a total of 2.3 million requests served, pretty wild for the slowest time of the year!

In other news I’ve been hard at work on all of the new stuff I have planned for V2 of LCBO API. Catagories are now a first-class resource as well as producers, and UPC lookups are now functional.

After I normalized producers it had me thinking about how the official LCBO site allows people to filter products by VQA designation. I thought it would be nice to identify products that are produced by Ontario Craft Brewers, so I now cross reference that data with the LCBO product catalog.

I’ve backported this functionality into V1 so you can start using it immediately. Products now have a boolean field called is_ocb that identifies if the product is produced by an OCB member. You can also filter on this field with the standard V1 filtering parameter /products?where=is_ocb or exclude OCB products by using /products?where_not=is_ocb. I’m excited to see how people integrate this data into their products. It’s a simple addition and one that reflects the direction that LCBO API is taking.

Finally, I’m going to be in Portland, Oregon for the first week of March to attend Ember Conf, I realized that this conflicts with my original release date of March 1st for LCBO API V2, so I’m bumping the date by a week to March 8th. I don’t want to release V2 when I’m possibly not available to answer questions and deal with potential issues.

That’s it for now, take care everyone!

— Carsten


What’s next for LCBO API

December 16, 2014 heycarsten

Before I get into the shape of LCBO API to come, I think it’s about time I told the story of LCBO API. It’s been quite the journey so far, and it’s also a bit of a lengthy read, so if you’d rather skip ahead to where I talk about the new stuff, I won’t mind.

This December marks six years since I picked up Rod Phillips’ book The 500 Best-Value Wines in the LCBO:

I thought, “Wouldn’t it be cool if I could use my phone to see a list of the wines in this book at the store I’m standing in?” Then I wouldn’t have to run through each item in the book and look for it, I’d just be presented with a list of wines available in the store I’m in. Oh, just imagine the efficiency! A decent bottle of red to go with dinner in mere seconds, and I always have my phone on me.

I really wanted to build it, but before I did anything I’d need a way to access the LCBO product catalog, inventory data, and store directory. It needed to be fast and minimal so that mobile phones (which were still pretty slow at the time) could quickly load and parse the responses, but most importantly; it needed to exist, and it didn’t.

An API is Born

The following weeks consisted of me hacking on a crawler after work to transform the pages on LCBO.com into usable, normalized data. I knew that if I was going to do this, it had to be done really well. At the time, all of the pages on LCBO.com were table-based and very hard to parse reliably. The character encodings were all over the place, everything was UPPERCASE, and all requests were via form posts — it was a blast!

I also wanted it to be released as a publicly available service so that others could use it and build cool things without having to solve this problem again-and-again. The first version of LCBO API was released in April, 2009. Over the following months I refined the API, and wrote documentation for it. In early 2010, V1 was released. By this point I had invested nearly 600 hours of my time into the project, interest was growing and just maintaining the crawler, adding useful features and responding to emails from interested parties was keeping me completely busy in my spare time.

An API is Used

LCBO API isn’t just a machine-readable representation of LCBO.com, it holds time and place for an entire retail sector in a large market.

I honestly never thought LCBO API would become as popular as it has. Last month (November) it served 1.4 million requests and over 100 dataset downloads. I thought developers might use it to build LCBO apps for various mobile devices, and I thought reviewers might use it to integrate availability data into their sites and blogs. But I never could have foresaw all the things that have happened over the past six years.

Developers have built apps, lots of apps, mostly mobile apps, but also web apps. Students and hobbyists have fiddled and hacked to learn about REST and JSON and how to consume an API. Beer and wine lovers with an interest in coding have hacked together scripts to alert them when their favorite drinks become available at nearby stores. Independent brewers and winemakers have analyzed and identified how their products are doing and where the most active markets are to get their products closer to the people that might buy them.

One of the most exciting use-cases I ever received was from a statistician at Harvard who was using the historical datasets as fixtures for testing different algorithms in their research. It was really humbling, and it drove home the fact that LCBO API isn’t just a machine-readable representation of LCBO.com, it holds time and place for an entire retail sector in a large market. This doesn’t really exist at this scale anywhere else in the world — it’s exciting!

I’ve had the pleasure of meeting all of these incredible people doing interesting things through my work on LCBO API. As much as I’d like to end it there, in order to tell the whole story I also have to tell you about some of the not-so-enjoyable experiences I’ve had running LCBO API.

An API is Abused

It’s not a cakewalk producing wine and beer in Ontario, it’s a very challenging place for small producers to succeed.

Once or twice a year someone will reach out to me and pitch me on how I could work with them to resell portions of LCBO API inventory data to small producers as a report, charging them a premium for this valuable insight — insight that’s available on LCBO.com to anyone with a spreadsheet application.

Schemes like this are depressingly uncreative, contribute toward a toxic ecosystem and stifle innovation by trying to create a walled garden around data that should be and already is available to everyone through the LCBO’s official sales and marketing insight reports.

This is not the reason I created the LCBO API and I have no interest in helping people realize such self-serving goals.

In the spring of 2012, changes were being made to LCBO.com on a fairly routine basis. There were a couple occasions where I was not able to update the crawler for days on end, and the data became stale. It was a very frustrating time, due external factors I was unable to spend time to update the crawler even though I was desperate to do so.

During this period I received a couple of undiplomatic emails complaining about the lapse in data updates. People were understandably upset that the data was not up to date for their “paying customers”. I’m a professional software developer, but this is a passion project and I only have so much free time. If you’re using and commercializing my work, at the very least I would hope you would be polite when talking to me!

This is some of the dark side of running LCBO API, but you know what? The good days far outnumber the bad ones, and it’s those good days, and emails, and stories, and projects that stoke my passion for working on LCBO API.

An API Grows Up

I want to build features and tools that allow non-technical users to benefit from LCBO API so that it’s delivering value to all members of the ecosystem, not just ones who can code.

I give LCBO API the utmost attention and care, it’s a hardened platform built on thousands of hours of work and I take every aspect of it very seriously. Going forward, I’m going to make sure that this level of commitment and quality is properly communicated. In addition, I want to build features and tools that allow non-technical users to benefit from LCBO API so that it’s delivering value to all members of the ecosystem, not just ones who can code.

The look and feel of the old site didn’t reflect any of this very well and I’ve wanted to update it for years, so I did. I’ve also made a number of changes under the hood, here’s what to expect soon:

Unlimited Anonymous Access is Deprecated

For the sake of my sanity, and to provide a better service and not hinder the potential of LCBO API, I need to have an understanding of who is using it, for what and where. This is why I have introduced the concept of Access Keys to LCBO API.

As of March 8th, 2015 anonymous API access will be rate-limited.

Anonymous access remains but, as of March 8th, 2015, it will be rate-limited. This means that you won’t need an Access Key for playing around or learning, and it means that existing mobile and JavaScript apps will continue to work. If you’re using LCBO API for anything beyond fiddling, you’ll want to acquire an Access Key.

In addition to no rate-limit, by using an Access Key you’ll also gain insights and statistics related to your account:

I plan to build out the management panel further and provide some other useful features in the future.

LCBO API continues to be a labour of love

If LCBO API is making it easier for you to do your job, run your business, or build an app, please consider supporting it financially.

I don’t want to sound like Jimmy Wales here, but outside of simply charging for API access on a subscription model, I’m hard pressed to come up with a way to financially support the project. The hard costs aren’t crazy, right now LCBO API consists of a load balancer, app server, worker server, and database server, it averages about $100/month in hosting costs plus another $60/month for AWS, monitoring, and backups.

The reality is that, like everyone, I have bills to pay and a family to support. I can’t spend as much time on LCBO API as I’d like to because at some point it eats into time that must be spent earning an income. Responding to project-related emails, maintaining the crawlers and ensuring updates happen on a daily basis eats up a lot of my available free time. This leaves very little bandwidth to actually improve LCBO API and develop the new and exciting things that fuel my passion for the project in the first place.

I really don’t enjoy talking about these things, but now they’re out there and very clear, no secrets. LCBO API costs about $160/month plus a lot of my time to run, and it generates $0/month in income. If LCBO API is making it easier for you to do your job, run your business, or build an app, please consider supporting it financially.

LCBO API V2

Now for the exciting stuff, as I said before LCBO API was introduced in 2009 and the visible API hasn’t really changed since that time. UNTIL NOW

HTTPS & CORS

These features have been backported into LCBO API V1 and are already live. Check out the V1 documentation for more details.

UPC Support

Finally. You’ll be able to look up products by barcode.

JSON API Compliance

The JSON structure of the V1 API was born out of necessity. LCBO API V2 complies with the JSON API open standard, making it easier to consume the API.

Category and Producer APIs

For the sake of completeness and to make it easier to implement discovery / browsing interfaces, I’ll be normalizing category and producer data and providing API endpoints for them.

Store(s) with Product(s) Feature

This is a doozie, I’ve been asked for this feature a handful of times. I’ve even been told how easy it is to implement — it’s not. That said, it’s required functionality if you want to build something like a great shopping-list feature. It’s a worthwhile ocean to boil, and I’m excited to bring it to LCBO API.

Historical Metrics

Aggregate metrics such as turnover rate and confidence in inventories. This will allow developers to alert users if it looks like a product might not actually be available. For example, on average, consumption begins to increase on Thursday, and peaks on Saturday. If a product is selling consistently throughout the week, and there are only a few left on Saturday morning, it’s very likely that come Saturday evening it won’t be available anymore. Conversely, some products are stocked at very low levels and have very low turnover, this also has to be considered to avoid false-positives.

Intelligent Crawler

A few months ago I tested an accessory crawler that analyzes various blogs and news sites for LCBO product numbers. The plan was to then use that information to perform priority crawling for pricing and inventory data of those products. It seems to work quite well, so this will be officially rolled into LCBO API proper as soon as time allows.

Webhooks

Now that LCBO API has the concept of accounts and Access Keys adding support for webhooks is a much less daunting task. You’ll be able to register against numerous events such as when products are added or removed, when prices change, and when product availability changes. This will make adding notification functionality to apps a lot easier and more reliable.

Products Meta API

I’d love to integrate with top-notch products like Untappd to incorporate ratings and other useful data so that it can be used in queries. This data would only become visible if you ask for it in requests, imagine something like /products?meta=untappd returning:

{
  "products": [
    "name": "Amsterdam Boneshaker",
    "meta": {
      "untappd": {
        "rating_score": 3.72,
        "rating_count": 4837
      }
    }
  ]
}

This would enable all sorts of cool uses and possibilities, and could even be opened up to allow 3rd parties to write custom metadata. If you’re interested in discussing such a partnership, please get in touch.

A Novel is Written

Today, the original app idea that ignited the motivation to build LCBO API in the first place has been far surpassed by entire product companies like WineAlign and Natalie Maclean. I actually find that really cool.

It shows that ideas by themselves are so often futile. You have to act on them — create with them — and when you do, what they become is never exactly what you had in mind. Reality always wiggles its way into the equation somehow.

I started with the idea to build a $2.99 iPhone app and ended up with a cloud service. A service that’s used to help students learn, to help enthusiasts locate specialty drinks, to help small producers gain some insight for their product line, to enable native mobile applications on any platform, to provide a large-scale realistic dataset for research projects. That’s something to be proud of, and I am. Thanks for listening.

— Carsten