Archive for the ‘WOTS’ Category

Does open geodata matter?

Monday, April 19th, 2010

I said this quite a while ago. But looks like finally others are getting the idea too:

Here is the problem: These efforts at creating an underlying database of places are duplicative, and any competitive advantage any single company gets from being more comprehensive than the rest will be short-lived at best. It is time for an open database of places which all companies and developers can both contribute to and borrow from.

Indeed. But why should the driver be whether it is to these companies competitive advantage or not? How about thinking about what is to the advantage of the rest of us – i.e. the people contributing the information in the first place?

But in order for such a database to be useful, the biggest and fastest-growing Geo companies need to contribute to it.

Well, not really. All it takes is for people to stop shoving their place information into proprietary silos, and put into genuinely open licensed efforts instead.

Word on the Street blog

Saturday, October 10th, 2009

Word on the Street now has a blog.

Word on the Street for Mac

Sunday, September 13th, 2009

Now you can run Word on the Street on Mac OSX (Snow Leopard) too:

https://wordonthestreethq.com/wotsmac/

Make GeoKit use the spatial features of the underlying database

Sunday, August 30th, 2009

GeoKit is a great rails plugin and ruby gem for dealing with geodata for mapping applications. I use it on the backend for the Word on the Street app) (link is to iTunes). However, with millions of mappable objects, geokit’s performance takes a nose dive.

For example, this query finds all places within 10 miles of a given point:

Place.find(:all,:origin=>latlng,:within=>10, :order=>'distance asc')

On a database of about a million rows, that takes about 14s on a mac mini. And, according to the explain plan, the right indexes are being used, yet mysql is nonetheless examining 87554 rows – even though the number of places in the result is 1135 (and the number of places actually within the bounding box that geokit applies is just 1323).

I spent a little time finding out why this is. GeoKit adds a bounding box to speed up queries together with a combined index on latitude and longitude. This helps, and the index is used, however it doesn’t help as much as it should. It turns out that the database (mysql in my case) is trying to search using two ranges (latitude and longitude), using a BTree index which is not suitable for this kind of query. This is not specific to geo based queries but actually applies to any two range query (e.g. age and height in a DB of the general population.

From an answer to a question I asked on stackoverflow, by Quassnoi:

Plain B-Tree indexes are not too good for the queries like this.

For your query, the range access method is used on the following condition:

places.lat > 51.3373601471464 AND places.lat < 51.6264998528536
, this doesn’t even take lon into account.

If you want to use spatial abilities, you should keep your places as Points, create a SPATIAL index of them and use MBRContains to filter the bounding box

Following this suggestion does indeed massively improve performance (the same query executes in less than a second if spatial indexing is used). However, getting the spatial stuff to play nice with rails is a bit of a pain. After a lot of trial and error, I came up with the code below so I thought I’d share it in case it is useful for anyone else.

Note: this code requires GeoRuby, the spatial_adapter plugin, MySql 5.0/5.1, and rails 2.3.2. It should be possible to adapt to other databases and versions however I find that spatial_adapter doesn’t work with rails 2.3.3.

Desired interface

Rather than patch geokit I decided to implement a named scope called bounded that would allow manual addition of a second bounding box to queries, such that the DB would correctly identify the spatial index as the one to use.

Given a mappable object (lat and lng), and a radius, the named scope works out a bounding box to apply to the query (much the same as geokit does internally, except this one will hit the spatial index). So, for example, to redo the above query:


Place.bounded(latlng,10).find(:all, :origin=>latlng, :within=>10, :order=>'distance asc')

The :origin and :within parameters are actually a little redundant in the above query, if all you wanted was a bounding box search, but this is just to show that all the usual geokit options are still available. In fact, geokit also adds its own bounding box but this doesn’t hurt. Plus in the case above, Geokit also adds the trig calculations to make it a true radial search. It would also be possible to patch geokit to apply the spatial query, and then the api would be more elegant, but then this would need to be redone for every new release of geokit.

Migrating the database to use spatial

To add the necessary DB bits to get at the spatial features, add this migration for your mappable model. This is for mysql and uses direct SQL on the database, however you could instead use the spatial adapter features to make the migration more readable and database independent. I did it this way because I found that the geometry/spatial stuff is very sensitive to the mysql version – the method below is tested to work with mysql 5.0 and 5.1


class ConvertPlacesToSpatial < ActiveRecord::Migration
def self.table_engine(table, engine='InnoDB')
execute "ALTER TABLE `#{table}` ENGINE = #{engine}"
end

def self.add_spatial_to_mappable(table)
execute "ALTER TABLE `#{table}` ADD geom GEOMETRY not null"
execute "UPDATE `#{table}` set geom=POINTFROMTEXT(CONCAT('POINT(',lat,' ',lng,')'))"
execute "CREATE SPATIAL INDEX index_#{table}_on_geom on #{table}(geom)"
end

def self.up
table_engine :places, 'MyISAM'
add_spatial_to_mappable(:places)
end

def self.down
remove_index :places,:geom
remove_column :places,:geom
table_engine :places, 'InnoDB'
end
end

The above migration changes the table storage engine to MyISAM, which is needed for true spatial indexing (but MyISAM also has some drawbacks – e.g. no transactions). The migration also sets up a geometry column and initialises it from the lat and lng columns of existing rows in the DB, then creates a spatial index for fast bounding box queries.

Adding the named scope to your ‘acts_as_mappable’ rails models

Finally, at the rails layer you need some ‘before_save’ code to keep the geometry column in sync with changes to lat and lon. This is also where you implement the named scope for adding the bounding box (again, you could instead hack geokit to do it – I did it this way so that I could update geokit later without having to patch it again).

Below is code that I added to my mappable models to make this work. You’ll need the GeoRuby gem and the spatial adapter plugin for this. I also find that at the time of writing the spatial adapter plugin only works well with Rails 2.3.2, and not at all with 2.3.3, so this may not work with earlier versions of rails.


require 'rubygems'
require 'geo_ruby/simple_features/point'

include GeoRuby::SimpleFeatures

def Place
...

named_scope :bounded,
lambda { |latlng, radius|
{ :conditions=> "MBRContains(#{boundsLineString(latlng,radius)},geom)" }}

before_save :spatialize

def self.boundsLineString(latlng,radius)
bounds=GeoKit::Bounds.from_point_and_radius(latlng,radius)
return "GeomFromText('LineString(#{bounds.sw.lat} #{bounds.sw.lng},#{bounds.ne.lat} #{bounds.ne.lng})')"
end

def spatialize
mygeom=GeoRuby::SimpleFeatures::Point.from_x_y(lat,lng)
self.geom=mygeom
end
end

(The boundsLineString helper method could be moved somewhere else to avoid repeating it for every mappable model)

And that’s it! Just use the ‘bounded’ named scope on any queries that are running slowly, so that the search is limited to a particular centre and radius, and you should see dramatic speed gains.

Word on the Street now available on the app store

Thursday, August 20th, 2009

Get it here (itunes link)

Word on the Street on twitter

Sunday, August 16th, 2009

Anyone interested in following Word on the Street news, I’ve set up a twitter feed here: @wots_news.

More Word on the Street beta stats

Sunday, August 16th, 2009

Some more stats on the Word on the Street beta.

I wanted to measure what the entry coverage was geographically, and how deep the coverage was. Even with a small number of users and entries so far, it’s not that bad and the growth is pretty steady.

The graphs below estimate coverage in square miles – in other words, how many square miles of the globe have a wots entry somewhere in that area.

This graph shows growth in square miles with any coverage at all:

Sq mile coverage over the last year Sat Aug 15 20-48-04 UTC 2009.png

And the next graph shows growth in coverage with at least 5 places and upwards:

Sq mile coverage over the last year (2) Sat Aug 15 20-48-04 UTC 2009.png

Not too shabby for only 40 people and less than ten of them very actively posting entries.

Word on the Street beta test review

Sunday, August 16th, 2009

My Word on the Street iPhone app is now in the queue for approval into the Apple app store (just over 1 week and counting).

Meanwhile I’ve taken the opportunity to write up some experiences and stats from the alpha/beta period, which started around the end of April this year. This is mainly for my own benefit but may be of use to others who are looking to set up a beta test for a service based app, so I’m sharing it here.

What the app does

The app allows people to leave notes wherever they are. You can also search for notes that other people have left nearby. You can rate locations, tag them, and share them with people, even if they don’t have an iPhone. You can also rate and edit other people’s entries. The idea is to share local knowledge not just about restaurants and bars etc, but about absolutely anything.

I’ve also added a feature which allows charities and non-profits who need volunteers at a physical location to post their requests, so that app users in the area can respond. If know of any charity/non-profit that may be interested in this, please send them here.

Beta participation

About 80 users expressed an interest to test and signed up via ibetatest.com, invitation, and other venues, but mostly via ibetatest. Their devices were provisioned in the ‘ad hoc distribution’ system that Apple requires.

Of these, only about half got as far as downloading the program and registering as users. Registration for an account is optional, so I don’t have stats on users that used the program without registering. However I believe that almost everyone who got as far as installing the program also registered, since some features like writing and editing entries require an account.

I suspect that a lot of the reason behind the difference in numbers registering for the beta, versus numbers getting as far as installing the program and registering for an account, is the godawful ‘ad hoc distribution’ method Apple provides for distributing apps outside the app store. This is notoriously buggy and fussy and I suspect many users got errors, or couldn’t follow the pretty convoluted steps, and simply gave up somewhere along the road. And who can blame them. Life is too short. Many of the users who did successfully install the program could only do so with some handholding from me. Some were able to install some versions and not others. Around the time OS3.0 came out, my developer certificate had to be renewed, and some users were no longer able to install the beta. A nightmare.

I also had to turn some users away because I hit the Apple imposed 100 device limit. Even even though some users upgraded to the new iPhone 3GS and no longer needed their 3G device provisioned, deleting their old device did not free up a provisioning slot in Apple’s system – this is actually by design on Apple’s part! At least one user who upgraded to 3GS was unable to participate further because of this.

So in short, due to badly implemented restrictions imposed by Apple, I probably had less than 40% of the active testers I could have had, even sticking to a 100 device limit. And of course if the device limit was not so ridiculously low or was not present at all I could have had many more. Please Apple, come up with something better for this. Preventing abuse should not mean preventing use. Given this is a free app, the rationale for limiting to 100 devices – which is that it might be sold outside the app store – makes no sense here.

How active were users?

Of the users that signed up, some were more active than others. The graph below shows total users over time, versus users who had connected within time windows of 1 to 8 weeks.

Active vs Total registered users over the last year Sat Aug 15 20-48-04 UTC 2009.png

(This graph is set up to show a window of a year however the beta started around 4 months ago, so there is only 4 months of data so far)

It looks like somewhere between 25% to 50% of the users tried the program on a semi-regular basis and at least half only tried it once or twice and then gave up. There is also a small spike in interest any time there is a new version of the app to test, but this diminishes over time. Also, though not shown on the graph, not everyone installs each new version and a lot of returning users stuck with older versions.

It is difficult to ascribe reasons why some users didn’t return much or at all, as it may have simply been that users didn’t like the app and lost interest. However some other possible reasons behind this:

  • Installing new versions via ad hoc provisioning is a pain for users, and it doesn’t always work. A proportion of users had trouble with some versions and not others and some users upgraded to 3GS and couldn’t be provisioned. That 100 device limit again.
  • Initial versions of the app were more alpha than beta – many features were unfinished and buggy. Even though this was made pretty clear, probably some users gave up because of this.
  • Some users only had iPod touches – the program is usable but a lot less useful on these, not because these devices lack GPS, but because currently the app really needs on the move internet. Again, some users probably gave up because of this.
  • The program is less interesting until it has a lot of users as ‘nearby’ entries tend to be your own. I’ve addressed this issue as much as I can – for example the program will show entries that are far away if necessary – but still it is a basic limitation until there are more users.

Lesson: Choose your testers carefully. Also bear in mind that ‘release early, release often’, may not work so well for an iPhone app, especially not within the 100 device limit imposed by Apple. The ‘release often’ part helps a little, but not that much.

You can also see the spike in tester signups due to ibetatest.com on this graph.

How much data?

The graphs below show growth in the amount of data collected over the beta period.

You can see that even though the number of users hit the maximum pretty early on, growth in the data itself has remained pretty steady, though has tailed off recently with less users posting entries. This is pretty much what I expected as once you have written about places nearby your usual haunts, the tendency is then to add entries only when you make an unusual trip.

It is also sometimes difficult to add entries when abroad as data roaming charges on iPhone are crazy, so you need to be able to get on free wifi, which currently limits what you can post about. I have ideas to address this in future versions.

Growth over the last year Sat Aug 15 20-48-04 UTC 2009.png
Growth over the last month Sat Aug 15 20-48-04 UTC 2009.png
Growth over the last week Sat Aug 15 20-48-04 UTC 2009.png

Who wrote what?

The graph below shows number of entries per user.

Most users posted at least one entry. However, by far the most entries were posted by me :-)

Aside from that, even within the small set of beta testers there is a clear Long Tail effect with a small percentage of users accounting for almost all other entries. This is pretty much what I expected and is yet another reason why Apple’s 100 device limit is a pain in the ass – you need an awful lot of users to get a lot of active posters.

Entries vs User (all time, 1 or more entries) Sat Aug 15 20-48-04 UTC 2009.png

Who rated place descriptions?

I was surprised that few users rated the helpfulness of other people’s entries. The graph below shows who rated what.

Again, most ratings were done by me – the data set is small enough so far that I was able to rate everything that other people posted.

But, only 10 users (25%) rated entries at all, and though there is not yet enough data to really say for sure, again there is a sign of a Long Tail effect, with a small percentage of users accounting for almost all ratings.

Ratings vs User (all time, 1 or more ratings) Sun Aug 16 10-17-21 UTC 2009.png

Note, these ratings are only about helpfulness of descriptions. This is separate from ratings of the places themselves, which are not shown on the above graph. I get into that later. In fact slightly more users rated places than posted entries about them. Generally, anyone who posted an entry also rated the place because the app generally requires it.

Initially, some users tried to rate their own entries. Although I planned to allow this as I think it can be useful to know whether a user thinks their own entry is better or worse than their other entries, I wound up discarding these ratings and disabling them. This was because it was too much of a pain to filter out self-rating when doing other calculations about entry helpfulness and user ‘karma’ etc, to prevent gaming of the system. However I will probably reinstate this capability at some point.

How helpful were the place descriptions?

With the caveat that not many people rated place descriptions, the graph below shows the breakdown of how helpful the descriptions were.

Place description helpfulness Sun Aug 16 10-37-56 UTC 2009.png

You can see that most entries were not rated at all (this is because most people didn’t rate descriptions at all), but of the ones that were, they were mostly at least “OK”. Only a handful were rated less than OK or close to ‘didn’t help at all’.

Of course it is difficult to draw any conclusions with this small sample but it looks like most entries are pretty helpful overall. Most of the ones that didn’t help at all were people posting about their own home and rating it awesome :-) This is really pretty common and I expect a lot of it once the app goes to a wider audience.

Surprisingly, a handful of entries also had a completely incorrect location and were voted down because of that. These were posted using an iPod touch, which does not have a GPS and relies on WiFi triangulation and some kind of WiFi database to get a fix. It looks like for some areas, this database is completely broken. For example, some entries that were posted in Vietnam showed up on the map in Kansas! The iPhone reports an ‘accuracy’ estimate, but this too was completely broken for these entries.

Again, I expect to see more of this once the app goes to a wider audience and because of this I distinguish between entries posted using an iPod versus ones posted using a phone.

How good were the places?

The graph below shows the breakdown of ratings on the places themselves, rounded to the nearest 5 star rating – i.e. how good was the location/restaurant/bar etc, according to the poster and anyone else who rated it.

Place rating Sat Aug 15 20-48-04 UTC 2009.png

Again, this is a small sample so it is hard to draw conclusions. But interestingly more places are rated better than just “OK” – perhaps because users are generally more motivated to write about places they like.

Very few people rated a location as really bad, and no place was rated in the region of no stars at all. Far more people rated places as ‘very good’ or ‘awesome’ (5 stars) !

Meet your new volunteers

Wednesday, August 5th, 2009

Word on the Street lets charities and non-profits post calls for volunteers free of charge.

If you need volunteers at a physical location, users of my forthcoming iPhone app and website will see your call when they are in the neighborhood.

volunteersnap.png

I’m putting the finishing touches to the app and website right now.

Meanwhile, if you’d like to be told how to add your calls for volunteers when I’m ready, add yourself to the list here.

wotslogobig.png

Word on the street beta for OS3.0

Sunday, June 21st, 2009

A new beta of Word on the Street is now available for download.

Changes:

  • Now requires OS3.0
  • Bug fixes
  • Animated maps with automatic zooming to show current location and surrounding areas
  • Edit, rate, and retag entries (your own and other peoples)
  • Popular tags
  • Tips
  • More responsive editing
  • Place lists should update correctly when moving
  • Bigger fonts
  • Find entries you haven’t rated yet
  • Find edits others have made to your entries
  • Better launch screen to give visual cue when the app is ready to use
  • Empty headings removed from place lists – e.g. if there is nothing within 10 miles, that heading won’t appear
  • Plus all the changes in the previous version:

  • Basic search (search on any tags, whether they exist nearby or not)
  • There is now an option to use the metric system rather than imperial (i.e. km not miles). See under the Settings… menu to turn this on.
  • Support for badges. Badges are awarded to you on various criteria, for example if you’ve added a place that other people find helpful. At the moment a handful of badges are supported on the server to illustrate this. More badges and different criteria will be added in future.
  • Support for ‘karma’ points. You can generally increase your karma and badges by adding and rating more notes and locations, especially if other users rate your stuff highly. More tags and longer place notes also give more karma, as long as other users rate your entry as helpful.
  • ‘My Places’ now works
  • ‘Me’ now displays information about you, including karma received, your rank compared to other users, and any badges you’ve been awarded.
  • You can change your display name to be something different than your login name (under the ‘Me’ menu item)
  • Rate places you’ve visited without being there (click ‘I’ve been there’ menu item when viewing details of any place)
  • ‘My Favorite Places’ now works. Click the heart button when viewing any place to add it to your favorites (or to remove it if it is already on your favorites).
  • Note, the user interface for favorites is not yet finished and there is currently no feedback as to whether a given place is already on your favorites, other than looking at your favorite list from the main screen.