English 中文(简体)
Geokit Gem 1.5 and Ruby 1.9.2 => "incompatible character encodings: UTF-8 and ASCII-8BIT"
原标题:

I am currently writing a rails app using bleeding edge stuff. Rails3, rSpec2, Ruby 1.9.2 and Geokit 1.5.0. When i try to geocode addresses that have special characters that are not in ASCII-8Bit i get this error:

incompatible character encodings: UTF-8 and ASCII-8BIT

The Trace is like this:

1) Spot Basic Validations should calculate lat and lng
    Failure/Error: spot = Spot.create!({
    incompatible character encodings: UTF-8 and ASCII-8BIT
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/geokit-1.5.0/lib/geokit/geocoders.rb:435:in `do_geocode 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/geokit-1.5.0/lib/geokit/geocoders.rb:126:in `geocode 
    # ./app/models/spot.rb:26:in `geocode_address 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activesupport-3.0.0.rc/lib/active_support/callbacks.rb:409:in `_run_validation_callbacks 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activemodel-3.0.0.rc/lib/active_model/validations/callbacks.rb:53:in `run_validations! 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activemodel-3.0.0.rc/lib/active_model/validations.rb:168:in `valid? 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/validations.rb:55:in `valid? 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/validations.rb:75:in `perform_validations 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/validations.rb:49:in `save! 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/attribute_methods/dirty.rb:30:in `save! 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:242:in `block in save! 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:289:in `block in with_transaction_returning_status 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/connection_adapters/abstract/database_statements.rb:139:in `transaction 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:204:in `transaction 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:287:in `with_transaction_returning_status 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:242:in `save! 
    # /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/validations.rb:34:in `create! 
    # ./spec/models/spot_spec.rb:13:in `block (2 levels) in <top (required)> 

I used # coding: utf-8 in all of my related files (specs, factories and model). Yet i get this error when i use an address like "Elsassers Straße 27".

Any hints? I thought Geokit was already compatible with 1.9.1 and therefore with all this new encoding thing.

问题回答

Using CGI.escape is not a good idea, as it gives unexpected results. Try "Oslo, Norway" with and without CGI.escape, you ll see what I mean.

A better solution is to use Iconv on the location:

ic = Iconv.new( US-ASCII//IGNORE ,  UTF-8 )
utf8location = ic.iconv(location)

Cheers!

EDIT: I had a suggestion by Wes Gamble for a edit here, which I think is relevant:

Using //IGNORE will remove any non-ASCII characters. But in many (most) cases, you may want to transliterate certain characters such as umlauts (e.g. "Zürich" will become "Zurich") or carons (e.g "Niš" will become "Nis") in order to successfully geocode them. If you ignore non-ASCII characters, then "Zürich" will become "Zrich" and "Niš" will become "Ni", neither of which will successfully geocode.

For this you want to use

ic = Iconv.new( US-ASCII//TRANSLIT ,  UTF-8 )

Note that the conversion will throw an exception if the transliteration cannot be completed so make sure you handle that.

CGI.escape seems to be more accurate than Geokit::Inflector::url_escape.

Here are the results of encoding "Elsassers Straße 27"

>> CGI.escape(address)

=> "Elsassers+Stra%C3%9Fe+27"

While

>> Geokit::Inflector::url_escape(address)

=> "Elsassers+Stra%C3e+27"

The letter ß should show as c39F (as per http://www.utf8-chartable.de/unicode-utf8-table.pl)

In addition, debug statement was blowing up (I knew there was a reason to check if debug logging is enabled :)

So, here is my solution for GoogleGeocoder3, I guess others will have a similar problem

module Geokit
  module Geocoders
    class GoogleGeocoder3 < Geocoder
      def self.do_geocode(address, options = {})
        bias_str = options[:bias] ? construct_bias_string_from_options(options[:bias]) :   
        address_str = address.is_a?(GeoLoc) ? address.to_geocodeable_s : address
        #use CGI.escape instead of Geokit::Inflector::url_escape
        url ="http://maps.google.com/maps/api/geocode/json?sensor=false&address=#{CGI.escape(address_str)}#{bias_str}"
        res = self.call_geocoder_service(url)
        return GeoLoc.new if !res.is_a?(Net::HTTPSuccess)
        json = res.body
        # escape results of json
        logger.debug "Google geocoding. Address: #{address}. Result: #{CGI.escape(json)}"
        return self.json2GeoLoc(json, address)
      end
    end
  end
end

Are you using Postgres and pg gem v0.8? Upgrade to 0.9

I know it a very very late answer, but I have written a Google geocoder for the Geokit gem that handles all of this Incompatibility errors. This Geocoder uses the newest V3 API of Google s geocoding service. The advantage is that now it does not parse XML but rather JSON which is faster, paired with the required gem Yajl (a super fast json parser for ruby) is way faster. My benchmarks show about 1.5x times faster than the old way.

https://github.com/rubymaniac/geokit-gem

I had the same problem and I solved this by adding CGI.escape() like this:

geo = Geokit::Geocoders::MultiGeocoder.geocode(CGI.escape(address))




相关问题
Ruby 1.9 hash with a dash in a key

In ruby 1.9 is there a way to define this hash with the new syntax? irb> { a: 2 } => {:a=>2} irb> { a-b: 2 } SyntaxError: (irb):5: syntax error, unexpected tLABEL { a-b: 2 } ^ ...

Can I set the default string encoding on Ruby 1.9?

This might sound minor, but it s been driving me nuts. Since releasing an application to production last Friday on Ruby 1.9, I ve been having lots of minor exceptions related to character encodings. ...

ruby 1.9: how do I get a byte-index-based slice of a String?

I m working with UTF-8 strings. I need to get a slice using byte-based indexes, not char-based. I found references on the web to String#subseq, which is supposed to be like String#[], but for bytes. ...

How to correctly uninstall Ruby 1.9.1 [closed]

Hi:) I have manually set Ruby 1.9.1. I have installed it via ./configure --prefix=/opt make make install The target uninstall does not exist in generated Makefile ... How to do uninstallation ...

invalid multibyte char (US-ASCII) with Rails and Ruby 1.9

I m using Ruby 1.9.1 with Rails 2.3.4 My application is to handle text input If I try something like (the inside quotation marks look different) text = "”“" I get the following error: #<...

热门标签