You’re using Hpricot to parse web content, but it’s throwing an error like this that completely kills the process (probably crashing your app, or your background task, as the case may be):
/usr/local/lib/ruby/gems/1.8/gems/hpricot-0.8.2/lib/hpricot/parse.rb:33: [BUG] Bus Error
ruby 1.8.7 (2009-04-08 patchlevel 160) [i686-darwin8.11.1]
This resource suggests that the problem is that the content retrieved is precisely 16384 bytes long, however, that was not the problem in my case.
My problem is replicated in this gist. Examination of the URL it was trying to retrieve using curl with -i indicated that this was returning a 302 redirect:
HTTP/1.1 302 Found
Date: Thu, 12 Nov 2009 14:50:53 GMT
Set-Cookie: ASP.NET_SessionId=p2s0dljru11tiwer3e01jfq2; path=/; HttpOnly
Set-Cookie: Forum2backURL=/tm.aspx?m=1859288#1859354; path=/
Set-Cookie: Forum2preURL=; path=/
Expires: Wed, 11 Nov 2009 13:50:53 GMT
Content-Type: text/html; charset=utf-8
I am not sure why Ruby’s OpenURI open method was not capable of parsing / following this redirect. However, I determined that the file returned by open() had a size of zero bytes, and this was causing Hpricot to blow up.
My workaround is just to check the size of the file returned by open() and only try to parse it if it is greater than 0:
f = open(file_or_uri)
if f.size > 0
doc = Hpricot(f)
raise "Could not retrieve content due to zero-sized file, possibly due to site redirect."