Wednesday, November 14, 2012

Node.js event loop does not poll

Node.js uses a well-known event loop, but does it work by polling? Some have that impression.

An event loop1 works by requesting its events from a message pump (per Wikipedia).

Here's how the event loop is implemented: 'Internally, node.js relies on libev to provide the event loop, which is supplemented by libeio[,] which uses pooled threads to provide asynchronous I/O.'2.

Here's Wikipedia's article on polling3 and another definition4.

Now, 'poll' is a system call which asks Unix to check a set of file descriptors:

'poll, ppoll - wait for some event on a file descriptor...If none of the events requested (and no error) has occurred for any of the file descriptors, then poll() blocks until one of the events occurs.'5

Possibly, the system call's name may have misled people into thinking a userland program is doing polling. Nevertheless, when the 'poll' (Unix system call) is invoked, this is not polling in itself.

Hypothetically, in order to get information from a message pump, an event loop could employ the Unix system call 'poll' to check a file descriptor, to which the message pump would write events.

Ultimately, this may be the source of the conceptual confusion here, or it may be caused by the fact that (actual) polling is the easiest method to think of, when programming.

For our case in particular, if an event loop calls Unix 'poll', this is not an instance of the event loop polling anything. Neither node.js, nor any event loop, but only Unix, polls the file descriptors (if indeed it even really does, anymore).

Anyway, an event loop, such as node.js's, does not poll its message pump. Instead, it merely makes a (blocking) request to it. Calling just any request 'polling' pollutes the meaning of the word (and that may be happening here.)

tl;dr – So, let's try anymore not to say that node.js is polling its events—okay? Instead, let's simply say that node.js waits for its events. (A lost cause, I know—but at least I've said it.)

1 http://en.wikipedia.org/wiki/Event_loop
2 http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/
3 http://en.wikipedia.org/wiki/Polling_%28computer_science%29
4 http://whatis.techtarget.com/definition/polling
5 http://linux.die.net/man/2/poll

Copyright (c) 2012 Mark D. Blackwell.

Wednesday, November 7, 2012

Install Opa language on 32-bit Debian squeeze, howto

The coolest feature of the Opa web programming language is that it automatically divides developers' programs into server and client sides, compiling to JavaScript.

Though the Opa compiler (as of this writing) doesn't have a 32-bit binary for Windows, I got it working in an easy way on (32-bit) Debian squeeze, after upgrading my nodejs installation.

Following Opa's instructions to install as a user (under the heading, Other Linux Distribution), I downloaded and ran their 32-bit Linux self-extracting package. When prompted, I chose to install it into ~/progra/mlstate-opa.

Then, after navigating to A tour of Opa in the sidebar, under the heading, Easy Workflow, I found and typed into a file, 'hello.opa' their sample program. The command:

$ opa hello.opa --

errored out, asking for more npm modules to be installed.

Rather than exactly following their suggested course of action, which would have installed node modules to root-owned directories, I typed:

$ npm install mongodb formidable nodemailer simplesmtp imap

After that the compiler worked just fine.

Copyright (c) 2012 Mark D. Blackwell.

Friday, November 2, 2012

PC timer for online tests, howto

I needed a timer to take online tests with (on an IBM PC).

And I found PC Timer. It seems ideal for this purpose:

http://www.brothersoft.com/the-pc-timer-1862.html

In order to alert me when a timed test is nearly over, I configured it to run this simple batch file:

@echo off
mode con: cols=100 lines=8
color 4e
echo .
echo .
echo "Time's up"
echo .
echo .
pause

Simple!

Presumably, it will be useful in timing other things, as well.

Beyond the timer, it also has two alarms (for moments, rather than durations, as of version 4.0).

Copyright (c) 2012 Mark D. Blackwell.

Monday, October 29, 2012

Free private hosting, howto

Recently, I came across this discussion of inexpensive alternatives for project source code hosting of private projects:

http://stackoverflow.com/questions/109440/best-git-repository-hosting-for-a-commercial-project

I investigated, and of those with apparent substance, the cheapest (actually free of charge for five users) is Bitbucket:

https://bitbucket.org/plans

I thought someone might appreciate a free alternative to GitHub for their private projects. If so, I recommend you follow these steps:

1. Download a backup of the current version of the source code, and save it on your computer. It should be available here:

https://github.com/{your username}/{your private project}/zipball/master

2. Open a free account with Bitbucket and there create a free, private repository (under your control, for your safety).

Possibly BitBucket will let your developers themselves push to the original repository, just as they can on GitHub.

Copyright (c) 2012 Mark D. Blackwell.

Friday, September 28, 2012

Frontend experience

Recently, I acquired some practical website frontend experience—which took quite a bit of learning!

For an initial demo for a startup, I analyzed, selected and set up all the infrastructure (Rails, Heroku & Amazon). I wrote all the CSS frontend. I also wrote all the working database backend.

See the demo! See how its layout is fluid?

(Click here, if you missed the above links.)

It doesn't have multiple user capability yet; it's just a demo, at this time.

I made this in the pursuit of becoming a does-everything website developer.

Copyright (c) 2012 Mark D. Blackwell.

Monday, September 17, 2012

Website page layouts, proofs of concept

A big part of frontend website development is implementing webpage layouts using CSS stylesheets (of course).

Recently, I've been experiencing a great deal more of business in the area of layouts (specifically for Rails websites) and especially the work of implementing these layouts through developing CSS stylesheets—whether or not this is really programming! (Well, I think it is.)

I find it much less efficient to run the Rails server, and much more efficient to 'web-browse' the local filesystem. The work progresses much more quickly, in other words, when it is isolated from any complicating factors arising from our misunderstanding of the Rails server, jQuery, ERB/HAML, and perhaps even Sass. The weightiest reason for this improvement (by far) is the troubleshooting principle: 'divide and conquer'. Less important is that the filesystem also is relatively quicker.

It is much more doable (dare I say, even feasible) to get isolated layouts working using pure CSS and HTML (while keeping class names simple). And the same is true while paring down a stylesheet to be as simple and clean as possible.

Of course, further simplifying cross-browser development is the use of a CSS-reset stylesheet. Also it is essential, for HTML5's semantic tags: header, footer and nav (etc.), to include a (JavaScript) HTML5 shim (or 'shiv') script. So I include both of these best practices.

I have prepared a repository of my CSS (layout) proofs of concept on GitHub—including nine(!) useful proofs (as of now, September, 2012).

These layout proofs contain stylesheet code the way I write for Rails projects as much as possible (without actually including Rails).

Copyright (c) 2012 Mark D. Blackwell.

Monday, August 27, 2012

Crisp image edges in web browsers, howto

Sometimes, website creation frontend work involves extracting images from pages rendered by browsers. These pages may be wireframes, for instance.

Of course, it is appropriate that web pages (displayed in a browser) contain some blurring for good looks (which becomes plainly visible if blown up to 1600% by Photoshop, etc.)

Of course, it is appropriate also that some images of a wireframe (such as icons) be blurred, because icons are created normally by a dithering process.

Although image blurring (for demonstration purposes) is appropriate and has a good look, such additional blurring is bad when images are extracted for reuse on a webpage, because the blurring will then happen twice (a doubled blurring will result).

To avoid this double-blurred problem, and for pixel art, the following method will set up for you a web browser which does not blur images:
  1. Download and install the latest SeaMonkey web browser:

    http://www.seamonkey-project.org/releases/

  2. For your particular operating system, locate your profile folder by reading:

    http://www.gemal.dk/mozilla/profile.html

  3. Immediately below your profile folder, make sure a folder exists named, 'chrome' (not the Google browser), and that a file exists in the chrome folder called, 'userContent.css' (or create them).

  4. Append to userContent.css the following lines: all are for resampling of images by the desired (in this case) nearest-neighbor method:

    (Note: I leave intact (below) some other browsers' settings for this, just in case you want to add these lines to your particular browser, in whatever way.)
/*
Gecko (Firefox & Seamonkey)
Webkit (Chrome & Safari)
*/
img {
image-rendering: optimizeSpeed;             /* Older Gecko */
image-rendering: optimize-contrast;         /* CSS3 draft proposal */
image-rendering: -webkit-optimize-contrast; /* Webkit */
image-rendering: crisp-edges;               /* CSS3 draft proposal */
image-rendering: -moz-crisp-edges;          /* Gecko */
image-rendering: -o-crisp-edges;            /* Opera */
-ms-interpolation-mode: nearest-neighbor;   /* IE8+ */
}
References:
http://help.dottoro.com/lcuiiosk.php
https://github.com/thoughtbot/bourbon/pull/102
http://productforums.google.com/forum/#!topic/chrome/AIihdmfPNvE
https://bugzilla.mozilla.org/show_bug.cgi/show_bug.cgi?id=41975
https://developer.mozilla.org/en-US/docs/CSS/Image-rendering http://www-archive.mozilla.org/unix/customizing.html#usercss
http://stackoverflow.com/questions/7615009/disable-interpolation-when-scaling-a-canvas
http://nullsleep.tumblr.com/post/16417178705/how-to-disable-image-smoothing-in-modern-web-browsers
http://www.w3.org/TR/2011/WD-css3-images-20110712/#image-rendering

Copyright (c) 2012 Mark D. Blackwell.

Saturday, August 11, 2012

Simple webserver for troubleshooting, howto

Here is an extremely simple web server to use in troubleshooting your code, derived from Yohanes Santoso's wonderful Gnome's Guide to WEBrick.

It serves any directory tree you're working on (including HTML properly) without any complexity arising from Rails, Sinatra, or any other web frameworks. (Isolation is a good thing when troubleshooting.)

Just place this in your tools directory (e.g., ~/t/serve-files — making sure it's executable):

#!/usr/bin/env ruby
require 'webrick'

program_name = $0
puts "#{program_name} #{ARGV.join ' '}"

puts "Running Ruby #{RUBY_VERSION}"

include WEBrick

options = {
  :BindAddress => '0.0.0.0',
  :Port => 3000,
  :DocumentRoot => Dir.pwd,
}
server = WEBrick::HTTPServer.new options

%w[INT TERM].each{|e| trap(e){server.shutdown}}

server.start

Copyright (c) 2012 Mark D. Blackwell.

Friday, August 10, 2012

Fern Hill, choral work by John Corigliano

I just heard a marvelous choral work by John Corigliano on WQXR's Q2 Music's program, ' The Choral Mix With Kent Tritle'.

It is Fern Hill, set to the poem by Dylan Thomas, composed in 1959 when he was 21.

This performance (directed by Kent Tritle) starts one third of the way into the program. (Press the Full Player button; find the August 5, 2012 program; click 'ADD THIS'.) To position to Fern Hill: pause the WQXR website player; wait awhile for buffering. Then click just left of the host's last name (in the series title).

There's a review of a previous performance in the New York Times.

Copyright (c) 2012 Mark D. Blackwell.

Saturday, July 28, 2012

Select online wikis and discussion boards for startups

In GitHub, there's no email notification associated with wiki changes. That's not very good! They specialize in source control, instead. For more:

http://rants.arantius.com/github-sucks

Stackexchange here discusses GitHub wiki change notifications, but only by setting up another server and programming it. Certainly a startup doesn't have time for that:

http://developer.github.com/v3/repos/hooks/

The old Google groups was bad. But not even the new Google Docs has change notification, either.

So today I researched online wikis and discussion boards for startups, and found this Stackexchange answer. GitHub's wiki does not include notifications, BTW.

Zoho's products seem excellent to me! Generally, their collaboration apps are widely used and integrated with Google Docs. They seem quite good (and cheap) both for their wiki and their discussion forum:

http://www.zoho.com/collaboration-apps.html

Zoho wiki: 'Access controls' (private?). Notifications. Comment threads. Free of charge for three (3) users, $12 per month for four (4) users:

http://www.zoho.com/wiki/wiki-pricing.html
http://www.zoho.com/wiki/wiki-notifications.html
http://www.zoho.com/wiki/enterprise-level-security.html
http://www.zoho.com/wiki/google-apps.html

Zoho online forums: Private (by Google Docs integration). Notifications, but I don't know if they're universal. One forum is free of charge:

http://discussions.zoho.com/
http://www.zoho.com/discussions/features.html
http://www.zoho.com/discussions/intranet-discussions.html
http://www.zoho.com/discussions/features.html#topicadministration
http://www.zoho.com/discussions/solutions.html
https://www.google.com/enterprise/marketplace/viewListing?productListingId=2533+14374228673760475061

--------------
Other online wiki sites:

Wikidot: No integration with Google Docs. Non-public. Private for $50 per year ($4 per month equivalent). Free of charge with advertising:

http://www.wikidot.com/plans
http://www.wikidot.com/faq:private-sites

--------------
Other online forum sites:

ProBoards: $7 per month for ad-free. Oriented to public access; seems somewhat disreputable:

http://www.proboards.com/premium-forum-features

QuickTopic: $49 per year ($4 per month equivalent):

http://www.quicktopic.com/gopro?ref=faq

Teamlab: No pricing found!

Wetpaint: No access control.

Wikispaces: $20 per month for restricted access.

Wikimatrix: for comparing wiki software, said they include online but seem to be installable software. It might not be worth running one's own wiki server.

http://www.wikimatrix.org/wizard.php?d[branding]=&d[domain]=&d[flag]=2&d[language]=&d[support]=&d[wysiwyg]=yes&d[history]=yes&d[go]=1&x=77&y=14

Copyright (c) 2012 Mark D. Blackwell.

Tuesday, July 24, 2012

Getting started with Ajax live updating, howto

Complexity is the enemy of troubleshooting. Remove as many dimensions as possible—it helps in a big way!

To create (with Ajax) your first live-updated web page, this no less applies. (Of course, these kinds of web pages don't require full page refreshes to change their appearance—familiar, right?)

The chosen web stack also makes its own recommendations. For getting Ajax to work, the elements of complexity include:
  • Understanding normally what happens in Ajax on the server and getting that to work, especially in an evolving web stack like Rails. (Googling gives obsolete information.)
  • Learning the details of the recommended language for the browser (naturally it gets compiled into Javascript). This is CoffeeScript for Rails now.
  • Learning the details of the particular Javascript library; Rails now recommends jQuery.
  • Getting from an external server the chosen Javascript library into the browser (which might not be working, giving no sign).
  • Learning in the chosen Javascript library (jQuery, e.g.) how to do Ajax.
  • Learning in (raw) Javascript how to do Ajax in a combination of browsers (using try-and-catch).
  • Learning in (raw) Javascript in a chosen, single browser how to do Ajax (e.g., doing XMLHttpRequest).
  • Learning to troubleshoot Javascript in a browser.
  • Learning the (browser) Document Object Model.
  • Learning Javascript.
For getting first-time Ajax working, all but the last four are unnecessarily complicated dimensions for troubleshooting.

Obviously if Ajax isn't working and nothing's happening, one had better remove all possible dimensions and get one thing working at a time!

So I have written a simple Ajax testbed server for the cloud:
  • It is always running.
  • It is extremely simple.
  • It doesn't have to be installed or configured by you.
  • It responds absolutely identically to:
    • GET and POST requests.
    • Header-specified HTML, JSON, XML and JS requests.
    • URL extension-specified HTML, JSON, XML and JS requests.
  • It simply returns the same JSON no matter what you do.
The testbed is running. Fork it on GitHub! Or tell me how it's not working for you.

Here's some browser client code to match (it runs on Mozilla's browsers):

<!DOCTYPE html>
<html> <head>
<title>bare Ajax</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf8">
<script type="text/javascript">
<!--

//Browser support code:

function getAjaxObject(){
  var result;
  try{ //Gecko: Mozilla, Firefox; Webkit: Chrome, Safari; ?: Opera; (etc.?) browsers:
    result = new XMLHttpRequest();
  } catch (e){ //Internet Explorer browsers:
    try{
      result = new ActiveXObject('Msxml2.XMLHTTP');
    } catch (e) {
      try{
        result = new ActiveXObject('Microsoft.XMLHTTP');
      } catch (e){ //Something went wrong:
        alert('Unable to obtain an Ajax object.');
        return false;
      }
    }
  }
  return result;
}

//Non-browser support code:

function targetId() {
  return 'replaceable';
}

function Target(id) {
  this.id = id;
}

function alterTarget(s) {
  var target = new Target(targetId());
  var elem = document.getElementById(target.id);
  elem.innerHTML = s;
}

function requestAjaxUpdate() {
  var req = getAjaxObject();
  //req.open('GET', 'http://localhost:5000/ajax',false);
  req.open('GET', 'http://ajax-testbed-simple.herokuapp.com/ajax',false); //Synchronous.
  req.send();
  //alert(req.status);
  var myJSON = JSON.parse(req.responseText);
  var s = myJSON.message;
  alert(s);
  alterTarget(s);
}

// When the DOM is fully loaded:
window.document.addEventListener('DOMContentLoaded', function() {
  //alert('window document ready');
}, false);

// When images, etc. are fully loaded:
window.onload = function() {
  //alert('window load');
  requestAjaxUpdate();
};

//-->
</script> </head> <body>

<div id="replaceable"  >replaceable-content</div>

</body> </html>

References:

http://ajaxpatterns.org/XMLHttpRequest_Call
https://developer.mozilla.org/en/AJAX/Getting_Started
https://developer.mozilla.org/en/DOM/
https://developer.mozilla.org/en/DOM/About_the_Document_Object_Model
https://developer.mozilla.org/en/DOM/DOM_event_reference/DOMContentLoaded
https://developer.mozilla.org/en/DOM/XMLHttpRequest
https://developer.mozilla.org/en/DOM/XMLHttpRequest/Using_XMLHttpRequest
https://developer.mozilla.org/en/DOM_Client_Object_Cross-Reference/DOM_Events
https://developer.mozilla.org/en/JavaScript/
https://developer.mozilla.org/en/JavaScript/A_re-introduction_to_JavaScript
https://developer.mozilla.org/en/JavaScript/Guide
https://developer.mozilla.org/en/JavaScript/Reference/
https://developer.mozilla.org/en/JavaScript_technologies_overview
https://developer.mozilla.org/en/Server-Side_Access_Control
http://en.wikipedia.org/wiki/Ajax_(programming)
http://en.wikipedia.org/wiki/JSON
http://en.wikipedia.org/wiki/XMLHttpRequest
http://linuxgazette.net/123/smith.html
http://molily.de/weblog/domcontentloaded
http://stackoverflow.com/questions/1457/modify-address-bar-url-in-ajax-app-to-match-current-state
http://stackoverflow.com/questions/6410951/how-to-simplify-render-to-string-in-rails-3
http://www.hunlock.com/blogs/Mastering_JSON_(_JavaScript_Object_Notation_)
http://www.json.org/js.html
http://www.mousewhisperer.co.uk/ajax_page.html
Ajax Tutorial – tizag.com/
http://www.w3.org/TR/XMLHttpRequest/
http://www.w3schools.com/json/default.asp
http://www.w3schools.com/xml/xml_http.asp
http://www.xml.com/pub/a/2005/02/09/xml-http-request.html

Copyright (c) 2012 Mark D. Blackwell.

Wednesday, July 18, 2012

Back up your KeePass database daily automatically (with timestamp), howto

Many people place their KeePass database in a Dropbox folder.

It's a good idea, but doesn't it seem just a little dangerous? Who knows—resynchronization might lose a password you just created, or worse.

So I wrote a utility which backs up your KeePass database with a timestamp, every time you start your computer.

It's for Windows users. I've been using it safely since March, 2011.

Just now, I realized I could improve its usability by removing (from its installation instructions) a difficult step setting up the shortcut (which was formerly a show-stopper). It uses Ruby from your PATH now (instead of the kluge of a fixed location the user has to manage).

I've made the change, so now it's on Github.

Copyright (c) 2012 Mark D. Blackwell.

Saturday, July 7, 2012

Manage long-running external webservice requests from Rails apps (on cloud servers), howto

Case: (as long as Rails is synchronous) requests to external webservices take the use of server resources to impossible levels, even when webservices behave normally—let alone when they are long delayed.

Plan: two web apps (one Rails, the other async Sinatra) can fairly easily manage the problem of external web service requests by minimizing use of server resources—without abandoning normal, threaded, synchronous Rails. The async Sinatra web app can be a separate business, even a moneymaking one.

This solution uses RabbitMQ, Memcache and PusherApp.

The async Sinatra web dynos (on the one hand) comprise external webservice request brokers. Also they have browser-facing functionality for signing up webmasters.

The Rails web dynos don't wait (on the other hand) for external webservices and they aren't short-polled by browsers.

This attempts to be efficient and robust. It should speed up heavily loaded servers while remaining within the mainstream of the Rails Way as much as possible.

E.g. it tries hard not to Pusherize browsers more than once for the case that a cached response to an external webservice was missed, but relies on browser short-polling after perhaps a 10-second timeout to cover these and other unusual cases.

But in the normal case browser short-polling will be avoided so Rails server response time should be peppy.

It tries to delete its temporary work from memcache but even if something is missed, memcache times out its data eventually so too much garbage won't pile up there.

Note: this is for web services without terribly large responses (thus appropriate for memcaching). Very large responses and non-idempotent services should be handled another way such as supplying them directly to the browser.

Method: the Rails web app dynos immediately use memcached external webservice responses if the URL's match.

Otherwise they push the URL of each external webservice request and an associated PusherApp channel ID (for eventually informing the browser) to a RabbitMQ Exchange.

For security purposes, minimal information is passed through PusherApp to the browser (only suggesting a short-poll now, not where).

The Rails web dynos (if necessary) return an incomplete page to the browser as usual (for completion with AJAX).

To cover cases where something got dropped the browser should short-poll the Rails app after a longish timeout—its length should be set by an environment variable and may be shortened to half a second when the Rails website is not terribly active, or when the async Sinatra web dynos are scaled down to off.

Each async Sinatra web dyno attaches a queue to the Rails app's RabbitMQ exchange for accepting messages without confirmation.

With each queued message, an async Sinatra web dyno:
  1. Checks the memcache for the external webservice request (with response)—if present, it:
    • Drops the message. (Some may slip through and be multiply-processed, but that's okay.)
    • Frees memcache of the request (without response) if it still exists (see below).
    Otherwise it checks the memcache for the external webservice request—without response. If recently memcached (perhaps within 10 seconds) it drops the message. (Some may slip through and be multiply-processed, but that's okay.)
    Otherwise it makes the request to the external webservice, setting a generous response timeout (maybe 60 seconds).
  2. Memcaches the external webservice request (without response) with the current time (not in the key).
  3. If the request times out, drops it in favor of letting the browser handle the problem, but leaves the memcached external webservice request (without response) for later viewing by async Sinatra web dynos.
  4. (Usually) receives a response from the external webservice request.
  5. Again checks memcache for the external webservice request (combined with the same response). If it's not there:
    • Pusherizes the appropriate browser. (Some requests may be multiply-processed, but that's okay.)
    • Memcaches the external webservice request (with response).
    • Clears from memcache the external webservice request without response.
The browser then asks the Rails web dyno to supply all available AJAX updates.

The Rails web dyno returns (usually incomplete: whatever is memcached—some may have been dropped, but that's okay) a set of still-needed AJAX responses to the browser (for further completion with AJAX).

Or (if all were memcached) the Rails web dynos return the complete set of outstanding AJAX responses to the browser.

I'm starting to implement this here, now.
Copyright (c) 2012 Mark D. Blackwell.

Thursday, July 5, 2012

User-story presence flags ease split-test metrics for lean startups, howto

This morning I read 'Measure', a chapter of the book, The Lean Startup. It discusses cohort analysis, split-testing, and the triple-A of metrics: Actionable, Accessible and Auditable.

Then I got an idea regarding split-testing the user-story stream in website development.

For split-testing newly deployed stories it's easy to include (in the logs) a (growing) bitstring of indicators for each user story, which indicate their presence (with/without or after/before), and the ordinal number (implicitly) of the story (perhaps from PivotalTracker). All are kept in the same central place (in the source code) usually used for configuration.

Packed together by story number (using standard Base64 encoding), each log line includes them as a short string. (They take up only a single character for each 64 stories, of course.)

With current aggregated logging, remembering which log records came from which active set of stories might be difficult. But at the first level this method eases split-testing (for the impact of) each newly-deployed story.

Going deeper, the flags in the logs categorize the comparison data cleanly and safely, especially if we ever want something more complex (in the current context)—such as to reassess an old story. To disable an earlier story, some special programming is required, but our log data will indicate clearly which stories are active.

For split-testing, we can filter the log data by these story-presence strings. We can split-test for various configurations (of user stories), new-user activation (or usage rates or whatever we desire).

Perhaps we might want to remove an old feature, and split-test that, before we expend the effort to develop an incompatible new feature—good idea? And arbitrary configurations of features can be split-tested.

Copyright (c) 2012 Mark D. Blackwell.

Monday, July 2, 2012

Scale Rails in the cloud by handling external webservice delays, howto

Background:

With Rails, each simultaneous browser request maintains its use of server memory till Rails responds—for unthreaded Rails, this is several tens of Mb. For threaded or JRuby, it's still substantial.

Webservers often (or usually) queue requests for a relatively small number of Rails instances, perhaps a dozen or two.

Freezing (in I/O wait) thousands of Rails instances for long-delayed webservices (located elsewhere on the Internet) is impracticable.

A naive Rails website design (without AJAX) which obtained all needed results from external services first (before responding to the browser) would provide little server throughput.

Usually for website scalability, developers offload (to other worker programs) what, in a naive or low-traffic design, Rails instances (themselves) might do.

In more scalable and sophisticated designs, Rails responds rapidly to initial requests (thus freeing its instance). Then, AJAX finishes the webpage (i.e., the browser polls the server).

Case:

It is a generally established software principle that events are better than polling.

Webpage content delivered by AJAX can be short-polled from the browser. However, short-polling gives unnecessarily slow response. From a user experience (UX) perspective it is either too slow, especially at peak times, or slower than it could be. Also short-polling loads up servers with extra, running Rails instances (maybe queued up) yet ultimately, most such requests determine (regrettably) there is nothing new.

Furthermore, during each polling request Rails reacquires all relevant information from its database cache (due to the stateless nature of the web). Therefore, each Rails short-polling responses takes a terrible amount of server resources—yet only calculates the time for the next poll.

Plan:

Long-polling (or websockets) should be used by Rails websites which access external services.

This can be accomplished efficiently if the browser doesn't long-polls Rails, but instead an asynchronous webserver such as (Ruby) EventMachine, configured for something other than Rails: for instance asynchronous Sinatra::Synchrony.

A RabbitMQ exchange (in a cloud environment such as Heroku) then can provide a queue to (all) front-facing asynchronous website server instances (dynos) containing information desired about (all) worker programs performing tasks which Rails instances offload.

Because the notifications don't need to be stored permanently, it's better to use a message system than a database.

Probably the information (simply) would be notification that each (single) task was complete. The exchange would connect all worker instances to all server instances in an instance-agnostic design typical of the cloud. A notification would include the user's session ID; then each asynchronous webserver could filter the notifications down to (presently active) long-polling connections they (themselves) own. Browsers can provide the session ID's (perhaps in a header) while setting up the long-poll.

Normally a webserver will keep long-poll HTTP connections open for quite a long time; however, if for any reason a connection has been broken, it doesn't matter much; the RabbitMQ queue (configurably) doesn't keep a message once the webserver has received it (so they won't pile up anywhere). Also if the webserver is restarted, the old queue automatically will be deleted.

This is because, in RabbitMQ, (named) exchanges actually manage the flow of messages; each receiver creates a queue of its own (from that exchange) which receives all the messages (applying to all server instances on the website).

Receipt confirmation also is unnecessary. If some messages might be dropped when the server is busy, so what? Nothing much bad will happen further; in that case the user may refresh the webpage—so the scheme is quite tolerant of individual failures.

After getting the new messages, the asynchronous webservers merely return from those long polls. After return, AJAX in the browser knows it can make a short-poll Rails request and be guaranteed of receiving more information for the webpage. Even if the connection merely is broken accidentally, the overall process is still safe, and will result in valid short-poll AJAX results. In other words (for simplicity), the normal way Rails responds to AJAX short-polling should not change.

This cycle of long-poll, short-poll should automatically repeat till Rails effectively tells the AJAX code (in the browser) all the work is done for the page—i.e., till no worker jobs are queued.

Perhaps (the default schedule of) AJAX short-polling can most easily be put off by increasing (to some large value) the time delay on the first short-poll. Presumably this is configurable in Rails. Long-polling of the other (asynchronous) webserver should be added to Rails page view layouts.

Thin (and Ruby EventMachine) are asynchronous and non-blocking just like Node.js. They can accept thousands of simultaneous HTTP connections (to browsers) each consuming only a few Kb of memory. Thin being based partly on Ruby EventMachine demonstrates the latter's quality.

The job queue for Rails worker programs also probably should be a RabbitMQ exchange, since we're using it.

Some various other asynchronous Ruby servers are: cramp, goliath, rainbows! and puma.

Actually, probably the best asynchronous webserver for this purpose is the paid service Pusher (or the open-source Slanger equivalent, to keep it in-house).

References:

blog.headius.com/2008/08/qa-what-thread-safe-rails-means.html
confreaks.com/videos/727-rockymtnruby2011-real-time-rack
github.com/igrigorik/async-rails/
github.com/igrigorik/em-synchrony
github.com/jjb/threaded-rails-example
jordanhollinger.com/2011/04/22/how-to-use-thin-effectivly [sic]
www.igvita.com/2009/05/13/fibers-cooperative-scheduling-in-ruby/
thechangelog.com/post/927103350/episode-0-3-1-websockets
www.igvita.com/2010/03/22/untangling-evented-code-with-ruby-fibers/
www.igvita.com/2010/06/07/rails-performance-needs-an-overhaul/
www.igvita.com/2011/03/08/goliath-non-blocking-ruby-19-web-server/
www.tumblr.com/tagged/pusher?before=1323905509
yehudakatz.com/2010/08/14/threads-in-ruby-enough-already/

Copyright (c) 2012 Mark D. Blackwell.

Thursday, June 28, 2012

Planning and developing

Regarding planning and developing in startups:

We should keep open what is open; and go forward with the unproblematic when there are no choices. Some decisions are best done early, some late. For instance, using Amazon for storage is certain, because it's cheapest.

For optional aspects without measured customer interest, action is best delayed.

Developing some aspects puts constraints sometimes on other aspects. Some aspects, certainly needed, are best delayed if multiple possible implementations exist (if other crucial aspects are still undone).

Obtaining the fastest possible overall development mandates choosing (based on actuality) what to work on, and what not to work on yet.

Copyright (c) 2012 Mark D. Blackwell.

Install Postgres on Debian squeeze for Rails, howto

Intro


Here's how to install Postgres, the popular, open-source database server for Rails 3 development on Debian squeeze (I used Rails 3.2.6). I switched to Postgres for maximum compatibility with Heroku (it's one of their '12-factor app' development principles).

This installation procedure keeps safety particularly in mind for anyone (like me) who has never before used Postgres.

Install


Debian squeeze's normal Postgres version (8.4) is unnecessary, unlike (apparently) the case on Mac OS.

Don't install the latest Postgres from source though postgresql.org recommends it—because its setup for Debian is difficult. Likely you will get mysterious errors such as:

$ bundle exec rake db:create:all --trace
  rake aborted!
  libpq.so.5: cannot open shared object file: No such file or directory - /home/mark/.rvm/gems/ruby-1.9.2-p320@global/gems/pg-0.14.0/lib/pg_ext.so
  /home/mark/.rvm/gems/ruby-1.9.2-p320@global/gems/pg-0.14.0/lib/pg.rb:4:in `require'

Instead use the most recent Postgres, backported to squeeze (currently 9.1.4). Here's how:

Remove old versions of Postgres software (e.g. 8.4) with:
$ apt-get purge libpq-dev libpq5 postgresql postgresql-client postgresql-common

Pay attention to messages, especially those warning about directory names containing 'postgres'. Take appropriate action to remove those directories.

Then clean up if you want to:
$ apt-get clean

Note that libpq is part of Postgres. Edit where Debian gets packages:
$ nano /etc/apt/sources.list

Include backports by adding:
  deb http://backports.debian.org/debian-backports squeeze-backports main

Ruby needs package 'libpq-dev' to connect to Postgres. Get the latest backported Postgres packages:
$ apt-get update
$ apt-get upgrade
$ apt-get -t squeeze-backports install postgresql libpq-dev

Expect to see the message, 'Configuring postgresql.conf to use port 5432' (which is the proper port for PostgreSQL *not* port 5433, which can come about if Debian gets confused).

Automatically, installation should start the Postgres server—look at what's actually running to confirm the port:
$ ls -a /var/run/postgresql

Instead of 5433 you should see:
  .s.PGSQL.5432

Minimally alter one of Postgres's configuration files to accept (app-specific) connections from Rails...:
$ nano /etc/postgresql/9.1/main/pg_hba.conf

by changing 'peer' to 'md5' where it says:

  # "local" is for Unix domain socket connections only
  #local   all         all                               peer
  local   all         all                               md5

Restart the Postgres server:
$ sudo /etc/init.d/postgresql stop
$ sudo /etc/init.d/postgresql start

New app


Now that you have Postgres installed, you can use it to create a new Rails app (which I'll call, 'APP'; replace this with something in lower case):
$ rails new APP -d postgresql; cd APP

Tell Rails you'll be using a database password from environment variables:

$ nano config/database.yml

Do this by changing the relevant lines (without moving them) to:

database:   <%=   ENV['DATABASE_USERNAME']   %>_development
database:   <%=   ENV['DATABASE_USERNAME']   %>_test
database:   <%=   ENV['DATABASE_USERNAME']   %>_production

username:   <%=   ENV['DATABASE_USERNAME']   %>
password:   <%=   ENV['DATABASE_PASSWORD']   %>

Repeat the username and password lines three times, once for each Rails environment.

Decide upon (or generate) a new database password for your app. Create the two environment variables above and set them somehow. If you're using foreman, you can set these in your .env file.

Unfortunately, Rails wants to drop the whole test database, not just its tables. Because it seems difficult to change this, we'll let Rails handle database creation:

Create the app's safe Postgres user. This asks you to enter your new app's password twice:
$ sudo -u postgres createuser --echo --encrypted --pwprompt --no-superuser --no-inherit --createdb --no-createrole APP

(If you made a mistake):
$ sudo -u postgres dropuser APP

Confirm your new app is included in the list of existing databases and users (called 'roles'):
$ sudo -u postgres psql
=> \dg
=> \l
=> \q

Rails should now be working. Test it by something like:
$ foreman run bundle exec rake db:create:all

If Rails merely complains that the three databases already exist, then this setup is working fine.

If Rails didn't work and the final error message started 'Couldn't create database' then scroll up: if you see...:

  could not connect to server: No such file or directory
  Is the server running locally and accepting connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

then go back and redo the above part regarding installing for the correct port 5432 (not 5433).

If after scrolling you see:

  password authentication failed for user {APP}

then go back and redo the above part regarding creating the 'role' (user) for the new app.

Rake


To run tests, with foreman setting up your local environment, remember that you should be running rake by:

$ foreman run bundle exec bin/rake ...

E.g., to run database migrations, do:

$ foreman run bundle exec bin/rake db:migrate

(And similarly for some other commands.)

Heroku


Heroku (e.g.) rewrites your database.yml to use only the single value in DATABASE_URL, which makes it effectively your production database, so beware. Don't set the DATABASE_URL environment variable (locally) in your .env file.

If you want to troubleshoot Heroku's access to your database with a local setup closer to Heroku's method, make a script (call it: my-foreman) containing this:

environment=$1
shift 1
export DATABASE_URL=\
postgres://$DATABASE_USERNAME:$DATABASE_PASSWORD@\
localhost:5432/$DATABASE_USERNAME'_'$environment
foreman start $@

Then you can run foreman in your desired environment with (e.g.)

./my-foreman production -p 5001

Alternative


Alternatively, you could take this approach:
* Make your own Unix username be a Postgres superuser;
* Keep the local authentication line in pg_hba.conf as,  'peer'; and
* Change the database username in new Rails apps to your Unix username.

But that approach is less safe, especially across apps. If you make a mistake in a new app, you don't want it to overwrite the database of another of your apps.

References

http://backports-master.debian.org/Instructions/
http://railscasts.com/episodes/342-migrating-to-postgresql?view=asciicast
http://wiki.debian.org/Backports
http://wiki.debian.org/PostgreSql#Installation
http://www.jquantlib.org/index.php/Upgrading_PostgreSQL_8.2_to_8.3_on_Debian
http://www.postgresql.org/docs/9.1/interactive/install-short.html
http://www.postgresql.org/docs/9.1/static/auth-methods.html
http://www.thegeekstuff.com/2009/04/linux-postgresql-install-and-configure-from-source/
http://xtremekforever.blogspot.com/2011/05/setup-rails-project-with-postgresql-on.html

Copyright (c) 2012 Mark D. Blackwell.

Saturday, March 31, 2012

Offline Rdoc generation, howto

Recently, I wanted to work with a Ruby gem called 'devise'.

Like many people installing gems, usually I don't include their Rdoc because of the time it takes.

My Internet connection happened to be down, so for my already installed gem devise, I wanted to generate its Ruby documentation (Rdoc) offline. The documented command for this unfortunately wanted to generate the Rdoc for all my gems:
rdoc {gem-name}

This happened with these versions:
ruby 1.9.2-p290
rubygems 1.8.10
rdoc 3.12


I found this workaround to generate a single gem's Rdocs:
cd ~/.rvm/gems/{ruby-version}/gems/{gem-name}-{version}
rdoc .
mv doc/* ~/.rvm/gems/{ruby-version}/doc/{gem-name}-{version}/rdoc
cd ~
gem server
# In SeaMonkey:
open localhost:8808
click {gem-name}
click 'rdoc'


Copyright (c) 2012 Mark D. Blackwell.

Thursday, March 1, 2012

Bret Victor - Inventing on Principle

Bret Victor's recent extremely interesting and captivating video, for creators of all kinds, is based on a guiding principle: "Creators need an immediate connection." (His title is, 'Inventing on Principle').

His examples are visually amazing:

Coding a picture (2:46 to 9:39)
Coding a game (12:15 to 14:22)
Discovering new games (15:01 to 16:24)
Coding an algorithm (18:05 to 22:39)
Circuit design (23:02 to 28:05)
Making a video (29:19 to 34:07)
General (34:07 to 36:47)

The rest is interesting in a different way--he talks about principle and personal identity (36:47 onward). I think he organizes his life by means of the first choice, though he says they can be combined. (And in what follows, I paraphrase.)

Choices:
* Stand for a guiding principle; fight for a cause
* Craftsman
* Problem solver

Examples in software:

Larry Tesler (PARC) 38:07 to 44:22
* Vision: "Personal computing."
* Guiding principle: "No person should be trapped in a mode."

People who fight for a guiding principle, unlike Thomas Edison, are not well described primarily as inventors.

A guiding principle embodies a specific nugget of insight. Both Tesler and Elizabeth Cady Stanton (see next):
* Recognized a cultural wrong;
* Envisioned a world without that wrong; and
* Dedicated themselves to fighting for a principle.

Elizabeth Cady Stanton 43:32 to 44:22
* Goal, vision and guiding principle: "Women should vote."

Doug Engelbart 44:22 to 45:25
* Goal: "Enable mankind to solve the world's urgent problems."
* Vision: "Knowledge workers, using complex powerful information tools, which harness their collective intelligence."
* Guiding principle: "Interactive computing."

Alan Kay (PARC) 45:25 to 46:35
* Goal: "Amplify human reach and bring new ways of thinking to a faltering civilization that desperately needs it."
* Vision: "If children became fluent in thinking in the medium of the computer, then they'd become adults with new forms of critical thought and new ways of understanding the world, and we'd have a more-enlightened society, similar to the difference brought by literacy."
* Guiding principle: "Children, fluent in the medium of the computer."

Everything Kay did, and invented, came out of pursuing this guiding principle (vision and goal) with children, following principles that he adopted from Piaget, Montessori, [Papert] and Jerome Bruner. (See also Discovery learning.)

[BTW, this puts into context why the lucky stiff's interest (in Hackety Hack) in programming by children.]

Richard Stallman 46:45 to 47:10
* Goal, vision and guiding principle: "Software must be free."

Copyright (c) 2012 Mark D. Blackwell.

Thursday, February 16, 2012

Methods as first-class objects in Ruby, howto

Recently, I became interested in the general concept of methods as first-class objects in Ruby.

Often, this is useful because they delay their execution, beyond argument-list evaluation time.

Procs and lambdas exist, but I noticed that lambdas are slower than ordinary methods.

The following is a scheme for simply using modules as first-class method objects.

Here are some methods to pass around, using the scheme:

module Double def self.call(x) x*2 end end
module Triple def self.call(x) x*3 end end
double, triple = Double, Triple

You can invoke them in your Ruby code with a syntax similar to that of lambdas:

p triple.call 'a' #=> "aaa"

If you care about lambda's bracket syntax, you have to add:

module Triple instance_eval{alias :[] :call} end
p triple['a'] #=> "aaa"

Here's how to use this module scheme to pass methods as first-class objects:

double=Double
def using_module(m) m.call(1) end
using_module(double)

Some testing:

methods=[Double,Triple]
def take_both(source,m) m.call(source) end
a=[[1],'a'].product(methods).map{|source,method| take_both source, method}
p a #=> [[1, 1], [1, 1, 1], "aa", "aaa"]

Here are some of the alternatives:

module Methods def self.double(x) x*2 end end
def using_symbol(m) Methods.send(m,1) end
using_symbol(:double)

double=lambda{|x| x*2}
def using_lambda(m) m.call(1) end
using_lambda(double)

I tested the speed of the various alternatives (in the 1.9.3, 1.9.2 and 1.8.7 versions of Ruby).

The result? In 1.9.3, the scheme is about fifteen percent faster than passing symbols, which in turn is about twenty percent faster than lambdas.

Copyright (c) 2012 Mark D. Blackwell.

Tuesday, January 17, 2012

`My song is love unknown' (hymn)

Recently, I heard a moving hymn, `My song is love unknown' (1664) by Samuel Crossman (1624-1683) and w/s found this post by Rupert Christiansen (in U.K.'s The Telegraph) for a story behind it.

Actually, what moved me was its tune, Love Unknown (1918) by John Ireland (1879-1962). Set to it, some contemporary churches know better the words, `Oft when of God we ask' by English Congregational minister Thomas Toke Lynch (1818-1871, more here).

During the 254 years that passed before Ireland wrote his, I wonder which tune the Crossman hymn used? There's currently no answer in Wikipedia's article. I found a list of alternate tunes for the text, but all are of poorer quality IMO for the words compared to Ireland's; it seems no wonder he was inspired to compose it, perhaps!

Crossman as a family name would seem to suggest to a boy, BTW, thinking about religion. However, that speculation remains unconfirmed by Wikipedia's article on Crossman.

Copyright (c) 2012 Mark D. Blackwell.

Tuesday, January 10, 2012

Lisp & Ruby metaprogramming

I just reread Paul Graham's article, Beating the Averages, on Lisp being the most powerful computer programming language because of Lisp macros (which, BTW, are not like assembly language macros). It led me to the obvious perception that because Ruby lacks Lisp macros, metaprogramming in it is weaker than in Lisp.

He explains the essence of Lisp macros: '[I]n general, for application software, you want to be using the most powerful ...language you can get, and using anything else is a mistake. ...Lisp code, after it's read by the parser, is made of data structures that you can traverse. If you understand how compilers work, [in Lisp you] write programs in the parse trees that get generated within the compiler when other languages are parsed. But these parse trees are fully accessible to your programs. You can write programs that manipulate them. In Lisp, these programs are called macros. [P]ower ...refers to features you could only get in [a] less powerful language by writing an interpreter for [a] more powerful language in it.'

So, for someone who wants to know where to go next after Ruby, and thinks that, in a Ruby code base, having a high proportion of metaprogramming code, like some say Rails 3 has these days, results in awkwardness, the next step is Lisp. Lisp apparently is perhaps a better metaprogramming language.

Copyright (c) 2012 Mark D. Blackwell.