Notes from an intuitive programmer: July 2012

Saturday, July 28, 2012

Select online wikis and discussion boards for startups

In GitHub, there's no email notification associated with wiki changes. That's not very good! They specialize in source control, instead. For more:

http://rants.arantius.com/github-sucks

Stackexchange here discusses GitHub wiki change notifications, but only by setting up another server and programming it. Certainly a startup doesn't have time for that:

http://developer.github.com/v3/repos/hooks/

The old Google groups was bad. But not even the new Google Docs has change notification, either.

So today I researched online wikis and discussion boards for startups, and found this Stackexchange answer. GitHub's wiki does not include notifications, BTW.

Zoho's products seem excellent to me! Generally, their collaboration apps are widely used and integrated with Google Docs. They seem quite good (and cheap) both for their wiki and their discussion forum:

http://www.zoho.com/collaboration-apps.html

Zoho wiki: 'Access controls' (private?). Notifications. Comment threads. Free of charge for three (3) users, $12 per month for four (4) users:

http://www.zoho.com/wiki/wiki-pricing.html
http://www.zoho.com/wiki/wiki-notifications.html
http://www.zoho.com/wiki/enterprise-level-security.html
http://www.zoho.com/wiki/google-apps.html

Zoho online forums: Private (by Google Docs integration). Notifications, but I don't know if they're universal. One forum is free of charge:

http://discussions.zoho.com/
http://www.zoho.com/discussions/features.html
http://www.zoho.com/discussions/intranet-discussions.html
http://www.zoho.com/discussions/features.html#topicadministration
http://www.zoho.com/discussions/solutions.html
https://www.google.com/enterprise/marketplace/viewListing?productListingId=2533+14374228673760475061

--------------
Other online wiki sites:

Wikidot: No integration with Google Docs. Non-public. Private for $50 per year ($4 per month equivalent). Free of charge with advertising:

http://www.wikidot.com/plans
http://www.wikidot.com/faq:private-sites

--------------
Other online forum sites:

ProBoards: $7 per month for ad-free. Oriented to public access; seems somewhat disreputable:

http://www.proboards.com/premium-forum-features

QuickTopic: $49 per year ($4 per month equivalent):

http://www.quicktopic.com/gopro?ref=faq

Teamlab: No pricing found!

Wetpaint: No access control.

Wikispaces: $20 per month for restricted access.

Wikimatrix: for comparing wiki software, said they include online but seem to be installable software. It might not be worth running one's own wiki server.

http://www.wikimatrix.org/wizard.php?d[branding]=&d[domain]=&d[flag]=2&d[language]=&d[support]=&d[wysiwyg]=yes&d[history]=yes&d[go]=1&x=77&y=14

Copyright (c) 2012 Mark D. Blackwell.

Tuesday, July 24, 2012

Getting started with Ajax live updating, howto

Complexity is the enemy of troubleshooting. Remove as many dimensions as possible—it helps in a big way!

To create (with Ajax) your first live-updated web page, this no less applies. (Of course, these kinds of web pages don't require full page refreshes to change their appearance—familiar, right?)

The chosen web stack also makes its own recommendations. For getting Ajax to work, the elements of complexity include:

Understanding normally what happens in Ajax on the server and getting that to work, especially in an evolving web stack like Rails. (Googling gives obsolete information.)
Learning the details of the recommended language for the browser (naturally it gets compiled into Javascript). This is CoffeeScript for Rails now.
Learning the details of the particular Javascript library; Rails now recommends jQuery.
Getting from an external server the chosen Javascript library into the browser (which might not be working, giving no sign).
Learning in the chosen Javascript library (jQuery, e.g.) how to do Ajax.
Learning in (raw) Javascript how to do Ajax in a combination of browsers (using try-and-catch).
Learning in (raw) Javascript in a chosen, single browser how to do Ajax (e.g., doing XMLHttpRequest).
Learning to troubleshoot Javascript in a browser.
Learning the (browser) Document Object Model.
Learning Javascript.

For getting first-time Ajax working, all but the last four are unnecessarily complicated dimensions for troubleshooting.

Obviously if Ajax isn't working and nothing's happening, one had better remove all possible dimensions and get one thing working at a time!

So I have written a simple Ajax testbed server for the cloud:

It is always running.
It is extremely simple.
It doesn't have to be installed or configured by you.
It responds absolutely identically to:

GET and POST requests.
Header-specified HTML, JSON, XML and JS requests.
URL extension-specified HTML, JSON, XML and JS requests.

It simply returns the same JSON no matter what you do.

The testbed is running. Fork it on GitHub! Or tell me how it's not working for you.

Here's some browser client code to match (it runs on Mozilla's browsers):

<!DOCTYPE html>
<html> <head>
<title>bare Ajax</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf8">
<script type="text/javascript">

</script> </head> <body>

<div id="replaceable" >replaceable-content</div>

</body> </html>

References:

http://ajaxpatterns.org/XMLHttpRequest_Call
https://developer.mozilla.org/en/AJAX/Getting_Started
https://developer.mozilla.org/en/DOM/
https://developer.mozilla.org/en/DOM/About_the_Document_Object_Model
https://developer.mozilla.org/en/DOM/DOM_event_reference/DOMContentLoaded
https://developer.mozilla.org/en/DOM/XMLHttpRequest
https://developer.mozilla.org/en/DOM/XMLHttpRequest/Using_XMLHttpRequest
https://developer.mozilla.org/en/DOM_Client_Object_Cross-Reference/DOM_Events
https://developer.mozilla.org/en/JavaScript/
https://developer.mozilla.org/en/JavaScript/A_re-introduction_to_JavaScript
https://developer.mozilla.org/en/JavaScript/Guide
https://developer.mozilla.org/en/JavaScript/Reference/
https://developer.mozilla.org/en/JavaScript_technologies_overview
https://developer.mozilla.org/en/Server-Side_Access_Control
http://en.wikipedia.org/wiki/Ajax_(programming)
http://en.wikipedia.org/wiki/JSON
http://en.wikipedia.org/wiki/XMLHttpRequest
http://linuxgazette.net/123/smith.html
http://molily.de/weblog/domcontentloaded
http://stackoverflow.com/questions/1457/modify-address-bar-url-in-ajax-app-to-match-current-state
http://stackoverflow.com/questions/6410951/how-to-simplify-render-to-string-in-rails-3
http://www.hunlock.com/blogs/Mastering_JSON_(_JavaScript_Object_Notation_)
http://www.json.org/js.html
http://www.mousewhisperer.co.uk/ajax_page.html
Ajax Tutorial – tizag.com/
http://www.w3.org/TR/XMLHttpRequest/
http://www.w3schools.com/json/default.asp
http://www.w3schools.com/xml/xml_http.asp
http://www.xml.com/pub/a/2005/02/09/xml-http-request.html

Copyright (c) 2012 Mark D. Blackwell.

Wednesday, July 18, 2012

Back up your KeePass database daily automatically (with timestamp), howto

Many people place their KeePass database in a Dropbox folder.

It's a good idea, but doesn't it seem just a little dangerous? Who knows—resynchronization might lose a password you just created, or worse.

So I wrote a utility which backs up your KeePass database with a timestamp, every time you start your computer.

It's for Windows users. I've been using it safely since March, 2011.

Just now, I realized I could improve its usability by removing (from its installation instructions) a difficult step setting up the shortcut (which was formerly a show-stopper). It uses Ruby from your PATH now (instead of the kluge of a fixed location the user has to manage).

I've made the change, so now it's on Github.

Copyright (c) 2012 Mark D. Blackwell.

Saturday, July 7, 2012

Manage long-running external webservice requests from Rails apps (on cloud servers), howto

Case: (as long as Rails is synchronous) requests to external webservices take the use of server resources to impossible levels, even when webservices behave normally—let alone when they are long delayed.

Plan: two web apps (one Rails, the other async Sinatra) can fairly easily manage the problem of external web service requests by minimizing use of server resources—without abandoning normal, threaded, synchronous Rails. The async Sinatra web app can be a separate business, even a moneymaking one.

This solution uses RabbitMQ, Memcache and PusherApp.

The async Sinatra web dynos (on the one hand) comprise external webservice request brokers. Also they have browser-facing functionality for signing up webmasters.

The Rails web dynos don't wait (on the other hand) for external webservices and they aren't short-polled by browsers.

This attempts to be efficient and robust. It should speed up heavily loaded servers while remaining within the mainstream of the Rails Way as much as possible.

E.g. it tries hard not to Pusherize browsers more than once for the case that a cached response to an external webservice was missed, but relies on browser short-polling after perhaps a 10-second timeout to cover these and other unusual cases.

But in the normal case browser short-polling will be avoided so Rails server response time should be peppy.

It tries to delete its temporary work from memcache but even if something is missed, memcache times out its data eventually so too much garbage won't pile up there.

Note: this is for web services without terribly large responses (thus appropriate for memcaching). Very large responses and non-idempotent services should be handled another way such as supplying them directly to the browser.

Method: the Rails web app dynos immediately use memcached external webservice responses if the URL's match.

Otherwise they push the URL of each external webservice request and an associated PusherApp channel ID (for eventually informing the browser) to a RabbitMQ Exchange.

For security purposes, minimal information is passed through PusherApp to the browser (only suggesting a short-poll now, not where).

The Rails web dynos (if necessary) return an incomplete page to the browser as usual (for completion with AJAX).

To cover cases where something got dropped the browser should short-poll the Rails app after a longish timeout—its length should be set by an environment variable and may be shortened to half a second when the Rails website is not terribly active, or when the async Sinatra web dynos are scaled down to off.

Each async Sinatra web dyno attaches a queue to the Rails app's RabbitMQ exchange for accepting messages without confirmation.

With each queued message, an async Sinatra web dyno:

Checks the memcache for the external webservice request (with response)—if present, it:

Drops the message. (Some may slip through and be multiply-processed, but that's okay.)
Frees memcache of the request (without response) if it still exists (see below).

Memcaches the external webservice request (without response) with the current time (not in the key).
If the request times out, drops it in favor of letting the browser handle the problem, but leaves the memcached external webservice request (without response) for later viewing by async Sinatra web dynos.
(Usually) receives a response from the external webservice request.
Again checks memcache for the external webservice request (combined with the same response). If it's not there:

Pusherizes the appropriate browser. (Some requests may be multiply-processed, but that's okay.)
Memcaches the external webservice request (with response).
Clears from memcache the external webservice request without response.

The browser then asks the Rails web dyno to supply all available AJAX updates.

The Rails web dyno returns (usually incomplete: whatever is memcached—some may have been dropped, but that's okay) a set of still-needed AJAX responses to the browser (for further completion with AJAX).

Or (if all were memcached) the Rails web dynos return the complete set of outstanding AJAX responses to the browser.

I'm starting to implement this here, now.
Copyright (c) 2012 Mark D. Blackwell.

Thursday, July 5, 2012

User-story presence flags ease split-test metrics for lean startups, howto

This morning I read 'Measure', a chapter of the book, The Lean Startup. It discusses cohort analysis, split-testing, and the triple-A of metrics: Actionable, Accessible and Auditable.

Then I got an idea regarding split-testing the user-story stream in website development.

For split-testing newly deployed stories it's easy to include (in the logs) a (growing) bitstring of indicators for each user story, which indicate their presence (with/without or after/before), and the ordinal number (implicitly) of the story (perhaps from PivotalTracker). All are kept in the same central place (in the source code) usually used for configuration.

Packed together by story number (using standard Base64 encoding), each log line includes them as a short string. (They take up only a single character for each 64 stories, of course.)

With current aggregated logging, remembering which log records came from which active set of stories might be difficult. But at the first level this method eases split-testing (for the impact of) each newly-deployed story.

Going deeper, the flags in the logs categorize the comparison data cleanly and safely, especially if we ever want something more complex (in the current context)—such as to reassess an old story. To disable an earlier story, some special programming is required, but our log data will indicate clearly which stories are active.

For split-testing, we can filter the log data by these story-presence strings. We can split-test for various configurations (of user stories), new-user activation (or usage rates or whatever we desire).

Perhaps we might want to remove an old feature, and split-test that, before we expend the effort to develop an incompatible new feature—good idea? And arbitrary configurations of features can be split-tested.

Copyright (c) 2012 Mark D. Blackwell.

Monday, July 2, 2012

Scale Rails in the cloud by handling external webservice delays, howto

Background:

With Rails, each simultaneous browser request maintains its use of server memory till Rails responds—for unthreaded Rails, this is several tens of Mb. For threaded or JRuby, it's still substantial.

Webservers often (or usually) queue requests for a relatively small number of Rails instances, perhaps a dozen or two.

Freezing (in I/O wait) thousands of Rails instances for long-delayed webservices (located elsewhere on the Internet) is impracticable.

A naive Rails website design (without AJAX) which obtained all needed results from external services first (before responding to the browser) would provide little server throughput.

Usually for website scalability, developers offload (to other worker programs) what, in a naive or low-traffic design, Rails instances (themselves) might do.

In more scalable and sophisticated designs, Rails responds rapidly to initial requests (thus freeing its instance). Then, AJAX finishes the webpage (i.e., the browser polls the server).

Case:

It is a generally established software principle that events are better than polling.

Webpage content delivered by AJAX can be short-polled from the browser. However, short-polling gives unnecessarily slow response. From a user experience (UX) perspective it is either too slow, especially at peak times, or slower than it could be. Also short-polling loads up servers with extra, running Rails instances (maybe queued up) yet ultimately, most such requests determine (regrettably) there is nothing new.

Furthermore, during each polling request Rails reacquires all relevant information from its database cache (due to the stateless nature of the web). Therefore, each Rails short-polling responses takes a terrible amount of server resources—yet only calculates the time for the next poll.

Plan:

Long-polling (or websockets) should be used by Rails websites which access external services.

This can be accomplished efficiently if the browser doesn't long-polls Rails, but instead an asynchronous webserver such as (Ruby) EventMachine, configured for something other than Rails: for instance asynchronous Sinatra::Synchrony.

A RabbitMQ exchange (in a cloud environment such as Heroku) then can provide a queue to (all) front-facing asynchronous website server instances (dynos) containing information desired about (all) worker programs performing tasks which Rails instances offload.

Because the notifications don't need to be stored permanently, it's better to use a message system than a database.

Probably the information (simply) would be notification that each (single) task was complete. The exchange would connect all worker instances to all server instances in an instance-agnostic design typical of the cloud. A notification would include the user's session ID; then each asynchronous webserver could filter the notifications down to (presently active) long-polling connections they (themselves) own. Browsers can provide the session ID's (perhaps in a header) while setting up the long-poll.

Normally a webserver will keep long-poll HTTP connections open for quite a long time; however, if for any reason a connection has been broken, it doesn't matter much; the RabbitMQ queue (configurably) doesn't keep a message once the webserver has received it (so they won't pile up anywhere). Also if the webserver is restarted, the old queue automatically will be deleted.

This is because, in RabbitMQ, (named) exchanges actually manage the flow of messages; each receiver creates a queue of its own (from that exchange) which receives all the messages (applying to all server instances on the website).

Receipt confirmation also is unnecessary. If some messages might be dropped when the server is busy, so what? Nothing much bad will happen further; in that case the user may refresh the webpage—so the scheme is quite tolerant of individual failures.

After getting the new messages, the asynchronous webservers merely return from those long polls. After return, AJAX in the browser knows it can make a short-poll Rails request and be guaranteed of receiving more information for the webpage. Even if the connection merely is broken accidentally, the overall process is still safe, and will result in valid short-poll AJAX results. In other words (for simplicity), the normal way Rails responds to AJAX short-polling should not change.

This cycle of long-poll, short-poll should automatically repeat till Rails effectively tells the AJAX code (in the browser) all the work is done for the page—i.e., till no worker jobs are queued.

Perhaps (the default schedule of) AJAX short-polling can most easily be put off by increasing (to some large value) the time delay on the first short-poll. Presumably this is configurable in Rails. Long-polling of the other (asynchronous) webserver should be added to Rails page view layouts.

Thin (and Ruby EventMachine) are asynchronous and non-blocking just like Node.js. They can accept thousands of simultaneous HTTP connections (to browsers) each consuming only a few Kb of memory. Thin being based partly on Ruby EventMachine demonstrates the latter's quality.

The job queue for Rails worker programs also probably should be a RabbitMQ exchange, since we're using it.

Some various other asynchronous Ruby servers are: cramp, goliath, rainbows! and puma.

Actually, probably the best asynchronous webserver for this purpose is the paid service Pusher (or the open-source Slanger equivalent, to keep it in-house).

References:

blog.headius.com/2008/08/qa-what-thread-safe-rails-means.html
confreaks.com/videos/727-rockymtnruby2011-real-time-rack
github.com/igrigorik/async-rails/
github.com/igrigorik/em-synchrony
github.com/jjb/threaded-rails-example
jordanhollinger.com/2011/04/22/how-to-use-thin-effectivly [sic]
www.igvita.com/2009/05/13/fibers-cooperative-scheduling-in-ruby/
thechangelog.com/post/927103350/episode-0-3-1-websockets
www.igvita.com/2010/03/22/untangling-evented-code-with-ruby-fibers/
www.igvita.com/2010/06/07/rails-performance-needs-an-overhaul/
www.igvita.com/2011/03/08/goliath-non-blocking-ruby-19-web-server/
www.tumblr.com/tagged/pusher?before=1323905509
yehudakatz.com/2010/08/14/threads-in-ruby-enough-already/

Copyright (c) 2012 Mark D. Blackwell.