Latest Tweets:

Tastypie caching kinda sucks

The problems:
I’m using tastypie on Google’s App Engine, which is preventing me from using Varnish, as suggested on the tastypie caching docs.  
Tastypie’s caching mechanism is based on Django’s caching mechanism, which is mostly designed for hosting web pages rather than APIs.  Django uses a single cache timeout for both memcache and HTTP caching of views, so the two are tied together when using Django’s cache middleware.
As an example, of tastypie/django interaction:
Django’s caching middleware looks for the cache-control: max-age=X header to cache a view.  Tastypie’s NoCache() class doesn”t specify a max-age, so the default for the Django installation is used, this means that the serialized form of the query will be cached by Django in memcache, as well as on the browser/proxies via HTTP caching. 
As a default setting for APIs, this kinda sucks because you can’t see the latest updates until the cache expires.
Tastypie’s SimpleCache() is also based on cache timeouts, and there’s no functionality for invalidating caches.
Tastypie has some support for etags.  This is beneficial for saving bandwidth by sending 304 Not Modified responses.  However, it still requires fetching from the database and serializing the results to compare against the etag hash, so there’s still the latency of processing the request and the server processing time,.
The solution design:
In an API world, it seems to me that the preferred behavior would be:
1. show the latest data as soon as possible.  this generally means minimizing HTTP caching.
2. serve requests from memcache if possible to minimize latency and database load.  this generally means maximizing the time requests live in memcache.
3. As a corollary to #1, #2, agressively invalidate items in memcache when resources change.  This allows setting a huge timeout in memcache, yet also allows showing recent results.
4. use etags to minimize bandwidth, but combine with 2 to minimize hits to the database.
Invalidation is probably the most difficult and crucial step.  Tastypie doesn’t cache queries because it’s too hard to predict when resources need to be invalidated (but then SimpleCache doesn’t do any invalidating at all, even for resource fetches).
The constraints:
I’ve found that with my mobile/web e-commerce app, the types of queries that are run are very predictable because I control the front end.
For example, any id__in query could have an infinite number of possible queries:
/api/v1/carcolors/?id__in=[1,2,3]
/api/v1/carcolors/?id__in=[2,1,3]
/api/v1/carcolors/?id__in=3,2,1
etc.
However, because I control the frontend, I know that I’d only use the first query.
Also, since I have a non-infinite number of cars, the number of color sets that I’d query are rather limited.
So when I update a carcolor resource, I know I have a limited number if id__in queries that MUST be updated, so I can manually invalidate those.
In this case, I must manually write the invalidate code for each resource class, but it’s generally fairly limited, and if the boilerplate tools are available for invalidation, it’s pretty easy.
The code:
Here’s an example Resource class.
class SimpleNoHTTPCache(SimpleCache):
    """
    Tastypie's SimpleCache, with HTTP caching disabled and invalidate exposed
    """
    def cache_control(self):
        return {
            'no_cache': True,
            'max_age': 0,
        }

    def invalidate(self, key):
        return cache.incr_version(key)



class OptionResource(CachedEtagJsonResource):
    class Meta(EatPublicMetaBase):
        queryset = Option.objects.filter(deleted=False)
        filtering = { 'id' : ['in'], 'choices' : ['in'] }
        excludes = ['deleted', 'undelete']
        cache = SimpleNoHTTPCache(timeout=settings.THIRTY_DAYS, varies=[])

    def get_list(self, request, **kwargs):
        if (len(request.GET) == 1 and request.GET.get("id__in")):
            # This represents the cached version
            return super(OptionResource, self)._get_list(request, **kwargs)
        else:
            # Grandparent is uncached
            return JsonModelResource.get_list(self, request, **kwargs)

    def invalidate(self, **kwargs):
        optionid = kwargs.get("optionid")
        menuitems = kwargs.get("menuitems")
        invalidated_strings = {}
        if optionid:
            # Invalidate cache.
            tmpstr = "/api/v1/option/?id__in=[" + str(optionid) + "]"
            self._invalidate_cache('serializedlist', **{"id__in": [unicode("[%s]"%optionid)]})
            self._invalidate_cache('serializedlist', **{"id__in": [unicode(optionid)]})
            self._invalidate_cache('serializeddetails', **{"pk":optionid})
            invalidate_tasty_cache(tmpstr, 'etag')
            invalidated_strings[tmpstr] = True
            if not menuitems:
                menuitems = MenuItem.objects.filter(options=optionid, deleted=False)
        for menuitem in menuitems:
            tmpstr = "/api/v1/option/?id__in=[" + ",".join([str(i) for i in menuitem.options]) + "]"
            if not tmpstr in invalidated_strings:
                self._invalidate_cache('serializedlist', **{"id__in":[u"[%s]" % ",".join([str(i) for i in menuitem.options])]})
                invalidate_tasty_cache(tmpstr, 'etag')
                invalidated_strings[tmpstr] = True
Notes: 
- CachedEtagJsonResource is my base class that generally handles all the caching of serialized data and etags.  It’s just slightly modified from the basic tastypie ModelResource class.  It derives from JsonModelResource, which is a standard tastypie ModelResource class with the format forced to json.
- Each resource class needs to define get_list() to specify which queries should be cached.  Other queries pass through and are uncached.
- The SimpleNoHTTPCache() is similar
- My API isn’t fully restlike, and my resources are modified from other calls, so I expose an invalidate() call on the resource.  Some logic would be needed in the POST/PUT/UPDATE handlers to call invalidate from there if you’re using that to update your resource.
If you’ve read all the way here, I’d love feedback.

Uploading SSL certifcate from GoDaddy to Google App Engine

Getting an SSL certificate onto Google App Engine was kinda a pain in the ass.  I remember it took a while to generate the certificate request to get a cert from GoDaddy, but I don’t remember the process.

When renewing the cert, I ran into a problem uploading it to App Engine.  I had kept the original private key file.  I tried downloading the cert from GoDaddy using ‘Other’ as the option.

The cert in this case is actually in the unencrypted PEM format and can be uploaded to GAE as is.

My private key file was generated by openssl and wasn’t in the unencrypted PEM format.  I needed to convert it:

openssl rsa -in privateKey.key -text > private.pem

Addendum:

I noticed that my HTTPS requests were working fine on desktop Chrome, but failing with a bad certificate on Android.  Thankfully, there was a StackOverflow answer to this.  According to the GAE docs, I had to append the intermediate cert provided by GoDaddy to my domain’s cert before uploading to App Engine.  Seems to be fine on Android now.

Sending from gmail on Google App Engine’s dev_appserver

The dev_appserver has a mail stub that can send mail via an SMTP server.  You can use Gmail, but you have to hack the SDK to enable TLS for the SMTP connection.  This just requires adding 3 lines to mail_stub.py in the App Engine SDK.  The annoying bit is that you’ll have to go add this hack each time you update the SDK.

Look for the following lines:

smtp = smtp_lib()
try:
    smtp.connect(self._smtp_host, self._smtp_port)
    if self._smtp_user:
    smtp.login(self._smtp_user, self._smtp_password)

Add 3 lines:

smtp = smtp_lib()
try:
    smtp.connect(self._smtp_host, self._smtp_port)
    smtp.ehlo()
    smtp.starttls()
    smtp.ehlo()

    if self._smtp_user:
    smtp.login(self._smtp_user, self._smtp_password)

(Source: code.google.com)

Wrangling timezones.

You would have figured dealing with timezones is a common enough task that there’s clean libraries to deal with them.  That’s mostly true, but there’s still edge cases where things are pretty ugly.

Consider the case where I want to build a system where any store can publish their store hours.  As an example, a store in New York is open on Fridays: 11AM-2AM.  I’d like to be able to tell any web viewer whether the store is currently open or closed, whether they are in New York or in Hong Kong.

It sounds simple enough, but it’s an annoyingly hairy problem where the solution is non ideal.

It’s possible to use server-side libraries to calculate whether the store is open or closed, but this would increase server load as well as latency.  It would be better to somehow store and transfer the store hours in a static manner so it would be largely cacheable.

The biggest factor is probably the Javascript Date object, which has a getTimezoneOffset() function which is useful for converting the current local time to UTC, but doesn’t have a built-in way to convert to another destination time.

Unfortunately, calculating the destination time isn’t simply storing the UTC offset for the destination location.  In order to convert to the destination time, we’d need to to know the destination time zone, as well the details of when daylight savings kicks in for the destination time zone.

There is in fact a Javscript library to perform this calculation: https://github.com/mde/timezone-js.  However, it relies on a timezone database, and downloading the database adds to the page weight.  One way to deal with this is the break down the database into smaller chunks, and then download the appropriate chunk dynamically based on the destination time zone.

Note, Google does provide a time zone API, but it does limit use to 2500 calls per day… Which actually may work if it’s the client calling it…

The most annoying IE bug

I guess since I learned web development on Chrome/Firebug, I’ve grown used to using console.log()

It turns out IE doesn’t have console.log().

Unless the debugger is enabled.

Whoever designed that is goddamn retarded.

Say you accidentally left some innocuous call to console.log() in your script. This is no big deal on FF or Chrome.  In IE this will throw an exception so your script stops running.  Your page behaves badly.  Crap.  You think, damn IE sucks.

Well, hopefully this is easy, just pull up the debugger.  And then the problem no longer happens because the call to console.log() now works and no longer throws an exception.

Really?

(Source: stackoverflow.com)

Google Autologin in an Android WebView

The Android browser has a nifty feature that lets you autologin to sites that use Google accounts, using accounts that you’ve already logged into on the phone (at least on Honeycomb+).  This is really convenient, because a user can just click on the “Sign in” button instead of typing in his full account name/password on a crappy mobile keyboard.

Unfortunately, this functionality did not make it into the WebView, so if you’re building you’re on WebView based app, you’re out of luck.  You gotta do it yourself.

Fortunately, Android is open source, and we can look at the Android Browser source, and get the code.  The source is hosted by the Android Open Source Project.  The browser source is here:

https://android.googlesource.com/platform/packages/apps/Browser/

The autologin function is mostly contained within two classes:

DeviceAccountLogin.java

AutologinBar.java

You’ll also need the resource file for the autologin bar:

res/layout/title_bar_autologin.xml

You may also need to copy over res/anim/autologin_enter.xml and res/anim/autologin_exit.xml if you want the autologin bar to animate in and out.  In addition, there’s a handful of strings you’ll need to pull from the Browser project’s string.xml.

Those are pretty much all the pieces needed.  Now comes the hard part, which is integrating it to your own project.

Integrating the layout was pretty easy for me.  I have a pretty simple layout that just contains my WebView and a progress bar.  In the original Browser project, the autologin bar is integrated with the title bar.  You can look in title_bar.xml and TitleBar.java to get a sense of how it’s instantiated.

All I had to do was to add the ViewStub for the autologin bar to my Layout:

<ViewStub  android:layout="@layout/title_bar_autologin"  android:id="@+id/autologin_stub"  android:layout_width="match_parent" android:layout_height="wrap_content" android:paddingTop="3dip" />

The harder part is the code.  The Browser is pretty well componentized into a View-Controller structure, and different parts of the UI are broken up into their own components, notably tabs and the title bar (which contains the autologin bar).

My own case was much simpler, I just have an Activity containing a WebView, so I took all the Autologin functionality related to the tab, controller, and title bar, and stuck it all in my main activity.  It actually wasn’t too messy.

First let’s go over a brief overview of how the Autologin works in the Android Browser.

  1. When the WebView browses to a Google login page, the WebViewClient receives a callback onReceivedLoginRequest() (see the Browser’s Tab.java for details).
  2. In onReceivedLoginRequest(), we instantiate DeviceAccountLogin and call handleLogin().
  3. handleLogin() calls the Controller to display the autologin UI.
  4. the Controller calls the BaseUI to display the autologin UI (BaseUI::showAutoLogin()).
  5. the BaseUI calls the TitleBar to display the autologin UI (TitleBar::updateAutoLogin()).
  6. the TitleBar instantiates the AutologinBar and calls AutologinBar.updateAutoLogin(), which is a key step.  
  7. At this point the AutologinBar fetches the DeviceAccountLogin from the Tab and gets the account names.
  8. The AutologinBar calls back to the Title Bar to display the autologin bar UI (TitleBar::showAutoLogin())

In the case of my app, I didn’t have a Controller, BaseUI or TitleBar, so I merged all that functionality into my MainActivity.  The actual work involved is:

  • Inside DeviceAccountLogin.java replace references to the WebViewController and Tab to point to MainActivity.
  • Inside AutologinBar.java replace references to TitleBar and Tab to point to MainActivity.

On our MainActivity, I’ll need to implement the actual functions.  on the Controller, BaseUI, TitleBar and Tab that are actually needed.  It’s not actually much.  The one thing the be wary of is that the BaseUI and TitleBar both have functions called showAutoLogin() and hideAutoLogin(), but they’re functionally different.  We’ll have to be careful to rename those.

Here’s all the functions I added to MainActivity to get this to work:

    private void inflateAutoLoginBar() {
        if (m_autologin != null) {
            return;
        }
        ViewStub stub = (ViewStub) findViewById(R.id.autologin_stub);
        m_autologin = (AutologinBar) stub.inflate();
        m_autologin.setActivity(this);
    }
	
    public void setDeviceAccountLogin(DeviceAccountLogin dal) {
	m_dal = dal;
    }

    public DeviceAccountLogin getDeviceAccountLogin() {
        return m_dal;
    }
	
    /* First pair of showAutoLogin()/hideAutoLogin() taken from Android Browser BaseUI.java
     * They simply call m_autologin.updateAutoLogin()
     */
    public void showAutoLogin()
    {
        updateAutoLogin(false);
    }
	
    public void hideAutoLogin()
    {
        updateAutoLogin(false);
    }
	
    private void updateAutoLogin(boolean animate)
    {
        if(m_autologin == null) {
            if(getDeviceAccountLogin() == null) {
                return;
            }
            inflateAutoLoginBar();
        }
        m_autologin.updateAutoLogin(this, animate);
    }

    /* Second pair of showAutoLogin()/hideAutoLogin() taken from Android Browser TitleBar.java       */
	
    public void hideAutoLogin(boolean animate) {
        if (animate) {
            Animation anim = AnimationUtils.loadAnimation(getBaseContext(),
                    R.anim.autologin_exit);
            anim.setAnimationListener(new AnimationListener() {
                @Override
                public void onAnimationEnd(Animation a) {
                    m_autologin.setVisibility(View.GONE);
                    m_webview.invalidate();
                }

                @Override
                public void onAnimationStart(Animation a) {
                }

                @Override
                public void onAnimationRepeat(Animation a) {
                }
            });
            m_autologin.startAnimation(anim);
        } else /*if (m_autologin.getAnimation() == null)*/ {
            m_autologin.setVisibility(View.GONE);
            m_webview.invalidate();
        }
    }

    public void showAutoLogin(boolean animate) {
        if (m_autologin == null) {
            inflateAutoLoginBar();
        }
        m_autologin.setVisibility(View.VISIBLE);
        if (animate) {
            m_autologin.startAnimation(AnimationUtils.loadAnimation(
                    this, R.anim.autologin_enter));
        }
    }

Android Development on Arch Linux

I’ve been using Arch Linux because it’s small.  My entire development environment, including the OS, is a 6GB VMWare image.  This zips down to under 4GB.  Moving between different dev machines is a breeze.

On the flip side, getting things to work is sometimes a hassle.  Before I start complaining about today’s painful process, I have to say there’s often times where things work well on Arch (like other Linux flavors).  Usually a single call to pacman -S package will install a given package with no hassles.

Today however, I had to deal with what seems to be the bane of Linux platforms - library compatibility issues - on top of a year-old Arch Linux installation.

My goal was to get started with Android development.  Usually this includes installing Eclipse (which I already had installed), the Android SDK (which I also happened to already have, since I was using adb), and the Eclipse plugin for Android (aka ADT).  Note that I’ve gone through this process before on Windows, and while time consuming, following the instructions worked.

1.  Given that I had the first two components, it should have been simple to follow the instructions for installing the plugin.  Unfortunately, the installation failed with an error that looked like this

Software being installed: Android Development Tools 11.0.0.v201105251008-128486 (com.android.ide.eclipse.adt.feature.group 11.0.0.v201105251008-128486) Missing requirement: Android Development Tools 11.0.0.v201105251008-128486 (com.android.ide.eclipse.adt.feature.group 11.0.0.v201105251008-128486) requires ‘org.eclipse.wst.sse.core 0.0.0’ but it could not be found

(Actually, the version number was different, I stole that quote from the solution I finally found on StackOverflow)

Instead of searching for a solution online, I assumed that it was because my android SDK was old and out of date.  Never assume.  So I tried to update the sdk by running android.  That failed with some cryptic error message:

Unsupported major.minor version 51.0

A Google search showed me that I needed JDK7, but I had JDK6 installed.

2.  Eventually I figured out that I needed to uninstall JDK6 (and force it to ignore dependencies, since it gave me error messages indicating I had java applications installed), and install JDK 7.  Fortunately, this only required two pacman commands, and a couple of minutes to download.  Finally, the android SDK updater worked, and I got the latest SDK builds.

3. Ok, back to installing the android plugin in Eclipse, and I run into the same error.  This time I searched online, and found a solution.  Normally the ADT has some dependencies, that Eclipse should normally automatically find, download, install and proceed.  However, somehow my Eclipse installation didn’t have the automatic update links… so I had to find them myself.

4. I managed to find the update links online.  I  added them in Eclipse under “Help->Install New Software.  I tried to go back and Check for Updates, and install updates, and then got an error saying I had insufficient permissions.

5. It looks like pacman has installed Eclipse as root, so I tried running eclipse as root to update.  This worked.  Not only did this work, I noticed that the upgrade urls showed up when running as root.

6. I went back to run non-root Eclipse to install ADT again.  No luck again, the update wasn’t the missing piece.  However, this gave me a hint to try the update urls.  I launched Eclipse as root again, copied the update urls, and pasted them in the non-root Eclipse.  Trying to install ADT after this subsequently worked!

That took way longer than it should.

Eventual Consistency + Memcache: Part 2

This is a continuation of the previous post.  In summary: For entities that don’t change frequently, you may want to cache the result for a long time to reduce your actual datastore requests.  If you update your data with a PUT request, you would typically invalidate your cache, so the next GET request would issue a datastore query and get fresh results.  With the eventually consistent datastore, you may yet receive stale results, and end up caching those for a long time.

So right now, I see three possible solutions.  I’m going to explore the pros and cons of each.

  1. Don’t cache, or expire your cache quickly.
  2. Instead of simply expiring your memcache results, explicitly update the memcache results when you update the datastore data.
  3. Use transactions in your queries.

Minimize Caching

This one’s simple, make your cache expire quickly, so it behaves more like an eventually consistent query.  You won’t get “bad” results cached for a long time, but you also increase your costs and reduce your performance.  I don’t really consider this a real solution.

Update Memcache

Instead of simply invalidating the cache on a PUT request, we’ll need to actually update the memcache data to reflect what the new results of a query should be.  While this sounds simple, I’ve found that this could be much more complicated than it sounds, especially if when your serialized data isn’t absolutely straightforward.

First, you can’t simply issue a new query and re-serialize the results.  You’ll have to take your memcached results, and somehow modify the serialized version.  This means you’ll have to explicitly check if an updated entity was modified in such a way that it was added/removed/modified in the results.  If your results were sorted, that’s a lot of extra work to make sure that the updated memcached data maintains the new sort order.

In my particular case, my serializer adds metadata like the number of entities returned, so I have to be careful to update that as well.

Overall, there’s a fair bit of work involved with this solution, mostly duplicating the effort of the datastore sorting and the serializer.  If you’re dealing with paginated results, I could see this being even more of a nightmare.

Oh, and keep in mind that the memcache operations should be atomic, so updating memcache should be using compare-and-set functionality in a retry loop.

Even after all this effort, there’s still potential holes in this system.  On requests where there is no existing memcache data (either your first request, or if your memcache entry got flushed), you’ll be forced to run a query, who’s results cannot be guaranteed.

Use Transactions

This leaves us with transactions - in other words, avoid eventually consistent queries.  If you’re still designing your data structures, this would be the preferred route.  Structuring your data for strongly consistent queries would save you a lot of effort in the future.

If, like me, you’ve already structured your data without using ancestors, this means you’ll need to rewrite the data in your datastore to use ancestors.  This effort seems to offset the effort required to update my cached, serialized data.  And it guarantees that I won’t run into the corner cases of getting inconsistent data when the cache is flushed.

Pitfalls of Eventual Consistency + Memcache on App Engine: A Case Study

Google App Engine’s High Redundancy Datastore (HRD) operates with an eventual consistency model.  This means when you make a change to the datastore, the next query may not actually see that change, but eventually, a query will.  There’s certain benefits to eventual consistency - it enables larger scaling.  The problems with it are probably obvious.  You can usually design around it.

I’ve realized it takes a fair bit of foresight to predict all potential problems.  The following is a situation I ran into this week, and the history of how I got here.  In particular, using memcache can exacerbate eventual consistency behavior if you don’t do it right.

I have two API entry points (HTTP request handlers).  GET gets a list of active Things that belong to a Group.  PUT activates/deactivates individual things.  It’s pretty clear that there could be a potential issue.

If it helps illustrate, the data models look like this (using Django syntax):

class Group(models.Model):
    id = models.IntegerField(primary_key=True)

class Thing(models.Model):
    id = models.IntegerField(primary_key=True)
    group = models.ForeignKey(Group)
    activated = models.BooleanField()

def get(group_id):
    return json.dumps(Thing.objects.filter(group=group_id).filter(activated=True)

@db.transactional
def put(thing_id, activated)
    t = Thing.objects.get(id=thing_id)
    t.activated = activated
    t.put()

With the two separate APIs, the obvious problem is:

  1. If PUT activates Thing A (that was previously not activated),
  2. a subsequent GET call may return without Thing A

Eventually, after enough time passes (usually a second or less), a GET call will return with Thing A.

In order to use eventual consistency, you must convince yourself that this is acceptable in your use case.  With human facing web applications, it often is - there’s often an expectation online that there’s often some delay, when something is updated, the user may need to reload their page once or twice to see the change.

Now say GET requests happen very frequently, and PUT requests happen rarely.  This is often the case on the web.  Each GET issues a datastore query, for which you have to pay for the query and entities returned.  It’ll save you a bunch of money of you cached the results instead of constantly hitting the datastore.  You generally want to cache the result for a long time.

LONG_TIME = 1000000
def get(group_id):
    result = memcache.get(group_id)
    if result is None:
        result = json.dumps(Thing.objects.filter(group=group_id).filter(activated=True)
        memcache.add(group_id, result, LONG_TIME)
    return result 

Ideally, you want to cache it forever so you never have to hit the datastore.  However, when a PUT request makes a change, you need to invalidate the cache. My initial attempt was to do just that

@db.transactional
def put(thing_id, activated)
    t = Thing.objects.get(id=thing_id)
    t.activated = activated
    t.put()
    memcache.incr_version(t.group) # Easy way to invalidate Django cache

That was simple!  But unfortunately wrong.  And even worse, it works most of the time.  A later GET would not find the result in memcache, get the new updated list of things, cache it, and be on its merry way.

There’s a problem if the GET comes quickly after the PUT.  It may read the same results - AND CACHE THEM FOR A LONG TIME.

So you need to update the cache results in the PUT.  What’s more, you have to be careful that you don’t simply update the datastore and then do a new datastore query, the serialize the results, since the query may not be accurate.

@db.transactional
def put(thing_id, activated)
     t = Thing.objects.get(id=thing_id)
     t.activated = activated
     t.put()
     query = Thing.objects.filter(group=group_id).filter(activated=True)
     # At this point you need to explicitly verify that t is either in
     # or not in query, as necessary.  If the query is incorrect, you
     # need to explicitly fix it.
     memcache.add(group_id, json.dumps(result), LONG_TIME)

At this point, I’m bitter because unlike this pseudo-code, my GET request isn’t quite a simple json.dumps(). It is actually handled by a framework that automatically serializes and caches requests. It’s a bunch of pain to figure out how the framework serializes the data, and also how the framework generates the cache keys.

But that’s not all!  What if two PUT requests come in row, activating or deactivating two things within the same Group?  That query in the group can still potentially get the wrong result and cache it.  Things are getting untenable.

I can see two ways to deal with this.

  1. Work with the results in memcache.  This means working with the serialized results.
  2. Use Transactions, but that would probably mean all the entities need to be modified to add ancestors.

The pain!

Django-tastypie Authentication/Authorization

I’ve been using Tastypie for a while; it took me a while to figure how Authentication/Authorization worked.

Actually, Authorization proved immediately useful.

Use the is_authorized() function to limit requests that should fail based on the caller.

Use the apply_limits() function to filter the results of a query before they are returned.

Authentication appears to perform a similar role with broader strokes, and blocks requests before they get to the Authorization stage.

The biggest difference I can find between the two is that Authentication is used to block access to the /schema API.  That portion of the API can’t be blocked using the Authorization component.

Defer Javascript loading

Here’s my script for deferred loading of Javascript.  This basically allows you to draw your page without waiting for all the scripts to get downloaded and parsed first.  The parsing is pretty quick on desktop browsers, but adds a significant delay in mobile.

Here’s the script.

    <script>

    function loadScripts() {

      var scripts = document.getElementsByTagName(‘script’);

      var scriptIndex = 0;

      function getScript(script,next){

        var done=false;

        var newscript = document.createElement(“script”);

        newscript.src = script.getAttribute(“data-src”);

        newscript.type = ‘text/javascript’;

        newscript.onload = script.onreadystatechange = function(event){

          if (!done && (!this.readyState || this.readyState == ‘loaded’ || this.readyState == ‘complete’) ) {

            done=true;

            script.onload = script.onreadystatechange = null;

            next();

          }

        };

        script.parentNode.removeChild(script);

        document.body.appendChild(newscript); 

      };

      function nextScript() 

      {

        for (var i = 0, len = scripts.length; i < len; i++) {

          var script = scripts[scriptIndex];

          if (script && script.type == ‘notjs’) {

            getScript(script, nextScript);

            return;

          } else {

            scriptIndex++;

          }

        }

      }

      nextScript();

    }

    window.onload = loadScripts;

    </script>

In order to use this, add your scripts like this:

<script type=”notjs” data-src=”http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js”></script>

This is really just a dummy script tag.  Because the type is not “text/javascript”, it isn’t parsed.  It’s really just to make it easy if you’re using some framework that’s already injecting scripts.  Note also that instead of specifying a “src”, I’m faking it as “data-src”.  This is because if “src” is specified, the file is actually downloaded (which is good), but on Firefox, the onload event isn’t thrown after we actually set it to text/javascript and parse it.  Only if it’s loaded dynamically in the script does the onload event get thrown.  A Chrome-only version would be much simpler.

The deferred loading script will look for these script tags and load them, in order.  Keep in mind that window.onload() will already have been triggered, so be careful not to depend on that in the deferred scripts.

(Source: developers.google.com)

Javascript Hoisting dangers.

Ran into a bug last week that didn’t seem obvious up front.

// 1.
// f();

function (bar{
    if(!f.text{
        f.text bar;
    }
    Y.one("#main").setContent(f.text);
}
f.text "Test";
    
// 2.
// f();

The bug occurs when running function f() at position 1.  Function f() will run without error when called from either position 1 or position 2, because the function definition will be hoisted to the top of the script.  However, the static declaration of f.text is not hoisted.

If the value for f.text is expected to be set, it won’t be at position 1.


Sticky header, footer, and full sized contents.

I wanted to build a page that looks like an application’s window, with a header bar across the top, a footer across the bottom, and a work area that filled up the rest of the screen, regardless of the browser size.

The goal was to avoid the browser showing its scroll bars.

The final product is in this fiddle.  Here’s the breakdown.

1. In order to get a sticky footer, you need a “wrapper” to take care of everything above the footer.

    <div id="footerwrapper"></div>
    <div id="footer">Footer</div>

    #footerwrapper {

        height100%;

    }
    #footer {
        positionrelative;
        margin-top-50px;
        height50px;
        background-color#000088;
    }

This almost works, but in order for the 100% height to be effective, the height for the containing element (ie the <body> elements) must be specified.

    htmlbody height100%}

Note that since the <body> element is contained by the <html> element, the <html> element must have its height specified too.  If any element in the chain doesn’t have its height specified, this doesn’t work.

After this, specifying a header and body are simple.

    <div id="footerwrapper">
      <div id="header">Header</div>
        <div id="main">
          Main
        </div>
      </div>
    <div id="footer">Footer</div>

    #header {
        background-color#008800;
        height50px;
        width:100%;
        positionfixed;
    }
    #main {
        background-color#880000;
        padding-top:50px;
        overflowhidden;
    }

Now the header and footers are sticky.  The size of the main area is undefined though.  If it’s less than the size of the window, there will be blank space.  If it’s greater than the size of the window, we’ll see a scroll bar to scroll the main area (the footer should stick to the bottom though).

We need to define the main area to be the size of the window, minus the header and footer.  I think this would be possible with CSS if the header and footer were defined in terms of percentages, then the body could be determined as a percentage too.  This would obviously have the effect of the header and footer getting bigger as the window gets bigger.  That’s not the effect I want in this case.

So it’s impossible to do with CSS if I define the height of the header and footer by anything other than percentages.  I can however, set the height using Javascript.

    function f({
        Y.one("#main").setStyle("height",String(window.innerHeight-100)+"px");
    };
    window.onresize f;
    f();

The required Javascript is pretty simple.  I’ll leave it up to the reader to handle the height of the header and footer without hardcoding it.


*1

Manipulating History for Fun and Profit.

I was stuck for a good day and a half debugging some navigation issues on my site caused by some incorrect usage of the HTML5 history API.  I think they’re pretty easy mistakes for a beginner to make.

First a quick description of my app.  It’s implemented as one HTML page, but internally, I keep track of a stack of “views”, with one view visible at a time.  The user can navigate forward and backwards between the views.  The views are actually individual <div> elements, but only one, the “top”, is visible at a time.  I have my own pushView() and popView() functions.  It would be nice if the “back” button to mimic the view navigation, ie it would call popView().

It seems straightforward…

The solution seems pretty simple.  Every time I navigate forward, I call:

history.pushState(data, title, url);

Then I need an onpopstate handler to handle the back button click.  If I’m navigating from one of the my views, I would call popView().

Now I noticed in Chrome, I would sometimes get onpopstate events at weird times, like first thing after I load a page.  It’s easy to avoid these as long as a provide history data (the first param in pushState).  Then I can check the onpopstate events to make sure I have data before I act on it.

Confusing the states

The first thing that caught me is that onpopstate will return with the state data of the destination state, which is most likely NOT that last state you pushed.  Here’s an example.  First you land on a page your history state would look something like this:

1. Landed on page state.  state data = null.

Then you call pushState(“mydata”).  Your history state would then be:

1. Landed on page state.  state data = null.

2. pushState(“mydata”).  state data = “mydata”.

You then click the browser “Back” button.  To me at least, intuitively, I expected an onpopstate event with “mydata”.  Instead, I get null.  What happens is that state 2  is popped, and state 1 is the page that should be visible now.  The onpopstate event carries the state data for state 1, which is null.

I had a bug where I couldn’t navigate back to state 1 because I was checking for state data.  The fix for me was to use history.replaceState(“main”) to set the state data for state 1.

Unexpected states from links

My second source of confusion was that I used a combination of <button> and <a href=”#”> elements to allow users to navigate my web app.  In both cases, Javascript handlers actually did the navigation.

The problem was the <a href=”#”> elements.  Even though they did not navigate away from the page, clicking on them would have the side effect of adding an unexpected entry to the history state queue.  Make sure you add event.preventDefault() to the click handler for <a href=”#”> links to avoid this.

Navigating back multiple pages is tricky

At some points in my app, I wanted to navigate back a two steps.  So I tried calling history.back() twice within a click handler.  This didn’t work.  I only got one onpopstate event, going back one page.

The second approach I tried was to call history.go(-2) instead of calling history.back() twice.  The result of this was again a single onpopstate event, but this time, it gave me the state data from two pages back.  I would have to have my own code to check the state data and figure out how many times I really wanted to pop my views.  I haven’t figured out how to handle the case where a view may be in the stack more than once.

(Source: diveintohtml5.info)

*1

Gitbits.

I’ve been using git for well over a year, but I hardly know how to do anything beyond the basic pull, modify, commit, push, with the occasional merge.  Every time I think I need to do something else, I find that it’s possible, once you’ve figured out how to do it.

i.e. You want to get a particular version of a file:

git checkout <branch or commit> -- <path to file>

Next task: rebasing something that I’ve already committed.