This is from the WWDC 2011 and 2010 talks, API documentation, a bit of experience, advice from others, etc. I’m not an experienced iOS engineer by any stretch of imagination, so if you find any mistakes, please add comments. Thanks. I made these notes for myself, and I then figured that others may find it useful, so I’ve shared them.
Each view has a backing layer, which contains an image of the view, excluding subviews. The view's contents are rendered onto the layer when the view is added on screen. Whenever the view needs to be drawn, the system tries to use the GPU to composite the layer onto the screen rather than asking the view to render its contents again, which requires the CPU.
The biggest bottlenecks to graphics performance are offscreen rendering and blending -- they happen for every frame of the animation, and can cause choppy scrolling.
Note that the layer itself is effectively an offscreen buffer, but that's not what we're referring to here. Sometimes two or more layers needs to be composited into an offscreen buffer, which is then composited onto the screen, as opposed to the normal flow of compositing layers directly onto the screen.
For example, this happens when you set a mask for a view. The entire view hierarchy needs to be rendered onto an offscreen buffer or, in other words, the layers for all these views need to be composited into the buffer, on which the mask is applied. Then the buffer is composited on screen.
The Core Animation Instrument has an option to detect offscreen rendering, as well as blended views. If an offscreen buffer is going to be needed, see if you can at least cache it between frames of the animation, by calling [layer setShouldRasterize:YES].
Make sure that you don't enable rasterization of a view without a good reason, because you might force offscreen rendering (slow) when it's not needed.
Other optimization techniques
Flatten your hierarchy: See if you can flatten your view hierarchy. If your view has subviews A and B, and B has subviews B1 and B2, see if you can eliminate B and add B1 and B2 directly as subviews. Flattening your view hierarchy means that there are fewer backing stores, so you save both space and time, including on the GPU.
Consider drawRect: If you can't flatten your view hierarchy, Apple recommends that you use subviews to create your layout rather than drawing each item individually in drawRect.
However, there are cases when you should go the drawRect route and draw stuff manually, using Core Graphics, NSString’s drawInRect, etc.
One trick is to create subviews but not actually add them to the superview. For example, if your view needs two labels and images, create the UILabel and UIImage objects without adding them to the superview and, in your drawRect, invoke UILabel's drawLabelInRect and UIIimage’s drawInRect (never invoke drawRect).
Or create your view hierarchy, invoke [view.layer renderInContext:context] on the root view, and throw away the view hierarchy.
Whichever of these three techniques you use, you have a backing store that effectively holds the view and all its subviews together instead of having separate backing stores for each view, or, worse, having a separate backing store for each view *and* one for the view hierarchy (if you call setShouldRasterize). This saves memory and time, both on the CPU and GPU. In particular, your view hierarchy is composited together before animations, rather than for each frame.
Create an image directly: Create a UIImage or CGLayer (which is cached better on GPU), and draw whatever you want into it.
The downside of this technique over the drawRect one (above) is that you're eagerly creating your backing store, whereas drawRect is invoked only when the view is added to a window. But if that’s not a consideration, this is simpler than the drawRect solution because instead of defining a UIView subclass, defining drawRect, and have the system invoke it to render your view into a layer, you create an image directly.
Scroll view callbacks: UIScrollView works by updating the contentOffset every 1/60th of a second. If your UIScrollViewDelegate takes longer than 1/60th of a second, you’re guaranteed to drop frames.
This is different from Core Animation, which runs smoothly on GPU even if the CPU is busy.