Vine Loop Counter View

Recently Vine has launched Loops, and one of the fun parts that I took on was building loop animation that shows as Loops and thought I'd share it as part of Vine's open source efforts: 


setKnownCount() where you can give it the current count, the time the current count is obtained, and a velocity that the count should be increasing at. 

setExtraCount() where you can give it extra counts independent of the known counts variables.

Animation Modes

The gist version supports three different parameters for AnimationModes:

continuousAnimation: if the increments will be continuous, if this is false animation will run once to the current number and then stop until the next time the count is changed.

pedometerAnimation: if true, the digits will move up 1 by 1, instead of skipping when animation increment is > 1 for that digit.

alphaAnimation: if true, the alpha will be changed as percentage of the animation completion. 

The default on the gist: non-continuous, non-pedometer, alpha-on, which is the way it is on Vine for Android as of Version 2.1.0.

Other Customizations

You can of course change and play with the digit spacings, animation durations, typefaces with either the given methods or change the constants. Test with the usual velocities that you wanna give and you will see interesting effects. I had a demo app working with all the different variations but I think I'll leave it to the reader to play with. 

How it works

On count invalidation, count will be checked against the current count and adjust the digit sizes. The current count based on the starting count and starting time and extra count will be calculated and produce individual states for each digit. Each digit keeps track of its own animation state. And then onDraw of View will be triggered via view invalidation.

On View invalidation, onDraw will simply draw out each digit one by one using the state it is in, and then post a runnable to calculate the updated states again. Note that frame rate is adjustable as well via a constant as the delay of re-calculating the states. 


Comments? Bugs? Suggestions? Feel free to leave it here or on the gist. 

Allocating Camera memory faster on Android Part Two

Part 1 Part 2

In Part 1, I talked about how to avoid the GCs so you can get reasonable speeds when using the frames from Android Camera's onPreviewFrame method and process them without losing any, it was basically as follows: (let's call this Method A)

1. Get faster memory allocation with tricks mentioned for small pieces of memories (bytes[]). The number of byte[] needed for the slowest device is the maximum number of frames to process (N);

2. Put the frame from onPreviewFrame to another thread.

3. The other thread process the data, and then give it back to the Camera. 

It turns out, there is another way to do it that's much faster, Method B:

1. Get faster memory allocation with tricks mentioned for small pieces of memories (bytes[]). The number of byte[] needed for the slowest device is about 10.

2. Put the frame from onPreviewFrame into a shared large ByteBuffer Queue that's big enough to fit maximum number of frames to process (Generate this Queue with ByteBuffer.allocateDirect(N*singleFrameSize) and then give back the buffer.

3. Another thread will manage the queue independent of the onPreviewFrame thread (process frames, drop frames when on pressure, etc.)

Comparison on cold launch: 

Method A: Requires generation of N byte[] in Java, and a total of N * singleFrameSize bytes. 

Method B: Requires generation of 10 byte[] in Java, (N + 1)* singleFrameSize in memory block allocation, and a total of (N + 11) * singleFrameSize bytes. 

If done right, Method A can trigger lots of GCs, average about (0.1 * N), so for 180 frames and assuming only the last 20% would GC with the large chunk first trick will make it about 3s. Method B, the allocation will basically be about 0.1 * 11, making it only use about 1s time. 


Allocating Camera memory faster on Android Part One

Part 1 Part 2 

One thing that was learned while building the capturing part of Vine for Android was dealing with all the raw buffers in order to satisfy the stop motion requirements. (According to Instagram, they were able to use the native MediaRecorder with 700ms+ delay on start time and a minimum duration, but Vine can't afford that in order to do stop motion) And because we can't use MediaRecorder, there are other libraries that are linked in order to do the encodings. 

In order to use the raw buffers, setPreviewCallbackBuffer  will be used in place of setPreviewCallback and addCallbackBuffer must be called with a minimum number of frames added prior/during to preview. This way buffers will not be generated during run time so that there is no lag during recording (which causes serious frame drops). For Vine, we take the frames, put them on a concurrent queue, another thread will take the buffers from the queue, process that frame, and then put the buffer back to Camera. So for a 6 second 30fps video, a maximum of 180 frames will be needed if the user records one single long clip. There goes the problem, 180 frames of raw bytes is pretty big to allocate at first as each frame is about 1MB big to allocate them at once will likely cause OOM and turns out to be really slow. But let's look the iteration that we did to minimize the problem as well as how to make everything else faster. 


Naive solution:  Add 180 frames prior to startPreview, guarantee 180 frames for all phones.  Doing all the allocations and initialization of classes and objects. when user starts recording. 

Result: GC_ALLOC happens, OOM happens on some phones, and frag increase of heap causes the allocation to go up to 10 - 30 seconds on certain phones. Takes 1 - 2 seconds before allocation happens. 


First thing I tried was to identify the bottlenecks during recording so that we don't need that many frames. Can process be faster so that we don't need that many frames?

Processing a frame really consists of four small steps so it was not hard to time them. 

(all the times are relative to the paragraph and to each other instead of real times since it varies by device) 

1. Convert a NV21 frame to a Bitmap for manipulation. (Time: 50x)

2. Doing bitmap manipulation on the converted Bitmap. (Time: 5x)

3. Encode the bitmap. (Time: 20x)

4. Write to the container.  (Time: 1x)


Optimize processing:  

1. If conversion in Java takes about 50x, can we do it better in native? Or is there a better solution. It turns out, if we do color conversion on GPU via an intrinsic RenderScript (super optimized conversion script), we can make it go from 50x to 1x with just a few lines of code. Unfortunately, this is Android 4.2+ only at time of writing but a support library may come in the future to back port this to older Android devices. 

2. All the bitmap manipulations were separated (rotation, clip, inversion), if we use a single Matrix, time was modified from 5x to 2x. 

3. Encoding, there isn't that much we can do here since the encoding algorithm is already optimized. If we use MediaCodec, time would be down from 20x to 10x, but this is 4.1+ and there is no sign that a support library may support this in the future. 

4. Writing it to the container is super fast, nothing to be done here for now. 

What did this was that we can now cut down from 180 requirement to a 140 requirement on certain devices, and 120 requirement on 4.2+ devices. (We have a device profiling system for this). 



Improved processing solution:  Add 140 frames prior to startPreview, guarantee 140 frames for all phones.  Doing all the allocations and initialization of classes and objects. when user starts recording. 

Result: GC_ALLOC happens less, OOM happens on some phones but less, and frag increase of heap causes the allocation to go up to 5 seconds on certain phones before they can start recording. Takes 1 - 2 seconds before allocation happens. (The big improvement here happen because GC on the last 40 frames is usually the slowest). 

This is still unacceptable. 


Improve allocation speed: Lying to get more memory is good. 

Why does GC happen? Why is growing heap even needed if we know how much we need?  

GC happens when the allocated heap is hitting about 70% capacity. And heap grows in frag because we only asks for a small byte[] at a time.  

It turns out, right before adding small buffers, I can add the following code to make it 100x faster:

temp = new byte[140 * requiredSize * 1.5] ;

temp[0] = 1; 

temp = null; //Explicit. 

This makes GCALLOC happens much much less (sometimes only once) and no more heap growing more than once. 


Result: GC_ALLOC happens much less, OOM happens faster, allocation time to go up to 2 seconds on certain phones before they can start recording. Takes 1 - 2 seconds before allocation happens. 

Much better, but can we do better? 






The rest of the improvements that we did we around using a service that maintains class loadings, using a bytebuffer queue when they restart recording so that we don't have to allocate more buffers, eventually bring the OOMs down to a very very small number, and allocation times to about 1.5s. The details are not important but what's important is that there was so much room for improvements and at many places that we did not expect to make a huge impact. Timing the execution and using GMAT like tools were very important at first for us to identify the bottle necks. 


Is Android fragmentation an issue?

For consumers? No. 

Consumers want the best phone they can afford. Android does exactly that by providing lots of options in a the entire price range. Do they really care if the phone have 512MB of RAM, 1.4Ghz Duo CPU, or another phone with 1GB RAM and a 1.9Ghz? They can't really tell the real benefits provided by the different phones. And they certainly don't care of a specific app is not on a certain phone if it means to pay $100 more (provided that most of the most popular apps are compatible with most phones).

For developers? Yes, but not really.

No because most apps will work just fine if you follow the best practices for Android. Unless you are doing something wrong, you won't run into any issues. There is a lot of gotcha's but answers are mostly on StackOverflow. Porting apps to different Android devices are not nearly as hard as coding in another platform.

Yes, and if you run into weird problems, there is not much help you can get, especially if you are using the newer APIs. After developing a dozen apps in different categories, SleepBot and Vine accounts for almost all the hardest problems because they interact with Camera, MediaRecorder, MediaPlayer, and opengl components. On the other hand, Squarespace and the other apps had no problem adopting all the platforms and devices, and at most you will be dealing some mistakes made on database related issues. I remember that one of the features for Vine encountered a different problem on each flavor of S2s because some functions were not implemented according to the SDK. This has gotten significantly better with 4.1+ devices, which is why Instagram is only 4.1+ when video was released as well. 

That being said, if you are not using any special hardware components or the more specialized APIs, there is nothing to worry about. Making a todo app is just as easy on Android than on iOS.