Tech Tip – Efficient loop optimization with unrolling and Duff’s device

Efficient loop optimization with loop unrolling and Duff’s device

One popular method of optimizing loops is to ‘unroll’ the loop so that more work is done for each cycle through the loop.
 For instance consider the following (trivial) example that accumulates the values of the elements in an array:
       long total = 0;
      unsigned char data[256];
       for ( int i = 0 ; i < 256 ; i ++ ){
            total += data[i];
      }
 With so little occurring in the loop body it’s not hard to believe that the overhead of the loop itself is greater than the work accumulating the data. Generating the assembly code (/Fas for Microsoft compilers) produces the following:
      mov   DWORD PTR _i$5081[ebp], 0
      jmp   SHORT $LN3@Loop1
$LN2@Loop1:
      mov   eax, DWORD PTR _i$5081[ebp]
      add   eax, 1
      mov   DWORD PTR _i$5081[ebp], eax
$LN3@Loop1:
      cmp   DWORD PTR _i$5081[ebp], 256         ; 00000100H
      jge   SHORT $LN1@Loop1
; 12   :          total += data[i];
       mov   eax, DWORD PTR _i$5081[ebp]
      movzx ecx, BYTE PTR _data$[ebp+eax]
      add   ecx, DWORD PTR _total$[ebp]
      mov   DWORD PTR _total$[ebp], ecx
 ; 13   :    }
       jmp   SHORT $LN2@Loop1
$LN1@Loop1:
 (In the above and following assembly listings, loop overhead is highlighted in green and the actual work shown in yellow)
 If we assume that the cost of each instruction is the same (which it isn’t but will simplify the discussion) then we could say that the loop overhead is twice that of the actual work (based on there being twice as many instructions). We can also calculate the total work 12 + ( 11 * 255) = 2817 instructions.
 We need a way to do more work in the loop, consider the following:
       long total = 0;
      unsigned char data[256];
       for ( int i = 0 […]

By |July 9th, 2013|Best Practices, Software|Comments Off on Tech Tip – Efficient loop optimization with unrolling and Duff’s device|

Performance Matters #1, or, effective optimization requires focus

My first foray into performance optimization came when attempting to write computer games for the Intel 80×86 family. The clunky graphics of CGA, MCGA, gave way to the higher resolutions but fewer colours of EGA, and VGA. For PC computer games though at that golden point in time, it was all about 320x200x8bpp and more specifically the infamous ModeX.

This post isn’t about ModeX, or any of the other modes really, it’s about how the drive to produce faster graphics to enable more realistic animation and game play led to a lifelong obsession with getting every last piece of performance out of a usually reluctant piece of hardware.

But enough history, needless to say the intent of these posts is to identify approaches and techniques or perhaps establish some principles that may help you the next time you are asked to go and optimize something. When I was spending all my time cranking through different assembly implementations of sprite drawing routines, looking for the magic combination of opcodes (CPU instructions) to yield the best possible performance, I was guilty of the cardinal sin of Performance Optimization – I optimized the code that I found most interesting rather than the code that most needed optimizing.

What I should have done is to determine which portion(s) of the game loop were actually being spent rendering sprites versus other activities such as maintaining object lists, handling input, updating state etc. Those results would have told me exactly where the time was being spent, and where I should focus the optimization.

Back in those dark ages there were not any of the tools that we now take for granted, and those that were available were very expensive. So the measurement code I […]

By |July 9th, 2013|Uncategorized|Comments Off on Performance Matters #1, or, effective optimization requires focus|

Software Regressions and Best Practices for Minimizing and Handling

We have all seen it and most likely all inadvertently done it: introduced a software regression in our product or project.  It happens to the best of us.  A software regression is never good and I’d like to think that most try to fix regressions as quickly as possible, especially if they are very noticeable.  I’ve seen a few recently with Android that have me disappointed and concerned, and it has me thinking about how development teams can avoid or minimize our chances of creating a software regression.

Before I start railing on the good folks at Google and AOSP, let me say this: the Android platform has advanced in terms of features, robustness and development tools/docs since it was first made available to the public.  I get it.  Sometimes things go sideways when you are running at full throttle on a rocket ship.

Now, back to the software regression examples.  The first one is with the Android emulator.  When Android 1.0 then 1.5 (anyone else remember Cupcake?) hit the scene it had an excellent feature set, was stable and workable.  This new kid on the block came with good documentation and integration with Eclipse, a popular IDE in the Linux and open source world.  It also included a device emulator for ARM which was surprisingly snappy.  Having worked on WindowsCE, Windows Mobile and Symbian, I had seen other device emulators and the Android emulator blew them out of the water.  It rocked.  It was responsive, available on Windows, Linux and Mac, and supported emulated hardware keys, sound, etc.  Fast forward to today.  Frankly, the Android emulator for 4.2 is abysmal.  The default arm-eabi-v7a (that’s version 7 of the ARM architecture) is slooooowwwww.  Painfully slow, even […]

By |June 28th, 2013|Android, Best Practices, Software|Comments Off on Software Regressions and Best Practices for Minimizing and Handling|

Security in Today’s Embedded and Mobile Platforms

We have all seen articles in the news, seemingly every day, about security holes and vulnerabilities exposed in Embedded and Mobile Platforms and the software which power them.  These security holes impact the quality of the product, the company that produced the product and the end user. Sometimes these security holes are introduced when product development efforts gets into a ‘rushed’ state and as a result fundamental security measures are skipped. While in this ‘rushed state” some security measures are overlooked and sometimes they are thrown out as they are perceived as being unnecessary and an additional expense.

Security is a huge topic, too big to cover completely in a single blog, but I want developers to think more about the basic building blocks that are the foundation for a stable and secure platform.  In this blog,  I will keep the subject mater generic enough to cover more than “just platform” or “just application” development.  I will also avoid discussing product security processes and policy management.  For process and policy management, I recommend reading the book by Microsoft, SDL – Security Development Life cycle. The book and supporting web pages contain a great deal of information about how to handle processes and procedures with respect to security issues during the product life cycle.

In the section that follows, I will go over what I consider the basics to obtain a secure implementation and set the stage for future blog articles where I can provide more details on individual aspects of security.   Below is what I consider the top 15 areas a developer/architect should always consider in a solution design and implementation.  (The tips below are mainly covering C/C++ development and the Windows CE/Windows Embedded platforms however, several of these rules apply to other […]

By |May 8th, 2013|Debug, Enterprise, mobile|Comments Off on Security in Today’s Embedded and Mobile Platforms|

Tech Tip – Using Android RenderScript ScriptGroup

Android API 17 added a very powerful RenderScript feature: the ability to group scripts together and execute them as one. This helps greatly with performance and management of the scripts as the RenderScript engine handles the optimum way to exchange data with the scripts and removes the overhead associated with this coordination happening in Java. Unfortunately, the documentation is not very clear on how this feature can be leveraged, other than strictly the new intrinsics.

Let’s say you have a script which generates a RGB image into a bitmap Allocation, but you really need it in YUV format once it is done. To do this you can tie your custom script, fancy_image_gen, to the ScriptIntrinsicColorMatrix using a group. Start in the usual way: create a RenderScript object and your instances of the scripts.
RenderScript rsCtx = RenderScript.create(context);
ScriptC_fancy_image_gen myGen =
new ScriptC_fancy_image_gen(rsCtx,
context.getResources(),
R.raw.fancy_image_gen);
ScriptIntrinsicColorMatrix mtrx =
ScriptIntrinsicColorMatrix.create(rsCtx,
Element.RGB_8888(rsCtx));
Now you need to create the ScriptGroup to pull them together.  The framework provides a ScriptGroup.Builder class to do this.  First create a ScriptGroup.Builder class then add your script instances to it and connect the […]

By |May 8th, 2013|Android, Jelly Bean|Comments Off on Tech Tip – Using Android RenderScript ScriptGroup|

Casting Debug Nets

Just the other day I was debugging a particularly troublesome problem on a project. We were seeing a CPU exception with no discernable reason.  It got me thinking about debugging and how sometimes it feels more like art than science.  I’ve seen more than one developer get “stuck in a rut” trying to debug a problem because of the environment or tools and an inability to push past it.  Let’s face it, debugging is a frustrating but necessary evil in our work.  Some people loathe it some people love it.  In case you’re wondering, I fall into the latter camp – I absolutely love testing and debugging the code I’ve written.  There’s something very satisfying about testing and verifying all the hard work you’ve put in designing and developing something.  That’s not to say it is not aggravating.  Like most things in life, how you apply your experiences really helps get you through tough debugging experiences.  How dependent on source-level debuggers are you?  Can you be effective without one?  What if you only have printf like functionality and it tells you nothing?  What if you don’t even have print capability?  Many software engineers throw up their hands at these things, scoff and say, “in this day and age that never happens!”  To this I laugh and wonder what exactly they have been working on.

In the recent past I’ve encountered more than one project with no available source level debugging, limited print/log capability and strange debug output at the time of the problem.  All difficult but not impossible problems.  This is where you cast what I like to think of as “debug nets”.  Think fishing.  Even if you’re not a fisherman or outdoorsy type, you […]

By |March 5th, 2013|Debug|Comments Off on Casting Debug Nets|

Who is ready for some Key Lime Pie (Android 5.0)?

As Google IO is now on the horizon there are rumors that Key Lime Pie – Android 5.0 will be introduced to the world at that time.  Google has announced the dates for Google IO which is from May 15th to May 17th 2012.  Some of the rumors floating around for Key Lime Pie include:

 Video Chat – Today’s smartphones include front facing camera and require third party software such as GTALK, Skype, Qik, etc for video chatting.  It is anticipated that Key Lime Pie will include application to support video chatting much like FaceTime from Apple.
Improved Battery Life – As Android platforms move to quad core and beyond,  power management becomes extremely important as these platforms require more power when running.  It is anticipated that Key Lime Pie will include major improvements in power management in order support these new high performing platforms.
Improved Performance – Quad core platforms and beyond are on the map for Android.  In order to support these platforms the Android OS must be optimized to handle these high performing multi-core processors.  This may include changes to the Dalvik engine to efficiently support multicore processors. It is expected that there will be significant improvements to the UI.
Miracast – sharing your desktop, share you screen with a conference room projector, watch live programs from your home cable box are all possible with Miracast which appears to be big addition to Key Lime Pie.
Google Cloud – Google will continue push all aspects of Android towards the cloud, and it is believed they will blend Google TV, Android@Home, and the Android OS as a whole into one finely tuned machine.

What ingredients do you think should be included in the recipe for Key Lime Pie?

By |January 8th, 2013|Android|Comments Off on Who is ready for some Key Lime Pie (Android 5.0)?|

More Connected Devices, More Spectrum

We all know that just about every electronic device is becoming a connected device these days.  Mobile device (smartphones, tablets) usage continues to grow at a breakneck pace.  Recent stats from black Friday 2012 show mobile traffic was up 98% compared to 2011 and average page visits up 74%.  That is a lot of wireless network connectivity and huge expansion in a single year!  The FCC has decided to do something about it: release additional spectrum which previously was licensed space, but do it in a managed way. This is important for a few different reasons.  First, as more devices get connected the current spectrum congestion (3G/4G carrier and Wifi) will become a problem.  Remember those old ads criticizing cable modem internet because everyone shared the bandwidth?  Similar concept: there’s only so much bandwidth per channel and so much avaialable spectrum and thus more devices and users == more congestion.  Second, because it is still managed in some way it will prevent a flood of users/providers leading us back to congestion issues.  Third, because it is still managed it will prevent a “wild west” problem like we have with Wifi/Bluetooth/cordless phones, etc.  Ok, so they really all boil down to a single thing: managing congestion.

It will be interesting to see how the FCC implements the “reservation system” for the new spectrum and how it will affect end users.  This will also likely present some interesting problems to solve regarding the “priority” access the FCC says will be supported.  Ideally, the end user should really only notice this in a single way: better wireless connectivity and performance.  In reality, there will be growing pains that the end users will likely see.  Regardless, this is sure to usher in an expansion […]

By |December 18th, 2012|FCC, mobile, regulatory, spectrum, Wifi, wireless|Comments Off on More Connected Devices, More Spectrum|

Android 4.2 (Jelly Bean) – Stepping Stone to the Enterprise

Consumer smartphones, including Android based devices, are often criticized for their lack of readiness for enterprise deployment. Blackberry previously held a strong foothold in this space due to their back end integration and security features.

Over time many workers started to side-step IT departments and bring their own devices (BYOD) because companies did not always provide phones to employees and employees generally own more feature-rich devices. This trend continues with users bringing their Android and iOS based devices into the workplace. This presents numerous issues for enterprise IT departments as they try to manage data and security on devices they do not physically control. Security within IT boils down to control. In some cases it is just the control of data and/or apps and in other cases it is control of the actual devices themselves. IT departments are capricious and each company establishes its own policies. Prior to the Android 4.2 release, the platform was partially solving the control problem with the Device Management APIs. This helps IT managers dictate and enforce policy, particularly for outside devices. However, the API itself does not do this enforcement: like any good platform it simply provides you the means in which to do it. Enter 3rd party vendors and ecosystem. Of course, this same API has its place with IT managed devices as well as user provided devices. Google tends to stay in tune with customer trends and demands (let’s face it,they specialize in collecting and processing data!) Enterprise is big business, big opportunity and Apple continues to show little interest in this space.  This is called market opportunity, folks, and Google/Android is working to jump on this.

Android 4.2 introduces the concept of multiple users on tablets. If […]

By |December 11th, 2012|Android, Enterprise, Jelly Bean, JellyBean|Comments Off on Android 4.2 (Jelly Bean) – Stepping Stone to the Enterprise|