How to Identify Client-Side Performance Bottlenecks

June 25, 2019
 By 
Ilya Dreytser
Mobile 
Group

Ilya: So today’s webinar is all about identifying mobile performance issues with Nimble, a HeadSpin company. And just to get started, we’re going to have an introduction by one of the founders of the Nimble technology and then we’re going to jump into the details of measuring performance within mobile apps and how to do that at build time. We’re going to talk about the product capabilities and go full live demo so you can see for yourself. Then we’re going to cover some customer case studies. And so without any further ado, I’ll pass this over to Junfeng Yang. Are you with us?

Jungfeng: Excellent. Hello everyone. I’m Junfeng Yang, Chief Scientist at HeadSpin. Prior to that, I was the Cofounder and CEO of NimbleDroid, acquired by HeadSpin. Prior to that, I was a professor at Columbia University, worked at Microsoft and also got my PhD from Stanford University. But for all of these 20 or so years, my passion has always been building better tools to help engineers build software faster, and improving the quality of the software, so that’s basically my career passion.

Ilya: asked me to talk a little bit about the background of our product and how it came into existence. Around the year 2013, I had a bunch of students who graduated and got into the job market and started working in mobile engineering. They came back and complained that the mobile tooling ecosystem is so broken, and that there are so many different devices out there, that the frameworks that are still pretty young and fast-evolving, and that there just aren’t good tools around to help them figure out performance bottlenecks and improve the user experience.

So, my PhD, then PhD, student Younghoon and I started working on some very cool technology to automatically diagnose performance issues using very deep system techniques and also machine learning or AI techniques and the resulting system gained a lot of attention in the mobile developer community. That’s when we thought about creating a product out of the technology so that we can benefit our millions of mobile developers. That’s all part of how the technology the product, NimbleDroid, and the company. That’s the background. So yeah, Ilya, anything else that you wanted me to discuss?

Ilya: Thank you Junfeng, that was perfect as an introduction on where this came from. Thank you very much sir.

So Junfeng is a Columbia professor, so he’s in New York City and thank you for joining us.

So let’s talk about the actual product as it exists today. So specifically, let’s start with performance. So we’re talking about performance of mobile apps. They could be native apps. They could be hybrid apps. So we’re basically talking about Android and iOS.

And why is performance so critical for your business? Now, there’s not a ton of studies out there, but some of the key analyses done by folks like Amazon and Google – they basically agree that any time your users experience some delay in using the app, they’re unhappy.

Amazon provides 100 millisecond latency as a percent loss in sales.

Google’s talking about 500 millisecond latency is a 20% drop in user requests. From our customers anecdotally, we’ve heard that anywhere from 250 to 500 milliseconds is significant enough that users will perceive that.

So if a user of your app is used to doing a certain function, you know, like, think of a shopping app where you’re going to search for a shirt, okay – or think of a finance app where I’m going to check my investment portfolio.

If I’m used to that screen appearing in a certain time and there’s a new release of the app, and all of a sudden it takes about 500 milliseconds more, that’s enough that it’s noticeable. Obviously the worse the delay, the less happy customers you have.

So why is that? You know, mobile performance is obviously complex and as Junfeng said, they looked at some of the reasons. So he named things like the ecosystem being complex, lots of different devices.

Here at Nimble as a product company, we’ve looked at what our customers are telling us and what we found is the three biggest reasons why there’s this complexity in understanding performance is:

  1. The developers are frequently making code changes. And so by the time you realize that there’s a performance impact, many commits have already happened. It’s very hard to track it down.
  2. The second problem is just testing for performance regressions, right? For a human tester to do some manual testing, they won’t necessarily know that an app got 300 milliseconds slowdowns. But if you think about three subsequent releases with 300 millisecond slow downs, now we’re talking about one second of slow down, which clearly is perceivable by your users.
  3. And the other problem, it’s not just for performance, right? This is sort of a generic thing in software development, but software development, especially these days, means you’re using a lot of third party code: SDKs, libraries developed by open source developers or by even other folks in your team, right? So there’s this poor visibility of how the performance of the SDKs or third party libraries are impacting the overall app performance.

And we actually have some examples of each of these three things to show you. So the other thing that we struggled with a lot is just tools, right? There’s lots of tools on the market and so they all do different things and they are better suited for certain things than others. So, APM tools for example, they’re great. They give you a lot of visibility into what your real users are doing. But when it comes to actually detecting issues, it’s too late in the cycle, right? The app is already out and real users are encountering those issues. So if you think about the common term these days, it’s shift left.

If you want to shift left on performance, you can’t wait until your users are experiencing some of these issues. Then the other thing we talked about already is just accurately measuring the performance. When you’re talking about a few hundred milliseconds of difference, it’s very hard to do that reliably because if you open the same app a couple of times on whatever device you happen to have, it’s not going to take the same amount of time.

So, identifying where performance is worse versus just a glitch or a network problem is kind of a big deal. And we can show you what that looks like.

And then of course Android and iOS have very different tooling and very different ecosystems and come with their own challenges. So there’s that element of it and there’s a lot of costs in trying to do this yourself in terms of getting devices that are optimized for this kind of testing specifically for performance and getting people who are trained to look at performance issues.

One of the things that we encounter commonly with our customers is that many organizations do not have a dedicated performance team, although some do. And it’s really a question of both people and time and engineering hours and dollars ultimately on how you can easily test your apps for performance and why you should.

So the solution that Nimble App put together is meant to provide visibility and control into the performance of a given mobile app in the development stage. We basically integrate with CII so we can continuously monitor every build so we can quickly alert when there is a regression as soon as some problematic code is introduced. It becomes a lot easier to go back and understand what commit has caused this problem.

And then the other thing which I’ll show you today is the ability of our product to provide these fine-grain diagnostics really helps developers pinpoint exactly which method or what part of their code is causing the slow down so they know where to look or where to start improving things and they can understand what kind of performance impact that’s going to have.

And so the idea again is to shift left and to be able to identify issues earlier in the cycle. And that’s what we’re talking about today.

Now typically, our users use our product in a few different ways:

  1. One is a developer may want to just check their latest build against performance numbers and they can do that. The typical use case is CI integration where every time there is a pull request and a new build gets generated, it gets profiled and we can detect regressions in performance.
  2. Also, some of our customers use it as a release criteria. So they set up an actual budget of what performance numbers are acceptable for certain typical UI interactions, and then they can check against those numbers.
  3. And then finally, in production when you’re talking about a customer reported issue, you can actually rerun it through Nimble and get some results to look at.

But as I said, the typical integration that we offer is a very, very easy and seamless integration into the CI workflow. So you can see some logos here of some popular CI systems. We actually integrate with any CI system – we haven’t met one yet that we cannot integrate with. And it’s a very, very simple process. And as I explain further what the performance metrics look like, you’ll see for yourselves. It’s a very simple thing. So, that’s what we’re going to talk about today.

So I’m going to actually switch to a live demo and go through some of the scenarios so that you can see for yourself what our product looks like.

And again, for those folks who have joined us since the beginning, just a reminder, if you have a question, please use the questions tab on the right hand side – you can actually put your question in. And what we’re going to do is address them at the end of the Webinar. Please go ahead and do that if you have any questions.

Okay. So let’s talk about a few specific things that we’re going to look at today. So, identifying slowdowns in third party code is very important. It’s one of the three main challenges that we found.

And here’s an example from a real customer we have: this is actually from Aaron’s personal blog, but Aaron is an engineer that works for a company that is one of our customers. And what he’s calling out is essentially that get resources stream could take up to two seconds to create this initializer. And he’s basically pointing out that there’s a very simple fix – you can just make the adapter lazily initialized and it saves anywhere from one to two seconds at a startup time.

So let me actually show you what Nimble does and how it shows these kinds of issues so you can see for yourself. So this is the first thing we’re going to do.

So, as I mentioned, Nimble will go ahead and integrate with your CRM system and every time there is an upload of the app, Nimble is going to analyze that app.

So this first step that I’m showing you is a shopping app. So one of the main things that we look at, honestly, one of the first things that any organization that starts caring about performance is going to look at is the cold startup numbers and cold startup is literally from when you launched the app until the App UI is usable by your users. And so each of these dots represents a specific build taken from probably their release branch.

And you can see the results in terms of how long it took to get to the app and make it usable. And other scenarios over here, like adding an item to the shopping cart, for example, or fetching some recommendations – these are custom test scenarios, which either we, Nimble, or our customers themselves can build. And we’ll talk a little bit more in detail about the test frameworks that we support.

But the idea again is every time there’s a new build, we can get these performance numbers. And so what that allows an organization to do is as soon as there’s a performance regression, it is connected to the previous build and it makes it very simple for developers to understand what exactly is causing that regression.

In fact, if we expand this out to say all 22 uploads and reload the dashboard, you’ll see for yourselves basically for cold startup, the numbers are fairly similar, fairly similar, fairly similar. They’re not exactly the same as the code changes come in and out, but they’re very minimal regressions or fixes.

And then suddenly cold startup jumps to 2.5 seconds and then even 2.6 seconds. So each of these two data points would have generated an alert to the team. And when the developer looks at this, it makes it very simple to understand what exactly regressed because I can simply click on this and say, compare this upload with the current.

And by the way, you can compare any two arbitrary points, but the typical use case is comparing the current or the next build – the latest build – with the previous builds. And so that’s what we see here. So the app didn’t change too much. If you look at the file size or the method count, they’re very similar. But the timing of the cold startup definitely regressed and went from 1.13 seconds to 2.5 seconds.

And so as I keep saying, Nimble makes it very simple to see why the code regressed. And there’s two ways we do that.

1) One is we actually identify specific reasons for slowdowns. So in this case we’re looking at three different things.

  • One is anything that’s on the UI thread, which is going to actually block or hang the UI.
  • Then we’re also able to find hung wait methods, which basically means a method on the UI thread is waiting for some background thread to finish.
  • And then the last thing is basically any method which runs in the background, but for longer than a 100 milliseconds.

So if you, again, if you remember that the left side used to be fast than the right side is where things got slow – it’s very clear that the splash activity on create exists in both cases and the timing is about the same 254 versus 253 milliseconds.

But there’s a new main activity on create which was added here and it’s very long. And specifically, if you look at the call stack for this, the main activity on create method is going to call the main activity initialize method, which itself is 2100 milliseconds.

And if you want more information than that, we can actually pull up the full call stack for the entire application like this. And so again, what you see is the UI thread on the left for example, was very quick, 312 milliseconds. And I can just zoom in and show everything that happens as part of the UI thread. And here it’s almost 2.5 milliseconds. And you can see that the biggest method or the method that takes the longest time to run is as big integer add method.

So what this boils down to is a developer made a code change. They submitted their build, we realized that there’s a performance problem, and by looking at the previous build versus this build, it’s very clear that this new method which was added is in fact the method that is causing the problem.

And so now developer can go back and change this method, get rid of this method, rewrite the code, whatever is the appropriate remediation, but they understand that this method and this timing information is the biggest problem here. There’s other methods being called by the onCreate method, but they’re very quick, so they’re not causing us any problems.

We can also pull up a timeline view – not really necessary here. Plus of course this is a demo, so not a real customer example.

But the timeline view basically shows us the threads and it also shows us when certain threads are launched, and if the UI thread is waiting on something to finish, it shows that as well.

So here the UI thread is actually waiting for a couple of items from a background thread to finish. And this is causing some delays as well. So to summarize, a typical use case for Nimble is you tie it into CI. Every time there’s a new build, it gets profiled for cold startup as well as any kind of custom tests. That information is communicated to the team, so that as soon as there is a performance regression, then we can communicate that to the users.

Now to further explain what’s happening here, actually one of the main things that we’ve discovered is if you look at performance and you run the app a couple of times on a couple of different devices, you will get very different readings.

So another thing that Nimble offers, and it’s really the foundation of our platform, is these thoughts or these numbers are very reliable, run over run. And the reason that they’re so reliable is, well there’s two reasons.

1) One is we have actual physical real devices in a farm which are used to install the app, run it and profile it. And those devices are all exactly the same. So if you think of a typical device farm, it may have lots of different devices and that may be useful for answering the question of how is my app doing on this device versus that device.

But for this kind of build over build performance testing, in order for it to make sense, you have to have exactly the same situation and this build as well as that built. Otherwise the numbers won’t match up on the trend graph.

And so Nimble provides that by having a device farm comprised of exactly the same devices, configured in exactly the same way, sitting on the same network pipe in the same location so that when you submit a new build, if the numbers are different as they are, in this case, there’s only two possibilities:

  1. Either you have network traffic that took a different amount of time
  2. or you’ve made a change in the code and we can show you that.

And in fact if you look at the timing information on the call stack, it becomes very easy to tell whether this is network related or CPU related, because what we’re showing here is the CPU timing for how long each method takes to execute. So if these two were exactly the same, then the issue would be on the network side. If these two are drastically different, then clearly the issue is in the code itself.

So another thing that Nimble provides is the functionality for iOS. Now, iOS, as I’m sure anybody who’s worked on iOS knows, Apple has some curveballs when it comes to third party tooling and running various things. And so we’ve basically dealt with all of that. So we also have a device farm of iOS devices and we can do exactly the same thing in iOS that we do for Android.

So, as you can see here, it’s sort of a safe shopping app, but in an iOS environment and I just want you to see that it works exactly the same. Every build comes in, it gets profiled. We can certainly do a “compare this upload with current” – so you can see the difference. I actually compared this upload with current, sorry about that.

So you see that in this case this is an IPA that’s being run and there might be some differences in this one and the timing went down. And so again, you can pull up a call stack and see what the differences are side over side. And so it makes it very easy to see this in iOS as well as Android.

Recently, we’ve had some requests for web apps. And so this is a demo which we do on Kohl’s. So all of these are basically pulling up the Kohl’s website inside a mobile browser. And again, we can basically profile it for timing information and show when there are differences. And when you look at the details for this, you can see that we’re also identifying slowdowns.

In the case of a web app, they’re basically JavaScript functionalities. In the case of Android, they’re Java code and the case of iOS, it’s either Swift or Objective C.

But basically by looking at the call stack and looking at the method timing, whatever the languages of each platform, we can identify these slowdowns. And more importantly, we can profile this and give these a relative number from one run to another.

Another thing that commonly happens is when a customer wants to look at improving the functionality of a certain app, they can look at the results that we provide, not by a build over build dashboard, but by actually looking at a unique set of results. And so I’d like to show you what that looks like.

This is an app that I pulled from the app store. So this is just one of the, not the latest, but one of the recent versions of this app called Runtastic. We grabbed the APK from the Google Play Store and we’ve analyzed it.

One of the things that you can see here is a login that takes 4.6 seconds, and arguably if I wanted to improve that time, the question naturally is: what’s causing it to be long and where should I focus my efforts?

What Nimble does, as you can see here, is we’re basically starting when the user clicks the login button and we’re ending when the actual logged-in screen or a version of the app is fully populated. This is driven by an automation test, which in this case we wrote. There’s 10 slowdowns and we can look at the actual details and see the actual methods. These are on the UI thread if you recalled because these are the ones that are actually capable of hanging the CPU and hanging the UI thread.

These are background methods. And so looking at this, we can say, if you want to start working on improving this particular app and this particular flow specifically, this is what you’re doing. This is the thing that is the path in the code that is taken.

If you want it to start improving this app’s performance for login, this is where you would focus. Here’s the full call stack so you can see for yourself what’s involved in making these calls.

For example, here we’re pulling in some data from maps and this is on the Ui thread and it’s almost a hundred milliseconds, right? So if we can offload that to a background thread, that would be great, and that would cause a performance improvement here.

But there’s some other interesting things we can see here.

We talked a little bit about third party code. And the thing about third party code is, whether it’s open source or in a large organization – maybe it’s a different team that’s creating some capabilities – traditionally, it’s hard to understand if the app performance is due to “my code” or “somebody else’s code.”

In this case, this is how easy Nimble makes it. So this is Gson. It’s an open source library for parsing Json. And in this scenario, which if you recall, is about four and a half seconds, almost a second is actually spent parsing Json using this Gson Library. So that’s kind of a strange thing. You would expect this to be very quick.

So again, if we go back to the original slowdowns, we can actually trace it down that way. So I’m looking at some of these methods. You can actually see where that Gson is taking place.

Here’s a great example of that and I’ll just go to the beginning of it up here. This is a background thread and it’s running in the background. So that’s not directly going to freeze the UI thread, but obviously it has to finish processing in order for the app to be useful.

And this is what it’s doing – you can see that Gson is being called and the reason that it’s so slow, relatively speaking, is because Gson uses reflection and on Android, reflection is actually incredibly slow.

In fact, we have a couple things further in this webinar that explain that a little bit better. We also provide you some links that you can look at.

But in general, the Gson Third Party library is causing some slow downs here. And I can track down into the code and see that this is due to this reflection. But the thing I find most interesting is actually, if you look at the name here, com.newrelic.agent – so not only am I identifying a problem in the third party library, which is Gson, but the really interesting thing is it’s actually being introduced not even by the native apps functionality – it’s actually because they have New Relic and New Relic is creating some slow downs.

I mentioned this earlier, New Relic is an APM tool. They provide a lot of useful information. So it may be the case that they’re willing to accept that slow down because it provides so much useful information and that’s great.

But what you can provide to New Relic itself now is a break down and actually say “Hey, we love your product but you’re causing us some problems. Is there anything you can do to speed this up?”

And New Relic would actually be wise to change out of Gson because there are other libraries for parsing Gson and in the call stack, it’s actually very clear that this happens a lot.

So here’s a background thread with a little over one second of runtime. And ultimately if you look at this, this is opening up some web server requests, which ultimately is going to call New Relic agent, which is ultimately going to call this to Json method, which is 300 milliseconds here. And then guess what, this is another New Relic call that’s 358 milliseconds over here and yet again over here for another 200. So, you see how this is all added up. And even though this is the background thread, this is actually responsible for almost the one second of CPU time.

So that’s the kind of thing that you can get from Nimble. Just to summarize, you plug it into CI or you could do it as a one-off build. We talked about both of these scenarios and what it gives you is this level of detail so that you can either work on improving the performance of a certain task or you can track it, build over build, then make sure that it does not regress.

In fact, a lot of our customers will talk about creating a baseline and then just making sure that nothing regresses or gets slower from that baseline. So there’s a couple of different ways of using this. And, that’s kind of the live demo.

I’m going to go back into the slides and cover the customer case studies and such. And again, if you have any questions or if you want to look at anything else, by all means, please put the questions into the question panel.

This is an example of identifying a third party problem in Moshi Kotlin and some explanation there.

This is the Gson example that I already showed you.

I have another case study here. This is from one of our customers who did not agree to share their name publicly, so I had to redact some of it, but this is a real world example where we’re looking at three different versions of an iOS app, very popular one.

It basically has this huge spike which was later fixed. And so this is the interesting one. And again, if we compare the before and after, it’s pretty clear to see that on the left hand side, the hot methods are 300 milliseconds, 280 milliseconds, On the right hand side, there’s some database access going on and it’s just very slow.

This is the kind of thing you can see with Nimble. This is an iOS app and this is what the full call stack looks like. And again, you can see under the UI application main, went from 2000 milliseconds- so from two seconds to seven seconds, that’s a five second change and believe me, five seconds is very perceivable to a user. I use this app. I’m sure a lot of you have to.

This is the sort of thing that we can identify with Nimble and this is how it makes it possible for a developer to look at that and say, okay, I get it. I need to focus on changing this method, which ultimately will cause this – ultimately, it’s either this guy or this guy that’s going to be my two biggest contributors and that’s what happens.

So that’s one real example. But we have some case studies here.

So Flipkart is a great customer of ours – been using us for over two years now. They use this exactly the way I described. It’s built into CI. They analyze every build and whenever we detect the regression for them, they use Nimble to analyze what the issue is. You can see they fixed it in two or three builds, improved it and made sure that it goes back to where it should be.

Flipkart is a big supporter because we deliver a lot of value to them. Of course, Flipkart has many customers. As a sales app, people tend to leave things in the shopping cart. People tend to get mad and go use something else. So that’s a big thing for Flipkart.

Here’s another [case study] – they didn’t let us use their name, but this is a very large company with lots of different teams.

What we found is when you look at how frequently that many engineers make code changes, the sooner you can find a problem, the better it is. After six months, we basically filed these credentials. If you think about the engineering time that’s necessary to go back and understand why the performance changed – the more commits you have to wade through, the more individual code changes you have to analyze – the harder it’s going to be.

Being able to look at it on a build by build level where the change is pretty small is much better than being able to only look at version to version comparison.

This is a great case study of New York Times – did this test with us. They wanted to improve their cold startup. Google recommends two seconds or less for cold startup. As you can see here, theirs was hovering at 4.3 and then they started working with us to improve it. Actually, it was even 5.6 initially.

So there is a link here – I’ll give you the URL at the end of the Webinar and we’ll obviously provide that with our follow-up, but they have a whole blog where they explained exactly what kinds of issues they found. This actually happened almost two years ago, but it’s such a great case study.

For me as a customer facing engineer, what’s interesting is that even though this happened two years ago, I go to customers today and I see the same kinds of things and just a spoiler: Gson and reflection is a big reason for why they had this slow startup time, which they fixed.

This blog post explains exactly why and how they fixed it. As you just saw from the latest Runtastic version, people are still hitting those problems on Android today. So, this is not something that went away – even though this is this particular case studies two years old, it’s actually still entirely valid based on what I see in the field. We’ll share the link at the end of the Webinar and we’ll follow up with that. I recommend you take a look at that. If you have any Android development.

The other case study – a little bit newer, also a great blog post – which you can read from our friends at Pinterest. They actually use us in a performance team and we are a big part of providing them a platform to run Android tests and iOS tests and identify performance issues early.

They have this nice blog post where they explained some of the things that they’ve seen with us and what kinds of issues they’ve been able to find and fix. And again, we’ll share the link and here they are: insert URLs here

The three things that I showed in the webinar, these are the URLs. If you want to take a look at them and I do recommend you guys take a look at it, we’ll share these links with you along with the recording.

At this point, just to summarize, Nimble enables our customers to accurately monitor and profile every critical user flow of Android and iOS apps, and now even web apps. We integrate pretty seamlessly into your existing workflow, which means it’s a set and forget kind of thing. You put it into CI and then it just works.

Nimble is now part of HeadSpin. We provide a lot of value in addition to the performance capabilities that HeadSpin already had – we’re two sides of the same coin, if you will.

If you haven’t already attended or watched the recording, we did do a performance webinar [on May 21st, 2019] for HeadSpin. So, I would recommend that you watch that as well because performance is a complex topic. Being able to attack it from the code level and from the network level and from the screen view level, there’s a lot of different ways that you can get to the data, which ultimately will end up with a faster app that your customers will love. And that’s why we’re here.

So that’s it for me. I’m going to open up the floor to any questions that folks have and then we’ll call it a day.

Q&A

Q: Does it work on iOS?

A: Yeah, absolutely. A lot of the content I’ve shared today is talking about Android. Android is in some ways easier for us to work with in the sense that you can just go to the Google Play Store and download an APK and then we can analyze it.

What Nimble did a couple of years ago is actually analyzed the top thousand apps for cold startup numbers that we have – that’s actually still posted on the site, so it’s interesting.

iOS is a newer product. We launched it a little over a year ago. We absolutely support iOS apps. We have a device cloud of iOS devices. We have customers looking at performance of their iOS apps – again, build over build, and it is absolutely something that’s supported. And I will say that a lot of vendors that have mobile solutions – they kind of struggle with iOS. So I’m happy to say that HeadSpin and Nimble have iOS very well in hand and are adding a lot of value there. So thank you for asking that question.

Q: How easy is CI integration? We use Circle CI.

A: Sure. Circle CI is great. I mentioned earlier at the beginning of the Webinar – we support many different the CI systems. And if you recall from the dashboard screen, every dot represents a build. So whatever CI system you use, ultimately the integration is a post-build step where the result of the build, either the APK or the IPA or the web app URL, is essentially sent to Nimble.

With all of our customers, whether it’s Jenkins or Circle CI or some of the other tools – it’s really just adding a single post-build step, which grabs the results of the build and sends them over to Nimble for analysis. It’s a very simple platform.

The results, again, are provided via either an alerting system or via some dashboards, or there’s a couple other ways that folks get their results. CI is a very important part of driving this product because again, it’s set and forget, right? You integrate into CI and that it just kind of goes from there. So thank you for whoever asked. That was a great question.

Q: How many builds does Nimble get per day?

A: Oh, that’s a great question. I don’t have the exact answer. I’d have to look. I know that some of our biggest customers are sending us anywhere from 10 to 50 builds per day. And that actually multiplies out because, for example, if you want to run this on Android 5, you know, SDK 21 versus Android 6 or iOS – we support as old as 10, so it could be iOS 10 or 11.

In other words, we have a lot of builds. The other interesting thing I should bring up – this wasn’t explicitly asked – but what I personally deal with when starting a proof of concept or a trial with a prospect is we jointly decide what makes sense in their organization, because some organizations are producing hundreds of builds per day.

A lot of organizations will have that typical hockey stick where on Friday afternoon there’s suddenly a ton of builds and then people go home.

So, what I find is the right place to add Nimble is either a release branch or some kind of integration branch. The number we’re looking for is anywhere from say 2 or 3 up to maybe 10 builds per day. Probably I’d say 2-5 builds per day.

The organization that does one build a week – we have some customers like that, but that’s not going to catch these issues soon enough. If you think about where in your branching scheme you want to add Nimble, it should be somewhere where, you know, we’re going to be able to run at least one if not a few scans per day.

The point is, if there is a problem or a regression, you want to be able to map it to the last few code commits while they’re still fresh in the developer’s memories. You want to do that pretty close to when the developers submit that code.

Q: This is exciting. Is there a rate card? How can I get in touch?

A: Well, absolutely. Get in touch with us at HeadSpin's helpdesk – that’s the easiest way. I’m on the tech side – I try to stay far, far away from contracting answers, but I mean rates and details on how you can try it for yourself – whether that’s a trial or a POC or some other mechanism – we have all that. We’d be happy to work with you.

Q: I’m a HeadSpin customer and looking at Nimble right now: will there be a dashboard integration with HeadSpin so we can get end-to-end performance visibility?

A: That is a great question. I hope that whoever this customer is also attended the previous webinar. The plan to integrate the dashboard exists. The details are being worked out right now. I did refer to this – as you know, Nimble is looking at the method level data. The HeadSpin performance traditionally is looking at the network traffic and also what’s on the screen. And there are really two sides of the same coin.

So absolutely, our long term plan is to integrate both technologies so that you can take a scenario – the login scenario we looked at earlier – and you can run it on Nimble, you can run it on HeadSpin, and then in a single UI you could see how long it took and you can see which parts of it were attributed to the network, which parts were attributed to the method or CPU time, if you will – execution – and all of that will eventually be on the same dashboard. That’s a great question. And yes, that we’re definitely working on putting all the pieces together.

Q: Are you compatible with all test frameworks?

A: That’s a great question. Nimble started out working with the native tools, so that would be Espresso or UI Automater for Android and XC test or iOS – we still support those. We’ve added support for Appium. The iOS and Android products and also the web can use regular open source Appium. There’s nothing proprietary there – we use just the usual Appium capabilities that are in the open source version of Appium. Both of those are options.

Q: From your experience, what is the single largest source of poor client performance? What are the 1-2 immediate steps developers can take to reduce time to load app?

A: Okay. That’s a good question. Everybody’s apps are very different. We looked at a shopping app versus a run tracking app versus another app (I can’t tell you what it is, but it’s a commonly used app). I mean, everybody’s apps are very different.

What we find across the customers is when you’re looking at the method level issues, there are some things that are specific to the constraints of the language.

For example, we talked about reflection on Gson. That’s a very clear problem, to be honest, that we saw years ago that still exists today and you can read all about it in the blog. So there’s certain things that are specific to a language or a feature. We also talked about the Moshi Kotlin issue in putting that adapter together.

That’s one area to look at. But, the other thing that we see is actually much more generic. It’s just the fact that when people make code changes stuff – habits – right? If you have that visibility of seeing the results build over build, it just becomes much more frequent, much quicker to get to the root cause of the problem by looking at it over the last few commits – instead of, how many hundreds of commits go into a version right before you put it into the App Store.

Just being able to get a baseline and then improve it over time turns out to be a great driving factor. But the kinds of issues that we see are all over the map. And again, if you look at the, the whole coin and you look at the method level information from Nimble and the network information from HeadSpin, there’s a lot of interaction there, right?

Because the method could be coded in such a way that if there is a network call that times out, the method doesn’t just keep going, it just sits there and waits. And now you’re relying on the back end to time out the call.

One issue we saw with a customer of ours is you would run the tests and they would report that there is roughly a short amount of regression, say 200 milliseconds regression.

But when you dig into it, it turned out that out of say 10 attempts, 8 of them were actually really fast and 2 of them would time out for at least 10 seconds. And it turns out that the backend was just not returning the proper termination and the way the method was coded was waiting for something.

So yeah, what it boils down to is, you have a majority of users that are very happy and a few users are reporting these huge delays that just seem very unrealistic. Those are very hard to diagnose. If you can look at the network and you can look at the method level information, it becomes much more likely that you would catch that earlier as opposed to digging into it using traditional debugging techniques

Tags
HeadSpin Logo

About HeadSpin

HeadSpin helps Telcos, media organizations, and large enterprises analyze and improve the user experience of their digital products through its global real device infrastructure, on the edge end-to-end testing, and ML-driven performance and quality of experience analytics.

The HeadSpin data science platform enables collaboration among global teams to accelerate release cycles, build for complex real user environments, and proactively detect and resolve issues whether at the code, device, or network layer. HeadSpin currently works alongside a number of global telco and media organizations today to:

  • Monitor and improve 5G user experience
  • Improve streaming experience for OTT apps
  • Test and optimize data, voice, and messaging services
  • Assess and validate device compatibility
  • Offer regression insights for accelerated development
  • Deploy software at the edge
Infosys Logo

About Infosys

Infosys is a global leader in next generation digital services and consulting. We enable clients in 50+ countries to navigate their digital transformation. With over three decades of experience in managing the system and workings of global enterprises, we expertly steer our clients through their digital journey. Visit www.infosys.com to see how Infosys (NYSE:INFY) can help your enterprise navigate your next.