Good morning everyone, or good afternoon, or evening, or midnight, depending on where you might be in the world. I’m really excited to be here in what is my morning, talking about Appium drivers and Appium 2.0. So, Jason already gave a very complete introduction for me, so I’ll just add this picture. I’m even wearing the same hat today. This is, in fact, what I look like, and I do work on Appium, both the open-source side as well as helping my clients be successful with Appium and mobile automation.
So, today we’re going to have a brief introduction and tour of four drivers which are part of the Appium driver bucket that you might not have used before. We’re going to talk about the Espresso driver, the Windows Driver,
the macOS driver, and the Raspberry Pi driver, which is for testing IoT devices. That’s definitely the most out-there of all we’re going to look at.
And of course, we will also touch on Appium 2.0. I’ve been working on Appium 2.0 a little bit, and I thought it was time to start circulating my ideas about what Appium 2.0 is, what’s the vision, and what are some of the technical possibilities and changes that you might expect when it’s released sometime next year.
So first of all, let’s drive a little bit beyond mobile. Let’s go beyond the typical Android and iOS drivers that you probably use with Appium and talk about some of the other possibilities. The basic idea here is that Appium uses something called the Webdriver protocol to facilitate the communication between your test script and the Appium server, which then ultimately performs the automation commands that need to happen.
The Webdriver protocol, as you can tell in the name, was originally designed for automating web browsers. And, you know, it’s already the case that Appium uses this same protocol to automate mobile applications. We simply extended it in a few ways and added some additional capabilities, but by and large, if you look at an Appium test script, it looks and reads very similar to a Selenium test script.
So the question is: why not go further? Why not go beyond mobile to other types of applications? When you get right down to it, applications are pretty much the same regardless of the platform, whether an app is running in a web browser or mobile device or on a desktop or laptop computer. They pretty much work the same way.
So, that is in fact, what Appium has done with some of these not so well-known drivers. We’ve extended the Webdriver protocol to support many other platforms, including desktop applications. The really nice feature of this capability of Appium is that you can write your test scripts in any language, mix and match them with any framework, and that will target pretty much any platform that’s out there.
Whereas, of course, if you wanted to automate Mac outside of Appium, you would be looking at packing together your own system events accessibility scripts and things like this, which would work, but then they’re not going to have anything in common with scripts that you might already have for the web version of your app or the mobile version of your app. So you get to leverage a lot of knowledge, a lot of code, and a lot of framework infrastructure and reuse all of that stuff when you rely on Appium as your automation framework. So, that’s part of Appium’s vision. So, let’s look at some of these drivers and what they can do.
First of all, let’s talk about the Espresso driver. Now, Espresso is the latest Android automation framework from Google. It’s not very new anymore.
It’s been out for years now, but it is the recommended approach for automation of Android applications from Google. So if you’re just an Android developer using Android studio, and just reading the Google developer documentation, you’ll see a lot about Espresso and using Espresso to automate your app for testing purposes.
Key things to understand about Espresso. First of all, Espresso relies on a gray box testing model rather than the black box testing model that Appium usually uses with all of its other platforms.
So, the difference here is that a black box testing model means the automation comes in from the outside, treating the application as a black box, and it can only do those sorts of things with the application that a user can do like: tapping on elements or reading things off the screen to get information about the state of the application. It doesn’t have access to the application internals. Now, in gray box testing or white box testing, application internals are available to the test script. So, the test script is running in a context where it has access to the application source code and can therefore trigger different commands or get different information about the state of the application from inside of the application itself.
It’s like being able to look a little bit into the application and do some things there. Another great feature of Espresso is what’s called idle synchronization – and this helps to address a huge problem in any kind of UI automation – which is that the automation is often trying to do things when the UI is not actually in a resting state.
So, if you’ve ever encountered element-not-found exceptions, it’s probably because your test code made an assumption about the state of the application and the application hadn’t painted that view yet or hadn’t navigated to that webpage yet. In other words, the application was doing something when you tried to interact with it, and it wasn’t ready. And, a human user uses their eyes and their brain to detect when an application is doing something, either there’s a spinner or maybe in a browser the little loading progress bar is still going across the top of the page, or maybe they’re looking for a specific element, and it’s not there yet, so they’re just waiting for it to show up.
What Espresso does is it actually detects when the application is in the middle of doing something, so that it knows that the application isn’t quite ready for user interaction. So if you send in a command to Espresso while it’s waiting, it just kind of keeps that command and doesn’t execute it until it’s actually ready. So this helps add stability and robustness to test, because they don’t experience the same kind of flaky errors due to the app state not being what you expect. There’s some complexity here, and Espresso doesn’t know about all the different ways that your app could be busy. So, sometimes you have to teach it when your app is busy so that we can take advantage of this idle synchronization, but it’s a nice feature.
One of the other takeaways for Espresso is that because it is tied in with the application itself, it is isolated to the app under test. So, you can’t use Espresso to automate all aspects of the device UI, the home screen and things like that. For that you would need to use the existing UIAutomator2 driver, which works by automating the accessibility layer of the device which is kind of a layer that sits above all the applications. So it’s not tied to any specific application. With Espresso, we also have access to some advanced ways of getting ahold of elements. We can even find elements that haven’t shown up on the screen yet, if they’re part of a data set that is bound to, for example, a data grid or a list view of some kind.
We can also find elements by a view tag, which is an Android specific piece of information about elements that can be added by developers. This is especially helpful for React Native, because React Native puts the test IDs into the view tag on Android. And, right now the Espresso driver is the only Appium driver that actually gives you access to the view tag of an element.
If you want to use the Espresso driver, these are the kinds of capabilities you need to worry about. Basically, it’s just like any other kind of Android automation capabilities with Appium, except for you need to make the automation name equal to Espresso.
On this screen, we have an example of a command which is only available on the Espresso driver. If you look kind of in the middle of the screen here, we have something that reads “driver.executeScript” and we’re executing this Appium script “mobile:backdoor.”
This is the backdoor method which enables us to get inside of our application code and call methods internal to our application that a user would never see from the outside. So this is kind of what I was talking about earlier with the gray box testing model. This is how Appium gives you access to this gray box aspect of Espresso.
In this case, to figure out what we’re trying to do in the application, we can take a look at the scriptArgs variable here, which is basically just a big map of maps and we’re basically saying: “we want to call a method that exists on the application. The methods’ name is called raiseToast.” If that’s appropriately named it’s going to show some message on the screen. And, the argument that we are passing to this method is a string and it has the value “Hello, from the test script!”
If you look down at the very bottom line, you’d see what we would write in Java if we were writing code inside of the application that leverages this raiseToast method. We’d basically be calling the raiseToast method on the main application Java class and would be calling it with this string argument.
The scriptArgs variable above is just our way of encoding all the information as a way to pass it to Appium, so that it can then call it within the Espresso context. I’ll how you a video of what it looks like when that code is executed.
You can see it happened really quickly, but we saw this “Hello, from the test script!” message pop up. Let me see if I can show you this again. My app very quickly pops up and then we get this message: “Hello, from the test script!” That was not triggered by any kind of UI interaction; it was triggered by directly calling that method internal to the application using the Espresso driver.
If you want to learn more about the Espresso driver, I have an article about it – specifically this back door method – here.
Okay, let’s talk about the Windows Driver. The Windows Driver is pretty awesome in that it’s actually powered by a tool provided by Microsoft itself. So, Microsoft decided to build an Appium-compatible automation tool. They call it WinAppDriver. So, the Appium Windows Driver is basically just a very small bridge to WinAppDriver, which is maintained by Microsoft. Actually, it’s up on GitHub, so you can go and check it out and ask for improvements if you want.
Some key things to understand about the Windows Driver. First of all, it requires developer mode to be turned on the machine or for your user as you’re logged in. It also requires Appium to be run in a console which has administration privileges. You need to do the whole trick where you hold down command or option as you type in the command prompt in the windows start menu. You want to make sure that whatever you’re running Appium in – whether that’s the typical command prompt or Powershell or some kind of bash equivalent on Windows, you need to make sure it has admin privileges.
The way that you launch applications with the Windows Driver is by an application ID. Usually with Appium you pass a path to the application on disk, but this doesn’t work for the Windows Driver. Instead, every application that’s installed and registered with the system has a particular ID, and these IDs can look pretty crazy. They’re unique ID’s.
I found this command you can run in PowerShell called ‘get-Startapps’ and it will list all of the apps that are available and each of their IDs. If you want to automate a particular application from this command and then look forward in the list, you can get its ID. As far as I know these IDs, especially for Windows applications, are the same across all Windows installs. It’s not like they differ from computer to computer, but they’re unique in the world of Windows applications.
The other thing to be aware of with automating Windows applications is that apps don’t have an accessibility ID on Windows. But, Microsoft implemented this attribute called “AutomationId”, which is specifically for AppDriver and other automation tools.
It’s pretty nice that they built this in. If you get your page source from a Windows application, you look through it, and you see an attribute labeled “AutomationID”, you can find the element with that attribute using Appium’s accessibilityID command so driver.findElement by mobile by accessibilityID and so on. That’s how you would use that.
For the Windows Driver, the capabilities look something like this: platform named Windows, platform version 10.
This only works on Windows 10. Although, it can automate Windows apps, which are quite old, I think it has to run on a relatively new machine. Device name is Windows PC, and then here’s an example of an app. In this case, it’s the weather app that comes with Windows 10, and this was the ID that I found using the get-StartApps command. This is what a test could look like of the weather application on a Windows machine.
You can see it looks basically like any other Appium test. We’re finding some things by accessibility ID some other things by XPath. This particular test is trying to find the different days which are displayed in the weather application and get some information about each of the days. For example, the sunrise and the sunset of that particular day, and when we run this example, it prints it out to the console.
It’s RPA for Windows, getting some weather data from the weather app. Obviously, there are more efficient ways to get weather data. We could use an API, but this is a fun demonstration of Windows automation. This is what it looks like when it’s running.
We actually tap through each of these different days, and then you can see sunrise is down in the day details. So, that’s what we’re scraping off as we tap through each of these different days. If you want to learn more about Windows Automation and get the full code for automating that weather app on Windows, check out Appium Pro Edition 81.
Okay, let’s move on to the MacOS Driver. The Mac Driver, as we call it, relies on the system accessibility frameworks to give control over the entire desktop. MacOS comes with system events framework that allows applications to basically see everything that’s on the screen, to interact by moving the mouse or by clicking on different things. It’s a very powerful kind of automation.
However, the downside is that apps have to be specifically given the permissions to control the computer in this way. This is really good for security purposes. You don’t want random applications controlling your computer without your knowledge, but it does mean that there’s some manual setup that’s required to make sure that Appium has the ability to automate the system in this way.
Key things to understand about the Mac driver. First of all, there is an actual Mac application, a .app that runs on a Mac called AppiumForMac, and this has to be manually downloaded and installed and put into your applications folder.
AppiumForMac must also be granted accessibility control over the system, and it’s actually not just AppiumForMac that has to be granted accessibility control, but whoever is the kind of parent process of AppiumForMac. So, if you’re running Appium from the terminal application and you’re running a Mac test that Appium will be running inside the terminal, and it will attempt to start AppiumForMac as a subprocess. The parent process of AppiumForMac is the terminal. That means you have to grant the terminal accessibility control over the system.
This is something you may not want to do on your own machine for security reasons, or you might want to turn it off. Because if the terminal has accessibility control over your system, then you could imagine if somebody tricks you into running a malicious script in your terminal outside of your knowledge, something might happen on your machine.
It’s important to consider your device under test to be something separate from your development and test development workstation.
You probably want to have a separate Mac Mini lying around or use Mac Stadium or something like this in order to run an Appium Test. And, then you might not care so much what happens on that system or you can find ways to ensure that malicious things don’t happen using networking protocols or keeping it internal to your local office network.
Okay, so the way that you open and launch applications with the Mac driver is simply using driver.get() so instead of putting a URL for a web browser, we’re putting the name of an application. If I say driver.get(‘Calculator’) it will look for calculator.app in my applications folder. So that’s pretty simple.
One other wrinkle about the Mac driver is that there’s only one locator strategy which is XPath. While in your test code, you’ll be writing things like driver.FindElement(By.XPath()), the actual flavor of XPath, which is used by the Mac Driver is something called Absolute XPath, or AXPath or XPath without any relative nodes or queries or searches.
Every XPath query must be fully qualified. I’ll show you an example of what this looks like. What this means is that the selector for different elements can be quite long and quite tedious, and in some cases potentially quite brittle.
You have to do some imaginative thinking to make sure that your XPath locators are going to be robust. To figure out what the locator of an element is: there’s a special feature that AppiumForMac has, where if you have it loaded, you can put a mouse pointer over any element on the MacOS desktop or operating system or other applications and hold down the function key for a few seconds. It copies the XPath for that particular element to the clipboard.
Capabilities for this driver are pretty straightforward. Platform and device name should both be Mac, and if you want an app to start automatically, you can just put its name as the app capability.
Here’s what a sample test looks like for the Mac driver. I’m testing the activity monitor application in this particular example, so you can see that I have a bunch of strings here where I’m defining XPath selectors by building them up from a base accessibility or absolute XPath.
The base Xpath, in this example, is AXApplication/AXTitle=Activity Monitor/AXWindow. And, the AXTitle=ActivityMonitor portion is extremely important, because that is what ensures that all of my other selectors, which I build off of this kind of bass string, that’s what ensures that they all take place within my application and not within some other application.
Then I’ve got something called a tabSelectorTemplate, which helps me to select different tabs in the application, so the different tabs like: memory, energy, disk, network, or CPU. You can see again how I’m using this XPath filter, making sure that the accessibility title of that particular tab is memory, or energy, or disk, or network, or CPU. This is a pretty reliable and robust way of finding these elements, time and time again, without relying completely on their position within the application.
All I’m doing in this little test is tapping through the various bits of the activity monitor application. Eventually, I am typing something into a search field and then getting the text of that search field and asserting that is what I ultimately typed into it. So, it doesn’t really do anything particularly useful, but you could see how you could potentially automate applications in a useful way using these commands.
Here’s an example. Tab through all the different aspects of the activity monitor and then type into the search field. This was us controlling a Mac application using Appium. If you want to learn more about this, check out Appium Pro’s Edition 52.
Okay, let’s move on to our last driver of the day: the RaspberryPi Driver. So, this is a special one. The idea behind this driver, and other similar drivers, is that applications don’t actually always have user interfaces. What about IoT devices?
These are physical things that take some kind of sensory input, whether it’s electrical or pressure or temperature or moisture or anything like that, and use that input to send data to something online or control something in your house using a feedback system. So, these are not traditional devices with user interfaces. But, of course, they could still be tested and they could still be tested in automatic fashion. So, why not use Appium to do this?
This was the experiment that I tried to do for myself when I was developing my presentation for Appium conference earlier this year in Bengaluru and India. I took this a circuit playground express from Adafruit Industries.
This is basically a little hackable circuit board that has a bunch of sensors built into it and can do different things like turn on or off a light or output some sound. I thought: “okay, I want to figure out how to test IoT devices with Appium, but first of all, I need to have an IoT device test. So I need to build some kind of IoT device to begin with.”
So, I built this little drum machine. When I tap those different buttons, the light changes on the circuit playground express. Also, you can’t hear it on the webinar, but some sound is emitted from the headphone jack. Actually, it’s not a jack. It’s one of these little electronic paths that we can connect a headphone to with some wires and things like that. A little drum sample is played when I hit each of these buttons. I’ve got a kick, a snare, a hi-hat, and a tom that are attached to each of these different buttons. It’s a little drum machine that I developed.
Then I asked myself: “okay, I’ve got a drum machine. Now how would I want to test this?” There are a bunch of ways to think about testing this. There’s software running on the circuit board, which I wrote, that I could just write some other software that would send the same commands as the real software does, whenever it waits for someone to tap the button.
That would be kind of like one layer of testing, but I wanted to go a layer more real. What I what I thought is that each of these buttons works by changing the electrical signal that is going into the circuit playground board. Whenever I tap one of the buttons, with this particular design, is that a circuit is broken and the circuit playground express software is listening for that circuit to be broken. And, when it’s broken it emits the appropriate sound and changes the light.
I thought, ” I don’t need to physically tap a button to make this happen, I could just send electrical signals into the same points on the circuit board. Whether that signal counts as a high or a low signal will then cause the circuit playground software to believe that a button has been pressed.” This is like a functional approach to testing a circuit board sending in or removing real electrical signals from the physical ports on this board.
So to do that, I got something called a Raspberry Pi. A Raspberry Pi is a little computer that’s just printed on a circuit board. It’s pretty awesome. I recommend playing around with them. The important part about this Raspberry Pi is it has something called the GPIO header. It’s up at the top here. It’s in two different rows of pins and GPIO stands for “general purpose input output”.
These are pins that the Raspberry Pi can use to send electrical signals through. What I wanted to do was take this Raspberry Pi and connect wires from the little pins here to the ports on the circuit playground circuit board. Then have the ability to send electrical signals from the Raspberry Pi mimicking me pressing one of the drum machines’ buttons physically in reality.
What I was able to do then is develop an Appium Driver for the Raspberry Pi that lets me write an Appium test to say when these different pins on the Raspberry Pi should emit or stop emitting electrical signals. By doing that I was able to construct an Appium test script that drove the circuit playground IoT application of a drum machine without actually having to tap the drum button themselves, but also without mocking the connection. I’m still sending electrical signals the exact same way that tapping the button would.
These are the elements that correspond to the different I/O ports on the circuit playground express circuit board, and I’m making electrical signals either go into or not go into those ports by using the sendkeys command. I can send a O which means no signal or I can send a 1 which means signal.
That’s how this works. Let’s see. Now what’s happening in the other terminal window. I’m starting up the Appium server on the Raspberry Pi itself.
And, here’s a video of how it’s all connected. I’ve got the wires coming out of the Raspberry Pi connected on top of the other wires coming from the buttons because I’m not modifying my app under test, and now I’m running my test script.
You can see that as the commands are being called, the different drum lights are being activated. We can’t hear it, but the different drum sounds are being emitted as well. So that was how we ran an Appium test of an IoT device using electrical signals, a Raspberry Pi, without modifying the drum machine device under test. That was pretty fun.
Key things to understand for this Raspberry Pi driver is it’s not an official driver yet. It basically runs in standalone mode. It runs on the Raspberry Pi itself. So you have to install an operating system on the Raspberry Pi and clone and build this driver in “node.js”. You have to have “node.js” installed as well.
The idea of an app is a bit different in that there’s no software application, instead what we’re doing is defining a set of electrical inputs and outputs. So the app capability is actually just a JSON object, which defines which pins have which names and what their initial states should be – whether they should be high or low.
Then we can use driver to find element by ID, a pin by the name that we gave it in the app capability, and then all we do is we send a 0 or 1 to that pin using sendkeys to set the state of the PIN to low or high. So, it’s pretty simple. There’s not a whole lot you can do with electrical signals like this.
If you want to read the full scoop on this, I’ve got two editions on Appium Pro starting at number 74.
Okay. Now let us wrap up by discussing Appium 2.0, and then we can take some questions. The idea behind Appium 2.0 in my mind – the big picture is that Appium goes from becoming a tool and automation library to becoming a platform for a whole automation ecosystem that spans devices and platforms and frameworks and everything else.
So, the idea is that moving forward, Appium itself is going to be this one small piece of the puzzle and then we’ll have many different drivers and many different plugins – all of which have their own independent existence and development trajectories and everything else, but they are integrated with Appium as and when you need them to be. So they’re not all bundled together into one big package the way that they are now.
So, with Appium 2.0, we’ll have a whole new set of command line instructions you can run. For example, you’ll have the set of Appium driver commands. So you’ll be able to list which drivers are installed or available to install. You’ll be able to install a specific driver from the standard repository or from anywhere on npm, or anywhere on GitHub, or anywhere on your local file system.
You’ll be able to uninstall drivers. You’ll be able to update them. So, you’ll be able to take just one driver, just the XCUI test driver, and say: okay Appium, update this driver to the latest for me, but the UIAutomator2 driver, I like it how it is. You know, I don’t want to upgrade to the next version of that, because it has some breaking changes, and I’m not ready for them. So just update this one driver for me.
The idea here is that Appium’s different drivers are for completely unrelated platforms. They have completely unrelated development cycles and different technologies that are used in their development.
They shouldn’t really be bundled together in a way which combines a certain version of the XCUI test driver with a certain version of the UIAutomator 2 driver the way it is now on Appium, because those are just unrelated things. So they should be able to vary with respect to one another. We also will have a command to verify your driver manifest to make sure that all the drivers that Appium thinks are installed are actually installed and so on.
Let’s talk a little bit more about the new driver model for Appium 2.0. The basic idea here is that not just the Appium team, but anybody can create and publish a driver that anybody else can use. So right now, if you want to create a driver for your team to use you can do that, but you have to modify Appium Source code to get it plugged in, or you have to convince the Appium team that your driver is useful enough to be considered one of the standard drivers that’s a part of Appium. So the Appium team has been hand importing and coding connections to those drivers into the Appium source code, and that’s not going to happen in the future.
Instead, the Appium team will just maintain a list of supported drivers and what their names are. So, the official drivers will just be a list that everybody will be able to install by name, but that doesn’t keep you from installing anything else – any unsupported or unofficial driver as well. So as I said before, any driver that you create would also be installable from npm, from GitHub, from any other git repository you have access to, or even the local file system. If you just have a driver that you’re playing with or that you’re distributing around your team.
Also, as I said earlier drivers will have their own independent versioning unrelated to Appium versioning. So, the main drivers right now are version separately. So, XCUI test driver has its own version and UIAutomator2 driver has its own version. But, when we release a new version of Appium, one particular version of each of those drivers are bundled in, and that’s what you get when you install Appium. And, this is not what we want to happen.
So that’s why Appium itself will not come with any drivers by default. So whenever you install Appium, the next thing you’ll do is Appium driver install, XCUI test, and then you’ll get that driver, and if you need more you can install more. But that’s the basic idea.
In addition to drivers, Appium 2.0 will also have the concept of plugins. I’ll say a little bit about plugins here in a moment, but they’ll basically have all of the same commands available as drivers do, so anybody will be able to write and publish plugins that will be installable using Appium plug-in install and friends.
So what are plugins? Plugins basically let you add arbitrary functionality before and/or after Appium Commands. So if you think about the findElement command you could write a plug-in that when the findElement command comes into Appium, instead of sending that command on to the typical driver, you do something else with it.
So for example, a plug-in that I developed a while back allows you to search for an element by a semantic label and then a machine learning model is employed to figure out which elements might match that label based on their appearance. So this would be an example of a plug-in that would just kind of pay attention to the findElement command and add additional functionality before or after that command.
Plugins can also modify the Appium server itself. So if you have an idea for a totally new commands that you would like Appium to have and you want to write a little client to access those commands that you distribute among your team. You can also add routes and handlers to Appium itself.
As I mentioned before, plugins will be distributable and insoluble and all the same ways as drivers are. So, if you have it up on GitHub anybody will be able to install, or if you have it up on npm, anybody will be able to install it. So basically you can think about plugins as tiny pieces of functionality, which can be small or they can be large, but they can be shared with others and you can mix and match them with a variety of other plug-in.
So you can choose which plugins are active at any given point in time in your Appium Install. Of course, just like drivers, plugins need will need to implement a very specific interface that Appium documents in order for them to be compatible with Appium. And, there’s a lot more to think through with plugins. This is still very early days.
So once we actually have a beta out, I certainly hope that a lot of people will jump on the opportunity to develop plugins so we can figure out what’s not working, whether there are any security concerns we need to be worried about, whether there are additional types of plugins or plugin features that need to be enabled in the plug-in interface to help people build what they want to build. There’s a lot that needs to be worked on, but this in my mind is one of the most exciting things coming up in Appium 2.0.
Of course since it’s the first time we’ll have bumped Appium’s major version in seven or five or six years. I can’t remember. I think it was 2014 when we released Appium 1.0. So yeah, it’ll be five or six years.
We’ll also squeeze in any breaking changes that we need to. It will be easier to make breaking changes in the future and also respect semantic versioning because each of the different drivers that the Appium team maintains will be on their complete own versioning scheme so we can upgrade the XCUI test driver from version 2 to version 3 without worrying that it will pull in breaking changes for people installing the next patch version of Appium or something like that. So breaking changes will be isolated to the particular drivers or plugins or server that they correspond to and won’t kind of leak into people using them from other projects.
So I’ll give you a little demo of Appium 2.0. I don’t have a whole lot already done in it, but I’ll show you what there is. So. I’m in my Appium development directory. So instead of typing Appium. I’m going to type node dot to actually activate the code here.
So first thing I’ll do is I’ll run driver list, and that will show me all the available drivers. So, I actually don’t have any available drivers right installed right now, so if I just start Appium, I get this message “No drivers have been installed. Use the ‘appium driver’ command to install the ones you want to use.” Okay. Well, that’s what I’ll do.
So, first thing I’m going to do is install one of the drivers that were listed as available. So I’m going to install the UIAutomator2 driver. So, here Appium will attempt to find it and it’s installing it silently here under the hood using npm. So this takes a moment based on my network, and of course with npm, there’s lots of files to download and unpack.
I was very proud of this little loading animation with the dots here. Inordinately proud. I’m going to make it nyan cat at some point here.
Okay, so it says driver UIAutomator2 successfully installed at this particular version and now it gives me the UI automation name, or sorry, the automation name that I need to use in my capabilities to trigger this driver to be used for a particular session, and it also gives me the platform names that this driver supports, which of course is just Android. So now if I run the Appium server, I get a message that I have just one driver available and it tells me what it’s automation name is.
So I can also install drivers from their name. So that would be one of the Appium officially supported drivers.
Oops, run driver list to show you again that the UIAutomator 2 driver has been installed. Another driver is not yet installed.
So, what I’ll next do is install a driver from GitHub. So, I know that I have the Safari driver. I’ll say “driver install –source=github=GitHub” and then I’ll just put in the GitHub organization and project name here: appium/appium-safari-driver. This is a driver for Safari on Mac, and it’s attempting to find it, and it since now installing it from GitHub.
So again, this will take a little bit of time. Okay. Now, we have these Safari driver successfully installed, and you can see that it actually potentially has multiple platforms that it supports. So if I again, list, I can even just list the ones that are installed.
And, there it is. And, we see that it’s hard driver was actually cloned from this GitHub URL instead of installed via NPM. So, this is all that I can really demonstrate right now. Basically all I’ve got kind of working in a demo of a fashion is this driver management system. If you’re wondering how does Appium keep track of these different drivers. Well it now creates a kind of Appium home directory which defaults to “.appium” in your home directory.
And, inside there’s something called “drivers.yaml”, so we can take a look at this, and we can see that this is the little file that Appium has created that has a list of all the drivers that are installed and where they were installed from and their versions and things like this. So this helps with the management of Appium.
So you could imagine that if you have an Appium set-up that you’d like to replicate on dozens of different machines, you could set it up on one machine, take the “driver.yaml” file, make sure that it gets in the right place on all the different machines you want to install Appium on, and then just have Appium kind of rehydrate the installed drivers based on that yellow pile. And of course, plugins will also be specified here as well. So it’ll be a nice way to kind of have your Appium configuration in code rather than in a set of commands that you need to run as well.
Q: So basically, you don’t need to use explicit waits for Espresso?
A: Yeah, great question. I mean that’s basically the idea. In a lot of cases, I found that to be true. There are some cases where Espresso doesn’t really know that your app is doing something. By default, it know of you as transitioning to another view. I think even maybe if a network request is happening, it decides that the application is in the middle of something. But some cases it might not know that your application is in the middle of something. So, in those cases, I’ve still seen a few examples where it’s safest to use an explicit waits, but I typically just write my test without explicit waits and then see if something breaks.
There’s also, in the case where something breaks, I then add an explicit wait around it. There’s also the ability to actually write custom Java code that will teach Espresso about your application and have your application let Espresso know when it’s doing something. So, that is still possible to do. It’s just not part of Appium itself. It’s something you have to do in your source code.
Q: Can I run the backdoor Espresso feature on any APK resident on an Android device even if I have no direct access to the source? I’m thinking about various preload APKs that are apart of the OS.
A: Yeah, great question. Unfortunately, I don’t believe this is possible. The reason is that you have to have a debug APK so it can’t be just an APK that’s already on the device or downloaded from the app store. It has to be one that you built in debug mode, otherwise Espresso will not be able to automate it. This is one of the downsides. It’s kind of like the power that you get of looking inside the application is also a really big security risk. So, Espresso only allows you to do it to applications that you have debug access over. Whereas the other driver, the UIAutomator2 driver, can’t look inside the application. It basically can do what a user can do, so there is no security concern. So it has basically full accessibility access over all of the applications and aspects of the device.
Q: For IoT, is there a gray box testing available in addition to the white box ETL testing?.
A: I wouldn’t call GPIO white-box testing, to be honest. I would say that white box testing would be like, if I took my drum kit software and I created test code that just lived inside of my drum kit software and triggered drum noises without any kind of external input driving that, so kind of as if I were writing a unit test. That’s more of what I would consider a white box test. I would consider what we’re doing with the GPIO stuff to be basically black box testing. We’re just sending in electrical signals, which is all that the buttons themselves do so, we’re basically mimicking a functional interface to the device.
The only way we could make it more end-to-end is by building a robot that actually taps the different buttons the way a human would. But what we’re doing with kind of getting one layer beneath the physical button tap to sending electrical signals, in my view, is exactly the same type of black box testing that Appium does with XCUI test your UIAutomator – where it’s not physically tapping the screen, but it is getting kind of one layer below the glass so to speak and synthesizing that tap event as if a user were actually tapping it.
Q: What are the main benefits of using UIAutomator2 instead of Espresso?
A: The main benefit of UIAutomator2 is that you’re not limited to your own device that you have a debug APK for. You can automate any application, you can automate apps you don’t own, didn’t develop, you can automate various aspects of the device itself: buttons and other things like that that just aren’t really part of Espresso, because Espresso’s designed for specific automation of a specific app.
I think the Appium team is investigating the ability to combine both Espresso and UIAutomator2 into one session so that you can kind of have the best of both worlds. So that when it comes to your application, you have this kind of gray box access to it and all the benefits of Espresso, but you also then have the ability to tap around other applications or other parts of the device. So, this isn’t available at the moment, but it’s something we’ve been exploring and looks like it might technically be possible. So that’s kind of an experimental phase.
Q: Do you need admin privilege to work with Appium 2.0?
A: No. You having permissions for Appium won’t change. I mean some of the specific drivers have their own permission model. So for example, like we talked about the Windows Driver requires admin permissions, but that’s not Appium itself and all the driver installation and management happens in your own user directories. So, unless you choose to put all of the drivers and plugins somewhere that your user doesn’t have access to, Appium shouldn’t require elevated privileges to do all of its basic operations.