Espresso is an Android test automation library maintained by Google. It has a number of advantages, for example built-in view synchronization that ensures element finding happens during idle periods in your app. (Curious to learn more about Espresso and how it compares to Appium and other frameworks? Check out my webinar, The Shifting Landscape of Mobile Automation for a more in-depth discussion). Most people assume Espresso is an alternative to Appium; you'd pick either Appium or Espresso, not both. That's pretty much been the case, up until now.
Part of Appium's vision is the incorporation of any good technology into Appium itself. Our goal with Appium is not to compete on fundamental automation technology, when what exists is good. Instead, our vision is of a single, unified API for all app automation, based on the official WebDriver spec. As I put it in the aforementioned webinar, here's what I think the Appium of the future looks like:
From this perspective, there's nothing odd about creating an Appium Espresso driver. And that is exactly what we've done. For some time we've been working on a very rough beta (really alpha) version of an Appium driver that runs Espresso under the hood. This means the same kind of Appium script you're used to writing, but running on Google's top-tier automation technology. This new driver is still in its infancy, and is not recommended for production. However, the beta is moving along enough that I think it's time more people had a chance to play with it. In this article, we'll see exactly how to do that!
In essence, it's pretty simple: we just change the automationName capability to Espresso (instead of, say, UiAutomator2 if you're using the current standard Android driver for Appium). By designating this automation name, Appium will know to start an Espresso session instead of something else. The Espresso driver is so new, however, that you will really need to be running Appium from source (by cloning the GitHub project and running npm install to get the latest dependencies, including the latest Espresso driver), or running the latest Appium beta (npm install -g appium@beta). At this point, simply running appium (or node . if running from source) will spin up an Appium server that knows about the most recent Espresso driver beta.
The best way to show off what you can currently do with the Espresso beta is with a comparison. The code for this article is therefore the same test (of a basic login flow) run on both UiAutomator2 and Espresso drivers. Let's take a look at the code for the standard UiAutomator2 driver first:
@Test
public void testLogin_UiAutomator2() throws MalformedURLException {
AndroidDriver driver = getDriver("UiAutomator2");
WebDriverWait wait = new WebDriverWait(driver, 10);
ExpectedCondition<WebElement> loginScreenReady =
ExpectedConditions.presenceOfElementLocated(loginScreen);
ExpectedCondition<WebElement> usernameReady =
ExpectedConditions.presenceOfElementLocated(username);
ExpectedCondition<WebElement> verificationReady =
ExpectedConditions.presenceOfElementLocated(verificationTextUiAuto2);
try {
wait.until(loginScreenReady).click();
wait.until(usernameReady).sendKeys("alice");
driver.findElement(password).sendKeys("mypassword");
driver.findElement(loginBtn).click();
wait.until(verificationReady);
} finally {
driver.quit();
}
}
The first thing to observe about this snippet is that we have a helper method, getDriver, which simply takes the automation name and gets us an instance of AndroidDriver. This is so we can reduce code duplication when we do the same thing for the Espresso version of the test (and to show that all the capabilities are the same, other than automationName). Next, we set up 3 expected conditions for user later on in the test. Finally, our test flow itself is 5 steps long, utilizing pre-defined locator fields like password. The steps themselves should be familiar from other articles: (1) get to the login prompt, (2) enter the username, (3) enter the password, (4) tap the log in button, and (5) verify that an element with the correct logged-in text is present.
So far, so good! Now let's take a look at the same test, but written for the Espresso driver:
@Test
public void testLogin_Espresso() throws MalformedURLException {
AndroidDriver driver = getDriver("Espresso");
WebDriverWait wait = new WebDriverWait(driver, 10);
ExpectedCondition<WebElement> loginScreenReady =
ExpectedConditions.presenceOfElementLocated(loginScreen);
try {
wait.until(loginScreenReady).click();
driver.findElement(username).sendKeys("alice");
driver.findElement(password).sendKeys("mypassword");
driver.findElement(loginBtn).click();
driver.findElement(verificationTextEspresso);
} finally {
driver.quit();
}
}
Can you spot the differences? There are just two:
Otherwise, the test code is exactly the same! This is great, because it means that, for the most part, changes were not required to migrate this particular test to the Espresso driver. Now, why do we only need 1 explicit wait instead of 3 as before? We needed them in the UiAutomator2 example because any time we try to find an element after a view transition, we have no guarantees about the timing of when the new view will show up, and we have to hedge our bets with an explicit wait. One of the benefits of Espresso, however, is synchronization, which as I explained before means that Espresso itself will hold off on finding any elements until it believes the app is in an idle state. What this means is that, for the most part, we don't need to worry about waits in Espresso! (We do still need the first wait because synchronization is not in effect until the app itself is fully loaded and instrumented by Espresso, and Appium doesn't know exactly when that happens).
The second difference we mentioned was that we needed a different verification locator. What are these two locators and how do they differ? Here is how they are defined as fields on the test class:
private By verificationTextEspresso = By.xpath(
"//com.facebook.react.views.text.ReactTextView[@text='You are logged in as alice']");
private By verificationTextUiAuto2 = By.xpath(
"//android.widget.TextView[contains(@text, 'alice')]");
Interestingly, the Espresso driver has access to app-internal class names. We can tell, for example, that I used React Native to develop the test application, whereas with UiAutomator2, all we know is that we have a text view of some kind. This specificity in the Espresso driver is nice, but it comes potentially at a cost of reducing the cross-platform nature of the element class names. The Appium team will be looking into ways to sort this out as we move forward with work on the Espresso driver beta. Meanwhile, we also note that if we wanted, we could have written a more general XPath query that works across both drivers (something like //*[contains(@text, 'alice')]).
Other than the view change synchronization, are there any other benefits to using the Espresso driver? In my experiments so far, it appears to be about 25% faster than the UiAutomator2 driver, though it would take a fair amount of work to ensure a "clean-room" environment for the experiment and corroborate that figure.
So, if you like living on the cutting edge of Appium and mobile automation, I encourage you to check out the Espresso driver. No doubt you will find that it doesn't work quite as you expect in one (or many) ways. That's part of why I'm directing you to try it; I think we're in a state now where we could really use some solid feedback and bug reports! So, fire it up and let us know on the Appium issue tracker if you encounter any issues. You can also follow along with the Espresso driver development on GitHub.
The full code for the comparison tests we looked at in this article is below, and as always can be found on GitHub as well.
import io.appium.java_client.MobileBy;
import io.appium.java_client.android.AndroidDriver;
import java.net.MalformedURLException;
import java.net.URL;
import org.junit.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.remote.DesiredCapabilities;
import org.openqa.selenium.support.ui.ExpectedCondition;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;
public class Edition018_Espresso_Beta {
private String APP = "https://github.com/cloudgrey-io/the-app/releases/download/v1.5.0/TheApp-v1.5.0.apk";
private By loginScreen = MobileBy.AccessibilityId("Login Screen");
private By username = MobileBy.AccessibilityId("username");
private By password = MobileBy.AccessibilityId("password");
private By loginBtn = MobileBy.AccessibilityId("loginBtn");
private By verificationTextEspresso = By.xpath(
"//com.facebook.react.views.text.ReactTextView[@text='You are logged in as alice']");
private By verificationTextUiAuto2 = By.xpath(
"//android.widget.TextView[contains(@text, 'alice')]");
private AndroidDriver getDriver(String automationName) throws MalformedURLException {
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability("platformName", "Android");
capabilities.setCapability("deviceName", "Android Emulator");
capabilities.setCapability("automationName", automationName);
capabilities.setCapability("app", APP);
return new AndroidDriver<>(new URL("http://localhost:4723/wd/hub"), capabilities);
}
@Test
public void testLogin_Espresso() throws MalformedURLException {
AndroidDriver driver = getDriver("Espresso");
WebDriverWait wait = new WebDriverWait(driver, 10);
ExpectedCondition<WebElement> loginScreenReady =
ExpectedConditions.presenceOfElementLocated(loginScreen);
try {
wait.until(loginScreenReady).click();
driver.findElement(username).sendKeys("alice");
driver.findElement(password).sendKeys("mypassword");
driver.findElement(loginBtn).click();
driver.findElement(verificationTextEspresso);
} finally {
driver.quit();
}
}
@Test
public void testLogin_UiAutomator2() throws MalformedURLException {
AndroidDriver driver = getDriver("UiAutomator2");
WebDriverWait wait = new WebDriverWait(driver, 10);
ExpectedCondition<WebElement> loginScreenReady =
ExpectedConditions.presenceOfElementLocated(loginScreen);
ExpectedCondition<WebElement> usernameReady =
ExpectedConditions.presenceOfElementLocated(username);
ExpectedCondition<WebElement> verificationReady =
ExpectedConditions.presenceOfElementLocated(verificationTextUiAuto2);
try {
wait.until(loginScreenReady).click();
wait.until(usernameReady).sendKeys("alice");
driver.findElement(password).sendKeys("mypassword");
driver.findElement(loginBtn).click();
wait.until(verificationReady);
} finally {
driver.quit();
}
}
}
Appium isn't limited to automating mobile systems! As long as there is an open way to interact with a system, a driver can be written for it, and included in Appium. Using a project called Appium For Mac Appium can automate native macOs apps.
Appium comes bundled with a macOs driver, but the actual AppiumForMac binary is not included, so we need to install it ourselves first:
AppiumForMac uses the system Accessibility API in order to automate apps. We need to give expanded permissions to AppiumForMac in order for it to work. Here are instructions from the README:
Open System Preferences > Security & Privacy. Click the Privacy tab. Click Accessibility in the left hand table. If needed, click the lock to make changes. If you do not see AppiumForMac.app in the list of apps, then drag it to the list from Finder. Check the checkbox next to AppiumForMac.app.
I'm using the latest version of macOS (10.14.2), if you are using an earlier version, specific instructions are included in the AppiumForMac Readme.
The example code for AppiumForMac already automates the calculator app, so let's do something different and automate the Activity Monitor instead.
In order to automate a macOs app, we only need to set the following desired capabilities:
{
"platformName": "Mac",
"deviceName": "Mac",
"app": "Activity Monitor"
}
We specify Mac as the platform and set app to the name of the installed app we want to run. Once a test has been started, an app can also be launched using the GET command, e.g.:
driver.get("Calculator")
AppiumForMac is a little tricky, since elements can only be found using a special kind of XPath selector called "absolute AXPath". All the AXPath selectors use Accessibility API identifiers and properties. I'm including the exact rules for AXPath selectors below, but don't be afraid if they do not make sense at first; in the next section I describe some tools for finding AXPath selectors.
Here are the rules for a valid Absolute AXPath selector: - Uses OS X Accessibility properties, e.g. AXMenuItem or AXTitle. You cannot use any property of an element besides these. - Must begin with /AXApplication. - Must contain at least one other node following /AXApplication. - Does not contain "//", or use a wildcard, or specify multiple paths using |. - Uses predicates with a single integer as an index, or one or more string comparisons using = and !=. - May use boolean operators and or or in between multiple comparisons, but may not include both and and or in a single statement. and and or must be surrounded by spaces. - Does not use predicate strings containing braces [] or parentheses (). - Uses single quotes, not double quotes for attribute strings. - Does not contain spaces except inside quotes and surrounding the and and or operators.
Any XPath selector that follows the above rules will work as an absolute AXPath selector. Be warned: if your AXPath selector breaks the rules, you won't get a special error and instead will get an ElementNotFound exception. It can be difficult to identify whether your selectors are failing because the AXPath is invalid or the element simply is not on the screen.
The README contains the following examples as guidance:
Good examples:
"/AXApplication[@AXTitle='Calculator']/AXWindow[0]/AXGroup[1]/AXButton[@AXDescription='clear']"
"/AXApplication[@AXTitle='Calculator']/AXMenuBarItems/AXMenuBarItem[@AXTitle='View']/AXMenu/AXMenuItem[@AXTitle='Scientific']"
"/AXApplication/AXMenuBarItems/AXMenuBarItem[@AXTitle='View']/AXMenu/AXMenuItem[@AXTitle='Basic' and @AXMenuItemMarkChar!='']"
Bad examples:
"//AXButton[@AXDescription='clear']"
(does not begin with /AXApplication, and contains //)
"/AXApplication[@AXTitle='Calculator']/AXWindow[0]/AXButton[@AXDescription='clear']"
(not an absolute path: missing AXGroup)
"/AXApplication[@AXTitle="Calculator"]/AXWindow[0]"
(a predicate string uses double quotes)
"/AXApplication[@AXTitle='Calculator']"
(path does not contain at least two nodes)
"/AXApplication[@AXTitle='Calculator']/AXMenuBar/AXMenuBarItems/AXMenuBarItem[@AXTitle='(Window)']"
(a predicate string contains forbidden characters)
"/AXApplication[@AXTitle='Calculator']/AXWindow[0]/AXGroup[1]/AXButton[@AXDescription ='clear']"
(a predicate contain a space before the =)
"/AXApplication[@AXTitle='Calculator']/AXWindow[position()>3]/AXGroup[1]/AXButton[@AXDescription='clear']"
(a predicate is not a simple string or integer, and specifies more than one node)
"/AXApplication/AXMenuBarItems/AXMenuBarItem[@AXTitle='View']/AXMenu/AXMenuItem[@AXTitle='Basic' and@AXMenuItemMarkChar!='']"
(leading and trailing spaces required for the boolean operator)
"/AXApplication[@AXTitle="Calculator"]/AXWindow[0]/AXButton[@AXDescription='clear' and @AXEnabled='YES' or @AXDescription='clear all']"
(predicate uses multiple kinds of boolean operators; use one or more 'and', or, use one or more 'or', but not both)
This special AXPath restriction is tricky to work with, but we have some tools at our disposal.
First of all, AppiumForMac provides a tool for generating the AXPath of any element on the screen. First, launch the AppiumForMac app manually using Finder or Launchpad. It won't display a window, but will appear in the dock. If you hold the fn key on your keyboard down for about three seconds, AppiumForMac will find the AXPath string to select whichever element your mouse pointer is currently hovering over. It stores the AXPath selector into your clipboard, so you can paste it into your test code. You'll know when it has worked because the AppiumForMac icon jumps out of the dock.
This behavior will work anywhere on your screen, because AppiumForMac can actually automate anything which is available to the Accessibility API.
(NB: Third-party keyboards may not work with this functionality.)
I found the AXPath strings generated by AppiumForMac to be pretty long. Make sure to organize your test so common parts of the string can be reused. I also removed many of the predicates since they were too-specific and not necessary.
Another tool which can help with AXPath strings is the Accessiblity Inspector. This tool will show the hierarchy of accessibility elements, allow you to click on an element to inspect it, and view properties on elements.
As a last resort, you can try to dump the entire view hierarchy by calling driver.getSource(). This works on simple apps, but hung indefinitely on the Activity Monitor app, most likely because the UI is constantly updating.
Here's an example test which starts the Activity Monitor, switches between tabs, and performs a search:
import io.appium.java_client.AppiumDriver;
import org.junit.After;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.remote.DesiredCapabilities;
import java.io.IOException;
import java.net.URL;
import java.util.concurrent.TimeUnit;
public class Edition052_Automate_Mac {
private AppiumDriver driver;
@Before
public void setUp() throws IOException {
DesiredCapabilities caps = new DesiredCapabilities();
caps.setCapability("platformName", "Mac");
caps.setCapability("deviceName", "Mac");
caps.setCapability("app", "Activity Monitor");
caps.setCapability("newCommandTimeout", 300);
driver = new AppiumDriver(new URL("http://localhost:4723/wd/hub"), caps);
driver.manage().timeouts().implicitlyWait(5, TimeUnit.SECONDS);
}
@After
public void tearDown() {
try {
driver.quit();
} catch (Exception ign) {}
}
@Test
public void testActivityMonitor() {
String baseAXPath = "/AXApplication[@AXTitle='Activity Monitor']/AXWindow";
String tabSelectorTemplate = baseAXPath + "/AXToolbar/AXGroup/AXRadioGroup/AXRadioButton[@AXTitle='%s']";
driver.findElementByXPath(String.format(tabSelectorTemplate, "Memory")).click();
driver.findElementByXPath(String.format(tabSelectorTemplate, "Energy")).click();
driver.findElementByXPath(String.format(tabSelectorTemplate, "Disk")).click();
driver.findElementByXPath(String.format(tabSelectorTemplate, "Network")).click();
driver.findElementByXPath(String.format(tabSelectorTemplate, "CPU")).click();
WebElement searchField = driver.findElementByXPath(baseAXPath + "/AXToolbar/AXGroup/AXTextField[@AXSubrole='AXSearchField']");
searchField.sendKeys("Activity Monitor");
WebElement firstRow = driver.findElementByXPath(baseAXPath + "/AXScrollArea/AXOutline/AXRow[0]/AXStaticText");
Assert.assertEquals(" Activity Monitor", firstRow.getText());
}
}
(As always, the full code sample is also up on GitHub)
AppiumForMac is rough around the edges, probably because it does not currently have a lot of community use. Check the Github issues if you get stuck.
One of the great things about Appium is that it can be used for far more than mobile app automation. While iOS and Android remain the most popular use case for Appium, it is also possible to use Appium to automate a host of other platforms, including Windows and Mac desktop applications. In this article, we'll take a look at how to use Appium to automate Windows desktop apps.
Automation of Windows apps is actually quite a special thing in the Appium world, since Microsoft itself supports this automation via the development of a tool called WinAppDriver. WinAppDriver is essentially an Appium-compatible automation interface, which Appium automatically includes if you specify the appropriate desired capabilities for your test.
What do you need to run automated tests of native Windows apps? Well, it goes without saying that you need a Windows PC to host and run your applications, as well as running the Appium server that will perform the automation. (Your client script, of course, can run anywhere you like as long as it can connect to Appium running on Windows over the network.)
Here's everything that I needed to do to get Appium set up to automate Windows apps:
At this point, my system is now ready to go, with an Appium server running on the default port of 4723. All that's left is to decide what to automate!
Which capabilities are necessary for use with Windows app automation? Pretty much just the typical ones:
For an app that you haven't created, like a built-in system app, I discovered a neat trick of opening up a PowerShell window and running the following command:
```bash
get-Startapps
```
This will list all the installed apps along with their App IDs. The app I chose to automate for this article was the built-in Weather app, which had the ID Microsoft.BingWeather_8wekyb3d8bbwe!App.
(The WinAppDriver docs also list several other capabilities that might be useful for your purposes, so check them out too.)
Ultimately, when put into Java client form, my capabilities look like:
DesiredCapabilities caps = new DesiredCapabilities();
caps.setCapability("platformName", "Windows");
caps.setCapability("platformVersion", "10");
caps.setCapability("deviceName", "WindowsPC");
caps.setCapability("app", "Microsoft.BingWeather_8wekyb3d8bbwe!App");
driver = new AppiumDriver<>(new URL("http://localhost:4723/wd/hub"), caps);
I found the Weather app to be quite well-instrumented with automation-ready IDs and labels. After I was able to launch a session using the capabilities above, I ran driver.getPageSource() to have a look at the source XML. I found it to be quite useful, with plenty of sections like this, that clued me into the most helpful available attributes:
<ListItem AcceleratorKey="" AccessKey="" AutomationId="" ClassName="ListBoxItem"
FrameworkId="XAML" HasKeyboardFocus="False" HelpText="" IsContentElement="True"
IsControlElement="True" IsEnabled="True" IsKeyboardFocusable="True"
IsOffscreen="False" IsPassword="False" IsRequiredForForm="False" ItemStatus=""
ItemType="" LocalizedControlType="list item"
Name="Wednesday 7 
high 26° 
Low 17° 
Sunny"
Orientation="None" ProcessId="14476" RuntimeId="42.657656.4.141" x="410" y="749"
width="240" height="316" IsSelected="False"
SelectionContainer="{, ListBox, 42.657656.4.71}" IsAvailable="True">
<Custom AcceleratorKey="" AccessKey="" AutomationId=""
ClassName="Microsoft.Msn.Weather.Controls.DailyForecastBadge" FrameworkId="XAML"
HasKeyboardFocus="False" HelpText="" IsContentElement="True" IsControlElement="True"
IsEnabled="True" IsKeyboardFocusable="False" IsOffscreen="False" IsPassword="False"
IsRequiredForForm="False" ItemStatus="" ItemType="" LocalizedControlType="custom"
Name="" Orientation="None" ProcessId="14476" RuntimeId="42.657656.4.151" x="438"
y="777" width="168" height="193"/>
</ListItem>
This is the representation of a ListItem element which shows a particular day of the week along with a little weather summary. We can see that the Name attribute has most of the information I might want, including the date, the high and low temperatures, and a weather forecast. I could easily find this element via the name locator strategy, or (as I ended up doing), using xpath.
Other non-dynamic elements had the AutomationId attribute set, and for these elements, we can use the corresponding attribute as the selector for the accessibility id locator strategy.
Once we know how to find elements, there's really not much more we need to know to write our test! The only wrinkle I discovered is that, unlike Appium's behavior with mobile apps, WinAppDriver does not reset the state of applications when a session starts. This is both a blessing and a curse. It meant I could manually open the Weather app and click through all the prompts and ads, then trust that the state would remain the same when I launched an automated test. But it also means that you can't necessarily assume the app will always be in the same state across different systems (say in a CI environment).
Also, because I was running the test on the computer I was using, I of course had to stop work while the test was running, so as not to disturb it (WinAppDriver steals the mouse and moves it around just like a user would).
Without further ado, I present my very useless test of the Weather app, which simply finds every day which is listed in the app, clicks on each one, and then prints out the weather forecast (along with sunrise/sunset) for that particular day. Again, it's not a true "test" in the sense that I'm not making any verifications, but I am showcasing how easy it is to automate a Windows application using the Appium API.
import io.appium.java_client.AppiumDriver;
import io.appium.java_client.MobileBy;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.remote.DesiredCapabilities;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;
import java.io.IOException;
import java.net.URL;
public class Edition081_Windows {
private AppiumDriver<WebElement> driver;
@Before
public void setUp() throws IOException {
DesiredCapabilities caps = new DesiredCapabilities();
caps.setCapability("platformName", "Windows");
caps.setCapability("platformVersion", "10");
caps.setCapability("deviceName", "WindowsPC");
caps.setCapability("app", "Microsoft.BingWeather_8wekyb3d8bbwe!App");
driver = new AppiumDriver<>(new URL("http://localhost:4723/wd/hub"), caps);
}
@After
public void tearDown() {
try {
driver.quit();
} catch (Exception ign) {}
}
@Test
public void testWeatherApp() {
WebDriverWait wait = new WebDriverWait(driver, 10);
wait.until(ExpectedConditions.presenceOfElementLocated(MobileBy.AccessibilityId("NameAndConditions")));
for (WebElement el : driver.findElements(By.xpath("//ListItem[contains(@Name, 'day ')]"))) {
el.click();
WebElement sunrise = driver.findElement(MobileBy.AccessibilityId("Almanac_Sunrise"));
WebElement sunset = driver.findElement(MobileBy.AccessibilityId("Almanac_Sunset"));
System.out.println(el.getAttribute("Name"));
System.out.println("Sunrise: " + sunrise.getAttribute("Name"));
System.out.println("Sunset: " + sunset.getAttribute("Name"));
System.out.println("----------------------------");
}
}
}