Join the webinar on 'Open source GenAI tools for Test Automation' on May 28th or May 30th.
Designing a Cross-Platform Swipe/Scroll Helper

Designing a Cross-Platform Swipe/Scroll Helper

March 5, 2020
 Jonathan Lipps Jonathan Lipps
Jonathan Lipps

One common requirement for testing is to scroll a list until an element is found or some other condition is reached. Scrolling can be achieved in a variety of ways, but the most "low-level" way of doing so is using the Actions API. We've looked in depth at this API in the earlier edition on how to automate complex gestures. In that article, we looked at how to draw shapes! But what about something simpler, like scrolling a list?

lists to scroll

The principles are all the same, but it's so common of an action it's a good idea to build some convenience methods for accomplishing it. That's what this article is about. Here are the requirements for our convenience methods:

  • We should be able to scroll a list view
  • We should be able to swipe in an arbitrary fashion, for an arbitrary amount of time
  • We should be able to define a swipe in screen-relative terms, not merely absolute pixels
  • Our actions should do the same thing on both iOS and Android
  • The methods should have equivalents with sane defaults
Check out: Facets to Select Cross Platform Testing Tools

Let's dive into how to do this!

The base class

Because we want to share this functionality between test classes, we make sure to put it on some kind of base class. Here's the base class I'll be using (sans imports; see the full file on GitHub:

\abstract public class Edition107_Base {
    protected AppiumDriver driver;

    private By listView = MobileBy.AccessibilityId("List Demo");
    private By firstCloud = MobileBy.AccessibilityId("Altocumulus");
    private WebDriverWait wait;
    private Dimension windowSize;

    protected AppiumDriver getDriver() {
        return driver;

    public void tearDown() {
        try {
        } catch (Exception ign) {}

    protected void navToList() {
        wait  = new WebDriverWait(getDriver(), 10);

So far, the base class just takes care of navigating to the list view in our application as well as tearing down the app. Note that we have a function called getDriver, which will be overridden by the specific test cases and allow each test case to return its own driver (which might be an iOS driver or an Android driver). Now let's start building out our convenience methods.

Also check: Automating the Clipboard on iOS and Android

Low-level swipe

Everything we want to do (scrolling, swiping) can be conceived of at the root as an action with basically 4 steps:

  1. Move the finger to a location above the screen
  2. Touch the finger to the screen
  3. While touching the screen, move the finger to another location, taking a certain amount of time to do so
  4. Lift the finger off the screen
Read: Reliably Opening Deep Links Across Platforms and Devices

Encoding these steps will be our first task and will result in the method we build on top of for everything else. Let's walk through the implementation:

protected void swipe(Point start, Point end, Duration duration)

We'll call this method 'swipe', and allow the user to define the start point, end point, and duration.

AppiumDriver d = getDriver();
boolean isAndroid = d instanceof AndroidDriver;

To perform the swipe, we need the driver instance, which we get, and we also give ourselves a handy way of referring to whether the driver is iOS or Android.

Also read: Understanding Appium Drivers

PointerInput input = new PointerInput(Kind.TOUCH, "finger1");
Sequence swipe = new Sequence(input, 0);
swipe.addAction(input.createPointerMove(Duration.ZERO, Origin.viewport(), start.x, start.y));

In the section above, we begin to construct our action sequence. First of all, we define our pointer input and the sequence object which will represent the swipe. The next two actions are common to both iOS and Android. We initially move the pointer to the start point (taking no explicit time duration to do so), and then lower the pointer to touch the screen. Now, we'll take a look at some code that changes a bit depending on platform:

if (isAndroid) {
    duration = duration.dividedBy(ANDROID_SCROLL_DIVISOR);
} else {
    swipe.addAction(new Pause(input, duration));
    duration = Duration.ZERO;
swipe.addAction(input.createPointerMove(duration, Origin.viewport(), end.x, end.y));

This is the part of the function, which is responsible for moving to the end point, with a certain duration. Here there is a quirk for each iOS and Android we have to worry about. For Android, I've noticed that the actual time taken by the action is always much greater than the amount of time we actually specify. This may not be true across the board, but I defined a static variable ANDROID_SCROLL_DIVISOR to represent this (the value we're using here is 3).

See: Validating Android Toast Messages

On iOS, the duration of the pointer move is not actually encoded on the move action at all; it is encoded on a pause action which is inserted before the move action, which is why we are adding that action here if we're using iOS (before then setting the duration to zero for the actual move). This is a bit of an odd situation on iOS, and I wouldn't be surprised if we can find a way to make sure these APIs are equivalent in the future.


The last bit of the method is above, where we finally lift the finger from the screen, and tell the driver to actually perform the sequence we've encoded.

Swiping relatively

Our swipe method is great, but it requires absolute screen coordinates. What if we're running the same test on different devices with different screen dimensions? It would be much better if we could specify our swipe in terms relative to the height and width of the screen. Let's do that now.

private Dimension getWindowSize() {
    if (windowSize == null) {
        windowSize = getDriver().manage().window().getSize();
    return windowSize;

protected void swipe(double startXPct, double startYPct, double endXPct, double endYPct, Duration duration) {
    Dimension size = getWindowSize();
    Point start = new Point((int)(size.width * startXPct), (int)(size.height * startYPct));
    Point end = new Point((int)(size.width * endXPct), (int)(size.height * endYPct));
    swipe(start, end, duration);

Here we have defined a new version of swipe, that takes start and end values in percentages, not absolute terms. To make this work, we need another helper function which retrieves (and caches) the window size. Using the window size (height and width), we are able to calculate absolute coordinates for the swipe.

Also see: Getting Started with Appium for Android on Windows


Now we are in a good position to build up our scroll-related convenience methods. What is a scroll exactly? Well, it's technically a swipe where we don't care about one of the dimensions. If I'm scrolling a list down, it means I'm performing a slow upwards swipe action in reality, where the only change in motion that matters is change in the y-axis.

Conceptually, we have 4 directions we can scroll, so we can create an enum to help us define that:

public enum ScrollDirection {

Now, we can construct a scroll method which takes one of these directions. It's also helpful to make the scroll amount configurable. Do we want a short scroll or a long scroll? What we can do is allow the user to define this again in terms relative to the screen height or width. So If I say I want a scroll amount of 1, that means I should scroll the equivalent of a full screen. 0.5 would mean the equivalent of a half screen. Let's look at the implementation:

protected void scroll(ScrollDirection dir, double distance) {
    if (distance < 0 || distance > 1) {
        throw new Error("Scroll distance must be between 0 and 1");
    Dimension size = getWindowSize();
    Point midPoint = new Point((int)(size.width * 0.5), (int)(size.height * 0.5));
    int top = midPoint.y - (int)((size.height * distance) * 0.5);
    int bottom = midPoint.y + (int)((size.height * distance) * 0.5);
    int left = midPoint.x - (int)((size.width * distance) * 0.5);
    int right = midPoint.x + (int)((size.width * distance) * 0.5);
    if (dir == ScrollDirection.UP) {
        swipe(new Point(midPoint.x, top), new Point(midPoint.x, bottom), SCROLL_DUR);
    } else if (dir == ScrollDirection.DOWN) {
        swipe(new Point(midPoint.x, bottom), new Point(midPoint.x, top), SCROLL_DUR);
    } else if (dir == ScrollDirection.LEFT) {
        swipe(new Point(left, midPoint.y), new Point(right, midPoint.y), SCROLL_DUR);
    } else {
        swipe(new Point(right, midPoint.y), new Point(left, midPoint.y), SCROLL_DUR);

Basically, what's going on her is that we're using the distance parameter to define where the possible start and end points of the swipe we're constructing should be. Conceptually, the start point of a scroll-swipe is going to be half the distance from the mid-point of the screen, along the appropriate axis. With all of these points defined, we can then simply start and end at the appropriate point corresponding to the direction we want to scroll!

Nice default scrolling

In a lot of cases, we don't need to specify the distance exactly, so we can rely on a sane default:

protected void scroll(ScrollDirection dir) {
    scroll(dir, SCROLL_RATIO);

(In this project, the default distance is 0.8, to make sure we don't accidentally scroll even a pixel past content we might care about)

In general, we usually care about scrolling lists down, so we can also have a super convenient method for doing that:

protected void scroll() {
    scroll(ScrollDirection.DOWN, SCROLL_RATIO);

That's it! That's the set of helper methods that tick all our requirement boxes above.

Putting it all together

Let's take a look at a cross-platform example using these methods, basically scrolling down and then back up in a list:

public void testGestures() {

This method can be used in either an iOS or an Android test, and I've written one for each platform. You can review the full code on GitHub:

Designing a Cross-Platform Swipe/Scroll Helper

4 Parts


Perfect Digital Experiences with Data Science Capabilities

Utilize HeadSpin's advanced capabilities to proactively improve performance and launch apps with confidence
popup image