Join the webinar on 'Open source GenAI tools for Test Automation' on May 28th or May 30th.
iOS-Specific Touch Action Methods

iOS-Specific Touch Action Methods

August 15, 2018
 Jonathan Lipps Jonathan Lipps
Jonathan Lipps

In a previous edition of Appium Pro, we saw how to use the W3C Actions API to automate complex gestures, including drawing some (amazing) stick figure art. The same API can obviously perform simple gestures like swiping, pinching, and zooming. However, there's occasionally a downside to using these general methods, which is that they bypass the officially-recognized APIs for standard complex actions provided by the underlying mobile automation tool, for example iOS.

If you ever run into difficulty using the W3C Action API, Appium provides direct access to these vendor-supported action methods as well. In this article we'll take a look at the ones available for iOS. Because these are not part of the WebDriver spec, Appium provides this access by overloading the executeScript command, as you'll see in the examples below.

mobile: swipe

This command ultimately calls the XCUIElement.swipe* family of methods provided by XCUITest, and thus takes two parameters: a direction (whether to swipe up, down, left, or right), and the ID of an element within which the swipe is to take place (Appium defaults to the entire Application element if no element is specified).

Note: for this command and all other mobile: commands which have an element as a parameter, the value which should be supplied is the internal ID of the element, which is not normally needed as part of Selenium/Appium testing. To get it in the Java client, you can call element.getID() (potentially needing to cast element to RemoteWebElement first).


// swipe up then down
Map args = new HashMap<>();
args.put("direction", "up");
driver.executeScript("mobile: swipe", args);
args.put("direction", "down");
driver.executeScript("mobile: swipe", args);

Unfortunately, XCUITest does not provide any parameters to modify the speed or distance of the swipe. For that, use the more general Actions API.

mobile: scroll

If you want to try and make sure that each movement of your gesture moves a view by the height of the scrollable content, or if you want to scroll until a particular element is visible, try mobile: scroll. It works similarly to mobile:swipe but takes more parameters:

  • element: the id of the element to scroll within (the application element by default). Call this the "bounding element"
  • direction: the opposite of how direction is used in mobile:swipe. A swipe "up" will scroll view contents down, whereas this is what a scroll "down" will do.
  • name: the accessibility ID of an element to scroll to within the bounding element
  • predicateString: the NSPredicate of an element to scroll to within the bounding element
  • toVisible: if true, and if element is set to a custom element, then simply scroll to the first visible child of element


// scroll down then up
Map args = new HashMap<>();
args.put("direction", "down");
driver.executeScript("mobile: scroll", args);
args.put("direction", "up");
driver.executeScript("mobile: scroll", args);

// scroll to the last item in the list by accessibility id
args.put("direction", "down");
args.put("name", "Stratus");
driver.executeScript("mobile: scroll", args);

// scroll back to the first item in the list
MobileElement list = (MobileElement) driver.findElement(By.className("XCUIElementTypeScrollView"));
args.put("direction", "up");
args.put("name", null);
args.put("element", list.getId());
driver.executeScript("mobile: scroll", args);

mobile: pinch

To pinch (described by a two-finger gesture where the fingers start far apart and come together) or to zoom (described by the inverse gesture where fingers start together and expand outward), use mobile: pinch, which calls XCUIElement.pinch under the hood. As with the other methods described so far, you can pass in an element parameter defining the element in which the pinch will take place (the entire application by default).

The only required parameter is scale:

  • Values between 0 and 1 refer to a "pinch"
  • Values greater than 1 refer to a "zoom"

An additional optional parameter velocity can be sent, which corresponds to "the velocity of the pinch in scale factor per second" according to Apple's docs.


// zoom in on something
Map args = new HashMap<>();
args.put("scale", 5);
driver.executeScript("mobile: pinch", args);

mobile: tap

The best way to tap on an element is using So why do we have mobile: tap? This method allows for extra parameters x and y signifying the coordinate at which to click. The nice thing is that this coordinate is either screen-relative (if an element parameter is not included, the default), or element-relative (if an element parameter is included).

Improve Appium Testing Experience With HeadSpin

This means that if you want to tap at the very top left corner of an element rather than dead center, you can!


// tap an element very near its top left corner
Map args = new HashMap<>();
args.put("element", ((MobileElement) element).getId());
args.put("x", 2);
args.put("y", 2);
driver.executeScript("mobile: tap", args);

mobile: doubleTap

There's more to tapping than single-tapping! And while you can certainly build a double-tap option using the Actions API, XCUITest provides a XCUIElement.doubleTap method for this purpose, and it could presumably have greater reliability than synthesizing your own action.

In terms of parameters, you should send in either an element parameter, with the ID of the element you want to tap, or both an x and y value representing the screen coordinate you wish to tap.


// double-tap the screen at a specific point
Map args = new HashMap<>();
args.put("x", 100);
args.put("y", 200);
driver.executeScript("mobile: doubleTap", args);

mobile: twoFingerTap

Not to be confused with a double-tap, a two-finger-tap is a single tap using two fingers! This method has only one parameter, which is required: good old element (it only works in the context of an element, not a point on the screen).


mobile: touchAndHold

Many iOS apps allow a user to trigger special behavior by tapping and holding the finger down on a certain UI element. You can specify all the same parameters as for doubleTap (element, x, and y) with the same semantics. In addition you must set the duration parameter to specify how many seconds you want the touch to be held.

// touch and hold an element
Map args = new HashMap<>();
args.put("element", ((MobileElement) element).getId());
args.put("duration", 1.5);
driver.executeScript("mobile: touchAndHold", args);

mobile: dragFromToForDuration

Another commonly-implemented app gesture is "drag-and-drop". As with all of these gestures, it's possible to build a respectable drag-and-drop using the Actions API, but if for some reason this doesn't work, XCUITest has provided a method directly for this purpose. It's a method on the XCUICoordinate class, and in my opinion the name 'dragFromToForDuration' isn't the most accurate representation of it.

Really, what's going on is that we're defining a start and an end coordinate, and also the duration of the hold on the start coordinate. In other words, we have no control over the drag duration itself, only on how long the first coordinate is held before the drag happens. What parameters do we use?

  • element: an element ID, which if provided will cause Appium to treat the coordinates as relative to this element. Absolute screen coordinates otherwise.
  • duration: the number of seconds (between 0.5 and 6.0) that the start coordinates should be held
  • fromX: the x-coordinate of the start position
  • fromY: the y-coordinate of the start position
  • toX: the x-coordinate of the end position
  • toY: the y-coordinate of the end position


// touch, hold, and drag based on coordinates
Map args = new HashMap<>();
args.put("duration", 1.5);
args.put("fromX", 100);
args.put("fromY", 100);
args.put("toX", 300);
args.put("toY", 600);
driver.executeScript("mobile: dragFromToForDuration", args);

And with that our tour of the special iOS-specific gesture methods is complete! If you want to see a working example of some of the scroll and swipe functionality, check out this article's code on GitHub, which makes use of a new scrolling list view added to The App!

iOS-Specific Touch Action Methods

4 Parts


Perfect Digital Experiences with Data Science Capabilities

Utilize HeadSpin's advanced capabilities to proactively improve performance and launch apps with confidence
popup image