
Boiler Plate Code:
https://github.com/praveendvd/webdriverIO_winappdriver_boilerplate
Read the theory:
Let us start by understanding how easy is device Automation:
let’s see what is automation is all about
Old implementation:
So previously there were no standards on device or web automation. Each device vendor releases a utility or tool called a driver that knows how to automate their product.
so for example for automating the windows app we have WinAppDriver, for automating chrome we have Chromedriver and so on. These tools have all implementations, code logic etc on how to automate their product (Yes you are right everything is already done for you)
Now they expose these implementations through rest API, rest APIs are nothing but an abstract public method that listens to HTTP calls and triggers the actual code logic under the hood when the user calls that rest endpoint. It just hides all the internal complex implementation so that users don’t get scared.
an example API:
yes, API is just a function!!! here we are saying if the user does a post to /findElement then do all complex things and return the element
app.post('/findElement', (req, res) => {
// do all complex emplementation on finding element
return element
});
Challenge:
The challenge of not having a standard was that each vendor implemented the API in a different way, this causes issues with cross-device or browser testing. You cannot use the same code for testing different browsers or devices
Solution:
so solutions like Appium, Selenium etc came up with a wrapper server that creates another abstraction and exposes everything using JSON wire protocol (its still rest api just enforced how the endpoints, response and requests should look like), so you can use the same API to automate all devices or browsers. Routing of commands to correct browser or device will be handled by selenium or appium server !!! again all things are already done for you

New implementation:
Now things have changed for good!!!!! atlast the w3c which is organisation that defines web standards have brought in something called w3c protocol (its still rest api just enforced how the endpoints, response and requests should look like) now all vendors need to follow this standard while exposing the implementation through API so now we no longer need selenium or Appium server when w3c standard get implemented full fledge

So coming back to what is all this automation about:
- selenium or appium server setup (just start it nothing else !!!) till full w3c is implemented by all vendors, once that’s done we no longer need the intermediatory servers
- After that, all selenium, webdriverIO etc we use are just HTTP client libraries that call these w3c or json wire protocol from your script. So selenium/Appium are just http client libraries nothing else .its just like your math library you use for adding math.pow etc
Difference between Appium and webdriverIO
selenium/appium:
So selenium and Appium is an umbrella project that consists of selenium IDE, selenium library and selenium server; and; Appium inspector, Appium server and Appium libraries respectively. so once w3c standard gets implemented fully, selenium and appium will be just HTTP client libraries and we no longer need the intermediatory server.
so selenium/appium are just libraries that let you automate devices by caling the w3c or jwp apis, it doesn’t assert anything (you can’t test), it doesn’t generate reports etc. you have to use other test frameworks,reporting tools, assertion libraries etc to get this done
WebdriverIO:
Wdio comes in two flavours , one as standalone library which is same as selenium . Its again just an http client library that calls w3c endpoints from your script. Next is wdio runner, which provides a framework that has all reporting, assertion libraries, test frameworks etc already set up and ready for you to use.
Final verdict:
Don’t be scared about test automation, it is just about using a HTTP client library to do HTTP calls (json wire or w3c) from your script
Now as you have got rid of the fear !! lets dive into winAppAutomation
Or jump Right into Implementation:
- Install WinAppDriver:
The first thing to do is install winappdriver :
https://github.com/microsoft/WinAppDriver/releases
download the latest release and install
2. Install tools for inspecting the elements:
you can inspect windows apps using 3 main tools:
- Inspect.exe. This is shipped together with Visual studio and it’s inside the SDK directory like this: “C:\Program Files (x86)\Windows Kits\10\bin\10.0.16299.0\x64\inspect.exe”
- UI Recorder. standalone tool that aims to provide users a simpler way of creating automaton scripts by recording UI events performed by the user and generating XPath queries and C# code on the fly:
Download from: https://github.com/microsoft/WinAppDriver/releases , just search recorder and download the latest zip.
3. Appium Desktop App. It’s a graphical interface for Appium server, also an inspector that help you to look at application’s element.Inspecting UI Elements for WinAppDriver automation using Appium Desktop
Windows Application Driver(WinAppDriver) is the recommended tool to do UI automation for Windows applications…licanhua.medium.com
3. My Favourite is UIrecorded:
start UI recorder by unzipping the release we downloaded and click record that’s it:


Now you get the full XPATH information,
4. Appium capabilities:
so we now know that all logic on routing the commands to a specific driver or device is done internally within appium/selenium/vendor driver. But the question is how, and the answer is using capabilities.
The capabilities tell the server or driver, what browser, device or platform etc that we are aiming so that it can route and do other logic internally as applicable.
As discussed winAppDriver w3c is not yet matured enough so we use Appium server to route requests to winAppDriver as WebdriverIO supports only w3c
Note: so for w3c capabilities, you have to append vendor name with all capability if its json wire you can remove appium: part
capabilities: [{
"appium:platformName": "windows",
"appium:automationName": "windows",
//you can get app by running below command in powershell
//Get-StartApps| Select-String "your app name"
//FOr native windows app you can directly give the exe location
//youcan also pass exe
//"appium:app": "C:/Windows/System32/notepad.exe"
"appium:app": "Microsoft.WindowsCalculator_8wekyb3d8bbwe!App"
}],
As mentioned, you can get the app id by:
Running the below command in PowerShell
Get-StartApps| Select-String "your app name"
For native windows app you can directly give the exe location
Thats it !!! now you can use same web/mobile app automation method like find element etc
Note that winappdriver supports below locators:
https://github.com/microsoft/WinAppDriver/blob/master/Docs/AuthoringTestScripts.md

Appium locators and winapplocators are connected as:
https://appium.io/docs/en/drivers/windows/

so you can use appium accessibility id locator to find windows elements with a specific automation id.