How does android get the element information on the screen?

see that there are some automatic operation tools on the Internet that can realize the automatic control of mobile phones. There are several problems:

  1. do these automation tools simply record and perform clicks and slides, or are they "smart" by identifying the contents of the elements on the screen?
  2. my own idea: the first one must exist, but it is not clear whether the second one can be implemented. Can it be achieved? For example, in a scenario like this:

    to automatically start an app, on the desktop, different mobile phones have different screen sizes and different response speeds. If you use the first method, there will be limitations, so can there be a method similar to checking web page elements to traverse all the elements on the screen? Get their information (icon text, coordinate location, etc.) and click automatically?

Jul.07,2021

  • both, and mature technologies already exist

    • the first category, the simplest software, such as the button wizard, simulates the operation of the hand on the screen, as well as keystroke events, and so on. It can trigger some operations by looking for graph functions or something to search the content you want. This kind of software generally requires root permission to work properly
    • the second category is more professional and is generally used for automated testing, such as appium , which supports multi-platform language calls such as java python. It is identified by analyzing the resource id in the package. It needs to set the package name, class and other operations. It can also trigger the events of each control and get its properties. Theoretically, it can meet your later requirements
Menu