How does the Huawei Ark compiler solve the problem? The principle of the Ark compiler

Before answering this question, let's take a look at the timeline in which Huawei works for the Ark compiler:

In 2009, Huawei started the development of 5G basic technology research and began to create a compilation group. The first batch of researchers at home and abroad joined.
In 2013, Huawei launched the self-developed compiler HCC for the field of base stations, and formally proposed the compiler framework.
In 2014, many experts at home and abroad joined Huawei and the Ark project was officially launched.
In 2016, the Compiler and Programming Language Lab was established.
In 2017, the first Java program "HelloWorld" on the Ark compiler runs.
One week before the Spring Festival in 2018, the Ark compiler ran through all the backend services of the Android system and successfully ported it to the phone.
In April 2019, the Huawei Ark compiler was announced at the domestic conference of the P30 series.

So, how is the principle of the Ark compiler implemented?

In fact, Huawei's so-called "Ark compiler" is not so much a compiler, it is a compile and run system; the operation of this system requires the cooperation environment and the terminal (that is, smart phone), the purpose is to bypass Android The virtual machine that the App must run on in the operating system, compile the mixed code such as Java/C/C++ into the machine code and run it directly on the mobile phone, completely bid farewell to the Java JNI overhead, and completely bid farewell to the virtual machine's GC memory. The application process caused by the recycling is dropped - thus ultimately achieving the smoothness of the Android operating system.

As mentioned above, in this implementation of the Ark compiler, four aspects need to be solved.

First: compile Java code directly into machine code

As far as the current situation is concerned, the problem that Java has to compile into machine code is the dynamic semantics in Java (corresponding to static semantics, which can be solved by advanced translation). Static semantics refers to Determining the language and meaning, while dynamic semantics refers to content that needs to be understood in conjunction with the context - in this case, if you want to compile dynamic semantics like compiling static semantics, many people think that it is impossible at all.

This is impossible, which is exactly what Huawei solved during the development of the Ark compiler.

Specifically, the Ark compiler solves two major difficulties in static compilation dynamic semantics through two-way blessing in the compile phase and the run phase: one is to design the data model, and the other is how to efficiently obtain dynamic information at runtime. The Ark compiler team basically traversed the dynamic semantics of Java and performed large-scale data modeling. At the same time, the precision of compile-time dynamic semantic analysis is greatly improved, especially when it involves cross-language invocation. In addition, Huawei has designed a dynamic semantic matching mechanism with core patents, which effectively reduces the overhead of dynamic semantics at runtime.

As a result, the Ark compiler can compile Java code into a language that the machine can execute directly. Huawei said that after the application of Huawei's Ark compiler, it no longer needs to be compiled on the mobile phone, and completely bid farewell to the virtual machine, which brings an Android experience that rivals or even surpasses iOS.

Second: Resolve JNI overhead for mixed languages

Since 95% of Top apps are written in mixed languages ​​such as Java/C/C++; the Ark compiler also needs to eliminate the JNI overhead of mixed language calls.

This involves a noun IR mentioned above, which is used to represent the data structure of the code. It is the "protocol and common language" used by the various modules of the compiler and related tools to transfer information. It is also a program transformation. And compile and optimize the carriers of various algorithms. It's the "brains" of the compiler, which directly determines the final effect of the compiler - so it's the most difficult.

The Huawei Ark Compiler team has been working on IR for five years, gradually exploring the signal laws of every nerve and every neuron in the "brain", and on the basis of this, invented a core patent. The different language codes can be uniformly compiled into the same set of directly executable machine code in the developer environment, thereby completely eliminating the overhead of the hybrid language calling each other.

In other words, the Huawei Ark compiler can implement a unified intermediate representation IR in a mixed language, which is equivalent to the same person able to understand the language of the world - of course, behind this is a deep understanding of the Huawei Ark compiler team based on multiple programming languages. And a lot of research and development.

Third: Code optimization outside of unified IR

Huawei's Ark compiler directly moved code optimization from the mobile phone to the developer environment, and may move to the cloud in the future. With the more powerful computing power of the developer environment, more advanced and sophisticated optimization algorithms can be implemented to achieve better optimization results. Huawei said that the optimization of code optimization in many specific scenarios is even subversive.

It is worth mentioning that developers use the Ark compiler and do not need to change the original coding habits. Developers can develop their own code optimization algorithms or code optimizations only through algorithms preset by the Ark compiler. In the future, Huawei will also provide code tuning tools. Developers can choose to adjust the code according to the optimization suggestions of the tool, and cooperate with the Ark compiler to obtain better execution results.

Fourth: Solve the problem of Caton caused by Android memory recycling

In order to solve this problem, the Ark compiler uses the reference counting method (RC, Reference Counting) to perform real-time memory recovery, and uses a special elimination ring algorithm (to eliminate the problem of unrecyclable problems caused by mutual reference of objects) to avoid System Caton brought by GC centralized recycling. Compared to GC, Ark's memory reclamation is real-time rather than centralized, and there is no need to suspend the application process, which greatly eliminates the jam.

In addition, the software has an infinite loop that everyone is familiar with, that is, the computer is occupied by computer resources by an infinite loop of running programs. This “infinite loop” is called “ring reference” in the software. In order to avoid the "eat" of the phone memory from the mechanism by the mechanism, the Ark compiler introduces the "alarm" flag of the annotation to mark the ring of the base class.

Of course, Java programmers can also mark rings in business code. After extensive practice verification, the Ark mechanism can reduce the occurrence of most of the program loops. On the other hand, the Ark compiler introduces an efficient loop recovery mechanism in the running state, allowing selective intelligent recovery of the memory footprint of an APP, which is an improvement over the traditional loop recovery algorithm.

In summary, in the face of the existing Android system in the code compilation, operation, IR, memory recovery and other four levels of problems, Huawei Ark compiler gave their own solutions, the core innovation is mixed language The unified intermediate representation and complete static compilation, but more importantly, Huawei's new ideas for solving the Android operating system App problem, and the courage to dare to invest in order to achieve this kind of thinking.

标签: The principle of the Ark compiler