# Algorithm

This week’s LeetCode problem is 50. Pow(x, n)

Implement pow(x, n), which calculates x raised to the power n (i.e., xn).

`Input：x = 2.00000, n = 10Output：1024.00000Input：x = 2.00000, n = -2Output：0.25000Explanation：2^-2 = (1/2)^2 = 1/4 = 0.25`

In order to use the dichotomy method, it is necessary to first ensure that n is always positive, so it is necessary to convert xn. First, judge n. If n < 0, the n-th power of x is equal to pow(1/x, -n). When n = 0 or 1, pow(x, n) is always x, only when n>1 , the nth power of x is to calculate pow(x,n). In the function `calPow(double x, long n)` that actually calculates xn, if `n==0`, it returns 1, and the result tmp of the xn/2 is obtained by recursion first, if n is Even, the final result is `tmp*tmp`; if n is odd, then the final result is `tmp*tmp*x`, the code is as follows:

`class Solution {    public double myPow(double x, int n) {        if (x == 0 || x == 1) {            return x;        }        if (n > 0) {            return calPow(x, n);        } else if (n < 0) {            return calPow(1/x, -n);        } else {            return 1;        }    }    // use long type: because abs(Integer.MIN_VALUE) > abs(Integer.MAX_VALUE)    public double calPow(double x, long n) {        if (n == 0) {            return 1;        }        double tmp = calPow(x, n/2);        if (n % 2 == 0) {            return tmp * tmp;        } else {            return tmp * tmp * x;        }    }}`

# Review

This week’s Review is for the following article: C Runtime Overhead

The author introduces what to do when the C language library (libc) becomes the bottleneck of program execution.

The author traced an empty program with `strace -tt` and found that it took up to 9.4ms between executions on his machine, which is entirely the overhead of the linker and the C language runtime

`// strace output for `int main() { return 0; }`00:32:27.539873 execve("./overhead", ["./overhead"], [/* 15 vars */]) = 000:32:27.541060 brk(0) = 0xa7b00000:32:27.541466 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)00:32:27.541834 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 300:32:27.542199 fstat(3, {st_mode=S_IFREG|0644, st_size=36586, ...}) = 000:32:27.542438 mmap(NULL, 36586, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f4f640be00000:32:27.542865 close(3) = 000:32:27.543109 open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 300:32:27.543455 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\ 0`\1\2\0\0\0\0\0"..., 832) = 83200:32:27.543782 fstat(3, {st_mode=S_IFREG|0755, st_size=1984416, ...}) = 000:32:27.544162 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f640bd00000:32:27.544466 mmap(NULL, 3813200, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f4f63b0300000:32:27.544796 mprotect(0x7f4f63c9c000, 2097152, PROT_NONE) = 000:32:27.545104 mmap(0x7f4f63e9c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x199000) = 0x7f4f63e9c00000:32:27.545503 mmap(0x7f4f63ea2000, 16208, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f4f63ea200000:32:27.545775 close(3) = 000:32:27.546087 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f640bc00000:32:27.546386 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f640bb00000:32:27.546664 arch_prctl(ARCH_SET_FS, 0x7f4f640bc700) = 000:32:27.547373 mprotect(0x7f4f63e9c000, 16384, PROT_READ) = 000:32:27.548209 mprotect(0x7f4f640c7000, 4096, PROT_READ) = 000:32:27.548533 munmap(0x7f4f640be000, 36586) = 000:32:27.548949 exit_group(0) = ?00:32:27.549272 +++ exited with 0 +++`

Then the author tries to use inline assembly and `-ffreestanding -nostdlib` to test the check:

`// gcc -m32 -ffreestanding -nostdlibvoid _start() {    /* exit system call */    asm("movl \$1,%eax;"        "xorl %ebx,%ebx;"        "int \$0x80"    );}// strace -tt output// 00:41:38.884122 execve("./blank", ["./blank"], [/* 15 vars */]) = 0// 00:41:38.884552 exit(0) = ?// 00:41:38.884629 +++ exited with 0 +++`

It can be seen that this program without any system calls only runs for 0.5ms. Then the author tried to compile the program statically, and the running time was reduced to about 8ms. The output obtained from the trace is as follows:

`00:35:19.095150 execve("./main-static", ["./main-static"], [/* 15 vars */]) = 000:35:19.095973 uname({sys="Linux", node="vagrant-arch", ...}) = 000:35:19.096373 brk(0) = 0x16cf00000:35:19.096724 brk(0x16d01c0) = 0x16d01c000:35:19.097065 arch_prctl(ARCH_SET_FS, 0x16cf880) = 000:35:19.097387 readlink("/proc/self/exe", "/vagrant/level0/main-static", 4096) = 2700:35:19.097779 brk(0x16f11c0) = 0x16f11c000:35:19.098140 brk(0x16f2000) = 0x16f200000:35:19.098539 fstat(0, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 000:35:19.098837 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8d5a7cb00000:35:19.099192 read(0, "Collaboratively administrative emp"..., 4096) = 139800:35:19.099544 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 000:35:19.099893 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8d5a7ca000`

Finally, the author wrote a minimal libc library to compile with the `-ffreestanding -nostdlib` flag, and the running time was reduced to 1ms, which is close to 1/ the running time of glibc 10, which has only five system calls:

`00:53:41.882305 execve("./main-43.1", ["./main-43.1"], [/* 15 vars */]) = 000:53:41.882920 read(0, "Collaboratively administrative emp"..., 4096) = 139800:53:41.882989 read(0, "", 4096) = 000:53:41.883006 write(1, "Collaboratively administratively emp"..., 1449) = 144900:53:41.883023 _exit(0) = ?00:53:41.883356 +++ exited with 0 +++`

Summary: If you feel that the program or process startup time is slow and really needs to be optimized from the library level, then it is worth your time to try another libc implementation (such as musl libc or Diet libc)

# Tip

In the example of callback function in C language. the callback function is to treat one function as a parameter of another function. The most classic application is to customize the function of comparison size in the sorting algorithm. The following is a simple example, the `operate` function calls the `add` and `multi` functions to achieve the effect of addition and multiplication respectively.

`#include <stdio.h>int operate(int a, int b, int (* func) (int a, int b));int add(int a, int b);int multi(int a, int b);int main(){    int a = 3, b = 4;    printf("%d + %d = %d\n", a, b, operate(a, b, add));    printf("%d * %d = %d\n", a, b, operate(a, b, multi));    return 0;}int operate(int a, int b, int (* func) (int a, int b)){    return func(a, b);}int add(int a, int b){    return a+b;}int multi(int a, int b){    return a*b;}`

The result of running the program is as follows:

`\$ gcc test.c\$ ./a.out3 + 4 = 73 * 4 = 12`

# Share

Now I have an idea about the topic of a non-ARTS technical blog post, which is roughly divided into three parts. I am currently working hard on writing, come on!