Android系统(245)---SystemServer进程的创建流程
Android进程系列第三篇---SystemServer进程的创建流程
一、内容预览
SystemServer进程的启动.png
二、概述
前面进程系列已经更新了两篇,本文(基于Android O源码)主要讲解SystemServer进程创建流程上半部分,下半部梳理一下SytemServer进程创建之后的启动阶段以及运行的核心服务。
Android进程系列第一篇---进程基础
Android进程系列第二篇---Zygote进程的创建流程
简要回顾上一篇的重点的内容
- Zygote进程实质是一种C/S架构,Zygote进程作为Server端,处理四面八方的客户端通过Socket发送来的创建进程的请求;
- 总结了Socket通信的框架,Init进程add了socket的fd,Zygote进程get到这个fd,创建了LocalServerSocket;
- 总结了Zygote进程做为所有应用进程的原因是什么;
- 总结Zygote进程如何进行资源的预加载,以及Zygote进程为什么不能在子线程中加载进程的资源
本篇文章主要写SystemServer进程的创建,SystemServer进程是Zygote进程的大弟子,是Zygote进程fork的第一个进程,Zygote和SystemServer这两个进程顶起了Java世界的半边天,任何一个进程的死亡,都会导致Java世界的崩溃。通常我们大多数死机重启问题也是发生在了SystemServer进程中。SystemServer进程运行了几十种核心服务,为了防止应用进程对系统造成破坏,应用进程没有权限访问系统的资源,只能通过SystemServer进程的代理来访问,从这几点可见SystemServer进程相当重要。
三、SystemServer的创建流程
SystemServer进程的创建.png
3.1、ZygoteInit的main方法
上图是SystemServer的创建序列图,我们仍然从ZygoteInit的main方法开始说起,再次亮出下面的“模板”代码。
frameworks/base/core/java/com/android/internal/os/ZygoteInit.java public static void main(String argv[]) { //1、创建ZygoteServer ZygoteServer zygoteServer = new ZygoteServer(); try { //2、创建一个Server端的Socket zygoteServer.registerServerSocket(socketName); //3、加载进程的资源和类 preload(bootTimingsTraceLog); if (startSystemServer) { //4、开启SystemServer进程,这是受精卵进程的第一次分裂 startSystemServer(abiList, socketName, zygoteServer); } //5、启动一个死循环监听来自Client端的消息 zygoteServer.runSelectLoop(abiList); //6、关闭SystemServer的Socket zygoteServer.closeServerSocket(); } catch (Zygote.MethodAndArgsCaller caller) { //7、这里捕获这个异常调用MethodAndArgsCaller的run方法。 caller.run(); } catch (Throwable ex) { Log.e(TAG, "System zygote died with exception", ex); zygoteServer.closeServerSocket(); throw ex; } }
ZygoteInit的main方法有7个关键点,1,2,3小点我们在上一篇已经进行了梳理,现在从第四点开始分析。
590 /**591 * Prepare the arguments and fork for the system server process.592 */593 private static boolean startSystemServer(String abiList, String socketName, ZygoteServer zygoteServer)594 throws Zygote.MethodAndArgsCaller, RuntimeException { .........613 /* Hardcoded command line to start the system server */614 String args[] = {615 "--setuid=1000",616 "--setgid=1000",617 "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,1032,3001,3002,3003,3006,3007,3009,3010",618 "--capabilities=" + capabilities + "," + capabilities,619 "--nice-name=system_server",620 "--runtime-args",621 "com.android.server.SystemServer",622 };623 ZygoteConnection.Arguments parsedArgs = null;624625 int pid;626627 try {628 parsedArgs = new ZygoteConnection.Arguments(args);629 ZygoteConnection.applyDebuggerSystemProperty(parsedArgs);630 ZygoteConnection.applyInvokeWithSystemProperty(parsedArgs);631632 //创建System进程,底层调用fork函数,见3.2小节633 pid = Zygote.forkSystemServer(634 parsedArgs.uid, parsedArgs.gid,635 parsedArgs.gids,636 parsedArgs.debugFlags,637 null,638 parsedArgs.permittedCapabilities,639 parsedArgs.effectiveCapabilities);640 } catch (IllegalArgumentException ex) {641 throw new RuntimeException(ex);642 }643644 //fork函数会返回两次,pid==0意味着子进程创建成功645 if (pid == 0) { //如果机器支持32位应用,需要等待32位的Zygote连接成功646 if (hasSecondZygote(abiList)) {647 waitForSecondaryZygote(socketName);648 }649 //关闭从Zygote进程继承来的Socket650 zygoteServer.closeServerSocket(); //处理SytemServer进程接下来的事情,见3.4小节651 handleSystemServerProcess(parsedArgs);652 }653654 return true;655 }656
- 1、将数组args转换成 ZygoteConnection.Arguments的形式,实质就是给 ZygoteConnection.Arguments中成员变量赋值,那么这些参数是什么意思呢?
614 String args[] = {615 "--setuid=1000",616 "--setgid=1000",617 "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,1032,3001,3002,3003,3006,3007,3009,3010",618 "--capabilities=" + capabilities + "," + capabilities,619 "--nice-name=system_server",620 "--runtime-args",621 "com.android.server.SystemServer",622 };
SystemServer进程的pid和gid都设置成1000,setgroups指定进程所属组,capabilities可设定进程的权限,nice-names是进程的名称,执行类是com.android.server.SystemServer。
- 2、调用forkSystemServer fork出系统进程,实质还是调用C层的fork函数(基于写时复制机制),如果返回的pid=0,代表成功fork出System进程。
- 3 、当Zygote复制出新的进程时,由于复制出的新进程与Zygote进程共享内存空间,而在Zygote进程中创建的服务端Socket是新进程不需要的,所以新创建的进程需调用 zygoteServer.closeServerSocket()方法关闭该Socket服务端。
3.2、Zygote的forkSystemServer方法
/frameworks/base/core/java/com/android/internal/os/Zygote.java146 public static int forkSystemServer(int uid, int gid, int[] gids, int debugFlags,147 int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {148 VM_HOOKS.preFork();149 // Resets nice priority for zygote process.150 resetNicePriority();151 int pid = nativeForkSystemServer(152 uid, gid, gids, debugFlags, rlimits, permittedCapabilities, effectiveCapabilities);153 // Enable tracing as soon as we enter the system_server.154 if (pid == 0) {155 Trace.setTracingEnabled(true);156 }157 VM_HOOKS.postForkCommon();158 return pid;159 }
nativeForkSystemServer是一个JNI方法,是在AndroidRuntime.cpp中注册的,调用com_android_internal_os_Zygote.cpp中的register_com_android_internal_os_Zygote()方法建立native方法的映射关系。
/frameworks/base/core/jni/com_android_internal_os_Zygote.cpp728static jint com_android_internal_os_Zygote_nativeForkSystemServer(729 JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,730 jint debug_flags, jobjectArray rlimits, jlong permittedCapabilities,731 jlong effectiveCapabilities) {732 pid_t pid = ForkAndSpecializeCommon(env, uid, gid, gids,733 debug_flags, rlimits,734 permittedCapabilities, effectiveCapabilities,735 MOUNT_EXTERNAL_DEFAULT, NULL, NULL, true, NULL,736 NULL, NULL, NULL);737 if (pid > 0) {738 // The zygote process checks whether the child process has died or not.739 ALOGI("System server process %d has been created", pid);740 gSystemServerPid = pid;741 // There is a slight window that the system server process has crashed742 // but it went unnoticed because we haven't published its pid yet. So743 // we recheck here just to make sure that all is well.744 int status;745 if (waitpid(pid, &status, WNOHANG) == pid) {746 ALOGE("System server process %d has died. Restarting Zygote!", pid);747 RuntimeAbort(env, __LINE__, "System server process has died. Restarting Zygote!");748 }749 }750 return pid;751}
这里需要解释一下waitpid函数
-
如果在调用waitpid()函数时,当指定等待的子进程已经停止运行或结束了,则waitpid()会立即返回;但是如果子进程还没有停止运行或结束,则调用waitpid()函数的父进程则会被阻塞,暂停运行。
-
status这个参数将保存子进程的状态信息,有了这个信息父进程就可以了解子进程为什么会退出,是正常退出还是出了什么错误。如果status不是空指针,则状态信息将被写入。
-
waitpid()函数第三个参数有两个选项,一是WNOHANG,如果pid指定的子进程没有结束,则waitpid()函数立即返回0,而不是阻塞在这个函数上等待;如果结束了,则返回该子进程的进程号。二是WUNTRACED,如果子进程进入暂停状态,则马上返回。
所以(waitpid(pid, &status, WNOHANG) == pid成立的时候,这意味着SytemServer进程died了,需要重启Zygote进程。继续看ForkAndSpecializeCommon函数。
474// Utility routine to fork zygote and specialize the child process.475static pid_t ForkAndSpecializeCommon(JNIEnv* env, uid_t uid, gid_t gid, jintArray javaGids,476 jint debug_flags, jobjectArray javaRlimits,477 jlong permittedCapabilities, jlong effectiveCapabilities,478 jint mount_external,479 jstring java_se_info, jstring java_se_name,480 bool is_system_server, jintArray fdsToClose,481 jintArray fdsToIgnore,482 jstring instructionSet, jstring dataDir) { //设置子进程的signal信号处理函数,见3.3小节483 SetSigChldHandler(); 516 ...... //fork子进程517 pid_t pid = fork();518519 if (pid == 0) {520 // The child process. ......576 if (!is_system_server) {577 int rc = createProcessGroup(uid, getpid());578 if (rc != 0) {579 if (rc == -EROFS) {580 ALOGW("createProcessGroup failed, kernel missing CONFIG_CGROUP_CPUACCT?");581 } else {582 ALOGE("createProcessGroup(%d, %d) failed: %s", uid, pid, strerror(-rc));583 }584 }585 }586587 SetGids(env, javaGids);//设置设置group588589 SetRLimits(env, javaRlimits);//设置资源limit590597 int rc = setresgid(gid, gid, gid);598 if (rc == -1) {599 ALOGE("setresgid(%d) failed: %s", gid, strerror(errno));600 RuntimeAbort(env, __LINE__, "setresgid failed");601 }602603 rc = setresuid(uid, uid, uid);//设置uid .......617618 SetCapabilities(env, permittedCapabilities, effectiveCapabilities, permittedCapabilities);619620 SetSchedulerPolicy(env);//设置调度策略621 ....... //创建selinux上下文640 rc = selinux_android_setcontext(uid, is_system_server, se_info_c_str, se_name_c_str); .......666 } else if (pid > 0) { .......673 }674 }675 return pid;676}677} // anonymous namespace678
值得注意的是在fork之前,调用了SetSigChldHandler,SetSigChldHandler定义了信号处理函数SigChldHandler,当信号SIGCHLD到来的时候,会进入3.3中的信号处理函数。
3.3、SystemServer与Zygote共存亡
141// Configures the SIGCHLD handler for the zygote process. This is configured142// very late, because earlier in the runtime we may fork() and exec()143// other processes, and we want to waitpid() for those rather than144// have them be harvested immediately.145//146// This ends up being called repeatedly before each fork(), but there's147// no real harm in that.148static void SetSigChldHandler() {149 struct sigaction sa;150 memset(&sa, 0, sizeof(sa));151 sa.sa_handler = SigChldHandler;152153 int err = sigaction(SIGCHLD, &sa, NULL);154 if (err < 0) {155 ALOGW("Error setting SIGCHLD handler: %s", strerror(errno));156 }157}
89// This signal handler is for zygote mode, since the zygote must reap its children90static void SigChldHandler(int /*signal_number*/) {91 pid_t pid;92 int status;9394 // It's necessary to save and restore the errno during this function.95 // Since errno is stored per thread, changing it here modifies the errno96 // on the thread on which this signal handler executes. If a signal occurs97 // between a call and an errno check, it's possible to get the errno set98 // here.99 // See b/23572286 for extra information.100 int saved_errno = errno;101102 while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {103 // Log process-death status that we care about. In general it is104 // not safe to call LOG(...) from a signal handler because of105 // possible reentrancy. However, we know a priori that the106 // current implementation of LOG() is safe to call from a SIGCHLD107 // handler in the zygote process. If the LOG() implementation108 // changes its locking strategy or its use of syscalls within the109 // lazy-init critical section, its use here may become unsafe.110 if (WIFEXITED(status)) {111 if (WEXITSTATUS(status)) {112 ALOGI("Process %d exited cleanly (%d)", pid, WEXITSTATUS(status));113 }114 } else if (WIFSIGNALED(status)) {115 if (WTERMSIG(status) != SIGKILL) {116 ALOGI("Process %d exited due to signal (%d)", pid, WTERMSIG(status));117 }118 if (WCOREDUMP(status)) {119 ALOGI("Process %d dumped core.", pid);120 }121 }122123 // If the just-crashed process is the system_server, bring down zygote124 // so that it is restarted by init and system server will be restarted125 // from there.126 if (pid == gSystemServerPid) {127 ALOGE("Exit zygote because system server (%d) has terminated", pid);128 kill(getpid(), SIGKILL);129 }130 }131132 // Note that we shouldn't consider ECHILD an error because133 // the secondary zygote might have no children left to wait for.134 if (pid < 0 && errno != ECHILD) {135 ALOGW("Zygote SIGCHLD error in waitpid: %s", strerror(errno));136 }137138 errno = saved_errno;139}
system_server进程是zygote的大弟子,是zygote进程fork的第一个进程,zygote和system_server这两个进程可以说是Java世界的半边天,任何一个进程的死亡,都会导致Java世界的崩溃。所以如果子进程SystemServer挂了,Zygote就会自杀,导致Zygote重启。也是Zygote和SystemServer是共存亡的。
3.4、handleSystemServerProcess方法处理fork的新进程
/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java446 /**447 * Finish remaining work for the newly forked system server process.448 */449 private static void handleSystemServerProcess(450 ZygoteConnection.Arguments parsedArgs)451 throws Zygote.MethodAndArgsCaller {452453 // set umask to 0077 so new files and directories will default to owner-only permissions.454 Os.umask(S_IRWXG | S_IRWXO);455 //设置新进程的名字456 if (parsedArgs.niceName != null) {457 Process.setArgV0(parsedArgs.niceName);458 }459 //获取systemServerClasspath460 final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");461 if (systemServerClasspath != null) { //优化systemServerClasspath路径之下的dex文件,看延伸阅读462 performSystemServerDexOpt(systemServerClasspath);463 // Capturing profiles is only supported for debug or eng builds since selinux normally464 // prevents it.465 boolean profileSystemServer = SystemProperties.getBoolean(466 "dalvik.vm.profilesystemserver", false);467 if (profileSystemServer && (Build.IS_USERDEBUG || Build.IS_ENG)) {468 try {469 File profileDir = Environment.getDataProfilesDePackageDirectory(470 Process.SYSTEM_UID, "system_server");471 File profile = new File(profileDir, "primary.prof");472 profile.getParentFile().mkdirs();473 profile.createNewFile();474 String[] codePaths = systemServerClasspath.split(":");475 VMRuntime.registerAppInfo(profile.getPath(), codePaths);476 } catch (Exception e) {477 Log.wtf(TAG, "Failed to set up system server profile", e);478 }479 }480 }481 //此处是空,所以是eles分之482 if (parsedArgs.invokeWith != null) {483 String[] args = parsedArgs.remainingArgs;484 // If we have a non-null system server class path, we'll have to duplicate the485 // existing arguments and append the classpath to it. ART will handle the classpath486 // correctly when we exec a new process.487 if (systemServerClasspath != null) {488 String[] amendedArgs = new String[args.length + 2];489 amendedArgs[0] = "-cp";490 amendedArgs[1] = systemServerClasspath;491 System.arraycopy(args, 0, amendedArgs, 2, args.length);492 args = amendedArgs;493 }494495 WrapperInit.execApplication(parsedArgs.invokeWith,496 parsedArgs.niceName, parsedArgs.targetSdkVersion,497 VMRuntime.getCurrentInstructionSet(), null, args);498 } else {499 ClassLoader cl = null;500 if (systemServerClasspath != null) {501 cl = createPathClassLoader(systemServerClasspath, parsedArgs.targetSdkVersion);502503 Thread.currentThread().setContextClassLoader(cl);504 }505506 /*507 * Pass the remaining arguments to SystemServer.见3.5小节508 */509 ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);510 }511512 /* should never reach here */513 }
延伸阅读:
在Android系统中,一个App的所有代码都在一个Dex文件里面。Dex是一个类似Jar的存储了多有Java编译字节码的归档文件。因为Android系统使用Dalvik虚拟机,所以需要把使用Java Compiler编译之后的class文件转换成Dalvik能够执行的class文件。这里需要强调的是,Dex和Jar一样是一个归档文件,里面仍然是Java代码对应的字节码文件。当Android系统启动一个应用的时候,有一步是对Dex进行优化,这个过程有一个专门的工具来处理,叫DexOpt。DexOpt的执行过程是在第一次加载Dex文件的时候执行的。这个过程会生成一个ODEX文件,即Optimised Dex。执行ODex的效率会比直接执行Dex文件的效率要高很多。但是在早期的Android系统中,DexOpt有一个问题,DexOpt会把每一个类的方法id检索起来,存在一个链表结构里面。但是这个链表的长度是用一个short类型来保存的,导致了方法id的数目不能够超过65536个。当一个项目足够大的时候,显然这个方法数的上限是不够的。尽管在新版本的Android系统中,DexOpt修复了这个问题,但是我们仍然需要对老系统做兼容。
Android提供了一个专门验证与优化dex文件的工具dexopt。其源码位于Android系统源码的dalvik/dexopt目录下classPath中的内容如下
systemServerClasspath = /system/framework/services.jar:/system/framework/ethernet-service.jar:/system/framework/wifi-service.jar
之后会将这三个jar从路径中获取出来,判断是否要进行dexopt优化. 如果需要就调用installer进行优化。
3.5、zygoteInit方法
/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java816 /**817 * The main function called when started through the zygote process. This818 * could be unified with main(), if the native code in nativeFinishInit()819 * were rationalized with Zygote startup.820 *821 * Current recognized args:822 *
823 * -
[--] <start class name> <args>824 *
825 *826 * @param targetSdkVersion target SDK version827 * @param argv arg strings828 */829 public static final void zygoteInit(int targetSdkVersion, String[] argv,830 ClassLoader classLoader) throws Zygote.MethodAndArgsCaller {831 if (RuntimeInit.DEBUG) {832 Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");833 }834835 Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit"); //见3.5.1836 RuntimeInit.redirectLogStreams();837 //见3.5.2838 RuntimeInit.commonInit(); //见3.5.3839 ZygoteInit.nativeZygoteInit(); //见3.5.4840 RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);841 }842
3.5.1、RuntimeInit的redirectLogStreams方法
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java319 /**320 * Redirect System.out and System.err to the Android log.321 */322 public static void redirectLogStreams() {323 System.out.close();324 System.setOut(new AndroidPrintStream(Log.INFO, "System.out"));325 System.err.close();326 System.setErr(new AndroidPrintStream(Log.WARN, "System.err"));327 }
初始化Android LOG输出流, 并且将system.out, system.err关闭, 将两者重新定向到Android log中 。
3.5.2、RuntimeInit的commonInit方法
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java135 protected static final void commonInit() {136 if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");137138 /*139 * set handlers; these apply to all threads in the VM. Apps can replace140 * the default handler, but not the pre handler.141 */ //设置进程的uncaught exception的处理方法,默认是设置LoggingHandler,输出函数的出错堆栈。见3.5.2.1142 Thread.setUncaughtExceptionPreHandler(new LoggingHandler()); //进入异常崩溃的处理流程,通知AMS弹窗,见3.5.2.2143 Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler());144145 /*146 * Install a TimezoneGetter subclass for ZoneInfo.db,设置时区147 */148 TimezoneGetter.setInstance(new TimezoneGetter() {149 @Override150 public String getId() {151 return SystemProperties.get("persist.sys.timezone");152 }153 });154 TimeZone.setDefault(null);155156 /*157 * Sets handler for java.util.logging to use Android log facilities.158 * The odd "new instance-and-then-throw-away" is a mirror of how159 * the "java.util.logging.config.class" system property works. We160 * can't use the system property here since the logger has almost161 * certainly already been initialized.162 */163 LogManager.getLogManager().reset();164 new AndroidConfig();165166 /*167 * Sets the default HTTP User-Agent used by HttpURLConnection.168 */169 String userAgent = getDefaultUserAgent();170 System.setProperty("http.agent", userAgent);171172 /*173 * Wire socket tagging to traffic stats.174 */175 NetworkManagementSocketTagger.install();176177 /*178 * If we're running in an emulator launched with "-trace", put the179 * VM into emulator trace profiling mode so that the user can hit180 * F9/F10 at any time to capture traces. This has performance181 * consequences, so it's not something you want to do always.182 */183 String trace = SystemProperties.get("ro.kernel.android.tracing");184 if (trace.equals("1")) {185 Slog.i(TAG, "NOTE: emulator trace profiling enabled");186 Debug.enableEmulatorTraceOutput();187 }188189 initialized = true;190 }
3.5.2.1、 设置进程出错堆栈的捕获方式。
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java63 /**64 * Logs a message when a thread encounters an uncaught exception. By65 * default, {@link KillApplicationHandler} will terminate this process later,66 * but apps can override that behavior.67 */68 private static class LoggingHandler implements Thread.UncaughtExceptionHandler {69 @Override70 public void uncaughtException(Thread t, Throwable e) {71 // Don't re-enter if KillApplicationHandler has already run72 if (mCrashing) return;73 if (mApplicationObject == null) {74 // The "FATAL EXCEPTION" string is still used on Android even though75 // apps can set a custom UncaughtExceptionHandler that renders uncaught76 // exceptions non-fatal.77 Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);78 } else {79 StringBuilder message = new StringBuilder();80 // The "FATAL EXCEPTION" string is still used on Android even though81 // apps can set a custom UncaughtExceptionHandler that renders uncaught82 // exceptions non-fatal.83 message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n");84 final String processName = ActivityThread.currentProcessName();85 if (processName != null) {86 message.append("Process: ").append(processName).append(", ");87 }88 message.append("PID: ").append(Process.myPid());89 Clog_e(TAG, message.toString(), e);90 }91 }92 }
应用的JAVA的crash问题是FATAL EXCEPTION开头的,比如:
01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: FATAL EXCEPTION: main01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: Process: com.xiaomi.scanner, PID: 1763501-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: java.lang.IllegalArgumentException: View=DecorView@77ff3a0[] not attached to window manager01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.view.WindowManagerGlobal.findViewLocked(WindowManagerGlobal.java:491)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.view.WindowManagerGlobal.removeView(WindowManagerGlobal.java:400)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.view.WindowManagerImpl.removeViewImmediate(WindowManagerImpl.java:125)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.app.Dialog.dismissDialog(Dialog.java:374)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.app.Dialog.dismiss(Dialog.java:357)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity.b(Unknown Source:14)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.a.d.c(Unknown Source:39)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.a.d.a(Unknown Source:53)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity.b(Unknown Source:30)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity.b(Unknown Source:0)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity$6.onJsPrompt(Unknown
系统的JAVA的crash问题是FATAL EXCEPTION IN SYSTEM PROCESS开头的,比如:
logcat.log.01:2211: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: *** FATAL EXCEPTION IN SYSTEM PROCESS: android.bglogcat.log.01:2212: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: java.lang.NullPointerException: Attempt to get length of null arraylogcat.log.01:2213: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: at com.android.server.net.NetworkPolicyManagerService.isUidIdle(NetworkPolicyManagerService.java:2318)logcat.log.01:2214: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: at com.android.server.net.NetworkPolicyManagerService.updateRuleForAppIdleLocked(NetworkPolicyManagerService.java:2244)logcat.log.01:2215: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: at com.android.server.net.NetworkPolicyManagerService.updateRulesForTempWhitelistChangeLocked(NetworkPolicyManagerService.java:2298)logcat.log.01:2216: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: at com.android.server.net.NetworkPolicyManagerService$3.run(NetworkPolicyManagerService.java:572)logcat.log.01:2217: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: at android.os.Handler.handleCallback(Handler.java:739)logcat.log.01:2218: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:95)logcat.log.01:2219: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: at android.os.Looper.loop(Looper.java:148)logcat.log.01:2220: 08-27 16:41:16.664 2999 3026 E AndroidRuntime: at android.os.HandlerThread.run(HandlerThread.java:61)logcat.log.01:2221: 08-27 16:41:16.665 2999 3026 I am_crash: [2999,0,system_server,-1,java.lang.NullPointerException,Attempt to get length of null array,NetworkPolicyManagerService.java,2318]logcat.log.01:2224: 08-27 16:41:16.696 2999 3026 I MitvActivityManagerService: handleApplicationCrash, processName: system_serverlogcat.log.01:2225: 08-27 16:41:16.696 2999 3026 I Process : Sending signal. PID: 2999 SIG: 9
3.5.2.1、 发生JE问题,弹窗提醒用户。
100 private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {101 public void uncaughtException(Thread t, Throwable e) {102 try {103 // Don't re-enter -- avoid infinite loops if crash-reporting crashes.104 if (mCrashing) return;105 mCrashing = true;106107 // Try to end profiling. If a profiler is running at this point, and we kill the108 // process (below), the in-memory buffer will be lost. So try to stop, which will109 // flush the buffer. (This makes method trace profiling useful to debug crashes.)110 if (ActivityThread.currentActivityThread() != null) {111 ActivityThread.currentActivityThread().stopProfiling();112 }113114 // Bring up crash dialog, wait for it to be dismissed,通知AMS弹窗115 ActivityManager.getService().handleApplicationCrash(116 mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));117 } catch (Throwable t2) {118 if (t2 instanceof DeadObjectException) {119 // System process is dead; ignore120 } else {121 try {122 Clog_e(TAG, "Error reporting crash", t2);123 } catch (Throwable t3) {124 // Even Clog_e() fails! Oh well.125 }126 }127 } finally {128 // Try everything to make sure this process goes away.129 Process.killProcess(Process.myPid());130 System.exit(10);131 }132 }133 }
3.5.3、ZygoteInit的nativeZygoteInit方法
nativeZygoteInit方法是个JNI方法,在AndroidRuntime.cpp中注册。
/frameworks/base/core/jni/AndroidRuntime.cpp12811282static const RegJNIRec gRegJNI[] = {1283 REG_JNI(register_com_android_internal_os_RuntimeInit),1284 REG_JNI(register_com_android_internal_os_ZygoteInit), .....
/frameworks/base/core/jni/AndroidRuntime.cpp48int register_com_android_internal_os_ZygoteInit(JNIEnv* env)249{250 const JNINativeMethod methods[] = {251 { "nativeZygoteInit", "()V",252 (void*) com_android_internal_os_ZygoteInit_nativeZygoteInit },253 };254 return jniRegisterNativeMethods(env, "com/android/internal/os/ZygoteInit",255 methods, NELEM(methods));256}
所以实际调用的是com_android_internal_os_ZygoteInit_nativeZygoteInit函数。
/frameworks/base/core/jni/AndroidRuntime.cpp221static void com_android_internal_os_ZygoteInit_nativeZygoteInit(JNIEnv* env, jobject clazz)222{223 gCurRuntime->onZygoteInit();224}
com_android_internal_os_ZygoteInit_nativeZygoteInit调用的是AndroidRuntime的onZygoteInit函数,但是onZygoteInit函数是个虚函数,它的实现是app_main.cpp中。
/frameworks/base/cmds/app_process/app_main.cpp91 virtual void onZygoteInit()92 {93 sp proc = ProcessState::self();94 ALOGV("App process: starting thread pool.\n"); //开启Binder线程池95 proc->startThreadPool();96 }
/frameworks/native/libs/binder/ProcessState.cpp145void ProcessState::startThreadPool()146{147 AutoMutex _l(mLock);148 if (!mThreadPoolStarted) {149 mThreadPoolStarted = true;150 spawnPooledThread(true);151 }152}153
/frameworks/native/libs/binder/ProcessState.cpp300void ProcessState::spawnPooledThread(bool isMain)301{302 if (mThreadPoolStarted) {303 String8 name = makeBinderThreadName();304 ALOGV("Spawning new pooled thread, name=%s\n", name.string());305 sp t = new PoolThread(isMain);306 t->run(name.string());307 }308}
/frameworks/native/libs/binder/ProcessState.cpp292String8 ProcessState::makeBinderThreadName() {293 int32_t s = android_atomic_add(1, &mThreadPoolSeq);294 pid_t pid = getpid();295 String8 name;296 name.appendFormat("Binder:%d_%X", pid, s);297 return name;298}
3.5.4、RuntimeInit的applicationInit方法
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java289 protected static void applicationInit(int targetSdkVersion, String[] argv, ClassLoader classLoader)290 throws Zygote.MethodAndArgsCaller {291 // If the application calls System.exit(), terminate the process292 // immediately without running any shutdown hooks. It is not possible to293 // shutdown an Android application gracefully. Among other things, the294 // Android runtime shutdown hooks close the Binder driver, which can cause295 // leftover running threads to crash before the process actually exits.296 nativeSetExitWithoutCleanup(true);297298 // We want to be fairly aggressive about heap utilization, to avoid299 // holding on to a lot of memory that isn't needed.300 VMRuntime.getRuntime().setTargetHeapUtilization(0.75f);301 VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);302303 final Arguments args;304 try { //将com.android.server.SystemServer赋值给startClass305 args = new Arguments(argv);306 } catch (IllegalArgumentException ex) {307 Slog.e(TAG, ex.getMessage());308 // let the process exit309 return;310 }311312 // The end of of the RuntimeInit event (see #zygoteInit).313 Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);314315 // Remaining arguments are passed to the start class's static main316 invokeStaticMain(args.startClass, args.startArgs, classLoader);317 }
经过applicationInit中的Arguments构造方法,args.startClass的值就是com.android.server.SystemServer。
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java231 private static void invokeStaticMain(String className, String[] argv, ClassLoader classLoader)232 throws Zygote.MethodAndArgsCaller {233 Class<?> cl;234235 try {236 cl = Class.forName(className, true, classLoader);237 } catch (ClassNotFoundException ex) {238 throw new RuntimeException(239 "Missing class when invoking static main " + className,240 ex);241 }242243 Method m;244 try {245 m = cl.getMethod("main", new Class[] { String[].class });246 } catch (NoSuchMethodException ex) {247 throw new RuntimeException(248 "Missing static main on " + className, ex);249 } catch (SecurityException ex) {250 throw new RuntimeException(251 "Problem getting static main on " + className, ex);252 }253254 int modifiers = m.getModifiers();255 if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {256 throw new RuntimeException(257 "Main method is not public and static on " + className);258 }259260 /*261 * This throw gets caught in ZygoteInit.main(), which responds262 * by invoking the exception's run() method. This arrangement263 * clears up all the stack frames that were required in setting264 * up the process.265 */266 throw new Zygote.MethodAndArgsCaller(m, argv);267 }
加载com.android.server.SystemServer的字节码,反射此类的main方法,得到Method对象,抛出Zygote.MethodAndArgsCaller异常。回到最开始的ZygoteInit的main方法。经过层层调用,ZygoteInit.main-->ZygoteInit.startSystemServer-->Zygote.forkSystemServer-->com_android_internal_os_Zygote_nativeForkSystemServer-->ForkAndSpecializeCommon-->fork-->ZygoteInit.handleSystemServerProcess--> ZygoteInit.zygoteInit-->RuntimeInit.applicationInit-->RuntimeInit.invokeStaticMain。最终来到invokeStaticMain方法,抛出一个Zygote.MethodAndArgsCaller异常被ZygoteInit.main方法所捕获。
frameworks/base/core/java/com/android/internal/os/ZygoteInit.java public static void main(String argv[]) { //1、创建ZygoteServer ZygoteServer zygoteServer = new ZygoteServer(); try { //2、创建一个Server端的Socket zygoteServer.registerServerSocket(socketName); //3、加载进程的资源和类 preload(bootTimingsTraceLog); if (startSystemServer) { //4、开启SystemServer进程,这是受精卵进程的第一次分裂 startSystemServer(abiList, socketName, zygoteServer); } //5、启动一个死循环监听来自Client端的消息 zygoteServer.runSelectLoop(abiList); //6、关闭SystemServer的Socket zygoteServer.closeServerSocket(); } catch (Zygote.MethodAndArgsCaller caller) { //7、这里捕获这个异常调用MethodAndArgsCaller的run方法。 caller.run(); } catch (Throwable ex) { Log.e(TAG, "System zygote died with exception", ex); zygoteServer.closeServerSocket(); throw ex; } }
/frameworks/base/core/java/com/android/internal/os/Zygote.java225 public static class MethodAndArgsCaller extends Exception226 implements Runnable {227 /** method to call */228 private final Method mMethod;229230 /** argument array */231 private final String[] mArgs;232233 public MethodAndArgsCaller(Method method, String[] args) {234 mMethod = method;//构造函数, 将SystemServer的main函数赋值给mMethod235 mArgs = args;236 }237238 public void run() {239 try { //执行SystemServer的main函数, 从而进入到SystemServer的main方法。240 mMethod.invoke(null, new Object[] { mArgs });241 } catch (IllegalAccessException ex) {242 throw new RuntimeException(ex);243 } catch (InvocationTargetException ex) {244 Throwable cause = ex.getCause();245 if (cause instanceof RuntimeException) {246 throw (RuntimeException) cause;247 } else if (cause instanceof Error) {248 throw (Error) cause;249 }250 throw new RuntimeException(ex);251 }252 }253 }254}
- 思考:为什么这里要有抛出异常的方式调用SytemServer的main方法呢?
因为从ZygoteInit的main开始fork一个进程出来,经过了层层调用,系统中累积了不少栈帧,为了一个创建一个干干净净的进程,需要清除里面的栈帧,故抛出这个异常。
四、总结
本文主要梳理了SystemServer进程的启动,这是受精卵进程的第一次分裂,有几个重点需要把握。
- 1、waitpid方法的特殊使用
- 2、SystemServer与Zygote共存亡
- 3、进程出错堆栈是怎么输出的,以及错误Dialog是怎么弹出的
- 4、为什么要有抛出异常的方式调用SytemServer的main方法
下篇将会梳理SytemServer的main里面做了哪些事情。
更多相关文章
- Android(安卓)Camera 使用小结
- Android创建文件夹
- android 事件总线 -- Otto(一)
- Android(安卓)系统启动过程
- Android(安卓)网络请求框架之Rxjava+Retrofit
- 浅谈Java中Collections.sort对List排序的两种方法
- 箭头函数的基础使用
- Python技巧匿名函数、回调函数和高阶函数
- Python list sort方法的具体使用