Kernel/make compilation time and how to improve compile times/compile the Linux kernel faster without hardware upgrades

I thought only a faster CPUand SSDwould help but I already have a Quad-Core CPU and it wasn't being maxed out. The actual tests were performed on an AMD-V enabled 128MB dual core VMWare container though.

There is a flag that can be passed to make in order to start multiple threads, by specifying 4 threads I was able to reduce the whole kernel compilation time from scratch by about 50%! (65minutes vs 31minutes!). *Yes I did do a make clean before each compilation too!

*Part of the slow kernel time is that I use the slow method of making my own initramfs (not pre-compressed and the kernel compile takes like 10x longer for the same thing I could do with a script whichI normally do).

Normal Make (single thread by default):

make

real 65m18.956s

Threaded Make (4 threads):

make -j 4

real 31m57.877s

second run:

real 27m28.745s

Threaded Make (8 threads):

*Ibelieve the worse result is likely due to swapping since I only had 128MB of RAM. Perhaps a lot more RAM could improve things too.

real 58m29.142s
user 33m3.616s
sys 19m13.064s

By increasing RAM to 512MB here are the results (when compiling RAM is more important than CPU and disk speed):

real 18m46.933s
user 20m11.776s
sys 6m51.334s

With 1GB of RAM

real 18m38.608s
user 20m31.857s
sys 7m46.141s

Ibelieve the time was disappointing because of the initramfs creation.

With pre-created initramfs linked into kernel:

real 10m47.362s
user 18m24.837s
sys 1m48.095s

With 12 threads:

real 10m34.550s

Clearly the threads no longer help once the CPUis maxed out, I didn't check but considering with 8 threads that I was often at 80-90% CPU, now the CPU is the bottleneck. I'm going to increase my cores to 4 and try again.

It only shaved off 13 seconds, but the crazy thing is that initramfs takes 8 minutes to create alone! That's how inefficient the routine from the kernel is. The same initramfs is created through a script in about 1 minute or less!


Snapshot of top with 8 threads showing high iowait:

You can really see iowait is starting to become a factor (40-60% on both cores on average). I'm already running a RAID 1 with 7200 RPM 1TB drives. I believe SSD would make a huge improvement with the iowait. The CPU io often hits 70-80% but I believe the main culprit is the high iowait. The system with 8 threads is quite unresponsive to even shell commands and typing.

11:24:56 up 54 days, 1:29, 5 users, load average: 11.80, 10.36, 6.08

top - 11:22:12 up 54 days, 1:27, 5 users, load average: 11.79, 9.53, 5.04
Tasks: 121 total, 6 running, 97 sleeping, 18 stopped, 0 zombie
Cpu0 : 18.1% us, 13.3% sy, 0.0% ni, 0.0% id, 67.7% wa, 0.0% hi, 0.9% si
Cpu1 : 13.3% us, 19.5% sy, 0.0% ni, 0.0% id, 53.6% wa, 1.1% hi, 12.4% si
Mem: 126980k total, 116308k used, 10672k free, 788k buffers
Swap: 377488k total, 107036k used, 270452k free, 5180k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7970 root 18 0 17596 14m 2976 D 18.2 11.5 0:00.75 cc1
7932 root 18 0 19220 13m 1612 D 17.8 10.9 0:00.94 cc1
7923 root 18 0 24152 13m 3140 R 13.4 11.0 0:01.37 cc1
7938 root 18 0 19448 12m 1612 R 10.8 10.4 0:00.83 cc1
14 root 10 -5 0 0 0 S 8.0 0.0 5:17.26 kblockd/1
7687 root 18 0 36760 13m 3036 R 7.6 10.7 0:05.45 cc1
7905 root 18 0 21188 14m 3064 D 7.0 11.5 0:00.64 cc1
118 root 10 -5 0 0 0 D 3.5 0.0 15:23.71 kswapd0
7915 root 18 0 27164 17m 1616 D 2.9 14.1 0:00.69 cc1

With 512MB of RAM instead of 128MB

real 18m46.933s
user 20m11.776s
sys 6m51.334s

Things don't fee lagged at all on the system unlike last time when it had 128MB of RAM.

The load is lower and iowait is virtually non-existent.

03:52:52 up 4 min, 2 users, load average: 10.55, 5.14, 2.02

04:02:46 up 14 min, 2 users, load average: 1.85, 5.12, 4.13

top - 03:53:11 up 5 min, 2 users, load average: 10.30, 5.43, 2.18
Tasks: 93 total, 12 running, 81 sleeping, 0 stopped, 0 zombie
Cpu0 : 90.0% us, 9.7% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.3% hi, 0.0% si
Cpu1 : 89.3% us, 10.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.7% si
Mem: 516820k total, 431108k used, 85712k free, 176120k buffers
Swap: 377488k total, 0k used, 377488k free, 111164k cached

top - 04:02:42 up 14 min, 2 users, load average: 1.84, 5.18, 4.14
Tasks: 55 total, 2 running, 53 sleeping, 0 stopped, 0 zombie
Cpu0 : 10.0% us, 40.2% sy, 0.0% ni, 49.8% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu1 : 11.6% us, 37.9% sy, 0.0% ni, 49.5% id, 1.0% wa, 0.0% hi, 0.0% si

Free Memory gets low sometimes:

total used free shared buffers cached
Mem: 504 486 18 0 4 427

With 1GB

Clearly 1GB is the sweet spot, I'm tempted to turn the threads up from 8 to at least 12 or 16.

We can also see that CPU usage gets higher, so it is a factor and that iowait when compiling is usually caused by swapping because of too little RAM.

Cpu0 : 92.0% us, 8.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu1 : 93.4% us, 6.6% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si

total used free shared buffers cached
Mem: 1012 408 603 0 37 291

04:24:24 up 13 min, 2 users, load average: 2.43, 6.33, 4.47

04:33:13 up 21 min, 2 users, load average: 10.58, 7.00, 5.06

Even with the high load the system is very responsive, unlike at 128MB of RAM

Free Mem does get low still:

total used free shared buffers cached
Mem: 1012 986 26 0 7 900
-/+ buffers/cache: 77 934
Swap: 368 0 368

*Note that if you specify -j with no number it opens unlimited threads, it basically causes gcc to crash/fail in my experience. Perhaps with more memory this wouldn't have happened, I'm not sure what caused it for sure other than my system being unable to handle unlimited threads.

You'll get errors like this if specifying unlimited threads:


CC kernel/time/timekeeping.o
gcc: gcc: Internal error: Killed (program cc1)
Please submit a full bug report.
See for instructions.
For Debian GNU/Linux specific bug reporting instructions, see
.

Internal error: Killed (program cc1)
Please submit a full bug report.
See for instructions.
For Debian GNU/Linux specific bug reporting instructions, see
.

make[2]: make[1]: *** [arch/x86/kernel/setup.o] Killed
*** [fs/file_table.o] Killed
gcc: gcc: make[2]: *** [arch/x86/kernel/x86_init.o] Killed
make[1]: *** [fs/super.o] Killed
gcc: gcc: Internal error: Killed (program cc1)
Please submit a full bug report.
See for instructions.
For Debian GNU/Linux specific bug reporting instructions, see
.
Internal error: Killed (program cc1)
Please submit a full bug report.
See for instructions.
For Debian GNU/Linux specific bug reporting instructions, see
.
Internal error: Killed (program cc1)
Please submit a full bug report.
See for instructions.
For Debian GNU/Linux specific bug reporting instructions, see
.
Internal error: Killed (program cc1)
Please submit a full bug report.
See for instructions.
For Debian GNU/Linux specific bug reporting instructions, see
.
make[2]:

Conclusion/What ILearned

The single most important factor for faster compiling is RAM and 1GB+ is preferable. The first wallI hit was high iowait due to insufficient RAM and swapping. With more RAM the iowait virtually disappears and you can see the CPU's getting loaded to 80-90%.

Basically using make with more threads, decreases the compile time exponentially, but only so long as you have enough RAM to support those threads, and the next bottleneck will become CPU processing power. The key is to add more cores at that point. Having a Quad Core or even 6 core with lots of RAM would give you the best performance and faster compliing. Ifeel disk speed is of little impact when compiling so SSD wouldn't make much of a difference.


Tags:

kernel, compilation, improve, compile, linux, hardware, upgradesi, cpu, ssd, quad, wasn, maxed, performed, amd, enabled, mb, dual, vmware, container, multiple, threads, specifying, reduce, vs, method, initramfs, compressed, default, threaded, swapping, ram, user, sys, increasing, compiling, disk, gb, disappointing, creation, linked, didn, bottleneck, cores, shaved, inefficient, snapshot, iowait, factor, raid, rpm, tb, improvement, io, culprit, unresponsive, shell, commands, typing, users, tasks, sy, ni, wa, mem, buffers, swap, cached, pid, pr, virt, res, shr, cc, kblockd, kswapd, fee, lagged, virtually, existent, min, tempted, usage, responsive, cache, specify, unlimited, gcc, wouldn, unable, ll, errors, timekeeping, submit, url, http, gnu, org, html, debian, reporting, usr, readme, fs, file_table, _init, preferable, insufficient, disappears, decreases, exponentially, processing, compliing,

Latest Articles

  • ImageMagick Convert PDF Not Authorized
  • ImageMagick Converted PDF to JPEG some files have a black background solution
  • Linux Mint Mate Customize the Lock screen messages and hide username and real name
  • Ubuntu/Gnome/Mint/Centos How To Take a partial screenshot
  • ssh how to verify your host key / avoid MIM attacks
  • Cisco IP Phone CP-8845 8800/8900 Series How To Reset To Factory Settings Instructions
  • ls how to list ONLY directories
  • How to encrypt your SSH private key file id_rsa
  • Linux Mint 18 Disable User Name List from showing on Login Screen
  • Firefox Cannot Hit Enter Key In Address Bar and Location History Not Working
  • Cisco Unified Communications Manager / CUCM IP 8.6,10,12 Install Error Solution
  • Ubuntu Debian Mint Linux SSHD OpenSSH Server Not Starting After Reboot Solution
  • nmap how to scan for all ports and not just the 1000 most common ports
  • Windows 7,8,10 and Server 2008, 2012, 2016, 2019 Read Only Attribute Won't Go Away
  • bind / named how to make a wildcard record and retain defined A records
  • Cisco Unified Communications Manager 12 Install Errors on Proxmox/KVM
  • Local Vs Universally Administered MAC Address NIC Refuses to come up
  • Cisco Unified Communications Manager 12 CUCM 12 - How To Enable Video Calling
  • Windows 7, 8, 10, Windows Server 2008, 2012, 2016, 2019 How To AC97 Audio Drivers and Other Unsigned Drivers
  • Cisco Unified Communications Manager / CUCM IP Telephony Definitions