Additional Tweaks¶
BBR (TCP Congestion Control Algorithm)¶
Google developed a TCP Congestion Control Algorithm (CCA) called TCP Bottleneck Bandwidth and RRT (BBR) that overcomes many of the issues found in both Reno and CUBIC (the default CCAs). This new algorithm not only achieves significant bandwidth improvements, but also lower latency. It is a TCP congestion control algorithm built for the congestion of the modern internet. TCP BBR is already employed on Google's servers, and now it's possible to apply it to your VPS — so long as your Linux machine is running kernel 4.9 or newer.
Further Reading:
https://medium.com/google-cloud/tcp-bbr-magic-dust-for-network-performance-57a5f1ccf437
To check which CCA is currently being used, run:
sudo sysctl net.ipv4.tcp_available_congestion_control
It will probably display:
net.ipv4.tcp_available_congestion_control = cubic reno
To switch over to using BBR...
sudo nano /etc/sysctl.conf
and add the following two lines at the bottom of the file
net.core.default_qdisc=fq
net.ipv4.tcp_congestion_control=bbr
and reload sysctl with:
$ sudo sysctl -p
Now check again with
sudo sysctl net.ipv4.tcp_available_congestion_control
and you should see confirmation:
net.ipv4.tcp_available_congestion_control = bbr
Adjust Open Files Limit¶
Linux provides ways of limiting the amount of resource that can be used in the form of a 'max processes per user' limit. This feature allows us to control the number of processes an existing user on the server may be authorized to have. The default maximum open files limit is 1024 Which is very low, especially for a busy web server that may be hosting multiple sites driven by SQL databases.
The current limit can be checked with the command:
ulimit -n
To increase this, we need to make changes in three files. Firstly...
sudo nano /etc/sysctl.conf
and add the following line:
fs.file-max = 65535
Then edit the limits.conf file:
sudo nano /etc/security/limits.conf
And add the following:
* soft nproc 65535
* hard nproc 65535
* soft nofile 65535
* hard nofile 65535
root soft nproc 65535
root hard nproc 65535
root soft nofile 65535
root hard nofile 65535
Limits can be hard or soft. Hard limits are set by the root user. Only the root user can increase hard limits, though other users can decrease them. Soft limits can be set and changed by other users, but they cannot exceed the hard limits.
And finally we edit the pam.d config:
sudo nano /etc/pam.d/common-session
Add the following line:
session required pam_limits.so
We then apply the changes with:
sudo sysctl -p
Reboot then after logging back in, check with:
ulimit -n
Raise Minimum Amount of Entropy¶
Explained simply by user Janne Pikkarainen on Serverfault.com:
"Your system gathers some "real" random numbers by keeping an eye about different events: network activity, hardware random number generator (if available; for example VIA processors usually has a "real" random number generator), and so on. If feeds those to kernel entropy pool, which is used by /dev/random. Applications which need some extreme security tend to use /dev/random as their entropy source, or in other words, the randomness source. If /dev/random runs out of available entropy, it's unable to serve out more randomness and the application waiting for the randomness stalls until more random stuff is available."
https://serverfault.com/questions/172337/explain-in-plain-english-about-entropy-available
Virtual machines have issues with the above sources of random input because they lack keyboard and mouse interrupts and may have some or all of the hardware virtualized, making it more predictable. This is especially problematic if you are hosting a web app with TLS everywhere and it doesn't receive much traffic that would help the machine generate additional entropy.
Further Reading:
https://alcidanalytics.com/p/entropy-on-cloud-vps
To raise the minimum amount of entropy, we can install a service called Haveged.
The haveged project is an attempt to provide an easy-to-use, unpredictable random number generator based upon an adaptation of the HAVEGE algorithm. Haveged was created to remedy low-entropy conditions in the Linux random device that can occur under some workloads, especially on headless servers.
http://www.issihosts.com/haveged/
To install:
sudo apt update && sudo apt install haveged apparmor-utils
And edit the config:
sudo nano /etc/default/haveged
Make sure the line reads:
DAEMON_ARGS="-w 1024"
And prevent AppArmor from stopping the service from starting:
sudo aa-complain /usr/sbin/haveged
Start the service and configure to start on boot:
sudo service haveged start
sudo update-rc.d haveged defaults
Additional Kernel Tweaks¶
Here are a few more tweaks that have helped, especially with the softIRQ budget warnings I was constantly being notified of by my monitoring system. I wouldn't advise just copying and pasting the whole lot into your config. Take the time to read the following pages first:
http://www.nateware.com/linux-network-tuning-for-2013.html
https://blog.cloudflare.com/the-story-of-one-latency-spike/
https://wiki.archlinux.org/index.php/sysctl
http://www.cyberciti.biz/faq/linux-kernel-etcsysctl-conf-security-hardening/
https://gist.github.com/gburd/6f555a7b1197260d8472e24a35b7752c
Edit the sysctl config file:
$ sudo nano /etc/sysctl.conf
Add the lines below if necessary (please do your research first!). I have included the lines we added earlier for the BBR Congestion Control Algorithm, and for increasing the open files limit.
#Reboot the machine soon after a kernel panic
kernel.panic=10
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename
# Useful for debugging multi-threaded applications
kernel.core_uses_pid = 1
# Protects against creating or following links under certain conditions
fs.protected_hardlinks=1
fs.protected_symlinks=1
#Enable ExecShield protection
#Set value to 1 or 2 (recommended)
kernel.exec-shield = 2
kernel.randomize_va_space=2
# increase system file descriptor limit
fs.file-max = 65535
#Allow for more PIDs
kernel.pid_max = 65536
#Disable zone reclaim
vm.zone_reclaim_mode = 0
#Reduce swap usage
vm.swappiness = 10
###############################################
########## IPv4 networking start ##############
###############################################
# Send redirects, if router, but this is just server
# So no routing allowed
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
# Accept packets with SRR option? No
net.ipv4.conf.all.accept_source_route = 0
# Accept Redirects? No, this is not router
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 1
#Ignore bad ICMP errors
net.ipv4.icmp_ignore_bogus_error_responses=1
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# TCP window scaling tries to avoid saturating the network adapter with
# incoming packets.
net.ipv4.tcp_window_scaling = 1
# If enabled, assume that no receipt of a window-scaling option means that the
# remote TCP is broken and treats the window as a signed quantity. If
# disabled, assume that the remote TCP is not broken even if we do not receive
# a window scaling option from it.
net.ipv4.tcp_workaround_signed_windows = 1
# TCP SACK and FACK refer to options found in RFC 2018 and are also documented
# back to Linux Kernel 2.6.17 with an experimental "TCP-Peach" set of
# functions. These are meant to get you your data without excessive losses.
net.ipv4.tcp_sack = 1
net.ipv4.tcp_fack = 1
# The latency setting is 1 if you prefer more packets vs bandwidth, or 0 if you
# prefer bandwidth. More packets are ideal for things like Remote Desktop and
# VOIP: less for bulk downloading.
#net.ipv4.tcp_low_latency = 0
# I found RFC 2923, which is a good review of PMTU. IPv6 uses PMTU by default
# to avoid segmenting packets at the router level, but its optional for
# IPv4. PMTU is meant to inform routers of the best packet sizes to use between
# links, but its a common admin practice to block ICMP ports that allow
# pinging, thus breaking this mechanism. Linux tries to use it, and so do I: if
# you have problems, you have a problem router, and can change the "no" setting
# to 1. "MTU probing" is also a part of this: 1 means try, and 0 means don't.
#net.ipv4.ip_no_pmtu_disc = 0
#net.ipv4.tcp_mtu_probing = 1
# FRTO is a mechanism in newer Linux kernels to optimize for wireless hosts:
# use it if you have them; delete the setting, or set to 0, if you don't.
#net.ipv4.tcp_frto = 2
#net.ipv4.tcp_frto_response = 2
# Log packets with impossible addresses to kernel log? yes
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
# Ignore all ICMP ECHO and TIMESTAMP requests sent to it via broadcast/multicast
net.ipv4.icmp_echo_ignore_broadcasts = 1
#Increase system IP port limits
net.ipv4.ip_local_port_range = 15000 65000
# Disable TCP slow start on idle connections
net.ipv4.tcp_slow_start_after_idle = 0
# Enable TCP/IP SYN cookies, see http://lwn.net/Articles/277146/
# Note: This may impact IPv6 TCP sessions too.
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 5
# Enable source validation by reversed path, as specified in RFC1812, which
# turn on Source Address Verification in all interfaces to prevent some
# spoofing attacks.
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
# RFC 1337, TIME-WAIT Assassination Hazards in TCP, a fix written in 1992
# for some theoretically-possible failure modes for TCP connections. To this
# day this RFC still has people confused if it negatively impacts performance
# or not or is supported by any decent router. Murphy's Law is that the only
# router that it would even have trouble with, is most likely your own.
net.ipv4.tcp_rfc1337 = 1
###############################################
########## IPv6 networking start ##############
###############################################
# Uncomment the next line to enable packet forwarding for IPv6. Enabling this
# option disables Stateless Address Autoconfiguration based on Router
# Advertisements for this host.
#net.ipv6.conf.all.forwarding = 0
# Number of Router Solicitations to send until assuming no routers are present.
# This is host and not router
net.ipv6.conf.default.router_solicitations = 0
# Accept packets with SRR option? No
net.ipv6.conf.all.accept_source_route = 0
# Accept Router Preference in RA?
net.ipv6.conf.default.accept_ra_rtr_pref = 0
# Learn Prefix Information in Router Advertisement
net.ipv6.conf.default.accept_ra_pinfo = 0
# Setting controls whether the system will accept Hop Limit settings from a router advertisement
net.ipv6.conf.default.accept_ra_defrtr = 0
#router advertisements can cause the system to assign a global unicast address to an interface
net.ipv6.conf.default.autoconf = 0
#how many neighbor solicitations to send out per address?
net.ipv6.conf.default.dad_transmits = 0
# How many global unicast IPv6 addresses can be assigned to each interface?
net.ipv6.conf.default.max_addresses = 1
# Do not accept ICMP redirects (prevent MITM attacks)
net.ipv6.conf.default.accept_redirects = 0
net.ipv6.conf.all.accept_redirects = 0
net.ipv6.conf.all.secure_redirects = 1
############################################
##### TCP Tuning ###########################
############################################
# Increase Linux autotuning TCP buffer limits
# Set max to 16MB for 1GE and 32M (33554432) or 54M (56623104) for 10GE
# Don't set tcp_mem itself! Let the kernel scale it based on RAM.
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.core.optmem_max = 40960
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Make room for more TIME_WAIT sockets due to more clients,
# and allow them to be reused if we run out of sockets
# Also increase the max packet backlog
net.ipv4.tcp_max_syn_backlog = 30000
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 10
net.core.netdev_max_backlog = 60000
net.core.netdev_budget = 60000
net.core.netdev_budget_usecs = 6000
# If your servers talk UDP, also up these limits
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192
# Change Congestion Control Algorithm to BBR
net.core.default_qdisc=fq
net.ipv4.tcp_congestion_control=bbr
When finished then apply the new configurions. A reboot is recommended:
sudo sysctl -p
sudo reboot