linux / backbone-sources

SSH Git

To clone this repository:

git clone git@git.backbone.ws:linux/backbone-sources.git

To push to this repository:

# Add a new remote
git remote add origin git@git.backbone.ws:linux/backbone-sources.git

# Push the master branch to the newly added origin, and configure
# this remote and branch as the default:
git push -u origin master

# From now on you can push master to the "origin" remote with:
git push

Diffs from 569dbb8 to 3738ba1

Commits

Jan Alexander Steffens (heftig) ZEN: Implement zen-tune v4.12 3f98cea 3 months ago
Jan Alexander Steffens (heftig) ZEN: Allow setting the number of available virtual TTYs b7c1ec5 3 months ago
Jan Alexander Steffens (heftig) Add BFQ-v8r12 122fc4c 3 months ago
Jan Alexander Steffens (heftig) Add extra checks related to entity scheduling 3a5fc91 3 months ago
Jan Alexander Steffens (heftig) block, bfq: reset in_service_entity if it becomes idle 2ea20ae 3 months ago
Jan Alexander Steffens (heftig) block, bfq: consider also in_service_entity to state whether an entity is active dfae7a4 3 months ago
Jan Alexander Steffens (heftig) block, bfq: improve and refactor throughput-boosting logic af82bd2 3 months ago
Jan Alexander Steffens (heftig) FIRST BFQ-MQ COMMIT: Copy bfq-sq-iosched.c as bfq-mq-iosched.c b0cf4f5 3 months ago
Jan Alexander Steffens (heftig) Add config and build bits for bfq-mq-iosched f4d33e1 3 months ago
Jan Alexander Steffens (heftig) Increase max policies for io controller 214ed83 3 months ago
Jan Alexander Steffens (heftig) Copy header file bfq.h as bfq-mq.h cb2cdec 3 months ago
Jan Alexander Steffens (heftig) Move thinktime from bic to bfqq 6999458 3 months ago
Jan Alexander Steffens (heftig) Embed bfq-ioc.c and add locking on request queue 8cbd577 3 months ago
Jan Alexander Steffens (heftig) Modify interface and operation to comply with blk-mq-sched 6268320 3 months ago
Jan Alexander Steffens (heftig) Add checks and extra log messages - Part I 6368610 3 months ago
Jan Alexander Steffens (heftig) Add lock check in bfq_allow_bio_merge e17ec0b 3 months ago
Jan Alexander Steffens (heftig) bfq-mq: execute exit_icq operations immediately c62353b 3 months ago
Jan Alexander Steffens (heftig) Unnest request-queue and ioc locks from scheduler locks faeb604 3 months ago
Jan Alexander Steffens (heftig) Add checks and extra log messages - Part II bfcb95b 3 months ago
Jan Alexander Steffens (heftig) Fix unbalanced increment of rq_in_driver c5991f4 3 months ago
Jan Alexander Steffens (heftig) Add checks and extra log messages - Part III 9cb6c59 3 months ago
Jan Alexander Steffens (heftig) TESTING: Check wrong invocation of merge and put_rq_priv functions 32d966f 3 months ago
Jan Alexander Steffens (heftig) Complete support for cgroups 08435d8 3 months ago
Jan Alexander Steffens (heftig) Remove all get and put of I/O contexts 457d4ca 3 months ago
Jan Alexander Steffens (heftig) BUGFIX: Remove unneeded and deadlock-causing lock in request_merged d4588ed 3 months ago
Jan Alexander Steffens (heftig) Fix wrong unlikely 710cf25 3 months ago
Jan Alexander Steffens (heftig) Change cgroup params prefix to bfq-mq for bfq-mq f1fb545 3 months ago
Jan Alexander Steffens (heftig) Add tentative extra tests on groups, reqs and queues 5a2eca3 3 months ago
Jan Alexander Steffens (heftig) block, bfq-mq: access and cache blkg data only when safe 3e35a27 3 months ago
Jan Alexander Steffens (heftig) bfq-mq: fix macro name in conditional invocation of policy_unregister b6b17de 3 months ago
Jan Alexander Steffens (heftig) Port of "blk-mq-sched: unify request finished methods" e4c990a 3 months ago
Jan Alexander Steffens (heftig) Port of "bfq-iosched: fix NULL ioc check in bfq_get_rq_private" a09643e 3 months ago
Jan Alexander Steffens (heftig) Port of "blk-mq-sched: unify request prepare methods" 86e70b2 3 months ago
Jan Alexander Steffens (heftig) Add list of bfq instances to documentation d0b8b97 3 months ago
Jan Alexander Steffens (heftig) bfq-sq: fix prefix of names of cgroups parameters 70f8b42 3 months ago
Jan Alexander Steffens (heftig) Add to documentation that bfq-mq and bfq-sq contain last fixes too 279fb53 3 months ago
Jan Alexander Steffens (heftig) Improve most frequently used no-logging path b75012a 3 months ago
Jan Alexander Steffens (heftig) bfq-sq: fix commit "Remove all get and put of I/O contexts" in branch bfq-mq 94c64be 3 months ago
Jan Alexander Steffens (heftig) bfq-sq-mq: make lookup_next_entity push up vtime on expirations f633145 3 months ago
Jan Alexander Steffens (heftig) bfq-sq-mq: remove direct switch to an entity in higher class b258361 3 months ago
Jan Alexander Steffens (heftig) bfq-sq-mq: guarantee update_next_in_service always returns an eligible entity 9472a39 3 months ago
Jan Alexander Steffens (heftig) doc, block, bfq: fix some typos and stale sentences e423a1b 3 months ago
Jan Alexander Steffens (heftig) doc, block, bfq: better describe how to properly configure bfq ae1584b 3 months ago
Jan Alexander Steffens (heftig) ZEN: Add ZEN branding 5cb4e6b 3 months ago
Jan Alexander Steffens (heftig) ZEN: Add a choice of boot logos 41037ae 3 months ago
Jan Alexander Steffens (heftig) ZEN: Add Thinkpad SMAPI driver 8655b46 3 months ago
Jan Alexander Steffens (heftig) ZEN: Add Ubuntu ureadahead support f32cb36 3 months ago
Jan Alexander Steffens (heftig) ZEN: Add VHBA driver 0c2d7b5 3 months ago
Jan Alexander Steffens (heftig) bfq: Remove added EXTRAVERSION dcd7583 3 months ago
Jan Alexander Steffens (heftig) ZEN: synaptics: Add support for some old Dell clickpads 362337f 3 months ago
Jan Alexander Steffens (heftig) ZEN: Add exFAT support 268b35e 3 months ago
Jan Alexander Steffens (heftig) ZEN: adbhid: Support absolute mode in adb-base trackpads 5a8ad95 3 months ago
Jan Alexander Steffens (heftig) ZEN: Enable additional CPU Optimizations for GCC v4.9+ / Kernel v4.13+ 2caf80b 3 months ago
Jan Alexander Steffens (heftig) ZEN: Add a CONFIG option that sets -O3 fcb7463 3 months ago
Jan Alexander Steffens (heftig) ZEN: Allow TCP YeAH as default congestion control 5a4b1dd 3 months ago
Jan Alexander Steffens (heftig) Merge remote-tracking branch 'github/4.13/bfq' into HEAD 14c4898 3 months ago
Jan Alexander Steffens (heftig) Merge remote-tracking branch 'github/4.13/misc' into HEAD 2602fb3 3 months ago
Jan Alexander Steffens (heftig) Merge remote-tracking branch 'github/4.13/zen-tune' into HEAD e8987ea 3 months ago
avatar Kolan Sh EXTRAVERSION updated. 36f1cdd 3 months ago
Greg Kroah-Hartman usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard 970974a 3 months ago
Greg Kroah-Hartman USB: serial: option: add support for D-Link DWM-157 C1 520369b 3 months ago
Greg Kroah-Hartman usb: Add device quirk for Logitech HD Pro Webcam C920-C f6f8eb1 3 months ago
Greg Kroah-Hartman usb:xhci:Fix regression when ATI chipsets detected 02fa872 3 months ago
Greg Kroah-Hartman USB: musb: fix external abort on suspend 26be105 3 months ago
Greg Kroah-Hartman ANDROID: binder: add padding to binder_fd_array_object. df33897 3 months ago
Greg Kroah-Hartman ANDROID: binder: add hwbinder,vndbinder to BINDER_DEVICES. 5da7c0c 3 months ago
Greg Kroah-Hartman USB: core: Avoid race of async_completed() w/ usbdev_release() 9bf1256 3 months ago
Greg Kroah-Hartman staging/rts5208: fix incorrect shift to extract upper nybble a55273d 3 months ago
Greg Kroah-Hartman staging: ccree: save ciphertext for CTS IV 2d94a1e 3 months ago
Greg Kroah-Hartman staging: fsl-dpaa2/eth: fix off-by-one FD ctrl bitmaks 34c874a 3 months ago
Greg Kroah-Hartman iio: adc: ti-ads1015: fix incorrect data rate setting update cc06f5a 3 months ago
Greg Kroah-Hartman iio: adc: ti-ads1015: fix scale information for ADS1115 dac6ce3 3 months ago
Greg Kroah-Hartman iio: adc: ti-ads1015: enable conversion when CONFIG_PM is not set 86b6c05 3 months ago
Greg Kroah-Hartman iio: adc: ti-ads1015: avoid getting stale result after runtime resume 1c68b99 3 months ago
Greg Kroah-Hartman iio: adc: ti-ads1015: don't return invalid value from buffer setup callbacks d9320af 3 months ago
Greg Kroah-Hartman iio: adc: ti-ads1015: add adequate wait time to get correct conversion 51a39e2 3 months ago
Greg Kroah-Hartman driver core: bus: Fix a potential double free 5beb744 3 months ago
Greg Kroah-Hartman HID: wacom: Do not completely map WACOM_HID_WD_TOUCHRINGSTATUS usage 8d89833 3 months ago
Greg Kroah-Hartman binder: free memory on error 5f9463e 3 months ago
Greg Kroah-Hartman crypto: caam/qi - fix compilation with CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y 224aec7 3 months ago
Greg Kroah-Hartman crypto: caam/qi - fix compilation with DEBUG enabled 9808d1a 3 months ago
Greg Kroah-Hartman thunderbolt: Fix reset response_type f89830d 3 months ago
Greg Kroah-Hartman fpga: altera-hps2fpga: fix multiple init of l3_remap_lock 328082e 3 months ago
Greg Kroah-Hartman intel_th: pci: Add Cannon Lake PCH-H support c388e61 3 months ago
Greg Kroah-Hartman intel_th: pci: Add Cannon Lake PCH-LP support e35d21f 3 months ago
Greg Kroah-Hartman ath10k: fix memory leak in rx ring buffer allocation 2e78447 3 months ago
Greg Kroah-Hartman drm/vgem: Pin our pages for dmabuf exports 419a7f1 3 months ago
Greg Kroah-Hartman drm/ttm: Fix accounting error when fail to get pages for pool 410ef18 3 months ago
Greg Kroah-Hartman drm/dp/mst: Handle errors from drm_atomic_get_private_obj_state() correctly 00d0e93 3 months ago
Greg Kroah-Hartman rtlwifi: rtl_pci_probe: Fix fail path of _rtl_pci_find_adapter 3d9dc09 3 months ago
Greg Kroah-Hartman Bluetooth: Add support of 13d3:3494 RTL8723BE device 28c300f 3 months ago
Greg Kroah-Hartman iwlwifi: pci: add new PCI ID for 7265D 6977721 3 months ago
Greg Kroah-Hartman dlm: avoid double-free on error path in dlm_device_{register,unregister} 6b42a3c 3 months ago
Greg Kroah-Hartman mwifiex: correct channel stat buffer overflows 24bb35f 3 months ago
Greg Kroah-Hartman MCB: add support for SC31 to mcb-lpc e9b8f63 3 months ago
Greg Kroah-Hartman s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangs e044181 3 months ago
Greg Kroah-Hartman drm/nouveau/pci/msi: disable MSI on big-endian platforms by default a603495 3 months ago
Greg Kroah-Hartman drm/nouveau: Fix error handling in nv50_disp_atomic_commit 209db16 3 months ago
Greg Kroah-Hartman workqueue: Fix flag collision 3060960 3 months ago
Greg Kroah-Hartman ahci: don't use MSI for devices with the silly Intel NVMe remapping scheme c67efb0 3 months ago
Greg Kroah-Hartman cs5536: add support for IDE controller variant ba25748 3 months ago
Greg Kroah-Hartman scsi: sg: protect against races between mmap() and SG_SET_RESERVED_SIZE 200afc4 3 months ago
Greg Kroah-Hartman scsi: sg: recheck MMAP_IO request length with lock held 9cbbaf1 3 months ago
Greg Kroah-Hartman of/device: Prevent buffer overflow in of_device_modalias() d29e6c2 3 months ago
Greg Kroah-Hartman rtlwifi: Fix memory leak when firmware request fails cf37a1b 3 months ago
Greg Kroah-Hartman rtlwifi: Fix fallback firmware loading 4870e16 3 months ago
Greg Kroah-Hartman Linux 4.13.1 94cd0e9 3 months ago
avatar Kolan Sh Linux 4.13 merged 3738ba1 3 months ago

Summary

  • Documentation/block/bfq-iosched.txt (122) -----------------------------------------+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • Documentation/tp_smapi.txt (267) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • Makefile (8) --++++++
  • arch/s390/include/asm/pgtable.h (2) -+
  • arch/s390/mm/gmap.c (39) -------++++++++++++++++++++++++++++++++
  • arch/x86/Kconfig.cpu (224) -------------------------------+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • arch/x86/Makefile (33) ---++++++++++++++++++++++++++++++
  • arch/x86/Makefile_32.cpu (23) --+++++++++++++++++++++
  • arch/x86/include/asm/module.h (38) ++++++++++++++++++++++++++++++++++++++
  • block/Kconfig.iosched (50) ++++++++++++++++++++++++++++++++++++++++++++++++++
  • block/Makefile (2) ++
  • block/bfq-cgroup-included.c
  • block/bfq-ioc.c (36) ++++++++++++++++++++++++++++++++++++
  • block/bfq-mq-iosched.c
  • block/bfq-mq.h
  • block/bfq-sched.c
  • block/bfq-sq-iosched.c
  • block/bfq.h
  • block/elevator.c (4) ++++
  • drivers/android/Kconfig (2) -+
  • drivers/android/binder.c (8) --++++++
  • drivers/ata/ahci.c (9) -++++++++
  • drivers/ata/pata_amd.c (1) +
  • drivers/ata/pata_cs5536.c (1) +
  • drivers/base/bus.c (2) -+
  • drivers/bluetooth/btusb.c (1) +
  • drivers/cpufreq/cpufreq_ondemand.c (12) --++++++++++
  • drivers/crypto/caam/caamalg.c (66) ---------------------------------------------------+++++++++++++++
  • drivers/crypto/caam/caamalg_qi.c (6) ---+++
  • drivers/crypto/caam/error.c (40) ++++++++++++++++++++++++++++++++++++++++
  • drivers/crypto/caam/error.h (4) ++++
  • drivers/crypto/caam/qi.c (2) -+
  • drivers/fpga/altera-hps2fpga.c (4) ---+
  • drivers/gpu/drm/drm_dp_mst_topology.c (8) ----++++
  • drivers/gpu/drm/nouveau/nv50_display.c (7) --+++++
  • drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c (4) ++++
  • drivers/gpu/drm/ttm/ttm_page_alloc.c (2) -+
  • drivers/gpu/drm/vgem/vgem_drv.c (81) ---------------------++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • drivers/gpu/drm/vgem/vgem_drv.h (4) ++++
  • drivers/hid/wacom_wac.c (8) -+++++++
  • drivers/hwtracing/intel_th/pci.c (10) ++++++++++
  • drivers/iio/adc/ti-ads1015.c (123) ---------------------------------------------------++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • drivers/input/mouse/synaptics.c (4) -+++
  • drivers/input/mouse/synaptics.h (1) +
  • drivers/macintosh/Kconfig (7) +++++++
  • drivers/macintosh/adbhid.c (83) --------+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • drivers/mcb/mcb-lpc.c (15) +++++++++++++++
  • drivers/net/wireless/ath/ath10k/core.c (12) ------++++++
  • drivers/net/wireless/intel/iwlwifi/pcie/drv.c (1) +
  • drivers/net/wireless/marvell/mwifiex/cfg80211.c (2) -+
  • drivers/net/wireless/marvell/mwifiex/scan.c (6) ++++++
  • drivers/net/wireless/realtek/rtlwifi/pci.c (4) --++
  • drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c (2) ++
  • drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c (2) ++
  • drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.c (4) ++++
  • drivers/net/wireless/realtek/rtlwifi/rtl8192de/sw.c (2) ++
  • drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c (2) ++
  • drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c (2) ++
  • drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c (2) ++
  • drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c (15) ----------+++++
  • drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.c (19) ----------+++++++++
  • drivers/of/device.c (2) ++
  • drivers/platform/x86/Kconfig (19) +++++++++++++++++++
  • drivers/platform/x86/Makefile (2) ++
  • drivers/platform/x86/hdaps.c (933) ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • drivers/platform/x86/thinkpad_ec.c
  • drivers/platform/x86/tp_smapi.c (1493) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • drivers/scsi/Kconfig (2) ++
  • drivers/scsi/Makefile (1) +
  • drivers/scsi/sg.c (19) -----++++++++++++++
  • drivers/scsi/vhba/Kconfig (9) +++++++++
  • drivers/scsi/vhba/Makefile (4) ++++
  • drivers/scsi/vhba/vhba.c
  • drivers/staging/ccree/ssi_cipher.c (40) ----++++++++++++++++++++++++++++++++++++
  • drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h (4) --++
  • drivers/staging/rts5208/rtsx_scsi.c (2) -+
  • drivers/thunderbolt/ctl.c (2) -+
  • drivers/tty/Kconfig (13) +++++++++++++
  • drivers/usb/core/devio.c (4) --++
  • drivers/usb/core/quirks.c (6) -+++++
  • drivers/usb/host/pci-quirks.c (35) -----------------++++++++++++++++++
  • drivers/usb/musb/musb_core.c (18) --------++++++++++
  • drivers/usb/serial/option.c (1) +
  • drivers/video/logo/Kconfig (95) --------------+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • drivers/video/logo/Makefile (12) ++++++++++++
  • drivers/video/logo/logo.c (190) -------------------------------------------------------------------+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • drivers/video/logo/logo_arch_clut224.ppm (43204) 
  • drivers/video/logo/logo_bsd_clut224.ppm
  • drivers/video/logo/logo_debian_clut224.ppm
  • drivers/video/logo/logo_exherbo_clut224.ppm (963) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • drivers/video/logo/logo_fbsd_clut224.ppm
  • drivers/video/logo/logo_fedoraglossy_clut224.ppm
  • drivers/video/logo/logo_fedorasimple_clut224.ppm
  • drivers/video/logo/logo_gentoo_clut224.ppm
  • drivers/video/logo/logo_oldzen_clut224.ppm
  • drivers/video/logo/logo_slackware_clut224.ppm (1123) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • drivers/video/logo/logo_tits_clut224.ppm
  • drivers/video/logo/logo_zen_clut224.ppm
  • fs/Kconfig (1) +
  • fs/Makefile (1) +
  • fs/dlm/user.c (4) ++++
  • fs/exec.c (6) -+++++
  • fs/exfat/Kconfig (39) +++++++++++++++++++++++++++++++++++++++
  • fs/exfat/LICENSE (339) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/Makefile (54) ++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/README.md (98) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/dkms.conf (7) +++++++
  • fs/exfat/exfat-km.mk (11) +++++++++++
  • fs/exfat/exfat_api.c (528) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_api.h (206) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_bitmap.c (63) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_bitmap.h (55) +++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_blkdev.c (197) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_blkdev.h (73) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_cache.c (784) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_cache.h (85) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_config.h (69) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_core.c
  • fs/exfat/exfat_core.h
  • fs/exfat/exfat_data.c (77) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_data.h (58) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_nls.c (448) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_nls.h (91) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_oal.c (196) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_oal.h (74) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_super.c
  • fs/exfat/exfat_super.h (171) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_upcase.c (405) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  • fs/exfat/exfat_version.h (19) +++++++++++++++++++
  • fs/open.c (4) ++++
  • include/linux/blkdev.h (10) -+++++++++
  • include/linux/linux_logo.h (12) ++++++++++++
  • include/linux/pci_ids.h (1) +
  • include/linux/thinkpad_ec.h (47) +++++++++++++++++++++++++++++++++++++++++++++++
  • include/linux/workqueue.h (2) -+
  • include/trace/events/fs.h (53) +++++++++++++++++++++++++++++++++++++++++++++++++++++
  • include/uapi/linux/android/binder.h (2) ++
  • include/uapi/linux/vt.h (15) -++++++++++++++
  • init/Kconfig (39) +++++++++++++++++++++++++++++++++++++++
  • kernel/configs/android-base.config (1) +
  • kernel/sched/fair.c (25) +++++++++++++++++++++++++
  • mm/page-writeback.c (8) ++++++++
  • net/ipv4/Kconfig (4) ++++
  • scripts/mkcompile_h (4) --++
11 11 groups (switching back to time distribution when needed to keep
12 12 throughput high).
13 13
14 If bfq-mq patches have been applied, then the following three
15 instances of BFQ are available (otherwise only the first instance):
16 - bfq: mainline version of BFQ, for blk-mq
17 - bfq-mq: development version of BFQ for blk-mq; this version contains
18 also all latest features and fixes not yet landed in mainline, plus many
19 safety checks
20 - bfq-sq: BFQ for legacy blk; also this version contains latest features
21 and fixes, as well as safety checks
22
14 23 In its default configuration, BFQ privileges latency over
15 24 throughput. So, when needed for achieving a lower latency, BFQ builds
16 25 schedules that may lead to a lower throughput. If your main or only
17 26 goal, for a given device, is to achieve the maximum-possible
18 27 throughput at all times, then do switch off all low-latency heuristics
19 for that device, by setting low_latency to 0. Full details in Section 3.
28 for that device, by setting low_latency to 0. See Section 3 for
29 details on how to configure BFQ for the desired tradeoff between
30 latency and throughput, or on how to maximize throughput.
20 31
21 32 On average CPUs, the current version of BFQ can handle devices
22 33 performing at most ~30K IOPS; at most ~50 KIOPS on faster CPUs. As a
23 34 reference, 30-50 KIOPS correspond to very high bandwidths with
24 35 sequential I/O (e.g., 8-12 GB/s if I/O requests are 256 KB large), and
25 to 120-200 MB/s with 4KB random I/O. BFQ has not yet been tested on
26 multi-queue devices.
36 to 120-200 MB/s with 4KB random I/O. BFQ is currently being tested on
37 multi-queue devices too.
27 38
28 The table of contents follow. Impatients can just jump to Section 3.
39 The table of contents follows. Impatients can just jump to Section 3.
29 40
30 41 CONTENTS
31 42
44 44 1-1 Personal systems
45 45 1-2 Server systems
46 46 2. How does BFQ work?
47 3. What are BFQ's tunable?
47 3. What are BFQ's tunables and how to properly configure BFQ?
48 48 4. BFQ group scheduling
49 49 4-1 Service guarantees provided
50 50 4-2 Interface
156 156 contrast, BFQ may idle the device for a short time interval,
157 157 giving the process the chance to go on being served if it issues
158 158 a new request in time. Device idling typically boosts the
159 throughput on rotational devices, if processes do synchronous
160 and sequential I/O. In addition, under BFQ, device idling is
161 also instrumental in guaranteeing the desired throughput
162 fraction to processes issuing sync requests (see the description
163 of the slice_idle tunable in this document, or [1, 2], for more
164 details).
159 throughput on rotational devices and on non-queueing flash-based
160 devices, if processes do synchronous and sequential I/O. In
161 addition, under BFQ, device idling is also instrumental in
162 guaranteeing the desired throughput fraction to processes
163 issuing sync requests (see the description of the slice_idle
164 tunable in this document, or [1, 2], for more details).
165 165
166 166 - With respect to idling for service guarantees, if several
167 167 processes are competing for the device at the same time, but
168 all processes (and groups, after the following commit) have
169 the same weight, then BFQ guarantees the expected throughput
170 distribution without ever idling the device. Throughput is
171 thus as high as possible in this common scenario.
168 all processes and groups have the same weight, then BFQ
169 guarantees the expected throughput distribution without ever
170 idling the device. Throughput is thus as high as possible in
171 this common scenario.
172 172
173 - On flash-based storage with internal queueing of commands
174 (typically NCQ), device idling happens to be always detrimental
175 for throughput. So, with these devices, BFQ performs idling
176 only when strictly needed for service guarantees, i.e., for
177 guaranteeing low latency or fairness. In these cases, overall
178 throughput may be sub-optimal. No solution currently exists to
179 provide both strong service guarantees and optimal throughput
180 on devices with internal queueing.
181
173 182 - If low-latency mode is enabled (default configuration), BFQ
174 183 executes some special heuristics to detect interactive and soft
175 184 real-time applications (e.g., video or audio players/streamers),
211 211 - Queues are scheduled according to a variant of WF2Q+, named
212 212 B-WF2Q+, and implemented using an augmented rb-tree to preserve an
213 213 O(log N) overall complexity. See [2] for more details. B-WF2Q+ is
214 also ready for hierarchical scheduling. However, for a cleaner
215 logical breakdown, the code that enables and completes
216 hierarchical support is provided in the next commit, which focuses
217 exactly on this feature.
214 also ready for hierarchical scheduling, details in Section 4.
218 215
219 216 - B-WF2Q+ guarantees a tight deviation with respect to an ideal,
220 217 perfectly fair, and smooth service. In particular, B-WF2Q+
266 266 the Idle class, to prevent it from starving.
267 267
268 268
269 3. What are BFQ's tunable?
270 ==========================
269 3. What are BFQ's tunables and how to properly configure BFQ?
270 =============================================================
271 271
272 The tunables back_seek-max, back_seek_penalty, fifo_expire_async and
273 fifo_expire_sync below are the same as in CFQ. Their description is
274 just copied from that for CFQ. Some considerations in the description
275 of slice_idle are copied from CFQ too.
272 Most BFQ tunables affect service guarantees (basically latency and
273 fairness) and throughput. For full details on how to choose the
274 desired tradeoff between service guarantees and throughput, see the
275 parameters slice_idle, strict_guarantees and low_latency. For details
276 on how to maximise throughput, see slice_idle, timeout_sync and
277 max_budget. The other performance-related parameters have been
278 inherited from, and have been preserved mostly for compatibility with
279 CFQ. So far, no performance improvement has been reported after
280 changing the latter parameters in BFQ.
276 281
282 In particular, the tunables back_seek-max, back_seek_penalty,
283 fifo_expire_async and fifo_expire_sync below are the same as in
284 CFQ. Their description is just copied from that for CFQ. Some
285 considerations in the description of slice_idle are copied from CFQ
286 too.
287
277 288 per-process ioprio and weight
278 289 -----------------------------
279 290
313 313
314 314 Setting slice_idle to 0 will remove all the idling on queues and one
315 315 should see an overall improved throughput on faster storage devices
316 like multiple SATA/SAS disks in hardware RAID configuration.
316 like multiple SATA/SAS disks in hardware RAID configuration, as well
317 as flash-based storage with internal command queueing (and
318 parallelism).
317 319
318 320 So depending on storage and workload, it might be useful to set
319 321 slice_idle=0. In general for SATA/SAS disks and software RAID of
320 322 SATA/SAS disks keeping slice_idle enabled should be useful. For any
321 323 configurations where there are multiple spindles behind single LUN
322 (Host based hardware RAID controller or for storage arrays), setting
323 slice_idle=0 might end up in better throughput and acceptable
324 latencies.
324 (Host based hardware RAID controller or for storage arrays), or with
325 flash-based fast storage, setting slice_idle=0 might end up in better
326 throughput and acceptable latencies.
325 327
326 328 Idling is however necessary to have service guarantees enforced in
327 329 case of differentiated weights or differentiated I/O-request lengths.
342 342 where it is beneficial also for throughput, idling can severely impact
343 343 throughput. One important case is random workload. Because of this
344 344 issue, BFQ tends to avoid idling as much as possible, when it is not
345 beneficial also for throughput. As a consequence of this behavior, and
346 of further issues described for the strict_guarantees tunable,
347 short-term service guarantees may be occasionally violated. And, in
348 some cases, these guarantees may be more important than guaranteeing
349 maximum throughput. For example, in video playing/streaming, a very
350 low drop rate may be more important than maximum throughput. In these
351 cases, consider setting the strict_guarantees parameter.
345 beneficial also for throughput (as detailed in Section 2). As a
346 consequence of this behavior, and of further issues described for the
347 strict_guarantees tunable, short-term service guarantees may be
348 occasionally violated. And, in some cases, these guarantees may be
349 more important than guaranteeing maximum throughput. For example, in
350 video playing/streaming, a very low drop rate may be more important
351 than maximum throughput. In these cases, consider setting the
352 strict_guarantees parameter.
352 353
353 354 strict_guarantees
354 355 -----------------
451 451 to the maximum number of sectors that can be served during
452 452 timeout_sync, according to the estimated peak rate.
453 453
454 For specific devices, some users have occasionally reported to have
455 reached a higher throughput by setting max_budget explicitly, i.e., by
456 setting max_budget to a higher value than 0. In particular, they have
457 set max_budget to higher values than those to which BFQ would have set
458 it with auto-tuning. An alternative way to achieve this goal is to
459 just increase the value of timeout_sync, leaving max_budget equal to 0.
460
454 461 weights
455 462 -------
456 463
548 548 BFQ must of course be the active scheduler for that device.
549 549
550 550 Within each group directory, the names of the files associated with
551 BFQ-specific cgroup parameters and stats begin with the "bfq."
552 prefix. So, with cgroups-v1 or cgroups-v2, the full prefix for
553 BFQ-specific files is "blkio.bfq." or "io.bfq." For example, the group
554 parameter to set the weight of a group with BFQ is blkio.bfq.weight
551 BFQ-specific cgroup parameters and stats begin with the "bfq.",
552 "bfq-sq." or "bfq-mq." prefix, depending on which instance of bfq you
553 want to use. So, with cgroups-v1 or cgroups-v2, the full prefix for
554 BFQ-specific files is "blkio.bfqX." or "io.bfqX.", where X can be ""
555 (i.e., null string), "-sq" or "-mq". For example, the group parameter
556 to set the weight of a group with the mainline BFQ is blkio.bfq.weight
555 557 or io.bfq.weight.
556 558
557 559 Parameters to set
561 561
562 562 For each group, there is only the following parameter to set.
563 563
564 weight (namely blkio.bfq.weight or io.bfq-weight): the weight of the
564 weight (namely blkio.bfqX.weight or io.bfqX.weight): the weight of the
565 565 group inside its parent. Available values: 1..10000 (default 100). The
566 566 linear mapping between ioprio and weights, described at the beginning
567 567 of the tunable section, is still valid, but all weights higher than
1 tp_smapi version 0.40
2 IBM ThinkPad hardware functions driver
3
4 Author: Shem Multinymous <multinymous@gmail.com>
5 Project: http://sourceforge.net/projects/tpctl
6 Wiki: http://thinkwiki.org/wiki/tp_smapi
7 List: linux-thinkpad@linux-thinkpad.org
8 (http://mailman.linux-thinkpad.org/mailman/listinfo/linux-thinkpad)
9
10 Description
11 -----------
12
13 ThinkPad laptops include a proprietary interface called SMAPI BIOS
14 (System Management Application Program Interface) which provides some
15 hardware control functionality that is not accessible by other means.
16
17 This driver exposes some features of the SMAPI BIOS through a sysfs
18 interface. It is suitable for newer models, on which SMAPI is invoked
19 through IO port writes. Older models use a different SMAPI interface;
20 for those, try the "thinkpad" module from the "tpctl" package.
21
22 WARNING:
23 This driver uses undocumented features and direct hardware access.
24 It thus cannot be guaranteed to work, and may cause arbitrary damage
25 (especially on models it wasn't tested on).
26
27
28 Module parameters
29 -----------------
30
31 thinkpad_ec module:
32 force_io=1 lets thinkpad_ec load on some recent ThinkPad models
33 (e.g., T400 and T500) whose BIOS's ACPI DSDT reserves the ports we need.
34 tp_smapi module:
35 debug=1 enables verbose dmesg output.
36
37
38 Usage
39 -----
40
41 Control of battery charging thresholds (in percents of current full charge
42 capacity):
43
44 # echo 40 > /sys/devices/platform/smapi/BAT0/start_charge_thresh
45 # echo 70 > /sys/devices/platform/smapi/BAT0/stop_charge_thresh
46 # cat /sys/devices/platform/smapi/BAT0/*_charge_thresh
47
48 (This is useful since Li-Ion batteries wear out much faster at very
49 high or low charge levels. The driver will also keeps the thresholds
50 across suspend-to-disk with AC disconnected; this isn't done
51 automatically by the hardware.)
52
53 Inhibiting battery charging for 17 minutes (overrides thresholds):
54
55 # echo 17 > /sys/devices/platform/smapi/BAT0/inhibit_charge_minutes
56 # echo 0 > /sys/devices/platform/smapi/BAT0/inhibit_charge_minutes # stop
57 # cat /sys/devices/platform/smapi/BAT0/inhibit_charge_minutes
58
59 (This can be used to control which battery is charged when using an
60 Ultrabay battery.)
61
62 Forcing battery discharging even if AC power available:
63
64 # echo 1 > /sys/devices/platform/smapi/BAT0/force_discharge # start discharge
65 # echo 0 > /sys/devices/platform/smapi/BAT0/force_discharge # stop discharge
66 # cat /sys/devices/platform/smapi/BAT0/force_discharge
67
68 (When AC is connected, forced discharging will automatically stop
69 when battery is fully depleted -- this is useful for calibration.
70 Also, this attribute can be used to control which battery is discharged
71 when both a system battery and an Ultrabay battery are connected.)
72
73 Misc read-only battery status attributes (see note about HDAPS below):
74
75 /sys/devices/platform/smapi/BAT0/installed # 0 or 1
76 /sys/devices/platform/smapi/BAT0/state # idle/charging/discharging
77 /sys/devices/platform/smapi/BAT0/cycle_count # integer counter
78 /sys/devices/platform/smapi/BAT0/current_now # instantaneous current
79 /sys/devices/platform/smapi/BAT0/current_avg # last minute average
80 /sys/devices/platform/smapi/BAT0/power_now # instantaneous power
81 /sys/devices/platform/smapi/BAT0/power_avg # last minute average
82 /sys/devices/platform/smapi/BAT0/last_full_capacity # in mWh
83 /sys/devices/platform/smapi/BAT0/remaining_percent # remaining percent of energy (set by calibration)
84 /sys/devices/platform/smapi/BAT0/remaining_percent_error # error range of remaing_percent (not reset by calibration)
85 /sys/devices/platform/smapi/BAT0/remaining_running_time # in minutes, by last minute average power
86 /sys/devices/platform/smapi/BAT0/remaining_running_time_now # in minutes, by instantenous power
87 /sys/devices/platform/smapi/BAT0/remaining_charging_time # in minutes
88 /sys/devices/platform/smapi/BAT0/remaining_capacity # in mWh
89 /sys/devices/platform/smapi/BAT0/design_capacity # in mWh
90 /sys/devices/platform/smapi/BAT0/voltage # in mV
91 /sys/devices/platform/smapi/BAT0/design_voltage # in mV
92 /sys/devices/platform/smapi/BAT0/charging_max_current # max charging current
93 /sys/devices/platform/smapi/BAT0/charging_max_voltage # max charging voltage
94 /sys/devices/platform/smapi/BAT0/group{0,1,2,3}_voltage # see below
95 /sys/devices/platform/smapi/BAT0/manufacturer # string
96 /sys/devices/platform/smapi/BAT0/model # string
97 /sys/devices/platform/smapi/BAT0/barcoding # string
98 /sys/devices/platform/smapi/BAT0/chemistry # string
99 /sys/devices/platform/smapi/BAT0/serial # integer
100 /sys/devices/platform/smapi/BAT0/manufacture_date # YYYY-MM-DD
101 /sys/devices/platform/smapi/BAT0/first_use_date # YYYY-MM-DD
102 /sys/devices/platform/smapi/BAT0/temperature # in milli-Celsius
103 /sys/devices/platform/smapi/BAT0/dump # see below
104 /sys/devices/platform/smapi/ac_connected # 0 or 1
105
106 The BAT0/group{0,1,2,3}_voltage attribute refers to the separate cell groups
107 in each battery. For example, on the ThinkPad 600, X3x, T4x and R5x models,
108 the battery contains 3 cell groups in series, where each group consisting of 2
109 or 3 cells connected in parallel. The voltage of each group is given by these
110 attributes, and their sum (roughly) equals the "voltage" attribute.
111 (The effective performance of the battery is determined by the weakest group,
112 i.e., the one those voltage changes most rapidly during dis/charging.)
113
114 The "BAT0/dump" attribute gives a a hex dump of the raw status data, which
115 contains additional data now in the above (if you can figure it out). Some
116 unused values are autodetected and replaced by "--":
117
118 In all of the above, replace BAT0 with BAT1 to address the 2nd battery (e.g.
119 in the UltraBay).
120
121
122 Raw SMAPI calls:
123
124 /sys/devices/platform/smapi/smapi_request
125 This performs raw SMAPI calls. It uses a bad interface that cannot handle
126 multiple simultaneous access. Don't touch it, it's for development only.
127 If you did touch it, you would so something like
128 # echo '211a 100 0 0' > /sys/devices/platform/smapi/smapi_request
129 # cat /sys/devices/platform/smapi/smapi_request
130 and notice that in the output "211a 34b b2 0 0 0 'OK'", the "4b" in the 2nd
131 value, converted to decimal is 75: the current charge stop threshold.
132
133
134 Model-specific status
135 ---------------------
136
137 Works (at least partially) on the following ThinkPad model:
138 * A30
139 * G41
140 * R40, R50p, R51, R52
141 * T23, T40, T40p, T41, T41p, T42, T42p, T43, T43p, T60
142 * X24, X31, X32, X40, X41, X60
143 * Z60t, Z61m
144
145 Not all functions are available on all models; for detailed status, see:
146 http://thinkwiki.org/wiki/tp_smapi
147
148 Please report success/failure by e-mail or on the Wiki.
149 If you get a "not implemented" or "not supported" message, your laptop
150 probably just can't do that (at least not via the SMAPI BIOS).
151 For negative reports, follow the bug reporting guidelines below.
152 If you send me the necessary technical data (i.e., SMAPI function
153 interfaces), I will support additional models.
154
155
156 Additional HDAPS features
157 -------------------------
158
159 The modified hdaps driver has several improvements on the one in mainline
160 (beyond resolving the conflict with thinkpad_ec and tp_smapi):
161
162 - Fixes reliability and improves support for recent ThinkPad models
163 (especially *60 and newer). Unlike the mainline driver, the modified hdaps
164 correctly follows the Embedded Controller communication protocol.
165
166 - Extends the "invert" parameter to cover all possible axis orientations.
167 The possible values are as follows.
168 Let X,Y denote the hardware readouts.
169 Let R denote the laptop's roll (tilt left/right).
170 Let P denote the laptop's pitch (tilt forward/backward).
171 invert=0: R= X P= Y (same as mainline)
172 invert=1: R=-X P=-Y (same as mainline)
173 invert=2: R=-X P= Y (new)
174 invert=3: R= X P=-Y (new)
175 invert=4: R= Y P= X (new)
176 invert=5: R=-Y P=-X (new)
177 invert=6: R=-Y P= X (new)
178 invert=7: R= Y P=-X (new)
179 It's probably easiest to just try all 8 possibilities and see which yields
180 correct results (e.g., in the hdaps-gl visualisation).
181
182 - Adds a whitelist which automatically sets the correct axis orientation for
183 some models. If the value for your model is wrong or missing, you can override
184 it using the "invert" parameter. Please also update the tables at
185 http://www.thinkwiki.org/wiki/tp_smapi and
186 http://www.thinkwiki.org/wiki/List_of_DMI_IDs
187 and submit a patch for the whitelist in hdaps.c.
188
189 - Provides new attributes:
190 /sys/devices/platform/hdaps/sampling_rate:
191 This determines the frequency at which the host queries the embedded
192 controller for accelerometer data (and informs the hdaps input devices).
193 Default=50.
194 /sys/devices/platform/hdaps/oversampling_ratio:
195 When set to X, the embedded controller is told to do physical accelerometer
196 measurements at a rate that is X times higher than the rate at which
197 the driver reads those measurements (i.e., X*sampling_rate). This
198 makes the readouts from the embedded controller more fresh, and is also
199 useful for the running average filter (see next). Default=5
200 /sys/devices/platform/hdaps/running_avg_filter_order:
201 When set to X, reported readouts will be the average of the last X physical
202 accelerometer measurements. Current firmware allows 1<=X<=8. Setting to a
203 high value decreases readout fluctuations. The averaging is handled by the
204 embedded controller, so no CPU resources are used. Higher values make the
205 readouts smoother, since it averages out both sensor noise (good) and abrupt
206 changes (bad). Default=2.
207
208 - Provides a second input device, which publishes the raw accelerometer
209 measurements (without the fuzzing needed for joystick emulation). This input
210 device can be matched by a udev rule such as the following (all on one line):
211 KERNEL=="event[0-9]*", ATTRS{phys}=="hdaps/input1",
212 ATTRS{modalias}=="input:b0019v1014p5054e4801-*",
213 SYMLINK+="input/hdaps/accelerometer-event
214
215 A new version of the hdapsd userspace daemon, which uses the input device
216 interface instead of polling sysfs, is available seprately. Using this reduces
217 the total interrupts per second generated by hdaps+hdapsd (on tickless kernels)
218 to 50, down from a value that fluctuates between 50 and 100. Set the
219 sampling_rate sysfs attribute to a lower value to further reduce interrupts,
220 at the expense of response latency.
221
222 Licensing note: all my changes to the HDAPS driver are licensed under the
223 GPL version 2 or, at your option and to the extent allowed by derivation from
224 prior works, any later version. My version of hdaps is derived work from the
225 mainline version, which at the time of writing is available only under
226 GPL version 2.
227
228 Bug reporting
229 -------------
230
231 Mail <multinymous@gmail.com>. Please include:
232 * Details about your model,
233 * Relevant "dmesg" output. Make sure thinkpad_ec and tp_smapi are loaded with
234 the "debug=1" parameter (e.g., use "make load HDAPS=1 DEBUG=1").
235 * Output of "dmidecode | grep -C5 Product"
236 * Does the failed functionality works under Windows?
237
238
239 More about SMAPI
240 ----------------
241
242 For hints about what may be possible via the SMAPI BIOS and how, see:
243
244 * IBM Technical Reference Manual for the ThinkPad 770
245 (http://www-307.ibm.com/pc/support/site.wss/document.do?lndocid=PFAN-3TUQQD)
246 * Exported symbols in PWRMGRIF.DLL or TPPWRW32.DLL (e.g., use "objdump -x").
247 * drivers/char/mwave/smapi.c in the Linux kernel tree.*
248 * The "thinkpad" SMAPI module (http://tpctl.sourceforge.net).
249 * The SMAPI_* constants in tp_smapi.c.
250
251 Note that in the above Technical Reference and in the "thinkpad" module,
252 SMAPI is invoked through a function call to some physical address. However,
253 the interface used by tp_smapi and the above mwave drive, and apparently
254 required by newer ThinkPad, is different: you set the parameters up in the
255 CPU's registers and write to ports 0xB2 (the APM control port) and 0x4F; this
256 triggers an SMI (System Management Interrupt), causing the CPU to enter
257 SMM (System Management Mode) and run the BIOS firmware; the results are
258 returned in the CPU's registers. It is not clear what is the relation between
259 the two variants of SMAPI, though the assignment of error codes seems to be
260 similar.
261
262 In addition, the embedded controller on ThinkPad laptops has a non-standard
263 interface at IO ports 0x1600-0x161F (mapped to LCP channel 3 of the H8S chip).
264 The interface provides various system management services (currently known:
265 battery information and accelerometer readouts). For more information see the
266 thinkpad_ec module and the H8S hardware documentation:
267 http://documentation.renesas.com/eng/products/mpumcu/rej09b0300_2140bhm.pdf
1 1 VERSION = 4
2 2 PATCHLEVEL = 13
3 SUBLEVEL = 0
4 EXTRAVERSION =
3 SUBLEVEL = 1
4 EXTRAVERSION = -backbone
5 5 NAME = Fearless Coyote
6 6
7 7 # *DOCUMENTATION*
638 638 KBUILD_CFLAGS += $(call cc-option,-Oz,-Os)
639 639 KBUILD_CFLAGS += $(call cc-disable-warning,maybe-uninitialized,)
640 640 else
641 ifdef CONFIG_CC_OPTIMIZE_HARDER
642 KBUILD_CFLAGS += -O3 $(call cc-disable-warning,maybe-uninitialized,)
643 else
641 644 ifdef CONFIG_PROFILE_ALL_BRANCHES
642 645 KBUILD_CFLAGS += -O2 $(call cc-disable-warning,maybe-uninitialized,)
643 646 else
644 647 KBUILD_CFLAGS += -O2
648 endif
645 649 endif
646 650 endif
647 651
505 505 * In the case that a guest uses storage keys
506 506 * faults should no longer be backed by zero pages
507 507 */
508 #define mm_forbids_zeropage mm_use_skey
508 #define mm_forbids_zeropage mm_has_pgste
509 509 static inline int mm_use_skey(struct mm_struct *mm)
510 510 {
511 511 #ifdef CONFIG_PGSTE
2121 2121 }
2122 2122
2123 2123 /*
2124 * Remove all empty zero pages from the mapping for lazy refaulting
2125 * - This must be called after mm->context.has_pgste is set, to avoid
2126 * future creation of zero pages
2127 * - This must be called after THP was enabled
2128 */
2129 static int __zap_zero_pages(pmd_t *pmd, unsigned long start,
2130 unsigned long end, struct mm_walk *walk)
2131 {
2132 unsigned long addr;
2133
2134 for (addr = start; addr != end; addr += PAGE_SIZE) {
2135 pte_t *ptep;
2136 spinlock_t *ptl;
2137
2138 ptep = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
2139 if (is_zero_pfn(pte_pfn(*ptep)))
2140 ptep_xchg_direct(walk->mm, addr, ptep, __pte(_PAGE_INVALID));
2141 pte_unmap_unlock(ptep, ptl);
2142 }
2143 return 0;
2144 }
2145
2146 static inline void zap_zero_pages(struct mm_struct *mm)
2147 {
2148 struct mm_walk walk = { .pmd_entry = __zap_zero_pages };
2149
2150 walk.mm = mm;
2151 walk_page_range(0, TASK_SIZE, &walk);
2152 }
2153
2154 /*
2124 2155 * switch on pgstes for its userspace process (for kvm)
2125 2156 */
2126 2157 int s390_enable_sie(void)
2168 2168 mm->context.has_pgste = 1;
2169 2169 /* split thp mappings and disable thp for future mappings */
2170 2170 thp_split_mm(mm);
2171 zap_zero_pages(mm);
2171 2172 up_write(&mm->mmap_sem);
2172 2173 return 0;
2173 2174 }
2181 2181 static int __s390_enable_skey(pte_t *pte, unsigned long addr,
2182 2182 unsigned long next, struct mm_walk *walk)
2183 2183 {
2184 /*
2185 * Remove all zero page mappings,
2186 * after establishing a policy to forbid zero page mappings
2187 * following faults for that page will get fresh anonymous pages
2188 */
2189 if (is_zero_pfn(pte_pfn(*pte)))
2190 ptep_xchg_direct(walk->mm, addr, pte, __pte(_PAGE_INVALID));
2191 2184 /* Clear storage key */
2192 2185 ptep_zap_key(walk->mm, addr, pte);
2193 2186 return 0;
115 115 config MPENTIUM4
116 116 bool "Pentium-4/Celeron(P4-based)/Pentium-4 M/older Xeon"
117 117 depends on X86_32
118 select X86_P6_NOP
118 119 ---help---
119 120 Select this for Intel Pentium 4 chips. This includes the
120 121 Pentium 4, Pentium D, P4-based Celeron and Xeon, and
148 148 -Paxville
149 149 -Dempsey
150 150
151
152 151 config MK6
153 bool "K6/K6-II/K6-III"
152 bool "AMD K6/K6-II/K6-III"
154 153 depends on X86_32
155 154 ---help---
156 155 Select this for an AMD K6-family processor. Enables use of
157 157 flags to GCC.
158 158
159 159 config MK7
160 bool "Athlon/Duron/K7"
160 bool "AMD Athlon/Duron/K7"
161 161 depends on X86_32
162 162 ---help---
163 163 Select this for an AMD Athlon K7-family processor. Enables use of
165 165 flags to GCC.
166 166
167 167 config MK8
168 bool "Opteron/Athlon64/Hammer/K8"
168 bool "AMD Opteron/Athlon64/Hammer/K8"
169 169 ---help---
170 170 Select this for an AMD Opteron or Athlon64 Hammer-family processor.
171 171 Enables use of some extended instructions, and passes appropriate
172 172 optimization flags to GCC.
173 173
174 config MK8SSE3
175 bool "AMD Opteron/Athlon64/Hammer/K8 with SSE3"
176 ---help---
177 Select this for improved AMD Opteron or Athlon64 Hammer-family processors.
178 Enables use of some extended instructions, and passes appropriate
179 optimization flags to GCC.
180
181 config MK10
182 bool "AMD 61xx/7x50/PhenomX3/X4/II/K10"
183 ---help---
184 Select this for an AMD 61xx Eight-Core Magny-Cours, Athlon X2 7x50,
185 Phenom X3/X4/II, Athlon II X2/X3/X4, or Turion II-family processor.
186 Enables use of some extended instructions, and passes appropriate
187 optimization flags to GCC.
188
189 config MBARCELONA
190 bool "AMD Barcelona"
191 ---help---
192 Select this for AMD Family 10h Barcelona processors.
193
194 Enables -march=barcelona
195
196 config MBOBCAT
197 bool "AMD Bobcat"
198 ---help---
199 Select this for AMD Family 14h Bobcat processors.
200
201 Enables -march=btver1
202
203 config MJAGUAR
204 bool "AMD Jaguar"
205 ---help---
206 Select this for AMD Family 16h Jaguar processors.
207
208 Enables -march=btver2
209
210 config MBULLDOZER
211 bool "AMD Bulldozer"
212 ---help---
213 Select this for AMD Family 15h Bulldozer processors.
214
215 Enables -march=bdver1
216
217 config MPILEDRIVER
218 bool "AMD Piledriver"
219 ---help---
220 Select this for AMD Family 15h Piledriver processors.
221
222 Enables -march=bdver2
223
224 config MSTEAMROLLER
225 bool "AMD Steamroller"
226 ---help---
227 Select this for AMD Family 15h Steamroller processors.
228
229 Enables -march=bdver3
230
231 config MEXCAVATOR
232 bool "AMD Excavator"
233 ---help---
234 Select this for AMD Family 15h Excavator processors.
235
236 Enables -march=bdver4
237
238 config MZEN
239 bool "AMD Zen"
240 ---help---
241 Select this for AMD Family 17h Zen processors.
242
243 Enables -march=znver1
244
174 245 config MCRUSOE
175 246 bool "Crusoe"
176 247 depends on X86_32
323 323
324 324 config MPSC
325 325 bool "Intel P4 / older Netburst based Xeon"
326 select X86_P6_NOP
326 327 depends on X86_64
327 328 ---help---
328 329 Optimize for Intel Pentium 4, Pentium D and older Nocona/Dempsey
333 333 using the cpu family field
334 334 in /proc/cpuinfo. Family 15 is an older Xeon, Family 6 a newer one.
335 335
336 config MATOM
337 bool "Intel Atom"
338 select X86_P6_NOP
339 ---help---
340
341 Select this for the Intel Atom platform. Intel Atom CPUs have an
342 in-order pipelining architecture and thus can benefit from
343 accordingly optimized code. Use a recent GCC with specific Atom
344 support in order to fully benefit from selecting this option.
345
336 346 config MCORE2
337 bool "Core 2/newer Xeon"
347 bool "Intel Core 2"
348 select X86_P6_NOP
338 349 ---help---
339 350
340 351 Select this for Intel Core 2 and newer Core 2 Xeons (Xeon 51xx and
353 353 family in /proc/cpuinfo. Newer ones have 6 and older ones 15
354 354 (not a typo)
355 355
356 config MATOM
357 bool "Intel Atom"
356 Enables -march=core2
357
358 config MNEHALEM
359 bool "Intel Nehalem"
360 select X86_P6_NOP
358 361 ---help---
359 362
360 Select this for the Intel Atom platform. Intel Atom CPUs have an
361 in-order pipelining architecture and thus can benefit from
362 accordingly optimized code. Use a recent GCC with specific Atom
363 support in order to fully benefit from selecting this option.
363 Select this for 1st Gen Core processors in the Nehalem family.
364 364
365 Enables -march=nehalem
366
367 config MWESTMERE
368 bool "Intel Westmere"
369 select X86_P6_NOP
370 ---help---
371
372 Select this for the Intel Westmere formerly Nehalem-C family.
373
374 Enables -march=westmere
375
376 config MSILVERMONT
377 bool "Intel Silvermont"
378 select X86_P6_NOP
379 ---help---
380
381 Select this for the Intel Silvermont platform.
382
383 Enables -march=silvermont
384
385 config MSANDYBRIDGE
386 bool "Intel Sandy Bridge"
387 select X86_P6_NOP
388 ---help---
389
390 Select this for 2nd Gen Core processors in the Sandy Bridge family.
391
392 Enables -march=sandybridge
393
394 config MIVYBRIDGE
395 bool "Intel Ivy Bridge"
396 select X86_P6_NOP
397 ---help---
398
399 Select this for 3rd Gen Core processors in the Ivy Bridge family.
400
401 Enables -march=ivybridge
402
403 config MHASWELL
404 bool "Intel Haswell"
405 select X86_P6_NOP
406 ---help---
407
408 Select this for 4th Gen Core processors in the Haswell family.
409
410 Enables -march=haswell
411
412 config MBROADWELL
413 bool "Intel Broadwell"
414 select X86_P6_NOP
415 ---help---
416
417 Select this for 5th Gen Core processors in the Broadwell family.
418
419 Enables -march=broadwell
420
421 config MSKYLAKE
422 bool "Intel Skylake"
423 select X86_P6_NOP
424 ---help---
425
426 Select this for 6th Gen Core processors in the Skylake family.
427
428 Enables -march=skylake
429
365 430 config GENERIC_CPU
366 431 bool "Generic-x86-64"
367 432 depends on X86_64
434 434 Generic x86-64 CPU.
435 435 Run equally well on all x86-64 CPUs.
436 436
437 config MNATIVE
438 bool "Native optimizations autodetected by GCC"
439 ---help---
440
441 GCC 4.2 and above support -march=native, which automatically detects
442 the optimum settings to use based on your processor. -march=native
443 also detects and applies additional settings beyond -march specific
444 to your CPU, (eg. -msse4). Unless you have a specific reason not to
445 (e.g. distcc cross-compiling), you should probably be using
446 -march=native rather than anything listed below.
447
448 Enables -march=native
449
437 450 endchoice
438 451
439 452 config X86_GENERIC
471 471 config X86_L1_CACHE_SHIFT
472 472 int
473 473 default "7" if MPENTIUM4 || MPSC
474 default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
474 default "6" if MK7 || MK8 || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MJAGUAR || MPENTIUMM || MCORE2 || MNEHALEM || MWESTMERE || MSILVERMONT || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MNATIVE || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
475 475 default "4" if MELAN || M486 || MGEODEGX1
476 476 default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
477 477
502 502
503 503 config X86_INTEL_USERCOPY
504 504 def_bool y
505 depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2
505 depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK8SSE3 || MK7 || MEFFICEON || MCORE2 || MK10 || MBARCELONA || MNEHALEM || MWESTMERE || MSILVERMONT || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MNATIVE
506 506
507 507 config X86_USE_PPRO_CHECKSUM
508 508 def_bool y
509 depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MATOM
509 depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MK10 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MK8SSE3 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MNEHALEM || MWESTMERE || MSILVERMONT || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MATOM || MNATIVE
510 510
511 511 config X86_USE_3DNOW
512 512 def_bool y
513 513 depends on (MCYRIXIII || MK7 || MGEODE_LX) && !UML
514 514
515 #
516 # P6_NOPs are a relatively minor optimization that require a family >=
517 # 6 processor, except that it is broken on certain VIA chips.
518 # Furthermore, AMD chips prefer a totally different sequence of NOPs
519 # (which work on all CPUs). In addition, it looks like Virtual PC
520 # does not understand them.
521 #
522 # As a result, disallow these if we're not compiling for X86_64 (these
523 # NOPs do work on all x86-64 capable chips); the list of processors in
524 # the right-hand clause are the cores that benefit from this optimization.
525 #
526 515 config X86_P6_NOP
527 def_bool y
528 depends on X86_64
529 depends on (MCORE2 || MPENTIUM4 || MPSC)
516 default n
517 bool "Support for P6_NOPs on Intel chips"
518 depends on (MCORE2 || MPENTIUM4 || MPSC || MATOM || MNEHALEM || MWESTMERE || MSILVERMONT || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MNATIVE)
519 ---help---
520 P6_NOPs are a relatively minor optimization that require a family >=
521 6 processor, except that it is broken on certain VIA chips.
522 Furthermore, AMD chips prefer a totally different sequence of NOPs
523 (which work on all CPUs). In addition, it looks like Virtual PC
524 does not understand them.
530 525
526 As a result, disallow these if we're not compiling for X86_64 (these
527 NOPs do work on all x86-64 capable chips); the list of processors in
528 the right-hand clause are the cores that benefit from this optimization.
529
530 Say Y if you have Intel CPU newer than Pentium Pro, N otherwise.
531
531 532 config X86_TSC
532 533 def_bool y
533 depends on (MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MATOM) || X86_64
534 depends on (MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MK8SSE3 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MNEHALEM || MWESTMERE || MSILVERMONT || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MNATIVE || MATOM) || X86_64
534 535
535 536 config X86_CMPXCHG64
536 537 def_bool y
537 depends on X86_PAE || X86_64 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MATOM
538 depends on X86_PAE || X86_64 || MCORE2 || MNEHALEM || MWESTMERE || MSILVERMONT || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MATOM || MNATIVE
538 539
539 540 # this should be set for all -march=.. options where the compiler
540 541 # generates cmov.
541 542 config X86_CMOV
542 543 def_bool y
543 depends on (MK8 || MK7 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX)
544 depends on (MK8 || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MJAGUAR || MK7 || MCORE2 || MNEHALEM || MWESTMERE || MSILVERMONT || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MNATIVE || MATOM || MGEODE_LX)
544 545
545 546 config X86_MINIMUM_CPU_FAMILY
546 547 int
121 121 KBUILD_CFLAGS += $(call cc-option,-mskip-rax-setup)
122 122
123 123 # FIXME - should be integrated in Makefile.cpu (Makefile_32.cpu)
124 cflags-$(CONFIG_MNATIVE) += $(call cc-option,-march=native)
124 125 cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8)
126 cflags-$(CONFIG_MK8SSE3) += $(call cc-option,-march=k8-sse3,-mtune=k8)
127 cflags-$(CONFIG_MK10) += $(call cc-option,-march=amdfam10)
128 cflags-$(CONFIG_MBARCELONA) += $(call cc-option,-march=barcelona)
129 cflags-$(CONFIG_MBOBCAT) += $(call cc-option,-march=btver1)
130 cflags-$(CONFIG_MJAGUAR) += $(call cc-option,-march=btver2)
131 cflags-$(CONFIG_MBULLDOZER) += $(call cc-option,-march=bdver1)
132 cflags-$(CONFIG_MPILEDRIVER) += $(call cc-option,-march=bdver2)
133 cflags-$(CONFIG_MSTEAMROLLER) += $(call cc-option,-march=bdver3)
134 cflags-$(CONFIG_MEXCAVATOR) += $(call cc-option,-march=bdver4)
135 cflags-$(CONFIG_MZEN) += $(call cc-option,-march=znver1)
125 136 cflags-$(CONFIG_MPSC) += $(call cc-option,-march=nocona)
126 137
127 138 cflags-$(CONFIG_MCORE2) += \
128 $(call cc-option,-march=core2,$(call cc-option,-mtune=generic))
129 cflags-$(CONFIG_MATOM) += $(call cc-option,-march=atom) \
130 $(call cc-option,-mtune=atom,$(call cc-option,-mtune=generic))
139 $(call cc-option,-march=core2,$(call cc-option,-mtune=core2))
140 cflags-$(CONFIG_MNEHALEM) += \
141 $(call cc-option,-march=nehalem,$(call cc-option,-mtune=nehalem))
142 cflags-$(CONFIG_MWESTMERE) += \
143 $(call cc-option,-march=westmere,$(call cc-option,-mtune=westmere))
144 cflags-$(CONFIG_MSILVERMONT) += \
145 $(call cc-option,-march=silvermont,$(call cc-option,-mtune=silvermont))
146 cflags-$(CONFIG_MSANDYBRIDGE) += \
147 $(call cc-option,-march=sandybridge,$(call cc-option,-mtune=sandybridge))
148 cflags-$(CONFIG_MIVYBRIDGE) += \
149 $(call cc-option,-march=ivybridge,$(call cc-option,-mtune=ivybridge))
150 cflags-$(CONFIG_MHASWELL) += \
151 $(call cc-option,-march=haswell,$(call cc-option,-mtune=haswell))
152 cflags-$(CONFIG_MBROADWELL) += \
153 $(call cc-option,-march=broadwell,$(call cc-option,-mtune=broadwell))
154 cflags-$(CONFIG_MSKYLAKE) += \
155 $(call cc-option,-march=skylake,$(call cc-option,-mtune=skylake))
156 cflags-$(CONFIG_MATOM) += $(call cc-option,-march=bonnell) \
157 $(call cc-option,-mtune=bonnell,$(call cc-option,-mtune=generic))
131 158 cflags-$(CONFIG_GENERIC_CPU) += $(call cc-option,-mtune=generic)
132 159 KBUILD_CFLAGS += $(cflags-y)
133 160
22 22 # Please note, that patches that add -march=athlon-xp and friends are pointless.
23 23 # They make zero difference whatsosever to performance at this time.
24 24 cflags-$(CONFIG_MK7) += -march=athlon
25 cflags-$(CONFIG_MNATIVE) += $(call cc-option,-march=native)
25 26 cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8,-march=athlon)
27 cflags-$(CONFIG_MK8SSE3) += $(call cc-option,-march=k8-sse3,-march=athlon)
28 cflags-$(CONFIG_MK10) += $(call cc-option,-march=amdfam10,-march=athlon)
29 cflags-$(CONFIG_MBARCELONA) += $(call cc-option,-march=barcelona,-march=athlon)
30 cflags-$(CONFIG_MBOBCAT) += $(call cc-option,-march=btver1,-march=athlon)
31 cflags-$(CONFIG_MJAGUAR) += $(call cc-option,-march=btver2,-march=athlon)
32 cflags-$(CONFIG_MBULLDOZER) += $(call cc-option,-march=bdver1,-march=athlon)
33 cflags-$(CONFIG_MPILEDRIVER) += $(call cc-option,-march=bdver2,-march=athlon)
34 cflags-$(CONFIG_MSTEAMROLLER) += $(call cc-option,-march=bdver3,-march=athlon)
35 cflags-$(CONFIG_MEXCAVATOR) += $(call cc-option,-march=bdver4,-march=athlon)
36 cflags-$(CONFIG_MZEN) += $(call cc-option,-march=znver1,-march=athlon)
26 37 cflags-$(CONFIG_MCRUSOE) += -march=i686 -falign-functions=0 -falign-jumps=0 -falign-loops=0
27 38 cflags-$(CONFIG_MEFFICEON) += -march=i686 $(call tune,pentium3) -falign-functions=0 -falign-jumps=0 -falign-loops=0
28 39 cflags-$(CONFIG_MWINCHIPC6) += $(call cc-option,-march=winchip-c6,-march=i586)
42 42 cflags-$(CONFIG_MVIAC3_2) += $(call cc-option,-march=c3-2,-march=i686)
43 43 cflags-$(CONFIG_MVIAC7) += -march=i686
44 44 cflags-$(CONFIG_MCORE2) += -march=i686 $(call tune,core2)
45 cflags-$(CONFIG_MATOM) += $(call cc-option,-march=atom,$(call cc-option,-march=core2,-march=i686)) \
46 $(call cc-option,-mtune=atom,$(call cc-option,-mtune=generic))
45 cflags-$(CONFIG_MNEHALEM) += -march=i686 $(call tune,nehalem)
46 cflags-$(CONFIG_MWESTMERE) += -march=i686 $(call tune,westmere)
47 cflags-$(CONFIG_MSILVERMONT) += -march=i686 $(call tune,silvermont)
48 cflags-$(CONFIG_MSANDYBRIDGE) += -march=i686 $(call tune,sandybridge)
49 cflags-$(CONFIG_MIVYBRIDGE) += -march=i686 $(call tune,ivybridge)
50 cflags-$(CONFIG_MHASWELL) += -march=i686 $(call tune,haswell)
51 cflags-$(CONFIG_MBROADWELL) += -march=i686 $(call tune,broadwell)
52 cflags-$(CONFIG_MSKYLAKE) += -march=i686 $(call tune,skylake)
53 cflags-$(CONFIG_MATOM) += $(call cc-option,-march=bonnell,$(call cc-option,-march=core2,-march=i686)) \
54 $(call cc-option,-mtune=bonnell,$(call cc-option,-mtune=generic))
47 55
48 56 # AMD Elan support
49 57 cflags-$(CONFIG_MELAN) += -march=i486
15 15 #define MODULE_PROC_FAMILY "586MMX "
16 16 #elif defined CONFIG_MCORE2
17 17 #define MODULE_PROC_FAMILY "CORE2 "
18 #elif defined CONFIG_MNATIVE
19 #define MODULE_PROC_FAMILY "NATIVE "
20 #elif defined CONFIG_MNEHALEM
21 #define MODULE_PROC_FAMILY "NEHALEM "
22 #elif defined CONFIG_MWESTMERE
23 #define MODULE_PROC_FAMILY "WESTMERE "
24 #elif defined CONFIG_MSILVERMONT
25 #define MODULE_PROC_FAMILY "SILVERMONT "
26 #elif defined CONFIG_MSANDYBRIDGE
27 #define MODULE_PROC_FAMILY "SANDYBRIDGE "
28 #elif defined CONFIG_MIVYBRIDGE
29 #define MODULE_PROC_FAMILY "IVYBRIDGE "
30 #elif defined CONFIG_MHASWELL
31 #define MODULE_PROC_FAMILY "HASWELL "
32 #elif defined CONFIG_MBROADWELL
33 #define MODULE_PROC_FAMILY "BROADWELL "
34 #elif defined CONFIG_MSKYLAKE
35 #define MODULE_PROC_FAMILY "SKYLAKE "
18 36 #elif defined CONFIG_MATOM
19 37 #define MODULE_PROC_FAMILY "ATOM "
20 38 #elif defined CONFIG_M686
51 51 #define MODULE_PROC_FAMILY "K7 "
52 52 #elif defined CONFIG_MK8
53 53 #define MODULE_PROC_FAMILY "K8 "
54 #elif defined CONFIG_MK8SSE3
55 #define MODULE_PROC_FAMILY "K8SSE3 "
56 #elif defined CONFIG_MK10
57 #define MODULE_PROC_FAMILY "K10 "
58 #elif defined CONFIG_MBARCELONA
59 #define MODULE_PROC_FAMILY "BARCELONA "
60 #elif defined CONFIG_MBOBCAT
61 #define MODULE_PROC_FAMILY "BOBCAT "
62 #elif defined CONFIG_MBULLDOZER
63 #define MODULE_PROC_FAMILY "BULLDOZER "
64 #elif defined CONFIG_MPILEDRIVER
65 #define MODULE_PROC_FAMILY "PILEDRIVER "
66 #elif defined CONFIG_MSTEAMROLLER
67 #define MODULE_PROC_FAMILY "STEAMROLLER "
68 #elif defined CONFIG_MJAGUAR
69 #define MODULE_PROC_FAMILY "JAGUAR "
70 #elif defined CONFIG_MEXCAVATOR
71 #define MODULE_PROC_FAMILY "EXCAVATOR "
72 #elif defined CONFIG_MZEN
73 #define MODULE_PROC_FAMILY "ZEN "
54 74 #elif defined CONFIG_MELAN
55 75 #define MODULE_PROC_FAMILY "ELAN "
56 76 #elif defined CONFIG_MCRUSOE
39 39 ---help---
40 40 Enable group IO scheduling in CFQ.
41 41
42 config IOSCHED_BFQ_SQ
43 tristate "BFQ-SQ I/O scheduler"
44 default n
45 ---help---
46 The BFQ-SQ I/O scheduler (for legacy blk: SQ stands for
47 SingleQueue) distributes bandwidth among all processes
48 according to their weights, regardless of the device
49 parameters and with any workload. It also guarantees a low
50 latency to interactive and soft real-time applications.
51 Details in Documentation/block/bfq-iosched.txt
52
53 config BFQ_SQ_GROUP_IOSCHED
54 bool "BFQ-SQ hierarchical scheduling support"
55 depends on IOSCHED_BFQ_SQ && BLK_CGROUP
56 default n
57 ---help---
58
59 Enable hierarchical scheduling in BFQ-SQ, using the blkio
60 (cgroups-v1) or io (cgroups-v2) controller.
61
42 62 choice
43 63
44 64 prompt "Default I/O scheduler"
73 73 config DEFAULT_CFQ
74 74 bool "CFQ" if IOSCHED_CFQ=y
75 75
76 config DEFAULT_BFQ_SQ
77 bool "BFQ-SQ" if IOSCHED_BFQ_SQ=y
78 help
79 Selects BFQ-SQ as the default I/O scheduler which will be
80 used by default for all block devices.
81 The BFQ-SQ I/O scheduler aims at distributing the bandwidth
82 as desired, independently of the disk parameters and with
83 any workload. It also tries to guarantee low latency to
84 interactive and soft real-time applications.
85
76 86 config DEFAULT_NOOP
77 87 bool "No-op"
78 88
92 92 string
93 93 default "deadline" if DEFAULT_DEADLINE
94 94 default "cfq" if DEFAULT_CFQ
95 default "bfq-sq" if DEFAULT_BFQ_SQ
95 96 default "noop" if DEFAULT_NOOP
97
98 config MQ_IOSCHED_BFQ
99 tristate "BFQ-MQ I/O Scheduler"
100 default y
101 ---help---
102 BFQ I/O scheduler for BLK-MQ. BFQ-MQ distributes bandwidth
103 among all processes according to their weights, regardless of
104 the device parameters and with any workload. It also
105 guarantees a low latency to interactive and soft real-time
106 applications. Details in Documentation/block/bfq-iosched.txt
107
108 config MQ_BFQ_GROUP_IOSCHED
109 bool "BFQ-MQ hierarchical scheduling support"
110 depends on MQ_IOSCHED_BFQ && BLK_CGROUP
111 default n
112 ---help---
113
114 Enable hierarchical scheduling in BFQ-MQ, using the blkio
115 (cgroups-v1) or io (cgroups-v2) controller.
96 116
97 117 config MQ_IOSCHED_DEADLINE
98 118 tristate "MQ deadline I/O scheduler"
23 23 obj-$(CONFIG_MQ_IOSCHED_KYBER) += kyber-iosched.o
24 24 bfq-y := bfq-iosched.o bfq-wf2q.o bfq-cgroup.o
25 25 obj-$(CONFIG_IOSCHED_BFQ) += bfq.o
26 obj-$(CONFIG_IOSCHED_BFQ_SQ) += bfq-sq-iosched.o
27 obj-$(CONFIG_MQ_IOSCHED_BFQ) += bfq-mq-iosched.o
26 28
27 29 obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o
28 30 obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o
1 /*
2 * BFQ: CGROUPS support.
3 *
4 * Based on ideas and code from CFQ:
5 * Copyright (C) 2003 Jens Axboe <axboe@kernel.dk>
6 *
7 * Copyright (C) 2008 Fabio Checconi <fabio@gandalf.sssup.it>
8 * Paolo Valente <paolo.valente@unimore.it>
9 *
10 * Copyright (C) 2015 Paolo Valente <paolo.valente@unimore.it>
11 *
12 * Copyright (C) 2016 Paolo Valente <paolo.valente@linaro.org>
13 *
14 * Licensed under the GPL-2 as detailed in the accompanying COPYING.BFQ
15 * file.
16 */
17
18 #ifdef BFQ_GROUP_IOSCHED_ENABLED
19
20 /* bfqg stats flags */
21 enum bfqg_stats_flags {
22 BFQG_stats_waiting = 0,
23 BFQG_stats_idling,
24 BFQG_stats_empty,
25 };
26
27 #define BFQG_FLAG_FNS(name) \
28 static void bfqg_stats_mark_##name(struct bfqg_stats *stats) \
29 { \
30 stats->flags |= (1 << BFQG_stats_##name); \
31 } \
32 static void bfqg_stats_clear_##name(struct bfqg_stats *stats) \
33 { \
34 stats->flags &= ~(1 << BFQG_stats_##name); \
35 } \
36 static int bfqg_stats_##name(struct bfqg_stats *stats) \
37 { \
38 return (stats->flags & (1 << BFQG_stats_##name)) != 0; \
39 } \
40
41 BFQG_FLAG_FNS(waiting)
42 BFQG_FLAG_FNS(idling)
43 BFQG_FLAG_FNS(empty)
44 #undef BFQG_FLAG_FNS
45
46 #ifdef BFQ_MQ
47 /* This should be called with the scheduler lock held. */
48 #else
49 /* This should be called with the queue_lock held. */
50 #endif
51 static void bfqg_stats_update_group_wait_time(struct bfqg_stats *stats)
52 {
53 unsigned long long now;
54
55 if (!bfqg_stats_waiting(stats))
56 return;
57
58 now = sched_clock();
59 if (time_after64(now, stats->start_group_wait_time))
60 blkg_stat_add(&stats->group_wait_time,
61 now - stats->start_group_wait_time);
62 bfqg_stats_clear_waiting(stats);
63 }
64
65 #ifdef BFQ_MQ
66 /* This should be called with the scheduler lock held. */
67 #else
68 /* This should be called with the queue_lock held. */
69 #endif
70 static void bfqg_stats_set_start_group_wait_time(struct bfq_group *bfqg,
71 struct bfq_group *curr_bfqg)
72 {
73 struct bfqg_stats *stats = &bfqg->stats;
74
75 if (bfqg_stats_waiting(stats))
76 return;
77 if (bfqg == curr_bfqg)
78 return;
79 stats->start_group_wait_time = sched_clock();
80 bfqg_stats_mark_waiting(stats);
81 }
82
83 #ifdef BFQ_MQ
84 /* This should be called with the scheduler lock held. */
85 #else
86 /* This should be called with the queue_lock held. */
87 #endif
88 static void bfqg_stats_end_empty_time(struct bfqg_stats *stats)
89 {
90 unsigned long long now;
91
92 if (!bfqg_stats_empty(stats))
93 return;
94
95 now = sched_clock();
96 if (time_after64(now, stats->start_empty_time))
97 blkg_stat_add(&stats->empty_time,
98 now - stats->start_empty_time);
99 bfqg_stats_clear_empty(stats);
100 }
101
102 static void bfqg_stats_update_dequeue(struct bfq_group *bfqg)
103 {
104 blkg_stat_add(&bfqg->stats.dequeue, 1);
105 }
106
107 static void bfqg_stats_set_start_empty_time(struct bfq_group *bfqg)
108 {
109 struct bfqg_stats *stats = &bfqg->stats;
110
111 if (blkg_rwstat_total(&stats->queued))
112 return;
113
114 /*
115 * group is already marked empty. This can happen if bfqq got new
116 * request in parent group and moved to this group while being added
117 * to service tree. Just ignore the event and move on.
118 */
119 if (bfqg_stats_empty(stats))
120 return;
121
122 stats->start_empty_time = sched_clock();
123 bfqg_stats_mark_empty(stats);
124 }
125
126 static void bfqg_stats_update_idle_time(struct bfq_group *bfqg)
127 {
128 struct bfqg_stats *stats = &bfqg->stats;
129
130 if (bfqg_stats_idling(stats)) {
131 unsigned long long now = sched_clock();
132
133 if (time_after64(now, stats->start_idle_time))
134 blkg_stat_add(&stats->idle_time,
135 now - stats->start_idle_time);
136 bfqg_stats_clear_idling(stats);
137 }
138 }
139
140 static void bfqg_stats_set_start_idle_time(struct bfq_group *bfqg)
141 {
142 struct bfqg_stats *stats = &bfqg->stats;
143
144 stats->start_idle_time = sched_clock();
145 bfqg_stats_mark_idling(stats);
146 }
147
148 static void bfqg_stats_update_avg_queue_size(struct bfq_group *bfqg)
149 {
150 struct bfqg_stats *stats = &bfqg->stats;
151
152 blkg_stat_add(&stats->avg_queue_size_sum,
153 blkg_rwstat_total(&stats->queued));
154 blkg_stat_add(&stats->avg_queue_size_samples, 1);
155 bfqg_stats_update_group_wait_time(stats);
156 }
157
158 static struct blkcg_policy blkcg_policy_bfq;
159
160 /*
161 * blk-cgroup policy-related handlers
162 * The following functions help in converting between blk-cgroup
163 * internal structures and BFQ-specific structures.
164 */
165
166 static struct bfq_group *pd_to_bfqg(struct blkg_policy_data *pd)
167 {
168 return pd ? container_of(pd, struct bfq_group, pd) : NULL;
169 }
170
171 static struct blkcg_gq *bfqg_to_blkg(struct bfq_group *bfqg)
172 {
173 return pd_to_blkg(&bfqg->pd);
174 }
175
176 static struct bfq_group *blkg_to_bfqg(struct blkcg_gq *blkg)
177 {
178 struct blkg_policy_data *pd = blkg_to_pd(blkg, &blkcg_policy_bfq);
179
180 return pd_to_bfqg(pd);
181 }
182
183 /*
184 * bfq_group handlers
185 * The following functions help in navigating the bfq_group hierarchy
186 * by allowing to find the parent of a bfq_group or the bfq_group
187 * associated to a bfq_queue.
188 */
189
190 static struct bfq_group *bfqg_parent(struct bfq_group *bfqg)
191 {
192 struct blkcg_gq *pblkg = bfqg_to_blkg(bfqg)->parent;
193
194 return pblkg ? blkg_to_bfqg(pblkg) : NULL;
195 }
196
197 static struct bfq_group *bfqq_group(struct bfq_queue *bfqq)
198 {
199 struct bfq_entity *group_entity = bfqq->entity.parent;
200
201 return group_entity ? container_of(group_entity, struct bfq_group,
202 entity) :
203 bfqq->bfqd->root_group;
204 }
205
206 /*
207 * The following two functions handle get and put of a bfq_group by
208 * wrapping the related blk-cgroup hooks.
209 */
210
211 static void bfqg_get(struct bfq_group *bfqg)
212 {
213 #ifdef BFQ_MQ
214 bfqg->ref++;
215 #else
216 blkg_get(bfqg_to_blkg(bfqg));
217 #endif
218 }
219
220 static void bfqg_put(struct bfq_group *bfqg)
221 {
222 #ifdef BFQ_MQ
223 bfqg->ref--;
224
225 BUG_ON(bfqg->ref < 0);
226 if (bfqg->ref == 0)
227 kfree(bfqg);
228 #else
229 blkg_put(bfqg_to_blkg(bfqg));
230 #endif
231 }
232
233 #ifdef BFQ_MQ
234 static void bfqg_and_blkg_get(struct bfq_group *bfqg)
235 {
236 /* see comments in bfq_bic_update_cgroup for why refcounting bfqg */
237 bfqg_get(bfqg);
238
239 blkg_get(bfqg_to_blkg(bfqg));
240 }
241
242 static void bfqg_and_blkg_put(struct bfq_group *bfqg)
243 {
244 bfqg_put(bfqg);
245
246 blkg_put(bfqg_to_blkg(bfqg));
247 }
248 #endif
249
250 static void bfqg_stats_update_io_add(struct bfq_group *bfqg,
251 struct bfq_queue *bfqq,
252 unsigned int op)
253 {
254 blkg_rwstat_add(&bfqg->stats.queued, op, 1);
255 bfqg_stats_end_empty_time(&bfqg->stats);
256 if (!(bfqq == ((struct bfq_data *)bfqg->bfqd)->in_service_queue))
257 bfqg_stats_set_start_group_wait_time(bfqg, bfqq_group(bfqq));
258 }
259
260 static void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op)
261 {
262 blkg_rwstat_add(&bfqg->stats.queued, op, -1);
263 }
264
265 static void bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op)
266 {
267 blkg_rwstat_add(&bfqg->stats.merged, op, 1);
268 }
269
270 static void bfqg_stats_update_completion(struct bfq_group *bfqg,
271 uint64_t start_time, uint64_t io_start_time,
272 unsigned int op)
273 {
274 struct bfqg_stats *stats = &bfqg->stats;
275 unsigned long long now = sched_clock();
276
277 if (time_after64(now, io_start_time))
278 blkg_rwstat_add(&stats->service_time, op,
279 now - io_start_time);
280 if (time_after64(io_start_time, start_time))
281 blkg_rwstat_add(&stats->wait_time, op,
282 io_start_time - start_time);
283 }
284
285 /* @stats = 0 */
286 static void bfqg_stats_reset(struct bfqg_stats *stats)
287 {
288 /* queued stats shouldn't be cleared */
289 blkg_rwstat_reset(&stats->merged);
290 blkg_rwstat_reset(&stats->service_time);
291 blkg_rwstat_reset(&stats->wait_time);
292 blkg_stat_reset(&stats->time);
293 blkg_stat_reset(&stats->avg_queue_size_sum);
294 blkg_stat_reset(&stats->avg_queue_size_samples);
295 blkg_stat_reset(&stats->dequeue);
296 blkg_stat_reset(&stats->group_wait_time);
297 blkg_stat_reset(&stats->idle_time);
298 blkg_stat_reset(&stats->empty_time);
299 }
300
301 /* @to += @from */
302 static void bfqg_stats_add_aux(struct bfqg_stats *to, struct bfqg_stats *from)
303 {
304 if (!to || !from)
305 return;
306
307 /* queued stats shouldn't be cleared */
308 blkg_rwstat_add_aux(&to->merged, &from->merged);
309 blkg_rwstat_add_aux(&to->service_time, &from->service_time);
310 blkg_rwstat_add_aux(&to->wait_time, &from->wait_time);
311 blkg_stat_add_aux(&from->time, &from->time);
312 blkg_stat_add_aux(&to->avg_queue_size_sum, &from->avg_queue_size_sum);
313 blkg_stat_add_aux(&to->avg_queue_size_samples,
314 &from->avg_queue_size_samples);
315 blkg_stat_add_aux(&to->dequeue, &from->dequeue);
316 blkg_stat_add_aux(&to->group_wait_time, &from->group_wait_time);
317 blkg_stat_add_aux(&to->idle_time, &from->idle_time);
318 blkg_stat_add_aux(&to->empty_time, &from->empty_time);
319 }
320
321 /*
322 * Transfer @bfqg's stats to its parent's dead_stats so that the ancestors'
323 * recursive stats can still account for the amount used by this bfqg after
324 * it's gone.
325 */
326 static void bfqg_stats_xfer_dead(struct bfq_group *bfqg)
327 {
328 struct bfq_group *parent;
329
330 if (!bfqg) /* root_group */
331 return;
332
333 parent = bfqg_parent(bfqg);
334
335 lockdep_assert_held(bfqg_to_blkg(bfqg)->q->queue_lock);
336
337 if (unlikely(!parent))
338 return;
339
340 bfqg_stats_add_aux(&parent->stats, &bfqg->stats);
341 bfqg_stats_reset(&bfqg->stats);
342 }
343
344 static void bfq_init_entity(struct bfq_entity *entity,
345 struct bfq_group *bfqg)
346 {
347 struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
348
349 entity->weight = entity->new_weight;
350 entity->orig_weight = entity->new_weight;
351 if (bfqq) {
352 bfqq->ioprio = bfqq->new_ioprio;
353 bfqq->ioprio_class = bfqq->new_ioprio_class;
354 #ifdef BFQ_MQ
355 /*
356 * Make sure that bfqg and its associated blkg do not
357 * disappear before entity.
358 */
359 bfqg_and_blkg_get(bfqg);
360 #else
361 bfqg_get(bfqg);
362 #endif
363 }
364 entity->parent = bfqg->my_entity; /* NULL for root group */
365 entity->sched_data = &bfqg->sched_data;
366 }
367
368 static void bfqg_stats_exit(struct bfqg_stats *stats)
369 {
370 blkg_rwstat_exit(&stats->merged);
371 blkg_rwstat_exit(&stats->service_time);
372 blkg_rwstat_exit(&stats->wait_time);
373 blkg_rwstat_exit(&stats->queued);
374 blkg_stat_exit(&stats->time);
375 blkg_stat_exit(&stats->avg_queue_size_sum);
376 blkg_stat_exit(&stats->avg_queue_size_samples);
377 blkg_stat_exit(&stats->dequeue);
378 blkg_stat_exit(&stats->group_wait_time);
379 blkg_stat_exit(&stats->idle_time);
380 blkg_stat_exit(&stats->empty_time);
381 }
382
383 static int bfqg_stats_init(struct bfqg_stats *stats, gfp_t gfp)
384 {
385 if (blkg_rwstat_init(&stats->merged, gfp) ||
386 blkg_rwstat_init(&stats->service_time, gfp) ||
387 blkg_rwstat_init(&stats->wait_time, gfp) ||
388 blkg_rwstat_init(&stats->queued, gfp) ||
389 blkg_stat_init(&stats->time, gfp) ||
390 blkg_stat_init(&stats->avg_queue_size_sum, gfp) ||
391 blkg_stat_init(&stats->avg_queue_size_samples, gfp) ||
392 blkg_stat_init(&stats->dequeue, gfp) ||
393 blkg_stat_init(&stats->group_wait_time, gfp) ||
394 blkg_stat_init(&stats->idle_time, gfp) ||
395 blkg_stat_init(&stats->empty_time, gfp)) {
396 bfqg_stats_exit(stats);
397 return -ENOMEM;
398 }
399
400 return 0;
401 }
402
403 static struct bfq_group_data *cpd_to_bfqgd(struct blkcg_policy_data *cpd)
404 {
405 return cpd ? container_of(cpd, struct bfq_group_data, pd) : NULL;
406 }
407
408 static struct bfq_group_data *blkcg_to_bfqgd(struct blkcg *blkcg)
409 {
410 return cpd_to_bfqgd(blkcg_to_cpd(blkcg, &blkcg_policy_bfq));
411 }
412
413 static struct blkcg_policy_data *bfq_cpd_alloc(gfp_t gfp)
414 {
415 struct bfq_group_data *bgd;
416
417 bgd = kzalloc(sizeof(*bgd), gfp);
418 if (!bgd)
419 return NULL;
420 return &bgd->pd;
421 }
422
423 static void bfq_cpd_init(struct blkcg_policy_data *cpd)
424 {
425 struct bfq_group_data *d = cpd_to_bfqgd(cpd);
426
427 d->weight = cgroup_subsys_on_dfl(io_cgrp_subsys) ?
428 CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL;
429 }
430
431 static void bfq_cpd_free(struct blkcg_policy_data *cpd)
432 {
433 kfree(cpd_to_bfqgd(cpd));
434 }
435
436 static struct blkg_policy_data *bfq_pd_alloc(gfp_t gfp, int node)
437 {
438 struct bfq_group *bfqg;
439
440 bfqg = kzalloc_node(sizeof(*bfqg), gfp, node);
441 if (!bfqg)
442 return NULL;
443
444 if (bfqg_stats_init(&bfqg->stats, gfp)) {
445 kfree(bfqg);
446 return NULL;
447 }
448
449 #ifdef BFQ_MQ
450 /* see comments in bfq_bic_update_cgroup for why refcounting */
451 bfqg_get(bfqg);
452 #endif
453 return &bfqg->pd;
454 }
455
456 static void bfq_pd_init(struct blkg_policy_data *pd)
457 {
458 struct blkcg_gq *blkg;
459 struct bfq_group *bfqg;
460 struct bfq_data *bfqd;
461 struct bfq_entity *entity;
462 struct bfq_group_data *d;
463
464 blkg = pd_to_blkg(pd);
465 BUG_ON(!blkg);
466 bfqg = blkg_to_bfqg(blkg);
467 bfqd = blkg->q->elevator->elevator_data;
468 BUG_ON(bfqg == bfqd->root_group);
469 entity = &bfqg->entity;
470 d = blkcg_to_bfqgd(blkg->blkcg);
471
472 entity->orig_weight = entity->weight = entity->new_weight = d->weight;
473 entity->my_sched_data = &bfqg->sched_data;
474 bfqg->my_entity = entity; /*
475 * the root_group's will be set to NULL
476 * in bfq_init_queue()
477 */
478 bfqg->bfqd = bfqd;
479 bfqg->active_entities = 0;
480 bfqg->rq_pos_tree = RB_ROOT;
481 }
482
483 static void bfq_pd_free(struct blkg_policy_data *pd)
484 {
485 struct bfq_group *bfqg = pd_to_bfqg(pd);
486
487 bfqg_stats_exit(&bfqg->stats);
488 #ifdef BFQ_MQ
489 bfqg_put(bfqg);
490 #else
491 kfree(bfqg);
492 #endif
493 }
494
495 static void bfq_pd_reset_stats(struct blkg_policy_data *pd)
496 {
497 struct bfq_group *bfqg = pd_to_bfqg(pd);
498
499 bfqg_stats_reset(&bfqg->stats);
500 }
501
502 static void bfq_group_set_parent(struct bfq_group *bfqg,
503 struct bfq_group *parent)
504 {
505 struct bfq_entity *entity;
506
507 BUG_ON(!parent);
508 BUG_ON(!bfqg);
509 BUG_ON(bfqg == parent);
510
511 entity = &bfqg->entity;
512 entity->parent = parent->my_entity;
513 entity->sched_data = &parent->sched_data;
514 }
515
516 static struct bfq_group *bfq_lookup_bfqg(struct bfq_data *bfqd,
517 struct blkcg *blkcg)
518 {
519 struct blkcg_gq *blkg;
520
521 blkg = blkg_lookup(blkcg, bfqd->queue);
522 if (likely(blkg))
523 return blkg_to_bfqg(blkg);
524 return NULL;
525 }
526
527 static struct bfq_group *bfq_find_set_group(struct bfq_data *bfqd,
528 struct blkcg *blkcg)
529 {
530 struct bfq_group *bfqg, *parent;
531 struct bfq_entity *entity;
532
533 bfqg = bfq_lookup_bfqg(bfqd, blkcg);
534
535 if (unlikely(!bfqg))
536 return NULL;
537
538 /*
539 * Update chain of bfq_groups as we might be handling a leaf group
540 * which, along with some of its relatives, has not been hooked yet
541 * to the private hierarchy of BFQ.
542 */
543 entity = &bfqg->entity;
544 for_each_entity(entity) {
545 bfqg = container_of(entity, struct bfq_group, entity);
546 BUG_ON(!bfqg);
547 if (bfqg != bfqd->root_group) {
548 parent = bfqg_parent(bfqg);
549 if (!parent)
550 parent = bfqd->root_group;
551 BUG_ON(!parent);
552 bfq_group_set_parent(bfqg, parent);
553 }
554 }
555
556 return bfqg;
557 }
558
559 static void bfq_pos_tree_add_move(struct bfq_data *bfqd,
560 struct bfq_queue *bfqq);
561
562 static void bfq_bfqq_expire(struct bfq_data *bfqd,
563 struct bfq_queue *bfqq,
564 bool compensate,
565 enum bfqq_expiration reason);
566
567 /**
568 * bfq_bfqq_move - migrate @bfqq to @bfqg.
569 * @bfqd: queue descriptor.
570 * @bfqq: the queue to move.
571 * @bfqg: the group to move to.
572 *
573 * Move @bfqq to @bfqg, deactivating it from its old group and reactivating
574 * it on the new one. Avoid putting the entity on the old group idle tree.
575 *
576 #ifdef BFQ_MQ
577 * Must be called under the scheduler lock, to make sure that the blkg
578 * owning @bfqg does not disappear (see comments in
579 * bfq_bic_update_cgroup on guaranteeing the consistency of blkg
580 * objects).
581 #else
582 * Must be called under the queue lock; the cgroup owning @bfqg must
583 * not disappear (by now this just means that we are called under
584 * rcu_read_lock()).
585 #endif
586 */
587 static void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
588 struct bfq_group *bfqg)
589 {
590 struct bfq_entity *entity = &bfqq->entity;
591
592 BUG_ON(!bfq_bfqq_busy(bfqq) && !RB_EMPTY_ROOT(&bfqq->sort_list));
593 BUG_ON(!RB_EMPTY_ROOT(&bfqq->sort_list) && !entity->on_st);
594 BUG_ON(bfq_bfqq_busy(bfqq) && RB_EMPTY_ROOT(&bfqq->sort_list)
595 && entity->on_st &&
596 bfqq != bfqd->in_service_queue);
597 BUG_ON(!bfq_bfqq_busy(bfqq) && bfqq == bfqd->in_service_queue);
598
599 /* If bfqq is empty, then bfq_bfqq_expire also invokes
600 * bfq_del_bfqq_busy, thereby removing bfqq and its entity
601 * from data structures related to current group. Otherwise we
602 * need to remove bfqq explicitly with bfq_deactivate_bfqq, as
603 * we do below.
604 */
605 if (bfqq == bfqd->in_service_queue)
606 bfq_bfqq_expire(bfqd, bfqd->in_service_queue,
607 false, BFQ_BFQQ_PREEMPTED);
608
609 BUG_ON(entity->on_st && !bfq_bfqq_busy(bfqq)
610 && &bfq_entity_service_tree(entity)->idle !=
611 entity->tree);
612
613 BUG_ON(RB_EMPTY_ROOT(&bfqq->sort_list) && bfq_bfqq_busy(bfqq));
614
615 if (bfq_bfqq_busy(bfqq))
616 bfq_deactivate_bfqq(bfqd, bfqq, false, false);
617 else if (entity->on_st) {
618 BUG_ON(&bfq_entity_service_tree(entity)->idle !=
619 entity->tree);
620 bfq_put_idle_entity(bfq_entity_service_tree(entity), entity);
621 }
622 #ifdef BFQ_MQ
623 bfqg_and_blkg_put(bfqq_group(bfqq));
624 #else
625 bfqg_put(bfqq_group(bfqq));
626 #endif
627
628 entity->parent = bfqg->my_entity;
629 entity->sched_data = &bfqg->sched_data;
630 #ifdef BFQ_MQ
631 /* pin down bfqg and its associated blkg */
632 bfqg_and_blkg_get(bfqg);
633 #else
634 bfqg_get(bfqg);
635 #endif
636
637 BUG_ON(RB_EMPTY_ROOT(&bfqq->sort_list) && bfq_bfqq_busy(bfqq));
638 if (bfq_bfqq_busy(bfqq)) {
639 bfq_pos_tree_add_move(bfqd, bfqq);
640 bfq_activate_bfqq(bfqd, bfqq);
641 }
642
643 if (!bfqd->in_service_queue && !bfqd->rq_in_driver)
644 bfq_schedule_dispatch(bfqd);
645 BUG_ON(entity->on_st && !bfq_bfqq_busy(bfqq)
646 && &bfq_entity_service_tree(entity)->idle !=
647 entity->tree);
648 }
649
650 /**
651 * __bfq_bic_change_cgroup - move @bic to @cgroup.
652 * @bfqd: the queue descriptor.
653 * @bic: the bic to move.
654 * @blkcg: the blk-cgroup to move to.
655 *
656 #ifdef BFQ_MQ
657 * Move bic to blkcg, assuming that bfqd->lock is held; which makes
658 * sure that the reference to cgroup is valid across the call (see
659 * comments in bfq_bic_update_cgroup on this issue)
660 #else
661 * Move bic to blkcg, assuming that bfqd->queue is locked; the caller
662 * has to make sure that the reference to cgroup is valid across the call.
663 #endif
664 *
665 * NOTE: an alternative approach might have been to store the current
666 * cgroup in bfqq and getting a reference to it, reducing the lookup
667 * time here, at the price of slightly more complex code.
668 */
669 static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
670 struct bfq_io_cq *bic,
671 struct blkcg *blkcg)
672 {
673 struct bfq_queue *async_bfqq = bic_to_bfqq(bic, 0);
674 struct bfq_queue *sync_bfqq = bic_to_bfqq(bic, 1);
675 struct bfq_group *bfqg;
676 struct bfq_entity *entity;
677
678 bfqg = bfq_find_set_group(bfqd, blkcg);
679
680 if (unlikely(!bfqg))
681 bfqg = bfqd->root_group;
682
683 if (async_bfqq) {
684 entity = &async_bfqq->entity;
685
686 if (entity->sched_data != &bfqg->sched_data) {
687 bic_set_bfqq(bic, NULL, 0);
688 bfq_log_bfqq(bfqd, async_bfqq,
689 "bic_change_group: %p %d",
690 async_bfqq,
691 async_bfqq->ref);
692 bfq_put_queue(async_bfqq);
693 }
694 }
695
696 if (sync_bfqq) {
697 entity = &sync_bfqq->entity;
698 if (entity->sched_data != &bfqg->sched_data)
699 bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
700 }
701
702 return bfqg;
703 }
704
705 static void bfq_bic_update_cgroup(struct bfq_io_cq *bic, struct bio *bio)
706 {
707 struct bfq_data *bfqd = bic_to_bfqd(bic);
708 struct bfq_group *bfqg = NULL;
709 uint64_t serial_nr;
710
711 rcu_read_lock();
712 serial_nr = bio_blkcg(bio)->css.serial_nr;
713
714 /*
715 * Check whether blkcg has changed. The condition may trigger
716 * spuriously on a newly created cic but there's no harm.
717 */
718 if (unlikely(!bfqd) || likely(bic->blkcg_serial_nr == serial_nr))
719 goto out;
720
721 bfqg = __bfq_bic_change_cgroup(bfqd, bic, bio_blkcg(bio));
722 #ifdef BFQ_MQ
723 /*
724 * Update blkg_path for bfq_log_* functions. We cache this
725 * path, and update it here, for the following
726 * reasons. Operations on blkg objects in blk-cgroup are
727 * protected with the request_queue lock, and not with the
728 * lock that protects the instances of this scheduler
729 * (bfqd->lock). This exposes BFQ to the following sort of
730 * race.
731 *
732 * The blkg_lookup performed in bfq_get_queue, protected
733 * through rcu, may happen to return the address of a copy of
734 * the original blkg. If this is the case, then the
735 * bfqg_and_blkg_get performed in bfq_get_queue, to pin down
736 * the blkg, is useless: it does not prevent blk-cgroup code
737 * from destroying both the original blkg and all objects
738 * directly or indirectly referred by the copy of the
739 * blkg.
740 *
741 * On the bright side, destroy operations on a blkg invoke, as
742 * a first step, hooks of the scheduler associated with the
743 * blkg. And these hooks are executed with bfqd->lock held for
744 * BFQ. As a consequence, for any blkg associated with the
745 * request queue this instance of the scheduler is attached
746 * to, we are guaranteed that such a blkg is not destroyed, and
747 * that all the pointers it contains are consistent, while we
748 * are holding bfqd->lock. A blkg_lookup performed with
749 * bfqd->lock held then returns a fully consistent blkg, which
750 * remains consistent until this lock is held.
751 *
752 * Thanks to the last fact, and to the fact that: (1) bfqg has
753 * been obtained through a blkg_lookup in the above
754 * assignment, and (2) bfqd->lock is being held, here we can
755 * safely use the policy data for the involved blkg (i.e., the
756 * field bfqg->pd) to get to the blkg associated with bfqg,
757 * and then we can safely use any field of blkg. After we
758 * release bfqd->lock, even just getting blkg through this
759 * bfqg may cause dangling references to be traversed, as
760 * bfqg->pd may not exist any more.
761 *
762 * In view of the above facts, here we cache, in the bfqg, any
763 * blkg data we may need for this bic, and for its associated
764 * bfq_queue. As of now, we need to cache only the path of the
765 * blkg, which is used in the bfq_log_* functions.
766 *
767 * Finally, note that bfqg itself needs to be protected from
768 * destruction on the blkg_free of the original blkg (which
769 * invokes bfq_pd_free). We use an additional private
770 * refcounter for bfqg, to let it disappear only after no
771 * bfq_queue refers to it any longer.
772 */
773 blkg_path(bfqg_to_blkg(bfqg), bfqg->blkg_path, sizeof(bfqg->blkg_path));
774 #endif
775 bic->blkcg_serial_nr = serial_nr;
776 out:
777 rcu_read_unlock();
778 }
779
780 /**
781 * bfq_flush_idle_tree - deactivate any entity on the idle tree of @st.
782 * @st: the service tree being flushed.
783 */
784 static void bfq_flush_idle_tree(struct bfq_service_tree *st)
785 {
786 struct bfq_entity *entity = st->first_idle;
787
788 for (; entity ; entity = st->first_idle)
789 __bfq_deactivate_entity(entity, false);
790 }
791
792 /**
793 * bfq_reparent_leaf_entity - move leaf entity to the root_group.
794 * @bfqd: the device data structure with the root group.
795 * @entity: the entity to move.
796 */
797 static void bfq_reparent_leaf_entity(struct bfq_data *bfqd,
798 struct bfq_entity *entity)
799 {
800 struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
801
802 BUG_ON(!bfqq);
803 bfq_bfqq_move(bfqd, bfqq, bfqd->root_group);
804 }
805
806 /**
807 * bfq_reparent_active_entities - move to the root group all active
808 * entities.
809 * @bfqd: the device data structure with the root group.
810 * @bfqg: the group to move from.
811 * @st: the service tree with the entities.
812 */
813 static void bfq_reparent_active_entities(struct bfq_data *bfqd,
814 struct bfq_group *bfqg,
815 struct bfq_service_tree *st)
816 {
817 struct rb_root *active = &st->active;
818 struct bfq_entity *entity = NULL;
819
820 if (!RB_EMPTY_ROOT(&st->active))
821 entity = bfq_entity_of(rb_first(active));
822
823 for (; entity ; entity = bfq_entity_of(rb_first(active)))
824 bfq_reparent_leaf_entity(bfqd, entity);
825
826 if (bfqg->sched_data.in_service_entity)
827 bfq_reparent_leaf_entity(bfqd,
828 bfqg->sched_data.in_service_entity);
829 }
830
831 /**
832 * bfq_pd_offline - deactivate the entity associated with @pd,
833 * and reparent its children entities.
834 * @pd: descriptor of the policy going offline.
835 *
836 * blkio already grabs the queue_lock for us, so no need to use
837 * RCU-based magic
838 */
839 static void bfq_pd_offline(struct blkg_policy_data *pd)
840 {
841 struct bfq_service_tree *st;
842 struct bfq_group *bfqg;
843 struct bfq_data *bfqd;
844 struct bfq_entity *entity;
845 #ifdef BFQ_MQ
846 unsigned long flags;
847 #endif
848 int i;
849
850 BUG_ON(!pd);
851 bfqg = pd_to_bfqg(pd);
852 BUG_ON(!bfqg);
853 bfqd = bfqg->bfqd;
854 BUG_ON(bfqd && !bfqd->root_group);
855
856 entity = bfqg->my_entity;
857
858 if (!entity) /* root group */
859 return;
860
861 #ifdef BFQ_MQ
862 spin_lock_irqsave(&bfqd->lock, flags);
863 #endif
864
865 /*
866 * Empty all service_trees belonging to this group before
867 * deactivating the group itself.
868 */
869 for (i = 0; i < BFQ_IOPRIO_CLASSES; i++) {
870 BUG_ON(!bfqg->sched_data.service_tree);
871 st = bfqg->sched_data.service_tree + i;
872 /*
873 * The idle tree may still contain bfq_queues belonging
874 * to exited task because they never migrated to a different
875 * cgroup from the one being destroyed now.
876 */
877 bfq_flush_idle_tree(st);
878
879 /*
880 * It may happen that some queues are still active
881 * (busy) upon group destruction (if the corresponding
882 * processes have been forced to terminate). We move
883 * all the leaf entities corresponding to these queues
884 * to the root_group.
885 * Also, it may happen that the group has an entity
886 * in service, which is disconnected from the active
887 * tree: it must be moved, too.
888 * There is no need to put the sync queues, as the
889 * scheduler has taken no reference.
890 */
891 bfq_reparent_active_entities(bfqd, bfqg, st);
892 BUG_ON(!RB_EMPTY_ROOT(&st->active));
893 BUG_ON(!RB_EMPTY_ROOT(&st->idle));
894 }
895 BUG_ON(bfqg->sched_data.next_in_service);
896 BUG_ON(bfqg->sched_data.in_service_entity);
897
898 __bfq_deactivate_entity(entity, false);
899 bfq_put_async_queues(bfqd, bfqg);
900
901 #ifdef BFQ_MQ
902 spin_unlock_irqrestore(&bfqd->lock, flags);
903 #endif
904 /*
905 * @blkg is going offline and will be ignored by
906 * blkg_[rw]stat_recursive_sum(). Transfer stats to the parent so
907 * that they don't get lost. If IOs complete after this point, the
908 * stats for them will be lost. Oh well...
909 */
910 bfqg_stats_xfer_dead(bfqg);
911 }
912
913 static void bfq_end_wr_async(struct bfq_data *bfqd)
914 {
915 struct blkcg_gq *blkg;
916
917 list_for_each_entry(blkg, &bfqd->queue->blkg_list, q_node) {
918 struct bfq_group *bfqg = blkg_to_bfqg(blkg);
919 BUG_ON(!bfqg);
920
921 bfq_end_wr_async_queues(bfqd, bfqg);
922 }
923 bfq_end_wr_async_queues(bfqd, bfqd->root_group);
924 }
925
926 static int bfq_io_show_weight(struct seq_file *sf, void *v)
927 {
928 struct blkcg *blkcg = css_to_blkcg(seq_css(sf));
929 struct bfq_group_data *bfqgd = blkcg_to_bfqgd(blkcg);
930 unsigned int val = 0;
931
932 if (bfqgd)
933 val = bfqgd->weight;
934
935 seq_printf(sf, "%u\n", val);
936
937 return 0;
938 }
939
940 static int bfq_io_set_weight_legacy(struct cgroup_subsys_state *css,
941 struct cftype *cftype,
942 u64 val)
943 {
944 struct blkcg *blkcg = css_to_blkcg(css);
945 struct bfq_group_data *bfqgd = blkcg_to_bfqgd(blkcg);
946 struct blkcg_gq *blkg;
947 int ret = -ERANGE;
948
949 if (val < BFQ_MIN_WEIGHT || val > BFQ_MAX_WEIGHT)
950 return ret;
951
952 ret = 0;
953 spin_lock_irq(&blkcg->lock);
954 bfqgd->weight = (unsigned short)val;
955 hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) {
956 struct bfq_group *bfqg = blkg_to_bfqg(blkg);
957
958 if (!bfqg)
959 continue;
960 /*
961 * Setting the prio_changed flag of the entity
962 * to 1 with new_weight == weight would re-set
963 * the value of the weight to its ioprio mapping.
964 * Set the flag only if necessary.
965 */
966 if ((unsigned short)val != bfqg->entity.new_weight) {
967 bfqg->entity.new_weight = (unsigned short)val;
968 /*
969 * Make sure that the above new value has been
970 * stored in bfqg->entity.new_weight before
971 * setting the prio_changed flag. In fact,
972 * this flag may be read asynchronously (in
973 * critical sections protected by a different
974 * lock than that held here), and finding this
975 * flag set may cause the execution of the code
976 * for updating parameters whose value may
977 * depend also on bfqg->entity.new_weight (in
978 * __bfq_entity_update_weight_prio).
979 * This barrier makes sure that the new value
980 * of bfqg->entity.new_weight is correctly
981 * seen in that code.
982 */
983 smp_wmb();
984 bfqg->entity.prio_changed = 1;
985 }
986 }
987 spin_unlock_irq(&blkcg->lock);
988
989 return ret;
990 }
991
992 static ssize_t bfq_io_set_weight(struct kernfs_open_file *of,
993 char *buf, size_t nbytes,
994 loff_t off)
995 {
996 u64 weight;
997 /* First unsigned long found in the file is used */
998 int ret = kstrtoull(strim(buf), 0, &weight);
999
1000 if (ret)
1001 return ret;
1002
1003 return bfq_io_set_weight_legacy(of_css(of), NULL, weight);
1004 }
1005
1006 static int bfqg_print_stat(struct seq_file *sf, void *v)
1007 {
1008 blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_stat,
1009 &blkcg_policy_bfq, seq_cft(sf)->private, false);
1010 return 0;
1011 }
1012
1013 static int bfqg_print_rwstat(struct seq_file *sf, void *v)
1014 {
1015 blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_rwstat,
1016 &blkcg_policy_bfq, seq_cft(sf)->private, true);
1017 return 0;
1018 }
1019
1020 static u64 bfqg_prfill_stat_recursive(struct seq_file *sf,
1021 struct blkg_policy_data *pd, int off)
1022 {
1023 u64 sum = blkg_stat_recursive_sum(pd_to_blkg(pd),
1024 &blkcg_policy_bfq, off);
1025 return __blkg_prfill_u64(sf, pd, sum);
1026 }
1027
1028 static u64 bfqg_prfill_rwstat_recursive(struct seq_file *sf,
1029 struct blkg_policy_data *pd, int off)
1030 {
1031 struct blkg_rwstat sum = blkg_rwstat_recursive_sum(pd_to_blkg(pd),
1032 &blkcg_policy_bfq,
1033 off);
1034 return __blkg_prfill_rwstat(sf, pd, &sum);
1035 }
1036
1037 static int bfqg_print_stat_recursive(struct seq_file *sf, void *v)
1038 {
1039 blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
1040 bfqg_prfill_stat_recursive, &blkcg_policy_bfq,
1041 seq_cft(sf)->private, false);
1042 return 0;
1043 }
1044
1045 static int bfqg_print_rwstat_recursive(struct seq_file *sf, void *v)
1046 {
1047 blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
1048 bfqg_prfill_rwstat_recursive, &blkcg_policy_bfq,
1049 seq_cft(sf)->private, true);
1050 return 0;
1051 }
1052
1053 static u64 bfqg_prfill_sectors(struct seq_file *sf, struct blkg_policy_data *pd,
1054 int off)
1055 {
1056 u64 sum = blkg_rwstat_total(&pd->blkg->stat_bytes);
1057
1058 return __blkg_prfill_u64(sf, pd, sum >> 9);
1059 }
1060
1061 static int bfqg_print_stat_sectors(struct seq_file *sf, void *v)
1062 {
1063 blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
1064 bfqg_prfill_sectors, &blkcg_policy_bfq, 0, false);
1065 return 0;
1066 }
1067
1068 static u64 bfqg_prfill_sectors_recursive(struct seq_file *sf,
1069 struct blkg_policy_data *pd, int off)
1070 {
1071 struct blkg_rwstat tmp = blkg_rwstat_recursive_sum(pd->blkg, NULL,
1072 offsetof(struct blkcg_gq, stat_bytes));
1073 u64 sum = atomic64_read(&tmp.aux_cnt[BLKG_RWSTAT_READ]) +
1074 atomic64_read(&tmp.aux_cnt[BLKG_RWSTAT_WRITE]);
1075
1076 return __blkg_prfill_u64(sf, pd, sum >> 9);
1077 }
1078
1079 static int bfqg_print_stat_sectors_recursive(struct seq_file *sf, void *v)
1080 {
1081 blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
1082 bfqg_prfill_sectors_recursive, &blkcg_policy_bfq, 0,
1083 false);
1084 return 0;
1085 }
1086
1087
1088 static u64 bfqg_prfill_avg_queue_size(struct seq_file *sf,
1089 struct blkg_policy_data *pd, int off)
1090 {
1091 struct bfq_group *bfqg = pd_to_bfqg(pd);
1092 u64 samples = blkg_stat_read(&bfqg->stats.avg_queue_size_samples);
1093 u64 v = 0;
1094
1095 if (samples) {
1096 v = blkg_stat_read(&bfqg->stats.avg_queue_size_sum);
1097 v = div64_u64(v, samples);
1098 }
1099 __blkg_prfill_u64(sf, pd, v);
1100 return 0;
1101 }
1102
1103 /* print avg_queue_size */
1104 static int bfqg_print_avg_queue_size(struct seq_file *sf, void *v)
1105 {
1106 blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
1107 bfqg_prfill_avg_queue_size, &blkcg_policy_bfq,
1108 0, false);
1109 return 0;
1110 }
1111
1112 static struct bfq_group *
1113 bfq_create_group_hierarchy(struct bfq_data *bfqd, int node)
1114 {
1115 int ret;
1116
1117 ret = blkcg_activate_policy(bfqd->queue, &blkcg_policy_bfq);
1118 if (ret)
1119 return NULL;
1120
1121 return blkg_to_bfqg(bfqd->queue->root_blkg);
1122 }
1123
1124 #ifdef BFQ_MQ
1125 #define BFQ_CGROUP_FNAME(param) "bfq-mq."#param
1126 #else
1127 #define BFQ_CGROUP_FNAME(param) "bfq-sq."#param
1128 #endif
1129
1130 static struct cftype bfq_blkcg_legacy_files[] = {
1131 {
1132 .name = BFQ_CGROUP_FNAME(weight),
1133 .flags = CFTYPE_NOT_ON_ROOT,
1134 .seq_show = bfq_io_show_weight,
1135 .write_u64 = bfq_io_set_weight_legacy,
1136 },
1137
1138 /* statistics, covers only the tasks in the bfqg */
1139 {
1140 .name = BFQ_CGROUP_FNAME(time),
1141 .private = offsetof(struct bfq_group, stats.time),
1142 .seq_show = bfqg_print_stat,
1143 },
1144 {
1145 .name = BFQ_CGROUP_FNAME(sectors),
1146 .seq_show = bfqg_print_stat_sectors,
1147 },
1148 {
1149 .name = BFQ_CGROUP_FNAME(io_service_bytes),
1150 .private = (unsigned long)&blkcg_policy_bfq,
1151 .seq_show = blkg_print_stat_bytes,
1152 },
1153 {
1154 .name = BFQ_CGROUP_FNAME(io_serviced),
1155 .private = (unsigned long)&blkcg_policy_bfq,
1156 .seq_show = blkg_print_stat_ios,
1157 },
1158 {
1159 .name = BFQ_CGROUP_FNAME(io_service_time),
1160 .private = offsetof(struct bfq_group, stats.service_time),
1161 .seq_show = bfqg_print_rwstat,
1162 },
1163 {
1164 .name = BFQ_CGROUP_FNAME(io_wait_time),
1165 .private = offsetof(struct bfq_group, stats.wait_time),
1166 .seq_show = bfqg_print_rwstat,
1167 },
1168 {
1169 .name = BFQ_CGROUP_FNAME(io_merged),
1170 .private = offsetof(struct bfq_group, stats.merged),
1171 .seq_show = bfqg_print_rwstat,
1172 },
1173 {
1174 .name = BFQ_CGROUP_FNAME(io_queued),
1175 .private = offsetof(struct bfq_group, stats.queued),
1176 .seq_show = bfqg_print_rwstat,
1177 },
1178
1179 /* the same statictics which cover the bfqg and its descendants */
1180 {
1181 .name = BFQ_CGROUP_FNAME(time_recursive),
1182 .private = offsetof(struct bfq_group, stats.time),
1183 .seq_show = bfqg_print_stat_recursive,
1184 },
1185 {
1186 .name = BFQ_CGROUP_FNAME(sectors_recursive),
1187 .seq_show = bfqg_print_stat_sectors_recursive,
1188 },
1189 {
1190 .name = BFQ_CGROUP_FNAME(io_service_bytes_recursive),
1191 .private = (unsigned long)&blkcg_policy_bfq,
1192 .seq_show = blkg_print_stat_bytes_recursive,
1193 },
1194 {
1195 .name = BFQ_CGROUP_FNAME(io_serviced_recursive),
1196 .private = (unsigned long)&blkcg_policy_bfq,
1197 .seq_show = blkg_print_stat_ios_recursive,
1198 },
1199 {
1200 .name = BFQ_CGROUP_FNAME(io_service_time_recursive),
1201 .private = offsetof(struct bfq_group, stats.service_time),
1202 .seq_show = bfqg_print_rwstat_recursive,
1203 },
1204 {
1205 .name = BFQ_CGROUP_FNAME(io_wait_time_recursive),
1206 .private = offsetof(struct bfq_group, stats.wait_time),
1207 .seq_show = bfqg_print_rwstat_recursive,
1208 },
1209 {
1210 .name = BFQ_CGROUP_FNAME(io_merged_recursive),
1211 .private = offsetof(struct bfq_group, stats.merged),
1212 .seq_show = bfqg_print_rwstat_recursive,
1213 },
1214 {
1215 .name = BFQ_CGROUP_FNAME(io_queued_recursive),
1216 .private = offsetof(struct bfq_group, stats.queued),
1217 .seq_show = bfqg_print_rwstat_recursive,
1218 },
1219 {
1220 .name = BFQ_CGROUP_FNAME(avg_queue_size),
1221 .seq_show = bfqg_print_avg_queue_size,
1222 },
1223 {
1224 .name = BFQ_CGROUP_FNAME(group_wait_time),
1225 .private = offsetof(struct bfq_group, stats.group_wait_time),
1226 .seq_show = bfqg_print_stat,
1227 },
1228 {
1229 .name = BFQ_CGROUP_FNAME(idle_time),
1230 .private = offsetof(struct bfq_group, stats.idle_time),
1231 .seq_show = bfqg_print_stat,
1232 },
1233 {
1234 .name = BFQ_CGROUP_FNAME(empty_time),
1235 .private = offsetof(struct bfq_group, stats.empty_time),
1236 .seq_show = bfqg_print_stat,
1237 },
1238 {
1239 .name = BFQ_CGROUP_FNAME(dequeue),
1240 .private = offsetof(struct bfq_group, stats.dequeue),
1241 .seq_show = bfqg_print_stat,
1242 },
1243 { } /* terminate */
1244 };
1245
1246 static struct cftype bfq_blkg_files[] = {
1247 {
1248 .name = BFQ_CGROUP_FNAME(weight),
1249 .flags = CFTYPE_NOT_ON_ROOT,
1250 .seq_show = bfq_io_show_weight,
1251 .write = bfq_io_set_weight,
1252 },
1253 {} /* terminate */
1254 };
1255
1256 #undef BFQ_CGROUP_FNAME
1257
1258 #else /* BFQ_GROUP_IOSCHED_ENABLED */
1259
1260 static inline void bfqg_stats_update_io_add(struct bfq_group *bfqg,
1261 struct bfq_queue *bfqq, unsigned int op) { }
1262 static inline void
1263 bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op) { }
1264 static inline void
1265 bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op) { }
1266 static inline void bfqg_stats_update_completion(struct bfq_group *bfqg,
1267 uint64_t start_time, uint64_t io_start_time,
1268 unsigned int op) { }
1269 static inline void
1270 bfqg_stats_set_start_group_wait_time(struct bfq_group *bfqg,
1271 struct bfq_group *curr_bfqg) { }
1272 static inline void bfqg_stats_end_empty_time(struct bfqg_stats *stats) { }
1273 static inline void bfqg_stats_update_dequeue(struct bfq_group *bfqg) { }
1274 static inline void bfqg_stats_set_start_empty_time(struct bfq_group *bfqg) { }
1275 static inline void bfqg_stats_update_idle_time(struct bfq_group *bfqg) { }
1276 static inline void bfqg_stats_set_start_idle_time(struct bfq_group *bfqg) { }
1277 static inline void bfqg_stats_update_avg_queue_size(struct bfq_group *bfqg) { }
1278
1279 static void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
1280 struct bfq_group *bfqg) {}
1281
1282 static void bfq_init_entity(struct bfq_entity *entity,
1283 struct bfq_group *bfqg)
1284 {
1285 struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
1286
1287 entity->weight = entity->new_weight;
1288 entity->orig_weight = entity->new_weight;
1289 if (bfqq) {
1290 bfqq->ioprio = bfqq->new_ioprio;
1291 bfqq->ioprio_class = bfqq->new_ioprio_class;
1292 }
1293 entity->sched_data = &bfqg->sched_data;
1294 }
1295
1296 static void bfq_bic_update_cgroup(struct bfq_io_cq *bic, struct bio *bio) {}
1297
1298 static void bfq_end_wr_async(struct bfq_data *bfqd)
1299 {
1300 bfq_end_wr_async_queues(bfqd, bfqd->root_group);
1301 }
1302
1303 static struct bfq_group *bfq_find_set_group(struct bfq_data *bfqd,
1304 struct blkcg *blkcg)
1305 {
1306 return bfqd->root_group;
1307 }
1308
1309 static struct bfq_group *bfqq_group(struct bfq_queue *bfqq)
1310 {
1311 return bfqq->bfqd->root_group;
1312 }
1313
1314 static struct bfq_group *
1315 bfq_create_group_hierarchy(struct bfq_data *bfqd, int node)
1316 {
1317 struct bfq_group *bfqg;
1318 int i;
1319
1320 bfqg = kmalloc_node(sizeof(*bfqg), GFP_KERNEL | __GFP_ZERO, node);
1321 if (!bfqg)
1322 return NULL;
1323
1324 for (i = 0; i < BFQ_IOPRIO_CLASSES; i++)
1325 bfqg->sched_data.service_tree[i] = BFQ_SERVICE_TREE_INIT;
1326
1327 return bfqg;
1328 }
1329 #endif
1 /*
2 * BFQ: I/O context handling.
3 *
4 * Based on ideas and code from CFQ:
5 * Copyright (C) 2003 Jens Axboe <axboe@kernel.dk>
6 *
7 * Copyright (C) 2008 Fabio Checconi <fabio@gandalf.sssup.it>
8 * Paolo Valente <paolo.valente@unimore.it>
9 *
10 * Copyright (C) 2010 Paolo Valente <paolo.valente@unimore.it>
11 */
12
13 /**
14 * icq_to_bic - convert iocontext queue structure to bfq_io_cq.
15 * @icq: the iocontext queue.
16 */
17 static struct bfq_io_cq *icq_to_bic(struct io_cq *icq)
18 {
19 /* bic->icq is the first member, %NULL will convert to %NULL */
20 return container_of(icq, struct bfq_io_cq, icq);
21 }
22
23 /**
24 * bfq_bic_lookup - search into @ioc a bic associated to @bfqd.
25 * @bfqd: the lookup key.
26 * @ioc: the io_context of the process doing I/O.
27 *
28 * Queue lock must be held.
29 */
30 static struct bfq_io_cq *bfq_bic_lookup(struct bfq_data *bfqd,
31 struct io_context *ioc)
32 {
33 if (ioc)
34 return icq_to_bic(ioc_lookup_icq(ioc, bfqd->queue));
35 return NULL;
36 }
1 /*
2 * Budget Fair Queueing (BFQ) I/O scheduler.
3 *
4 * Based on ideas and code from CFQ:
5 * Copyright (C) 2003 Jens Axboe <axboe@kernel.dk>
6 *
7 * Copyright (C) 2008 Fabio Checconi <fabio@gandalf.sssup.it>
8 * Paolo Valente <paolo.valente@unimore.it>
9 *
10 * Copyright (C) 2015 Paolo Valente <paolo.valente@unimore.it>
11 *
12 * Copyright (C) 2017 Paolo Valente <paolo.valente@linaro.org>
13 *
14 * Licensed under the GPL-2 as detailed in the accompanying COPYING.BFQ
15 * file.
16 *
17 * BFQ is a proportional-share I/O scheduler, with some extra
18 * low-latency capabilities. BFQ also supports full hierarchical
19 * scheduling through cgroups. Next paragraphs provide an introduction
20 * on BFQ inner workings. Details on BFQ benefits and usage can be
21 * found in Documentation/block/bfq-iosched.txt.
22 *
23 * BFQ is a proportional-share storage-I/O scheduling algorithm based
24 * on the slice-by-slice service scheme of CFQ. But BFQ assigns
25 * budgets, measured in number of sectors, to processes instead of
26 * time slices. The device is not granted to the in-service process
27 * for a given time slice, but until it has exhausted its assigned
28 * budget. This change from the time to the service domain enables BFQ
29 * to distribute the device throughput among processes as desired,
30 * without any distortion due to throughput fluctuations, or to device
31 * internal queueing. BFQ uses an ad hoc internal scheduler, called
32 * B-WF2Q+, to schedule processes according to their budgets. More
33 * precisely, BFQ schedules queues associated with processes. Thanks to
34 * the accurate policy of B-WF2Q+, BFQ can afford to assign high
35 * budgets to I/O-bound processes issuing sequential requests (to
36 * boost the throughput), and yet guarantee a low latency to
37 * interactive and soft real-time applications.
38 *
39 * NOTE: if the main or only goal, with a given device, is to achieve
40 * the maximum-possible throughput at all times, then do switch off
41 * all low-latency heuristics for that device, by setting low_latency
42 * to 0.
43 *
44 * BFQ is described in [1], where also a reference to the initial, more
45 * theoretical paper on BFQ can be found. The interested reader can find
46 * in the latter paper full details on the main algorithm, as well as
47 * formulas of the guarantees and formal proofs of all the properties.
48 * With respect to the version of BFQ presented in these papers, this
49 * implementation adds a few more heuristics, such as the one that
50 * guarantees a low latency to soft real-time applications, and a
51 * hierarchical extension based on H-WF2Q+.
52 *
53 * B-WF2Q+ is based on WF2Q+, that is described in [2], together with
54 * H-WF2Q+, while the augmented tree used to implement B-WF2Q+ with O(log N)
55 * complexity derives from the one introduced with EEVDF in [3].
56 *
57 * [1] P. Valente, A. Avanzini, "Evolution of the BFQ Storage I/O
58 * Scheduler", Proceedings of the First Workshop on Mobile System
59 * Technologies (MST-2015), May 2015.
60 * http://algogroup.unimore.it/people/paolo/disk_sched/mst-2015.pdf
61 *
62 * http://algogroup.unimo.it/people/paolo/disk_sched/bf1-v1-suite-results.pdf
63 *
64 * [2] Jon C.R. Bennett and H. Zhang, ``Hierarchical Packet Fair Queueing
65 * Algorithms,'' IEEE/ACM Transactions on Networking, 5(5):675-689,
66 * Oct 1997.
67 *
68 * http://www.cs.cmu.edu/~hzhang/papers/TON-97-Oct.ps.gz
69 *
70 * [3] I. Stoica and H. Abdel-Wahab, ``Earliest Eligible Virtual Deadline
71 * First: A Flexible and Accurate Mechanism for Proportional Share
72 * Resource Allocation,'' technical report.
73 *
74 * http://www.cs.berkeley.edu/~istoica/papers/eevdf-tr-95.pdf
75 */
76 #include <linux/module.h>
77 #include <linux/slab.h>
78 #include <linux/blkdev.h>
79 #include <linux/cgroup.h>
80 #include <linux/elevator.h>
81 #include <linux/jiffies.h>
82 #include <linux/rbtree.h>
83 #include <linux/ioprio.h>
84 #include <linux/sbitmap.h>
85 #include <linux/delay.h>
86
87 #include "blk.h"
88 #include "blk-mq.h"
89 #include "blk-mq-tag.h"
90 #include "blk-mq-sched.h"
91 #include "bfq-mq.h"
92
93 /* Expiration time of sync (0) and async (1) requests, in ns. */
94 static const u64 bfq_fifo_expire[2] = { NSEC_PER_SEC / 4, NSEC_PER_SEC / 8 };
95
96 /* Maximum backwards seek, in KiB. */
97 static const int bfq_back_max = (16 * 1024);
98
99 /* Penalty of a backwards seek, in number of sectors. */
100 static const int bfq_back_penalty = 2;
101
102 /* Idling period duration, in ns. */
103 static u32 bfq_slice_idle = (NSEC_PER_SEC / 125);
104
105 /* Minimum number of assigned budgets for which stats are safe to compute. */
106 static const int bfq_stats_min_budgets = 194;
107
108 /* Default maximum budget values, in sectors and number of requests. */
109 static const int bfq_default_max_budget = (16 * 1024);
110
111 /*
112 * Async to sync throughput distribution is controlled as follows:
113 * when an async request is served, the entity is charged the number
114 * of sectors of the request, multiplied by the factor below
115 */
116 static const int bfq_async_charge_factor = 10;
117
118 /* Default timeout values, in jiffies, approximating CFQ defaults. */
119 static const int bfq_timeout = (HZ / 8);
120
121 static struct kmem_cache *bfq_pool;
122
123 /* Below this threshold (in ns), we consider thinktime immediate. */
124 #define BFQ_MIN_TT (2 * NSEC_PER_MSEC)
125
126 /* hw_tag detection: parallel requests threshold and min samples needed. */
127 #define BFQ_HW_QUEUE_THRESHOLD 4
128 #define BFQ_HW_QUEUE_SAMPLES 32
129
130 #define BFQQ_SEEK_THR (sector_t)(8 * 100)
131 #define BFQQ_SECT_THR_NONROT (sector_t)(2 * 32)
132 #define BFQQ_CLOSE_THR (sector_t)(8 * 1024)
133 #define BFQQ_SEEKY(bfqq) (hweight32(bfqq->seek_history) > 32/8)
134
135 /* Min number of samples required to perform peak-rate update */
136 #define BFQ_RATE_MIN_SAMPLES 32
137 /* Min observation time interval required to perform a peak-rate update (ns) */
138 #define BFQ_RATE_MIN_INTERVAL (300*NSEC_PER_MSEC)
139 /* Target observation time interval for a peak-rate update (ns) */
140 #define BFQ_RATE_REF_INTERVAL NSEC_PER_SEC
141
142 /* Shift used for peak rate fixed precision calculations. */
143 #define BFQ_RATE_SHIFT 16
144
145 /*
146 * By default, BFQ computes the duration of the weight raising for
147 * interactive applications automatically, using the following formula:
148 * duration = (R / r) * T, where r is the peak rate of the device, and
149 * R and T are two reference parameters.
150 * In particular, R is the peak rate of the reference device (see below),
151 * and T is a reference time: given the systems that are likely to be
152 * installed on the reference device according to its speed class, T is
153 * about the maximum time needed, under BFQ and while reading two files in
154 * parallel, to load typical large applications on these systems.
155 * In practice, the slower/faster the device at hand is, the more/less it
156 * takes to load applications with respect to the reference device.
157 * Accordingly, the longer/shorter BFQ grants weight raising to interactive
158 * applications.
159 *
160 * BFQ uses four different reference pairs (R, T), depending on:
161 * . whether the device is rotational or non-rotational;
162 * . whether the device is slow, such as old or portable HDDs, as well as
163 * SD cards, or fast, such as newer HDDs and SSDs.
164 *
165 * The device's speed class is dynamically (re)detected in
166 * bfq_update_peak_rate() every time the estimated peak rate is updated.
167 *
168 * In the following definitions, R_slow[0]/R_fast[0] and
169 * T_slow[0]/T_fast[0] are the reference values for a slow/fast
170 * rotational device, whereas R_slow[1]/R_fast[1] and
171 * T_slow[1]/T_fast[1] are the reference values for a slow/fast
172 * non-rotational device. Finally, device_speed_thresh are the
173 * thresholds used to switch between speed classes. The reference
174 * rates are not the actual peak rates of the devices used as a
175 * reference, but slightly lower values. The reason for using these
176 * slightly lower values is that the peak-rate estimator tends to
177 * yield slightly lower values than the actual peak rate (it can yield
178 * the actual peak rate only if there is only one process doing I/O,
179 * and the process does sequential I/O).
180 *
181 * Both the reference peak rates and the thresholds are measured in
182 * sectors/usec, left-shifted by BFQ_RATE_SHIFT.
183 */
184 static int R_slow[2] = {1000, 10700};
185 static int R_fast[2] = {14000, 33000};
186 /*
187 * To improve readability, a conversion function is used to initialize the
188 * following arrays, which entails that they can be initialized only in a
189 * function.
190 */
191 static int T_slow[2];
192 static int T_fast[2];
193 static int device_speed_thresh[2];
194
195 #define BFQ_SERVICE_TREE_INIT ((struct bfq_service_tree) \
196 { RB_ROOT, RB_ROOT, NULL, NULL, 0, 0 })
197
198 #define RQ_BIC(rq) ((struct bfq_io_cq *) (rq)->elv.priv[0])
199 #define RQ_BFQQ(rq) ((rq)->elv.priv[1])
200
201 /**
202 * icq_to_bic - convert iocontext queue structure to bfq_io_cq.
203 * @icq: the iocontext queue.
204 */
205 static struct bfq_io_cq *icq_to_bic(struct io_cq *icq)
206 {
207 /* bic->icq is the first member, %NULL will convert to %NULL */
208 return container_of(icq, struct bfq_io_cq, icq);
209 }
210
211 /**
212 * bfq_bic_lookup - search into @ioc a bic associated to @bfqd.
213 * @bfqd: the lookup key.
214 * @ioc: the io_context of the process doing I/O.
215 * @q: the request queue.
216 */
217 static struct bfq_io_cq *bfq_bic_lookup(struct bfq_data *bfqd,
218 struct io_context *ioc,
219 struct request_queue *q)
220 {
221 if (ioc) {
222 unsigned long flags;
223 struct bfq_io_cq *icq;
224
225 spin_lock_irqsave(q->queue_lock, flags);
226 icq = icq_to_bic(ioc_lookup_icq(ioc, q));
227 spin_unlock_irqrestore(q->queue_lock, flags);
228
229 return icq;
230 }
231
232 return NULL;
233 }
234
235 /*
236 * Scheduler run of queue, if there are requests pending and no one in the
237 * driver that will restart queueing.
238 */
239 static void bfq_schedule_dispatch(struct bfq_data *bfqd)
240 {
241 if (bfqd->queued != 0) {
242 bfq_log(bfqd, "schedule dispatch");
243 blk_mq_run_hw_queues(bfqd->queue, true);
244 }
245 }
246
247 #define BFQ_MQ
248 #include "bfq-sched.c"
249 #include "bfq-cgroup-included.c"
250
251 #define bfq_class_idle(bfqq) ((bfqq)->ioprio_class == IOPRIO_CLASS_IDLE)
252 #define bfq_class_rt(bfqq) ((bfqq)->ioprio_class == IOPRIO_CLASS_RT)
253
254 #define bfq_sample_valid(samples) ((samples) > 80)
255
256 /*
257 * Lifted from AS - choose which of rq1 and rq2 that is best served now.
258 * We choose the request that is closesr to the head right now. Distance
259 * behind the head is penalized and only allowed to a certain extent.
260 */
261 static struct request *bfq_choose_req(struct bfq_data *bfqd,
262 struct request *rq1,
263 struct request *rq2,
264 sector_t last)
265 {
266 sector_t s1, s2, d1 = 0, d2 = 0;
267 unsigned long back_max;
268 #define BFQ_RQ1_WRAP 0x01 /* request 1 wraps */
269 #define BFQ_RQ2_WRAP 0x02 /* request 2 wraps */
270 unsigned int wrap = 0; /* bit mask: requests behind the disk head? */
271
272 if (!rq1 || rq1 == rq2)
273 return rq2;
274 if (!rq2)
275 return rq1;
276
277 if (rq_is_sync(rq1) && !rq_is_sync(rq2))
278 return rq1;
279 else if (rq_is_sync(rq2) && !rq_is_sync(rq1))
280 return rq2;
281 if ((rq1->cmd_flags & REQ_META) && !(rq2->cmd_flags & REQ_META))
282 return rq1;
283 else if ((rq2->cmd_flags & REQ_META) && !(rq1->cmd_flags & REQ_META))
284 return rq2;
285
286 s1 = blk_rq_pos(rq1);
287 s2 = blk_rq_pos(rq2);
288
289 /*
290 * By definition, 1KiB is 2 sectors.
291 */
292 back_max = bfqd->bfq_back_max * 2;
293
294 /*
295 * Strict one way elevator _except_ in the case where we allow
296 * short backward seeks which are biased as twice the cost of a
297 * similar forward seek.
298 */
299 if (s1 >= last)
300 d1 = s1 - last;
301 else if (s1 + back_max >= last)
302 d1 = (last - s1) * bfqd->bfq_back_penalty;
303 else
304 wrap |= BFQ_RQ1_WRAP;
305
306 if (s2 >= last)
307 d2 = s2 - last;
308 else if (s2 + back_max >= last)
309 d2 = (last - s2) * bfqd->bfq_back_penalty;
310 else
311 wrap |= BFQ_RQ2_WRAP;
312
313 /* Found required data */
314
315 /*
316 * By doing switch() on the bit mask "wrap" we avoid having to
317 * check two variables for all permutations: --> faster!
318 */
319 switch (wrap) {
320 case 0: /* common case for CFQ: rq1 and rq2 not wrapped */
321 if (d1 < d2)
322 return rq1;
323 else if (d2 < d1)
324 return rq2;
325
326 if (s1 >= s2)
327 return rq1;
328 else
329 return rq2;
330
331 case BFQ_RQ2_WRAP:
332 return rq1;
333 case BFQ_RQ1_WRAP:
334 return rq2;
335 case (BFQ_RQ1_WRAP|BFQ_RQ2_WRAP): /* both rqs wrapped */
336 default:
337 /*
338 * Since both rqs are wrapped,
339 * start with the one that's further behind head
340 * (--> only *one* back seek required),
341 * since back seek takes more time than forward.
342 */
343 if (s1 <= s2)
344 return rq1;
345 else
346 return rq2;
347 }
348 }
349
350 static struct bfq_queue *
351 bfq_rq_pos_tree_lookup(struct bfq_data *bfqd, struct rb_root *root,
352 sector_t sector, struct rb_node **ret_parent,
353 struct rb_node ***rb_link)
354 {
355 struct rb_node **p, *parent;
356 struct bfq_queue *bfqq = NULL;
357
358 parent = NULL;
359 p = &root->rb_node;
360 while (*p) {
361 struct rb_node **n;
362
363 parent = *p;
364 bfqq = rb_entry(parent, struct bfq_queue, pos_node);
365
366 /*
367 * Sort strictly based on sector. Smallest to the left,
368 * largest to the right.
369 */
370 if (sector > blk_rq_pos(bfqq->next_rq))
371 n = &(*p)->rb_right;
372 else if (sector < blk_rq_pos(bfqq->next_rq))
373 n = &(*p)->rb_left;
374 else
375 break;
376 p = n;
377 bfqq = NULL;
378 }
379
380 *ret_parent = parent;
381 if (rb_link)
382 *rb_link = p;
383
384 bfq_log(bfqd, "rq_pos_tree_lookup %llu: returning %d",
385 (unsigned long long) sector,
386 bfqq ? bfqq->pid : 0);
387
388 return bfqq;
389 }
390
391 static void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq)
392 {
393 struct rb_node **p, *parent;
394 struct bfq_queue *__bfqq;
395
396 if (bfqq->pos_root) {
397 rb_erase(&bfqq->pos_node, bfqq->pos_root);
398 bfqq->pos_root = NULL;
399 }
400
401 if (bfq_class_idle(bfqq))
402 return;
403 if (!bfqq->next_rq)
404 return;
405
406 bfqq->pos_root = &bfq_bfqq_to_bfqg(bfqq)->rq_pos_tree;
407 __bfqq = bfq_rq_pos_tree_lookup(bfqd, bfqq->pos_root,
408 blk_rq_pos(bfqq->next_rq), &parent, &p);
409 if (!__bfqq) {
410 rb_link_node(&bfqq->pos_node, parent, p);
411 rb_insert_color(&bfqq->pos_node, bfqq->pos_root);
412 } else
413 bfqq->pos_root = NULL;
414 }
415
416 /*
417 * Tell whether there are active queues or groups with differentiated weights.
418 */
419 static bool bfq_differentiated_weights(struct bfq_data *bfqd)
420 {
421 /*
422 * For weights to differ, at least one of the trees must contain
423 * at least two nodes.
424 */
425 return (!RB_EMPTY_ROOT(&bfqd->queue_weights_tree) &&
426 (bfqd->queue_weights_tree.rb_node->rb_left ||
427 bfqd->queue_weights_tree.rb_node->rb_right)
428 #ifdef BFQ_GROUP_IOSCHED_ENABLED
429 ) ||
430 (!RB_EMPTY_ROOT(&bfqd->group_weights_tree) &&
431 (bfqd->group_weights_tree.rb_node->rb_left ||
432 bfqd->group_weights_tree.rb_node->rb_right)
433 #endif
434 );
435 }
436
437 /*
438 * The following function returns true if every queue must receive the
439 * same share of the throughput (this condition is used when deciding
440 * whether idling may be disabled, see the comments in the function
441 * bfq_bfqq_may_idle()).
442 *
443 * Such a scenario occurs when:
444 * 1) all active queues have the same weight,
445 * 2) all active groups at the same level in the groups tree have the same
446 * weight,
447 * 3) all active groups at the same level in the groups tree have the same
448 * number of children.
449 *
450 * Unfortunately, keeping the necessary state for evaluating exactly the
451 * above symmetry conditions would be quite complex and time-consuming.
452 * Therefore this function evaluates, instead, the following stronger
453 * sub-conditions, for which it is much easier to maintain the needed
454 * state:
455 * 1) all active queues have the same weight,
456 * 2) all active groups have the same weight,
457 * 3) all active groups have at most one active child each.
458 * In particular, the last two conditions are always true if hierarchical
459 * support and the cgroups interface are not enabled, thus no state needs
460 * to be maintained in this case.
461 */
462 static bool bfq_symmetric_scenario(struct bfq_data *bfqd)
463 {
464 return !bfq_differentiated_weights(bfqd);
465 }
466
467 /*
468 * If the weight-counter tree passed as input contains no counter for
469 * the weight of the input entity, then add that counter; otherwise just
470 * increment the existing counter.
471 *
472 * Note that weight-counter trees contain few nodes in mostly symmetric
473 * scenarios. For example, if all queues have the same weight, then the
474 * weight-counter tree for the queues may contain at most one node.
475 * This holds even if low_latency is on, because weight-raised queues
476 * are not inserted in the tree.
477 * In most scenarios, the rate at which nodes are created/destroyed
478 * should be low too.
479 */
480 static void bfq_weights_tree_add(struct bfq_data *bfqd,
481 struct bfq_entity *entity,
482 struct rb_root *root)
483 {
484 struct rb_node **new = &(root->rb_node), *parent = NULL;
485
486 /*
487 * Do not insert if the entity is already associated with a
488 * counter, which happens if:
489 * 1) the entity is associated with a queue,
490 * 2) a request arrival has caused the queue to become both
491 * non-weight-raised, and hence change its weight, and
492 * backlogged; in this respect, each of the two events
493 * causes an invocation of this function,
494 * 3) this is the invocation of this function caused by the
495 * second event. This second invocation is actually useless,
496 * and we handle this fact by exiting immediately. More
497 * efficient or clearer solutions might possibly be adopted.
498 */
499 if (entity->weight_counter)
500 return;
501
502 while (*new) {
503 struct bfq_weight_counter *__counter = container_of(*new,
504 struct bfq_weight_counter,
505 weights_node);
506 parent = *new;
507
508 if (entity->weight == __counter->weight) {
509 entity->weight_counter = __counter;
510 goto inc_counter;
511 }
512 if (entity->weight < __counter->weight)
513 new = &((*new)->rb_left);
514 else
515 new = &((*new)->rb_right);
516 }
517
518 entity->weight_counter = kzalloc(sizeof(struct bfq_weight_counter),
519 GFP_ATOMIC);
520
521 /*
522 * In the unlucky event of an allocation failure, we just
523 * exit. This will cause the weight of entity to not be
524 * considered in bfq_differentiated_weights, which, in its
525 * turn, causes the scenario to be deemed wrongly symmetric in
526 * case entity's weight would have been the only weight making
527 * the scenario asymmetric. On the bright side, no unbalance
528 * will however occur when entity becomes inactive again (the
529 * invocation of this function is triggered by an activation
530 * of entity). In fact, bfq_weights_tree_remove does nothing
531 * if !entity->weight_counter.
532 */
533 if (unlikely(!entity->weight_counter))
534 return;
535
536 entity->weight_counter->weight = entity->weight;
537 rb_link_node(&entity->weight_counter->weights_node, parent, new);
538 rb_insert_color(&entity->weight_counter->weights_node, root);
539
540 inc_counter:
541 entity->weight_counter->num_active++;
542 }
543
544 /*
545 * Decrement the weight counter associated with the entity, and, if the
546 * counter reaches 0, remove the counter from the tree.
547 * See the comments to the function bfq_weights_tree_add() for considerations
548 * about overhead.
549 */
550 static void bfq_weights_tree_remove(struct bfq_data *bfqd,
551 struct bfq_entity *entity,
552 struct rb_root *root)
553 {
554 if (!entity->weight_counter)
555 return;
556
557 BUG_ON(RB_EMPTY_ROOT(root));
558 BUG_ON(entity->weight_counter->weight != entity->weight);
559
560 BUG_ON(!entity->weight_counter->num_active);
561 entity->weight_counter->num_active--;
562 if (entity->weight_counter->num_active > 0)
563 goto reset_entity_pointer;
564
565 rb_erase(&entity->weight_counter->weights_node, root);
566 kfree(entity->weight_counter);
567
568 reset_entity_pointer:
569 entity->weight_counter = NULL;
570 }
571
572 /*
573 * Return expired entry, or NULL to just start from scratch in rbtree.
574 */
575 static struct request *bfq_check_fifo(struct bfq_queue *bfqq,
576 struct request *last)
577 {
578 struct request *rq;
579
580 if (bfq_bfqq_fifo_expire(bfqq))
581 return NULL;
582
583 bfq_mark_bfqq_fifo_expire(bfqq);
584
585 rq = rq_entry_fifo(bfqq->fifo.next);
586
587 if (rq == last || ktime_get_ns() < rq->fifo_time)
588 return NULL;
589
590 bfq_log_bfqq(bfqq->bfqd, bfqq, "check_fifo: returned %p", rq);
591 BUG_ON(RB_EMPTY_NODE(&rq->rb_node));
592 return rq;
593 }
594
595 static struct request *bfq_find_next_rq(struct bfq_data *bfqd,
596 struct bfq_queue *bfqq,
597 struct request *last)
598 {
599 struct rb_node *rbnext = rb_next(&last->rb_node);
600 struct rb_node *rbprev = rb_prev(&last->rb_node);
601 struct request *next, *prev = NULL;
602
603 BUG_ON(list_empty(&bfqq->fifo));
604
605 /* Follow expired path, else get first next available. */
606 next = bfq_check_fifo(bfqq, last);
607 if (next) {
608 BUG_ON(next == last);
609 return next;
610 }
611
612 BUG_ON(RB_EMPTY_NODE(&last->rb_node));
613
614 if (rbprev)
615 prev = rb_entry_rq(rbprev);
616
617 if (rbnext)
618 next = rb_entry_rq(rbnext);
619 else {
620 rbnext = rb_first(&bfqq->sort_list);
621 if (rbnext && rbnext != &last->rb_node)
622 next = rb_entry_rq(rbnext);
623 }
624
625 return bfq_choose_req(bfqd, next, prev, blk_rq_pos(last));
626 }
627
628 /* see the definition of bfq_async_charge_factor for details */
629 static unsigned long bfq_serv_to_charge(struct request *rq,
630 struct bfq_queue *bfqq)
631 {
632 if (bfq_bfqq_sync(bfqq) || bfqq->wr_coeff > 1)
633 return blk_rq_sectors(rq);
634
635 /*
636 * If there are no weight-raised queues, then amplify service
637 * by just the async charge factor; otherwise amplify service
638 * by twice the async charge factor, to further reduce latency
639 * for weight-raised queues.
640 */
641 if (bfqq->bfqd->wr_busy_queues == 0)
642 return blk_rq_sectors(rq) * bfq_async_charge_factor;
643
644 return blk_rq_sectors(rq) * 2 * bfq_async_charge_factor;
645 }
646
647 /**
648 * bfq_updated_next_req - update the queue after a new next_rq selection.
649 * @bfqd: the device data the queue belongs to.
650 * @bfqq: the queue to update.
651 *
652 * If the first request of a queue changes we make sure that the queue
653 * has enough budget to serve at least its first request (if the
654 * request has grown). We do this because if the queue has not enough
655 * budget for its first request, it has to go through two dispatch
656 * rounds to actually get it dispatched.
657 */
658 static void bfq_updated_next_req(struct bfq_data *bfqd,
659 struct bfq_queue *bfqq)
660 {
661 struct bfq_entity *entity = &bfqq->entity;
662 struct bfq_service_tree *st = bfq_entity_service_tree(entity);
663 struct request *next_rq = bfqq->next_rq;
664 unsigned long new_budget;
665
666 if (!next_rq)
667 return;
668
669 if (bfqq == bfqd->in_service_queue)
670 /*
671 * In order not to break guarantees, budgets cannot be
672 * changed after an entity has been selected.
673 */
674 return;
675
676 BUG_ON(entity->tree != &st->active);
677 BUG_ON(entity == entity->sched_data->in_service_entity);
678
679 new_budget = max_t(unsigned long, bfqq->max_budget,
680 bfq_serv_to_charge(next_rq, bfqq));
681 if (entity->budget != new_budget) {
682 entity->budget = new_budget;
683 bfq_log_bfqq(bfqd, bfqq, "updated next rq: new budget %lu",
684 new_budget);
685 bfq_requeue_bfqq(bfqd, bfqq, false);
686 }
687 }
688
689 static unsigned int bfq_wr_duration(struct bfq_data *bfqd)
690 {
691 u64 dur;
692
693 if (bfqd->bfq_wr_max_time > 0)
694 return bfqd->bfq_wr_max_time;
695
696 dur = bfqd->RT_prod;
697 do_div(dur, bfqd->peak_rate);
698
699 /*
700 * Limit duration between 3 and 13 seconds. Tests show that
701 * higher values than 13 seconds often yield the opposite of
702 * the desired result, i.e., worsen responsiveness by letting
703 * non-interactive and non-soft-real-time applications
704 * preserve weight raising for a too long time interval.
705 *
706 * On the other end, lower values than 3 seconds make it
707 * difficult for most interactive tasks to complete their jobs
708 * before weight-raising finishes.
709 */
710 if (dur > msecs_to_jiffies(13000))
711 dur = msecs_to_jiffies(13000);
712 else if (dur < msecs_to_jiffies(3000))
713 dur = msecs_to_jiffies(3000);
714
715 return dur;
716 }
717
718 static void
719 bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd,
720 struct bfq_io_cq *bic, bool bfq_already_existing)
721 {
722 unsigned int old_wr_coeff;
723 bool busy = bfq_already_existing && bfq_bfqq_busy(bfqq);
724
725 if (bic->saved_has_short_ttime)
726 bfq_mark_bfqq_has_short_ttime(bfqq);
727 else
728 bfq_clear_bfqq_has_short_ttime(bfqq);
729
730 if (bic->saved_IO_bound)
731 bfq_mark_bfqq_IO_bound(bfqq);
732 else
733 bfq_clear_bfqq_IO_bound(bfqq);
734
735 if (unlikely(busy))
736 old_wr_coeff = bfqq->wr_coeff;
737
738 bfqq->ttime = bic->saved_ttime;
739 bfqq->wr_coeff = bic->saved_wr_coeff;
740 bfqq->wr_start_at_switch_to_srt = bic->saved_wr_start_at_switch_to_srt;
741 BUG_ON(time_is_after_jiffies(bfqq->wr_start_at_switch_to_srt));
742 bfqq->last_wr_start_finish = bic->saved_last_wr_start_finish;
743 bfqq->wr_cur_max_time = bic->saved_wr_cur_max_time;
744 BUG_ON(time_is_after_jiffies(bfqq->last_wr_start_finish));
745
746 if (bfqq->wr_coeff > 1 && (bfq_bfqq_in_large_burst(bfqq) ||
747 time_is_before_jiffies(bfqq->last_wr_start_finish +
748 bfqq->wr_cur_max_time))) {
749 bfq_log_bfqq(bfqq->bfqd, bfqq,
750 "resume state: switching off wr (%lu + %lu < %lu)",
751 bfqq->last_wr_start_finish, bfqq->wr_cur_max_time,
752 jiffies);
753
754 bfqq->wr_coeff = 1;
755 }
756
757 /* make sure weight will be updated, however we got here */
758 bfqq->entity.prio_changed = 1;
759
760 if (likely(!busy))
761 return;
762
763 if (old_wr_coeff == 1 && bfqq->wr_coeff > 1) {
764 bfqd->wr_busy_queues++;
765 BUG_ON(bfqd->wr_busy_queues > bfqd->busy_queues);
766 } else if (old_wr_coeff > 1 && bfqq->wr_coeff == 1) {
767 bfqd->wr_busy_queues--;
768 BUG_ON(bfqd->wr_busy_queues < 0);
769 }
770 }
771
772 static int bfqq_process_refs(struct bfq_queue *bfqq)
773 {
774 int process_refs, io_refs;
775
776 lockdep_assert_held(&bfqq->bfqd->lock);
777
778 io_refs = bfqq->allocated;
779 process_refs = bfqq->ref - io_refs - bfqq->entity.on_st;
780 BUG_ON(process_refs < 0);
781 return process_refs;
782 }
783
784 /* Empty burst list and add just bfqq (see comments to bfq_handle_burst) */
785 static void bfq_reset_burst_list(struct bfq_data *bfqd, struct bfq_queue *bfqq)
786 {
787 struct bfq_queue *item;
788 struct hlist_node *n;
789
790 hlist_for_each_entry_safe(item, n, &bfqd->burst_list, burst_list_node)
791 hlist_del_init(&item->burst_list_node);
792 hlist_add_head(&bfqq->burst_list_node, &bfqd->burst_list);
793 bfqd->burst_size = 1;
794 bfqd->burst_parent_entity = bfqq->entity.parent;
795 }
796
797 /* Add bfqq to the list of queues in current burst (see bfq_handle_burst) */
798 static void bfq_add_to_burst(struct bfq_data *bfqd, struct bfq_queue *bfqq)
799 {
800 /* Increment burst size to take into account also bfqq */
801 bfqd->burst_size++;
802
803 bfq_log_bfqq(bfqd, bfqq, "add_to_burst %d", bfqd->burst_size);
804
805 BUG_ON(bfqd->burst_size > bfqd->bfq_large_burst_thresh);
806
807 if (bfqd->burst_size == bfqd->bfq_large_burst_thresh) {
808 struct bfq_queue *pos, *bfqq_item;
809 struct hlist_node *n;
810
811 /*
812 * Enough queues have been activated shortly after each
813 * other to consider this burst as large.
814 */
815 bfqd->large_burst = true;
816 bfq_log_bfqq(bfqd, bfqq, "add_to_burst: large burst started");
817
818 /*
819 * We can now mark all queues in the burst list as
820 * belonging to a large burst.
821 */
822 hlist_for_each_entry(bfqq_item, &bfqd->burst_list,
823 burst_list_node) {
824 bfq_mark_bfqq_in_large_burst(bfqq_item);
825 bfq_log_bfqq(bfqd, bfqq_item, "marked in large burst");
826 }
827 bfq_mark_bfqq_in_large_burst(bfqq);
828 bfq_log_bfqq(bfqd, bfqq, "marked in large burst");
829
830 /*
831 * From now on, and until the current burst finishes, any
832 * new queue being activated shortly after the last queue
833 * was inserted in the burst can be immediately marked as
834 * belonging to a large burst. So the burst list is not
835 * needed any more. Remove it.
836 */
837 hlist_for_each_entry_safe(pos, n, &bfqd->burst_list,
838 burst_list_node)
839 hlist_del_init(&pos->burst_list_node);
840 } else /*
841 * Burst not yet large: add bfqq to the burst list. Do
842 * not increment the ref counter for bfqq, because bfqq
843 * is removed from the burst list before freeing bfqq
844 * in put_queue.
845 */
846 hlist_add_head(&bfqq->burst_list_node, &bfqd->burst_list);
847 }
848
849 /*
850 * If many queues belonging to the same group happen to be created
851 * shortly after each other, then the processes associated with these
852 * queues have typically a common goal. In particular, bursts of queue
853 * creations are usually caused by services or applications that spawn
854 * many parallel threads/processes. Examples are systemd during boot,
855 * or git grep. To help these processes get their job done as soon as
856 * possible, it is usually better to not grant either weight-raising
857 * or device idling to their queues.
858 *
859 * In this comment we describe, firstly, the reasons why this fact
860 * holds, and, secondly, the next function, which implements the main
861 * steps needed to properly mark these queues so that they can then be
862 * treated in a different way.
863 *
864 * The above services or applications benefit mostly from a high
865 * throughput: the quicker the requests of the activated queues are
866 * cumulatively served, the sooner the target job of these queues gets
867 * completed. As a consequence, weight-raising any of these queues,
868 * which also implies idling the device for it, is almost always
869 * counterproductive. In most cases it just lowers throughput.
870 *
871 * On the other hand, a burst of queue creations may be caused also by
872 * the start of an application that does not consist of a lot of
873 * parallel I/O-bound threads. In fact, with a complex application,
874 * several short processes may need to be executed to start-up the
875 * application. In this respect, to start an application as quickly as
876 * possible, the best thing to do is in any case to privilege the I/O
877 * related to the application with respect to all other
878 * I/O. Therefore, the best strategy to start as quickly as possible
879 * an application that causes a burst of queue creations is to
880 * weight-raise all the queues created during the burst. This is the
881 * exact opposite of the best strategy for the other type of bursts.
882 *
883 * In the end, to take the best action for each of the two cases, the
884 * two types of bursts need to be distinguished. Fortunately, this
885 * seems relatively easy, by looking at the sizes of the bursts. In
886 * particular, we found a threshold such that only bursts with a
887 * larger size than that threshold are apparently caused by
888 * services or commands such as systemd or git grep. For brevity,
889 * hereafter we call just 'large' these bursts. BFQ *does not*
890 * weight-raise queues whose creation occurs in a large burst. In
891 * addition, for each of these queues BFQ performs or does not perform
892 * idling depending on which choice boosts the throughput more. The
893 * exact choice depends on the device and request pattern at
894 * hand.
895 *
896 * Unfortunately, false positives may occur while an interactive task
897 * is starting (e.g., an application is being started). The
898 * consequence is that the queues associated with the task do not
899 * enjoy weight raising as expected. Fortunately these false positives
900 * are very rare. They typically occur if some service happens to
901 * start doing I/O exactly when the interactive task starts.
902 *
903 * Turning back to the next function, it implements all the steps
904 * needed to detect the occurrence of a large burst and to properly
905 * mark all the queues belonging to it (so that they can then be
906 * treated in a different way). This goal is achieved by maintaining a
907 * "burst list" that holds, temporarily, the queues that belong to the
908 * burst in progress. The list is then used to mark these queues as
909 * belonging to a large burst if the burst does become large. The main
910 * steps are the following.
911 *
912 * . when the very first queue is created, the queue is inserted into the
913 * list (as it could be the first queue in a possible burst)
914 *
915 * . if the current burst has not yet become large, and a queue Q that does
916 * not yet belong to the burst is activated shortly after the last time
917 * at which a new queue entered the burst list, then the function appends
918 * Q to the burst list
919 *
920 * . if, as a consequence of the previous step, the burst size reaches
921 * the large-burst threshold, then
922 *
923 * . all the queues in the burst list are marked as belonging to a
924 * large burst
925 *
926 * . the burst list is deleted; in fact, the burst list already served
927 * its purpose (keeping temporarily track of the queues in a burst,
928 * so as to be able to mark them as belonging to a large burst in the
929 * previous sub-step), and now is not needed any more
930 *
931 * . the device enters a large-burst mode
932 *
933 * . if a queue Q that does not belong to the burst is created while
934 * the device is in large-burst mode and shortly after the last time
935 * at which a queue either entered the burst list or was marked as
936 * belonging to the current large burst, then Q is immediately marked
937 * as belonging to a large burst.
938 *
939 * . if a queue Q that does not belong to the burst is created a while
940 * later, i.e., not shortly after, than the last time at which a queue
941 * either entered the burst list or was marked as belonging to the
942 * current large burst, then the current burst is deemed as finished and:
943 *
944 * . the large-burst mode is reset if set
945 *
946 * . the burst list is emptied
947 *
948 * . Q is inserted in the burst list, as Q may be the first queue
949 * in a possible new burst (then the burst list contains just Q
950 * after this step).
951 */
952 static void bfq_handle_burst(struct bfq_data *bfqd, struct bfq_queue *bfqq)
953 {
954 /*
955 * If bfqq is already in the burst list or is part of a large
956 * burst, or finally has just been split, then there is
957 * nothing else to do.
958 */
959 if (!hlist_unhashed(&bfqq->burst_list_node) ||
960 bfq_bfqq_in_large_burst(bfqq) ||
961 time_is_after_eq_jiffies(bfqq->split_time +
962 msecs_to_jiffies(10)))
963 return;
964
965 /*
966 * If bfqq's creation happens late enough, or bfqq belongs to
967 * a different group than the burst group, then the current
968 * burst is finished, and related data structures must be
969 * reset.
970 *
971 * In this respect, consider the special case where bfqq is
972 * the very first queue created after BFQ is selected for this
973 * device. In this case, last_ins_in_burst and
974 * burst_parent_entity are not yet significant when we get
975 * here. But it is easy to verify that, whether or not the
976 * following condition is true, bfqq will end up being
977 * inserted into the burst list. In particular the list will
978 * happen to contain only bfqq. And this is exactly what has
979 * to happen, as bfqq may be the first queue of the first
980 * burst.
981 */
982 if (time_is_before_jiffies(bfqd->last_ins_in_burst +
983 bfqd->bfq_burst_interval) ||
984 bfqq->entity.parent != bfqd->burst_parent_entity) {
985 bfqd->large_burst = false;
986 bfq_reset_burst_list(bfqd, bfqq);
987 bfq_log_bfqq(bfqd, bfqq,
988 "handle_burst: late activation or different group");
989 goto end;
990 }
991
992 /*
993 * If we get here, then bfqq is being activated shortly after the
994 * last queue. So, if the current burst is also large, we can mark
995 * bfqq as belonging to this large burst immediately.
996 */
997 if (bfqd->large_burst) {
998 bfq_log_bfqq(bfqd, bfqq, "handle_burst: marked in burst");
999 bfq_mark_bfqq_in_large_burst(bfqq);
1000 goto end;
1001 }
1002
1003 /*
1004 * If we get here, then a large-burst state has not yet been
1005 * reached, but bfqq is being activated shortly after the last
1006 * queue. Then we add bfqq to the burst.
1007 */
1008 bfq_add_to_burst(bfqd, bfqq);
1009 end:
1010 /*
1011 * At this point, bfqq either has been added to the current
1012 * burst or has caused the current burst to terminate and a
1013 * possible new burst to start. In particular, in the second
1014 * case, bfqq has become the first queue in the possible new
1015 * burst. In both cases last_ins_in_burst needs to be moved
1016 * forward.
1017 */
1018 bfqd->last_ins_in_burst = jiffies;
1019
1020 }
1021
1022 static int bfq_bfqq_budget_left(struct bfq_queue *bfqq)
1023 {
1024 struct bfq_entity *entity = &bfqq->entity;
1025
1026 return entity->budget - entity->service;
1027 }
1028
1029 /*
1030 * If enough samples have been computed, return the current max budget
1031 * stored in bfqd, which is dynamically updated according to the
1032 * estimated disk peak rate; otherwise return the default max budget
1033 */
1034 static int bfq_max_budget(struct bfq_data *bfqd)
1035 {
1036 if (bfqd->budgets_assigned < bfq_stats_min_budgets)
1037 return bfq_default_max_budget;
1038 else
1039 return bfqd->bfq_max_budget;
1040 }
1041
1042 /*
1043 * Return min budget, which is a fraction of the current or default
1044 * max budget (trying with 1/32)
1045 */
1046 static int bfq_min_budget(struct bfq_data *bfqd)
1047 {
1048 if (bfqd->budgets_assigned < bfq_stats_min_budgets)
1049 return bfq_default_max_budget / 32;
1050 else
1051 return bfqd->bfq_max_budget / 32;
1052 }
1053
1054 static void bfq_bfqq_expire(struct bfq_data *bfqd,
1055 struct bfq_queue *bfqq,
1056 bool compensate,
1057 enum bfqq_expiration reason);
1058
1059 /*
1060 * The next function, invoked after the input queue bfqq switches from
1061 * idle to busy, updates the budget of bfqq. The function also tells
1062 * whether the in-service queue should be expired, by returning
1063 * true. The purpose of expiring the in-service queue is to give bfqq
1064 * the chance to possibly preempt the in-service queue, and the reason
1065 * for preempting the in-service queue is to achieve one of the two
1066 * goals below.
1067 *
1068 * 1. Guarantee to bfqq its reserved bandwidth even if bfqq has
1069 * expired because it has remained idle. In particular, bfqq may have
1070 * expired for one of the following two reasons:
1071 *
1072 * - BFQ_BFQQ_NO_MORE_REQUEST bfqq did not enjoy any device idling and
1073 * did not make it to issue a new request before its last request
1074 * was served;
1075 *
1076 * - BFQ_BFQQ_TOO_IDLE bfqq did enjoy device idling, but did not issue
1077 * a new request before the expiration of the idling-time.
1078 *
1079 * Even if bfqq has expired for one of the above reasons, the process
1080 * associated with the queue may be however issuing requests greedily,
1081 * and thus be sensitive to the bandwidth it receives (bfqq may have
1082 * remained idle for other reasons: CPU high load, bfqq not enjoying
1083 * idling, I/O throttling somewhere in the path from the process to
1084 * the I/O scheduler, ...). But if, after every expiration for one of
1085 * the above two reasons, bfqq has to wait for the service of at least
1086 * one full budget of another queue before being served again, then
1087 * bfqq is likely to get a much lower bandwidth or resource time than
1088 * its reserved ones. To address this issue, two countermeasures need
1089 * to be taken.
1090 *
1091 * First, the budget and the timestamps of bfqq need to be updated in
1092 * a special way on bfqq reactivation: they need to be updated as if
1093 * bfqq did not remain idle and did not expire. In fact, if they are
1094 * computed as if bfqq expired and remained idle until reactivation,
1095 * then the process associated with bfqq is treated as if, instead of
1096 * being greedy, it stopped issuing requests when bfqq remained idle,
1097 * and restarts issuing requests only on this reactivation. In other
1098 * words, the scheduler does not help the process recover the "service
1099 * hole" between bfqq expiration and reactivation. As a consequence,
1100 * the process receives a lower bandwidth than its reserved one. In
1101 * contrast, to recover this hole, the budget must be updated as if
1102 * bfqq was not expired at all before this reactivation, i.e., it must
1103 * be set to the value of the remaining budget when bfqq was
1104 * expired. Along the same line, timestamps need to be assigned the
1105 * value they had the last time bfqq was selected for service, i.e.,
1106 * before last expiration. Thus timestamps need to be back-shifted
1107 * with respect to their normal computation (see [1] for more details
1108 * on this tricky aspect).
1109 *
1110 * Secondly, to allow the process to recover the hole, the in-service
1111 * queue must be expired too, to give bfqq the chance to preempt it
1112 * immediately. In fact, if bfqq has to wait for a full budget of the
1113 * in-service queue to be completed, then it may become impossible to
1114 * let the process recover the hole, even if the back-shifted
1115 * timestamps of bfqq are lower than those of the in-service queue. If
1116 * this happens for most or all of the holes, then the process may not
1117 * receive its reserved bandwidth. In this respect, it is worth noting
1118 * that, being the service of outstanding requests unpreemptible, a
1119 * little fraction of the holes may however be unrecoverable, thereby
1120 * causing a little loss of bandwidth.
1121 *
1122 * The last important point is detecting whether bfqq does need this
1123 * bandwidth recovery. In this respect, the next function deems the
1124 * process associated with bfqq greedy, and thus allows it to recover
1125 * the hole, if: 1) the process is waiting for the arrival of a new
1126 * request (which implies that bfqq expired for one of the above two
1127 * reasons), and 2) such a request has arrived soon. The first
1128 * condition is controlled through the flag non_blocking_wait_rq,
1129 * while the second through the flag arrived_in_time. If both
1130 * conditions hold, then the function computes the budget in the
1131 * above-described special way, and signals that the in-service queue
1132 * should be expired. Timestamp back-shifting is done later in
1133 * __bfq_activate_entity.
1134 *
1135 * 2. Reduce latency. Even if timestamps are not backshifted to let
1136 * the process associated with bfqq recover a service hole, bfqq may
1137 * however happen to have, after being (re)activated, a lower finish
1138 * timestamp than the in-service queue. That is, the next budget of
1139 * bfqq may have to be completed before the one of the in-service
1140 * queue. If this is the case, then preempting the in-service queue
1141 * allows this goal to be achieved, apart from the unpreemptible,
1142 * outstanding requests mentioned above.
1143 *
1144 * Unfortunately, regardless of which of the above two goals one wants
1145 * to achieve, service trees need first to be updated to know whether
1146 * the in-service queue must be preempted. To have service trees
1147 * correctly updated, the in-service queue must be expired and
1148 * rescheduled, and bfqq must be scheduled too. This is one of the
1149 * most costly operations (in future versions, the scheduling
1150 * mechanism may be re-designed in such a way to make it possible to
1151 * know whether preemption is needed without needing to update service
1152 * trees). In addition, queue preemptions almost always cause random
1153 * I/O, and thus loss of throughput. Because of these facts, the next
1154 * function adopts the following simple scheme to avoid both costly
1155 * operations and too frequent preemptions: it requests the expiration
1156 * of the in-service queue (unconditionally) only for queues that need
1157 * to recover a hole, or that either are weight-raised or deserve to
1158 * be weight-raised.
1159 */
1160 static bool bfq_bfqq_update_budg_for_activation(struct bfq_data *bfqd,
1161 struct bfq_queue *bfqq,
1162 bool arrived_in_time,
1163 bool wr_or_deserves_wr)
1164 {
1165 struct bfq_entity *entity = &bfqq->entity;
1166
1167 if (bfq_bfqq_non_blocking_wait_rq(bfqq) && arrived_in_time) {
1168 /*
1169 * We do not clear the flag non_blocking_wait_rq here, as
1170 * the latter is used in bfq_activate_bfqq to signal
1171 * that timestamps need to be back-shifted (and is
1172 * cleared right after).
1173 */
1174
1175 /*
1176 * In next assignment we rely on that either
1177 * entity->service or entity->budget are not updated
1178 * on expiration if bfqq is empty (see
1179 * __bfq_bfqq_recalc_budget). Thus both quantities
1180 * remain unchanged after such an expiration, and the
1181 * following statement therefore assigns to
1182 * entity->budget the remaining budget on such an
1183 * expiration. For clarity, entity->service is not
1184 * updated on expiration in any case, and, in normal
1185 * operation, is reset only when bfqq is selected for
1186 * service (see bfq_get_next_queue).
1187 */
1188 BUG_ON(bfqq->max_budget < 0);
1189 entity->budget = min_t(unsigned long,
1190 bfq_bfqq_budget_left(bfqq),
1191 bfqq->max_budget);
1192
1193 BUG_ON(entity->budget < 0);
1194 return true;
1195 }
1196
1197 BUG_ON(bfqq->max_budget < 0);
1198 entity->budget = max_t(unsigned long, bfqq->max_budget,
1199 bfq_serv_to_charge(bfqq->next_rq, bfqq));
1200 BUG_ON(entity->budget < 0);
1201
1202 bfq_clear_bfqq_non_blocking_wait_rq(bfqq);
1203 return wr_or_deserves_wr;
1204 }
1205
1206 static void bfq_update_bfqq_wr_on_rq_arrival(struct bfq_data *bfqd,
1207 struct bfq_queue *bfqq,
1208 unsigned int old_wr_coeff,
1209 bool wr_or_deserves_wr,
1210 bool interactive,
1211 bool in_burst,
1212 bool soft_rt)
1213 {
1214 if (old_wr_coeff == 1 && wr_or_deserves_wr) {
1215 /* start a weight-raising period */
1216 if (interactive) {
1217 bfqq->wr_coeff = bfqd->bfq_wr_coeff;
1218 bfqq->wr_cur_max_time = bfq_wr_duration(bfqd);
1219 } else {
1220 bfqq->wr_start_at_switch_to_srt = jiffies;
1221 bfqq->wr_coeff = bfqd->bfq_wr_coeff *
1222 BFQ_SOFTRT_WEIGHT_FACTOR;
1223 bfqq->wr_cur_max_time =
1224 bfqd->bfq_wr_rt_max_time;
1225 }
1226 /*
1227 * If needed, further reduce budget to make sure it is
1228 * close to bfqq's backlog, so as to reduce the
1229 * scheduling-error component due to a too large
1230 * budget. Do not care about throughput consequences,
1231 * but only about latency. Finally, do not assign a
1232 * too small budget either, to avoid increasing
1233 * latency by causing too frequent expirations.
1234 */
1235 bfqq->entity.budget = min_t(unsigned long,
1236 bfqq->entity.budget,
1237 2 * bfq_min_budget(bfqd));
1238
1239 bfq_log_bfqq(bfqd, bfqq,
1240 "wrais starting at %lu, rais_max_time %u",
1241 jiffies,
1242 jiffies_to_msecs(bfqq->wr_cur_max_time));
1243 } else if (old_wr_coeff > 1) {
1244 if (interactive) { /* update wr coeff and duration */
1245 bfqq->wr_coeff = bfqd->bfq_wr_coeff;
1246 bfqq->wr_cur_max_time = bfq_wr_duration(bfqd);
1247 } else if (in_burst) {
1248 bfqq->wr_coeff = 1;
1249 bfq_log_bfqq(bfqd, bfqq,
1250 "wrais ending at %lu, rais_max_time %u",
1251 jiffies,
1252 jiffies_to_msecs(bfqq->
1253 wr_cur_max_time));
1254 } else if (soft_rt) {
1255 /*
1256 * The application is now or still meeting the
1257 * requirements for being deemed soft rt. We
1258 * can then correctly and safely (re)charge
1259 * the weight-raising duration for the
1260 * application with the weight-raising
1261 * duration for soft rt applications.
1262 *
1263 * In particular, doing this recharge now, i.e.,
1264 * before the weight-raising period for the
1265 * application finishes, reduces the probability
1266 * of the following negative scenario:
1267 * 1) the weight of a soft rt application is
1268 * raised at startup (as for any newly
1269 * created application),
1270 * 2) since the application is not interactive,
1271 * at a certain time weight-raising is
1272 * stopped for the application,
1273 * 3) at that time the application happens to
1274 * still have pending requests, and hence
1275 * is destined to not have a chance to be
1276 * deemed soft rt before these requests are
1277 * completed (see the comments to the
1278 * function bfq_bfqq_softrt_next_start()
1279 * for details on soft rt detection),
1280 * 4) these pending requests experience a high
1281 * latency because the application is not
1282 * weight-raised while they are pending.
1283 */
1284 if (bfqq->wr_cur_max_time !=
1285 bfqd->bfq_wr_rt_max_time) {
1286 bfqq->wr_start_at_switch_to_srt =
1287 bfqq->last_wr_start_finish;
1288 BUG_ON(time_is_after_jiffies(bfqq->last_wr_start_finish));
1289
1290 bfqq->wr_cur_max_time =
1291 bfqd->bfq_wr_rt_max_time;
1292 bfqq->wr_coeff = bfqd->bfq_wr_coeff *
1293 BFQ_SOFTRT_WEIGHT_FACTOR;
1294 bfq_log_bfqq(bfqd, bfqq,
1295 "switching to soft_rt wr");
1296 } else
1297 bfq_log_bfqq(bfqd, bfqq,
1298 "moving forward soft_rt wr duration");
1299 bfqq->last_wr_start_finish = jiffies;
1300 }
1301 }
1302 }
1303
1304 static bool bfq_bfqq_idle_for_long_time(struct bfq_data *bfqd,
1305 struct bfq_queue *bfqq)
1306 {
1307 return bfqq->dispatched == 0 &&
1308 time_is_before_jiffies(
1309 bfqq->budget_timeout +
1310 bfqd->bfq_wr_min_idle_time);
1311 }
1312
1313 static void bfq_bfqq_handle_idle_busy_switch(struct bfq_data *bfqd,
1314 struct bfq_queue *bfqq,
1315 int old_wr_coeff,
1316 struct request *rq,
1317 bool *interactive)
1318 {
1319 bool soft_rt, in_burst, wr_or_deserves_wr,
1320 bfqq_wants_to_preempt,
1321 idle_for_long_time = bfq_bfqq_idle_for_long_time(bfqd, bfqq),
1322 /*
1323 * See the comments on
1324 * bfq_bfqq_update_budg_for_activation for
1325 * details on the usage of the next variable.
1326 */
1327 arrived_in_time = ktime_get_ns() <=
1328 bfqq->ttime.last_end_request +
1329 bfqd->bfq_slice_idle * 3;
1330
1331 bfq_log_bfqq(bfqd, bfqq,
1332 "bfq_add_request non-busy: "
1333 "jiffies %lu, in_time %d, idle_long %d busyw %d "
1334 "wr_coeff %u",
1335 jiffies, arrived_in_time,
1336 idle_for_long_time,
1337 bfq_bfqq_non_blocking_wait_rq(bfqq),
1338 old_wr_coeff);
1339
1340 BUG_ON(bfqq->entity.budget < bfqq->entity.service);
1341
1342 BUG_ON(bfqq == bfqd->in_service_queue);
1343 bfqg_stats_update_io_add(bfqq_group(RQ_BFQQ(rq)), bfqq, rq->cmd_flags);
1344
1345 /*
1346 * bfqq deserves to be weight-raised if:
1347 * - it is sync,
1348 * - it does not belong to a large burst,
1349 * - it has been idle for enough time or is soft real-time,
1350 * - is linked to a bfq_io_cq (it is not shared in any sense)
1351 */
1352 in_burst = bfq_bfqq_in_large_burst(bfqq);
1353 soft_rt = bfqd->bfq_wr_max_softrt_rate > 0 &&
1354 !in_burst &&
1355 time_is_before_jiffies(bfqq->soft_rt_next_start);
1356 *interactive =
1357 !in_burst &&
1358 idle_for_long_time;
1359 wr_or_deserves_wr = bfqd->low_latency &&
1360 (bfqq->wr_coeff > 1 ||
1361 (bfq_bfqq_sync(bfqq) &&
1362 bfqq->bic && (*interactive || soft_rt)));
1363
1364 bfq_log_bfqq(bfqd, bfqq,
1365 "bfq_add_request: "
1366 "in_burst %d, "
1367 "soft_rt %d (next %lu), inter %d, bic %p",
1368 bfq_bfqq_in_large_burst(bfqq), soft_rt,
1369 bfqq->soft_rt_next_start,
1370 *interactive,
1371 bfqq->bic);
1372
1373 /*
1374 * Using the last flag, update budget and check whether bfqq
1375 * may want to preempt the in-service queue.
1376 */
1377 bfqq_wants_to_preempt =
1378 bfq_bfqq_update_budg_for_activation(bfqd, bfqq,
1379 arrived_in_time,
1380 wr_or_deserves_wr);
1381
1382 /*
1383 * If bfqq happened to be activated in a burst, but has been
1384 * idle for much more than an interactive queue, then we
1385 * assume that, in the overall I/O initiated in the burst, the
1386 * I/O associated with bfqq is finished. So bfqq does not need
1387 * to be treated as a queue belonging to a burst
1388 * anymore. Accordingly, we reset bfqq's in_large_burst flag
1389 * if set, and remove bfqq from the burst list if it's
1390 * there. We do not decrement burst_size, because the fact
1391 * that bfqq does not need to belong to the burst list any
1392 * more does not invalidate the fact that bfqq was created in
1393 * a burst.
1394 */
1395 if (likely(!bfq_bfqq_just_created(bfqq)) &&
1396 idle_for_long_time &&
1397 time_is_before_jiffies(
1398 bfqq->budget_timeout +
1399 msecs_to_jiffies(10000))) {
1400 hlist_del_init(&bfqq->burst_list_node);
1401 bfq_clear_bfqq_in_large_burst(bfqq);
1402 }
1403
1404 bfq_clear_bfqq_just_created(bfqq);
1405
1406 if (!bfq_bfqq_IO_bound(bfqq)) {
1407 if (arrived_in_time) {
1408 bfqq->requests_within_timer++;
1409 if (bfqq->requests_within_timer >=
1410 bfqd->bfq_requests_within_timer)
1411 bfq_mark_bfqq_IO_bound(bfqq);
1412 } else
1413 bfqq->requests_within_timer = 0;
1414 bfq_log_bfqq(bfqd, bfqq, "requests in time %d",
1415 bfqq->requests_within_timer);
1416 }
1417
1418 if (bfqd->low_latency) {
1419 if (unlikely(time_is_after_jiffies(bfqq->split_time)))
1420 /* wraparound */
1421 bfqq->split_time =
1422 jiffies - bfqd->bfq_wr_min_idle_time - 1;
1423
1424 if (time_is_before_jiffies(bfqq->split_time +
1425 bfqd->bfq_wr_min_idle_time)) {
1426 bfq_update_bfqq_wr_on_rq_arrival(bfqd, bfqq,
1427 old_wr_coeff,
1428 wr_or_deserves_wr,
1429 *interactive,
1430 in_burst,
1431 soft_rt);
1432
1433 if (old_wr_coeff != bfqq->wr_coeff)
1434 bfqq->entity.prio_changed = 1;
1435 }
1436 }
1437
1438 bfqq->last_idle_bklogged = jiffies;
1439 bfqq->service_from_backlogged = 0;