Ubuntu系统下MI50根据温度自动调整风扇转速脚本

MI50由于是被动散热设计,作为普通消费级使用,需要改装散热器。

我使用的方案是3D打印外壳+涡轮风扇,4P风扇口插主板。

由于是风道直吹,散热效果还是很不错的。但涡轮的转速是非常快的,在全速运行时噪音特别大。

windows系统下有FanControl这个工具可以可视化的配置自动温度控制转速。那么当MI50在Linux系统下部署使用,是否也能根据显卡的温度自动控制转速并随系统启动自动运行呢?可以的朋友,可以的。

首先我们安装这个工具:

sudo apt install lm-sensors

# 执行传感器探测 并一路YES
sudo sensors-detect

# 最终会探测到全部的可检测温度与控制pwm的芯片信息。
# 当询问是否自动将探测到的芯片驱动加入到驱动模块配置里,仍然选yes,
Do you want to add these lines automatically to /etc/modules? (yes/NO)

# 重启系统
reboot

系统重启后执行

sudo sensors
lm96163-i2c-13-4c
Adapter: SMBus I801 adapter at f000
temp1:        +42.0°C  (high = +70.0°C)
temp2:        +50.6°C  (low  =  +0.0°C, high = +85.0°C)
                       (crit = +110.0°C, hyst = +100.0°C)  sensor = CPU diode

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +29.0°C  (high = +90.0°C, crit = +100.0°C)
Core 0:        +24.0°C  (high = +90.0°C, crit = +100.0°C)
Core 1:        +23.0°C  (high = +90.0°C, crit = +100.0°C)
Core 2:        +24.0°C  (high = +90.0°C, crit = +100.0°C)
Core 3:        +24.0°C  (high = +90.0°C, crit = +100.0°C)
Core 4:        +24.0°C  (high = +90.0°C, crit = +100.0°C)
Core 5:        +25.0°C  (high = +90.0°C, crit = +100.0°C)
Core 6:        +23.0°C  (high = +90.0°C, crit = +100.0°C)
Core 8:        +22.0°C  (high = +90.0°C, crit = +100.0°C)
Core 9:        +23.0°C  (high = +90.0°C, crit = +100.0°C)
Core 10:       +23.0°C  (high = +90.0°C, crit = +100.0°C)
Core 11:       +24.0°C  (high = +90.0°C, crit = +100.0°C)
Core 12:       +24.0°C  (high = +90.0°C, crit = +100.0°C)
Core 13:       +23.0°C  (high = +90.0°C, crit = +100.0°C)
Core 14:       +24.0°C  (high = +90.0°C, crit = +100.0°C)

nvme-pci-0200
Adapter: PCI adapter
Composite:    +39.9°C  (low  = -273.1°C, high = +84.8°C)
                       (crit = +84.8°C)
Sensor 1:     +39.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +34.9°C  (low  = -273.1°C, high = +65261.8°C)

nct6793-isa-0a20
Adapter: ISA adapter
in0:                     1.81 V  (min =  +0.00 V, max =  +1.74 V)  ALARM
in1:                     1.22 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                     3.33 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                     3.34 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                   248.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                   128.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                     1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                     3.31 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                     3.25 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                     1.06 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                  152.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                  128.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                    1.22 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                    1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                  168.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                     0 RPM  (min =    0 RPM)
fan2:                  1296 RPM  (min =    0 RPM)
fan3:                     0 RPM  (min =    0 RPM)
fan4:                     0 RPM  (min =    0 RPM)
fan5:                     0 RPM  (min =    0 RPM)
SYSTIN:                +117.0°C  (high =  +0.0°C, hyst =  +0.0°C)  ALARM  sensor = thermistor
CPUTIN:                 +49.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermal diode
AUXTIN0:                +25.5°C    sensor = thermistor
AUXTIN1:               +127.0°C    sensor = thermistor
AUXTIN2:               +127.0°C    sensor = thermistor
AUXTIN3:               +127.0°C    sensor = CPU diode
PECI Agent 0:            +8.5°C  
PCH_CHIP_CPU_MAX_TEMP:   +0.0°C  
PCH_CHIP_TEMP:           +0.0°C  
PCH_CPU_TEMP:            +0.0°C  
PCH_MCH_TEMP:            +0.0°C  
Agent0 Dimm0 :           +0.0°C  
TSI2_TEMP:             +3892314.0°C  
TSI3_TEMP:             +3892314.0°C  
TSI4_TEMP:             +3892314.0°C  
TSI5_TEMP:             +3892314.0°C  
TSI6_TEMP:             +3892314.0°C  
TSI7_TEMP:             +3892314.0°C  
intrusion0:            OK
intrusion1:            ALARM
beep_enable:           disabled

amdgpu-pci-0500
Adapter: PCI adapter
vddgfx:      743.00 mV 
fan1:           0 RPM  (min =    0 RPM, max = 3850 RPM)
edge:         +49.0°C  (crit = +100.0°C, hyst = -273.1°C)
                       (emerg = +105.0°C)
junction:     +53.0°C  (crit = +100.0°C, hyst = -273.1°C)
                       (emerg = +105.0°C)
mem:          +54.0°C  (crit = +94.0°C, hyst = -273.1°C)
                       (emerg = +99.0°C)
PPT:          30.00 W  (cap = 300.00 W)

根据这些信息,编写了一个脚本,每间隔5秒读取一次amdgpu的温度,将pwm控制在一定范围内,并且随开机自动启动,该脚本也适用NVIDIA Tesla P100,V100等改装主动散热的设备。执行效果如下:

10:14:21 root@dev ~  journalctl -f -u fan-control-gpu.service 
Sep 24 10:13:20 dev fan-control-gpu.sh[16433]: [2025-09-24 10:13:20] 🌡️ 温度: 67°C  PWM: 151
Sep 24 10:13:25 dev fan-control-gpu.sh[16441]: [2025-09-24 10:13:25] 🌡️ 温度: 68°C  PWM: 155
Sep 24 10:13:30 dev fan-control-gpu.sh[16449]: [2025-09-24 10:13:30] 🌡️ 温度: 69°C  PWM: 158
Sep 24 10:13:35 dev fan-control-gpu.sh[16457]: [2025-09-24 10:13:35] 🌡️ 温度: 70°C  PWM: 162
Sep 24 10:13:40 dev fan-control-gpu.sh[16465]: [2025-09-24 10:13:40] 🌡️ 温度: 71°C  PWM: 166
Sep 24 10:13:55 dev fan-control-gpu.sh[16494]: [2025-09-24 10:13:55] 🌡️ 温度: 72°C  PWM: 170
Sep 24 10:14:15 dev fan-control-gpu.sh[16523]: [2025-09-24 10:14:15] 🌡️ 温度: 73°C  PWM: 173
Sep 24 10:14:20 dev fan-control-gpu.sh[16547]: [2025-09-24 10:14:20] 🌡️ 温度: 68°C  PWM: 155
Sep 24 10:14:25 dev fan-control-gpu.sh[16560]: [2025-09-24 10:14:25] 🌡️ 温度: 65°C  PWM: 143
Sep 24 10:14:30 dev fan-control-gpu.sh[16568]: [2025-09-24 10:14:30] 🌡️ 温度: 63°C  PWM: 136
Sep 24 10:14:35 dev fan-control-gpu.sh[16578]: [2025-09-24 10:14:35] 🌡️ 温度: 61°C  PWM: 128

自动化脚本参数说明:

PWM_DEVICE="/sys/class/hwmon/hwmon3/pwm2" # pwm控制设备
PWM_ENABLE="/sys/class/hwmon/hwmon3/pwm2_enable" # pwm开启设备号
TEMP_SENSOR="/sys/class/hwmon/hwmon1/temp1_input" # 温度检测设备


INTERVAL_TIME=5 # 间隔5秒探测一次,可以自行修改比如1秒一次

MIN_TEMP=40000    # 40°C → 单位是毫摄氏度 (millidegrees)
MAX_TEMP=80000    # 80°C 最高温度,用来对应最高转速
MIN_PWM=50 # 最低转速,可以设置0,可能会使风扇完全停转,与风扇的pwm控制逻辑有关
MAX_PWM=200 # 最高转速 ,最高可设置255,全速的涡轮噪音太大,但如果仍压不到温度,可以设置到最大

Bash脚本代码:

支付 ¥5 购买本节后解锁剩余36%的内容

发表评论