BlindCat

时间盲目,人类愚蠢

0%

记录关于要做的创新想法

  • [] AI-ability 动态加载Ai能力
  • [] 把简历做成智能体/skill, 通过AI完成面试
  • [] vscode MR-检视-cleancode-issue 自动修复 插件
Read more »

什么是 Agent Skills?

一句话:SKILL.md是对AGENT.md的优化版本,相当于把AGENT.md变成了可以调用的可以渐近式加载(而非一股脑全加载)的封装函数。
AGENT.md: 平铺的代码脚本,不可复用
SKILL.md: 封装成函数,不需要知道细节,可复用

skill的管理问题

skill很好用,但是它的管理面临以下问题

  • 安装问题:每个Agent客户端都有各自的skill安装目录,每个skill在每个客户端都得拷贝到不同的目录安装一次
  • 更新问题:skill的远程仓库可能更新,本地版本可能迭代,不同agent客户端直接的skill面临过时、更新、同步问题
    使用npx skills 可以解决skill管理问题

    核心机制

    把skill下载到本地集中管理,然后软链接到不同agent实现生效
    优点:
  • 一次更新,处处可用
  • 集中管理,无需同步
  • 简单好用的skill市场

没有解决的问题:SKILL的按需动态加载

SKILL和MCP一样,装多了总会发生上下文爆炸;
诚然,每个SKILL的meta层只有约50个token,但数量多了之后,还是会出现管理问题。
每次都要为会话默认开启全部的skill,还是每次为会话手动指定应该加载的skill,然后重启会话?
不是的,SKILL还需要一个动态搜索、动态加载、动态调用机制。
这是业界还没有共识的未解决的问题。

我的方案

我为这个问题设计了一个解决方案,归一了skill和mcp,归一了全部技能的动态搜索和动态调用。

开发一个唯一的MCP来管理skill和mcp,实现动态搜索和调用skill和MCP。
本质上,skill和mcp都不过是上下文注入而已,那么通过一个服务去管理就是一个很自然的方式。
这个服务像本地调用一样返回当前skill和mcp应该返回的上下文。
那么,所谓的skill管理,mcp管理,就完全托管给这个服务,对agent客户端来说完全不存在了。
不存在管理问题,不存在更新问题,不存在不一致问题,不存在skill和mcp,只存在AI和它的ability。

Read more »

场景

opencode支持web方式运行,方便全览控制各个工作区对话。
opencode在wsl运行时总是因wsl资源耗尽而掉线,希望有个守护进程将其自动拉起
要将openCode配置为守护进程,最标准的方法是使用 systemd

步骤 1:创建 systemd 服务文件

在 /etc/systemd/system/ 目录下创建一个名为 opencode.service 的文件(需要 root 权限):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[Unit]
Description=OpenCode AI Server Daemon
After=network.target

[Service]
# 建议指定具体用户,不要以 root 运行
User=root
Group=root
# 指定运行目录
WorkingDirectory=/home/workspace
# 使用 serve 命令启动无头服务器模式
ExecStart=/usr/bin/opencode web --port 4096 --hostname 0.0.0.0
# 失败后自动重启
Restart=always
# 环境变量配置(如 API 密钥,也可以在 opencode.json 中配置)
Environment="HTTP_PROXY=http://your-proxy:your-port"
Environment="HTTPS_PROXY=http://your-proxy:your-port"
Environment="NO_PROXY=localhost,127.0.0.1" # 一定要配置,避免你的服务走代理了
[Install]
WantedBy=multi-user.target

[注:请根据实际安装路径修改 ExecStart 中的 opencode 路径,通常可用 which opencode 查询]。 

步骤 2:启用并启动服务

执行以下命令使配置生效:
重新加载配置:
sudo systemctl daemon-reload
启动并设置开机自启:
sudo systemctl enable –now opencode.service
检查运行状态:
sudo systemctl status opencode.service 

步骤 3:如何与之交互

web模式

Read more »

问题描述

构建系统使用python代码作为调度, 其中打包时系统挂死.

问题解决

  1. 查看线程状态

    1
    2
    3
    4
    5
    6
    # ps -ef|grep python3
    root 22459 30393 0 15:13 ? 00:00:00 python3 packet_ci_board.py 2488hv6
    root 22805 22459 0 15:15 ? 00:00:01 python3 -B packet.py -b 2488hv6
    root 22821 22805 0 15:15 ? 00:00:00 python3 -B packet.py -b 2488hv6
    root 30393 30392 0 14:59 ? 00:00:00 python3 build_one_click.py --board 2488hv6 --sign
    root 30822 29995 0 16:23 pts/3 00:00:00 grep --color=auto python3

    可见线程调用关系30392 -> 30393 -> 22459 -> 22805 ->22821, 挂死线程为22821

  2. 尝试调试线程
    参考这篇博客[^1],检查环境是否有gdb python-dbg, 前者存在, 后者不存在;

  3. 安装python-dbg

    由于线上环境为suse12 sp5, 无法通过软件源安装python-dbg. 参考博客[^1]提示, 实际上就是将一个源码文件拷贝到环境上而已.

    • 软件检查环境python版本, 为3.8.5

    • 获得3.8.5的源文件libpython.py[^2],通过文本复制方式上传到环境.

    • 拷贝到python的安装路径,

      1
      2
      3
      4
      5
      # which python3
      /opt/buildtools/python-3.8.5/bin/python3

      # mkdir -p /usr/share/gdb/auto-load/opt/buildtools/python-3.8.5/bin/
      # cp libpython.py /usr/share/gdb/auto-load/opt/buildtools/python-3.8.5/bin/python3.8-gdb.py
  4. 使用gdb调试
    开始调试

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    # gdb python3 22821 
    (gdb)
    # 由上述步骤后,应可在gdb中使用`py-*`相关命令, 否则, 可尝试在gdb中`source libpython.py`手动导入以下命令
    (gdb) source libpython.py
    # 输入py后,使用tab可自动补全
    (gdb) py
    py-bt py-down py-locals py-up python-interactive
    py-bt-full py-list py-print python
    # 避免破坏线程, 可以先dump, 再执行gdb python3 ./core.22821调试dump文件
    (gdb) generate-core-file
    # 查看调用栈
    (gdb) bt
    #0 0x00007f8458abb5f4 in do_futex_wait.constprop () from /lib64/libpthread.so.0
    #1 0x00007f8458abb6f8 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
    #2 0x00007f8454b7f388 in semlock_acquire (self=0x7f84575bd5b0, args=<optimized out>, kwds=<optimized out>)
    at /tmp/python-3.8.5/Python-3.8.5/Modules/_multiprocessing/semaphore.c:319
    #3 0x000000000044018a in cfunction_call_varargs (kwargs=<optimized out>, args=<optimized out>,
    func=<built-in method acquire of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>)
    at Objects/call.c:742
    #4 PyCFunction_Call (
    func=<built-in method acquire of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>,
    args=<optimized out>, kwargs=<optimized out>) at Objects/call.c:772
    #5 0x000000000043d156 in _PyObject_MakeTpCall (
    callable=callable@entry=<built-in method acquire of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>, args=args@entry=0x21978d0, nargs=<optimized out>, keywords=keywords@entry=0x0) at Objects/call.c:159
    #6 0x000000000042dbc2 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>,
    args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:125
    #7 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1ebc3c0)
    at Python/ceval.c:4963
    #8 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3469
    #9 0x00000000004259b8 in function_code_fastcall (co=<optimized out>, args=<optimized out>, nargs=1,
    globals=<optimized out>) at Objects/call.c:283
    #10 0x000000000043fd7c in PyVectorcall_Call (callable=<function at remote 0x7f84575bb940>,
    tuple=<optimized out>, kwargs=<optimized out>) at Objects/call.c:199
    #11 0x000000000042a749 in do_call_core (kwdict={},
    callargs=(<Lock(_semlock=<_multiprocessing.SemLock at remote 0x7f84575bd5b0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>) at remote 0x7f84576d09d0>,),
    func=<function at remote 0x7f84575bb940>, tstate=0x1ebc3c0) at Python/ceval.c:5010
    #12 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3559
    #13 0x00000000004259b8 in function_code_fastcall (co=<optimized out>, args=<optimized out>, nargs=1,
    globals=<optimized out>) at Objects/call.c:283
    #14 0x0000000000428733 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>,
    args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:127
    #15 trace_call_function (kwnames=<optimized out>, nargs=<optimized out>, args=<optimized out>,
    func=<optimized out>, tstate=<optimized out>) at Python/ceval.c:4944
    #16 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1ebc3c0)
    at Python/ceval.c:4960
    #17 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486
    #18 0x00000000004f04dd in PyEval_EvalFrameEx (throwflag=0,
    f=Frame 0x2169550, for file /opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/process.py, line 571, in _bootstrap (self=<Process(_identity=(13,), _config={'authkey': <AuthenticationString at remote 0x7f84576c9dc0>, 'semprefix': '/mp'}, _parent_pid=22805, _parent_name='MainProcess', _popen=None, _closed=False, _target=<function at remote 0x7f84575bb940>, _args=(<Lock(_semlock=<_multiprocessing.SemLock at remote 0x7f84575bd5b0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>) at remote 0x7f84576d09d0>,), _kwargs={}, _name='Process-13') at remote 0x7f845757b250>, parent_sentinel=30, util=<module at remote 0x7f84575e7540>, context=<module at remote 0x7f84577a2a40>)) at Python/ceval.c:741
    #19 _PyEval_EvalCodeWithName (_co=<code at remote 0x7f84576d6920>, globals=<optimized out>,
    locals=locals@entry=0x0, args=<optimized out>, argcount=1, kwnames=0x7f845756d6b8,
    --Type <RET> for more, q to quit, c to continue without paging--q

    # 查看python的调用栈
    (gdb) py-bt
    Traceback (most recent call first):
    <built-in method acquire of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>
    File "packet.py", line 3533, in create_tosupport_pkg
    File "/opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
    File "/opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/process.py", line 571, in _bootstrap
    File "/opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/popen_fork.py", line 75, in _launch
    code = process_obj._bootstrap(parent_sentinel=child_r)
    File "/opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
    File "/opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)
    File "/opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
    File "/opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/process.py", line 633, in start
    File "packet.py", line 3706, in create_pkg
    File "packet.py", line 3475, in <module>
    (gdb) py-up
    #8 Frame 0x21976c0, for file packet.py, line 3533, in create_tosupport_pkg (lock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x7f84575bd5b0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>) at remote 0x7f84576d09d0>, logger=<Logger(filters=[], name='create_tosupport_pkg', level=10, parent=<RootLogger(filters=[], name='root', level=30, parent=None, propagate=True, handlers=[], disabled=False, _cache={}) at remote 0x7f845771dd00>, propagate=True, handlers=[<FileHandler(baseFilename='/usr1/jenkins/workspace/Version_pipeline_compile_iBMC/V2R2_trunk/temp/log/packet_log/create_tosupport_pkg.log', mode='a', encoding=None, delay=False, filters=[], _name=None, level=0, formatter=<Formatter(_style=<StrFormatStyle(_fmt=' [{levelname} {pathname}:{lineno} {funcName:4}] {message}') at remote 0x7f845757b460>, _fmt=' [{levelname} {pathname}:{lineno} {funcName:4}] {message}', datefm...(truncated)
    (gdb) py-locals
    lock = <Lock(_semlock=<_multiprocessing.SemLock at remote 0x7f84575bd5b0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x7f84575bd5b0>) at remote 0x7f84576d09d0>
    logger = <Logger(filters=[], name='create_tosupport_pkg', level=10, parent=<RootLogger(filters=[], name='root', level=30, parent=None, propagate=True, handlers=[], disabled=False, _cache={}) at remote 0x7f845771dd00>, propagate=True, handlers=[<FileHandler(baseFilename='/usr1/jenkins/workspace/Version_pipeline_compile_iBMC/V2R2_trunk/temp/log/packet_log/create_tosupport_pkg.log', mode='a', encoding=None, delay=False, filters=[], _name=None, level=0, formatter=<Formatter(_style=<StrFormatStyle(_fmt=' [{levelname} {pathname}:{lineno} {funcName:4}] {message}') at remote 0x7f845757b460>, _fmt=' [{levelname} {pathname}:{lineno} {funcName:4}] {message}', datefmt=None) at remote 0x7f845757b430>, lock=<_thread.RLock at remote 0x7f845757b540>, stream=<_io.TextIOWrapper at remote 0x7f84575faba0>) at remote 0x7f845757b4f0>], disabled=False, _cache={20: True}, manager=<Manager(root=<...>, disable=0, emittedNoHandlerWarning=False, loggerDict={'/usr1/jenkins/workspace/Version_pipeline_compile_iBMC/V2R2_trunk/application/build/utils...(truncated)
    release_dir = '2488hv6_tosupport_release'
    archive_emmc_nand_dir = '/usr1/jenkins/workspace/Version_pipeline_compile_iBMC/V2R2_trunk/application/build/utils/../../../application/src/resource/board/2488hv6/archive_emmc_nand'
    sw_code = '05022XYJ'
    ar_ver_file = '/usr1/jenkins/workspace/Version_pipeline_compile_iBMC/V2R2_trunk/application/build/utils/../../../application/src/resource/board/2488hv6/archive_emmc_nand/05022XYJ_version.ini'
    ver_num = '3.03.10.01'
    xml_ver_num = '3.03.10.01'
    hpm_file_name = '2488HV6-iBMC_3.03.10.01.hpm'
    Fusion_pod_v5_board_tuple = ('dh120v5', 'dh141v5')
    hpm_size = 57418095
    max_size = 73400320
    special_board_tuple = ('TaiShan2280v2', 'MM920', 'MM921')
    target_pkg = '2488HV6-iBMC_3.03.10.01'
  5. 分析

    • 从c调用栈do_futex_wait.constprop ()python调用栈_multiprocessing.SemLock可看出, 线程似乎在等待获得锁,

    • File "packet.py", line 3533, in create_tosupport_pkg可以获得文件名和函数名

    • 从代码可以看出, create_tosupport_pkg 通过入参与其他函数共享锁, 怀疑是其他函数锁没有释放导致, 排查后发现是create_mib_pkg函数

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      # 简化的代码逻辑如下
      def create_tosupport_pkg(lock):
      lock.acquire()
      create_localization_support_pkg()
      lock.release()
      return
      def create_mib_pkg(lock):
      lock.acquire()
      create_localization_mib_pkg()
      lock.release()
      return
    • 检查日志发现, create_localization_mib_pkg抛出了异常, 未处理.

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      [2022-03-03T04:04:15.263Z]  [INFO packet.py:2273 delete_unification_unnecessary_files] delete_unification_unnecessary_files  ----------> [begin]
      [2022-03-03T04:04:15.890Z] 当前进度: [ 66.67%] #
      Process Process-14:
      [2022-03-03T04:04:15.890Z] Traceback (most recent call last):
      [2022-03-03T04:04:15.890Z] File "/opt/buildtools/python-3.8.5/lib/python3.8/shutil.py", line 788, in move
      [2022-03-03T04:04:15.890Z] os.rename(src, real_dst)
      [2022-03-03T04:04:15.890Z] FileNotFoundError: [Errno 2] No such file or directory: "/usr1/jenkins/workspace/Version_pipeline_compile_iBMC/V2R2_trunk/application/build/utils/../../../temp/ZOOM Hard'Server 2488H V6-BMC_3.03.10.01_MIB/ZOOM Hard'Server 2488H V6-BMC_3.03.10.01_MIB.zip" -> '2488H V6/ToSupportE/C'
      [2022-03-03T04:04:15.890Z]
      [2022-03-03T04:04:15.890Z] During handling of the above exception, another exception occurred:
      [2022-03-03T04:04:15.890Z]
      [2022-03-03T04:04:15.890Z] Traceback (most recent call last):
      [2022-03-03T04:04:15.890Z] File "/opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
      [2022-03-03T04:04:15.890Z] self.run()
      [2022-03-03T04:04:15.890Z] File "/opt/buildtools/python-3.8.5/lib/python3.8/multiprocessing/process.py", line 108, in run
      [2022-03-03T04:04:15.890Z] self._target(*self._args, **self._kwargs)
      [2022-03-03T04:04:15.890Z] File "packet.py", line 3203, in create_mib_pkg
      [2022-03-03T04:04:15.890Z] create_localization_mib_pkg(logger, "zoom")
      [2022-03-03T04:04:15.890Z] File "packet.py", line 3151, in create_localization_mib_pkg
      [2022-03-03T04:04:15.890Z] shutil.move(f"{g_temp_dir}/{target_pkg}/{target_pkg}.zip", f"{localization_archive_dic['localization_tosupport_mib_dir']}")
      [2022-03-03T04:04:15.890Z] File "/opt/buildtools/python-3.8.5/lib/python3.8/shutil.py", line 802, in move
      [2022-03-03T04:04:15.890Z] copy_function(src, real_dst)
      [2022-03-03T04:04:15.890Z] File "/opt/buildtools/python-3.8.5/lib/python3.8/shutil.py", line 432, in copy2
      [2022-03-03T04:04:15.890Z] copyfile(src, dst, follow_symlinks=follow_symlinks)
      [2022-03-03T04:04:15.890Z] File "/opt/buildtools/python-3.8.5/lib/python3.8/shutil.py", line 261, in copyfile
      [2022-03-03T04:04:15.890Z] with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
      [2022-03-03T04:04:15.890Z] FileNotFoundError: [Errno 2] No such file or directory: '2488H V6/ToSupportE/C'
      • 通过git blame, 查看create_localization_mib_pkg相关代码, 可以确认是3天前的一个commit导致, 由相关引入人修改.
      • 至此, 问题解决.
  6. 结论

    • 多个create*_pkg通过多线程处理构建过程.
    • 函数create_tosupport_pkg 与函数create_mib_pkg共享锁
    • create_mib_pkg调用函数create_localization_mib_pkg抛出了异常, 未处理
    • create_mib_pkg持有锁未释放, 导致create_tosupport_pkg线程总是处于等待中,
    • 多线程任务总是不结束, 表现为挂死.

反思

  • 从python代码的选型, 设计, 编写, 上库, 看护, 都是草率而不负责任的, 在短短的一年时间就将原来的代码变成不可读不可测的垃圾, 这些是沉重的历史债务. 给调试带来了很大的困难, 这点前任总工是有责任的
    问题点(TODO)
    • python转shell
    • python调用shell
    • python调用python
    • python代码乱写
    • 随意上库, 完全不看护
    • 所谓的配置化
    • 所谓的拆分日志
    • 所谓的检查返回码(异常的典型错误使用)
    • 跨文件的全局变量
  • 使用线程锁时, 应该使用try-catch-finanly的方式.
  • fail-first, 应该让构建系统, 代码, 对任何错误零容忍, 尽早地失败, 不会将错误传递到下游, 不会带来大量的垃圾日志
  • 检查日志的工作应该先做, 如何从大量垃圾日志中获取有用的信息, 除了人力搜索之外, 还应该自动化检查关键字
  • 谨慎地, 对日志进行分流和分割, 不要隐藏日志

[^1]:gdb调试cpython | Meteorix’s Blog
[^2]:python-dbg

Read more »

问题原因

自动化场景下,使用python 库paramiko 模拟ssh登录终端,概率性出现无法判断终端类型,导致terminal_get_size获取终端大小失败,终端大小读取为65535*65535,进而导致terminal_change_size时,申请65535*65535的内存失败。

此后,执行re_clear_display刷新当前屏幕信息时,访问越界导致概率性coredump

问题日志

#0 0xb6a1e7e8 in re_clear_display () from /usr/lib/libedit.so.0

(gdb) bt

#0 0xb6a1e7e8 in re_clear_display () from /usr/lib/libedit.so.0

#1 0xb6a1bf10 in read_prepare (el=0xd36c18) at read.c:432

#2 0xb6a1c1cc in el_wgets (el=0xd36c18, nread=0xbecbd8fc) at read.c:508

#3 0xb6a150ac in el_gets () from /usr/lib/libedit.so.0

#4 0xb6a3251c in readline (p=0x40541c ‘ ‘ <repeats 12 times>, “Copyright(C) 2013-2

#5 0x004022f8 in get_user_input () at /home/workspace/V2R2_trunk/application/src/a

#6 main (argc=, argv=) at /home/workspace/V2R2_trunk

(gdb) frame #1

Invalid character ‘#’ in expression.

(gdb) frame 1

#1 0xb6a1bf10 in read_prepare (el=0xd36c18) at read.c:432

432 read.c: No such file or directory.

(gdb) print el->el_terminal

$1 = {t_name = 0x0, t_size = {h = 65535, v = 65535}, t_flags = 0, t_buf = 0xd37fe8

t_fkey = 0xd38ff8}

(gdb) frame 4

#4 0xb6a3251c in readline (p=0x40541c ‘ ‘ <repeats 12 times>, “Copyright(C) 2013-2

455 readline.c: No such file or directory.

函数调用

1
2
3
4
graph TD
a[get_user_input]-->b[readline]
b-->rl_initialize-->el_init_internal-->terminal_init-->terminal_set-->terminal_change_size-->terminal_rebuffer_display-->terminal_alloc_display-->terminal_alloc_buffer--申请内存失败-->c
b-->el_gets-->el_wgets-->read_prepare-->re_clear_display-->c[访问el->el_display的65535地址]

修改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
 src/terminal.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/src/terminal.c b/src/terminal.c
index 2e62f7b..96bc9b1 100644
--- a/src/terminal.c
+++ b/src/terminal.c
@@ -952,6 +952,16 @@ terminal_get_size(EditLine *el, int *lins, int *cols)
}
}
#endif
+ if (*cols > 2000 ) {
+ *cols = 2000;
+ } else if (*cols < 2) {
+ *cols = 80;
+ }
+ if (*lins > 2000) {
+ *lins = 2000;
+ } else if (*lins < 1) {
+ *lins = 24;
+ }
return Val(T_co) != *cols || Val(T_li) != *lins;
}

@@ -966,8 +976,20 @@ terminal_change_size(EditLine *el, int lins, int cols)
/*
* Just in case
*/
- Val(T_co) = (cols < 2) ? 80 : cols;
- Val(T_li) = (lins < 1) ? 24 : lins;
+ if (cols < 2) {
+ Val(T_co) = 80;
+ } else if (cols > 2000) {
+ Val(T_co) = 2000;
+ } else {
+ Val(T_co) = cols;
+ }
+ if (lins < 1) {
+ Val(T_li) = 24;
+ } else if (lins > 2000) {
+ Val(T_li) = 2000;
+ } else {
+ Val(T_li) = lins;
+ }

/* re-make display buffers */
if (terminal_rebuffer_display(el) == -1)

github链接地址

Read more »

问题描述

PME版本升级, 更新了libcrypto版本, 导致syslog-ng单向认证无法解析报文, 提示unknown ca
报错显示如下

Sep 3 14:09:33 syslog4 syslog-ng[45709]: Syslog connection accepted; fd=’5’, client=’AF_INET(71.41.5.127:38057)’, local=’AF_INET(0.0.0.0:2460)’
Sep 3 14:09:33 syslog4 syslog-ng[45709]: SSL error while reading stream; tls_error=’SSL routines:ssl3_read_bytes:tlsv1 alert unknown ca’, location=’/etc/syslog-ng/syslog-ng.conf:254:9’
Sep 3 14:09:33 syslog4 syslog-ng[45709]: I/O error occurred while reading; fd=’5’, error=’Connection reset by peer (104)’
Sep 3 14:09:33 syslog4 syslog-ng[45709]: Syslog connection closed; fd=’5’, client=’AF_INET(71.41.5.127:38057)’, local=’AF_INET(0.0.0.0:2460)’

定位思路

  1. 先尝试复现问题
  2. 二分定位确定问题代码
    1. 通过git bisect -> 编译libcrypto -> 使用此so,替换syslog服务器使用文件 -> 验证问题是否存在 -> git bisect
  3. 对变更代码加日志, 查看二者流程差异
  4. 客户端gdb调试syslog-ng, 查看报错点, 以及是如何调用openssl的libctypto的
  5. 客户端上tcpdump抓包, 对比正误情况下的报文

复现问题

syslog服务器71.47.142.13

配置生效重启syslog服务killall /sbin/syslog-ng; /sbin/syslog-ng -f /etc/syslog-ng/syslog-ng.conf

客户端 71.41.5.127

killall /usr/sbin/syslog-ng;/usr/sbin/syslog-ng -f /etc/syslog-ng/syslog-ng.conf -p /var/run/syslogd.pid -F -R /opt/pme/pram/syslog-ng.persist

Read more »

基于

目的

带着问题来学

  • 解决什么问题
  • 如何解决(越具体越好)
  • 学到了什么(语法, 设计, 改造)

c1

ref

引言 |《Design Patterns in Modern C++》 | ZenLian
https://zenlian.github.io/posts/design-pattern-in-modern-cpp/introduction/

小结

  • 学习了几个原则

    SOLID 是以下 5 大设计原则的首字母缩写:
    单一职责原则(Single Responsibility Principle,SRP)
    开闭原则(Open-Closed Principle,OCP)
    里氏替换原则(Liskov Substitution Principle,LSP)
    接口隔离原则(Interface Segregation Principle,ISP)
    依赖倒置原则(Dependency Inversion Principle,DIP)

  • 发现变化, 抽离不变; 实际上和我写代码的思路一样, 只是人家能够表达出来, 总结出来, 还有大量的经验作为栗子; 给我时间我我也一样能总结出来, 不过不会这么完整, 也不会则这么好
  • 使用cpp的特性, 能够简化代码, 实现这些原则
Read more »

摘要

本文使用 c/cpp 的编译宏特性, 通过编译宏 ## 拼接的技巧,在现有大量存量代码接口不变更的情况下, 完成日志字符串二进制级别的控制.

背景

要求 Release 版本在二进制中去除调试日志,而 debug 版本保持不变;
代码中已存在大量使用旧日志接口的代码, 要做到存量代码零变更.

实现

二进制级别的去除, 不可能使用函数条件判断实现, 只能通过编译宏实现.

现有一日志宏, 通过调用log_func输出日志到某个文件。

1
2
3
4
5
6
7
8
9
10
11
// 定义
#define debug_log(level, format, arg...) \
do { \
log_func((level), __FILE__, __LINE__, format, ##arg); \
} while (0)

// 调用
debug_log(DLOG_DEBUG, "int_var=%d", int_var);
debug_log(DLOG_INFO, "int_var=%d", int_var);
debug_log(DLOG_ERROR, "int_var=%d", int_var);

通过编译宏技巧##, 以及条件编译宏实现日志宏隔离, 和接口的无感知变化.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// 定义不变
#define debug_log(level, format, arg...) \
do \
{ \
log_func((level), __FILE__, __LINE__, format, ##arg); \
} while (0)
// 新增定义宏
#define LOG_MARCO(level, format, arg...) \
do \
{ \
log_func((level), __FILE__, __LINE__, format, ##arg); \
} while (0)

#ifdef RELEASE
// 从level转为宏, 空定义不需要的级别, 再转为level
#undef debug_log
#define debug_log(level, format, arg...) LOG_##level(format, ##arg)
// 空定义, 实现二进制级别剔除
#define LOG_DLOG_DEBUG(format, arg...)
#define LOG_DLOG_MASS(format, arg...)
// 真正调用日志
#define LOG_DLOG_INFO(format, arg...) \
LOG_MARCO(DLOG_INFO, format, ##arg)
#define LOG_DLOG_ERROR(format, arg...) \
LOG_MARCO(DLOG_ERROR, format, ##arg)
#endif

// 调用处不变
debug_log(DEBUG, "int_var=%d", int_var);
debug_log(INFO, "int_var=%d", int_var);
debug_log(ERROR, "int_var=%d", int_var);
Read more »

1.

有时在大洋上,蒸腾了几个月的水汽汇成的好几块雨云恰好都没有落成雨水,在海上风暴略微平静的某天,它们偶然撞在一块,成为巨大的云团,盖压住数十平方公里的海面。

奔腾的海水与暴怒的积雨云团几乎在海面对冲,于是游荡的风从四处汇来,视自转左旋或右旋,形成环绕整个云海的风壁。鱼儿也被洋流带到此处,像受到了什么感召,奋力向云心游去。

这时候天空收蓄的所有水汽,都在到达临界点的瞬间倾注而下。过去数个月洋面赠予天空的丰沛的蒸腾气流,化作十数条粗大的水柱。这样的水柱很不稳定,其中大多都在极短的时间内湍紊扭曲消散掉,只有最中心的那条,拥有最大的截面积与水量,在外层风旋与水旋相向的共同作用下,在庞大云团的支给下,幸运地维持着水柱的稳定与平衡。

此时在风暴的中心,这来之不易的稳定的通天之柱,短暂地沟通了暴掠的海洋与翻腾的天空,形成了一条通路。不知为何而汇聚而来的鱼群中,会有一些鱼逆流而上,向天空游去。

直上直下是不行的,鱼们需要贴着水柱与水旋边,在一种微妙的平衡下,用尾鳍的力量螺旋阶梯地向上攀登跳跃去,去接近云端,去游向它们的天。

即使云团被水块压得很低,即使海面被波涛卷向高天,那几百米的水柱也绝没有鱼能攀登而上。但四十亿年的时间里,总有一天,总有那么一次,有条幸运又健壮的鱼,靠着非凡的毅力,靠着过去所有鱼群从过去所有洋流中与死亡中得到的写入基因的经验与本能,终于游上了天空。

于是,云团中有鱼在游动,流过它尾鳍的所有风水,都离开此刻的天空。

2.

在鱼类的文化里,天是一种飘渺的海,甚至有的鱼,一生都未曾意识到天的存在。

Read more »