Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 13 Aug 2021 15:33:40 +0800 (CST)
From: youyan  <hyouyan@....com>
To: lkrg-users@...ts.openwall.com
Subject: Re:Re: Re:deadlock happen on
 p_rb_hash[i].p_lock.lock

hi Adam
   The deadlock issue due to hard to reproduce , it needs dozens of machines and weeks. At the same time, the machine has been mass-produced。So 
I can not switch new lkrg code before full verity test.
   On my machine has fellow funtion ftrace.Could you help me review?  If some situation may casue deallock? while before p_cmp_tasks have lock the rwlock,and another cpu want the rwlock to write. Thanks!!! 
 
1)  awbctrl-3361  =>  kworker-3331 
 ------------------------------------------


 1)               |  p_cmp_tasks [sidkm]() {
 1)   ==========> |
 1)               |  gic_handle_irq() {
 1)               |    handle_IPI() {
 1)               |      irq_enter() {
 1)   0.808 us    |        rcu_irq_enter();
 1)   0.230 us    |        preempt_count_add();
 1)   6.307 us    |      }
 1)               |      __wake_up() {
 1)               |        __wake_up_common_lock() {
 1)               |          _raw_spin_lock_irqsave() {
 1)   0.539 us    |            preempt_count_add();
 1)   0.307 us    |            do_raw_spin_lock();
 1)   4.731 us    |          }
 1)               |          __wake_up_common() {
 1)               |            autoremove_wake_function() {
 1)               |              default_wake_function() {
 1)               |                try_to_wake_up() {
 1)               |                  _raw_spin_lock_irqsave() {
 1)   0.230 us    |                    preempt_count_add();
 1)   0.462 us    |                    do_raw_spin_lock();
 1)   4.461 us    |                  }
 1)               |                  select_task_rq_fair() {
 1)   0.231 us    |                    __rcu_read_lock();
 1)   0.270 us    |                    idle_cpu();
 1)   0.269 us    |                    target_load();
 1)   0.269 us    |                    source_load();
 1)   0.346 us    |                    task_h_load();
 1)   0.231 us    |                    idle_cpu();
 1)   0.385 us    |                    idle_cpu();
 1)   0.269 us    |                    idle_cpu();
 1)   0.385 us    |                    idle_cpu();
 1)   0.230 us    |                    __rcu_read_unlock();
 1)   0.230 us    |                    __rcu_read_lock();
 1)   0.230 us    |                    __rcu_read_unlock();
 1)   0.231 us    |                    nohz_balance_exit_idle();
 1) + 31.231 us   |                  }
 1)   0.308 us    |                  cpus_share_cache();
 1)               |                  _raw_spin_lock() {
 1)   0.230 us    |                    preempt_count_add();
 1)   0.231 us    |                    do_raw_spin_lock();
 1)   4.346 us    |                  }
 1)   0.423 us    |                  update_rq_clock();
 1)               |                  ttwu_do_activate() {
 1)               |                    activate_task() {
 1)               |                      psi_task_change() {
 1)   0.539 us    |                        record_times();
 1)   3.154 us    |                      }
 1)               |                      enqueue_task_fair() {
 1)               |                        update_curr() {
 1)   0.269 us    |                          update_min_vruntime();
 1)               |                          cpuacct_charge() {
 1)   0.577 us    |                            __rcu_read_lock();
 1)   0.231 us    |                            __rcu_read_unlock();
 1)   5.346 us    |                          }
 1)   9.885 us    |                        }
 1)   0.346 us    |                        __update_load_avg_se();
 1)   0.385 us    |                        __update_load_avg_cfs_rq();
 1)   0.231 us    |                        update_cfs_shares();
 1)   0.346 us    |                        account_entity_enqueue();
 1)   0.269 us    |                        check_spread();
 1)   0.231 us    |                        __rcu_read_lock();
 1)   0.231 us    |                        __rcu_read_unlock();
 1)   0.231 us    |                        hrtick_update();
 1) + 30.462 us   |                      }
 1) + 37.692 us   |                    }
 1)               |                    optimized_callback() {
 1)               |                      opt_pre_handler() {
 1)               |                        pre_handler_kretprobe() {
 1)               |                          _raw_spin_lock_irqsave() {
 1)   0.231 us    |                            preempt_count_add();
 1)   0.461 us    |                            do_raw_spin_lock();
 1)   4.693 us    |                          } /* _raw_spin_lock_irqsave */
 1)               |                          _raw_spin_unlock_irqrestore() {
 1)   0.307 us    |                            do_raw_spin_unlock();
 1)   0.270 us    |                            preempt_count_sub();
 1)   4.461 us    |                          }
 1)               |                          p_ttwu_do_wakeup_entry [sidkm]() {
 1)               |                            _raw_read_trylock() {
 1)   0.231 us    |                              preempt_count_add();
 1)   0.539 us    |                              do_raw_read_trylock();
 1)   4.769 us    |                            }
 1)               |                            p_ed_validate_from_running [sidkm]() {
 1)               |                              p_validate_task_from_running [sidkm]() {
 1)   0.231 us    |                                __rcu_read_lock();
 1)   0.538 us    |                                p_rb_find_ed_pid [sidkm]();
 1)               |                                p_cmp_tasks [sidkm]() {
 1)   0.577 us    |                                  p_ed_pcfi_validate_sp [sidkm]();
 1)               |                                  p_cmp_creds [sidkm]() {







thanks and best regards
ethan








At 2021-07-16 01:39:45, "Adam Zabrocki" <pi3@....com.pl> wrote:
>Can you try LKRG from git TOT ?
>
>On Thu, Jul 15, 2021 at 08:52:49PM +0800, youyan wrote:
>> Hi all
>>      I am sorry ,do not notice picture can not direct dispaly on mail list。I also describe it in words.
>>   cpu0 cpu1 wait for the lock ,which is holded on cpu2.
>>   cpu2 wait kretprobe_table_locks[hash].lock which is hold cpu3
>>   cpu3 wait for the p_rb_hash[i].p_lock.lock.
>>   the value of p_rb_hash[i].p_lock.lock is 0x01. 0x01 also mean this lock is holded throuh read lock.
>>    
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 在 2021-07-15 20:20:50,"youyan" <hyouyan@....com> 写道:
>> 
>> Hi all
>>        I met a deadlock issue, p_rb_hash[i].p_lock.lock is not unlocked. lkrg version is 0.8,  software is android  10 ,hardware is unisoc SL8541E。
>>  fellow picture is trace32 stack callback and register。
>>  1:cpu 0
>> 
>> 
>> 
>> 
>> 2:cpu1
>> 3:cpu 2
>> 4:cpu3 
>> 
>> 
>>      Above situation,I think where use read_lock for p_rb_hash[i].p_lock.lock ,but not unlock.Or after lock,there is some code may cause schedule. Go throuh lkrg code, I can not find this situation code.
>> Repeating this issue need at least two weeks. 
>>     Have anybody met this similar issue??
>> 
>> 
>> thanks and best regards
>> ethan
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>  
>> 
>> 
>> 
>> 
>> 
>>  
>
>
>
>
>
>
>-- 
>pi3 (pi3ki31ny) - pi3 (at) itsec pl
>http://pi3.com.pl

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.