Interrupt is catched in kernel, but not handled

Issues related to applications and software problems and general support
Post Reply
ranshalit
Posts: 49
Joined: 2015/12/28 17:01:59

Interrupt is catched in kernel, but not handled

Post by ranshalit » 2019/11/29 06:14:09

Hello,

I installed centos8 , and I use uio_pci_generic, in order to communicate with FPGA (Xilinx) through PCIe in userspace.

The cpu is ATOM cpu.

I first install the driver:
echo "10ee 0007" > /sys/bus/pci/drivers/uio_pci_generic/new_id

I use userspace application which wait for interrupt, just as described in code example here:
https://www.kernel.org/doc/html/v4.1...uio-howto.html

I than trigger an interrupt from FPGA, but no print from the userspace application is given and there is an exception:

Code: Select all

irq 23: nobody cared (try booting with the "irqpoll" option)
[   91.030760] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.18.16 #6
[   91.037037] Hardware name:  /conga-MA5, BIOS MA50R000 10/30/2019
[   91.043302] Call Trace:
[   91.045881]  <IRQ>
[   91.048002]  dump_stack+0x5c/0x80
[   91.051464]  __report_bad_irq+0x35/0xaf
[   91.055465]  note_interrupt.cold.9+0xa/0x63
[   91.059823]  handle_irq_event_percpu+0x68/0x70
[   91.064470]  handle_irq_event+0x37/0x57
[   91.068481]  handle_fasteoi_irq+0x97/0x150
[   91.072758]  handle_irq+0x1a/0x30
[   91.076230]  do_IRQ+0x44/0xd0
[   91.079333]  common_interrupt+0xf/0xf
[   91.083154]  </IRQ>
[   91.085360] RIP: 0010:cpuidle_enter_state+0x7d/0x220
[   91.090563] Code: e8 18 1a 45 00 41 89 c4 e8 d0 50 b1 ff 65 8b 3d d9 db e5 5f e8 44 4f b1 ff 31 ff 48 89 c3 e8 ea 61 b1 ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 4c 29 eb ba ff ff ff 7f 48 89 d9 48
[   91.110283] RSP: 0018:ffffb20e806b7ea8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffffda
[   91.118203] RAX: ffff90133fd214c0 RBX: 00000015316faa09 RCX: 000000000000001f
[   91.125662] RDX: 00000015316faa09 RSI: 0000000000000000 RDI: 0000000000000000
[   91.133133] RBP: ffff90133fd2b300 R08: 0000011741eb842c R09: 0000000000000006
[   91.140605] R10: 00000000ffffffff R11: ffff90133fd205a8 R12: 0000000000000001
[   91.148068] R13: 00000015316f98b9 R14: 0000000000000001 R15: 0000000000000000
[   91.155523]  ? cpuidle_enter_state+0x76/0x220
[   91.160088]  do_idle+0x221/0x260
[   91.163470]  cpu_startup_entry+0x6a/0x70
[   91.167588]  start_secondary+0x1a4/0x1f0
[   91.171676]  secondary_startup_64+0xb7/0xc0
[   91.176043] handlers:
[   91.178419] [<00000000ec05b056>] uio_interrupt
[   91.183054] Disabling IRQ #23
So, I started debugging this issue in kernel uio_pci_generic.c code, and according to the prints below, it seems that irq is catched but not delivered to userspace:

Code: Select all

static irqreturn_t irqhandler(int irq, struct uio_info *info)
{
        struct uio_pci_generic_dev *gdev = to_uio_pci_generic_dev(info);
printk("i'm here 1\n"); <<--- this is printed
        if (!pci_check_and_mask_intx(gdev->pdev))
                return IRQ_NONE;
printk("i'm here 2\n"); <<--- but this is NOT printed
        /* UIO core will signal the user process. */
        return IRQ_HANDLED;
}
Reading documentation, I see that pci_check_and_mask_intx is actually checking if interrupt bit is set in configuration space. Since it seems to return 0 ,it should mean that it finds that this bit is not enabled.
But how can it be that irq is triggered and status bit is not enabled in configuration space ?

The device appear as following with lspci -vv:

Code: Select all

02:00.0 RAM memory: Xilinx Corporation Default PCIe endpoint ID
        Subsystem: Xilinx Corporation Default PCIe endpoint ID
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 23
        Region 0: Memory at 91200000 (32-bit, non-prefetchable) [size=1M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [58] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 1, Latency L0s <64ns, L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s unlimited
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        Capabilities: [100 v1] Device Serial Number 00-00-00-00-00-00-00-00
        Kernel driver in use: uio_pci_generic
I also verified that there are no additional irq numbered 23 except for our device.

How can it be that irq is catched, but not delivered to userspace ?

Thank you for any idea,
ranran

Post Reply