Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCL on BeagleBoard X15 #102

Closed
gtortone opened this issue Sep 21, 2016 · 4 comments
Closed

OpenCL on BeagleBoard X15 #102

gtortone opened this issue Sep 21, 2016 · 4 comments

Comments

@gtortone
Copy link

Hi,
I'm trying to use OpenCL with AM57xx-EVM (BeagleBoard X15 board) and with a recent kernel (e.g. 4.4.21-ti-r45) it does not work; after loading of cmemk module each OpenCL example fails with:

root@arm:/usr/share/ti/examples/opencl/simple# ./simple
Unable to allocate OCL MSMC memory from 0x40500000

(kernel 4.4.21-ti-r45 U-Boot load am57xx-beagle-x15-revb1 device tree but I don't know if this is correct)

Reading BeagleBoard X15 wiki I realized that OpenCL needs linux kernel: 4.1.10-ti-r23/4.1.10-ti-rt-r23 or greater from the v4.1.x-ti branch then I installed 4.1.10-ti-r23 but U-Boot did not find am57xx-beagle-x15-revb1 device tree and after forced U-Boot to load am57xx-beagle-x15.dtb (I don't know if this is correct)
OpenCL examples work but sometime in dmesg i found this:

[  453.129412] ------------[ cut here ]------------
[  453.129435] WARNING: CPU: 0 PID: 6622 at drivers/bus/omap_l3_noc.c:147 l3_interrupt_handler+0x284/0x39c()
[  453.129444] 44000000.ocp:L3 Custom Error: MASTER MPU TARGET DMM_P1 (Read): Data Access in User mode during Functional access
[  453.129450] Modules linked in: rpmsg_proto cmemk(O) 8021q garp mrp stp llc rpmsg_rpc virtio_rpmsg_bus snd_soc_simple_card ti_vpe videobuf2_dma_contig ti_vpdma v4l2_mem2mem snd_soc_omap_hdmi_audio videobuf2_memops pruss_remoteproc snd_soc_tlv320aic3x videobuf2_core v4l2_common videodev media snd_soc_davinci_mcasp snd_soc_edma omap_rng rng_core omap_remoteproc uio_pdrv_genirq uio bnep bluetooth rfkill
[  453.129554] CPU: 0 PID: 6622 Comm: dsplib_fft Tainted: G        W  O    4.1.10-ti-r23 #1
[  453.129560] Hardware name: Generic DRA74X (Flattened Device Tree)
[  453.129578] [<c001a180>] (unwind_backtrace) from [<c0014b38>] (show_stack+0x20/0x24)
[  453.129590] [<c0014b38>] (show_stack) from [<c09b54ec>] (dump_stack+0x8c/0xcc)
[  453.129603] [<c09b54ec>] (dump_stack) from [<c00479c8>] (warn_slowpath_common+0x98/0xc8)
[  453.129613] [<c00479c8>] (warn_slowpath_common) from [<c0047a38>] (warn_slowpath_fmt+0x40/0x48)
[  453.129623] [<c0047a38>] (warn_slowpath_fmt) from [<c05471f8>] (l3_interrupt_handler+0x284/0x39c)
[  453.129638] [<c05471f8>] (l3_interrupt_handler) from [<c009f1e8>] (handle_irq_event_percpu+0xb0/0x254)
[  453.129649] [<c009f1e8>] (handle_irq_event_percpu) from [<c009f3d8>] (handle_irq_event+0x4c/0x6c)
[  453.129658] [<c009f3d8>] (handle_irq_event) from [<c00a23c8>] (handle_fasteoi_irq+0xf0/0x1ac)
[  453.129668] [<c00a23c8>] (handle_fasteoi_irq) from [<c009e70c>] (generic_handle_irq+0x3c/0x4c)
[  453.129677] [<c009e70c>] (generic_handle_irq) from [<c009ea38>] (__handle_domain_irq+0x8c/0xfc)
[  453.129686] [<c009ea38>] (__handle_domain_irq) from [<c000952c>] (gic_handle_irq+0x34/0x6c)
[  453.129697] [<c000952c>] (gic_handle_irq) from [<c09bae40>] (__irq_svc+0x40/0x74)
[  453.129703] Exception stack(0xedcf5c68 to 0xedcf5cb0)
[  453.129710] 5c60:                   000000ff eb4a265f 00000000 000007ff ffeff000 00000000
[  453.129718] 5c80: c0025144 00080000 eb782140 ecde1b88 00000207 edcf5ccc 000eb4a2 edcf5cb0
[  453.129724] 5ca0: c0004000 c00244d4 20070013 ffffffff
[  453.129736] [<c09bae40>] (__irq_svc) from [<c00244d4>] (kmap_atomic+0x110/0x15c)
[  453.129745] [<c00244d4>] (kmap_atomic) from [<c0022110>] (__flush_dcache_page+0xa0/0x108)
[  453.129755] [<c0022110>] (__flush_dcache_page) from [<c00221f0>] (flush_dcache_page+0x78/0x7c)
[  453.129767] [<c00221f0>] (flush_dcache_page) from [<c0173150>] (__get_user_pages+0x408/0x4d0)
[  453.129779] [<c0173150>] (__get_user_pages) from [<c0173548>] (get_user_pages_unlocked+0x13c/0x1c8)
[  453.129791] [<c0173548>] (get_user_pages_unlocked) from [<c0166ef4>] (get_user_pages_fast+0x50/0x58)
[  453.129802] [<c0166ef4>] (get_user_pages_fast) from [<c00c66e0>] (get_futex_key+0xdc/0x244)
[  453.129811] [<c00c66e0>] (get_futex_key) from [<c00c6928>] (futex_wake+0x40/0x148)
[  453.129819] [<c00c6928>] (futex_wake) from [<c00c8af8>] (do_futex+0x140/0xa74)
[  453.129827] [<c00c8af8>] (do_futex) from [<c00c94b8>] (SyS_futex+0x8c/0x178)
[  453.129836] [<c00c94b8>] (SyS_futex) from [<c0045334>] (mm_release+0xf0/0x110)
[  453.129845] [<c0045334>] (mm_release) from [<c004a3e4>] (do_exit+0x16c/0x9e8)
[  453.129855] [<c004a3e4>] (do_exit) from [<c004acac>] (do_group_exit+0x0/0xd4)
[  453.129861] ---[ end trace baa5f7cd7094a3af ]---

It is possible to know a working combination of kernel release and device tree to work with OpenCL without any issues ?

Thanks a lot

@RobertCNelson
Copy link
Member

@gtortone sorry, haven't worked on this issue this week. Been working on work-arounds for the bbgw wlan0 fun-ness.

For v4.1.x, yes opencl had a lot of messages dumpped to the kernel, you can ignore them.. For the v4.1.x -> v4.4.x transition some things are still broken, and when i tested the last TI sdk for the x15, the opencl speed up was only 0.8x vs, the 3x->4x we saw in v4.1.x

I've noticed quite a few opencl updates coming across the ti mailing list, so it might be working again. (just need to find a free moment)

REgards,

@RobertCNelson
Copy link
Member

@gtortone there's a new build:

http://arago-project.org/git/projects/?p=oe-layersetup.git;a=blob;f=configs/processor-sdk/processor-sdk-03.00.00.04-config.txt;hb=HEAD

Can you dl and test:

http://software-dl.ti.com/processor-sdk-linux/esd/AM57X/latest/index_FDS.html

run the opencl-mpp opencl demo, it'll compare the Dual A15's vs the DSP's

If it's back in the 2x/3x speed up range, i'll start porting the stuff over again..

Regards,

@Shim-Apan
Copy link

Shim-Apan commented Oct 3, 2016

@RobertCNelson

Hi, i am wondering the same the same thing.
I've downloaded the newest processor sdk and installed it. However i could not find the opencl-mpp demo. I did however ran the examples from the opencl example folder (dgemm and matmpy)

Their results are:

root@am57xx-evm:/usr/share/ti/examples/opencl/dgemm# ./dgemm
Generating random data ... Done. Starting Dgemm.
cMaj C[2048,2048] = A[2048,2048] * B[2048,2048]:
2 DSPs: 3.641 Gflops (4.718682s)
1 CPU : 2.217 Gflops (7.747983s) with ATLAS library
PASS!

and

root@am57xx-evm:/usr/share/ti/examples/opencl/matmpy# ./matmpy
float C[512][512] = float A[512][512] x float B[512][512]
OpenCL dispatching to 1 DSP(S): 0.2874 secs
OpenMP dispatching to 4 CPU(S): 0.2582 secs
Passed!

I'm not sure wether this displays an increase or decrease in speedup compared to the cpu. dgemm supports your findings of a 0.8x speedup but i'm not sure what to make of matmpy's result.

-Shim

edit: formatting

crow-misia pushed a commit to crow-misia/linux that referenced this issue Jun 26, 2019
commit 5cdb0ef upstream.

In case USB disconnect happens at the moment transmitting workqueue is in
progress the underlying interface may be gone causing a NULL pointer
dereference. Add synchronization of the workqueue destruction with the
detach implementation in core so that the transmitting workqueue is stopped
during detach before the interfaces are removed.

Fix following Oops:

Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = 9e6a802d
[00000008] *pgd=00000000
Internal error: Oops: 5 [beagleboard#1] PREEMPT SMP ARM
Modules linked in: nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle
xt_connmark xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
iptable_filter ip_tables x_tables usb_f_mass_storage usb_f_rndis u_ether
usb_serial_simple usbserial cdc_acm brcmfmac brcmutil smsc95xx usbnet
ci_hdrc_imx ci_hdrc ulpi usbmisc_imx 8250_exar 8250_pci 8250 8250_base
libcomposite configfs udc_core
CPU: 0 PID: 7 Comm: kworker/u8:0 Not tainted 4.19.23-00076-g03740aa-dirty beagleboard#102
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Workqueue: brcmf_fws_wq brcmf_fws_dequeue_worker [brcmfmac]
PC is at brcmf_txfinalize+0x34/0x90 [brcmfmac]
LR is at brcmf_fws_dequeue_worker+0x218/0x33c [brcmfmac]
pc : [<7f0dee64>]    lr : [<7f0e4140>]    psr: 60010093
sp : ee8abef0  ip : 00000000  fp : edf38000
r10: ffffffed  r9 : edf38970  r8 : edf3800
r7 : edf3e970  r6 : 00000000  r5 : ede69000  r4 : 00000000
r3 : 00000a97  r2 : 00000000  r1 : 0000888e  r0 : ede69000
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 7d03c04a  DAC: 00000051
Process kworker/u8:0 (pid: 7, stack limit = 0x24ec3e04)
Stack: (0xee8abef0 to 0xee8ac000)
bee0:                                     ede69000 00000000 ed56c3e0 7f0e4140
bf00: 00000001 00000000 edf3800 edf3e99c ed56c3e0 80d03d00 edfea43a edf3e970
bf20: ee809880 ee804200 ee971100 00000000 edf3e974 00000000 ee804200 80135a70
bf40: 80d03d00 ee804218 ee809880 ee809894 ee804200 80d03d00 ee804218 ee8aa000
bf60: 00000088 80135d5c 00000000 ee829f00 ee829dc0 00000000 ee809880 80135d30
bf80: ee829f1c ee873eac 00000000 8013b1a0 ee829dc0 8013b07c 00000000 00000000
bfa0: 00000000 00000000 00000000 801010e8 00000000 00000000 00000000 00000000
bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<7f0dee64>] (brcmf_txfinalize [brcmfmac]) from [<7f0e4140>] (brcmf_fws_dequeue_worker+0x218/0x33c [brcmfmac])
[<7f0e4140>] (brcmf_fws_dequeue_worker [brcmfmac]) from [<80135a70>] (process_one_work+0x138/0x3f8)
[<80135a70>] (process_one_work) from [<80135d5c>] (worker_thread+0x2c/0x554)
[<80135d5c>] (worker_thread) from [<8013b1a0>] (kthread+0x124/0x154)
[<8013b1a0>] (kthread) from [<801010e8>] (ret_from_fork+0x14/0x2c)
Exception stack(0xee8abfb0 to 0xee8abff8)
bfa0:                                     00000000 00000000 00000000 00000000
bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
Code: e1530001 0a000007 e3560000 e1a00005 (05942008)
---[ end trace 079239dd31c86e90 ]---

Signed-off-by: Piotr Figiel <p.figiel@camlintechnologies.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
@pdp7
Copy link
Contributor

pdp7 commented Jun 10, 2020

Please re-open if still an issue with our current Debian images:
https://github.com/beagleboard/Latest-Images

@pdp7 pdp7 closed this as completed Jun 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants