aboutsummaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'next' of ↵Michael Ellerman2016-03-14146-4699/+5439
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt bits, and minor fixes/cleanup."
| * powerpc/fsl/dts: Add "jedec,spi-nor" flash compatibleHou Zhiqiang2016-03-1134-43/+43
| | | | | | | | | | | | | | | | | | | | | | | | Starting with commit <8947e396a829> ("Documentation: dt: mtd: replace "nor-jedec" binding with "jedec, spi-nor"") we have "jedec,spi-nor" binding indicating support for JEDEC identification. Use it for all flashes that are supposed to support READ ID op according to the datasheets. Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@freescale.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/T104xRDB: add tdm riser card node to device treeZhao Qiang2016-03-111-0/+5
| | | | | | | | | | Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: PAGE_EXEC required for inittextChristophe Leroy2016-03-111-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PAGE_EXEC is required for inittext, otherwise CONFIG_DEBUG_PAGEALLOC ends up with an Oops [ 0.000000] Inode-cache hash table entries: 8192 (order: 1, 32768 bytes) [ 0.000000] Sorting __ex_table... [ 0.000000] bootmem::free_all_bootmem_core nid=0 start=0 end=2000 [ 0.000000] Unable to handle kernel paging request for instruction fetch [ 0.000000] Faulting instruction address: 0xc045b970 [ 0.000000] Oops: Kernel access of bad area, sig: 11 [#1] [ 0.000000] PREEMPT DEBUG_PAGEALLOC CMPC885 [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.25-local-dirty #1673 [ 0.000000] task: c04d83d0 ti: c04f8000 task.ti: c04f8000 [ 0.000000] NIP: c045b970 LR: c045b970 CTR: 0000000a [ 0.000000] REGS: c04f9ea0 TRAP: 0400 Not tainted (3.18.25-local-dirty) [ 0.000000] MSR: 08001032 <ME,IR,DR,RI> CR: 39955d35 XER: a000ff40 [ 0.000000] GPR00: c045b970 c04f9f50 c04d83d0 00000000 ffffffff c04dcdf4 00000048 c04f6b10 GPR08: c04f6ab0 00000001 c0563488 c04f6ab0 c04f8000 00000000 00000000 b6db6db7 GPR16: 00003474 00000180 00002000 c7fec000 00000000 000003ff 00000176 c0415014 GPR24: c0471018 c0414ee8 c05304e8 c03aeaac c0510000 c0471018 c0471010 00000000 [ 0.000000] NIP [c045b970] free_all_bootmem+0x164/0x228 [ 0.000000] LR [c045b970] free_all_bootmem+0x164/0x228 [ 0.000000] Call Trace: [ 0.000000] [c04f9f50] [c045b970] free_all_bootmem+0x164/0x228 (unreliable) [ 0.000000] [c04f9fa0] [c0454044] mem_init+0x3c/0xd0 [ 0.000000] [c04f9fb0] [c045080c] start_kernel+0x1f4/0x390 [ 0.000000] [c04f9ff0] [c0002214] start_here+0x38/0x98 [ 0.000000] Instruction dump: [ 0.000000] 2f150000 7f968840 72a90001 3ad60001 56b5f87e 419a0028 419e0024 41a20018 [ 0.000000] 807cc20c 38800000 7c638214 4bffd2f5 <3a940001> 3a100024 4bffffc8 7e368b78 [ 0.000000] ---[ end trace dc8fa200cb88537f ]--- Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/mpc85xx: Add pcsphy nodes to FManV3 device treeIgal Liberman2016-03-1118-0/+90
| | | | | | | | | | | | | | This patch adds pcsphy node to FManV3 device tree. Signed-off-by: Igal Liberman <igal.liberman@freescale.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)Igal Liberman2016-03-1119-19/+2198
| | | | | | | | | | | | | | | | | | | | | | Describe the PHY topology for all configurations supported by each board Based on prior work by Andy Fleming <afleming@freescale.com> Signed-off-by: Shruti Kanetkar <Shruti@freescale.com> Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Igal Liberman <Igal.Liberman@freescale.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/86xx: Introduce and use common dtsiAlessio Igor Bogani2016-03-118-1443/+394
| | | | | | | | | | Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/86xx: Update device treeAlessio Igor Bogani2016-03-116-288/+169
| | | | | | | | | | | | | | | | Avoid duplication of the interrupt-parent, migrate to 4 interrupt-cells and set the right clock-frequency for pcie (100 Mhz). Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/86xx: Move dts files to fsl directoryAlessio Igor Bogani2016-03-116-0/+0
| | | | | | | | | | Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/86xx: Switch to kconfig fragments approachAlessio Igor Bogani2016-03-1111-1563/+126
| | | | | | | | | | Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/86xx: Update defconfigsAlessio Igor Bogani2016-03-116-547/+734
| | | | | | | | | | | | | | | | This patch show how defconfigs appear if the kconfig fragment approach is used. Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/86xx: Consolidate common platform codeAlessio Igor Bogani2016-03-119-163/+53
| | | | | | | | | | Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: Remove one insn in mulhduChristophe Leroy2016-03-111-6/+5
| | | | | | | | | | | | | | Remove one instruction in mulhdu Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: small optimisation in flush_icache_range()Christophe Leroy2016-03-111-3/+2
| | | | | | | | | | | | | | | | Inlining of _dcache_range() functions has shown that the compiler does the same thing a bit better with one insn less Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc: Simplify test in __dma_sync()Christophe Leroy2016-03-111-1/+1
| | | | | | | | | | | | | | | | This simplification helps the compiler. We now have only one test instead of two, so it reduces the number of branches. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: move xxxxx_dcache_range() functions inlineChristophe Leroy2016-03-113-68/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | flush/clean/invalidate _dcache_range() functions are all very similar and are quite short. They are mainly used in __dma_sync() perf_event locate them in the top 3 consumming functions during heavy ethernet activity They are good candidate for inlining, as __dma_sync() does almost nothing but calling them Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: Remove clear_pages() and define clear_page() inlineChristophe Leroy2016-03-113-20/+14
| | | | | | | | | | | | | | | | | | | | | | | | clear_pages() is never used expect by clear_page, and PPC32 is the only architecture (still) having this function. Neither PPC64 nor any other architecture has it. This patch removes clear_pages() and moves clear_page() function inline (same as PPC64) as it only is a few isns Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc: add inline functions for cache related instructionsChristophe Leroy2016-03-111-0/+19
| | | | | | | | | | | | | | | | This patch adds inline functions to use dcbz, dcbi, dcbf, dcbst from C functions Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/8xx: rewrite flush_instruction_cache() in CChristophe Leroy2016-03-112-6/+11
| | | | | | | | | | | | | | | | | | | | On PPC8xx, flushing instruction cache is performed by writing in register SPRN_IC_CST. This registers suffers CPU6 ERRATA. The patch rewrites the fonction in C so that CPU6 ERRATA will be handled transparently Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/8xx: rewrite set_context() in CChristophe Leroy2016-03-112-44/+34
| | | | | | | | | | | | | | | | | | There is no real need to have set_context() in assembly. Now that we have mtspr() handling CPU6 ERRATA directly, we can rewrite set_context() in C language for easier maintenance. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/8xx: remove special handling of CPU6 errata in set_dec()Christophe Leroy2016-03-112-23/+1
| | | | | | | | | | | | | | | | CPU6 ERRATA is now handled directly in mtspr(), so we can use the standard set_dec() fonction in all cases. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macroChristophe Leroy2016-03-112-0/+84
| | | | | | | | | | | | | | | | | | MPC8xx has an ERRATA on the use of mtspr() for some registers This patch includes the ERRATA handling directly into mtspr() macro so that mtspr() users don't need to bother about that errata Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/8xx: Add missing SPRN defines into reg_8xx.hChristophe Leroy2016-03-112-2/+13
| | | | | | | | | | | | | | | | | | | | | | Add missing SPRN defines into reg_8xx.h Some of them are defined in mmu-8xx.h, so we include mmu-8xx.h in reg_8xx.h, for that we remove references to PAGE_SHIFT in mmu-8xx.h to have it self sufficient, as includers of reg_8xx.h don't all include asm/page.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: remove ioremap_baseChristophe Leroy2016-03-114-14/+2
| | | | | | | | | | | | | | ioremap_base is not initialised and is nowhere used so remove it Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: Remove useless/wrong MMU:setio progress messageChristophe Leroy2016-03-111-4/+0
| | | | | | | | | | | | | | | | | | Commit 771168494719 ("[POWERPC] Remove unused machine call outs") removed the call to setup_io_mappings(), so remove the associated progress line message Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() togetherChristophe Leroy2016-03-114-42/+22
| | | | | | | | | | | | | | | | | | | | x_mapped_by_bats() and x_mapped_by_tlbcam() serve the same kind of purpose, and are never defined at the same time. So rename them x_block_mapped() and define them in the relevant places Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: Fix pte_offset_kernel() to return NULL for bad pagesChristophe Leroy2016-03-111-1/+2
| | | | | | | | | | | | | | | | | | | | The fixmap related functions try to map kernel pages that are already mapped through Large TLBs. pte_offset_kernel() has to return NULL for LTLBs, otherwise the caller will try to access level 2 table which doesn't exist Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.cChristophe Leroy2016-03-112-19/+17
| | | | | | | | | | | | | | | | Now we have a 8xx specific .c file for that so put it in there as other powerpc variants do Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/8xx: Map linear kernel RAM with 8M pagesChristophe Leroy2016-03-114-14/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a live running system (VoIP gateway for Air Trafic Control), over a 10 minutes period (with 277s idle), we get 87 millions DTLB misses and approximatly 35 secondes are spent in DTLB handler. This represents 5.8% of the overall time and even 10.8% of the non-idle time. Among those 87 millions DTLB misses, 15% are on user addresses and 85% are on kernel addresses. And within the kernel addresses, 93% are on addresses from the linear address space and only 7% are on addresses from the virtual address space. MPC8xx has no BATs but it has 8Mb page size. This patch implements mapping of kernel RAM using 8Mb pages, on the same model as what is done on the 40x. In 4k pages mode, each PGD entry maps a 4Mb area: we map every two entries to the same 8Mb physical page. In each second entry, we add 4Mb to the page physical address to ease life of the FixupDAR routine. This is just ignored by HW. In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry will point to the first page of the area. The DTLB handler adds the 3 bits from EPN to map the correct page. With this patch applied, we now get only 13 millions TLB misses during the 10 minutes period. The idle time has increased to 313s and the overall time spent in DTLB miss handler is 6.3s, which represents 1% of the overall time and 2.2% of non-idle time. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/8xx: Save r3 all the time in DTLB miss handlerChristophe Leroy2016-03-111-9/+4
| | | | | | | | | | | | | | | | | | | | | | We are spending between 40 and 160 cycles with a mean of 65 cycles in the DTLB handling routine (measured with mftbl) so make it more simple althought it adds one instruction. With this modification, we get three registers available at all time, which will help with following patch. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/p5040: Add device node for RAID EngineXuelin Shi2016-03-092-0/+7
| | | | | | | | | | | | | | | | add the missing RAID Engine device node for p5040. otherwise, the device can not be detected. Signed-off-by: Xuelin Shi <xuelin.shi@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc: optimise csum_partial() call when len is constantChristophe Leroy2016-03-094-28/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | csum_partial is often called for small fixed length packets for which it is suboptimal to use the generic csum_partial() function. For instance, in my configuration, I got: * One place calling it with constant len 4 * Seven places calling it with constant len 8 * Three places calling it with constant len 14 * One place calling it with constant len 20 * One place calling it with constant len 24 * One place calling it with constant len 32 This patch renames csum_partial() to __csum_partial() and implements csum_partial() as a wrapper inline function which * uses csum_add() for small 16bits multiple constant length * uses ip_fast_csum() for other 32bits multiple constant * uses __csum_partial() in all other cases Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/fsl-lbc: Modify suspend/resume entry sequenceRaghav Dogra2016-03-091-11/+38
| | | | | | | | | | | | | | | | | | | | Modify platform driver suspend/resume to syscore suspend/resume. This is because p1022ds needs to use localbus when entering the PCIE resume. Signed-off-by: Raghav Dogra <raghav.dogra@nxp.com> [scottwood: dropped makefile churn] Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/8xx: CONFIG_DEBUG_PAGEALLOC requires ITLBmiss for kernel addressesChristophe Leroy2016-03-091-1/+1
| | | | | | | | | | | | | | | | | | When CONFIG_DEBUG_PAGEALLOC is activated, the initial TLB mapping gets flushed to track accesses to wrong areas. Therefore, kernel addresses will also generate ITLB misses. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/885: set SDCR to 0x40Christophe Leroy2016-03-091-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | The MPC885 reference manual says that SDCR shall have value 0x40, but most exemples set SDCR to 0x1 With 0x1 in SDCR, we observe TX underruns on SCC when using it in QMC mode. According the NXP technical support, this is a copy/paste error from MPC860 reference manual, 0x40 being the only value supported by the MPC885 HW. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/86xx: disable IDE subsystem in mpc8610_hpcd_defconfigBartlomiej Zolnierkiewicz2016-03-091-1/+0
| | | | | | | | | | | | | | | | | | This patch disables deprecated IDE subsystem in mpc8610_hpcd_defconfig (no IDE host drivers are selected in this config so there is no valid reason to enable IDE subsystem itself). Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/85xx: disable IDE subsystem in stx_gp3_defconfigBartlomiej Zolnierkiewicz2016-03-091-2/+0
| | | | | | | | | | | | | | | | | | | | | | This patch disables deprecated IDE subsystem in stx_gp3_defconfig (no IDE host drivers are selected in this config so there is no valid reason to enable IDE subsystem itself). Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/85xx: disable IDE subsystem in ksi8560_defconfigBartlomiej Zolnierkiewicz2016-03-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | This patch disables deprecated IDE subsystem in ksi8560_defconfig (no IDE host drivers are selected in this config so there is no valid reason to enable IDE subsystem itself). Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/83xx: disable IDE subsystem in mpc834x_itx_defconfigBartlomiej Zolnierkiewicz2016-03-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | This patch disables deprecated IDE subsystem in mpc834x_itx_defconfig (no IDE host drivers are selected in this config so there is no valid reason to enable IDE subsystem itself). Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/mpc85xx: Add CPU hotplug support for E6500chenhui zhao2016-03-044-31/+124
| | | | | | | | | | | | | | Support Freescale E6500 core-based platforms, like t4240. Support disabling/enabling individual CPU thread dynamically. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>
| * powerpc/mpc85xx: Add hotplug support on E5500 and E500MC coreschenhui zhao2016-03-044-90/+118
| | | | | | | | | | | | | | | | | | | | | | Freescale E500MC and E5500 core-based platforms, like P4080, T1040, support disabling/enabling CPU dynamically. This patch adds this feature on those platforms. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com> Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com> [scottwood: removed unused pr_fmt] Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/mpc85xx: refactor the PM operationschenhui zhao2016-03-044-54/+127
| | | | | | | | | | | | | | | | | | | | | | Freescale CoreNet-based and Non-CoreNet-based platforms require different PM operations. This patch extracted existing PM operations on Non-CoreNet-based platforms to a new file which can accommodate both platforms. In this way, PM operation codes are clearer structurally. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com> Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/rcpm: add RCPM driverchenhui zhao2016-03-048-0/+466
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a RCPM (Run Control/Power Management) in Freescale QorIQ series processors. The device performs tasks associated with device run control and power management. The driver implements some features: mask/unmask irq, enter/exit low power states, freeze time base, etc. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com> Signed-off-by: Tang Yuantian <Yuantian.Tang@freescale.com> [scottwood: remove __KERNEL__ ifdef] Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/cache: add cache flush operation for various e500chenhui zhao2016-03-047-78/+128
| | | | | | | | | | | | | | | | | | | | | | | | Various e500 core have different cache architecture, so they need different cache flush operations. Therefore, add a callback function cpu_flush_caches to the struct cpu_spec. The cache flush operation for the specific kind of e500 is selected at init time. The callback function will flush all caches inside the current cpu. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com> Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc/mm: any thread in one core can be the first to setup TLB1chenhui zhao2016-03-042-3/+9
| | | | | | | | | | | | | | | | | | | | | | On e6500, in the case of cpu hotplug, either thread in one core may be the first thread initilzing the TLB1. The subsequent threads must not setup it again. The code is derived from the comment of Scott Wood. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc: simplify csum_add(a, b) in case a or b is constant 0Christophe Leroy2016-03-041-0/+6
| | | | | | | | | | | | | | Simplify csum_add(a, b) in case a or b is constant 0 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: optimise csum_partial() loopChristophe Leroy2016-03-041-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | On the 8xx, load latency is 2 cycles and taking branches also takes 2 cycles. So let's unroll the loop. This patch improves csum_partial() speed by around 10% on both: * 8xx (single issue processor with parallel execution) * 83xx (superscalar 6xx processor with dual instruction fetch and parallel execution) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: optimise a few instructions in csum_partial()Christophe Leroy2016-03-041-20/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r5 does contain the value to be updated, so lets use r5 all way long for that. It makes the code more readable. To avoid confusion, it is better to use adde instead of addc The first addition is useless. Its only purpose is to clear carry. As r4 is a signed int that is always positive, this can be done by using srawi instead of srwi Let's also remove the comment about bdnz having no overhead as it is not correct on all powerpc, at least on MPC8xx In the last part, in our situation, the remaining quantity of bytes to be proceeded is between 0 and 3. Therefore, we can base that part on the value of bit 31 and bit 30 of r4 instead of anding r4 with 3 then proceding on comparisons and substractions. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc32: rewrite csum_partial_copy_generic() based on copy_tofrom_user()Christophe Leroy2016-03-041-111/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | csum_partial_copy_generic() does the same as copy_tofrom_user and also calculates the checksum during the copy. Unlike copy_tofrom_user(), the existing version of csum_partial_copy_generic() doesn't take benefit of the cache. This patch is a rewrite of csum_partial_copy_generic() based on copy_tofrom_user(). The previous version of csum_partial_copy_generic() was handling errors. Now we have the checksum wrapper functions to handle the error case like in powerpc64 so we can make the error case simple: just return -EFAULT. copy_tofrom_user() only has r12 available => we use it for the checksum r7 and r8 which contains pointers to error feedback are used, so we stack them. On a TCP benchmark using socklib on the loopback interface on which checksum offload and scatter/gather have been deactivated, we get about 20% performance increase. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
| * powerpc: inline ip_fast_csum()Christophe Leroy2016-03-044-56/+38
| | | | | | | | | | | | | | | | | | | | | | | | In several architectures, ip_fast_csum() is inlined There are functions like ip_send_check() which do nothing much more than calling ip_fast_csum(). Inlining ip_fast_csum() allows the compiler to optimise better Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [scottwood: whitespace and cast fixes] Signed-off-by: Scott Wood <oss@buserror.net>