aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
Commit message (Collapse)AuthorAgeFilesLines
* tools lib: Adopt zalloc()/zfree() from tools/perfArnaldo Carvalho de Melo2019-07-091-1/+1
| | | | | | | | | | Eroding a bit more the tools/perf/util/util.h hodpodge header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Add CBR value to decoder stateAdrian Hunter2019-06-251-0/+1
| | | | | | | | | | For convenience, add the core-to-bus ratio (CBR) value to the decoder state. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Cater for CBR change in PSB+Adrian Hunter2019-06-251-0/+7
| | | | | | | | | | | | PSB+ provides status information only so the core-to-bus ratio (CBR) in PSB+ will not have changed from its previous value. However, cater for the possibility of a another CBR change that gets caught up in the PSB+ anyway. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Decoder to output CBR changes immediatelyAdrian Hunter2019-06-251-10/+6
| | | | | | | | | | | | | The core-to-bus ratio (CBR) provides the CPU frequency. With branches enabled, the decoder was outputting CBR changes only when there was a branch. That loses the correct time of the change if the trace is not in context (e.g. not tracing kernel space). Change to output the CBR change immediately. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190622093248.581-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Add decoder support for PEBS via PTAdrian Hunter2019-06-171-1/+77
| | | | | | | | | | | PEBS data is encoded in Block Item Packets (BIP). Populate a new structure intel_pt_blk_items with the values and, upon a Block End Packet (BEP), report them as a new Intel PT sample type INTEL_PT_BLK_ITEMS. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Add new packets for PEBS via PTAdrian Hunter2019-06-171-4/+34
| | | | | | | | | | | | | | | | | | | | | | | Add 3 new packets to supports PEBS via PT, namely Block Begin Packet (BBP), Block Item Packet (BIP) and Block End Packet (BEP). PEBS data is encoded into multiple BIP packets that come between BBP and BEP. The BEP packet might be associated with a FUP packet. That is indicated by using a separate packet type (INTEL_PT_BEP_IP) similar to other packets types with the _IP suffix. Refer to the Intel SDM for more information about PEBS via PT: https://software.intel.com/en-us/articles/intel-sdm May 2019 version: Vol. 3B 18.5.5.2 PEBS output to Intel® Processor Trace Decoding of BIP packets conflicts with single-byte TNT packets. Since BIP packets only occur in the context of a block (i.e. between BBP and BEP), that context must be recorded and passed to the packet decoder. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190610072803.10456-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'perf-core-for-mingo-5.3-20190611' of ↵Ingo Molnar2019-06-171-43/+286
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: perf record: Alexey Budankov: - Allow mixing --user-regs with --call-graph=dwarf, making sure that the minimal set of registers for DWARF unwinding is present in the set of user registers requested to be present in each sample, while warning the user that this may make callchains unreliable if more that the minimal set of registers is needed to unwind. yuzhoujian: - Add support to collect callchains from kernel or user space only, IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user} bits from the command line. perf trace: Arnaldo Carvalho de Melo: - Remove x86_64 specific syscall numbers from the augmented_raw_syscalls BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit} payloads, use instead the syscall numbers obtainer either by the arch specific syscalltbl generators or from audit-libs. - Allow 'perf trace' to ask for the number of bytes to collect for string arguments, for now ask for PATH_MAX, i.e. the whole pathnames, which ends up being just a way to speficy which syscall args are pathnames and thus should be read using bpf_probe_read_str(). - Skip unknown syscalls when expanding strace like syscall groups. This helps using the 'string' group of syscalls to work in arm64, where some of the syscalls present in x86_64 that deal with strings, for instance 'access', are deprecated and this should not be asked for tracing. Leo Yan: - Exit when failing to build eBPF program. perf config: Arnaldo Carvalho de Melo: - Bail out when a handler returns failure for a key-value pair. This helps with cases where processing a key-value pair is not just a matter of setting some tool specific knob, involving, for instance building a BPF program to then attach to the list of events 'perf trace' will use, e.g. augmented_raw_syscalls.c. perf.data: Kan Liang: - Read and store die ID information available in new Intel processors in CPUID.1F in the CPU topology written in the perf.data header. perf stat: Kan Liang: - Support per-die aggregation. Documentation: Arnaldo Carvalho de Melo: - Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY, CLOCKID and DIR_FORMAT headers. Song Liu: - Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF. Leo Yan: - Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'. JVMTI: Jiri Olsa: - Address gcc string overflow warning for strncpy() core: - Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd(). Intel PT: Adrian Hunter: - Add support for samples to contain IPC ratio, collecting cycles information from CYC packets, showing the IPC info periodically, because Intel PT does not update the cycle count on every branch or instruction, the incremental values will often be zero. When there are values, they will be the number of instructions and number of cycles since the last update, and thus represent the average IPC since the last IPC value. E.g.: # perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001 rounding mmap pages size to 1024M (262144 pages) [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 2.208 MB perf.data ] # perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid # <SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)> 1 cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f jnz 0x7f5219ac2af0 IPC: 0.81 (36/44) 2 cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45 cmp $0x1f, %rbp 3 cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49 jbe 0x7f5219ac2b00 4 cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f test $0x8, %al 5 cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51 jnz 0x7f5219ac2b00 6 cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57 movq 0x13c58a(%rip), %rcx 7 cc1 63501.650479626: 7f5219ac27de _int_free+0x5e mov %rdi, %r12 8 cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61 movq %fs:(%rcx), %rax 9 cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65 test %rax, %rax 10 cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68 jz 0x7f5219ac2821 11 cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a leaq -0x11(%rbp), %rdi 12 cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e mov %rdi, %rsi 13 cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71 shr $0x4, %rsi 14 cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75 cmpq %rsi, 0x13caf4(%rip) 15 cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c jbe 0x7f5219ac2821 16 cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1 cmpq 0x13f138(%rip), %rbp 17 cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8 jnbe 0x7f5219ac28d8 18 cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158 testb $0x2, 0x8(%rbx) 19 cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c jnz 0x7f5219ac2ab0 IPC: 6.00 (18/3) <SNIP> - Allow using time ranges with Intel PT, i.e. these features, already present but not optimially usable with Intel PT, should be now: Select the second 10% time slice: $ perf script --time 10%/2 Select from 0% to 10% time slice: $ perf script --time 0%-10% Select the first and second 10% time slices: $ perf script --time 10%/1,10%/2 Select from 0% to 10% and 30% to 40% slices: $ perf script --time 0%-10%,30%-40% cs-etm (ARM): Mathieu Poirier: - Add support for CPU-wide trace scenarios. s390: Thomas Richter: - Fix missing kvm module load for s390. - Fix OOM error in TUI mode on s390 - Support s390 diag event display when doing analysis on !s390 architectures. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf intel-pt: Add intel_pt_fast_forward()Adrian Hunter2019-06-101-0/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel PT decoding is done in time order. In order to support efficient time interval filtering, add a facility to "fast forward" towards a particular timestamp. That involves finding the right buffer, stepping to that buffer, and then stepping forward PSBs. Because decoding must begin at a PSB, "fast forward" stops at the last PSB that has a timestamp before the target timestamp. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf intel-pt: Add reposition parameter to intel_pt_get_data()Adrian Hunter2019-06-101-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | When the decoder gets the next trace buffer, some state is reset if the buffer is not consecutive to the previous buffer. Add a parameter 'reposition' so that can be done also to support a "fast forward" facility. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf intel-pt: Factor out intel_pt_reposition()Adrian Hunter2019-06-101-4/+9
| | | | | | | | | | | | | | | | | | | | Factor out intel_pt_reposition() so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf intel-pt: Factor out intel_pt_8b_tsc()Adrian Hunter2019-06-101-9/+17
| | | | | | | | | | | | | | | | | | | | Factor out intel_pt_8b_tsc() so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf intel-pt: Add lookahead callbackAdrian Hunter2019-06-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Add a callback function to enable the decoder to lookahead at subsequent trace buffers. This will be used to implement a "fast forward" facility which will be needed to support efficient time interval filtering. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packetsAdrian Hunter2019-06-051-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CYC packets are not available, it is still possible to count cycles using TSC/TMA/MTC timestamps. As the timestamp increments in TSC ticks, convert to CPU cycles using the current core-to-bus ratio. Do not accumulate cycles when control flow packet generation is not enabled, nor when time has been "lost", typically due to mwait, which is indicated by a TSC/TMA packet that is not part of PSB+. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ipAdrian Hunter2019-06-051-6/+17
| | | | | | | | | | | | | | | | | | | | To make it easier to add new code for different TIP cases, separate each case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf intel-pt: Record when decoding PSB+ packetsAdrian Hunter2019-06-051-10/+31
| | | | | | | | | | | | | | | | | | | | In preparation for using MTC packets to count cycles, record whether decoding is between a PSB and PSBEND packets. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf intel-pt: Accumulate cycle count from CYC packetsAdrian Hunter2019-06-051-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for providing instructions-per-cycle (IPC) information, accumulate cycle count from CYC packets. Although CYC packets are optional (requires config term 'cyc' to enable cycle-accurate mode when recording), the simplest way to count cycles is with CYC packets. The first complication is that cycles must be counted only when also counting instructions. That means when control flow packet generation is enabled i.e. between TIP.PGE and TIP.PGD packets. Also, sampling the cycle count follows the same rules as sampling the timestamp, that is, not before the instruction to which the decoder is walking is reached. In addition, the cycle count is not accurate for any but the first branch of a TNT packet. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf intel-pt: Factor out intel_pt_update_sample_timeAdrian Hunter2019-06-051-8/+10
| | | | | | | | | | | | | | | | | | | | To eliminate some duplication and make the code more understandable, factor out intel_pt_update_sample_time. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190520113728.14389-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288Thomas Gleixner2019-06-051-10/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 263 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* perf intel-pt: Fix sample timestamp wrt non-taken branchesAdrian Hunter2019-05-161-1/+4
| | | | | | | | | | | | | | | | | | | | The sample timestamp is updated to ensure that the timestamp represents the time of the sample and not a branch that the decoder is still walking towards. The sample timestamp is updated when the decoder returns, but the decoder does not return for non-taken branches. Update the sample timestamp then also. Note that commit 3f04d98e972b5 ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp"). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: 3f04d98e972b ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix improved sample timestampAdrian Hunter2019-05-161-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The decoder uses its current timestamp in samples. Usually that is a timestamp that has already passed, but in some cases it is a timestamp for a branch that the decoder is walking towards, and consequently hasn't reached. The intel_pt_sample_time() function decides which is which, but was not handling TNT packets exactly correctly. In the case of TNT, the timestamp applies to the first branch, so the decoder must first walk to that branch. That means intel_pt_sample_time() should return true for TNT, and this patch makes that change. However, if the first branch is a non-taken branch (i.e. a 'N'), then intel_pt_sample_time() needs to return false for subsequent taken branches in the same TNT packet. To handle that, introduce a new state INTEL_PT_STATE_TNT_CONT to distinguish the cases. Note that commit 3f04d98e972b5 ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp"). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: 3f04d98e972b5 ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix instructions sampling rateAdrian Hunter2019-05-161-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The timestamp used to determine if an instruction sample is made, is an estimate based on the number of instructions since the last known timestamp. A consequence is that it might go backwards, which results in extra samples. Change it so that a sample is only made when the timestamp goes forwards. Note this does not affect a sampling period of 0 or sampling periods specified as a count of instructions. Example: Before: $ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 10 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 8 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 6 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 4 instructions:u: 7fac71e2dab2 _dl_cache_libcmp+0xd2 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16423 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222734: 12731 instructions:u: 7fac71e27938 _dl_name_match_p+0x68 (/lib/x86_64-linux-gnu/ld-2.28.so) ... After: $ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16479 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: f4aa081949e7b ("perf tools: Add Intel PT decoder") Link: http://lkml.kernel.org/r/20190510124143.27054-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix TSC slipAdrian Hunter2019-03-281-12/+8
| | | | | | | | | | | | | | A TSC packet can slip past MTC packets so that the timestamp appears to go backwards. One estimate is that can be up to about 40 CPU cycles, which is certainly less than 0x1000 TSC ticks, but accept slippage an order of magnitude more to be on the safe side. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: 79b58424b821c ("perf tools: Add Intel PT support for decoding MTC packets") Link: http://lkml.kernel.org/r/20190325135135.18348-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Packet splitting can happen only on 32-bitAdrian Hunter2019-02-061-1/+1
| | | | | | | | | | | | Data is copied when the trace is stopped, so packets are never split between buffers except when processing if the buffer cannot fit in the address space which can only happen on 32-bit systems. Change the logic to reflect that. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190206103947.15750-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix CYC timestamp calculation after OVFAdrian Hunter2019-02-061-1/+0
| | | | | | | | | | | | | | CYC packet timestamp calculation depends upon CBR which was being cleared upon overflow (OVF). That can cause errors due to failing to synchronize with sideband events. Even if a CBR change has been lost, the old CBR is still a better estimate than zero. So remove the clearing of CBR. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix overlap calculation for paddingAdrian Hunter2019-02-061-2/+34
| | | | | | | | | | | Auxtrace records might have up to 7 bytes of padding appended. Adjust the overlap accordingly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Add MTC and CYC timestamps to debug logAdrian Hunter2018-11-051-0/+4
| | | | | | | | | | One cause of decoding errors is un-synchronized side-band data. Timestamps are needed to debug such cases. TSC packet timestamps are logged. Log also MTC and CYC timestamps. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/20181105073505.8129-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Implement decoder flags for trace begin / endAdrian Hunter2018-09-201-11/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Have the Intel PT decoder implement the new Intel PT decoder flags for trace begin / end. Previously, the decoder would indicate begin / end by a branch from / to zero. That hides useful information, in particular when a trace ends with a call. That happens when using address filters, for example: $ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.031 MB perf.data ] Before: $ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns 7249.622183310: tr strt 0 [unknown] => 401590 main+0x0 7249.622183311: call 4015b9 main+0x29 => 0 [unknown] 7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e 7249.622183714: call 4015c8 main+0x38 => 0 [unknown] 7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d 7249.622247760: call 4015d7 main+0x47 => 0 [unknown] 7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c 7249.622248341: call 4015e1 main+0x51 => 0 [unknown] 7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56 7249.622248682: call 4015eb main+0x5b => 0 [unknown] 7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60 7249.622248971: call 401612 main+0x82 => 0 [unknown] 7249.622249757: tr strt 0 [unknown] => 401617 main+0x87 7249.622249770: call 401847 main+0x2b7 => 0 [unknown] 7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc 7249.622250612: call 4019bf main+0x42f => 0 [unknown] 7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434 7249.622256863: call 4019f5 main+0x465 => 0 [unknown] 7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a 7249.622264235: call 401832 main+0x2a2 => 0 [unknown] After: $ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns 7249.622183310: tr strt 0 [unknown] => 401590 main+0x0 7249.622183311: tr end call 4015b9 main+0x29 => 401ef0 set_program_name+0x0 7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e 7249.622183714: tr end call 4015c8 main+0x38 => 4014b0 setlocale@plt+0x0 7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d 7249.622247760: tr end call 4015d7 main+0x47 => 4012d0 bindtextdomain@plt+0x0 7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c 7249.622248341: tr end call 4015e1 main+0x51 => 4012b0 textdomain@plt+0x0 7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56 7249.622248682: tr end call 4015eb main+0x5b => 404340 atexit+0x0 7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60 7249.622248971: tr end call 401612 main+0x82 => 401320 getopt_long@plt+0x0 7249.622249757: tr strt 0 [unknown] => 401617 main+0x87 7249.622249770: tr end call 401847 main+0x2b7 => 401360 uname@plt+0x0 7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc 7249.622250612: tr end call 4019bf main+0x42f => 401b10 print_element+0x0 7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434 7249.622256863: tr end call 4019f5 main+0x465 => 401340 __overflow@plt+0x0 7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a 7249.622264235: tr end call 401832 main+0x2a2 => 401520 exit@plt+0x0 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20180920130048.31432-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix "Unexpected indirect branch" errorAdrian Hunter2018-06-061-2/+15
| | | | | | | | | | | | Some Atom CPUs can produce FUP packets that contain NLIP (next linear instruction pointer) instead of CLIP (current linear instruction pointer). That will result in "Unexpected indirect branch" errors. Fix by comparing IP to NLIP in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix MTC timing after overflowAdrian Hunter2018-06-061-1/+0
| | | | | | | | | | | | On some platforms, overflows will clear before MTC wraparound, and there is no following TSC/TMA packet. In that case the previous TMA is valid. Since there will be a valid TMA either way, stop setting 'have_tma' to false upon overflow. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIPAdrian Hunter2018-06-061-1/+4
| | | | | | | | | | It is possible to have a CBR packet between a FUP packet and corresponding TIP packet. Stop treating it as an error. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix timestamp following overflowAdrian Hunter2018-03-081-0/+1
| | | | | | | | | | | | | | | | | | | timestamp_insn_cnt is used to estimate the timestamp based on the number of instructions since the last known timestamp. If the estimate is not accurate enough decoding might not be correctly synchronized with side-band events causing more trace errors. However there are always timestamps following an overflow, so the estimate is not needed and can indeed result in more errors. Suppress the estimate by setting timestamp_insn_cnt to zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix error recovery from missing TIP packetAdrian Hunter2018-03-081-0/+1
| | | | | | | | | | | | | | When a TIP packet is expected but there is a different packet, it is an error. However the unexpected packet might be something important like a TSC packet, so after the error, it is necessary to continue from there, rather than the next packet. That is achieved by setting pkt_step to zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix overlap detection to identify consecutive buffers correctlyAdrian Hunter2018-03-081-33/+29
| | | | | | | | | | | | | Overlap detection was not not updating the buffer's 'consecutive' flag. Marking buffers consecutive has the advantage that decoding begins from the start of the buffer instead of the first PSB. Fix overlap detection to identify consecutive buffers correctly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Do not use TSC packets for calculating CPU cycles to TSCAdrian Hunter2017-06-301-0/+14
| | | | | | | | | | | | | | | CBR (core-to-bus ratio) packets provide an indication of CPU frequency. A more accurate measure can be made by counting the cycles (given by CYC packets) in between other timing packets (either MTC or TSC). Using TSC packets has at least 2 issues: 1) timing might have stopped (e.g. mwait) or 2) TSC packets within PSB+ might slip past CYC packets. For now, simply do not use TSC packets for calculating CPU cycles to TSC. That leaves the case where 2 MTC packets are used, otherwise falling back to the CBR value. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-37-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Add decoder support for CBR eventsAdrian Hunter2017-06-211-0/+19
| | | | | | | | | | Add decoder support for informing the tools of changes to the core-to-bus ratio (CBR). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-16-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Add reserved byte to CBR packet payloadAdrian Hunter2017-06-211-1/+1
| | | | | | | | | | Future proof CBR packet decoding by passing through also the undefined 'reserved' byte in the packet payload. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-15-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Add decoder support for ptwrite and power event packetsAdrian Hunter2017-06-211-8/+168
| | | | | | | | | | | Add decoder support for PTWRITE, MWAIT, PWRE, PWRX and EXSTOP packets. This patch only affects the decoder, so the tools still do not select or consume the new information. That is added in subsequent patches. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-14-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Allow decoding with branch tracing disabledAdrian Hunter2017-06-211-0/+13
| | | | | | | | | | | | | | | The kernel now supports the disabling of branch tracing, however the decoder assumes branch tracing is always enabled. Pass through a parameter to indicate whether branch tracing is enabled and use it to avoid cases when the decoder is expecting branch packets. There are 2 such cases. First, FUP packets which can bind to an IP even when there is no branch tracing. Secondly, the decoder will try to use branch packets to find an IP to start decoding or to recover from errors. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-11-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Add missing __fallthroughAdrian Hunter2017-06-211-1/+1
| | | | | | | | | | perf tools uses __fallthrough. Add missing __fallthrough to a switch statement. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Clear FUP flag on errorAdrian Hunter2017-06-211-0/+2
| | | | | | | | | | | | Sometimes a FUP packet is associated with a TSX transaction and a flag is set to indicate that. Ensure that flag is cleared on any error condition because at that point the decoder can no longer assume it is correct. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Use FUP always when scanning for an IPAdrian Hunter2017-06-211-8/+4
| | | | | | | | | | | | | The decoder will try to use branch packets to find an IP to start decoding or to recover from errors. Currently the FUP packet is used only in the case of an overflow, however there is no reason for that to be a special case. So just use FUP always when scanning for an IP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zeroAdrian Hunter2017-06-211-3/+5
| | | | | | | | | | | | | Intel PT uses IP compression based on the last IP. For decoding purposes, 'last IP' is not updated when a branch target has been suppressed, which is indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so ensure never to set 'last_ip' when packet 'count' is zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix last_ip usageAdrian Hunter2017-06-211-2/+11
| | | | | | | | | | | | | | | Intel PT uses IP compression based on the last IP. For decoding purposes, 'last IP' is considered to be reset to zero whenever there is a synchronization packet (PSB). The decoder wasn't doing that, and was treating the zero value to mean that there was no last IP, whereas compression can be done against the zero value. Fix by setting last_ip to zero when a PSB is received and keep track of have_last_ip. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IPAdrian Hunter2017-06-211-0/+1
| | | | | | | | | | | A value of zero is used to indicate that there is no IP. Ensure the value is zero when the state is INTEL_PT_STATE_NO_IP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix missing stack clearAdrian Hunter2017-06-211-0/+1
| | | | | | | | | | | The return compression stack must be cleared whenever there is a PSB. Fix one case where that was not happening. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Improve sample timestampAdrian Hunter2017-06-211-3/+31
| | | | | | | | | | | | | | The decoder uses its current timestamp in samples. Usually that is a timestamp that has already passed, but in some cases it is a timestamp for a branch that the decoder is walking towards, and consequently hasn't reached. Improve that situation by using the pkt_state to determine when to use the current or previous timestamp. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Move decoder error setting into one conditionAdrian Hunter2017-06-211-4/+7
| | | | | | | | | | | | Move decoder error setting into one condition. Cc'ed to stable because later fixes depend on it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Use __fallthroughArnaldo Carvalho de Melo2017-02-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | To address new warnings emmited by gcc 7, e.g.:: CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o CC /tmp/build/perf/tests/parse-events.o util/intel-pt-decoder/intel-pt-pkt-decoder.c: In function 'intel_pt_pkt_desc': util/intel-pt-decoder/intel-pt-pkt-decoder.c:499:6: error: this statement may fall through [-Werror=implicit-fallthrough=] if (!(packet->count)) ^ util/intel-pt-decoder/intel-pt-pkt-decoder.c:501:2: note: here case INTEL_PT_CYC: ^~~~ CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o cc1: all warnings being treated as errors Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-mf0hw789pu9x855us5l32c83@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt/bts: Report instruction bytes and length in sampleAndi Kleen2016-10-241-0/+2
| | | | | | | | | | | | | | | | | | | | Change Intel PT and BTS to pass up the length and the instruction bytes of the decoded or sampled instruction in the perf sample. The decoder already knows this information, we just need to pass it up. Since it is only a couple of movs it is not very expensive. Handle instruction cache too. Make sure ilen is always initialized. Used in the next patch. [Adrian: re-base on top (and adjust for) instruction buffer size tidy-up] [Adrian: add BTS support and adjust commit message accordingly] Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/1475847747-30994-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf intel-pt: Fix MTC timestamp calculation for large MTC periodsAdrian Hunter2016-10-051-0/+36
| | | | | | | | | | | | | | | | The MTC packet provides a 8-bit slice of CTC which is related to TSC by the TMA packet, however the TMA packet only provides the lower 16 bits of CTC. If mtc_shift > 8 then some of the MTC bits are not in the CTC provided by the TMA packet. Fix-up the last_mtc calculated from the TMA packet by copying the missing bits from the current MTC assuming the least difference between the two, and that the current MTC comes after last_mtc. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.3+ Link: http://lkml.kernel.org/r/1475062896-22274-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>