### Demo #### Timing Analysis Solutions for Multicore Systems #### **Luis Miguel Pinho** CISTER-TR-170601 #### Timing Analysis Solutions for Multicore Systems #### Luis Miguel Pinho \*CISTER Research Centre Polytechnic Institute of Porto (ISEP-IPP) Rua Dr. António Bernardino de Almeida, 431 4200-072 Porto Portugal Tel.: +351.22.8340509, Fax: +351.22.8321159 E-mail: http://www.cister.isep.ipp.pt #### **Abstract** ### Timing analysis solutions for multicore systems ### Luis Miguel Pinho ISEP - P-SOCRATES at a glance - Time predictability and high-performance on parallel architectures - Quick review of WCET estimation techniques - pros & cons and their applicability in P-SOCRATES - The methodology in the MANANAS/SDK ### Quick fact sheet - P-SOCRATES: Parallel SOftware framework for time-CRitical mAny-core sysTEmS - Three-year FP7 STREP project (Oct-2013, Dec-2016) - Website: www.p-socrates.eu - Budget: 3.6 M€ - Partners # Industrial Advisory Board - Review and prioritize requirements, ensure that the project is kept on focus, analyze and validate the results - Members: Embedded Computing High Performance Computing Demand of increased **performance** with **guaranteed processing times** next-generation embedded many-core accelerators real-time methodologies to provide time predictability programmability of many-core from high-performance computing A generic framework, integrating models, tools and system software, to parallelize applications with high performance and real-time requirements # P-SOCRATES TA Objectives ## P-SOCRATES TA Objectives ### Safety-critical systems "Simple" functions Programming and design guidelines No guidelines Well written and structured code Simple and predictable hardware More powerful # Business-critical systems Performance-critical systems ### Reviewed WCET techniques - Static WCET techniques - Measurement-based techniques - Hybrid techniques - Probabilistic WCET ### Phase 1: Flow analysis Identify the feasible execution path in a program ### Static WCET techniques #### Phase 1: Flow analysis Identify the feasible execution path in a program ### Static WCET techniques #### Phase 1: Flow analysis Identify the feasible execution path in a program ### Static WCET techniques - No need to have the actual hardware available - Years of experience, reliability proven for simple embedded processors -> very efficient for SC applications - Little support for multicores - Long time-to-market due to the inherent complexity - V&V issues and associated cost - Accuracy for more complex platforms? (how to deal with IPs in COTS?) # Measurement-based techniques - Estimations available immediately (also, average, etc.) - Oneed to design accurate model -> reduced effort and cost - Requires the hardware to be available, which may not be the case if the HW is developed in parallel with the SW - Difficult to set up an environment which acts like the final system - Intrusive instrumentation code - Exhaustive testing is impossible ### Hybrid techniques Combine the merits of static and measurement-based analysis - Do not rely on complex models - Provide accurate estimates - Intrusive instrumentation code - Exhaustive testing is impossible ## Probabilistic WCET techniques # Probabilistic WCET techniques Measurement-based - Allow to derive estimates with confidence level - Require specific hardware support (randomization) - Not mature enough and controversial Assessed existing tools and methodologies against these new settings and requirements #### Static analysis is out the window #### Not because of the complexity of the architecture! Because of the architecture, the programming model, the OS and runtime, the man-power, the rapid evolution of the hardware... ✓ Portable (out-of-the-box) The provided tools should be "easily" portable from one platform to another ### Develop a measurement-based trace-collecting tool Collecting runtime execution traces is <u>fully automatic</u> (process of 12 subsequent steps for the Kalray MPPA) #### Develop a measurement-based trace-collecting tool Every step is well defined and is adaptable to other platforms with minimal effort # What we have done ### Develop a measurement-based trace-collecting tool ### Written in Python 3.4 - cross-platform language - Can be easily combined with other programming languages A new approach to tackle the interference problem #### Not one but two WCET estimates One estimate is obtained by running every task in complete isolation (runs on 1 core, the rest of the system stays quiet) A new approach to tackle the interference problem #### Not one but two WCET estimates The other is obtained by running every task in complete contention (runs on 1 core, the rest of the system does everything possible to interfere with its execution) A new approach to tackle the interference problem The gap between ISO and CONT is sometimes huge! Slow down factor between 7 and 8 in average Very negative impact on the global schedulability analysis - Measurements taken in isolation are not safe as the execution time is subject to variation due to the shared resources - Measurements taken in a totally congested system are not meaningful - ⇒ Design processes that creates a controllable interference on every shared resource - ⇒ **Investigate** how we can re-create a system activity similar to that of the final system - Processes to perform schedulability analysis - Based on both intrinsic and extrinsic WCET estimates - One process for the dynamic project approach - Task-to-thread mapping is with global queue - Thread scheduling is global with limited preemption - Maximize average performance - Another for the static process approach - Fixed task-to-thread mapping (heuristics to minimize makespan) - Partitioned per-core scheduling (with limited preemption) - Minimize guaranteed response time ## The big picture (dynamic) #### The big picture (static) #### The big picture (static) # The big picture (static) Reduce the pessimism of the WCET estimates Reduce the pessimism of the WCET estimates Intrinsic: missing a path that leads to the WCET #### Similar problem as on single-core systems - Little we can do here apart from improving the "path exploration" process. - Powerful tools exist to guarantee code coverage. Those may turn to be useful to help find the longest path. - Extrinsic: not observing the maximum interference - Extremely likely to happen, if not certain - Can we use this information? Can we sort of "extrapolate" the observations to guess the worst-case and possibly adjust/define the safety margin accordingly? From the traces obtained in isolation and contention modes, we want to analyze how sensitive to concurrent activity the analyzed task really is - Define safety margin accordingly - Make recommandation to set up the environment in an appropriate way: dynamic vs. static mapping, PREM, ... - Restrict the runtime, capture the maximum activity and map it to a pre-defined level of inteference intensity created by a "tunable IG" #### http://www.upscale-sdk.com/ Post-project work partially supported by National Funds through FCT/MEC (Portuguese Foundation for Science and Technology) and co-financed by ERDF (European Regional Development Fund) under the PT2020 Partnership, within the CISTER Research Unit (CEC/04234).