万字整理 - systrace的实现

作者简介:伟林,中年码农,从事过电信、手机、安全、芯片等行业,目前依旧从事Linux方向开发工作,个人爱好Linux相关知识分享,个人微博CSDN pwl999。

上图基本就能说清systrace的整个框架:

  • 1、systrace调用atrace抓取目标机的trace数据;

  • 2、systrace把trace数据和'prefix.html'、'suffix.html'、'systrace_trace_viewer.html'合成一个'trace.html'文件;

  • 3、使用chrome浏览器打开'trace.html'就可以非常方便的以图形化的形式来查看和分析trace数据。背后是Trace-Viewer的脚本在运行;

内核态和用户态的存储trace数据的实现:

  • 1、内核trace信息,通过trace event记录到ftrace的buffer中;

  • 2、用户态(app/java framework/native)是通过使用Trace类来记录trace信息的,也是记录到内核ftrace buffer当中的,是通过"/sys/kernel/debug/tracing/trace_marker"接口记录的。

1、systrace的使用

在定位Android性能问题的时候,我们经常会用到systrace工具。配置到adb,连接上目标手机,在主机侧使用类似命令启动systrace:

ASOP_ROOT/external/chromium-trace/catapult/systrace/bin$ ./systrace -o trace.html -t 10 gfx view wm am dalvik input sched freq idle
Starting tracing (10 seconds)
Tracing completed. Collecting output...
Outputting Systrace results...
Tracing complete, writing results

Wrote trace HTML file: file:///ASOP_ROOT/external/chromium-trace/catapult/systrace/bin/trace.html

systrace命令在ASOP源码包的ASOP_ROOT/external/chromium-trace/catapult/systrace/bin路径下。上述命令的参数:

  • '-t 10',指定了抓取trace的时长为10s;

  • '-o trace.html',指定了trace的输出文件;

  • 'gfx view wm am dalvik input sched freq idle',指定了需要抓取的事件;

在命令启动以后,我们就可以在目标机上进行滑动、启动app等一系列操作,10s内的这些操作都会被记录下来最后dump进trace.html文件。我们可以通过google的chrome浏览器来查看、分析trace.html:'google-chrome trace.html'。

可以通过'-l'选项来查看目标机支持的systrace事件全集:

$ ./systrace -l
gfx - Graphics
input - Input
view - View System
webview - WebView
wm - Window Manager
am - Activity Manager
sm - Sync Manager
audio - Audio
video - Video
camera - Camera
hal - Hardware Modules
app - Application
res - Resource Loading
dalvik - Dalvik VM
rs - RenderScript
bionic - Bionic C Library
power - Power Management
pm - Package Manager
ss - System Server
database - Database
network - Network
adb - ADB
pdx - PDX services
sched - CPU Scheduling
irq - IRQ Events
freq - CPU Frequency
idle - CPU Idle
disk - Disk I/O
workq - Kernel Workqueues
memreclaim - Kernel Memory Reclaim
regulators - Voltage and Current Regulators
binder_driver - Binder Kernel driver
binder_lock - Binder global lock trace
pagecache - Page cache

NOTE: more categories may be available with adb root

还可以通过'--help'选项来查看systrace命令的详细选项:

$ ./systrace --help
Usage: systrace [options] [category1 [category2 ...]]

Example: systrace -b 32768 -t 15 gfx input view sched freq

Options:
-h, --help show this help message and exit
-o FILE write trace output to FILE
-j, --json write a JSON file
--link-assets (deprecated)
--asset-dir=ASSET_DIR
(deprecated)
-e DEVICE_SERIAL_NUMBER, --serial=DEVICE_SERIAL_NUMBER
adb device serial number
--timeout=TIMEOUT timeout for start and stop tracing (seconds)
--collection-timeout=COLLECTION_TIMEOUT
timeout for data collection (seconds)
-t N, --time=N trace for N seconds
--target=TARGET choose tracing target (android or linux)
-b N, --buf-size=N use a trace buffer size of N KB
-l, --list-categories
list the available categories and exit

Atrace options:
--atrace-categories=ATRACE_CATEGORIES
Select atrace categories with a comma-delimited list,
e.g. --atrace-categories=cat1,cat2,cat3
-k KFUNCS, --ktrace=KFUNCS
specify a comma-separated list of kernel functions to
trace
--no-compress Tell the device not to send the trace data in
compressed form.
-a APP_NAME, --app=APP_NAME
enable application-level tracing for comma-separated
list of app cmdlines
--from-file=FROM_FILE
read the trace from a file (compressed) rather than
running a live trace

BattOr trace options:
--battor-categories=BATTOR_CATEGORIES
Select battor categories with a comma-delimited list,
e.g. --battor-categories=cat1,cat2,cat3
--serial-map=SERIAL_MAP
File containing pregenerated map of phone serial
numbers to BattOr serial numbers.
--battor-path=BATTOR_PATH
specify a BattOr path to use
--battor Use the BattOr tracing agent.

Ftrace options:
--ftrace-categories=FTRACE_CATEGORIES
Select ftrace categories with a comma-delimited list,
e.g. --ftrace-categories=cat1,cat2,cat3

WALT trace options:
--walt Use the WALT tracing agent. WALT is a device for
measuring latency of physical sensors on phones and
computers. See https://github.com/google/walt

2、主机systrace命令的实现(python)

systrace实质上是一个python文件,整个python相关包在ASOP_ROOT/external/chromium-trace/catapult/路径下,以此路径为根(root=ASOP_ROOT/external/chromium-trace/catapult),我们来分析它的实现过程:

./systrace/bin/systrace:

import os
import sys

# (1) 将'./systrace/'路径加入到python的搜索路径 #
_SYSTRACE_DIR = os.path.abspath(
os.path.join(os.path.dirname(__file__), os.path.pardir))
sys.path.insert(0, _SYSTRACE_DIR)

# (2) 这样就能找到'./systrace/systrace/run_systrace.py'模块并导入 #
from systrace import run_systrace

# (3) 调用run_systrace.py模块中的main函数 #
if __name__ == '__main__':
sys.exit(run_systrace.main)

./systrace/systrace/run_systrace.py:


# (3.1) main函数继续调用main_impl函数 #
def main:
main_impl(sys.argv)



def main_impl(arguments):
# Parse the command line options.

# (3.1.1) 将用户输入的参数解析为options和categories #
options, categories = parse_options(arguments)

# Override --atrace-categories and --ftrace-categories flags if command-line
# categories are provided.

# (3.1.2) 如果解析出了trace事件categories #
# 根据options.target指定的平台'android'/'linux',来给options中的选项赋值 #
# 平台是使用sytrace命令的'--target=TARGET'选项来指定的,如果没有指定默认值是'android' #
if categories:
if options.target == 'android':
options.atrace_categories = categories
elif options.target == 'linux':
options.ftrace_categories = categories
else:
raise RuntimeError('Categories are only valid for atrace/ftrace. Target '
'platform must be either Android or Linux.')


./systrace/systrace/run_systrace.py:
# Include atrace categories by default in Systrace.
# (3.1.3) 如果平台是'android',且没有指定trace事件,给出默认的trace事件 #
if options.target == 'android' and not options.atrace_categories:
options.atrace_categories = atrace_agent.DEFAULT_CATEGORIES

# (3.1.4) 如果平台是'android',且不是从文件中读取数据,那么就是从实际的目标机中读取数据了 #
if options.target == 'android' and not options.from_file:
# 初始化adb #
initialize_devil
# 如果没有指定目标机的serialnumber,尝试读取 #
if not options.device_serial_number:
devices = [a.GetDeviceSerial() for a in adb_wrapper.AdbWrapper.Devices()]
if len(devices) == 0:
raise RuntimeError('No ADB devices connected.')
elif len(devices) >= 2:
raise RuntimeError('Multiple devices connected, serial number required')
options.device_serial_number = devices[0]

# If list_categories is selected, just print the list of categories.
# In this case, use of the tracing controller is not necessary.
# (3.1.5) 如果当前是'systrace -l'命令,列出目标机支持的所有trace事件后直接返回 #
if options.list_categories:
if options.target == 'android':
# 调用'systrace/systrace/tracing_agents/atrace_agent.py'文件中的list_categories函数 #
# 最后调用的是`adb shell atrace --list_categories'命令 #
atrace_agent.list_categories(options)
elif options.target == 'linux':
ftrace_agent.list_categories(options)
return

# Set up the systrace runner and start tracing.
# (3.1.6) 如果是普通的trace命令,根据'systrace/systrace/systrace_runner.py'模块中的SystraceRunner类来创建对象 #
controller = systrace_runner.SystraceRunner(
os.path.dirname(os.path.abspath(__file__)), options)

# (3.1.6.1) 开始tracing #
controller.StartTracing

# Wait for the given number of seconds or until the user presses enter.
# pylint: disable=superfluous-parens
# (need the parens so no syntax error if trying to load with Python 3)
if options.from_file is not None:
print('Reading results from file.')
elif options.trace_time:
print('Starting tracing (%d seconds)' % options.trace_time)
time.sleep(options.trace_time)
else:
raw_input('Starting tracing (stop with enter)')

# Stop tracing and collect the output.
print('Tracing completed. Collecting output...')

# (3.1.6.2) 停止tracing #
controller.StopTracing

print('Outputting Systrace results...')

# (3.1.6.3) 输出tracing结果到文件中 #
controller.OutputSystraceResults(write_json=options.write_json)

我们可以看到trace过的重点最后落在SystraceRunner对象的创建,以及.StartTracing/.StopTracing/.OutputSystraceResults几个方法上。

下面我们详细分析一下这几个步骤的实现过程。

2.1、SystraceRunner类的初始化

SystraceRunner类在systrace_runner.py文件当中,我们来查看起具体实现。

./systrace/systrace/systrace_runner.py:

AGENT_MODULES = [android_process_data_agent, atrace_agent,
atrace_from_file_agent, battor_trace_agent,
ftrace_agent, walt_agent]

class SystraceRunner(object):
def __init__(self, script_dir, options):
"""Constructor.

Args:
script_dir: Directory containing the trace viewer script
(systrace_trace_viewer.html)
options: Object containing command line options.
"""
# Parse command line arguments and create agents.
self._script_dir = script_dir
self._out_filename = options.output_file

# (1) #
agents_with_config = tracing_controller.CreateAgentsWithConfig(
options, AGENT_MODULES)

# (2) #
controller_config = tracing_controller.GetControllerConfig(options)

# Set up tracing controller.

# (3) #
self._tracing_controller = tracing_controller.TracingController(
agents_with_config, controller_config)
  • 1、分析tracing_controller.CreateAgentsWithConfig的实现;

./systrace/systrace/tracing_controller.py:

def CreateAgentsWithConfig(options, modules):
"""Create tracing agents.

This function will determine which tracing agents are valid given the
options and create those agents along with their corresponding configuration
object.
Args:
options: The command-line options.
modules: The modules for either Systrace or profile_chrome.
TODO(washingtonp): After all profile_chrome agents are in
Systrace, this parameter will no longer be valid.
Returns:
A list of AgentWithConfig options containing agents and their corresponding
configuration object.
"""
result =

# (1.1) 遍历modules中的agent调用相应的函数。对'android'来说,基本只会调用到atrace_agent、android_process_data_agent #
for module in modules:

# (1.1.1) #
config = module.get_config(options)

# (1.1.2) #
agent = module.try_create_agent(config)

if agent and config:
# (1.1.3) 创建一个AgentWithConfig类的对象,用来存储config和agent #
result.append(AgentWithConfig(agent, config))
return [x for x in result if x and x.agent]

|→

./systrace/systrace/tracing_agents/atrace_agent.py:

# (1.1.1) 创建一个AtraceConfig类的对象,用来存储optios中的相关配置 #
def get_config(options):
return AtraceConfig(options.atrace_categories,
options.trace_buf_size, options.kfuncs,
options.app_name, options.compress_trace_data,
options.from_file, options.device_serial_number,
options.trace_time, options.target)



class AtraceConfig(tracing_agents.TracingConfig):
def __init__(self, atrace_categories, trace_buf_size, kfuncs,
app_name, compress_trace_data, from_file,
device_serial_number, trace_time, target):
tracing_agents.TracingConfig.__init__(self)
self.atrace_categories = atrace_categories
self.trace_buf_size = trace_buf_size
self.kfuncs = kfuncs
self.app_name = app_name
self.compress_trace_data = compress_trace_data
self.from_file = from_file
self.device_serial_number = device_serial_number
self.trace_time = trace_time
self.target = target
# (1.1.2) 创建一个AtraceAgent类的对象 #
def try_create_agent(config):
"""Create an Atrace agent.

Args:
config: Command line config.
"""

# 根据config的配置,判断当前Atrace agent是否需要创建 #
if config.target != 'android':
return None
if config.from_file is not None:
return None

if not config.atrace_categories:
return None

# Check device SDK version.
device_sdk_version = util.get_device_sdk_version
if device_sdk_version < version_codes.JELLY_BEAN_MR2:
print ('Device SDK versions < 18 (Jellybean MR2) not supported. '
'Your device SDK version is %d.' % device_sdk_version)
return None

return AtraceAgent(device_sdk_version)



class AtraceAgent(tracing_agents.TracingAgent):

def __init__(self, device_sdk_version):
super(AtraceAgent, self).__init__
self._device_sdk_version = device_sdk_version
self._adb = None
self._trace_data = None
self._tracer_args = None
self._collection_thread = None
self._device_utils = None
self._device_serial_number = None
self._config = None
self._categories = None

|→

./systrace/systrace/tracing_agents/android_process_data_agent.py:

def try_create_agent(config):
if config.target != 'android':
return None
if config.from_file is not None:
return None
return AndroidProcessDataAgent



class AndroidProcessDataAgent(tracing_agents.TracingAgent):
def __init__(self):
super(AndroidProcessDataAgent, self).__init__
self._trace_data = ""
self._device = None
  • 2、分析tracing_controller.GetControllerConfig的实现;

./systrace/systrace/tracing_controller.py:

# (2.1) 创建一个TracingControllerConfig类的对象,用来存储optios中的相关配置 #
def GetControllerConfig(options):
return TracingControllerConfig(options.output_file, options.trace_time,
options.write_json,
options.link_assets, options.asset_dir,
options.timeout, options.collection_timeout,
options.device_serial_number, options.target)



class TracingControllerConfig(tracing_agents.TracingConfig):
def __init__(self, output_file, trace_time, write_json,
link_assets, asset_dir, timeout, collection_timeout,
device_serial_number, target):
tracing_agents.TracingConfig.__init__(self)
self.output_file = output_file
self.trace_time = trace_time
self.write_json = write_json
self.link_assets = link_assets
self.asset_dir = asset_dir
self.timeout = timeout
self.collection_timeout = collection_timeout
self.device_serial_number = device_serial_number
self.target = target

  • 3、分析tracing_controller.TracingController的实现;

./systrace/systrace/tracing_controller.py:

# (3.1) 创建一个TracingController类的对象,稍后的start/stop/output操作都使用该对象的方法 #
class TracingController(object):
def __init__(self, agents_with_config, controller_config):
"""Create tracing controller.

Create a tracing controller object. Note that the tracing
controller is also a tracing agent.

Args:
agents_with_config: List of tracing agents for this controller with the
corresponding tracing configuration objects.
controller_config: Configuration options for the tracing controller.
"""
self._child_agents = None

# (3.1.1) 存储agent对象和config #
self._child_agents_with_config = agents_with_config
# (3.1.2) 新建一个TracingControllerAgent类的对象 #
self._controller_agent = TracingControllerAgent
# (3.1.3) 存储controller层级的config #
self._controller_config = controller_config
self._trace_in_progress = False
self.all_results = None

2.2、SystraceRunner.StartTracing

入口是SystraceRunner类的.StartTracing方法,./systrace/systrace/systrace_runner.py:

class SystraceRunner(object):

def StartTracing(self):
# (1) #
self._tracing_controller.StartTracing

继续调用到TracingController类的.StartTracing方法,./systrace/systrace/tracing_controller.py:

class TracingController(object):

def StartTracing(self):
"""Start tracing for all tracing agents.

This function starts tracing for both the controller tracing agent
and the child tracing agents.

Returns:
Boolean indicating whether or not the start tracing succeeded.
Start tracing is considered successful if at least the
controller tracing agent was started.
"""

# (1.1) 设置对象中的tracing启动标志 #
assert not self._trace_in_progress, 'Trace already in progress.'
self._trace_in_progress = True

# Start the controller tracing agents. Controller tracing agent
# must be started successfully to proceed.
# (1.2) 启动成员对象_controller_agent的.StartAgentTracing方法 #
# 记录一些log #
if not self._controller_agent.StartAgentTracing(
self._controller_config,
timeout=self._controller_config.timeout):
print 'Unable to start controller tracing agent.'
return False

# Start the child tracing agents.
succ_agents =
# (1.3) 逐个启动agent list中agent的.StartAgentTracing方法 #
# 对'Android'来说,就一个agent:AtraceAgent #
for agent_and_config in self._child_agents_with_config:
agent = agent_and_config.agent
config = agent_and_config.config

# 启动agent #
if agent.StartAgentTracing(config,
timeout=self._controller_config.timeout):
# 把启动成功的agent加入succ_agents list #
succ_agents.append(agent)
else:
print 'Agent %s not started.' % str(agent)

# Print warning if all agents not started.
na = len(self._child_agents_with_config)
ns = len(succ_agents)
if ns < na:
print 'Warning: Only %d of %d tracing agents started.' % (ns, na)
self._child_agents = succ_agents
return True

|→

继续调用到AtraceAgent类的.StartTracing方法,./systrace/systrace/tracing_agents/atrace_agent.py:

class AtraceAgent(tracing_agents.TracingAgent):

@py_utils.Timeout(tracing_agents.START_STOP_TIMEOUT)
def StartAgentTracing(self, config, timeout=None):
assert config.atrace_categories, 'Atrace categories are missing!'
self._config = config

# systrace命令下达的需要trace的事件 #
self._categories = config.atrace_categories
if isinstance(self._categories, list):
self._categories = ','.join(self._categories)

# 使用'atrace --list_categories'获取的目标机支持的可trace事件 #
avail_cats = get_available_categories(config, self._device_sdk_version)

# (1.3.1) 判断命令中是否有目标机不支持的trace事件 #
unavailable = [x for x in self._categories.split(',') if
x not in avail_cats]
self._categories = [x for x in self._categories.split(',') if
x in avail_cats]
if unavailable:
print 'These categories are unavailable: ' + ' '.join(unavailable)
self._device_utils = device_utils.DeviceUtils(config.device_serial_number)
self._device_serial_number = config.device_serial_number

# (1.3.2) 构造参数:'atrace ... categories' #
# '...'包括以下选项: #
# '-z': compress_trace_data #
# '-t trace_time' #
# '-b trace_buf_size' #
# '-a app_name' #
# '-k kfuncs' #
self._tracer_args = _construct_atrace_args(config,
self._categories)
# (1.3.3) 执行命令:'atrace ... categories --async_start' #
self._device_utils.RunShellCommand(
self._tracer_args + ['--async_start'], check_return=True)
return True

|→

继续调用到AndroidProcessDataAgent类的.StartTracing方法,./systrace/systrace/tracing_agents/android_process_data_agent.py:

class AndroidProcessDataAgent(tracing_agents.TracingAgent):

@py_utils.Timeout(tracing_agents.START_STOP_TIMEOUT)
def StartAgentTracing(self, config, timeout=None):
self._device = device_utils.DeviceUtils(config.device_serial_number)
self._trace_data += self._get_process_snapshot
return True



def _get_process_snapshot(self):
use_legacy = False
try:
dump = self._device.RunShellCommand( \
PS_COMMAND_PROC, check_return=True, as_root=True, shell=True)
except AdbShellCommandFailedError:
use_legacy = True

# Check length of 2 as we execute two commands, which in case of failure
# on old devices output 1 line each.
if use_legacy or len(dump) == 2:
logging.debug('Couldn\'t parse ps dump, trying legacy method ...')
dump = self._device.RunShellCommand( \
PS_COMMAND_PROC_LEGACY, check_return=True, as_root=True, shell=True)
if len(dump) == 2:
logging.error('Unable to extract process data!')
return ""

return ' '.join(dump) + ' '

# 实际上就是在StartTracing和StopTracing时,各调用一次如下的ps命令 #
PS_COMMAND_PROC = "ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM" \
"&& ps -AT -o USER,PID,TID,CMD"

2.3、SystraceRunner.StopTracing

入口是SystraceRunner类的.StopTracing方法,./systrace/systrace/systrace_runner.py:

class SystraceRunner(object):

def StopTracing(self):
# (1) #
self._tracing_controller.StopTracing

继续调用到TracingController类的.StopTracing方法,./systrace/systrace/tracing_controller.py:

class TracingController(object):

def StopTracing(self):
"""Issue clock sync marker and stop tracing for all tracing agents.

This function stops both the controller tracing agent
and the child tracing agents. It issues a clock sync marker prior
to stopping tracing.

Returns:
Boolean indicating whether or not the stop tracing succeeded
for all agents.
"""

# (1.1) 清除对象中的tracing启动标志 #
assert self._trace_in_progress, 'No trace in progress.'
self._trace_in_progress = False

# Issue the clock sync marker and stop the child tracing agents.
self._IssueClockSyncMarker
succ_agents =
# (1.2) 逐个调用agent list中agent的.StopAgentTracing方法 #
# 对'Android'来说,就一个agent:AtraceAgent #
for agent in self._child_agents:
if agent.StopAgentTracing(timeout=self._controller_config.timeout):
succ_agents.append(agent)
else:
print 'Agent %s not stopped.' % str(agent)

# Stop the controller tracing agent. Controller tracing agent
# must be stopped successfully to proceed.
# (1.3) 调用成员对象_controller_agent的.StartAgentTracing方法 #
# 记录一些log #
if not self._controller_agent.StopAgentTracing(
timeout=self._controller_config.timeout):
print 'Unable to stop controller tracing agent.'
return False

# Print warning if all agents not stopped.
na = len(self._child_agents)
ns = len(succ_agents)
if ns < na:
print 'Warning: Only %d of %d tracing agents stopped.' % (ns, na)
self._child_agents = succ_agents

# Collect the results from all the stopped tracing agents.
all_results =
# (1.4) 汇总agent list中agent和成员对象_controller_agent所有的result #
for agent in self._child_agents + [self._controller_agent]:
try:
result = agent.GetResults(
timeout=self._controller_config.collection_timeout)
if not result:
print 'Warning: Timeout when getting results from %s.' % str(agent)
continue
if result.source_name in [r.source_name for r in all_results]:
print ('Warning: Duplicate tracing agents named %s.' %
result.source_name)
all_results.append(result)
# Check for exceptions. If any exceptions are seen, reraise and abort.
# Note that a timeout exception will be swalloed by the timeout
# mechanism and will not get to that point (it will return False instead
# of the trace result, which will be dealt with above)
except:
print 'Warning: Exception getting results from %s:' % str(agent)
print sys.exc_info[0]
raise
self.all_results = all_results
return all_results

|→

继续调用到AtraceAgent类的.StopAgentTracing方法,./systrace/systrace/tracing_agents/atrace_agent.py:

class AtraceAgent(tracing_agents.TracingAgent):

@py_utils.Timeout(tracing_agents.START_STOP_TIMEOUT)
def StopAgentTracing(self, timeout=None):
"""Stops tracing and starts collecting results.

To synchronously retrieve the results after calling this function,
call GetResults.
"""

# (1.2.1) 启动stop tracing的线程 #
self._collection_thread = threading.Thread(
target=self._collect_and_preprocess)
self._collection_thread.start
return True



def _collect_and_preprocess(self):
"""Collects and preprocesses trace data.

Stores results in self._trace_data.
"""

# (1.2.1.1) dump trace数据,并stop tracing #
trace_data = self._collect_trace_data

# (1.2.1.2) 对获取的trace数据进行预处理 #
self._trace_data = self._preprocess_trace_data(trace_data)



def _collect_trace_data(self):
"""Reads the output from atrace and stops the trace."""

# (1.2.1.1.1) dump出trace数据 #
# 执行命令:'atrace ... categories --async_dump' #
dump_cmd = self._tracer_args + ['--async_dump']
result = self._device_utils.RunShellCommand(
dump_cmd, raw_output=True, large_output=True, check_return=True)

# (1.2.1.1.2) 找到trace数据中'TRACE\:'开头的位置 #
data_start = re.search(TRACE_START_REGEXP, result)
if data_start:
data_start = data_start.end(0)
else:
raise IOError('Unable to get atrace data. Did you forget adb root?')

# (1.2.1.1.3) 清除trace数据中类似无效数据:r'^capturing trace\.\.\. done|^capturing trace\.\.\.' #
output = re.sub(ADB_IGNORE_REGEXP, '', result[data_start:])

# (1.2.1.1.4) stop tracing #
# 执行命令:'atrace ... categories --async_stop' #
self._stop_trace
return output

def _preprocess_trace_data(self, trace_data):
"""Performs various processing on atrace data.

Args:
trace_data: The raw trace data.
Returns:
The processed trace data.
"""

# (1.2.1.2.1) 对trace数据进行一些strp和解压 #
if trace_data:
trace_data = strip_and_decompress_trace(trace_data)

if not trace_data:
print >> sys.stderr, ('No data was captured. Output file was not '
'written.')
sys.exit(1)

# (1.2.1.2.2) 修复MISSING_TGIDS #
if _FIX_MISSING_TGIDS:
# Gather proc data from device and patch tgids
procfs_dump = self._device_utils.RunShellCommand(
'echo -n /proc/[0-9]*/task/[0-9]*',
shell=True, check_return=True)[0].split(' ')
pid2_tgid = extract_tgids(procfs_dump)
trace_data = fix_missing_tgids(trace_data, pid2_tgid)

# (1.2.1.2.3) 修复CIRCULAR_TRACES #
if _FIX_CIRCULAR_TRACES:
trace_data = fix_circular_traces(trace_data)

return trace_data

|→

继续调用到AndroidProcessDataAgent类的.StopAgentTracing方法,./systrace/systrace/tracing_agents/android_process_data_agent.py:

class AndroidProcessDataAgent(tracing_agents.TracingAgent):

@py_utils.Timeout(tracing_agents.START_STOP_TIMEOUT)
def StopAgentTracing(self, timeout=None):
self._trace_data += self._get_process_snapshot
return True



def _get_process_snapshot(self):
use_legacy = False
try:
dump = self._device.RunShellCommand( \
PS_COMMAND_PROC, check_return=True, as_root=True, shell=True)
except AdbShellCommandFailedError:
use_legacy = True

# Check length of 2 as we execute two commands, which in case of failure
# on old devices output 1 line each.
if use_legacy or len(dump) == 2:
logging.debug('Couldn\'t parse ps dump, trying legacy method ...')
dump = self._device.RunShellCommand( \
PS_COMMAND_PROC_LEGACY, check_return=True, as_root=True, shell=True)
if len(dump) == 2:
logging.error('Unable to extract process data!')
return ""

return ' '.join(dump) + ' '

# 实际上就是在StartTracing和StopTracing时,各调用一次如下的ps命令 #
PS_COMMAND_PROC = "ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM" \
"&& ps -AT -o USER,PID,TID,CMD"

2.4、SystraceRunner.OutputSystraceResults

入口是SystraceRunner类的.OutputSystraceResults方法,./systrace/systrace/systrace_runner.py:

class SystraceRunner(object):

def OutputSystraceResults(self, write_json=False):
"""Output the results of systrace to a file.

If output is necessary, then write the results of systrace to either (a)
a standalone HTML file, or (b) a json file which can be read by the
trace viewer.

Args:
write_json: Whether to output to a json file (if false, use HTML file)
"""
print 'Tracing complete, writing results'
if write_json:
# (1) 输出成jason格式 #
result = output_generator.GenerateJSONOutput(
self._tracing_controller.all_results,
self._out_filename)
else:
# (2) 输出成html格式 #
# all_results是上一步stoptracing时得到的trace结果 #
# _out_filename是使用'-o'选项指定的文件名 #
result = output_generator.GenerateHTMLOutput(
self._tracing_controller.all_results,
self._out_filename)
print ' Wrote trace %s file: file://%s ' % (('JSON' if write_json
else 'HTML'), result)

继续调用到output_generator模块中的GenerateHTMLOutput函数,./systrace/systrace/output_generator.py:

def GenerateHTMLOutput(trace_results, output_file_name):
"""Write the results of systrace to an HTML file.

Args:
trace_results: A list of TraceResults.
output_file_name: The name of the HTML file that the trace viewer
results should be written to.
"""
def _ReadAsset(src_dir, filename):
return open(os.path.join(src_dir, filename)).read

# TODO(rnephew): The tracing output formatter is able to handle a single
# systrace trace just as well as it handles multiple traces. The obvious thing
# to do here would be to use it all for all systrace output: however, we want
# to continue using the legacy way of formatting systrace output when a single
# systrace and the tracing controller trace are present in order to match the
# Java verison of systrace. Java systrace is expected to be deleted at a later
# date. We should consolidate this logic when that happens.

# (2.1) 如果result list中成员个数大于3,使用新方法NewGenerateHTMLOutput #
if len(trace_results) > 3:
NewGenerateHTMLOutput(trace_results, output_file_name)
return os.path.abspath(output_file_name)

systrace_dir = os.path.abspath(os.path.dirname(__file__))

# (2.2) 尝试更新./systrace/systrace/systrace_trace_viewer.html文件 #
try:
from systrace import update_systrace_trace_viewer
except ImportError:
pass
else:
update_systrace_trace_viewer.update

trace_viewer_html = _ReadAsset(systrace_dir, 'systrace_trace_viewer.html')

# Open the file in binary mode to prevent python from changing the
# line endings, then write the prefix.

# (2.3) 读出'prefix.html','suffix.html','systrace_trace_viewer.html'文件的内容备用 #
systrace_dir = os.path.abspath(os.path.dirname(__file__))
html_prefix = _ReadAsset(systrace_dir, 'prefix.html')
html_suffix = _ReadAsset(systrace_dir, 'suffix.html')
trace_viewer_html = _ReadAsset(systrace_dir,
'systrace_trace_viewer.html')

# Open the file in binary mode to prevent python from changing the
# line endings, then write the prefix.
# (2.4) 打开一个名为'xxx.html'的输出文件,准备写入 #
html_file = open(output_file_name, 'wb')

# (2.4.1) 首先写入'prefix.html'的内容 #
# 并且把其中的'{{SYSTRACE_TRACE_VIEWER_HTML}}'字符用'systrace_trace_viewer.html'的文件内容替换 #
html_file.write(html_prefix.replace('{{SYSTRACE_TRACE_VIEWER_HTML}}',
trace_viewer_html))

# Write the trace data itself. There is a separate section of the form
#
# for each tracing agent (including the controller tracing agent).

# (2.4.2) 逐个按格式写入trace_results中的trace数据 #
html_file.write(' ')
for result in trace_results:
html_file.write(' ')
html_file.write(' ')

# Write the suffix and finish.
# (2.4.3) 最后写入'suffix.html'的内容 #
html_file.write(html_suffix)
html_file.close

final_path = os.path.abspath(output_file_name)
return final_path

2.4、执行的命令

我们在devil/devil/android/device_utils.py的RunShellCommand.run和devil/devil/utils/cmd_helper.py的Popen处加上调试打印,看到’./systrace -o trace.html -t 10 gfx view wm am dalvik input sched freq idle‘命令的具体执行过程如下:

ASOP_ROOT/external/chromium-trace/catapult/systrace/bin/systrace -o trace.html -t 10 gfx view wm am dalvik input sched freq idle
Popen: [u'/usr/bin/adb', 'devices']
run: ( c=/data/local/tmp/cache_token;echo $EXTERNAL_STORAGE;cat $c 2>/dev/||echo;echo "636d6fb8-d59c-11e8-ae24-b8ca3a959992">$c &&getprop )>/data/local/tmp/temp_file-4aec8d95e60eb
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( ( c=/data/local/tmp/cache_token;echo $EXTERNAL_STORAGE;cat $c 2>/dev/||echo;echo "636d6fb8-d59c-11e8-ae24-b8ca3a959992">$c &&getprop )>/data/local/tmp/temp_file-4aec8d95e60eb );echo %$?']
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'pull', '/data/local/tmp/temp_file-4aec8d95e60eb', '/tmp/tmple_FOq/tmp_ReadFileWithPull']
run: su 0 ls /root && ! ls /root
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', Popen: [u'/usr/bin/adb', 'shell''-s', , '8'( su 0 ls72QADT5KWKRG', /root && ! ls /root 'shell', );echo'rm -f /data/local/tmp/ %$?']temp_file-4aec8d95e60eb']

run: ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM&& ps -AT -o USER,PID,TID,CMD
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM&& ps -AT -o USER,PID,TID,CMD );echo %$?']
run: atrace --list_categories
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( atrace --list_categories );echo %$?']
run: atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_start
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_start );echo %$?']
Starting tracing (10 seconds)
Tracing completed. Collecting output...
run: ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM&& ps -AT -o USER,PID,TID,CMD
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM&& ps -AT -o USER,PID,TID,CMD );echo %$?']
run: ( atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_dump )>/data/local/tmp/temp_file-57edea8625b51
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( ( atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_dump )>/data/local/tmp/temp_file-57edea8625b51 );echo %$?']
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'pull', '/data/local/tmp/temp_file-57edea8625b51', '/tmp/tmp_preLk/tmp_ReadFileWithPull']
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', 'rm -f /data/local/tmp/temp_file-57edea8625b51']
run: atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_stop
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_stop );echo %$?']
run: echo -n /proc/[0-9]*/task/[0-9]*
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( echo -n /proc/[0-9]*/task/[0-9]* );echo %$?']
Outputting Systrace results...
Tracing complete, writing results

Wrote trace HTML file: file:///home/pengweilin/sdb2/m1872/code/repo/external/chromium-trace/catapult/systrace/bin/pwl1.html

精简起来是调用目标机上的atrace命令实现的:

adb shell atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_start
adb shell atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_dump > /data/local/tmp/temp_file-57edea8625b51
adb pull /data/local/tmp/temp_file-57edea8625b51 /tmp/tmp_preLk/tmp_ReadFileWithPull
adb shell rm -f /data/local/tmp/temp_file-57edea8625b51
adb shell atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_stop

3、目标机atrace命令的实现(c++)

从上一节可以看到主机的systrace命令实际最后都是调用目标机上的atrace命令实现的。从atrace的命令格式上看,分为两部分:

  • [options]:就是'-'和'--'打头的选项;

  • [categories...]:指定需要trace哪些事件。就是类似'gfx view wm am dalvik input sched freq idle'这些;

$ atrace --help
usage: atrace [options] [categories...]
options include:
-a appname enable app-level tracing for a comma separated list of cmdlines
-b N use a trace buffer size of N KB
-c trace into a circular buffer
-f filename use the categories written in a file as space-separated
values in a line
-k fname,... trace the listed kernel functions
-n ignore signals
-s N sleep for N seconds before tracing [default 0]
-t N trace for N seconds [default 5]
-z compress the trace dump
--async_start start circular trace and return immediately
--async_dump dump the current contents of circular trace buffer
--async_stop stop tracing and dump the current contents of circular
trace buffer
--stream stream trace to stdout as it enters the trace buffer
Note: this can take significant CPU time, and is best
used for measuring things that are not affected by
CPU performance, like pagecache usage.
--list_categories
list the available tracing categories
-o filename write the trace to the specified file instead
of stdout.

atrace的源码在ASOP_ROOT/frameworks/native/cmds/atrace/atrace.cpp,我们来具体分析一下它的实现:

int main(int argc, char **argv)
{
bool async = false;
bool traceStart = true;
bool traceStop = true;
bool traceDump = true;
bool traceStream = false;

/* (1) atrace跟"--help"选项,印出帮助信息 */
if (argc == 2 && 0 == strcmp(argv[1], "--help")) {
showHelp(argv[0]);
exit(0);
}

/* (2) ftrace目录是否存在 */
if (!findTraceFiles) {
fprintf(stderr, "No trace folder found ");
exit(-1);
}

/* (3) 逐条解析atrace的命令参数 */
for (;;) {
int ret;
int option_index = 0;
static struct option long_options = {
{"async_start", no_argument, 0, 0 },
{"async_stop", no_argument, 0, 0 },
{"async_dump", no_argument, 0, 0 },
{"list_categories", no_argument, 0, 0 },
{"stream", no_argument, 0, 0 },
{ 0, 0, 0, 0 }
};

/* (3.1) 尝试使用option来解析atrace的命令参数 */
ret = getopt_long(argc, argv, "a:b:cf:k:ns:t:zo:",
long_options, &option_index);

/* (3.2) 如果使用option解析失败,尝试使用category来进行解析 */
if (ret < 0) {
for (int i = optind; i < argc; i++) {
if (!setCategoryEnable(argv[i], true)) {
fprintf(stderr, "error enabling tracing category \"%s\" ", argv[i]);
exit(1);
}
}
break;
}

/* (3.3) 根据解析出来的option,设置flag */
switch(ret) {
case 'a':
g_debugAppCmdLine = optarg;
break;

case 'b':
g_traceBufferSizeKB = atoi(optarg);
break;

case 'c':
g_traceOverwrite = true;
break;

case 'f':
g_categoriesFile = optarg;
break;

case 'k':
g_kernelTraceFuncs = optarg;
break;

case 'n':
g_nohup = true;
break;

case 's':
g_initialSleepSecs = atoi(optarg);
break;

case 't':
g_traceDurationSeconds = atoi(optarg);
break;

case 'z':
g_compress = true;
break;

case 'o':
g_outputFile = optarg;
break;

case 0:
if (!strcmp(long_options[option_index].name, "async_start")) {
async = true;
traceStop = false;
traceDump = false;
g_traceOverwrite = true;
} else if (!strcmp(long_options[option_index].name, "async_stop")) {
async = true;
traceStart = false;
} else if (!strcmp(long_options[option_index].name, "async_dump")) {
async = true;
traceStart = false;
traceStop = false;
} else if (!strcmp(long_options[option_index].name, "stream")) {
traceStream = true;
traceDump = false;
} else if (!strcmp(long_options[option_index].name, "list_categories")) {
listSupportedCategories;
exit(0);
}
break;

default:
fprintf(stderr, " ");
showHelp(argv[0]);
exit(-1);
break;
}
}

registerSigHandler;

if (g_initialSleepSecs > 0) {
sleep(g_initialSleepSecs);
}

bool ok = true;
/* (4) "-- async_start" option命令的实际执行动作 */
if (traceStart) {
/* (4.1) */
ok &= setUpTrace;

/* (4.2) */
ok &= startTrace;
}

if (ok && traceStart) {
if (!traceStream) {
printf("capturing trace...");
fflush(stdout);
}

// We clear the trace after starting it because tracing gets enabled for
// each CPU individually in the kernel. Having the beginning of the trace
// contain entries from only one CPU can cause "begin" entries without a
// matching "end" entry to show up if a task gets migrated from one CPU to
// another.
/* (4.3) 清除trace内容 */
ok = clearTrace;

/* (4.4) 通过trace_marker文件接口,写入同步信息 */
writeClockSyncMarker;

/* (4.5) async==false的支持 */
if (ok && !async && !traceStream) {
// Sleep to allow the trace to be captured.
struct timespec timeLeft;
timeLeft.tv_sec = g_traceDurationSeconds;
timeLeft.tv_nsec = 0;
do {
if (g_traceAborted) {
break;
}
} while (nanosleep(&timeLeft, &timeLeft) == -1 && errno == EINTR);
}

/* (4.6) traceStream的支持 */
if (traceStream) {
streamTrace;
}
}

// Stop the trace and restore the default settings.
/* (5) "-- async_stop" option命令的实际执行动作 */
if (traceStop)
stopTrace;

/* (6) "-- async_dump" option命令的实际执行动作 */
if (ok && traceDump) {
if (!g_traceAborted) {
printf(" done ");
fflush(stdout);
int outFd = STDOUT_FILENO;
if (g_outputFile) {
outFd = open(g_outputFile, O_WRONLY | O_CREAT | O_TRUNC, 0644);
}
if (outFd == -1) {
printf("Failed to open '%s', err=%d", g_outputFile, errno);
} else {
dprintf(outFd, "TRACE: ");
dumpTrace(outFd);
if (g_outputFile) {
close(outFd);
}
}
} else {
printf(" trace aborted. ");
fflush(stdout);
}
clearTrace;
} else if (!ok) {
fprintf(stderr, "unable to start tracing ");
}

// Reset the trace buffer size to 1.
if (traceStop)
cleanUpTrace;

return g_traceAborted ? 1 : 0;
}

3.1、categories

categories指的是可被trace的事件的类别,它的全集如下:

/* Tracing categories */
static const TracingCategory k_categories = {
{ "gfx", "Graphics", ATRACE_TAG_GRAPHICS, {
{ OPT, "events/mdss/enable" },
} },
{ "input", "Input", ATRACE_TAG_INPUT, { } },
{ "view", "View System", ATRACE_TAG_VIEW, { } },
{ "webview", "WebView", ATRACE_TAG_WEBVIEW, { } },
{ "wm", "Window Manager", ATRACE_TAG_WINDOW_MANAGER, { } },
{ "am", "Activity Manager", ATRACE_TAG_ACTIVITY_MANAGER, { } },
{ "sm", "Sync Manager", ATRACE_TAG_SYNC_MANAGER, { } },
{ "audio", "Audio", ATRACE_TAG_AUDIO, { } },
{ "video", "Video", ATRACE_TAG_VIDEO, { } },
{ "camera", "Camera", ATRACE_TAG_CAMERA, { } },
{ "hal", "Hardware Modules", ATRACE_TAG_HAL, { } },
{ "app", "Application", ATRACE_TAG_APP, { } },
{ "res", "Resource Loading", ATRACE_TAG_RESOURCES, { } },
{ "dalvik", "Dalvik VM", ATRACE_TAG_DALVIK, { } },
{ "rs", "RenderScript", ATRACE_TAG_RS, { } },
{ "bionic", "Bionic C Library", ATRACE_TAG_BIONIC, { } },
{ "power", "Power Management", ATRACE_TAG_POWER, { } },
{ "pm", "Package Manager", ATRACE_TAG_PACKAGE_MANAGER, { } },
{ "ss", "System Server", ATRACE_TAG_SYSTEM_SERVER, { } },
{ "database", "Database", ATRACE_TAG_DATABASE, { } },
{ "network", "Network", ATRACE_TAG_NETWORK, { } },
{ "adb", "ADB", ATRACE_TAG_ADB, { } },
{ k_coreServiceCategory, "Core services", 0, { } },
{ k_pdxServiceCategory, "PDX services", 0, { } },
{ "sched", "CPU Scheduling", 0, {
{ REQ, "events/sched/sched_switch/enable" },
{ REQ, "events/sched/sched_wakeup/enable" },
{ OPT, "events/sched/sched_waking/enable" },
{ OPT, "events/sched/sched_blocked_reason/enable" },
{ OPT, "events/sched/sched_cpu_hotplug/enable" },
{ OPT, "events/cgroup/enable" },
} },
{ "irq", "IRQ Events", 0, {
{ REQ, "events/irq/enable" },
{ OPT, "events/ipi/enable" },
} },
{ "i2c", "I2C Events", 0, {
{ REQ, "events/i2c/enable" },
{ REQ, "events/i2c/i2c_read/enable" },
{ REQ, "events/i2c/i2c_write/enable" },
{ REQ, "events/i2c/i2c_result/enable" },
{ REQ, "events/i2c/i2c_reply/enable" },
{ OPT, "events/i2c/smbus_read/enable" },
{ OPT, "events/i2c/smbus_write/enable" },
{ OPT, "events/i2c/smbus_result/enable" },
{ OPT, "events/i2c/smbus_reply/enable" },
} },
{ "freq", "CPU Frequency", 0, {
{ REQ, "events/power/cpu_frequency/enable" },
{ OPT, "events/power/clock_set_rate/enable" },
{ OPT, "events/power/cpu_frequency_limits/enable" },
} },
{ "membus", "Memory Bus Utilization", 0, {
{ REQ, "events/memory_bus/enable" },
} },
{ "idle", "CPU Idle", 0, {
{ REQ, "events/power/cpu_idle/enable" },
} },
{ "disk", "Disk I/O", 0, {
{ OPT, "events/f2fs/f2fs_sync_file_enter/enable" },
{ OPT, "events/f2fs/f2fs_sync_file_exit/enable" },
{ OPT, "events/f2fs/f2fs_write_begin/enable" },
{ OPT, "events/f2fs/f2fs_write_end/enable" },
{ OPT, "events/ext4/ext4_da_write_begin/enable" },
{ OPT, "events/ext4/ext4_da_write_end/enable" },
{ OPT, "events/ext4/ext4_sync_file_enter/enable" },
{ OPT, "events/ext4/ext4_sync_file_exit/enable" },
{ REQ, "events/block/block_rq_issue/enable" },
{ REQ, "events/block/block_rq_complete/enable" },
} },
{ "mmc", "eMMC commands", 0, {
{ REQ, "events/mmc/enable" },
} },
{ "load", "CPU Load", 0, {
{ REQ, "events/cpufreq_interactive/enable" },
} },
{ "sync", "Synchronization", 0, {
{ REQ, "events/sync/enable" },
} },
{ "workq", "Kernel Workqueues", 0, {
{ REQ, "events/workqueue/enable" },
} },
{ "memreclaim", "Kernel Memory Reclaim", 0, {
{ REQ, "events/vmscan/mm_vmscan_direct_reclaim_begin/enable" },
{ REQ, "events/vmscan/mm_vmscan_direct_reclaim_end/enable" },
{ REQ, "events/vmscan/mm_vmscan_kswapd_wake/enable" },
{ REQ, "events/vmscan/mm_vmscan_kswapd_sleep/enable" },
{ REQ, "events/lowmemorykiller/enable" },
} },
{ "regulators", "Voltage and Current Regulators", 0, {
{ REQ, "events/regulator/enable" },
} },
{ "binder_driver", "Binder Kernel driver", 0, {
{ REQ, "events/binder/binder_transaction/enable" },
{ REQ, "events/binder/binder_transaction_received/enable" },
{ OPT, "events/binder/binder_set_priority/enable" },
} },
{ "binder_lock", "Binder global lock trace", 0, {
{ OPT, "events/binder/binder_lock/enable" },
{ OPT, "events/binder/binder_locked/enable" },
{ OPT, "events/binder/binder_unlock/enable" },
} },
{ "pagecache", "Page cache", 0, {
{ REQ, "events/filemap/enable" },
} },
};

这些categories可以简单分为3类:

  • 1、使用内核ftrace的trace event实现的内核事件。例如"sched",.sysfiles字段指定了trace event的路径;

  • 2、使用Trace类实现的用户态事件(app/java framework/native)。例如"input",.tags字段指定了trace tag;

  • 3、service类型的事件。如k_coreServiceCategory和k_pdxServiceCategory;

具体目标机可能只支持categories全集中的一部分,所以atrace会对命令参数中指定的categories本机是否支持进行判断:

main -> setCategoryEnable:

static bool setCategoryEnable(const char* name, bool enable)
{
/* (1) 逐个判断命令参数中的categories,本机是否支持 */
for (size_t i = 0; i < arraysize(k_categories); i++) {
const TracingCategory& c = k_categories[i];
if (strcmp(name, c.name) == 0) {

/* (2) 判断categories是否支持 */
if (isCategorySupported(c)) {
g_categoryEnables[i] = enable;
return true;
} else {
/* (3) 如果本机不支持categories,原因是否是没有root权限? */
if (isCategorySupportedForRoot(c)) {
fprintf(stderr, "error: category \"%s\" requires root "
"privileges. ", name);

/* (4) 本机不支持categories,报错 */
} else {
fprintf(stderr, "error: category \"%s\" is not supported "
"on this device. ", name);
}
return false;
}
}
}
fprintf(stderr, "error: unknown tracing category \"%s\" ", name);
return false;
}



static bool isCategorySupported(const TracingCategory& category)
{
/* (2.1) 如果category是k_coreServiceCategory,判断name是否为"core_services",判断"ro.atrace.core.services"的property是否存在 */
if (strcmp(category.name, k_coreServiceCategory) == 0) {
return !android::base::GetProperty(k_coreServicesProp, "").empty;
}


/* (2.2) 如果category是k_pdxServiceCategory,判断name是否为"pdx" */
if (strcmp(category.name, k_pdxServiceCategory) == 0) {
return true;
}

bool ok = category.tags != 0;
/* (2.2) 如果category是使用了内核的trace event,判断trace4event相应的文件是否存在是否可写 */
for (int i = 0; i < MAX_SYS_FILES; i++) {
const char* path = category.sysfiles[i].path;
bool req = category.sysfiles[i].required == REQ;
if (path != ) {
if (req) {
if (!fileIsWritable(path)) {
return false;
} else {
ok = true;
}
} else {
ok = true;
}
}
}
return ok;
}

我们也可以使用atrace命令的"--list_categories"选项,列出当前目标机支持的所有categories:

main -> listSupportedCategories:

static voidlistSupportedCategories
{
/* (1) 逐个遍历k_categories数组中的所有category */
for (size_t i = 0; i < arraysize(k_categories); i++) {
const TracingCategory& c = k_categories[i];

/* (2) 判断当前目标机系统是否支持 */
if (isCategorySupported(c)) {
printf(" %10s - %s ", c.name, c.longname);
}
}
}

3.2、'--async_start'

当atrace命令使用了'--async_start'选项以后,会启动指定的所有categories:

static bool setUpTrace
{
bool ok = true;

// Set up the tracing options.
/* (4.1.1) 如果categories是使用文件指定的,解析文件中的categories并使能 */
ok &= setCategoriesEnableFromFile(g_categoriesFile);

/* (4.1.2) 根据'-c'选项,配置trace event的overwrite:"/sys/kernel/debug/tracing/options/overwrite" */
ok &= setTraceOverwriteEnable(g_traceOverwrite);

/* (4.1.3) 根据'-b'选项,配置trace event的buffer size:"/sys/kernel/debug/tracing/buffer_size_kb" */
ok &= setTraceBufferSizeKB(g_traceBufferSizeKB);
// TODO: Re-enable after stabilization
//ok &= setCmdlineSize;

/* (4.1.4) 根据选项,配置:"/sys/kernel/debug/tracing/trace_clock" */
ok &= setClock;

/* (4.1.5) 使能:"/sys/kernel/debug/tracing/options/print-tgid" */
ok &= setPrintTgidEnableIfPresent(true);

/* (4.1.6) 根据'-k'选项指定的需要trace的kernel 函数,
配置当前tracer为function_graph:"/sys/kernel/debug/tracing/current_tracer"
在filter中配置需要trace的函数名:"/sys/kernel/debug/tracing/set_ftrace_filter"
*/
ok &= setKernelTraceFuncs(g_kernelTraceFuncs);

// Set up the tags property.
uint64_t tags = 0;
/* (4.1.7) 把categories中需要trace的tag进行'或'操作,并配置到"debug.atrace.tags.enableflags"的property中 */
for (size_t i = 0; i < arraysize(k_categories); i++) {
if (g_categoryEnables[i]) {
const TracingCategory &c = k_categories[i];
tags |= c.tags;
}
}
ok &= setTagsProperty(tags);

bool coreServicesTagEnabled = false;
for (size_t i = 0; i < arraysize(k_categories); i++) {
if (strcmp(k_categories[i].name, k_coreServiceCategory) == 0) {
coreServicesTagEnabled = g_categoryEnables[i];
}

// Set whether to poke PDX services in this session.
if (strcmp(k_categories[i].name, k_pdxServiceCategory) == 0) {
g_tracePdx = g_categoryEnables[i];
}
}

/* (4.1.8) 根据'-a'选项指定的需要trace的app name,最多指定16个,
配置到"debug.atrace.app_%d"的property当中
*/
std::string packageList(g_debugAppCmdLine);
if (coreServicesTagEnabled) {
if (!packageList.empty) {
packageList += ",";
}
packageList += android::base::GetProperty(k_coreServicesProp, "");
}
ok &= setAppCmdlineProperty(&packageList[0]);

/* (4.1.9) 让defaultServiceManager中的所有service重新读取property,使修改的配置生效 */
ok &= pokeBinderServices;

/* (4.1.10) 让::android::hardware::defaultServiceManager中的所有service重新读取property,使修改的配置生效 */
pokeHalServices;

/* (4.1.11) 如果categories指定了"pdx",让pdx service重新读取property,使修改的配置生效 */
if (g_tracePdx) {
ok &= ServiceUtility::PokeServices;
}

// Disable all the sysfs enables. This is done as a separate loop from
// the enables to allow the same enable to exist in multiple categories.
/* (4.1.12) disable掉所有的trace event */
ok &= disableKernelTraceEvents;

// Enable all the sysfs enables that are in an enabled category.
/* (4.1.13) 根据categories数组中的配置,使能相应的trace event */
for (size_t i = 0; i < arraysize(k_categories); i++) {
if (g_categoryEnables[i]) {
const TracingCategory &c = k_categories[i];
for (int j = 0; j < MAX_SYS_FILES; j++) {
const char* path = c.sysfiles[j].path;
bool required = c.sysfiles[j].required == REQ;
if (path != ) {
if (fileIsWritable(path)) {
ok &= setKernelOptionEnable(path, true);
} else if (required) {
fprintf(stderr, "error writing file %s ", path);
ok = false;
}
}
}
}
}

return ok;
}
static bool startTrace
{
return setTracingEnabled(true);
}



// Enable or disable kernel tracing.
static bool setTracingEnabled(bool enable)
{
/* (4.2.1) 打开总开关:"/sys/kernel/debug/tracing/tracing_on" */
return setKernelOptionEnable(k_tracingOnPath, enable);
}

3.3、'--async_dump'

当atrace命令使用了'--async_dump'选项以后,会dump出trace内容:

main
{

/* (6) "-- async_dump" option命令的实际执行动作 */
if (ok && traceDump) {
if (!g_traceAborted) {
printf(" done ");
fflush(stdout);

/* (6.1) 默认的dump信息输出句柄 */
int outFd = STDOUT_FILENO;

/* (6.2) 使用'-o'选项指定的dump信息输出句柄 */
if (g_outputFile) {
outFd = open(g_outputFile, O_WRONLY | O_CREAT | O_TRUNC, 0644);
}
if (outFd == -1) {
printf("Failed to open '%s', err=%d", g_outputFile, errno);
} else {
dprintf(outFd, "TRACE: ");

/* (6.3) 读出"/sys/kernel/debug/tracing/trace"的内容dump到输出句柄 */
dumpTrace(outFd);
if (g_outputFile) {
close(outFd);
}
}
} else {
printf(" trace aborted. ");
fflush(stdout);
}
clearTrace;
} else if (!ok) {
fprintf(stderr, "unable to start tracing ");
}

}



3.4、'--async_stop'

当atrace命令使用了'--async_stop'选项以后,会停止所有categories的trace信息抓取:

static void stopTrace
{
setTracingEnabled(false);
}



static bool setTracingEnabled(bool enable)
{
/* (5.1) 关闭总开关:"/sys/kernel/debug/tracing/tracing_on" */
return setKernelOptionEnable(k_tracingOnPath, enable);
}

4、目标机Trace类的使用(java/c++)

atrace只是一个控制命令,实际的trace数据路径是这样分工的:

  • 1、内核trace信息,通过trace event记录到ftrace的buffer中;

  • 2、用户态(app/java framework/native)是通过使用Trace类来记录trace信息的,实际上是通过"/sys/kernel/debug/tracing/trace_marker"接口记录到内核ftracebuffer当中的。

我们查看atrace dump出来的原始信息,可以看到对应的"tracing_mark_write"字段:

 ndroid.systemui-1856 ( 1856) [001] ...1 89888.553074: tracing_mark_write: S|1856|animator:alpha|62928891
ndroid.systemui-1856 ( 1856) [001] ...1 89888.553096: tracing_mark_write: F|1856|animator:alpha|62928891
ndroid.systemui-1856 ( 1856) [001] ...1 89888.553110: tracing_mark_write: S|1856|animator:scaleX|37096049
ndroid.systemui-1856 ( 1856) [001] ...1 89888.553131: tracing_mark_write: F|1856|animator:scaleX|37096049

这样内核态和用户态的trace数据最后都写入到了ftrace buffer当中,也实现了时间同步,最后通过"/sys/kernel/debug/tracing/trace"接口读出。

在用户态的各个层次是这样使用Trace类的:

4.1、app

import android.os.Trace;
Trace.beginSection(String sectionName)
Trace.EndSection

详细例子可以参考:systrace

public class MyAdapter extends RecyclerView.Adapter {
...
@Override
public MyViewHolder onCreateViewHolder(ViewGroup parent, int viewType) {
Trace.beginSection("MyAdapter.onCreateViewHolder");
MyViewHolder myViewHolder;
try {
myViewHolder = MyViewHolder.newInstance(parent);
} finally {
// In 'try...catch' statements, always call endSection
// in a 'finally' block to ensure it is invoked even when an exception
// is thrown.
Trace.endSection;
}
return myViewHolder;
}



@Override
public void onBindViewHolder(MyViewHolder holder, int position) {
Trace.beginSection("MyAdapter.onBindViewHolder");
try {
try {
Trace.beginSection("MyAdapter.queryDatabase");
RowItem rowItem = queryDatabase(position);
mDataset.add(rowItem);
} finally {
Trace.endSection;
}
holder.bind(mDataset.get(position));
} finally {
Trace.endSection;
}
}
...
}

然后通过python systrace.py --app=sectionName 指定apk,或者通过ddms选择指定apk,抓取systrace分析。

4.2、Java framework

import android.os.Trace;
Trace.traceBegin(long traceTag, String methodName)
Trace.traceEnd(long traceTag)

4.3、Native framework

#include 
ATRACE_CALL

5、主机Trace-Viewer的实现(js)

从systrace python文件的解析我们可以看到,最后把trace数据和'prefix.html'、'suffix.html'、'systrace_trace_viewer.html'合成一个'trace.html'文件,使用chrome浏览器打开'trace.html'就可以非常方便的以图形化的形式来查看和分析trace数据。

这里面的关键就是Trace-Viewer,核心是其中的js脚本。该项目的github主页。

本人js功底太差,看了一堆资料还是不得要领。如有感兴趣的同学可以继续深入研究,Trace-Viewer功能非常强大,用来做trace数据的分析和实现再合适不过了,如果可以利用起来自己修改非常强大。

发表评论
留言与评论(共有 0 条评论) “”
   
验证码:

相关文章

推荐文章