Profiling Python
DataKit Python profiling supports dd-trace-py and py-spy.
Requirements¶
Install DataKit and enable profile input.
Use dd-trace-py¶
- Install dd-trace-py library
Info
DataKit is now compatible with dd-trace-py 1.14.x and below, higher versions are not tested.
- Profiling by attaching into the target process
DD_PROFILING_ENABLED=true \
DD_ENV=dev \
DD_SERVICE=my-web-app \
DD_VERSION=1.0.3 \
DD_TRACE_AGENT_URL=http://127.0.0.1:9529 \
ddtrace-run python app.py
- Profiling by writing code
import time
import ddtrace
from ddtrace.profiling import Profiler
ddtrace.tracer.configure(
https=False,
hostname="localhost",
port="9529",
)
prof = Profiler()
prof.start(True, True)
# your code here ...
# while True:
# time.sleep(1)
There is no need to add ddtrace-run command
View Profile¶
After a minute or two, you can visualize your profiles on the APM -> Profile .
Use py-spy¶
py-spyis a non-invasive Python performance metric sampling tool provided by the open source community,
which has the advantages of running independently and having low impact on target program load By default, py-spy will output sampling data in different formats to a local file based on the specified parameters. To simplify the integration of py-spy and DataKit, center provides a branch version py-spy-for-datakit, with little modifications made to the original version, supporting automatic profiling send data to DataKit.
- Installation
pip install is recommend way.
Below is Linux x86_64 platform as an example (other platforms is similar), let's introduce the installation steps of the pre compiled version:
# after download binary
# use pip to install
pip3 install --force-reinstall --no-index --find-links . py-spy-for-datakit
# confirm successful installation
py-spy-for-datakit help
if your machine has rust and cargo installed, you can use cargo to install it.
- Usage
py-spy-for-datakit has added the datakit command to the original subcommand of py-spy, specifically used to send sampling data to DataKit. You can type py-spy-for-datakit help datakit for usage help:
| Option | describe | default |
|---|---|---|
| -H, --host | DataKit listening host | 127.0.0.1 |
| -P, --port | DataKit listening port | 9529 |
| -S, --service | Your service name | unnamed-service |
| -E, --env | Your app deploy environment | unnamed-env |
| -V, --version | Your app version | unnamed-version |
| -p, --pid | Target process PID | You must set this option or command |
| -d, --duration | Profiling duration | 60 |
| -r, --rate | Profiling rate | 100 |
| -s, --subprocesses | Whether profiling sub process | false |
| -i, --idle | Whether profiling inactive thread | false |
py-spy-for-datakit can analyze the currently running program by using the --pid <PID> or -p <PID> parameters to pass the process PID of the running Python program to py-spy-for-datakit.
Imaging your target process PID is 12345, and DataKit is listening at 127.0.0.1:9529:
py-spy-for-datakit datakit \
--host 127.0.0.1 \
--port 9529 \
--service <your-service-name> \
--env testing \
--version v0.1 \
--duration 60 \
--pid 12345
If needed, please add sudo prefix.
py-spy-for-datakit also supports direct startup commands with Python projects, so there is no need to specify a process PID. At the same time, data sampling will be performed when the program starts, and the running commands are similar:
py-spy-for-datakit datakit \
--host 127.0.0.1 \
--port 9529 \
--service your-service-name \
--env testing \
--version v0.1 \
-d 60 \
-- python3 server.py # There is a blank in front of python3
After a minute or two, you can visualize your profiles on the profile.