Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mysql-connector instrumentor support for sqlcommenting #3023

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

tammy-baylis-swi
Copy link
Contributor

@tammy-baylis-swi tammy-baylis-swi commented Nov 19, 2024

Description

Updates mysql-connector instrumentor to support sqlcommenting.

To leverage existing code and reduce duplication, this updates DB-API integration instrumentor with new abstract classes BaseTracedConnectionProxy and BaseTracedCursorProxy, while keeping uses by aiopg, psycopg2, psycopg instrumentors the same.

To perform checks specific to mysql-connector, its instrumentor has new custom subclasses and methods that extend from DB-API integration, sort of like what psycopg instrumentor does. This is to enable_commenter at the db client cursor level (instead of DB-API level) and support mixes of cursor types instrumented. See below for More background.

Fixes #2902

Replaces other PR #2943

More background

tl;dr: Injecting a unique sqlcomment into a MySQL statement, i.e. with traceparent, causes mysql-connector to treat it as new, which imposes new, extra overhead if the cursor is enabled for server-side prepared statements unless we skip it.

db client cursors, prepared statements, and sqlcommenting

If an app-side, mysql-connector cursor is set with prepared=True, mysql-connector will perform native prepared statement execution (see docs here). Here is an example of 'prepared' cursor setup and usage in an instrumented db client app (without sqlcommenting):

import mysql.connector
from opentelemetry.instrumentation.mysql import MySQLInstrumentor

MySQLInstrumentor().instrument()
cnx = mysql.connector.connect(
    host="foo-host",
    port="3306",
    user="foo-user",
    password="bar-password",
    database="foo-db",
)
cursor = cnx.cursor(prepared=True)
stmt = "SELECT * FROM city WHERE id = %s"  # (1)
cursor.execute(stmt, (1818,))              # (2)
cursor.execute(stmt, (1900,))              # (3)
cursor.execute(stmt, (2000,))              # (4)

The mysql-connector cursor will correctly optimize by only doing a Prepare phase for the execute marked with (2). The execute and (3) and (4) will skip Prepare and only do Execute on the server. Here are the MySQL general logs that result:

2024-11-25T20:19:14.260210Z	   13 Prepare	SELECT * FROM city WHERE id = ?
2024-11-25T20:19:14.260479Z	   13 Reset stmt
2024-11-25T20:19:14.260896Z	   13 Execute	SELECT * FROM city WHERE id = 1818
2024-11-25T20:19:14.262569Z	   13 Reset stmt
2024-11-25T20:19:14.262897Z	   13 Execute	SELECT * FROM city WHERE id = 1900
2024-11-25T20:19:14.263474Z	   13 Reset stmt
2024-11-25T20:19:14.263810Z	   13 Execute	SELECT * FROM city WHERE id = 2000

If we use mysql-connector cursor with prepared=True and we enable sqlcommenting, native prepared statement execution would introduce a new performance penalty unless we take precautions (see further down for Solution). Here is the same client app setup but with enable_commenter=True:

import mysql.connector
from opentelemetry.instrumentation.mysql import MySQLInstrumentor

MySQLInstrumentor().instrument(
    enable_commenter=True,  # !!!
)
cnx = mysql.connector.connect(
    host="foo-host",
    port="3306",
    user="foo-user",
    password="bar-password",
    database="foo-db",
)
cursor = cnx.cursor(prepared=True)
stmt = "SELECT * FROM city WHERE id = %s"  # (1)
cursor.execute(stmt, (1818,))              # (2)
cursor.execute(stmt, (1900,))              # (3)
cursor.execute(stmt, (2000,))              # (4)

This time, the mysql-connector cursor will NOT correctly optimize and do more work because of the introduction of sqlcomment uniqueness! Every client-side execute ((2), (3), and (4)) results in a Prepare + Execute, plus a Close in between! More processing, more caching, more memory use, and increased latency:

2024-11-25T20:28:37.423818Z	   35 Prepare	SELECT * FROM city WHERE id = ? /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-b80c8f24dd4f8025-01'*/
2024-11-25T20:28:37.423935Z	   35 Reset stmt
2024-11-25T20:28:37.424463Z	   35 Execute	SELECT * FROM city WHERE id = 1818 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-b80c8f24dd4f8025-01'*/
2024-11-25T20:28:37.425580Z	   35 Close stmt
2024-11-25T20:28:37.425800Z	   35 Prepare	SELECT * FROM city WHERE id = ? /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-bdccd0b081753e66-01'*/
2024-11-25T20:28:37.425907Z	   35 Reset stmt
2024-11-25T20:28:37.426158Z	   35 Execute	SELECT * FROM city WHERE id = 1900 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-bdccd0b081753e66-01'*/
2024-11-25T20:28:37.426905Z	   35 Close stmt
2024-11-25T20:28:37.427029Z	   35 Prepare	SELECT * FROM city WHERE id = ? /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-72dec2e23f36f12f-01'*/
2024-11-25T20:28:37.427220Z	   35 Reset stmt
2024-11-25T20:28:37.427675Z	   35 Execute	SELECT * FROM city WHERE id = 2000 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-cdd1e98d2821c94d4ae17c92a2f4c6e8-72dec2e23f36f12f-01'*/

This does not happen if mysql-connector cursor is set with the default prepared=False, without nor with sqlcommenting (logs shown respectively)

2024-11-25T20:33:43.151505Z	   46 Query	SELECT * FROM city WHERE id = 1818
2024-11-25T20:33:43.152935Z	   46 Query	SELECT * FROM city WHERE id = 1900
2024-11-25T20:33:43.153776Z	   46 Query	SELECT * FROM city WHERE id = 2000
2024-11-25T20:34:51.497504Z	   49 Query	SELECT * FROM city WHERE id = 1818 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-8e131b2efc67b5d49a373bcd074ea04b-1687fb39738d86f9-01'*/
2024-11-25T20:34:51.499008Z	   49 Query	SELECT * FROM city WHERE id = 1900 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-8e131b2efc67b5d49a373bcd074ea04b-ad727cb2192ad5a0-01'*/
2024-11-25T20:34:51.499537Z	   49 Query	SELECT * FROM city WHERE id = 2000 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-8e131b2efc67b5d49a373bcd074ea04b-76fa05d5df541c4f-01'*/

Solution: the new mysql-connector instrumentor implementation of TracedConnectionProxy.cursor() is a precaution that checks and disables sqlcommenting if prepared=True, and only for that cursor. Tracing is still enabled because that's still correct; we just don't do sqlcommenting or else performance penalty. Other cursors (i.e. prepared=False) will still have both tracing and sqlcommenting performed. This check is in the mysql-connector instrumentor and not the DB-API integration because native prepared statement execution as described above is only a mysql-connector feature, and not in mysqlclient nor PyMySQL.

Here is an example client app where instrumentation is equipped with new TracedConnectionProxy.cursor under the hood while user sets enable_commenter=True, one mysql-connector cursor prepared=True, then another cursor after with prepared=False:

import mysql.connector
from opentelemetry.instrumentation.mysql import MySQLInstrumentor
# ^^ includes `TracedConnectionProxy.cursor` kwargs check

MySQLInstrumentor().instrument(
    enable_commenter=True,
)
cnx = mysql.connector.connect(
    host="foo-host",
    port="3306",
    user="foo-user",
    password="bar-password",
    database="foo-db",
)
cursor = cnx.cursor(prepared=True)
stmt = "SELECT * FROM city WHERE id = %s"  # (1)
cursor.execute(stmt, (1818,))              # (2)
cursor.execute(stmt, (1900,))              # (3)
cursor.execute(stmt, (2000,))              # (4)

cursor = cnx.cursor(prepared=False)
stmt = "SELECT * FROM city WHERE id = %s"  # (1)
cursor.execute(stmt, (1818,))              # (2)
cursor.execute(stmt, (1900,))              # (3)
cursor.execute(stmt, (2000,))              # (4)

Here is the resulting MySQL server general log. The server-side 1 Prepare + 3 Execute trigger is preserved by the prepared mysql-connector cursor, and we cannot support sqlcommenting. Meanwhile non-prepared cursors do support sqlcommenting to preserve as much expected functionality as possible:

2024-11-25T20:39:04.609978Z	   60 Prepare	SELECT * FROM city WHERE id = ?
2024-11-25T20:39:04.610178Z	   60 Reset stmt
2024-11-25T20:39:04.610480Z	   60 Execute	SELECT * FROM city WHERE id = 1818
2024-11-25T20:39:04.611404Z	   60 Reset stmt
2024-11-25T20:39:04.611749Z	   60 Execute	SELECT * FROM city WHERE id = 1900
2024-11-25T20:39:04.612438Z	   60 Reset stmt
2024-11-25T20:39:04.612731Z	   60 Execute	SELECT * FROM city WHERE id = 2000
2024-11-25T20:39:06.552106Z	   60 Query	SELECT * FROM city WHERE id = 1818 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-7e9698e9b4b0eae1489ac894a35d5cf9-af5bca03efb81666-01'*/
2024-11-25T20:39:06.553814Z	   60 Query	SELECT * FROM city WHERE id = 1900 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-7e9698e9b4b0eae1489ac894a35d5cf9-2c285a7e5545b2cb-01'*/
2024-11-25T20:39:06.555004Z	   60 Query	SELECT * FROM city WHERE id = 2000 /*db_driver='mysql.connector%%3A8.4.0',dbapi_level='2.0',dbapi_threadsafety=1,driver_paramstyle='pyformat',mysql_client_version='8.4.0',traceparent='00-7e9698e9b4b0eae1489ac894a35d5cf9-85516f0bcb231251-01'*/

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Unit tests

  • tox -e py312-test-instrumentation-dbapi
  • tox -e py312-test-instrumentation-mysql-0
  • tox -e py312-test-instrumentation-mysql-1

Manual testing

  • App instrumented with mysql-connector, cursor prepared=True
  • App instrumented with mysql-connector, cursor prepared=False
  • Apps instrumented with mysqlclient, PyMySQL, and sqlite3 to check that DB-API functionality still works
  • Apps instrumented with psycopg2, psycopg to check that DB-API inheritance still works elsewhere

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@tammy-baylis-swi tammy-baylis-swi changed the title Add mysql-connector instrumentor support for sqlcommenting WIP/PoC Add mysql-connector instrumentor support for sqlcommenting Nov 21, 2024
@tammy-baylis-swi tammy-baylis-swi changed the title WIP/PoC Add mysql-connector instrumentor support for sqlcommenting Add mysql-connector instrumentor support for sqlcommenting Nov 21, 2024
Comment on lines +215 to +218
db_api_integration_factory: The `DatabaseApiIntegration` to use. If none is passed the
default one is used.
get_cnx_proxy: Method to get traced connextion proxy. If none is passed the
default one is used.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these optional kwargs to reduce code duplication. mysql instrumentor doesn't need to implement its own instrument_connection but can still customize TracedConnectionProxy.cursor to check user-specified cursor params. Open to other ideas too!

@tammy-baylis-swi tammy-baylis-swi marked this pull request as ready for review November 28, 2024 20:31
@tammy-baylis-swi tammy-baylis-swi requested a review from a team as a code owner November 28, 2024 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add sqlcommenting support to MySQL driver instrumentors
1 participant