This worklog has been replaced with mariadb.org/jira

This site is here for historical purposes only. Do not add or edit tasks here!

 
 
 

WorkLog Frontpage Log in / Register
High-Level Description | Task Dependencies | High-Level Specification | Low-Level Design | File Attachments | User Comments | Time Estimates | Funding and Votes | Progress Reports

 Enhance RQG to support comparative benchmarks
Title
Task ID178
Queue
Version
Status
Priority
Copies toSergei
Pstoev

Created byMdcallag13 Mar 2011Done
Supervisor   
Lead Architect    
Architecture Review  
Implementor  
Code Review  
QA  
Documentation  
 High-Level Description
Enhance RQG so that others can use it on production database & queries to confirm 
there are no serious regressions while upgrading to MariaDB.

By regression I mean there is either a large change in execution time or in the 
values from some SHOW SESSION STATUS counters (rows processed, handler_read* 
values). The SHOW STATUS counters to check should be configurable. The amount a 
value must change to be considered a regression should also be configurable.

It should also check for correct results.

When there is a change the tool should log query plans, index statistics, query 
results when there is a difference, query text.

The way I plan to use such a tool is:
1) setup two mysqld instances, one with old version and one with new version
2) let them replicate from a master and then stop at same point in replication
3) sample some queries from production
4) run RQG on queries from step #3
5) goto step 2
 Task Dependencies
Others waiting for Task 178Task 178 is waiting forGraph
 
 High-Level Specification
Effort estimates:

* Improvements to the core RQG - 16 hours
* Performance gathering infrastructure - 32 hours
* Statistics aggregation - 20 hours 

Total without query plan preservations or obfuscation: 72 hours

* Query Plan Preservation - 36 hours
* Obfuscation - 28 hours
 Low-Level Design
==Improvements to the core RQG==

The following improvements to the core RQG will be required to properly 
accomodate performance comparison benchmarks. This will also serve to remove any 
bias towards using queries that are randomly generated based on a grammar.

* Allow per-module properties to be specified in the command line or a unified 
config file and then read by individual modules. This will remove the need to 
implement a separate config file just for this particular feature;

* Create a new Generator that will read queries based on client's specification. 
Either individual flat files with one query per file, placed in a directory, or 
multiple queries in a single file with a proper separator to allow for newlines 
within query text;

* Make the existing random Generator optional, along with the --gendata, --
gentest, etc. RQG command-line options;

* An alternative compare() function that will use hashes or otherwise avoid 
loading the entire dataset in memory in order to compare it;

* Implement a checker that will prevent the tool from running if given more than 
SELECT permissions.

* A general interface for walking query parse trees in order to perform 
transformations such as extract literals and replace them with "?" placeholders

==Performance Gathering Infrastructure==

The RQG will need to run each query two times against 2 different servers and 
package the obtained time measurements and counters for further processing.

* A "Performance" mode for executing queries that will:
** Run FLUSH STATUS before executing the query;
** Run SHOW STATUS after execution;
** Handle innodb status variables separately since they are not FLUSH-able;
** Use mysql_send_result to measure the time it takes to recieve the first row 
of the dataset and the time to receive the entire dataset;
** Handle additional variables available from Facebook, Google, Percona, InnoDB 
Plugin or MariaDB flavours;
** bundle the information thus obtained for easy access and reporting

* New flags for the Executor that direct it to:
** run the query in a performance mode and collect performance data;
** run the same query twice, for "cold" and "warm" execution;
** set a timeout and enforce it, either using alarm() or with the event 
scheduler

* Performance reporting
** Immediately report the performance data on the queries that are above a pre-
defined threshold immediately after they were run;
** Preserve performance data on all other queries using binning (described 
below)

==Query Plan Preservation==

It is assumed that it will not be possible to export any customer data for the 
developers to reproduce any problematic queries. Therefore, the focus is not on 
gathering the original dataset and obfuscating it but instead on gathering as 
much information about the query as possible in order to debug the problem 
without having the original data.

The interaction between the Optimizer and the storage engine happens through the 
info() and records_in_range() functions of the storage engine API. The 
information supplied by info() is readily available to SHOW INDEXES. The 
information from records_in_range() is indirectly accessible by running EXPLAIN 
and SELECT COUNT(*) on each part of the WHERE separately and recording the 
results.

All data gathered, even though it does not include the original dataset, is 
subject to obfuscation of all participating database, table, field and index 
names, as well as any literals that are used in the query as described below.

Query plan preservation will occur under the following circumstances:
* While the performance comparison test is running, on all queries that are 
above a pre-defined performance threshold;
* When a better specimen of a particular optimization is found, in order to 
replace the information recorded for a previous query. For example, if a 10-fold 
performance regression is observed for a particular optimization X, plan 
information for the query will be preserved and will replace the information 
from the previously worst query for optimization X where the regression was only 
10% percent.
* A stand-alone tool will query plan preservation for individual queries 
specified in a file or the command line. Server caches and the selectivity 
estimates may have changed in the meantime, so the emphasis is on collecting all 
query plan data at the same time as the performance comparisons are being made.

Gathering activities are two types:
* Non-intrusive, such as EXPLAIN and SHOW INDEXES that can be executed without 
side-effects to the server;
* Intrusive, such as SELECT COUNT(*), that may take a long time or cause caches 
to be trashed. Such gathering will only be executed if explicitly requested.

The following modules are needed:

* A procedure for processing EXPLAINs that will:
** convert EXPLAIN output to a generic object that can then be processed;
** Execute EXPLAIN EXTENDED and then SHOW WARNINGS and preserve the output;
** Identify participating tables, indexes and views, for further use such as 
extra statistics gathering

* A procedure for processing server metadata that will (non-intrusive):
** Preserve SHOW CREATE TABLE information on participating tables, including 
views;
** Preserve SHOW INDEXES and other information relevant to index usage and plan 
optimization
** Preserve SHOW VARIABLES, in particular variables imporant to query 
optimization and execution;

* A procedure for preserving all selectivity estimates that may influence the 
Optimizer (intrusive):
** Deconstruct the join conditions and the WHERE clause of each query in order 
to extract index conditions;
** Convert index conditions into SELECTs so that their selectivity can be 
estimated;
** Execute the SELECTS either via EXPLAIN and SELECT COUNT(*) in order to obtain 
estimates and actual row counts; The SELECT COUNT(*) method will only run if 
explicitly requested (otherwise it will trash the InnoDB caches) and will be 
subject to a timeout;
** Package the information thus obtained for proper processing, transmission and 
display;

* A procedure for obtaining actual index statistics (intrusive):
** A SELECT COUNT(*), COUNT(DISTINCT ) , MIN() , MAX() will be run against the 
entirety of each participating table;
** An appropriate SELECT will be used to obtain the selectivity for prefixes of 
each participating index;

==Statistics Aggregation==

Even though each query will be executed and processed individually, a more 
general mechanism for aggregating statistics would be required in order to spot 
trends that may not be obvious from running a simple query.

For the purpose, the statistics for each executed query will be placed in one or 
more bins. Each bin will maintain a running set of statistical atrributes that 
will be recalculated without the need to revisit any previous queries.

* Binning algorithm: Each query will be recorded in one or more of the following 
bins:
** Each query will be in its own bin. Repeated executions of identical queries 
will either be skipped or aggregated in the same bin
** The query after removal of literals. This way almost identical queries will 
be aggregated together;
** The items of the "Extra" field of the EXPLAIN will be used to aggregate 
queries based on the optimization that they triggered, such as "Using index 
condition";
** Queries that return no rows or result in an error will be counted in separate 
bins; The number of queries in this bin will be reported at the end of the test 
in order to make sure a sufficient number of queries did return rows;
** All queries will additionally be aggregated in one single large bin in order 
to obtain an overall percentage of speedup/slowdown between the servers that are 
being compared.

* Binning storage: The following statistics will be recorded for each bin:
** The count of queries that have been placed in the bin;
** Average, min, max, standard deviation for the increase or decrease in running 
time;
** Average, min, max, standard deviation for each of the counter 
increase/decrease that is being measured;
** Full query plan information and original query text for the best/worst 
queries in each bin for each of the counters being measured;
** The number of individual queries and bins that can be stored in 1GB of Perl 
memory will be optimized, calculated and documented; It is important that the 
tool does not cause excessive memory usage or trashing;

* Statistics reports
** Each bin should be viewable in human-readable HTML
** A filtering capability via short phrases of Perl code will be provided in 
order to filter out and display only particular bins (similar to the approach 
taken by the mk-query-digest from Maatkit)

==Obfuscation==

It is assumed that obfuscation must be applied to all non-integer data to be 
taken out of the system, that is, data that contains names of database objects 
and literal strings. Result sets will not be taken out of the system at all;

Information to be obfuscated:
* SHOW CREATE output
* SHOW INDEXES (column and index names)
* query text
* comments will be stripped from each query before further processing

Information that will not be obfuscated:
* SHOW STATUS and SHOW VARIABLES
* SHOW INDEXES and EXPLAIN (index selectivity estimates and other integers)

Obfuscation dictionary:
* A dictionary will maintain the mapping between each obfuscatable identifier 
and its obfuscated representation, either an MD5 hash or a monothonically 
increasing identifier such as "table1", "table2", "table3", etc.
* The dictionary will be populated using all participating database objects in 
the query and all literals from the query itself;
* The dictionary will be used to obfuscate all mentions of database objects in 
any context and string literals from all future quries;
 File Attachments
 NameTypeSizeByDate
 User Comments
(Mdcallag - 2011-04-21 14:52:51
    Time to first row and time to all rows are both interesting for query plans that can have blocking operations (sort).
 Time Estimates
NameHours WorkedLast Updated
Total0 
 Hrs WorkedProgressCurrentOriginal
Total000
 
 Funding and Votes
Votes: 0: 0%
 Make vote: Useless    Nice to have    Important    Very important    

Funding: 0 offers, total 0 Euro
 Progress Reports
(Pstoev - Thu, 21 Apr 2011, 12:25
    
Low Level Design modified.
--- /tmp/wklog.178.old.10361	2011-04-21 12:25:30.000000000 +0000
+++ /tmp/wklog.178.new.10361	2011-04-21 12:25:30.000000000 +0000
@@ -126,10 +126,10 @@
 ** An appropriate SELECT will be used to obtain the selectivity for prefixes of 
 each participating index;
 
-==Statistics Gathering==
+==Statistics Aggregation==
 
 Even though each query will be executed and processed individually, a more 
-general mechanism for gathering statistics would be required in order to spot 
+general mechanism for aggregating statistics would be required in order to spot 
 trends that may not be obvious from running a simple query.
 
 For the purpose, the statistics for each executed query will be placed in one or 

(Pstoev - Thu, 21 Apr 2011, 12:24
    
High-Level Specification modified.
--- /tmp/wklog.178.old.10252	2011-04-21 12:24:58.000000000 +0000
+++ /tmp/wklog.178.new.10252	2011-04-21 12:24:58.000000000 +0000
@@ -2,7 +2,7 @@
 
 * Improvements to the core RQG - 16 hours
 * Performance gathering infrastructure - 32 hours
-* Statistics gathering - 20 hours 
+* Statistics aggregation - 20 hours 
 
 Total without query plan preservations or obfuscation: 72 hours
 

(Pstoev - Thu, 21 Apr 2011, 12:23
    
High-Level Specification modified.
--- /tmp/wklog.178.old.10136	2011-04-21 12:23:07.000000000 +0000
+++ /tmp/wklog.178.new.10136	2011-04-21 12:23:07.000000000 +0000
@@ -1,2 +1,11 @@
+Effort estimates:
 
+* Improvements to the core RQG - 16 hours
+* Performance gathering infrastructure - 32 hours
+* Statistics gathering - 20 hours 
+
+Total without query plan preservations or obfuscation: 72 hours
+
+* Query Plan Preservation - 36 hours
+* Obfuscation - 28 hours
 

(Pstoev - Thu, 21 Apr 2011, 12:21
    
Version updated.
--- /tmp/wklog.178.old.10080	2011-04-21 12:21:45.000000000 +0000
+++ /tmp/wklog.178.new.10080	2011-04-21 12:21:45.000000000 +0000
@@ -1,2 +1,2 @@
-9.x
+Server-9.x
 

(Pstoev - Thu, 21 Apr 2011, 12:21
    
Supervisor updated:  -> Igor
Implementor updated: Psergey -> Pstoev

(Pstoev - Thu, 21 Apr 2011, 12:21
    
Version updated.
No change.

(Pstoev - Thu, 21 Apr 2011, 12:21
    
Supervisor updated:  -> Igor
Implementor updated: Psergey -> Pstoev

(Pstoev - Thu, 21 Apr 2011, 12:20
    
Low Level Design modified.
--- /tmp/wklog.178.old.9969	2011-04-21 12:20:32.000000000 +0000
+++ /tmp/wklog.178.new.9969	2011-04-21 12:20:32.000000000 +0000
@@ -1,2 +1,198 @@
+==Improvements to the core RQG==
+
+The following improvements to the core RQG will be required to properly 
+accomodate performance comparison benchmarks. This will also serve to remove any 
+bias towards using queries that are randomly generated based on a grammar.
+
+* Allow per-module properties to be specified in the command line or a unified 
+config file and then read by individual modules. This will remove the need to 
+implement a separate config file just for this particular feature;
+
+* Create a new Generator that will read queries based on client's specification. 
+Either individual flat files with one query per file, placed in a directory, or 
+multiple queries in a single file with a proper separator to allow for newlines 
+within query text;
+
+* Make the existing random Generator optional, along with the --gendata, --
+gentest, etc. RQG command-line options;
+
+* An alternative compare() function that will use hashes or otherwise avoid 
+loading the entire dataset in memory in order to compare it;
+
+* Implement a checker that will prevent the tool from running if given more than 
+SELECT permissions.
+
+* A general interface for walking query parse trees in order to perform 
+transformations such as extract literals and replace them with "?" placeholders
+
+==Performance Gathering Infrastructure==
+
+The RQG will need to run each query two times against 2 different servers and 
+package the obtained time measurements and counters for further processing.
+
+* A "Performance" mode for executing queries that will:
+** Run FLUSH STATUS before executing the query;
+** Run SHOW STATUS after execution;
+** Handle innodb status variables separately since they are not FLUSH-able;
+** Use mysql_send_result to measure the time it takes to recieve the first row 
+of the dataset and the time to receive the entire dataset;
+** Handle additional variables available from Facebook, Google, Percona, InnoDB 
+Plugin or MariaDB flavours;
+** bundle the information thus obtained for easy access and reporting
+
+* New flags for the Executor that direct it to:
+** run the query in a performance mode and collect performance data;
+** run the same query twice, for "cold" and "warm" execution;
+** set a timeout and enforce it, either using alarm() or with the event 
+scheduler
+
+* Performance reporting
+** Immediately report the performance data on the queries that are above a pre-
+defined threshold immediately after they were run;
+** Preserve performance data on all other queries using binning (described 
+below)
+
+==Query Plan Preservation==
+
+It is assumed that it will not be possible to export any customer data for the 
+developers to reproduce any problematic queries. Therefore, the focus is not on 
+gathering the original dataset and obfuscating it but instead on gathering as 
+much information about the query as possible in order to debug the problem 
+without having the original data.
+
+The interaction between the Optimizer and the storage engine happens through the 
+info() and records_in_range() functions of the storage engine API. The 
+information supplied by info() is readily available to SHOW INDEXES. The 
+information from records_in_range() is indirectly accessible by running EXPLAIN 
+and SELECT COUNT(*) on each part of the WHERE separately and recording the 
+results.
+
+All data gathered, even though it does not include the original dataset, is 
+subject to obfuscation of all participating database, table, field and index 
+names, as well as any literals that are used in the query as described below.
+
+Query plan preservation will occur under the following circumstances:
+* While the performance comparison test is running, on all queries that are 
+above a pre-defined performance threshold;
+* When a better specimen of a particular optimization is found, in order to 
+replace the information recorded for a previous query. For example, if a 10-fold 
+performance regression is observed for a particular optimization X, plan 
+information for the query will be preserved and will replace the information 
+from the previously worst query for optimization X where the regression was only 
+10% percent.
+* A stand-alone tool will query plan preservation for individual queries 
+specified in a file or the command line. Server caches and the selectivity 
+estimates may have changed in the meantime, so the emphasis is on collecting all 
+query plan data at the same time as the performance comparisons are being made.
+
+Gathering activities are two types:
+* Non-intrusive, such as EXPLAIN and SHOW INDEXES that can be executed without 
+side-effects to the server;
+* Intrusive, such as SELECT COUNT(*), that may take a long time or cause caches 
+to be trashed. Such gathering will only be executed if explicitly requested.
+
+The following modules are needed:
+
+* A procedure for processing EXPLAINs that will:
+** convert EXPLAIN output to a generic object that can then be processed;
+** Execute EXPLAIN EXTENDED and then SHOW WARNINGS and preserve the output;
+** Identify participating tables, indexes and views, for further use such as 
+extra statistics gathering
+
+* A procedure for processing server metadata that will (non-intrusive):
+** Preserve SHOW CREATE TABLE information on participating tables, including 
+views;
+** Preserve SHOW INDEXES and other information relevant to index usage and plan 
+optimization
+** Preserve SHOW VARIABLES, in particular variables imporant to query 
+optimization and execution;
+
+* A procedure for preserving all selectivity estimates that may influence the 
+Optimizer (intrusive):
+** Deconstruct the join conditions and the WHERE clause of each query in order 
+to extract index conditions;
+** Convert index conditions into SELECTs so that their selectivity can be 
+estimated;
+** Execute the SELECTS either via EXPLAIN and SELECT COUNT(*) in order to obtain 
+estimates and actual row counts; The SELECT COUNT(*) method will only run if 
+explicitly requested (otherwise it will trash the InnoDB caches) and will be 
+subject to a timeout;
+** Package the information thus obtained for proper processing, transmission and 
+display;
+
+* A procedure for obtaining actual index statistics (intrusive):
+** A SELECT COUNT(*), COUNT(DISTINCT ) , MIN() , MAX() will be run against the 
+entirety of each participating table;
+** An appropriate SELECT will be used to obtain the selectivity for prefixes of 
+each participating index;
+
+==Statistics Gathering==
+
+Even though each query will be executed and processed individually, a more 
+general mechanism for gathering statistics would be required in order to spot 
+trends that may not be obvious from running a simple query.
+
+For the purpose, the statistics for each executed query will be placed in one or 
+more bins. Each bin will maintain a running set of statistical atrributes that 
+will be recalculated without the need to revisit any previous queries.
+
+* Binning algorithm: Each query will be recorded in one or more of the following 
+bins:
+** Each query will be in its own bin. Repeated executions of identical queries 
+will either be skipped or aggregated in the same bin
+** The query after removal of literals. This way almost identical queries will 
+be aggregated together;
+** The items of the "Extra" field of the EXPLAIN will be used to aggregate 
+queries based on the optimization that they triggered, such as "Using index 
+condition";
+** Queries that return no rows or result in an error will be counted in separate 
+bins; The number of queries in this bin will be reported at the end of the test 
+in order to make sure a sufficient number of queries did return rows;
+** All queries will additionally be aggregated in one single large bin in order 
+to obtain an overall percentage of speedup/slowdown between the servers that are 
+being compared.
+
+* Binning storage: The following statistics will be recorded for each bin:
+** The count of queries that have been placed in the bin;
+** Average, min, max, standard deviation for the increase or decrease in running 
+time;
+** Average, min, max, standard deviation for each of the counter 
+increase/decrease that is being measured;
+** Full query plan information and original query text for the best/worst 
+queries in each bin for each of the counters being measured;
+** The number of individual queries and bins that can be stored in 1GB of Perl 
+memory will be optimized, calculated and documented; It is important that the 
+tool does not cause excessive memory usage or trashing;
+
+* Statistics reports
+** Each bin should be viewable in human-readable HTML
+** A filtering capability via short phrases of Perl code will be provided in 
+order to filter out and display only particular bins (similar to the approach 
+taken by the mk-query-digest from Maatkit)
+
+==Obfuscation==
+
+It is assumed that obfuscation must be applied to all non-integer data to be 
+taken out of the system, that is, data that contains names of database objects 
+and literal strings. Result sets will not be taken out of the system at all;
+
+Information to be obfuscated:
+* SHOW CREATE output
+* SHOW INDEXES (column and index names)
+* query text
+* comments will be stripped from each query before further processing
+
+Information that will not be obfuscated:
+* SHOW STATUS and SHOW VARIABLES
+* SHOW INDEXES and EXPLAIN (index selectivity estimates and other integers)
+
+Obfuscation dictionary:
+* A dictionary will maintain the mapping between each obfuscatable identifier 
+and its obfuscated representation, either an MD5 hash or a monothonically 
+increasing identifier such as "table1", "table2", "table3", etc.
+* The dictionary will be populated using all participating database objects in 
+the query and all literals from the query itself;
+* The dictionary will be used to obfuscate all mentions of database objects in 
+any context and string literals from all future quries;
 
 

(Pstoev - Tue, 05 Apr 2011, 06:24
    
Title modified.
--- /tmp/wklog.178.old.1248	2011-04-05 06:24:36.000000000 +0000
+++ /tmp/wklog.178.new.1248	2011-04-05 06:24:36.000000000 +0000
@@ -1,2 +1,2 @@
-Enhance RQG
+Enhance RQG to support comparative benchmarks
 

(Pstoev - Tue, 05 Apr 2011, 06:24
    
Category updated.
--- /tmp/wklog.178.old.1248	2011-04-05 06:24:36.000000000 +0000
+++ /tmp/wklog.178.new.1248	2011-04-05 06:24:36.000000000 +0000
@@ -1,2 +1,2 @@
-Client-RawIdeaBin
+Client-Sprint
 
-- View All Progress Notes (15 total) --


Report Generator:
 
Saved Reports:

WorkLog v4.0.0
  © 2010  Sergei Golubchik and Monty Program AB
  © 2004  Andrew Sweger <yDNA@perlocity.org> and Addnorya
  © 2003  Matt Wagner <matt@mysql.com> and MySQL AB