@@ -7,102 +7,110 @@ This library takes MySQL `CREATE TABLE` statements and returns a data structure
77MySQL syntax [ version 5.7] ( https://dev.mysql.com/doc/refman/5.7/en/create-table.html ) is supported.
88This library does not try to validate input - the goal is to deconstruct valid ` CREATE TABLE ` statements.
99
10-
1110## Installation
1211
1312You can install this package using composer. To add it to your ` composer.json ` :
1413
15- composer require iamcal/sql-parser
14+ ``` plain
15+ composer require iamcal/sql-parser
16+ ```
1617
1718You can then load it using the composer autoloader:
1819
19- require_once 'vendor/autoload.php';
20- use iamcal\SQLParser;
20+ ``` php
21+ require_once 'vendor/autoload.php';
22+ use iamcal\SQLParser;
2123
22- $parser = new SQLParser();
24+ $parser = new SQLParser();
25+ ```
2326
2427If you don't use composer, you can skip the autoloader and include ` src/SQLParser.php ` directly.
2528
26-
2729## Usage
2830
2931To extract the tables defined in SQL:
3032
31- $parser = new SQLParser();
32- $parser->parse($sql);
33+ ``` php
34+ $parser = new SQLParser();
35+ $parser->parse($sql);
3336
34- print_r($parser->tables);
37+ print_r($parser->tables);
38+ ```
3539
36- The ` tables ` property is an array of tables, each of which is a nested array structure defining the
40+ The ` tables ` property is an array of tables, each of which is a nested array structure defining the
3741table's structure:
3842
39- CREATE TABLE `achievements_counts` (
40- `achievement_id` int(10) unsigned NOT NULL,
41- `num_players` int(10) unsigned NOT NULL,
42- `date_updated` int(10) unsigned NOT NULL,
43- PRIMARY KEY (`achievement_id`)
44- ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
45-
46-
47- [
48- 'achievements_counts' => [
49- 'name' => 'achievements_counts',
50- 'fields' => [
51- [
52- 'name' => 'achievement_id',
53- 'type' => 'INT',
54- 'length' => '10',
55- 'unsigned' => true,
56- 'null' => false,
57- ],
58- [
59- 'name' => 'num_players',
60- 'type' => 'INT',
61- 'length' => '10',
62- 'unsigned' => true,
63- 'null' => false,
64- ],
65- [
66- 'name' => 'date_updated',
67- 'type' => 'INT',
68- 'length' => '10',
69- 'unsigned' => true,
70- 'null' => false,
71- ],
72- ],
73- 'indexes' => [
74- [
75- 'type' => 'PRIMARY',
76- 'cols' => [
77- [
78- 'name' => 'achievement_id',
79- ],
80- ],
81- ],
82- ],
83- 'props' => [
84- 'ENGINE' => 'InnoDB',
85- 'CHARSET' => 'utf8',
86- ],
87- ],
88- ]
43+ ``` SQL
44+ CREATE TABLE `achievements_counts ` (
45+ ` achievement_id` int (10 ) unsigned NOT NULL ,
46+ ` num_players` int (10 ) unsigned NOT NULL ,
47+ ` date_updated` int (10 ) unsigned NOT NULL ,
48+ PRIMARY KEY (` achievement_id` )
49+ ) ENGINE= InnoDB DEFAULT CHARSET= utf8;
50+ ```
51+
52+ ``` php
53+ [
54+ 'achievements_counts' => [
55+ 'name' => 'achievements_counts',
56+ 'fields' => [
57+ [
58+ 'name' => 'achievement_id',
59+ 'type' => 'INT',
60+ 'length' => '10',
61+ 'unsigned' => true,
62+ 'null' => false,
63+ ],
64+ [
65+ 'name' => 'num_players',
66+ 'type' => 'INT',
67+ 'length' => '10',
68+ 'unsigned' => true,
69+ 'null' => false,
70+ ],
71+ [
72+ 'name' => 'date_updated',
73+ 'type' => 'INT',
74+ 'length' => '10',
75+ 'unsigned' => true,
76+ 'null' => false,
77+ ],
78+ ],
79+ 'indexes' => [
80+ [
81+ 'type' => 'PRIMARY',
82+ 'cols' => [
83+ [
84+ 'name' => 'achievement_id',
85+ ],
86+ ],
87+ ],
88+ ],
89+ 'props' => [
90+ 'ENGINE' => 'InnoDB',
91+ 'CHARSET' => 'utf8',
92+ ],
93+ ],
94+ ]
95+ ```
8996
9097You can also use the lexer directly to work with other piece of SQL:
9198
92- $parser = new SQLParser();
93- $parser->lex($sql);
99+ ``` php
100+ $parser = new SQLParser();
101+ $parser->lex($sql);
94102
95- print($parser->tokens);
103+ print($parser->tokens);
104+ ```
96105
97- The ` tokens ` property contains an array of tokens. SQL keywords are returned as uppercase,
106+ The ` tokens ` property contains an array of tokens. SQL keywords are returned as uppercase,
98107with multi-word terms (e.g. ` DEFAULT CHARACTER SET ` ) as a single token. Strings and escaped
99108identifiers are not further processed; they are returned exactly as expressed in the input SQL.
100109
101- By default, the tokenizer will ignore unterminated comments and strings, and stop parsing at
110+ By default, the tokenizer will ignore unterminated comments and strings, and stop parsing at
102111that point, producing no further tokens. You can set ` $parser->throw_on_bad_syntax = true; ` to
103112throw an exception of type ` iamcal\SQLParserSyntaxException ` instead.
104113
105-
106114## Performance
107115
108116My test target is an 88K SQL file containing 114 tables from Glitch's main database.
@@ -113,18 +121,16 @@ seconds just to lex the input. This was obviously not a great option.
113121The current implementation uses a hand-written lexer which takes around 140ms to lex the same
114122input and imposes less odd restrictions. This seems to be the way to go.
115123
116-
117124## History
118125
119126This library was created to parse multiple ` CREATE TABLE ` schemas and compare them, so
120127figure out what needs to be done to migrate one to the other.
121128
122129This is based on the system used at b3ta, Flickr and then Tiny Speck to check the differences
123- between production and development databases and between shard instances. The original system
130+ between production and development databases and between shard instances. The original system
124131just showed a diff (see [ SchemaDiff] ( https://github.com/iamcal/SchemaDiff ) ), but that was a bit
125132of a pain.
126133
127-
128134## Unsupported features
129135
130136MySQL table definitions have a * lot* of options, so some things just aren't supported. They include:
@@ -141,20 +147,19 @@ MySQL table definitions have a *lot* of options, so some things just aren't supp
141147If you need support for one of these features, open an issue or (better) send a pull request with tests.
142148
143149The specs for each of the four field groupings can be found here:
150+
144151* https://dev.mysql.com/doc/refman/5.7/en/numeric-type-syntax.html
145152* https://dev.mysql.com/doc/refman/5.7/en/date-and-time-type-syntax.html
146153* https://dev.mysql.com/doc/refman/5.7/en/string-type-syntax.html
147154* https://dev.mysql.com/doc/refman/5.7/en/spatial-type-overview.html
148155
149-
150156## Alternatives
151157
152158If you're using PHP, then [ Modyllic] ( https://github.com/onlinebuddies/modyllic ) is a great SQL parser and set of schema management tools.
153159
154160If you're using Hack, then [ Hack SQL Fake] ( https://github.com/slackhq/hack-sql-fake ) allows you to parse SQL and create a fake MySQL
155161server for testing, with many (but not all!) features of MySQL.
156162
157-
158163## Publishing
159164
160165To publish a new version:
0 commit comments