Exploring SQLite Internals: Virtual Database Engine

Published on Oct. 16, 2012 by Matt Cottingham.

Follow @mattrco

SQLite is an embedded SQL database engine used in a formidable list of high-profile projects and deployed in countless applications. In this first article exploring SQLite's internals, we look at SQLite's Virtual Database Engine (VDBE), which executes code generated from SQL commands to store and retrieve data.

The VDBE is described in the SQLite architecture documentation as:

a virtual computer that runs a program in its virtual machine language. The goal of each program is to interrogate or change the database. Toward this end, the machine language that the VDBE implements is specifically designed to search, read, and modify databases. http://www.sqlite.org/vdbe.html

To put it another way, the VDBE encapsulates a state machine which is manipulated by opcodes. Although these are specialized opcodes for storing and retrieving data, it's meaningful to think of the VDBE as a virtual machine, and VDBE opcodes as an assembly langauge.

To show the VDBE structure in detail, here is an extract from vdbeInt.h where the Vdbe struct is defined.

As we can see, a VDBE instance holds pointers to the program it is running (*aOp), memory locations (and the number allocated), an array of program results, as well as a pointer to the program counter and current instruction return value. This fits well with the analogy that a VDBE is a virtual machine instance intended to perform database operations.

The program which the VDBE runs is not composed of SQL commands. Instead, SQL commands are used to generate opcodes (the code generation box in our architecture diagram), and these opcodes are what the VDBE executes. For a list of opcodes, see the full opcode documentation. So, what does a generated VDBE program look like? Here's our simple example schema, a table with Int and String columns.

Now let's look at the VDBE opcodes generated to execute this statement.

As shown, after dealing with a Trace instruction (which is used for passing information back to the caller), the first instruction is to jump to line 10 of our program. Here, a transaction is opened, VerifyCookie is used to check the schema is what we expect, and a TableLock is obtained for the appropriate table. We then jump to instruction 2 of our program, where a write cursor is opened. A new record is constructed from our inputted data, and the record is inserted. After closing the write cursor, Halt is executed which ends the transaction and exits our program.

Having seen a simple insert statement in action, let's look at a more complex VDBE program &emd; adding an index. Explain what the code does here and what the btree.h/btree.c implementation shows.

In summary, we've had a detailed look at how SQLite's virtual database engine executes programs generated from SQL commands. For a look at even more opcode programs, take a look at the (outdated, but still useful) VDBE tutorial. Don't forget to check the up to date opcode reference afterwards, though.

Next, I hope to take a closer look at how SQL statements are tokenized, parsed, and assembled into the opcode instructions we saw above. I then intend to explore SQLite's btree implementation and how data is persisted to disk.