SQLiteをPythonからインメモリで使う

SQLite、手軽で便利なデータベースですが激しく使っていると速度が気になる時もあります。

SQLiteはインメモリデータベースもサポートしているので、既存のSQLiteのデータベースからインメモリ化して読み取り専用にすると早くなります。

データベース準備

適当に大きなデータベースを用意します。


#!/bin/env python
# coding:utf-8

import sqlite3

con=sqlite3.connect("test.db")
con.cursor().execute("CREATE TABLE test( key integer, val integer , primary key(key))")
con.commit()


for key in range(1,10000000+1):
  con.cursor().execute("insert into test values(?,?)",(key,key+1,))
con.commit()

#!/bin/env python

# coding:utf-8

import sqlite3

con=sqlite3.connect("test.db")

con.cursor().execute("CREATE TABLE test( key integer, val integer , primary key(key))")

con.commit()

for key in range(1,10000000+1):

con.cursor().execute("insert into test values(?,?)",(key,key+1,))

con.commit()

1000万レコードのデータベースを作成しました。


$ sqlite3 test.db 
SQLite version 3.7.6.3
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select count(*) from test;
10000000

$ sqlite3 test.db

SQLite version 3.7.6.3

Enter ".help" for instructions

Enter SQL statements terminated with a ";"

sqlite> select count(*) from test;

10000000

プログラム

こちらを参考にプログラムを作成します。


#!/bin/env python
# coding:utf-8

import sqlite3
from StringIO import StringIO
import time
import random

# in memory化
con=sqlite3.connect("test.db")
tempfile=StringIO()
for line in con.iterdump():
    tempfile.write("%s\n" % line)
tempfile.seek(0)
mcon=sqlite3.connect(":memory:")
mcon.cursor().executescript(tempfile.read())
mcon.commit()
mcon.row_factory=sqlite3.Row


current_milli_times = lambda: int(round(time.time() * 1000))

print "ready.."
# normal
# select 10000 times
N=10000
nstart=current_milli_times()
for i in range(N):
    key=random.randrange(10000000)
    res=con.cursor().execute("select * from test where key=?",(key,))

nend=current_milli_times()
print "normal:"+str(nend-nstart)


# inmemory
mstart=current_milli_times()
for i in range(N):
    key=random.randrange(10000000)
    res=mcon.cursor().execute("select * from test where key=?",(key,))
mend=current_milli_times()
print "inmemory:"+str(mend-mstart)

#!/bin/env python

# coding:utf-8

import sqlite3

from StringIO import StringIO

import time

import random

# in memory化

con=sqlite3.connect("test.db")

tempfile=StringIO()

for line in con.iterdump():

tempfile.write("%s\n" % line)

tempfile.seek(0)

mcon=sqlite3.connect(":memory:")

mcon.cursor().executescript(tempfile.read())

mcon.commit()

mcon.row_factory=sqlite3.Row

current_milli_times = lambda: int(round(time.time() * 1000))

print "ready.."

# normal

# select 10000 times

N=10000

nstart=current_milli_times()

for i in range(N):

key=random.randrange(10000000)

res=con.cursor().execute("select * from test where key=?",(key,))

nend=current_milli_times()

print "normal:"+str(nend-nstart)

# inmemory

mstart=current_milli_times()

for i in range(N):

key=random.randrange(10000000)

res=mcon.cursor().execute("select * from test where key=?",(key,))

mend=current_milli_times()

print "inmemory:"+str(mend-mstart)

conのコネクションが通常のデータベースアクセス、mconがデータベースファイルをインメモリ化したものになります。10000回ランダムにSELECTしてみます。

結果

ready..
normal:414
inmemory:171

ready..

normal:414

inmemory:171

大学のスパコンで計算したのですが２倍以上の差が出ました。

python