Data Model
Adding new Tables
Interested in adding a new table to our schema? Check out this reference PR: https://github.com/internetarchive/openlibrary/pull/7928/files
Querying for Data
The bookshelves core model demonstrates how to use a database connection on the backend to query for data.
from openlibrary.core import db
oldb = db.get_db() # i.e. web.database(**web.config.db_parameters)
query = "SELECT count(*) from bookshelves_books"
oldb.query(query)Fetching Things Individually or in Bulk
From within routers/controllers, it's much more common to use the web.ctx.site object to fetch individual or multiple records.
doc = web.ctx.site.get("/works/OL5285479W")
keys = ["/works/OL5285479W", "/works/OL257943W", "/works/OL27448W"]
docs = web.ctx.site.get_many(keys)Understanding Infogami, Infobase, and Web.py
Open Library is built using a wiki engine called infogami, which sits on top of the web.py Python micro-web framework (comparable to Flask). Web.py uses a variable called web.ctx to maintain the application context during an HTTP request. Web.py also maintains a PostgreSQL database connection using web.db. Infogami extends web.db by offering a system called infobase, which behaves like an ORM (database wrapper) to define arbitrary data types such as works, editions, and authors.
At its core, Infobase relies on two tables: things and data:
thingsassigns every object in the system an ID, a type, and a reference to its data in thedatatable.datais just a massive catalog of json data that can be accessed by querying and joining with thethingstable.
Infogami injects a utility called site into web.py's web.ctx variable (see web.py ctx documentation), which maintains information and connections specific to the current client. The web.ctx.site utility handles queries and joins, allowing you to request any key from the things table, fetch its corresponding data, and leverage the models and functions defined for that thing's type.
Infogami Database Schema
Every Infogami page on Open Library (anything with a URL) has an associated type. Each type contains a schema that defines which fields can be used and their formats. These schemas generate view and edit templates, which can be further customized as needed. Infogami provides a generic way to create new types through its wiki interface.
Aside from the tables listed here, Open Library essentially has only two database tables. By default, they provide basic functionality through Infogami.
Thing table
The thing table defines types such as editions, works, authors, users, and languages. It also tracks instances of things by their identifiers, registering their IDs in the table.
Entries in a sample thing table
| id | key | type | latest_revision | created | last_modified |
|---|---|---|---|---|---|
| 2 | /type/key | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
| 3 | /type/string | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
| 4 | /type/text | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
| 5 | /type/int | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
Data table
The data table maps each type to all associated data.
Entry in a sample data table
| thing_id | revision | data |
|---|---|---|
| 1 | 1 | {"created": {"type": "/type/datetime", "value": "2013-03-20T10:27:01.223351"}, "last_modified": {"type": "/type/datetime", " value": "2013-03-20T10:27:01.223351"}, "latest_revision": 1, "key": "/type/type", "type": {"key": "/type/type"}, "id": 1, "revision": 1} |
Read further about Infogami and type on: https://openlibrary.org/dev/docs/infogami
Open Library Feature Tables
Open Library has a number of additional tables that are used to support a variety of features. The DDL for these tables can be found here.
bookshelves and bookshelves_books
These tables store the books on patrons' "Want to Read", "Currently Reading", and "Already Read" reading log shelves. The bookshelves_books table contains most of this data, with bookshelves serving as a lookup table for shelf names.
bookshelves.py provides functions which interact with the reading log tables.
yearly_reading_goals
This table stores the target number of books a patron commits to reading in a given year. Functions that interact with the yearly_reading_goals table are in yearly_reading_goals.py.
bookshelves_events
A patron can track the last date they finished any book on their "Already Read" shelf. The bookshelves_events table stores these dates and may later be used to track additional dates (such as when they started reading, or start and finish dates of re-reads).
Related code can be found in bookshelves_events.py.
observations
Patrons can provide structured reviews by attaching pre-defined tags to a work. These are stored in the observations table.
The code that interacts with this table, as well as the definitions for the tags, are found in observations.py.
booknotes
A patron can add private notes to any work. The booknotes table stores these notes. booknotes.py contains the code that interacts with this table.
ratings
Patrons can submit star ratings for works. The ratings table stores these ratings. See ratings.py for related code.
community_edits_queue
This table holds librarian requests, which populate the librarian request table at https://openlibrary.org/merges. Code that interacts directly with this table is in edits.py.