Backend General
Code Organization
openlibrary/core— core Open Library functionality, imported and used bywwwopenlibrary/plugins— additional models, controllers, and view helpersopenlibrary/views— views for rendering web pagesopenlibrary/templates— all the templates used in the websiteopenlibrary/macros— reusable template fragments, callable from wikitext
Architecture
Open Library is built on the Infogami wiki system, which is itself built on the web.py Python web framework and the Infobase database framework.
Memcache
Infobase queries are cached in memcached. In the dev instance, a single-node memcached instance is available for testing:
$ docker compose run --rm home python
Python 3.10.5 (main, Jun 23 2022, 17:14:57)
[Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import yaml
>>> from openlibrary.utils import olmemcache
>>> with open('/openlibrary/conf/openlibrary-docker.yml') as in_file:
... y = yaml.safe_load(in_file)
...
>>> mc = olmemcache.Client(y['memcache_servers'])To get a memcached entry:
>>> mc.get('/authors/OL18319A')
'{"bio": {"type": "/type/text", "value": "Mark Twain, was an American author and humorist. Twain is noted for his novels Adventures of Huckleberry Finn (1884), which has been called \\"the Great American Novel\\", and The Adventures of Tom Sawyer (1876). He is extensively quoted. Twain was a friend to presidents, artists, industrialists, and European royalty. ([Source][1].)\\r\\n\\r\\n[1]:http://en.wikipedia.org/wiki/Mark_Twain"}, "photograph": "/static/files//697/OL2622189A_photograph_1212404607766697.jpg", "name": "Mark Twain", "marc": ["1 \\u001faTwain, Mark,\\u001fd1835-1910.\\u001e"], "alternate_names": ["Mark TWAIN", "M. Twain", "TWAIN", "Twain", "Twain, Mark (pseud)", "Twain, Mark (Spirit)", "Twain, Mark, 1835-1910", "Mark (Samuel L. Clemens) Twain", "Samuel Langhorne Clemens (Mark Twain)", "Samuel Langhorne Clemens", "mark twain "], "death_date": "21 April 1910", "wikipedia": "http://en.wikipedia.org/wiki/Mark_Twain", "created": {"type": "/type/datetime", "value": "2013-03-28T07:50:47.897206"}, "last_modified": {"type": "/type/datetime", "value": "2013-03-28T07:50:47.897206"}, "latest_revision": 1, "key": "/authors/OL18319A", "birth_date": "30 November 1835", "title": "(pseud)", "personal_name": "Mark Twain", "type": {"key": "/type/author"}, "revision": 1}'To delete a memcached entry:
>>> mc.delete('/authors/OL18319A')You can also find memcached items using the Internet Archive ID (import memcache instead of olmemcache):
>>> import yaml
>>> import memcache
>>> with open('openlibrary.yml') as in_file:
... y = yaml.safe_load(in_file)
...
>>> mc = memcache.Client(y['memcache_servers'])
>>> mc.get('ia.get_metadata-"houseofscorpion00farmrich"')Logs
To view logs from Docker containers:
# Follow all service logs
docker compose logs -f
# Follow logs for a specific service
docker compose logs -f webDatabase
You should not work directly with the database. All data is managed by Open Library through infobase. If you need to access the database directly, use the following commands.
The first thing you have to know is that Open Library is based on a triplestore database running on Postgres.
To connect to the db run:
su postgres
psql openlibraryOpen Library's entities are stored as things in the thing table:
| id | key | type | latest_revision | created | last_modified |
|---|
It is useful to identify the id of specific types: /type/author, /type/work, /type/edition, /type/user
SELECT * FROM thing WHERE key='/type/author' OR key='/type/edition' OR key='/type/work' OR key='/type/user';this query returns something like:
| id | key | type | latest_revision | created | last_modified |
|---|---|---|---|---|---|
| 17872418 | /type/work | 1 | 14 | 2008-08-18 22:51:38.685066 | 2010-08-09 23:37:25.678493 |
| 22 | /type/user | 1 | 5 | 2008-03-19 16:44:20.354477 | 2009-03-16 06:21:53.030443 |
| 52 | /type/edition | 1 | 33 | 2008-03-19 16:44:24.216334 | 2009-09-22 10:44:06.178888 |
| 58 | /type/author | 1 | 11 | 2008-03-19 16:44:24.216334 | 2009-06-29 12:35:31.346997 |
To count the authors:
SELECT count(*) as count FROM thing WHERE type='58';To count works:
SELECT count(*) as count FROM thing WHERE type='17872418';To count editions:
SELECT count(*) as count FROM thing WHERE type='52';To count users:
SELECT count(*) as count FROM thing WHERE type='22';Caching
The home page is cached by default. To clear the cache for any page, run the following command:
docker compose restart memcached