openclawclaude-codev1.0.0
Litdb Expert Skill
@jkitchin⭐ 74 stars· last commit 4mo ago· 4 open issues
You are an expert assistant for litdb, a literature and document database tool designed to help researchers curate and search their collection of scientific literature.
7.3/10
Verified
Mar 9, 2026// RATINGS
🟢ProSkills ScoreAI Verified
7.3/10📍
Not yet listed on ClawHub or SkillsMP
// README
#+title: litdb - a literature and document database
#+attr_org: :width 600
[[./litdb.png]]
#+BEGIN_html
<a href="https://github.com/jkitchin/litdb/actions/workflows/build.yml">
<img src="https://github.com/jkitchin/litdb/actions/workflows/build.yml/badge.svg" alt="Build Status">
</a>
<img src="https://img.shields.io/badge/tests-133_passed,_66_skipped-brightgreen" alt="Tests">
#+END_HTML
* litdb concept
litdb is a tool to help you curate and use your collection of scientific literature. You use it to collect and search papers. You can use it to collect older articles, and to keep up with newer articles. litdb uses https://openalex.org for searching the scientific literature, and https://turso.tech/libsql to store results in a local database.
The idea is you add papers to your database, and then you can search it with natural language queries, and interact with it via an ollama GPT application. It will show you the papers that best match your query. You can read those papers, get bibtex entries for them, or add new papers based on the references, papers that cite that paper, or related papers. You can also set up filters that you update when you want to get new papers created since the last time you checked.
** videos
1. https://www.youtube.com/live/e-J3Bh2Uti4 Introduction to litdb
2. https://www.youtube.com/live/teW68WogulU local files (volume is very low for some reason)
3. https://youtube.com/live/3LltpiiQaR8 CrossRef, reviewer suggestions, COA
4. https://youtube.com/live/ZkKKuvVUWkE litdb and Emacs
5. https://youtube.com/live/j7rItPwWDaY litdb and Jupyter Lab
6. https://youtube.com/live/SUtvtc7l6y0 litdb + GPT enhancements
7. https://youtube.com/live/3FZ1ROnCC6Y litdb + LiteLLM and streamlit
8. https://www.youtube.com/live/IKKTQSTXQmc litdb + Youtube and audio
9. https://youtube.com/live/MEf9rPI0Z1M litdb + Image search with text and image queries using CLIP
10. https://youtube.com/live/C4qCam0shf8 litdb + deep research
11. https://youtube.com/live/6Wpy7KM3wIM litdb + LLM-augmented search of OpenAlex
12. https://www.youtube.com/live/eS0D-Aje_6A litdb + Claude Desktop
** installation
litdb is on PyPi.
#+BEGIN_SRC sh
pip install litdb
#+END_SRC
To get the cutting edge package, you can install it directly from GitHUB.
#+BEGIN_SRC sh
pip install git+https://github.com/jkitchin/litdb
#+END_SRC
litdb relies on a lot of ML-related packages, and conflicts with version are common. If you have any issues, I recommend you install it with uv (https://github.com/astral-sh/uv) like this:
#+BEGIN_SRC sh
uv venv --python 3.13
source .venv/bin/activate
uv pip install litdb
#+END_SRC
You have to activate that virtual env to use litdb, which may be annoying.
** Recent additions
*** Version 2.1.8
- *fromtext command*: Extract and add references from pasted text using LLM parsing and CrossRef matching. See [[./FROMTEXT_USAGE.md][detailed documentation]].
- *summary command*: Generate newsletter-style summaries of recent articles with automatic topic extraction and classification.
- *extract command*: Extract tables from PDF files.
- *schema command*: Extract structured data from documents using a flexible schema DSL or JSON format.
- *Bug fix*: Corrected OpenAlex API parameter from ~email~ to ~mailto~ for proper API compliance.
- *Security fix*: Replaced unsafe ~eval()~ with ~ast.literal_eval()~ in schema parsing.
** configuration
You have to create a toml configuration file. This file is called litdb.toml. The directory this file is in is considered the root directory. All commands will start in the current working directory and look up to find this file. You can put this file in your home directory, or you can have sub-directories, e.g. a per project litdb.
There are a few choices you have to make. You have to choose a SentenceTransformer model, and specify the size of the vectors it makes. You also have to specify the chunk_size and chunk_overlap settings that are used to break documents up to compute document level embedding vectors.
You will need an OpenAlex premium key if you want to use the update-filters feature.
#+BEGIN_EXAMPLE
[embedding]
# SentenceTransformer model for vector search
model = 'all-MiniLM-L6-v2'
cross-encoder = 'cross-encoder/ms-marco-MiniLM-L-6-v2'
chunk_size = 1000
chunk_overlap = 200
[openalex]
# Email for OpenAlex API (polite pool access)
email = "[email protected]"
# Optional: Premium API key for update-filters feature
api_key = "..."
[gpt]
# Model for the 'litdb gpt' command (uses ollama)
model = "llama2"
[llm]
# Model for LiteLLM-based commands (chat, fromtext, summary, schema, research)
# Supports any provider: ollama/*, openai/*, anthropic/*, google_genai/*, etc.
model = "ollama/llama2"
#+END_EXAMPLE
Configuration sections and their usage:
- *[embedding]*: Used by all vector search operations (~vsearch~, ~hybrid-search~, etc.)
- *[openalex]*: Required for OpenAlex searches and updates
- *[gpt]*: Used by the ~litdb gpt~ command (ollama-only)
- *[llm]*: Used by ~chat~, ~fromtext~, ~summary~, ~schema~, and ~research~ commands (supports all LiteLLM providers)
You can define an environment variable to the root of your default litdb project. This should be a directory with a litdb.toml file in it.
#+BEGIN_SRC sh
export LITDB_ROOT="/path/to/your/default/litdb"
#+END_SRC
When you run a litdb command, it will look for a dominating litdb.toml file, which means you are running the command in a litdb project. If one is not found, it will check for the LITDB_ROOT environment variable and use that if it is found. Finally, if that does not exist, it will prompt you to make a new project in the current directory.
* Using litdb
Your litdb starts out empty. You have to add articles that are relevant to you. It is an open question of the best way to build a litdb. The answer surely depends on what your aim is. You have to compromise on breadth and depth with the database size. The CLI makes it pretty easy to do this
litdb has a cli with an entry command of litdb and subcommands (like git) for interacting with it. You can see all the options with this command.
#+BEGIN_SRC sh :dir example
litdb --help
#+END_SRC
** Command Organization
litdb's 43 commands are organized into logical groups:
- *Management*: init, add, remove, index, reindex, update_embeddings
- *Search*: vsearch, fulltext, hybrid_search, lsearch, similar, image_search, screenshot
- *Export*: bibtex, citation, show, visit, about, sql
- *Tags*: add_tag, rm_tag, delete_tag, show_tag, list_tags
- *Review*: review, summary
- *Filters*: add_filter, rm_filter, update_filters, list_filters
- *OpenAlex*: openalex, author_search, follow, watch, citing, related, unpaywall
- *Research*: fhresearch, research, suggest_reviewers
- *Data Processing*: crossref, fromtext, extract, schema, crawl
- *Utilities*: web, audio, chat, app, version, coa
** Searching the web
You have to start somewhere. You can use this to open a search in OpenAlex.
#+BEGIN_SRC sh
litdb web query
#+END_SRC
You can also open searches with these options:
| option | source |
|-----------------------+----------------|
| -g, --google | Google |
| -gs, --google-scholar | Google Scholar |
| -ar, --arxiv | Arxiv |
| -pm, --pubmed | Pubmed |
| -cr, --chemrxiv | ChemRxiv |
| -br, --biorxiv | BioRxiv |
| -a, --all | All |
You can find starting points this way.
*** Fine-tuned search in OpenAlex
This is a default query in Open Alex. It does not change your litdb, it just does a simple text search query on works.
#+BEGIN_SRC sh
litdb openalex query
#+END_SRC
You can get more specific with a filter:
#+BEGIN_SRC sh
litdb openalex -f 'author.orcid:https://orcid.org/0000-0003-2625-9232'
#+END_SRC
You can also search other endpoints and use fulters. Here we perform a search on Sources for display_names that contain the word discovery.
#+BEGIN_SRC sh
litdb openalex -e sources -f d
// HOW IT'S BUILT
TECHNOLOGY STACK
JavaScript
This skill is built with JavaScript., and includes automated tests.
KEY FILES
README.orgSKILL.mdTESTING.mdtest_fromtext.sh
// REPO STATS
74 stars
4 open issues
Last commit: 4mo ago
// SHARE
// SOURCE
View on GitHub// PROSKILLS SCORE
7.3/10
Good
BREAKDOWN
Code Quality7/10
Documentation7/10
Functionality7.5/10
Maintenance8/10
Security7.5/10
Uniqueness7/10
Usefulness7/10