About Paul Graham GPT

Paul Graham GPT is an AI-powered tool designed to search and chat through all of Paul Graham’s essays. It’s an open-source project developed by Mckay Wrigley, based on the essays of Paul Graham. The tool offers users the ability to interact with the vast content of Paul Graham’s essays in a more dynamic and intuitive manner.

Features of Paul Graham GPT

  1. AI-Powered Search and Chat Interface:
    • The tool provides a search interface that was created using OpenAI Embeddings (text-embedding-ada-002). It loops over the essays, generating embeddings for each chunk of text. When a user inputs a search query, the system generates an embedding for it and finds the most similar passages from the essays. The comparison is done using cosine similarity across a database of vectors. The results are ranked by similarity score and presented to the user.
    • The chat interface builds upon the search feature. It uses the search results to create a prompt that is then fed into GPT-3.5-turbo. This allows for a chat-like experience where users can ask questions about the essays and receive answers.
  2. Open-Source Dataset:
    • The dataset used by Paul Graham GPT is a CSV file containing all text and embeddings. This dataset is 100% open-source, and anyone interested can download and utilize it.
  3. Integration with OpenAI and Supabase:
    • The tool uses OpenAI to generate embeddings. For data storage, it uses a Postgres database with the pgvector extension hosted on Supabase. Supabase is recommended for its ease of use, but users can opt for other storage methods if they prefer.

Additional Features

  1. Running Locally:
    • For developers interested in running the tool locally, there are clear instructions provided. Requirements include setting up OpenAI (to generate embeddings) and Supabase (for database creation). There’s also a schema.sql file in the repository to help set up the database. After setting up, developers can clone the repository, install dependencies, and run the application.
  2. Scraping and Embedding Scripts:
    • The tool comes with scripts that scrape all the essays from Paul Graham’s website and save them as a JSON file. Another script reads this JSON file, generates embeddings for each text chunk, and saves the results to the database.
  3. Credits and Contact:
    • Mckay Wrigley credits Paul Graham for his essays, mentioning how they inspired him to learn coding, which changed his life. For any queries, Mckay Wrigley can be reached out on Twitter.