Understanding Python Tests and Git Patches for OpenAI SWE Bench Verified DatasetOpenAI released a human-validated subset of SWE-bench dataset, named SWE-bench Verified, to help with evaluating LLMs’ ability to solve…Aug 20, 2024Aug 20, 2024
A Quick Comparison of Text-to-Image Models: Flux, Stable Diffusion 3, DALL·E 3, and KlingLast week, a new state-of-the-art text-to-image model called Flux was released by Black Forest Labs (the original creators of Stable…Aug 9, 2024Aug 9, 2024
Use Poetry for Python Package and Dependency ManagementI see more Python projects are using Poetry to manage packages and dependencies and want to learn what Poetry is and why it’s better than…Jul 14, 2024Jul 14, 2024
A Dataset for Teaching and Evaluating RAGAs a fan of Acquired (https://www.acquired.fm/), I recently published a dataset containing 200 Acquired Podcast Transcripts with metadata…Jun 15, 2024Jun 15, 2024
How to Setup a (Real) Self-Contained Python RepositoryHave you ever find some Python code on Github and could not easily run them locally due to issues like missing data, packages, versioning…May 1, 2024May 1, 2024
The “CAR” Problem of LLMsWhen I teach Retrieval-Augmented Generation (RAG), I defined the “CAR” (Credibility, Accuracy, and Recency) problem to outline the common…Apr 24, 2024Apr 24, 2024
Index Your Serverless MongoDBTL;DL: Implementing indexing on our Serverless MongoDB databases slashed our costs by a factor of at least 10.Apr 14, 2024Apr 14, 2024