Harry WangUnderstanding Python Tests and Git Patches for OpenAI SWE Bench Verified DatasetOpenAI released a human-validated subset of SWE-bench dataset, named SWE-bench Verified, to help with evaluating LLMs’ ability to solve…Aug 20Aug 20
Harry WangA Quick Comparison of Text-to-Image Models: Flux, Stable Diffusion 3, DALL·E 3, and KlingLast week, a new state-of-the-art text-to-image model called Flux was released by Black Forest Labs (the original creators of Stable…Aug 9Aug 9
Harry WangUse Poetry for Python Package and Dependency ManagementI see more Python projects are using Poetry to manage packages and dependencies and want to learn what Poetry is and why it’s better than…Jul 14Jul 14
Harry WangA Dataset for Teaching and Evaluating RAGAs a fan of Acquired (https://www.acquired.fm/), I recently published a dataset containing 200 Acquired Podcast Transcripts with metadata…Jun 15Jun 15
Harry WangHow to Setup a (Real) Self-Contained Python RepositoryHave you ever find some Python code on Github and could not easily run them locally due to issues like missing data, packages, versioning…May 1May 1
Harry WangThe “CAR” Problem of LLMsWhen I teach Retrieval-Augmented Generation (RAG), I defined the “CAR” (Credibility, Accuracy, and Recency) problem to outline the common…Apr 24Apr 24
Harry WangIndex Your Serverless MongoDBTL;DL: Implementing indexing on our Serverless MongoDB databases slashed our costs by a factor of at least 10.Apr 14Apr 14