Hallucinations in LLMs: Mitigating Factors and Mechanistic Interpretation

Hallucinations in Large Language Models

Introduction

Large Language Models (LLMs) have shown remarkable capabilities in generating human-like text, but they are prone to producing “hallucinations” - confident assertions that are factually incorrect or nonsensical. This post explores the phenomenon of hallucinations in LLMs, examining their underlying causes, mitigation strategies, and attempts at mechanistic interpretation.

LLM Poisoning: Attacks and Defenses Against Large Language Models

LLM Poisoning: Attacks and Defenses

Introduction

Large Language Models are vulnerable to various forms of poisoning attacks that can compromise their integrity, reliability, and safety. Understanding these attack vectors and developing robust defenses is crucial for maintaining trust in AI systems.

LLM SEO: Optimizing Content for Large Language Models

Introduction

As Large Language Models become central to information retrieval and content generation, a new form of Search Engine Optimization (SEO) has emerged. LLM SEO focuses on optimizing content to be effectively indexed, ranked, and retrieved by AI systems, rather than traditional search engines.

LLM Steering: Controlling Model Behavior and Outputs

LLM Steering: Controlling Model Behavior

Introduction

As Large Language Models become more powerful, the ability to reliably steer their behavior becomes increasingly important. LLM steering encompasses various techniques to guide model outputs toward desired outcomes while maintaining coherence and usefulness.

Prompt Trajectory: Human vs Machine Generated Prompts

Introduction

The way prompts are crafted and processed by Large Language Models reveals fundamental differences between human and machine-generated content. Understanding prompt trajectories - how prompts evolve and are interpreted through the AI system - provides insights into human-AI interaction patterns and optimization opportunities.

Scaling Factors and Emergent Behavior in Large Language Models

Scaling Factors and Emergent Behavior in LLMs

Introduction

The dramatic improvements in Large Language Model performance have largely been driven by scaling - increasing model size, training data, and computational resources. This scaling has led to the emergence of unexpected capabilities, challenging our understanding of intelligence and learning.

Part 2: Deep Learning FPGA Acceleration with Python - 'Inference'

Overview

In this we develop out our hardware acceleration library to handle inference for a MLP neural network using the Fisher Iris dataset. We are building a library that codegens to verilog for FPGA synthesis.

Part 1: Deep Learning FPGA Acceleration with Python - 'Quantization and Data Pipelines'

Overview

This is a project aimed at building custom per-model hardware accelerators for machine learning systems to reduce overall time and power cost in inference and training.

Part 2: Deep Learning FPGA Acceleration with Python - 'Inference'

Overview

In this we develop out our hardware acceleration library to handle training for an MLP model using the Fisher Iris dataset. We will go through each layer to determine what modifications are needed for training and gradient calculations.

How to finetune a model then deploy it to the web: Part 2 - 'Inference with Python and JavaScript'

Overview

This is part two of a two-part blog series. If you haven’t read part 1, I reccomend you do so first to get caught up to this point. All the code referenced can be found in full functioning order here.

How to finetune a model then deploy it to the web: Part 1 - 'Training by finetuning'

Overview

One problem with getting started with ML nowadays is how daunting some of the ‘interesting’ things can be. The models are large, often too large for regular computers. This is a two-part blog on how one might work with a larger model in a manageable way. I’m on a Macbook Pro M2 with 16gb of RAM, which admittedly is quite powerful, but this method should be repeatable on most hardware. This should be able to serve as somewhat of an introduction to ML. This will be especially helpful if you have prior programming experience, but have yet to apply itto ML.

What the Heck is Serverless

Whatever your technical expertise this will be a nice introduction to serverless and the pros/cons.

Machine Learning (Usually) isn't the Right Solution

This is a weird one. As someone who has made their living from machine learning this probably isn’t what you’d expect. But really, most problems don’t need to use machine learning! I’ll help you come up an intuition to understand the types of problems machine learning tends to solve well. This applies equally if you are a developer or not, there is no required technical know-how.

Blog

Hallucinations in LLMs: Mitigating Factors and Mechanistic Interpretation

Hallucinations in Large Language Models

Introduction

LLM Poisoning: Attacks and Defenses Against Large Language Models

LLM Poisoning: Attacks and Defenses

Introduction

LLM SEO: Optimizing Content for Large Language Models

LLM SEO: Optimizing Content for Large Language Models

Introduction

LLM Steering: Controlling Model Behavior and Outputs

LLM Steering: Controlling Model Behavior

Introduction

Prompt Trajectory: Human vs Machine Generated Prompts

Prompt Trajectory: Human vs Machine Generated Prompts

Introduction

Scaling Factors and Emergent Behavior in Large Language Models

Scaling Factors and Emergent Behavior in LLMs

Introduction

Part 2: Deep Learning FPGA Acceleration with Python - 'Inference'

Overview

Part 1: Deep Learning FPGA Acceleration with Python - 'Quantization and Data Pipelines'

Overview

Part 2: Deep Learning FPGA Acceleration with Python - 'Inference'

Overview

How to finetune a model then deploy it to the web: Part 2 - 'Inference with Python and JavaScript'

Overview

How to finetune a model then deploy it to the web: Part 1 - 'Training by finetuning'

Overview

What the Heck is Serverless

Machine Learning (Usually) isn't the Right Solution

Contact Info

Contact Form